Re: new sigil

2005-11-04 Thread Larry Wall
On Fri, Nov 04, 2005 at 08:14:11PM +0100, TSa wrote:
: HaloO,
: 
: Larry Wall wrote:
: >On Tue, Oct 25, 2005 at 10:25:48PM -0600, Luke Palmer wrote:
: >: Yeah, I didn't really follow his argument on that one.  I, too, think
: >: that the one() junction in general is silly, especially for types.
: >
: >Well, I think it's silly too.  I'm just trying to see if we need to
: >reserve the syntax in case someone figures out a way to make it
: >unsilly in some dialect.
: 
: So, here are three non-trivial types that make excelent use of the
: one() junction:
: 
:   1) The parse tree of a packrat parser
:   2) a n-ary, recursive decision tree
:   3) a multi method
: 
: I wouldn't call these silly :)

Hmm, yes, one might even go as far as to put 4) union types.

: >And that's why I'm kind of
: >pushing for a natural bundling via juxtaposition that can be viewed
: >as ANDing the constraints:
: >
: >:(Any Dog Cat Fish ¢T $dc)
: >
: >That says much the same thing as
: >
: >:($ where {
: > .does(Any) and
: > .does(Dog) and
: > .does(Cat) and
: > .does(Fish) and
: > .does(Class) and ¢T := $dc.class and
: > .does(Scalar) and $dc := $_;
: > }
: >)
: 
: This is a very nice way to avoid explicit infix & syntactically,
: which is a great achievement in its own right. BTW, does a sub
: name in there count as a type constraint?

Which "in there" are you referring to?  Syntactically the inside of
a "where" is an ordinary expression, so "Dog" has to be predeclared
or use :: on the front.

: Or are only package kinds applicable? I mean the ones that would get
: a :: sigil if the sigil were required.

If you mean the "where" expression, the inside of .does() is just
evaluating an ordinary expression, so you could certainly put a sub
call or anything else in there.  If you're asking about the proposed
stacked type constraint syntax, I don't think a sub will work there
because of syntactic ambiguity with wanting to declare &blocks
and such.

Larry


Re: new sigil

2005-11-04 Thread TSa

HaloO,

Larry Wall wrote:

On Tue, Oct 25, 2005 at 10:25:48PM -0600, Luke Palmer wrote:
: Yeah, I didn't really follow his argument on that one.  I, too, think
: that the one() junction in general is silly, especially for types.

Well, I think it's silly too.  I'm just trying to see if we need to
reserve the syntax in case someone figures out a way to make it
unsilly in some dialect.


So, here are three non-trivial types that make excelent use of the
one() junction:

  1) The parse tree of a packrat parser
  2) a n-ary, recursive decision tree
  3) a multi method

I wouldn't call these silly :)



: When you say Dog^Cat, you're saying "I want something that either
: conforms to the Dog interface or the Cat interface, but *definitely
: not both*!"  Why the heck would you care about that?  Does there
: really arise a situation in which your code will be erroneous when the
: variable conforms to both interfaces?


Hmm, I think MMD in a certain way picks one of many targets out of
a collection of applicable ones. The whole point of the excluded middle
is *avoiding ambiguity* up-front.

The question is at what point on the time line of a concrete program
instance one is wanting to resolve the ambiguity. Note that it is easy
to shift it *after* a successful run with a certain set of arguments!



: And in fact, its very existence defies another implicit principle of
: mine, that is, the "principle of partial definition":  Defining a new
: type or instance can only break a previously typechecking program by
: making it ambiguous.  The idea behind that is that at some time you
: may realize that oen of your types already obeys another type, and
: declare that it conforms to that interface.  But you don't go the
: other way around, undeclaring that an interface holds, without your
: program having been erroneous in the first place.  Declaring that a
: new interface holds (so long as it actually does) shouldn't break
: anything that was already correct.


I think the situation is a bit more complicated than that. In this
chain of arguments is a hidden assumption of asymmetry in the sense
that one interface was first and therefore the other has to adapt.
Thus putting two self-consistent systems together could break the
whole thing. Take e.g. the right versus left driving in Great Britain
and continental Europe. They are both self-consistent but mutually
exclusive. I can hardly imagine a shared lane approach for disambiguation.
This works for pedestrians but not cars ;)

Another example where 1 + 1 = 1 is two laola waves in a stadion
running in opposite directions. For the individual standing up
where the waves permeate each other it makes no difference, but
the people outside that area get up and down in a higher frequent
pattern than for a single wave. That is they stand up because
1 + 0 = 0 + 1 = 1 for waves from the left and right.



And that's basically what we decided in Portland when we went to
set types rather than junctional types.  And that's why I'm kind of
pushing for a natural bundling via juxtaposition that can be viewed
as ANDing the constraints:

:(Any Dog Cat Fish ¢T $dc)

That says much the same thing as

:($ where {
.does(Any) and
.does(Dog) and
.does(Cat) and
.does(Fish) and
.does(Class) and ¢T := $dc.class and
.does(Scalar) and $dc := $_;
}
)


This is a very nice way to avoid explicit infix & syntactically,
which is a great achievement in its own right. BTW, does a sub
name in there count as a type constraint? Or are only package
kinds applicable? I mean the ones that would get a :: sigil if
the sigil were required.



And basically, if | can be used to construct type sets, it ends up
meaning exactly the same thing:

:(Any|Dog|Cat|Fish ¢T $dc)

But maybe that just means we don't need it.


I thought, theory theory is about lifting these issues onto a higher
level by introducing an explicit calculus with predicates on type
variables. And then letting the type system point out (self-)
inconsistencies and more important incompatibilities between two
self-consistent modules forced to play together in an importing
module or program.

I further thought that junctions are the value or runtime shadow
of the theory level. Just as the grammar engine is always there,
the type system is also hanging out for occassional runtime
interference---that is e.g. throwing exceptions or on the fly
theory instanciations or some such. In both cases compile time is
in a certain way the extreme case of "interference" of these two
systems producing the code of the program :)


The following puzzle might serve to illustrate the point I try
to make. There is a given square of side length a. At the top left
and the bottom right corners a line of length b is attached in the
direction of the left and right sides of the square. The endpoints
of these extensions are connected with the respective opposite corners,
thus forming a para

Re: new sigil

2005-10-26 Thread Larry Wall
On Tue, Oct 25, 2005 at 10:25:48PM -0600, Luke Palmer wrote:
: Yeah, I didn't really follow his argument on that one.  I, too, think
: that the one() junction in general is silly, especially for types.

Well, I think it's silly too.  I'm just trying to see if we need to
reserve the syntax in case someone figures out a way to make it
unsilly in some dialect.

: When you say Dog^Cat, you're saying "I want something that either
: conforms to the Dog interface or the Cat interface, but *definitely
: not both*!"  Why the heck would you care about that?  Does there
: really arise a situation in which your code will be erroneous when the
: variable conforms to both interfaces?
: 
: And in fact, its very existence defies another implicit principle of
: mine, that is, the "principle of partial definition":  Defining a new
: type or instance can only break a previously typechecking program by
: making it ambiguous.  The idea behind that is that at some time you
: may realize that oen of your types already obeys another type, and
: declare that it conforms to that interface.  But you don't go the
: other way around, undeclaring that an interface holds, without your
: program having been erroneous in the first place.  Declaring that a
: new interface holds (so long as it actually does) shouldn't break
: anything that was already correct.

And that's basically what we decided in Portland when we went to
set types rather than junctional types.  And that's why I'm kind of
pushing for a natural bundling via juxtaposition that can be viewed
as ANDing the constraints:

:(Any Dog Cat Fish ¢T $dc)

That says much the same thing as

:($ where {
.does(Any) and
.does(Dog) and
.does(Cat) and
.does(Fish) and
.does(Class) and ¢T := $dc.class and
.does(Scalar) and $dc := $_;
}
)

And basically, if | can be used to construct type sets, it ends up
meaning exactly the same thing:

:(Any|Dog|Cat|Fish ¢T $dc)

But maybe that just means we don't need it.

Larry


Re: new sigil

2005-10-26 Thread Rob Kinyon
> And in fact, its very existence defies another implicit principle of
> mine, that is, the "principle of partial definition":  Defining a new
> type or instance can only break a previously typechecking program by
> making it ambiguous.  The idea behind that is that at some time you
> may realize that oen of your types already obeys another type, and
> declare that it conforms to that interface.  But you don't go the
> other way around, undeclaring that an interface holds, without your
> program having been erroneous in the first place.  Declaring that a
> new interface holds (so long as it actually does) shouldn't break
> anything that was already correct.
>
> The principle also has strong implications with library code:
> including a new library but doing nothing with it shouldn't start
> randomly breaking stuff.  (Unless, of course, it breaks the rules and
> does crazy stuff, in which case anything goes)

Is this better expressed as side-effect-free programming or loose
coupling/tight cohesion?

Rob


Re: new sigil

2005-10-25 Thread Luke Palmer
On 10/25/05, Eric <[EMAIL PROTECTED]> wrote:
> I would just like to mention that 'class' is confusing because you
> don't realy mean class there.  The whole conversation is about types
> so why not have it be 'type'?

If you read the introduction to theory.pod[1], you'll find that we are
actually talking about classes.  First, let me define what I mean:

class: A concrete, instantiable type.  Each value has exactly one of these.
type: A behavioral interface.  Each value probably has more than
one of these.

One of the big ideas behind theories is that you can never write the
name of a class in your program... ever.  The only way you can talk
about classes is through variables with constraints.  So, when I say:

sub foo(Int $bar) { $bar + 1 }

I actually mean:

sub foo(¢T $bar where ¢T (in) Int) { $bar + 1 }

(Which is pseudosyntax)

That is, foo accepts any *class* which obeys the Int *type*.  That's
the most specific you can get.


> BTW didn't you contradict your own real world usage of type1^type2 ?
> Even if we use ^ as a sigil why would it get confused on that?  I
> don't think type1 ^type2 could have any realy meaning so it should be
> easy for the parser to know the difference.

Yeah, I didn't really follow his argument on that one.  I, too, think
that the one() junction in general is silly, especially for types.

When you say Dog^Cat, you're saying "I want something that either
conforms to the Dog interface or the Cat interface, but *definitely
not both*!"  Why the heck would you care about that?  Does there
really arise a situation in which your code will be erroneous when the
variable conforms to both interfaces?

And in fact, its very existence defies another implicit principle of
mine, that is, the "principle of partial definition":  Defining a new
type or instance can only break a previously typechecking program by
making it ambiguous.  The idea behind that is that at some time you
may realize that oen of your types already obeys another type, and
declare that it conforms to that interface.  But you don't go the
other way around, undeclaring that an interface holds, without your
program having been erroneous in the first place.  Declaring that a
new interface holds (so long as it actually does) shouldn't break
anything that was already correct.

The principle also has strong implications with library code: 
including a new library but doing nothing with it shouldn't start
randomly breaking stuff.  (Unless, of course, it breaks the rules and
does crazy stuff, in which case anything goes)

Luke

[1] I'm still thinking in terms of this proposal.  If it turns out to
be wrong, disregard my comments.


Re: new sigil

2005-10-25 Thread Eric
On 10/25/05, Larry Wall <[EMAIL PROTECTED]> wrote:
> On Wed, Oct 26, 2005 at 01:17:10AM +0200, Juerd wrote:
> : Larry Wall skribis 2005-10-25 15:51 (-0700):
> : > ^T would still have to be a placeholder variable.
> :
> : Which it is, in a way.
>
> Though we don't currently allow placeholders in ordinary sigs, or even
> in conjunction with ordinary sigs.
>
> : Still, I don't think ^ as a sigil needs to mean the same thing as ^ as a
> : twigil. Visually similar pairs are also not related:
> :
> : ?foo$?foo
> : *foo$*foo
> : +foo$+foo
> : =foo$=foo
> :$
> True, though it would be the first time we used the same character as
> both a sigil and a twigil.  ^^T would certainly be a placeholder
> in that case, but that might be confusing.
>
> : I think that it would help, and in different ways, even, to see ¢ as a
> : prefix operator with special syntax, instead of as a sigil. It doesn't
> : fit well in the list of sub, hash, array, scalar.
>
> But then you can't have ¢*T, ¢+T, ¢^T, ¢?T, ¢.T, etc.
>
> By the way, the meaning of the + twigil just changed.  Last week it
> meant "The $? of the currently compiling unit".  That's been taken
> over by the COMPILING::<$?foo> notation, partly because I wanted to
> steal $+ for generalized "environment" variables, that is, dynamically
> visible lexicals, such as $_.  See
>
> http://svn.perl.org/perl6/doc/trunk/design/syn/S02.pod
>
> I've also checked in various changes to
>
> http://svn.perl.org/perl6/doc/trunk/design/syn/S06.pod
>
> [Everyone please remember to start new threads for unrelated topics.]
>
> : Making it a prefix op would allow whitespace after it, which would make
> : the "class" keyword not seem so desperato. (I think it's a bad keyword
> : for this, and picking ^ instead would render it unnecessary, but more
> : about why I think "class" is bad for this in a later post.)
>
> Let's delete "sub" too while we're at it.  :-)
>
> I'm not stuck on "class" as the ASCII workaround.  It's not quite the
> same as "sub" since we don't allow "sub X" everywhere we allow "&X".
>
> : > And it might conflict with infix ^ if we ever allow xor'ed types,
> : > since declarations contain lots of things that look like juxtaposed
> : > terms.
> :
> : Is this the same "conflict" that occurs in %foo % %bar?
>
> Nope.  Normal expressions always know whether they're expecting a term
> or operator.  I'm talking about in signatures where (it seems to me)
> terms are juxtaposed to mean "and", and the available operators
> can only be ones that can't be confused with a term prefix or zone
> marker.  We actually had that problem back when we had junctional
> types:
>
> T1&T2
>
> Is that a sub T2 returning a T1?
>
> T1 &T2
>
> Or is a type constraint that must match both types?
>
> T1 & T2
>
> Currently we've said that such junctions should be down in a "where"
> clause, but it would be nice to leave the door open for
>
> Int|Str
>
> type constraints.  But it's somewhat problematic to have a junctional
> return type, so maybe that will never happen, and the most complicated
> thing we'll see (in the absence of where clauses and subsignatures),
> is
>
> sub foo (Mammal ¢T $fido)
>
> In that case ^ or | could be forced to work.  But I'd still rather use
> something that doesn't shout "operator", especially if the notation can
> sneak into rvalue code.  And I'd prefer to leave the ASCII characters
> available for real operators and metaoperators later.
>
> : (I cannot imagine needing a one() junction for types, by the way. If
> : someone can come up with a good real-life example, please do so.)
>
> Oh, that's easy, for some definition of "real".  In real life, cats and
> dogs don't overlap, so if you say Cat|Dog, you really mean Cat^Dog.
> But these junctional types are only good for constraints.  A given
> object may only have a set of types, not a junction of types.
>
> Larry
>
I would just like to mention that 'class' is confusing because you
don't realy mean class there.  The whole conversation is about types
so why not have it be 'type'?  On the whole i think ^ makes good sense
and looks pretty good.  It also meets the needs of being a single
letter, not already being a sigil, etc.  'class ' is no longer a
twigil and leads to the confusing reasoning you had about where it
needs class and where it doesn't\

sub test (^T $x) {
my ^T $y = $x + 5;
}

Is pretty easy and straight forward.  With class i can't figure and
obvious DWIM.

sub test (class T $x) {
my class T $y = $x + 5;
}

That doesn't look right because you have 'my class' so you would think
the type is 'class' except the type isn't class, its T.  If you
bareword T then it might overlap (collide) with a real class named T.

In the following case it is not clear wether Dog is defined in the
signature, or a realy seperate class.

sub test (class Dog $x) {
my Dog $y = $x + 5;
}

The same example with ^ becomes much clearer to me.
sub test

Re: new sigil

2005-10-25 Thread Larry Wall
On Wed, Oct 26, 2005 at 01:17:10AM +0200, Juerd wrote:
: Larry Wall skribis 2005-10-25 15:51 (-0700):
: > ^T would still have to be a placeholder variable.
: 
: Which it is, in a way.

Though we don't currently allow placeholders in ordinary sigs, or even
in conjunction with ordinary sigs.

: Still, I don't think ^ as a sigil needs to mean the same thing as ^ as a
: twigil. Visually similar pairs are also not related:
: 
: ?foo$?foo
: *foo$*foo
: +foo$+foo
: =foo$=foo
:$ notation, partly because I wanted to
steal $+ for generalized "environment" variables, that is, dynamically
visible lexicals, such as $_.  See

http://svn.perl.org/perl6/doc/trunk/design/syn/S02.pod

I've also checked in various changes to

http://svn.perl.org/perl6/doc/trunk/design/syn/S06.pod

[Everyone please remember to start new threads for unrelated topics.]

: Making it a prefix op would allow whitespace after it, which would make
: the "class" keyword not seem so desperato. (I think it's a bad keyword
: for this, and picking ^ instead would render it unnecessary, but more
: about why I think "class" is bad for this in a later post.)

Let's delete "sub" too while we're at it.  :-)

I'm not stuck on "class" as the ASCII workaround.  It's not quite the
same as "sub" since we don't allow "sub X" everywhere we allow "&X".

: > And it might conflict with infix ^ if we ever allow xor'ed types,
: > since declarations contain lots of things that look like juxtaposed
: > terms.
: 
: Is this the same "conflict" that occurs in %foo % %bar? 

Nope.  Normal expressions always know whether they're expecting a term
or operator.  I'm talking about in signatures where (it seems to me)
terms are juxtaposed to mean "and", and the available operators
can only be ones that can't be confused with a term prefix or zone
marker.  We actually had that problem back when we had junctional
types:

T1&T2

Is that a sub T2 returning a T1?

T1 &T2

Or is a type constraint that must match both types?

T1 & T2

Currently we've said that such junctions should be down in a "where"
clause, but it would be nice to leave the door open for

Int|Str

type constraints.  But it's somewhat problematic to have a junctional
return type, so maybe that will never happen, and the most complicated
thing we'll see (in the absence of where clauses and subsignatures),
is

sub foo (Mammal ¢T $fido)

In that case ^ or | could be forced to work.  But I'd still rather use
something that doesn't shout "operator", especially if the notation can
sneak into rvalue code.  And I'd prefer to leave the ASCII characters
available for real operators and metaoperators later.

: (I cannot imagine needing a one() junction for types, by the way. If
: someone can come up with a good real-life example, please do so.)

Oh, that's easy, for some definition of "real".  In real life, cats and
dogs don't overlap, so if you say Cat|Dog, you really mean Cat^Dog.
But these junctional types are only good for constraints.  A given
object may only have a set of types, not a junction of types.

Larry


Re: Avoid the Yen Sign [Was: Re: new sigil]

2005-10-25 Thread Juerd
Jan Dubois skribis 2005-10-25 12:33 (-0700):
> Just something to keep in mind in case you are tempted to use the Won
> sign as a sigil or operator in the future.

I don't know what stitch() will do, but this will have to be its infix
operator :)

zip ¥   Y
stitch w


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html


RE: Avoid the Yen Sign [Was: Re: new sigil]

2005-10-25 Thread Jan Dubois
On Tue, 25 Oct 2005, Larry Wall wrote:
> As for the ¥ pitfall, so far we've intentionally been careful to use
> it only where an operator is expected, whereas \ is legal only where a
> term is expected.  So at least for Perl code, we can translate legacy
> ¥ to different codepoints.  (Whether the Japanese font distinguishes
> them is another issue, of course.  I have a "Unicode" font on my
> machine that prints backslash as ¥, which I find slightly irritating,
> but doubtless will be par for the course in Japan for the foreseeable
> future.  Maybe that's a good reason to allow the doublewith backslash
> as an alias for normal backslash.  Maybe not.)

BTW, the exact same thing happens with the Won sign ₩ on Korean Windows
systems; it is also mapped to 0x5c in the default codepage, and paths
are displayed with the Won sign instead of the backslash as separators.
Just something to keep in mind in case you are tempted to use the Won
sign as a sigil or operator in the future.

Cheers,
-Jan




Re: new sigil

2005-10-25 Thread Juerd
Larry Wall skribis 2005-10-25 15:51 (-0700):
> ^T would still have to be a placeholder variable.

Which it is, in a way.

Still, I don't think ^ as a sigil needs to mean the same thing as ^ as a
twigil. Visually similar pairs are also not related:

?foo$?foo
*foo$*foo
+foo$+foo
=foo$=foo
   $ And it might conflict with infix ^ if we ever allow xor'ed types,
> since declarations contain lots of things that look like juxtaposed
> terms.

Is this the same "conflict" that occurs in %foo % %bar? 

(I cannot imagine needing a one() junction for types, by the way. If
someone can come up with a good real-life example, please do so.)


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html


Re: new sigil

2005-10-25 Thread Larry Wall
On Tue, Oct 25, 2005 at 11:44:35PM +0200, Juerd wrote:
: Larry Wall skribis 2005-10-25 14:35 (-0700):
: > On Thu, Oct 20, 2005 at 11:18:14AM -0600, Eric wrote:
: > : Actualy i think ^ might be my favorite so far.
: > : sub sametype (^T $x, ^T $y) {...}
: > I thought that, too, until I realized it wouldn't work as an rvalue:
: > ^T.count# 1's complement of number of T instances
: 
: Ehm, isn't that +^, ~^ (and ?^, perhaps) nowadays?

Er.  It's, um, uh...a generic 1's complement operator!  Yeah, that's
the ticket...

But leaving aside my tendencies toward senility, ^T would still have
to be a placeholder variable.  And it might conflict with infix ^
if we ever allow xor'ed types, since declarations contain lots of
things that look like juxtaposed terms.

Larry


Re: new sigil

2005-10-25 Thread Juerd
Larry Wall skribis 2005-10-25 14:35 (-0700):
> On Thu, Oct 20, 2005 at 11:18:14AM -0600, Eric wrote:
> : Actualy i think ^ might be my favorite so far.
> : sub sametype (^T $x, ^T $y) {...}
> I thought that, too, until I realized it wouldn't work as an rvalue:
> ^T.count  # 1's complement of number of T instances

Ehm, isn't that +^, ~^ (and ?^, perhaps) nowadays?


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html



Re: new sigil

2005-10-25 Thread Larry Wall
On Thu, Oct 20, 2005 at 09:59:49AM -0600, Luke Palmer wrote:
: How about this:
: 
: sub foo(c|T $x) {
: my sub util (c|T $in) {...}
: util($x)
: }
: 
: Is that c|T in util() a new, free type variable, or am I asserting
: that the type of util()'s argument must be the same type as $x?

It's a new T according to the current thinking.  Just use T if you
want the same one.  (But that does force util to be recloned on every
entry to foo, I expect.)

Larry


Re: new sigil

2005-10-25 Thread Larry Wall
On Thu, Oct 20, 2005 at 11:18:14AM -0600, Eric wrote:
: Actualy i think ^ might be my favorite so far.
: 
: sub sametype (^T $x, ^T $y) {...}

I thought that, too, until I realized it wouldn't work as an rvalue:

^T.count# 1's complement of number of T instances

On top of which, if it did work, it should be a placeholder variable,
not something you see in a signature.

Larry


Re: new sigil

2005-10-25 Thread Benjamin Smith
On Tue, Oct 25, 2005 at 02:02:58PM -0500, Jonathan Scott Duff wrote:
> On Tue, Oct 25, 2005 at 12:18:41PM -0600, Luke Palmer wrote:



> I don't think so. In the first example all the T (or ¢T) are the same
> type after the first ¢T (where the type is bound). In the second one
> you'd get two separate types ¢T and ¢U. But ¢U would probably get bound
> to the same type as ¢T as that's the type of thing that it returns
> (assuming perl can figure that out).

We have (or have had?) parameterised classes where you can specify
parameters to the class enclosed in [].

eg. class Foo[...] { ... }

So couldn't the same be used for functions?  This way you wouldn't need
a special sigil for classes declared in such a way.

sub foo[Bar] (Bar $tab) { ... }

Since perl6 isn't really a static language, I don't think you need to be
allowed to have non-type variables in the [] (dependent-typing, or where
you can use primitive types like int in template parameters in C++),
since being parameters in [] means only that they're types, and not that
they are always bound at compile time.

(apologies for breaking the unicode)

-- 
Benjamin Smith <[EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED]>
Christ's College - Mathematics Part 1B
IRC: integral on irc.perl.org, and irc.freenode.net (channel: #perl)


Re: new sigil

2005-10-25 Thread Jonathan Scott Duff
On Tue, Oct 25, 2005 at 12:18:41PM -0600, Luke Palmer wrote:
> I like that symmetry between &foo and ¢foo.  So to get the behavior
> that an outer type variable applies to an inner sub, could I do this:
> 
> # a complicated identity function :-)
> sub foo (¢T $x --> ¢T) {
> my sub bar (T $z --> T) {
> $z;
> }
> bar $x;
> }
> 
> Because omitting the ¢ would not bind T.  

You can even do 

 sub foo (¢T $x --> T) {
 my sub bar (T $z --> T) {
$z;
}
bar $x;
}

I do believe.

> Whereas if I wrote:
> 
> sub foo (¢T $x --> ¢T) {
> my sub bar (¢T $z --> T) {
> $z;
> }
> bar $x;
> }

It would be semantically the same as above. (just like C
would only declare one C<$x>, so too C<¢T $x ... ¢T $y> should only
bind one type to T (or ¢T) for the duration of the scope.

> It would be a totally new variable in both spots in the inner sub, and
> if I wrote:
>
> sub foo (¢T $x --> ¢T) {
> my sub bar (T $z --> ¢T) {
> $z;
> }
> bar $x;
> }
>
> It would be equivalent to:
>
> sub foo (¢T $x --> ¢T) {
> my sub bar (T $z --> ¢U) {
> $z;
> }
> bar $x;
> }

I don't think so. In the first example all the T (or ¢T) are the same
type after the first ¢T (where the type is bound). In the second one
you'd get two separate types ¢T and ¢U. But ¢U would probably get bound
to the same type as ¢T as that's the type of thing that it returns
(assuming perl can figure that out).

That's if I understand Larry correctly.

-Scott
-- 
Jonathan Scott Duff
[EMAIL PROTECTED]


Re: new sigil

2005-10-25 Thread Larry Wall
On Tue, Oct 25, 2005 at 01:57:52PM -0400, Rob Kinyon wrote:
: I'm assuming that when you allow
: 
: my ¢T := sometype();
: 
: you're also allowing
: 
: my class T := sometype();

Yes, that's the idea.

: So, what happens when stupid me names a class "class" through
: symbol-table craziness?

How much class could a class class class if a class class could class class?

What happens is either that you have to say "class class" or "¢class"
or you redefine the "class" keyword to something else like "frobnitz".
I think "class" and "sub" are keywords in the, er, class of things
that trump mere symbol table entries.  Either that, or "class" is merely
the name of the metaclass, and you'll get a class collision when
you try to redefine it.  But I expect "class" is really a declarator
of the same status as "sub", at least syntactically.

Larry


Re: new sigil

2005-10-25 Thread Luke Palmer
On 10/25/05, Larry Wall <[EMAIL PROTECTED]> wrote:
> We're probably converging on a general rule that two or more
> declarations of the same variable in the same scope refer to the
> same entity:
>
> my $x = 1;  # outer $x;
> {
> $x = 2; # bound to OUTER::<$x>
> if my $x = foo() {...}  # new $x declared here
> if my $x = bar() {...}  # same $x, "my" is optional
> baz();  # baz sees single inner CALLER::<$x>.
> }

...

...

Cool!

> So too these would mean the same thing:
>
> sub Bool eqv (¢T $x, T $y) { my T $z; }
> sub Bool eqv (¢T $x, ¢T $y) { my ¢T $z; }

I like that symmetry between &foo and ¢foo.  So to get the behavior
that an outer type variable applies to an inner sub, could I do this:

# a complicated identity function :-)
sub foo (¢T $x --> ¢T) {
my sub bar (T $z --> T) {
$z;
}
bar $x;
}

Because omitting the ¢ would not bind T.  Whereas if I wrote:

sub foo (¢T $x --> ¢T) {
my sub bar (¢T $z --> T) {
$z;
}
bar $x;
}

It would be a totally new variable in both spots in the inner sub, and
if I wrote:

sub foo (¢T $x --> ¢T) {
my sub bar (T $z --> ¢T) {
$z;
}
bar $x;
}

It would be equivalent to:

sub foo (¢T $x --> ¢T) {
my sub bar (T $z --> ¢U) {
$z;
}
bar $z;
}

(Thus causing a "signature too general" type error)

Right?  Totally off?

Luke


Re: new sigil

2005-10-25 Thread Rob Kinyon
> Basically, ¢T is a close analog of &t, which is the variableish form
> for "sub t".  When used in a declaration, both of them introduce a
> bare name as an alias into whatever scope the declaration is inserting
> symbols, albeit with different syntactic slots.  So just as
>
> my &t := { ... }
>
> introduces the possibility of
>
> t 1,2,3
>
> so also a
>
> my ¢T := sometype();
>
> introduces the possibility of
>
> my T $x;

I'm assuming that when you allow

my ¢T := sometype();

you're also allowing

my class T := sometype();

So, what happens when stupid me names a class "class" through
symbol-table craziness?

Rob


Re: Avoid the Yen Sign [Was: Re: new sigil]

2005-10-25 Thread Larry Wall
: On 10/23/05, Autrijus Tang <[EMAIL PROTECTED]> wrote:
: > In addition to your handy table, the >> and << french quotes, which are used
: > quite heavily in Perl 6 for both bracketing and hyper operators, also have
: > full width equivalents:
: >
: > 300A;LEFT DOUBLE ANGLE BRACKET;Ps;0;ON;Y;OPENING DOUBLE ANGLE 
BRACKET
: > 300B;RIGHT DOUBLE ANGLE BRACKET;Pe;0;ON;Y;CLOSING DOUBLE ANGLE 
BRACKET
: >
: > Half width: «»
: > Full width: 《》

I think we actually speculated about that identity in the Apocalypse.

: > One way to approach it is to make Perl 6 accept both full- and
: > half-width variants.
: >
: > Another way would be to use ASCII fallbacks exclusively in real programs, 
and
: > reserve unicode variants for pretty-printing, the same way that PLT Scheme 
and
: > Haskell recognizes λ in literatures, but actually write "lambda" and
: > "\" respectively
: > in everyday coding.

I think we should enable both approaches.  Restricting Unicode characters
to literature is wrong, but so is forcing Unicode on someone prematurely.

On Sun, Oct 23, 2005 at 07:07:33PM -0400, Rob Kinyon wrote:
: Isn't this starting to be the question of why we have the Unicode
: operators instead of just functions? Would it be possible to have a
: function be infix?

At which precedence level?

Larry


Re: Avoid the Yen Sign [Was: Re: new sigil]

2005-10-25 Thread Larry Wall
On Sun, Oct 23, 2005 at 10:55:34PM +0900, Dan Kogai wrote:
: To make the matter worse, there are not just one "yen sign" in  
: Unicode. Take a look at this.
: 
: ¥ U+00A5 YEN SIGN
: ¥ U+FFE5 FULLWIDTH YEN SIGN
: 
: Tough they look and groks the same to human, computers handle them  
: differently.  This happened when Unicode Consortium decided to make  
: BMP round-trippable against legacy encodings.  They were distinct in  
: JIS standards, so happened Unicode.
: 
: Maybe we should avoid other symbols like this for sigils -- those not  
: in ASCII that have 'fullwidth' variations.  q($) and q(\) are okay  
: (or too late) because they are already in ASCII.  q(¥) should be  
: avoided because you can hardly tell the difference from q(¥) in the  
: display.
: 
: But this will also outlaw the cent sign.  I have attached a list of  
: those affected.  As you see, most are with ASCII equivalents but some  
: are not.

We'd have to outlaw A..Z as well.  :-)

I think a better plan might just be to say that we'll treat any fullwidth
character as equivalent to its narrow companion, at least when used as
an operator.  Canonicalizing identifiers may be another matter though.

On the other hand, certain of the double-width characters are likely to
be confused with two singles, such as 

=   FF1DFULLWIDTH EQUALS SIGN
_   FF3FFULLWIDTH LOW LINE

so maybe they should be equivalent to == and __, or outlawed.

And one could (un)reasonably argue that

~   FF5EFULLWIDTH TILDE

ought to mean ~~ rather than ~.  But in general we need to go slow
on such decisions.  For now just sticking our toe into Latin-1
is enough, as long as we're looking ahead for visual pitfalls.

As for the ¥ pitfall, so far we've intentionally been careful to use
it only where an operator is expected, whereas \ is legal only where a
term is expected.  So at least for Perl code, we can translate legacy
¥ to different codepoints.  (Whether the Japanese font distinguishes
them is another issue, of course.  I have a "Unicode" font on my
machine that prints backslash as ¥, which I find slightly irritating,
but doubtless will be par for the course in Japan for the foreseeable
future.  Maybe that's a good reason to allow the doublewith backslash
as an alias for normal backslash.  Maybe not.)

Anyway, I think people will be able to distinguish visually between
"A ¥ B" and "¥X" as long as we keep the operator/term distinction.

Larry


Re: new sigil

2005-10-25 Thread Larry Wall
On Sat, Oct 22, 2005 at 06:00:38AM -0400, Damian Conway wrote:
: Autrijus wrote:
: 
: >Indeed.  Somehow I think this makes some sense:
: >
: >sub Bool eqv (|T $x, |T $y) { ... }
: 
: Except that it prevents anyone from ever writing:
: 
: multi sub circumfix:<| |> (Num $x) { return abs $x }
: multi sub circumfix:<| |> (Vec $x) { return $x.mag }
: 
: which many mathematically inclined folks might find annoying.
: 
: (It also precludes intriguing possibilities like:
: 
: multi sub circumfix:«| >» ($q) { return Quantum::State.new(val => $q) 
: }
: 
: which I personally would find irritating. ;-)

I considered | last week, but decided it was better to hold unary | in
reserve, especially since it's likely to be confusing with junctions.
And if we use | for type set notation, then unary | would preclude
the ability to stack types, and I've been treating an utterance like

my Mammal ¢T $fido where Bark :($a,$b,$c --> Wag)

as having at least five implicitly ANDed type specifications:

must do Mammal
must do Class
must do Scalar
must do Bark
must do Wag

plus there must be three components that are Scalar, plus whatever
extra type constraints Wag puts onto those three components.  Having
Mammal |T be ambiguous with Mammal|T would be bad, at least visually.

Anyway, having mulled over all this while off in Amsterdam and
Budapest, my current thinking is that the ascii shortcut for ¢T is
simply "class T", so you could write any of:

sub Bool eqv (¢T $x, T $y)
sub Bool eqv (class ¢T $x, T $y)
sub Bool eqv (Any ¢T $x, T $y)
sub Bool eqv (Any class T $x, T $y)

and mean the same thing.

Basically, ¢T is a close analog of &t, which is the variableish form
for "sub t".  When used in a declaration, both of them introduce a
bare name as an alias into whatever scope the declaration is inserting
symbols, albeit with different syntactic slots.  So just as

my &t := { ... }

introduces the possibility of

t 1,2,3

so also a

my ¢T := sometype();

introduces the possibility of

my T $x;

Use as an rvalue can be either T or ¢T without declaring a new type.
We're probably converging on a general rule that two or more
declarations of the same variable in the same scope refer to the
same entity:

my $x = 1;  # outer $x;
{
$x = 2; # bound to OUTER::<$x>
if my $x = foo() {...}  # new $x declared here
if my $x = bar() {...}  # same $x, "my" is optional
baz();  # baz sees single inner CALLER::<$x>.
}

So too these would mean the same thing:

sub Bool eqv (¢T $x, T $y) { my T $z; }
sub Bool eqv (¢T $x, ¢T $y) { my ¢T $z; }

Only the first declarative ¢ actually installs a new symbol T.
An inner scope would of course establish its own type space, but
the formal parameters to a block count as part of the block, which
is why the second form above applies the existing T to $z rather
than capturing the type of $z.  But it's a bit like writing &foo()
when you could just say foo() instead.

Larry


Re: new sigil

2005-10-24 Thread Luke Palmer
On 10/24/05, TSa <[EMAIL PROTECTED]> wrote:
> Does this capturing of the type into ¢T also involve runtime
> code template expansion? That is, if sametype(Int,Int) didn't
> exist it would be compiled on the fly for a call sametype(3,2)?

I think that's up to the implementation.  From the language
perspective, no, it behaves as though it was compiled once.  But an
implementation is free to instantiate the routine for various types
for optimization.

> Which brings up the question if ¢T will be allowed in multi defs?

Good question.  I believe the ordering multi algorithm can be extended
to handle it, but I'll have to think about what it means.

> > So it's a type position thing if it can be.  Good.  (I wonder if,
> > since it's allowed in term position, we will come up with ambiguities)
> >
> > How about this:
> >
> > sub foo(c|T $x) {
> > my sub util (c|T $in) {...}
> > util($x)
> > }
> >
> > Is that c|T in util() a new, free type variable, or am I asserting
> > that the type of util()'s argument must be the same type as $x?
>
> I would guess there are two distinct ¢foo::T and ¢foo::util::T free
> type variables.

Hmm, yeah, that makes sense, but it can also be annoying.  For
instance, in Haskell, I wrote this:

closure :: (Ord a) => (a -> [a]) -> [a] -> [a]
clsoure f init = closure' Set.empty init
where
closure' :: (Ord a) => Set a -> [a] -> [a]
closure' set [] = []
closure' set (x:xs) = ...

This gives me a type error on closure', because the inner "a" is
different from the outer "a".  Incidentally, there is no signautre
that closure' can possibly have.  So I was forced to leave off the
signature and let the type inferencer do the work.  In this case it
would have been nice to have the variable carry over to inner clauses.

But letting that happen also has problems.  You can't freely move code
around, because you depend on the type variables that were bound in
outer scopes.  However, if the number of "type topicalizers" (as it
were) is small, then maybe that's okay.

> In the call of util($x) the type reference is handed
> or rebound down the call chain just like value refs. BTW, will there
> be a topic type ¢_, grammar type ¢/ and the exception type ¢! as well?

The topic type ¢_ is discussed in theory.pod.  I don't see much use
for the others (there is no @/ or @!, for instance).

> What operations are available for type variables? E.g. ¢foo <= ¢bar could
> be the subtype relation. But what would ¢foo + ¢bar mean?

Nothing.

Perhaps ¢foo (+) ¢bar is a union type, but I don't think it should be.
 Again, see theory.pod for formalisms of the difference between things
that are in type variables and the types you declare in the program. 
Essentially the things that are in type variables are only
instantiable, concrete types, whereas the types you declare in the
program are more like interfaces.   There is no concept of a subtype
in the concrete world; only in the interface world.  But theory.pod
isn't gospel (yet ;-).

> Is ¢foo - ¢bar the dispatch distance?

Especially not since that concept doesn't exist anymore.

> Is the compiler obliged to separate type variables from value variables? Or 
> does
>
>$foo = \¢bar;
>
> produce a type reference? How would that be dereferenced then? Is the type
> inferencer in the compiler automatically calculating a supertype bound
> for every expression? If yes, how is that accessable?

Hmm, don't know about that.  Exactly how "first-class" are type variables?

Luke


Re: new sigil

2005-10-24 Thread TSa

HaloO,

Luke Palmer wrote:

On 10/20/05, Larry Wall <[EMAIL PROTECTED]> wrote:


Another thing I didn't mention is that that binds both the variable
and its class.  But the $ variable is of course optional after the
type, so you could just write that

   sub sametype (¢T, ¢T) {...}

if you don't actually care about $x and $y.  Basically, ¢T captures
the type of the associated scalar in any lvalue or declarative context,
whether or not hte scalar itself is captured.


Does this capturing of the type into ¢T also involve runtime
code template expansion? That is, if sametype(Int,Int) didn't
exist it would be compiled on the fly for a call sametype(3,2)?
Which brings up the question if ¢T will be allowed in multi defs?
And how does it influence dispatch then? Can type variables be
constrained with where clauses?



So it's a type position thing if it can be.  Good.  (I wonder if,
since it's allowed in term position, we will come up with ambiguities)

How about this:

sub foo(c|T $x) {
my sub util (c|T $in) {...}
util($x)
}

Is that c|T in util() a new, free type variable, or am I asserting
that the type of util()'s argument must be the same type as $x?


I would guess there are two distinct ¢foo::T and ¢foo::util::T free
type variables. In the call of util($x) the type reference is handed
or rebound down the call chain just like value refs. BTW, will there
be a topic type ¢_, grammar type ¢/ and the exception type ¢! as well?

What operations are available for type variables? E.g. ¢foo <= ¢bar could
be the subtype relation. But what would ¢foo + ¢bar mean? Is ¢foo - ¢bar
the dispatch distance? Is the compiler obliged to separate type variables
from value variables? Or does

  $foo = \¢bar;

produce a type reference? How would that be dereferenced then? Is the type
inferencer in the compiler automatically calculating a supertype bound
for every expression? If yes, how is that accessable?
--
$TSa.greeting := "HaloO"; # mind the echo!



Re: new sigil

2005-10-24 Thread Michele Dondi

On Sat, 22 Oct 2005 [EMAIL PROTECTED] wrote:


If we find a lot of yen signs as zip-operators in the standard
library, Japanese would have a big question: "Give up either
Perl6 or Windows.  Which do we need?"  And I suppose the answer


Hmmm, begins to sound interesting... ;-P


Michele
--
voices
you're letting voices tell you what to do
when you yourself don't know
- Pennywise, "Come Out Fighting".


Re: Avoid the Yen Sign [Was: Re: new sigil]

2005-10-23 Thread Rob Kinyon
On 10/23/05, Autrijus Tang <[EMAIL PROTECTED]> wrote:
> Dan Kogai wrote:
> > To make the matter worse, there are not just one "yen sign" in  Unicode.
> > Take a look at this.
> >
> > ¥ U+00A5 YEN SIGN
> > ¥ U+FFE5 FULLWIDTH YEN SIGN
> >
> > Tough they look and groks the same to human, computers handle them
> > differently.  This happened when Unicode Consortium decided to make  BMP
> > round-trippable against legacy encodings.  They were distinct in  JIS
> > standards, so happened Unicode.
>
> In addition to your handy table, the >> and << french quotes, which are used
> quite heavily in Perl 6 for both bracketing and hyper operators, also have
> full width equivalents:
>
> 300A;LEFT DOUBLE ANGLE BRACKET;Ps;0;ON;Y;OPENING DOUBLE ANGLE BRACKET
> 300B;RIGHT DOUBLE ANGLE BRACKET;Pe;0;ON;Y;CLOSING DOUBLE ANGLE BRACKET
>
> Half width: «»
> Full width: 《》
>
> There is no way to type out the half-width yen and double angle brackets under
> MSWin32, under either the traditional or simplified code pages; only full 
> width
> variants are available.
>
> One way to approach it is to make Perl 6 accept both full- and
> half-width variants.
>
> Another way would be to use ASCII fallbacks exclusively in real programs, and
> reserve unicode variants for pretty-printing, the same way that PLT Scheme and
> Haskell recognizes λ in literatures, but actually write "lambda" and
> "\" respectively
> in everyday coding.

Isn't this starting to be the question of why we have the Unicode
operators instead of just functions? Would it be possible to have a
function be infix?

Rob


Re: Avoid the Yen Sign [Was: Re: new sigil]

2005-10-23 Thread Autrijus Tang
Dan Kogai wrote:
> To make the matter worse, there are not just one "yen sign" in  Unicode.
> Take a look at this.
>
> ¥ U+00A5 YEN SIGN
> ¥ U+FFE5 FULLWIDTH YEN SIGN
>
> Tough they look and groks the same to human, computers handle them
> differently.  This happened when Unicode Consortium decided to make  BMP
> round-trippable against legacy encodings.  They were distinct in  JIS
> standards, so happened Unicode.

In addition to your handy table, the >> and << french quotes, which are used
quite heavily in Perl 6 for both bracketing and hyper operators, also have
full width equivalents:

300A;LEFT DOUBLE ANGLE BRACKET;Ps;0;ON;Y;OPENING DOUBLE ANGLE BRACKET
300B;RIGHT DOUBLE ANGLE BRACKET;Pe;0;ON;Y;CLOSING DOUBLE ANGLE BRACKET

Half width: «»
Full width: 《》

There is no way to type out the half-width yen and double angle brackets under
MSWin32, under either the traditional or simplified code pages; only full width
variants are available.

One way to approach it is to make Perl 6 accept both full- and
half-width variants.

Another way would be to use ASCII fallbacks exclusively in real programs, and
reserve unicode variants for pretty-printing, the same way that PLT Scheme and
Haskell recognizes λ in literatures, but actually write "lambda" and
"\" respectively
in everyday coding.

TIMTOWTDI. :)

Thanks,
/Autrijus/


Avoid the Yen Sign [Was: Re: new sigil]

2005-10-23 Thread Dan Kogai

Maeda-san and the list members,

Thank you for raising this issue and sorry for not raising this myself.

On Oct 22, 2005, at 19:42 , Kaoru Maeda wrote:

If we find a lot of yen sign as zip-operator in the standard library,
we have a big question: "Give up either Perl6 or Windows.  Which do  
we abandon?"
And I suppose the answer would be "We have a lot of substitutes to  
Perl6:

Ruby, Perl5, etc."

In Japan, yes is synonym to backslash.  We wish to retain this legacy.
Zip-operator is far less important than regex-escape, string- 
escape, and

take-reference operator.


To make the matter worse, there are not just one "yen sign" in  
Unicode. Take a look at this.


¥ U+00A5 YEN SIGN
¥ U+FFE5 FULLWIDTH YEN SIGN

Tough they look and groks the same to human, computers handle them  
differently.  This happened when Unicode Consortium decided to make  
BMP round-trippable against legacy encodings.  They were distinct in  
JIS standards, so happened Unicode.


Maybe we should avoid other symbols like this for sigils -- those not  
in ASCII that have 'fullwidth' variations.  q($) and q(\) are okay  
(or too late) because they are already in ASCII.  q(¥) should be  
avoided because you can hardly tell the difference from q(¥) in the  
display.


But this will also outlaw the cent sign.  I have attached a list of  
those affected.  As you see, most are with ASCII equivalents but some  
are not.


Dan the Man with Too Many Signs to Deal With

% grep FULLWIDTH /usr/local/lib/perl5/5.8.7/unicore/Name.pl | perl - 
Mencoding=utf8 -aple '$_=chr(hex($F[0]))."\t".$_'

!   FF01FULLWIDTH EXCLAMATION MARK
"   FF02FULLWIDTH QUOTATION MARK
#   FF03FULLWIDTH NUMBER SIGN
$   FF04FULLWIDTH DOLLAR SIGN
%   FF05FULLWIDTH PERCENT SIGN
&   FF06FULLWIDTH AMPERSAND
'   FF07FULLWIDTH APOSTROPHE
(   FF08FULLWIDTH LEFT PARENTHESIS
)   FF09FULLWIDTH RIGHT PARENTHESIS
*   FF0AFULLWIDTH ASTERISK
+   FF0BFULLWIDTH PLUS SIGN
,   FF0CFULLWIDTH COMMA
-   FF0DFULLWIDTH HYPHEN-MINUS
.   FF0EFULLWIDTH FULL STOP
/   FF0FFULLWIDTH SOLIDUS
0   FF10FULLWIDTH DIGIT ZERO
1   FF11FULLWIDTH DIGIT ONE
2   FF12FULLWIDTH DIGIT TWO
3   FF13FULLWIDTH DIGIT THREE
4   FF14FULLWIDTH DIGIT FOUR
5   FF15FULLWIDTH DIGIT FIVE
6   FF16FULLWIDTH DIGIT SIX
7   FF17FULLWIDTH DIGIT SEVEN
8   FF18FULLWIDTH DIGIT EIGHT
9   FF19FULLWIDTH DIGIT NINE
:   FF1AFULLWIDTH COLON
;   FF1BFULLWIDTH SEMICOLON
<   FF1CFULLWIDTH LESS-THAN SIGN
=   FF1DFULLWIDTH EQUALS SIGN
>   FF1EFULLWIDTH GREATER-THAN SIGN
?   FF1FFULLWIDTH QUESTION MARK
@   FF20FULLWIDTH COMMERCIAL AT
A   FF21FULLWIDTH LATIN CAPITAL LETTER A
B   FF22FULLWIDTH LATIN CAPITAL LETTER B
C   FF23FULLWIDTH LATIN CAPITAL LETTER C
D   FF24FULLWIDTH LATIN CAPITAL LETTER D
E   FF25FULLWIDTH LATIN CAPITAL LETTER E
F   FF26FULLWIDTH LATIN CAPITAL LETTER F
G   FF27FULLWIDTH LATIN CAPITAL LETTER G
H   FF28FULLWIDTH LATIN CAPITAL LETTER H
I   FF29FULLWIDTH LATIN CAPITAL LETTER I
J   FF2AFULLWIDTH LATIN CAPITAL LETTER J
K   FF2BFULLWIDTH LATIN CAPITAL LETTER K
L   FF2CFULLWIDTH LATIN CAPITAL LETTER L
M   FF2DFULLWIDTH LATIN CAPITAL LETTER M
N   FF2EFULLWIDTH LATIN CAPITAL LETTER N
O   FF2FFULLWIDTH LATIN CAPITAL LETTER O
P   FF30FULLWIDTH LATIN CAPITAL LETTER P
Q   FF31FULLWIDTH LATIN CAPITAL LETTER Q
R   FF32FULLWIDTH LATIN CAPITAL LETTER R
S   FF33FULLWIDTH LATIN CAPITAL LETTER S
T   FF34FULLWIDTH LATIN CAPITAL LETTER T
U   FF35FULLWIDTH LATIN CAPITAL LETTER U
V   FF36FULLWIDTH LATIN CAPITAL LETTER V
W   FF37FULLWIDTH LATIN CAPITAL LETTER W
X   FF38FULLWIDTH LATIN CAPITAL LETTER X
Y   FF39FULLWIDTH LATIN CAPITAL LETTER Y
Z   FF3AFULLWIDTH LATIN CAPITAL LETTER Z
[   FF3BFULLWIDTH LEFT SQUARE BRACKET
\   FF3CFULLWIDTH REVERSE SOLIDUS
]   FF3DFULLWIDTH RIGHT SQUARE BRACKET
^   FF3EFULLWIDTH CIRCUMFLEX ACCENT
_   FF3FFULLWIDTH LOW LINE
`   FF40FULLWIDTH GRAVE ACCENT
a   FF41FULLWIDTH LATIN SMALL LETTER A
b   FF42FULLWIDTH LATIN SMALL LETTER B
c   FF43FULLWIDTH LATIN SMALL LETTER C
d  

Re: new sigil

2005-10-23 Thread Kaoru Maeda

> Luke Palmer wrote:
>
>> limited access to system settings.
>> And in those kinds of corporate environments, you're not going to be
>> working with any code but code written in-house.  Which means that
>> nobody is going to be using Latin-1, and everyone will be using the
>> ASCII synonyms.  What's the problem?

Dave Whipp wrote:
> My experience is that this isn't true: we use lots of external code,
> but I still need to file requests with IT to get system-settings changed.

Right.  We rely on Perl libraries from CPAN, and elsewhere.
You have to make sure that the code you are looking at is
transfered via utf-8 aware systems only.
It is not safe that we decide to use ASCII synonyms ourselves.
We have to be sure that all the modules, which happen to
have Unicode sigils/ops, should be installed without intervening
legacy systems.

Explanation of the situation in Japan follows.  Those who are not
interested in Japan can skip.  Seemingly this problem is very unique
to Japan.  It's already one year since yen sign became zip-operator.
This is not to kick a discussion, just a whining of mine. :P

Ancient ISO-646 allowed variants, which substitute certain part of ASCII 
characters
with local symbols.  Currency signs were the first candidates of this.
http://en.wikipedia.org/wiki/ISO_646
This legacy convention is still alive in Japan as JIS/ShiftJIS encodings.
I hope Unicode supercedes them and the "backslash-yen" confusion would 
disappear,
but the movement is not quick enough.

The problem doesn't reside in writing code but in carrying files.
  - You cannot tell whether a text file is in US-ASCII, utf8,
or ShiftJIS, when all the code points are below 0x7f.  It is too
late when you receive a code snippet from your colleague by mail.
  - If we convert yen from Latin-1 (0xa5) to Unicode
(utf8=c2a5), then to "the default coding system,
which is believed to be ASCII but actually
ShiftJIS", it becomes 0x5c.  There's no way to tell
whether the byte was a bachslash or a yen at the beginning.

Grepping for yen signs doesn't help because at the time you
run grep, they are already backslashes.

If we find a lot of yen sign as zip-operator in the standard library,
we have a big question: "Give up either Perl6 or Windows.  Which do we abandon?"
And I suppose the answer would be "We have a lot of substitutes to Perl6:
Ruby, Perl5, etc."

In Japan, yes is synonym to backslash.  We wish to retain this legacy.
Zip-operator is far less important than regex-escape, string-escape, and
take-reference operator.

--
Kaoru Maeda
[EMAIL PROTECTED]


Re: new sigil

2005-10-23 Thread maeda
Luke Palmer wrote:
>> limited access to system settings.
>> And in those kinds of corporate environments, you're not going to be
>> working with any code but code written in-house.  Which means that
>> nobody is going to be using Latin-1, and everyone will be using the
>> ASCII synonyms.  What's the problem?

Dave Whipp wrote:
> My experience is that this isn't true: we use lots of external code,
> but I still need to file requests with IT to get system-settings changed.

Right.  We rely on Perl libraries from CPAN, and elsewhere.  You
have to make sure that the code you are looking at is transfered
via utf-8 aware systems only.  It is not safe that we decide to
use ASCII synonyms ourselves.  We have to be sure that all the
modules, which happen to have Unicode sigils/ops, should be
installed without intervening legacy systems.

Explanation of the situation in Japan follows.  Those who are not
interested in Japan can skip.  Seemingly this problem is very unique
to Japan.

(It's already one year since yen sign became zip-operator.
This is not to kick an argument, just a whining of mine. :P)

The problem doesn't reside in writing code but in carrying files.
   - You cannot tell whether a text file is in US-ASCII, utf8,
 or ShiftJIS, when all the code points are below 0x7f.  It
 is too late when you receive a code snippet from your
 colleague by mail.
   - If we convert yen from Latin-1 (0xa5) to Unicode
 (utf8=c2a5), then to "the default coding system, which is
 believed to be ASCII but actually ShiftJIS", it becomes
 0x5c.  There's no way to tell whether the byte was a
 bachslash or a yen at the beginning.

Grepping for yen signs doesn't help because at the time you run
grep, they are already backslashes.

If we find a lot of yen signs as zip-operators in the standard
library, Japanese would have a big question: "Give up either
Perl6 or Windows.  Which do we need?"  And I suppose the answer
would be "We have a lot of substitutes to Perl6: Ruby, Perl5,
etc."

In <[EMAIL PROTECTED]> Larry wrote:
> (Of course, we'll leave out the little problem that half the people
> in Japan would read it as a backslash wannabe...that's not really
> a problem since a zipper would only be used where an operator is
> expected, and backslash is illegal there (so far).)

It is not the people who read a yen as a backslash, but the
legacy systems.  We might define backslash as a synonym for the
zip op, but it's too risky.  "Yen as zip" has the same magnitude
of risk in Japan.

-- 
Kaoru Maeda
[EMAIL PROTECTED]


Re: new sigil

2005-10-23 Thread Damian Conway

Autrijus wrote:


Indeed.  Somehow I think this makes some sense:

sub Bool eqv (|T $x, |T $y) { ... }


Except that it prevents anyone from ever writing:

multi sub circumfix:<| |> (Num $x) { return abs $x }
multi sub circumfix:<| |> (Vec $x) { return $x.mag }

which many mathematically inclined folks might find annoying.

(It also precludes intriguing possibilities like:

multi sub circumfix:«| >» ($q) { return Quantum::State.new(val => $q) }

which I personally would find irritating. ;-)

Damian


Re: new sigil

2005-10-22 Thread Darren Duncan

At 3:26 PM +0100 10/22/05, Nicholas Clark wrote:

At the risk of re-enforcing my apparent optimism.

On Thu, Oct 20, 2005 at 04:02:10PM -0700, Darren Duncan wrote:


 that the next best one to exploit is ¤ (euro;
 unicode=20AC; utf8=E282AC), and the next best is


Woah. You've just demonstrated why Euro is far worse than any of the other
"Unicode" characters so far suggested. You mail headers say:

Content-Type: text/plain; charset="iso-8859-1" ; format="flowed"

The symbol in your message *as sent* is the international currency symbol,
U00A4. The Euro symbol is not part of ISO-8859-1.
(ISO-8859-15 yes, but that's about 10 years more recent)


Actually, what you point out in my message is a 
limitation of my email client, which I didn't 
realize existed until now.


I then did a bit of research, and apparently the 
newest Eudora doesn't support customization of 
what character set messages are composed with, 
always sending them using ISO-8859-1.  This is 
apparently a an issue that many Eudora users 
requested fixed but haven't been addressed.


This said, sending UTF8 files as email 
attachments, rather than UTF8 in the message 
body, still works fine, AFAIK, as does 
transmitting them by other ways such as http or 
ftp etc.


And my normal text editor handles UTF8 correctly.

Also, apparently some other email clients handle UTF8 properly.

So my email client failed me, but my point still 
stands that Unicode characters should still be 
embraced in Perl 6.  I just need to replace my 
email client if I want to type them into the 
message body.


-- Darren Duncan


Re: new sigil

2005-10-22 Thread John Macdonald
On Fri, Oct 21, 2005 at 09:35:12AM -0400, Rob Kinyon wrote:
> On 10/21/05, Steve Peters <[EMAIL PROTECTED]> wrote:
> > On Fri, Oct 21, 2005 at 02:37:09PM +0200, Juerd wrote:
> > > Steve Peters skribis 2005-10-21  6:07 (-0500):
> > > > Older versions of Eclipse are not able to enter these characters.  
> > > > That's
> > > > where the copy and paste comes in.
> > >
> > > That's where upgrades come in.
> > >
> > That's where lots of money to update to the next version of WSAD becomes the
> > limiting factor.
> 
> So, you are proposing that the Perl of the Unicode era be limited to
> ASCII because a 15 year old editor cannot handle the charset? That's
> like suggesting that operating systems should all be bootable from a
> single floppy because not everyone has access to a CD drive.

Um, that's not what I'm hearing.

To type in a Unicode character requires machinations beyond just
hitting a labelled key on the keybourd.  There are no standards
for these machinations - what must be done is different for
Windows vs. Linux, and different for specific applications
(text-mode mutt vs. xvi vs. Eclipse vs. ...).

So, a book can't just show code and expect the reader to be
able to use it, and no book is going to be able to tell all
of its users how to type the characters because there are so
many different ways.

Any serious programmer will be able to sort out how to do
things but casual programmers won't be typing the extended
characters enough to learn how to do it without looking it
up each time.  Proprammers that use many different computers
and applications will have difficulty remembering which of
the varous incantations happen to work on the system they're
currently using.  People who do sort out a good working
environment will be at a loss when they occassionally have to do
something on a different system and no longer know how to type
"basic" characters.  (But since in their normal environment they
do know how, they may never have known the ASCII workarounds,
so they'll have to look them up.)  I've gotten away from
programming enough that I often have to look up a function
or operator definition to check on details; but that is much
less disruptive to the thought process than having to look up
how to type a character.

I think that the reasons for using Unicode characters are good
ones and that there is no good alternative.  However, doing
so does make Perl less accessable for casual programmers.
(While we may deride the Learn to Web Program in 5 Minutes
crowd, that did get many people involved with Perl, and I'm
sure some of them evolved beyond those limited roots, just
as an earlier generation of programmers had some who evolved
beyond their having started with Basic into nonetheless becoming
competent and knowledgeable craftsmen.)

We need to have a "Why Unicode is the lesser of evils" document
to refer to whenever this issue rizes up again.  The genuine
problems involved ensure that the issue will continue to arise,
so we can't just get mad at the people who raise it.

-- 


Re: new sigil

2005-10-22 Thread Aaron Crane
Kaoru Maeda writes:
> Darren Duncan wrote:
> > the next best is £
> Isn't that 0x23 in UK?  I imagine that someday all the comment lines 
> cause syntax errors in UK...

U+00A3 "POUND SIGN" is at 0x23 in ISO 646-GB (aka BS 4730), true.
Fortunately, that character set is almost never used.  I think the last
time I encountered it was on a dot-matrix printer manufactured in the
1980s.

Hmmm.  Encode.pm doesn't seem to have support available for any of the
ISO 646 character sets.  I feel a patch coming on.

-- 
Aaron Crane


Re: new sigil

2005-10-22 Thread Nicholas Clark
At the risk of re-enforcing my apparent optimism.

On Thu, Oct 20, 2005 at 04:02:10PM -0700, Darren Duncan wrote:

> that the next best one to exploit is ¤ (euro; 
> unicode=20AC; utf8=E282AC), and the next best is 

Woah. You've just demonstrated why Euro is far worse than any of the other
"Unicode" characters so far suggested. You mail headers say:

Content-Type: text/plain; charset="iso-8859-1" ; format="flowed"

The symbol in your message *as sent* is the international currency symbol,
U00A4. The Euro symbol is not part of ISO-8859-1.
(ISO-8859-15 yes, but that's about 10 years more recent)

ISO-8859-1 has been the default standard for the character set on most
Internet protocols for a long time, and many systems for the past 10+ years
have supported it by default (most Unix variants, Windows 3.1, I think.
DOS boxes were CP437, but native Windows was (extended) ISO-8859-1)

This cannot be said for ISO-8859-15. So I can see little reason why any
currently operational system will be incapable of displaying the ISO-8859-1
operators in scripts or CPAN modules correctly, even if the editor the
maintenance programmer (or sysdamin) is constrained to entering the ASCII
digraphs.

But there will be a lot of systems out there where this is not true for the
Euro symbol, and the assumption of ISO-8859-1 defaults will mean that this
won't be the last time that Euro symbols are going to get mangled during
transit, with all the ensuing pain, frustration, losses and defections to
other languages that this will cause.

Perl 5 runs everywhere: http://www.cpan.org/ports/index.html

Perl 6 is intended to be an improvement on Perl 5. It would be a shame to
design in restrictions on portability.

Nicholas Clark



Re: new sigil

2005-10-22 Thread John Adams
-Original Message-
From: Nicholas Clark <[EMAIL PROTECTED]>

> And for anyone who says "upgrade", please note that many firms in the real
world are still forcing a base perl version of 5.005_03 or 5.6.1 for
development. Still.

My weekend project is to demonstrate that you are an optimist. Really.


Re: new sigil

2005-10-22 Thread Nicholas Clark
On Fri, Oct 21, 2005 at 09:42:00AM +0100, Carl Franks wrote:
> Where did you get ALT-155 from?

Code page 437:

http://www.kostis.net/charsets/cp437.htm

On Fri, Oct 21, 2005 at 06:07:47AM -0500, Steve Peters wrote:
> On Fri, Oct 21, 2005 at 09:42:00AM +0100, Carl Franks wrote:
> > Where did you get ALT-155 from?
> > 
> > I've just checked the windows Character Map, and ¢ (cent) is ALT-0162
> > ( If it's not in your startmenu, do start -> run -> charmap )
> 
> Actually, both work.  That's where the issus with the documentation starts.

"what he says"

This is going to be hard to document well.

For example, *I* know why the leading zero is significant on ALT-0162, but
how many people are going to assume that it's not?

Anyone care to save to a file called AUX.TXT on Windows?

And for anyone who says "upgrade", please note that many firms in the real
world are still forcing a base perl version of 5.005_03 or 5.6.1 for
development. Still.

The active perl "community" is not wholly representitive of the global usage
of perl, and would do well to remember this. For example, see
http://use.perl.org/~barbie/journal/27098

Nicholas Clark


Re: new sigil

2005-10-21 Thread Autrijus Tang
Juerd wrote:
> I do not see why $ and @ couldn't be both a sigil and an infix
> operator, and the same goes for whatever ASCII equivalent ¢ gets.
> 
> ^ and | are available for sigil use. (All the closing brackets are too,
> but that would be very confusing because we tend to visually parse those
> in pairs.)
> 
> Using the an infix operator's symbol as a sigil is not weird, not wrong,
> not confusing and mostly: not a new idea.

Indeed.  Somehow I think this makes some sense:

sub Bool eqv (|T $x, |T $y) { ... }

Thanks,
/Autrijus/


Re: new sigil

2005-10-21 Thread Luke Palmer
On 10/21/05, Dave Whipp <[EMAIL PROTECTED]> wrote:
> Luke Palmer wrote:
> > And in those kinds of corporate environments, you're not going to be
> > working with any code but code written in-house.  Which means that
> > nobody is going to be using Latin-1, and everyone will be using the
> > ASCII synonyms.  What's the problem?
>
> My experience is that this isn't true: we use lots of external code, but
> I still need to file requests with IT to get system-settings changed.

Oh good, reduce the number of fears I have of working in a tightly
controlled corporate environment by one...  bringing it to 499.

Luke


Re: new sigil

2005-10-21 Thread Juerd
TSa skribis 2005-10-21 18:54 (+0200):
> My 2¢ is that we should reap ^ from the one junction and promote it to
> become the 'runtime type information carrier' sigil---like the wings
> on the feet of Hermes/Mercury :)

It is not necessary (or sane, but that's an opinion) to reap it from the
junction, because that's in infix/op position, while sigils are in
prefix/term position.

In Perl 5:

- % is a sigil and an infix operator

- * is a sigil and an infix operator

- & is a sigil and an infix operator

I do not see why $ and @ couldn't be both a sigil and an infix
operator, and the same goes for whatever ASCII equivalent ¢ gets.

^ and | are available for sigil use. (All the closing brackets are too,
but that would be very confusing because we tend to visually parse those
in pairs.)

Using the an infix operator's symbol as a sigil is not weird, not wrong,
not confusing and mostly: not a new idea.


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html


Re: new sigil

2005-10-21 Thread TSa

HaloO,

Brent 'Dax' Royal-Gordon wrote:

Steve Peters <[EMAIL PROTECTED]> wrote:


~ seems to be available for a sigil, if my reading of S02 is correct, and
the cent sign is replacing :: in all cases.  If not (that is $::foo is
still the global variable named foo) then * may also be available.



Sigils can't conflict with unary operators (like, say, the
stringification and flattening operators, ~ and *) and ideally
shouldn't conflict with binary ops either (although % breaks this
rule).


My 2¢ is that we should reap ^ from the one junction and promote it to
become the 'runtime type information carrier' sigil---like the wings
on the feet of Hermes/Mercury :)

And we should find an alternative to binary % which isn't very well
defined in it's abstract meaning---but I find that the 0/0 connotation
that it spawns in my infinitly twisted brain matches nicely with infinite
precision nums and I get the identities:

   Undef ::=   0/0;
   One   ::= Any/Any  # actually $x = any(1..Inf) && 1 == $x/$x
   Inf   ::= Inf/Inf  # the other Undef :)
   Type  ::= All  # the concept that is shared by all instances
  # and represented by the one meta representative

and of course some mixed cases like

   0 ::= 0/Any
 Inf ::= Any/0

The none junction hasn't one single char infix creator either. Also the
all junction is in partial conflict with the & sigil. OTOH, many fear
that junctive auto-threading enters their functions. And the junctions
have got very well picked short names.

In other words a comparison like

  if $x != $x { ... }

should *never* hit the nada operator. While

  if &x != &x { ... }

could depending on the evaluation of the code &x refers to.
--
$TSa.greeting := "HaloO"; # mind the echo!



Re: new sigil

2005-10-21 Thread Dave Whipp

Luke Palmer wrote:

As I mentioned earlier, most programmers in a corporate environment

>> have

limited access to system settings.
And in those kinds of corporate environments, you're not going to be
working with any code but code written in-house.  Which means that
nobody is going to be using Latin-1, and everyone will be using the
ASCII synonyms.  What's the problem?


My experience is that this isn't true: we use lots of external code, but 
I still need to file requests with IT to get system-settings changed.


That said, I have no objection to Latin-1 sigils. So it's only your 
argument that's bogus, not the conclusion ;-).


Re: new sigil

2005-10-21 Thread Spider Boardman
On Thu, 20 Oct 2005 16:52:04 -0600, Thom Boyer wrote (in part):

Thom> On 10/20/05, Juerd <[EMAIL PROTECTED]> wrote:

>> 2. How can it be typed with X character composition, vim's digraphs
>> and major international keyboards?

For X11 composition, where getting into compose state is up to your X
environment:
/c

In my case (for a more concrete example), that's "/c".

--s.


Re: new sigil

2005-10-21 Thread Steve Peters
On Fri, Oct 21, 2005 at 05:27:53PM +0200, Schneelocke wrote:
> On 21/10/05, Steve Peters <[EMAIL PROTECTED]> wrote:
> > I honestly don't know or care what flavor of vi I using, since it usually
> > changes depending on what *nix flavor I'm working on.  I also don't think 
> > that
> > it should make a difference what editor I'm using with a programming 
> > language.
> > Others seem to think differently.  C'est la vie.
> 
> It won't make a difference. Even if you're in an environment where you
> can neither type nor copy'n'paste the cent sign, you can still use the
> ASCII version of the sigil. Sure, it's going to be one extra
> keystroke, but that's not really a big issue - and even less so when
> you consider that you probably won't be using the class sigil as often
> as the others, anyway.
> 
> The amount of typing that was required for your emails in this thread
> so far probably exceeds the amount of extry typing you'll have to do
> to use the ASCII version of the sigil for your entire life already. :)

For me, all that you have written above is correct.  That still does not
fix that potential advocacy and documentation issues that are created by
this.  Someone who is new to Perl 6 after its released may not know the
difference.  That's the problem.

Steve Peters
[EMAIL PROTECTED]


Re: new sigil

2005-10-21 Thread Schneelocke
On 21/10/05, Steve Peters <[EMAIL PROTECTED]> wrote:
> I honestly don't know or care what flavor of vi I using, since it usually
> changes depending on what *nix flavor I'm working on.  I also don't think that
> it should make a difference what editor I'm using with a programming language.
> Others seem to think differently.  C'est la vie.

It won't make a difference. Even if you're in an environment where you
can neither type nor copy'n'paste the cent sign, you can still use the
ASCII version of the sigil. Sure, it's going to be one extra
keystroke, but that's not really a big issue - and even less so when
you consider that you probably won't be using the class sigil as often
as the others, anyway.

The amount of typing that was required for your emails in this thread
so far probably exceeds the amount of extry typing you'll have to do
to use the ASCII version of the sigil for your entire life already. :)

> Steve Peters
> [EMAIL PROTECTED]

--
schnee


Re: new sigil

2005-10-21 Thread Steve Peters
On Fri, Oct 21, 2005 at 10:30:40AM -0400, Rob Kinyon wrote:
> > > So, you are proposing that the Perl of the Unicode era be limited to
> > > ASCII because a 15 year old editor cannot handle the charset? That's
> > > like suggesting that operating systems should all be bootable from a
> > > single floppy because not everyone has access to a CD drive.
> >
> > I saying that, since my up-to-date version of vi on my up-to-date OpenBSD
> > can't type, much less even allow me to paste in, a Latin-1 character, this
> > is an issue.
> 
> You're still using the base vi vs. vim?!? I didn't know people did
> that when it wasn't 3am on Sunday when trying to fix a borked /etc ...
> Huh!

I honestly don't know or care what flavor of vi I using, since it usually
changes depending on what *nix flavor I'm working on.  I also don't think that
it should make a difference what editor I'm using with a programming language.
Others seem to think differently.  C'est la vie.

Steve Peters
[EMAIL PROTECTED]


Re: new sigil

2005-10-21 Thread Thom Boyer
On 10/20/05, Juerd <[EMAIL PROTECTED]> wrote:
>
> Larry Wall skribis 2005-10-20 7:56 (-0700):
> > the new sigil is the cent sign, so ::T is now written ¢T instead.
>
> 1. What does it look like? I've never used a cent sign, and have seen
> several.


It looks like a lowercase c with a vertical line through it -- though the
vertical line is often slanted forward, so it looks like a c overtyped with
a slash ("/").

2. How can it be typed with X character composition, vim's digraphs and
> major international keyboards?


For vim, use CTRL-K C t

I can't address the other contexts.
=thom

"A painting in a museum hears more ridiculous opinions than anything else in
the world."
Edmond de Goncourt


Re: new sigil

2005-10-21 Thread Juerd
Steve Peters skribis 2005-10-21  9:10 (-0500):
> I saying that, since my up-to-date version of vi on my up-to-date OpenBSD
> can't type, much less even allow me to paste in, a Latin-1 character, this
> is an issue.

You should report this bug. Hopefully, it will then be fixed before Perl
6 is released.


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html


Re: new sigil

2005-10-21 Thread Rob Kinyon
> > So, you are proposing that the Perl of the Unicode era be limited to
> > ASCII because a 15 year old editor cannot handle the charset? That's
> > like suggesting that operating systems should all be bootable from a
> > single floppy because not everyone has access to a CD drive.
>
> I saying that, since my up-to-date version of vi on my up-to-date OpenBSD
> can't type, much less even allow me to paste in, a Latin-1 character, this
> is an issue.

You're still using the base vi vs. vim?!? I didn't know people did
that when it wasn't 3am on Sunday when trying to fix a borked /etc ...
Huh!

Rob


Re: new sigil

2005-10-21 Thread Mark Reed

On 2005-10-21 10:10 AM, "Steve Peters" <[EMAIL PROTECTED]> wrote:
> I saying that, since my up-to-date version of vi on my up-to-date OpenBSD
> can't type, much less even allow me to paste in, a Latin-1 character, this
> is an issue.

If you're using stock vi rather than vim or elvis or at least nvi,
"up-to-date" doesn't apply. :)   But the pasting problem has more to do with
your windowing and terminal environment, and I'd be surprised if there
weren't a simple tweak that would make it work for you.

 




Re: Y [was: "Re: new sigil"]

2005-10-21 Thread Mark Reed
> Speaking of which, the advantage of, say, « over << is that the former
> is _one_ character. But Y, compared to ¥, is one character only as
> well, and is even more visually distinctive with most fonts I know of,
> afaict, so is there any good reason to keep the latter as the
> "official" one?!?

I can't speak for anyone else, but personally I prefer ¥ because I don't
like infix operators that look like identifiers.  It's idiosyncratic,
admittedly, but I dislike Pascal's "mod" and Perl5's "x" for the same
reason.  Even with the ability to use Unicode names, ¥ can't be an
identifier, because it's not a letter, it's a currency symbol.  Now that
we've opened up the Pandora's box of Unicode, we have lots more letters, but
also lots more non-letters, and I'd rather see the latter used for
operators.

Just my 2¢. :)




Re: Y [was: "Re: new sigil"]

2005-10-21 Thread Michele Dondi

On Thu, 20 Oct 2005, Rutger Vos wrote:

 _one_ charachter. But Y, compared to ¥, is one charachter only as well, 
 and is even more visually distinctive with most fonts I know of, afaict, 
 so is there any good reason to keep the latter as the "official" one?!?


Do you even need to ask? It's because it *looks cool* :)


Does it? Guillemets _do_ look kool, but I don't by the argument for the 
Yen symbol...



Michele
--

Is e+pi a rational or irrational number?

Yes, it is.
- Robert Israel in sci.math, "Re: A Number Problem"

Re: new sigil

2005-10-21 Thread Steve Peters
On Fri, Oct 21, 2005 at 09:35:12AM -0400, Rob Kinyon wrote:
> On 10/21/05, Steve Peters <[EMAIL PROTECTED]> wrote:
> > On Fri, Oct 21, 2005 at 02:37:09PM +0200, Juerd wrote:
> > > Steve Peters skribis 2005-10-21  6:07 (-0500):
> > > > Older versions of Eclipse are not able to enter these characters.  
> > > > That's
> > > > where the copy and paste comes in.
> > >
> > > That's where upgrades come in.
> > >
> > That's where lots of money to update to the next version of WSAD becomes the
> > limiting factor.
> 
> So, you are proposing that the Perl of the Unicode era be limited to
> ASCII because a 15 year old editor cannot handle the charset? That's
> like suggesting that operating systems should all be bootable from a
> single floppy because not everyone has access to a CD drive.

I saying that, since my up-to-date version of vi on my up-to-date OpenBSD
can't type, much less even allow me to paste in, a Latin-1 character, this
is an issue.


Re: new sigil

2005-10-21 Thread Stefan Lidman
> For me AltGr + C gives Copyright-symbol "(c)".

For me too, but AltGr + shift + E gives ¢.

/Stefan Lidman


Re: new sigil

2005-10-21 Thread Braňo Tichý


- Original Message - 
From: "Steve Peters" <[EMAIL PROTECTED]>

To: "Luke Palmer" <[EMAIL PROTECTED]>
Cc: 
Sent: Friday, October 21, 2005 4:21 AM
Subject: Re: new sigil




But I may have to support your code.  That's the issue.



Isn't perl6 assuming the source file is in UTF-8 unless explicitly specified 
differently?


Also it's quite interesting how often was Latin-1 and UTF-8 used in the 
discussion interchangeably;
"every source is Latin-1" is marginally better than "every source is ASCII", 
but we can do better.


As for keyboard layouts: I don't think there is Yen sign on US keyboard 
either.
I also use Slovak layout, which does not have backtick (only grave accent) 
and all sigils but % are written with AltGr. So what. I got used to it.
On the other hand, there is ¤ sign. (That's U+00A4 Currency Sign -- hey, it 
looks like little o. If ¢ is maimed c for class, then ¤ may be o for object. 
Or universal-unspecified-i-dont-care-sigil.)


braňo

P.S. this e-mail should be sent in UTF-8.

 



Re: Y [was: "Re: new sigil"]

2005-10-21 Thread Rutger Vos


Speaking of which the advantage of, say, « over << is that the former 
is _one_ charachter. But Y, compared to ¥, is one charachter only as 
well, and is even more visually distinctive with most fonts I know of, 
afaict, so is there any good reason to keep the latter as the 
"official" one?!?



Do you even need to ask? It's because it *looks cool* :)

We need *more* of these. I can't wait until the day when I can finally 
code in overloaded Tagalog or Gujarati:


http://www.iam.uni-bonn.de/~alt/html/unicode_23.html


Re: new sigil

2005-10-21 Thread Rob Kinyon
On 10/21/05, Steve Peters <[EMAIL PROTECTED]> wrote:
> On Fri, Oct 21, 2005 at 02:37:09PM +0200, Juerd wrote:
> > Steve Peters skribis 2005-10-21  6:07 (-0500):
> > > Older versions of Eclipse are not able to enter these characters.  That's
> > > where the copy and paste comes in.
> >
> > That's where upgrades come in.
> >
> That's where lots of money to update to the next version of WSAD becomes the
> limiting factor.

So, you are proposing that the Perl of the Unicode era be limited to
ASCII because a 15 year old editor cannot handle the charset? That's
like suggesting that operating systems should all be bootable from a
single floppy because not everyone has access to a CD drive.

Rob


Re: new sigil

2005-10-21 Thread Steve Peters
On Fri, Oct 21, 2005 at 02:37:09PM +0200, Juerd wrote:
> Steve Peters skribis 2005-10-21  6:07 (-0500):
> > Older versions of Eclipse are not able to enter these characters.  That's
> > where the copy and paste comes in.
> 
> That's where upgrades come in.
> 
That's where lots of money to update to the next version of WSAD becomes the
limiting factor.

Steve Peters
[EMAIL PROTECTED]


Re: new sigil

2005-10-21 Thread Carl Franks
On 21/10/05, Steve Peters <[EMAIL PROTECTED]> wrote:
> On Fri, Oct 21, 2005 at 09:42:00AM +0100, Carl Franks wrote:
> > Where did you get ALT-155 from?
> >
> > I've just checked the windows Character Map, and ¢ (cent) is ALT-0162
> > ( If it's not in your startmenu, do start -> run -> charmap )
>
> Actually, both work.  That's where the issus with the documentation starts.

Strange, in any windows app on my machine, ALT-155 prints "o" with a
diagonal line through it (bottom left to upper right).
cent: ¢
not: ø
I wonder if it's a font issue?

Carl


Re: new sigil

2005-10-21 Thread Juerd
Steve Peters skribis 2005-10-21  6:07 (-0500):
> Older versions of Eclipse are not able to enter these characters.  That's
> where the copy and paste comes in.

That's where upgrades come in.


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html


Re: new sigil

2005-10-21 Thread Juerd
Brent 'Dax' Royal-Gordon skribis 2005-10-20 21:42 (-0700):
> @   Array sigil Array sigil
> $   Scalar sigilScalar sigil
> %   Hash sigil  Hash sigil, modulo

In non-term, it's not a sigil. There cannot be two subsequent terms.

This is why it makes no sense to want sigils to be free in infix/op
position, and why % and ^ would work well (without ambiguity) as sigils. 

> ^   (Not sure)  one() junction

^ is available in prefix/term

> (   Open paren  Subroutine call

open paren: grouping. The paren is the glyph, not its function.

Also, for the subcall to work, it's not all possible infix/op, but only
postix with no whitespace in between. Same for other .[] where [] is any
set of brackets, and the dot is implied.

> {   Block   Hash index

Block/hash

> <   quote words Less than

Also, hash subscript.

> There are very few unary operators available, and none (besides the
> user-defined backticks operator) unused in both term and operator
> context.

But that isn't necessary. It's not as if % used in two ways is new, and
was already overstepping a boundary. It's perfectly normal to have one
glyph do very different things according to how/where it's used.


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html


Re: new sigil

2005-10-21 Thread Steve Peters
On Fri, Oct 21, 2005 at 09:42:00AM +0100, Carl Franks wrote:
> Where did you get ALT-155 from?
> 
> I've just checked the windows Character Map, and ¢ (cent) is ALT-0162
> ( If it's not in your startmenu, do start -> run -> charmap )

Actually, both work.  That's where the issus with the documentation starts.
> 
> It displays in Eclipse (3.1.1) whether the Text File Encoding is set to
> Cp1252 (default) or UTF-8 or ISO-8859-1

Older versions of Eclipse are not able to enter these characters.  That's
where the copy and paste comes in.

Steve Peters
[EMAIL PROTECTED]


Re: new sigil

2005-10-21 Thread Steve Peters
On Fri, Oct 21, 2005 at 11:03:07AM +0200, Bra??o Tichý wrote:
> 
> - Original Message - 
> From: "Steve Peters" <[EMAIL PROTECTED]>
> To: "Luke Palmer" <[EMAIL PROTECTED]>
> Cc: 
> Sent: Friday, October 21, 2005 4:21 AM
> Subject: Re: new sigil
> 
> 
> >
> >But I may have to support your code.  That's the issue.
> >
> 
> Isn't perl6 assuming the source file is in UTF-8 unless explicitly 
> specified differently?

My point is that there is a difference between the source file being in
Unicode and depending on characters outside of ASCII.  If someone wants
to code using whatever Unicode characters they want, that's fine.  Not
every computer or editor can do Unicode out of the box.  The issue 
starts when people are required to write code outside of ASCII and that
is not available.

> 
> Also it's quite interesting how often was Latin-1 and UTF-8 used in the 
> discussion interchangeably;
> "every source is Latin-1" is marginally better than "every source is 
> ASCII", but we can do better.
> 
> As for keyboard layouts: I don't think there is Yen sign on US keyboard 
> either.

And that is as much of an issue.

> bra??o
> 
> P.S. this e-mail should be sent in UTF-8.

And I see your name as "bra??o" :)


Re: new sigil

2005-10-21 Thread Carl Franks
Where did you get ALT-155 from?

I've just checked the windows Character Map, and ¢ (cent) is ALT-0162
( If it's not in your startmenu, do start -> run -> charmap )

It displays in Eclipse (3.1.1) whether the Text File Encoding is set to
Cp1252 (default) or UTF-8 or ISO-8859-1

Cheers,
Carl


Re: new sigil

2005-10-21 Thread Kaoru Maeda

Darren Duncan wrote:

In this case, I support the use of any international currency symbol 
for use as Perl sigils and/or operators as appropriate.  Eg, we 
already use $ (dollar; unicode=0024; utf8=24) and ¥ (yen; 
unicode=00A5; utf8=C2A5), and I suggest that the next best one to 
exploit is ¤ (euro; unicode=20AC; utf8=E282AC), and the next best is £ 
(pound; unicode=00A3; utf8=C2A3).  In my experience, the ¢ (cent; 
unicode=00A3; utf8=C2A3) is no harder to type than either of those. 


I haven't read this list for quite a long time, but do we already have 
the yen sign as a sigil?
In Japan, there has been a big confusion between backslashes and yen 
signs over two decades.
The code point 0x5c is a backslash in ASCII but it is the yen sign in 
JISX0201.
When I display ASCII Perl program with my Japanese Windows' notepad, it 
shows all the backslashes as yen signs.

Japanese Perl books sometimes tell:
 "If you cannot find a backslash on your keyboard, use the yen sign".
Thus we usually think yen = ascii 005c,
my eyes are optimized to unify a backslash and a yen sign in program codes,
my finger is optimized to hit the yen key when my brain thinks of a 
backslash. 
It's already merged into my reflection :P


Yes, I know.  Careful configuration of your editor should allow you to 
distinguish ASCII 0x5c from JISX0201 0x5c.
But in Japan, only a very keen coding-system/character-set wizard can do 
that.


Don't you have similar confusions with the pound sign in ISO-646 British 
version?

> the next best is £ (pound; unicode=00A3; utf8=C2A3)
Isn't that 0x23 in UK?  I imagine that someday all the comment lines 
cause syntax errors in UK...


Sorry if this is an already discussed and solved issue.

--
Kaoru Maeda
[EMAIL PROTECTED]



Re: new sigil

2005-10-20 Thread Markus Laire

Sam Vilain wrote:

ps, X11 users, if you have any key bound to "AltGr", then "AltGr" + C
might well give you a ¢ sign without any extra reconfiguration.


For me AltGr + C gives Copyright-symbol "©".
(SuSe 9.1, tested in konsole, kwrite and thunderbird)

--
Markus Laire


Re: new sigil

2005-10-20 Thread Brent 'Dax' Royal-Gordon
Steve Peters <[EMAIL PROTECTED]> wrote:
> ~ seems to be available for a sigil, if my reading of S02 is correct, and
> the cent sign is replacing :: in all cases.  If not (that is $::foo is
> still the global variable named foo) then * may also be available.

Sigils can't conflict with unary operators (like, say, the
stringification and flattening operators, ~ and *) and ideally
shouldn't conflict with binary ops either (although % breaks this
rule).

This has been done before several times on p6l, but I'll do it again:

Chr TermOperator
=== 
~   Stringify   Concatenate
`   Reserved for user   Reserved for user
!   Not
@   Array sigil Array sigil
#   Comment Comment
$   Scalar sigilScalar sigil
%   Hash sigil  Hash sigil, modulo
^   (Not sure)  one() junction
&   Subroutine sigilall() junction
*   Unary splat Multiplication
(   Open paren  Subroutine call
)   (technically unused)Close paren
-   Negate  Subtract
_   Identifier  (technically unused)
=   Iteration   Assign
+   Numify  Add
\   Take reference
|   any() junction
[   Anonymous array Array index
{   Block   Hash index
]   (technically unused)Close square bracket
}   (technically unused)Close curly bracket
;   (technically unused)Statement delimiter, anonymous array
:   Pair"super comma"
'   Single quotes   (technically unused)
"   Double quotes   (technically unused)
,   (technically unused)List items
<   quote words Less than
.   Method call on topicMethod call
>   (technically unused)Greater than
/   Anonymous rule  Divide
?   Boolify

There are very few unary operators available, and none (besides the
user-defined backticks operator) unused in both term and operator
context.

--
Brent 'Dax' Royal-Gordon <[EMAIL PROTECTED]>
Perl and Parrot hacker


Re: new sigil

2005-10-20 Thread Steve Peters
On Thu, Oct 20, 2005 at 10:40:44PM -0400, Rob Kinyon wrote:
> > Surely you aren't suggesting that these non-English speakers do not have
> > access to the ASCII (or EBCDIC) character sets for their editors, are you?
> 
> Surely you aren't suggesting that your editor doesn't have access to
> the Latin-1 charset, are you? Let's take a look at popular editors:
> vi - check
> emacs - check
> eclipse - check
> mutt - check (http://www.rano.org/mutt.html)
> Notepad - check
> A bazillion other editors - check
> (http://www.alanwood.net/unicode/utilities_editors.html)

Not every installed version of the above can handle Latin-1 by default.  Since
many programmers have little control over their installed software, this 
remains an issue.  Also, the ability to do this within the application is
not well documented within many editors.  Finally, most will of the above allow 
you to paste in Latin-1 or even UTF-8 data, but the ability to actually 
enter it from a keyboard using the editor is a completely different issue.  

> > As I mentioned earlier, most programmers in a corporate environment have
> > limited access to system settings.  Changing them in some cases can cause
> > reprimands or dismissal.  Systems are often set up with the bare minimum
> > of locales and character sets necessary to do the job.  Also, you have to
> > deal with the situations where programmers are connecting to *nix servers
> > through a variety of Windows-based XWindows servers (Exceed, Cygwin, etc.)
> > complicates what character sets are available immensely.
> 
> I have worked as a contractor in almost a dozen settings, most of them
> corporate lockdowns, and I've always been able to go to my manager and
> say "To be more productive, I need this tool" and it would be loaded
> the next day. The few times I've had to talk to an IT person to
> explain the tool, I'd do it over lunch (my treat) and it would be on
> my desktop the next morning. Saying you cannot get a tool you need
> loaded on your machine is, essentially, saying that you cannot play
> corporate politics. I'm assuming you can, which means this is a straw
> man.

I don't think a programmer's skill (or lack thereof) in corporate politics
should be a prerequisite to experimenting in Perl 6.  My bigger point is
about system settings which are typically locked down and not usually 
sweet-talkable.  Also, getting new software purchased can be a painfully
slow depending on the bureaucracy involved, and generally requires lots
of beers and lunches, or the right catastrophe, which could have been 
prevented and/or repaired with the tool you want, to speed up the process.

Steve Peters
[EMAIL PROTECTED]


Re: new sigil

2005-10-20 Thread Sebastian
> Please let that the sigil looks like a certain leter not be a reason.
>
>
> Juerd

They make for good mnemonics, which isn't necessarily a bad thing for
people coming from languages without them or with fewer

- sebastian


Re: new sigil

2005-10-20 Thread Rob Kinyon
> Surely you aren't suggesting that these non-English speakers do not have
> access to the ASCII (or EBCDIC) character sets for their editors, are you?

Surely you aren't suggesting that your editor doesn't have access to
the Latin-1 charset, are you? Let's take a look at popular editors:
vi - check
emacs - check
eclipse - check
mutt - check (http://www.rano.org/mutt.html)
Notepad - check
A bazillion other editors - check
(http://www.alanwood.net/unicode/utilities_editors.html)

> I have worked on an app that needed to work with English (US and GB),
> German, and Japanese.  I do not, however, remember having to write my
> code in anything but ASCII.

No, I had to edit my -templates- and -data files- that included French
text. Some of them could use HTML entities, but the datafiles intended
for the DB couldn't.

> As I mentioned earlier, most programmers in a corporate environment have
> limited access to system settings.  Changing them in some cases can cause
> reprimands or dismissal.  Systems are often set up with the bare minimum
> of locales and character sets necessary to do the job.  Also, you have to
> deal with the situations where programmers are connecting to *nix servers
> through a variety of Windows-based XWindows servers (Exceed, Cygwin, etc.)
> complicates what character sets are available immensely.

I have worked as a contractor in almost a dozen settings, most of them
corporate lockdowns, and I've always been able to go to my manager and
say "To be more productive, I need this tool" and it would be loaded
the next day. The few times I've had to talk to an IT person to
explain the tool, I'd do it over lunch (my treat) and it would be on
my desktop the next morning. Saying you cannot get a tool you need
loaded on your machine is, essentially, saying that you cannot play
corporate politics. I'm assuming you can, which means this is a straw
man.

Rob


Re: new sigil

2005-10-20 Thread Steve Peters
On Thu, Oct 20, 2005 at 04:23:44PM -0600, Luke Palmer wrote:
> On 10/20/05, Steve Peters <[EMAIL PROTECTED]> wrote:
> > Like the old joke goes "Doctor, Doctor, it hurts when I try to type a 
> > Latin-1
> > character."  "So don't try to type Latin-1 characters!"  Instead, many
> > programmers will to use the ASCII equivolents that will require additional
> > keystrokes.
> 
> You mean additional keystroke.  We haven't yet developed any ASCII
> equivalent that takes more than two characters.  For most cases, the
> ASCII equivalents are easier to type than the Latin-1 versions. 
> However, being a Perl 6 programmer myself, I still use the Latin-1
> versions because I like how they look and feel better.  But nobody is
> forcing you to do the same.

But I may have to support your code.  That's the issue.

> 
> The one thing you have to worry about is if you use an editor that
> doesn't support Latin-1 to read somebody else's code.  However, many
> many popular editors are capable of doing this, and any editor that
> doesn't probably will soon.  We've been over this and over this.

I'd say a lot more editors support ASCII than Latin-1.  Also, you are also
assuming that programmers have control over what tools they have available,
and have the ability to upgrade whenever they wish.  I've found this to be
very far from reality.  I understand that the ability to process the code
as Unicode is an important feature of Perl 6.  There is a big difference
between allowing it and requiring it.  Writing off a large number of 
editors, and even operating systems, seems like a big shot in the foot.

My biggest concern, however, relates to advocacy.  There will need to be
books, magazine articles, tutorials, etc. written to announce the arrival
of Perl 6.  If the code uses Latin-1 characters, and people are unable to 
look at the example code in their favorite editor or type in some of the
example code, we'll lose that person to Perl 6.  The other alternative is 
to preface every article with the explanation of the separate ASCII/Latin-1
sigils.  That doesn't sound practical, and cannot be policed or enforced.

> 
> Also, don't think of the class sigil as a sigil.  You won't be writing
> it very often.  Just think of it as an operator.
> 
> My final point:  we don't introduce unicode characters lightly.  We do
> so when we think it is the best symbol for the job, optimizing, for
> once, for readability rather than writability.

As you mentioned above, readibility is a big issue.  If I can't tell one sigil
from another, or cannot even see it, how can I support the code?

> If you don't think the
> class sigil should be a unicode character, come up with a better one. 
> We're not going to say "You're right, Steve.  No more unicode sigils!"
> until wee see a good alternative to the unicode sigil that we have.

~ seems to be available for a sigil, if my reading of S02 is correct, and
the cent sign is replacing :: in all cases.  If not (that is $::foo is
still the global variable named foo) then * may also be available.

Steve Peters
[EMAIL PROTECTED]


Re: new sigil

2005-10-20 Thread Juerd
Schneelocke skribis 2005-10-21  1:07 (+0200):
> I think the reason why Larry proposed the "¢" is much simpler - it
> looks a bit like a c, which one could associate with "class", similar
> to how $ looks like S (scalar) and @ looks like a (array). :)

And how % looks like h (hash).

I dislike things like "$calar" and "@rray", and now some people will use
"¢lass" in examples.

Please let that the sigil looks like a certain leter not be a reason.


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html


Re: new sigil

2005-10-20 Thread Luke Palmer
On 10/20/05, Steve Peters <[EMAIL PROTECTED]> wrote:
> As I mentioned earlier, most programmers in a corporate environment have
> limited access to system settings.

And in those kinds of corporate environments, you're not going to be
working with any code but code written in-house.  Which means that
nobody is going to be using Latin-1, and everyone will be using the
ASCII synonyms.  What's the problem?

Please, just use the ASCII synonyms.  We've argued over the unicodity
of Perl 6 many times; you're certainly not the first with this
concern.  The result every time is: Unicode will be in Perl 6, nobody
is forcing you to use it.

Luke


Re: new sigil

2005-10-20 Thread Steve Peters
On Thu, Oct 20, 2005 at 05:03:27PM -0400, Rob Kinyon wrote:
> On 10/20/05, Steve Peters <[EMAIL PROTECTED]> wrote:
> > I have some serious concerns about using Latin-1 sigils within Perl 6 and
> > the ASCII multi-character aliases.  Am I not understanding something that
> > I should see this as an advantage?
> 
> I had the same concern a few months back. I've come to see the light
> in this fashion:
> 1) more and more Perl programmers come from non-English countries.
> Heck, the Pugs effort is at least 50% non-US, if not more. None of the
> are on US soil and very few of the leaders are US citizens.

Surely you aren't suggesting that these non-English speakers do not have
access to the ASCII (or EBCDIC) character sets for their editors, are you?

> 2) More and more of us are programming with internationalization
> (i18n) in mind. Just recently, I had to edit french text within the
> templates of an app I work on. If you haven't already, you will be
> doing so in the near future, within the next 3 years.

I have worked on an app that needed to work with English (US and GB), 
German, and Japanese.  I do not, however, remember having to write my
code in anything but ASCII.

> 3) Every editor (with very few exceptions) can display Latin-1
> and, with a few more exceptions, can input Latin-1. If your favorite
> editor cannot, then that's something to bring up with the authors.

As I mentioned earlier, most programmers in a corporate environment have
limited access to system settings.  Changing them in some cases can cause
reprimands or dismissal.  Systems are often set up with the bare minimum
of locales and character sets necessary to do the job.  Also, you have to
deal with the situations where programmers are connecting to *nix servers
through a variety of Windows-based XWindows servers (Exceed, Cygwin, etc.)
complicates what character sets are available immensely.

Also, what settings changes do I need to make to get Latin-1 on 
?  Welcome to your documentation
nightmare!  In Perl 5, we have a nearly impossible time keeping track of where
Microsoft has put their free compiler tools.  Now multiply that by the 
number of Linux distributions, BSD distributions, and various other operating
systems.  Don't forget different versions will do it differently, and have
documentation in different places.  Some of the documentation won't even be
available on the Internet, so Perl 6 would need to reference it in some way.
Are you beginning to get the magnitude of the documentation problem?

> 
> Windows ... yeah. As you pointed out, the old joke goes "Doctor,
> it hurts when I use Windows . . . then, don't use Windows!"

Well over 95% of the desktop computers in a corporate environment are using
Windows.  If you are suggesting Perl 6 ignores Windows, then we should all
start writing Perl 6's obituary.  This sort of attitude does nothing to
advance Perl 6.

> With the availability of dual-booting into FreeBSD/Linux (given the
> near-complete migration of all the necessary Office products) and both
> gvim and emacs having been successfully ported to WIn32, there is a
> way to do it. gvim on WinXP will do all Latin-1 charset with the vim
> keys. (I don't know about emacs, but I'd be shocked if it didn't.) If
> your IT department's policy is rigid, a quick discussion with your
> manager's manager will solve that problem immediately. Or, the cost of
> a few lunches with your favorite IT person will exempt your computer
> from the nightly audit. ($50 goes a long way ...)
> 
Again, I'd prefer not to be fired.  Everything you have written above is
not an option for the majority of the programmers out there.  Also, not
to helpful if you write your programs in TSO on an IBM mainframe.

> Personally, I plan on using every single Latin-1 operator I am
> given access to. All the cool kids will ...
 
Famous last words have never been more finely spoken.  Ignoring Windows and
other environments without ready access to Latin-1 seems like a horrible 
mistake to me.  While the cool kids are playing with their Latin-1 sigils, 
programmers in corporate environments where Latin-1 isn't available will 
start writing their new systems in Java, Ruby, or .NET.  

Steve Peters
[EMAIL PROTECTED]


Re: new sigil

2005-10-20 Thread Schneelocke
On 21/10/05, Darren Duncan <[EMAIL PROTECTED]> wrote:
> On the other hand, if you want to use the ¢ due
> to its being conceptually tied to $, that they
> are different units of currency meant to be used
> together, then the ¢ is fine.

I think the reason why Larry proposed the "¢" is much simpler - it
looks a bit like a c, which one could associate with "class", similar
to how $ looks like S (scalar) and @ looks like a (array). :)

--
schnee


Re: new sigil

2005-10-20 Thread Darren Duncan
Speaking briefly, Unicode is the way of the 
future, and even many modern systems have strong 
support for it.  Perl 6 is a language of the 
future plus present, not of the past, and 
shouldn't be limited by things that are only 
issues for older systems while even then being 
easy to work-around on them.


I say that we should exploit all the Unicode 
characters reasonably possible to make for a more 
elegant language, and any tools currently behind 
will catch up before long.


In this case, I support the use of any 
international currency symbol for use as Perl 
sigils and/or operators as appropriate.  Eg, we 
already use $ (dollar; unicode=0024; utf8=24) and 
¥ (yen; unicode=00A5; utf8=C2A5), and I suggest 
that the next best one to exploit is ¤ (euro; 
unicode=20AC; utf8=E282AC), and the next best is 
£ (pound; unicode=00A3; utf8=C2A3).  In my 
experience, the ¢ (cent; unicode=00A3; utf8=C2A3) 
is no harder to type than either of those.


In some cases, typing a ¢ is easier than most of 
those characters.  On a Macintosh keyboard, 
typing opt-4 will get a ¢ as shift-4 gets a $. 
For that matter, Macintosh keyboards and their 
'option' key allows one to type twice as many 
characters without entering special codes or 
using an input palette as other keyboards having 
only a 'shift' key do.  So in that respect, if 
you want a sigil that is meant to be discouraged 
due to being harder to type, then ¢ may be a 
worse choice than some other options.


On the other hand, if you want to use the ¢ due 
to its being conceptually tied to $, that they 
are different units of currency meant to be used 
together, then the ¢ is fine.


All this being said, if you explicitly want to 
have ASCII alternatives for all Unicode 
characters being used, then I suggest it is best 
to keep the use of Unicode characters mainly in 
operators, because those are always surrounded by 
whitespace and can easily be substituted for 
latin words.


Whereas, because sigils are always right next to 
ordinary word characters, I suggest that they 
should always be ASCII characters, or that the 
ASCII equivalent should not contain any word 
characters.  My impression is that sigils 
containing alphanumerics just look wrong.


Perhaps a solution here for an ASCII equivalent 
is something combining the $ and something else. 
How about this twigil, which combines '::' and 
'$':


  :$:

Does that conflict with anything?

-- Darren Duncan


Re: new sigil

2005-10-20 Thread Sam Vilain
On Thu, 2005-10-20 at 08:45 -0700, Larry Wall wrote:
> More info.  ¢T is a scalar variable just like $T, but enforces a
> class view, so you can use it as a class parameter, and pass any
> object to it, but only access the classish aspects of the object.
> The only other big difference is that you can use it in the class
> syntactic slot, so it's legal to say ¢T $x where it would be illegal
> to say $T $x.

Is this necessary?  Isn't putting a variable before another variable
like that in the correct context (subroutine declaration, in this case),
enough to imply that the variable "does Class" ?

While I'm not arguing against another sigil type, I think this would
distinguish it from the other sigils % and @, which are just an implicit
(does Hash) / (does Array), as well as being a part of the unique name,
as I understand it so far.

This makes me wonder which language feature is used to describe sigils
themselves.  Can I define my own sigils with their own type
implications?

Sam.

ps, X11 users, if you have any key bound to "AltGr", then "AltGr" + C
might well give you a ¢ sign without any extra reconfiguration.





Re: new sigil

2005-10-20 Thread Luke Palmer
On 10/20/05, Steve Peters <[EMAIL PROTECTED]> wrote:
> Like the old joke goes "Doctor, Doctor, it hurts when I try to type a Latin-1
> character."  "So don't try to type Latin-1 characters!"  Instead, many
> programmers will to use the ASCII equivolents that will require additional
> keystrokes.

You mean additional keystroke.  We haven't yet developed any ASCII
equivalent that takes more than two characters.  For most cases, the
ASCII equivalents are easier to type than the Latin-1 versions. 
However, being a Perl 6 programmer myself, I still use the Latin-1
versions because I like how they look and feel better.  But nobody is
forcing you to do the same.

The one thing you have to worry about is if you use an editor that
doesn't support Latin-1 to read somebody else's code.  However, many
many popular editors are capable of doing this, and any editor that
doesn't probably will soon.  We've been over this and over this.

Also, don't think of the class sigil as a sigil.  You won't be writing
it very often.  Just think of it as an operator.

My final point:  we don't introduce unicode characters lightly.  We do
so when we think it is the best symbol for the job, optimizing, for
once, for readability rather than writability.  If you don't think the
class sigil should be a unicode character, come up with a better one. 
We're not going to say "You're right, Steve.  No more unicode sigils!"
until wee see a good alternative to the unicode sigil that we have.

Luke


Re: new sigil

2005-10-20 Thread Rob Kinyon
On 10/20/05, Steve Peters <[EMAIL PROTECTED]> wrote:
> I have some serious concerns about using Latin-1 sigils within Perl 6 and
> the ASCII multi-character aliases.  Am I not understanding something that
> I should see this as an advantage?

I had the same concern a few months back. I've come to see the light
in this fashion:
1) more and more Perl programmers come from non-English countries.
Heck, the Pugs effort is at least 50% non-US, if not more. None of the
are on US soil and very few of the leaders are US citizens.
2) More and more of us are programming with internationalization
(i18n) in mind. Just recently, I had to edit french text within the
templates of an app I work on. If you haven't already, you will be
doing so in the near future, within the next 3 years.
3) Every editor (with very few exceptions) can display Latin-1
and, with a few more exceptions, can input Latin-1. If your favorite
editor cannot, then that's something to bring up with the authors.

Windows ... yeah. As you pointed out, the old joke goes "Doctor,
it hurts when I use Windows . . . then, don't use Windows!" With the
availability of dual-booting into FreeBSD/Linux (given the
near-complete migration of all the necessary Office products) and both
gvim and emacs having been successfully ported to WIn32, there is a
way to do it. gvim on WinXP will do all Latin-1 charset with the vim
keys. (I don't know about emacs, but I'd be shocked if it didn't.) If
your IT department's policy is rigid, a quick discussion with your
manager's manager will solve that problem immediately. Or, the cost of
a few lunches with your favorite IT person will exempt your computer
from the nightly audit. ($50 goes a long way ...)

Personally, I plan on using every single Latin-1 operator I am
given access to. All the cool kids will ...

Rob


Re: new sigil

2005-10-20 Thread Steve Peters
On Thu, Oct 20, 2005 at 10:24:23AM -0700, chromatic wrote:
> On Thu, 2005-10-20 at 10:32 -0500, Steve Peters wrote:
> 
> > The idea of punishing programmers who choose to use certain operating system
> > or locales just doesn't seem right to me.
> 
> Haven't they already acclimated to the punishment of those operating
> systems already?
> 

I have rarely considered working in Eclipse or WSAD to be a punishment, but
I still can't type a Latin-1 sigil on its editor.  Here on my OpenBSD box,
I can't even cut and paste a Latin-1 sigil here in mutt.  There are many
things that I get punished for by being required to use Windows at work,
but I've not programmed in a language that punishes me for the characters
available on my system.

Since there have been some concerns regarding the lack of suggestions in this
thread, my suggestion is to avoid non-ASCII sigils completely.  There are
a couple of reasons I see for this.  

The first reason is efficiency.  I started programming with Perl 5 because of 
its efficiency. The lack of the code-compile-run loop helped to shorten 
development 
and feedback times.  The fact that I had to go to Google to figure out how to
type a cent character doesn't bode well for my efficiency in Perl 6.  The best
way I can see currently on my current desktop setup is:
* Start up Microsoft Word
* Type the character as ALT-155 (the 155 must be typed on the numeric 
keypad)
* Copy and paste the character into my editor
Like the old joke goes "Doctor, Doctor, it hurts when I try to type a Latin-1 
character."  "So don't try to type Latin-1 characters!"  Instead, many 
programmers will to use the ASCII equivolents that will require additional
keystrokes.  Ideally, a lazy programmer will develop shortcuts to make this
easier, but this, of course, takes time and the right editor.  

The second reason is in educating the average programmer.  There may be books 
written on Perl 6 that don't explain the ASCII equivolents for the Latin-1
sigils and vice-versa.  If you don't think that will be the case, lets take 
Perl 5 as the example.  There are many beginning Perl 5 books, even those 
written by reputable authors, that treat "for" and "foreach" very differently,
when they are identical in every way.  I would hope the book editors will be  
good enough to catch the sigil differences in Perl 6, but this seems rather 
naive on my part.  

These both cause problems with advocacy.  The high-end Perl programmers who
these sigils are supposed to be for are also, typically, the best advocates and 
the ones trusted in a typical programming shop.  If this programmer has to
advocate changes in the entire development environment to get the most 
efficient 
environment they can get along with migrating to Perl 6, this programmer is
going to have a tough fight, especially when competing against the likes of
Java with Eclipse/WSAD or Ruby on Rails.

I have some serious concerns about using Latin-1 sigils within Perl 6 and
the ASCII multi-character aliases.  Am I not understanding something that 
I should see this as an advantage?

Steve Peters
[EMAIL PROTECTED]


Re: new sigil

2005-10-20 Thread Juerd
Luke Palmer skribis 2005-10-20 10:07 (-0600):
> You seem to be forgetting that we do have the longest token rule.  So,
> the only way this destroys a dream (and likewise, the only way c|
> doesn't work), is if you have the poor package or class name c and you
> insist on writing c|d or c!d without spaces.

I'd like to be able to use these without whitespace, and I expect it to
be commonly written without whitespace for simple cases, because 1|2|3
isn't any less clear than 1 | 2 | 3, while it's a lot easier to type.

> Still, if you'd like to make a suggestion instead of just telling us
> why our ideas don't work in very specific circumstances, feel free.

I've already suggested two. Is that not enough?

(a) ^
(b) 1c


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html


Re: new sigil

2005-10-20 Thread chromatic
On Thu, 2005-10-20 at 10:32 -0500, Steve Peters wrote:

> The idea of punishing programmers who choose to use certain operating system
> or locales just doesn't seem right to me.

Haven't they already acclimated to the punishment of those operating
systems already?

-- c



Re: new sigil

2005-10-20 Thread Eric
What about something like:

c\

Then you get

sub sametype (c\T $x, c\T $y) {...}

Not exactly pretty though. c\T

Actualy i think ^ might be my favorite so far.

sub sametype (^T $x, ^T $y) {...}

--
Eric


Re: new sigil

2005-10-20 Thread Michele Dondi

On Thu, 20 Oct 2005, Larry Wall wrote:


: > c| or C| maybe.

[snip]

: if $foo eq c|d { ... }

Other suggestions welcome.


<| maybe? And what will we make |> do?


Michele
--
Se non te ne frega nulla e lo consideri un motore usa e getta, vai
pure di avviatore, ma e' un vero delitto. Un po' come vedere un 
cavallo che sodomizza un criceto!!!

- Cesare/"edizioni modellismo sas" in it.hobby.modellismo


Y [was: "Re: new sigil"]

2005-10-20 Thread Michele Dondi

On Thu, 20 Oct 2005, Juerd wrote:


All non-ASCII operators have ASCII equivalents:

   ¥   Y
   «   <<
   »   >>


Speaking of which the advantage of, say, « over << is that the former is 
_one_ charachter. But Y, compared to ¥, is one charachter only as well, 
and is even more visually distinctive with most fonts I know of, afaict, 
so is there any good reason to keep the latter as the "official" one?!?



Michele
--
Commander Helena Braddock: So, where is everybody?
Melanie Ballard: Yeah, Friday night, the whole place should be packed. A
whole twelve hours before sun up and there's money to burn, whores to fuck
and drugs to take.
- Gosts of Mars (2001)

Re: new sigil

2005-10-20 Thread Jonathan Scott Duff
On Thu, Oct 20, 2005 at 08:55:46AM -0700, Larry Wall wrote:
> On Thu, Oct 20, 2005 at 05:53:00PM +0200, Juerd wrote:
> : Larry Wall skribis 2005-10-20  8:46 (-0700):
> : > On Thu, Oct 20, 2005 at 05:35:10PM +0200, Juerd wrote:
> : > : I'm sure ¢ will have its equivalent too.
> : > c| or C| maybe.
> : 
> : But 
> : 
> : sub c { ... }
> : sub d { ... }
> : 
> : if $foo eq c|d { ... }
> 
> Other suggestions welcome.

I don't know ... since we're still using ::T for classy things, I'd
kind of like to see something with a : in it. I also get the feeling
that these are type/class placeholders, so I wouldn't mind a ^ either.
Here are some suggestions:

:$T
:^T
^^T
:&T
$::T
$:T 
[T] # these next 3 don't evoke "variable" as much as 
 # parametric type (ala C++)
(T)

And yes, I know several of those are already "taken".  I'm suggesting
that we at least think about reassigning them.

-Scott
-- 
Jonathan Scott Duff
[EMAIL PROTECTED]


Re: new sigil

2005-10-20 Thread Luke Palmer
On 10/20/05, Juerd <[EMAIL PROTECTED]> wrote:
> Schneelocke skribis 2005-10-20 18:00 (+0200):
> > Would c! be an option?
>
> In current Perl 6: Yes, because infix ! does not exist.
>
> But several people want ! to be a chainy none() constructor, and this
> would destroy a dream.

You seem to be forgetting that we do have the longest token rule.  So,
the only way this destroys a dream (and likewise, the only way c|
doesn't work), is if you have the poor package or class name c and you
insist on writing c|d or c!d without spaces.

Still, if you'd like to make a suggestion instead of just telling us
why our ideas don't work in very specific circumstances, feel free.

Luke


Re: new sigil

2005-10-20 Thread Larry Wall
On Thu, Oct 20, 2005 at 05:35:10PM +0200, Juerd wrote:
: I'm sure ¢ will have its equivalent too.

c| or C| maybe.

Larry


Re: new sigil

2005-10-20 Thread Juerd
Schneelocke skribis 2005-10-20 18:00 (+0200):
> Would c! be an option?

In current Perl 6: Yes, because infix ! does not exist.

But several people want ! to be a chainy none() constructor, and this
would destroy a dream.


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html


Re: new sigil

2005-10-20 Thread Schneelocke
On 20/10/05, Larry Wall <[EMAIL PROTECTED]> wrote:
> : But
> :
> : sub c { ... }
> : sub d { ... }
> :
> : if $foo eq c|d { ... }
>
> Other suggestions welcome.

Would c! be an option?

--
schnee


Re: new sigil

2005-10-20 Thread Luke Palmer
On 10/20/05, Larry Wall <[EMAIL PROTECTED]> wrote:
> Another thing I didn't mention is that that binds both the variable
> and its class.  But the $ variable is of course optional after the
> type, so you could just write that
>
> sub sametype (¢T, ¢T) {...}
>
> if you don't actually care about $x and $y.  Basically, ¢T captures
> the type of the associated scalar in any lvalue or declarative context,
> whether or not hte scalar itself is captured.

So it's a type position thing if it can be.  Good.  (I wonder if,
since it's allowed in term position, we will come up with ambiguities)

How about this:

sub foo(c|T $x) {
my sub util (c|T $in) {...}
util($x)
}

Is that c|T in util() a new, free type variable, or am I asserting
that the type of util()'s argument must be the same type as $x?

Luke


Re: new sigil

2005-10-20 Thread Larry Wall
On Thu, Oct 20, 2005 at 05:53:00PM +0200, Juerd wrote:
: Larry Wall skribis 2005-10-20  8:46 (-0700):
: > On Thu, Oct 20, 2005 at 05:35:10PM +0200, Juerd wrote:
: > : I'm sure ¢ will have its equivalent too.
: > c| or C| maybe.
: 
: But 
: 
: sub c { ... }
: sub d { ... }
: 
: if $foo eq c|d { ... }

Other suggestions welcome.

Larry


Re: new sigil

2005-10-20 Thread Larry Wall
On Thu, Oct 20, 2005 at 07:56:09AM -0700, Larry Wall wrote:
: I don't know how long this EuroOSCON net is going to stay up, so I'll be
: brief.  I think we're having a new "class" sigil.  Where we've been
: writing ::T, that will revert to meaning "an existing class T that
: we just might not see the declaration of for dynamic reasons".  Instead,
: the new sigil is the cent sign, so ::T is now written ¢T instead.
: 
: In addition, it doesn't automatically bind to T like we were making ::T
: do, so you have to use it consistently:
: 
: sub sametype (¢T $x, ¢T $y) {...}

Another thing I didn't mention is that that binds both the variable
and its class.  But the $ variable is of course optional after the
type, so you could just write that

sub sametype (¢T, ¢T) {...}

if you don't actually care about $x and $y.  Basically, ¢T captures
the type of the associated scalar in any lvalue or declarative context,
whether or not hte scalar itself is captured.

Sorry for all the short notes--we still don't know how long this OSCON
net will be up before they take it down.

Larry


Re: new sigil

2005-10-20 Thread Juerd
Larry Wall skribis 2005-10-20  8:46 (-0700):
> On Thu, Oct 20, 2005 at 05:35:10PM +0200, Juerd wrote:
> : I'm sure ¢ will have its equivalent too.
> c| or C| maybe.

But 

sub c { ... }
sub d { ... }

if $foo eq c|d { ... }


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html


Re: new sigil

2005-10-20 Thread Larry Wall
On Thu, Oct 20, 2005 at 11:46:30AM -0400, John Siracusa wrote:
: On 10/20/05 11:37 AM, Larry Wall wrote:
: > On Thu, Oct 20, 2005 at 10:32:14AM -0500, Steve Peters wrote:
: > : The idea of punishing programmers who choose to use certain operating 
system
: > : or locales just doesn't seem right to me.
: > 
: > That's why we provide ugly ASCII workarounds for all of them.  We just
: > haven't decided what the appropriate ugly ASCII workaround for ¢ should be.
: 
: So...no joy on the class(T) builtin/macro/whatever?  Does it look too much
: like a cast?

It looks too much like a class declaration, and we're not declaring
a class.  We're just declaring a variable that holds something that
"does class".

Larry


Re: new sigil

2005-10-20 Thread Larry Wall
On Thu, Oct 20, 2005 at 08:45:25AM -0700, Larry Wall wrote:
: More info.  ¢T is a scalar variable just like $T, but enforces a
: class view, so you can use it as a class parameter, and pass any
: object to it, but only access the classish aspects of the object.

And a nice side effect of that is that declaring the invocant ¢T
doesn't commit to whether you are thinking in a class-based or
prototype-based model.  And you wouldn't care until you got down
to a .clone or a .bless.

Larry


Re: new sigil

2005-10-20 Thread John Siracusa
On 10/20/05 11:37 AM, Larry Wall wrote:
> On Thu, Oct 20, 2005 at 10:32:14AM -0500, Steve Peters wrote:
> : The idea of punishing programmers who choose to use certain operating system
> : or locales just doesn't seem right to me.
> 
> That's why we provide ugly ASCII workarounds for all of them.  We just
> haven't decided what the appropriate ugly ASCII workaround for ¢ should be.

So...no joy on the class(T) builtin/macro/whatever?  Does it look too much
like a cast?

-John




Re: new sigil

2005-10-20 Thread Larry Wall
More info.  ¢T is a scalar variable just like $T, but enforces a
class view, so you can use it as a class parameter, and pass any
object to it, but only access the classish aspects of the object.
The only other big difference is that you can use it in the class
syntactic slot, so it's legal to say ¢T $x where it would be illegal
to say $T $x.

Larry


Re: new sigil

2005-10-20 Thread Larry Wall
On Thu, Oct 20, 2005 at 10:32:14AM -0500, Steve Peters wrote:
: The idea of punishing programmers who choose to use certain operating system
: or locales just doesn't seem right to me.

That's why we provide ugly ASCII workarounds for all of them.  We just
haven't decided what the appropriate ugly ASCII workaround for ¢ should be.

Larry


Re: new sigil

2005-10-20 Thread Juerd
Steve Peters skribis 2005-10-20 10:32 (-0500):
> The idea of punishing programmers who choose to use certain operating system
> or locales just doesn't seem right to me.

All non-ASCII operators have ASCII equivalents:

¥   Y
«   <<
»   >>

I'm sure ¢ will have its equivalent too.

(It's ^KCt in vim, btw)


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html


  1   2   >