Re: Vocabulary
On Tuesday, December 16, 2003, at 05:36 PM, Chip Salzenberg wrote: Speed is for users. PR is for non-users. You want speed? OK, we can talk about the actual speed you actually need based on your actual usage patterns. But from a design perspective you're a collection of anecote, not a user base; so your usage patterns may be irrelevant to Perl in the big picture. In a separate matter, non-users may perceive Perl {5,6} to be too slow for their needs; more to the point, they may *assume* that it is too slow without research and testing. That assumption is a public relations issue -- ironically, one which is fundamentally disconnected from the question of Perl's _actual_ efficiency. Well, just for clarification; in my anecdotal case (server-side web applications), the speed I actually need is "as much as I can get", and "all the time". Every N cycles I save represents an increase in peak traffic capabilities per server, which is, from a marketing perspective, essential. If a potential client company needs to decide between two server-based products -- my Perl based product, and a competing Java-based one -- one of the first questions they ask is "how much traffic can it handle for X dollars of hardware and software". I don't have to win that benchmark, but I have to be close. Otherwise I don't get to play. I agree, it is frequently the case that the question of speed is made critical by people who most assuredly do not need it. But they still decide that way, and I have found that asserting to them that speed is not important has been... well, less than effective. I do not doubt that P6 will be much more competitive, speed-wise, than P5 -- but if it could actually _win_ a few benchmarks, it would turn my company's use of Perl from a PR problem to a PR advantage. your usage patterns may be irrelevant to Perl in the big picture. The thought has crossed my mind repeatedly, believe me. MikeL
Re: Vocabulary
On Tuesday, December 16, 2003, at 04:01 PM, Chip Salzenberg wrote: According to Michael Lazzaro: As someone who has 90% of their projects relying very critically on speed ... an anecdote ... Yes. and who has had to battle a number of clients' IT departments over the years in defense of said speed compared to other popular languages which, out of spite, I will not name, ... and a public relations issue. Yes, again. Let us not confuse them. I'm not sure I understand which part of that is in conflict. Is it the premise that some people use Perl in environments in which speed is an issue, the premise that Perl5 has a public relations issue about being inappropriate for speed-critical environments, or the conflation that someone that works in speed-critical environments, and wishes to use Perl, is going to run up against the public-relations issue? MikeL
Re: Vocabulary
On Tuesday, December 16, 2003, at 03:00 PM, Luke Palmer wrote: But Perl hinges on laziness, doesn't it? Eh, I trust that Perl 6 will make it easy to figure that out in most cases. I was coming from the perspective that 90% of my projects don't need speed; but I can say no such thing on account of my users. And what about that un-accounted-for 10%? As someone who has 90% of their projects relying very critically on speed, and who has had to battle a number of clients' IT departments over the years in defense of said speed compared to other popular languages which, out of spite, I will not name, I beg you to never speak or think that sentence again. ;-) MikeL
Re: Vocabulary
On Tuesday, December 16, 2003, at 12:20 PM, Gordon Henriksen wrote: finally by default? None for me; thanks, though. I don't think so; we're just talking about whether you can extend a class at _runtime_, not _compiletime_. Whether or not Perl can have some degree of confidence that, once a program is compiled, it won't have to assume the worst-case possibility of runtime alteration of every class, upon every single method call, just in case you've screwed with something. They still aren't "final" classes, in that you can subclass them at will. You just can't subclass them _runtime_, via C, unless you've specifically marked that you want to allow that for that _specific_ class. As Larry hypothesized: The other reason for "final" is to make it easy for the compiler to optimize. That's also problematical. As implemented by Java, it's a premature optimization. The point at which you'd like to know this sort of thing is just after parsing the entire program and just before code generation. And the promises have to come from the users of interfaces, not the providers, because the providers don't know how their services are going to be used. Methods, roles, and classes may never declare themselves final. They may be declared final only by the agreement of all their users. But the agreement could be implied by silence. If, by the time the entire program is parsed, nobody has said they want to extend an interface, then the interface can be considered closed. In other words, if you think you *might* want to extend an interface at run time, you'd better say so at compile time somehow. I think that's about as far as we can push it in the "final" direction. -and- Actually, I think making people declare what they want to extend might actually provide a nice little safety mechanism for what can be modified by the eval and what can't. It's not exactly Safe, but it's a little safer. -and- Seriously, I hope we can provide a framework in which you can screw around to your heart's content while modules are being compiled, and to a lesser extent after compilation. But we'll never get to a programming-in-the-large model if we can't limit most of the screwing around to the lexical scope currently being compiled, or at least to a known subset of the code. So, if I may interpret that; it might not be so bad to have to declare whether or not you were going to extend/alter a class at runtime, in order that Perl could optimize what it knows at compile-time for the 99.5% of the classes that you wouldn't be doing that for. MikeL
Re: Vocabulary
On Tuesday, December 16, 2003, at 09:07 AM, Larry Wall wrote: Seriously, I hope we can provide a framework in which you can screw around to your heart's content while modules are being compiled, and to a lesser extent after compilation. But we'll never get to a programming-in-the-large model if we can't limit most of the screwing around to the lexical scope currently being compiled, or at least to a known subset of the code. Modules that turn off optimization for all other modules are going to be about as popular as $&. So the general declaration should probably be something easy to see like: use STRANGE_SEMANTICS_THAT_SLOW_EVERYONE_DOWN; That will encourage people to be more specific about what they want to pessimize. Certainly, your fancy module should be encouraged to declare these things on behalf of its users if it can. I'm not suggesting that Lukian or Damianly modules force such declarations onto the users unless it's impossible for the module to know. And it seems to me that with sufficient control over the user's grammar, you can often get that information into your own fancy module somehow. Might take a few macros though, or analysis of the user's code at CHECK time (or maybe just before). And in general, it's probably not necessary to declare all the new interfaces, but only those interfaces known at compile time that want to stay open. Any interfaces added at run time are probably assumed to be open. So in some cases you might find yourself deriving a single open class at compile time from which you can derive other open classes later. Exactly, assuming I correctly understand. :-) My own first instinct would be that the run-time extensibility of a particular interface/class would simply be a trait attached to that class... by default, classes don't get it. By limiting or not limiting the amount of runtime screwin' around you can do with the class, it is therefore able to control the level of optimization that calls to methods, etc., are given -- but specific to that particular interface/class, not to the module and certainly not to the program in general. class Wombat is runtime_extensible { ... }; So everything is closed, except the specific classes which are not. Even when you are (to use an example from my own code) making runtime subclasses on-the-fly, you're almost always starting from some common base class. (And 'almost' is probably an unneeded qualifier, there. As is 'probably'.) As far as users of your class being able to specify that they want something runtime-extensible, when your original module didn't call for it, I don't see that as a problem, if they can just add the trait to your class shortly after they C the package containing it, if such things are possible -- or, for that matter, simply subclass your original into a runtime_extensible class: class Wombat { ... }; # Not runtime extensible class MyWombat is Wombat is runtime_extensible { ... }; # Runtime extensible Now, it might be that declaring MyWombat to be runtime_extensible actually silently disables some compile-time optimizations not only for it, but for all its superclasses/roles/etc., depending on how intelligent and far reaching those optimizations may be. Not sure. Still, the fact that you are _requesting_ that happen is specific to the particular class that needs it -- and should be associated with that class, such that if that class later falls into disuse, the optimizations silently reappear. (At first glance, I am less sure of the need to have similar functionality for entire modules, as opposed to classes, but perhaps someone out there can come up with an example.) MikeL
Re: Vocabulary
On Sunday, December 14, 2003, at 06:14 PM, Larry Wall wrote: But the agreement could be implied by silence. If, by the time the entire program is parsed, nobody has said they want to extend an interface, then the interface can be considered closed. In other words, if you think you *might* want to extend an interface at run time, you'd better say so at compile time somehow. I think that's about as far as we can push it in the "final" direction. That seems a very fair rule, especially if it adds a smidge more speed. Runtime extension will likely be very unusual -- requiring it to be explicit seems reasonable. I'm probably spouting nonsense. I just hope it's good-sounding nonsense... It's beyond good-sounding, it's frickin' awesome. MikeL
Re: enums and bitenums
On Thursday, December 11, 2003, at 10:04 AM, Larry Wall wrote: Explicitly: $bar.does(Color)# does $bar know how to be a Color? $bar.as(Color) # always cast to Color Implicitly boolean: $bar ~~ Color # $bar.does(Color) ?$bar.Color # $bar.does(Color) if $bar.Color # if $bar.does(Color) Implicitly non-boolean: +$bar.Color # +$bar.as(Color) ~$bar.Color # ~$bar.as(Color) $($bar.Color) # $($bar.as(Color)) @($bar.Color) # @($bar.as(Color)) So C would be for casting, not coercion, right? Suppose you have a class Foo, such that: class Foo does (Bar, Baz) { ... } ... or however that looks. May I then presume that $foo.Bar.zap# ($foo.as(Bar)).zap) calls the method C of role C, with $foo as the invocant? MikeL
Re: >>OP<< [was: Re: Properties]
On Tuesday, December 2, 2003, at 12:37 PM, Luke Palmer wrote: Michael Lazzaro writes: There were also vaguely threatening proposals to have <> and >>op<< do slightly different things. I assume that is also dead, and that <> is (typically) a syntax error. Ack. No, slightly different things would be a very bad idea. At the moment, as most of you probably know, they do *very* different things. >>op<< vectorizes the operator, and <> is equivalent to qw{some stuff}. Sorry, right. I therefore deduce that the proposal to have, for example, <<+>> mean a different flavor of vectorization than >>+<<, e.g. for these to do different things: @a >>+<< @b; @a <<+>> @b; is quite soundly and completely dead. Excellent. Let us not speak of it again. MikeL
>>OP<< [was: Re: Properties]
On Monday, December 1, 2003, at 01:05 PM, Hodges, Paul wrote: Didn't know "is" would do that. Good to know! And in my meager defense, I did reference MikeL's operator synopsis as of 3/25/03, which said ^[op] might be a synonym for <<>> or >><< (Sorry, no fancy chars here. :) Hey, that was *March*! ;-) The fossil records from that time are fragmentary, at best. I don't think I ever saw any further reference to the ^[op] syntax staying alive; I assume that means it's dead. Last I heard, which was admittedly around the same time frame, we'd have the non-Unicode-using >>op<<, and a Unicode synonym »op«, and that's it. There were also vaguely threatening proposals to have <> and >>op<< do slightly different things. I assume that is also dead, and that <> is (typically) a syntax error. If anyone in the know knows otherwise, plz verify for Piers' summary and the future fossil record. MikeL
Re: 'Core' Language Philosophy [was: Re: 'catch' statement modifier]
On Wednesday, November 26, 2003, at 12:29 PM, Larry Wall wrote: If you contrast it with an explicit try block, sure, it looks better. But that's not what I compare it with. I compare it with Perl 5's: $opus.write_to_file($file) or die "Couldn't write to $file: $!"; That has some known problems with false positives, er, negatives, which Perl 6 addresses with things like: $opus.write_to_file($file) err fail "Couldn't write to $file: $!"; But if we're going to talk about philosophy, we should talk about Perl's notion of not forcing people to escalate all the way to exceptions when a lesser form of undefinedness or falseness will do. Perl 6 doesn't back off from this. In fact, it takes it further by allowing you to return unthrown exceptions as "undef". And by providing a "fail" that either returns or throws the exception depending on the preferences of the caller. Well, yes, hmm, har, but... Speaking only for myself, my own (database-heavy) code already makes pretty extensive use of the differences between "false", "unknown/undefined/NULL", and "worthy of exception" -- all three of those conditions may exist at various times, and no two of them can reasonably be lumped together as being logically identical. There are plenty of cases where a piece of data being "undefined" or "NULL" means something different from it being zero, for example -- and neither case represents an actual error condition. Just undefinedness. So, for those cases, I'm forced into using full-fledged exception handling. So as an abstract example, I would consider these to be entirely different, but each of them to be useful: foo(...) or blah(...); foo(...) err blah(...); foo(...) catch blah(...); I read the first one as executing blah() if the result of foo() is a false value; the second as executing blah() if the result is something with an undefined or NULL value; the third as executing blah() if an otherwise fatal condition arises during the execution of foo(). All three can be considered constructs useful for quick-n-dirty recovery from the corresponding -- but very different -- exceptional conditions. If you want to promote a catch modifier to me, you'd better market it as a form of "or" rather than a form of "try". I could see a form of "err" that implies a try around its left argument and coerces any caught exception to an undef exception. Yes, precisely that. But that sounds like a philosophical decision to be made consistently by the calling module. like "use fatal" in reverse. So I think I'd rather see a pragma that does that to "err" instead of adding yet another keyword. I would explicitly not want that, personally, for the above reasons; there are many circumstances in which I'd rather use "undefined" to mean "undefined", rather than "exceptional/error condition". Again, tho, TMTOWTDI. It's hardly a crisis if it doesn't exist. It just seems like an obvious simplification of try/CATCH when only one statement is being wrapped. MikeL
Re: 'Core' Language Philosophy [was: Re: 'catch' statement modifier]
On Wednesday, November 26, 2003, at 01:50 PM, Michael Lazzaro wrote: my $c = (defined($a) ? $a : $b); Sorry, P5. Before the grammar police attack... my $c = (defined($a) ?? $a :: $b); Parens for clarity. MikeL
Re: 'Core' Language Philosophy [was: Re: 'catch' statement modifier]
On Wednesday, November 26, 2003, at 12:13 PM, chromatic wrote: On Wed, 2003-11-26 at 11:39, Michael Lazzaro wrote: I think we also need to be skeptical of the false economy of putting such sugar into CP6AN, if a sizable portion of the community is going to download it anyway. A more interesting question is *when* to put something into the core language or libraries. Consider Perl 5, where File::Find is a core module. While the interface may have been nice in 1995 (though I doubt even that), it's been widely regarded as awful for at least three years. It's likely never to be removed from the core. File::Find::Rule is immensely nicer. Agreed, but I hope I made it clear I was talking about a different level of beast -- a bit of pure syntactic/semantic sugar that rests solely upon other core things, not a bit of extended functionality. File::Find is an excellent example of something that wouldn't belong in core because it does not represent the Only Good Way To Do It. You can think of plenty of valid interfaces to something as complex as a File::Find-like module, and each would have vigorous supporters. I'm talking about things on the level of, for example, C. I can say: foo() if not $a; or foo() unless $a; The presence of C in the language, functionality-wise, is utterly meaningless; it adds nothing aside from a very slight but useful linguistic nuance. I wager most of us regularly use both C and C now, interspersed liberally, depending on what precisely we are trying to convey. We could be trained to always say C, however, and eliminating it from P6 would save a keyword. But it would be a hollow savings; nobody would realistically then use a func/method/op/whatever called 'unless' in their code -- and if they did use it, it would almost certainly be to produce a behavior identical to the existing functionality. Similarly, the much-needed new C operator: my $c = $a // $b; or my $c = (defined($a) ? $a : $b); Again, a functionally meaningless bit of fluff which exists solely to provide a simpler visual reading of a ridiculously common construct. It could be eliminated easily; to do so would be an overall loss. Ditto ==>, or even C/C. When I use the term 'sugar', it is things of this level of primitiveness that I mean to convey. MikeL
'Core' Language Philosophy [was: Re: 'catch' statement modifier]
On Monday, November 24, 2003, at 03:28 PM, Luke Palmer wrote: Damian Conway writes: In which case I think we just fall back to: try{$opus.write_to_file($file); CATCH {die "Couldn't write to $file: $!"}} which is, after all, only 5 characters longer than: $opus.write_to_file($file) catch die "Couldn't write to $file: $!\n"; ;-) Fair enough :-) No, I wasn't implying that C could validly return undef. I just failed to realize how short the "long" version was. But you have to admit, the latter version makes the important part of the expression stand out, and is more natural to write (IMHO, as always). But it's moduleable, so I won't worry about it. A small point of order, philosophically... While there is a justifiable urge to deny entry into of the core language of as much syntactic sugar as possible -- since merely looking at the operator list proves P6 will be a behemoth of a language, when you consider all of its nooks and crannies -- I think we also need to be skeptical of the false economy of putting such sugar into CP6AN, if a sizable portion of the community is going to download it anyway. In my mind, Luke's proposed C modifier quite easily fits the qualifications of something that should be in core, for the following reasons: - It serves a common purpose, for which there is Only One (Good) Way To Do It. While you perhaps _can_ say try { $opus.write_to_file($file); CATCH { die "Couldn't write to $file: $!" } } in other golfish ways, the above is clearly (ignoring preferences in whitespace formatting) the Obvious and Recommended way to do it, and so the above phrase will appear _wherever_ a single statement needs to be wrapped with a try/CATCH. - It has one, and only one, obvious meaning. Nobody should be making their own 'catch' modifiers that do different things -- that would be far more annoying, for shared code, than reserving the keyword in advance to do the One Obvious Thing. - It is consistent with the philosophy of similar syntax. - It poses no significant harm to novice users. They can program in P6 effectively without ever using it, but if they do see it in someone else's code, it will be reasonably obvious what it is doing. And whether it is core or commonly used CP6AN, they _will_ see it in other people's code. It is true, the difference between the two constructs: try { $opus.write_to_file($file); CATCH {die "Couldn't write to $file: $!"} } $opus.write_to_file($file) catch die "Couldn't write to $file: $!"; is only 7 characters. But four of those characters are curly brackets, visually introducing two blocks. That's kind of a yuck. MikeL
Re: Control flow variables
On Tuesday, November 18, 2003, at 12:15 PM, Luke Palmer wrote: Oh, and if you really want to do that return thing without using a C, you can just: sub blah { return $a || goto CONT; CONT: ... } I don't see what's wrong with that. :-p Umm... refresh my/our memory. Did we end up having a post- form of C, such that: return $_ if given $big.long.calculation.{ with }{ some }{ stuff }; does what I might suppose it does, or does it have to be... longer? (The point of that old thread was to try and find the smallest possible way to write annoyingly common constructs like: method foo ($self: $a,$b,$c) { return $self.cached.{ $a }{ $b }{ $c } # short-circuit calculation, if $self.cached.{ $a }{ $b }{ $c }; # if possible ... otherwise do actual stuff ... } but I don't recall the official recommended solution.) Forgive the craptacularness of my current P6 knowledge, I have been preoccupied with life-involving stuff, with items and things and doohickies. MikeL
Re: Control flow variables
On Wednesday, November 19, 2003, at 12:28 PM, Smylers wrote: Larry Wall writes: : Michael Lazzaro wrote: : : >return if $a { $a } No, it's a syntax error. You must write Excellent! I too was quietly hoping someone would say that. These hurt my admittedly ever-shrinking brain: return if $a; # our old friend return if $a { $a }# ow! me noggin! Always returns, or not? The C encapsulation helps clarify enormously: return do { if $a { $a } } ... well, OK, maybe not enormously... I'd still be annoyed if anyone actually wrote that with a straight face, but there's nothing offensive about it being legal. MikeL
Re: Control flow variables
Would that then imply that sub blah { ... # 1 return if $a;# 2 ... # 3 } ...would return $a if $a was true, and fall through to (3) if it was false? It sure should, provided there were a correct context waiting, which would quite nicely address another WIBNI thread a couple of months back about a quick return under those conditions. I don't think so. I say that all the time to mean precisely: if $a { return } And I don't think people are ready to give that up. In particular, if we kept our bottom-up parser around, this particular construct would cause an infinite-lookahead problem. So for ambiguity's sake, C should not be a valid term without a block following. So, just to make sure, these two lines are both valid, but do completely different things: return if $a; return if $a { $a } MikeL
Re: Control flow variables
On Tuesday, November 18, 2003, at 06:38 AM, Simon Cozens wrote: Given that we've introduced the concept of "if" having a return status: my $result = if ($a) { $a } else { $b }; Would that then imply that sub blah { ... # 1 return if $a;# 2 ... # 3 } ...would return $a if $a was true, and fall through to (3) if it was false? [EMAIL PROTECTED] (Dan Sugalski) writes: Luke Palmer: That's illegal anyway. Can't chain statement modifiers :-) Will be able to. I was under the strong impression that Larry had decided that syntactic ambiguities prevented this from happening. (Now, of course, you will ask me for a cite to the thread, which I can't even begin to find at this point...) MikeL
Re: Apocalypses and Exegesis...
On Thursday, August 14, 2003, at 07:00 AM, Alberto Manuel Brandão Simões wrote: On Thu, 2003-08-14 at 14:49, Simon Cozens wrote: Just a hint: don't try writing it and revising it as the language changes. I wrote a Perl 6 chapter for a book in December and it is now almost unusable due to the pace of change. Yes. That's why I'm asking :-) I can start looking to apocalypses and exegesis to have an idea of the structure and content, but not really write them. I would need a running prototype, too. And that's more difficult to find :) I add a hearty "Amen" to Simon's advice. In my own opinion -- which is worth approximately what you paid for it :-) -- things are probably pretty slushy until A12/E12 "Objects" comes out. AFAIK, that's due to be the next official A/E. I expect that _after_ that one, things will solidify rather rapidly; but be wary of doing too much before that, IMHO. The A12/E12 problem is that the core concepts and syntax related to objects and types have wide repercussions on the syntax of everything else -- control structures, subroutines, operators, etc. etc. etc. Since *everything* can be described as being an operation upon a set of objects/types (and, after all, even csubs/subs/ops are themselves "objects", in the larger sense) until those object/type concepts are nailed down *quite* firmly, I would be a little wary of counting on the stability of anything else. We've seen several examples already of things changing -- for the better! -- long after the A&Es for them have come out. I confidently prophesize at least one more big, scary round of that. My personal advice is to wait until E12 comes out and is polished, and then go whole-hog. I wouldn't expect any major changes to happen after that, because the rest of the A&Es are less far-flung in scope. MikeL
Re: printf-like formatting in interpolated strings
On Monday, June 16, 2003, at 11:49 AM, Austin Hastings wrote: --- Michael Lazzaro <[EMAIL PROTECTED]> wrote: Or, if we have "output rules" just like we have "input rules", could something quite complex be expressed simply as: "You have <$x as MoneyFormat>" having previously defined your MoneyFormat "formatting rule" in some other location? "You have ", no? Yeah. Though I'd actually hope both forms were acceptable, personally. I really like the visual karma of the first, representing a "type or format conversion", more than the second, representing the "creation of a formatted object" -- though in practice the two notions are of course identical. :-) MikeL
Re: Type Conversion Matrix, Pragmas (TAKE 4)
On Monday, June 16, 2003, at 11:04 AM, David Storrs wrote: On Mon, Jun 16, 2003 at 10:15:57AM -0700, Michael Lazzaro wrote: (I've been operating under the assumption that an "untyped scalar" doesn't _remove_ the type of something, it just can store values of _any_ type, and is by default much more generous about autoconverting them for you, so that you could use $c as an Int, int, Num, num, Str, str, etc., without warning or error... but internally, it's actually still storing the value as an C, because that's what you assigned to it.) Seems reasonable. Although I would assume that it would store and pull the value from an Int slot, then create a new value of the "converted to" type, and use that. Yeah, I would think so. A better example of what I was driving at would be this: my $a = 'foo' but purple; my Int $b = $a; In other words: I've just created an untyped Scalar. This Scalar is (presumably in it's Str slot) storing a string value which happens to have a property set on it (i.e., CI am almost positive that the assignment would perform a copy/clone of the original value, but would preserve the properties while doing so. So if we try to assign C<'foo' but purple> to the Int $b, it: - clones a new Scalar C<'foo' but purple>, but... - identifies the copied C<'foo' but purple> as being a Scalar, not an Int, so... - converts the copied C to an Int, resulting in C - assigns C to $b. ... or something like that. But I would expect that the property _would_ be preserved... my gut feeling is that otherwise, they'd be _way_ to easy to accidentally lose, yes? OT afterthought: In the past, whenever we've gotten embroiled in one of these thorny, knotty issues, @Larry has pulled a stunningly beautiful, elegant rabbit out of their hats. And when I thought that, I had this vision of a single quantum rabbit simultaneously coming out of multiple hats with widely divergent spatial coordinates Yeah. What I wouldn't give for a quantum bunny, right about now. It's not even that type conversion is a particularly difficult issue, it's just so *very* all-encompassing that it's got to be done precisely, because it chains through everything else about the language... what happens when calling subroutines, what happens in the multimethod dispatcher, what simple lines of code do or don't give big honkin' errors, etc... MikeL
Re: printf-like formatting in interpolated strings
On Monday, June 16, 2003, at 10:39 AM, Edwin Steiner wrote: I'm content if this will be revisited (hopefully by someone with better overview than mine). It just should not be ignored. Oh, it definitely won't be ignored. :-) It's come up several times before -- try searching for "stringification", IIRC -- and has always sortof fizzled because the higher-ups were never quite ready for it yet. And there's some primitive type and type conversion questions that are still unclear -- until those are fleshed out, the stringification proposals have been a bit "stuck". But there is broad support for the idea that the somewhat elderly printf syntax is a PITA, and that printf, in general, should be completely unnecessary since we already *have* interpolated strings, fer pete's sake. If you really want to make your brain hurt, consider this: stringification can be thought of, obliquely, as the "inverse" of regexes. One puts strings together, the other takes them apart. And Perl6 introduces shiny, clean-looking rule syntax: /here is a / Oooh, pretty. So if I were in an evil mood, which I almost always am, I'd ask: what's the inverse of a rule? Is it possible that interpolated strings could benefit from the same angle-bracket syntax? __Is it possible that there are "output rules" just like there are "input rules"?__ So what would "The value of x is " mean, from the interpolation end of things? _Could_ it mean something? Is it possible that "The value of x is " is in fact a cleaner, more elegant syntax than: "The value of x is $(expr but formatted(...))" Or, if we have "output rules" just like we have "input rules", could something quite complex be expressed simply as: "You have <$x as MoneyFormat>" having previously defined your MoneyFormat "formatting rule" in some other location? MikeL
Re: Type Conversion Matrix, Pragmas (TAKE 4)
On Friday, June 13, 2003, at 10:26 PM, David Storrs wrote: On the subject of untyped scalars...what does it mean to say that the conversion is 'lossless'? For example: I've been using the word to mean that a conversion is "lossless" if, for a particular A-->B conversion, you can recreate the typed value A *completely* from B, including value, definedness, and properties. So if you can say A-->B-->A, and always get _exactly_ the same thing in A that you started with, for _any_ valid starting value of A, it's "lossless". Which is darn rare, looking at the matrix, because of range issues, etc. my $a = 'foo'; my Int $b = $a; # legal; $b is now 0; is there a warning? my $c = $b; # is $c 0, or 'foo'? 0, I think. Or specifically, C. (I've been operating under the assumption that an "untyped scalar" doesn't _remove_ the type of something, it just can store values of _any_ type, and is by default much more generous about autoconverting them for you, so that you could use $c as an Int, int, Num, num, Str, str, etc., without warning or error... but internally, it's actually still storing the value as an C, because that's what you assigned to it.) my Str $d = $a; # no loss my $a = $d; # no effective change in $a my $e = $b; # what is $d? $d? Still a Str, I would think. And $e would be Int 0, same as $c In the above, I would expect that $c is 0 and my $a = 7 but false; my Str $b = $e; # ??? What value does $f end up with? (My vote would be '7'.) My understanding is that properties go with the values, (just like traits go with the variables), so I would expect $f to be C<7 but false>. So if a value is C, it stays C until you say otherwise. Are any warnings emitted? Yeah, I dunno. I think we need somebody smart to tell us at this point. I have no idea how close or how far we are on our musings about pragmas and defaults... I sure hope somebody does. :-) MikeL
Re: Type Conversion Matrix, Pragmas (TAKE 4)
On Wednesday, June 11, 2003, at 05:48 AM, Tim Bunce wrote: (vi) Conversions of User Defined Types/Classes It may be useful to allow the same level of pragma-based control for user-defined types and classes. For example, a given class Foo may wish to be "silently" convertable to an C. One proposed syntax to declare the method of coercion/conversion might be: class Foo { ... to int {...} # or C? } However, users of such a class could adjust the warning level of the given conversion using the alternate syntax given above (v): use strict conversions warn { Foo => int }; For my int $int = $foo; isn't the int() method (vtable entry) called on $foo? (effectively) So the $foo object is asked to provide an int for assigment to $int. So the Foo class gets to decide how to do that. Yep. The general question that has come up before is to what extent the user of a given class or module should be able to influence the "strictness" of the interface of that class/module -- without altering the module, obviously. The general feeling was that people wanted to be able to do it, because they didn't want to be bound to a particular CP6AN author's decisions on interface strictness. The specific issue discussed previously, IIR, was something like this: class Foo { method bar(str $s); } If you said: my int $i = 5; $foo->bar($i); what should happen? Should bar() convert $i to a str, or give a compiletime warning or error because it's not a str? And is that determined by the strictness level of the Foo class, or the strictness level of the calling code, or --shudder-- possibly both? Please correct me if I'm wrong (which I could easily be as I've not been following any of thise closely). I think this is an important topic because it may reflect back on how the specific Int/Num/Str classes should also be handled. Tim [quite possibly talking nonsense] Talking perfect sense. It's a nasty issue. MikeL
Type Conversion Matrix, Pragmas (TAKE 4)
Seeing as how lots of folks are on the road, and you can hear the on-list crickets chirping, I'm not sure if anything can be accomplished, but I'll repost this as one of those perennial things-which-really-need-to-be-decided-because-lots-of-stuff-is- dependant-on-it-in-terms-of-basic-A2-examples-and-best-practices. Namely, how the most basic Perl6 types interact with each other, and what types can and can't be converted automatically, and the whole philosophy behind this: is it supposed to be silent, and DWIMmy? Is it supposed to be pedantic enough to prohibit "lossy" conversions? Is there more than one level of strictness? Who controls it -- the caller, or the callee? (i) Type Matrix The following matrix depicts the basic P6 scalar types, and the "kinds" of conversions that may/must take place between them. Key is as follows: +: Automatic conversion, conversion is not lossy *: undefness and properties will be lost N: numeric range or precision may be lost (esp. bigints, bignums) F: numeric (float) conversion -- conversion to int is lossy S: string conversion -- if string is not *entirely* numeric, is lossy B: boolean conversion -- loses all but true/false J: junction type; coercing to non-junction type may be lossy FROM -> str Str int Int num Num bit Bit bool Bool Scalar TO: str -*+*+*+*+ **J Str +-+++++++ + J int S *S- *NF *NF+*+ **J Int SS+-FF+++ + J num S *S+*- *N+*+ **J Num SS+++-+++ + J bit B *BB *BB *B-*+ **J Bit BBBBBB+-+ + J bool B *BB *BB *B+*- **J Bool BBBBBB+++ - J Scalar +++++++++ + - (ii) Initial Assumptions I previously proposed simplifying the matrix using the following Big Assumptions. This was not universally agreed upon, however: it may be that these assumptions are controlled by pragma (see next section). I include them here separately for reference. *: (undefness and properties lost) Using/converting an uppercase type as/to a lowercase (primitive) type is silently allowed. If you're sending an Int to something that requires an C, you know that the 'something' can't deal with the undef case anyway -- it doesn't differentiate between undef and zero. Thus, you meant to do that: it's an "intentionally destructive" narrowing, and the C becomes a C<0>. my Int $a = undef; my int $b = $a; # $b is now C<0>, NOT C B: (conversion to boolean) Converting to/from a bit or bool value is silently allowed. The Perl5 rules for "truth" are preserved, such that: my bool $b = undef; # $b is C<0> my bool $b = 0; # $b is C<0> my bool $b = 1; # $b is C<1> my bool $b = -5; # $b is C<1> my bool $b = 'foo'; # $b is C<1> Converting a C or C to any other type always results in C<0> or C<1> for numeric conversions, or C<'0'> or C<'1'> for string conversions. J: (scalar junctive to typed scalar) A scalar junctive, e.g. an "untyped" scalar, can always be silently used as and/or converted to a more specific primitive type. This will quite frequently result in the loss of information; for example, saying: my $a = 'foo'; my int $b = $a;# $b is now C<0> works, but silently sets $b to 0, because the numeric value of C<'foo'> is C<0 but true>. This means that using untyped scalars gets you back to Perl5 behavior of 'silently' accepting pretty much any conversion you can think of. my str $a = 'foo'; my $b = $a; my int $c = $a;# COMPILE TIME ERROR my int $c = $b;# OK If you are using typed variables to enforce strict conversions, you probably want to be warned if you are using any untyped variables, anywhere. Something like: use strict types; (iii) Simplified Matrix The above assumptions result in a simplified conversion matrix, as follows: FROM -> str Str int Int num Num bit Bit bool Bool Scalar TO: str -++++++++ + + Str +-+++++++ + + int SS-NF NF+++ + + Int SS+-FF+++ + + num SS++-N+++ + + Num SS+++-+++ + + bit ++++++-++ + + Bit +++++++-+ + + bool ++++++++- + + Bool +++++++++ - + Scalar ++
Re: MMD [was Re: This week's summary]
On Monday, June 9, 2003, at 03:45 PM, Dave Whipp wrote: "Michael Lazzaro" <[EMAIL PROTECTED]> wrote multi bar (Foo $self, int $i : ); # semicolon optional I think you meant "colon optional". The semi-colon is, I think, a syntax error. You need the yada-yada-yada thing: "{...}". Sigh. Yes, thank you. This, not that: multi bar (Foo $self, int $i : ) {...} # colon optional It's been a bad day. :-/ MikeL
Re: MMD [was Re: This week's summary]
On Monday, June 9, 2003, at 09:19 AM, Mark A. Biggar wrote: On Mon, Jun 09, 2003 at 01:26:22PM +0100, Piers Cawley wrote: multi factorial (0) { 1 } multi factorial ($n) { $n * factorial($n - 1) } That's a bad example, as it's really not MMD. It's a partially pre-memoized function instead. It's MMD if you think of the number 0 as being a "subclass" of C or C. In other words, you have an C class, and then a subclass of C that binds the value to always be zero. In a not-too-twisted fashion, you can think of any constant as being a "subclass" of its base type, overridden to store exactly one possible value. It's like instance-based (classless) inheritance, which we haven't discussed much, but which I hope we eventually get to, because it's bloody useful... Sigh... Which brings up a issue. Is it really MMD if you're only dispatching on a single invocant? Most of the examples I've seen for MMD so far use only a single invocant and are really either regular dispatch or simple overloading instead. MMD only becomes really interesting if you have multiple invocants possibly with best-match signature matching involved. I think it's a matter of semantics: a single-invocant routine is still a "multi", and still semantically MMD, because it uses the same internal dispatcher as an N-invocant one, and checks the same list of possible variants. So you can have: multi bar (Baz $b : ...); # one invocant multi bar (Foo $f : ...); # one invocant, but different! multi bar (Foo $f, Baz $b : ...); # two invocants All three of those are multimethod variants of a routine named C. The MMD mechanism has to determine which of those three variants to use, based on the invocant(s) -- of which there may be one, or several, for any given call to C. Even if there only happens to be one invocant, it's still the same dispatcher, sifting through the same possible variants. The single-invocant C thing I still find confusing at this point is that, for example, you can't actually have Cs! That is, you can't do this: class Foo { method bar (int $i); method bar (str $s); # ERROR method bar (str $s1, str $s2); } You'd have to do this: class Foo { multi bar (Foo $self, int $i : ); # semicolon optional multi bar (Foo $self, str $s : ); multi bar (Foo $self, str $s1, str $s2 : ); } Which, internally, makes some sense -- they have to go to a more complicated dispatcher than normal methods -- but is semantically icky, IMO, and I hope/wish we could find a better way of expressing that. Perhaps E6 will help. MikeL
Re MMD [was Re: This week's summary]
On Monday, June 9, 2003, at 07:13 AM, Adam Turoff wrote: On Mon, Jun 09, 2003 at 01:26:22PM +0100, Piers Cawley wrote: Assuming I'm not misunderstanding what Adam is after, this has come up before (I think I asked about value based dispatch a few months back) and I can't remember if the decision was that MMD didn't extend to dispatching based on value, or if that decision hasn't been taken yet. If it's not been taken, I still want to be able to do multi factorial (0) { 1 } multi factorial ($n) { $n * factorial($n - 1) } The most recent semi-official opinion given onlist, AFAIK, was from Damian on 3/13/03: On Thursday, March 13, 2003, at 06:15 PM, Damian Conway wrote: Piers Cawley wrote: Speaking of multis and constants, Greg McCarroll wondered on IRC if this would work: multi factorial (Int 0) { 1 } multi factorial (Int $n) { $n * factorial($n-1) } Probably not. We did discuss whether multimethods should be able to be overloaded by value, but concluded (for that week, at least ;-) that this might prove syntactically excessive. See the rest of his message for a marginally scary workaround. MikeL
Re: Threads and Progress Monitors
On Thursday, May 29, 2003, at 04:48 PM, Luke Palmer wrote: To nitpick: my $result is lazy::threaded := { slow_fn_imp @_ }; Pursuing this lazy-threaded variables notion, a question. Given: sub slow_func is threaded {# me likey this auto-parallelizing syntax! ... } Would we want to say that _both_ of these have the lazy-blocking behavior? my $result := slow_func(); print $result; my $result = slow_func(); print $result; Or would the first one block at C, but the second block immediately at the C<=>? The obvious answer is that the := binding "passes through" the lazyness, but the = assignment doesn't. But I wonder if that isn't a bit too obscure, to put it mildly. MikeL
Re: Threads and Progress Monitors
On Thursday, May 29, 2003, at 12:45 PM, Dave Whipp wrote: "Michael Lazzaro" <[EMAIL PROTECTED]> wrote in> # But if you want to get the thread object, so you can monitor it's { ... my $tid = thread &slow_func_impl(...); while $tid.active { status_monitor($tid.progress); sleep 60; } return $tid.result; } To my eye, that looks pretty darn slick. You might be a bit frustrated if the &slow_func_impl took 61 seconds :-(. How do we interrupt the C? Possibly in the same way as we'd timeout a blocking IO operations. Personally, I'd be happy with just making the C a smaller number, like one second, or a fifth of a second, or whatever. You want the status_monitor to be updated no more often than it needs to be, but often enough that it's not lagging. But if you really wanted wake-immediately-upon-end, I'd add that as a variant of C. For example, you might want a variant that blocked until a given variable "changed", just like in debuggers; that would allow: { my $tid = thread &slow_func_impl(...); while $tid.active { status_monitor($tid.progress); sleep( 60, watch => \($tid.progress) ); # do you even need the '\'? } return $tid.result; } ... which would sleep 60 seconds, or until the .progress attribute changed, whichever came first. You could make more builtins for that, but I think I'd like them to just be C or C variants. Obvious possibilities: sleep 60; # sleep 60 seconds sleep( block => $tid ); # sleep until given thread is complete sleep( watch => \$var ); # sleep until given var changes value sleep( 60, block => $tid, watch => [\$var1, \$var2, \$var3] ); five tests $tid.sleep(...);# sleep the given thread, instead of this one MikeL
Re: Threads and Progress Monitors
On Thursday, May 29, 2003, at 10:47 AM, Dave Whipp wrote: OK, we've beaten the producer/consumer thread/coro model to death. Here's a different use of threads: how simple can we make this in P6: Hey, good example. Hmm... Well, for starters I think it wouldn't be a big deal to associate a "progress" attribute with each thread object. It should be that thread's responsibility to fill it out, if it wants to -- so you shouldn't ever have to pass \$percent_done as an argument, it should be a basic attribute of every thread instance. That might encourage people to add progress calculations to their threads after-the-fact, without changing the basic interface of what they wrote. I'll also claim that I would still prefer the auto-parallel, auto-lazy-blocking behavior on the thread results we've mused about previously. So coming from the semantics end, I'd love to see it written like this: # Declaring a threaded calculation sub slow_func_impl is threaded { while (...stuff...) { ... do stuff ... &_.thread.progress += 10.0; # or however you want to guesstimate[*] this } return $result; } # If you don't care about getting the actual thread object, just the result, # call it this way: { ... my $result = slow_func_impl(...); ... return $result; } # But if you want to get the thread object, so you can monitor it's progress, # call it this way: { ... my $tid = thread &slow_func_impl(...); while $tid.active { status_monitor($tid.progress); sleep 60; } return $tid.result; } To my eye, that looks pretty darn slick. MikeL [*] Huh. Imagine my surprise to find out that my spellcheck considers "guesstimate" to be a real word. And I always thought that was just a spasmostical pseudolexomangloid.
Re: Cothreads [Philosophy]
On Wednesday, May 28, 2003, at 02:56 PM, Austin Hastings wrote: (s/coroutine/thread/g for the same rough arguments, e.g. "why should the caller care if what they're doing invokes parallelization, so long as it does the right thing?") Global variables. Threads __never__ do the right thing. Heh. That's for sure. I am enamored with the possibility of finding some sub-like syntax for threads where variables are shared *solely* based on their scope, because that is simply The Way It Should Work. If you're in a thread, and refer to a var outside of the threaded block, it's shared; if you refer to a lexical var within the thread, it's not shared. Much like your April example, or John M's idea of C vs. C. So that if: sub process_event (Event $e) is threaded { # (A) an always-parallelized subroutine my $z; ... } our $x; loop { our $y; my $current_event = get_event() or next; process_event($current_event); # (B) creates a 'process_event' thread for each event } $x and $y are shared between all active threads invoked by line (B), the threads can't see the lexical $current_event at all, and the lexical $e and $z are private to each individual C thread. Bada-bing, Bada-boom, can't get much more intuitive than that. OTOH, "threads" have proven historically easiest to manage when little if any data is shared. OTOOH, threads that "share" everything but their private lexical data would be faster/easier to create & run, because they don't have to do mass copying of program state. OTOOOH, they'd still need automatically generated locking when they _were_ accessing those shared vars. OTH, there's nothing wrong with that -- that's what threads are supposed to do, and the vast majority of speed-oriented threads don't *refer* to much shared data, and big "event loop" threads do because, well, they have to. MikeL
Re: Cothreads [Philosophy]
On Wednesday, May 28, 2003, at 01:01 PM, Austin Hastings wrote: Exampling: sub traverse(Hash $tree) { return unless $tree; traverse $tree{left} if $tree{left}; yield $tree{node}; traverse $tree{right} if $tree{right}; } my %hash is Tree; my &cotrav := coro &traverse(%hash); print $_ for ; my &thtrav := thread &traverse(%hash); print $_ for ; Hmm. I think that having _anything_ on the caller side that has to change based on whether the called thing is a subroutine vs. a coroutine probably defeats one of the most central purposes of coroutines -- that nifty subroutine-like abstraction that makes it "just work". Consider, under Damian's latest model: for {...} It doesn't matter whether foo() is a closure or function returning a list, lazy list, or iterator, or is a coroutine returning it's .next value. Which is excellent, and, I'd argue, the whole point; I'm not sure that we can have any coroutine syntax that _doesn't_ do that, can we? But, as Luke pointed out, some of the other syntax required to make that work is isn't particularly friendly: coro pre_traverse(%data) { yield %data{ value }; yield $_ for <&_.clone(%data{ left })>; yield $_ for <&_.clone(%data{ right })>; } If I work backwards, the syntax I'd _want_ for something like that would be much like Luke proposed: sub pre_traverse(%data) is coroutine { yield %data{ value }; pre_traverse( %data{ left } ); pre_traverse( %data{ right } ); } ... where the internal pre_traverses are yielding the _original_ pre_traverse. Whoa, though, that doesn't really work, because you'd have to implicitly do the clone, which screws up the normal iterator case! And I don't immediately know how to have a syntax do the right thing in _both_ cases. So, if I have to choose between the two, I think I'd rather iteration be easy than recursion be easy. If lines like yield $_ for <&_.clone(%data{ left })>; are too scary, we might be able to make a keyword that does that, like: sub pre_traverse(%data) is coroutine { yield %data{ value }; delegate pre_traverse( %data{ left } ); delegate pre_traverse( %data{ right } ); } Maybe. But in truth, that seems no more intuitive than the first. (s/coroutine/thread/g for the same rough arguments, e.g. "why should the caller care if what they're doing invokes parallelization, so long as it does the right thing?") MikeL
Re: Cothreads [Philosophy]
On Tuesday, May 27, 2003, at 07:32 PM, Jonathan Scott Duff wrote: On Tue, May 27, 2003 at 02:05:57PM -0700, Michael Lazzaro wrote: If we could think about "threads" not in terms of forkyness, but simply in terms of coroutines that can be called in parallel, it should be possible to create an implementation of "threading" that had to do a whole heck-of-a-lot less duplication of state, etc. See, this where I start to feel all Cozeny and wonder what the heck we're doing even thinking about how it's implemented. What I want to know is how it looks from the user perspective. Sorry, yes, I'm not talking at all about implementation. I'm just talking about syntax/semantics of invoking them. Underneath, threads and coroutines may well be implemented by giant, man-eating woodchucks battling to the death on a rainy day under a popsicle-stick bridge, for all I care. :-) If, in order to understand threads, I have to first understand coroutines, I think that's a loss because it throws away (or at least morphs into an unrecognizable form) all of collect CS knowledge of what "threading" usually means. In other words, I think the idea of fork-like behaviour is important to threads. [Philosophy] Here's the possible defect in my brain that started my train of thought. I hadn't used coroutines in a long while, so when the topic came up I had to do a little reviewing to make sure I was on the same page. Reviewing done, I started thinking about how coroutines were a different concept from continuations, which were different from threads, and we already had closures, and at what point was the aggregation of all those things going to cause potential Perl6 converts to look at our hypothetical table-o-contents and say "WTF -- exactly how much 'Perl' am I gonna have to learn to be considered a 'good' Perl programmer?" Because _all_ those things are useful concepts, but in spite of the fact that they address very similar problem spaces -- allowing "program state" to be saved and resumed in less-than-strictly-linear ways -- they historically approach it in rather different ways. Whether you're sucking in a concept from a well-worn but sortof-crazy-aunt language like Lisp, or from a possibly-created-as-the-result-of-a-wager language like Icon, each concept largely reached its current fruition independent of the others. No currently-popular(?) language uses _all_ of them at the same time, or to be more precise, you don't see many programmers using _all_ those concepts simultaneously within a given application (and my gut tells me I'd not be keen to debug it if I saw it.) OK. So I get to thinking about coroutines. Coroutines are one of those things that Freak People Out, but once you get to use them, they're quite easy to understand, and damn useful. They're like blocking threads; you call them like a subroutine, they (eventually) return a result to you, and you go on your merry way. They are used _exactly_ as subroutines are used... the only difference is that, like a thread, you can _resume_ the subroutine from the point you last yield()ed, and continue on your way within the subroutine as if that yield() never happened. Again, like a suspended and resumed thread. True threads, on the other hand, are one of those things that everyone *thinks* they know, but which have so many issues with scheduling, blocking, data sharing and hiding, and general wiggyness that it's a lot harder to implement a well-engineered group of threads than most people are willing to admit. Much of that is intrinsic, by definition, to _any_ attempt at parallelization. But how much? Threads are a low-level interface to parallelization. But realistically, people use threads for a *very* narrow set of real-world concepts: (Concept 1) Creating event loops, listeners, etc., in which an event spawns an "action", which runs in parallel to anything else which is running and is independent of all those other things -- so whether it succeeds, fails, or just never ends, it doesn't affect the rest of the program. We'll call this the "subprocess" model. (Concept 2) Splitting multiple "lengthy" tasks to run in parallel, to allow calculations on one task to proceed even if another task is in a blocked (waiting) state, as frequently happens in I/O, or to allow multiple calculations to be run on multiple processors. We'll call this the "parallelization" model. Current popular thread semantics, I would argue, are entirely designed around (Concept 1), and are designed to be very similar to actual _process_ forking. Which at first glance is great, because if you understand process fork()ing you understand threading, but there are two problems with that. First, threads aren't processes, so the analogy brea
Re: Cothreads
On Tuesday, May 27, 2003, at 01:49 PM, Jonathan Scott Duff wrote: I think there's some timing missing (or maybe it's just me). Executing a Code junction implies that I have all of the routines I wish to execute in parallel available at the same time. This is often not the case. Or if adding a Code block to a junction is how you parallelize them at differing times, then I think the syntax would be horrid. Besides *I* don't want to have to keep track of the junction, I just want my threads to execute. I suppose you could make C or whatever return a "thread group" object, which, if necessary, you could use to add additional blocks: my $tgroup = parallel( &foo | &bar | &baz ); ... $tgroup.add( &fuz, &fup, &waz ); # three more though I metaphysically like the idea of executing a junction of Code in parallel, and returning a junction of results. But that still keeps the idea of routine-like, as opposed to fork-like, threads. And no matter what, we'd need to have a "global" thread group, so if your intent was to make a globally-parallel thread, you'd do it on a builtin var called $*THREADS or something. MikeL
Re: Cothreads
On Tuesday, May 27, 2003, at 01:16 PM, Austin Hastings wrote: I like and agree with some of what you've been saying. I too think that there's a case of "an x is just a y with ..." underlying the whole coro/thread/parallel thing. That's why I'm in favor of deconstructing the threading thing -- a lower thread overhead means more people can spawn more threads for lower cost. (snip) So with that in mind, see my enormous proposal from April 15th. I think that coroutine behavior could be coded with the stuff I proposed, maybe with a few helper items added in. Yes, I just re-read it. Of what you wrote, the one thing I'd like to think extra hard about is whether we really _need_ the fork()-like behavior of threads, at all. No, seriously. Stop laughing! If we could think about "threads" not in terms of forkyness, but simply in terms of coroutines that can be called in parallel, it should be possible to create an implementation of "threading" that had to do a whole heck-of-a-lot less duplication of state, etc. Things "outside" the scope of the thread group would automatically be shared(?), things "inside" the thread group would not be shared unless explicitly marked as such. Which, if I read it right, is what you proposed too, but with a slightly different syntax. That _might_ make threads a heck of a lot faster upon creation/startup, and a lot less bulky in general. MikeL
Re: Cothreads
On Tuesday, May 27, 2003, at 12:26 PM, Luke Palmer wrote: We could also have things like: sub { ... } closure { ... } I think you've finally gone crazy. :-) All four of these things are closures. coroutine { ... } thread{ ... } Well, yes, I've been crazy for a while now. But seriously, all four of those are closures -- but I'm theorizing that the last two constructs create something "more" -- a closure bound to an encapsulating something-else. OTOH, the difference between a thread and a coroutine is mostly internal, not external. Again, I beg to differ. But, these are the kinds of misunderstandings that are keeping a good coroutine proposal from getting in Here's how I think of things: coroutines - Used for easy iteration over complex data structures, pipelines, and communication of the results of certain algorithms. threads - Used for user interfaces, speed (on SMP systems), very occasionally pipelines, and headaches. I may have missed things in the threads section, because I haven't done very much threaded programming. To be honest, my favorite use so far has been using them to (painstakingly) emulate coroutines :-) AHA! I think what I am proposing as a "thread" is perhaps not the all-encompassing "thread" as implemented by many other languages, but merely "an encapsulated method of parallelization". We are perhaps used to thinking of threads as intra-process versions of fork(), which I would argue is a damn annoying way to do it -- far too low-level. All a thread really has to be is a block of code that: (a) executes in parallel with "sibling" blocks of code, and (b) is independent of exceptions thrown by "sibling" blocks of code The conventional thread interface is sortof lame. What I'm very fuzzily envisioning, while hoping that my dumb-guy analysis inspires a "eureka!" moment in somebody more experienced in these implementations than I am, is "nestable" threads and subthreads, the way coroutines can be "nestable". So it's not like doing a fork(). It's like calling a subroutine and getting a result. Now, in some cases (like a top-level event loop) that subroutine will never return, which is just as true of normal subroutines. If you call one routine, piece o' cake, it's not a thread, and it doesn't have to do anything fancy. If you call a _junction_ of routines, however, _then_ it knows it has to do the extra fluff to make them parallel, which it then automatically does. So don't execute a junction of Code blocks in parallel unless you intend to do that! So rather than having fork()y threads, perhaps we can use Code junctions to represent parallelization, and call threads _as if they were simply coroutines_. (?) But I'm pretty sure these two concepts are things that we don't want to unify, even though they both have to do with "state". I like your musings on "state", however, and making them more explicit might enable us to come up with some very cool ideas. If we define a thread as a coroutine that runs in parallel, the syntax might converge: sub foo() is cothread { ... yield() ... return() } # start foo() as a coroutine, (blocks until explicitly yields): my $results = foo(...); # start foo() as a parallel thread (nonblocking): my $results_junction = parallel( &foo(...), &bar(...), &baz(...) ) In this situation, C would indicate that you couldn't continue on until all three threads had suspended or exited -- so in order to have truly "top-level" parallel threads, you'd have to set them up so that your entire program was encapsulated within them. (Otherwise you'd just get temporary parallelization, which is just as desirable.) (You could declare a scheduler as an optional named parameter of C.) So to what extent is it OK to "hide" the complexity of a coroutine, or a thread, in order to have the caller side interface as clean and brief as possible? (A question over which I remember having a vigorous but unanswerable debate with someone -- specifically over C++ operator overloading, which suffers from the problem in spades.) There is very little complexity to a coroutine. It's a difficult *concept*, in the ways it has traditionally been explained. Somewhat unrelatedly, I have a mini-rant about encapsulating coroutines inside the sub call interface. Why do so many people consider this a good thing? I don't go around coding all of my functions with ten C variables, and I consider it a good practice. My subs tend to do the same thing each time they're called... which is precisely how the concept works. There are things that are meant to keep state, and these things are called objects! Why don't we use their interface to manage our state? I say this because Damian's coroutine proposal could be greatly simplified (IMHO making it clearer and easier) if calling the sub didn't imply starting an implicit coroutine the first time. I might write something tha
Re: Cothreads [was Re: Coroutines]
On Monday, May 26, 2003, at 06:51 PM, John Macdonald wrote: This is an interesting idea. I'd add forked processes to the list (which includes the magic opens that fork a child process on the end of a pipeline instead of opening a file. I thought about that quite a bit, but fork() is a process-level thing, whereas even threads are more internally controllable/implementable, so I thought that would be too controversial. People already *know* how to fork processes in Perl, whereas thread syntax is newer and more malleable. There is a danger in hiding the distinction between them too much. They have quite difference performance overheads. Some thinking out loud: a thread is significantly more expensive than a coroutine. Why? Because threads must be executable in parallel, so you're duping quite a bit of data. (You don't dup nearly as much stuff for a coroutine, _unless_ of course you're cloning an already-active coroutine.) So a conventional thread is like cloning a "very-top-level" coroutine. Now, if we were to say: sub foo(...) is coroutine { ... yield() ... return() } We would expect foo(...args...); to give us back that coroutine, from the last yield point. In the above, C yields out of the coroutine, and C yields-without-saving-state such that the next foo() invocation will start from the beginning of the routine. Similarly, then, I would expect: sub foo(...) is threaded { ... yield() ... return() } foo(...args...) to start &foo as a new thread. C would temporarily suspend the thread, and C would end the thread. (Note that you could use &_.yield to yield the current Code object, so you can have nested yields w/out confusion -- see C, from A6.) These are using nearly identical syntax, but there is still a clear semantic difference between them -- the trait C is sufficient for that. We could also have things like: sub { ... } closure { ... } coroutine { ... } thread{ ... } If that's preferable syntax. As long as they're similar, and share similar suspend/resume capabilities. OTOH, the difference between a thread and a coroutine is mostly internal, not external. If we define a thread as a coroutine that runs in parallel, the syntax might converge: sub foo() is cothread { ... yield() ... return() } # start foo() as a coroutine, (blocks until explicitly yields): my $results = foo(...); # start foo() as a parallel thread (nonblocking): my $results_junction = parallel( &foo(...), &bar(...), &baz(...) ) In this situation, C would indicate that you couldn't continue on until all three threads had suspended or exited -- so in order to have truly "top-level" parallel threads, you'd have to set them up so that your entire program was encapsulated within them. (Otherwise you'd just get temporary parallelization, which is just as desirable.) (You could declare a scheduler as an optional named parameter of C.) So to what extent is it OK to "hide" the complexity of a coroutine, or a thread, in order to have the caller side interface as clean and brief as possible? (A question over which I remember having a vigorous but unanswerable debate with someone -- specifically over C++ operator overloading, which suffers from the problem in spades.) I'm of mixed opinion. I like the notion of merging them, because I think they are largely the same concept, implemented in different ways. I _VERY MUCH_ like the idea of any arbitrary Code block being parallelizable with the addition of a single word. Few languages do a decent job of parallelization, and current thread syntax is often overly burdensome. Importantly, I hope that it _might_ be possible to, in the multiprocessor-laden future, automagically parallelize some coroutines without going full-fledged into multiple threads, which are far too expensive to produce any net benefit for anything but the largest tasks. (The trick is in the dataflow analysis -- making sure there's no side effects on either path that will conflict, which is bloody difficult, if not impossible.) Such that given: my $result = foo(...); ... more stuff ... print $result; foo() recognizes that it can be run in parallel with the main flow, which is only blocked at the C statement if C<$result> isn't completed yet. Until such time as dataflow analysis can accomplish that, however, it may take a keyword: my $result = parallel foo(...); ... print $result; or my $result |||= foo(...); ... print $result; MikeL
Re: Cothreads [was Re: Coroutines]
On Monday, May 26, 2003, at 06:10 PM, Dave Whipp wrote: So, in summary, its good to have a clean abstraction for all the HCCCT things. But I think it is a mistake to push them too close. Each of the HCCCT things might be implemented as facades over the underlying othogonal concepts of data management and execution management (hmm, we mustn't forget IO and other resource managers). Yeah, that. What I STRONGLY SUGGEST WE AVOID is a situation where _some_ of those interfaces are object-based, using or similar, and others of those are trait or keyword based, such as or , and others are implicit and invisible (closures).[*] I think that's where we _might_ be headed, and we should turn that burro 'round. MikeL [*] I would even be ~somewhat~ willing to debate whether Perl5's ability to implicitly create closures, in absence of any marker to specify that you _meant_ to do that, might be a slight misfeature. In that I've seen people accidentally create closures -- a damn hard bug to find, sometimes -- when they would have been better served by Perl throwing var-not-found errors, and requiring a specific keyword to create an actual closure. Somewhat willing. Maybe. Sortof.
Re: Cothreads [was Re: Coroutines]
On Monday, May 26, 2003, at 06:10 PM, Dave Whipp wrote: Michael Lazzaro wrote: What I'm getting at is that all these concepts are much more related than they might at first seem, and that the differences between them are largely of "scope". If we have some form of coroutines, _and_ can run them in parallel, all the other constructs fall out of them fairly naturally. Even threads, for some pretty decent definition of "threads". [snip] OK, Mike sayes "a single entity", and then "these concepts are much more related than...". Perhaps I'm being pedantic, but I relationships generally imply multiple entities. But perhaps he meant that things are so related that they become attributes of a single entity. I'd like to disagree: they may belong the a single domain, but to stick them in a single class would result in a bloated blob with a fat interface. Sorry; I meant that the _current_ concepts are strongly related, but they _could_ be a single entity. Please forgive the somewhat spastic parts of the proposal -- like I said, it was less than an hour of thought, I certainly don't mean it to be a be-all-end-all. (In particular I'm aware I'm badly conflating the thread vs. scheduling issues.) Most obvious is that the "Priority" attribute is a property of the relation between the execution context and its CPU resource manager (i.e its scheduler). In fact, I'd like to suggest the redical proposal that every shared lock should be able to have an associated priority for each blocked thread: and to enable this we'd want each lock to have an associated arbiter (the simplest of which would not have priorities). Certainly, a separate and overridable thread-scheduler seems a necessity. I'd be annoyed if you couldn't swap out the scheduler. What I really want, most of all, is a syntax for coroutines and a syntax for threads that is Damn Similar; similar enough so that one can be taught as merely a more encompassing extension of the other. Whether they should be absolutely identical is _very_ debatable, but I think it's worth some thought. More in a sec... MikeL
Re: == vs. eq
One thing we should clear up is that we already *have* a generic comparator, C<~~>, depending on what you mean by "generic". It can be made to compare things however you like, according to whatever standard of similarness you decide you want to enforce, and can even compare objects of disparate types (if you tell it how.) The way I personally have been envisioning this working is: $a == $b; # compares numerifications of $a and $b $a eq $b; # compares stringifications of $a and $b $a ~~ $b; # tests equality of $a and $b $a =:= $b; # tests identity of $a and $b Of these, I would expect that ==, eq, and =:= would almost never be overloaded, because they have very specific meanings.[*] You _could_ choose to override those == and eq for a particular custom class, and use those for comparing equality of objects. But since some people will tend to want to override ==, and some will want to override eq, it's not clear that the Perl6 community will converge on using only one or the other, which might make things quite confusing if you're using a library that has standardized on the opposite convention from your own. ~~, on the other hand, is meant to be overloaded to a possibly-excruciating extent, and I've been assuming that it will be the thing that classes most often overload when they want to test "equality" of two arbitrary objects, without resorting to serializing them via num/str. (Using num/str comparisions to test for object equality obviously only works if num/stringifying _completely_ serializes the object -- which very often is _not_ what you want the num/stringification of an object to do, by default.) The proposed =:=, however, merely tests that two objects are identical -- that is, _that they are bound to the same underlying thing_. It's possible for two objects to be equal, ~~wise, without actually being identical, identitywise. I don't see ever wanting to overload =:=, and it's unclear if it should even be allowed. Note also that == and eq comparisions are naturally accompanied by greater-than, less-than variants. But ~~ and =:= don't really have those variants, because they wouldn't make much sense. Having said all that, it should be noted that I'm completely making this stuff up. MikeL [*] By 'overloading', I mean multimethod variants. The comparision operators are almost certainly accomplished via multimethods, $a and $b being the two invocants.
Re: == vs. eq
Luke Palmer wrote: As much as I don't want to refute my own operator, I agree with you here. I don't know what the "official" (this week) policy is, but I think it's a bad idea for references to auto-dereference. The other way around is fine, though (arrays auto-referencizing). I'm pretty darn sure they autodereference... we last talked about this when we were trying to determine how an arrayref would behave when interpolated into a string literal (answer: just like the original array would). Here's the relevant message: On Fri, Dec 06, 2002, Larry Wall wrote: On Fri, Dec 06, 2002, Dan Sugalski wrote: : If an aggregate and a reference to an aggregate are going to behave : the same, which is what Larry's indicated in the past, then : stringifying a reference should be the same as stringifying its : referent. This is a bit of an oversimplification. $foo and @foo do not always behave the same, even if $foo and @foo refer to the same array object. In particular, $foo doesn't behave like @foo in a list context. But it's probably fair to say that $foo and @foo always behave identically in a scalar context. So I *really* don't think comparing the equality of references will be a good idea, in P6. :-) John Williams wrote: You're right, but personally, I have come to trust eq more that == when comparing things-which-might-not-be-numbers, such as references. [EMAIL PROTECTED] eq [EMAIL PROTECTED];# true, for the reason I think # (the string-representation of the refs are equal) I'm pretty sure that breaks too, for the same reason. It puts both sides in string context, which causes both sides to return the string representation of the underlying array, _not_ the string representation of the references themselves. MikeL
Re: == vs. eq
On Tuesday, April 1, 2003, at 10:35 AM, John Williams wrote: On Tue, 1 Apr 2003, Michael Lazzaro wrote: So I would imagine it _is_ possible to test that two values "have the same identity", but I would imagine it is -not- possible to actually get what that identity "is". There's no .id method, per se, unless you create one yourself. What about the \ (reference) operator? If you take two references to an object, they should compare the same, right? In theory, I would think so. But in P6 practice, that might not be as useful as it sounds: my @a; my @b := @a; # bind @b same as @a [EMAIL PROTECTED] == [EMAIL PROTECTED]; # true, but not for the reason you think! @a =:= @b; # true, are bound to the same array Note if we are truly strict about C<==> always meaning "compare numerically", I imagine that the line: [EMAIL PROTECTED] == [EMAIL PROTECTED]; would in fact be identical to _this_ line: @a.length == @b.length;# or whatever it's called or even just: @a == @b; ...which is probably not at all what you meant when you tried to compare [EMAIL PROTECTED] == [EMAIL PROTECTED] Likewise, if you attempt to store the numerified reference of something, in hopes of using it later as an identifier: my num $id = [EMAIL PROTECTED]; You would be in for a world of hurt. That line would actually set $id to the number of elements in @a -- at least, I hope it would: my $x = @a; # stores a reference to @a my $x = [EMAIL PROTECTED]; # same thing my int $x = @a; # stores length of @a my int $x = [EMAIL PROTECTED]; # same thing So I don't think comparing references would do what you wanted, for arbitrary (non-scalar) objects. Or rather, I don't think it'll be easy to get at a "numeric representation" of a reference ... a true 'identity' test might be a better approach.[*] (?) MikeL [*] (Also, any identity test that relies on numerification or other transformation of the comparators is doing a lot of unnecessary work.)
Re: == vs. eq
On Tuesday, April 1, 2003, at 02:22 AM, Luke Palmer wrote: To outline the problem again, even disregarding user-defined objects: Generic containers need a way to compare nums to nums and strings to strings and only get true when they actually are equal. The kind that the user overloads with his own user-defined type to say what it means to be equal. No magic. I would suggest that probably ~~ would be the thing most objects overload to determine equality. Since == means numeric, and eq means string, that pretty much leaves ~~ as our beast-to-work-with. Which is fine, because I can certainly see two things being equal numerically without being "Equal" like-wise. And I can certainly see two things as being similar (like-wise), without being identical (identity-wise). For example, two arrays may contain the name number of elements, but the actual elements in each may be totally different. One possibility. my @a1 = (1,2,3); my @a2 = ('a','b','c'); @a1 == @a2; # true -- same number of elements @a1 eq @a2; # false -- stringifications aren't the same @a1 ~~ @a2; # false -- elements contained don't ~~ match @a1 =:= @a2; # false -- aren't bound to the same exact array So I like your idea a lot, personally. MikeL
Re: == vs. eq
On Tuesday, April 1, 2003, at 06:59 AM, Jonathan Scott Duff wrote: On Tue, Apr 01, 2003 at 03:22:33AM -0700, Luke Palmer wrote: ($a =:= $b; # looks a little better) I like =:= as identity operator if we want one. If not, as long as .id returns something that compares properly with both == and eq, I'm happy. Agreed, =:= is nice looking. As I said before, I would strongly doubt that there will be an .id method _at all_ on any builtin types/classes -- because unless we used memory location as the id, it would imply that a separate id had to be calculated and stored with each object, which would be expensive, and if we _did_ use mem location as the id, getting and storing the id would be largely useless, since it could change unpredictably. So I would imagine it _is_ possible to test that two values "have the same identity", but I would imagine it is -not- possible to actually get what that identity "is". There's no .id method, per se, unless you create one yourself. As to whether we want on identity op, the only purpose would be when building your own classes, so that string, numeric, smartmatch, and true identity comparisions could be overloaded separately. We'd almost certainly have "identical" mean "points to the same object in memory", as opposed to the fuzzier matching of the other variants. Which sounds like a very good idea to me. We've talked about this before, and let it drop. We should decide... well, we should encourage the deciders to decide. Myself, I strongly vote for -no- .id function, but for an identity-test operator separate from ==, eq, and ~~. Called C<=:=> MikeL
Re: Conditional Cs?
On Monday, March 31, 2003, at 11:21 AM, Smylers wrote: Michael Lazzaro writes: Forgive me; a very minor style & efficiency question... what would the canonical way to do this be, given what we know of Perl6? # the hapless, inefficient version: return &result_of_some_big_long_calculation(...args...) if &result_of_some_big_long_calculation(...args...); The obvious answers are this: my bool $x = &result_of_some_big_long_calculation(...args...); return $x if $x; That does something different, in that it has coerced the result into a C[*0]. So after the first line C<$x> can only be 0 or 1[*1]. And given that one of those states is false, the code becomes equivalent to: However if you permit the function to return more than two different values (of which more than one are true) then it becomes a more interesting question. My apologies, you are correct -- $x should _not_ be typed. It's the more interesting question I'm asking. On Monday, March 31, 2003, at 12:30 PM, Paul wrote: I started to suggest this myself, then realized that you might not want it to return at all if the value is false. Yes, exactly: sub foo(...args...) { # We first attempt to get our return value the easy way. # If successful (the resulting value is defined and true), # just return that value. my $x = &baz(...args...); return $x if $x; # Still here? OK, then... we've got a lot more work # to do before we can return a reasonable value ... } I'm looking for a Perl6 way to say that oft-repeated, oft-chained two-line snippet up there without declaring the temporary variable. Using C or C, maybe? MikeL
Re: Conditional Cs?
On Monday, March 31, 2003, at 11:18 AM, Matthijs van Duin wrote: On Mon, Mar 31, 2003 at 11:04:35AM -0800, Michael Lazzaro wrote: my bool $x = &result_of_some_big_long_calculation(...args...); return $x if $x; Is there a way that doesn't require the named variable? $_ and return given big_calculation(); or: given big_calculation() { return when true; } Don't those return C, as opposed to the value of C<$_>? I.E. wouldn't it be: $_ and return $_ given big_calculation(); -or- given big_calculation() { return $_ when true; } MikeL
Conditional Cs?
Forgive me; a very minor style & efficiency question... what would the canonical way to do this be, given what we know of Perl6? # the hapless, inefficient version: ... return &result_of_some_big_long_calculation(...args...) if &result_of_some_big_long_calculation(...args...); ... The obvious answers are this: my bool $x = &result_of_some_big_long_calculation(...args...); return $x if $x; -or the identical - my bool $x; return $x if $x = &result_of_some_big_long_calculation(...args...); Is there a way that doesn't require the named variable? Perhaps using C? Just a thought experiment on my part, reduced from some code examples that do this sort of thing in a duplicitous fashion, switch-like... my bool $x; return $x if $x = &calc_try_1(...); return $x if $x = &calc_try_2(...); return $x if $x = &calc_try_3(...); ... MikeL
Re: This week's Perl 6 Summary
On Monday, March 31, 2003, at 10:15 AM, Jonathan Scott Duff wrote: On Mon, Mar 31, 2003 at 10:09:43AM -0800, Michael Lazzaro wrote: I'm still hoping rather desperately for a if-uninitialized op in general, even if only for hashes, because the difference between "present but undefined" and "not present" is rather crucial for some common algorithms. Can you give some examples? Sure, edited slightly from my last mail... I suppose my own most common example is a hash representing a cache... it's entirely possible for C to be calculated and cached, but that doesn't mean the cache entry is invalid and should be recalculated every time -- a nonextant key would mean the cache entry is invalid. { $cache.{ $key } = &foo($key)# (1) if not exists $cache.{ $key }; $cache.{ $key }; } In my own experiences, code similar to the above is awfully common. An assign-if-not-present form (at least for hashes, but in a magical fairy world, for arrays/params/whatever, too) such as: $cache.{ $key } ///= &foo($key); # (2) would be a lot cleaner, and maybe a little faster, since it's testing C<$cache.{ $key }> once instead of -- what, 2.5 or 3 times, I guess, depending on how you count it? There are other examples -- working with external data sources, primarily -- but they pretty much always boil down to the same general concept. I basically want the shortest, *fastest* possible way to say the caching code above given above. MikeL
Re: This week's Perl 6 Summary
On Monday, March 31, 2003, at 07:39 AM, Piers Cawley wrote: Argument initializations Michael Lazzaro summarized the various different and proposed assignment operators available in Perl 6, including a proposed "::=" for 'only assign to uninitialized variables'. Michael wondered how these could be used in method signatures and proposed some changes to the signature system as set out in Apocalypse 6. People were dubious about this, with Damian saying "I don't think that allowing 20 different types of assignment in the parameter list of a subroutine actually helps at all." I'm not sure Michael is convinced yet. Nah, I'm convinced that nobody likes the putting-more-initializers-in-the-sig idea, so it can die. And I'm convinced that we can't use ::= for the purpose of if-uninitialized, because it's not consistent with the other meaning. I'm still hoping rather desperately for a if-uninitialized op in general, even if only for hashes, because the difference between "present but undefined" and "not present" is rather crucial for some common algorithms. But I have no idea what to propose calling it, which is a bit of a pickle. :-/ $h{ k } = 'blah'; # always $h{ k } //= 'blah'; # if not defined $h{ k } ///= 'blah'; # if not exists ??? MikeL
Re: A6: argument initializations via //=, ||=, ::=
On Tuesday, March 25, 2003, at 12:59 PM, Smylers wrote: Michael Lazzaro writes: Larry Wall wrote: We don't have a word for "START" right now. It's somewhat equivalent to state $foo //= 0 unless $foo gets undefined, I suppose. Assuming we have a static-like scope called C, one can definitely see the use of having an assignment variant for "not yet initialized" or "doesn't yet exist", the proposed spelling of which was C<::=>. I'm unconvinced by the need for a cryptic way to distinguish undefined from uninitialized. Or, at least, if we want to make such a distinction here it should be because we come up with good examples where the distinction is useful, doing something that couldn't easily be achieved in some other way -- and that that usefulness is thought to outway the additional complexity of having another assignment operator in the language and having to distinguish it when teaching or learning Perl. I suppose my own most common example is a hash representing a cache... it's entirely possible for C to be calculated and cached, but that doesn't mean the cache entry is invalid and should be recalculated every time -- a nonextant key would mean the cache entry is invalid. { $cache.{ $key } = $cache.load($key) if not exists $cache.{ $key }; $cache.{ $key }; } In my own experiences, code similar to the above is awfully common. An assign-if-not-present form (at least for hashes, but in a magical fairy world, for arrays/params/whatever, too) such as: $cache.{ $key } __= $cache.load($key); would be a lot cleaner, and maybe a little faster, since it's testing C<$cache.{ $key }> once instead of -- what, 2.5 or 3 times, I guess, depending on how you count it? Your mileage may vary, of course -- I must confess I need the undefined vs. noexists distinction all the time, but it's quite possible I code weirdly compared to other people. :-/ MikeL
Re: A6: argument initializations via //=, ||=, ::=
On Tuesday, March 25, 2003, at 06:17 PM, Damian Conway wrote: Likewise, I read sub foo($x //= 1) {...} as saying the value stored in $x is a constant, but if the caller passed an undefined value (or didn't pass anything at all), we're going to instead pretend they passed us a (still-constant) 1. I'm not sure why that violates any rules.(?) //= means "assign to lvalue unless lvalue is defined". So I read that as: $x is a constant variable. Bind the corresponding argument to $x and then assign 1 to it if its value is undef. But the container bound to $x is bound as a constant, so it *can't* be assigned to. OK, I buy that, I think. But does that mean that fixing edge cases in your arguments doesn't work at all, i.e. you can't even do this? sub foo(?$x) { $x //= 1; # if they passed an undef, init it to 1 ... } or that you have to say it like this? sub foo(?$x) { my $x = $x // 1; # hide $x param using a non-const C? ... } MikeL
Re: P6ML?
Robin Berjon wrote: Including... The data binding folks have tried to address the problem using XML Schema, and the result is, hmmm, "unpleasant" to use something polite. The SOAP and WSDL people have been at it, and I won't even describe the result because I couldn't possibly be polite about it. Imho a grammar-based approach would likely be too low-level. I'm currently betting on something that would mix XBind[1] and Regular Fragmentations[2]. The first one defines simple mappings as described above, the second tells you how to parse data in XML documents that has structure not expressed in XML (eg 2003-03-26) so that it is seen in a structured way, without the need for typing. One very cool thing that could be done in Perl 6 would be to take an XBind+RegFrag document and generate a grammar derived from the P6 XML grammar that would 1) be specific to the vocabulary (and thus hopefully faster than a generic XML grammar, though I don't have /too/ much hope) and 2) directly produce the object representation you want and return it in the parse object. Indeed. This is the primary problem space. Nobody has done this well. If we could provide a toolset for doing this, we would Really Have Something. My initial query about the ambiguously-named "P6ML" was not based so much on a notion that such an effort couldn't be done in Perl5, and more on the notion that it may be far _more_ possible to do this, quickly & credibly, using P6 typing/OO and the new regex engine. As I said, I've done quite a bit of prototyping, and the P5 solutions can be very, very tedious. (P5 and P6 may be mostly alike, but it's the parts that aren't "mostly" that have driven the very need for P6 -- and just so happen to be the very parts that make this problem so awkward in P5.) And in case I haven't mentioned it this week, you Parrot folks are my heros. [0]http://use.perl.org/~gnat/journal/11081 [1]http://www.prescod.net/xml/xbind/ [2]http://www.simonstl.com/projects/fragment/ Thanks for those... I was aware of the first two links, but I had completely missed the Frag one -- I plead ignorance on that. You are correct, it looks quite promising. MikeL
Re: A6: argument initializations via //=, ||=, ::=
On Tuesday, March 25, 2003, at 02:19 PM, Damian Conway wrote: And I don't think that allowing 20 different types of assignment in the parameter list of a subroutine actually helps at all. Especially since the vast majority of parameters in Perl 6 will be constant. Twenty types of _initialization_. :-D Seriously, tho, I'm not sure I understand the constantness part. sub foo($x = 1) {...} # A6 syntax I read the above as saying $x is indeed constant, but if it's not explicitly placed by the caller, we're going to pretend the caller passed us a 1. Likewise, I read sub foo($x //= 1) {...} as saying the value stored in $x is a constant, but if the caller passed an undefined value (or didn't pass anything at all), we're going to instead pretend they passed us a (still-constant) 1. I'm not sure why that violates any rules.(?) As a marginal bonus, perhaps an assertion-style sub foo($x //= 1) {...} optimizes to be faster, runtime, than sub foo($x) { $x //= 1; ... } when passing a constant in $x, e.g C or C, since it can optimize out the assertion at compile-time? MikeL
Re: A6: argument initializations via //=, ||=, ::=
On Tuesday, March 25, 2003, at 03:35 PM, Mark Biggar wrote: sub myprint(+$file is IO:File is rw ::= IO:STDOUT, [EMAIL PROTECTED]) {...} open f ">/a/d/v/f/r"; myprint file => f, "Hello World!\n"; # goes to f myprint "Differnet World!\n";# goes to IO:STDOUT As a side note... that sig will not do the behavior you've described. You instead want this: sub myprint([EMAIL PROTECTED], +$file is IO:File is rw ::= IO:STDOUT) {...} The named parameter +$file has to go behind the positional [EMAIL PROTECTED] in the signature, but still goes _before_ [EMAIL PROTECTED] in the actual call. (Insert image of a five-year-old me jumping up and down on a chair, pointing my finger, saying "see! see! I told you people would do that!") MikeL
Re: A6: argument initializations via //=, ||=, ::=
On Tuesday, March 25, 2003, at 11:08 AM, Jonathan Scott Duff wrote: On Tue, Mar 25, 2003 at 10:42:39AM -0800, Michael Lazzaro wrote: But it is certainly possible to extend the initialization capabilities to be more robust: sub foo($x = 'blah') {...} # wrong: use one of the below sub foo($x ::= 'blah') {...} # same as C<$x is default('blah')> sub foo($x //= 'blah') {...} # sets $x whenever $x is undefined sub foo($x ||= 'blah') {...} # sets $x whenever $x is false While this looks pretty in email, it makes me wonder what the :: operator does outside of regular expressions and how that operator interacts with ??:: Well, := is the binding operator, and ::= is the "compile-time" binding operator. For some definition of "compile-time"(?)... I'm not sure whether it can be reused for this "init-time" purpose or not. So ::= does what := does, but does it sooner. :-) It has been proposed on several occasions that $foo ??= 'baz' :: 'zap'; be equiv to $foo = ($foo ?? 'baz' :: 'zap'); But I don't think ??= and ::= conflict in any (technical) way. And don't forget these other argument initializations: sub foo($x &&= 'blah') {...}# sets $x whenever $x is true sub foo($x += 1) {...} # add 1 to whatever $x given sub foo($x -= 1) {...} # subtract 1 to whatever $x given sub foo($x *= 2) {...} # multiply by 2 whatever $x given sub foo($x /= 2) {...} # divide by 2 whatever $x given sub foo($x ~= "foo") {...}# Append "foo" to whatever $x given True, most of those are not useful. But the ones that potentially are (::=, //=, ||=), should we support them? It would seem that all three of those notions are quite often used, even in quickndirty code. I'm arguing for these three defaulting options in particular, but my more encompassing argument is the wrapper-like notion that signatures should be able to specify assertions, invariants, and initializations that may then be "inherited" by all implementations of that function/method. Placing such extensive capabilities in the sig is especially useful if we can 'typecast' subs, as others have pointed out. class mysub is sub(...big long signature...) returns int {...} sub foo is mysub { # actual implementation goes here } sub bar is mysub { # same signature, different function } If ...big long signature... contains initializations and assertions that are required to be shared by all sub implementations, than it makes the sig a quite powerful thing. Sure, a complex sig could be pretty big. But it's one big thing, as opposed to repeating the same assertions in N implementations. MikeL
Re: P6ML?
On Tuesday, March 25, 2003, at 11:02 AM, Robin Berjon wrote: Michael Lazzaro wrote: So, is anyone working on a P6ML, and/or is there any discussion/agreement of what it would entail? Imho P6ML is a bad idea, if it means what I think it means (creating a parser for quasi-MLs). People will laugh at our folly, and rightly so for trying to be able My own musing was not something that would accept bad XML, but something more geared as a P6-based replacement for the steaming hunk of crap known as XSL. An XML-based derivative that performs XML transformations, allowing/using embedded P6 regexs, closures, etc., and able to more easily translate XML <==> P6 data. Something like that might significantly help P6 adoption rates.[*] While we're stuck with XML, I'm not willing to say we in Perl-land should be stuck with the currently craptacular XML transformation methods being adopted by other languages. :-P Anyway, it's a future library issue more than a language development one, but I'd be interested in hearing if any such plans were already underway. MikeL [*] For example, one of the Very First Things I'll be doing with Perl6 is, of course, creating a P6-specific companion to ASP/JSP/PHP, but one that's substantially more OO in nature... all of those *Ps have pretty poor capabilities, and do not allow sufficiently flexible OO-based templatizations, in my experience. And while P5's Mason is impressive, one can imagine a more firmly P6, OO-based solution that would have a *lot* of additional speed/capability. (I have a longtime P5 prototype that we use here, but limitations of the P5 implementation makes it annoyingly slow during template compilation & init.)
P6ML?
So, is anyone working on a P6ML, and/or is there any discussion/agreement of what it would entail? MikeL
A6: argument initializations via //=, ||=, ::=
Getting back to A6, a few thoughts. From the 'Re: is static?' thread: On Wednesday, March 19, 2003, at 08:30 AM, Larry Wall wrote: Well, people *will* write state $foo = 0; The question is what that should mean, and which major set of people we want to give the minor surprise to, and how much effort we want to expend in training to avoid the surprise in the first place. There's something to be said for keeping = as assignment outside of sigs. But it's not clear whether that = is part of a sig... Most use will be to init with a compile-time constant, so I suppose we could train people to just say: state $foo ::= 0; We don't have a word for "START" right now. It's somewhat equivalent to state $foo //= 0 unless $foo gets undefined, I suppose. That's where that particular thread ended -- from there, it diverged into a subthread about C behavior within nested subs/closures. Ignoring the closure parts, and getting back to the more generic idea of C: Assuming we have a static-like scope called C, one can definitely see the use of having an assignment variant for "not yet initialized" or "doesn't yet exist", the proposed spelling of which was C<::=>. That gives us: state $baz = 0; state $baz ::= 'blah'; # if doesn't "exist" or not "initialized" state $baz //= 'blah'; # if $baz is undefined state $baz ||= 'blah'; # if $baz is false state $baz &&= 'blah'; # if $baz is true state $baz ??= 'blah' :: 'blarp'; # if $baz is true|false Of course, for C scoped vars, C<::=> is the only one that makes much sense, and C<::=> doesn't make much sense for lexical vars. (One could argue that C<=> should just mimic C<::=> for C-scoped, but I'm not going to.) --- OK, so tying that back in to A6... people have suggested extending //= and/or ::= into signatures. The more I think about it, the better I like this approach, and the more justified it seems. Right now we have: sub foo($x = 0) {...}# same as C<$x is default(0)> to set the $x param to 0 _if it was not specified by the caller_. But it is certainly possible to extend the initialization capabilities to be more robust: sub foo($x = 'blah') {...} # wrong: use one of the below sub foo($x ::= 'blah') {...} # same as C<$x is default('blah')> sub foo($x //= 'blah') {...} # sets $x whenever $x is undefined sub foo($x ||= 'blah') {...} # sets $x whenever $x is false The utility of this is that it gives the signature significantly more control over the initialization of individual parameters, regardless of the actual function/method "implementation". Maybe sigs _should_ be able to make certain assertions about their arguments, and even adjustments to them, before they hit the 'real' implementation. There has been some debate about the power that sigs should be given. Specifically, I'm thinking of the old "how do you make an assertion upon an argument" debate of a few months ago, and the terror of ten-line-long function sigs with assertions attached to the various parameters. But it's largely a false debate, I would argue: if you want all possible implementations of a given function/method to share the same precise parameter assertions, they _should_ be specified in one place, not in twenty. (subs are objects, have inheritance, etc... more on this later). But TMTOWTDI, if that's not your style. Anyway, I can definitely see merit in allowing those particular constructs for the sake of significantly smarter sigs. I can see _practical_ uses for C<::=>, C, and <||=> -- enough so that they probably should be differentiated, and that I would even propose the C<=> spelling, in sigs, be dropped as ambiguous/meaningless. ? MikeL
Re: A6: Named vs. Variadic Parameters
On Wednesday, March 19, 2003, at 09:58 AM, Larry Wall wrote: : sub foo($x, [EMAIL PROTECTED], +$k) {...}# (2) OK Fine, you can set @a using positional notation, like push(), in addition to the notations available to (1). But if you set "k =>", it has to be before the list, unless you pass the list explicitly as a "*@" named parameter. With the exception of [EMAIL PROTECTED] at the front as in (2), non-positional parameters don't pay any attention to their order of declaration. It's the "with the exception of [EMAIL PROTECTED] at the front" part that worries me quite a bit. I'd be a lot happier with having the rule be "non-positional parameters must come after positional parameters, but before any variadic elements". I think newbies are going to unquestionably try and put the parameters in the same order as they expect to see the eventual arguments, and be durn confused it doesn't work -- I know I would. Especially because: sub foo($x, +$k, *%h) {...} # (3) sub foo($x, *%h, +$k) {...} # (4) _are_ synonymous, but sub foo($x, +$k, [EMAIL PROTECTED]) {...} # (1) sub foo($x, [EMAIL PROTECTED], +$k) {...} # (2) are quite definitely not. Dunno. I'm just one datapoint, but I strongly see the difference between (1) and (2) as being a *huge* newbie trap. And it doesn't seem like you ever gain anything by specifying (1) -- I don't know why you would ever purposefully _want_ to do that, instead of (2). So I would still strongly urge that (1) and (2) be synonyms. MikeL
Re: "XML is Too Hard for Programmers" = Tim Bray
On Tuesday, March 18, 2003, at 09:55 AM, Austin Hastings wrote: To me, this says that there's no real commitment to "doing XML". What there is seems to be a recognition that XML format is regular and comprehensible to others, so writing "XML-like" files becomes popular. Yep. Which makes things even worse. And this is pretty important stuff. We do a *lot* of XML parsing here (Cognitivity, that is) and even more "XML-like" parsing. And even with Perl, it's a royal pain. There are P5 XML modules out there which tie into C-based XML libraries... those are quite fast, but fail badly if the XML isn't 100% well-formed, and are largely not extensible for "XML-like" situations. You'd have to rip one up and rewrite it, in C, for every iteration of "-like", which we cannot credibly do. A perl5-native parser can be rigged up fairly easily, but it's *numbingly* slow compared to the C version. I mean, 20-50 times slower, by my guess. The speed issue when importing XML-like data (which we do *very frequently*) is a constant sticking point for us and our clients. Damian's Parse::RecDescent has been a godsend, implementation-wise -- but it of course suffers the same nasty speed issues. This is a big, big issue, and one that P6 needs to address well, because this is how many businesses will judge it. What I'm hoping, obviously, is that the new P6 regexes -- which will be *perfect* for writing and maintaining our umpteen quite-similar parsing rulesets -- will be fast enough to at least be in the same order of magnitude as a middling C solution. They don't have to be as fast as C, obviously, but they can't be 20x worse. Why does this matter so much? Because it's a barn door. Even though it's so much easier to write XML-like parsers in Perl than, well, anything else, the speed issue will at some point dictate moving to a non-Perl parsing solution. At which point, the issue becomes how much of the rest of the related system to move into that other solution as well, since it is much cheaper to maintain expertise in one toolset than two. So within a company, it can lead to greater use of Perl -- or abandonment of Perl -- depending on success in this one key area. (I have seen this in action at a number of companies.) It is therefore critically important that P6 allows easy, fast parsing for XML-like things, not necessarily just XML proper, because that's the way the business winds have been blowing. And it needs to support it out-of-the-box. Seriously, it's that important. MikeL
Re: is static?
Damian wrote: Hence, I would argue, one ought to simply mark it with a trait: FWIW, I personally think this is _absolutely_ the right approach. Using a trait is a very visible, very obvious way to say what's going on that is probably easier to remember than adding another keyword to the [my|our|temp|let] group. While I, too, immediately understood what 'has' meant, I can't help but feel many people won't get it. As others have pointed out, the problem with 'static' is not only that (a) it has too many C++ meanings, but (b) the word itself implies 'constant', not 'persistent'. I would really, really like for us to not use that already-abused word. is retained is preserved is kept These three, I think, show the most promise. Or the linguistically dubious "is once", maybe. The others like "is saved/stored/restored" might be taken for serialization-style persistence. David Landgren wrote: I expected to see 'is persistent' as a possible name. Or does that denote serialisation too much? I think so... I thought about that too, but I think "persistent" is becoming synonymous with "serialized & stored" these days. MikeL
Re: survey page? [OT, was Re: is static?]
On Tuesday, March 18, 2003, at 06:49 AM, Paul wrote: Merely for the one small thing I might possibly contribute Would it be useful to have a convenient place to do polls? I suspect there already is one somewhere, but I'm unaware of it. I don't want to undermine the authority of the core planning team, but thought they might like to have a simple way to survey for things that are more preference than major issue. The (now very outdated!) POOC pages at http://cog.cognitivity.com/perl6/ have polls attached to each recipe. If nothing else, let me know and I can easily add a few. They contain the caveat that you must register to vote, as a way to prevent ballot stuffing. The POOC polls were an experiment. They tentatively demonstrated that (a) while hundreds of people visited those pages, pretty much NONE of them voted, and (b) we all like bitching a heck of a lot more than we like deciding. ;-) So dunno. We might try a few, but I'm not sure the results would be very useful. The design team has proven repeatedly that they have a terrific handle on the various issues, and there's been quite a few things that, if left to the prevailing popular opinion, would have led to distinctly the WRONG decision being made. The most productive (though not necessarily painless!) approach I've personally witnessed is when the design team muses about ideas, the list argues back and forth for a while, then the design team comes down with an Edict From On High that takes those issues into account. If people are *really* convinced it's wrong, the argument continues for a while, but it usually gets shut down when most of the list is satisfied that all the arguments have been heard. As much as people hated it, I think the P6 Operators thread was *quite* beneficial. It lead to the saving of ^ xor, and the >>hyper<< syntax, and quite a few other improvements, and got things pinned down squarely. I wouldn't mind seeing more of that level of disciplined debate, but it's difficult to pull off. MikeL
Re: A6: Complex Parameter Types
On Monday, March 17, 2003, at 10:35 AM, Luke Palmer wrote: I've been thinking of it like this: class int isa intlike; # and isa value or whatever class Int isa intlike; # and isa Object or whatever class num isa numlike; # and isa value or whatever class Num isa numlike; # and isa Object or whatever ... class Scalar isa intlike, numlike, ...; # and isa Object or whatever The *like classes are placeholder (interface) classes that capture the ability to *treat as* (as opposed to *really is*). And (importantly IMO) the *like classes capture the aspects of foo and Foo that are the same, leaving the foo and Foo classes to capture the differences. This is an interesting concept. We can have Intlike and Numlike abstract class that Int and Num concretify :). Then anything can derive from Intlike and Numlike and be substituted where they are expected everywhere. Not quite sure I understand -- how does this interact with the goal of flexibility between things which take an C vs. an C, for example? Ideally... sub foo(int $n) {...}# accepts int only (?) sub foo(Int $n) {...}# accepts int, Int (?) sub foo(num $n) {...}# accepts int, num (?) sub foo(Num $n) {...}# accepts int, Int, num, Num (?) (Assuming a significant difference between int/Int and num/Num is that the latters are allowed to be undef, thus making the above distinctions more relevant) You certainly don't want to have to type things as sub foo(intlike $n) {...} just to get the above flexibility, for example. ? MikeL
A6: Named vs. Variadic Parameters
A simple question, I hope... From A6, "Calling Subroutines", comes the following: multi push(@array, +$how, [EMAIL PROTECTED]) {...} push(@a, how => 'rapidly', 1,2,3); # OK push(@a, 1,2,3); # WRONG, $how == 1! Oops! What you really wanted to say was: multi push(@array, [EMAIL PROTECTED], +$how) {...} push(@a, how => 'rapidly', 1,2,3); # OK push(@a, 1,2,3); # OK Note the gotcha part... if you want to use both named arguments and a variadic list, you must declare the parameters in the signature in a different order than they must appear when actually calling the sub. If you put the signatured params and the actual arguments in the _same_ order, it will break. The reason for this is because, in the first example, the slurpy array has been placed in the named-only zone, _not_ the positional zone. Clearly, it's going to be a newbie problem, and I guess I'm not understanding why we can't enforce What They Really Meant. When calling a sub that has both named params and a slurpy list, the slurpy list should always come last. If a sub has both a slurpy hash and a slurpy list, the slurpy list should still always come last. You simply can't credibly have anything after the slurpy list, or it'll be slurped. So args/params must ALWAYS come in this exact order, if they are to be useful: sub foo( $x,# required positional ?$y,# optional positional +$k,# optional named *%h,# optional slurpy hash *$s,# optional slurpy scalar [EMAIL PROTECTED],# optional slurpy array ) {...} I guess what I'm not understanding is why you would _EVER_ want [EMAIL PROTECTED] to be in the named-only zone, and presuming you never would, why we can't syntactically / semantically fix the above gotcha so that the params always appear in the "calling" order? MikeL
Re: [SUMMARY] A6: Type Inference
On Friday, March 14, 2003, at 12:21 PM, Dave Whipp wrote: Michael Lazzaro wrote: 3) If an "untyped" var is used for a typed parameter, a simple dataflow analysis is used to determine whether the compiler can guarantee that, at that point, an "untyped" var will _always_ contain values of a known, specific type. If so, the type is inferred (silently or with a warning, according to pragma?) Otherwise, it is a compile-time error. I was suggesting something slightly more subtle: * If the dataflow says its definitely wrong, then its an error * If the dataflow says its definitely OK, it is OK -- no warning, nor error, nor run-time check * Otherwise (silently or with warning), defer to run-time check The coercsion aspect makes it slightly more complex, but not significantly so. Sorry -- I agree 100% with that. As a language feature, it has problems, as Luke & Angel both pointed out. As a mere optimization of case #2, it's fine, and there's not really any language implications. In tight loops, etc., it would probably speed up the runtime quite a bit, in fact. The optional warning if it falls through to the runtime check is important, IMO, because even lazy one-off scripts sometimes need to be fast. :-/ It would still mean that, in 'non-strict' mode, you might get different "deferring to runtime typechecking" warnings on different versions of P6, but I don't see that as a big issue, if it's just an optional warning. MikeL
Re: [SUMMARY] A6: Type Inference
On Friday, March 14, 2003, at 11:06 AM, Michael Lazzaro wrote: AFAICT, these are the *only* possible solutions to the problem. At last count, Larry was leaning towards #2. Damian was countering with #1. Some Lowly Grubs were suggesting #3. Am I missing anything? Whoops! That needs correcting. Larry sayeth: The real question is whether this particular stricture is part of the default "use strict" that classes and modules assume. There are decent arguments on both sides of that one, but just to mollify Damian I'm inclined to come down on the strict side for that. E.G. "use strict" invokes #1, otherwise it's #2. Sorry for the repost, but if something says "SUMMARY" it'd be best for it not to be obsolete after the first 5 min... MikeL
[SUMMARY] A6: Type Inference (was Re: A6: Strict signature checking)
OK, divide & conquer. We seem to be spasming about this and trying to talk about N things at once, so here's an issue summary. We're talking about at least two separate cases, (1) "inferring" type where none has been specified, and (2) "coercing" a typed value into another type. Let's take these separately, first one first: [ISSUE: Type Inference] Consider the following example: class MyClass {...} sub foo(MyClass $c) {...} my MyClass $a = MyClass.new; my $b = $a; foo($a);# OK foo($b);# ERROR! The issue here is that the variable C<$b> has not been typed -- or more accurately, it has been given a default type of C. So $b can store a MyClass, but it has not been _guaranteed_ to contain a MyClass, violating the typed signature of C. This is potentially significant because for quick-and-dirty scripts, some programmers would prefer to not explicitly type every single variable they declare, but they would still like to be able to use modules that have been written to be type-aware. NOTE that this issue only comes up when passing "untyped" vars like C, C, or C. If a type _has_ been explicitly given, it's not type "inference", it's type "coercion", which should be considered separately. [POSSIBLE APPROACHES] 1) If an "untyped" var is used for a typed parameter, it's a compile-time error. Broadly speaking, if you use type-aware subs, you must use type awareness everywhere. The advantage of this approach is that in assures there will be no hidden runtime costs; if you want to take advantage of the *huge* speed increases of using strictly typed vars, you can just do it. The disadvantage is that you'd pretty much have to use types *everywhere* in your program, or *nowhere*, because the edge between them always introduces compile-time errors. This is especially troublesome when using library modules, for example. 2) If an "untyped" var is used for a typed parameter, invoke runtime type checking, either with or without a warning (according to a pragma?) The advantage of this approach is that it will silently work; the disadvantage is that it could introduce *very* expensive runtime penalties if "accidentally" invoked, essentially nullifying the speed gains of typing. 3) If an "untyped" var is used for a typed parameter, a simple dataflow analysis is used to determine whether the compiler can guarantee that, at that point, an "untyped" var will _always_ contain values of a known, specific type. If so, the type is inferred (silently or with a warning, according to pragma?) Otherwise, it is a compile-time error. The advantage is that it will silently work, and will present no runtime speed penalties. The disadvantage is that different implementations of Perl might have different levels of dataflow analysis, causing one-off code that was acceptable under a "smarter" analysis to be invalid if moved to a "dumber" analysis -- meaning either this has to be acceptable behavior of an implementation, or that we need to specify, as part of the language spec, the precise smartness of the analysis. --- AFAICT, these are the *only* possible solutions to the problem. At last count, Larry was leaning towards #2. Damian was countering with #1. Some Lowly Grubs were suggesting #3. Am I missing anything? I think we can decide _this_ issue independently of coercion issues, yes? MikeL
Re: a thought on multiple properties
On Thursday, March 13, 2003, at 01:23 PM, Dave Whipp wrote: Michael Lazzaro wrote: Defining a Class for this is also overkill. Ye.. well, no. Why? class Foo is Bar; # normal inheritance class Baz is Bar; # the thing that we are over-killing Foo.isa("Baz") == FALSE; A lightweight, typedef-like mechanism behaves differently: class Foo is Bar; typedef Baz is Bar; Foo.isa("Baz") == TRUE; Ah, I get it. But why would you want that -- treating Foo and Baz as synonymous? Shouldn't you always be using Baz instead of Foo, if you really mean Baz and not Foo, and vice versa? Because later on, if you changed it such that: class Foo is Bar; typedef Baz is Bar is blarpy; Foo.isa("Baz") == FALSE; # BOOM! ...which would break anything that relied on the symmetry. Mind you, I'm not really against the idea, I'm just devil's advocating -- trying to think whether we really need the feature or whether we just _think_ we need it because we're all used to it from C, when in fact P6 will provide better ways of doing it. (?) MikeL
Re: a thought on multiple properties
On Thursday, March 13, 2003, at 01:05 PM, Austin Hastings wrote: More to the point: type sigfunc is interrupt is reentrant; sub sig_ign() is sigfunc {...} sub sig_kill() is sigfunc {...} sub sig_intr() is sigfunc {...} This is WAGging based on A6, but I guess I see things like this as being implemented by making subs that inherit from subs: class sigfunc is sub (...default signature...) is interrupt is reentrant; sub sig_ign(...alternate signature...) is sigfunc {...} sub sig_kill is sigfunc {...} sub sig_intr is sigfunc {...} sigfunc sig_foo {...} # could you also say it like this, I wonder? Since C is itself a class, you can subclass it. And since A6 indicates that the signature, traits, and even implementing body are just properties of a given C "object", you should be able to override them individually if you want, for example, an alternate signature. At least, I'm hoping something like that works -- there's a lot of guessing there. type null but defined but false; ... return undef but null; Hmm... I'm not entirely sure how that works for runtime properties... but what about class null is void but defined but false; return undef but null; Would something like that that be OK? Essentially using 'void' as a marker that you're defining a (heh) classless class? I'd really like to avoid making a separate keyword for combining traits, I'd love for it to just use the normal class inheritance mechanisms. class CatTable is Hash of Array of Array of Hash of Array of Cat; my %pet is CatTable; sub feed (%cats is CatTable); MikeL
ISSUE: How is C spelled? (was Re: A6: Signature zones and such)
On Wednesday, March 12, 2003, at 04:07 PM, Piers Cawley wrote: Michael Lazzaro <[EMAIL PROTECTED]> writes: Can we get a final answer, for the (documented) record? @list is variadic @list is slurpy @list is greedy @list is slurpificatious @list is slurptacular @list is bloated @list is greedy Anyone? C or C are both good with me too. As cute as C is, I'm not sure it's as obvious. And us American types will get it confused with the 7-11 spelling C, probably. :-) POMTC, 0.0005%. Don't care what it is, as long as it is. MikeL
Re: a thought on multiple properties
On Thursday, March 13, 2003, at 12:04 PM, Mark Biggar wrote: What we do need is some way of bundling a bunch of traits together under a simple name. Yes, yes, yes. Defining a Class for this is also overkill. Ye.. well, no. Why? So instead of saying: my %pet is Hash of Array of Array of Hash of Array of Cat; sub feed (%cats is Hash of Array of Array of Hash of Array of Cat) {...} You could say trait cat_table is Hash of Array of Array of Hash of Array of Cat; my cat_table %pet; sub feed (cat_table %cats) {...} I think classes are not necessarily the heavyweights some people might expect them to be... I think of them more as types, actually. Basically, if you replaced the word 'trait' with 'class', I think the current plan is that you can do exactly what you're suggesting: class CatTable is Hash of Array of Array of Hash of Array of Cat; my %pet is CatTable; sub feed (%cats is CatTable); (note I fixed the last lines to use the right syntax... before, you were actually saying that %pet was a Hash of CatTables...) MikeL
Re: Operators and context
On Wednesday, March 12, 2003, at 05:03 PM, Deborah Ariel Pickett wrote: Sort of a rehash on an old topic, but there's new stuff now with A6. Mike Lazarro had been making a list of all the operators that Perl6 has. The latest version I could find was Take 6 (at http://archive.develooper.com/[EMAIL PROTECTED]/msg12130.html So, my questions: 1. Is there a more recent version of this list? Nope, I think that version is still good. There will be the addition of the piping ops <== and ==>, and I'm not sure if (a) the Unicodeness of >>op<< was ever decided upon, and (b) whether there's still an >>op<< vs. <> issue. Note also that some of the things may or may not be "real" operators, but those should all be clearly marked in that version. 2. Perhaps this list ought to be expanded to specify how the operators relate to context (e.g., C<+> applies numeric context to LHS and RHS). I'm happy to give this a go, but I'd prefer the most recent operator list first. It'll be huge, but needs to be done, that's for sure. I'd certainly be happy if you gave it a shot! 3. Speaking of context, what's the complete tree of contexts now? Am I missing anything from this? Are ArrayRef and HashRef et al still needed, or are we going away from the %{...} notation for dereferencing a hash? A6 implies that there will be a knowable context for every type, so your type-context tree probably looks nearly identical to the P6 type hierarchy. But what that full type tree is, I dunno. :-) We left it hanging... -- Is a C a C, or is a C a C? -- Is an C a C, or are they both subclasses of C? -- Are things like C actually called C or similar? -- Is there a difference between C and C context? -- etc. We also talked on-list about whether 'void' could be considered a type, but nothing came out of it. So this is definitely an area that needs some serious help... Type context | +--- Scalar |+--- Bool |+--- Num ||+--- Int |+--- Str |+--- Ref | +--- HashRef | +--- ArrayRef | +--- CodeRef | +--- ScalarRef +--- List |+--- (lazy and eager?) | +--- Void (Is there a Pair context? PairRef? My guess is no.) (There's also Object, Code, and all the others in A6, approx. page 8.) MikeL
A6: Type Coercions (was Re: A6: Strict signature checking)
I think the issue of type coercion (forcing one type to another) should be decided separately from the issue of "implicit" types (recognizing when an untyped variable can be KNOWN at a given point to hold a specific type, even if it isn't explicitly stated.) As far as true coercion goes: for the sake of example, let's assume we have a class, C, that doesn't inherit from str, but that can be converted to a str if necessary. sub foo(str $s); my str $a = 'blah'; my MyClass $myclass = MyClass.new; foo($a); # OK foo($myclass);# COMPILE ERROR; WRONG TYPE That last line is a type coercion... we know it's one type, but it needs to be another. Previous implication is that you'd do it like this: foo($myclass.str); # OK I'm not keen on that syntax, personally, because it means you're cluttering up the method namespace of MyClass with every possible type/class you want to be able to convert to, which I have to think would cause problems in larger libraries. I at one point suggested: class MyClass { to str {...} to num {...} # etc } foo($myclass to str); as a way to separate the 'coercions' namespace from the normal 'methods' namespace. But regardless, the A6 question being asked is whether the C or C can know when it's allowed to implicitly coerce arguments from one type to another, such that foo($myclass); can work without having to first define 'foo' variants to explicitly handle each and every possible type. Possibilities: (1) If MyClass isa str, or MyClass implements str, problem solved -- it will work on anything that a str works on, right? (But maybe you don't want your class to actually _implement_ str, because we don't ordinarily want it to be considered a str... we actually just want a one-directional way to _convert_ instances to strs.) (2) Maybe we can mark a given C as being specifically willing to coerce an argument if necessary, probably with a trait: multi foo (str $s) {...} # Normal syntax multi foo ($s to str) {...} # OK, should attempt to COERCE $s to a str multi foo (str from $s) {...} # alternate spelling??? ... thereby identifying that we want to convert it if the compiler and/or runtime can figure out how. Would something like that achieve the desired effect? My main points being, untyped-to-typed coercion is different from true type coercion, and that the 'coerce to' and 'implements' relationships might be worthy of syntactic separation. POMTC: 70% MikeL
Re: A6: Strict signature checking
On Wednesday, March 12, 2003, at 11:07 AM, Damian Conway wrote: Austin Hastings wrote: In this case, I rather like the idea of being able to say sub foo(@a is Array of Int) {...} my @input = read_a_bunch_o_data(); foo(@input); Where the compiler will automatically "wrap" the @input array in a make-it-an-int converter. This, to me, is DWIM. But to many others it's not DWIS ("Do What I Said"). To them, types are about compile-time checking of constraints on the run-time compatibility of data. So they would argue that declaring C like that implies that any argument passed to C ought to guarantee that it already contains Ints, rather than specifying a (possibly unsuccessful) run-time coercion to ensure that condition. And many would argue that implicit coercions on typed parameters is one of the major *problems* with C++. Agreed. It should do compile-time verification, not runtime. That said, I still think there *might* be something to be said for compile-time 'hints' for explicitly _untyped_ values. Take the example: sub foo(str $s) {...} my str $a = 'blah'; # type 'str' my $b = 'blah'; # untyped, but set to a constant 'str' my $c = $a; # untyped, but set to a known typed 'str' my $d = $a + $b; # untyped, but known to be of type 'str' foo($a); # OK foo($b); # COMPILE ERROR foo($c); # COMPILE ERROR foo($d); # COMPILE ERROR With strict typing, the last three lines are errors. But it is known that $b,$c,$d were all set to known C values and have not since been altered. So the compiler *could* infer the type of these variables without them explicitly being stated, we *might* choose to catch those cases and make them non-errors. (You can actually track the 'type' hints quite a ways, if the operations you're doing produced typed values.) That would allow the use of quickie untyped temporary variables. For example, if you have a little ten-line script that uses type-aware CP6AN subs, but yer a lazy slob. Again, POMTC (percentage of me that cares), 50%. Not a showstopper. Just an idea. MikeL
Re: A6: Signature zones and such
On Wednesday, March 12, 2003, at 11:14 AM, Damian Conway wrote: Larry wrote: : I agree. As long as it's not C! Of course not. We're trying to encourage the use of line noise, and discourage the use of the long variants, so the long one would have to be C. Riiight! Thank-you, General Haig. Of course, C (my own preference for the name of this trait) would probably have much the same effect. ;-) Can we get a final answer, for the (documented) record? @list is variadic @list is slurpy @list is greedy @list is slurpificatious @list is slurptacular @list is bloated :-) MikeL
Re: A6: multi promotion
On Tuesday, March 11, 2003, at 12:39 PM, Austin Hastings wrote: You want C to tell the compiler to build in multiple dispatch. Any invocation of C after C is going to be a penny dropped into the great Pachinko game of multimethod-dispatchery. By default, if no winning multi appears, the call falls out the bottom and winds up invoking the original sub(). OK, hmm. What Damian is saying is that, tentatively, it's the reverse... it calls the sub if theres a sub, then the multi if there's a multi. So overriding a sub with a multi wouldn't work, but it would *silently* not work, because you could just never get to the multi version (well, not without a bit of introspection). I agree that the issue of overriding an inherited/preexisting C -- like one from a CPAN module -- with a set of C implementations is a useful capability; it would allow you to extend predefined routines to handle different arguments without getting into OO. But I sure worry that it makes accidental redefinition of subs invisible in many cases. Dunno. I could argue that one both ways. Maybe it has to wait until A12. sub foo($a) {...} ... lots of code inbetween ... multi foo(int $a) {...} # (???) multi foo(str $a) {...} MikeL
Re: A6: Signature zones and such
On Tuesday, March 11, 2003, at 05:18 PM, Damian Conway wrote: Various folks have suggested that the default assignment syntax: sub foo(?$bar is copy = 1) {...} be considered merely a shorthand for something like: sub foo(?$bar is copy is default(1)) {...} I don't know...maybe I'm worrying too much. But then, that's part of my job. ;-) Are you concerned about having an C spelling at all, or just about optional placements of C<=>? I agree that the C vs. C<= 1 is constant> is a potential point of confusion, and probably not worth the grief. I don't know that that means we couldn't have an C spelling, though. And C (or something easier to spell) for the * case. If we have C and C, I think it would be appropriate to have names for the other linenoise as well. (Percentage of me that really cares: 20%. But I suspect others might.) MikeL
Re: A6: multi promotion
On Tuesday, March 11, 2003, at 11:19 AM, Austin Hastings wrote: But you can't wrap multi-ness, as far as I can tell. [A6] And it happens that the multimethod dispatch is smart enough to find the ordinary single-invocant sysread method, even though it may not have been explicitly declared a multimethod. Multimethod dispatch happens to map directly onto ordinary method dispatch when there's only one invocant. At least, that's how it works this week... [/A6] To me, this suggests that multithods can be bolted on top of unithods. And presumably, likewise multisubs can be bolted atop Plain Old Subs (POSs). I *think* that's just talking about the internal dispatch mechanism... e.g. the dispatcher treats one-invocant subs and multis the same way, and can map between them if necessary -- I don't think it's talking about the semantic/syntactic rules for declaring them, just the guts of what happens when you eventually call one. So I still surmise it's a semantic error to override a C with a C. (It doesn't have to be, internally, but otherwise I'm not sure why you'd want a separate keyword C at all.) sub foo($a,$b,$c) {...} ... lots of code inbetween ... multi foo(int $a) {...}# ERROR - can't multi an existing sub multi foo(str $a) {...} In your example, where you have two separate modules that both have a C: module CPANthing; sub foo($x) {...} module MyThing; sub foo($x) {...} # ERROR - redefined: multi won't, but "my" will fix sub foo(int $x) {...} # ERROR - must use multi multi foo(int $x) {...} # OK multi foo($x: @a) {...} # OK multi foo($x: ?$y = 0) {...} # WARNING: compile-time or run-time? I would put the errors in different places, because there's nothing wrong with having an identically named function in two different namespaces, right? So I would rewrite that: module CPANthing; sub foo($x) {...} module MyThing; sub foo($x) {...} # OK - we're in a different namespace now sub foo(int $x) {...} # ERROR - must use multi multi foo(int $x) {...} # ERROR - can't multi an existing sub multi foo($x: @a) {...} multi foo($x: ?$y = 0) {...} Or am I completely spacing on something? MikeL
Re: A6: multi promotion
On Tuesday, March 11, 2003, at 06:42 AM, Richard Proctor wrote: If one has a simple sub such as factorial: sub factorial(int $a) {...} then one subsequently declares the multi form of factorial to pick up the non-integer form: multi factorial(num $a) {...} Does this promote the original declaration of factorial to a multi? if not what happens? I would *strongly* suspect that it would fail, saying "can't redeclare 'factorial'" or something. The idea behind C is that if you're giving multiple possible signatures to a function, you have to do so *explicitly*. Otherwise, you might just be accidentally overriding some previous sub when you didn't really mean to -- and you'd _really_ like to know when that was happening. sub foo($a,$b,$c) {...} ... lots of code inbetween ... multi foo(int $a) {...} multi foo(str $a) {...} You might have forgotten, when declaring the two Cs, that -- oops -- you had already used that subroutine name for something completely different! So you'd want it to tell you if you were redefining the C sub. So I'm betting it's an error. You have to go back and make the first one a C, if that's what you really meant to do. (Note that this means you can't give alternate signatures to functions that you've pulled in from a CPAN-style library, unless the library has given you permission to do so by making the functions C in the first place. Probably a good idea, on balance. You can do similar things with wrapper functions.) MikeL
Re: A6: Signature zones and such
On Tuesday, March 11, 2003, at 08:41 AM, Brent Dax wrote: Almost makes you wish for those backwards declarations from C that computer scientists always gripe about, eh? :^) Well, what about this? multi substr(Str $str, $from = $CALLER::_ is optional, $len = Inf is optional, $new is optional) Well, if we have alternate spellings of all the markers: $arg is default(1) # same as $arg = 1 $arg is optional # same as ?$arg $arg is named # same as +$arg @list is variadic # same as [EMAIL PROTECTED] @list is slurpy# (possible alternate spelling) @list is greedy# (possible alternate spelling) This would lend a little more credence to the notion that you can put the C<=> in whatever order you want. Well, maybe. OK, maybe not. But if you have an C spelling, you wouldn't care so much... Wimpy, readable code: multi substr ( Str $str, $from is default($CALLER::_) is optional, $len is default(Inf) is optional, $new is optional ) {...} # is same as multi substr ( Str $str, $from is optional is default ($CALLER::_), $len is optional is default(Inf), $new is optional ) {...} Manly, expert code: multi substr ( Str $str, ?$from = $CALLER::_, ?$len = Inf, ?$new ) {...} MikeL
Re: A6: Complex Parameter Types
Larry wrote: : > multi foo (@a is Array of int) {...} : > : > my int @a = baz(); # is Array of int : > my @b = baz(); # is Array of Scalar : > : > foo(@a);# @a is typed correctly, so OK : > foo(@b);# @b is not explicitly typed as C; OK or FAIL? I dunno. I can argue that it should coerce that. Damian wrote: H. I think that IF there's a significant speed issue there at all, it should just fail. (IMO, it's more important to be able to optimize for speed.) But the issue is of course that when people make temporary arrays, I imagine they'll seldom be explicitly typed, because we all like to be lazy slobs: sub baz returns Array of int {...} my @temp = baz(); Since this is such a common case, it might be possible to leave a lazy 'hint' after the assignment such that @temp is treated as an C, because we know for a fact that's how it was (last) assigned. The next operation on @temp would clear the hint: push @temp, $i; # well, now we don't know *what* type @temp returns (I hate to niggle at all over such a stupid little edge case, but I worry it's going to be terribly common.) MikeL P.S. I'm so happy I'm getting my multimethod-friendly Array of Array of Cats! Herding Cats is so much easier when they're placed in explicitly typed multidimensional buckets!
A6: Complex Parameter Types
In A6, it is confirmed that you can have "complex" types such as: my %pet is Hash of Array of Array of Hash of Array of Cat; It is also confirmed that you can indeed use such types in sub signatures, e.g.: sub foo (@a is Array of int) {...} Confirmations/Questions: 1) Complex types for sub parameters: The above would imply that a sub can tell the difference between an C vs an C, thank goodness. That also implies that you can use arbitrarily complex types, and still get the same type checking: sub foo ( %pet is Hash of Array of Array of Hash of Array of Cat ) {...} Yes/No? 2) Multimethod dispatch: The text would seem to _IMPLY_ that you can perform multimethod dispatch based on complex types, but it isn't actually stated anywhere AFAICT. e.g. multi foo (@a is Array of str) {...} multi foo (@a is Array of int) {...} ... is it legal, and DWYM? 3) The edge point between explicitly typed and explicitly non-typed variables: If you pass an "untyped" array (or list?) to an explicitly typed array parameter, is the "untyped" array considered a unique case, or will it fail? multi foo (@a is Array of int) {...} my int @a = baz(); # is Array of int my @b = baz(); # is Array of Scalar foo(@a);# @a is typed correctly, so OK foo(@b);# @b is not explicitly typed as C; OK or FAIL? MikeL
A6: Pipes
Since noone else has said it yet -- This Apoc looks *great*. The sig stuff is very, very nice. (The wrapper stuff has interesting possibilities, too, especially with OO.) Question on pipes: I like very much the concept of relating them only to the variadic list, that was the piece we were all missing in the P6L discussions. After reading that appendix, I'm still a bit murky on the final decisions as to which of these edge cases will be allowed: my @out <== (1,2,3); my @out <== (my @in = foo()); my @out <== foo(); (1,2,3) ==> my @out; (my @in = foo()) ==> my @out; foo() ==> my @out; Are these all valid, or do some of them have to be errors? I got lost in the appendix explanation of what was confirmed, and what was wishful-only... FWIW, stylistically, I'd personally vote for @in ==> map {...} ==> sort {...} ==> map {...} ==> @out; as better than: @in ==> map {...} ==> sort {...} ==> map {...} ==> @out; ... if we care about such details. :-) MikeL
Re: Arrays, lists, referencing
On Saturday, February 15, 2003, at 08:47 AM, David Storrs wrote: I can see five possible courses here: 1) We decide that my suggestion is a bad one and do nothing with it. That's fine; I am not wedded to it, I just thought it was an interesting idea that I wanted to raise. 2) (4, 1, 2) + 7 returns (9). This is C comma behavior, and I always found it incredibly non-intuitive. I'd really like to get away from this, even if it means that this expression is a fatal error "Can't add scalar to list". 3) (4, 1, 2) + 7 returns (10), by adding 7 to the length of the list. This makes lists look even more like arrays, and doesn't really add any new power to the language. 4) (4, 1, 2) + 7 returns (11, 8, 9). This is a convenient shorthand for the vector syntax, IMO. 5) (4, 1, 2) + 7 returns (14). That is, the list is collapsed into a datatype matching the "RHS" by iteratively applying the operator in question to the list elements, and then the item on the RHS of the operator is applied. I'm not sure this is useful; I'm just exploring options. IMHO the only reasonable possibilities are (2) or (3)... the others are much rarer in practice, and too prone to accidental-invocation-with-baffling-results. Agreed, however, that (2) is icky. My worry has been that removing C-comma behavior would break common constructs, but I haven't been able to find any that would really break (except obfuscated ones that would be better written in some other fashion anyway.) Statements like: foo() or (warn("blah"), next); work either way, because they don't rely on getting the "scalar value" of the list. So, IMO, the only reasonable answer is (3)... that a list in numeric context returns the length. Thus we have consistency between lists and arrays: (1,2,3) + 4 # --> (1,2,3).length + 4 --> 7 (list) [1,2,3] + 4 # --> [1,2,3].length + 4 --> 7 (array ref) my @a = (1,2,3); # @a + 4# --> @a.length + 4 --> 7 (array var) *@a + 4 # --> (*@a).length + 4 --> 7 (list) (@a,@a) + 4 # --> 3 + 3 + 4 --> 10 (list) Alternatively, we could say that using a list in numeric context is a syntax error. This is fine w/ me as well, but pointedly *doesn't* match the array behavior... and would mean the second to last line would also be a syntax error. I think the consistency of behavior probably means (3) wins. MikeL
Re: Arrays, lists, referencing
On Wednesday, February 12, 2003, at 05:50 PM, Deborah Ariel Pickett wrote: All right, I'm prepared to buy that. Now how would it extend to hashes? A %hash in list context returns a list of its pairs (NOTE4) A %hash in scalar context returns a reference to itself (NOTE1) A %hash in numeric (scalar) context returns (?) A %hash in string (scalar) context returns (?) A $hashref in list context returns a hashref (NOTE2) A $hashref in scalar context returns a hashref A $hashref in numeric (scalar) context returns (?) A $hashref in string (scalar) context returns (?) (NOTE4): Or is it a flattened list of key-values? As far as NOTE4, I don't think they've decided yet. At least, I can't seem to find confirmation of it. One possibility for the (?) part is of course that numeric context returns the number of keys, and that string context returns a pretty-printed list of key => value pairs. That would seem the most obvious answer, but it might not be the right one. We have to get that verified too. And how would it extend to the finer-grained contexts we're getting in Perl6 (integer numeric scalar context, hashref context, ...)? Our complete list of contexts now is quite a hierarchy. Yeah, I'm waiting eagerly for A6 to talk about that, but it shouldn't be too hard. Certainly, context seems to be a simple tree. So 'want' can return true for multiple things... my int $i = foo(); ... # now inside foo: want scalar;# true want numeric; # true want int; # true want str; # false I've been told with great consistency that there's not really going to be complete, typedef-style contexts... e.g., you'll be able to tell "hash" context, but you won't be able to tell the difference between "hash of ints" and "hash of strs" context. Which is a shame, IMO, since multimethods could benefit from as much info as possible, but we'll see what they come up with. An @array in nonbinding list context returns a list of its elements. An @array in binding list context returns the symbol table reference for itself An @array in nonbinding scalar context returns a reference to itself. An @array in binding scalar context returns the symbol table reference for itself Would that fly? If so, I'd expect the new generic want() operator to be able to detect it. Huh, I never really thought of it that way, but I suppose it would have to be something like that. So you can overload the binding operator C<:=>... MikeL
Re: Arrays, lists, referencing (was Re: Arrays vs. Lists)
On Tuesday, February 11, 2003, at 04:56 PM, Deborah Ariel Pickett wrote: But is it OK for a list to be silently promoted to an array when used as an array? So that all of the following would work, and not just 50% of them? (1..10).map {...} [1..10].map {...} And somehow related to all this . . . I think some of this is in A2, but not all of it. Here are some of the answers from my own notes. These behaviors have all been confirmed on-list by the design team: An @array in list context returns a list of its elements An @array in scalar context returns a reference to itself (NOTE1) An @array in numeric (scalar) context returns the number of elements An @array in string (scalar) context returns a join of its elements An $arrayref in list context returns an arrayref (NOTE2) An $arrayref in scalar context returns an arrayref An $arrayref in numeric (scalar) context returns ??? (NOTE3) An $arrayref in string (scalar) context returns ??? Note that that's pretty consistent with how it works now. (NOTE1): This is the big change. It's what allows us to treat arrays as objects, and call methods on them like @array.length. I don't think anyone will argue that's not a good thing. (NOTE2): Note that this is a non-change. If we changed it so that an arrayref flattened itself in array context, you could never have complex data structures, because [[1,2],[3,4]] would always be the same as [1,2,3,4]. (NOTE3): I have not been able to find explicitly confirmed behaviors for these two. It has been implied that they return $arrayref.length and $arrayref.string (or whatever those methods are called). Maybe. --- List Flattening --- The confusing behavior is, of course, that the list (@a,@b,@c) is seen as being treated differently in different syntactic contexts. In the case of: sub foo(@p1,@p2,@p3); &foo(@a,@b,@c); the arrays @a, @b, and @c are NOT flattened, but are passed as @p1, @p2, and @p3. Likewise, in: my(@d,@e,@f) := (@a,@b,@c); the same is true. But in ALL other circumstances, like my(@d,@e,@f) = (@a,@b,@c); an array in list context simply returns it's elements, such that @d = (@a,@b,@c), @e=(), @f=(). So what's the deal? My own two-sentence explanation for why this is is that in the first two examples, C and C<:=>, you're binding one variable to another, NOT dealing with the array-ness of those variables at all. E.G. @a := @b makes @a refer to the same array object as @b refers to, whereas @a = @b simply says to copy all elements _contained within_ @b into @a. So it's not that arrays are behaving differently in different situations, because they're NOT... the same rules always apply. It's just that C and C<:=> are specific, binding-style operations... they do the same thing for scalar variables, too. There, how convincing did that sound? MikeL
Re: Arrays vs. Lists [x-adr]
On Tuesday, February 11, 2003, at 10:56 AM, Garrett Goebel wrote: What about this? \@array hmm. As perl Apoc2, Lists, RFC 175... arrays and hashes return a reference to themselves in scalar context... I'm not sure what context '\' puts them in. I'd guess \@array is a reference to an array reference. I understand the logic, but: my $r = @a; # ref to @a my $r = \@a; # ref to ref to @a ??? my @array = (\@a,\@b,\@c); # array of three arrayrefs Boy howdy, I think that would freak people. But making '\' put them in list context would of course be far worse: @array = (\@a); # means @a = ( \@a[0], \@a[1], ... ) ??? So I think '\' just puts things in C context, which solves the problem and always does The Right Thing, I think. So the context rules for arrays are: - in scalar numeric context, returns num of elements - in scalar string context, returns join of elements - in scalar ref context, returns a ref - in generic scalar context, returns a ref IMO. MikeL
Re: Arrays vs. Lists
On Monday, February 10, 2003, at 06:26 PM, Joseph F. Ryan wrote: Deborah Ariel Pickett wrote: (Just going off on a tangent: Is it true that an array slice such as @array[4..8] is syntactically equivalent to this list (@array[4], @array[5], @array[6], @array[7], @array[8]) ? Are array slices always lists in Perl6?) I think so, unless its possible to do crazy things like reference part of an array. Maybe @array[4..8] is a list, and \@array[4..8] acts like an array. Or maybe \@array[4..8] is actually ( \@array[4], \@array[5], \@array[6], \@array[7], \@array[8]), like it is in perl 5. If it keeps that behaivor, then @array[4..8] is always a list. What is the utility of the perl5 behavior: \($a,$b,$c) meaning (\$a, \$b, \$c) Do people really do that? I must say, given that it looks *so obviously* like it instead means [$a,$b,$c], I wonder if attempting to take a reference to a list should be a compile-time error. Note that this is still OK: \($a) # same as \$a because as previously discussed, it's the commas making the list, not the parens. But \($a,$b,$c) seems like a bug waiting to happen. I don't use it. Can someone give an example of an actual, proper, use? What joy I'll have explaining that one to my students . . . Groan. Yeah. I feel your pain. :-| MikeL
Re: Arrays vs. Lists
On Monday, February 10, 2003, at 05:56 PM, Luke Palmer wrote: Indeed, this supports the distinction, which I will reiterate: - Arrays are variables. - Lists are values. My hesitation about the 'arrays are variables' part is that Damian corrected me on a similar thing when I was writing about scalars. A variable is more like "a name of a container for a value", e.g. there's three parts to it: - the name (what it's called in the namespace) - the container (a specific container implementation) - the value (what's inside it) So I don't know that arrays are variables, so much as arrays are containers, if we want to get pedantic about it (which I don't, but... documentation... sigh). Just to clarify... in P6, is this an array reference, or a list reference? [1,2,3] What about this? \@array I'd say both of them are array references, but there's no variable associated with the first one -- it's just an anonymous container. So I'd rewrite the definition to: - Lists are an ordered collection of scalar values - Arrays are containers that store lists (Coupled with Uri's explanations, of course... it's the 'container' part that allows read/write, as opposed to simply read.) Yes/no? Arrays are things that know about lists. They know how to get a particular element out of a list. They know how to *flatten themselves, interpolating themselves into the surrounding list. They know how to map, grep, sort, splice themselves. They know how to turn themselves into a scalar. Lists don't know how to do these things. But is it OK for a list to be silently promoted to an array when used as an array? So that all of the following would work, and not just 50% of them? (1..10).map {...} [1..10].map {...} (@a,@b,@c).pop [@a,@b,@c].pop MikeL
Re: A4 aliasing syntax (and a note on statement modification)
On Saturday, February 8, 2003, at 02:53 AM, Luke Palmer wrote: If you're talking about your own C example, actually, this would match it better: grep $x <- @list { $x eq 3 } But if you're talking about A4's: grep @list -> $x { $x eq 3 } Which is very close to (one of) the currently valid: grep @list: -> $x { $x eq 3 } (In Perl 6 there will be many ways to do Cs and C syntactically) My guess is that Larry wanted $x to appear before the block it will be used in, and that C's swapping of block and list (when compared to C) makes doing so ugly (IMO). There has been some inconclusive discussion about unifying those two syntaxes. As you can see from my example above, you can get pretty close, so nobody seems to be complaining anymore. I'm personally still hoping for a unification there... see my note from Jan 23, "Re: Why C needs work" for some ideas. Maybe someone will think of a way to do it, at some point. We stopped that discussion because Dan S. begged for mercy, at least until A6 comes out. MikeL
Re: Arrays vs. Lists
On Friday, February 7, 2003, at 04:24 PM, Uri Guttman wrote: ML> \(1,2,3) ML> returns an array reference... in perl5 it returns a list of refs ( \1, \2, \3 ). i dunno the perl6 semantics. it could be the same as [ 1, 2, 3 ] which means it is not a Sorry, I was misremembering a thread. I remember (vaguely) now... can't do what I suggested because it's something like \($x) should never be a list ref, which means we would have to treat parens differently depending on how many things are inside them, etc, which pointedly won't work. If someone remembers when that damn thread happened, or better still remembers the outcome (if any), drop me a pointer? MikeL
Re: Arrays vs. Lists
On Friday, February 7, 2003, at 03:38 PM, Uri Guttman wrote: but you can't derive the rules about allowing push/pop/splice/slice from that pair of defintions. Is there any syntactic reason why both of the following cannot be allowed? (1,2,3).pop [1,2,3].pop I don't know that one is any more/less useful than the other, and it would seem a list could be silently promoted to an array where it is used as an array. For example, \(1,2,3) returns an array reference... MikeL
Re: Arrays vs. Lists
On Friday, February 7, 2003, at 02:07 PM, Uri Guttman wrote: the whole notion is that lists are always temporary and arrays can be as permanent as you want (an array ref going quickly out of scope is very temporary). lists can't live beyond the current expression but arrays can. Along those lines, the closest I've been able to come so far to a usable two-sentence definition is: -- A list is an ordered set of scalar values. -- An array is an object that stores a list. But I'm not sure that holds water. MikeL
Arrays vs. Lists
I'm trying, and failing, to accurately and definitively answer the question "what's the difference between an array and a list in Perl6?" If someone can come up with a simple but accurate definition, it would be helpful. MikeL
Re: summarizing the obvious
On Friday, January 31, 2003, at 09:40 AM, Garrett Goebel wrote: Or for the extremely thick: GOOD: Separate syntax for indexed vs. named lookups BAD: Same syntax with >= 2 contextual meanings Seriously, everyone read Damian's "Seven Deadly Sins" thing, if ya haven't read/heard it already. It's quite short, and quite good at pointing out Things That Suck about programming languages. http://www.csse.monash.edu.au/~damian/papers/PDF/SevenDeadlySins.pdf I'm disappointed that The Perl Foundation (TPF) has been so quiet and unresponsive on support for our core language designers and architects. I dropped a note to all the TPF contacts over a week ago, and have yet to receive a reply. It is a sad state of affairs when a language as prevalent as Perl and with such a strong sense of community can be so disorganized and lacking when it comes to financial sustenance. It's worse than that, IMO. Think of all the businesses that benefit from Perl... it's bloody *everywhere*. It's the, what, fourth most popular language, after C/C++/Java, according to that monster-job-search-based-statistic on slashdot. And yet the entire population of the Perl-using planet can't support 4 or 5 full time designers/developers? Have we just not been effective in getting the word out, or are businesses truly that cheap? Does Perl need to be made into a commercially supported product, w/ venture capital, a'la other recent open source pkgs, in order to get funded? MikeL
What's core? (was Re: Arrays: is computed)
On Thursday, January 30, 2003, at 10:44 PM, Leopold Toetsch wrote: Michael Lazzaro wrote: [EMAIL PROTECTED] wrote: Shouldn't access to 'is computed' arrays be read-only? In general, I would hope that 90% of them would be, but it's been stated that it won't be a requirement. If you want such 'is computed' thingy, then tie it or wrap it in your own - IMHO. Everyone seems to need different things, so the simplest and by far the safest way is to make this explicit in your code. Yep. Sigh. This is all quite frustrating, because everyone has different ideas on what's valuable enough to be a builtin, but it's the design team that finalizes that, and the information is sort of scattered piecemeal through the A's/E's/p6l. If you look around enough, there's a pile of builtin stuff that's obliquely referred to, but never quite spelled out. And even the simplest ones -- what "default" means, for example -- aren't necessarily obvious at first glance. All I want is a list of what the already-known builtins *are*, or what other things people *demand* should be there, and the gist of what they do. Even if I have to change it later. Even if they're not in core, but they're things that people are going to immediately build because they're so obvious. Something to make me feel that the precise behaviors have really been pinned down -- and can be reliably extrapolated from -- preferably *before* people spend a lot of time coding them. Lord, this is just *arrays*. Not even *hashes*, yet... So, is it obvious that I'm a little discouraged lately? Don't suppose anyone can come up with some numbingly inspirational words to cheer us (well, me) up... MikeL
Re: Arrays: is computed
[EMAIL PROTECTED] wrote: Shouldn't access to 'is computed' arrays be read-only? In general, I would hope that 90% of them would be, but it's been stated that it won't be a requirement. But hey -- note that, for starters, this would mean that you could easily use an array for caching things... you could give a big hairy calculation as the C sub, and immediately store the result in the indexed location -- thus avoiding triggering the computation the next time. Pretty slick. Assuming, again, your C sub didn't return undef. If it might, you still need a separate flag to mean "real undef" vs. "haven't-gotten-around-to-it-yet undef". Which I agree sucks. Hmm... real vs. fake undef... a difference between null-PMC and PMC-null, autofill with null-PMC, but assigning undef writes PMC-null... is that enough to make it work w/out speed penalty? Dan/Leopold? But making C and C<@a[n] = undef> do very different things, that's still scary. Powerful, but scary. People really, really want that, huh? On Thursday, January 30, 2003, at 02:30 PM, Nicholas Clark wrote: I think there is a lot of scope for creating tied objects for all of this complex behaviour. Everyone has different ideas about what would be useful, and they aren't all compatible. Eliminating the speed hit from perl 5 tie and perl 5 overloading is one big reason why parrot should be nicer to work with than any language built on the perl 5 internals. Agreed. Very, very agreed. :-) DAMN, I want to start using this NOW. MikeL
Re: Arrays: Default Values
On Thursday, January 30, 2003, at 12:49 PM, Austin Hastings wrote: undef @a[5]; # undefining the element sets it to the default @a[5]; # 2 @a[5] = undef; # same as above @a[5]; # 2 undef!! @a is an array of Int (not int) and can store undef, so no error occurs when you make the assignment. But now I, the programmer, am saying that of my own volition I want an undef in there, not a 2. If I wanted @a[5] to take on the default value, I'd say so: C<@a[5] = @a.default;> This is operating on Damian's premise that the presence of an undefined value is what causes the default value to be raised. So we don't have two types of undef (undef but undef, etc.)... a cell is either defined, or it's not. The presence of C is what triggers the default value, regardless of how the undef got there. Thus, as Damian said, there is no way to place an undefined value in an array with a default: I'm not compelled by the counter-argument that this makes it impossible to store an C in an array with a default. Because the whole point of an array having a default is to prevent those nasty out-of-range Cs from popping up in the first place. I'll document the behavior as Damian has specified it, since he's the ranking design team member here. If a design team member overrules, we'll change it to match. MikeL
Arrays: is computed
For C arrays, things get more complicated. Since there are no true 'holes' in a primitive-typed array, the correct behavior there would seem to be to autofill the array using the computed values. For example, an empty array: my int @a is computed { $^index ** 2 } @a[2]; # 4 (doesn't exist, is computed) @a[3]; # 9 (doesn't exist, is computed) @a[4]; # 16 (doesn't exist, is computed) Now setting an element: @a[4] = 0;# (setting an element autofills previous elements) # @a now contains (0,1,4,9,0) @a[2];# 4 @a[3];# 9 @a[4];# 0 @a[5];# 25 (still doesn't exist, is computed) @a[1000] = 0 # (calls the computed sub 1000 times, hope ya meant it) Again, note the dubious behavior of doing a C or other manipulation on any C array. The autofilled portion would shift, but the computed portion would not: my int @a is computed { $^index ** 2 } # at first, @a is entirely computed, (0,1,4,9,16,25,...) @a[4] = 0; # @a now contains (0,1,4,9,0); # now (real) + (computed) shift @a;# (1,4,9,0) + (16,25,...) shift @a;# (4,9,0) + (9,16,25,...) shift @a;# (9,0) + (4,9,16,25,...) shift @a;# (0) + (1,4,9,16,25,...) shift @a;# () + (0,1,4,9,16,25,...) Not saying that's wrong. Just very, very wacky. And yes, it's fixable if every array has an "offset" number that's always updated to mark how far the array has been shifted/unshifted from it's starting point. But I'm not suggesting that. Really. MikeL
Re: Arrays: Default Values
On Thursday, January 30, 2003, at 10:56 AM, Austin Hastings wrote: There is no reason why primitive-typed arrays can't have a default. It is the confusion of "default" with "undef" that is causing this problem. Yes, I misspoke. You can have a default, which it will use for autofill & out-of-range values. You just can't use undef to do it. MikeL