On Fri, 25 Oct 2002, Michael Lazzaro wrote: : What's the Official Perl difference between a named unary op and a : one-arg universal method?
I didn't give the other half of the answer. A method is a term, not an operator. It's the . in front of it that's the operator... It's just that, in indirect-object syntax, the colon on length $object: is optional, so it looks a lot like unary operator. But I think the precedence is probably LISTOP, not UNIOP, at least if we stick with the Perl 5 approach of any listop grabbing all the available args to the right. I don't see a good way to keep the precedence of length and friends at UNIOP--at least, not without having universal subroutines that just pass their single arguments off to the object as a real method. The question is whether there's any way to keep Perl 5's print length $a, "\n"; so it still parses as expected. It seems a bit silly to have to declare a universal sub like sub *length ($x) { $x.length() } On the other hand, we would like to make methods obey the same argument parsing rules as subs. Which means that the above behavior could be implied for untyped objects via class Object { method length ($x) {...} } without having to declare a universal sub. But that depends on our assumption that any method call's syntax can be determined by looking at the type of its left side. That has ramifications if the declared type of the left side is a base class and we really want to call a method in a derived class that exceeds the contract of the base class. We can probably defer some of these decisions till run time, such as whether to interpret an @foo argument in scalar or list context. But changing the precedence of length from a LISTOP to a UNIOP can't be deferred that way. Which is why we either need the parser to know the uniop declaration of sub *length ($x) { $x.length() } or we have to make print length $a, "\n"; illegal, and require people to say one of: print length($a), "\n"; print (length $a), "\n"; print $a.length, "\n"; if there's a following list. The latter approach seems quite a bit cleaner, in that it doesn't require either the parser or the programmer to maintain special knowledge about a unary function called "length". I think we also need to fix this: print (length $a), "\n"; The problem with Perl 5's rule, "If it looks like a function, it *is* a function", is that the above doesn't actually look like a function to most people. I'm thinking we need a rule that says you can't put a space before a dereferencing (...), just as you can't with {...} or [...]. If you want to, then, as with {...} or [...] you have to use .(...) instead. That is, print .(length $a), "\n"; means print(length $a), "\n"; but print (length $a), "\n"; means print( (length $a), "\n" ); If we ever allow a syntax like C++'s foo<bar> for who knows what purpose, then it would have to follow the same rules, since it would otherwise be ambigous with a < operator. So maybe we should start telling people not to say things like $a<$b when they mean $a < $b. One could argue that this rule should be followed for all bracketing syntax, including Unicode. That would be consistent, at least. The real name of subscripts is then always with the dot: operator:.[] # subscript [] operator:.{} # subscript {} operator:.() # subscript () aka function args operator:.<> # subscript <> (reserved) ... operator:[] # array composer operator:{} # hash or closure operator:() # regular parens operator:<> # an op that screws up <, <<, <=, and <=> :-) ... That's assuming that matched brackets are always recognized and assumed to have an expression in the middle. Actually, it's not clear that operator:<> would mess up binary < and friends. It looks as if those four are really: term:[] # array composer term:{} # hash or closure term:() # regular parens term:<> # the input symbol AKA call the iterator ... So we note that we can actually get away with having all of: operator:.<> operator:< term:<> without ambiguity (assuming a consistent space rule). However, if we ever had operator:{ we couldn't do the trick of assuming an implicit operator before a block in if $a eq $b {...} But now note how we could have all three of $a++ # operator:.++ $a ++ $b # operator:++ ++$b # term:++ by applying the rule to non-bracketing characters as well. Basically, operator:.op vs operator:op allows us to distinguish postfix ops from binary ops, if we want. That might be cool. But we have a problem if we want to specify a binary operator that begins with dot. So it probably has to be: postfix:++ infix:++ prefix:++ or some such. That still leaves us with a problem if they define: postfix:! # factorial infix:! # xor superposition prefix:! # logical negation The problem is now that you can't say $x .! # still factorial! if you want to put space before the postfix !, because we comandeered the dot for bitops. Hmm. Maybe that was a mistake. Something to ponder. I expect we can't just rely on bracket properties in Unicode for understanding how to parse operators though, or we can't write things like: term:'' # single quoted string. Either we need a placeholder, or something that says to treat matched chars as bracket chars. The placeholder is more general, particularly if it specifies the grammar rule to parse the inside: term:'<singletext>' term:(<expr>) Note also that we do have to distinguish term from prefix, since they leave the lexer in a different state of expectation afterwards. So we end up not with: postfix:<> infix:< term:<> but rather postfix:<lt><mumble><gt> infix:<lt> term:<lt><expr><gt> Now you can finally have your infix:<ws> operator. :-) It will, of course, totally break the lexer. Your choice, though. Larry