Re: [swift-evolution] A path forward on rationalizing unicode identifiers and operators

Ethan Tira-Thompson via swift-evolution Tue, 03 Oct 2017 17:49:03 -0700


> On Oct 2, 2017, at 10:07 PM, Chris Lattner via swift-evolution 
> <swift-evolution@swift.org> wrote:
> 
> On Oct 2, 2017, at 9:12 PM, David Sweeris via swift-evolution 
> <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
>>> Keep in mind that Swift already goes far above and beyond in terms of 
>>> operators
>> Yep, that's is a large part of why I'm such a Swift fan :-D
> 
> Fortunately, no one is seriously proposing a major curtailing of the 
> capabilities here, we’re just trying to rationalize the operator set, which 
> is a bit of a mess at present.
> 
>>> in that: (a) it allows overloading of almost all standard operators; (b) it 
>>> permits the definition of effectively an infinite number of custom 
>>> operators using characters found in standard operators; (c) it permits the 
>>> definition of custom precedences for custom operators; and (d) it 
>>> additionally permits the use of a wide number of Unicode characters for 
>>> custom operators. Most systems programming languages don't even allow (a), 
>>> let alone (b) or (c). Even dramatically curtailing (d) leaves Swift with an 
>>> unusually expansive support for custom operators.
> 
>> Yes, but many of those custom operators won't have a clear meaning because 
>> operators are rarely limited to pre-existing symbols like "++++++++" (which 
>> doesn't mean anything at all AFAIK), so operators that are widely known 
>> within some field probably won't be widely known to the general public, 
>> which, IIUC, seems to be your standard for inclusion(?). Please let me know 
>> if that's not your position... I hate being misunderstood probably more than 
>> the next person, and I wouldn't want to be guilty of that myself.
> 
> The approach to operator handling in Swift is very intentional.  IMO, it is 
> well known that:
> 
> 1) Operators can make code significantly easier to understand by reducing 
> noise from complex expressions: writing x.matmul(y) is insane 
> <https://www.python.org/dev/peps/pep-0465/> if you’re doing a lot of matrix 
> multiplies.
> 2) Operators can be completely opaque to someone who doesn’t know them, and 
> sometimes named functions are more clear.
> 3) Named functions can also sometimes be completely opaque if you don't know 
> them, e.g. "let x = cholesky(y)"
> 4) Languages with fixed operator sets that also allow overloading (e.g. C++) 
> end up with those operators being abused.
> 5) Some code can only be written and maintained by domain experts, and those 
> experts often know the operators.


Well said!

I think comments about poorly chosen operator symbols (e.g. invisible or visual 
similar) are a bit of a red herring.  From a malicious angle, they’d rather 
overload a standard operator than introduce an exotic one which would draw more 
attention and doesn’t have pre-existing usage.  From a maintenance angle, 
choosing a poor operator symbol is akin to choosing a poorly named identifier.  
That’s really for the users to figure out themselves, we shouldn’t try to 
legislate the equivalent of “no single letter variables”.

> Swift’s approach is basically to say to users: “ok we allow overloaded 
> operators, but at least if you encounter some operation that you don’t know… 
> you know that you don’t know it”.  If you encounter "if ¬x {“  or “a ∩ b” in 
> some source code, at least you can command click, jump to the definition and 
> read what it does: you aren’t misled into thinking that the expression is 
> some familiar thing, but find out later it was overloaded to do something 
> crazy (bitshifts for i/o?  really??? :).

Exactly!  If someone has already decided they want an operator for something, 
better to let them have a choice of a new symbol rather than necessarily 
overloading one of the standard ones because we’ve restricted the set.  I think 
most of the bad reputation of custom operators is the surprising results of 
developers being forced to shoehorn the “standard” operators into new roles 
that confuse readers who think they know what an operator is doing.  E.g. it’s 
not the operator that’s as dangerous as the overloading.

> Set algebra is an illustrative example, because it is both used by people who 
> are experts and people who are not.  As far as policies go, I think it makes 
> sense for Swift libraries to define operator-like things as named functions 
> (e.g. “intersection") and also define operators (“∩”) which can optionally be 
> used in source bases that want them for convenience.  The compiler and 
> language cannot know whether a code base is written and maintained by experts 
> who know the symbols and who value their clarity (over the difficulty typing 
> and recognizing them), and this approach allows maintainers of the codebase 
> to pick their own policies.
> 
> I do think that Ethan’s suggestion upthread interesting, which suggest 
> considering something like:
>    import matrixlib (operators: [ᵀ,·,⊗])
> 
> Three concerns I see:
>  - Requiring them today would be a source incompatibility with Swift 4

To clarify, I’m only suggesting the qualifier be required for “non-standard” 
operators, so the source incompatibility would be on par to whatever unicode 
cleanup is similarly reclassifying characters already in use.

In that vein, this suggestion would dovetail well with such a reclassification 
effort, as it would give an easy upgrade path for existing code that wants to 
continue using a particular character, and allows a fairly conservative set of 
“standard” operators to be whitelisted without sacrificing end-user 
expressibility, which simplifies the scope of the classification effort.

“Standard” operators could include sections of the mathematical plane even 
though they aren’t necessarily used by the standard library, if there is desire 
to reserve such characters exclusively for operators and never identifiers.  

>  - Multiple modules can define operators, unclear whether this refers to the 
> operator decl or implementations of operators.

Hmm, how are conflicting operator declarations handled today? (e.g. different 
precedence, associativity for the same fixity?)

My thinking is import all declarations of that operator for a specified module 
(and so if the declaration isn’t imported, then implementations are hidden 
too).  You would have to specifically import the operator for each module that 
provides it.  If the user imports conflicting declarations it’s just the same 
result as today.

And by “all declarations of that operator” I mean if we have a matrix library 
that defines ᵀ for combinations of matrix, vector, sparse matrix, etc., then 
the single "import matrixlib (operator: ᵀ) ” statement makes all of those 
available since we should expect the module to be giving a consistent 
interpretation of that operator.  So in technical terms this is importing all 
declarations regardless of fixity, not sure if it’s worth getting more granular 
about importing just prefix but not infix.

Conversely, if the operator isn’t imported, then it’s as if those declarations 
were all internal to the module, and avoids any conflicts.

So if module A declares an operator ¬ and another module B uses that as 
identifier, then the client resolves this at import.  Either import ¬ from A 
and lose access to the identifier in B, or ignore the operator from A but 
retain access to the identifier in B.  (Hopefully rational symbol choices would 
make this a rare situation on par with other global namespace collisions, and 
good modules should provide less exotic interface fallbacks as well.)


>  - Imports are per-module, not per-source-file, so this couldn’t be used to 
> “user-partition” the identifier and operator space.  It could be a way to 
> make it clear that the user is opting into these explicitly.

Ahh nuts I actually thought imports were per-source-file! 🤦🏻‍♂️

So I guess a intra-module dependency for building the identifier/operator set 
is still too much a performance hit?  Parsing isn't already collecting all the 
imports from across the current module?

Well regardless, I’d be willing to live with repeating a per-file import 
statement for operator specification.  A little quirky that the operator 
attribute only has a file-level scope, but clearly I don’t mind respecifying 
imports in each file anyway (I kind of feel this is good form so you can move 
source files around and the dependencies come along.)

Alternatively, we could make a new per-file import specific for operators, 
orthogonal to module imports, although using similar syntax:
        import operator ᵀ
        import operator ·
        import operator ⊗

I thought about just using “import ᵀ”, but I don’t want to risk confusion with 
a module name.  Might be nice to pass a collection, but since we’re not doing 
that with module imports then don't start now.

These would be applied similar to previous proposal, but globally toggling 
operator visibility.  Basically just controls the operator character set and 
nothing more.  So implementation should be really simple, all imported 
operators declarations are already loaded as normal, but the compiler can only 
make the connection if the character was listed as an operator in the current 
file.  Initially I wanted an operator declaration in the current file to also 
serve as updating the character set so you don’t need both, but I see an 
argument to always require the import (for non-standard operators) just to 
surface guidance when an import will be needed to access that operator from 
elsewhere.

Does that help?  I liked having per-module control for conflict resolution and 
also auditing where operators come from, but (naively) this seems like a really 
simple implementation and if there is demand we could still add a syntax for 
module-specific filtering later.

-Ethan

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Re: [swift-evolution] A path forward on rationalizing unicode identifiers and operators

Reply via email to