Re: [swift-evolution] [Pitch] Synthesized static enum property to iterate over cases

Matthew Johnson via swift-evolution Sat, 09 Sep 2017 14:56:58 -0700


Sent from my iPad


> On Sep 9, 2017, at 2:07 PM, Tony Allevato <tony.allev...@gmail.com> wrote:
> 
> 
> 
>> On Fri, Sep 8, 2017 at 5:14 PM Xiaodi Wu <xiaodi...@gmail.com> wrote:
>>> On Fri, Sep 8, 2017 at 4:08 PM, Matthew Johnson via swift-evolution 
>>> <swift-evolution@swift.org> wrote:
>> 
>>> 
>>>> On Sep 8, 2017, at 12:05 PM, Tony Allevato <tony.allev...@gmail.com> wrote:
>>>> 
>>>> 
>>>> 
>>>> On Fri, Sep 8, 2017 at 9:44 AM Matthew Johnson <matt...@anandabits.com> 
>>>> wrote:
>>>>>> On Sep 8, 2017, at 11:32 AM, Tony Allevato <tony.allev...@gmail.com> 
>>>>>> wrote:
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Fri, Sep 8, 2017 at 8:35 AM Matthew Johnson <matt...@anandabits.com> 
>>>>>> wrote:
>>>>>>>> On Sep 8, 2017, at 9:53 AM, Tony Allevato via swift-evolution 
>>>>>>>> <swift-evolution@swift.org> wrote:
>>>>>>>> 
>>>>>>>> Thanks for bringing this up, Logan! It's something I've been thinking 
>>>>>>>> about a lot lately after a conversation with some colleagues outside 
>>>>>>>> of this community. Some of my thoughts:
>>>>>>>> 
>>>>>>>> AFAIK, there are two major use cases here: (1) you need the whole 
>>>>>>>> collection of cases, like in your example, and (2) you just need the 
>>>>>>>> number of cases. The latter seems to occur somewhat commonly when 
>>>>>>>> people want to use an enum to define the sections of, say, a 
>>>>>>>> UITableView. They just return the count from numberOfSections(in:) and 
>>>>>>>> then switch over the cases in their cell-providing methods.
>>>>>>>> 
>>>>>>>> Because of #2, it would be nice to avoid instantiating the collection 
>>>>>>>> eagerly. (Also because of examples like Jonathan's, where the enum is 
>>>>>>>> large.) If all the user is ever really doing is iterating over them, 
>>>>>>>> there's no need to keep the entire collection in memory. This leads us 
>>>>>>>> to look at Sequence; we could use something like AnySequence to keep 
>>>>>>>> the current case as our state and a transition function to advance to 
>>>>>>>> the next one. If a user needs to instantiate the full array from that 
>>>>>>>> sequence they can do so, but they have to do it explicitly.
>>>>>>>> 
>>>>>>>> The catch is that Sequence only provides `underestimatedCount`, rather 
>>>>>>>> than `count`. Calling the former would be an awkward API (why is it 
>>>>>>>> underestimated? we know how many cases there are). I suppose we could 
>>>>>>>> create a concrete wrapper for Sequence (PrecountedSequence?) that 
>>>>>>>> provides a `count` property to make that cleaner, and then have 
>>>>>>>> `underestimatedCount` return the same thing if users passed this thing 
>>>>>>>> into a generic operation constrained over Sequence. (The standard 
>>>>>>>> library already has support wrappers like EnumeratedSequence, so maybe 
>>>>>>>> this is appropriate.)
>>>>>>>> 
>>>>>>>> Another question that would need to be answered is, how should the 
>>>>>>>> cases be ordered? Declaration order seems obvious and straightforward, 
>>>>>>>> but if you have a raw-value enum (say, integers), you could have the 
>>>>>>>> declaration order and the numeric order differ. Maybe that's not a 
>>>>>>>> problem. Tying the iteration order to declaration order also means 
>>>>>>>> that the behavior of a program could change simply by reördering the 
>>>>>>>> cases. Maybe that's not a big problem either, but it's something to 
>>>>>>>> call out.
>>>>>>>> 
>>>>>>>> If I were designing this, I'd start with the following approach. 
>>>>>>>> First, add a new protocol to the standard library:
>>>>>>>> 
>>>>>>>> ```
>>>>>>>> public protocol ValueEnumerable {
>>>>>>>>   associatedtype AllValuesSequence: Sequence where 
>>>>>>>> AllValuesSequence.Iterator.Element == Self
>>>>>>>> 
>>>>>>>>   static var allValues: AllValuesSequence { get }
>>>>>>>> }
>>>>>>>> ```
>>>>>>>> 
>>>>>>>> Then, for enums that declare conformance to that protocol, synthesize 
>>>>>>>> the body of `allValues` to return an appropriate sequence. If we 
>>>>>>>> imagine a model like AnySequence, then the "state" can be the current 
>>>>>>>> case, and the transition function can be a switch/case that returns it 
>>>>>>>> and advances to the next one (finally returning nil).
>>>>>>>> 
>>>>>>>> There's an opportunity for optimization that may or may not be worth 
>>>>>>>> it: if the enum is RawRepresentable with RawValue == Int, AND all the 
>>>>>>>> raw values are in a contiguous range, AND declaration order is 
>>>>>>>> numerical order (assuming we kept that constraint), then the 
>>>>>>>> synthesized state machine can just be a simple integer incrementation 
>>>>>>>> and call to `init?(rawValue:)`. When all the cases have been 
>>>>>>>> generated, that will return nil on its own.
>>>>>>>> 
>>>>>>>> So that covers enums without associated values. What about those with 
>>>>>>>> associated values? I would argue that the "number of cases" isn't 
>>>>>>>> something that's very useful here—if we consider that enum cases are 
>>>>>>>> really factory functions for concrete values of the type, then we 
>>>>>>>> shouldn't think about "what are all the cases of this enum" but "what 
>>>>>>>> are all the values of this type". (For enums without associated 
>>>>>>>> values, those are synonymous.)
>>>>>>>> 
>>>>>>>> An enum with associated values can potentially have an infinite number 
>>>>>>>> of values. Here's one:
>>>>>>>> 
>>>>>>>> ```
>>>>>>>> enum BinaryTree {
>>>>>>>>   case subtree(left: BinaryTree, right: BinaryTree)
>>>>>>>>   case leaf
>>>>>>>>   case empty
>>>>>>>> }
>>>>>>>> ```
>>>>>>>> 
>>>>>>>> Even without introducing an Element type in the leaf nodes, there are 
>>>>>>>> a countably infinite number of binary trees. So first off, we wouldn't 
>>>>>>>> be able to generate a meaningful `count` property for that. Since 
>>>>>>>> they're countably infinite, we *could* theoretically lazily generate a 
>>>>>>>> sequence of them! It would be a true statement to say "an enum with 
>>>>>>>> associated values can have all of its values enumerated if all of its 
>>>>>>>> associated values are also ValueEnumerable". But I don't think that's 
>>>>>>>> something we could have the compiler synthesize generally: the logic 
>>>>>>>> to tie the sequences together would be quite complex in the absence of 
>>>>>>>> a construct like coroutines/yield, and what's worse, the compiler 
>>>>>>>> would have to do some deeper analysis to avoid infinite recursion. For 
>>>>>>>> example, if it used the naïve approach of generating the elements in 
>>>>>>>> declaration order, it would keep drilling down into the `subtree` case 
>>>>>>>> above over and over; it really needs to hit the base cases first, and 
>>>>>>>> requiring the user to order the cases in a certain way for it to just 
>>>>>>>> work at all is a non-starter.
>>>>>>>> 
>>>>>>>> So, enums with associated values are probably left unsynthesized. But 
>>>>>>>> the interesting thing about having this be a standard protocol is that 
>>>>>>>> there would be nothing stopping a user from conforming to it and 
>>>>>>>> implementing it manually, not only for enums but for other types as 
>>>>>>>> well. The potential may exist for some interesting algorithms by doing 
>>>>>>>> that, but I haven't thought that far ahead.
>>>>>>>> 
>>>>>>>> There are probably some things I'm missing here, but I'd love to hear 
>>>>>>>> other people's thoughts on it.
>>>>>>> 
>>>>>>> There are some things I really like about this approach, but it doesn’t 
>>>>>>> quite align with a lot of the usage I have seen for manually declared 
>>>>>>> `allValues` pattern.  
>>>>>>> 
>>>>>>> One of the most common ways I have seen `allValues` used is as a 
>>>>>>> representation of static sections or rows backing table or collection 
>>>>>>> views.  Code written like this will take the section or item index 
>>>>>>> provided by a data source or delegate method and index into an 
>>>>>>> `allValues` array to access the corresponding value.  These methods 
>>>>>>> usually access one or more members of the value or pass it along to 
>>>>>>> something else (often a cell) which does so.  
>>>>>>> 
>>>>>>> If we introduce synthesis that doesn’t support this use case I think a 
>>>>>>> lot people will be frustrated so my opinion is that we need to support 
>>>>>>> it.  This means users need a way to request synthesis of a `Collection` 
>>>>>>> with an `Int` index.  Obviously doing this solves the `count` problem.  
>>>>>>> The collection would not need to be eager.  It could be implemented to 
>>>>>>> produce values on demand rather than storing them.  
>>>>>> 
>>>>>> Great points! I was only considering the table view/section case where 
>>>>>> the enum had raw values 0..<count, but I do imagine it's possible that 
>>>>>> someone could just define `enum Section { case header, content, footer 
>>>>>> }` and then want to turn an IndexPath value into the appropriate Section.
>>>>>> 
>>>>>> On the other hand, though, isn't that what raw value enums are for? If 
>>>>>> the user needs to do what you're saying—map specific integers to enum 
>>>>>> values—shouldn't they do so by giving those cases raw values and calling 
>>>>>> init?(rawValue:), not by indexing into a collection? Especially since 
>>>>>> they can already do that today, and the only thing they're missing is 
>>>>>> being able to retrieve the count, which a "PrecountedSequence" mentioned 
>>>>>> above, or something like it, could also provide.
>>>>> 
>>>>> First, I’m making observations about what people are doing, not what they 
>>>>> could do.  
>>>>> 
>>>>> Second, the raw value may not correspond to 0-based indices.  It might 
>>>>> not even be an Int.  There is no reason to couple this common use case of 
>>>>> `allValues` to `Int` raw values with 0-based indices.
>>>> 
>>>> Do we know of any examples where a user is both (1) defining an enum with 
>>>> integer raw values that are noncontiguous or non-zero-based and (2) need 
>>>> declaration-ordinal-based indexing into those cases for other reasons, 
>>>> like a table/collection view? I can't think of why someone would do that, 
>>>> but I'm happy to consider something that I'm missing.
>>> 
>>> I don’t off-hand, but I don’t think the lack of example is a good 
>>> motivation for a solution that doesn’t directly address the most commonly 
>>> known use case for this feature.
>>> 
>>>>  
>>>>> 
>>>>> Third, `init(rawValue:)` is a failable initializer and would require a 
>>>>> force unwrap.  If the raw values *are* 0-based integers this is similar 
>>>>> to the collection bounds check that would be necessary, but it moves it 
>>>>> into user code.  People don’t like writing force unwraps.
>>>> 
>>>> Yeah, this is a really good point that I wasn't fully considering. If 
>>>> other invariants in the application hold—such as table view cell functions 
>>>> never receiving a section index outside 0..<count—then unwrapping it just 
>>>> forces users to address a situation that will never actually occur unless 
>>>> UIKit is fundamentally broken.
>>> 
>>> Right, but the most crucial point is that it forces *user* to address this. 
>>>  They are not required to today.  It is handled by the bounds check in 
>>> Array.  This might sound like splitting hairs but I think there are a lot 
>>> of people who wouldn't view it that way.
>>> 
>>>> 
>>>>  
>>>>> 
>>>>>> 
>>>>>> My main concern with providing a Collection with Int indices is that, at 
>>>>>> some fundamental/theoretical level, it feels like it only makes sense 
>>>>>> for enums with contiguous numeric raw values. For other kinds of enums, 
>>>>>> including those where the enum is just a "bag of things" without raw 
>>>>>> values, it feels artificial.
>>>>> 
>>>>> Sure, that’s why I proposed a couple of options for addressing both use 
>>>>> cases.  I think both have merit.  I also think we need to recognize that 
>>>>> most people are asking for a replacement for manually writing a static 
>>>>> array and won’t be satisfied unless we provide a solution where the 
>>>>> synthesized property behaves similarly.
>>>> 
>>>> Agreed—I just wanted to point out the distinction because an important 
>>>> part of fleshing this out will be to partition the various "classes" of 
>>>> enums into those that would receive an indexable Collection vs. those that 
>>>> would receive just a Sequence.
>>> 
>>> I agree that it’s an important distinction.  To be honest, I’m not sure 
>>> there is a good way to solve both usages without introducing more 
>>> complexity than would be acceptable for something like this.  It might be a 
>>> problem better solved by macros or some other metaprogramming feature.  It 
>>> would be unfortunate to have to wait until we have those to solve this.  
>>> However, I don’t think it's an important enough problem to deserve a 
>>> solution with a lot of knobs and associated complexity.
>> 
>> Just wanted to chime in to say that I too very much like the opt-in nature 
>> of this `ValuesEnumerable` design proposed by Tony (modulo some bikeshedding 
>> about the name). Seems like Tony has thought about this very deeply and 
>> considered multiple angles. However, I do agree with Matthew and others 
>> that, at the end of the day, a custom sequence/collection solving multiple 
>> usages that we don't even know are in high demand seems to be 
>> overcomplicating things. Synthesized Equatable, Hashable, and Codable give 
>> people a simple answer to simple problems. Here, people just want an array 
>> of all cases. Give them an array of all cases. When it's not possible (i.e., 
>> in the case of cases with associated values), don't do it.
> 
> I want to push a little harder on the array part. To the people saying that 
> it should just be an array of values, is it *specifically* an array that you 
> want, or would any Int-indexable Collection do?

I think the latter.  I haven't seen any use cases where the difference would 
really matter. I imagine some people would complain about having to explicitly 
convert to an Array if they need to use it in contexts that specify a concrete 
type but I don't think this would be common enough that it should influence the 
design.

> 
> I think this is an important distinction because (1) if it's synthesized via 
> a standard library protocol, that protocol should be defined in general 
> terms, and (2) a user wanting to iterate over information that is already 
> known at compile-time should not have to suffer a slow runtime heap 
> allocation in order to do so. Even if the array is allocated once and cached, 
> and even if it's small in most cases, why duplicate information you already 
> have?

Agree.

> 
> If the direction we want to go is to allow Int-indexing, then I believe a 
> custom Collection subtype would be just as usable for most people's needs, it 
> would afford the compiler optimization opportunities that a simple array 
> cannot, and if a user does need specifically an Array for some reason, they 
> can easily initialize one at the call site.

Agree.

>  
>>>>>>  
>>>>>>> Of course there might be some cases where a manual implementation is 
>>>>>>> necessary but implementing `Collection` is not desirable for one reason 
>>>>>>> or another.  One way to solve both of these use cases would be to have 
>>>>>>> a protocol hierarchy but that seems like it might be excessively 
>>>>>>> complex for a feature like this.  Another way might be to take 
>>>>>>> advantage of the fact that in the use case mentioned above people are 
>>>>>>> usually working with the concrete type.  We could allow the compiler to 
>>>>>>> synthesize an implementation that *exceeds* the requirement of the 
>>>>>>> protocol such that the synthesized `AllValuesSequence` is actually a 
>>>>>>> `Collection where Index == Int`.  I’m not sure which option is better.
>>>>>>> 
>>>>>>> I would also like to discuss enums with associated values.  It would 
>>>>>>> certainly be reasonable to disallow synthesis for these types in an 
>>>>>>> initial implementation.  I don’t know of any use cases off the top of 
>>>>>>> my head (although I expect some good ones do exist).  That said, I 
>>>>>>> don’t think synthesis would be prohibitive for enums with associated 
>>>>>>> values so long as the type of all associated values conforms to 
>>>>>>> `ValueEnumerable`.  We should probably support synthesis for these 
>>>>>>> types eventually, possibly in the initial implementation if there are 
>>>>>>> no significant implementation barriers.
>>>>>> 
>>>>>> I mentioned some of those barriers above. One issue is that synthesizing 
>>>>>> the code to lazily (i.e., reëntrantly) generate a sequence whose 
>>>>>> elements are the Cartesian products of other sequences is non-trivial. 
>>>>>> (Coroutines/yield would make this a piece of cake.)
>>>>> 
>>>>> The good news is that we might be in luck on this front in the Swift 5 
>>>>> timeframe.  :)
>>>> 
>>>> Fingers crossed! I'm not a concurrency expert by any means, so the most 
>>>> exciting part of those new proposals to me is the side-effect that we 
>>>> might get something like C# enumerators :)
>>>> 
>>>>  
>>>>> 
>>>>>> 
>>>>>> The other is the issue with recursive enums, like the BinaryTree 
>>>>>> example, where the compiler has to know to synthesize them in a 
>>>>>> particular order or else it will recurse indefinitely before even 
>>>>>> producing its first value. However, this could be addressed by simply 
>>>>>> forbidding automatic synthesis of enums that have an indirect case, 
>>>>>> which is probably a reasonable limitation.
>>>>> 
>>>>> Yeah, that seems like a reasonable limitation.
>>>>> 
>>>>>> 
>>>>>>  
>>>>>>> 
>>>>>>> That’s my two cents.
>>>>>>> 
>>>>>>> - Matthew
>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> On Fri, Sep 8, 2017 at 3:40 AM Jonathan Hull via swift-evolution 
>>>>>>>>> <swift-evolution@swift.org> wrote:
>>>>>>>>> +1000
>>>>>>>>> 
>>>>>>>>> I once made a country code enum, and creating that array was simple, 
>>>>>>>>> but took forever, and was prone to mistakes.
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Jon
>>>>>>>>> 
>>>>>>>>> > On Sep 8, 2017, at 2:56 AM, Logan Shire via swift-evolution 
>>>>>>>>> > <swift-evolution@swift.org> wrote:
>>>>>>>>> >
>>>>>>>>> > Googling ‘swift iterate over enum cases’ yields many results of 
>>>>>>>>> > various levels of hackery.
>>>>>>>>> > Obviously it’s trivial to write a computed property that returns an 
>>>>>>>>> > enum’s cases as an
>>>>>>>>> > array, but maintaining that is prone to error. If you add another 
>>>>>>>>> > case, you need to make sure
>>>>>>>>> > you update the array property. For enums without associated types,
>>>>>>>>> > I propose adding a synthesized static var, ‘cases', to the enum’s 
>>>>>>>>> > type. E.g.
>>>>>>>>> >
>>>>>>>>> > enum Suit: String {
>>>>>>>>> >    case spades = "♠"
>>>>>>>>> >    case hearts = "♥"
>>>>>>>>> >    case diamonds = "♦"
>>>>>>>>> >    case clubs = "♣"
>>>>>>>>> > }
>>>>>>>>> >
>>>>>>>>> > let values = (1…13).map { value in
>>>>>>>>> >    switch value {
>>>>>>>>> >    case 1: return “A”
>>>>>>>>> >    case 11: return “J”
>>>>>>>>> >    case 12: return “Q”
>>>>>>>>> >    case 13: return “K”
>>>>>>>>> >    default: return String(value)
>>>>>>>>> >    }
>>>>>>>>> > }
>>>>>>>>> >
>>>>>>>>> > let cards = values.flatMap { value in Suit.cases.map { 
>>>>>>>>> > “\($0)\(value)"  } }
>>>>>>>>> >
>>>>>>>>> > Yields [“♠A”, “ ♥ A”, …, “♣K”]
>>>>>>>>> > Thoughts?
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> > Thanks!
>>>>>>>>> > - Logan Shire
>>>>>>>>> > _______________________________________________
>>>>>>>>> > swift-evolution mailing list
>>>>>>>>> > swift-evolution@swift.org
>>>>>>>>> > https://lists.swift.org/mailman/listinfo/swift-evolution
>>>>>>>>> 
>>>>>>>>> _______________________________________________
>>>>>>>>> swift-evolution mailing list
>>>>>>>>> swift-evolution@swift.org
>>>>>>>>> https://lists.swift.org/mailman/listinfo/swift-evolution
>>>>>>>> _______________________________________________
>>>>>>>> swift-evolution mailing list
>>>>>>>> swift-evolution@swift.org
>>>>>>>> https://lists.swift.org/mailman/listinfo/swift-evolution
>>> 
>>> 
>>> _______________________________________________
>>> swift-evolution mailing list
>>> swift-evolution@swift.org
>>> https://lists.swift.org/mailman/listinfo/swift-evolution
>>>

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Re: [swift-evolution] [Pitch] Synthesized static enum property to iterate over cases

Reply via email to