Re: [rust-dev] RFC: Conventions for well-behaved iterators
On Thu, Feb 13, 2014 at 10:05 AM, Simon Sapin simon.sa...@exyr.org wrote: Proposal: 0. An iterator is said to be well-behaved if, after its .next() method has returned None once, any subsequent call also returns None. I agree with the spirit of your proposal. But I would change that first clause above to read: An iterator is said to be well-behaved when its .next() method always returns None if the iterator logically has no elements to iterate over. And all iterators should, by convention, be well-behaved. Otherwise it's impossible to pinpoint what exactly is the bug in the following code: struct Digits { n: int } impl Iteratorint for Digits { fn next(mut self) - Optionint { self.n += 1; if self.n == 10 { None } else { Some(self.n) } } } fn main() { let mut itr = Digits { n: -1 }; for i in itr { // for-loop consumes all items in itr println!({}, i); } let sum = itr.fold(0, |a, b| a + b); // Infinite loop println!({}, sum); } Given the current std::iter::Iterator specification [1], the implementation of the .next() method of Digits is valid. Also, the fold method of Iterator trait should return the initial state (the first argument) when fold is called on an empty iterator, but the call get's stuck on an infinite loop instead. [1]: The Iterator protocol states that an iterator yields a (potentially-empty, potentially-infinite) sequence of values, and returns None to signal that it's finished. The Iterator protocol does not define behavior after None is returned. A concrete Iterator implementation may choose to behave however it wishes, either by returning None infinitely, or by doing something else. ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] RFC: Conventions for well-behaved iterators
On 04 Mar 2014, at 15:59, Simon Sapin simon.sa...@exyr.org wrote: On 04/03/2014 13:23, Tommi wrote: On Thu, Feb 13, 2014 at 10:05 AM, Simon Sapin simon.sa...@exyr.org mailto:simon.sa...@exyr.org wrote: Proposal: 0. An iterator is said to be well-behaved if, after its .next() method has returned None once, any subsequent call also returns None. I agree with the spirit of your proposal. But I would change that first clause above to read: /An iterator is said to be well-behaved when its .next() method always returns None if the iterator logically has no elements to iterate over./ No elements to iterate over is already what None is supposed to mean, so this doesn’t add anything. What the wording of my definition does is it relaxes the constraints your definition specified. This relaxation of the rule is needed because there might be a need for the use case of having a once empty iterator become non-empty again. And all iterators should, by convention, be well-behaved. Otherwise it's impossible to pinpoint what exactly is the bug in the following code: struct Digits { n: int } impl Iteratorint for Digits { fn next(mut self) - Optionint { self.n += 1; if self.n == 10 { None } else { Some(self.n) } } } I don’t see what’s unclear here. This is the canonical example of an ill-behaved iterator. This code is buggy, `self.n` should only be incremented when returning Some(). I agree that this should be an ill-behaved iterator, but my point is that according to the specification [1], this is _not_ an ill-behaved iterator, it's a completely valid one. Given that self.n is less than 10, it correctly returns some elements as Some(...) and then it returns None. After it has returned None, according to specification [1], it is allowed to return whatever it wants. Therefore I'd be inclined to say that the bug in my example is in calling fold on an iterator whose all elements have been exhausted, but that seems just silly and fragile. [1]: The Iterator protocol states that an iterator yields a (potentially-empty, potentially-infinite) sequence of values, and returns None to signal that it's finished. The Iterator protocol does not define behavior after None is returned. A concrete Iterator implementation may choose to behave however it wishes, either by returning None infinitely, or by doing something else. ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] RFC: Conventions for well-behaved iterators
On 04/03/2014 14:42, Tommi wrote: I agree that this should be an ill-behaved iterator, but my point is that according to the specification [1], this is_not_ an ill-behaved iterator, it's a completely valid one. Given that self.n is less than 10, it correctly returns some elements as Some(...) and then it returns None. After it has returned None, according to specification [1], it is allowed to return whatever it wants. Yes, it is valid per the *current* definition of Iterator in the Rust documentation. The point of this thread is to change this definition. Therefore I'd be inclined to say that the bug in my example is in calling fold on an iterator whose all elements have been exhausted, but that seems just silly and fragile. Yes it’s fragile. I’m proposing the change to avoid such fragility. [1]: The Iterator protocol states that an iterator yields a (potentially-empty, potentially-infinite) sequence of values, and returns None to signal that it's finished. The Iterator protocol does not define behavior after None is returned. A concrete Iterator implementation may choose to behave however it wishes, either by returning None infinitely, or by doing something else. -- Simon Sapin ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] RFC: Conventions for well-behaved iterators
On Mar 4, 2014, at 5:23 AM, Tommi rusty.ga...@icloud.com wrote: I agree with the spirit of your proposal. But I would change that first clause above to read: An iterator is said to be well-behaved when its .next() method always returns None if the iterator logically has no elements to iterate over. And all iterators should, by convention, be well-behaved. Otherwise it's impossible to pinpoint what exactly is the bug in the following code: struct Digits { n: int } impl Iteratorint for Digits { fn next(mut self) - Optionint { self.n += 1; if self.n == 10 { None } else { Some(self.n) } } } fn main() { let mut itr = Digits { n: -1 }; for i in itr { // for-loop consumes all items in itr println!({}, i); } let sum = itr.fold(0, |a, b| a + b); // Infinite loop println!({}, sum); } Given the current std::iter::Iterator specification [1], the implementation of the .next() method of Digits is valid. Also, the fold method of Iterator trait should return the initial state (the first argument) when fold is called on an empty iterator, but the call get's stuck on an infinite loop instead. The bug is pretty obvious. You're using an iterator after it's been exhausted. This means you're relying on behavior specific to that implementation of the iterator. One reason why the iterator protocol allows this is precisely to allow things like an iterator that yields multiple distinct sequences. And that's what your Digits iterator is. It yields two sequences, the first is the finite sequence [1, 10), and the second is the infinite sequence [11, ...]. So the fold() runs forever because the second sequence is infinite. If Digits only ever yielded a single infinite sequence [0, ...] then your fold() would still run forever. Alternatively, if your Digits was implemented to return multiple finite sequences, your fold() would work. For example impl Iteratorint for Digits { fn next(mut self) - Optionint { self.n += 1; if self.n % 10 == 0 { None } else { Some(self.n) } } } This variant yields successive 9-element sequences, skipping every value divisible by 10. If you do need to touch an iterator after it's been exhausted, and you want the post-exhaustion behavior to be always returning None, that's what .fuse() is for. fn main() { let mut itr = Digits { n: -1 }.fuse(); for i in itr { println!({}, i); } let sum = itr.fold(0, |a, b| a + b); println!({}, sum); } But of course this still has a bug, which is that the fold is now guaranteed to return 0, because you exhausted the iterator already. -Kevin smime.p7s Description: S/MIME cryptographic signature ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] RFC: Conventions for well-behaved iterators
On Fri, Feb 14, 2014 at 12:35 AM, Daniel Micay danielmi...@gmail.comwrote: On Thu, Feb 13, 2014 at 6:33 PM, Erick Tryzelaar erick.tryzel...@gmail.com wrote: On Thursday, February 13, 2014, Gábor Lehel glaebho...@gmail.com wrote: This is not strictly true. If instead of fn next(mut self) - OptionA; we had something like fn next(self) - Option(Self, A); then access to exhausted iterators would be ruled out at the type level. (But it's more cumbersome to work with and is currently incompatible with trait objects.) This is an appealing option. If it is really this simple to close this undefined behavior, I think we should consider it. Are there any other downsides? Does it optimize down to the same code as our current iterators? It's certainly not as convenient and would only work if all iterators were marked as `NoPod`. Even if it were `Pod` (i.e. copyable), the state of the old copy would be left unchanged by the call, so I don't think this is a problem. You could also recover the behavior of the existing `Fuse` adapter (call it any number of times, exhaustion checked at runtime) by wrapping it in an `Option` like so: fn next_fusedT, Iter: IteratorT(opt_iter: mut OptionIter) - OptionT { opt_iter.take().and_then(|iter| { iter.next().map(|(next_iter, result)| { *opt_iter = Some(next_iter); result }) }) } Dunno about performance. Lots of copies/moves with this scheme, so it seems possible that it might be slower. ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] RFC: Conventions for well-behaved iterators
Taking `self` would make it much harder to thread iterators statefully through other function calls, and also make using Iterator trait objects harder/impossible, since `next` requires `~self` until 10672 is fixed, which is unacceptable, and mentions `Self` in the return value, making it uncallable through something with the type erased. Huon On 14/02/14 21:46, Gábor Lehel wrote: On Fri, Feb 14, 2014 at 12:35 AM, Daniel Micay danielmi...@gmail.com mailto:danielmi...@gmail.com wrote: On Thu, Feb 13, 2014 at 6:33 PM, Erick Tryzelaar erick.tryzel...@gmail.com mailto:erick.tryzel...@gmail.com wrote: On Thursday, February 13, 2014, Gábor Lehel glaebho...@gmail.com mailto:glaebho...@gmail.com wrote: This is not strictly true. If instead of fn next(mut self) - OptionA; we had something like fn next(self) - Option(Self, A); then access to exhausted iterators would be ruled out at the type level. (But it's more cumbersome to work with and is currently incompatible with trait objects.) This is an appealing option. If it is really this simple to close this undefined behavior, I think we should consider it. Are there any other downsides? Does it optimize down to the same code as our current iterators? It's certainly not as convenient and would only work if all iterators were marked as `NoPod`. Even if it were `Pod` (i.e. copyable), the state of the old copy would be left unchanged by the call, so I don't think this is a problem. You could also recover the behavior of the existing `Fuse` adapter (call it any number of times, exhaustion checked at runtime) by wrapping it in an `Option` like so: fn next_fusedT, Iter: IteratorT(opt_iter: mut OptionIter) - OptionT { opt_iter.take().and_then(|iter| { iter.next().map(|(next_iter, result)| { *opt_iter = Some(next_iter); result }) }) } Dunno about performance. Lots of copies/moves with this scheme, so it seems possible that it might be slower. ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] RFC: Conventions for well-behaved iterators
As you said, the type system can not enforce this rule, that's why the documentation have no choice but to say the behavior is undefined. If the code you write relies on None being returned forever, then you should use the Fuse iterator adaptor, that wraps an existing iterator and enforces this behavior: http://static.rust-lang.org/doc/master/std/iter/trait.Iterator.html#method.fuse http://static.rust-lang.org/doc/master/guide-container.html#iterator-adaptors - Message d'origine - De : Simon Sapin Envoyés : 13.02.14 16:05 À : rust-dev@mozilla.org Objet : [rust-dev] RFC: Conventions for well-behaved iterators Hi, The Rust documentation currently makes iterators behavior undefined after .next() has returned None once. http://static.rust-lang.org/doc/master/std/iter/trait.Iterator.html The Iterator protocol does not define behavior after None is returned. A concrete Iterator implementation may choose to behave however it wishes, either by returning None infinitely, or by doing something else. http://static.rust-lang.org/doc/master/guide-container.html In general, you cannot rely on the behavior of the next() method after it has returned None. Some iterators may return None forever. Others may behave differently. This is unfortunate. Code that accepts any iterator as input and does with it anything more complicated than a single 'for' loop will have to be defensive in order to not fall into undefined behavior. The type system can not enforce anything about this, but I’d like that we consider having conventions about well-behaved iterators. --- Proposal: 0. An iterator is said to be well-behaved if, after its .next() method has returned None once, any subsequent call also returns None. 1. Iterators *should* be well-behaved. 2. Iterators in libstd and other libraries distributed with rustc *must* be well-behaved. (I.e. not being well-behaved is a bug.) 3. When accepting an iterator as input, it’s ok to assume it’s well-behaved. 4. For iterator adaptors in particular, 3. means that 1. and 2. only apply for well-behaved input. (So that, eg. std::iter::Map can stay as straightforward as it is, and does not need to be coded defensively.) --- Does the general idea sound like something y’all want? I’m not overly attached to the details. -- Simon Sapin ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] RFC: Conventions for well-behaved iterators
On 13/02/2014 16:09, Olivier Renaud wrote: As you said, the type system can not enforce this rule, that's why the documentation have no choice but to say the behavior is undefined. Just because we can not make them automatically-enforced rules doesn’t mean we can’t have conventions. I’m suggesting that we adopt a convention that iterators *should* be well-behaved. If the code you write relies on None being returned forever, then you should use the Fuse iterator adaptor, that wraps an existing iterator and enforces this behavior: http://static.rust-lang.org/doc/master/std/iter/trait.Iterator.html#method.fuse http://static.rust-lang.org/doc/master/guide-container.html#iterator-adaptors -- Simon Sapin ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] RFC: Conventions for well-behaved iterators
For reference, this topic was discussed last August as well: https://mail.mozilla.org/pipermail/rust-dev/2013-August/005113.html On Thu, Feb 13, 2014 at 10:05 AM, Simon Sapin simon.sa...@exyr.org wrote: Hi, The Rust documentation currently makes iterators behavior undefined after .next() has returned None once. http://static.rust-lang.org/doc/master/std/iter/trait.Iterator.html The Iterator protocol does not define behavior after None is returned. A concrete Iterator implementation may choose to behave however it wishes, either by returning None infinitely, or by doing something else. http://static.rust-lang.org/doc/master/guide-container.html In general, you cannot rely on the behavior of the next() method after it has returned None. Some iterators may return None forever. Others may behave differently. This is unfortunate. Code that accepts any iterator as input and does with it anything more complicated than a single 'for' loop will have to be defensive in order to not fall into undefined behavior. The type system can not enforce anything about this, but I'd like that we consider having conventions about well-behaved iterators. --- Proposal: 0. An iterator is said to be well-behaved if, after its .next() method has returned None once, any subsequent call also returns None. 1. Iterators *should* be well-behaved. 2. Iterators in libstd and other libraries distributed with rustc *must* be well-behaved. (I.e. not being well-behaved is a bug.) 3. When accepting an iterator as input, it's ok to assume it's well-behaved. 4. For iterator adaptors in particular, 3. means that 1. and 2. only apply for well-behaved input. (So that, eg. std::iter::Map can stay as straightforward as it is, and does not need to be coded defensively.) --- Does the general idea sound like something y'all want? I'm not overly attached to the details. -- Simon Sapin ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] RFC: Conventions for well-behaved iterators
On Thu, Feb 13, 2014 at 5:09 PM, Olivier Renaud o.ren...@gmx.fr wrote: As you said, the type system can not enforce this rule, that's why the documentation have no choice but to say the behavior is undefined. This is not strictly true. If instead of fn next(mut self) - OptionA; we had something like fn next(self) - Option(Self, A); then access to exhausted iterators would be ruled out at the type level. (But it's more cumbersome to work with and is currently incompatible with trait objects.) If the code you write relies on None being returned forever, then you should use the Fuse iterator adaptor, that wraps an existing iterator and enforces this behavior: http://static.rust-lang.org/doc/master/std/iter/trait.Iterator.html#method.fuse http://static.rust-lang.org/doc/master/guide-container.html#iterator-adaptors - Message d'origine - De : Simon Sapin Envoyés : 13.02.14 16:05 À : rust-dev@mozilla.org Objet : [rust-dev] RFC: Conventions for well-behaved iterators Hi, The Rust documentation currently makes iterators behavior undefined after .next() has returned None once. http://static.rust-lang.org/doc/master/std/iter/trait.Iterator.html The Iterator protocol does not define behavior after None is returned. A concrete Iterator implementation may choose to behave however it wishes, either by returning None infinitely, or by doing something else. http://static.rust-lang.org/doc/master/guide-container.html In general, you cannot rely on the behavior of the next() method after it has returned None. Some iterators may return None forever. Others may behave differently. This is unfortunate. Code that accepts any iterator as input and does with it anything more complicated than a single 'for' loop will have to be defensive in order to not fall into undefined behavior. The type system can not enforce anything about this, but I'd like that we consider having conventions about well-behaved iterators. --- Proposal: 0. An iterator is said to be well-behaved if, after its .next() method has returned None once, any subsequent call also returns None. 1. Iterators *should* be well-behaved. 2. Iterators in libstd and other libraries distributed with rustc *must* be well-behaved. (I.e. not being well-behaved is a bug.) 3. When accepting an iterator as input, it's ok to assume it's well-behaved. 4. For iterator adaptors in particular, 3. means that 1. and 2. only apply for well-behaved input. (So that, eg. std::iter::Map can stay as straightforward as it is, and does not need to be coded defensively.) --- Does the general idea sound like something y'all want? I'm not overly attached to the details. -- Simon Sapin ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] RFC: Conventions for well-behaved iterators
On Thu, Feb 13, 2014 at 2:13 PM, Gábor Lehel glaebho...@gmail.com wrote: (But it's more cumbersome to work with and is currently incompatible with trait objects.) Iterators are already mostly incompatible with trait objects since all the adaptors take by-value self. ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] RFC: Conventions for well-behaved iterators
On Feb 13, 2014, at 11:56 AM, Daniel Micay danielmi...@gmail.com wrote: On Thu, Feb 13, 2014 at 10:05 AM, Simon Sapin simon.sa...@exyr.org wrote: Hi, The Rust documentation currently makes iterators behavior undefined after .next() has returned None once. http://static.rust-lang.org/doc/master/std/iter/trait.Iterator.html The Iterator protocol does not define behavior after None is returned. A concrete Iterator implementation may choose to behave however it wishes, either by returning None infinitely, or by doing something else. http://static.rust-lang.org/doc/master/guide-container.html In general, you cannot rely on the behavior of the next() method after it has returned None. Some iterators may return None forever. Others may behave differently. This is unfortunate. Code that accepts any iterator as input and does with it anything more complicated than a single 'for' loop will have to be defensive in order to not fall into undefined behavior. The type system can not enforce anything about this, but I’d like that we consider having conventions about well-behaved iterators. --- Proposal: 0. An iterator is said to be well-behaved if, after its .next() method has returned None once, any subsequent call also returns None. 1. Iterators *should* be well-behaved. 2. Iterators in libstd and other libraries distributed with rustc *must* be well-behaved. (I.e. not being well-behaved is a bug.) 3. When accepting an iterator as input, it’s ok to assume it’s well-behaved. 4. For iterator adaptors in particular, 3. means that 1. and 2. only apply for well-behaved input. (So that, eg. std::iter::Map can stay as straightforward as it is, and does not need to be coded defensively.) --- Does the general idea sound like something y’all want? I’m not overly attached to the details. -- Simon Sapin Enforcing this invariant makes many adaptors more complex. For example, the `filter` adaptor would need to maintain a boolean flag and branch on it. I'm fine with the current solution of a `fuse` adaptor because it moves all of the responsibility to a single location, and user-defined adaptors don't need to get this right. This was the main reasoning behind the current logic. The vast majority of users of iterators don't care about next() behavior after the iterator has returned None, so there was no need to make the iterator adaptors track extra state in the general case. Any client who does need it can just call `.fuse()` to get a Fuse adaptor that adds the necessary checks. -Kevin smime.p7s Description: S/MIME cryptographic signature ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] RFC: Conventions for well-behaved iterators
On 13/02/2014 19:56, Daniel Micay wrote: Enforcing this invariant makes many adaptors more complex. For example, the `filter` adaptor would need to maintain a boolean flag and branch on it. I'm fine with the current solution of a `fuse` adaptor because it moves all of the responsibility to a single location, and user-defined adaptors don't need to get this right. My entire email was about specifically *not* enforcing anything, only convention. Filter is in the exact same category as Map: it’s only expected to be well-behaved when its input is. -- Simon Sapin ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] RFC: Conventions for well-behaved iterators
On Thu, Feb 13, 2014 at 6:33 PM, Erick Tryzelaar erick.tryzel...@gmail.com wrote: On Thursday, February 13, 2014, Gábor Lehel glaebho...@gmail.com wrote: This is not strictly true. If instead of fn next(mut self) - OptionA; we had something like fn next(self) - Option(Self, A); then access to exhausted iterators would be ruled out at the type level. (But it's more cumbersome to work with and is currently incompatible with trait objects.) This is an appealing option. If it is really this simple to close this undefined behavior, I think we should consider it. Are there any other downsides? Does it optimize down to the same code as our current iterators? It's certainly not as convenient and would only work if all iterators were marked as `NoPod`. ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev