Re: [julia-users] Re: iterator construct seems incorrect?
Glad it makes sense. Maybe we could explain that better. On Jul 23, 2014, at 3:20 PM, vava...@uwaterloo.ca wrote: Dear Julia users, Thank you for all your thoughtful replies on this matter. Indeed, you are all correct, and I misunderstood the construct. (The point that confused me was that in the call (i,state)=next(I,state), I mistakenly assumed that the returned state should correspond with returned data item i, when in fact the proper approach is for the data item i to correspond to the old state and to lag one position behind the returned state.) I no longer think there is a problem with the start/done/next paradigm in Julia. --- Steve Vavasis On Wednesday, July 23, 2014 12:29:12 AM UTC+3, vav...@uwaterloo.ca wrote: Dear Julia users, As I have mentioned in earlier posts, I am working on a 2-3 tree implementation of a sort-order dict, that is, a dict in which the (key, value) pairs can be retrieved in the sort-order of the keys. According to section 2.1.7 of the manual, for i = I; body ; end translates to: state = start(I) while !done(I, state) (i,state) = next(I,state) body end The more obvious way to implement the loop would be to put the body BEFORE the 'next' statement, and at first I thought that maybe the manual has a typo. But then I checked the file dict.jl, and I found indeed the code: done(t::ObjectIdDict, i) = is(next(t,i),()) So this means that every loop iteration over an ObjectIdDict requires two calls to 'next', one as part of the call to 'done', and a second one to actually advance the loop. I also looked at the code for 'done' for Dict in dict.jl, and it appears to be buggy(??). It does not check whether everything after the current i is deleted(??) For a 2-3 tree, the 'next' operation is logarithmic time, so requiring it twice per loop iteration is undesirable. Can anyone shed light on why the Julia loop construction is implemented this way, so that two separate calls to 'next' are necessary? I suppose that it is possible with additional code complexity to cache the result of the first call to 'next' inside the state, but what exactly is the rationale for the current design? Thanks, Steve Vavasis
Re: [julia-users] Re: iterator construct seems incorrect?
Thanks for clarifying. While I know exactly what you're talking about, and I wasn't around when iterators were designed, I've always assumed that the current behavior follows from two principles: (1) `done` must evaluate to a boolean, and (2) the `item` should not leak out of the scope block defined by the `while` loop. Last I thought through it carefully, the current design seemed inevitable. Best, --Tim On Tuesday, July 22, 2014 02:59:33 PM vava...@uwaterloo.ca wrote: Tim, The fact that 'next' is called before the loop body rather than after means that the 'done' predicate must provide an answer to the question: if 'next' is called on the current state, then would the result be the end of the data structure? The obvious way to answer that question is to actually invoke 'next' to see what happens. For a balanced tree, there is no simpler way that I know of to answer the question other than calling 'next'. It is possible to avoid the performance hit with additional code complexity by caching the result of 'next', but still, the question remains, why are start/done/next implemented in this way instead of the more obvious way (for example, in the C implementation of for(..; .. ; ..) the 'next' operation is invoked at the end of the body). -- Steve On Wednesday, July 23, 2014 12:29:12 AM UTC+3, vav...@uwaterloo.ca wrote: Dear Julia users, As I have mentioned in earlier posts, I am working on a 2-3 tree implementation of a sort-order dict, that is, a dict in which the (key, value) pairs can be retrieved in the sort-order of the keys. According to section 2.1.7 of the manual, for i = I; body ; end translates to: state = start(I) while !done(I, state) (i,state) = next(I,state) body end The more obvious way to implement the loop would be to put the body BEFORE the 'next' statement, and at first I thought that maybe the manual has a typo. But then I checked the file dict.jl, and I found indeed the code: done(t::ObjectIdDict, i) = is(next(t,i),()) So this means that every loop iteration over an ObjectIdDict requires two calls to 'next', one as part of the call to 'done', and a second one to actually advance the loop. I also looked at the code for 'done' for Dict in dict.jl, and it appears to be buggy(??). It does not check whether everything after the current i is deleted(??) For a 2-3 tree, the 'next' operation is logarithmic time, so requiring it twice per loop iteration is undesirable. Can anyone shed light on why the Julia loop construction is implemented this way, so that two separate calls to 'next' are necessary? I suppose that it is possible with additional code complexity to cache the result of the first call to 'next' inside the state, but what exactly is the rationale for the current design? Thanks, Steve Vavasis
Re: [julia-users] Re: iterator construct seems incorrect?
Two separate calls to next certainly aren't in general necessary, and I haven't hit any issues implementing the iterator interface before. In fact, it felt quite natural in the end. I suspect that no matter which way you do the iteration interface, you'll be able to find an example where you'd prefer it to be the other way - it'd be nice to formalize that. On Tuesday, July 22, 2014 7:34:18 PM UTC-4, Tim Holy wrote: Thanks for clarifying. While I know exactly what you're talking about, and I wasn't around when iterators were designed, I've always assumed that the current behavior follows from two principles: (1) `done` must evaluate to a boolean, and (2) the `item` should not leak out of the scope block defined by the `while` loop. Last I thought through it carefully, the current design seemed inevitable. Best, --Tim On Tuesday, July 22, 2014 02:59:33 PM vav...@uwaterloo.ca javascript: wrote: Tim, The fact that 'next' is called before the loop body rather than after means that the 'done' predicate must provide an answer to the question: if 'next' is called on the current state, then would the result be the end of the data structure? The obvious way to answer that question is to actually invoke 'next' to see what happens. For a balanced tree, there is no simpler way that I know of to answer the question other than calling 'next'. It is possible to avoid the performance hit with additional code complexity by caching the result of 'next', but still, the question remains, why are start/done/next implemented in this way instead of the more obvious way (for example, in the C implementation of for(..; .. ; ..) the 'next' operation is invoked at the end of the body). -- Steve On Wednesday, July 23, 2014 12:29:12 AM UTC+3, vav...@uwaterloo.ca wrote: Dear Julia users, As I have mentioned in earlier posts, I am working on a 2-3 tree implementation of a sort-order dict, that is, a dict in which the (key, value) pairs can be retrieved in the sort-order of the keys. According to section 2.1.7 of the manual, for i = I; body ; end translates to: state = start(I) while !done(I, state) (i,state) = next(I,state) body end The more obvious way to implement the loop would be to put the body BEFORE the 'next' statement, and at first I thought that maybe the manual has a typo. But then I checked the file dict.jl, and I found indeed the code: done(t::ObjectIdDict, i) = is(next(t,i),()) So this means that every loop iteration over an ObjectIdDict requires two calls to 'next', one as part of the call to 'done', and a second one to actually advance the loop. I also looked at the code for 'done' for Dict in dict.jl, and it appears to be buggy(??). It does not check whether everything after the current i is deleted(??) For a 2-3 tree, the 'next' operation is logarithmic time, so requiring it twice per loop iteration is undesirable. Can anyone shed light on why the Julia loop construction is implemented this way, so that two separate calls to 'next' are necessary? I suppose that it is possible with additional code complexity to cache the result of the first call to 'next' inside the state, but what exactly is the rationale for the current design? Thanks, Steve Vavasis
Re: [julia-users] Re: iterator construct seems incorrect?
The other point to make is that if you find it more natural to do your incrementing inside done, there's no rule saying that next has to do any actual work. You can cache the item somewhere and just have next fetch it. --Tim On Tuesday, July 22, 2014 06:13:25 PM Iain Dunning wrote: Two separate calls to next certainly aren't in general necessary, and I haven't hit any issues implementing the iterator interface before. In fact, it felt quite natural in the end. I suspect that no matter which way you do the iteration interface, you'll be able to find an example where you'd prefer it to be the other way - it'd be nice to formalize that. On Tuesday, July 22, 2014 7:34:18 PM UTC-4, Tim Holy wrote: Thanks for clarifying. While I know exactly what you're talking about, and I wasn't around when iterators were designed, I've always assumed that the current behavior follows from two principles: (1) `done` must evaluate to a boolean, and (2) the `item` should not leak out of the scope block defined by the `while` loop. Last I thought through it carefully, the current design seemed inevitable. Best, --Tim On Tuesday, July 22, 2014 02:59:33 PM vav...@uwaterloo.ca javascript: wrote: Tim, The fact that 'next' is called before the loop body rather than after means that the 'done' predicate must provide an answer to the question: if 'next' is called on the current state, then would the result be the end of the data structure? The obvious way to answer that question is to actually invoke 'next' to see what happens. For a balanced tree, there is no simpler way that I know of to answer the question other than calling 'next'. It is possible to avoid the performance hit with additional code complexity by caching the result of 'next', but still, the question remains, why are start/done/next implemented in this way instead of the more obvious way (for example, in the C implementation of for(..; .. ; ..) the 'next' operation is invoked at the end of the body). -- Steve On Wednesday, July 23, 2014 12:29:12 AM UTC+3, vav...@uwaterloo.ca wrote: Dear Julia users, As I have mentioned in earlier posts, I am working on a 2-3 tree implementation of a sort-order dict, that is, a dict in which the (key, value) pairs can be retrieved in the sort-order of the keys. According to section 2.1.7 of the manual, for i = I; body ; end translates to: state = start(I) while !done(I, state) (i,state) = next(I,state) body end The more obvious way to implement the loop would be to put the body BEFORE the 'next' statement, and at first I thought that maybe the manual has a typo. But then I checked the file dict.jl, and I found indeed the code: done(t::ObjectIdDict, i) = is(next(t,i),()) So this means that every loop iteration over an ObjectIdDict requires two calls to 'next', one as part of the call to 'done', and a second one to actually advance the loop. I also looked at the code for 'done' for Dict in dict.jl, and it appears to be buggy(??). It does not check whether everything after the current i is deleted(??) For a 2-3 tree, the 'next' operation is logarithmic time, so requiring it twice per loop iteration is undesirable. Can anyone shed light on why the Julia loop construction is implemented this way, so that two separate calls to 'next' are necessary? I suppose that it is possible with additional code complexity to cache the result of the first call to 'next' inside the state, but what exactly is the rationale for the current design? Thanks, Steve Vavasis
Re: [julia-users] Re: iterator construct seems incorrect?
it can however to better for done() to do no work, but to have next actually compute the next+1 state: start(x) = next(x) next(x,i) = (i, next(x)) done(x,i) = (i===sentinel) this makes next somewhat more expensive to call repeatedly, but has the benefit that the user can hold onto a reference to any arbitrary node (until they mutate the underlying object) to perform custom iteration sequences (where sentinel could either be a special value, or a boolean that is part of the state tuple) On Tue, Jul 22, 2014 at 9:42 PM, Tim Holy tim.h...@gmail.com wrote: The other point to make is that if you find it more natural to do your incrementing inside done, there's no rule saying that next has to do any actual work. You can cache the item somewhere and just have next fetch it. --Tim On Tuesday, July 22, 2014 06:13:25 PM Iain Dunning wrote: Two separate calls to next certainly aren't in general necessary, and I haven't hit any issues implementing the iterator interface before. In fact, it felt quite natural in the end. I suspect that no matter which way you do the iteration interface, you'll be able to find an example where you'd prefer it to be the other way - it'd be nice to formalize that. On Tuesday, July 22, 2014 7:34:18 PM UTC-4, Tim Holy wrote: Thanks for clarifying. While I know exactly what you're talking about, and I wasn't around when iterators were designed, I've always assumed that the current behavior follows from two principles: (1) `done` must evaluate to a boolean, and (2) the `item` should not leak out of the scope block defined by the `while` loop. Last I thought through it carefully, the current design seemed inevitable. Best, --Tim On Tuesday, July 22, 2014 02:59:33 PM vav...@uwaterloo.ca javascript: wrote: Tim, The fact that 'next' is called before the loop body rather than after means that the 'done' predicate must provide an answer to the question: if 'next' is called on the current state, then would the result be the end of the data structure? The obvious way to answer that question is to actually invoke 'next' to see what happens. For a balanced tree, there is no simpler way that I know of to answer the question other than calling 'next'. It is possible to avoid the performance hit with additional code complexity by caching the result of 'next', but still, the question remains, why are start/done/next implemented in this way instead of the more obvious way (for example, in the C implementation of for(..; .. ; ..) the 'next' operation is invoked at the end of the body). -- Steve On Wednesday, July 23, 2014 12:29:12 AM UTC+3, vav...@uwaterloo.ca wrote: Dear Julia users, As I have mentioned in earlier posts, I am working on a 2-3 tree implementation of a sort-order dict, that is, a dict in which the (key, value) pairs can be retrieved in the sort-order of the keys. According to section 2.1.7 of the manual, for i = I; body ; end translates to: state = start(I) while !done(I, state) (i,state) = next(I,state) body end The more obvious way to implement the loop would be to put the body BEFORE the 'next' statement, and at first I thought that maybe the manual has a typo. But then I checked the file dict.jl, and I found indeed the code: done(t::ObjectIdDict, i) = is(next(t,i),()) So this means that every loop iteration over an ObjectIdDict requires two calls to 'next', one as part of the call to 'done', and a second one to actually advance the loop. I also looked at the code for 'done' for Dict in dict.jl, and it appears to be buggy(??). It does not check whether everything after the current i is deleted(??) For a 2-3 tree, the 'next' operation is logarithmic time, so requiring it twice per loop iteration is undesirable. Can anyone shed light on why the Julia loop construction is implemented this way, so that two separate calls to 'next' are necessary? I suppose that it is possible with additional code complexity to cache the result of the first call to 'next' inside the state, but what exactly is the rationale for the current design? Thanks, Steve Vavasis