Re: [julia-users] Re: iterator construct seems incorrect?

2014-07-23 Thread Stefan Karpinski
Glad it makes sense. Maybe we could explain that better.

 On Jul 23, 2014, at 3:20 PM, vava...@uwaterloo.ca wrote:
 
 Dear Julia users,
 
 Thank you for all your thoughtful replies on this matter.  Indeed, you are 
 all correct, and I misunderstood the construct.  (The point that confused me 
 was that in the call (i,state)=next(I,state), I mistakenly assumed that the 
 returned state should correspond with returned data item i, when in fact 
 the proper approach is for the data item i to correspond to the old state 
 and to lag one position behind the returned state.)  I no longer think 
 there is a problem with the start/done/next paradigm in Julia.
 
 --- Steve Vavasis
 
 
 
 On Wednesday, July 23, 2014 12:29:12 AM UTC+3, vav...@uwaterloo.ca wrote:
 Dear Julia users,
 
 As I have mentioned in earlier posts, I am working on a 2-3 tree 
 implementation of a sort-order dict, that is, a dict in which the (key, 
 value) pairs can be retrieved in the sort-order of the keys.  According to 
 section 2.1.7 of the manual, for i = I; body ; end translates to:
 
   state = start(I)
   while !done(I, state)
   (i,state) = next(I,state)
   body 
   end
 
 The more obvious way to implement the loop would be to put the body BEFORE 
 the 'next' statement, and at first I thought that maybe the manual has a 
 typo.  But then I checked the file dict.jl, and I found indeed the code:
 
 done(t::ObjectIdDict, i) = is(next(t,i),())
 
 So this means that every loop iteration over an ObjectIdDict requires two 
 calls to 'next', one as part of the call to 'done', and a second one to 
 actually advance the loop.
 
 I also looked at the code for 'done' for Dict in dict.jl, and it appears to 
 be buggy(??).  It does not check whether everything after the current i is 
 deleted(??)
 
 For a 2-3 tree, the 'next' operation is logarithmic time, so requiring it 
 twice per loop iteration is undesirable.
 
 Can anyone shed light on why the Julia loop construction is implemented this 
 way, so that two separate calls to 'next' are necessary?  I suppose that it 
 is possible with additional code complexity to cache the result of the first 
 call to 'next' inside the state, but what exactly is the rationale for the 
 current design?
 
 Thanks,
 Steve Vavasis
 


Re: [julia-users] Re: iterator construct seems incorrect?

2014-07-22 Thread Tim Holy
Thanks for clarifying. While I know exactly what you're talking about, and I 
wasn't around when iterators were designed, I've always assumed that the 
current behavior follows from two principles: (1) `done` must evaluate to a 
boolean, and (2) the `item` should not leak out of the scope block defined by 
the `while` loop. Last I thought through it carefully, the current design 
seemed inevitable.

Best,
--Tim

On Tuesday, July 22, 2014 02:59:33 PM vava...@uwaterloo.ca wrote:
 Tim,
 
 The fact that 'next' is called before the loop body rather than after means
 that the 'done' predicate must provide an answer to the question: if
 'next' is called on the current state, then would the result be the end of
 the data structure?  The obvious way to answer that question is to
 actually invoke 'next' to see what happens. For a balanced tree, there is
 no simpler way that I know of to answer the question other than calling
 'next'.  It is possible to avoid the performance hit with additional code
 complexity by caching the result of 'next', but still, the question
 remains, why are start/done/next implemented in this way instead of the
 more obvious way (for example, in the C implementation of for(..; .. ; ..)
 the 'next' operation is invoked at the end of the body).
 
 -- Steve
 
 On Wednesday, July 23, 2014 12:29:12 AM UTC+3, vav...@uwaterloo.ca wrote:
  Dear Julia users,
  
  As I have mentioned in earlier posts, I am working on a 2-3 tree
  implementation of a sort-order dict, that is, a dict in which the (key,
  value) pairs can be retrieved in the sort-order of the keys.  According to
  
  section 2.1.7 of the manual, for i = I; body ; end translates to:
state = start(I)
while !done(I, state)

(i,state) = next(I,state)
body

end
  
  The more obvious way to implement the loop would be to put the body BEFORE
  the 'next' statement, and at first I thought that maybe the manual has a
  typo.  But then I checked the file dict.jl, and I found indeed the code:
  
  done(t::ObjectIdDict, i) = is(next(t,i),())
  
  So this means that every loop iteration over an ObjectIdDict requires two
  calls to 'next', one as part of the call to 'done', and a second one to
  actually advance the loop.
  
  I also looked at the code for 'done' for Dict in dict.jl, and it appears
  to be buggy(??).  It does not check whether everything after the current i
  is deleted(??)
  
  For a 2-3 tree, the 'next' operation is logarithmic time, so requiring it
  twice per loop iteration is undesirable.
  
  Can anyone shed light on why the Julia loop construction is implemented
  this way, so that two separate calls to 'next' are necessary?  I suppose
  that it is possible with additional code complexity to cache the result of
  the first call to 'next' inside the state, but what exactly is the
  rationale for the current design?
  
  Thanks,
  Steve Vavasis



Re: [julia-users] Re: iterator construct seems incorrect?

2014-07-22 Thread Iain Dunning
Two separate calls to next certainly aren't in general necessary, and I 
haven't hit any issues implementing the iterator interface before. In fact, 
it felt quite natural in the end.

I suspect that no matter which way you do the iteration interface, you'll 
be able to find an example where you'd prefer it to be the other way - it'd 
be nice to formalize that.

On Tuesday, July 22, 2014 7:34:18 PM UTC-4, Tim Holy wrote:

 Thanks for clarifying. While I know exactly what you're talking about, and 
 I 
 wasn't around when iterators were designed, I've always assumed that the 
 current behavior follows from two principles: (1) `done` must evaluate to 
 a 
 boolean, and (2) the `item` should not leak out of the scope block defined 
 by 
 the `while` loop. Last I thought through it carefully, the current design 
 seemed inevitable. 

 Best, 
 --Tim 

 On Tuesday, July 22, 2014 02:59:33 PM vav...@uwaterloo.ca javascript: 
 wrote: 
  Tim, 
  
  The fact that 'next' is called before the loop body rather than after 
 means 
  that the 'done' predicate must provide an answer to the question: if 
  'next' is called on the current state, then would the result be the end 
 of 
  the data structure?  The obvious way to answer that question is to 
  actually invoke 'next' to see what happens. For a balanced tree, there 
 is 
  no simpler way that I know of to answer the question other than calling 
  'next'.  It is possible to avoid the performance hit with additional 
 code 
  complexity by caching the result of 'next', but still, the question 
  remains, why are start/done/next implemented in this way instead of the 
  more obvious way (for example, in the C implementation of for(..; .. ; 
 ..) 
  the 'next' operation is invoked at the end of the body). 
  
  -- Steve 
  
  On Wednesday, July 23, 2014 12:29:12 AM UTC+3, vav...@uwaterloo.ca 
 wrote: 
   Dear Julia users, 
   
   As I have mentioned in earlier posts, I am working on a 2-3 tree 
   implementation of a sort-order dict, that is, a dict in which the 
 (key, 
   value) pairs can be retrieved in the sort-order of the keys. 
  According to 
   
   section 2.1.7 of the manual, for i = I; body ; end translates to: 
 state = start(I) 
 while !done(I, state) 
 
 (i,state) = next(I,state) 
 body 
 
 end 
   
   The more obvious way to implement the loop would be to put the body 
 BEFORE 
   the 'next' statement, and at first I thought that maybe the manual has 
 a 
   typo.  But then I checked the file dict.jl, and I found indeed the 
 code: 
   
   done(t::ObjectIdDict, i) = is(next(t,i),()) 
   
   So this means that every loop iteration over an ObjectIdDict requires 
 two 
   calls to 'next', one as part of the call to 'done', and a second one 
 to 
   actually advance the loop. 
   
   I also looked at the code for 'done' for Dict in dict.jl, and it 
 appears 
   to be buggy(??).  It does not check whether everything after the 
 current i 
   is deleted(??) 
   
   For a 2-3 tree, the 'next' operation is logarithmic time, so requiring 
 it 
   twice per loop iteration is undesirable. 
   
   Can anyone shed light on why the Julia loop construction is 
 implemented 
   this way, so that two separate calls to 'next' are necessary?  I 
 suppose 
   that it is possible with additional code complexity to cache the 
 result of 
   the first call to 'next' inside the state, but what exactly is the 
   rationale for the current design? 
   
   Thanks, 
   Steve Vavasis 



Re: [julia-users] Re: iterator construct seems incorrect?

2014-07-22 Thread Tim Holy
The other point to make is that if you find it more natural to do your 
incrementing inside done, there's no rule saying that next has to do any 
actual work. You can cache the item somewhere and just have next fetch it.

--Tim

On Tuesday, July 22, 2014 06:13:25 PM Iain Dunning wrote:
 Two separate calls to next certainly aren't in general necessary, and I
 haven't hit any issues implementing the iterator interface before. In fact,
 it felt quite natural in the end.
 
 I suspect that no matter which way you do the iteration interface, you'll
 be able to find an example where you'd prefer it to be the other way - it'd
 be nice to formalize that.
 
 On Tuesday, July 22, 2014 7:34:18 PM UTC-4, Tim Holy wrote:
  Thanks for clarifying. While I know exactly what you're talking about, and
  I
  wasn't around when iterators were designed, I've always assumed that the
  current behavior follows from two principles: (1) `done` must evaluate to
  a
  boolean, and (2) the `item` should not leak out of the scope block defined
  by
  the `while` loop. Last I thought through it carefully, the current design
  seemed inevitable.
  
  Best,
  --Tim
  
  On Tuesday, July 22, 2014 02:59:33 PM vav...@uwaterloo.ca javascript:
  
  wrote:
   Tim,
   
   The fact that 'next' is called before the loop body rather than after
  
  means
  
   that the 'done' predicate must provide an answer to the question: if
   'next' is called on the current state, then would the result be the end
  
  of
  
   the data structure?  The obvious way to answer that question is to
   actually invoke 'next' to see what happens. For a balanced tree, there
  
  is
  
   no simpler way that I know of to answer the question other than calling
   'next'.  It is possible to avoid the performance hit with additional
  
  code
  
   complexity by caching the result of 'next', but still, the question
   remains, why are start/done/next implemented in this way instead of the
   more obvious way (for example, in the C implementation of for(..; .. ;
  
  ..)
  
   the 'next' operation is invoked at the end of the body).
   
   -- Steve
   
   On Wednesday, July 23, 2014 12:29:12 AM UTC+3, vav...@uwaterloo.ca
  
  wrote:
Dear Julia users,

As I have mentioned in earlier posts, I am working on a 2-3 tree
implementation of a sort-order dict, that is, a dict in which the
  
  (key,
  
value) pairs can be retrieved in the sort-order of the keys.
   
   According to
   
section 2.1.7 of the manual, for i = I; body ; end translates to:
  state = start(I)
  while !done(I, state)
  
  (i,state) = next(I,state)
  body
  
  end

The more obvious way to implement the loop would be to put the body
  
  BEFORE
  
the 'next' statement, and at first I thought that maybe the manual has
  
  a
  
typo.  But then I checked the file dict.jl, and I found indeed the
  
  code:
done(t::ObjectIdDict, i) = is(next(t,i),())

So this means that every loop iteration over an ObjectIdDict requires
  
  two
  
calls to 'next', one as part of the call to 'done', and a second one
  
  to
  
actually advance the loop.

I also looked at the code for 'done' for Dict in dict.jl, and it
  
  appears
  
to be buggy(??).  It does not check whether everything after the
  
  current i
  
is deleted(??)

For a 2-3 tree, the 'next' operation is logarithmic time, so requiring
  
  it
  
twice per loop iteration is undesirable.

Can anyone shed light on why the Julia loop construction is
  
  implemented
  
this way, so that two separate calls to 'next' are necessary?  I
  
  suppose
  
that it is possible with additional code complexity to cache the
  
  result of
  
the first call to 'next' inside the state, but what exactly is the
rationale for the current design?

Thanks,
Steve Vavasis



Re: [julia-users] Re: iterator construct seems incorrect?

2014-07-22 Thread Jameson Nash
it can however to better for done() to do no work, but to have next
actually compute the next+1 state:

start(x) = next(x)
next(x,i) = (i, next(x))
done(x,i) = (i===sentinel)

this makes next somewhat more expensive to call repeatedly, but has the
benefit that the user can hold onto a reference to any arbitrary node
(until they mutate the underlying object) to perform custom iteration
sequences

(where sentinel could either be a special value, or a boolean that is part
of the state tuple)


On Tue, Jul 22, 2014 at 9:42 PM, Tim Holy tim.h...@gmail.com wrote:

 The other point to make is that if you find it more natural to do your
 incrementing inside done, there's no rule saying that next has to do any
 actual work. You can cache the item somewhere and just have next fetch it.

 --Tim

 On Tuesday, July 22, 2014 06:13:25 PM Iain Dunning wrote:
  Two separate calls to next certainly aren't in general necessary, and I
  haven't hit any issues implementing the iterator interface before. In
 fact,
  it felt quite natural in the end.
 
  I suspect that no matter which way you do the iteration interface, you'll
  be able to find an example where you'd prefer it to be the other way -
 it'd
  be nice to formalize that.
 
  On Tuesday, July 22, 2014 7:34:18 PM UTC-4, Tim Holy wrote:
   Thanks for clarifying. While I know exactly what you're talking about,
 and
   I
   wasn't around when iterators were designed, I've always assumed that
 the
   current behavior follows from two principles: (1) `done` must evaluate
 to
   a
   boolean, and (2) the `item` should not leak out of the scope block
 defined
   by
   the `while` loop. Last I thought through it carefully, the current
 design
   seemed inevitable.
  
   Best,
   --Tim
  
   On Tuesday, July 22, 2014 02:59:33 PM vav...@uwaterloo.ca
 javascript:
  
   wrote:
Tim,
   
The fact that 'next' is called before the loop body rather than after
  
   means
  
that the 'done' predicate must provide an answer to the question: if
'next' is called on the current state, then would the result be the
 end
  
   of
  
the data structure?  The obvious way to answer that question is to
actually invoke 'next' to see what happens. For a balanced tree,
 there
  
   is
  
no simpler way that I know of to answer the question other than
 calling
'next'.  It is possible to avoid the performance hit with additional
  
   code
  
complexity by caching the result of 'next', but still, the question
remains, why are start/done/next implemented in this way instead of
 the
more obvious way (for example, in the C implementation of for(..; ..
 ;
  
   ..)
  
the 'next' operation is invoked at the end of the body).
   
-- Steve
   
On Wednesday, July 23, 2014 12:29:12 AM UTC+3, vav...@uwaterloo.ca
  
   wrote:
 Dear Julia users,

 As I have mentioned in earlier posts, I am working on a 2-3 tree
 implementation of a sort-order dict, that is, a dict in which the
  
   (key,
  
 value) pairs can be retrieved in the sort-order of the keys.
  
According to
  
 section 2.1.7 of the manual, for i = I; body ; end translates
 to:
   state = start(I)
   while !done(I, state)

   (i,state) = next(I,state)
   body

   end

 The more obvious way to implement the loop would be to put the body
  
   BEFORE
  
 the 'next' statement, and at first I thought that maybe the manual
 has
  
   a
  
 typo.  But then I checked the file dict.jl, and I found indeed the
  
   code:
 done(t::ObjectIdDict, i) = is(next(t,i),())

 So this means that every loop iteration over an ObjectIdDict
 requires
  
   two
  
 calls to 'next', one as part of the call to 'done', and a second
 one
  
   to
  
 actually advance the loop.

 I also looked at the code for 'done' for Dict in dict.jl, and it
  
   appears
  
 to be buggy(??).  It does not check whether everything after the
  
   current i
  
 is deleted(??)

 For a 2-3 tree, the 'next' operation is logarithmic time, so
 requiring
  
   it
  
 twice per loop iteration is undesirable.

 Can anyone shed light on why the Julia loop construction is
  
   implemented
  
 this way, so that two separate calls to 'next' are necessary?  I
  
   suppose
  
 that it is possible with additional code complexity to cache the
  
   result of
  
 the first call to 'next' inside the state, but what exactly is the
 rationale for the current design?

 Thanks,
 Steve Vavasis