RFC 255 (v2) Fix iteration of nested hashes
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Fix iteration of nested hashes =head1 VERSION Maintainer: Damian Conway <[EMAIL PROTECTED]> Date: 18 Sep 2000 Last Modified: 19 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 255 Version: 2 Status: Developing =head1 ABSTRACT This RFC proposes that the internal cursor iterated by the C function be stored in the pad of the block containing the C, rather than being stored within the hash being iterated. =head1 DESCRIPTION Currently, nesting two C iterations on the same hash leads to unexpected behaviour, because both Cs advance the same internal cursor within the hash. For example: %desc = ( blue => "moon", green => "egg", red => "Baron" ); while ( my ($key1,$value1) = each %desc ) { while ( my ($key2,$value2) = each %desc ) { print "$value2 is not $key1\n" unless $key1 eq $key2; } } print "(finished)\n"; It is proposed that each C maintain its own cursor (stored in the pad of the block containing it) so that the above example DWIMs. =head1 MIGRATION ISSUES Minimal. No-one nests iterators now because it doesn't work. Usages such as: $x = each %hash; $y = each %hash; @z = each %hash; would change their behaviour, but could be translated if p52p6 defined: sub p5_each(\%) { each %{$_[0]} } and globally replaced each Perl 5 C by C. There would not (necessarily) be any effect on the use of FIRSTKEY and NEXTKEY in tied hashes, since the compiler could still determine which should be called. However, tied hashes that use an internal cursor might behave differently, if nested. =head1 IMPLEMENTATION Store the cursor in the pad of the block in which the C is defined, rather than within hash. =head1 REFERENCES RFC 136: (Implementation of hash iterators) suggests separate iterators for C and C/C.
Re: RFC 255 (v2) Fix iteration of nested hashes
>This RFC proposes that the internal cursor iterated by the C function >be stored in the pad of the block containing the C, rather than >being stored within the hash being iterated. Then how do you specify which iterator is to be reset when you wish to do that? Currently, you do this by specifying the hash. If the iterator is no longer affiliated with the hash, but the opcode node, then what are you going to do? --tom
Re: RFC 255 (v2) Fix iteration of nested hashes
> >This RFC proposes that the internal cursor iterated by the C > >function be stored in the pad of the block containing the C, > >rather than being stored within the hash being iterated. > > Then how do you specify which iterator is to be reset when you wish > to do that? Currently, you do this by specifying the hash. > If the iterator is no longer affiliated with the hash, but the opcode node, Just to note: in version 2 of the RFC, it's associated with the pad of the block in which the C appears. > then what are you going to do? The short answer is that there is no "manual" reset of iterators. Damian
Re: RFC 255 (v2) Fix iteration of nested hashes
On Wed, Sep 20, 2000 at 07:06:21AM +1100, Damian Conway wrote: >> >This RFC proposes that the internal cursor iterated by the C >> >function be stored in the pad of the block containing the C, >> >rather than being stored within the hash being iterated. >> >> Then how do you specify which iterator is to be reset when you wish >> to do that? Currently, you do this by specifying the hash. > >> If the iterator is no longer affiliated with the hash, but the opcode node, > > Just to note: in version 2 of the RFC, it's associated with the pad of > the block in which the C appears. So you are suggesting that the first itteration of the loop reset the iterator. THis is because currently %hash = ( aa => 1, bb => 2); while(my($k,$v) = each %hash) { print $k,"\n" } is no different to foreach (1 .. keys %hash) { while(my($k,$v) = each %hash) { print $k,"\n"; last if $_ == 1; } } No if the iterator is associated with the scope how do I reset it so that the itterator always starts at the beginning each time and not only when it is called after having reach the end on the previous time >> then what are you going to do? > > The short answer is that there is no "manual" reset of iterators. Then you loos the ability to exit and reenter a scope continueing where you left off. Graham.
Re: RFC 255 (v2) Fix iteration of nested hashes
>Just to note: in version 2 of the RFC, it's associated with the pad of >the block in which the C appears. > > then what are you going to do? >The short answer is that there is no "manual" reset of iterators. I am concerned about that. sub fn(\%) { my $href = shift; while (my($k,$v) = each %$href) { return if something's funny; } } Now, imagine you call fn(%foo); fn(%bar); and there's a premature exit. Isn't the second fn() going to not only be at the wrong spot, but still worse, at the wrong hash? Or do you plan for all block exits to clear all their iterators? What happens then in this code: for my $hr (\(%foo, %bar, %glarch)) { push @first_keys, scalar each %$hr; } There's no block exit there. --tom
Re: RFC 255 (v2) Fix iteration of nested hashes
> "DC" == Damian Conway <[EMAIL PROTECTED]> writes: >> >This RFC proposes that the internal cursor iterated by the C >> >function be stored in the pad of the block containing the C, >> >rather than being stored within the hash being iterated. >> >> Then how do you specify which iterator is to be reset when you wish >> to do that? Currently, you do this by specifying the hash. >> If the iterator is no longer affiliated with the hash, but the >> opcode node, DC> Just to note: in version 2 of the RFC, it's associated with the pad of DC> the block in which the C appears. DC> The short answer is that there is no "manual" reset of iterators. wouldn't exiting the block do a reset when it gets exited/reentered? this shouldn't happen for closures but for other blocks. also having a separate iterator variable which can be part of a closure would solve the manual reset problem. uri -- Uri Guttman - [EMAIL PROTECTED] -- http://www.sysarch.com SYStems ARCHitecture, Software Engineering, Perl, Internet, UNIX Consulting The Perl Books Page --- http://www.sysarch.com/cgi-bin/perl_books The Best Search Engine on the Net -- http://www.northernlight.com
Re: RFC 255 (v2) Fix iteration of nested hashes
Thanks to everyone for their valuable feedback on this RFC. Clearly the proposed solution is not adequate, perhaps because it does not address the central issue that iterators really ought to be stateful objects, rather than statefree functions. I don't have time to rework the proposal from scratch, so the alternatives are: 1. I retract the RFC, and we live with the problem 2. I freeze the RFC, having added an Unresolved Problems section 3. Someone else takes it over and fixes it properly Preferences? Suggestions? Volunteers? Damian
Re: RFC 255 (v2) Fix iteration of nested hashes
On Tue, 19 Sep 2000, Tom Christiansen wrote: > >This RFC proposes that the internal cursor iterated by the C function > >be stored in the pad of the block containing the C, rather than > >being stored within the hash being iterated. > > Then how do you specify which iterator is to be reset when you wish > to do that? Currently, you do this by specifying the hash. If the Suppose we change each to be: each HASH each ITERATOR and create a new keyword, iterator HASH which creates a new iterator for the specified hash. This iterator can then be eached, just like the hash, and reset using reset ITERATOR Usage would then look like this (stealing Damian's code): %desc = ( blue => "moon", green => "egg", red => "Baron" ); $i1 = iterator %desc; $i2 = iterator %desc; while ( my ($key1,$value1) = each $i1) { while ( my ($key2,$value2) = each $i2 ) { print "$value2 is not $key1\n" unless $key1 eq $key2; } } print "(finished)\n"; This runs into problems if you currently have an iterator extant and you modify the hash to which it points. Immediate suggestions on how to handle this would be: 1) Do what the docs currently do; tell people "don't do that" 2) Have the iterator auto-reset when the hash is modified (probably bad) 3) Make the hash unmodifiable while there is an iterator extant (probably bad) 4) Make powerful magic in some way that isn't coming to mind Dave
Re: RFC 255 (v2) Fix iteration of nested hashes
In message <[EMAIL PROTECTED]> Dave Storrs <[EMAIL PROTECTED]> wrote: > This runs into problems if you currently have an iterator extant and you > modify the hash to which it points. Immediate suggestions on how to > handle this would be: > > 1) Do what the docs currently do; tell people "don't do that" > 2) Have the iterator auto-reset when the hash is modified > (probably bad) > 3) Make the hash unmodifiable while there is an iterator extant > (probably bad) > 4) Make powerful magic in some way that isn't coming to mind See the "Freezing state for keys and values efficiently" section of RFC 136 for some powerful magic that could achieve this... Tom -- Tom Hughes ([EMAIL PROTECTED]) http://www.compton.nu/ ...Reading is thinking with someone else's head instead of one's own.
Re: RFC 255 (v2) Fix iteration of nested hashes
Dear all, Since no-one has put their hand up to take this RFC over, I am now intending to retract it. I simply don't have the time to try and find a solution to the many (valid) problems that have been pointed out. I would heartily encourage anyone who wants to take on this monster to steal whatever they feel is worthwhile from this now-defunct proposal. Damian
Re: RFC 255 (v2) Fix iteration of nested hashes
On Mon, 25 Sep 2000 17:18:56 +1100 (EST), Damian Conway wrote: >Since no-one has put their hand up to take this RFC over, I am now >intending to retract it. I simply don't have the time to try and >find a solution to the many (valid) problems that have been pointed out. > >I would heartily encourage anyone who wants to take on this monster >to steal whatever they feel is worthwhile from this now-defunct proposal. I am not porposing to take this over. Frankly, I don't care enough, because I don't ever use "each". But I had written a reply, of which I'm not use if I ever sent it. In this, I proposed to give each a lexical scope, possibly optional. That way, even if you do recursion in a function that uses each, you'd get a *different* iterator for every time you come across it. Would that solve your problem? I think it could. As for the "possibly optional" lexical scoping: the next syntax is a bit ugly, but it shows some potential: while(my($key, $value) = my each %hash) { ... } ^^ -- Bart.