El jueves, 27 de febrero de 2014 18:45:35 UTC-6, Kevin Squire escribió: > > So, no one has really explained why this is happening. > > Sets are built on top of hash tables. The hash table itself is just an > array, many entries of which are not being used. > > As with all iteration in Julia, for iteration over a Set, start, next, > and done functions are defined (see > http://docs.julialang.org/en/latest/stdlib/base/#iteration). > > start(s::Set) iterates over the hash table and returns the index of the > first used entry (no values are returned yet from the Set). > > next(s::Set, i) returns (s.hash_table[i], i_next), where s.hash_table[i] is > the value for the current iteration, and i_next is the location of the > *following* entry in the hash table. > > done(s::Set, i) returns true if we're done iterating (which will > specifically be true when i > length(s.hash_table)) > > (Most of this is actually defined in Dict, upon which Set is defined, but > the essentials are correct and easier to understand above, IMHO.) > > The problem with modifying the table is two-fold: > > 1. next(s::Set, i) has already returned the location of the next entry > in the hash table; if you delete that entry, the next location is actually > invalid > 2. adding values to the hash table may trigger a resize (removing does > not) > > Problem 1. could be fixed by having next() check the next value before > returning it, which would allow removing elements from the Set. Any > change would likely make iteration over sets slightly less efficient. Do > we want to support this? > > Adding elements to the Set will never be a good idea. > For now, I'll take back my recommendation about PersistentSets in the > FunctionalCollections.jl package, as they seem to have other issues, and > I'm not sure now if they'll actually work. >
OK, this seems to be a much deeper issue than I thought. I am coming from C++, where this seems (?) to be a solved problem, in the sense that the standard specifies that iterators are not affected by deleting other elements. (Though now I am doubting this.) But this depends crucially on the underlying implementation apparently. A solution, I guess, is to copy the Set into an array and iterate over the array, checking membership of each element. But this seems to be very expensive. David. > > Kevin > > > > On Thu, Feb 27, 2014 at 8:02 AM, David P. Sanders > <dpsa...@gmail.com<javascript:> > > wrote: > >> >> >> El jueves, 27 de febrero de 2014 09:54:58 UTC-6, David P. Sanders >> escribió: >> >>> I need to iterate over a Set whose elements will be deleted in the >>> process. >>> However, some deleted elements are included in the iteration, e.g. the >>> element 4 in the code below. >>> >> >> According to the following, it is enough to check if the element is `in` >> the Set. >> It seems wrong that this should be necessary, though! >> >> >> julia> for i in s >> println("i=$i s=$s $(i in s)") >> delete!(s, i-1) >> println("now s=$s\n") >> end >> i=5 s=Set{Int64}({5,4,6,3}) true >> now s=Set{Int64}({5,6,3}) >> >> i=4 s=Set{Int64}({5,6,3}) false >> now s=Set{Int64}({5,6}) >> >> i=6 s=Set{Int64}({5,6}) true >> now s=Set{Int64}({6}) >> >> >> >>> >>> Is this a bug, or am I misunderstanding? >>> >>> Thanks. >>> >>> julia> s = Set(5, 3, 4, 6) >>> Set{Int64}({5,4,6,3}) >>> >>> julia> for i in s >>> println("i=$i s=$s") >>> delete!(s, i-1) >>> println("now s=$s\n") >>> end >>> i=5 s=Set{Int64}({5,4,6,3}) >>> now s=Set{Int64}({5,6,3}) >>> >>> i=4 s=Set{Int64}({5,6,3}) >>> now s=Set{Int64}({5,6}) >>> >>> i=6 s=Set{Int64}({5,6}) >>> now s=Set{Int64}({6}) >>> >>> >>> >>> >