Re: Container insertion and removal

Robert Jacques Sun, 07 Mar 2010 21:25:33 -0800

On Sun, 07 Mar 2010 22:07:14 -0500, Steven Schveighoffer<schvei...@yahoo.com> wrote:

On Sun, 07 Mar 2010 12:43:09 -0500, Robert Jacques <sandf...@jhu.edu>wrote:
On Sun, 07 Mar 2010 08:23:03 -0500, Steven Schveighoffer<schvei...@yahoo.com> wrote:

[snip]

Please define for me an O(1) slice or index operation for a linked-list.
One for which you have references to the slice end points. I think thiswill work, and I was planning on providing it in the upcomingdcollections port. The only thing you cannot guarantee is that theorder is correct.

The container would have to do an O(N) search to verify the ranges areactually part of the collection. And using two ranges as iterators tocreate a third range feels very wrong and very bug prone: see all theissues raised during Andrei's iterators vs ranges presentations.Similarly, it feels wrong for something to define slicing and not indexing.

By the way, having ranges detect if they reach their end nodes or notis fairly easy to do.
you are correct in that point, you could throw an exception as long asthe end point is part of the range structure. If you just use a currentnode plus a length, then you cannot do that. But soft ops are notnecessary to create this.

Soft ops are necessary to document in code whether that invalidation isexpected to happen.

I still fail to see the difference between "soft" operations andnon-soft. What does soft guarantee? Give me a concrete definition,an example would help too.
There are a couple of possible definitions for soft operations: 1) thememory safety of the ranges of a collection are guaranteed. 2) That forthe topology viewed by a range isn't logically changed. i.e. the rangewill continue to perform the same logical function if the topology itsoperating on is updated 3) That for the topology viewed by a rangeisn't actually changed and all elements selected at range creation willbe viewed. 4) Like 3, but with all values being viewed.
For example, modifying an array in any way doesn't change 1, 2 or 3 forany of its slices.For a linked list defining a forward range, mutation, insertion andremoval can be done under 1 & 2.
The same can be said about doubly linked lists and bidirectional ranges.
For other containers, such as a sorted tree, mutation can break a 2/3though insertion and deletion don't break 2.Although, the ranges will see many values, they may not see all thevalues currently in the collection nor all the values in the collectionwhen the iterator was generated. So code that relies on such propertieswould be logically invalid.
I'd probably define hard ops as being 1) and soft ops at level 2. 4) isreally only possible with immutable containers.
Hard ops definitely qualify as #1, since we are in a GC'd language.
I don't really understand 2, "the range will continue to perform thesame logical function," what does that mean? Define that function. Iwould define it as #3, so obviously you think it's something else.
#3 would be most useful for soft operations if it could be guaranteed.I don't think it can.

I was thinking of "iterate from the start to the end", for example. Onemight better describe this concept as the topology of the containerrelative to the range doesn't change: things before it stay before, thingsafter it stay after and in the case of bidirectional ranges, things in themiddle, stay in the middle.

Wouldn't re-hashing necessitate re-allocation? (Thus the range wouldsee a
stale view)
God no. If my hash collision solution is linked-list based (which itis in dcollections), why should I reallocate all those nodes? I justrearrange them in a new bucket array.
Sorry, I was assuming that if you were going to implement a hashcollection you wouldn't be using a linked list approach, since that'swhat D's associative arrays already do. The are some really goodreasons to not use a list based hash in D due to GC false pointerissues, but basically none to re-implementing (poorly?) D's built-indata structure.
Hm... my hash outperforms builtin AAs by a wide margin. But this is nottechnically because my implementation is better, it's because AA's use"dumb" allocation methods. I don't know about false pointers, the hashnodes in my implementation only contain pointers, so I'm not sure thereis any possibility for false ones.

The GC isn't precise, so if you have a non-pointer type in a structurewith a pointer or in a class, you'll get false pointers. (i.e. the hashvalue at each node)

The difference is that algorithms can document in their template
constraints that they need a container with 'soft' properties.
What is the advantage? Why would an algorithm require softfunctions? What is an example of such an algorithm?
Something that uses toUpperCase or toLowerCase, for example.
I guess I won't get a real example. I'm not sure it matters. WhenAndrei starts implementing the soft methods, either they will be a hugewin or obviously useless. If I were a betting man, I'd bet on thelatter, but I'm not really good at betting, and Andrei's ideas areusually good :)

Though this was the first thing that popped into my head, it is a fairlyreal example. Consider you are doing some processing involving a containerand you call a library function, like toUpperCase, that will perform somemutation. For some containers, this is completely reasonable. But forothers, the container topology is going to massively change, invalidatingall your current ranges. And you really like to guarantee that both thecurrent implementation and the one 6 months from now don't mess up yourranges and cause a bunch of exceptions to be thrown.

I wasn't thinking of multi-threaded containers. I was trying to pointout
that version ids have failed in lock-free containers, where things are
happening on the order of a single atomic op or a context switch.Giventhe time a range could go unused in standard code, versioning won'twork.
Are you trying to say that if you don't use your range for exactly2^32 mutations, it could mistakenly think the range is still valid?That's a valid, but very very weak point.
Umm, no. That is a valid point that happens in production code todisastrous effects. Worse, it doesn't happen often and there's no goodunit test for it. I for one, never want to debug a program that onlyglitches after days of intensive use in a live environment with realcustomer data. Integer overflow bugs like this are actually one of thefew bugs that have ever killed anyone.
I think you are overestimating the life of ranges on a container. Theywill not survive that many mutations, and the worst that happens is therange continues on fully allocated memory (i.e. your #1 above).

A year ago I would have agreed with you, but then I saw a couple ofarticles about how this unlikely event does start to occur repeatably incertain circumstances. This made a bunch of news since it threw a massivespanner in lock-free containers, which relied on this never happening.

You can also force the container to invalidate itself once the firstwrap occurs. This at least will be a hard error.


So every container will have a finite number of operation and that's it?

No it's not. version tags + integer overflow = bug. Doug Lea knewaboutthe problem and but thought it would never happen in real code. AndBillGates thought no one will need more than 640k of ram. They both havebeen
proven wrong.
Overflow != bug. Wrapping completely to the same value == bug, but isso unlikely, it's worth the possibility.
Statistics 101: do a test enough times and even the highly improbablewill happen.
In statistics we generally ignore the outliers. I normally am on yourside in these types of cases, but we are talking about a statisticalimpossibility -- nobody will leave a range untouched for exactly 2^32mutations, it simply won't happen. Your computer will probably beobsolete before it does.

No, this is a statistical implausibility. One that has already happened tosome people. ulongs on the other hand...

BTW, I'm not advocating adding mutation counters, I don't plan on addingthem. It's just another alternative to "soft" mutations.


I understand.

Re: Container insertion and removal

Reply via email to