shuffle

btc Mon, 23 Sep 2013 17:34:44 -0700

Nicolas Cellier wrote:

The reason why I don't like the explicit "copy", is because this is the
default behaviour.
In place mutation is an (evil) exception.
I'm personnally fine with current conventions:
- sort, shuffle, reverse, replace: are imperative, thus you are telling
collection, sort yourself !
- sorted clearly sounds as a sorted copy for me.
If ever a change was to be made, I would much prefer this to happen on
opposite naming, sort -> inPlaceSort, or sortYourself :)
It's like telling the programmer, hey - a stateful mutation is really what
you want, is it?

Conventions should encourage good programming practice. Although it makes perfect sense on hearing it, I was explicitly not aware of the convention between 'sort' and 'sorted'. For me, the small difference in letters used between the two doesn't provide a wide enough distinction. Also, regarding not mutating state, I guess I knew that in the back of my head, but it has not generally been an explicit concern of mine while programming. Convention should encourage good programming. So I like the suggestion of inPlaceSort & sortYourself. As well as being more intention revealing, the less desirable methods are made less attractive by using a longer name. Perhaps another naming convention could be mutatedSort, or mutSort as a hint that the programmer should consider using these in conjunction with mutexes, or at own risk.

cheers -ben


2013/9/23 Stéphane Ducasse <[email protected]>

+1
Now I understand the point of max with the name sort/sorted because this
is not explicit to me too :)
I laways have to think.

On Sep 23, 2013, at 6:28 PM, Nicolas Cellier <
[email protected]> wrote:

100% agree.
Do it right > do it fast.
We must not turn usage of the library into something fragile for the sake
of speed.
We already make the code itself fragile more often than not in term of
complexity (harder to understand/test/change).

Especially, introduction of shared mutable states (global, class vars,
singleton or any other form) should ring an alarm in reviewers head
(This is some very old Squeak code in this case, so Squeakers are to
blame, but we're all in same bath).



2013/9/23 Henrik Johansen <[email protected]>

On Sep 23, 2013, at 3:33 , Max Leske <[email protected]> wrote:

If a user is going to modify the same object concurrently, he/she

takes care of mutual exclusion.

Agreed. BUT: the Random object used by these methods is the same one

that is used by #atRandom for instance, hence the race condition. There is
no way anyone can safely use these methods without the mutex, single
threaded or not. Calls to methods using that same Random object can be all
over the place and also in the base system.


It seems to me an existing Random instance is used in this case mostly*
for performance.
One could argue that since the Random in this case is used for a bulk
operation, for which the object creation cost is largely amortized for
collection sizes > 20, it's acceptable to instead use Random new by
default, which wouldn't suffer from the same race conditions.
While still slower than a mutex-protected version for single-threaded
code, it would also scale correctly if the users (and vm) are actually
multi-threaded.

[#(1 2  3 4 5 6 7 8 9 10 11 12) shuffle] bench '208,000 per second.'
 '222,000 per second.' '223,000 per second.'
[#(1 2 3 4 5 6 7 8 9 10 11 12) shuffleWithMutex] bench '188,000 per
second.'  '186,000 per second.''184,000 per second.'
[#(1 2 3 4 5 6 7 8 9 10 11 12) shuffleNewRandom] bench '167,000 per
second.' '166,000 per second.'  '167,000 per second.'

Cheers,
Henry

* Low seed entropy is another issue, but if purely random shuffling is a
critical requirement, one shouldn't use the default Random generator
anyways...

Re: [Pharo-dev] 11635: Race condition in SequenceableCollection>>shuffle

Reply via email to