bump...
> Thanks for the feedback guys. > > Based on the discussion I propose a different set of changes: > > 1. > shuffle > ^ self shuffleBy: Random new > > 2. > shuffled > ^ self copy shuffle > > 3. remove #shuffledBy: (if you're specific enough to use a custom Random > instance you can also create a copy of the collection yourself if you want to) > > > I leave the matter of renaming the methods for further discussion. Personally > I'm for more intention revealing selectors; I quite like #shuffle and > #shuffleInPlace. If we can agree on a different naming scheme we should then > apply it to other methods aswell of course (e.g. #sort, #sortInPlace). > > Also open for discussion is the use of Collection>>randomForPicking and > Collection>>mutexForPicking in other methods (such as #atRandom). I think it > shouldn't be too big a problem to make those methods use individual Random > instances and to remove the two class variables from Collection. > > Cheers, > Max > > > On 24.09.2013, at 02:33, [email protected] wrote: > >> Nicolas Cellier wrote: >>> >>> The reason why I don't like the explicit "copy", is because this is the >>> default behaviour. >>> In place mutation is an (evil) exception. >>> I'm personnally fine with current conventions: >>> - sort, shuffle, reverse, replace: are imperative, thus you are telling >>> collection, sort yourself ! >>> - sorted clearly sounds as a sorted copy for me. >>> If ever a change was to be made, I would much prefer this to happen on >>> opposite naming, sort -> inPlaceSort, or sortYourself :) >>> It's like telling the programmer, hey - a stateful mutation is really what >>> you want, is it? >>> >> >> Conventions should encourage good programming practice. Although it makes >> perfect sense on hearing it, I was explicitly not aware of the convention >> between 'sort' and 'sorted'. For me, the small difference in letters used >> between the two doesn't provide a wide enough distinction. Also, regarding >> not mutating state, I guess I knew that in the back of my head, but it has >> not generally been an explicit concern of mine while programming. >> Convention should encourage good programming. So I like the suggestion of >> inPlaceSort & sortYourself. As well as being more intention revealing, the >> less desirable methods are made less attractive by using a longer name. >> Perhaps another naming convention could be mutatedSort, or mutSort as a hint >> that the programmer should consider using these in conjunction with mutexes, >> or at own risk. >> >> cheers -ben >> >> >> >>> 2013/9/23 Stéphane Ducasse <[email protected]> >>> >>> >>>> +1 >>>> Now I understand the point of max with the name sort/sorted because this >>>> is not explicit to me too :) >>>> I laways have to think. >>>> >>>> On Sep 23, 2013, at 6:28 PM, Nicolas Cellier < >>>> [email protected]> wrote: >>>> >>>> 100% agree. >>>> Do it right > do it fast. >>>> We must not turn usage of the library into something fragile for the sake >>>> of speed. >>>> We already make the code itself fragile more often than not in term of >>>> complexity (harder to understand/test/change). >>>> >>>> Especially, introduction of shared mutable states (global, class vars, >>>> singleton or any other form) should ring an alarm in reviewers head >>>> (This is some very old Squeak code in this case, so Squeakers are to >>>> blame, but we're all in same bath). >>>> >>>> >>>> >>>> 2013/9/23 Henrik Johansen <[email protected]> >>>> >>>> >>>>> On Sep 23, 2013, at 3:33 , Max Leske <[email protected]> wrote: >>>>> >>>>> >>>>>>> If a user is going to modify the same object concurrently, he/she >>>>>>> >>>>> takes care of mutual exclusion. >>>>> >>>>>> Agreed. BUT: the Random object used by these methods is the same one >>>>>> >>>>> that is used by #atRandom for instance, hence the race condition. There is >>>>> no way anyone can safely use these methods without the mutex, single >>>>> threaded or not. Calls to methods using that same Random object can be all >>>>> over the place and also in the base system. >>>>> >>>>> >>>>> It seems to me an existing Random instance is used in this case mostly* >>>>> for performance. >>>>> One could argue that since the Random in this case is used for a bulk >>>>> operation, for which the object creation cost is largely amortized for >>>>> collection sizes > 20, it's acceptable to instead use Random new by >>>>> default, which wouldn't suffer from the same race conditions. >>>>> While still slower than a mutex-protected version for single-threaded >>>>> code, it would also scale correctly if the users (and vm) are actually >>>>> multi-threaded. >>>>> >>>>> [#(1 2 3 4 5 6 7 8 9 10 11 12) shuffle] bench '208,000 per second.' >>>>> '222,000 per second.' '223,000 per second.' >>>>> [#(1 2 3 4 5 6 7 8 9 10 11 12) shuffleWithMutex] bench '188,000 per >>>>> second.' '186,000 per second.''184,000 per second.' >>>>> [#(1 2 3 4 5 6 7 8 9 10 11 12) shuffleNewRandom] bench '167,000 per >>>>> second.' '166,000 per second.' '167,000 per second.' >>>>> >>>>> Cheers, >>>>> Henry >>>>> >>>>> * Low seed entropy is another issue, but if purely random shuffling is a >>>>> critical requirement, one shouldn't use the default Random generator >>>>> anyways... >>>>> >>>>> >>>>> >>>>> >>>> >>> >>> >> >
