Re: [Newbies] Re: Another extension proposal -> subsets
really interesting. but it reveals slower but it was because of Set operations. To be consistent with other method, I used an array. subsets6 | subsetsSize subsets workArray | workArray := self asArray. subsetsSize := 2 raisedTo: self size. subsets := Array new: subsetsSize. 1 to: subsetsSize do: [:sift | subsets at: sift put: (workArray selectIntegerMask: sift into: (Set new: sift highBit) )]. ^subsets Putting asSet as the end is very expensive when the collection is big results: set := (1 to: 15) asSet. [ set subsets "first sift" ] timeToRun ." 2484" [ set subsets4 "combinationsSize:do:"] timeToRun. " 1598" [ set subsets5 "Zulk"] timeToRun." 12493""I think he uses Set for the subset" [ set subsets6 "optimized sift] timeToRun. " 1436" "Winner :) " Thats was instructive, especially the way you deal with bit. Nice also how you removed the reverse(s). Cédrick ___ Beginners mailing list Beginners@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/beginners
[Newbies] Re: Another extension proposal -> subsets
cdrick gmail.com> writes: > > I like bit sift solution too for it's simplicity. > > me too even if we could argue this is not self explaining...That's what Andres Valoud said here (thanks Marcin): http://blogten.blogspot.com/2005/09/very-nice-methods.html > > which is the exact transposition of #combinations:atATimeDo: ... Nicolas ___ Beginners mailing list Beginners@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/beginners
[Newbies] Re: Another extension proposal -> subsets
nicolas cellier ifrance.com> writes: > > He, press ALT-v to get versions of #combinations:atATimeDo: and thanks tk! > > I like bit sift solution too for it's simplicity. > The problem is that it will iterate p times for creating subset of size p. > #combinations:atATimeDo: does not. It is building subsets in parallel. > The required copy does an iteration though, but in a primitive! > Hence the difference... > > Nicolas > What i wrote is true only if you collect: Arrays. It is not true if you use #asSet instead of #copy. #asSet is not primitive. The cost of bit sift can be greatly reduced: factor the reverse operations: Set>>subsets | subsetsSize subsets workArray | workArray := self asArray. subsetsSize := 2 raisedTo: self size. subsets := Set new: subsetsSize. 1 to: subsetsSize do: [:sift | subsets add: (workArray maskedBy: (sift printStringBase: 2) reverse) asSet]. ^subsets ArrayedCollection>>maskedBy: aBitString | maxSize result | maxSize := self size min: aBitString size. result := (self species new: maxSize) writeStream. 1 to: maxSize do: [:index | (aBitString at: index) = $1 ifTrue: [result nextPut: (self at: index)]]. ^result contents While at it, anyway, you do not need to reverse at all... Ordering does not matter. This eliminates one loop. It still costs 3 loops per partition: (1 to: maxSize) and asSet plus printStringBase:... Thus 2P+Q loops for a subset of size P out of N with P<=Q<=N You can eliminate the printString and send directly #bitAt: to integers. See http://bugs.squeak.org/view.php?id=6985 and load as pre-requisite: Installer mantis ensureFix: 6985. Set>>subsets | subsetsSize subsets workArray | workArray := self asArray. subsetsSize := 2 raisedTo: self size. subsets := Set new: subsetsSize. 1 to: subsetsSize do: [:sift | subsets add: (workArray maskedByIntegerBits: sift) asSet]. ^subsets ArrayedCollection>>maskedByIntegerBits: anInteger | maxSize result | maxSize := self size min: anInteger highBit. result := (self species new: maxSize) writeStream. 1 to: maxSize do: [:index | (anInteger bitAt: index) = 1 ifTrue: [result nextPut: (self at: index)]]. ^result contents You still have P+Q loops, P<= Q<=N You can now eliminate the asSet loop using collect:into: / select:into: (I think we can call this Klaus trick). Set>>subsets | subsetsSize subsets workArray | workArray := self asArray. subsetsSize := 2 raisedTo: self size. subsets := Set new: subsetsSize. 1 to: subsetsSize do: [:sift | subsets add: (workArray selectIntegerMask: sift into: (Set new: sift highBit)]. ^subsets ArrayedCollection>>selectIntegerMask: anInteger into: aCollection 1 to: (self size min: anInteger highBit) do: [:index | (anInteger bitAt: index) = 1 ifTrue: [aCollection add: (self at: index)]]. ^aCollection Only costs Q loops per partition... Maybe you can now approach #combinations:atAtTimeDo: I let you try. Nicolas ___ Beginners mailing list Beginners@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/beginners
Re: [Newbies] Re: Another extension proposal -> subsets
> > > > He, press ALT-v to get versions of #combinations:atATimeDo: and thanks tk! :) > > > I like bit sift solution too for it's simplicity. me too even if we could argue this is not self explaining... That's what Andres Valoud said here (thanks Marcin): http://blogten.blogspot.com/2005/09/very-nice-methods.html > > The problem is that it will iterate p times for creating each subset of > size p. > #combinations:atATimeDo: does not. It is building subsets in parallel. uhm, ok... the diferrence is an order or 2 more or less... > > The required copy does an iteration though, but in a primitive! > Hence the difference... > > Nicolas > Thanks all, this was fun and interesting ;)... Cédrick ___ Beginners mailing list Beginners@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/beginners
[Newbies] Re: Another extension proposal -> subsets
cdrick gmail.com> writes: > > Yes.. Nico's solution is lot faster... > > set := #(1 2 3 4 5) asSet. > [ set subsets "mine"] timeToRun ." 1" > [ set subsets2 "Zulk first" ] timeToRun. " 8" > [ set subsets4 "Nicolas"] timeToRun. " 1" > [ set subsets5 "Zulk"] timeToRun. " 2" > > set := (1 to: 10) asSet. > [ set subsets ] timeToRun . " 46" > [ set subsets4 ] timeToRun. " 18" > [ set subsets5 ] timeToRun." 114" > > set := (1 to: 15) asSet. > [ set subsets ] timeToRun ." 2484" > [ set subsets4 ] timeToRun. " 1598" > [ set subsets5 ] timeToRun." 12493" > > Nicolas won ;) So maybe we can add this one and rename the > combinations:atATime: ? > > Cédrick > He, press ALT-v to get versions of #combinations:atATimeDo: and thanks tk! I like bit sift solution too for it's simplicity. The problem is that it will iterate p times for creating each subset of size p. #combinations:atATimeDo: does not. It is building subsets in parallel. The required copy does an iteration though, but in a primitive! Hence the difference... Nicolas ___ Beginners mailing list Beginners@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/beginners
[Newbies] Re: Another extension proposal -> subsets
Zulq Alam zulq.net> writes: > > Nicolas Cellier wrote: > > > Zulq, > > the algorithm you are proposing is very simple but has major problems: > > > > 1) it is not efficient for large size n: it will do (n factorial) loops when > > only (2 raisedTo: n) are necessary > > It's better than N! because it will not loop over a set already > processed. For instance, for a set of 5 elements it will try 81 sets but > only process 32 of these. Not 120 in either case (5 factorial). > Apologies for this case of blindness! I forgot the includes: test was cutting branches. > > > > 2) each loop will result in a costly hash lookup. > > Your hashBlock involve LargeInteger of size (B raisedTo: p) where > > It doesn't need to. That was just a very rough attempt at producing a > hash that didn't evaluate to only 16 values. It should be possible to > create one that produces SmallIntegers but with a higher cardinality. > Yes, modular arithmetic would be a way to go. Beware, "small" LargeIntegers intermediate might still spoil the game... Crafting a good hash is an art (see work from Andres Valloud). > > 3) the algorithm must store all partitions even if we don't want to collect but > > just to iterate on partitions. > > Yes. > > > > > No offense, but you'd better not bother opening a mantis issue for this time. > > > > Agreed. I was just curious about why the naive algorithm was so slow and > then as a seperate question how one gets such changes in. > > Z. > Right, you did a good job exercizing your (and our) curiousity. This is very enlightening. Beginners should also learn using MessageTally spyOn: [], to analyze were the real costs go. Cheers ___ Beginners mailing list Beginners@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/beginners
[Newbies] Re: Another extension proposal -> subsets
Nicolas Cellier wrote: Zulq, the algorithm you are proposing is very simple but has major problems: 1) it is not efficient for large size n: it will do (n factorial) loops when only (2 raisedTo: n) are necessary It's better than N! because it will not loop over a set already processed. For instance, for a set of 5 elements it will try 81 sets but only process 32 of these. Not 120 in either case (5 factorial). 2) each loop will result in a costly hash lookup. Your hashBlock involve LargeInteger of size (B raisedTo: p) where It doesn't need to. That was just a very rough attempt at producing a hash that didn't evaluate to only 16 values. It should be possible to create one that produces SmallIntegers but with a higher cardinality. 3) the algorithm must store all partitions even if we don't want to collect but just to iterate on partitions. Yes. No offense, but you'd better not bother opening a mantis issue for this time. Agreed. I was just curious about why the naive algorithm was so slow and then as a seperate question how one gets such changes in. Z. ___ Beginners mailing list Beginners@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/beginners
Re: [Newbies] Re: Another extension proposal -> subsets
Yes.. Nico's solution is lot faster... set := #(1 2 3 4 5) asSet. [ set subsets "mine"] timeToRun ." 1" [ set subsets2 "Zulk first" ] timeToRun. " 8" [ set subsets4 "Nicolas"] timeToRun. " 1" [ set subsets5 "Zulk"] timeToRun. " 2" set := (1 to: 10) asSet. [ set subsets ] timeToRun . " 46" [ set subsets4 ] timeToRun. " 18" [ set subsets5 ] timeToRun." 114" set := (1 to: 15) asSet. [ set subsets ] timeToRun ." 2484" [ set subsets4 ] timeToRun. " 1598" [ set subsets5 ] timeToRun." 12493" Nicolas won ;) So maybe we can add this one and rename the combinations:atATime: ? Cédrick 2008/7/25 Nicolas Cellier <[EMAIL PROTECTED]>: > Klaus D. Witzel cobss.com> writes: > >> >> On Thu, 24 Jul 2008 20:09:53 +0200, Zulq Alam wrote: >> >> This is very good for small n :) >> >> > combinations >> >| combinations | >> >combinations := (PluggableSet new: (2 raisedTo: self size)) >> > hashBlock: >> >[:aSet | >> >aSet >> > inject: aSet size >> > into: [:hash :each | hash * each hash + each]]; >> > yourself. >> >self combinationsInto: combinations. >> >^ combinations >> > >> > combinationsInto: aSet >> >(aSet includes: self) ifTrue: [^ self]. >> >aSet add: self. >> >self do: >> > [:each | >> > (self copyWithout: each) combinationsInto: aSet]. >> > >> > Even now, I'm still not clear on how one takes a change like this and >> > drives it into a squeak release - Tests, Mantis, then what? >> >> Put it on Mantis as enhancement into the collection category. The rest >> follows automatically ;) I suggest to put something like "fast powerset >> method for collections" into the summary line. >> >> > Z. >> > > Zulq, > the algorithm you are proposing is very simple but has major problems: > > 1) it is not efficient for large size n: it will do (n factorial) loops when > only (2 raisedTo: n) are necessary >(2 raisedTo: 10) -> 1024 >(10 factorial) -> 3628800 > > 2) each loop will result in a costly hash lookup. > Your hashBlock involve LargeInteger of size (B raisedTo: p) where > - B is the size of an elementary hash (presumably 4 bytes), > - p the size of the partition (ranging from 0 to n) > The cost of a single hash of size p involves p multiplications of growing > size. > Each multiplication of k digits has a cost k^2 > (this is the naive algorithm currently programmed in LargeInteger plugin). > Thus a single hash costs (B^2+(2*B)^2+...+(p*B)^2) > Combine this with the (n factorial) loops and you will understand the problem. > > 3) the algorithm must store all partitions even if we don't want to collect > but > just to iterate on partitions. > > No offense, but you'd better not bother opening a mantis issue for this time. > > Cheers > > Nicolas > > > > > > > ___ > Beginners mailing list > Beginners@lists.squeakfoundation.org > http://lists.squeakfoundation.org/mailman/listinfo/beginners > ___ Beginners mailing list Beginners@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/beginners
[Newbies] Re: Another extension proposal -> subsets
Klaus D. Witzel cobss.com> writes: > > On Thu, 24 Jul 2008 20:09:53 +0200, Zulq Alam wrote: > > This is very good for small n :) > > > combinations > >| combinations | > >combinations := (PluggableSet new: (2 raisedTo: self size)) > > hashBlock: > >[:aSet | > >aSet > > inject: aSet size > > into: [:hash :each | hash * each hash + each]]; > > yourself. > >self combinationsInto: combinations. > >^ combinations > > > > combinationsInto: aSet > >(aSet includes: self) ifTrue: [^ self]. > >aSet add: self. > >self do: > > [:each | > > (self copyWithout: each) combinationsInto: aSet]. > > > > Even now, I'm still not clear on how one takes a change like this and > > drives it into a squeak release - Tests, Mantis, then what? > > Put it on Mantis as enhancement into the collection category. The rest > follows automatically ;) I suggest to put something like "fast powerset > method for collections" into the summary line. > > > Z. > Zulq, the algorithm you are proposing is very simple but has major problems: 1) it is not efficient for large size n: it will do (n factorial) loops when only (2 raisedTo: n) are necessary (2 raisedTo: 10) -> 1024 (10 factorial) -> 3628800 2) each loop will result in a costly hash lookup. Your hashBlock involve LargeInteger of size (B raisedTo: p) where - B is the size of an elementary hash (presumably 4 bytes), - p the size of the partition (ranging from 0 to n) The cost of a single hash of size p involves p multiplications of growing size. Each multiplication of k digits has a cost k^2 (this is the naive algorithm currently programmed in LargeInteger plugin). Thus a single hash costs (B^2+(2*B)^2+...+(p*B)^2) Combine this with the (n factorial) loops and you will understand the problem. 3) the algorithm must store all partitions even if we don't want to collect but just to iterate on partitions. No offense, but you'd better not bother opening a mantis issue for this time. Cheers Nicolas ___ Beginners mailing list Beginners@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/beginners
[Newbies] Re: Another extension proposal -> subsets
On Thu, 24 Jul 2008 20:09:53 +0200, Zulq Alam wrote: Klaus D. Witzel wrote: Squeak looks to be already fast in this case, your routine is almost optimal :) Implementing it in Set so that Set's internal can be of benefit won't bring these figures down much -- what remains is (aSet includes: another), times #copyWithout: loop, and that seems to be unavoidable. It's significantly faster with a more appropriate hashing function: {2->0 . 3->0 . 4->1 . 5->1 . 6->3 . 7->9 . 8->22 . 9->59 . 10->138 . 11->305 . 12->686 . 13->1640 . 14->4366 . 15->10550 . 16->28104 . 17->93373 . 18->303425} This is very good for small n :) combinations | combinations | combinations := (PluggableSet new: (2 raisedTo: self size)) hashBlock: [:aSet | aSet inject: aSet size into: [:hash :each | hash * each hash + each]]; yourself. self combinationsInto: combinations. ^ combinations combinationsInto: aSet (aSet includes: self) ifTrue: [^ self]. aSet add: self. self do: [:each | (self copyWithout: each) combinationsInto: aSet]. I think the problem is that most of the sets contain similar data so the hashes (calculated with bitXor) tend towards to a small set of values. This means there are a lot of collisions and each #add: and #includes: is very expensive. How about naming Collection>>#asPowerset ^ self asSet powersetInto: (Set new: (2 raisedTo: self size)) with Set>>#powersetInto: and putting that into the next release. Even now, I'm still not clear on how one takes a change like this and drives it into a squeak release - Tests, Mantis, then what? Put it on Mantis as enhancement into the collection category. The rest follows automatically ;) I suggest to put something like "fast powerset method for collections" into the summary line. Z. ___ Beginners mailing list Beginners@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/beginners
[Newbies] Re: Another extension proposal -> subsets
Klaus D. Witzel wrote: Squeak looks to be already fast in this case, your routine is almost optimal :) Implementing it in Set so that Set's internal can be of benefit won't bring these figures down much -- what remains is (aSet includes: another), times #copyWithout: loop, and that seems to be unavoidable. It's significantly faster with a more appropriate hashing function: {2->0 . 3->0 . 4->1 . 5->1 . 6->3 . 7->9 . 8->22 . 9->59 . 10->138 . 11->305 . 12->686 . 13->1640 . 14->4366 . 15->10550 . 16->28104 . 17->93373 . 18->303425} combinations | combinations | combinations := (PluggableSet new: (2 raisedTo: self size)) hashBlock: [:aSet | aSet inject: aSet size into: [:hash :each | hash * each hash + each]]; yourself. self combinationsInto: combinations. ^ combinations combinationsInto: aSet (aSet includes: self) ifTrue: [^ self]. aSet add: self. self do: [:each | (self copyWithout: each) combinationsInto: aSet]. I think the problem is that most of the sets contain similar data so the hashes (calculated with bitXor) tend towards to a small set of values. This means there are a lot of collisions and each #add: and #includes: is very expensive. How about naming Collection>>#asPowerset ^ self asSet powersetInto: (Set new: (2 raisedTo: self size)) with Set>>#powersetInto: and putting that into the next release. Even now, I'm still not clear on how one takes a change like this and drives it into a squeak release - Tests, Mantis, then what? Z. ___ Beginners mailing list Beginners@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/beginners
Re: [Newbies] Re: Another extension proposal -> subsets
> > Didn't check #combinations:* implementation, does it avoid #includes: ? I think so. It calls #combinationsAt:in: after: do: wich is recursive. > >> I'd still prefer #combinationsSize:do: instead of >> #combinations:atATimeDo: > > +1 but it looks alt-w friendly already (like #detectSum: looks ;) > Funny to see it was probably the original name. See the comment of #combinationsAt:in: after: (last line) "Choose k of N items and put in aCollection. jj-1 already chosen. Indexes of items are in numerical order, to avoid the same combo being used twice. In this slot, we are allowed to use items in self indexed by nn+1 to self size. nn is the index used for position jj-1." "(1 to: 6) combinationsSize: 3 do: [:each | Transcript cr; show: each printString]" Cédrick >>> >>> How about naming >>> >>> Collection>>#asPowerset >>> ^ self asSet powersetInto: (Set new: (2 raisedTo: self size)) >>> >>> with Set>>#powersetInto: and putting that into the next release. >>> >>> /Klaus >>> > > > ___ > Beginners mailing list > Beginners@lists.squeakfoundation.org > http://lists.squeakfoundation.org/mailman/listinfo/beginners > ___ Beginners mailing list Beginners@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/beginners
[Newbies] Re: Another extension proposal -> subsets
On Thu, 24 Jul 2008 18:27:48 +0200, Cedrick wrote: or withNicolas suggestion: Set>>asPowerset | subset | subset := (OrderedCollection new: (2 raisedTo: self size)) add: Set new; yourself. 1 to: self size do: [:subSize | "don't copy anymore as there is the Set conversion" self asArray combinations: subSize atATimeDo: [:subArray | subset add: subArray asSet]]. ^ subset asSet Ah, you francophones always want to see Blaise Pascal's triangle at work ;) Didn't check #combinations:* implementation, does it avoid #includes: ? I'd still prefer #combinationsSize:do: instead of #combinations:atATimeDo: +1 but it looks alt-w friendly already (like #detectSum: looks ;) How about naming Collection>>#asPowerset ^ self asSet powersetInto: (Set new: (2 raisedTo: self size)) with Set>>#powersetInto: and putting that into the next release. /Klaus ___ Beginners mailing list Beginners@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/beginners
Re: [Newbies] Re: Another extension proposal -> subsets
or withNicolas suggestion: Set>>asPowerset | subset | subset := (OrderedCollection new: (2 raisedTo: self size)) add: Set new; yourself. 1 to: self size do: [:subSize | "don't copy anymore as there is the Set conversion" self asArray combinations: subSize atATimeDo: [:subArray | subset add: subArray asSet]]. ^ subset asSet I'd still prefer #combinationsSize:do: instead of#combinations:atATimeDo: > > How about naming > > Collection>>#asPowerset >^ self asSet powersetInto: (Set new: (2 raisedTo: self size)) > > with Set>>#powersetInto: and putting that into the next release. > > /Klaus > ___ Beginners mailing list Beginners@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/beginners
[Newbies] Re: Another extension proposal -> subsets
On Thu, 24 Jul 2008 15:13:22 +0200, Zulq Alam wrote: Actually, we don't need #permutationsDo: as we're only looking for combinations. Permutations? (n factorial)? v.s. (2 raisedTo: n)? With a few optimisations: combinations ^ self combinationsInto: (Set new: (2 raisedTo: self size)) combinationsInto: aSet (aSet includes: self) ifTrue: [^ self]. "optimisation" aSet add: self. self do: [:each | (self copyWithout: each) combinationsInto: aSet]. ^ aSet Yeah, that's a good one :) [(1 to: 8) asSet combinations] timeToRun "215" [(1 to: 9) asSet combinations] timeToRun "918" [(1 to: 10) asSet combinations] timeToRun "3989" [(1 to: 11) asSet combinations] timeToRun "16349" [(1 to: 12) asSet combinations] timeToRun "68780" So a little better, but I was expecting much more. I think that the amount of work for (n + 1) is about (timeToRun of: n) * 3; should be observable better with larger n's ;) Squeak looks to be already fast in this case, your routine is almost optimal :) Implementing it in Set so that Set's internal can be of benefit won't bring these figures down much -- what remains is (aSet includes: another), times #copyWithout: loop, and that seems to be unavoidable. How about naming Collection>>#asPowerset ^ self asSet powersetInto: (Set new: (2 raisedTo: self size)) with Set>>#powersetInto: and putting that into the next release. /Klaus What's worrying is this: (1 to: 10) asSet combinations size "1024" ((1 to: 10) asSet combinations collect: [:e | e hash]) asSet size "16" So, 1024 distinct Sets have only 16 distinct hashes between them? That seems pretty bad. It would probably be possible to get a little bit more out of the routine with a better (more appropriate) hash function. Even so, I don't think it will ever be as fast as your method but will happily be proved wrong! :) Z. cdrick wrote: Set>>subsets | subsets | subsets := Set with: self. self asArray permutationsDo: [:e | subsets add: e asSet]. self do: [:e | subsets addAll: (self copyWithout: e) subsets]. ^ subsets nice too and frustrating how you got it quick :) I tried a recursive method too first but found the byte ressemblance so... My only consolation is that the recursive solution (subsets2) is slower and hardly work for Set with more than 10 elements. set := #(1 2 3 4 5) asSet. [ set subsets ] timeToRun ." 1" [ set subsets2 ] timeToRun. " 8" set := #(1 2 3 4 5 6 7) asSet. [ set subsets ] timeToRun ." 5" [ set subsets2 ] timeToRun. " 233" set := #(1 2 3 4 5 6 7 8 ) asSet. [ set subsets ] timeToRun . " 11" [ set subsets2 ] timeToRun. " 1683" set := (1 to: 10) asSet. [ set subsets ] timeToRun . " 46" set := (1 to: 15) asSet. [ set subsets ] timeToRun ." 2484" set := (1 to: 20) asSet. [ set subsets ] timeToRun . "559953" "but here the result has (2 raisedTo: 20) 1 048 576 Sets :)" set := (1 to: 50) asSet. [ set subsets ] timeToRun ."I got a "space is low wow" :) I have to go, That was fun :) See you Cédrick ___ Beginners mailing list Beginners@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/beginners ___ Beginners mailing list Beginners@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/beginners
[Newbies] Re: Another extension proposal -> subsets
Actually, we don't need #permutationsDo: as we're only looking for combinations. With a few optimisations: combinations ^ self combinationsInto: (Set new: (2 raisedTo: self size)) combinationsInto: aSet (aSet includes: self) ifTrue: [^ self]. "optimisation" aSet add: self. self do: [:each | (self copyWithout: each) combinationsInto: aSet]. ^ aSet [(1 to: 8) asSet combinations] timeToRun "215" [(1 to: 9) asSet combinations] timeToRun "918" [(1 to: 10) asSet combinations] timeToRun "3989" [(1 to: 11) asSet combinations] timeToRun "16349" [(1 to: 12) asSet combinations] timeToRun "68780" So a little better, but I was expecting much more. What's worrying is this: (1 to: 10) asSet combinations size "1024" ((1 to: 10) asSet combinations collect: [:e | e hash]) asSet size "16" So, 1024 distinct Sets have only 16 distinct hashes between them? That seems pretty bad. It would probably be possible to get a little bit more out of the routine with a better (more appropriate) hash function. Even so, I don't think it will ever be as fast as your method but will happily be proved wrong! :) Z. cdrick wrote: Set>>subsets | subsets | subsets := Set with: self. self asArray permutationsDo: [:e | subsets add: e asSet]. self do: [:e | subsets addAll: (self copyWithout: e) subsets]. ^ subsets nice too and frustrating how you got it quick :) I tried a recursive method too first but found the byte ressemblance so... My only consolation is that the recursive solution (subsets2) is slower and hardly work for Set with more than 10 elements. set := #(1 2 3 4 5) asSet. [ set subsets ] timeToRun ." 1" [ set subsets2 ] timeToRun. " 8" set := #(1 2 3 4 5 6 7) asSet. [ set subsets ] timeToRun ." 5" [ set subsets2 ] timeToRun. " 233" set := #(1 2 3 4 5 6 7 8 ) asSet. [ set subsets ] timeToRun . " 11" [ set subsets2 ] timeToRun. " 1683" set := (1 to: 10) asSet. [ set subsets ] timeToRun . " 46" set := (1 to: 15) asSet. [ set subsets ] timeToRun ." 2484" set := (1 to: 20) asSet. [ set subsets ] timeToRun . "559953" "but here the result has (2 raisedTo: 20) 1 048 576 Sets :)" set := (1 to: 50) asSet. [ set subsets ] timeToRun ."I got a "space is low wow" :) I have to go, That was fun :) See you Cédrick ___ Beginners mailing list Beginners@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/beginners ___ Beginners mailing list Beginners@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/beginners
Re: [Newbies] Re: Another extension proposal -> subsets
Nicolas, your initials fits you ;) nice and twice quicker as the bit-sift one. > You have one messsage #combinations:atATimeDo: to enumerate partitions, but > in SequenceableCollection, not in Set: > > | collec subset | > collec := 1 to: 4. > subset := OrderedCollection new: (2 raisedTo: collec size). > subset add: #(). > 1 to: collec size do: [:subSize | > collec combinations: subSize atATimeDo: [:subArray | subset add: subArray > copy]]. > subset > > Mind the copy, because the iteration recycle same subArray for economy. > I'dnevertheless prefer a name like: #combinationsSize:do: Cédrick ___ Beginners mailing list Beginners@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/beginners
[Newbies] Re: Another extension proposal -> subsets
You have one messsage #combinations:atATimeDo: to enumerate partitions, but in SequenceableCollection, not in Set: | collec subset | collec := 1 to: 4. subset := OrderedCollection new: (2 raisedTo: collec size). subset add: #(). 1 to: collec size do: [:subSize | collec combinations: subSize atATimeDo: [:subArray | subset add: subArray copy]]. subset Mind the copy, because the iteration recycle same subArray for economy. cdrick a écrit : Set>>subsets | subsets | subsets := Set with: self. self asArray permutationsDo: [:e | subsets add: e asSet]. self do: [:e | subsets addAll: (self copyWithout: e) subsets]. ^ subsets nice too and frustrating how you got it quick :) I tried a recursive method too first but found the byte ressemblance so... My only consolation is that the recursive solution (subsets2) is slower and hardly work for Set with more than 10 elements. set := #(1 2 3 4 5) asSet. [ set subsets ] timeToRun ." 1" [ set subsets2 ] timeToRun. " 8" set := #(1 2 3 4 5 6 7) asSet. [ set subsets ] timeToRun ." 5" [ set subsets2 ] timeToRun. " 233" set := #(1 2 3 4 5 6 7 8 ) asSet. [ set subsets ] timeToRun . " 11" [ set subsets2 ] timeToRun. " 1683" set := (1 to: 10) asSet. [ set subsets ] timeToRun . " 46" set := (1 to: 15) asSet. [ set subsets ] timeToRun ." 2484" set := (1 to: 20) asSet. [ set subsets ] timeToRun . "559953" "but here the result has (2 raisedTo: 20) 1 048 576 Sets :)" set := (1 to: 50) asSet. [ set subsets ] timeToRun ."I got a "space is low wow" :) I have to go, That was fun :) See you Cédrick ___ Beginners mailing list Beginners@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/beginners ___ Beginners mailing list Beginners@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/beginners
Re: [Newbies] Re: Another extension proposal -> subsets
> > Set>>subsets > | subsets | > subsets := Set with: self. > self asArray permutationsDo: [:e | subsets add: e asSet]. > self do: [:e | subsets addAll: (self copyWithout: e) subsets]. > ^ subsets nice too and frustrating how you got it quick :) I tried a recursive method too first but found the byte ressemblance so... My only consolation is that the recursive solution (subsets2) is slower and hardly work for Set with more than 10 elements. set := #(1 2 3 4 5) asSet. [ set subsets ] timeToRun ." 1" [ set subsets2 ] timeToRun. " 8" set := #(1 2 3 4 5 6 7) asSet. [ set subsets ] timeToRun ." 5" [ set subsets2 ] timeToRun. " 233" set := #(1 2 3 4 5 6 7 8 ) asSet. [ set subsets ] timeToRun . " 11" [ set subsets2 ] timeToRun. " 1683" set := (1 to: 10) asSet. [ set subsets ] timeToRun . " 46" set := (1 to: 15) asSet. [ set subsets ] timeToRun ." 2484" set := (1 to: 20) asSet. [ set subsets ] timeToRun . "559953" "but here the result has (2 raisedTo: 20) 1 048 576 Sets :)" set := (1 to: 50) asSet. [ set subsets ] timeToRun ."I got a "space is low wow" :) I have to go, That was fun :) See you Cédrick ___ Beginners mailing list Beginners@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/beginners
[Newbies] Re: Another extension proposal -> subsets
Couldn't find anything obvious but remembered #permutationsDo: which may help. Not sure if these does what you want, but maybe it can be adapted? Set>>subsets | subsets | subsets := Set with: self. self asArray permutationsDo: [:e | subsets add: e asSet]. self do: [:e | subsets addAll: (self copyWithout: e) subsets]. ^ subsets #(1 2) asSet subsets. "a Set(a Set(1 2) a Set() a Set(2) a Set(1))" #(1 2 3) asSet subsets. "a Set(a Set(1 2) a Set() a Set(2) a Set(1 2 3) a Set(2 3) a Set(3) a Set(1 3) a Set(1))" Z. cdrick wrote: When we don't find method, we reinvent the wheel. Ok, so this is what happened with my quest on rounding float. So here is another example. I needed a method subsets ... to have all subsets of a Set... #(1 2 ) asSet subsets " -> a Set(a Set(1 2) a Set(1) a Set(2) a Set()) " Is there already something ? I couldn't find so here is what I did ? probably hackish (because of the binary mask) so feel free to comment ;) Set>>subsets | subsetsSize subsets workArray | workArray := self asArray. subsetsSize := 2 raisedTo: self size. subsets := OrderedCollection new. 1 to: subsetsSize do: [:ea | subsets add: ((workArray masquedBy: (ea printStringBase: 2)) asSet)]. "masque par une conversion binaire" ^subsets asSet "could be an array of sets" ArrayedCollection>>masquedBy: aBitString | result entry bitString | entry := self reverse. bitString := aBitString reverse. result := OrderedCollection new. 1 to: (self size) do: [:ea | ((bitString at: ea ifAbsent: []) = $1) ifTrue: [result add: (entry at: ea)]]. ^result reverse Cédrick ___ Beginners mailing list Beginners@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/beginners ___ Beginners mailing list Beginners@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/beginners