Re: [Newbies] Re: Another extension proposal -> subsets

2008-07-25 Thread cdrick
really interesting.
but it reveals slower but it was because of Set operations. To be consistent
with other method, I used an array.

subsets6
  | subsetsSize subsets workArray |
  workArray := self asArray.
  subsetsSize := 2 raisedTo: self size.
  subsets := Array new: subsetsSize.
  1 to: subsetsSize do: [:sift |
 subsets at: sift put:
   (workArray selectIntegerMask: sift into: (Set new: sift highBit) )].
  ^subsets

Putting asSet as the end is very expensive when the collection is big


results:

set := (1 to: 15) asSet.
[ set subsets "first sift" ] timeToRun ." 2484"
[ set subsets4  "combinationsSize:do:"] timeToRun. " 1598"
[ set subsets5  "Zulk"] timeToRun." 12493""I think he uses Set for the
subset"
[ set subsets6  "optimized sift] timeToRun. " 1436"   "Winner :) "

Thats was instructive, especially the way you deal with bit. Nice also how
you removed the reverse(s).

Cédrick
___
Beginners mailing list
Beginners@lists.squeakfoundation.org
http://lists.squeakfoundation.org/mailman/listinfo/beginners


[Newbies] Re: Another extension proposal -> subsets

2008-07-25 Thread nicolas cellier
cdrick  gmail.com> writes:

> 
> I like bit sift solution too for it's simplicity.
> 
> me too even if we could argue this is not self explaining...That's what Andres
Valoud said here (thanks Marcin):
http://blogten.blogspot.com/2005/09/very-nice-methods.html
>  
> 

which is the exact transposition of #combinations:atATimeDo: ...

Nicolas




___
Beginners mailing list
Beginners@lists.squeakfoundation.org
http://lists.squeakfoundation.org/mailman/listinfo/beginners


[Newbies] Re: Another extension proposal -> subsets

2008-07-25 Thread nicolas cellier
nicolas cellier  ifrance.com> writes:
> 
> He, press ALT-v to get versions of #combinations:atATimeDo: and thanks tk!
> 
> I like bit sift solution too for it's simplicity.
> The problem is that it will iterate p times for creating subset of size p.
> #combinations:atATimeDo: does not. It is building subsets in parallel.
> The required copy does an iteration though, but in a primitive!
> Hence the difference...
> 
> Nicolas
> 

What i wrote is true only if you collect: Arrays.
It is not true if you use #asSet instead of #copy. #asSet is not primitive.

The cost of bit sift can be greatly reduced: factor the reverse operations:

Set>>subsets
   | subsetsSize subsets workArray |
   workArray := self asArray.
   subsetsSize := 2 raisedTo: self size.
   subsets := Set new: subsetsSize.
   1 to: subsetsSize do: [:sift |
  subsets add:
(workArray maskedBy: (sift printStringBase: 2) reverse) asSet].
   ^subsets

ArrayedCollection>>maskedBy: aBitString
   | maxSize result |
   maxSize := self size min: aBitString size.
   result := (self species new: maxSize) writeStream.
   1 to: maxSize do: [:index |
  (aBitString  at: index) = $1 ifTrue: [result nextPut: (self at: index)]].
   ^result contents

While at it, anyway, you do not need to reverse at all...
Ordering does not matter.
This eliminates one loop.
It still costs 3 loops per partition:
 (1 to: maxSize) and asSet plus printStringBase:...
Thus 2P+Q loops for a subset of size P out of N with P<=Q<=N

You can eliminate the printString and send directly #bitAt: to integers.
See http://bugs.squeak.org/view.php?id=6985 and load as pre-requisite:
Installer mantis ensureFix: 6985.

Set>>subsets
   | subsetsSize subsets workArray |
   workArray := self asArray.
   subsetsSize := 2 raisedTo: self size.
   subsets := Set new: subsetsSize.
   1 to: subsetsSize do: [:sift |
  subsets add:
(workArray maskedByIntegerBits: sift) asSet].
   ^subsets

ArrayedCollection>>maskedByIntegerBits: anInteger
   | maxSize result |
   maxSize := self size min: anInteger highBit.
   result := (self species new: maxSize) writeStream.
   1 to: maxSize do: [:index |
  (anInteger bitAt: index) = 1 ifTrue: [result nextPut: (self at: index)]].
   ^result contents

You still have P+Q loops, P<= Q<=N
You can now eliminate the asSet loop using collect:into: / select:into:
(I think we can call this Klaus trick).

Set>>subsets
   | subsetsSize subsets workArray |
   workArray := self asArray.
   subsetsSize := 2 raisedTo: self size.
   subsets := Set new: subsetsSize.
   1 to: subsetsSize do: [:sift |
  subsets add:
(workArray selectIntegerMask: sift into: (Set new: sift highBit)].
   ^subsets

ArrayedCollection>>selectIntegerMask: anInteger into: aCollection
   1 to: (self size min: anInteger highBit) do: [:index |
  (anInteger bitAt: index) = 1 ifTrue: [aCollection add: (self at: index)]].
   ^aCollection

Only costs Q loops per partition...
Maybe you can now approach #combinations:atAtTimeDo:
I let you try.

Nicolas

___
Beginners mailing list
Beginners@lists.squeakfoundation.org
http://lists.squeakfoundation.org/mailman/listinfo/beginners


Re: [Newbies] Re: Another extension proposal -> subsets

2008-07-25 Thread cdrick
>
>
>
> He, press ALT-v to get versions of #combinations:atATimeDo: and thanks tk!


:)


>
>
> I like bit sift solution too for it's simplicity.


me too even if we could argue this is not self explaining...
That's what Andres Valoud said here (thanks Marcin):
http://blogten.blogspot.com/2005/09/very-nice-methods.html


>
> The problem is that it will iterate p times for creating each subset of
> size p.
> #combinations:atATimeDo: does not. It is building subsets in parallel.


uhm, ok...
the diferrence is an order or 2 more or less...


>
> The required copy does an iteration though, but in a primitive!
> Hence the difference...
>
> Nicolas
>

Thanks all, this was fun and interesting ;)...

Cédrick
___
Beginners mailing list
Beginners@lists.squeakfoundation.org
http://lists.squeakfoundation.org/mailman/listinfo/beginners


[Newbies] Re: Another extension proposal -> subsets

2008-07-25 Thread nicolas cellier
cdrick  gmail.com> writes:

> 
> Yes.. Nico's solution is lot faster...
> 
> set := #(1 2 3 4 5) asSet.
> [ set subsets  "mine"] timeToRun ." 1"
> [ set subsets2  "Zulk first"  ] timeToRun. " 8"
> [ set subsets4  "Nicolas"] timeToRun. " 1"
> [ set subsets5  "Zulk"] timeToRun. " 2"
> 
> set := (1 to: 10) asSet.
> [ set subsets  ] timeToRun . " 46"
> [ set subsets4  ] timeToRun. " 18"
> [ set subsets5  ] timeToRun." 114"
> 
> set := (1 to: 15) asSet.
> [ set subsets  ] timeToRun ." 2484"
> [ set subsets4  ] timeToRun. " 1598"
> [ set subsets5  ] timeToRun." 12493"
> 
> Nicolas won ;) So maybe we can add this one and rename the
> combinations:atATime: ?
> 
> Cédrick
> 

He, press ALT-v to get versions of #combinations:atATimeDo: and thanks tk!

I like bit sift solution too for it's simplicity.
The problem is that it will iterate p times for creating each subset of size p.
#combinations:atATimeDo: does not. It is building subsets in parallel.
The required copy does an iteration though, but in a primitive!
Hence the difference...

Nicolas

___
Beginners mailing list
Beginners@lists.squeakfoundation.org
http://lists.squeakfoundation.org/mailman/listinfo/beginners


[Newbies] Re: Another extension proposal -> subsets

2008-07-25 Thread nicolas cellier
Zulq Alam  zulq.net> writes:

> 
> Nicolas Cellier wrote:
> 
> > Zulq,
> > the algorithm you are proposing is very simple but has major problems:
> > 
> > 1) it is not efficient for large size n: it will do (n factorial) loops when
> > only (2 raisedTo: n) are necessary
> 
> It's better than N! because it will not loop over a set already 
> processed. For instance, for a set of 5 elements it will try 81 sets but 
> only process 32 of these. Not 120 in either case (5 factorial).
> 

Apologies for this case of blindness!
I forgot the includes: test was cutting branches.

> > 
> > 2) each loop will result in a costly hash lookup.
> > Your hashBlock involve LargeInteger of size (B raisedTo: p) where
> 
> It doesn't need to. That was just a very rough attempt at producing a 
> hash that didn't evaluate to only 16 values. It should be possible to 
> create one that produces SmallIntegers but with a higher cardinality.
> 

Yes, modular arithmetic would be a way to go.
Beware, "small" LargeIntegers intermediate might still spoil the game...
Crafting a good hash is an art (see work from Andres Valloud).

> > 3) the algorithm must store all partitions even if we don't want to collect
 but
> > just to iterate on partitions.
> 
> Yes.
> 
> > 
> > No offense, but you'd better not bother opening a mantis issue for this
 time.
> > 
> 
> Agreed. I was just curious about why the naive algorithm was so slow and 
> then as a seperate question how one gets such changes in.
> 
> Z.
> 

Right, you did a good job exercizing your (and our) curiousity.
This is very enlightening.
Beginners should also learn using MessageTally spyOn: [],
to analyze were the real costs go.

Cheers

___
Beginners mailing list
Beginners@lists.squeakfoundation.org
http://lists.squeakfoundation.org/mailman/listinfo/beginners


[Newbies] Re: Another extension proposal -> subsets

2008-07-25 Thread Zulq Alam

Nicolas Cellier wrote:


Zulq,
the algorithm you are proposing is very simple but has major problems:

1) it is not efficient for large size n: it will do (n factorial) loops when
only (2 raisedTo: n) are necessary


It's better than N! because it will not loop over a set already 
processed. For instance, for a set of 5 elements it will try 81 sets but 
only process 32 of these. Not 120 in either case (5 factorial).




2) each loop will result in a costly hash lookup.
Your hashBlock involve LargeInteger of size (B raisedTo: p) where


It doesn't need to. That was just a very rough attempt at producing a 
hash that didn't evaluate to only 16 values. It should be possible to 
create one that produces SmallIntegers but with a higher cardinality.



3) the algorithm must store all partitions even if we don't want to collect but
just to iterate on partitions.


Yes.



No offense, but you'd better not bother opening a mantis issue for this time.



Agreed. I was just curious about why the naive algorithm was so slow and 
then as a seperate question how one gets such changes in.


Z.
___
Beginners mailing list
Beginners@lists.squeakfoundation.org
http://lists.squeakfoundation.org/mailman/listinfo/beginners


Re: [Newbies] Re: Another extension proposal -> subsets

2008-07-25 Thread cdrick
Yes.. Nico's solution is lot faster...

set := #(1 2 3 4 5) asSet.
[ set subsets  "mine"] timeToRun ." 1"
[ set subsets2  "Zulk first"  ] timeToRun. " 8"
[ set subsets4  "Nicolas"] timeToRun. " 1"
[ set subsets5  "Zulk"] timeToRun. " 2"

set := (1 to: 10) asSet.
[ set subsets  ] timeToRun . " 46"
[ set subsets4  ] timeToRun. " 18"
[ set subsets5  ] timeToRun." 114"

set := (1 to: 15) asSet.
[ set subsets  ] timeToRun ." 2484"
[ set subsets4  ] timeToRun. " 1598"
[ set subsets5  ] timeToRun." 12493"

Nicolas won ;) So maybe we can add this one and rename the
combinations:atATime: ?

Cédrick



2008/7/25 Nicolas Cellier <[EMAIL PROTECTED]>:
> Klaus D. Witzel  cobss.com> writes:
>
>>
>> On Thu, 24 Jul 2008 20:09:53 +0200, Zulq Alam wrote:
>>
>> This is very good for small n :)
>>
>> > combinations
>> >| combinations |
>> >combinations := (PluggableSet new: (2 raisedTo: self size))
>> >  hashBlock:
>> >[:aSet |
>> >aSet
>> >  inject: aSet size
>> >  into: [:hash :each | hash * each hash + each]];
>> >  yourself.
>> >self combinationsInto: combinations.
>> >^ combinations
>> >
>> > combinationsInto: aSet
>> >(aSet includes: self) ifTrue: [^ self].
>> >aSet add: self.
>> >self do:
>> >  [:each |
>> >  (self copyWithout: each) combinationsInto: aSet].
>> >
>> > Even now, I'm still not clear on how one takes a change like this and
>> > drives it into a squeak release - Tests, Mantis, then what?
>>
>> Put it on Mantis as enhancement into the collection category. The rest
>> follows automatically ;) I suggest to put something like "fast powerset
>> method for collections" into the summary line.
>>
>> > Z.
>>
>
> Zulq,
> the algorithm you are proposing is very simple but has major problems:
>
> 1) it is not efficient for large size n: it will do (n factorial) loops when
> only (2 raisedTo: n) are necessary
>(2 raisedTo: 10) -> 1024
>(10 factorial) -> 3628800
>
> 2) each loop will result in a costly hash lookup.
> Your hashBlock involve LargeInteger of size (B raisedTo: p) where
> - B is the size of an elementary hash (presumably 4 bytes),
> - p the size of the partition (ranging from 0 to n)
> The cost of a single hash of size p involves p multiplications of growing 
> size.
> Each multiplication of k digits has a cost k^2
> (this is the naive algorithm currently programmed in LargeInteger plugin).
> Thus a single hash costs (B^2+(2*B)^2+...+(p*B)^2)
> Combine this with the (n factorial) loops and you will understand the problem.
>
> 3) the algorithm must store all partitions even if we don't want to collect 
> but
> just to iterate on partitions.
>
> No offense, but you'd better not bother opening a mantis issue for this time.
>
> Cheers
>
> Nicolas
>
>
>
>
>
>
> ___
> Beginners mailing list
> Beginners@lists.squeakfoundation.org
> http://lists.squeakfoundation.org/mailman/listinfo/beginners
>
___
Beginners mailing list
Beginners@lists.squeakfoundation.org
http://lists.squeakfoundation.org/mailman/listinfo/beginners


[Newbies] Re: Another extension proposal -> subsets

2008-07-25 Thread Nicolas Cellier
Klaus D. Witzel  cobss.com> writes:

> 
> On Thu, 24 Jul 2008 20:09:53 +0200, Zulq Alam wrote:
> 
> This is very good for small n :)
> 
> > combinations
> >| combinations |
> >combinations := (PluggableSet new: (2 raisedTo: self size))
> >  hashBlock:
> >[:aSet |
> >aSet
> >  inject: aSet size
> >  into: [:hash :each | hash * each hash + each]];
> >  yourself.
> >self combinationsInto: combinations.
> >^ combinations
> >
> > combinationsInto: aSet
> >(aSet includes: self) ifTrue: [^ self].
> >aSet add: self.
> >self do:
> >  [:each |
> >  (self copyWithout: each) combinationsInto: aSet].  
> >
> > Even now, I'm still not clear on how one takes a change like this and  
> > drives it into a squeak release - Tests, Mantis, then what?
> 
> Put it on Mantis as enhancement into the collection category. The rest  
> follows automatically ;) I suggest to put something like "fast powerset  
> method for collections" into the summary line.
> 
> > Z.
> 

Zulq,
the algorithm you are proposing is very simple but has major problems:

1) it is not efficient for large size n: it will do (n factorial) loops when
only (2 raisedTo: n) are necessary
(2 raisedTo: 10) -> 1024
(10 factorial) -> 3628800

2) each loop will result in a costly hash lookup.
Your hashBlock involve LargeInteger of size (B raisedTo: p) where
- B is the size of an elementary hash (presumably 4 bytes),
- p the size of the partition (ranging from 0 to n)
The cost of a single hash of size p involves p multiplications of growing size.
Each multiplication of k digits has a cost k^2
(this is the naive algorithm currently programmed in LargeInteger plugin).
Thus a single hash costs (B^2+(2*B)^2+...+(p*B)^2)
Combine this with the (n factorial) loops and you will understand the problem.

3) the algorithm must store all partitions even if we don't want to collect but
just to iterate on partitions.

No offense, but you'd better not bother opening a mantis issue for this time.

Cheers

Nicolas






___
Beginners mailing list
Beginners@lists.squeakfoundation.org
http://lists.squeakfoundation.org/mailman/listinfo/beginners


[Newbies] Re: Another extension proposal -> subsets

2008-07-24 Thread Klaus D. Witzel

On Thu, 24 Jul 2008 20:09:53 +0200, Zulq Alam wrote:


Klaus D. Witzel wrote:

Squeak looks to be already fast in this case, your routine is almost  
optimal :) Implementing it in Set so that Set's internal can be of  
benefit won't bring these figures down much -- what remains is (aSet  
includes: another), times #copyWithout: loop, and that seems to be  
unavoidable.


It's significantly faster with a more appropriate hashing function:

{2->0 . 3->0 . 4->1 . 5->1 . 6->3 . 7->9 . 8->22 . 9->59 . 10->138 .  
11->305 . 12->686 . 13->1640 . 14->4366 . 15->10550 . 16->28104 .  
17->93373 . 18->303425}


This is very good for small n :)


combinations
   | combinations |
   combinations := (PluggableSet new: (2 raisedTo: self size))
 hashBlock:
   [:aSet |
   aSet
 inject: aSet size
 into: [:hash :each | hash * each hash + each]];
 yourself.
   self combinationsInto: combinations.
   ^ combinations

combinationsInto: aSet
   (aSet includes: self) ifTrue: [^ self].
   aSet add: self.
   self do:
 [:each |
 (self copyWithout: each) combinationsInto: aSet].  

I think the problem is that most of the sets contain similar data so the  
hashes (calculated with bitXor) tend towards to a small set of values.  
This means there are a lot of collisions and each #add: and #includes:  
is very expensive.



 How about naming
 Collection>>#asPowerset
^ self asSet powersetInto: (Set new: (2 raisedTo: self size))
 with Set>>#powersetInto: and putting that into the next release.



Even now, I'm still not clear on how one takes a change like this and  
drives it into a squeak release - Tests, Mantis, then what?


Put it on Mantis as enhancement into the collection category. The rest  
follows automatically ;) I suggest to put something like "fast powerset  
method for collections" into the summary line.



Z.


___
Beginners mailing list
Beginners@lists.squeakfoundation.org
http://lists.squeakfoundation.org/mailman/listinfo/beginners


[Newbies] Re: Another extension proposal -> subsets

2008-07-24 Thread Zulq Alam

Klaus D. Witzel wrote:

Squeak looks to be already fast in this case, your routine is almost 
optimal :) Implementing it in Set so that Set's internal can be of 
benefit won't bring these figures down much -- what remains is (aSet 
includes: another), times #copyWithout: loop, and that seems to be 
unavoidable.


It's significantly faster with a more appropriate hashing function:

{2->0 . 3->0 . 4->1 . 5->1 . 6->3 . 7->9 . 8->22 . 9->59 . 10->138 . 
11->305 . 12->686 . 13->1640 . 14->4366 . 15->10550 . 16->28104 . 
17->93373 . 18->303425}


combinations
  | combinations |
  combinations := (PluggableSet new: (2 raisedTo: self size))
hashBlock:
  [:aSet |
  aSet
inject: aSet size
into: [:hash :each | hash * each hash + each]];
yourself.
  self combinationsInto: combinations.
  ^ combinations

combinationsInto: aSet
  (aSet includes: self) ifTrue: [^ self].
  aSet add: self.
  self do:
[:each |
(self copyWithout: each) combinationsInto: aSet].   

I think the problem is that most of the sets contain similar data so the 
hashes (calculated with bitXor) tend towards to a small set of values. 
This means there are a lot of collisions and each #add: and #includes: 
is very expensive.




How about naming

Collection>>#asPowerset
^ self asSet powersetInto: (Set new: (2 raisedTo: self size))

with Set>>#powersetInto: and putting that into the next release.



Even now, I'm still not clear on how one takes a change like this and 
drives it into a squeak release - Tests, Mantis, then what?


Z.

___
Beginners mailing list
Beginners@lists.squeakfoundation.org
http://lists.squeakfoundation.org/mailman/listinfo/beginners


Re: [Newbies] Re: Another extension proposal -> subsets

2008-07-24 Thread cdrick
>
> Didn't check #combinations:* implementation, does it avoid #includes: ?

I think so. It calls #combinationsAt:in: after: do:  wich is recursive.

>
>> I'd still prefer #combinationsSize:do:   instead of
>>  #combinations:atATimeDo:
>
> +1 but it looks alt-w friendly already (like #detectSum: looks ;)
>

Funny to see it was probably the original name. See the comment of
#combinationsAt:in: after: (last line)

"Choose k of N items and put in aCollection.  jj-1 already chosen.
Indexes of items are in numerical order, to avoid the same combo being
used twice.  In this slot, we are allowed to use items in self indexed
by nn+1 to self size.  nn is the index used for position jj-1."
"(1 to: 6) combinationsSize: 3 do: [:each | Transcript cr; show: each
printString]"


Cédrick
>>>
>>> How about naming
>>>
>>> Collection>>#asPowerset
>>>   ^ self asSet powersetInto: (Set new: (2 raisedTo: self size))
>>>
>>> with Set>>#powersetInto: and putting that into the next release.
>>>
>>> /Klaus
>>>
>
>
> ___
> Beginners mailing list
> Beginners@lists.squeakfoundation.org
> http://lists.squeakfoundation.org/mailman/listinfo/beginners
>
___
Beginners mailing list
Beginners@lists.squeakfoundation.org
http://lists.squeakfoundation.org/mailman/listinfo/beginners


[Newbies] Re: Another extension proposal -> subsets

2008-07-24 Thread Klaus D. Witzel

On Thu, 24 Jul 2008 18:27:48 +0200, Cedrick wrote:


or withNicolas suggestion:

Set>>asPowerset
| subset |
subset := (OrderedCollection new: (2 raisedTo: self size))
add: Set new;
yourself.
1 to: self size do: [:subSize | "don't copy anymore as there is the
Set conversion"
self asArray combinations: subSize atATimeDo: [:subArray | subset
add: subArray asSet]].
^ subset asSet


Ah, you francophones always want to see Blaise Pascal's triangle at work ;)

Didn't check #combinations:* implementation, does it avoid #includes: ?

I'd still prefer #combinationsSize:do:   instead of 
#combinations:atATimeDo:


+1 but it looks alt-w friendly already (like #detectSum: looks ;)



How about naming

Collection>>#asPowerset
   ^ self asSet powersetInto: (Set new: (2 raisedTo: self size))

with Set>>#powersetInto: and putting that into the next release.

/Klaus




___
Beginners mailing list
Beginners@lists.squeakfoundation.org
http://lists.squeakfoundation.org/mailman/listinfo/beginners


Re: [Newbies] Re: Another extension proposal -> subsets

2008-07-24 Thread cdrick
or withNicolas suggestion:

Set>>asPowerset
| subset |
subset := (OrderedCollection new: (2 raisedTo: self size))
add: Set new;
yourself.
1 to: self size do: [:subSize | "don't copy anymore as there is the
Set conversion"
self asArray combinations: subSize atATimeDo: [:subArray | subset
add: subArray asSet]].
^ subset asSet

I'd still prefer #combinationsSize:do:   instead of#combinations:atATimeDo:

>
> How about naming
>
> Collection>>#asPowerset
>^ self asSet powersetInto: (Set new: (2 raisedTo: self size))
>
> with Set>>#powersetInto: and putting that into the next release.
>
> /Klaus
>
___
Beginners mailing list
Beginners@lists.squeakfoundation.org
http://lists.squeakfoundation.org/mailman/listinfo/beginners


[Newbies] Re: Another extension proposal -> subsets

2008-07-24 Thread Klaus D. Witzel

On Thu, 24 Jul 2008 15:13:22 +0200, Zulq Alam wrote:

Actually, we don't need #permutationsDo: as we're only looking for  
combinations.


Permutations? (n factorial)? v.s. (2 raisedTo: n)?


With a few optimisations:

combinations
   ^ self combinationsInto: (Set new: (2 raisedTo: self size))

combinationsInto: aSet
   (aSet includes: self) ifTrue: [^ self]. "optimisation"
   aSet add: self.
   self do:
 [:each |
 (self copyWithout: each) combinationsInto: aSet].  
   ^ aSet


Yeah, that's a good one :)


[(1 to: 8) asSet combinations] timeToRun "215"
[(1 to: 9) asSet combinations] timeToRun "918"
[(1 to: 10) asSet combinations] timeToRun "3989"
[(1 to: 11) asSet combinations] timeToRun "16349" 
[(1 to: 12) asSet combinations] timeToRun  "68780"

So a little better, but I was expecting much more.


I think that the amount of work for (n + 1) is about (timeToRun of: n) *  
3; should be observable better with larger n's ;)


Squeak looks to be already fast in this case, your routine is almost  
optimal :) Implementing it in Set so that Set's internal can be of benefit  
won't bring these figures down much -- what remains is (aSet includes:  
another), times #copyWithout: loop, and that seems to be unavoidable.


How about naming

Collection>>#asPowerset
^ self asSet powersetInto: (Set new: (2 raisedTo: self size))

with Set>>#powersetInto: and putting that into the next release.

/Klaus


What's worrying is this:

(1 to: 10) asSet combinations size  "1024"
((1 to: 10) asSet combinations collect: [:e | e hash]) asSet size "16"

So, 1024 distinct Sets have only 16 distinct hashes between them? That  
seems pretty bad. It would probably be possible to get a little bit more  
out of the routine with a better (more appropriate) hash function. Even  
so, I don't think it will ever be as fast as your method but will  
happily be proved wrong! :)


Z.

cdrick wrote:

Set>>subsets
 | subsets |
 subsets := Set with: self.
 self asArray permutationsDo: [:e | subsets add: e asSet].
 self do: [:e | subsets addAll: (self copyWithout: e) subsets].
 ^ subsets

 nice too and frustrating how you got it quick :)
 I tried a recursive method too first but found the byte ressemblance  
so...

My only consolation is that the recursive solution (subsets2) is
slower and hardly work for Set with more than 10 elements.
 set := #(1 2 3 4 5) asSet.
[ set subsets  ] timeToRun ." 1"
[ set subsets2  ] timeToRun. " 8"
 set := #(1 2 3 4 5 6 7) asSet.
[ set subsets  ] timeToRun ." 5"
[ set subsets2  ] timeToRun. " 233"
 set := #(1 2 3 4 5 6 7 8 ) asSet.
[ set subsets  ] timeToRun . " 11"
[ set subsets2  ] timeToRun. " 1683"
 set := (1 to: 10) asSet.
[ set subsets  ] timeToRun . " 46"
 set := (1 to: 15) asSet.
[ set subsets  ] timeToRun ." 2484"
 set := (1 to: 20) asSet.
[ set subsets  ] timeToRun .  "559953"  "but here the result has (2
raisedTo: 20)  1 048 576 Sets :)"
 set := (1 to: 50) asSet.
[ set subsets  ] timeToRun ."I got a "space is low   wow" :)
 I have to go,
That was fun :)
 See you
 Cédrick
   


 ___
Beginners mailing list
Beginners@lists.squeakfoundation.org
http://lists.squeakfoundation.org/mailman/listinfo/beginners



___
Beginners mailing list
Beginners@lists.squeakfoundation.org
http://lists.squeakfoundation.org/mailman/listinfo/beginners


[Newbies] Re: Another extension proposal -> subsets

2008-07-24 Thread Zulq Alam
Actually, we don't need #permutationsDo: as we're only looking for 
combinations. With a few optimisations:


combinations
  ^ self combinationsInto: (Set new: (2 raisedTo: self size))

combinationsInto: aSet
  (aSet includes: self) ifTrue: [^ self]. "optimisation"
  aSet add: self.
  self do:
[:each |
(self copyWithout: each) combinationsInto: aSet].   
  ^ aSet

[(1 to: 8) asSet combinations] timeToRun "215"
[(1 to: 9) asSet combinations] timeToRun "918"
[(1 to: 10) asSet combinations] timeToRun "3989"
[(1 to: 11) asSet combinations] timeToRun "16349" 
[(1 to: 12) asSet combinations] timeToRun  "68780"

So a little better, but I was expecting much more. What's worrying is this:

(1 to: 10) asSet combinations size  "1024"
((1 to: 10) asSet combinations collect: [:e | e hash]) asSet size "16"

So, 1024 distinct Sets have only 16 distinct hashes between them? That 
seems pretty bad. It would probably be possible to get a little bit more 
out of the routine with a better (more appropriate) hash function. Even 
so, I don't think it will ever be as fast as your method but will 
happily be proved wrong! :)


Z.

cdrick wrote:

Set>>subsets
 | subsets |
 subsets := Set with: self.
 self asArray permutationsDo: [:e | subsets add: e asSet].
 self do: [:e | subsets addAll: (self copyWithout: e) subsets].
 ^ subsets


nice too and frustrating how you got it quick :)

I tried a recursive method too first but found the byte ressemblance so...
My only consolation is that the recursive solution (subsets2) is
slower and hardly work for Set with more than 10 elements.

set := #(1 2 3 4 5) asSet.
[ set subsets  ] timeToRun ." 1"
[ set subsets2  ] timeToRun. " 8"

set := #(1 2 3 4 5 6 7) asSet.
[ set subsets  ] timeToRun ." 5"
[ set subsets2  ] timeToRun. " 233"

set := #(1 2 3 4 5 6 7 8 ) asSet.
[ set subsets  ] timeToRun . " 11"
[ set subsets2  ] timeToRun. " 1683"

set := (1 to: 10) asSet.
[ set subsets  ] timeToRun . " 46"

set := (1 to: 15) asSet.
[ set subsets  ] timeToRun ." 2484"

set := (1 to: 20) asSet.
[ set subsets  ] timeToRun .  "559953"  "but here the result has (2
raisedTo: 20)  1 048 576 Sets :)"

set := (1 to: 50) asSet.
[ set subsets  ] timeToRun ."I got a "space is low   wow" :)

I have to go,
That was fun :)

See you

Cédrick




___
Beginners mailing list
Beginners@lists.squeakfoundation.org
http://lists.squeakfoundation.org/mailman/listinfo/beginners


___
Beginners mailing list
Beginners@lists.squeakfoundation.org
http://lists.squeakfoundation.org/mailman/listinfo/beginners


Re: [Newbies] Re: Another extension proposal -> subsets

2008-07-24 Thread cdrick
Nicolas, your initials fits you ;)

nice and twice quicker as the bit-sift one.

> You have one messsage #combinations:atATimeDo: to enumerate partitions, but
> in SequenceableCollection, not in Set:
>
> | collec subset |
> collec := 1 to: 4.
> subset := OrderedCollection new: (2 raisedTo: collec size).
> subset add: #().
> 1 to: collec size do: [:subSize |
>  collec combinations: subSize atATimeDo: [:subArray | subset add: subArray
> copy]].
> subset
>
> Mind the copy, because the iteration recycle same subArray for economy.
>

I'dnevertheless prefer a name like:  #combinationsSize:do:


Cédrick
___
Beginners mailing list
Beginners@lists.squeakfoundation.org
http://lists.squeakfoundation.org/mailman/listinfo/beginners


[Newbies] Re: Another extension proposal -> subsets

2008-07-23 Thread nicolas cellier
You have one messsage #combinations:atATimeDo: to enumerate partitions, 
but in SequenceableCollection, not in Set:


| collec subset |
collec := 1 to: 4.
subset := OrderedCollection new: (2 raisedTo: collec size).
subset add: #().
1 to: collec size do: [:subSize |
 collec combinations: subSize atATimeDo: [:subArray | subset add: 
subArray copy]].

subset

Mind the copy, because the iteration recycle same subArray for economy.

cdrick a écrit :

Set>>subsets
 | subsets |
 subsets := Set with: self.
 self asArray permutationsDo: [:e | subsets add: e asSet].
 self do: [:e | subsets addAll: (self copyWithout: e) subsets].
 ^ subsets


nice too and frustrating how you got it quick :)

I tried a recursive method too first but found the byte ressemblance so...
My only consolation is that the recursive solution (subsets2) is
slower and hardly work for Set with more than 10 elements.

set := #(1 2 3 4 5) asSet.
[ set subsets  ] timeToRun ." 1"
[ set subsets2  ] timeToRun. " 8"

set := #(1 2 3 4 5 6 7) asSet.
[ set subsets  ] timeToRun ." 5"
[ set subsets2  ] timeToRun. " 233"

set := #(1 2 3 4 5 6 7 8 ) asSet.
[ set subsets  ] timeToRun . " 11"
[ set subsets2  ] timeToRun. " 1683"

set := (1 to: 10) asSet.
[ set subsets  ] timeToRun . " 46"

set := (1 to: 15) asSet.
[ set subsets  ] timeToRun ." 2484"

set := (1 to: 20) asSet.
[ set subsets  ] timeToRun .  "559953"  "but here the result has (2
raisedTo: 20)  1 048 576 Sets :)"

set := (1 to: 50) asSet.
[ set subsets  ] timeToRun ."I got a "space is low   wow" :)

I have to go,
That was fun :)

See you

Cédrick




___
Beginners mailing list
Beginners@lists.squeakfoundation.org
http://lists.squeakfoundation.org/mailman/listinfo/beginners


___
Beginners mailing list
Beginners@lists.squeakfoundation.org
http://lists.squeakfoundation.org/mailman/listinfo/beginners


Re: [Newbies] Re: Another extension proposal -> subsets

2008-07-23 Thread cdrick
>
> Set>>subsets
>  | subsets |
>  subsets := Set with: self.
>  self asArray permutationsDo: [:e | subsets add: e asSet].
>  self do: [:e | subsets addAll: (self copyWithout: e) subsets].
>  ^ subsets

nice too and frustrating how you got it quick :)

I tried a recursive method too first but found the byte ressemblance so...
My only consolation is that the recursive solution (subsets2) is
slower and hardly work for Set with more than 10 elements.

set := #(1 2 3 4 5) asSet.
[ set subsets  ] timeToRun ." 1"
[ set subsets2  ] timeToRun. " 8"

set := #(1 2 3 4 5 6 7) asSet.
[ set subsets  ] timeToRun ." 5"
[ set subsets2  ] timeToRun. " 233"

set := #(1 2 3 4 5 6 7 8 ) asSet.
[ set subsets  ] timeToRun . " 11"
[ set subsets2  ] timeToRun. " 1683"

set := (1 to: 10) asSet.
[ set subsets  ] timeToRun . " 46"

set := (1 to: 15) asSet.
[ set subsets  ] timeToRun ." 2484"

set := (1 to: 20) asSet.
[ set subsets  ] timeToRun .  "559953"  "but here the result has (2
raisedTo: 20)  1 048 576 Sets :)"

set := (1 to: 50) asSet.
[ set subsets  ] timeToRun ."I got a "space is low   wow" :)

I have to go,
That was fun :)

See you

Cédrick
___
Beginners mailing list
Beginners@lists.squeakfoundation.org
http://lists.squeakfoundation.org/mailman/listinfo/beginners


[Newbies] Re: Another extension proposal -> subsets

2008-07-23 Thread Zulq Alam
Couldn't find anything obvious but remembered #permutationsDo: which may 
help. Not sure if these does what you want, but maybe it can be adapted?


Set>>subsets
  | subsets |
  subsets := Set with: self.
  self asArray permutationsDo: [:e | subsets add: e asSet].
  self do: [:e | subsets addAll: (self copyWithout: e) subsets].
  ^ subsets

#(1 2) asSet subsets.
"a Set(a Set(1 2) a Set() a Set(2) a Set(1))"

#(1 2 3) asSet subsets.
"a Set(a Set(1 2) a Set() a Set(2) a Set(1 2 3) a Set(2 3) a Set(3) a 
Set(1 3) a Set(1))"


Z.

cdrick wrote:

When we don't find method, we reinvent the wheel. Ok, so this is what
happened with my quest on rounding float. So here is another example.
I needed a method subsets ... to have all subsets of a Set...

#(1 2 ) asSet subsets " -> a Set(a Set(1 2) a Set(1) a Set(2) a Set()) "

Is there already something ? I couldn't find so here is what I did ?
probably hackish (because of the binary mask) so feel free to comment
;)

Set>>subsets
| subsetsSize subsets workArray |
workArray := self asArray.
subsetsSize := 2 raisedTo: self size.
subsets := OrderedCollection new.
1 to: subsetsSize do: [:ea |
subsets add: ((workArray masquedBy: (ea printStringBase: 2))
asSet)]. "masque par une conversion binaire"
^subsets asSet  "could be an array of sets"

ArrayedCollection>>masquedBy: aBitString
| result entry bitString |
entry := self reverse.
bitString := aBitString reverse.
result := OrderedCollection new.
1 to: (self size) do: [:ea |
 ((bitString  at: ea ifAbsent: []) = $1) ifTrue: [result add: 
(entry
at: ea)]].
^result reverse

Cédrick




___
Beginners mailing list
Beginners@lists.squeakfoundation.org
http://lists.squeakfoundation.org/mailman/listinfo/beginners


___
Beginners mailing list
Beginners@lists.squeakfoundation.org
http://lists.squeakfoundation.org/mailman/listinfo/beginners