Hi,

I'm stuck on getting decent performance for the following problem.
Say we have a list of all 5-card hands from a deck of 52 cards:
   load '~addons/stats/base/combinatorial.ijs'
   $allHands=. 5 comb 52
2598960 5

We also have a list of all 4-card combinations:
   $HCs=. 4 comb 52
270725 4

I want to find all hands possible using any two cards from a given row of *HCs
*but excluding those with either of the other two cards from the same row.

For example, for this randomly-chosen row of *HCs*, which hands include the
first two but exclude the second two?
   (]{~[:?#) HCs
15 22 33 44
   (#,+/) include=. 2=+/"1 allHands e."1 ] *15 22*
2598960 19600
   (#,+/) exclude=. 0=+/"1 allHands e."1 ] *33 44*
2598960 480200

So there are 17,296 hands we want:
   +/include*.exclude
17296

Check five at random to verify that these look correct:
   ]rcix=. (]{~[:5&?#) I. include*.exclude
1002358 1165504 2176134 1960355 56685
   rcix{allHands
 4 15 22 37 39
 5 15 22 34 45
15 18 20 22 39
12 15 22 27 39
 0  4  6 15 22

These look good because all include (15,22) but exclude any of  (33,44).

However, this is only one possibility for this single row of *HCs* but we
have six variations for this row based on choosing all possible pairs of
indexes into the row of *HCs* to include,
   |:inclIx=. 2 comb 4
0 0 0 1 1 2
1 2 3 2 3 3

And six corresponding pairs to exclude (indexes into the row of *HCs*):
   |:exclIx=. (i.4)-."1 ] 2 comb 4
2 1 1 0 0 0
3 3 2 3 2 1

In the above example, we did the one of the six corresponding to *inclIx *of
(0,1) and *exclIx *of (2,3).  Is there a way to do all six sets for each
row of *HCs* compared to each row of *allHands*?

We might start by reducing the sizes of our arrays by making them character
instead of integer:
   allHands=. allHands{a. [ HCs=. HCs{a.

However, we quickly see that we run out of memory for the expression we
would like to write:
   include=. 2=+/"1 allHands e."1 / inclIx{"1 / HCs
|out of memory
|   include=.2=+/"1 allHands     e."1/inclIx{"1/HCs

A little arithmetic shows us the extent of the problem:
   (],*/) #&>(,/inclIx{"1/HCs);<allHands
1624350 2598960 4221620676000
   10^.1624350*2598960
12.6255

So we have a 4-trillion-element intermediate result which is a bit much.

Our immediate thought might be to do this in pieces by selecting successive
portions of the right-hand part (*HCs*) of this expression.  We would like
to work on integral pieces for neatness, so we look at the prime factors of
the size of *HCs* and select the product of a couple of them to be the
"chunk size":
   q:#HCs
5 5 7 7 13 17
   ixs=. (ctr*chsz)+i.chsz [ ctr=. 0 [ chsz=. */5 5

[The expression for the next chunk would be
   ixs=. (ctr*chsz)+i.chsz [ ctr=. >:ctr
and so on.]

Timing how long one piece takes and scoring the results by simply adding up
their indexes into *allHands*, we get this:
   6!:2 'sel=. (2=+/"1 allHands e."1 / inclIx{"1 / ixs{HCs) *. 0=+/"1
allHands e."1 / exclIx{"1 / ixs{HCs'
11.5105
   scores=. 0$~#HCs    NB. Initialize scores vector
   6!:2 'theseScores=. +/&>I.&.><"1 |:+./"1 ] 0 2 1|:sel'
0.687056
   6!:2 'scores=. (theseScores) ixs}scores'
6.4e_6
   +/11.5105 0.687056 6.4e_6      NB. Total time per chunk
12.1976
   12.1976*(#ixs)%~#HCs           NB. Total estimated time
132088
   0 60 60#:12.1976*(#ixs)%~#HCs  NB. Estimated time: h m s
36 41 27.8104

So we should be able to do it this way in about 36 hours.  Can someone
think of a faster method to accomplish this?  The memory requirements seem
modest enough that I should be able to run five to ten processes
simultaneously on this to reduce the overall time but is there a way to
speed up the basic selection?

Thanks,

Devon


-- 

Devon McCormick, CFA

Quantitative Consultant
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to