The combn solution offered by Bill is great.  It struck me that what you are 
doing, in fact, is generating the null distribution of the two-sample Wilcoxon 
test where the first group has size m and the second group has size n.  In 
general, the length of the array has size choose(n+m-1,m) which gets very big 
very fast.  For example,

> choose(20+20-1,20)
[1] 68923264410

If, on the off chance that you are interested in *summing* the unordered 
combinations across columns, there is a very slick way to do this that takes a 
tiny fraction of the time and memory that generating huge arrays entails.  If 
not, obviously, you already have your solution.

Just in case, here is the code to generate the distribution of sums.  It is 
based on an algorithm due to Harding (1984).

f <- function(m,n) {

umax <- (m*n+1)

ifelse (umax%%2==0, umaxp <- (umax/2)-1, umaxp <- (umax-1)/2)
#umaxp is analagous to “M” from Harding (1984)

p <- min((m+n),umaxp)
q <- min(m,umaxp)

dis <- c(1,numeric(umaxp))

if ((n+1)>umaxp) {

for (i in 1:q) {                        #steps for denominator of generating 
function
for (j in i:umaxp) {
        dis[j+1] <- (dis[j+1] + dis[(j-i)+1])
        }
        }

} else {

for (i in (n+1):(p)) {          #steps for numerator of generating function
for (j in (umaxp):i) {
        dis[j+1] <- (dis[j+1] - dis[(j-i)+1])
        }
        }

for (i in 1:q) {                        #steps for denominator of generating 
function
for (j in i:umaxp) {
        dis[j+1] <- (dis[j+1] + dis[(j-i)+1])
        }
        }

}

ldis <- length(dis)
ifelse(umax%%2==0,dis <- c(dis,dis[ldis:1]),dis <- c(dis,dis[(ldis-1):1]))
dispr <- dis/choose((n+m),n)
ws <- sum(1:m):sum((n+1):(n+m))
lws <- length(ws)
mat3 <- cbind(ws,dis,dispr,numeric(lws),numeric(lws))
mat3[,4] <- cumsum(mat3[,3])
mat3[,5] <- cumsum(mat3[,3][lws:1])[lws:1]
colnames(mat3) <- c("W","Freq","Probability","Sum up","Sum down")

print(mat3)
}

> system.time(f(20,20))
   user  system elapsed 
   0.11    0.00    0.11 

Bryan




That's brilliant - thanks.

On 17 Sep 2009, at 23:36, William Dunlap wrote:

> There is a 1-1 correspondance between your n-sets
> consisting of m possible element types (0 through m-1
> in your example) and the number of n-subsets of a (n+m-1)-set.
> E.g., your example had m=3 and n=3 and subtracting
> 1:3 from each column of combn(3+3-1,3) gives your result:
>
>> t(combn(3+3-1, 3)-(1:3))
>      [,1] [,2] [,3]
> [1,]    0    0    0
> [2,]    0    0    1
> [3,]    0    0    2
> [4,]    0    1    1
> [5,]    0    1    2
> [6,]    0    2    2
> [7,]    1    1    1
> [8,]    1    1    2
> [9,]    1    2    2
> [10,]    2    2    2
>
> Bill Dunlap
> TIBCO Software Inc - Spotfire Division
> wdunlap tibco.com
>
>> -----Original Message-----
>> From: r-help-boun...@r-project.org
>> [mailto:r-help-boun...@r-project.org] On Behalf Of Dan Halligan
>> Sent: Thursday, September 17, 2009 1:31 PM
>> To: r-help@r-project.org
>> Subject: [R] generating unordered combinations
>>
>> Hi,
>>
>> I am trying to generate all unordered combinations of a set of
>> numbers / characters, and I can only find a (very) clumsy way of  
>> doing
>> this using expand.grid.  For example, all unordered combinations of
>> the numbers 0, 1, 2 are:
>> 0, 0, 0
>> 0, 0, 1
>> 0, 0, 2
>> 0, 1, 1
>> 0, 1, 2
>> 0, 2, 2
>> 1, 1, 1
>> 1, 1, 2
>> 1, 2, 2
>> 2, 2, 2
>>
>> (I have not included, for example, 1, 0, 0, since it is equivalent to
>> 0, 0, 1).
>>
>> I have found a way to generate this data.frame using expand.grid as
>> follows:
>>
>> g <- expand.grid(c(0,1,2), c(0,1,2), c(0,1,2))
>> for(i in 1:nrow(g)) {
>>      g[i,] <- sort(as.character(g[i,]))
>> }
>> o <- order(g$Var1, g$Var2, g$Var3)
>> unique(g[o,]).
>>
>> This is obviously quite clumsy and hard to generalise to a greater
>> number of characters, so I'm keen to find any other solutions.  Can
>> anyone suggest a better (more general, quicker) method?
>>
>> Cheers
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>


-------------
Bryan Keller, Doctoral Student/Project Assistant
Educational Psychology - Quantitative Methods
The University of Wisconsin - Madison

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to