Re: [R] Constrained vector permutation

2010-02-01 Thread Andrew Rominger
Chuck,

Thanks for the reference.  MCMC with swapping was my original goal and I
think I'll go with that in the long run--although using sample() has worked
out for now.  I was initially concerned that checking all choose(n,2)
possible swaps would slow down the process, but in my case for choose(30,2)
= 435 this seems not unreasonable.  I wonder if for larger vectors an
alternative would be desired.

Thanks to everyone for your help--
Andy


On Fri, Jan 29, 2010 at 8:23 PM, Charles C. Berry cbe...@tajo.ucsd.eduwrote:

 On Fri, 29 Jan 2010, Andrew Rominger wrote:

  Being reasonably sure that all valid permutations are equally probable is
 important to me.  I've played around with search algorithms in permuting
 contingency tables and find that possible solutions decrease rapidly once
 one starts assigning values, particularly if small values are assigned
 first, so it would seem all solutions are not equally probable (not only
 that but one frequently encounters dead ends where there are values left
 to assign and no allowable place to put them).  As such I think I'd opt to
 use sample()... several times if needed.

 To clarify, yes, I only need one valid permutation, the idea is I'll
 generate 1000s of ordered vectors, and then for each one generate one
 valid
 permutation.

 Thanks very much for the help and insights--
 Andy


 Andy,

 If you have some sense of importance sampling and/or MCMC you might look at

 Zaman and Simberloff (2002, Environmental and Ecological Statistics 9,
 405--421).

 which concerns sampling a binary matrix with fixed margins - not quite your
 problem, but akin to it in being a combinatorial nightmare without an
 obvious direct solution of workable size for real problems.

 They define a neighborhood for each allowable matrix s.t. swapping a pair
 of 1's at ij and kl with a pair of 0's at il and kj  (which doesn't violate
 the margin constraints) leads to a member of the neighborhood. IIRC, the
 size of the neighborhood and the sizes of the neighborhoods of the members
 of its neighborhood determine the probabilities of staying put or moving to
 get the next element of the MCMC chain and which member of the neighborhood
 to choose.

 I suppose something like that (i.e. defining neighborhoods of allowable
 permutations, measuring their size, and using this to guide sampling or
 develop importance weights) might apply in your case. Maybe something like
 this: start with an ordering of your n-vector that conforms to the
 constraints, look at all the choose(n,2) pairs of elements and check which
 of them could be exchanged to yield another conforming ordering; the
 allowable swaps define the neighborhood, and their number is its size.

 So, the idea is to develop an MCMC sampler. Run it for each ordered vector
 to get past the burn in, then take one value from it.

 HTH,

 Chuck



 On Thu, Jan 28, 2010 at 3:04 PM, Thomas Lumley tlum...@u.washington.edu
 wrote:

  On Thu, 28 Jan 2010, Jason Smith wrote:

  It wouldn't be guaranteed to produce any usable permutation, but it
 seems

 like it would be much faster and so could be repeated until an
 acceptable
 vector is found.  What do you think?

 Thanks--
 Andy


  I think I am not understanding what your ultimate goal is so I'm not
 sure I can give you appropriate advice.  Are you looking for a single
 valid permutation or all of them?

 Since that constraint sets a ceiling on each subsequent value, it
 seems like you could solve this problem more easily and quickly by
 using a search strategy instead of random sampling or generating all
 permutations then testing.  The constraint will help prune the search
 space so you only generate valid permutations.  Once you are examining
 a particular element you can determine which of the additional
 elements would be valid, so only consider those.


 It's easy to generate valid permutations this way.  It does not appear
 straightforward to ensure that all valid permutations are sampled with
 equal
 probability, which I thought was part of the specification of the
 problem.

 -thomas


 Thomas Lumley   Assoc. Professor, Biostatistics
 tlum...@u.washington.eduUniversity of Washington, Seattle


[[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 Charles C. Berry(858) 534-2098
Dept of Family/Preventive
 Medicine
 E mailto:cbe...@tajo.ucsd.edu   UC San Diego
 http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE 

Re: [R] Constrained vector permutation

2010-01-29 Thread Andrew Rominger
Being reasonably sure that all valid permutations are equally probable is
important to me.  I've played around with search algorithms in permuting
contingency tables and find that possible solutions decrease rapidly once
one starts assigning values, particularly if small values are assigned
first, so it would seem all solutions are not equally probable (not only
that but one frequently encounters dead ends where there are values left
to assign and no allowable place to put them).  As such I think I'd opt to
use sample()... several times if needed.

To clarify, yes, I only need one valid permutation, the idea is I'll
generate 1000s of ordered vectors, and then for each one generate one valid
permutation.

Thanks very much for the help and insights--
Andy


On Thu, Jan 28, 2010 at 3:04 PM, Thomas Lumley tlum...@u.washington.eduwrote:

 On Thu, 28 Jan 2010, Jason Smith wrote:

  It wouldn't be guaranteed to produce any usable permutation, but it seems
 like it would be much faster and so could be repeated until an acceptable
 vector is found.  What do you think?

 Thanks--
 Andy


 I think I am not understanding what your ultimate goal is so I'm not
 sure I can give you appropriate advice.  Are you looking for a single
 valid permutation or all of them?

 Since that constraint sets a ceiling on each subsequent value, it
 seems like you could solve this problem more easily and quickly by
 using a search strategy instead of random sampling or generating all
 permutations then testing.  The constraint will help prune the search
 space so you only generate valid permutations.  Once you are examining
 a particular element you can determine which of the additional
 elements would be valid, so only consider those.


 It's easy to generate valid permutations this way.  It does not appear
 straightforward to ensure that all valid permutations are sampled with equal
 probability, which I thought was part of the specification of the problem.

  -thomas


 Thomas Lumley   Assoc. Professor, Biostatistics
 tlum...@u.washington.eduUniversity of Washington, Seattle


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Constrained vector permutation

2010-01-29 Thread Charles C. Berry

On Fri, 29 Jan 2010, Andrew Rominger wrote:


Being reasonably sure that all valid permutations are equally probable is
important to me.  I've played around with search algorithms in permuting
contingency tables and find that possible solutions decrease rapidly once
one starts assigning values, particularly if small values are assigned
first, so it would seem all solutions are not equally probable (not only
that but one frequently encounters dead ends where there are values left
to assign and no allowable place to put them).  As such I think I'd opt to
use sample()... several times if needed.

To clarify, yes, I only need one valid permutation, the idea is I'll
generate 1000s of ordered vectors, and then for each one generate one valid
permutation.

Thanks very much for the help and insights--
Andy


Andy,

If you have some sense of importance sampling and/or MCMC you might look 
at


Zaman and Simberloff (2002, Environmental and Ecological Statistics 9,
405--421).

which concerns sampling a binary matrix with fixed margins - not quite 
your problem, but akin to it in being a combinatorial nightmare without 
an obvious direct solution of workable size for real problems.


They define a neighborhood for each allowable matrix s.t. swapping a pair 
of 1's at ij and kl with a pair of 0's at il and kj  (which doesn't 
violate the margin constraints) leads to a member of the neighborhood. 
IIRC, the size of the neighborhood and the sizes of the neighborhoods of 
the members of its neighborhood determine the probabilities of staying put 
or moving to get the next element of the MCMC chain and which member of 
the neighborhood to choose.


I suppose something like that (i.e. defining neighborhoods of allowable 
permutations, measuring their size, and using this to guide sampling or 
develop importance weights) might apply in your case. Maybe something like 
this: start with an ordering of your n-vector that conforms to the 
constraints, look at all the choose(n,2) pairs of elements and check which 
of them could be exchanged to yield another conforming ordering; the 
allowable swaps define the neighborhood, and their number is its size.


So, the idea is to develop an MCMC sampler. Run it for each ordered vector 
to get past the burn in, then take one value from it.


HTH,

Chuck




On Thu, Jan 28, 2010 at 3:04 PM, Thomas Lumley tlum...@u.washington.eduwrote:


On Thu, 28 Jan 2010, Jason Smith wrote:

 It wouldn't be guaranteed to produce any usable permutation, but it seems

like it would be much faster and so could be repeated until an acceptable
vector is found.  What do you think?

Thanks--
Andy



I think I am not understanding what your ultimate goal is so I'm not
sure I can give you appropriate advice.  Are you looking for a single
valid permutation or all of them?

Since that constraint sets a ceiling on each subsequent value, it
seems like you could solve this problem more easily and quickly by
using a search strategy instead of random sampling or generating all
permutations then testing.  The constraint will help prune the search
space so you only generate valid permutations.  Once you are examining
a particular element you can determine which of the additional
elements would be valid, so only consider those.



It's easy to generate valid permutations this way.  It does not appear
straightforward to ensure that all valid permutations are sampled with equal
probability, which I thought was part of the specification of the problem.

 -thomas


Thomas Lumley   Assoc. Professor, Biostatistics
tlum...@u.washington.eduUniversity of Washington, Seattle



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Constrained vector permutation

2010-01-28 Thread Jason Smith
Andrew Rominger ajrominger at gmail.com writes:

 I'm trying to permute a vector of positive integers  0 with the constraint

Hi Andy

I'm not sure if you are explicitly wanting to use a sampling approach, but the 
gtools library has a permutations function (found by ??permutation then ?
gtools::combinations).

Hope this helps,
Jason Smith

Here is the script I used:


# Constraint
# f(n_i) = 2 * f(n_(i-1))
#
# Given a start value and the number of elements
# recursively generate a vector representing the 
# maximum values each index is allowed
#
f - function(value, num_elements) {
#cat(paste(f(,value,,,num_elements,)\n))
if (num_elements  1) {
value;
} else {
z - c(value,f(2*value, num_elements-1))
}
}

# Generate base vector
v - 2:6

# Calculate constraint vector
v.constraints - f(v[1],length(v)-1)

# Generate permutations using gtools functions
library(gtools) 
v.permutations - permutations(length(v), length(v), v)

# Check each permutation
results - apply(v.permutations,1, function(x) all(x = v.constraints))

#
# Display Results
#
print(Original Vector)
print(v)
print(Constraint Vector)
print(v.constraints)
print(Does Vector meet Constraints)
print(cbind(v.permutations,results))

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Constrained vector permutation

2010-01-28 Thread Jason Smith
I just realized I read through your email too quickly and my script does
not actually address the constraint on each permutation, sorry about that.

You should be able to use the permutations function to generate the vector 
permutations however.

Jason

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Constrained vector permutation

2010-01-28 Thread Andrew Rominger
Hi Jason,

Thanks for you suggestions, I think that's pretty close to what I'd need.
The only glitch is that I'd be working with a vector of ~30 elements, so
permutations(...) would take quite a long time.  I only need one permutation
per vector (the whole routine will be within a loop that generates
pseudo-random vectors that could potentially conform to the constraints).

In light of that, do you think I'd be better off doing something like:
v.permutations - replicate(1,sample(v,length(v),rep=FALSE))   # instead
of permutations()
results - apply(v.permutations,2,function(x){all(x =
f(x[1],length(x)-1))})   # function f(...) would be like your f

It wouldn't be guaranteed to produce any usable permutation, but it seems
like it would be much faster and so could be repeated until an acceptable
vector is found.  What do you think?

Thanks--
Andy


On Thu, Jan 28, 2010 at 6:15 AM, Jason Smith devja...@gmail.com wrote:

 I just realized I read through your email too quickly and my script does
 not actually address the constraint on each permutation, sorry about that.

 You should be able to use the permutations function to generate the vector
 permutations however.

 Jason

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Constrained vector permutation

2010-01-28 Thread Jason Smith
 It wouldn't be guaranteed to produce any usable permutation, but it seems
 like it would be much faster and so could be repeated until an acceptable
 vector is found.  What do you think?

 Thanks--
 Andy


I think I am not understanding what your ultimate goal is so I'm not
sure I can give you appropriate advice.  Are you looking for a single
valid permutation or all of them?

Since that constraint sets a ceiling on each subsequent value, it
seems like you could solve this problem more easily and quickly by
using a search strategy instead of random sampling or generating all
permutations then testing.  The constraint will help prune the search
space so you only generate valid permutations.  Once you are examining
a particular element you can determine which of the additional
elements would be valid, so only consider those.

--jason

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Constrained vector permutation

2010-01-28 Thread Thomas Lumley

On Thu, 28 Jan 2010, Jason Smith wrote:


It wouldn't be guaranteed to produce any usable permutation, but it seems
like it would be much faster and so could be repeated until an acceptable
vector is found.  What do you think?

Thanks--
Andy



I think I am not understanding what your ultimate goal is so I'm not
sure I can give you appropriate advice.  Are you looking for a single
valid permutation or all of them?

Since that constraint sets a ceiling on each subsequent value, it
seems like you could solve this problem more easily and quickly by
using a search strategy instead of random sampling or generating all
permutations then testing.  The constraint will help prune the search
space so you only generate valid permutations.  Once you are examining
a particular element you can determine which of the additional
elements would be valid, so only consider those.


It's easy to generate valid permutations this way.  It does not appear 
straightforward to ensure that all valid permutations are sampled with equal 
probability, which I thought was part of the specification of the problem.

  -thomas


Thomas Lumley   Assoc. Professor, Biostatistics
tlum...@u.washington.eduUniversity of Washington, Seattle
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.