from:"Chris Oldmeadow"

[R] simulate binary markov chain

2008-12-16 Thread Chris Oldmeadow

Hi all, I was hoping somebody may know of a function for simulating a 
large binary sequence (length >10 million) using a (1st order) markov 
model with known (2x2) transition matrix. It needs to be reasonably 
fast. I have tried the following;


mc<-function(sq,P){
 s<-c()
 x<-row.names(P)
 n<-length(sq)
 p1<-sum(sq)/n
 s[1] <- rbinom(1,1,p1);
 for ( i in 2:n){
s[i] <- rbinom( 1, 1, P[s[i-1]+1] )
 }
 return(s)
}


P<-c(0.63,0.27)
x<-rbinom(500,1,0.5)
new<-mc(x,P)

thanks in advance!
Chris

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] sliding window over a large vector

2008-12-15 Thread Chris Oldmeadow


Hi all,

I have a very large binary vector, I wish to calculate the number of 
1's  over sliding windows.


this is my very slow function

slide<-function(seq,window){
  n<-length(seq)-window
  tot<-c()
  tot[1]<-sum(seq[1:window])   
  for (i in 2:n) {

 tot[i]<- tot[i-1]-seq[i-1]+seq[i]
  }
  return(tot)
}

this works well for for reasonably sized vectors. Does anybody know a 
way for large vectors ( length=12 million), im trying to avoid using C.


Thanks,
Chris

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] sample consecutive integers efficiently

2008-08-28 Thread Chris Oldmeadow




Charles C. Berry wrote:

On Thu, 28 Aug 2008, Chris Oldmeadow wrote:


Hi all,

I have some rough code to sample consecutive integers with length 
according to a vector of lengths


#sample space (representing positions)
pos<-c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20)

#sample lengths
lengths<-c(2,3,2)

From these two vectors I need a vector of sampled positions.


the sampling is without replacement, making things tough as the 
sampled integers need to be consecutive. Im hoping somebody knows a 
faster way of doing it than I have. ATM its way to slow on large 
vectors.



samplePos<-function(l){
   start.pos<-sample(pos,1)
   end.pos<-start.pos+l-1
   posies<-start.pos:end.pos
   posies
}

s.start<-c()


newPos<-function(a){
   rp<-samplePos(a)
   #test sampled range is consecutive, if not resample
   if (length(rp) != rp[a]+1 -rp[1]){rp<-samplePos(a)}
   pos<-setdiff(pos,rp)
   rp[1]
}

newps<-c()
newps<-unlist(lapply(lengths,newPos))

I think the bottleneck may be on the setdiff() function - the sample 
space is quite large so I dont think there would be too many rejections.



The bottleneck is in the formulation of the sampling scheme.

This is a simple combinatorics problem. There are 3360 possible values 
( prod(16:14) ) for the start positions of the three elements, and you 
can form a bijection between 1:3360 and the individual samples. If the 
number of possible sample is small enough, it would be most efficient 
to sample from the corresponding integer vector and then translate it 
to the corresponding sample. For larger values where the number of 
possible samples become a challenge for 32-bit integer arithmetic, I 
expect this approach would be preferred:


Permute length ( pos ) - sum ( lengths ) + length( lengths ) distinct 
(consecutively labelled) elements:


elz <- sample( length ( pos ) - sum ( lengths ) + length( lengths ) )


Take the lengths of the original objects to be

z.lens <- rep( 1, length( elz ) )
z.lens[ seq(along = lengths ) ] <- lengths

(i.e. objects longer than 1 appear first)

Determine the start positions of the objects as if they were laid down 
consecutively according to the permutation:


start <- head( cumsum( c(0, z.lens[ elz ]) ) + 1 , -1 )

Find the start positions of just those with lengths greater than 1

gt.1 <- match( seq(along=lengths) , elz )

Report the start positions

start[ gt.1 ]

---

If length( pos ) is large, you can rewrite the above to simply sample 
the positions (in the ordering) of the objects with lengths greater 
than 1. You will have to revise the calculation of start and gt.1 in 
that case.


HTH,

Chuck


 >



Many thanks,
Chris

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



Charles C. Berry(858) 534-2098
Dept of Family/Preventive 
Medicine

E mailto:[EMAIL PROTECTED]UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 
92093-0901






Thanks! that worked perfectly.

Chris

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] sample consecutive integers efficiently

2008-08-28 Thread Chris Oldmeadow


Hi all,

I have some rough code to sample consecutive integers with length 
according to a vector of lengths


#sample space (representing positions)
pos<-c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20)

#sample lengths
lengths<-c(2,3,2)

From these two vectors I need a vector of sampled positions.


the sampling is without replacement, making things tough as the sampled 
integers need to be consecutive. Im hoping somebody knows a faster way 
of doing it than I have. ATM its way to slow on large vectors.



samplePos<-function(l){
   start.pos<-sample(pos,1)
   end.pos<-start.pos+l-1
   posies<-start.pos:end.pos
   posies
}

s.start<-c()


newPos<-function(a){
   rp<-samplePos(a)
   #test sampled range is consecutive, if not resample
   if (length(rp) != rp[a]+1 -rp[1]){rp<-samplePos(a)}
   pos<-setdiff(pos,rp)
   rp[1]
}

newps<-c()
newps<-unlist(lapply(lengths,newPos))

I think the bottleneck may be on the setdiff() function - the sample 
space is quite large so I dont think there would be too many rejections.




Many thanks,
Chris

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] sequence with start and stop positions

2008-08-25 Thread Chris Oldmeadow


Hi,

I have a vector of start positions, and another vector of stop positions,

eg start<-c(1,20,50)
stop<-c(7,25,53)

Is there a quick way to create a sequence from these vectors?

new<-c(1,2,3,4,5,6,7,20,21,22,23,24,25,50,51,52,53)

the way Im doing it at the moment is

pos<-seq(start[1],stop[1])

for (i in 2:length(start)){
 new<-seq(start[i],stop[i])
 pos<-c(pos,new)
}

This works on small data, but its very inefficient on large vectors, and 
is taking forever!


Anybody no a better way?

many thanks,
Chris

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] simulate binary markov chain

[R] sliding window over a large vector

Re: [R] sample consecutive integers efficiently

[R] sample consecutive integers efficiently

[R] sequence with start and stop positions

5 matches

Site Navigation

Mail list logo

Footer information