Re: [R] OT UNIX grep question
On 10/08/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > > Selon Chris wallace <[EMAIL PROTECTED]>: > > >grep -w dog /usr/share/dict/words > > Well, for the record it's does not work with my settings. > Maybe *Mr Turner* can give you a lesson as well. Sorry I'm just in the > mood for > a joke ... > > Romain > > $ grep -w dog /usr/share/dict/words > bird-dog > bull-dog > cat-and-dog > dog > dog-banner > Ah - the original example didn't include hyphenated words (and nor does my /usr/share/dict/words). To match whole lines try grep -x. Does that do it? C. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] OT UNIX grep question
On 10/08/06, Rolf Turner <[EMAIL PROTECTED]> wrote: > > [EMAIL PROTECTED] wrote: > > grep '^dog$' /usr/share/dict/words > > or (simpler, in my view) grep -w dog /usr/share/dict/words Chris. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Spacing and margins in plot
"Earl F. Glynn" <[EMAIL PROTECTED]> writes: > AFAIK, the only way to get the axis label "closer" to the axis is to > suppress the actual axis labels and use the mtext command to display > alternative text where you want it. For example, look at the blue text in > Figure 2B (at the above link) that is between the axis label and the axis. > This blue text is at line=2, when the axis labels are at line=3. how about plot(..., xlab="") title(xlab="label text", line=2) ? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] converting stata's by syntax to R
Chris Wallace <[EMAIL PROTECTED]> writes: > I am struggling with migrating some stata code to R Thanks to all who replied. It was very helpful to see a combination of more direct stata->R translations and more R-ish code. which.max() solves my problem this time, but learning about split(), unsplit() and duplicated() should make such problems fewer in the long run. C. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] converting stata's by syntax to R
I am struggling with migrating some stata code to R. I have a data frame containing, sometimes, repeat observations (rows) of the same family. I want to keep only one observation per family, selecting that observation according to some other variable. An example data frame is: # construct example data fam <- c(1,2,3,3,4,4,4) wt <- c(1,1,0.6,0.4,0.4,0.4,0.2) keep <- c(1,1,1,0,1,0,0) dat <- as.data.frame(cbind(fam,wt,keep)) dat I want to keep the observation for which wt is a maximum, and where this doesn't identify a unique observation, to keep just one anyway, not caring which. Those observations are indicated above by keep==1. (Note, keep <- c(1,1,1,0,0,1,0) would be fine too, but not c(1,1,1,0,0,0,1)). The stata code I would use is bys fam (wt): keep if _n==_N This is my (long-winded) attempt in R: # first keep those rows where wt=max_fam(wt) maxwt <- by(dat,dat$fam,function(x) max(x[,2])) maxwt <- sapply(maxwt,"[[",1) maxwt.dat <- data.frame("maxwt"=maxwt,"fam"=as.integer(names(maxwt))) dat <- merge(dat,maxwt.dat) dat <- dat[dat$wt==dat$maxwt,] dat Now I am stuck - I want to keep either row with fam==4, and have tried playing around with combinations of sample and apply or by, but with no success. I can only find an inefficient for-loop solution: # identify those rows with >1 observation more <- by(dat,dat$fam,function(x) dim(x)[1]) more <- sapply(more,"[[",1) more.dat <- data.frame("more"=more,"fam"=as.integer(names(more))) dat <- merge(dat,more.dat) # sample from those for whom more>1 result<-dat[dat$more==1,] for(f in unique(dat$fam[dat$more>1])) { rows <- rownames(dat[dat$fam==f,]) result <- rbind(result,dat[sample(rows,1),]) } result I am sure that for something so simple in stata to be so complicated in R must indicate ignorance of R on my part, but searches of help files and RSiteSearch hasn't led to any better solution. Any suggestions would be most helpful! Thanks, C. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] simulate dependent probabilities
I need to simulate from a random process and am not sure how to go about it. The process is the probability of an event occuring between a pair of points on a line. (This probability is between 0 and 0.5). I have estimates of these probabilities for a series of points, their standard errors and the correlation matrix (which is AR(1)). Eg (for 4 points) estimated prob (q): 0.1163 0.1280 0.0698 standard error: 0.0320 0.0288 0.0259 asymptotic correlation matrix: 1. -0.0880 1. 0. -0.0739 1. The vector q is used in a further analysis, treated as known. I would like to simulate alternative vectors q, which could be used in the further analysis in order to generate some empirical confidence interval. But I don't know where to start with such simulation. (In practice, q has about 50 elements). Although I know how to use cholesky decomposition to simulate dependent variables from a MVN distribution, I am stuck on two counts here: - the distribution for q - how to incorporate the dependence into the simulation. I would appreciate any suggestions. Chris. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html