Re: [R] OT UNIX grep question
On 10/08/06, Rolf Turner [EMAIL PROTECTED] wrote: [EMAIL PROTECTED] wrote: grep '^dog$' /usr/share/dict/words or (simpler, in my view) grep -w dog /usr/share/dict/words Chris. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] OT UNIX grep question
On 10/08/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Selon Chris wallace [EMAIL PROTECTED]: grep -w dog /usr/share/dict/words Well, for the record it's does not work with my settings. Maybe *Mr Turner* can give you a lesson as well. Sorry I'm just in the mood for a joke ... Romain $ grep -w dog /usr/share/dict/words bird-dog bull-dog cat-and-dog dog dog-banner Ah - the original example didn't include hyphenated words (and nor does my /usr/share/dict/words). To match whole lines try grep -x. Does that do it? C. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Spacing and margins in plot
Earl F. Glynn [EMAIL PROTECTED] writes: AFAIK, the only way to get the axis label closer to the axis is to suppress the actual axis labels and use the mtext command to display alternative text where you want it. For example, look at the blue text in Figure 2B (at the above link) that is between the axis label and the axis. This blue text is at line=2, when the axis labels are at line=3. how about plot(..., xlab=) title(xlab=label text, line=2) ? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] converting stata's by syntax to R
Chris Wallace [EMAIL PROTECTED] writes: I am struggling with migrating some stata code to R Thanks to all who replied. It was very helpful to see a combination of more direct stata-R translations and more R-ish code. which.max() solves my problem this time, but learning about split(), unsplit() and duplicated() should make such problems fewer in the long run. C. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] converting stata's by syntax to R
I am struggling with migrating some stata code to R. I have a data frame containing, sometimes, repeat observations (rows) of the same family. I want to keep only one observation per family, selecting that observation according to some other variable. An example data frame is: # construct example data fam - c(1,2,3,3,4,4,4) wt - c(1,1,0.6,0.4,0.4,0.4,0.2) keep - c(1,1,1,0,1,0,0) dat - as.data.frame(cbind(fam,wt,keep)) dat I want to keep the observation for which wt is a maximum, and where this doesn't identify a unique observation, to keep just one anyway, not caring which. Those observations are indicated above by keep==1. (Note, keep - c(1,1,1,0,0,1,0) would be fine too, but not c(1,1,1,0,0,0,1)). The stata code I would use is bys fam (wt): keep if _n==_N This is my (long-winded) attempt in R: # first keep those rows where wt=max_fam(wt) maxwt - by(dat,dat$fam,function(x) max(x[,2])) maxwt - sapply(maxwt,[[,1) maxwt.dat - data.frame(maxwt=maxwt,fam=as.integer(names(maxwt))) dat - merge(dat,maxwt.dat) dat - dat[dat$wt==dat$maxwt,] dat Now I am stuck - I want to keep either row with fam==4, and have tried playing around with combinations of sample and apply or by, but with no success. I can only find an inefficient for-loop solution: # identify those rows with 1 observation more - by(dat,dat$fam,function(x) dim(x)[1]) more - sapply(more,[[,1) more.dat - data.frame(more=more,fam=as.integer(names(more))) dat - merge(dat,more.dat) # sample from those for whom more1 result-dat[dat$more==1,] for(f in unique(dat$fam[dat$more1])) { rows - rownames(dat[dat$fam==f,]) result - rbind(result,dat[sample(rows,1),]) } result I am sure that for something so simple in stata to be so complicated in R must indicate ignorance of R on my part, but searches of help files and RSiteSearch hasn't led to any better solution. Any suggestions would be most helpful! Thanks, C. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] simulate dependent probabilities
I need to simulate from a random process and am not sure how to go about it. The process is the probability of an event occuring between a pair of points on a line. (This probability is between 0 and 0.5). I have estimates of these probabilities for a series of points, their standard errors and the correlation matrix (which is AR(1)). Eg (for 4 points) estimated prob (q): 0.1163 0.1280 0.0698 standard error: 0.0320 0.0288 0.0259 asymptotic correlation matrix: 1. -0.0880 1. 0. -0.0739 1. The vector q is used in a further analysis, treated as known. I would like to simulate alternative vectors q, which could be used in the further analysis in order to generate some empirical confidence interval. But I don't know where to start with such simulation. (In practice, q has about 50 elements). Although I know how to use cholesky decomposition to simulate dependent variables from a MVN distribution, I am stuck on two counts here: - the distribution for q - how to incorporate the dependence into the simulation. I would appreciate any suggestions. Chris. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html