Re: [R] Likelihood ratio test between glm and glmer fits
2008/7/16 Dimitris Rizopoulos [EMAIL PROTECTED]: well, for computing the p-value you need to use pchisq() and dchisq() (check ?dchisq for more info). For model fits with a logLik method you can directly use the following simple function: lrt - function (obj1, obj2) { L0 - logLik(obj1) L1 - logLik(obj2) L01 - as.vector(- 2 * (L0 - L1)) df - attr(L1, df) - attr(L0, df) list(L01 = L01, df = df, p-value = pchisq(L01, df, lower.tail = FALSE)) } library(lme4) gm0 - glm(cbind(incidence, size - incidence) ~ period, family = binomial, data = cbpp) gm1 - glmer(cbind(incidence, size - incidence) ~ period + (1 | herd), family = binomial, data = cbpp) lrt(gm0, gm1) Yes, that seems quite natural, but then try to compare with the deviance: logLik(gm0) logLik(gm1) (d0 - deviance(gm0)) (d1 - deviance(gm1)) (LR - d0 - d1) pchisq(LR, 1, lower = FALSE) Obviously the deviance in glm is *not* twice the negative log-likelihood as it is in glmer. The question remains which of these two quantities is appropriate for comparison. I am not sure exactly how the deviance and/or log-likelihood are calculated in glmer, but my feeling is that one should trust the deviance rather than the log-likelihoods for these purposes. This is supported by the following comparison: Ad an arbitrary random effect with a close-to-zero variance and note the deviance: tmp - rep(1:4, each = nrow(cbpp)/4) gm2 - glmer(cbind(incidence, size - incidence) ~ period + (1 | tmp), family = binomial, data = cbpp) (d2 - deviance(gm2)) This deviance is very close to that obtained from the glm model. I have included the mixed-models mailing list in the hope that someone could explain how the deviance is computed in glmer and why deviances, but not likelihoods are comparable to glm-fits. Best Rune However, there are some issues regarding this likelihood ratio test. 1) The null hypothesis is on the boundary of the parameter space, i.e., you test whether the variance for the random effect is zero. For this case the assumed chi-squared distribution for the LRT may *not* be totally appropriate and may produce conservative p-values. There is some theory regarding this issue, which has shown that the reference distribution for the LRT in this case is a mixture of a chi-squared(df = 0) and chi-squared(df = 1). Another option is to use simulation-based approach where you can approximate the reference distribution of the LRT under the null using simulation. You may check below for an illustration of this procedure (not-tested): X - model.matrix(gm0) coefs - coef(gm0) pr - plogis(c(X %*% coefs)) n - length(pr) new.dat - cbpp Tobs - lrt(gm0, gm1)$L01 B - 200 out.T - numeric(B) for (b in 1:B) { y - rbinom(n, cbpp$size, pr) new.dat$incidence - y fit0 - glm(formula(gm0), family = binomial, data = new.dat) fit1 - glmer(formula(gm1), family = binomial, data = new.dat) out.T[b] - lrt(fit0, fit1)$L01 } # estimate p-value (sum(out.T = Tobs) + 1) / (B + 1) 2) For the glmer fit you have to note that you work with an approximation to the log-likelihood (obtained using numerical integration) and not the actual log-likelihood. I hope it helps. Best, Dimitris Dimitris Rizopoulos Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm Quoting COREY SPARKS [EMAIL PROTECTED]: Dear list, I am fitting a logistic multi-level regression model and need to test the difference between the ordinary logistic regression from a glm() fit and the mixed effects fit from glmer(), basically I want to do a likelihood ratio test between the two fits. The data are like this: My outcome is a (1,0) for health status, I have several (1,0) dummy variables RURAL, SMOKE, DRINK, EMPLOYED, highereduc, INDIG, male, divorced, SINGLE, chronic, vigor_d and moderat_d and AGE is continuous (20 to 100). My higher level is called munid and has 581 levels. The data have 45243 observations. Here are my program statements: #GLM fit ph.fit.2-glm(poorhealth~RURAL+SMOKE+DRINK+EMPLOYED+highereduc+INDIG+AGE+male+divorced+SINGLE+chronic+vigor_d+moderat_d,family=binomial(), data=mx.merge) #GLMER fit ph.fit.3-glmer(poorhealth~RURAL+SMOKE+DRINK+EMPLOYED+INSURANCE+highereduc+INDIG+AGE+male+divorced+SINGLE+chronic+vigor_d+moderat_d+(1|munid),family=binomial(), data=mx.merge) I cannot find a method in R that will do the LR test between a glm and a glmer fit, so I try to do it using the liklihoods from both models #form the likelihood ratio test between the glm and glmer fits x2--2*(logLik(ph.fit.2)-logLik(ph.fit.3)) ML 79.60454 attr(,nobs) n 45243 attr(,nall) n 45243 attr(,df) [1] 14 attr(,REML) [1] FALSE attr(,class)
[R] Histogram with two colors depending on condition
Dear List, Say, we generate data like this- dat-rnorm(1000,1,2) hist(dat) How do i make the histogram, say, red (col = 2) before X = dat = 0, and rest say, green (col = 3) beyond X = dat = 0 in R? The resulting histogram could be like this http://ehsan.karim.googlepages.com/histogram.JPG (edited) Thanks in advance. Ehsan http://ehsan.karim.googlepages.com/diaryofastatistician __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Histogram with two colors depending on condition
Here is something that is close: x - rnorm(1) y - hist(x, plot=FALSE) plot(y, col=ifelse(y$mid0,'red','green')) On Thu, Jul 17, 2008 at 5:13 AM, Mohammad Ehsanul Karim [EMAIL PROTECTED] wrote: Dear List, Say, we generate data like this- dat-rnorm(1000,1,2) hist(dat) How do i make the histogram, say, red (col = 2) before X = dat = 0, and rest say, green (col = 3) beyond X = dat = 0 in R? The resulting histogram could be like this http://ehsan.karim.googlepages.com/histogram.JPG (edited) Thanks in advance. Ehsan http://ehsan.karim.googlepages.com/diaryofastatistician __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with bivariate density plot question
Hi Daniela, Spencer (? Graves) is not at home. Seriously, this is a list that many people read and use. If you wish to elicit a response, then you would be wise to give a better statement of what your difficulty is. The function you enquire about is well documented with an example, see ## library(GenKern) ## load the package ?KernSur ## get help on the function You don't need to do anything special to get adaptive bandwidths, it's all done for you (by the authors of the package). Just replace the x and y values in the example with your values, and perhaps deal with any NAs in your data set. Should one moralize?: Well, it is generally true that you want help from others... HTH, Mark. Daniela Carollo wrote: Hi Spencer, I have seen your name on the web site, and perhaps you can help me with my R problem. I'm trying to use KernSur to put in evidence a substructure in a bidimensional plot. My problem is that, in order to get the density in the low density areas (in which the substructure is located) I should use different bandwidths. How I can do that? Also, I think that the best choice for my case is to use the function akerdmul which perform the multivariate adaptive kernel density distribution. Are you familiar with this function? Any help would be really appreciated. Thank you very much! Regards, Daniela __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/help-with-bivariate-density-plot-question-tp18495958p18504638.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to compute loglikelihood of Lognormal distribution
Hi, I am trying to learn lognormal mixture models with EM. I was wondering how does one compute the log likelihood. The current implementation I have is as follows, which perform really bad in learning the mixture models. __BEGIN__ # compute probably density of lognormal. dens - function(lambda, theta, k){ temp-NULL meanl=theta[1:k] sdl=theta[(k+1):(2*k)] for(j in 1:k){ # each being lognormal distribution temp=cbind(temp,dlnorm(x,meanlog=meanl[j],sdlog=sdl[j])) } temp=t(lambda*t(temp)) temp } old.obs.ll - sum(log(apply(dens(lambda, theta, k),1,sum))) # this is prior likelihood lognorm.ll - function(theta, z,lambda, k) - sum(z*log(dens(lambda,theta,k))) __END__ It is based on a slight modification of our earlier Gamma version, which works really well. The full code of Gamma version can be found here: http://dpaste.com/65353/plain/ -G.V. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Histogram with two colors depending on condition
On Thu, Jul 17, 2008 at 5:13 PM, Mohammad Ehsanul Karim [EMAIL PROTECTED] wrote: Dear List, Say, we generate data like this- dat-rnorm(1000,1,2) hist(dat) library(ggplot) qplot(dat, geom=histogram, colour = factor(dat 0)) Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] calculate differences - strange outcome
Dear List, I ran into some trouble by calculating differences. For me it is important that differences are either 0 or not. So I don't understand the outcome of this calculation 865.56-(782.86+0+63.85+18.85+0) [1] -1.136868e-13 I run R version 2.71 on WinXP I could solve my problem by using round() but I would like to know the reason. Maybe someone can help me? Thanx __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] model.tables standard errors
Not sure how to interpret standard errors produced by model.tables(object, type=effects, se=T) For example, simplest possible case is single stratum anova with two equally replicated treatments (n observations each). The effects displayed are plus and minus half the difference between the two treatment means, and the SE is given as sqrt(RMS/n), where RMS = residual mean square. But the usual calculation would give the SE as this divided by sqrt(2). Can't see this question raised previously in the archives, so perhaps I am missing something basic here. I.White Ashworth Laboratories, West Mains Road Edinburgh EH9 3JT E-mail: [EMAIL PROTECTED] -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calculate differences - strange outcome
On Thu, 17 Jul 2008 11:47:42 +0200 Kunzler, Andreas [EMAIL PROTECTED] wrote: I ran into some trouble by calculating differences. For me it is important that differences are either 0 or not. If that is the case, restrict yourself to calculating with integers. :) So I don't understand the outcome of this calculation 865.56-(782.86+0+63.85+18.85+0) [1] -1.136868e-13 I run R version 2.71 on WinXP I could solve my problem by using round() but I would like to know the reason. Maybe someone can help me? FAQ 7.31 and the reference cited therein should shed light on your problem. Best wishes, Berwin === Full address = Berwin A TurlachTel.: +65 6516 4416 (secr) Dept of Statistics and Applied Probability+65 6516 6650 (self) Faculty of Science FAX : +65 6872 3919 National University of Singapore 6 Science Drive 2, Blk S16, Level 7 e-mail: [EMAIL PROTECTED] Singapore 117546http://www.stat.nus.edu.sg/~statba __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: calculate differences - strange outcome
Hi [EMAIL PROTECTED] napsal dne 17.07.2008 11:47:42: Dear List, I ran into some trouble by calculating differences. For me it is important that differences are either 0 or not. So you shall not use computers. They do not work with infinite precision. See Faq 7.31 Regards Petr So I don't understand the outcome of this calculation 865.56-(782.86+0+63.85+18.85+0) [1] -1.136868e-13 I run R version 2.71 on WinXP I could solve my problem by using round() but I would like to know the reason. Maybe someone can help me? Thanx __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calculate differences - strange outcome
Hi Andreas, It's because you are dealing with binary or floating point calculations, not just a few apples and oranges, or an abacus (which, by the way, is an excellent calculating device, and still widely used in some [sophisticated] parts of the world). http://en.wikipedia.org/wiki/Floating_point HTH, Marks. Kunzler, Andreas wrote: Dear List, I ran into some trouble by calculating differences. For me it is important that differences are either 0 or not. So I don't understand the outcome of this calculation 865.56-(782.86+0+63.85+18.85+0) [1] -1.136868e-13 I run R version 2.71 on WinXP I could solve my problem by using round() but I would like to know the reason. Maybe someone can help me? Thanx __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/calculate-differences---strange-outcome-tp18505010p18505498.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] NAMESPACE vs internal.Rd
G'day Christophe, On Wed, 16 Jul 2008 18:22:49 +0200 [EMAIL PROTECTED] wrote: Thanks for your answer. My pleasure. I guess writing a regular expression that says export everything that does not start with a dot but do not export foo and bar would be not trivial to write (at least not for me). The NAMESPACE created by package.skeleton contain a single line : exportPattern(^[[:alpha:]]+) I guess that it is what you just say... Not really. :) The regular expression ^[[:alpha:]]+ matches, as far as I know, all objects that have one or more alphabetic character at the beginning. The Writing R Extensions manual suggests the directive exportPattern(^[^\\.]) exports all variables that do not start with a period. As far as I can tell, both these instructions have more or less the same effect (assuming that there are no objects with non-standard names in your package; and I am not enough of an R language lawyer to know what would happen in such a case). I was commenting on your classification as: - some fonction will be accessible (regular function) - some function will be hidden (function starting with .) - some function will be forbiden (function not in namespace) Say, you have accessible functions called fubar, rabuf, main and something.incredible.useful, some hidden functions whose name start with a . and forbidden functions (i.e. not exported) with names foo and bar. (Though, as I commented earlier, it is practically impossible to implement forbidden functions, users can always access them if they want using :::.) Both of the directives above would export fubar, rabuf, main, something.incredible.useful, foo and bar. So my challenge was to come up with a regular expression for exportPattern that says export everything that does not start with a dot but do not export foo and bar, i.e. a regular expression that would only export the accessible functions. HTC. Cheers, Berwin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calculate differences - strange outcome
Kunzler, Andreas wrote: Dear List, I ran into some trouble by calculating differences. For me it is important that differences are either 0 or not. So I don't understand the outcome of this calculation 865.56-(782.86+0+63.85+18.85+0) [1] -1.136868e-13 I run R version 2.71 on WinXP I could solve my problem by using round() but I would like to know the reason. Maybe someone can help me? This is FAQ 7.31. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Likelihood ratio test between glm and glmer fits
On Thu, Jul 17, 2008 at 2:50 AM, Rune Haubo [EMAIL PROTECTED] wrote: 2008/7/16 Dimitris Rizopoulos [EMAIL PROTECTED]: well, for computing the p-value you need to use pchisq() and dchisq() (check ?dchisq for more info). For model fits with a logLik method you can directly use the following simple function: lrt - function (obj1, obj2) { L0 - logLik(obj1) L1 - logLik(obj2) L01 - as.vector(- 2 * (L0 - L1)) df - attr(L1, df) - attr(L0, df) list(L01 = L01, df = df, p-value = pchisq(L01, df, lower.tail = FALSE)) } library(lme4) gm0 - glm(cbind(incidence, size - incidence) ~ period, family = binomial, data = cbpp) gm1 - glmer(cbind(incidence, size - incidence) ~ period + (1 | herd), family = binomial, data = cbpp) lrt(gm0, gm1) Yes, that seems quite natural, but then try to compare with the deviance: logLik(gm0) logLik(gm1) (d0 - deviance(gm0)) (d1 - deviance(gm1)) (LR - d0 - d1) pchisq(LR, 1, lower = FALSE) Obviously the deviance in glm is *not* twice the negative log-likelihood as it is in glmer. The question remains which of these two quantities is appropriate for comparison. I am not sure exactly how the deviance and/or log-likelihood are calculated in glmer, but my feeling is that one should trust the deviance rather than the log-likelihoods for these purposes. This is supported by the following comparison: Ad an arbitrary random effect with a close-to-zero variance and note the deviance: tmp - rep(1:4, each = nrow(cbpp)/4) gm2 - glmer(cbind(incidence, size - incidence) ~ period + (1 | tmp), family = binomial, data = cbpp) (d2 - deviance(gm2)) This deviance is very close to that obtained from the glm model. I have included the mixed-models mailing list in the hope that someone could explain how the deviance is computed in glmer and why deviances, but not likelihoods are comparable to glm-fits. In that example I think the problem may be that I have not yet written the code to adjust the deviance of the glmer fit for the null deviance. However, there are some issues regarding this likelihood ratio test. 1) The null hypothesis is on the boundary of the parameter space, i.e., you test whether the variance for the random effect is zero. For this case the assumed chi-squared distribution for the LRT may *not* be totally appropriate and may produce conservative p-values. There is some theory regarding this issue, which has shown that the reference distribution for the LRT in this case is a mixture of a chi-squared(df = 0) and chi-squared(df = 1). Another option is to use simulation-based approach where you can approximate the reference distribution of the LRT under the null using simulation. You may check below for an illustration of this procedure (not-tested): X - model.matrix(gm0) coefs - coef(gm0) pr - plogis(c(X %*% coefs)) n - length(pr) new.dat - cbpp Tobs - lrt(gm0, gm1)$L01 B - 200 out.T - numeric(B) for (b in 1:B) { y - rbinom(n, cbpp$size, pr) new.dat$incidence - y fit0 - glm(formula(gm0), family = binomial, data = new.dat) fit1 - glmer(formula(gm1), family = binomial, data = new.dat) out.T[b] - lrt(fit0, fit1)$L01 } # estimate p-value (sum(out.T = Tobs) + 1) / (B + 1) 2) For the glmer fit you have to note that you work with an approximation to the log-likelihood (obtained using numerical integration) and not the actual log-likelihood. I hope it helps. Best, Dimitris Dimitris Rizopoulos Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm Quoting COREY SPARKS [EMAIL PROTECTED]: Dear list, I am fitting a logistic multi-level regression model and need to test the difference between the ordinary logistic regression from a glm() fit and the mixed effects fit from glmer(), basically I want to do a likelihood ratio test between the two fits. The data are like this: My outcome is a (1,0) for health status, I have several (1,0) dummy variables RURAL, SMOKE, DRINK, EMPLOYED, highereduc, INDIG, male, divorced, SINGLE, chronic, vigor_d and moderat_d and AGE is continuous (20 to 100). My higher level is called munid and has 581 levels. The data have 45243 observations. Here are my program statements: #GLM fit ph.fit.2-glm(poorhealth~RURAL+SMOKE+DRINK+EMPLOYED+highereduc+INDIG+AGE+male+divorced+SINGLE+chronic+vigor_d+moderat_d,family=binomial(), data=mx.merge) #GLMER fit ph.fit.3-glmer(poorhealth~RURAL+SMOKE+DRINK+EMPLOYED+INSURANCE+highereduc+INDIG+AGE+male+divorced+SINGLE+chronic+vigor_d+moderat_d+(1|munid),family=binomial(), data=mx.merge) I cannot find a method in R that will do the LR test between a glm and a glmer fit, so I try to do it using the liklihoods from both models #form the
Re: [R] Labelling curves on graphs
Barry Rowlingson wrote: Why use color when you can use black and label the curves where they are most separated? This solves problems with color blindness, xeroxing, and faxing. Where should I put the labels in this example: set.seed(123) z=matrix(runif(6*50),50,6) matplot(z,type='l',col=1,lty=1) A label is only going to indicate which curve is which at a point. If the curves approach or cross anywhere you're going to get confused. Line styles and colours persist along the length of the line. For nice smooth well-behaved curves then perhaps labelling is okay - contour lines rarely cross, and even when they do they must have the same height value anyway. I'm not sure once you get to six lines on a graph you can find six distinctive line styles without using colour, so it might be better to switch graphic completely. Compare the above plot with image(z). If the plots don't cross, one solution is to extend the plot to the right and just add labels there: set.seed(123) z=matrix(runif(6*50),50,6) m=apply(z,1,cumsum) ms=apply(m,1,sort) matplot(ms,type='l',xlim=c(0,55),lty=1,col=1) text(50,ms[50,],c(Foo,Bar,Baz,Qux,Quux,Quuux),adj=-0.1) I can't see a neater way of putting labels actually on the lines in this case - they'll be all over the place to squeeze them in the gaps. Barry Barry - you're right; when two lines cross more than 2 or 3 times the suggested approach doesn't work well. When it does work, not having the user have to look at a legend is a benefit. Depending on the number of lines, using thick gray scale and thin black lines can be very effective. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] fastICA
Hi everyone It looks like repeated runs of fastICA produce quite significantly different mixing matrices (not only in terms of sign and row order). I'm not a specialist, so would appreciate any advice on whether this should really be the case: res3 = fastICA(af[,2:20],4,alg.typ=parallel,fun=logcosh,alpha=1,method=C,row.norm=TRUE) colstandard res4 = fastICA(af[,2:20],4,alg.typ=parallel,fun=logcosh,alpha=1,method=C,row.norm=TRUE) colstandard res3$A [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9][,10] [,11] [,12] [1,] 0.3492656 0.47129703 0.11709867 -0.3866225 -0.3180731 -0.4991164 0.6215495 0.8923775 0.5595437 -0.074976310 -0.02226736 0.2540303 [2,] 0.3374660 0.39962068 0.24130395 -0.4555433 -0.2731303 -0.4717878 -0.1361462 -0.1812964 -0.4349231 0.001598010 0.05622615 0.3771093 [3,] 0.1577214 0.06431248 0.04530352 0.2426606 0.2546012 0.3607750 0.3066278 0.2459179 -0.1632415 0.079017073 -0.01756727 -0.6831387 [4,] 0.3024079 0.42547482 0.42272925 0.4219971 0.1849393 0.3102353 -0.2709699 -0.2480479 -0.2029552 0.186288059 -0.03772351 0.2284498 [,13] [,14] [,15] [,16][,17] [,18] [,19] [1,] -0.3560465 -0.8023760 -0.6901401 -0.09377930 0.039225519 0.01019511 -0.07118729 [2,] -0.1333961 0.3828090 0.4846588 -0.03412919 -0.009067316 -0.13459846 -0.01677050 [3,] -1.0164242 -0.2774528 0.2472543 0.05641915 0.112881877 -0.10800022 0.09233623 [4,] -0.1606417 -0.6265621 -0.6120113 -0.08991269 -0.052944720 -0.11545406 -0.06530203 res4$A [,1][,2][,3][,4][,5][,6] [,7][,8] [,9] [,10] [,11] [,12] [1,] 0.37425998 0.47693372 0.43825728 0.06634739 0.01113021 0.00141282 -0.4044253 -0.47496861 -0.5553201 0.14804389 0.01623842 0.3564346 [2,] -0.14173880 -0.04252148 -0.03162973 -0.24141814 -0.25736466 -0.36473238 -0.3000759 -0.23306483 0.1691999 -0.07594278 0.01626331 0.6948413 [3,] -0.05005946 -0.01742712 0.11216921 0.64260942 0.34415960 0.58450365 -0.1146743 -0.08045181 0.1442732 0.13216174 -0.06578838 -0.1372067 [4,] -0.43535000 -0.58086526 -0.21688752 0.34399283 0.29998124 0.47268328 -0.5527654 -0.81536674 -0.4607192 0.03768184 0.02307793 -0.3113437 [,13] [,14] [,15] [,16] [,17] [,18] [,19] [1,] -0.14663100 -0.01915221 0.04564954 -0.06653969 -0.04960206 -0.17660055 -0.04146818 [2,] 1.00383413 0.24516021 -0.27622870 -0.06074335 -0.11331305 0.10496803 -0.09549671 [3,] -0.01167087 -0.67320085 -0.73464239 -0.03193329 -0.03025987 0.01552101 -0.02808728 [4,] 0.41733259 0.86385119 0.72718191 0.10995336 -0.03082935 0.02770219 0.08069126 Many thanks Mikhail -- Mikhail Spivakov PhD Postdoctoral Fellow European Bioinformatics Institute European Molecular Biology Laboratory Hinxton Cambridgeshire CB10 1SD UK tel +44 1223 492 260 fax +44 1223 494 468 spivakov ebi ac uk -- View this message in context: http://www.nabble.com/fastICA-tp18506527p18506527.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] barchart with bars attached to y=0-line
Dear Deepayan and other users, thanks a lot, that was exactly what i intended to get. However, there are some aspects of the plot that can be improved. It would be nice to have the same wisth of the bars in all panels. I saw the bow.width-argument, but as I understood it is no help to get a fixed absolute width of bars in all panels. Is there a solution? Secondly I would like to get horizontal grid lines at the positions of the y-axis tick marks(from -2 to 6, increment |2|). I tried the following: PLOT-barchart(data=df,Ratio~reorder(Compound,as.numeric(Class))|Class,origin=0,scales=list(x=list(relation=free), y=list(tck=1))) But this does not work. When using: P-barchart(data=df, Ratio~reorder(Compound,as.numeric(Class))|Class, origin=0, scales=list(x=free,y=list(alternating=3)),ylab=log2 ratio (winter/summer), panel=function(...){panel.barchart(...); grid.grill(v=unit(c(-2,-1,0,1,2,3,4,5,6),native),default.units=native)}) i get horizontal lines, but not in correspondence with the y-scaling and get some vertical lines as well. Where is my mistake? Am i refering to the wrong coordinate system (tried with npc, but also did not work)? Thanks a lot for your help, Henning Original-Nachricht Datum: Wed, 16 Jul 2008 09:48:35 -0700 Von: Deepayan Sarkar [EMAIL PROTECTED] An: Henning Wildhagen [EMAIL PROTECTED] CC: r-help@r-project.org Betreff: Re: [R] barchart with bars attached to y=0-line On 7/16/08, Henning Wildhagen [EMAIL PROTECTED] wrote: Dear R users, i am using the following code to produce barcharts with lattice: Compound-c(Glutamine, Arginine, Glutamate, Glycine, Serine, Glucose, Fructose, Raffinose, Glycerol, Galacglycerol, Threitol, Galactinol, Galactitol) Class-c(aminos,aminos,aminos,aminos,aminos,sugars,sugars,sugars,glycerols,glycerols,sugar alcohols,sugar alcohols,sugar alcohols) set.seed(5) Ratio-rnorm(13, 0.5, 3) df-data.frame(Compound, Class,Ratio) library(lattice) P-barchart(data=df,Ratio~Compound|Class) However, I would like to have the bars attached to an imaginary y=0-line so that they go up if Ratio0 and down if Ratio0. I saw some older entries in the mailing list supposing to use panel.barchart. However, I did not fully understand the usage of this function. Also, it was annouced to add an option to barchart to make it easier to get this type of plot. Has anyone an idea what the easiest solution might be? barchart(data=df,Ratio~Compound|Class, origin = 0) A second problem is concerning the display of the Compound-objects in accordance with the conditioning variable Class. In other words: Is it possible to use the whole space of each panel for only those Compound-objects which belong to the Class displayed in that particular panel? Of course, for this purpose the panels have to be detached to allow appropiate labling of the x-axis. barchart(data=df,Ratio~Compound|Class, origin = 0, scales = list(x = free)) But this would work better if the levels of Compound are adjacent within Class; e.g. barchart(data=df,Ratio~reorder(Compound, as.numeric(Class))|Class, origin = 0, scales = list(x = free)) -Deepayan -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] spliting a string
Hi String-130.5 Df-Strsplit(.,:,String) Then Df get But I want Df should contains Df 130 5 If any body knows how to do it.tel me Thanks K.Ravichandra [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] barchart with bars attached to y=0-line
## add horizontal grid: barchart(data=df,Ratio~reorder(Compound, as.numeric(Class))|Class, origin = 0, scales = list(x = free), panel=function(...) { panel.grid(h=-1, v=0); panel.barchart(...)}) ## make bars all the same width: library(latticeExtra) update(trellis.last.object(), layout=c(5,1)) resizePanels() On Thu, Jul 17, 2008 at 10:03 PM, Henning Wildhagen [EMAIL PROTECTED] wrote: Dear Deepayan and other users, thanks a lot, that was exactly what i intended to get. However, there are some aspects of the plot that can be improved. It would be nice to have the same wisth of the bars in all panels. I saw the bow.width-argument, but as I understood it is no help to get a fixed absolute width of bars in all panels. Is there a solution? Secondly I would like to get horizontal grid lines at the positions of the y-axis tick marks(from -2 to 6, increment |2|). I tried the following: PLOT-barchart(data=df,Ratio~reorder(Compound,as.numeric(Class))|Class,origin=0,scales=list(x=list(relation=free), y=list(tck=1))) But this does not work. When using: P-barchart(data=df, Ratio~reorder(Compound,as.numeric(Class))|Class, origin=0, scales=list(x=free,y=list(alternating=3)),ylab=log2 ratio (winter/summer), panel=function(...){panel.barchart(...); grid.grill(v=unit(c(-2,-1,0,1,2,3,4,5,6),native),default.units=native)}) i get horizontal lines, but not in correspondence with the y-scaling and get some vertical lines as well. Where is my mistake? Am i refering to the wrong coordinate system (tried with npc, but also did not work)? Thanks a lot for your help, Henning Original-Nachricht Datum: Wed, 16 Jul 2008 09:48:35 -0700 Von: Deepayan Sarkar [EMAIL PROTECTED] An: Henning Wildhagen [EMAIL PROTECTED] CC: r-help@r-project.org Betreff: Re: [R] barchart with bars attached to y=0-line On 7/16/08, Henning Wildhagen [EMAIL PROTECTED] wrote: Dear R users, i am using the following code to produce barcharts with lattice: Compound-c(Glutamine, Arginine, Glutamate, Glycine, Serine, Glucose, Fructose, Raffinose, Glycerol, Galacglycerol, Threitol, Galactinol, Galactitol) Class-c(aminos,aminos,aminos,aminos,aminos,sugars,sugars,sugars,glycerols,glycerols,sugar alcohols,sugar alcohols,sugar alcohols) set.seed(5) Ratio-rnorm(13, 0.5, 3) df-data.frame(Compound, Class,Ratio) library(lattice) P-barchart(data=df,Ratio~Compound|Class) However, I would like to have the bars attached to an imaginary y=0-line so that they go up if Ratio0 and down if Ratio0. I saw some older entries in the mailing list supposing to use panel.barchart. However, I did not fully understand the usage of this function. Also, it was annouced to add an option to barchart to make it easier to get this type of plot. Has anyone an idea what the easiest solution might be? barchart(data=df,Ratio~Compound|Class, origin = 0) A second problem is concerning the display of the Compound-objects in accordance with the conditioning variable Class. In other words: Is it possible to use the whole space of each panel for only those Compound-objects which belong to the Class displayed in that particular panel? Of course, for this purpose the panels have to be detached to allow appropiate labling of the x-axis. barchart(data=df,Ratio~Compound|Class, origin = 0, scales = list(x = free)) But this would work better if the levels of Compound are adjacent within Class; e.g. barchart(data=df,Ratio~reorder(Compound, as.numeric(Class))|Class, origin = 0, scales = list(x = free)) -Deepayan -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Felix Andrews / 安福立 PhD candidate Integrated Catchment Assessment and Management Centre The Fenner School of Environment and Society The Australian National University (Building 48A), ACT 0200 Beijing Bag, Locked Bag 40, Kingston ACT 2604 http://www.neurofractal.org/felix/ 3358 543D AAC6 22C2 D336 80D9 360B 72DD 3E4C F5D8 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] spliting a string
On Thu, 2008-07-17 at 17:56 +0530, Kurapati, Ravichandra (Ravichandra) wrote: Hi String-130.5 Df-Strsplit(.,:,String) Then Df get But I want Df should contains Df 130 5 Hi Ravichandra, Try: strsplit(String,[.]) The period (full stop) has to be marked as literal. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] float and double precision with C code
Hi all, There is a mistake for wich I need your ligths : /// I have a small C code : SEXP testData() { SEXP result; void * rans; float * my_data; int my_data_length; my_data_length = 2; my_data = new float[my_data_length]; my_data[0] = 29.958334; my_data[1] = 29.875; PROTECT(result = allocVector(REALSXP, my_data_length)); rans = (void *)REAL(result); for(int i=0; i my_data_length; i++) ((double *)rans)[i] = (double)(my_data[i]); Rprintf(C value #1 : %f\n, ((double *)rans)[0]); Rprintf(C value #2 : %f\n, ((double *)rans)[1]); delete(my_data); UNPROTECT(1); return(result); } /// And the R corresponding function : testData - function() { result - .Call(testData, PACKAGE=my_package) print(paste(R value #1 :, result[1])) print(paste(R value #2 :, result[2])) return(result) } /// And in R console, after compilation : my_result - testData() C value #1 : 29.958334 C value #2 : 29.875000 [1] R value #1 : 29.9583339691162 [1] R value #2 : 29.875 my_result[1] == 29.958334 [1] FALSE ??? How can I do to conserve my values ? Regards, ___ e http://mail.yahoo.fr __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to split the string
Hi Str-130.1,2.2,3.1,... I want the o/p like Dataframe1 row 130 2 3 . . Dataframe2 Col 1 2 1 . . How can I get desired o/p. Thanks K.Ravichandra [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] float and double precision with C code
FAQ 7.31 Also read What Every Computer Scientist Should Know About Floating-Point Arithmetic, ACM Computing Surveys, 23/1, 5–48, also available via http://docs.sun.com/source/806-3568/ncg_goldberg.html. On Thu, Jul 17, 2008 at 8:47 AM, JS Ubei [EMAIL PROTECTED] wrote: Hi all, There is a mistake for wich I need your ligths : /// I have a small C code : SEXP testData() { SEXP result; void * rans; float * my_data; int my_data_length; my_data_length = 2; my_data = new float[my_data_length]; my_data[0] = 29.958334; my_data[1] = 29.875; PROTECT(result = allocVector(REALSXP, my_data_length)); rans = (void *)REAL(result); for(int i=0; i my_data_length; i++) ((double *)rans)[i] = (double)(my_data[i]); Rprintf(C value #1 : %f\n, ((double *)rans)[0]); Rprintf(C value #2 : %f\n, ((double *)rans)[1]); delete(my_data); UNPROTECT(1); return(result); } /// And the R corresponding function : testData - function() { result - .Call(testData, PACKAGE=my_package) print(paste(R value #1 :, result[1])) print(paste(R value #2 :, result[2])) return(result) } /// And in R console, after compilation : my_result - testData() C value #1 : 29.958334 C value #2 : 29.875000 [1] R value #1 : 29.9583339691162 [1] R value #2 : 29.875 my_result[1] == 29.958334 [1] FALSE ??? How can I do to conserve my values ? Regards, ___ e http://mail.yahoo.fr __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] spliting a string
haven't even peeked into the docs, have you? try: strsplit(String, ., fixed=TRUE) strsplit(String, [.]) vQ Kurapati, Ravichandra (Ravichandra) wrote: Hi String-130.5 Df-Strsplit(.,:,String) Then Df get But I want Df should contains Df 130 5 If any body knows how to do it.tel me Thanks K.Ravichandra [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with lm()
Dear Hsin-Ya, The problem here is that subject is a factor with 14 levels, and sequence again is a factor with 2 levels; so the subject:sequence interaction already uses up another (13*1)=13 d.f.; given that there are only 28 replicates, it is not surprising that there are no residual degrees of freedom left! What do you intend to test with your model? Maybe I can be of help to solve the problem Best wishes Christoph [EMAIL PROTECTED] schrieb: Dear Dr. Christoph: Thanks for your reply. I am sure that I use the same data to run GLM with SPSS, but SPSS seems work!! I also try your suggestion. I change the sequence data. a-c(1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13,14,14) b-c(1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2) c-c(2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2) d-c(2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1) d-c(1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1) e-c(1739,1633,1481,1837,1780,2073,1374,1629,1555,1385,1756,1522,1566,1643,1939, 1615,1475,1759,1388,1483,1127,1682,1542,1247,1235,1605,1598,1718 ) KK-data.frame(subject=as.factor(a), drug=as.factor(b), period=as.factor(c), sequence=as.factor(d), Max=e) lm3- lm(Max ~ subject*sequence + period + drug+sequence , data=KK) print(lm3) anova(lm3) However, it can not work!! So, where is the problems? Do I misunderstand what you mean? Best regards, Hsin-Ya Dr. Christoph Scherber wrote: Dear Hsin-Ya Lee, The problem seems to be that every subject always only received one sequence: sequence subject 1 2 1 0 2 2 2 0 3 0 2 4 2 0 5 0 2 6 2 0 7 0 2 8 2 0 9 0 2 10 2 0 11 0 2 12 2 0 13 0 2 14 2 0 You should try to use a design where each subject receives each of the two sequences. Otherwise obviously the interactions will be not available. Best wishes Christoph leeznar schrieb: Dear R-users: I am a new R-user and I have a question about lm function. Here is my data. a-c(1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13,14,14) b-c(1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2) c-c(2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2) d-c(2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1) e-c(1739,1633,1481,1837,1780,2073,1374,1629,1555,1385,1756,1522,1566,1643,1939,1615,1475,1759,1388,1483,1127,1682,1542,1247,1235,1605,1598,1718 ) Data-data.frame(subject=as.factor(a), drug=as.factor(b), period=as.factor(c), sequence=as.factor(d), Max=e) lm3- lm(Max ~subject*sequence + sequence + period + drug, data=Data) print(lm3) anova(lm3) When I use lm to fit the data, there are some problems in subject*sequence. I have use GLM in SPSS to fit the same data, and it seems there is no problem. I don't know where my problem is. How can I get the same result with SPSS? How can I do? Best regards, Hsin-Ya Lee __ [[elided Yahoo spam]] Content-Type: application/msword; name=Result_SPSS.doc Content-Transfer-Encoding: base64 Content-Description: 3367377201-Result_SPSS.doc Content-Disposition: attachment; filename=Result_SPSS.doc __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Quoted from: http://www.nabble.com/Problems-with-lm%28%29-tp1700p18000246.html . __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Re : float and double precision with C code
thank you for your quick answer, I'm far of the digits capacity and my values are not the result of a computation. I'm developping a R package to acces a specific data source. And I need precision a few better. How can I do ? When I try this In R console, this is correct and what I need : my_value - 29.958334 my_value == 29.958334 [1] TRUE But I need to do the first operation (my_value - 29.958334) in C Regards, - Message d'origine De : jim holtman [EMAIL PROTECTED] À : JS Ubei [EMAIL PROTECTED] Cc : r-help@r-project.org Envoyé le : Jeudi, 17 Juillet 2008, 14h56mn 01s Objet : Re: [R] float and double precision with C code FAQ 7.31 Also read What Every Computer Scientist Should Know About Floating-Point Arithmetic, ACM Computing Surveys, 23/1, 5–48, also available via http://docs.sun.com/source/806-3568/ncg_goldberg.html. On Thu, Jul 17, 2008 at 8:47 AM, JS Ubei [EMAIL PROTECTED] wrote: Hi all, There is a mistake for wich I need your ligths : /// I have a small C code : SEXP testData() { SEXP result; void * rans; float * my_data; int my_data_length; my_data_length = 2; my_data = new float[my_data_length]; my_data[0] = 29.958334; my_data[1] = 29.875; PROTECT(result = allocVector(REALSXP, my_data_length)); rans = (void *)REAL(result); for(int i=0; i my_data_length; i++) ((double *)rans)[i] = (double)(my_data[i]); Rprintf(C value #1 : %f\n, ((double *)rans)[0]); Rprintf(C value #2 : %f\n, ((double *)rans)[1]); delete(my_data); UNPROTECT(1); return(result); } /// And the R corresponding function : testData - function() { result - .Call(testData, PACKAGE=my_package) print(paste(R value #1 :, result[1])) print(paste(R value #2 :, result[2])) return(result) } /// And in R console, after compilation : my_result - testData() C value #1 : 29.958334 C value #2 : 29.875000 [1] R value #1 : 29.9583339691162 [1] R value #2 : 29.875 my_result[1] == 29.958334 [1] FALSE ??? How can I do to conserve my values ? Regards, ___ e http://mail.yahoo.fr __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? _ Envo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lm() question
Dear Steven, Your xlim() function is what changes the behavior of the x axis values; If you remove the xlim() statement from your function, you get a correct x axis: ## YC=82:91 Age=rev(seq(2,11,1)) Num=c(2,0,8,21,49,18,79,28,273,175) box44=data.frame(YC,Age,Num) mod1=lm(log(Num+1)~YC, data=box44) plot(log(Num+1)~YC, data=box44, pch=19, xlab=Year Class, ylab=Loge Number at age, ylim=c(0,6)) abline(lm(log(Num+1)~YC), col=blue, lwd=2) ## But what kind of graph do you want to arrive at in the end? Best wishes Christoph Ranney, Steven schrieb: I have data that looks like YC Age Num 82 11 2 83 10 0 84 9 8 85 8 21 86 7 49 87 6 18 88 5 79 89 4 28 90 3 273 91 2 175 with a program mod1=lm(log(Num+1)~YC, data=box44) plot(log(Num+1)~YC, data=box44, pch=19, xlab=Year Class, ylab=Loge Number at age, ylim=c(0,6), xlim=c(91,82)) abline(lm(log(Num+1)~YC), col=blue, lwd=2) summary(mod1) I need to regress log(Num+1) against YC, but in descending (91 - 82) order so as to get a negative slope and positive intercept. I can plot the values such that the regression line appears correct (the xlim statement), but I can't figure out how to force the regression from YC 91 through 82. A call to the rev() function in front of YC does not produce the results I'm looking for. More than likely, I'm overlooking something. Thanks for any help you can provide. SR Steven H. Ranney Graduate Research Assistant (Ph.D) USGS Montana Cooperative Fishery Research Unit Montana State University PO Box 173460 Bozeman, MT 59717-3460 phone: (406) 994-6643 fax: (406) 994-7479 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. . -- Dr. rer.nat. Christoph Scherber University of Goettingen DNPW, Agroecology Waldweg 26 D-37073 Goettingen Germany phone +49 (0)551 39 8807 fax +49 (0)551 39 8806 Homepage http://www.gwdg.de/~cscherb1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Re : Re : float and double precision with C code
ok, sorry, my mistake was the C printf. Thank you for your good answer Regards - Message d'origine De : JS Ubei [EMAIL PROTECTED] À : jim holtman [EMAIL PROTECTED] Cc : r-help@r-project.org Envoyé le : Jeudi, 17 Juillet 2008, 15h25mn 07s Objet : [R] Re : float and double precision with C code thank you for your quick answer, I'm far of the digits capacity and my values are not the result of a computation. I'm developping a R package to acces a specific data source. And I need precision a few better. How can I do ? When I try this In R console, this is correct and what I need : my_value - 29.958334 my_value == 29.958334 [1] TRUE But I need to do the first operation (my_value - 29.958334) in C Regards, - Message d'origine De : jim holtman [EMAIL PROTECTED] À : JS Ubei [EMAIL PROTECTED] Cc : r-help@r-project.org Envoyé le : Jeudi, 17 Juillet 2008, 14h56mn 01s Objet : Re: [R] float and double precision with C code FAQ 7.31 Also read What Every Computer Scientist Should Know About Floating-Point Arithmetic, ACM Computing Surveys, 23/1, 5–48, also available via http://docs.sun.com/source/806-3568/ncg_goldberg.html. On Thu, Jul 17, 2008 at 8:47 AM, JS Ubei [EMAIL PROTECTED] wrote: Hi all, There is a mistake for wich I need your ligths : /// I have a small C code : SEXP testData() { SEXP result; void * rans; float * my_data; int my_data_length; my_data_length = 2; my_data = new float[my_data_length]; my_data[0] = 29.958334; my_data[1] = 29.875; PROTECT(result = allocVector(REALSXP, my_data_length)); rans = (void *)REAL(result); for(int i=0; i my_data_length; i++) ((double *)rans)[i] = (double)(my_data[i]); Rprintf(C value #1 : %f\n, ((double *)rans)[0]); Rprintf(C value #2 : %f\n, ((double *)rans)[1]); delete(my_data); UNPROTECT(1); return(result); } /// And the R corresponding function : testData - function() { result - .Call(testData, PACKAGE=my_package) print(paste(R value #1 :, result[1])) print(paste(R value #2 :, result[2])) return(result) } /// And in R console, after compilation : my_result - testData() C value #1 : 29.958334 C value #2 : 29.875000 [1] R value #1 : 29.9583339691162 [1] R value #2 : 29.875 my_result[1] == 29.958334 [1] FALSE ??? How can I do to conserve my values ? Regards, ___ e http://mail.yahoo.fr __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? _ Envo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ ne boite mail plus intelligente http://mail.yahoo.fr __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Comparing differences in AUC from 2 different models
Hi, I would like to compare differences in AUC from 2 different models, glm and gam for predicting presence / absence. I know that in theory the model with a higher AUC is better, but what I am interested in is if statistically the increase in AUC from the glm model to the gam model is significant. I also read quite extensive discussions on the list about ROC and AUC but I still didn't find my answer. To calculate the AUC and plot the ROC I used the package PresenceAbsence. The help file for auc() says: The standard errors from auc are only valid for comparing an individual model to random assignment (i.e. AUC=.5). To compare two models to each other it is necessary to account for correlation due to the fact that they use the same test set. If you are interested in pair wise model comparisons see the Splus ROC library from Mayo clinic. auc is a much simpler function than what is available from the Splus ROC library from Mayo clinic. I did download this library but I don't have access to S-PLUS and even if supposedly the code is very similar between S-PLUS and R I still don't quite understand what is going on because I am a little bit confused what some parameters represent …. For example markers and status, although I think status represent my original data (all coded 0 and 1) and markers might be the probabilities obtained from my 2 models. The confusion may also steam from the fact that I don't have a medical or biological training and maybe markers and status do have a special meaning for these 2 disciplines. I will really appreciate if you can help in finding a way to compare differences in AUC. Thanks, Monica _ _WL_Refresh_messenger_video_072008 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Colours in R
Hi list, I will help an person that will use some graphics of R in internet. But this pearson want to specify the colours. This person want me to create an pallete of colours like that: Name of color - Code - Colour White - 0xFF - color white (like an box with this color) I know that is possible with R, but i don't know how. Thanks for the advance. Atenciosamente, Leandro Lins Marino Centro de Avaliação Fundação CESGRANRIO Rua Santa Alexandrina, 1011 - 2º andar Rio de Janeiro, RJ - CEP: 20261-903 ( (21) 2103-9600 R.:236 ( (21) 8777-7907 * [EMAIL PROTECTED] Aquele que suporta o peso da sociedade é precisamente aquele que obtém as menores vantagens. (SMITH, Adam) P Antes de imprimir pense em sua responsabilidade e compromisso com o MEIO AMBIENTE __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] errors using step function
Dear R users, I have had a problem using the step function for the selection of the variables from a gamma model with log link function. CODE: step.glm.gam1.infroma - step(glm.gam1.infroma, direction = backward, k=2) ERROR MESSAGE: Error in glm.fit(x[, jj, drop = FALSE], y, wt, offset = object$offset, : inner loop 1; cannot correct step size In addition: Warning message: step size truncated due to divergence Could someone help me??? Thanks Silvia Narduzzi Dipartimento di Epidemiologia ASL RM E Via di S. Costanza, 53 00198 Roma Tel.: +39 06 83060461 Mail: [EMAIL PROTECTED] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RES: Colours in R
I know that! But i will select about 10 colurs and show him But he needs in that format! thanks@ -Mensagem original- De: Duncan Murdoch [mailto:[EMAIL PROTECTED] Enviada em: quinta-feira, 17 de julho de 2008 11:32 Para: Leandro Marino Cc: [EMAIL PROTECTED] Org Assunto: Re: [R] Colours in R On 7/17/2008 10:25 AM, Leandro Marino wrote: Hi list, I will help an person that will use some graphics of R in internet. But this pearson want to specify the colours. This person want me to create an pallete of colours like that: Name of color - Code - Colour White - 0xFF - color white (like an box with this color) I know that is possible with R, but i don't know how. The colors() function returns all of the names. The ?col2rgb has examples that produce text tables like what you want, but doing it graphically will be hard: there are 150 different colours to show. There's some code in the ?rainbow examples that displays a smaller number of colours. Duncan Murdoch Thanks for the advance. Atenciosamente, Leandro Lins Marino Centro de Avaliação Fundação CESGRANRIO Rua Santa Alexandrina, 1011 - 2º andar Rio de Janeiro, RJ - CEP: 20261-903 ( (21) 2103-9600 R.:236 ( (21) 8777-7907 * [EMAIL PROTECTED] Aquele que suporta o peso da sociedade é precisamente aquele que obtém as menores vantagens. (SMITH, Adam) P Antes de imprimir pense em sua responsabilidade e compromisso com o MEIO AMBIENTE __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] histogram plot default
Dear All: I am trying to plot a series of data using histogram plot. But I want to change the default of setting for histogram plot. For example, I want to set frequency plot starting with zero. Many Thanks Xin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Graph
Hi, I need to draw a barplot with the value numbers on the bars. In particular I would draw a barplot(beside=T) with on the bars the value numbers and in the middle of the bars a parcentage. -- | 12.34% | 150 | -- - | 7.01% | 78 | - Marco __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with mpi.close.Rslaves()
I have found a solution that appears to work. Instead of calling mpi.close.Rslaves() to shut down the slaves, I use mpi.bcast.cmd(q(no)) and rely on the scheduler/mpirun to shut down anything else. However, because I really don't know what I am doing, I would appreciate it if anyone who sees something wrong with this would let me know. Mark Lyman, Statistician ATK Launch Systems [EMAIL PROTECTED] (435) 863-2863 From: Lyman, Mark Sent: Wednesday, July 16, 2008 10:06 AM To: '[EMAIL PROTECTED]' Cc: 'r-help@r-project.org'; Palmer, Michael Subject: Problem with mpi.close.Rslaves() I am running R 2.7.0 on a Suse 9.1 linux cluster with a job scheduler dispatching jobs and openmpi-1.0.1. I have tried running one of the examples at http://ace.acadiau.ca/math/ACMMaC/Rmpi/examples.html in Rmpi and they seem to be working, except mpi.close.Rslaves() hangs. The slaves are closed, but the master doesn't finish its script. Below is the example script and the call to R. The job is being run on a single 4 processor machine. Any suggestions? Also is Rmpi using rexec to communicate? Can it use ssh if it doesn't already? mpirun -np 4 -machinefile /var/spool/PBS/aux/90361.head /apps/R/R270/bin/R CMD BATCH --save Rmpi_test4.R # Initialize MPI library(Rmpi) # Function the slaves will call to perform a validation on the # fold equal to their slave number. # Assumes: thedata,fold,foldNumber,p foldslave - function() { # Note the use of the tag for sent messages: # 1=ready_for_task, 2=done_task, 3=exiting # Note the use of the tag for received messages: # 1=task, 2=done_tasks junk - 0 done - 0 while (done != 1) { # Signal being ready to receive a new task mpi.send.Robj(junk,0,1) # Receive a task task - mpi.recv.Robj(mpi.any.source(),mpi.any.tag()) task_info - mpi.get.sourcetag() tag - task_info[2] if (tag == 1) { foldNumber - task$foldNumber rss - double(p) for (i in 1:p) { # produce a linear model on the first i variables on # training data templm - lm(y~.,data=thedata[fold!=foldNumber,1:(i+1)]) # produce predicted data from test data yhat - predict(templm,newdata=thedata[fold==foldNumber,1:(i+1)]) # get rss of yhat-y localrssresult - sum((yhat-thedata[fold==foldNumber,1])^2) rss[i] - localrssresult } # Send a results message back to the master results - list(result=rss,foldNumber=foldNumber) mpi.send.Robj(results,0,2) } else if (tag == 2) { done - 1 } # We'll just ignore any unknown messages } mpi.send.Robj(junk,0,3) } # We're in the parent. # first make some data n - 1000 # number of obs p - 30 # number of variables # Create data as a set of n samples of p independent variables, # make a random beta with higher weights in the front. # Generate y's as y = beta*x + random x - matrix(rnorm(n*p),n,p) beta - c(rnorm(p/2,0,5),rnorm(p/2,0,.25)) y - x %*% beta + rnorm(n,0,20) thedata - data.frame(y=y,x=x) fold - rep(1:10,length=n) fold - sample(fold) summary(lm(y~x)) # Now, send the data to the slaves mpi.bcast.Robj2slave(thedata) mpi.bcast.Robj2slave(fold) mpi.bcast.Robj2slave(p) # Send the function to the slaves mpi.bcast.Robj2slave(foldslave) # Call the function in all the slaves to get them ready to # undertake tasks mpi.bcast.cmd(foldslave()) # Create task list tasks - vector('list') for (i in 1:10) { tasks[[i]] - list(foldNumber=i) } # Create data structure to store the results rssresult = matrix(0,p,10) junk - 0 closed_slaves - 0 n_slaves - mpi.comm.size()-1 while (closed_slaves n_slaves) { # Receive a message from a slave message - mpi.recv.Robj(mpi.any.source(),mpi.any.tag()) message_info - mpi.get.sourcetag() slave_id - message_info[1] tag - message_info[2] if (tag == 1) { # slave is ready for a task. Give it the next task, or tell it tasks # are done if there are none. if (length(tasks) 0) { # Send a task, and then remove it from the task list mpi.send.Robj(tasks[[1]], slave_id, 1); tasks[[1]] - NULL } else { mpi.send.Robj(junk, slave_id, 2) } } else if (tag == 2) { # The message contains results. Do something with the results. # Store them in the data structure foldNumber - message$foldNumber rssresult[,foldNumber] - message$result } else if (tag == 3) { # A slave has closed down.
Re: [R] Colours in R
On 7/17/2008 10:25 AM, Leandro Marino wrote: Hi list, I will help an person that will use some graphics of R in internet. But this pearson want to specify the colours. This person want me to create an pallete of colours like that: Name of color - Code - Colour White - 0xFF - color white (like an box with this color) I know that is possible with R, but i don't know how. The colors() function returns all of the names. The ?col2rgb has examples that produce text tables like what you want, but doing it graphically will be hard: there are 150 different colours to show. There's some code in the ?rainbow examples that displays a smaller number of colours. Duncan Murdoch Thanks for the advance. Atenciosamente, Leandro Lins Marino Centro de Avaliação Fundação CESGRANRIO Rua Santa Alexandrina, 1011 - 2º andar Rio de Janeiro, RJ - CEP: 20261-903 ( (21) 2103-9600 R.:236 ( (21) 8777-7907 * [EMAIL PROTECTED] Aquele que suporta o peso da sociedade é precisamente aquele que obtém as menores vantagens. (SMITH, Adam) P Antes de imprimir pense em sua responsabilidade e compromisso com o MEIO AMBIENTE __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: Colours in R
Hi Maybe there is some neat function for it but apply(data.frame(t(col2rgb(colours()[1:10]))),1,function(x) paste(as.character.hexmode(x),collapse=)) gives you hex values for colours specified. Then you can use colours()[1:10] to get names (or you can invent your own) and you can print real colours in a plot by using filled rectangle. Regards Petr [EMAIL PROTECTED] napsal dne 17.07.2008 16:25:39: Hi list, I will help an person that will use some graphics of R in internet. But this pearson want to specify the colours. This person want me to create an pallete of colours like that: Name of color - Code - Colour White - 0xFF - color white (like an box with this color) I know that is possible with R, but i don't know how. Thanks for the advance. Atenciosamente, Leandro Lins Marino Centro de Avaliação Fundação CESGRANRIO Rua Santa Alexandrina, 1011 - 2º andar Rio de Janeiro, RJ - CEP: 20261-903 ( (21) 2103-9600 R.:236 ( (21) 8777-7907 * [EMAIL PROTECTED] Aquele que suporta o peso da sociedade é precisamente aquele que obtém as menores vantagens. (SMITH, Adam) P Antes de imprimir pense em sua responsabilidade e compromisso com o MEIO AMBIENTE __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: Graph
Hi [EMAIL PROTECTED] napsal dne 17.07.2008 16:14:33: Hi, I need to draw a barplot with the value numbers on the bars. In particular I would draw a barplot(beside=T) with on the bars the value numbers and in the middle of the bars a parcentage. -- | 12.34% | 150 | -- - | 7.01% | 78 | - See ?barplot especially Value section bbb-barplot(1:10) bbb text(bbb, 1:10, letters[1:10]) text(bbb, (1:10)/2, LETTERS[1:10]) Regards Petr Marco __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Histogram with two colors depending on condition
Here are a couple of ways: dat-rnorm(1000,1,2) hist(dat, col='red') tmp - par('usr') clip(0,tmp[2],tmp[3],tmp[4]) hist(dat, col='green', add=TRUE) # or library(TeachingDemos) hist(dat, col='red') clipplot( hist(dat, col='green', add=TRUE), c(0, par('usr')[2]) ) Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] (801) 408-8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Mohammad Ehsanul Karim Sent: Thursday, July 17, 2008 3:13 AM To: r-help@r-project.org Subject: [R] Histogram with two colors depending on condition Dear List, Say, we generate data like this- dat-rnorm(1000,1,2) hist(dat) How do i make the histogram, say, red (col = 2) before X = dat = 0, and rest say, green (col = 3) beyond X = dat = 0 in R? The resulting histogram could be like this http://ehsan.karim.googlepages.com/histogram.JPG (edited) Thanks in advance. Ehsan http://ehsan.karim.googlepages.com/diaryofastatistician __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] enscript states file for R scripts?
Dirk Eddelbuettel wrote: On 15 July 2008 at 17:23, hadley wickham wrote: | An alternative to enscript is highlight, | http://www.andre-simon.de/doku/highlight/en/highlight.html, which does | come with R highlighting built in. Another alternative is GNU a2ps which has definitions for R source, documentation and transcripts. As it says the bottom of the page at http://cran.r-project.org/other-software.html (with links) GNU a2ps is a fairly versatile text-to-anything processor, useful for typsetting source code from a wide variety of programming languages. s.ssh, rd.ssh and st.ssh are a2ps style sheets for S code, Rd documentation format, and S transscripts, respectively. (These will be included in the next a2ps release.) and indeed, these files have been part of a2ps' upstream releases for a long time. Dirk Unfortunately, a2ps doesn't support Unicode (and afaics u2ps does not do indentation...). -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Sampling distribution (PDF CDF) of correlation
Hi all, I'm looking for an analytic method to obtain the PDF CDF of the sampling distribution of a given correlation (rho) at a given sample size (N). I've attached code describing a monte carlo method of achieving this, and while it is relatively fast, an analytic solution would obviously be optimal. get.cors - function(i, x, y, N){ end=i*N .Internal(cor(x[(end-N+1):end] ,y[(end-N+1):end] ,TRUE ,FALSE )) } get.r.dist - function(N, rho, it){ Sigma=matrix(c(1,rho,rho,1),2,2) eS = eigen(Sigma, symmetric = TRUE, EISPACK = TRUE) ev = eS$values fact = eS$vectors %*% diag(sqrt(pmax(ev, 0)), 2) Z = rnorm(2 * N * it) dim(Z) = c(2, N * it) Z = t(fact %*% Z) x = Z[, 1] y = Z[, 2] r = sapply(1:it ,get.cors,x, y, N) return(r) } #Run 1e3 monte carlo iterations, where each obtains the correlation # of 10 pairs of observations from a bivariate normal distribution with # a true correlation of .5. Returns 1e3 values for the observed correlation mc.rs = get.r.dist( N=10 , rho=.5 , it=1e3 ) #plot the PDF CDF par(mfrow=c(1,2)) hist(mc.rs,prob=T,xlab='Observed correlation') probs = seq(0,1,.01) plot(quantile(mc.rs,probs=probs),probs,type='l',xlab='Observed correlation',ylab='Cumulative probability') -- Mike Lawrence Graduate Student, Department of Psychology, Dalhousie University www.memetic.ca The road to wisdom? Well, it's plain and simple to express: Err and err and err again, but less and less and less. - Piet Hein __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Colours in R
Leandro Marino [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] But this pearson want to specify the colours. This person want me to create an pallete of colours like that: Name of color - Code - Colour White - 0xFF - color white (like an box with this color) This page shows an R color chart and how to make one: http://research.stowers-institute.org/efg/R/Color/Chart/index.htm In particular, pp. 2-8 of this PDF shows the color name, hex value, and separate decimal RGB components: http://research.stowers-institute.org/efg/R/Color/Chart/ColorChart.pdf efg Earl F Glynn Bioinformatics Stowers Institute for Medical Research __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help with data layout
Hello list I have been given some Excel sheets with data laid like this: Col1Col2 A 3 2 3 B 4 5 4 C 1 4 3 I was hoping to import this into R as a csv and then get the mean and SD for each letter in column 1. Could someone give me some guidance on best to approach this? Thanks Iain [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] keeping seperate row.names
I have microarray data with gene names in the first column, gene id in the second and the expression data in the remaining columns. When trying use read.table I get the error, . . . more columns than column names. Is there any way to keep both columns of names without having to discard one. or the other? The raw data is read from a text file and is in the form mitogen-activated protein kinase 3 1000_at 946 1928.8 1504.9 722.5 873.9 836.9 1294.3 631.1 606 1126.6 841.2 833.6 689.6 1256.9 685.8 755.3 974.8 where, mitogen-activated protein kinase 3 is the first column of this (tab delimited) text file and 1000_at is the second column. I want to keep both columns as labels. Is there a way to do that? I've used: test - read.table(path, header=T, sep=\t, row.names=1) and get the error, more columns than column names. Many Thanks, Patrick Van Andel Institute Grand Rapids, MI -- View this message in context: http://www.nabble.com/keeping-seperate-row.names-tp18512130p18512130.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Newbie's question about lm
Hello, I would like to fit data with the following formula : y=V*(1+alpha*(x-25)) where y and x are my data, V is a constant and alpha is the slope I'm looking for. How to translate this into R-language ? At the moment, I only know : lm(y ~ x) Sorry for such a basic question. I thought I could find the solution in a post but I have to confess that, up to know, I'm not able to understand the posts I read concerning lm (the level of the questions are too high for me and my english quite poor as well). Have a nice evening, Ptit Bleu. -- View this message in context: http://www.nabble.com/Newbie%27s-question-about-lm-tp18511262p18511262.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sampling distribution (PDF CDF) of correlation
See Chapter 22 in N.L. Johnson, S. Kotz, and N. Balakrishnan, Continuous Univariate Distributions, Volume 2, Second Edition, 1995. Maria On Thu, Jul 17, 2008 at 11:29 AM, Mike Lawrence [EMAIL PROTECTED] wrote: Hi all, I'm looking for an analytic method to obtain the PDF CDF of the sampling distribution of a given correlation (rho) at a given sample size (N). I've attached code describing a monte carlo method of achieving this, and while it is relatively fast, an analytic solution would obviously be optimal. get.cors - function(i, x, y, N){ end=i*N .Internal(cor(x[(end-N+1):end] ,y[(end-N+1):end] ,TRUE ,FALSE )) } get.r.dist - function(N, rho, it){ Sigma=matrix(c(1,rho,rho,1),2,2) eS = eigen(Sigma, symmetric = TRUE, EISPACK = TRUE) ev = eS$values fact = eS$vectors %*% diag(sqrt(pmax(ev, 0)), 2) Z = rnorm(2 * N * it) dim(Z) = c(2, N * it) Z = t(fact %*% Z) x = Z[, 1] y = Z[, 2] r = sapply(1:it ,get.cors,x, y, N) return(r) } #Run 1e3 monte carlo iterations, where each obtains the correlation # of 10 pairs of observations from a bivariate normal distribution with # a true correlation of .5. Returns 1e3 values for the observed correlation mc.rs = get.r.dist( N=10 , rho=.5 , it=1e3 ) #plot the PDF CDF par(mfrow=c(1,2)) hist(mc.rs,prob=T,xlab='Observed correlation') probs = seq(0,1,.01) plot(quantile(mc.rs,probs=probs),probs,type='l',xlab='Observed correlation',ylab='Cumulative probability') -- Mike Lawrence Graduate Student, Department of Psychology, Dalhousie University www.memetic.ca The road to wisdom? Well, it's plain and simple to express: Err and err and err again, but less and less and less. - Piet Hein __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Can mvtnorm calculate a sequence of probabilities?
Hi all, I know pnorm() is able to calculate a sequence of probabilities by providing a vector of means, and a single number for location and standard deviation. For example, pnorm(0.5,c(0.5,2),1) will produce [1] 0.500 0.0668072. However, I wonder if its multivariate counterpart mvtnorm can do the same thing? Namely, if I provide a sequence of (vector-)means and a single vector for boundaries, and a single correlation matrix, can I get a sequence of probabilities? Many thanks! Jianing -- Life is a Markov Chain, your future will be independent with yesterday. Life is also a super-martingale, once you lose it, you are not expected to get it back. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with data layout
Iain Gallagher wrote: Hello list I have been given some Excel sheets with data laid like this: Col1Col2 A 3 2 3 B 4 5 4 C 1 4 3 I was hoping to import this into R as a csv and then get the mean and SD for each letter in column 1. Could someone give me some guidance on best to approach this? Sure. Reading in Excel sheets can be done at least a few ways, see the R Data Import/Export manual on CRAN. The only way I have done it is to save the Excel sheet as a CSV file, and then use read.csv in R to get a data.frame. One note here is that sometimes the Excel sheet has 'missing' cells where someone has inserted blanks. These may get written out to the CSV file, you'll have to check. For example, I've seen an Excel sheet with something like 10 rows of data that outputs about 100 to the CSV file, mostly all missing. Anyway, once you have the data.frame, I'd use na.locf from the zoo package to 'fill' in the missing Col1 values, and then use an R function such as ave, tapply, aggregate, or by to do whatever you'd like. Thanks Iain [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with data layout
Hi, hope this will help: txt - Col1,Col2 A,3 , 2 ,3 B,4 , 5 , 4 C,1 , 4 , 3 ## read data dat - read.csv(textConnection(txt),na.string=) ## fill in empty cells with correct category dat$Col1[] - Reduce(function(x,y) c(x,ifelse(is.na(y),tail(x,1),y)),dat$Col1) ## calculate mean and standard deviation mat - t(sapply(split(dat$Col2,f=dat$Col1),function(X) c(mean=mean(X),sd=sd(X ## look at results (stored in a matrix) print(mat) meansd A 2.67 0.5773503 B 4.33 0.5773503 C 2.67 1.5275252 - Original Message From: Iain Gallagher [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Thursday, July 17, 2008 8:50:42 AM Subject: [R] help with data layout Hello list I have been given some Excel sheets with data laid like this: Col1Col2 A 3 2 3 B 4 5 4 C 1 4 3 I was hoping to import this into R as a csv and then get the mean and SD for each letter in column 1. Could someone give me some guidance on best to approach this? Thanks Iain [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] negative P-values with shapiro.test
MM == Martin Maechler [EMAIL PROTECTED] on Wed, 16 Jul 2008 18:02:47 +0200 writes: MC == Mark Cowley [EMAIL PROTECTED] on Wed, 16 Jul 2008 15:32:30 +1000 writes: MC Dear list, MC I am analysing a set of quantitative proteomics data MC from 16 patients which has a large numbers of missing MC data, thus some proteins are only detected once, upto a MC maximum of 16. I want to test each protein for MC normality by the Shapiro Wilk test (function MC shapiro.test in package stats), which can only be MC applied to data with at least 3 measurements, which is MC fine. In the case where I have only 3 observations, and MC two of those observations are identical, then the MC shapiro.test produces negative P-values, which should MC never happen. This occurs for all of the situations MC that I have tried for 3 values, where 2 are the same. MM Yes. Since all such tests are location- and scale-invariant, you MM can reproduce it with MM shapiro.test(c(0,0,1)) MM The irony is that the original papers by Roydon and the R help MM page all assert that the P-value for n = 3 is exact ! MM OTOH, the paper [Roydon (1982), Appl.Stat 31, p115-124] MM clearly states that MM X(1) X(2) X(3) ... X(n) MM i.e., does not allow ties (two equal values). MM If the exact formula in the paper were evaluated exactly MM (instead with a rounded value of about 6 digits), MM the exact P-value would be exactly 0. I have now slightly increased the precision in some of the calculations involved, and also make sure that P-value = 0 for n == 3. This is now in both R-patched and R-devel. Thank you, Marc, for the report! But really, back to your data analysis question : I cannot imagine that the P-value of a Shapiro-Wilks test with n=3 (non-NA) observations is a good tool to help you draw valid conclusions about your data We have had several threads (on R-help) about the (non)sense of normality testing... MM Now that would count as a bug in the paper I think. The bug is not in Roydon's math-stat paper, but arguably in the Fortran code that was published in the accompanying paper... MM More about this tomorrow or so. Martin Maechler, ETH Zurich MC Reproducible code below: MC # these are the data points that raised the problem shapiro.test(c(-0.644, 0.0566, 0.0566)) MC Shapiro-Wilk normality test MC data: c(-0.644, 0.0566, 0.0566) MC W = 0.75, p-value 2.2e-16 shapiro.test(c(-0.644, 0.0566, 0.0566))$p.value MC [1] -7.69e-07 MC # note the verbose output shows a small, but positive P-value, but MC when you extract that P using $p.value, it becomes negative MC # various other tests shapiro.test(c(1,1,2))$p.value MC [1] -8.35e-07 shapiro.test(c(-1,-1,2))$p.value MC [1] -1.03e-06 MC cheers, MC Mark sessionInfo() MC R version 2.6.1 (2007-11-26) MC i386-apple-darwin8.10.1 MC locale: MC en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8 MC attached base packages: MC [1] tcltk graphics grDevices datasets utils stats MC methods base MC other attached packages: MC [1] qvalue_1.12.0Cairo_1.3-5 RSvgDevice_0.6.3 MC SparseM_0.74 pwbc_0.1 MC [6] mjcdev_0.1 tigrmev_0.1 slfa_0.1 MC sage_0.1 qtlreaper_0.1 MC [11] pajek_0.1mjcstats_0.1 mjcspot_0.1 MC mjcgraphics_0.1 mjcaffy_0.1 MC [16] haselst_0.1 geomi_0.1geo_0.1 MC genomics_0.1 cor_0.1 MC [21] bootstrap_0.1blat_0.1 bitops_1.0-4 MC mjcbase_0.1 gdata_2.3.1 MC [26] gtools_2.4.0 MC - MC Mark Cowley, BSc (Bioinformatics)(Hons) MC Peter Wills Bioinformatics Centre MC Garvan Institute of Medical Research, Sydney, Australia MC __ MC R-help@r-project.org mailing list MC https://stat.ethz.ch/mailman/listinfo/r-help MC PLEASE do read the posting guide http://www.R-project.org/posting-guide.html MC and provide commented, minimal, self-contained, reproducible code. MM __ MM R-help@r-project.org mailing list MM https://stat.ethz.ch/mailman/listinfo/r-help MM PLEASE do read the posting guide http://www.R-project.org/posting-guide.html MM and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Hiding information about functions in newly developed packages
on 07/17/2008 10:57 AM Tudor Bodea wrote: Dear UseRs: I intend to write a package to handle the basic operations a revenue management analyst has to deal with on a regular basis (e.g., demand untruncation, capacity allocation, pricing decisions, etc.). Following the directions posted by Peter Rossi (Making R Packages under Windows: A Tutorial, January 2006) I was able to build an interim package that currently consists of one simple function (e.g., fun). After the package is loaded and the function is invoked without any parameters (e.g., fun) the body of the function is displayed on the screen underneath the function call. Since the user does not need to know the internal structure of the function, I am wondering if there is a way to hide this information. I have looked into some of the material that Paul Murrell and John Chambers among others posted but I have not really understood how to incorporate this information into the package building process. If you already experienced this issue and have some useful suggestions, I would really appreciate your taking the time to share them with me. For this project, I am using R2.7.1 on a Windows XP machine. Thank you. Tudor Do you want to have the function hidden in a 'namespace' or to actually attempt to keep the user from seeing your R source code at all? The latter is impossible in R. For the former, see that Writing R Extensions manual: http://cran.r-project.org/doc/manuals/R-exts.html#Package-name-spaces and the following article by Luke in R News: Luke Tierney. Name space management for R. R News, 3(1):2-6, June 2003 You could 'export' a wrapper function called fun(), but not export the main working internal function called fun.internal(). fun - function(...) fun.internal(...) However, the body of the wrapper function would still be seen and if the user knows how to 'get' to functions within namespaces, they will be able see the full body of the internal function. HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with data layout
Does this do at least the means for you: x - read.csv(textConnection(Col1 , Col2 + A, 3 + , 2 + , 3 + B, 4 + , 5 + , 4 + C , 1 + , 4 + , 3), strip.white=TRUE) x Col1 Col2 1A3 2 2 3 3 4B4 5 5 6 4 7C1 8 4 9 3 # replace blanks with NAs in first column is.na(x$Col1) - x$Col1 == '' require(zoo) x$Col1 - na.locf(x$Col1) x Col1 Col2 1A3 2A2 3A3 4B4 5B5 6B4 7C1 8C4 9C3 aggregate(x$Col2, list(x$Col1), FUN=mean) Group.1x 1 A 2.67 2 B 4.33 3 C 2.67 On Thu, Jul 17, 2008 at 11:50 AM, Iain Gallagher [EMAIL PROTECTED] wrote: Hello list I have been given some Excel sheets with data laid like this: Col1Col2 A 3 2 3 B 4 5 4 C 1 4 3 I was hoping to import this into R as a csv and then get the mean and SD for each letter in column 1. Could someone give me some guidance on best to approach this? Thanks Iain [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Hiding information about functions in newly developed packages
Tudor Bodea wrote: Dear UseRs: I intend to write a package to handle the basic operations a revenue management analyst has to deal with on a regular basis (e.g., demand untruncation, capacity allocation, pricing decisions, etc.). Following the directions posted by Peter Rossi (Making R Packages under Windows: A Tutorial, January 2006) I was able to build an interim package that currently consists of one simple function (e.g., fun). After the package is loaded and the function is invoked without any parameters (e.g., fun) the body of the function is displayed on the screen underneath the function call. Since the user does not need to know the internal structure of the function, I am wondering if there is a way to hide this information. I have looked into some of the material that Paul Murrell and John Chambers among others posted but I have not really understood how to incorporate this information into the package building process. If you already experienced this issue and have some useful suggestions, I would really appreciate your taking the time to share them with me. For this project, I am using R2.7.1 on a Windows XP machine. A simple way is print.invisible - function(x) cat(Printing disabled\n) class(ls) - invisible ls Printing disabled Of course, unclass(ls) still shows the definition, but I assume that the purpose was never to hide the progamming, just to avoid confusing users. Also of course, the same thing applies to all other functions in the system And, btw, fun is not invoking the function without parameters, that would be fun(). Typing the name is a request to see the function object. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Hiding information about functions in newly developed packages
You seems to be confused here. A function is invoked -- i.e. called -- via the expression fun(...) ## *Note the parentheses.* The function body is *not* printed. The expression fun ## note: no parentheses is not a call to the function. Rather, it is by default a call to the **print** method for the object, fun; when fun is a function, this prints the body of the function. One could change this by changing the S3 or S4 class of the object and defining the print or show method, respectively, to be something other than the standard print the body of the function. I don't think this is necessarily a good idea, though. Incidentally, if the function is invoked without arguments ** that it needs and for which defaults have not been provide** then you'll get an error. Details on this can be found in ?print, ?print.default ?show and the relevant documentation on S3 and S4 methods (e.g. ?UseMethod ,?setClass, ?setMethod and various R docs). Perhaps before you write your package you would do well to do some more studying of the R language. Cheers, Bert Gunter Genentech -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tudor Bodea Sent: Thursday, July 17, 2008 8:58 AM To: [EMAIL PROTECTED] Cc: Tudor Bodea Subject: [R] Hiding information about functions in newly developed packages Dear UseRs: I intend to write a package to handle the basic operations a revenue management analyst has to deal with on a regular basis (e.g., demand untruncation, capacity allocation, pricing decisions, etc.). Following the directions posted by Peter Rossi (Making R Packages under Windows: A Tutorial, January 2006) I was able to build an interim package that currently consists of one simple function (e.g., fun). After the package is loaded and the function is invoked without any parameters (e.g., fun) the body of the function is displayed on the screen underneath the function call. Since the user does not need to know the internal structure of the function, I am wondering if there is a way to hide this information. I have looked into some of the material that Paul Murrell and John Chambers among others posted but I have not really understood how to incorporate this information into the package building process. If you already experienced this issue and have some useful suggestions, I would really appreciate your taking the time to share them with me. For this project, I am using R2.7.1 on a Windows XP machine. Thank you. Tudor -- Tudor Dan Bodea Georgia Institute of Technology School of Civil and Environmental Engineering Web: http://www.prism.gatech.edu/~gtg757i __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Labelling curves on graphs
On 17-Jul-08 02:32:41, Frank E Harrell Jr wrote: Barry Rowlingson wrote: 2008/7/16 Ted Harding [EMAIL PROTECTED]: Hi Folks, I'd be grateful for good suggestions about the following. I'm plotting a family of (X,Y) curves (for different levels of another variable, Z): say 6 curves in all but could be more or less -- it's a rather variables situation. I'd like to label each curve with the value of Z that it corresponds to. But people out there must have faced it, and I'd be grateful for their own feedback from the coal-face! Isn't this why we have colour vision? Just use 6 different colours and a legend. Or use linestyles and colours. Trying to do neat labelling along general lines like contour does is going to fail in edge cases such as if two lines are very close together. Contour can be fairly sure the lines will have sensible spacing unless it's mapping cliffs. I looked at the underlying C code for contour a while back with a view to using the algorithms in another package, but I gave up pretty quickly. The generated contour lines and labels are quite beautiful, but the code gave me a headache. Barry Barry, Why use color when you can use black and label the curves where they are most separated? This solves problems with color blindness, xeroxing, and faxing. Frank [Also taking later postings into account] Barry has a fair point, as far as it goes, but Frank's response is an indicator of how far it can go. Colour is fine on a monitor screen, provided there are only a few lines. With up to 4 colours (say black, red, blue, green) you can get good distinguishability between colours. More than that, and they can be difficult to distinguish. Also, I find I get cross-reference overload with as few as 4 colours, and looking up the key gets distracting; much easier to let your eye run along the curve to find its label). In any case, in the case I have in hand, this is eventually getting printed in black and white, so the colour solution would not be much use! (Trying to distinguish lines printed in subtly different shades of grey is a fruitless exercise). The issue of curves which intersect each other is indeed tricky; if colour is not practicable, then several labels per curve is presumably the way to go. Indeed, the case I'm looking at tends to have curves broken into disjoint segments (a consequence of NA results for some ranges of the variables) so the same issue can arise. This has been an interesting and helpful discussion, and I'm grateful to all participants. Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 17-Jul-08 Time: 18:34:34 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] REvolution computing
Hi, everybody. Sorry to bring up the subject again but I have visited the revolution computing web page, but its not clear when or what will be the new release be. Does anybody have information about that?. Does anybody know if the version that will be released will be validated?. I have been reading the previous mails about validation and I work in a Pharma company and validation is a big concern. Thank you Mariana _ Descargá GRATIS el poder del nuevo Internet Explorer 7. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Display variables when running a script
Hi, I know this must be a stupid question, and sorry in advance for being such a noob. But, is there way to get R to display only certain variables when running a script. I know if you want to see the value of a variable when using the interface, you just type it in and hit enter, but is there a command that I can use in a script file that will show the value at a certain point? Thanks, -Nina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Comparing differences in AUC from 2 different models
Monica Pisica wrote: Hi, I would like to compare differences in AUC from 2 different models, glm and gam for predicting presence / absence. I know that in theory the model with a higher AUC is better, but what I am interested in is if statistically the increase in AUC from the glm model to the gam model is significant. I also read quite extensive discussions on the list about ROC and AUC but I still didn't find my answer. To calculate the AUC and plot the ROC I used the package PresenceAbsence. The help file for auc() says: The standard errors from auc are only valid for comparing an individual model to random assignment (i.e. AUC=.5). To compare two models to each other it is necessary to account for correlation due to the fact that they use the same test set. If you are interested in pair wise model comparisons see the Splus ROC library from Mayo clinic. auc is a much simpler function than what is available from the Splus ROC library from Mayo clinic. I did download this library but I don't have access to S-PLUS and even if supposedly the code is very similar between S-PLUS and R I still don't quite understand what is going on because I am a little bit confused what some parameters represent …. For example markers and status, although I think status represent my original data (all coded 0 and 1) and markers might be the probabilities obtained from my 2 models. The confusion may also steam from the fact that I don't have a medical or biological training and maybe markers and status do have a special meaning for these 2 disciplines. I will really appreciate if you can help in finding a way to compare differences in AUC. Thanks, Monica Comparison of ROC areas does not have sufficient power to detect important differences of two models. See the following, for which I have R/S+ code. Likelihood ratio tests for nested models are even more powerful. -Frank @Article{pen08eva, author = {Pencina, Michael J. and {D'Agostino Sr}, Ralph B. and {D'Agostino Jr}, Ralph B. and Vasan, Ramachandran S.}, title = {Evaluating the added predictive ability of a new marker: {From} area under the {ROC} curve to reclassification and beyond}, journal = Stat in Med, year = 2008, volume = 27, pages ={157-172}, annote = {discrimination;model performance;AUC;C-index;risk prediction;biomarker;small differences in ROC area can still be very meaningful;example of insignificant test for difference in ROC areas with very significant results from new method;Yates' discrimination slope;reclassification table;limiting version of this based on whether and amount by which probabilities rise for events and lower for non-events when compare new model to old;comparing two models} } -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Display variables when running a script
?print ?cat On Thu, Jul 17, 2008 at 1:47 PM, [EMAIL PROTECTED] wrote: Hi, I know this must be a stupid question, and sorry in advance for being such a noob. But, is there way to get R to display only certain variables when running a script. I know if you want to see the value of a variable when using the interface, you just type it in and hit enter, but is there a command that I can use in a script file that will show the value at a certain point? Thanks, -Nina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Display variables when running a script
Depending on your motivation here, you may want to search for 'debugging in R' or something to that effect to look at the various options available. I like the debug package available on CRAN. Also see ?browser and ?trace if you want to debug a function. [EMAIL PROTECTED] wrote: Hi, I know this must be a stupid question, and sorry in advance for being such a noob. But, is there way to get R to display only certain variables when running a script. I know if you want to see the value of a variable when using the interface, you just type it in and hit enter, but is there a command that I can use in a script file that will show the value at a certain point? Thanks, -Nina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] REvolution computing
on 07/17/2008 12:40 PM Mariana Carla Gambarotta wrote: Hi, everybody. Sorry to bring up the subject again but I have visited the revolution computing web page, but its not clear when or what will be the new release be. Does anybody have information about that?. Does anybody know if the version that will be released will be validated?. I have been reading the previous mails about validation and I work in a Pharma company and validation is a big concern. Thank you Mariana Have you contacted them directly? http://www.revolution-computing.com/revolution/web/static/contactus They will certainly be the only authoritative source to respond to your queries. In the mean time, for R as it is made available by the R Foundation, you may wish to review the following document: http://www.r-project.org/doc/R-FDA.pdf HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Display variables when running a script
?print ?cat HTH Stephan [EMAIL PROTECTED] schrieb: Hi, I know this must be a stupid question, and sorry in advance for being such a noob. But, is there way to get R to display only certain variables when running a script. I know if you want to see the value of a variable when using the interface, you just type it in and hit enter, but is there a command that I can use in a script file that will show the value at a certain point? Thanks, -Nina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ubuntu and rgl package vs suse ?? static plots
Hello all, I've switched over my OS to ubuntu from Suse 10.3. In suse I was able to load up the 'rgl' library in R and then move my plots around after I had plotted them. Which I assumed was part of the openGL. However, since switching to ubuntu I am unable to 'move' my plot around, they are just static plots. :( I have done some things differently such as install R from the .tar.gz instead of rpm. The rgl package doens't complain when I install it. Is there something I'm overlooking? Same laptop same graphics card? cheers, Paul __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ubuntu and rgl package vs suse ?? static plots
On 7/17/2008 2:54 PM, H. Paul Benton wrote: Hello all, I've switched over my OS to ubuntu from Suse 10.3. In suse I was able to load up the 'rgl' library in R and then move my plots around after I had plotted them. Which I assumed was part of the openGL. However, since switching to ubuntu I am unable to 'move' my plot around, they are just static plots. :( I have done some things differently such as install R from the .tar.gz instead of rpm. The rgl package doens't complain when I install it. Is there something I'm overlooking? Same laptop same graphics card? This is a bug in Ubuntu/compiz. This message http://finzi.psych.upenn.edu/R/Rhelp02a/archive/130549.html describes a workaround; there are probably other messages with more detail if you search the Ubuntu forums. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ubuntu and rgl package vs suse ?? static plots
This is really odd. Because in Suse I could have compiz on and still have rotating rgl plots. I did have kde in suse and now gnome in ubuntu but I wouldn't have thought that would make the difference. It must be a ubuntu thing. Very odd. Paul Duncan Murdoch wrote: On 7/17/2008 2:54 PM, H. Paul Benton wrote: Hello all, I've switched over my OS to ubuntu from Suse 10.3. In suse I was able to load up the 'rgl' library in R and then move my plots around after I had plotted them. Which I assumed was part of the openGL. However, since switching to ubuntu I am unable to 'move' my plot around, they are just static plots. :( I have done some things differently such as install R from the .tar.gz instead of rpm. The rgl package doens't complain when I install it. Is there something I'm overlooking? Same laptop same graphics card? This is a bug in Ubuntu/compiz. This message http://finzi.psych.upenn.edu/R/Rhelp02a/archive/130549.html describes a workaround; there are probably other messages with more detail if you search the Ubuntu forums. Duncan Murdoch -- Research Programmer Technician The Scripps Research Institute Mass Spectrometry Core Facility o The / o Scripps \ o Research / o Institute __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sampling distribution (PDF CDF) of correlation
On Thu, 17 Jul 2008, Mike Lawrence wrote: Hi all, I'm looking for an analytic method to obtain the PDF CDF of the sampling distribution of a given correlation (rho) at a given sample size (N). See Fisher (1915) Biometrika. It is non-central t up to a monotone transformation. Finding the value of the non-centrality parameter and that transformation is left as an exercise for the reader. Chuck I've attached code describing a monte carlo method of achieving this, and while it is relatively fast, an analytic solution would obviously be optimal. get.cors - function(i, x, y, N){ end=i*N .Internal(cor(x[(end-N+1):end] ,y[(end-N+1):end] ,TRUE ,FALSE )) } get.r.dist - function(N, rho, it){ Sigma=matrix(c(1,rho,rho,1),2,2) eS = eigen(Sigma, symmetric = TRUE, EISPACK = TRUE) ev = eS$values fact = eS$vectors %*% diag(sqrt(pmax(ev, 0)), 2) Z = rnorm(2 * N * it) dim(Z) = c(2, N * it) Z = t(fact %*% Z) x = Z[, 1] y = Z[, 2] r = sapply(1:it ,get.cors,x, y, N) return(r) } # Run 1e3 monte carlo iterations, where each obtains the correlation # of 10 pairs of observations from a bivariate normal distribution with # a true correlation of .5. Returns 1e3 values for the observed correlation mc.rs = get.r.dist( N=10 , rho=.5 , it=1e3 ) #plot the PDF CDF par(mfrow=c(1,2)) hist(mc.rs,prob=T,xlab='Observed correlation') probs = seq(0,1,.01) plot(quantile(mc.rs,probs=probs),probs,type='l',xlab='Observed correlation',ylab='Cumulative probability') -- Mike Lawrence Graduate Student, Department of Psychology, Dalhousie University www.memetic.ca The road to wisdom? Well, it's plain and simple to express: Err and err and err again, but less and less and less. - Piet Hein __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:[EMAIL PROTECTED] UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Matching Up Values
Dear all, I have two files, both of similar formats. In column 1 are Latitude values (real numbers, e.g. -179.25), column 2 has Longitude values (also real numbers) and in one of the files, column 3 has Population Density values (integers); there is no column 3 in the other file. However, the main difference between these two files is that one has fewer rows than the other. So what I'm looking to do is, 'pad out' the shorter file, by adding in the rows with those that are 'missing' from the longer file (ie. if a particular coordinate isn't present in the shorter file but is in the 'longer/master' file), and having 'zero' as its Population Density value (column C). This should result in the shorter file becoming the same length as the initially longer file, and with each file having the same coordinate values (latitude and longitude on each line). How would I do this in R? Thanks for any help offered, Steve _ The John Lewis Clearance - save up to 50% with FREE delivery __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] nested calls, variable scope
Below is an example of a problem I encounter repeatedly when I write functions. A call works at the command line, but it does not work inside a function, even when I have made sure that all required variables are available within the function. The only way I know to solve it is to make the required variable global, which of course is dangerous. What is the elegant or appropriate way to solve this? # The call to ns works at the command line: junk-ns(DAT$Age, knots=74) # The call to lmer works at the command line, with the call to ns nested within it: junk2-lmer(formula=Finished ~ Sex01 + ns(DAT$Age, knots= 74 ) + MinCC + PzC + (1 | NAME ) + (1 | Year), data=myDATonly, family=binomial) But now I want to do this within a function such as the following: myfn function(myknot=74) { library(lme4) library(splines) cat(myknot=, myknot, \n) myMER-lmer(formula=Finished ~ Sex01 + ns(DAT$Age, knots= myknot) + MinCC + PzC + (1 | NAME ) + (1 | Year), data=myDATonly, family=binomial) } myfn(74) myknot= 74 Error in ns(DAT$Age, knots = myknot) : object myknot not found # The bad way to do it: revise myfn to make myknot a global variable: myfn function(myknot=74) { library(lme4) library(splines) cat(myknot=, myknot, \n) myknot-myknot myMER-lmer(formula=Finished ~ Sex01 + ns(DAT$Age, knots= myknot ) + MinCC + PzC + (1 | NAME ) + (1 | Year), data=myDATonly, family=binomial) } z-myfn(74) myknot= 74 # Runs fine but puts a global variable into .GlobalEnv. sessionInfo() R version 2.6.2 (2008-02-08) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] splines stats graphics grDevices datasets utils methods base other attached packages: [1] lme4_0.999375-13 Matrix_0.999375-9 lattice_0.17-6 loaded via a namespace (and not attached): [1] boot_1.2-32 grid_2.6.2 Thanks for any insight. Jacob A. Wegelin [EMAIL PROTECTED] Assistant Professor Department of Biostatistics Virginia Commonwealth University 730 East Broad Street Room 3006 P. O. Box 980032 Richmond VA 23298-0032 U.S.A. http://www.people.vcu.edu/~jwegelin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with data layout - Thanks
Thanks for all the excellent replies. Iain [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Matching Up Values
On 18/07/2008, at 8:42 AM, Steve Murray wrote: snip So what I'm looking to do is, 'pad out' the shorter file, by adding in the rows with those that are 'missing' from the longer file (ie. if a particular coordinate isn't present in the shorter file but is in the 'longer/master' file), and having 'zero' as its Population Density value (column C). Aaarrghh!!! Zero is not the same as a missing value. Surely to gumdrops you should say ``having NA as its Population Density value''. Elsewise you are lying about your data. cheers, Rolf Turner ## Attention:\ This e-mail message is privileged and confid...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Newbie's question about lm
Hi Ptit, I would like to fit data with the following formula : y=V*(1+alpha*(x-25)) where y and x are my data, V is a constant and alpha is the slope I'm looking for. Priorities first: lm() or ordinary least-square regression is a basically a method for finding the best-fitting straight line through a set of points (assuming that x is measured without error). So lm() determines the slope; it's usually what you are trying to estimate. Perhaps you are after abline: ?abline You can feed this coefficients (i.e. your intercept and slope), and then plot over an existing scatterplot. ## plot(x,y) abline(a=intercept, b=slope) ## or: abline(coef=c(intercept, slope)) HTH, Mark. Ptit_Bleu wrote: Hello, I would like to fit data with the following formula : y=V*(1+alpha*(x-25)) where y and x are my data, V is a constant and alpha is the slope I'm looking for. How to translate this into R-language ? At the moment, I only know : lm(y ~ x) Sorry for such a basic question. I thought I could find the solution in a post but I have to confess that, up to know, I'm not able to understand the posts I read concerning lm (the level of the questions are too high for me and my english quite poor as well). Have a nice evening, Ptit Bleu. -- View this message in context: http://www.nabble.com/Newbie%27s-question-about-lm-tp18511262p18517239.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] combining lists of pairs
Hi everybody, This has been causing me some trouble: I want to combine several lists of pairs into a single?vector but keep the pairs together. The lists are of the form: (1)?x1 x2 x3... N? (2) y1 y2 y3..? N I would like to keep it this way, simply appending one list to another, e.g. (1) x1i x2i... x1j x2j... N (2) etc. A?method like c(...) does not achieve this. It creates a single list, disregarding the pairs. Is there a more efficient way of doing it? Thanks! Oliver Marshall [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] REvolution computing
Mi Mariana: on 07/17/2008 12:40 PM Mariana Carla Gambarotta wrote: Hi, everybody. Sorry to bring up the subject again but I have visited the revolution computing web page, but its not clear when or what will be the new release be. Does anybody have information about that?. Does anybody know if the version that will be released will be validated?. I have been reading the previous mails about validation and I work in a Pharma company and validation is a big concern. Thank you Mariana Have you contacted them directly? http://www.revolution-computing.com/revolution/web/static/contactus They will certainly be the only authoritative source to respond to your queries. We are about to release our first product within the next week. This is a parallel computing environment called NetworkSpaces with a few parallel computing packages bundled in along with the NetworkSpaces server. It works with your current R installation, or with our upcoming product called RPro. Given your query above, I think you are likely more concerned with our following software release which hopefully will occur prior to the Joint Statistical Meetings (this is the current plan). This will be a precompiled version of R for Mac, Linux, and Windows (if you need other platforms, please do ask us) built with the Math Kernel Library from Intel. We currently have the builds completed and are in the testing and QA phase of both the R/MKL combination and the binary installer. With respect to validation, we are also compiling the internal documentation information one would need for Installation Qualification and are prepared to work with Pharma on making the Operational Qualification and Performance Qualification process as easy as possible. In the mean time, for R as it is made available by the R Foundation, you may wish to review the following document: http://www.r-project.org/doc/R-FDA.pdf I would highly recommend reading this document. From our current discussions with Pharma, it contains a wealth of relevant information to aid you in getting R validated internally within your organization. Thanks and please do not hesitate to contact me with any questions or queries you might have. I may not have the answers you need, but can put you in contact with the people who do. Thanks!! Dave H -- David Henderson, Ph.D. Director of Community REvolution Computing 1100 Dexter Avenue North, Suite 250 206-577-4778 x3203 [EMAIL PROTECTED] http://www.revolution-computing.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with TLC/TK on Ubuntu
Dear all, I have installed R on Linux/Ubuntu 8.04. When I try to load the tcltk package, I get the response: library(tcltk) Error in firstlib(which.lib.loc, package) : Tcl/Tk support is not available on this system Error in library(tcltk) : . First.lib failed for 'tcltk' In order to solve this problem, I try to: 1. To download the tcl and the tk packages (version: 8.5) from http://www.tcl.tk and to install them (the -dev packages also), but the error hold over. 2. To install the previous releases (version: 8.4) for these packages. The error hold over. 3. To set the environment variables 'TCL_LIBRARY' and 'TK_LIBRARY', in this mode: TCL_LIBRARY=/usr/local/lib/tcl8.5 * (or 8.4)* TK_LIBRARY=/usr/lib/tk8.5 *(or 8.4)* but the error hold over. 4.Therefore, after this modifications, i uninstalled R and I installed again it. I try to do this both without setting environment variables and setting it. The error hold over. I look for a solution. Thank you Davide Massidda *-- QPLab - Quantitative Psychology laboratory Department of General Psychology Via Venezia 8 -35131 Padova, Italia* [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] size and coding error in plot ?
Hello R users: I was trying to draw a boxplot?using 2 variables (rain and wind). Both of them has length 25056. So dummy is a matrix with dimension 2 x 25056. First I tried to draw a full boxplot using the following code?and gave an error boxplot(dummy$Rain~dummy$Wind) Error: cannot allocate vector of size 2.8 Gb In addition: Warning messages: 1: In rep.int(boxwex, n) : ? Reached total allocation of 1535Mb: see help(memory.size) 2: In rep.int(boxwex, n) : ? Reached total allocation of 1535Mb: see help(memory.size) 3: In rep.int(boxwex, n) : ? Reached total allocation of 1535Mb: see help(memory.size) 4: In rep.int(boxwex, n) : ? Reached total allocation of 1535Mb: see help(memory.size) and?thereafter I tried to draw boxplots at regular intervals (that means ) and that didnt work either boxplot(dummy$Rain~cut(dummy$Wind,(1:25)*1000), data=dummy) Error in plot.window(xlim = xlim, ylim = ylim, log = log, yaxs = pars$yaxs) : ? need finite 'ylim' values In addition: Warning messages: 1: In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL' 2: In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL' 3: In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL' 4: In min(x) : no non-missing arguments to min; returning Inf 5: In max(x) : no non-missing arguments to max; returning -Inf Also I used ?boxplot(dummy$Rain~cut(dummy$Wind, breaks=seq(from=1,to=25000,by=1000)), data=dummy) and that showing only sigle boxplot at far left corner. What would be the right ting to do ? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] size and coding error in plot ?
Sorry, I forgot to mention that I also tried??? memory.size(T)? and then tried to use those commands. Nothing got changed.. Can you please tell me where is the problem in?my code and how to fix it ? -Original Message- From: [EMAIL PROTECTED] To: r-help@r-project.org Sent: Thu, 17 Jul 2008 9:44 pm Subject: size and coding error in plot ? Hello R users: I was trying to draw a boxplot?using 2 variables (rain and wind). Both of them has length 25056. So dummy is a matrix with dimension 2 x 25056. First I tried to draw a full boxplot using the following code?and gave an error boxplot(dummy$Rain~dummy$Wind) Error: cannot allocate vector of size 2.8 Gb In addition: Warning messages: 1: In rep.int(boxwex, n) : ? Reached total allocation of 1535Mb: see help(memory.size) 2: In rep.int(boxwex, n) : ? Reached total allocation of 1535Mb: see help(memory.size) 3: In rep.int(boxwex, n) : ? Reached total allocation of 1535Mb: see help(memory.size) 4: In rep.int(boxwex, n) : ? Reached total allocation of 1535Mb: see help(memory.size) and?thereafter I tried to draw boxplots at regular intervals (that means ) and that didnt work either boxplot(dummy$Rain~cut(dummy$Wind,(1:25)*1000), data=dummy) Error in plot.window(xlim = xlim, ylim = ylim, log = log, yaxs = pars$yaxs) : ? need finite 'ylim' values In addition: Warning messages: 1: In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL' 2: In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL' 3: In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL' 4: In min(x) : no non-missing arguments to min; returning Inf 5: In max(x) : no non-missing arguments to max; returning -Inf Also I used ?boxplot(dummy$Rain~cut(dummy$Wind, breaks=seq(from=1,to=25000,by=1000)), data=dummy) and that showing only sigle boxplot at far left corner. What would be the right ting to do ? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Function to create variables with prefix
What you should be doing is to return a value from the function and you can then assign this to an object of your choice: sizeattributes - c(length, width, height) blue - myfunction(bluefrenchwidgets, sizeattributes, norm_) red - myfunction(redcanadianwidgets, sizeattributes, norm_) I don't think you really what to create objects in your global space. You can with 'assign' but it is usually better to return a list that has elements of your subsets. This is much easier to manage since you can alway use 'names' on the list to find out exactly what it contains. HTH On Thu, Jul 17, 2008 at 5:46 PM, Moira Burke [EMAIL PROTECTED] wrote: Thanks to everyone who replied. I really appreciate your help, and have been trying variations on the solutions you've offered. Unfortunately, none do quite what I'm hoping to do. Uwe's suggestion seems to be the closest (I think), but the new variables created within the function didn't exist outside of the scope of the function. (If there's something basic in R that I'm just missing, please let me know!) Here's a rephrasing of what I'm trying to do: Let's say I have a data.frame called mywidgets. It has these variables: mywidgets$length mywidgets$width mywidgets$height mywidgets$color mywidgets$country And I have some subsets of mywidgets, let's say: bluefrenchwidgets and redcanadianwidgets. And I'm planning to make lots more subsets in the future. So I want to create a function that performs some operation (such as scaling) within a subset easily. The function would be called sort of like this: sizeattributes - c(length, width, height) myfunction(bluefrenchwidgets, sizeattributes, norm_) myfunction(redcanadianwidgets, sizeattributes, norm_) And then a bunch of scaled variables would be created within those subsets, e.g.: bluefrenchwidgets$norm_length redcanadianwidgets$norm_width etc. Can someone suggest either a function that will do this, or another way to approach the problem? Thanks, Moira On Tue, Jul 15, 2008 at 3:06 AM, Uwe Ligges [EMAIL PROTECTED] wrote: Example: myfunction - function(mydata, myvariables, prefix=norm_){ newnames - paste(prefix, myvariables, sep=) mydata[newnames] - scale(mydata[myvariables]) mydata } mydata - data.frame(a=1:2, b=3:2) myfunction(mydata, c(a, b)) Moira Burke wrote: Hi. I'm a new R user and think what I'm trying to do is pretty basic, but haven't been able to find the answer in the mailing list archives. How do I write a function that creates new variables in a given data.frame with a prefix attached to the variable names? I do a lot of repetitive logging and scaling of variables, and would like a way to consistently generate and name versions of the variables. The function would take a data.frame, an array of variables, and a prefix string. It performs a transformation on the variables (e.g. logs and scales), and creates new variables with the same names as the inputs, but with the prefix_string prepended. So, a sample call would look like: myfunction(mydata, myvariables, norm_) And if myvariables contained variables named var1, var2, etc., the function would generate: mydata$norm_var1 mydata$norm_var2 Thanks, Moira [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] size and coding error in plot ?
On Fri, Jul 18, 2008 at 9:44 AM, [EMAIL PROTECTED] wrote: Hello R users: I was trying to draw a boxplot?using 2 variables (rain and wind). Both of them has length 25056. So dummy is a matrix with dimension 2 x 25056. First I tried to draw a full boxplot using the following code?and gave an error boxplot(dummy$Rain~dummy$Wind) Error: cannot allocate vector of size 2.8 Gb In addition: Warning messages: 1: In rep.int(boxwex, n) : ? Reached total allocation of 1535Mb: see help(memory.size) 2: In rep.int(boxwex, n) : ? Reached total allocation of 1535Mb: see help(memory.size) 3: In rep.int(boxwex, n) : ? Reached total allocation of 1535Mb: see help(memory.size) 4: In rep.int(boxwex, n) : ? Reached total allocation of 1535Mb: see help(memory.size) and?thereafter I tried to draw boxplots at regular intervals (that means ) and that didnt work either boxplot(dummy$Rain~cut(dummy$Wind,(1:25)*1000), data=dummy) Error in plot.window(xlim = xlim, ylim = ylim, log = log, yaxs = pars$yaxs) : ? need finite 'ylim' values In addition: Warning messages: 1: In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL' 2: In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL' 3: In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL' 4: In min(x) : no non-missing arguments to min; returning Inf 5: In max(x) : no non-missing arguments to max; returning -Inf Also I used ?boxplot(dummy$Rain~cut(dummy$Wind, breaks=seq(from=1,to=25000,by=1000)), data=dummy) and that showing only sigle boxplot at far left corner. What would be the right ting to do ? It's hard to check without your data, but the following should work in ggplot2: install.packages(ggplot2) library(ggplot2) # draws a single boxplot qplot(Rain, Wind, data=dummy, geom=boxplot) # draws multiple boxplots qplot(Rain, Wind, data=dummy, geom=boxplot, group = cut(Wind, (1:25)*1000)) Another approach would be to use quantile regression: qplot(Rain, Wind, data=dummy, geom=c(point, quantile)) qplot(Rain, Wind, data=dummy, geom=c(point, quantile), formula = y ~ ns(x, 10)) but getting the parametric form (formula) correct may be tricky. Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] size and coding error in plot ?
on 07/17/2008 08:44 PM [EMAIL PROTECTED] wrote: Hello R users: I was trying to draw a boxplot?using 2 variables (rain and wind). Both of them has length 25056. So dummy is a matrix with dimension 2 x 25056. First I tried to draw a full boxplot using the following code?and gave an error boxplot(dummy$Rain~dummy$Wind) Error: cannot allocate vector of size 2.8 Gb In addition: Warning messages: 1: In rep.int(boxwex, n) : ? Reached total allocation of 1535Mb: see help(memory.size) 2: In rep.int(boxwex, n) : ? Reached total allocation of 1535Mb: see help(memory.size) 3: In rep.int(boxwex, n) : ? Reached total allocation of 1535Mb: see help(memory.size) 4: In rep.int(boxwex, n) : ? Reached total allocation of 1535Mb: see help(memory.size) and?thereafter I tried to draw boxplots at regular intervals (that means ) and that didnt work either boxplot(dummy$Rain~cut(dummy$Wind,(1:25)*1000), data=dummy) Error in plot.window(xlim = xlim, ylim = ylim, log = log, yaxs = pars$yaxs) : ? need finite 'ylim' values In addition: Warning messages: 1: In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL' 2: In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL' 3: In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL' 4: In min(x) : no non-missing arguments to min; returning Inf 5: In max(x) : no non-missing arguments to max; returning -Inf Also I used ?boxplot(dummy$Rain~cut(dummy$Wind, breaks=seq(from=1,to=25000,by=1000)), data=dummy) and that showing only sigle boxplot at far left corner. What would be the right ting to do ? If you are simply looking to create 2 side by side boxplots, one for each variable, Rain and Wind, and your data is in a data frame called 'dummy', then you want to use: boxplot(dummy) The use of '$' in your code above indicates that 'dummy' is a data frame, not a matrix. Right now, you are trying to draw a potentially large number of boxplots, one for each unique value of Wind, which is why the process is running out of memory. If that is not what you are trying to do, then you'll need to provide more information. HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] matrix multiplication question
Hello, I am a newcomer to R and therefore apologize for posting such a basic question. I am trying to multiply 2 matrices t(X1)%*%X1, where t(X1) is: 1 2 3 4 5 8 12 13 20 24 26 27 31 33 34 36 37 40 41 42 45 46 47 48 49 ones 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Shadow 0 1 1 1 1 1 1 1 0 11 1 1 0 1 0 1 0 1 1 0 1 0 1 0 However, when I try to perform this operation, I obtain: Error in t(X1) %*% X1 : requires numeric matrix/vector arguments Can someone please tell me what I seem to be doing wrong? thanks. Sincerely, Murali Kuchibhotla Murali Kuchibhotla Department of Economics Iowa State University Office:75,Heady Phone:515-294-5452 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix multiplication question
On 17/07/2008 9:47 PM, Murali K wrote: Hello, I am a newcomer to R and therefore apologize for posting such a basic question. I am trying to multiply 2 matrices t(X1)%*%X1, where t(X1) is: It's hard to say for sure, but it looks as though X1 really isn't a numeric matrix. The ones and Shadow entries on the left look like character or factor data. To find out what X1 is, you can run str(X1). Duncan Murdoch 1 2 3 4 5 8 12 13 20 24 26 27 31 33 34 36 37 40 41 42 45 46 47 48 49 ones 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Shadow 0 1 1 1 1 1 1 1 0 11 1 1 0 1 0 1 0 1 1 0 1 0 1 0 However, when I try to perform this operation, I obtain: Error in t(X1) %*% X1 : requires numeric matrix/vector arguments Can someone please tell me what I seem to be doing wrong? thanks. Sincerely, Murali Kuchibhotla Murali Kuchibhotla Department of Economics Iowa State University Office:75,Heady Phone:515-294-5452 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.