Re: [R] iplots problem
You have to use Sun's java on your PC and not microsoft's. I just loaded it on mine and then restarted computer and it seemed to work. Hope this helps! mister_bluesman wrote: > > Hi > > How did u 'load' sun's java in R? > > Many thanks > > > > ryestone wrote: >> >> Did you load Sun's java? Try that and also try rebooting your machine. >> This worked for me. >> >> >> mister_bluesman wrote: >>> >>> Hi. I try to load iplots using the following commands >>> library(rJava) library(iplots) >>> >>> but then I get the following error: >>> >>> Error in .jinit(cp, parameters = "-Xmx512m", silent = TRUE) : >>> Cannot create Java Virtual Machine >>> Error in library(iplots) : .First.lib failed for 'iplots' >>> >>> What do I have to do to correct this? >>> >>> I have jdk1.6 and jre1.6 installed on my windows machine >>> Thanks >>> >> >> > > -- View this message in context: http://www.nabble.com/iplots-problem-tf3815516.html#a10813663 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Estimation of Dispersion parameter in GLM for Gamma Dist.
This is discussed in the book the MASS package (sic) supports, and/or its online material (depending on the edition). On Fri, 25 May 2007, fredrik odegaard wrote: > Hi All, > could someone shed some light on what the difference between the > estimated dispersion parameter that is supplied with the GLM function > and the one that the 'gamma.dispersion( )' function in the MASS > library gives? And is there consensus for which estimated value to > use? > > > It seems that the dispersion parameter that comes with the summary > command for a GLM with a Gamma dist. is close to (but not exactly): > Pearson Chi-Sq./d.f. Sometimes close to, but by no means always. Again, discussed in MASS. > While the dispersion parameter from the MASS library > ('gamma.dispersion ( )' ) is close to the approximation given in > McCullagh&Nelder (p.291): > Res.Dev./n*(6+Res.Dev./n) / (6 + 2*Res.Dev./n) > > (Since it is only an approximation it seems reasonable that they are > not exactly alike.) > > > Many thanks, > Fredrik > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with rpart
You only have 43 cases. After one split, the groups are too small to split again with the default settings. See ?rpart.control. On Fri, 25 May 2007, Silvia Lomascolo wrote: > > I work on Windows, R version 2.4.1. I'm very new with R! > > I am trying to build a classification tree using rpart but, although the > matrix has 108 variables, the program builds a tree with only one split > using one variable! I know it is probable that only one variable is > informative, but I think it's unlikely. I was wondering if someone can help > me identify if I'm doing something wrong because I can't see it, nor could I > find it in the help or in this forum. > > I want to see whether I can predict disperser type (5 categories) of a > species given the volatile compounds that the fruits emit (108 volatiles) > I am writing: > >> dispvol.x<- read.table ('C:\\Documents and > Settings\\silvia\\...\\volatile_disperser_matrix.txt', header=T) >> dispvol.df<- as.data.frame (dispvol.x) >> attach (dispvol.df) #I think I need to do this so the variables are > identified when I write the regression equation >> dispvol.ctree <- rpart (disperser~ P3.70 +P4.29 +P5.05 +...+P30.99 >> +P32.25 > +TotArea, >data= dispvol.df, method='class') > > and I get the following output: > > n= 28 > > node), split, n, loss, yval, (yprob) > * denotes terminal node > > 1) root 28 15 non (0.036 0.32 0.071 0.11 0.46) > 2) P10.01>=1.185 10 4 bat (0.1 0.6 0.2 0 0.1) * > 3) P10.01< 1.185 18 6 non (0 0.17 0 0.17 0.67) * > > There is nothing special about P10.01 that I can see in my data and I don't > know why it chooses that variables and stops there! > > My matrix looks something like this (except, with a lot more variables) > > disperser P3.70 P4.29 P6.45 P6.55 P10.01 P10.15 P10.18 TotArea > ban 0.000.001.340.001.490.000.002.83 > non 0.000.000.00152.80 0.0014.31 0.00167.11 > bat 0.000.000.00131.56 0.650.000.00132.21 > bat 0.000.005.050.0013.01 6.850.0024.90 > non 0.000.0072.65 103.26 4.100.000.00180.02 > non 0.000.000.000.000.000.000.000.00 > bat 1.230.000.480.890.250.000.002.85 > bat 0.000.000.000.000.000.000.000.00 > non 0.000.000.000.001.060.000.001.06 > bat 0.000.000.000.0028.69 0.0021.33 50.02 > mix 0.000.000.000.000.000.000.000.00 > non 0.000.000.000.000.000.000.000.00 > non 0.000.000.000.000.000.000.000.00 > non 0.000.000.000.001.150.000.001.15 > non 0.000.000.000.000.000.000.000.00 > non 0.000.820.001.650.000.000.002.47 > bat 0.000.00133.24 0.003.130.000.00136.37 > bir 0.000.0011.08 3.161.792.090.4818.61 > non 0.000.000.000.000.000.000.000.00 > mix 0.000.000.000.000.000.000.000.00 > bat 0.000.000.000.001.310.000.001.31 > non 0.000.000.000.000.000.001.231.23 > bat 0.000.001.810.002.840.000.004.65 > non 0.000.001.180.000.730.000.001.91 > bir 0.000.000.000.001.400.000.001.40 > bat 0.000.008.161.501.220.000.0010.88 > mix 0.000.550.000.000.000.000.000.55 > non 0.000.000.000.000.000.000.000.00 > > Thanks! Silvia. > > -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to get the "Naive SE" of coefficients from the zelig output
Dear Abdus, Can you try: summary(il6w.out)$table[,"(Naive SE)"] If you want to know more details of where that's coming from, run the following command in your R prompt and have a look at the outputted code > survival:::summary.survreg You might also consider subscribing at the zelig mailing list ( http://lists.gking.harvard.edu/?info=zelig) for any Zelig related question btw, in zelig(), there is no need to use formula = il6.data$il6 ~ il6.data$apache You can just use: formula = il6 ~ apache Best, Ferdi On 5/25/07, Abdus Sattar <[EMAIL PROTECTED]> wrote: > > Dear R-user: > > After the fitting the Tobit model using zelig, if I use the following > command then I can get the regression coefficents: > > beta=coefficients(il6.out) > > beta > (Intercept) apache > 4.7826 0.9655 > > How may I extract the "Naive SE" from the following output please? > > > summary(il6w.out) > Call: > zelig(formula = il6.data$il6 ~ il6.data$apache, model = "tobit", > data = il6.data, robust = TRUE, cluster = "il6.data$subject", > weights = il6.data$w) > Value Std. Err (Naive SE) z p > (Intercept) 4.572 0.124210.27946 36.8 1.44e-296 > il6.data$apache 0.983 0.001890.00494 519.4 0.00e+00 > Log(scale) 2.731 0.006600.00477 414.0 0.00e+00 > Scale= 15.3 > Gaussian distribution > Loglik(model)= -97576 Loglik(intercept only)= -108964 > Chisq= 22777 on 1 degrees of freedom, p= 0 > (Loglikelihood assumes independent observations) > Number of Newton-Raphson Iterations: 6 > n=5820 (1180 observations deleted due to missingness) > > I would appreciate if any help you could provide please. Thank you. > > Sattar > > > > > > > > > [[alternative HTML version deleted]] > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Interactive plots?
You could load it into GGobi using the rggobi package and then use the identification mode, as well as all the other interactive features... http://www.ggobi.org Michael On 5/25/07, mister_bluesman <[EMAIL PROTECTED]> wrote: > > > Hi there. > > I have a matrix that provides place names and the distances between them: > >Chelt Exeter London Birm > Chelt 0 118 96 50 > Exeter 1180 118 163 > London 96 118 0 118 > Birm 50 163 118 0 > > After performing multidimensional scaling I get the following points > plotted > as follows > > http://www.nabble.com/file/p10810700/demo.jpeg > > I would like to know how if I hover a point I can get a little box telling > me which place the point refers to. Does anyone know? > > Many thanks. > -- > View this message in context: > http://www.nabble.com/Interactive-plots--tf3818454.html#a10810700 > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to get the "Naive SE" of coefficients from the zelig output
Dear R-user: After the fitting the Tobit model using zelig, if I use the following command then I can get the regression coefficents: beta=coefficients(il6.out) > beta (Intercept) apache 4.7826 0.9655 How may I extract the "Naive SE" from the following output please? > summary(il6w.out) Call: zelig(formula = il6.data$il6 ~ il6.data$apache, model = "tobit", data = il6.data, robust = TRUE, cluster = "il6.data$subject", weights = il6.data$w) Value Std. Err (Naive SE) z p (Intercept) 4.572 0.124210.27946 36.8 1.44e-296 il6.data$apache 0.983 0.001890.00494 519.4 0.00e+00 Log(scale) 2.731 0.006600.00477 414.0 0.00e+00 Scale= 15.3 Gaussian distribution Loglik(model)= -97576 Loglik(intercept only)= -108964 Chisq= 22777 on 1 degrees of freedom, p= 0 (Loglikelihood assumes independent observations) Number of Newton-Raphson Iterations: 6 n=5820 (1180 observations deleted due to missingness) I would appreciate if any help you could provide please. Thank you. Sattar [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 3D plots with data.frame
On 25/05/2007 8:22 PM, [EMAIL PROTECTED] wrote: > > You could try the function 'plot3d', in package 'rgl': > > library(rgl) > ?plot3d > x<-data.frame(a=rnorm(100),b=rnorm(100),c=rnorm(100)) > plot3d(x$a,x$b,x$c) Or more simply, plot3d(x) (which plots the 1st three columns). Duncan Murdoch > > Jose > > > Quoting "H. Paul Benton" <[EMAIL PROTECTED]>: > >> Dear all, >> >> Thank you for any help. I have a data.frame and would like to plot >> it in 3D. I have tried wireframe() and cloud(), I got >> >> scatterplot3d(xs) >> Error: could not find function "scatterplot3d" >> >>> wireframe(xs) >> Error in wireframe(xs) : no applicable method for "wireframe" >> >>> persp(x=x, y=y, z=xs) >> Error in persp.default(x = x, y = y, z = xs) : >> (list) object cannot be coerced to 'double' >>> class(xs) >> [1] "data.frame" >> Where x and y were a sequence of my min -> max by 50 of xs[,1] and xs[,2]. >> >> my data is/looks like: >> >>> dim(xs) >> [1] 400 4 >>> xs[1:5,] >> x y Z1 Z2 >> 1 27172.4 19062.4 0128 >> 2 27000.9 19077.8 0 0 >> 3 27016.8 19077.5 0 0 >> 4 27029.5 19077.3 0 0 >> 5 27045.4 19077.0 0 0 >> >> Cheers, >> >> Paul >> >> -- >> Research Technician >> Mass Spectrometry >>o The >> / >> o Scripps >> \ >>o Research >> / >> o Institute >> >> __ >> R-help@stat.math.ethz.ch mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > > > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 3D plots with data.frame
You could try the function 'plot3d', in package 'rgl': library(rgl) ?plot3d x<-data.frame(a=rnorm(100),b=rnorm(100),c=rnorm(100)) plot3d(x$a,x$b,x$c) Jose Quoting "H. Paul Benton" <[EMAIL PROTECTED]>: > Dear all, > > Thank you for any help. I have a data.frame and would like to plot > it in 3D. I have tried wireframe() and cloud(), I got > > scatterplot3d(xs) > Error: could not find function "scatterplot3d" > >> wireframe(xs) > Error in wireframe(xs) : no applicable method for "wireframe" > >> persp(x=x, y=y, z=xs) > Error in persp.default(x = x, y = y, z = xs) : > (list) object cannot be coerced to 'double' >> class(xs) > [1] "data.frame" > Where x and y were a sequence of my min -> max by 50 of xs[,1] and xs[,2]. > > my data is/looks like: > >> dim(xs) > [1] 400 4 >> xs[1:5,] > x y Z1 Z2 > 1 27172.4 19062.4 0128 > 2 27000.9 19077.8 0 0 > 3 27016.8 19077.5 0 0 > 4 27029.5 19077.3 0 0 > 5 27045.4 19077.0 0 0 > > Cheers, > > Paul > > -- > Research Technician > Mass Spectrometry >o The > / > o Scripps > \ >o Research > / > o Institute > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- Dr. Jose I. de las Heras Email: [EMAIL PROTECTED] The Wellcome Trust Centre for Cell BiologyPhone: +44 (0)131 6513374 Institute for Cell & Molecular BiologyFax: +44 (0)131 6507360 Swann Building, Mayfield Road University of Edinburgh Edinburgh EH9 3JR UK __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] normality tests [Broadcast]
[EMAIL PROTECTED] wrote: > Following up on Frank's thought, why is it that parametric tests are so > much more popular than their non-parametric counterparts? As > non-parametric tests require fewer assumptions, why aren't they the > default? The relative efficiency of the Wilcoxon test as compared to the > t-test is 0.955, and yet I still see t-tests in the medical literature all > the time. Granted, the Wilcoxon still requires the assumption of symmetry > (I'm curious as to why the Wilcoxon is often used when asymmetry is > suspected, since the Wilcoxon assumes symmetry), but that's less stringent > than requiring normally distributed data. In a similar vein, one usually > sees the mean and standard deviation reported as summary statistics for a > continuous variable - these are not very informative unless you assume the > variable is normally distributed. However, clinicians often insist that I > included these figures in reports. > > Cody Hamilton, PhD > Edwards Lifesciences Well said Cody, just want to add that Wilcoxon does not assume symmetry if you are interested in testing for stochastic ordering and not just for a mean. Frank > > > > > Frank E Harrell > Jr > <[EMAIL PROTECTED] To > bilt.edu> "Lucke, Joseph F" > Sent by: <[EMAIL PROTECTED]> > [EMAIL PROTECTED] cc > at.math.ethz.ch r-help >Subject >Re: [R] normality tests > 05/25/2007 02:42 [Broadcast] > PM > > > > > > > > > > Lucke, Joseph F wrote: >> Most standard tests, such as t-tests and ANOVA, are fairly resistant to >> non-normalilty for significance testing. It's the sample means that have >> to be normal, not the data. The CLT kicks in fairly quickly. Testing >> for normality prior to choosing a test statistic is generally not a good >> idea. > > I beg to differ Joseph. I have had many datasets in which the CLT was > of no use whatsoever, i.e., where bootstrap confidence limits were > asymmetric because the data were so skewed, and where symmetric > normality-based confidence intervals had bad coverage in both tails > (though correct on the average). I see this the opposite way: > nonparametric tests works fine if normality holds. > > Note that the CLT helps with type I error but not so much with type II > error. > > Frank > >> -Original Message- >> From: [EMAIL PROTECTED] >> [mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy >> Sent: Friday, May 25, 2007 12:04 PM >> To: [EMAIL PROTECTED]; Frank E Harrell Jr >> Cc: r-help >> Subject: Re: [R] normality tests [Broadcast] >> >> From: [EMAIL PROTECTED] >>> On 25/05/07, Frank E Harrell Jr <[EMAIL PROTECTED]> wrote: [EMAIL PROTECTED] wrote: > Hi all, > > apologies for seeking advice on a general stats question. I ve run > normality tests using 8 different methods: > - Lilliefors > - Shapiro-Wilk > - Robust Jarque Bera > - Jarque Bera > - Anderson-Darling > - Pearson chi-square > - Cramer-von Mises > - Shapiro-Francia > > All show that the null hypothesis that the data come from a normal > distro cannot be rejected. Great. However, I don't think >>> it looks nice > to report the values of 8 different tests on a report. One note is > that my sample size is really tiny (less than 20 >>> independent cases). > Without wanting to start a flame war, are there any >>> advices of which > one/ones would be more appropriate and should be reported >>> (along with > a Q-Q plot). Thank you. > > Regards, > Wow - I have so many concerns with that approach that it's >>> hard to know where to begin. But first of all, why care about >>> normality? Why not use distribution-free methods? You should examine the power of the tests for n=20. You'll probably find it's not good enough to reach a reliable conclusion. >>> And wouldn't it be even worse if I used non-parametric tests? >> I believe what Frank meant was that it's pr
Re: [R] normality tests [Broadcast]
Following up on Frank's thought, why is it that parametric tests are so much more popular than their non-parametric counterparts? As non-parametric tests require fewer assumptions, why aren't they the default? The relative efficiency of the Wilcoxon test as compared to the t-test is 0.955, and yet I still see t-tests in the medical literature all the time. Granted, the Wilcoxon still requires the assumption of symmetry (I'm curious as to why the Wilcoxon is often used when asymmetry is suspected, since the Wilcoxon assumes symmetry), but that's less stringent than requiring normally distributed data. In a similar vein, one usually sees the mean and standard deviation reported as summary statistics for a continuous variable - these are not very informative unless you assume the variable is normally distributed. However, clinicians often insist that I included these figures in reports. Cody Hamilton, PhD Edwards Lifesciences Frank E Harrell Jr <[EMAIL PROTECTED] To bilt.edu> "Lucke, Joseph F" Sent by: <[EMAIL PROTECTED]> [EMAIL PROTECTED] cc at.math.ethz.ch r-help Subject Re: [R] normality tests 05/25/2007 02:42 [Broadcast] PM Lucke, Joseph F wrote: > Most standard tests, such as t-tests and ANOVA, are fairly resistant to > non-normalilty for significance testing. It's the sample means that have > to be normal, not the data. The CLT kicks in fairly quickly. Testing > for normality prior to choosing a test statistic is generally not a good > idea. I beg to differ Joseph. I have had many datasets in which the CLT was of no use whatsoever, i.e., where bootstrap confidence limits were asymmetric because the data were so skewed, and where symmetric normality-based confidence intervals had bad coverage in both tails (though correct on the average). I see this the opposite way: nonparametric tests works fine if normality holds. Note that the CLT helps with type I error but not so much with type II error. Frank > > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy > Sent: Friday, May 25, 2007 12:04 PM > To: [EMAIL PROTECTED]; Frank E Harrell Jr > Cc: r-help > Subject: Re: [R] normality tests [Broadcast] > > From: [EMAIL PROTECTED] >> On 25/05/07, Frank E Harrell Jr <[EMAIL PROTECTED]> wrote: >>> [EMAIL PROTECTED] wrote: Hi all, apologies for seeking advice on a general stats question. I ve run > normality tests using 8 different methods: - Lilliefors - Shapiro-Wilk - Robust Jarque Bera - Jarque Bera - Anderson-Darling - Pearson chi-square - Cramer-von Mises - Shapiro-Francia All show that the null hypothesis that the data come from a normal > distro cannot be rejected. Great. However, I don't think >> it looks nice to report the values of 8 different tests on a report. One note is > that my sample size is really tiny (less than 20 >> independent cases). Without wanting to start a flame war, are there any >> advices of which one/ones would be more appropriate and should be reported >> (along with a Q-Q plot). Thank you. Regards, >>> Wow - I have so many concerns with that approach that it's >> hard to know >>> where to begin. But first of all, why care about >> normality? Why not >>> use distribution-free methods? >>> >>> You should examine the power of the tests for n=20. You'll probably > >>> find it's not good enough to reach a reliable conclusion. >> And wouldn't it be even worse if I used non-parametric tests? > > I believe what Frank meant was that it's probably better to use a > distribution-free procedure to do the real test of interest (if there is > one) instead of testing for normality, and then use a test that assumes > normality. > > I guess the question is, what exactly do you want to do with the outcome > of the normality tests? If those are going to be used as basis for > deci
[R] Estimation of Dispersion parameter in GLM for Gamma Dist.
Hi All, could someone shed some light on what the difference between the estimated dispersion parameter that is supplied with the GLM function and the one that the 'gamma.dispersion( )' function in the MASS library gives? And is there consensus for which estimated value to use? It seems that the dispersion parameter that comes with the summary command for a GLM with a Gamma dist. is close to (but not exactly): Pearson Chi-Sq./d.f. While the dispersion parameter from the MASS library ('gamma.dispersion ( )' ) is close to the approximation given in McCullagh&Nelder (p.291): Res.Dev./n*(6+Res.Dev./n) / (6 + 2*Res.Dev./n) (Since it is only an approximation it seems reasonable that they are not exactly alike.) Many thanks, Fredrik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] normality tests [Broadcast]
You can also try validating your regression model via the bootstrap (the validate() function in the Design library is very helpful). To my mind that would be much more reassuring than normality tests performed on twenty residuals. By the way, be careful with the correlation test - it's only good at detecting linear relationships between two variables (i.e. not helpful for detecting non-linear relationships). Regards, -Cody Cody Hamilton, PhD Edwards Lifesciences [EMAIL PROTECTED] m Sent by: To [EMAIL PROTECTED] "Lucke, Joseph F" at.math.ethz.ch <[EMAIL PROTECTED]> cc r-help 05/25/2007 11:23 Subject AMRe: [R] normality tests [Broadcast] Thank you all for your replies they have been more useful... well in my case I have chosen to do some parametric tests (more precisely correlation and linear regressions among some variables)... so it would be nice if I had an extra bit of support on my decisions... If I understood well from all your replies... I shouldn't pay s much attntion on the normality tests, so it wouldn't matter which one/ones I use to report... but rather focus on issues such as the power of the test... Thanks again. On 25/05/07, Lucke, Joseph F <[EMAIL PROTECTED]> wrote: > Most standard tests, such as t-tests and ANOVA, are fairly resistant to > non-normalilty for significance testing. It's the sample means that have > to be normal, not the data. The CLT kicks in fairly quickly. Testing > for normality prior to choosing a test statistic is generally not a good > idea. > > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy > Sent: Friday, May 25, 2007 12:04 PM > To: [EMAIL PROTECTED]; Frank E Harrell Jr > Cc: r-help > Subject: Re: [R] normality tests [Broadcast] > > From: [EMAIL PROTECTED] > > > > On 25/05/07, Frank E Harrell Jr <[EMAIL PROTECTED]> wrote: > > > [EMAIL PROTECTED] wrote: > > > > Hi all, > > > > > > > > apologies for seeking advice on a general stats question. I ve run > > > > > normality tests using 8 different methods: > > > > - Lilliefors > > > > - Shapiro-Wilk > > > > - Robust Jarque Bera > > > > - Jarque Bera > > > > - Anderson-Darling > > > > - Pearson chi-square > > > > - Cramer-von Mises > > > > - Shapiro-Francia > > > > > > > > All show that the null hypothesis that the data come from a normal > > > > > distro cannot be rejected. Great. However, I don't think > > it looks nice > > > > to report the values of 8 different tests on a report. One note is > > > > > that my sample size is really tiny (less than 20 > > independent cases). > > > > Without wanting to start a flame war, are there any > > advices of which > > > > one/ones would be more appropriate and should be reported > > (along with > > > > a Q-Q plot). Thank you. > > > > > > > > Regards, > > > > > > > > > > Wow - I have so many concerns with that approach that it's > > hard to know > > > where to begin. But first of all, why care about > > normality? Why not > > > use distribution-free methods? > > > > > > You should examine the power of the tests for n=20. You'll probably > > > > find it's not good enough to reach a reliable conclusion. > > > > And wouldn't it be even worse if I used non-parametric tests? > > I believe what Frank meant was that it's probably better to use a > distribution-free procedure to do the real test of interest (if there is > one) instead of testing for normality, and then use a test that assumes > normality. > > I guess the question is, what exactly do you want to do with the outcome > of the normality tests? If those are going to be used as basis for > deciding which test(s) to do next, then I concur with Frank's > reservation. > > Generally speaking, I do not find goodness-of-fit for distributions very > useful, mostly for the reason that failure to reject the null is no > evidence in favor of the null. It's difficult for me to imagine why > "there's insufficient evidence to show that the data did not come from a > norm
Re: [R] Interactive plots?
The package RSVGTipsDevice allows you to do just it just -- you create a plot in an SVG file that can be viewed in a browser like FireFox, and the points (or shapes) in that plot can have pop-up tooltips. -- Tony Plate mister_bluesman wrote: > Hi there. > > I have a matrix that provides place names and the distances between them: > >Chelt Exeter London Birm > Chelt 0 118 96 50 > Exeter 1180 118 163 > London 96 118 0 118 > Birm 50 163 118 0 > > After performing multidimensional scaling I get the following points plotted > as follows > > http://www.nabble.com/file/p10810700/demo.jpeg > > I would like to know how if I hover a point I can get a little box telling > me which place the point refers to. Does anyone know? > > Many thanks. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] normality tests [Broadcast]
The normality of the residuals is important in the inference procedures for the classical linear regression model, and normality is very important in correlation analysis (second moment)... Washington S. Silva > Thank you all for your replies they have been more useful... well > in my case I have chosen to do some parametric tests (more precisely > correlation and linear regressions among some variables)... so it > would be nice if I had an extra bit of support on my decisions... If I > understood well from all your replies... I shouldn't pay s much > attntion on the normality tests, so it wouldn't matter which one/ones > I use to report... but rather focus on issues such as the power of the > test... > > Thanks again. > > On 25/05/07, Lucke, Joseph F <[EMAIL PROTECTED]> wrote: > > Most standard tests, such as t-tests and ANOVA, are fairly resistant to > > non-normalilty for significance testing. It's the sample means that have > > to be normal, not the data. The CLT kicks in fairly quickly. Testing > > for normality prior to choosing a test statistic is generally not a good > > idea. > > > > -Original Message- > > From: [EMAIL PROTECTED] > > [mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy > > Sent: Friday, May 25, 2007 12:04 PM > > To: [EMAIL PROTECTED]; Frank E Harrell Jr > > Cc: r-help > > Subject: Re: [R] normality tests [Broadcast] > > > > From: [EMAIL PROTECTED] > > > > > > On 25/05/07, Frank E Harrell Jr <[EMAIL PROTECTED]> wrote: > > > > [EMAIL PROTECTED] wrote: > > > > > Hi all, > > > > > > > > > > apologies for seeking advice on a general stats question. I ve run > > > > > > > normality tests using 8 different methods: > > > > > - Lilliefors > > > > > - Shapiro-Wilk > > > > > - Robust Jarque Bera > > > > > - Jarque Bera > > > > > - Anderson-Darling > > > > > - Pearson chi-square > > > > > - Cramer-von Mises > > > > > - Shapiro-Francia > > > > > > > > > > All show that the null hypothesis that the data come from a normal > > > > > > > distro cannot be rejected. Great. However, I don't think > > > it looks nice > > > > > to report the values of 8 different tests on a report. One note is > > > > > > > that my sample size is really tiny (less than 20 > > > independent cases). > > > > > Without wanting to start a flame war, are there any > > > advices of which > > > > > one/ones would be more appropriate and should be reported > > > (along with > > > > > a Q-Q plot). Thank you. > > > > > > > > > > Regards, > > > > > > > > > > > > > Wow - I have so many concerns with that approach that it's > > > hard to know > > > > where to begin. But first of all, why care about > > > normality? Why not > > > > use distribution-free methods? > > > > > > > > You should examine the power of the tests for n=20. You'll probably > > > > > > find it's not good enough to reach a reliable conclusion. > > > > > > And wouldn't it be even worse if I used non-parametric tests? > > > > I believe what Frank meant was that it's probably better to use a > > distribution-free procedure to do the real test of interest (if there is > > one) instead of testing for normality, and then use a test that assumes > > normality. > > > > I guess the question is, what exactly do you want to do with the outcome > > of the normality tests? If those are going to be used as basis for > > deciding which test(s) to do next, then I concur with Frank's > > reservation. > > > > Generally speaking, I do not find goodness-of-fit for distributions very > > useful, mostly for the reason that failure to reject the null is no > > evidence in favor of the null. It's difficult for me to imagine why > > "there's insufficient evidence to show that the data did not come from a > > normal distribution" would be interesting. > > > > Andy > > > > > > > > > > > > Frank > > > > > > > > > > > > -- > > > > Frank E Harrell Jr Professor and Chair School > > > of Medicine > > > > Department of Biostatistics > > > Vanderbilt University > > > > > > > > > > > > > -- > > > yianni > > > > > > __ > > > R-help@stat.math.ethz.ch mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > > > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > > > > > > > > > > > -- > > Notice: This e-mail message, together with any > > attachments,...{{dropped}} > > > > __ > > R-help@stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > -- > yianni > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.eth
[R] Interactive plots?
Hi there. I have a matrix that provides place names and the distances between them: Chelt Exeter London Birm Chelt 0 118 96 50 Exeter 1180 118 163 London 96 118 0 118 Birm 50 163 118 0 After performing multidimensional scaling I get the following points plotted as follows http://www.nabble.com/file/p10810700/demo.jpeg I would like to know how if I hover a point I can get a little box telling me which place the point refers to. Does anyone know? Many thanks. -- View this message in context: http://www.nabble.com/Interactive-plots--tf3818454.html#a10810700 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] normality tests [Broadcast]
[EMAIL PROTECTED] wrote: > Thank you all for your replies they have been more useful... well > in my case I have chosen to do some parametric tests (more precisely > correlation and linear regressions among some variables)... so it > would be nice if I had an extra bit of support on my decisions... If I > understood well from all your replies... I shouldn't pay s much > attntion on the normality tests, so it wouldn't matter which one/ones > I use to report... but rather focus on issues such as the power of the > test... If doing regression I assume your normality tests were on residuals rather than raw data. Frank > > Thanks again. > > On 25/05/07, Lucke, Joseph F <[EMAIL PROTECTED]> wrote: >> Most standard tests, such as t-tests and ANOVA, are fairly resistant to >> non-normalilty for significance testing. It's the sample means that have >> to be normal, not the data. The CLT kicks in fairly quickly. Testing >> for normality prior to choosing a test statistic is generally not a good >> idea. >> >> -Original Message- >> From: [EMAIL PROTECTED] >> [mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy >> Sent: Friday, May 25, 2007 12:04 PM >> To: [EMAIL PROTECTED]; Frank E Harrell Jr >> Cc: r-help >> Subject: Re: [R] normality tests [Broadcast] >> >> From: [EMAIL PROTECTED] >> > >> > On 25/05/07, Frank E Harrell Jr <[EMAIL PROTECTED]> wrote: >> > > [EMAIL PROTECTED] wrote: >> > > > Hi all, >> > > > >> > > > apologies for seeking advice on a general stats question. I ve run >> >> > > > normality tests using 8 different methods: >> > > > - Lilliefors >> > > > - Shapiro-Wilk >> > > > - Robust Jarque Bera >> > > > - Jarque Bera >> > > > - Anderson-Darling >> > > > - Pearson chi-square >> > > > - Cramer-von Mises >> > > > - Shapiro-Francia >> > > > >> > > > All show that the null hypothesis that the data come from a normal >> >> > > > distro cannot be rejected. Great. However, I don't think >> > it looks nice >> > > > to report the values of 8 different tests on a report. One note is >> >> > > > that my sample size is really tiny (less than 20 >> > independent cases). >> > > > Without wanting to start a flame war, are there any >> > advices of which >> > > > one/ones would be more appropriate and should be reported >> > (along with >> > > > a Q-Q plot). Thank you. >> > > > >> > > > Regards, >> > > > >> > > >> > > Wow - I have so many concerns with that approach that it's >> > hard to know >> > > where to begin. But first of all, why care about >> > normality? Why not >> > > use distribution-free methods? >> > > >> > > You should examine the power of the tests for n=20. You'll probably >> >> > > find it's not good enough to reach a reliable conclusion. >> > >> > And wouldn't it be even worse if I used non-parametric tests? >> >> I believe what Frank meant was that it's probably better to use a >> distribution-free procedure to do the real test of interest (if there is >> one) instead of testing for normality, and then use a test that assumes >> normality. >> >> I guess the question is, what exactly do you want to do with the outcome >> of the normality tests? If those are going to be used as basis for >> deciding which test(s) to do next, then I concur with Frank's >> reservation. >> >> Generally speaking, I do not find goodness-of-fit for distributions very >> useful, mostly for the reason that failure to reject the null is no >> evidence in favor of the null. It's difficult for me to imagine why >> "there's insufficient evidence to show that the data did not come from a >> normal distribution" would be interesting. >> >> Andy >> >> >> > > >> > > Frank >> > > >> > > >> > > -- >> > > Frank E Harrell Jr Professor and Chair School >> > of Medicine >> > > Department of Biostatistics >> > Vanderbilt University >> > > >> > >> > >> > -- >> > yianni >> > >> > __ >> > R-help@stat.math.ethz.ch mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> > http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > >> > >> > >> >> >> >> -- >> Notice: This e-mail message, together with any >> attachments,...{{dropped}} >> >> __ >> R-help@stat.math.ethz.ch mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-projec
Re: [R] normality tests [Broadcast]
Lucke, Joseph F wrote: > Most standard tests, such as t-tests and ANOVA, are fairly resistant to > non-normalilty for significance testing. It's the sample means that have > to be normal, not the data. The CLT kicks in fairly quickly. Testing > for normality prior to choosing a test statistic is generally not a good > idea. I beg to differ Joseph. I have had many datasets in which the CLT was of no use whatsoever, i.e., where bootstrap confidence limits were asymmetric because the data were so skewed, and where symmetric normality-based confidence intervals had bad coverage in both tails (though correct on the average). I see this the opposite way: nonparametric tests works fine if normality holds. Note that the CLT helps with type I error but not so much with type II error. Frank > > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy > Sent: Friday, May 25, 2007 12:04 PM > To: [EMAIL PROTECTED]; Frank E Harrell Jr > Cc: r-help > Subject: Re: [R] normality tests [Broadcast] > > From: [EMAIL PROTECTED] >> On 25/05/07, Frank E Harrell Jr <[EMAIL PROTECTED]> wrote: >>> [EMAIL PROTECTED] wrote: Hi all, apologies for seeking advice on a general stats question. I ve run > normality tests using 8 different methods: - Lilliefors - Shapiro-Wilk - Robust Jarque Bera - Jarque Bera - Anderson-Darling - Pearson chi-square - Cramer-von Mises - Shapiro-Francia All show that the null hypothesis that the data come from a normal > distro cannot be rejected. Great. However, I don't think >> it looks nice to report the values of 8 different tests on a report. One note is > that my sample size is really tiny (less than 20 >> independent cases). Without wanting to start a flame war, are there any >> advices of which one/ones would be more appropriate and should be reported >> (along with a Q-Q plot). Thank you. Regards, >>> Wow - I have so many concerns with that approach that it's >> hard to know >>> where to begin. But first of all, why care about >> normality? Why not >>> use distribution-free methods? >>> >>> You should examine the power of the tests for n=20. You'll probably > >>> find it's not good enough to reach a reliable conclusion. >> And wouldn't it be even worse if I used non-parametric tests? > > I believe what Frank meant was that it's probably better to use a > distribution-free procedure to do the real test of interest (if there is > one) instead of testing for normality, and then use a test that assumes > normality. > > I guess the question is, what exactly do you want to do with the outcome > of the normality tests? If those are going to be used as basis for > deciding which test(s) to do next, then I concur with Frank's > reservation. > > Generally speaking, I do not find goodness-of-fit for distributions very > useful, mostly for the reason that failure to reject the null is no > evidence in favor of the null. It's difficult for me to imagine why > "there's insufficient evidence to show that the data did not come from a > normal distribution" would be interesting. > > Andy > > >>> Frank >>> >>> >>> -- >>> Frank E Harrell Jr Professor and Chair School >> of Medicine >>> Department of Biostatistics >> Vanderbilt University >> >> -- >> yianni >> >> __ >> R-help@stat.math.ethz.ch mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> >> > > > > -- > Notice: This e-mail message, together with any > attachments,...{{dropped}} > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 3D plots with data.frame
Dear all, Thank you for any help. I have a data.frame and would like to plot it in 3D. I have tried wireframe() and cloud(), I got scatterplot3d(xs) Error: could not find function "scatterplot3d" > wireframe(xs) Error in wireframe(xs) : no applicable method for "wireframe" > persp(x=x, y=y, z=xs) Error in persp.default(x = x, y = y, z = xs) : (list) object cannot be coerced to 'double' > class(xs) [1] "data.frame" Where x and y were a sequence of my min -> max by 50 of xs[,1] and xs[,2]. my data is/looks like: > dim(xs) [1] 400 4 > xs[1:5,] x y Z1 Z2 1 27172.4 19062.4 0128 2 27000.9 19077.8 0 0 3 27016.8 19077.5 0 0 4 27029.5 19077.3 0 0 5 27045.4 19077.0 0 0 Cheers, Paul -- Research Technician Mass Spectrometry o The / o Scripps \ o Research / o Institute __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with rpart
I work on Windows, R version 2.4.1. I'm very new with R! I am trying to build a classification tree using rpart but, although the matrix has 108 variables, the program builds a tree with only one split using one variable! I know it is probable that only one variable is informative, but I think it's unlikely. I was wondering if someone can help me identify if I'm doing something wrong because I can't see it, nor could I find it in the help or in this forum. I want to see whether I can predict disperser type (5 categories) of a species given the volatile compounds that the fruits emit (108 volatiles) I am writing: >dispvol.x<- read.table ('C:\\Documents and Settings\\silvia\\...\\volatile_disperser_matrix.txt', header=T) >dispvol.df<- as.data.frame (dispvol.x) >attach (dispvol.df) #I think I need to do this so the variables are identified when I write the regression equation >dispvol.ctree <- rpart (disperser~ P3.70 +P4.29 +P5.05 +...+P30.99 >+P32.25 +TotArea, >data= dispvol.df, method='class') and I get the following output: n= 28 node), split, n, loss, yval, (yprob) * denotes terminal node 1) root 28 15 non (0.036 0.32 0.071 0.11 0.46) 2) P10.01>=1.185 10 4 bat (0.1 0.6 0.2 0 0.1) * 3) P10.01< 1.185 18 6 non (0 0.17 0 0.17 0.67) * There is nothing special about P10.01 that I can see in my data and I don't know why it chooses that variables and stops there! My matrix looks something like this (except, with a lot more variables) disperser P3.70 P4.29 P6.45 P6.55 P10.01 P10.15 P10.18 TotArea ban 0.000.001.340.001.490.000.002.83 non 0.000.000.00152.80 0.0014.31 0.00167.11 bat 0.000.000.00131.56 0.650.000.00132.21 bat 0.000.005.050.0013.01 6.850.0024.90 non 0.000.0072.65 103.26 4.100.000.00180.02 non 0.000.000.000.000.000.000.000.00 bat 1.230.000.480.890.250.000.002.85 bat 0.000.000.000.000.000.000.000.00 non 0.000.000.000.001.060.000.001.06 bat 0.000.000.000.0028.69 0.0021.33 50.02 mix 0.000.000.000.000.000.000.000.00 non 0.000.000.000.000.000.000.000.00 non 0.000.000.000.000.000.000.000.00 non 0.000.000.000.001.150.000.001.15 non 0.000.000.000.000.000.000.000.00 non 0.000.820.001.650.000.000.002.47 bat 0.000.00133.24 0.003.130.000.00136.37 bir 0.000.0011.08 3.161.792.090.4818.61 non 0.000.000.000.000.000.000.000.00 mix 0.000.000.000.000.000.000.000.00 bat 0.000.000.000.001.310.000.001.31 non 0.000.000.000.000.000.001.231.23 bat 0.000.001.810.002.840.000.004.65 non 0.000.001.180.000.730.000.001.91 bir 0.000.000.000.001.400.000.001.40 bat 0.000.008.161.501.220.000.0010.88 mix 0.000.550.000.000.000.000.000.55 non 0.000.000.000.000.000.000.000.00 Thanks! Silvia. -- View this message in context: http://www.nabble.com/Problem-with-rpart-tf3818436.html#a10810625 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] iplots problem
Hi How did u 'load' sun's java in R? Many thanks ryestone wrote: > > Did you load Sun's java? Try that and also try rebooting your machine. > This worked for me. > > > mister_bluesman wrote: >> >> Hi. I try to load iplots using the following commands >> >>> library(rJava) >>> library(iplots) >> >> but then I get the following error: >> >> Error in .jinit(cp, parameters = "-Xmx512m", silent = TRUE) : >> Cannot create Java Virtual Machine >> Error in library(iplots) : .First.lib failed for 'iplots' >> >> What do I have to do to correct this? >> >> I have jdk1.6 and jre1.6 installed on my windows machine >> Thanks >> > > -- View this message in context: http://www.nabble.com/iplots-problem-tf3815516.html#a10810602 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] covariance question which has nothing to do with R
while my other program is running. The reference I mentioned previously addresses exactly this. Snijders and Bosker's Multilevel Analysis book on page 31 and 33, section 3.6.2 and 363 discuss this. When you say that the Xs are correlated then you would need to say according to which structure they are correlated: (1,X1,Y1) (1,X2,Y2) (1,X3,Y3) . . . (1,X55,Y55) (2,X56,Y56) (2,X57,Y57) . . . (2,... To pick some real world examples one row represents a person, or a stock. And the first column indicates to which organization or to which country that person/stock belongs to. Then Xs are correlated within the organization/country. You will have two covariances, one within-county and one between country covariance of stocks. This can be implemented in R manually providing method of moments estimates, or the gls function providing ML or REML estimates can be used for that. I am not a post doc, just a pre master :-) Toby Leeds, Mark (IED) wrote: > This is a covariance calculation question so nothing to do with R but > maybe someone could help me anyway. > > Suppose, I have two random variables X and Y whose means are both known > to be zero and I want to get an estimate of their covariance. > > I have n sample pairs > > (X1,Y1) > (X2,Y2) > . > . > . > . > . > (Xn,Yn) > > , so that the covariance estimate is clearly 1/n *(sum from i = 1 to n > of ( X_i*Y_i) ) > > But, suppose that it is know that the X_i are positively correlated with > each other and that the Y_i are independent > of each other. > > Then, does this change the formula for the covariance estimate at all ? > Intuitively, I would think that, if the X_i's are positively > correlated , then something should change because there is less info > there than if they were independent but i'm not sure what should change > and I couldn't find it in a book. > > I can assume that the correlation between the X_i's is rho if this makes > things easier ? Thanks. > > References are appreciated also. > > > This is not an offer (or solicitation of an offer) to buy/se...{{dropped}} > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Scale mixture of normals
Dear Friends, Is there an R package which implements regression models with error distributions following a scale mixture of normals? Thanks Anup __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] email the silly fuckers instead.
What? I'm not sure about anyone else but I have absolutely no idea what you're going on about. foxy-3 wrote: > > An alle finanzinvestoren! Diese aktie wird durchstarten! Dienstag 29. > Mai startet die hausse! > > Firma: Talktech Telemedia > Kurzel: OYQ / OYQ.F / OYQ.DE > WKN: 278104 > ISIN: US8742781045 > > Preis: 0.54 > 5-T Prognose: 1.75 > > Realisierter kursgweinn von 450% in 5 tagen! OYQ Wird wie eine rakete > durchstarten! DIENSTAG 29. MAI STARTET DIE HAUSSE! > > But thanks for drawing my attention to it anyway. These universities > attract talented people who develop new ideas which can be commercially > developed in the surrounding community, creating high-paying jobs. A new > breakout from the main PKK tube has been advancing down the pali in the > past three weeks, more than a kilometer west of the Campout tube. > This article was written by scientists at the U. We are also encouraging > licence fee payers to respond to the consultation. > Its well regarded as being generally independant news reviewer, and has > clashed with various governments over the years on the indepenance and > nature of its reporting. > shtml Send This Article to a Friend: Your Name: Your E-Mail: Please fill > in all fields. > including renewable energy research and development. to be fully > involved in the digital revolution that is sweeping the world. > regardless of their economic circumstances. The second principle is that > innovation, entrepreneurship and risk-taking need to be encouraged, > nurtured and rewarded. > Frank Pemberton from Cornell Cooperative Extension on "Home Lawn Care > and Garden Tips. It is located at Kamokuna, about midway between the two > older entries. Whether students study STEM subjects or other fields of > study we need to increase the incentives for parents to save for their > children's college education. com Nome Search Powered by :: Free RSS > news Add RSS news to your web site Garden news vertical portal can now > be syndicated quickly and easily using our new Really Simple > Syndication feeds. including renewable energy research and development. > We all want a higher standard of living for ourselves and our children. > Aceasta este vestea rea. Cross off those plans to tool to the shore in a > convertible. > They think it is too hard to change. In fact, spectacular eruptions like > that of Mount Pinatubo are demonstrated to contribute to global cooling > through the injection of solar energy reflecting ash and other small > particles. It is important for us to have a concrete, shared > understanding of where we want to go. > Educator Mary Jean LeTendre said it best when she observed, "America's > future walks through the doors of our schools each day. " would have > been answered differently. > Vintage apron show and open house at May Creek Home :: Web Directory :: > Garden News :: Free RSS news :: Free Newsletter :: Tell a Friend > Clientfinder. Bush vows to work with allies for tougher response to > Iran's . Aceasta este vestea rea. > In fact, spectacular eruptions like that of Mount Pinatubo are > demonstrated to contribute to global cooling through the injection of > solar energy reflecting ash and other small particles. Garden tool sets > Home :: Web Directory :: Garden News :: Free RSS news :: Free Newsletter > :: Tell a Friend Clientfinder. We must seize the moment. will support > three new endowed professorships at UH in STEM disciplines. Recipient > Name: Recipient E-Mail: Personal Message: Discuss This Story: > HawaiiThreads. To achieve this vision, we have to change our economy > from one based on land development, to one fueled by innovation and the > new ideas generated by our universities and a highly-trained workforce. > And, to do this while preserving those aspects of life that make our > island home so special. roCampania de optimizare a site-ului > EHConstruct. > com Nome Search Powered by :: Free RSS news Add RSS news to your web > site Garden news vertical portal can now be syndicated quickly and > easily using our new Really Simple Syndication feeds. > The state's High Technology Development Corporation would become the > Center's master lessee. > and by their ability to work and communicate effectively with others > from around the world. These skills are known collectively as STEM. > Keep up with the latest Advogato features and changes by checking the > Advogato status blog. > Simply put, success means producing a constantly rising standard of > living for all Hawai'i's people while using fewer natural resources, > including land. > " "I want to lead us down the path of innovation, because it is the path > of hope and opportunity," she said. > Sapling balm after storm fury Home :: Web Directory :: Garden News :: > Free RSS news :: Free Newsletter :: Tell a Friend Clientfinder. > Senator Daniel Inouye about this proposal and a follow-up conversation > with his senior staff about how the Senator can help us achieve our > vision for a new economic futur
[R] email the silly fuckers instead.
An alle finanzinvestoren! Diese aktie wird durchstarten! Dienstag 29. Mai startet die hausse! Firma: Talktech Telemedia Kurzel: OYQ / OYQ.F / OYQ.DE WKN: 278104 ISIN: US8742781045 Preis: 0.54 5-T Prognose: 1.75 Realisierter kursgweinn von 450% in 5 tagen! OYQ Wird wie eine rakete durchstarten! DIENSTAG 29. MAI STARTET DIE HAUSSE! But thanks for drawing my attention to it anyway. These universities attract talented people who develop new ideas which can be commercially developed in the surrounding community, creating high-paying jobs. A new breakout from the main PKK tube has been advancing down the pali in the past three weeks, more than a kilometer west of the Campout tube. This article was written by scientists at the U. We are also encouraging licence fee payers to respond to the consultation. Its well regarded as being generally independant news reviewer, and has clashed with various governments over the years on the indepenance and nature of its reporting. shtml Send This Article to a Friend: Your Name: Your E-Mail: Please fill in all fields. including renewable energy research and development. to be fully involved in the digital revolution that is sweeping the world. regardless of their economic circumstances. The second principle is that innovation, entrepreneurship and risk-taking need to be encouraged, nurtured and rewarded. Frank Pemberton from Cornell Cooperative Extension on "Home Lawn Care and Garden Tips. It is located at Kamokuna, about midway between the two older entries. Whether students study STEM subjects or other fields of study we need to increase the incentives for parents to save for their children's college education. com Nome Search Powered by :: Free RSS news Add RSS news to your web site Garden news vertical portal can now be syndicated quickly and easily using our new Really Simple Syndication feeds. including renewable energy research and development. We all want a higher standard of living for ourselves and our children. Aceasta este vestea rea. Cross off those plans to tool to the shore in a convertible. They think it is too hard to change. In fact, spectacular eruptions like that of Mount Pinatubo are demonstrated to contribute to global cooling through the injection of solar energy reflecting ash and other small particles. It is important for us to have a concrete, shared understanding of where we want to go. Educator Mary Jean LeTendre said it best when she observed, "America's future walks through the doors of our schools each day. " would have been answered differently. Vintage apron show and open house at May Creek Home :: Web Directory :: Garden News :: Free RSS news :: Free Newsletter :: Tell a Friend Clientfinder. Bush vows to work with allies for tougher response to Iran's . Aceasta este vestea rea. In fact, spectacular eruptions like that of Mount Pinatubo are demonstrated to contribute to global cooling through the injection of solar energy reflecting ash and other small particles. Garden tool sets Home :: Web Directory :: Garden News :: Free RSS news :: Free Newsletter :: Tell a Friend Clientfinder. We must seize the moment. will support three new endowed professorships at UH in STEM disciplines. Recipient Name: Recipient E-Mail: Personal Message: Discuss This Story: HawaiiThreads. To achieve this vision, we have to change our economy from one based on land development, to one fueled by innovation and the new ideas generated by our universities and a highly-trained workforce. And, to do this while preserving those aspects of life that make our island home so special. roCampania de optimizare a site-ului EHConstruct. com Nome Search Powered by :: Free RSS news Add RSS news to your web site Garden news vertical portal can now be syndicated quickly and easily using our new Really Simple Syndication feeds. The state's High Technology Development Corporation would become the Center's master lessee. and by their ability to work and communicate effectively with others from around the world. These skills are known collectively as STEM. Keep up with the latest Advogato features and changes by checking the Advogato status blog. Simply put, success means producing a constantly rising standard of living for all Hawai'i's people while using fewer natural resources, including land. " "I want to lead us down the path of innovation, because it is the path of hope and opportunity," she said. Sapling balm after storm fury Home :: Web Directory :: Garden News :: Free RSS news :: Free Newsletter :: Tell a Friend Clientfinder. Senator Daniel Inouye about this proposal and a follow-up conversation with his senior staff about how the Senator can help us achieve our vision for a new economic future for West O'ahu. Educator Mary Jean LeTendre said it best when she observed, "America's future walks through the doors of our schools each day. Please welcome students from McKinley High School and their teacher Osa Tui and FIRST Waialua High School students and their teacher Glenn Lee
Re: [R] File path expansion
On 5/25/2007 1:09 PM, Prof Brian Ripley wrote: > On Fri, 25 May 2007, Martin Maechler wrote: > >> >>> path.expand("~") >> [1] "/home/maechler" > > Yes, but beware that may not do what you want on Windows in R <= 2.5.0, > since someone changed the definition of 'home' but not path.expand. A more basic problem is that the definition of "~" in Windows is very ambiguous. Is it my Cygwin home directory, where "cd ~" would take me while in Cygwin? Is it my Windows CSIDL_PERSONAL folder, usually %HOMEDRIVE%/%HOMEPATH%/My Documents? Is it the parent of that folder, %HOMEDRIVE%/%HOMEPATH%? "~" is a shell concept that makes sense in Unix-like shells, but not in Windows. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] In which package is the errbar command located?
help.search("errbar") shows the one installed on my system (and might on yours), but RSiteSearch("errbar") shows that more than one package contains a function by that name On Fri, 25 May 2007, Judith Flores wrote: > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:[EMAIL PROTECTED] UC San Diego http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0901 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculation of ratio distribution properties
Mike, Attached is an R function to do this, along with an example that will reproduce the MathCad plot shown in your attached paper. I haven't checked it thoroughly, but it seems to reproduce the MathCad example well. Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Mike Lawrence Sent: Friday, May 25, 2007 1:55 PM To: Lucke, Joseph F Cc: Rhelp Subject: Re: [R] Calculation of ratio distribution properties According to the paper I cited, there is controversy over the sufficiency of Hinkley's solution, hence their proposed more complete solution. On 25-May-07, at 2:45 PM, Lucke, Joseph F wrote: > The exact ratio is given in > > On the Ratio of Two Correlated Normal Random Variables, D. V. > Hinkley, Biometrika, Vol. 56, No. 3. (Dec., 1969), pp. 635-639. > -- Mike Lawrence Graduate Student, Department of Psychology, Dalhousie University Website: http://myweb.dal.ca/mc973993 Public calendar: http://icalx.com/public/informavore/Public "The road to wisdom? Well, it's plain and simple to express: Err and err and err again, but less and less and less." - Piet Hein __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lme with corAR1 errors - can't find AR coefficient in output
Millo, On 5/24/07, Millo Giovanni <[EMAIL PROTECTED]> wrote: > Dear List, > > I am using the output of a ML estimation on a random effects model with > first-order autocorrelation to make a further conditional test. My model > is much like this (which reproduces the method on the famous Grunfeld > data, for the econometricians out there it is Table 5.2 in Baltagi): > > library(Ecdat) > library(nlme) > data(Grunfeld) > mymod<-lme(inv~value+capital,data=Grunfeld,random=~1|firm,correlation=co > rAR1(0,~year|firm)) > > Embarrassing as it may be, I can find the autoregressive parameter > ('Phi', if I get it right) in the printout of summary(mymod) but I am > utterly unable to locate the corresponding element in the lme or > summary.lme objects. > > Any help appreciated. This must be something stupid I'm overlooking, > either in str(mymod) or in the help files, but it's a huge problem for > me. > > Try coef(mymod$model$corStruct, unconstrained = FALSE) Stephen -- Rochester, Minn. USA __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] iplots problem
Did you load Sun's java? Try that and also try rebooting your machine. This worked for me. mister_bluesman wrote: > > Hi. I try to load iplots using the following commands > >> library(rJava) >> library(iplots) > > but then I get the following error: > > Error in .jinit(cp, parameters = "-Xmx512m", silent = TRUE) : > Cannot create Java Virtual Machine > Error in library(iplots) : .First.lib failed for 'iplots' > > What do I have to do to correct this? > > I have jdk1.6 and jre1.6 installed on my windows machine > Thanks > -- View this message in context: http://www.nabble.com/iplots-problem-tf3815516.html#a10808131 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] /tmp/ gets filled up fast
Dear useRs, I'm running some pretty big R scripts using a PBS that calls upon the RMySQL library and it's filling up the /tmp/ directory with Rtmp* files. I thought the problem might have come from scripts crashing and therefore not getting around to executing the dbDisconnect() function, but I just ran a script that exited correctly and it still left a giant file in /tmp/ without cleaning up after itself. I'm running all of my scripts using the --vanilla tag (saving the data I need either into a separate .RData file or into the MySQL database). Is there some other tag I should be using or some command I can put at the end of my script to remove the Rtmp file when it's done? Thank you, -Alessandro -- Forwarded message -- From: Yaroslav Halchenko Date: May 24, 2007 7:42 PM Subject: Re: mysql> Access denied for user To: Alessandro Gagliardi Alessandro We need to resolve the issue somehow better way... /tmp/ gets filled up fast with your R tasks... *$> du -scm RtmpvKEqwr/ 804 RtmpvKEqwr/ 804 total [EMAIL PROTECTED]:/tmp $> ls -ld RtmpvKEqwr/ 0 drwx-- 2 eklypse eklypse 80 2007-05-24 15:18 RtmpvKEqwr// [EMAIL PROTECTED]:/tmp $> df . Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda6 1052184 1052184 0 100% /tmp I had to remove it... check if there is a way to use some other directory may be? or properly clean up?? On Wed, 23 May 2007, Alessandro Gagliardi wrote: > Well, this is a real problem then. Because I'm generating tables that > R cannot get into MySQL because they are too big and it times out > before it's done. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] normality tests [Broadcast]
Thank you all for your replies they have been more useful... well in my case I have chosen to do some parametric tests (more precisely correlation and linear regressions among some variables)... so it would be nice if I had an extra bit of support on my decisions... If I understood well from all your replies... I shouldn't pay s much attntion on the normality tests, so it wouldn't matter which one/ones I use to report... but rather focus on issues such as the power of the test... Thanks again. On 25/05/07, Lucke, Joseph F <[EMAIL PROTECTED]> wrote: > Most standard tests, such as t-tests and ANOVA, are fairly resistant to > non-normalilty for significance testing. It's the sample means that have > to be normal, not the data. The CLT kicks in fairly quickly. Testing > for normality prior to choosing a test statistic is generally not a good > idea. > > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy > Sent: Friday, May 25, 2007 12:04 PM > To: [EMAIL PROTECTED]; Frank E Harrell Jr > Cc: r-help > Subject: Re: [R] normality tests [Broadcast] > > From: [EMAIL PROTECTED] > > > > On 25/05/07, Frank E Harrell Jr <[EMAIL PROTECTED]> wrote: > > > [EMAIL PROTECTED] wrote: > > > > Hi all, > > > > > > > > apologies for seeking advice on a general stats question. I ve run > > > > > normality tests using 8 different methods: > > > > - Lilliefors > > > > - Shapiro-Wilk > > > > - Robust Jarque Bera > > > > - Jarque Bera > > > > - Anderson-Darling > > > > - Pearson chi-square > > > > - Cramer-von Mises > > > > - Shapiro-Francia > > > > > > > > All show that the null hypothesis that the data come from a normal > > > > > distro cannot be rejected. Great. However, I don't think > > it looks nice > > > > to report the values of 8 different tests on a report. One note is > > > > > that my sample size is really tiny (less than 20 > > independent cases). > > > > Without wanting to start a flame war, are there any > > advices of which > > > > one/ones would be more appropriate and should be reported > > (along with > > > > a Q-Q plot). Thank you. > > > > > > > > Regards, > > > > > > > > > > Wow - I have so many concerns with that approach that it's > > hard to know > > > where to begin. But first of all, why care about > > normality? Why not > > > use distribution-free methods? > > > > > > You should examine the power of the tests for n=20. You'll probably > > > > find it's not good enough to reach a reliable conclusion. > > > > And wouldn't it be even worse if I used non-parametric tests? > > I believe what Frank meant was that it's probably better to use a > distribution-free procedure to do the real test of interest (if there is > one) instead of testing for normality, and then use a test that assumes > normality. > > I guess the question is, what exactly do you want to do with the outcome > of the normality tests? If those are going to be used as basis for > deciding which test(s) to do next, then I concur with Frank's > reservation. > > Generally speaking, I do not find goodness-of-fit for distributions very > useful, mostly for the reason that failure to reject the null is no > evidence in favor of the null. It's difficult for me to imagine why > "there's insufficient evidence to show that the data did not come from a > normal distribution" would be interesting. > > Andy > > > > > > > > Frank > > > > > > > > > -- > > > Frank E Harrell Jr Professor and Chair School > > of Medicine > > > Department of Biostatistics > > Vanderbilt University > > > > > > > > > -- > > yianni > > > > __ > > R-help@stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > > > > -- > Notice: This e-mail message, together with any > attachments,...{{dropped}} > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- yianni __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] iPlots package
I am having trouble connecting two points in iplot. In the normal plot command I would use segments(). I know there is a function ilines() but can you just enter coordinates of 2 points? -- View this message in context: http://www.nabble.com/iPlots-package-tf3817683.html#a10808180 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] In which package is the errbar command located?
Hmisc --- Judith Flores <[EMAIL PROTECTED]> wrote: __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculation of ratio distribution properties
According to the paper I cited, there is controversy over the sufficiency of Hinkley's solution, hence their proposed more complete solution. On 25-May-07, at 2:45 PM, Lucke, Joseph F wrote: > The exact ratio is given in > > On the Ratio of Two Correlated Normal Random Variables, D. V. > Hinkley, Biometrika, Vol. 56, No. 3. (Dec., 1969), pp. 635-639. > -- Mike Lawrence Graduate Student, Department of Psychology, Dalhousie University Website: http://myweb.dal.ca/mc973993 Public calendar: http://icalx.com/public/informavore/Public "The road to wisdom? Well, it's plain and simple to express: Err and err and err again, but less and less and less." - Piet Hein __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] xyplot: different scales accross rows, same scales within rows
On 5/25/07, Marta Rufino <[EMAIL PROTECTED]> wrote: > Dear list members, > > > I would like to set up a multiple panel in xyplots, with the same scale > for all colunms in each row, but different accross rows. > relation="free" would set up all x or y scales free... which is not what > I want :-( > > Is this possible? It's possible, but requires some abuse of the Trellis design, which doesn't really allow for such use. See https://stat.ethz.ch/pipermail/r-help/2004-October/059396.html -Deepayan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] xyplot: different scales accross rows, same scales within rows
xlim= can take a list: # CO2 is built into R library(lattice) xlim <- rep(list(c(0, 1000), c(0, 2000)), each = 2) xyplot(uptake ~ conc | Type * Treatment, data = CO2, scales = list(relation = "free"), xlim = xlim) On 5/25/07, Marta Rufino <[EMAIL PROTECTED]> wrote: > Dear list members, > > > I would like to set up a multiple panel in xyplots, with the same scale > for all colunms in each row, but different accross rows. > relation="free" would set up all x or y scales free... which is not what > I want :-( > > Is this possible? > > > Thank you in advance, > Best wishes, > Marta > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] normality tests [Broadcast]
Most standard tests, such as t-tests and ANOVA, are fairly resistant to non-normalilty for significance testing. It's the sample means that have to be normal, not the data. The CLT kicks in fairly quickly. Testing for normality prior to choosing a test statistic is generally not a good idea. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy Sent: Friday, May 25, 2007 12:04 PM To: [EMAIL PROTECTED]; Frank E Harrell Jr Cc: r-help Subject: Re: [R] normality tests [Broadcast] From: [EMAIL PROTECTED] > > On 25/05/07, Frank E Harrell Jr <[EMAIL PROTECTED]> wrote: > > [EMAIL PROTECTED] wrote: > > > Hi all, > > > > > > apologies for seeking advice on a general stats question. I ve run > > > normality tests using 8 different methods: > > > - Lilliefors > > > - Shapiro-Wilk > > > - Robust Jarque Bera > > > - Jarque Bera > > > - Anderson-Darling > > > - Pearson chi-square > > > - Cramer-von Mises > > > - Shapiro-Francia > > > > > > All show that the null hypothesis that the data come from a normal > > > distro cannot be rejected. Great. However, I don't think > it looks nice > > > to report the values of 8 different tests on a report. One note is > > > that my sample size is really tiny (less than 20 > independent cases). > > > Without wanting to start a flame war, are there any > advices of which > > > one/ones would be more appropriate and should be reported > (along with > > > a Q-Q plot). Thank you. > > > > > > Regards, > > > > > > > Wow - I have so many concerns with that approach that it's > hard to know > > where to begin. But first of all, why care about > normality? Why not > > use distribution-free methods? > > > > You should examine the power of the tests for n=20. You'll probably > > find it's not good enough to reach a reliable conclusion. > > And wouldn't it be even worse if I used non-parametric tests? I believe what Frank meant was that it's probably better to use a distribution-free procedure to do the real test of interest (if there is one) instead of testing for normality, and then use a test that assumes normality. I guess the question is, what exactly do you want to do with the outcome of the normality tests? If those are going to be used as basis for deciding which test(s) to do next, then I concur with Frank's reservation. Generally speaking, I do not find goodness-of-fit for distributions very useful, mostly for the reason that failure to reject the null is no evidence in favor of the null. It's difficult for me to imagine why "there's insufficient evidence to show that the data did not come from a normal distribution" would be interesting. Andy > > > > Frank > > > > > > -- > > Frank E Harrell Jr Professor and Chair School > of Medicine > > Department of Biostatistics > Vanderbilt University > > > > > -- > yianni > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > -- Notice: This e-mail message, together with any attachments,...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] In which package is the errbar command located?
__ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculation of ratio distribution properties
MathCad code failed to attach last time. Here it is. On 25-May-07, at 2:24 PM, Mike Lawrence wrote: I came across this reference: http://www.informaworld.com/smpp/content? content=10.1080/03610920600683689 The authors sent me code (attached with permission) in MathCad to perform the calculations in which I'm interested. However, I do not have MathCad nor experience with its syntax, so I thought I'd send the code to the list to see if anyone with more experience with R and MathCad would be interested in making this code into a function or package of some sort Mike Begin forwarded message: From: "Noyan Turkkan" <[EMAIL PROTECTED]> Date: May 25, 2007 11:09:02 AM ADT To: "Mike Lawrence" <[EMAIL PROTECTED]> Subject: Rép. : R code for 'Density of the Ratio of Two Normal Random Variables'? Hi Mike I do not know if anyone coded my approach in R. However if you have acces to MathCad, I am including a MathCad file (also in PDF) which computes the density of the ratio of 2 dependent normal variables, its mean & variance. If you do not have acces to MathCad, you will see that all the computations can be easily programmed in R, as I replaced Hypergeometric function by the Erf function. I am not very familiar with R but the erf function may be programmed as : (erf <- function(x) 2*pnorm(x *sqrt(2)) - 1). Good luck. Noyan Turkkan, ing. Professeur titulaire & directeur / Professor & Head Dépt. de génie civil / Civil Eng. Dept. Faculté d'ingénierie / Faculty of Engineering Université de Moncton Moncton, N.B., Canada, E1A 3E9 >>> Mike Lawrence <[EMAIL PROTECTED]> 05/25/07 9:20 am >>> Hi Dr. Turkkan, I am working on a problem that necessitates the estimation of the mean and variance of the density of two dependent normal random variables and in my search for methods to achieve such estimation I came across your paper 'Density of the Ratio of Two Normal Random Variables' (2006). I'm not a statistics or math expert by any means, but I am quite familiar with the R programming language; do you happen to know whether anyone has coded your approach for R yet? Cheers, Mike -- Mike Lawrence Graduate Student, Department of Psychology, Dalhousie University Website: http://myweb.dal.ca/mc973993 Public calendar: http://icalx.com/public/informavore/Public "The road to wisdom? Well, it's plain and simple to express: Err and err and err again, but less and less and less." - Piet Hein -- Mike Lawrence Graduate Student, Department of Psychology, Dalhousie University Website: http://myweb.dal.ca/mc973993 Public calendar: http://icalx.com/public/informavore/Public "The road to wisdom? Well, it's plain and simple to express: Err and err and err again, but less and less and less." - Piet Hein __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student, Department of Psychology, Dalhousie University Website: http://myweb.dal.ca/mc973993 Public calendar: http://icalx.com/public/informavore/Public "The road to wisdom? Well, it's plain and simple to express: Err and err and err again, but less and less and less." - Piet Hein __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] File path expansion
On Fri, 25 May 2007, Martin Maechler wrote: > >> path.expand("~") > [1] "/home/maechler" Yes, but beware that may not do what you want on Windows in R <= 2.5.0, since someone changed the definition of 'home' but not path.expand. > >> "RobMcG" == McGehee, Robert <[EMAIL PROTECTED]> >> on Fri, 25 May 2007 11:44:27 -0400 writes: > >RobMcG> R-Help, >RobMcG> I discovered a "mis-feature" is ghostscript, which is used by the > bitmap >RobMcG> function. It seems that specifying file names in the form > "~/abc.png" >RobMcG> rather than "/home/directory/abc.png" causes my GS to crash when I > open >RobMcG> the bitmap device on my Linux box. > >RobMcG> The easiest solution would seem to be to intercept any file names > in the >RobMcG> form "~/abc.png" and replace the "~" with the user's home > directory. I'm >RobMcG> sure I could come up with something involving regular expressions > and >RobMcG> system calls to do this in Linux, but even that might not be system >RobMcG> independent. So, I wanted to see if anyone knew of a native R > solution >RobMcG> of converting "~" to its full path expansion. > >RobMcG> Thanks, >RobMcG> Robert > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] normality tests [Broadcast]
From: [EMAIL PROTECTED] > > On 25/05/07, Frank E Harrell Jr <[EMAIL PROTECTED]> wrote: > > [EMAIL PROTECTED] wrote: > > > Hi all, > > > > > > apologies for seeking advice on a general stats question. I ve run > > > normality tests using 8 different methods: > > > - Lilliefors > > > - Shapiro-Wilk > > > - Robust Jarque Bera > > > - Jarque Bera > > > - Anderson-Darling > > > - Pearson chi-square > > > - Cramer-von Mises > > > - Shapiro-Francia > > > > > > All show that the null hypothesis that the data come from a normal > > > distro cannot be rejected. Great. However, I don't think > it looks nice > > > to report the values of 8 different tests on a report. One note is > > > that my sample size is really tiny (less than 20 > independent cases). > > > Without wanting to start a flame war, are there any > advices of which > > > one/ones would be more appropriate and should be reported > (along with > > > a Q-Q plot). Thank you. > > > > > > Regards, > > > > > > > Wow - I have so many concerns with that approach that it's > hard to know > > where to begin. But first of all, why care about > normality? Why not > > use distribution-free methods? > > > > You should examine the power of the tests for n=20. You'll probably > > find it's not good enough to reach a reliable conclusion. > > And wouldn't it be even worse if I used non-parametric tests? I believe what Frank meant was that it's probably better to use a distribution-free procedure to do the real test of interest (if there is one) instead of testing for normality, and then use a test that assumes normality. I guess the question is, what exactly do you want to do with the outcome of the normality tests? If those are going to be used as basis for deciding which test(s) to do next, then I concur with Frank's reservation. Generally speaking, I do not find goodness-of-fit for distributions very useful, mostly for the reason that failure to reject the null is no evidence in favor of the null. It's difficult for me to imagine why "there's insufficient evidence to show that the data did not come from a normal distribution" would be interesting. Andy > > > > Frank > > > > > > -- > > Frank E Harrell Jr Professor and Chair School > of Medicine > > Department of Biostatistics > Vanderbilt University > > > > > -- > yianni > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > -- Notice: This e-mail message, together with any attachments,...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] xyplot: different scales accross rows, same scales within rows
Dear list members, I would like to set up a multiple panel in xyplots, with the same scale for all colunms in each row, but different accross rows. relation="free" would set up all x or y scales free... which is not what I want :-( Is this possible? Thank you in advance, Best wishes, Marta __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] windows to unix
Martin Maechler wrote: >> "Erin" == Erin Hodgess <[EMAIL PROTECTED]> >> on Fri, 25 May 2007 06:10:10 -0500 writes: > > Erin> Dear R People: > Erin> Is there any way to take a Windows version of R, compiled from > source, > Erin> compress it, and put it on a Unix-like environment, please? > Just 'zip' the corresponding directory and copy the zip > file to your "unix like" environment. > > You can take a Windows-compiled R to Unix, but you can't make it work The big unasked question is 'What is this unix-like environment?'. Linux isn't Unix, so maybe you mean that, in which case you'll not make your Windows compiled R run. Not without 'Wine' or some other layer of obfuscation. Barry __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] normality tests
On 25/05/07, Frank E Harrell Jr <[EMAIL PROTECTED]> wrote: > [EMAIL PROTECTED] wrote: > > Hi all, > > > > apologies for seeking advice on a general stats question. I ve run > > normality tests using 8 different methods: > > - Lilliefors > > - Shapiro-Wilk > > - Robust Jarque Bera > > - Jarque Bera > > - Anderson-Darling > > - Pearson chi-square > > - Cramer-von Mises > > - Shapiro-Francia > > > > All show that the null hypothesis that the data come from a normal > > distro cannot be rejected. Great. However, I don't think it looks nice > > to report the values of 8 different tests on a report. One note is > > that my sample size is really tiny (less than 20 independent cases). > > Without wanting to start a flame war, are there any advices of which > > one/ones would be more appropriate and should be reported (along with > > a Q-Q plot). Thank you. > > > > Regards, > > > > Wow - I have so many concerns with that approach that it's hard to know > where to begin. But first of all, why care about normality? Why not > use distribution-free methods? > > You should examine the power of the tests for n=20. You'll probably > find it's not good enough to reach a reliable conclusion. And wouldn't it be even worse if I used non-parametric tests? > > Frank > > > -- > Frank E Harrell Jr Professor and Chair School of Medicine > Department of Biostatistics Vanderbilt University > -- yianni __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] File path expansion
> path.expand("~") [1] "/home/maechler" > "RobMcG" == McGehee, Robert <[EMAIL PROTECTED]> > on Fri, 25 May 2007 11:44:27 -0400 writes: RobMcG> R-Help, RobMcG> I discovered a "mis-feature" is ghostscript, which is used by the bitmap RobMcG> function. It seems that specifying file names in the form "~/abc.png" RobMcG> rather than "/home/directory/abc.png" causes my GS to crash when I open RobMcG> the bitmap device on my Linux box. RobMcG> The easiest solution would seem to be to intercept any file names in the RobMcG> form "~/abc.png" and replace the "~" with the user's home directory. I'm RobMcG> sure I could come up with something involving regular expressions and RobMcG> system calls to do this in Linux, but even that might not be system RobMcG> independent. So, I wanted to see if anyone knew of a native R solution RobMcG> of converting "~" to its full path expansion. RobMcG> Thanks, RobMcG> Robert __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] normality tests
[EMAIL PROTECTED] wrote: > Hi all, > > apologies for seeking advice on a general stats question. I ve run > normality tests using 8 different methods: > - Lilliefors > - Shapiro-Wilk > - Robust Jarque Bera > - Jarque Bera > - Anderson-Darling > - Pearson chi-square > - Cramer-von Mises > - Shapiro-Francia > > All show that the null hypothesis that the data come from a normal > distro cannot be rejected. Great. However, I don't think it looks nice > to report the values of 8 different tests on a report. One note is > that my sample size is really tiny (less than 20 independent cases). > Without wanting to start a flame war, are there any advices of which > one/ones would be more appropriate and should be reported (along with > a Q-Q plot). Thank you. > > Regards, > Wow - I have so many concerns with that approach that it's hard to know where to begin. But first of all, why care about normality? Why not use distribution-free methods? You should examine the power of the tests for n=20. You'll probably find it's not good enough to reach a reliable conclusion. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] windows to unix
> "Erin" == Erin Hodgess <[EMAIL PROTECTED]> > on Fri, 25 May 2007 06:10:10 -0500 writes: Erin> Dear R People: Erin> Is there any way to take a Windows version of R, compiled from source, Erin> compress it, and put it on a Unix-like environment, please? Since nobody has answered yet, let a die-hard non-windows user try : Just 'zip' the corresponding directory and copy the zip file to your "unix like" environment. I assume the only things this does not contain would be the - registry entries (which used to be optional anyway; I'm not sure if that's still true) - desktop links to R - startup menu links to R but the last two can easily be recreated after people copy the zip file and unpack it in "their windows enviroment" -- which I assume is the purpose of the whole procedure.. {Please reply to R-help; not me, I am *the" windows-non-expert ..} Martin Erin> thanks in advance, Erin> Sincerely, Erin> Erin Hodgess Erin> Associate Professor Erin> Department of Computer and Mathematical Sciences Erin> University of Houston - Downtown Erin> mailto: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] trouble with snow and Rmpi
Dear Erin, What operating system are you trying this on? Windows? In Linux you definitely don't need MPICH2 but, rather, LAM/MPI. Best, R. On 5/25/07, Erin Hodgess <[EMAIL PROTECTED]> wrote: > Dear R People: > > I am having some trouble with the snow package. > > It requires MPICH2 and Rmpi. > > Rmpi is fine. However, I downloaded the MPICH2 package, and installed. > > There is no mpicc, mpirun, etc. > > Does anyone have any suggestions, please? > > Thanks in advance! > > Sincerely, > Erin Hodgess > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: [EMAIL PROTECTED] > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Ramon Diaz-Uriarte Statistical Computing Team Structural Biology and Biocomputing Programme Spanish National Cancer Centre (CNIO) http://ligarto.org/rdiaz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] File path expansion
Try ?path.expand On 5/25/07, McGehee, Robert <[EMAIL PROTECTED]> wrote: > R-Help, > I discovered a "mis-feature" is ghostscript, which is used by the bitmap > function. It seems that specifying file names in the form "~/abc.png" > rather than "/home/directory/abc.png" causes my GS to crash when I open > the bitmap device on my Linux box. > > The easiest solution would seem to be to intercept any file names in the > form "~/abc.png" and replace the "~" with the user's home directory. I'm > sure I could come up with something involving regular expressions and > system calls to do this in Linux, but even that might not be system > independent. So, I wanted to see if anyone knew of a native R solution > of converting "~" to its full path expansion. > > Thanks, > Robert > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] File path expansion
R-Help, I discovered a "mis-feature" is ghostscript, which is used by the bitmap function. It seems that specifying file names in the form "~/abc.png" rather than "/home/directory/abc.png" causes my GS to crash when I open the bitmap device on my Linux box. The easiest solution would seem to be to intercept any file names in the form "~/abc.png" and replace the "~" with the user's home directory. I'm sure I could come up with something involving regular expressions and system calls to do this in Linux, but even that might not be system independent. So, I wanted to see if anyone knew of a native R solution of converting "~" to its full path expansion. Thanks, Robert __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Read in 250K snp chips
bhiggs wrote: > I'm having trouble getting summaries out of the 250K snp chips in R. I'm > using the oligo package and when I attempt to create the necessary SnpQSet > object (to get genotype calls and intensities) using snprma, I encounter > memory issues. > > Anyone have an alternative package or workaround for these large snp chips? Oligo is a Bioconductor package, so you should probably direct questions on the Bioconductor-help listserve rather than R-help. Best, Jim -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623 ** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Cochran-Armitage
Hi, I have to calculate a Cochran-Armitage test on data. I don't know much on this test and then, I have few questions on the independence_test function. I have read the help and search the archives but this doesn't help me... - They are two ways: either with a formula (independence_test(tumor ~ dose, data = lungtumor, teststat = "quad")) or with a table (independence_test(table(lungtumor$dose, lungtumor$tumor), teststat = "quad")). Why this difference ? The two ways do not give the same results... Why ? - In the teststat option, what is the difference between "quad", "max" and "scalar" ? What is used generaly ? - Why choosing approximate or asymptotic distribution if an exact test can be calculate ? For a classical Cochran Armitage test (nothing special was specified so I guess I have to do the most classical Cochran Armitage test), which options should I use ? Thanks for your help. Delphine Fontaine __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Read in 250K snp chips
I'm having trouble getting summaries out of the 250K snp chips in R. I'm using the oligo package and when I attempt to create the necessary SnpQSet object (to get genotype calls and intensities) using snprma, I encounter memory issues. Anyone have an alternative package or workaround for these large snp chips? -- View this message in context: http://www.nabble.com/Read-in-250K-snp-chips-tf3816761.html#a10805124 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with complex lme model fit
Hi R helpers, I'm trying to fit a rather complex model to some simulated data using lme and am not getting the correct results. It seems there might be some identifiability issues that could possibly be dealt with by specifying starting parameters - but I can't see how to do this. I'm comparing results from R to those got when using GenStat... The raw data are available on the web at http://cmbeale.freehostia.com/OutData.txt and can be read directly into R using: gpdat <- read.table("http://cmbeale.freehostia.com/OutData.txt";, header = T) gpdat$X7 <- as.factor(gpdat$X7) gpdat$X4 <- as.factor(gpdat$X4) rand_mat <- as.matrix(gpdat[,11:26]) gpdat <- groupedData(Y1 ~X1 + X2 + X3 + X4 + X5 + m_sum|.g, data = gpdat) the model fitted using: library(Matrix) library(nlme) m_sum <- rowSums(gpdat[,11:27]) mod1 <- lme(fixed = Y1 ~ X1 + X2 + X3 + X4 + X5 + m_sum, random = pdBlocked(list(pdIdent(~1), pdIdent (~ X6 - 1), pdIdent (~ X7 - 1), pdIdent(~ rand_mat -1))), data = gpdat) Which should recover the variance components: var_labelvar_est rand_mat_scalar 0.00021983 X6_scalar 0.62314002 X7_scalar0.03853604 as recovered by GenStat and used to generate the dataset. Instead I get: X6 0.6231819 X7 0.05221481 rand_mat1.377596e-11 However, If I change or drop either of X5 or X6. I then get much closer estimates to what is expected. For example: mod2 <- lme(fixed = Y1 ~ X1 + X2 + X3 + X4 + X5 + m_sum, random = pdBlocked(list(pdIdent(~1), pdIdent (~ X6 - 1), pdIdent (~as.numeric( X7) - 1), pdIdent(~ rand_mat -1))), data = gpdat) returns variance components: X6 0.6137986 X7 Not meaningful rand_mat0.0006119088 which is much closer to those used to generate the dataset for the parameters that are now meaningful, and has appropriate random effect estimates for the -rand_mat columns (the variable of most interest here). This suggests to me that there is some identifiability issue that might be helped by giving different starting values. Is this possible? Or does anyone have any other suggestions? Thanks, Colin sessionInfo: R version 2.5.0 (2007-04-23) i386-pc-mingw32 locale: LC_COLLATE=English_United Kingdom.1252; LC_CTYPE=English_United Kingdom.1252; LC_MONETARY=English_United Kingdom.1252; LC_NUMERIC=C; LC_TIME=English_United Kingdom.1252 attached base packages: [1] "stats" "graphics" "grDevices" "datasets" "tcltk" "utils" "methods" "base" other attached packages: nlme Matrix latticesvSocketsvIO R2HTML svMisc svIDE "3.1-80" "0.9975-11""0.15-5" "0.9-5" "0.9-5" "1.58" "0.9-5" "0.9-5" Dr. Colin Beale Spatial Ecologist The Macaulay Institute Craigiebuckler Aberdeen AB15 8QH UK Tel: 01224 498245 ext. 2427 Fax: 01224 311556 Email: [EMAIL PROTECTED] -- Please note that the views expressed in this e-mail are those of the sender and do not necessarily represent the views of the Macaulay Institute. This email and any attachments are confidential and are intended solely for the use of the recipient(s) to whom they are addressed. If you are not the intended recipient, you should not read, copy, disclose or rely on any information contained in this e-mail, and we would ask you to contact the sender immediately and delete the email from your system. Thank you. Macaulay Institute and Associated Companies, Macaulay Drive, Craigiebuckler, Aberdeen, AB15 8QH. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Make check failure for R-2.4.1
> Adam> Here is the results: > >>> xB <- c(2000,1e6,1e50,Inf) >>> for(df in c(0.1, 1, 10)) > Adam> +for(ncp in c(0, 1, 10, 100)) > Adam> +print(pchisq(xB, df=df, ncp=ncp), digits == 15) > Adam> Error in print.default(pchisq(xB, df = df, ncp = ncp), digits == 15) > : > Adam> object "digits" not found > > well, that's a typo - I think - you should have been able to fix > (I said "something like" ...). > Just do replace the '==' by '=' Sorry my R is very limited... Here is the output with an '=' instead > xB <- c(2000,1e6,1e50,Inf) > for(df in c(0.1, 1, 10)) +for(ncp in c(0, 1, 10, 100)) +print(pchisq(xB, df=df, ncp=ncp), digits = 15) [1] 1 1 1 1 [1] 1 1 1 1 [1] 1 1 1 1 [1] 1 1 1 1 [1] 1 1 1 1 [1] 1 1 1 1 [1] 1 1 1 1 [1] 1 1 1 1 [1] 1 1 1 1 [1] 1 1 1 1 [1] 1 1 1 1 [1] 1 1 1 1 Thanks again Adam __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Make check failure for R-2.4.1
> "Adam" == Adam Witney <[EMAIL PROTECTED]> > on Fri, 25 May 2007 14:48:18 +0100 writes: >>> ##-- non central Chi^2 : >>> xB <- c(2000,1e6,1e50,Inf) >>> for(df in c(0.1, 1, 10)) >> + for(ncp in c(0, 1, 10, 100)) stopifnot(pchisq(xB, df=df, ncp=ncp) ==1) >> Error: pchisq(xB, df = df, ncp = ncp) == 1 is not all TRUE >> Execution halted >> >> Ok, thanks; >> so, if we want to learn more, we need >> the output of something like >> >> xB <- c(2000,1e6,1e50,Inf) >> for(df in c(0.1, 1, 10)) >> for(ncp in c(0, 1, 10, 100)) >> print(pchisq(xB, df=df, ncp=ncp), digits == 15) Adam> Here is the results: >> xB <- c(2000,1e6,1e50,Inf) >> for(df in c(0.1, 1, 10)) Adam> +for(ncp in c(0, 1, 10, 100)) Adam> +print(pchisq(xB, df=df, ncp=ncp), digits == 15) Adam> Error in print.default(pchisq(xB, df = df, ncp = ncp), digits == 15) : Adam> object "digits" not found well, that's a typo - I think - you should have been able to fix (I said "something like" ...). Just do replace the '==' by '=' Martin Adam> Thanks again... Adam> adam __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with numerical integration and optimization with BFGS
Ravi, Thanks a lot for your detailed suggestions. I will certainly look at the links that you have sent and the package "mnormt". For the moment, I have managed to analytically integrate the expression using "pnorm" along the lines suggested by Prof. Ripley yesterday. For instance, my first integral becomes the following: f1 <- function(w1,w0) { a <- 1/(2*(1-rho2^2)*sigep^2) b <- (rho2*(w1-w0+delta))/((1-rho2^2)*sigep*sigeta) c <- ((w1-w0+delta)^2)/(2*(1-rho2^2)*sigeta^2) d <- muep k <- 2*pi*sigep*sigeta*(sqrt(1-rho2^2)) b1 <- ((-2*a*d - b)^2)/(4*a) - a*d^2 - b*d - c b21 <- sqrt(a)*(w1-rho1*w0) + (-2*a*d - b)/(2*sqrt(a)) b22 <- sqrt(a)*(-w1+rho1*w0) + (-2*a*d - b)/(2*sqrt(a)) b31 <- 2*pnorm(b21*sqrt(2)) - 1 # ERROR FUNCTION b32 <- 2*pnorm(b22*sqrt(2)) - 1 # ERROR FUNCTION b33 <- as.numeric(w1-rho1*w0>=0)*(b31-b32) return(sqrt(pi)*(1/(2*k*sqrt(a)))*exp(b1)*b33) } for (i in 2:n) { out1 <- f1(y[i],y[i-1]) } I have worked out similar expressions for the other two integrals also. Deepankar On Fri, 2007-05-25 at 09:56 -0400, Ravi Varadhan wrote: > Deepankar, > > If the problem seems to be in the evaluation of numerical quadrature part, > you might want to try quadrature methods that are better suited to > integrands with strong peaks. The traditional Gaussian quadrature methods, > even their adaptive versions such as Gauss-Kronrod, are not best suited for > integrating because they do not explicitly account for the "peakedness" of > the integrand, and hence can be inefficient and inaccurate. See the article > below: > http://citeseer.ist.psu.edu/cache/papers/cs/18996/http:zSzzSzwww.sci.wsu.edu > zSzmathzSzfacultyzSzgenzzSzpaperszSzmvn.pdf/genz92numerical.pdf > > Alan Genz has worked on this problem a lot and has a number of computational > tools available. I used some of them when I was working on computing Bayes > factors for binomial regression models with different link functions. If > you are interested, check the following: > > http://www.math.wsu.edu/faculty/genz/software/software.html. > > For your immediate needs, there is an R package called "mnormt" that has a > function for computing integrals under a multivariate normal (and > multivariate t) densities, which is actually based on Genz's Fortran > routines. You could try that. > > Ravi. > > > > --- > > Ravi Varadhan, Ph.D. > > Assistant Professor, The Center on Aging and Health > > Division of Geriatric Medicine and Gerontology > > Johns Hopkins University > > Ph: (410) 502-2619 > > Fax: (410) 614-9625 > > Email: [EMAIL PROTECTED] > > Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html > > > > > > > > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Deepankar Basu > Sent: Friday, May 25, 2007 12:02 AM > To: Prof Brian Ripley > Cc: r-help@stat.math.ethz.ch > Subject: Re: [R] Problem with numerical integration and optimization with > BFGS > > Prof. Ripley, > > The code that I provided with my question of course does not contain > code for the derivatives; but I am supplying analytical derivatives in > my full program. I did not include that code with my question because > that would have added about 200 more lines of code without adding any > new information relevant for my question. The problem that I had pointed > to occurs whether I provide analytical derivatives or not to the > optimization routine. And the problem was that when I use the "BFGS" > method in optim, I get an error message saying that the integrals are > probably divergent; I know, on the other hand, that the integrals are > convergent. The same problem does not arise when I instead use the > Nelder-Mead method in optim. > > Your suggestion that the expression can be analytically integrated > (which will involve "pnorm") might be correct though I do not see how to > do that. The integrands are the bivariate normal density functions with > one variable replaced by known quantities while I integrate over the > second. > > For instance, the first integral is as follows: the integrand is the > bivariate normal density function (with general covariance matrix) where > the second variable has been replaced by > y[i] - rho1*y[i-1] + delta > and I integrate over the first variable; the range of integration is > lower=-y[i]+rho1*y[i-1] > upper=y[i]-rho1*y[i-1] > > The other two integrals are very similar. It would be of great help if > you could point out how to integrate the expressions analytically using > "pnorm". > > Thanks. > Deepankar > > > On Fri, 2007-05-25 at 04:22 +0100, Prof Brian Ripley wrote: > > You are trying to use a derivative-based optimization method without > > supplying derivatives. Th
Re: [R] Problem with numerical integration and optimization with BFGS
Deepankar, If the problem seems to be in the evaluation of numerical quadrature part, you might want to try quadrature methods that are better suited to integrands with strong peaks. The traditional Gaussian quadrature methods, even their adaptive versions such as Gauss-Kronrod, are not best suited for integrating because they do not explicitly account for the "peakedness" of the integrand, and hence can be inefficient and inaccurate. See the article below: http://citeseer.ist.psu.edu/cache/papers/cs/18996/http:zSzzSzwww.sci.wsu.edu zSzmathzSzfacultyzSzgenzzSzpaperszSzmvn.pdf/genz92numerical.pdf Alan Genz has worked on this problem a lot and has a number of computational tools available. I used some of them when I was working on computing Bayes factors for binomial regression models with different link functions. If you are interested, check the following: http://www.math.wsu.edu/faculty/genz/software/software.html. For your immediate needs, there is an R package called "mnormt" that has a function for computing integrals under a multivariate normal (and multivariate t) densities, which is actually based on Genz's Fortran routines. You could try that. Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Deepankar Basu Sent: Friday, May 25, 2007 12:02 AM To: Prof Brian Ripley Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Problem with numerical integration and optimization with BFGS Prof. Ripley, The code that I provided with my question of course does not contain code for the derivatives; but I am supplying analytical derivatives in my full program. I did not include that code with my question because that would have added about 200 more lines of code without adding any new information relevant for my question. The problem that I had pointed to occurs whether I provide analytical derivatives or not to the optimization routine. And the problem was that when I use the "BFGS" method in optim, I get an error message saying that the integrals are probably divergent; I know, on the other hand, that the integrals are convergent. The same problem does not arise when I instead use the Nelder-Mead method in optim. Your suggestion that the expression can be analytically integrated (which will involve "pnorm") might be correct though I do not see how to do that. The integrands are the bivariate normal density functions with one variable replaced by known quantities while I integrate over the second. For instance, the first integral is as follows: the integrand is the bivariate normal density function (with general covariance matrix) where the second variable has been replaced by y[i] - rho1*y[i-1] + delta and I integrate over the first variable; the range of integration is lower=-y[i]+rho1*y[i-1] upper=y[i]-rho1*y[i-1] The other two integrals are very similar. It would be of great help if you could point out how to integrate the expressions analytically using "pnorm". Thanks. Deepankar On Fri, 2007-05-25 at 04:22 +0100, Prof Brian Ripley wrote: > You are trying to use a derivative-based optimization method without > supplying derivatives. This will use numerical approoximations to the > derivatives, and your objective function will not be suitable as it is > internally using adaptive numerical quadrature and hence is probably not > close enough to a differentiable function (it may well have steps). > > I believe you can integrate analytically (the answer will involve pnorm), > and that you can also find analytical derivatives. > > Using (each of) numerical optimization and integration is a craft, and it > seems you need to know more about it. The references on ?optim are too > advanced I guess, so you could start with Chapter 16 of MASS and its > references. > > On Thu, 24 May 2007, Deepankar Basu wrote: > > > Hi R users, > > > > I have a couple of questions about some problems that I am facing with > > regard to numerical integration and optimization of likelihood > > functions. Let me provide a little background information: I am trying > > to do maximum likelihood estimation of an econometric model that I have > > developed recently. I estimate the parameters of the model using the > > monthly US unemployment rate series obtained from the Federal Reserve > > Bank of St. Louis. (The data is freely available from their web-based > > database called FRED-II). > > > > For my model, the likelihood function for each observation is the sum of > > three integrals. The integrand in each
Re: [R] Make check failure for R-2.4.1
>> ##-- non central Chi^2 : >> xB <- c(2000,1e6,1e50,Inf) >> for(df in c(0.1, 1, 10)) > + for(ncp in c(0, 1, 10, 100)) stopifnot(pchisq(xB, df=df, ncp=ncp) ==1) > Error: pchisq(xB, df = df, ncp = ncp) == 1 is not all TRUE > Execution halted > > Ok, thanks; > so, if we want to learn more, we need > the output of something like > > xB <- c(2000,1e6,1e50,Inf) > for(df in c(0.1, 1, 10)) >for(ncp in c(0, 1, 10, 100)) >print(pchisq(xB, df=df, ncp=ncp), digits == 15) Here is the results: > xB <- c(2000,1e6,1e50,Inf) > for(df in c(0.1, 1, 10)) +for(ncp in c(0, 1, 10, 100)) +print(pchisq(xB, df=df, ncp=ncp), digits == 15) Error in print.default(pchisq(xB, df = df, ncp = ncp), digits == 15) : object "digits" not found Thanks again... adam __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] normality tests
Hi all, apologies for seeking advice on a general stats question. I ve run normality tests using 8 different methods: - Lilliefors - Shapiro-Wilk - Robust Jarque Bera - Jarque Bera - Anderson-Darling - Pearson chi-square - Cramer-von Mises - Shapiro-Francia All show that the null hypothesis that the data come from a normal distro cannot be rejected. Great. However, I don't think it looks nice to report the values of 8 different tests on a report. One note is that my sample size is really tiny (less than 20 independent cases). Without wanting to start a flame war, are there any advices of which one/ones would be more appropriate and should be reported (along with a Q-Q plot). Thank you. Regards, -- yianni __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Competing Risks Analysis
I am working on a competing risks problem, specifically an analysis of cause-specific mortality. I am familiar with the cmprsk package and have used it before to create cumulative incidence plots. I also came across an old (1998) s-news post from Dr. Terry Therneau describing a way to use coxph to model competing risks. I am re-producing the post at the bottom of this message. I would like to know if this approach is still reasonable or are there other ways to go now. I did an RSiteSearch with the term "competing risks" and found some interesting articles but nothing as specific as the post below. - S-news Article Begins - Competing risks It's actually quite easy. Assume a data set with n subjects and 4 types of competing events. Then create a data set with 4n observations First n obs: the data set you would create for an analysis of "time to event type 1", where all other event types are censored. An extra variable "etype" is =1. Second n obs: the data set you would create for "time to event type 2", with etype=2 . . . Then fit <- coxph(Surv(time,status) ~ + strata(etype), 1. Wei, Lin, and Weissfeld apply this to data sets where the competing risks are not necessarily exclusive, i.e., time to progression and time to death for cancer patients. JASA 1989, 1065-1073. If a given subject can have more than one "event", then you need to use the sandwich estimate of variance, obtained by adding ".. + cluster(id).." to the model statement above, where "id" is variable unique to each subject. (The method of fitting found in WLW, namely to do individual fits and then glue the results together, is not necessary). 2. If a given subject can have at most one event, then it is not clear that the sandwich estimate of variance is necessary. See Lunn and McNeil, Biometrics (year?) for an example. 3. The covariates can be coded any way you like. WLW put in all of the strata * covariate interactions for instance (the "x" coef is different for each event type), but I never seem to have a big enough sample to justify doing this. Lunn and McNeil use a certain coding of the treatment effect, so that the betas are a contrast of interest to them; I've used similar things but never that particular one. 4. "etype" doesn't have to be 1,2,3,... of course; etype= 'paper', 'scissors', 'stone', 'pc' would work as well. Terry M. Therneau, Ph.D. - S-news Article Ends - -- Kevin E. Thorpe Biostatistician/Trialist, Knowledge Translation Program Assistant Professor, Department of Public Health Sciences Faculty of Medicine, University of Toronto email: [EMAIL PROTECTED] Tel: 416.864.5776 Fax: 416.864.6057 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Make check failure for R-2.4.1
> "Adam" == Adam Witney <[EMAIL PROTECTED]> > on Fri, 25 May 2007 09:38:29 +0100 writes: Adam> Thanks for your replies Details inline below: Adam> On 24/5/07 17:12, "Martin Maechler" <[EMAIL PROTECTED]> wrote: >>> "UweL" == Uwe Ligges <[EMAIL PROTECTED]> >>> on Thu, 24 May 2007 17:34:16 +0200 writes: >> UweL> Some of these test are expected from time to time, since they are >> using UweL> random numbers. Just re-run. >> >> eehm, "some of these", yes, but not the ones Adam mentioned, >> d-p-q-r-tests.R. >> >> Adam, if you want more info you should report to us the *end* >> (last dozen of lines) of >> your d-p-q-r-tests.Rout[.fail] file. Adam> Ok, here they are... [1] TRUE TRUE TRUE TRUE > > ##-- non central Chi^2 : > xB <- c(2000,1e6,1e50,Inf) > for(df in c(0.1, 1, 10)) + for(ncp in c(0, 1, 10, 100)) stopifnot(pchisq(xB, df=df, ncp=ncp) ==1) Error: pchisq(xB, df = df, ncp = ncp) == 1 is not all TRUE Execution halted Ok, thanks; so, if we want to learn more, we need the output of something like xB <- c(2000,1e6,1e50,Inf) for(df in c(0.1, 1, 10)) for(ncp in c(0, 1, 10, 100)) print(pchisq(xB, df=df, ncp=ncp), digits == 15) UweL> BTW: We do have R-2.5.0 these days. >> >> Indeed! >> >> And gcc 2.95.4 is also very old. >> Maybe you've recovered an old compiler / math-library bug from >> that antique compiler suite ? Adam> Yes, maybe I should start think about upgrading this box! yes, at least "start" ... ;-) Adam> Thanks again Adam> adam __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R-About PLSR
On Fri, 2007-05-25 at 17:25 +0530, Nitish Kumar Mishra wrote: > hi R help group, > I have installed PLS package in R and use it for princomp & prcomp > commands for calculating PCA using its example file(USArrests example). > But How I can use PLS for Partial least square, R square, mvrCv one more > think how i can import external file in R. When I use plsr, R2, RMSEP it > show error could not find function plsr, RMSEP etc. > How I can calculate PLS, R2, RMSEP, PCR, MVR using pls package in R. > Thanking you Did you load the package with: library(pls) Before you tried to use the functions you mention? HTH G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R-About PLSR
hi R help group, I have installed PLS package in R and use it for princomp & prcomp commands for calculating PCA using its example file(USArrests example). But How I can use PLS for Partial least square, R square, mvrCv one more think how i can import external file in R. When I use plsr, R2, RMSEP it show error could not find function plsr, RMSEP etc. How I can calculate PLS, R2, RMSEP, PCR, MVR using pls package in R. Thanking you -- Nitish Kumar Mishra Junior Research Fellow BIC, IMTECH, Chandigarh, India E-Mail Address: [EMAIL PROTECTED] [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lmer and scale parameter in glmm model
Hi all, I try to fit a glmm model with binomial distribution and I would to verify that the scale parameter is close to 1... the lmer function gives the following result : Estimated scale (compare to 1 ) 0.766783 But I would like to know how this estimation (0.766783) is performed, and I would like to know if it is possible to find this estimation with the different results given by the function lmer. Thanks, Olivier. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] iplots problem
Hi. I try to load iplots using the following commands > library(rJava) > library(iplots) but then I get the following error: Error in .jinit(cp, parameters = "-Xmx512m", silent = TRUE) : Cannot create Java Virtual Machine Error in library(iplots) : .First.lib failed for 'iplots' What do I have to do to correct this? Thanks -- View this message in context: http://www.nabble.com/iplots-problem-tf3815516.html#a10801096 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] windows to unix
Dear R People: Is there any way to take a Windows version of R, compiled from source, compress it, and put it on a Unix-like environment, please? thanks in advance, Sincerely, Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] trouble with snow and Rmpi
Dear R People: I am having some trouble with the snow package. It requires MPICH2 and Rmpi. Rmpi is fine. However, I downloaded the MPICH2 package, and installed. There is no mpicc, mpirun, etc. Does anyone have any suggestions, please? Thanks in advance! Sincerely, Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] in unix opening data object created under win
--- [EMAIL PROTECTED] wrote: > On Unix R's version is 2.3.1 and on PC its 2.4.1. > > I dont have the rights to install newer version of R > on Unix. > > I tried different upload methods. No one worked. > > On Unix it looks as follows (dots to hide my > userid): > > > > load("/afs/ir/users/../project/ps/data/dtaa") > > head(dtaa) > hospid mfpi1 mfpi2 mfpi3 mfpi4 mfpi5 > mfpi6 mfpi7 mfpi8 > NA9 0.1428571 1 0.5 0.2857143 0.50 > 0.0 0.333 0 > 4041 9 0.1428571 0 0.0 0.2857143 0.25 > 0.2 0.000 0 > mfpi9 > NA 0.333 > 4041 1.000 > > > > The data comes through but its screwed up. > > Thanks for your help. > > Toby Hi Toby, Except that the rest of the data is not showing up why do you say the data is screwed up? I don't know what your data should look like but the two rows above look "okay". If you are having a compatability problem with the two version of R, something you might want to try is downloading 2.4.1 or 2.5.0 and installing it on a USB stick. You can run R quite nicely from a USB and it might indicate if it is R or a corrupt file that is the problem. > > Liaw, Andy wrote: > > What are the versions of R on the two platform? > Is the version on Unix > > at least as new as the one on Windows? > > > > Andy > > > > From: [EMAIL PROTECTED] > > > >>Hi All > >> > >>I am saving a dataframe in my MS-Win R with > save(). > >>Then I copy it onto my personal AFS space. > >>Then I start R and run it with emacs and load() > the data. > >>It loads only 2 lines: head() shows only two lines > nrow() als > >>say it has only 2 > >>lines, I get error message, when trying to use > this data > >>object, saying that > >>some row numbers are missing. > >>If anyone had similar situation, I appreciate > letting me know. > >> > >>Best Toby __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to mimic plot=F for truehist?
By defining your own function. You can get the function body by typing its name in the R command line and pressing Enter. Copy-paste the function body in ascii file (source R code), redefine it as you like, for example, by adding desired argument and code for processing it, then source that file and use your customized function. Johannes Graumann-2 wrote: > > Dear Rologists, > > In order to combine plots I need to get access to the some "par"s specific > to my plot prior to replot it with modified parameters. I have not found > any option like "plot=F" associated with truehist and would like to know > whether someone can point out how to overcome this problem. > > -- View this message in context: http://www.nabble.com/how-to-mimic-plot%3DF-for-truehist--tf3815196.html#a10800310 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] L1-SVM
> Is there a package in R to find L1-SVM. I did search and found svmpath > but not sure if it is what I need. Try kernlab __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] testing difference (or similarities) between two distance matrices (not independent)
Hi, i'm looking to test if two distance matrices are statistically different from each others. These two matrices have been computed on the same set of data (several population samples) 1. using a particular genetic distance 2. weighting that genetic distance with an extra factor (we can look at this as one set is computed before applying a treatment and the second one after applying that particular treatment ... kind of a similar situation) both these matrices are obviously not independent from each others, so Mantel test and others correlation tests do not apply here. I thought of testing the order of values between these matrices (if distances are ordered the same way for both matrices, we have very similar matrices and the additional factor has quite no effect on the calculation). Is there any package or function in R allowing to do that and statistically test it (with permutations or another approach)? I did check the mailing lists but did not find anything on that problem i'm trying to solve thanks for your help and insights on that problem Stéphane Buhler __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to mimic plot=F for truehist?
Dear Rologists, In order to combine plots I need to get access to the some "par"s specific to my plot prior to replot it with modified parameters. I have not found any option like "plot=F" associated with truehist and would like to know whether someone can point out how to overcome this problem. Thanks, Joh __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Speeding up resampling of rows from a large matrix
That's beautiful. For the full 120 x 65,000 matrix your approach took 85 seconds. A truly remarkable improvement over my 80 minutes! Thank you! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question concerning "pastecs" package
Dear Philippe Thanks a lot for your information - it is extremely usefull Rainer Philippe Grosjean wrote: > Hello, > > I already answered privately to your question. No, there is no > translation of pastecs.pdf. The English documentation is accessible, as > usual, by: > > ?turnpoints > > Regarding your specific question, 'info' is the quantity of information > I associated with the turning points: > > I = -log2 P(t) > > where P is the probability to observe a turning point at time t under > the null hypothesis that the time series is purely random, and thus, the > distribution of turning points follows a normal distribution with: > > E(p) = 2/3*(n-2) > var(p) = (16*n - 29)/90 > > with p, the number of observed turning points and n the number of > observations. Ibanez (1982, in French, sorry... not my fault!) > demonstrated that P(t) is: > > P(t) = 2*(1/n(t-1)! * (n-1)!) > > As you can easily imagine, from this point on, it is straightforward to > construct a test to determine if the series is random (regarding the > distribution of the turning points), more or less monotonic (more or > less turning points than expected), See also the ref cited in the online > help (Kendall 1976). > > References: > --- > Ibanez, F., 1982. Sur une nouvelle application de la théorie de > l'information à la description des séries chronologiques planctoniques. > J. Exp. Mar. Biol. Ecol., 4:619-632 > > Kendall, M.G., 1976. Time-series, 2nd ed. Charles Griffin & Co, London > > Best, > > Philippe Grosjean > > ..<°}))>< > ) ) ) ) ) > ( ( ( ( (Prof. Philippe Grosjean > ) ) ) ) ) > ( ( ( ( (Numerical Ecology of Aquatic Systems > ) ) ) ) ) Mons-Hainaut University, Belgium > ( ( ( ( ( > .. > > Rainer M. Krug wrote: >> Hi >> >> I just installed the pastecs package and I am wondering: is there an >> english (or german) translation of the file pastecs.pdf? If not, is >> there an explanation somewhere of the object of type 'turnpoints' as a >> result of turnpoints(), especially the "info" field? >> >> Thanks, >> >> Rainer >> -- NEW EMAIL ADDRESS AND ADDRESS: [EMAIL PROTECTED] [EMAIL PROTECTED] WILL BE DISCONTINUED END OF MARCH Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation Biology (UCT) Leslie Hill Institute for Plant Conservation University of Cape Town Rondebosch 7701 South Africa Fax:+27 - (0)86 516 2782 Fax:+27 - (0)21 650 2440 (w) Cell: +27 - (0)83 9479 042 Skype: RMkrug email: [EMAIL PROTECTED] [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Off topic: S.E. for cross validation
Hi, I'm performing (blocked) 10-fold cross-validation of a several time series forecasting methods, measuring their mean squared error (MSE). I know that the MSE_cv is the average over the 10 MSEs. Is there a way to calculate the standard error as well? The usual SD/sqrt(n) formula probably doesn't apply here as the 10 observations aren't independent. Thanks, Gad -- Gad Abraham Department of Mathematics and Statistics The University of Melbourne Parkville 3010, Victoria, Australia email: [EMAIL PROTECTED] web: http://www.ms.unimelb.edu.au/~gabraham __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question concerning "pastecs" package
Hello, I already answered privately to your question. No, there is no translation of pastecs.pdf. The English documentation is accessible, as usual, by: ?turnpoints Regarding your specific question, 'info' is the quantity of information I associated with the turning points: I = -log2 P(t) where P is the probability to observe a turning point at time t under the null hypothesis that the time series is purely random, and thus, the distribution of turning points follows a normal distribution with: E(p) = 2/3*(n-2) var(p) = (16*n - 29)/90 with p, the number of observed turning points and n the number of observations. Ibanez (1982, in French, sorry... not my fault!) demonstrated that P(t) is: P(t) = 2*(1/n(t-1)! * (n-1)!) As you can easily imagine, from this point on, it is straightforward to construct a test to determine if the series is random (regarding the distribution of the turning points), more or less monotonic (more or less turning points than expected), See also the ref cited in the online help (Kendall 1976). References: --- Ibanez, F., 1982. Sur une nouvelle application de la théorie de l'information à la description des séries chronologiques planctoniques. J. Exp. Mar. Biol. Ecol., 4:619-632 Kendall, M.G., 1976. Time-series, 2nd ed. Charles Griffin & Co, London Best, Philippe Grosjean ..<°}))>< ) ) ) ) ) ( ( ( ( (Prof. Philippe Grosjean ) ) ) ) ) ( ( ( ( (Numerical Ecology of Aquatic Systems ) ) ) ) ) Mons-Hainaut University, Belgium ( ( ( ( ( .. Rainer M. Krug wrote: > Hi > > I just installed the pastecs package and I am wondering: is there an > english (or german) translation of the file pastecs.pdf? If not, is > there an explanation somewhere of the object of type 'turnpoints' as a > result of turnpoints(), especially the "info" field? > > Thanks, > > Rainer > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why might X11() not be found?
On Fri, 25-May-2007 at 08:25AM +1200, Patrick Connolly wrote: |> > sessionInfo() |> R version 2.5.0 (2007-04-23) |> x86_64-unknown-linux-gnu |> |> locale: |> |> LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C |> |> attached base packages: |> [1] "utils""stats""graphics" "methods" "base" Well, in fact it was very simple. There's no "package:grDevices" in there. Now, why that didn't happen before, I'm yet to work out. Thanks for the suggestions. best -- ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ___Patrick Connolly {~._.~} Great minds discuss ideas _( Y )_Middle minds discuss events (:_~*~_:)Small minds discuss people (_)-(_) . Anon ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Make check failure for R-2.4.1
Thanks for your replies Details inline below: On 24/5/07 17:12, "Martin Maechler" <[EMAIL PROTECTED]> wrote: >> "UweL" == Uwe Ligges <[EMAIL PROTECTED]> >> on Thu, 24 May 2007 17:34:16 +0200 writes: > > UweL> Some of these test are expected from time to time, since they are > using > UweL> random numbers. Just re-run. > > eehm, "some of these", yes, but not the ones Adam mentioned, > d-p-q-r-tests.R. > > Adam, if you want more info you should report to us the *end* > (last dozen of lines) of > your d-p-q-r-tests.Rout[.fail] file. Ok, here they are... [1] TRUE TRUE TRUE TRUE > > ##-- non central Chi^2 : > xB <- c(2000,1e6,1e50,Inf) > for(df in c(0.1, 1, 10)) + for(ncp in c(0, 1, 10, 100)) stopifnot(pchisq(xB, df=df, ncp=ncp) == 1) Error: pchisq(xB, df = df, ncp = ncp) == 1 is not all TRUE Execution halted > UweL> BTW: We do have R-2.5.0 these days. > > Indeed! > > And gcc 2.95.4 is also very old. > Maybe you've recovered an old compiler / math-library bug from > that antique compiler suite ? Yes, maybe I should start think about upgrading this box! Thanks again adam __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Speeding up resampling of rows from a large matrix
Here is a possibility. The only catch is that if a pair of rows is selected twice you will get the results in a block, not scattered at random throughout the columns of G. I can't see that as a problem. ### --- start code excerpt --- nSNPs <- 1000 H <- matrix(sample(0:1, 120*nSNPs , replace=T), nrow=120) # G <- matrix(0, nrow=3, ncol=nSNPs) # Keep in mind that the real H is 120 x 65000 ij <- as.matrix(subset(expand.grid(i = 1:120, j = 1:120), i < j)) nResamples <- 3000 sel <- sample(1:nrow(ij), nResamples, rep = TRUE) repf <- table(sel) # replication factors ij <- ij[as.numeric(names(repf)), ] # distinct choice made G <- matrix(0, nrow = 3, ncol = nrow(ij)) # for now for(j in 1:ncol(G)) G[,j] <- rowSums(outer(0:2, colSums(H[ij[j, ], ]), "==")) G <- G[, rep(1:ncol(G), repf)] # bulk up the result # _ # _ # _ # _pair <- replicate(nResamples, sample(1:120, 2)) # _ # _gen <- function(x){g <- sum(x); c(g==0, g==1, g==2)} # _ # _for (i in 1:nResamples){ # _G <- G + apply(H[pair[,i],], 2, gen) # _} ### --- end of code excerpt --- I did a timing on my machine which is a middle-of-the range windows monstrosity... > system.time({ + + nSNPs <- 1000 + H <- matrix(sample(0:1, 120*nSNPs , replace=T), nrow=120) + + # G <- matrix(0, nrow=3, ncol=nSNPs) + + # Keep in mind that the real H is 120 x 65000 + + ij <- as.matrix(subset(expand.grid(i = 1:120, j = 1:120), i < j)) + + nResamples <- 3000 + sel <- sample(1:nrow(ij), nResamples, rep = TRUE) + repf <- table(sel) # replication factors + ij <- ij[as.numeric(names(repf)), ] # distinct choice made + + G <- matrix(0, nrow = 3, ncol = nrow(ij)) # for now + + for(j in 1:ncol(G)) + G[,j] <- rowSums(outer(0:2, colSums(H[ij[j, ], ]), "==")) + + G <- G[, rep(1:ncol(G), repf)] # bulk up the result + + # _ + # _ + # _ + # _pair <- replicate(nResamples, sample(1:120, 2)) + # _ + # _gen <- function(x){g <- sum(x); c(g==0, g==1, g==2)} + # _ + # _for (i in 1:nResamples){ + # _G <- G + apply(H[pair[,i],], 2, gen) + # _} + # _#-- - + # _ + }) user system elapsed 0.970.000.99 Less than a second. Somewhat of an improvement on the 80 minutes, I reckon. This will increase, of course when you step the size of the H matrix up from 1000 to 65000 columns Bill Venables CSIRO Laboratories PO Box 120, Cleveland, 4163 AUSTRALIA Office Phone (email preferred): +61 7 3826 7251 Fax (if absolutely necessary): +61 7 3826 7304 Mobile:(I don't have one!) Home Phone: +61 7 3286 7700 mailto:[EMAIL PROTECTED] http://www.cmis.csiro.au/bill.venables/ -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Juan Pablo Lewinger Sent: Friday, 25 May 2007 4:04 PM To: r-help@stat.math.ethz.ch Subject: [R] Speeding up resampling of rows from a large matrix I'm trying to: Resample with replacement pairs of distinct rows from a 120 x 65,000 matrix H of 0's and 1's. For each resampled pair sum the resulting 2 x 65,000 matrix by column: 0 1 0 1 ... + 0 0 1 1 ... ___ = 0 1 1 2 ... For each column accumulate the number of 0's, 1's and 2's over the resamples to obtain a 3 x 65,000 matrix G. For those interested in the background, H is a matrix of haplotypes, each pair of haplotypes forms a genotype, and each column corresponds to a SNP. I'm using resampling to compute the null distribution of the maximum over correlated SNPs of a simple statistic. The code: #--- nSNPs <- 1000 H <- matrix(sample(0:1, 120*nSNPs , replace=T), nrow=120) G <- matrix(0, nrow=3, ncol=nSNPs) # Keep in mind that the real H is 120 x 65000 nResamples <- 3000 pair <- replicate(nResamples, sample(1:120, 2)) gen <- function(x){g <- sum(x); c(g==0, g==1, g==2)} for (i in 1:nResamples){ G <- G + apply(H[pair[,i],], 2, gen) } #--- The problem is that the loop takes about 80 mins to complete and I need to repeat the whole thing 10,000 times, which would then take over a year and a half! Is there a way to speed this up so that the full 10,000 iterations take a reasonable amount of time (say a week)? My machine has an Intel Xeon 3.40GHz CPU with 1GB of RAM > sessionInfo() R version 2.5.0 (2007-04-23) i386-pc-mingw32 I would greatly appreciate any help. Juan Pablo Lewinger Department of Preventive Medicine Keck School of Medicine University of Southern California __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mail
Re: [R] Running R in Bash and R GUI
There are two things that occur. Firstly, I normally have to unset no_proxy: %> unset no_proxy; R Secondly, if for some reason http_proxy isn't being seen in R, you can use the Sys.putenv() function within R to manipulate the environment -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: 24 May 2007 16:05 To: R-help@stat.math.ethz.ch Subject: [R] Running R in Bash and R GUI I have been trying to get the R and package update functions in the GUI version of R to work on my Mac. Initially I got error messages that suggested I needed to set up the http_proxy for GUI R to use, but how can this be done? I eventually got to the point of writing a .bash_profile file in the Bash terminal and setting the proxy addresses there. I can now use my Bash terminal, invoke R, and run the update / install commands and they work! The problem that still remains is that in the R console of the GUI R, the http_proxy is not seen and thus I cannot connect to CRAN or any other mirror using the GUI functions in the pull-down menus. I get > update.packages () Warning: unable to access index for repository http://cran.uk.r- project.org/bin/macosx/universal/contrib/2.5 > Basically it still seems unable to access port 80. Is there a way of solving this so that I can use both terminals rather than just everything through Bash? Thanks Steve Hodgkinson University of Brighton __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.