Re: [R] different results form summarization by loop and sum or rowMeans function
On Thu, 11 Sep 2008, Markus Schmidberger wrote: Hi, I found different results calculating the rowMeans by the function rowMeans() and a simple for-loop. The differences are very low. But after this Indeed, but the C code (rowMeans) is likely to be more accurate as it uses an extended-precision accumulator. calculation I will start some optimization algorithms (BFGS or CG) and there I get huge differences (from the small changes in the beginning or start values, I changed nothing else in the code). How I can avoid these differences between sum-loops and sum-functions? You cannot. What you can do is work on making what you do with these inputs numerically stable: unless you do so your end results will have very little value. (For example, are you finding different local minima, in which case you need to decide how to treat that possibility?) I suggest reading an introductory book on Numerical Analysis, or Monahan, J. F. (2001) Numerical Methods of Statistics. Cambridge: Cambridge. Chapter 2. or Press,W. H., Teukolsky, S. A., Vetterling, W. T. and Flannery, B. P. (2007) Numerical Recipes. The Art of Scientific Programming. Third Edition. Cambridge. Section 1.1 (I think). Attached a small testcode using data form Bioconductor. Best Markus library(affy) data(affybatch.example) mat <- exprs(affybatch.example)[1:100,1:3] mat <- exp(1)*mat mat <- asinh(mat) rowM1<- rowMeans(mat) t=rep(0,100) # Vektor mit 0en for(i in 1:100){ for(j in 1:3) t[i] <- t[i] + mat[i,j] } rowM2 <- t/3 m1 <- mat - rowM1 m2 <- mat -rowM2 print(m1-m2) sessionInfo() R version 2.7.1 (2008-06-23) i386-pc-mingw32 locale: LC_COLLATE=German_Germany.1252;LC_CTYPE=German_Germany.1252;LC_MONETARY=German_Germany.1252;LC_NUMERIC=C;LC_TIME=German_Germany.1252 attached base packages: [1] tools stats graphics grDevices utils datasets methods [8] base other attached packages: [1] affy_1.18.2 preprocessCore_1.2.0 affyio_1.8.0 [4] Biobase_2.0.1 -- Dipl.-Tech. Math. Markus Schmidberger Ludwig-Maximilians-Universität München IBE - Institut für medizinische Informationsverarbeitung, Biometrie und Epidemiologie Marchioninistr. 15, D-81377 Muenchen URL: http://ibe.web.med.uni-muenchen.de Mail: Markus.Schmidberger [at] ibe.med.uni-muenchen.de Tel: +49 (089) 7095 - 4599 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] different results form summarization by loop and sum or rowMeans function
How low is "very low"? This is probably answered by FAQ 7.31 On Thu, Sep 11, 2008 at 9:49 AM, Markus Schmidberger <[EMAIL PROTECTED]> wrote: > Hi, > > I found different results calculating the rowMeans by the function > rowMeans() and a simple for-loop. The differences are very low. But after > this calculation I will start some optimization algorithms (BFGS or CG) and > there I get huge differences (from the small changes in the beginning or > start values, I changed nothing else in the code). > How I can avoid these differences between sum-loops and sum-functions? > > Attached a small testcode using data form Bioconductor. > > Best > Markus > > > library(affy) > data(affybatch.example) > mat <- exprs(affybatch.example)[1:100,1:3] > mat <- exp(1)*mat > mat <- asinh(mat) > > rowM1<- rowMeans(mat) > > t=rep(0,100) # Vektor mit 0en > for(i in 1:100){ > for(j in 1:3) > t[i] <- t[i] + mat[i,j] > } > rowM2 <- t/3 > > m1 <- mat - rowM1 > m2 <- mat -rowM2 > > print(m1-m2) > > sessionInfo() > R version 2.7.1 (2008-06-23) > i386-pc-mingw32 > > locale: > LC_COLLATE=German_Germany.1252;LC_CTYPE=German_Germany.1252;LC_MONETARY=German_Germany.1252;LC_NUMERIC=C;LC_TIME=German_Germany.1252 > > attached base packages: > [1] tools stats graphics grDevices utils datasets methods [8] > base > other attached packages: > [1] affy_1.18.2 preprocessCore_1.2.0 affyio_1.8.0 [4] > Biobase_2.0.1 > -- > Dipl.-Tech. Math. Markus Schmidberger > > Ludwig-Maximilians-Universität München > IBE - Institut für medizinische Informationsverarbeitung, > Biometrie und Epidemiologie > Marchioninistr. 15, D-81377 Muenchen > URL: http://ibe.web.med.uni-muenchen.de Mail: Markus.Schmidberger [at] > ibe.med.uni-muenchen.de > Tel: +49 (089) 7095 - 4599 > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] different results form summarization by loop and sum or rowMeans function
Hi, I found different results calculating the rowMeans by the function rowMeans() and a simple for-loop. The differences are very low. But after this calculation I will start some optimization algorithms (BFGS or CG) and there I get huge differences (from the small changes in the beginning or start values, I changed nothing else in the code). How I can avoid these differences between sum-loops and sum-functions? Attached a small testcode using data form Bioconductor. Best Markus library(affy) data(affybatch.example) mat <- exprs(affybatch.example)[1:100,1:3] mat <- exp(1)*mat mat <- asinh(mat) rowM1<- rowMeans(mat) t=rep(0,100) # Vektor mit 0en for(i in 1:100){ for(j in 1:3) t[i] <- t[i] + mat[i,j] } rowM2 <- t/3 m1 <- mat - rowM1 m2 <- mat -rowM2 print(m1-m2) sessionInfo() R version 2.7.1 (2008-06-23) i386-pc-mingw32 locale: LC_COLLATE=German_Germany.1252;LC_CTYPE=German_Germany.1252;LC_MONETARY=German_Germany.1252;LC_NUMERIC=C;LC_TIME=German_Germany.1252 attached base packages: [1] tools stats graphics grDevices utils datasets methods [8] base other attached packages: [1] affy_1.18.2 preprocessCore_1.2.0 affyio_1.8.0 [4] Biobase_2.0.1 -- Dipl.-Tech. Math. Markus Schmidberger Ludwig-Maximilians-Universität München IBE - Institut für medizinische Informationsverarbeitung, Biometrie und Epidemiologie Marchioninistr. 15, D-81377 Muenchen URL: http://ibe.web.med.uni-muenchen.de Mail: Markus.Schmidberger [at] ibe.med.uni-muenchen.de Tel: +49 (089) 7095 - 4599 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.