Re: [R] different results form summarization by loop and sum or rowMeans function

2008-09-11 Thread Prof Brian Ripley

On Thu, 11 Sep 2008, Markus Schmidberger wrote:


Hi,

I found different results calculating the rowMeans by the function rowMeans() 
and a simple for-loop. The differences are very low. But after this


Indeed, but the C code (rowMeans) is likely to be more accurate as it uses 
an extended-precision accumulator.


calculation I will start some optimization algorithms (BFGS or CG) and there 
I get huge differences (from the small changes in the beginning or start 
values, I changed nothing else in the code).

How I can avoid these differences between sum-loops and sum-functions?


You cannot. What you can do is work on making what you do with these 
inputs numerically stable: unless you do so your end results will have 
very little value.  (For example, are you finding different local minima, 
in which case you need to decide how to treat that possibility?)


I suggest reading an introductory book on Numerical Analysis, or

Monahan, J. F. (2001) Numerical Methods of Statistics. Cambridge: 
Cambridge. Chapter 2.


or

Press,W. H., Teukolsky, S. A., Vetterling, W. T. and Flannery, B. P. 
(2007) Numerical Recipes. The Art of Scientific Programming. Third 
Edition. Cambridge. Section 1.1 (I think).



Attached a small testcode using data form Bioconductor.

Best
Markus


library(affy)
data(affybatch.example)
mat <- exprs(affybatch.example)[1:100,1:3]
mat <- exp(1)*mat
mat <- asinh(mat)

rowM1<- rowMeans(mat)

t=rep(0,100) # Vektor mit 0en
for(i in 1:100){
 for(j in 1:3)
 t[i] <- t[i] + mat[i,j]
}
rowM2 <- t/3

m1 <- mat - rowM1
m2 <- mat -rowM2

print(m1-m2)

sessionInfo()
R version 2.7.1 (2008-06-23)
i386-pc-mingw32

locale:
LC_COLLATE=German_Germany.1252;LC_CTYPE=German_Germany.1252;LC_MONETARY=German_Germany.1252;LC_NUMERIC=C;LC_TIME=German_Germany.1252

attached base packages:
[1] tools stats graphics  grDevices utils datasets  methods [8] 
base 
other attached packages:
[1] affy_1.18.2  preprocessCore_1.2.0 affyio_1.8.0   [4] 
Biobase_2.0.1 
--

Dipl.-Tech. Math. Markus Schmidberger

Ludwig-Maximilians-Universität München
IBE - Institut für medizinische Informationsverarbeitung,
Biometrie und Epidemiologie
Marchioninistr. 15, D-81377 Muenchen
URL: http://ibe.web.med.uni-muenchen.de Mail: Markus.Schmidberger [at] 
ibe.med.uni-muenchen.de

Tel: +49 (089) 7095 - 4599

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] different results form summarization by loop and sum or rowMeans function

2008-09-11 Thread jim holtman
How low is "very low"?  This is probably answered by FAQ 7.31

On Thu, Sep 11, 2008 at 9:49 AM, Markus Schmidberger
<[EMAIL PROTECTED]> wrote:
> Hi,
>
> I found different results calculating the rowMeans by the function
> rowMeans() and a simple for-loop. The differences are very low. But after
> this calculation I will start some optimization algorithms (BFGS or CG) and
> there I get huge differences (from the small changes in the beginning or
> start values, I changed nothing else in the code).
> How I can avoid these differences between sum-loops and sum-functions?
>
> Attached a small testcode using data form Bioconductor.
>
> Best
> Markus
>
>
> library(affy)
> data(affybatch.example)
> mat <- exprs(affybatch.example)[1:100,1:3]
> mat <- exp(1)*mat
> mat <- asinh(mat)
>
> rowM1<- rowMeans(mat)
>
> t=rep(0,100) # Vektor mit 0en
> for(i in 1:100){
>  for(j in 1:3)
>  t[i] <- t[i] + mat[i,j]
> }
> rowM2 <- t/3
>
> m1 <- mat - rowM1
> m2 <- mat -rowM2
>
> print(m1-m2)
>
> sessionInfo()
> R version 2.7.1 (2008-06-23)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=German_Germany.1252;LC_CTYPE=German_Germany.1252;LC_MONETARY=German_Germany.1252;LC_NUMERIC=C;LC_TIME=German_Germany.1252
>
> attached base packages:
> [1] tools stats graphics  grDevices utils datasets  methods [8]
> base
> other attached packages:
> [1] affy_1.18.2  preprocessCore_1.2.0 affyio_1.8.0   [4]
> Biobase_2.0.1
> --
> Dipl.-Tech. Math. Markus Schmidberger
>
> Ludwig-Maximilians-Universität München
> IBE - Institut für medizinische Informationsverarbeitung,
> Biometrie und Epidemiologie
> Marchioninistr. 15, D-81377 Muenchen
> URL: http://ibe.web.med.uni-muenchen.de Mail: Markus.Schmidberger [at]
> ibe.med.uni-muenchen.de
> Tel: +49 (089) 7095 - 4599
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] different results form summarization by loop and sum or rowMeans function

2008-09-11 Thread Markus Schmidberger

Hi,

I found different results calculating the rowMeans by the function 
rowMeans() and a simple for-loop. The differences are very low. But 
after this calculation I will start some optimization algorithms (BFGS 
or CG) and there I get huge differences (from the small changes in the 
beginning or start values, I changed nothing else in the code).

How I can avoid these differences between sum-loops and sum-functions?

Attached a small testcode using data form Bioconductor.

Best
Markus


library(affy)
data(affybatch.example)
mat <- exprs(affybatch.example)[1:100,1:3]
mat <- exp(1)*mat
mat <- asinh(mat)

rowM1<- rowMeans(mat)

t=rep(0,100) # Vektor mit 0en
for(i in 1:100){
  for(j in 1:3)
  t[i] <- t[i] + mat[i,j]
}
rowM2 <- t/3

m1 <- mat - rowM1
m2 <- mat -rowM2

print(m1-m2)

sessionInfo()
R version 2.7.1 (2008-06-23)
i386-pc-mingw32

locale:
LC_COLLATE=German_Germany.1252;LC_CTYPE=German_Germany.1252;LC_MONETARY=German_Germany.1252;LC_NUMERIC=C;LC_TIME=German_Germany.1252

attached base packages:
[1] tools stats graphics  grDevices utils datasets  methods 
[8] base


other attached packages:
[1] affy_1.18.2  preprocessCore_1.2.0 affyio_1.8.0   
[4] Biobase_2.0.1  


--
Dipl.-Tech. Math. Markus Schmidberger

Ludwig-Maximilians-Universität München
IBE - Institut für medizinische Informationsverarbeitung,
Biometrie und Epidemiologie
Marchioninistr. 15, D-81377 Muenchen
URL: http://ibe.web.med.uni-muenchen.de 
Mail: Markus.Schmidberger [at] ibe.med.uni-muenchen.de

Tel: +49 (089) 7095 - 4599

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.