[R] How many random numbers needed?

2008-05-10 Thread Birgit Lemcke

Hello R-People!

I am running R 2.7.0 on a Power Book (Tiger). (I am still R and  
statistics beginner)


Perhaps this is another stupid question of me, but I was wondering  
how I know the needed random (set.seed) numbers, when running  
randomForest (library randomForest) on a large dataset.


Thanks in advance

Birgit



Birgit Lemcke
Institut für Systematische Botanik
Zollikerstrasse 107
CH-8008 Zürich
Switzerland
Ph: +41 (0)44 634 8351
[EMAIL PROTECTED]

175 Jahre UZH
«staunen.erleben.begreifen. Naturwissenschaft zum Anfassen.»
MNF-Jubiläumsevent für gross und klein.
19. April 2008, 10.00 Uhr bis 02.00 Uhr
Campus Irchel, Winterthurerstrasse 190, 8057 Zürich
Weitere Informationen http://www.175jahre.uzh.ch/naturwissenschaft

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How many random numbers needed?

2008-05-10 Thread Gavin Simpson
On Sat, 2008-05-10 at 13:21 +0200, Birgit Lemcke wrote:
 Hello R-People!
 
 I am running R 2.7.0 on a Power Book (Tiger). (I am still R and  
 statistics beginner)
 
 Perhaps this is another stupid question of me, but I was wondering  
 how I know the needed random (set.seed) numbers, when running  
 randomForest (library randomForest) on a large dataset.

The seed is just a starting point for the RNG. You can draw as many
numbers as you like once the RNG has been seeded.

The ability to set the seed allows repeated runs of functions like
randomForest to provide the same results for each run. This is a basic
requirement of reproducible research.

require(randomForest)
set.seed(1)
mod1 - randomForest(Species ~ ., data = iris)
mod2 - randomForest(Species ~ ., data = iris)
set.seed(1)
mod3 - randomForest(Species ~ ., data = iris)

all.equal(mod1, mod2)
all.equal(mod1, mod3)

You could put whatever (within reason - up to limits of an integer in R)
into the set.seed function, but the point is to provide the same number
in the seed if you want to make sure your results are reproducible.

HTH

G

 
 Thanks in advance
 
 Birgit
 
 
 
 Birgit Lemcke
 Institut für Systematische Botanik
 Zollikerstrasse 107
 CH-8008 Zürich
 Switzerland
 Ph: +41 (0)44 634 8351
 [EMAIL PROTECTED]
 
 175 Jahre UZH
 «staunen.erleben.begreifen. Naturwissenschaft zum Anfassen.»
 MNF-Jubiläumsevent für gross und klein.
 19. April 2008, 10.00 Uhr bis 02.00 Uhr
 Campus Irchel, Winterthurerstrasse 190, 8057 Zürich
 Weitere Informationen http://www.175jahre.uzh.ch/naturwissenschaft
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How many random numbers needed?

2008-05-10 Thread Birgit Lemcke

Thank you Gavin.

I knew about the meaning set.seed for reproducability but  I did not  
realise that it is only the starting point.


Is it possible that very small or very big random numbers cause any  
kind of bias?


B.


Am 10.05.2008 um 14:50 schrieb Gavin Simpson:

On Sat, 2008-05-10 at 13:21 +0200, Birgit Lemcke wrote:

Hello R-People!

I am running R 2.7.0 on a Power Book (Tiger). (I am still R and
statistics beginner)

Perhaps this is another stupid question of me, but I was wondering
how I know the needed random (set.seed) numbers, when running
randomForest (library randomForest) on a large dataset.


The seed is just a starting point for the RNG. You can draw as many
numbers as you like once the RNG has been seeded.

The ability to set the seed allows repeated runs of functions like
randomForest to provide the same results for each run. This is a basic
requirement of reproducible research.

require(randomForest)
set.seed(1)
mod1 - randomForest(Species ~ ., data = iris)
mod2 - randomForest(Species ~ ., data = iris)
set.seed(1)
mod3 - randomForest(Species ~ ., data = iris)

all.equal(mod1, mod2)
all.equal(mod1, mod3)

You could put whatever (within reason - up to limits of an integer  
in R)
into the set.seed function, but the point is to provide the same  
number

in the seed if you want to make sure your results are reproducible.

HTH

G



Thanks in advance

Birgit



Birgit Lemcke
Institut für Systematische Botanik
Zollikerstrasse 107
CH-8008 Zürich
Switzerland
Ph: +41 (0)44 634 8351
[EMAIL PROTECTED]

175 Jahre UZH
«staunen.erleben.begreifen. Naturwissenschaft zum Anfassen.»
MNF-Jubiläumsevent für gross und klein.
19. April 2008, 10.00 Uhr bis 02.00 Uhr
Campus Irchel, Winterthurerstrasse 190, 8057 Zürich
Weitere Informationen http://www.175jahre.uzh.ch/naturwissenschaft

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting- 
guide.html

and provide commented, minimal, self-contained, reproducible code.

--
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%



Birgit Lemcke
Institut für Systematische Botanik
Zollikerstrasse 107
CH-8008 Zürich
Switzerland
Ph: +41 (0)44 634 8351
[EMAIL PROTECTED]

175 Jahre UZH
«staunen.erleben.begreifen. Naturwissenschaft zum Anfassen.»
MNF-Jubiläumsevent für gross und klein.
19. April 2008, 10.00 Uhr bis 02.00 Uhr
Campus Irchel, Winterthurerstrasse 190, 8057 Zürich
Weitere Informationen http://www.175jahre.uzh.ch/naturwissenschaft

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.