[R] what are the limits on R array sizes?

2006-02-08 Thread Mike Miller
I have some computers with a massive amount of memory, and I have some 
jobs that could use very large matrix sizes.  Can R handle matrices of 
larger than 2GB?  If I were to create a matrix of 1,000,000 x 1,000, it 
would use about 8GB.  Can R work with an array of that size if I have 
compiled R on an IA64 Linux system with 15GB of RAM?

Mike

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] open source and R

2005-11-13 Thread Mike Miller
On Sun, 13 Nov 2005, Roger Bivand wrote:

 On Sun, 13 Nov 2005, Robert wrote:

 If I do not know C or FORTRAN, how can I fully understand the package 
 or possibly improve it?

 By learning enough to see whether that makes a difference for your 
 purposes. Life is hard, but that's what makes life interesting ...


None of us fully understands what we are doing with computer software. 
If you understand R code, that's great, but then there is the R 
interpreter -- do you understand how it works?  That interpreter was 
written in another language that was then compiled by a compiler which was 
written by someone else for some other purpose -- do you understand the 
compiler?  Then it all gets processed by some very complex hardware that 
practically none of us *fully* understands.  We have to accept that we 
can't have a complete grasp of what R is doing, but we can still read the 
R docs and test R in many ways.

When functions are written in R, they may be easier for you to read, but 
they may run much slower than code written in C, C++ or FORTRAN.  I don't 
think it is wise to forgo the speed improvement so that people who don't 
know FORTRAN can enjoy contributing to R development.  The contribution of 
FORTRAN libraries R functionality and efficiency is probably much greater 
than the contributions would be from any group of people who could code in 
R but could't code in C or FORTRAN.

That said, I appreciate the sentiment and I think we should prefer 
straight R code for many functions, but some things just run too slowly 
when written that way.

Mike

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] problems with for: warnings and segfault

2005-11-11 Thread Mike Miller
On Fri, 11 Nov 2005, Ronaldo Reis-Jr. wrote:

 Segmentation fault

 This is a R bug or an error in my for function?


All seg faults are bugs.

Mike

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How to find statistics like that.

2005-11-10 Thread Mike Miller
On Thu, 10 Nov 2005, Ruben Roa wrote:

 A statistic is any real-valued or vector-valued function whose
 domain includes the sample space of a random sample. The
 p-value is a real-valued function and its domain includes the
 sample space of a random sample. The p-value has a sampling
 distribution. The code below, found with Google (sampling distribution
 of the p-value R command) shows the sampling
 distribution of the p-value for a t-test of a mean when the null hypothesis
 is true.
 Ruben

 n-18
 mu-40
 pop.var-100
 n.draw-200
 alpha-0.05
 draws-matrix(rnorm(n.draw * n, mu, sqrt(pop.var)), n)
 get.p.value-function(x) t.test(x, mu = mu)$p.value
 pvalues-apply(draws, 2, get.p.value)
 hist(pvalues)
 sum(pvalues = alpha)
 [1] 6


The sampling distribution of a p-value when the null hypothesis is true 
can be given more simply by this R code:

runif()

That holds for any valid test, not just a t test, that produces p-values 
distributed continuously on [0,1].  Discrete distributions can't quite do 
that without special tweaking.

Mike

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Is there no LU-Decomposition in R?

2005-11-09 Thread Mike Miller
On Wed, 9 Nov 2005, Christian Hinz wrote:

 I need the LU_Decomposition for my conversion of the DIRK/SDIRK (singly 
 diagonally Runge Kutta implicit methods) algorithm.


I think you need the Matrix package:

http://cran.r-project.org/src/contrib/Descriptions/Matrix.html

Mike

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] writing R shell scripts?

2005-11-09 Thread Mike Miller
On Wed, 9 Nov 2005, Henrik Bengtsson wrote:

 Put everything in curly brackets as my above example show.

Ah ha! This leads to the latest version of my little one-liner (called 
doR):

#!/bin/sh
echo output - { $1 }; write.table(file=stdout(), .Last.value, 
row.names=FALSE, col.names=FALSE); q() | /usr/local/bin/R --slave --no-save


The use of output - suppresses the unwanted intermediate computations 
and the curly brackets make it so that .Last.value does what we want. 
So I can now do a series of semi-colon separated commands in my one-liner 
and pump out the result of only the last step.  Like so:

# doR 'dims - c(100,5) ; A - matrix(rnorm(100*5)*4,dims); chol(cov(A))'
3.94658490894107 0.317366981840069 1.06924803070620 0.340120807287771 
-0.497063892955837
0 3.8326282049957 -0.675108240711913 -0.0199506782900556 -0.306406513059080
0 0 3.56097699699067 -0.52407121204737 0.107635052917745
0 0 0 3.88866072385884 -0.365208065096497
0 0 0 0 3.81017455630087

That isn't a great example, but you get the idea.  I use this kind of 
thing pretty often.

Mike

-- 
Michael B. Miller, Ph.D.
Assistant Professor
Division of Epidemiology and Community Health
and Institute of Human Genetics
University of Minnesota
http://taxa.epi.umn.edu/~mbmiller/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] R-and-octave (was Element-by-element multiplication operator?)

2005-11-09 Thread Mike Miller
On Wed, 9 Nov 2005, Gabor Grothendieck wrote:

 See:

 http://cran.r-project.org/doc/contrib/R-and-octave-2.txt


That is a really beautiful page.  If the author is willing to update it to 
include some of the new Octave functionality -- specifically N-dimensional 
arrays -- I'm sure I can help out.  If there is new R functionality to 
include, I probably wouldn't know about it (I'm still learning R) but 
maybe someone else on the list would know.

Mike

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How to find statistics like that.

2005-11-09 Thread Mike Miller
On Wed, 9 Nov 2005, Gao Fay wrote:

 Hi there,

 Suppose mu is constant, and error is normally distributed with mean 0 and 
 fixed variance s. I need to find a statistics that:
 Y_i = mu + beta1* I1_i beta2*I2_i + beta3*I1_i*I2_i + +error, where I_i is 1 
 Y_i is from group A, and 0 if Y_i is from group B.

 It is large when  beta1=beta2=0
 It is small when beta1 and/or beta2 is not equal to 0

 How can I find it by R? Thank you very much for your time.


That's a funny question.  Usually we want a statistic that is small when 
beta1=beta2=0 and large otherwise.

Why not compute the usual F statistic for the null beta1=beta2=0 and then 
use 1/F as your statistic?

Mike

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] writing R shell scripts?

2005-11-08 Thread Mike Miller
Many thanks for the suggestions.  Here is a related question:

When I do things like this...

echo matrix(rnorm(25*2),c(25,2)) | R --slave --no-save

...I get the desired result except that I would like to suppress the 
[,1] row and column labels so that only the values go to stdout.  What 
is the trick to making that work?

By the way, I find it useful to have a script in my path that does this:

#!/bin/sh
echo $1 | /usr/local/bin/R --slave --no-save

Suppose that script was called doR, then one could do things like this 
from the Linux/UNIX command line:

# doR 'sqrt(35.6)'
[1] 5.966574

# doR 'runif(1)'
[1] 0.8881654

Which I find to be handy for quick arithmetic and even for much more 
sophisticated things.  I'd like to get rid of the [1] though!

Mike

-- 
Michael B. Miller, Ph.D.
Assistant Professor
Division of Epidemiology and Community Health
and Institute of Human Genetics
University of Minnesota
http://taxa.epi.umn.edu/~mbmiller/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] writing R shell scripts?

2005-11-08 Thread Mike Miller
On Wed, 9 Nov 2005, Henrik Bengtsson wrote:

 What you really want to do might be solved by write.table(), e.g.

 x - matrix(rnorm(25*2),c(25,2));
 write.table(file=stdout(), x, row.names=FALSE, col.names=FALSE);

Thanks.  That does what I want.

There is one remaining problem for my echo method.  While write.table 
seems to do the right thing for me, R seems to add an extra newline to the 
output when it closes.  You can see that this produces exactly one newline 
and nothing else:

# echo '' | R --slave --no-save | wc -c
   1

Is there any way to stop R from sending an extra newline?  It's funny 
because normal running of R doesn't seem to terminate by sending a newline 
to stdout.  Oops -- I just figured it out.  If I send quit(), there is 
no extra newline!  Examples that send no extra newline:

echo 'quit()' | R --slave --no-save

echo x - matrix(rnorm(25*2),c(25,2)); write.table(file=stdout(), x, 
row.names=FALSE, col.names=FALSE); quit() | R --slave --no-save

I suppose we can live with that as it is.  Is this an intentional feature?


 A note of concern: When writing batch scripts like this, be explicit and 
 use the print() statement.  A counter example to compare

 echo 1; 2 | R --slave --no-save

 and

 echo print(1); print(2) | R --slave --no-save

I guess you are saying that sometimes R will fail if I don't use print(). 
Can you give an example of how it can fail?


 By the way, I find it useful to have a script in my path that does 
 this:
 
 #!/bin/sh
 echo $1 | /usr/local/bin/R --slave --no-save
 
 Suppose that script was called doR, then one could do things like 
 this from the Linux/UNIX command line:
 
 # doR 'sqrt(35.6)'
 [1] 5.966574
 
 # doR 'runif(1)'
 [1] 0.8881654
 
 Which I find to be handy for quick arithmetic and even for much more 
 sophisticated things.  I'd like to get rid of the [1] though!

 If you want to be lazy and not use, say, doR 'cat(runif(1),\n)' above, 
 maybe a simple Unix sed in your shell script can fix that?!


It looks like I can rewrite the script like this:

#!/bin/sh
echo $1 ; write.table(file=stdout(), out, row.names=FALSE, col.names=FALSE); 
quit() | /usr/local/bin/R --slave --no-save

Then I have to always include something like this...

out - some_operation

...as part of my command.  Example:

# doR 'A - matrix(rnorm(100*5),c(100,5)); out - chol(cov(A))'
1.08824564637869 0.00749462665482204 -0.109577665309141 0.123824503621501 
0.0420504647142321
0 0.969304154505745 0.0689085053799411 0.143273894584171 -0.0204348333174425
0 0 0.995383836907855 0.0860782051613422 0.056980680914183
0 0 0 0.94180592438191 0.0534651651371964
0 0 0 0 0.907266109886987

Now we've got it!!

The output above is nice and compact, and it doesn't have an extra 
newline, but if you want it to look nice on the screen, my friend Stephen 
Montgomery-Smith (Math, U Missouri) made me this nice little perl script 
for aligning numbers neatly and easily (but it doesn't work if there are 
letters in there (e.g., NA or 1.3e-6):

http://taxa.epi.umn.edu/misc/numalign

# ./doR 'A - matrix(rnorm(100*5),c(100,5)); out - chol(cov(A))' | numalign
0.903339952680364 -0.088773840144205 -0.223677935069773  -0.0736286093726908  
0.0457396703130186
0  1.08096548052082   0.0800540640587432 -0.0457840266135511 
-0.0311210293661459
0  0  0.938343073536710.0665017259723313 
-0.0825698771035788
0  0  0   1.03303581434252
0.118372967026342
0  0  0   0   
0.972768611955302

Thanks very much for all of the help!!!

Mike

-- 
Michael B. Miller, Ph.D.
Assistant Professor
Division of Epidemiology and Community Health
and Institute of Human Genetics
University of Minnesota
http://taxa.epi.umn.edu/~mbmiller/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] writing R shell scripts?

2005-11-08 Thread Mike Miller
On Wed, 9 Nov 2005, Henrik Bengtsson wrote:

 A note of concern: When writing batch scripts like this, be explicit 
 and use the print() statement.  A counter example to compare
 
 echo 1; 2 | R --slave --no-save
 
 and
 
 echo print(1); print(2) | R --slave --no-save
 
 
 I guess you are saying that sometimes R will fail if I don't use 
 print(). Can you give an example of how it can fail?

This may have been a misunderstanding because it looks like my R and your 
R are not functioning in the same ways.  More below...


 What I was really try to say is that if you running the R terminal, that 
 is, you sending commands via the R prompt, R will take the *last* 
 value and call print()  on it.  This is why you get the result when you 
 type

 1+1
 [1] 2

 Without this feature you would have had to type
 print(1+1)
 [1] 2

 to get any results.  Note that it is only the last value calculate, that will 
 be output this way, cf.
 1+1; 2+2
 [1] 4


My version of R works differently:

# echo 1+1; 2+2 | R --slave --no-save
[1] 2
[1] 4

It does the same thing from the interactive prompt.  This holds in both of 
these versions of R on Red Hat Linux:

R 1.8.1 (2003-11-21).
R 2.2.0 (2005-10-06).


If I assign the results of earlier computations to variables, then they 
are not printed:

# echo x - 1+1; 2+2 | R --slave --no-save
[1] 4


 Or go get the last value calculated by .Last.value, see

 echo $1; out - .Last.value; write.table(file=stdout(), out, 
 row.names=FALSE, col.names=FALSE); quit() | /usr/local/bin/R --slave 
 --no-save

Beautiful.  I didn't know about .Last.value, but now that I do, I think we 
can shorten that script to this...

echo $1; write.table(file=stdout(), .Last.value, row.names=FALSE, 
col.names=FALSE); quit() | /usr/local/bin/R --slave --no-save

...because we no longer need the out variable.  It seems like one 
problem I'm having is that R returns results of every computation, and not 
just the last one, unless I assign the result to a variable.  Example 
using the doR one-line script above:

# doR 'chol(cov(matrix(rnorm(100*5),c(100,5'
  [,1]   [,2][,3]  [,4][,5]
[1,] 1.021414 0.09806281 0.003275454  0.0009819654  0.05031847
[2,] 0.00 1.10031274 0.002696835 -0.0990352880  0.17356877
[3,] 0.00 0. 0.822075977 -0.0353553332 -0.04559222
[4,] 0.00 0. 0.0  0.9367890692 -0.01513027
[5,] 0.00 0. 0.0  0.00  0.97588119
1.02141394873274 0.0980628119885006 0.00327545419626209 0.000981965434760053 
0.050318470112499
0 1.10031274450895 0.00269683530006245 -0.0990352879929318 0.173568771318532
0 0 0.82207597738982 -0.0353553332133034 -0.0455922206141078
0 0 0 0.936789069194909 -0.0151302741201435
0 0 0 0 0.975881188029811

# doR 'x - chol(cov(matrix(rnorm(100*5),c(100,5'
1.09005225946311 0.183719241993361 -0.211250918511775 -0.014827266647 
-0.097633753471306
0 0.990599902490968 0.0546812452445389 -0.0255188599622241 0.0502929718369168
0 0 0.982263267444303 -0.0587151164554906 -0.046018923176493
0 0 0 1.00433563628640 0.222340686806836
0 0 0 0 0.976420329786668

I suppose I can live with that.  Is my R really working differently from 
the R other people are using?


 You may also want to create your own method for returning/outputting 
 data. An ideal way for doing this is to use create a so called generic 
 function, say, returnOutput(), and then special functions for each class 
 of object that you might get returned, e.g. returnOutput.matrix(), 
 returnOutput.list(), etc. Don't forget returnOutput.default().  If you 
 do not understand what I'm talking about here, please read up on 
 S3/UseMethod in R documentation.  It's all in there.  Then you'll also 
 get a much deeper understanding of how print() (and R) works.

Thanks yet again for another great tip!

Mike

-- 
Michael B. Miller, Ph.D.
Assistant Professor
Division of Epidemiology and Community Health
and Institute of Human Genetics
University of Minnesota
http://taxa.epi.umn.edu/~mbmiller/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] writing R shell scripts?

2005-11-07 Thread Mike Miller
I'm new to the list.  I've used R and S-PLUS a bit for about 15 years but 
am now working to make R my main program for all numerical and statistical 
computing.  I also use Octave for this kind of work and I recommend it (it 
is also under the GPL).  Here's my question:  In Octave I can write shell 
scripts in the Linux/UNIX environment that begin with a line like this...

#!/usr/local/bin/octave -q

...and the remaining lines are octave commands.  Is it possible to do this 
sort of thing in R using something like this?:

#!/usr/lib/R/bin/R.bin

Well, that isn't quite it because I tried it and it didn't work!

Any advice greatly appreciated.  Thanks in advance.

Mike

-- 
Michael B. Miller, Ph.D.
Assistant Professor
Division of Epidemiology and Community Health
and Institute of Human Genetics
University of Minnesota
http://taxa.epi.umn.edu/~mbmiller/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html