Re: [R] install.packages saying the car package is not in therepositories

2006-03-08 Thread Dirk Eddelbuettel

On 9 March 2006 at 02:23, Jeremy Morris wrote:
| I am running version 2.1.0 of R on a Debian Linux machine.

In which case 

$ apt-get install r-cran-car

is your friend.  

2.1.0 is ancient, so you could take advantage of the backport of the current
R the Debian stable which Chris Steigies provides -- and that is on every
CRAN mirror as detailed in the R FAQ.

Hth, Dirk

-- 
Hell, there are no rules here - we're trying to accomplish something. 
  -- Thomas A. Edison

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] install.packages saying the car package is not in therepositories

2006-03-08 Thread Jeremy Morris
I am running version 2.1.0 of R on a Debian Linux machine.

Jeremy

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] variable '%s' was fitted with class... in predict.nls()

2006-03-08 Thread Jeff D. Hamann
I've tried to predict the values from a new data.frame using the
nls.predict   function and keep getting the error message:

Error in if (sum(wrong) == 1) stop(gettextf("variable '%s' was fitted with
class \"%s\" but class \"%s\" was supplied",  :
missing value where TRUE/FALSE needed

I first thought that it was becuase there may have been something goofy in
the formats of the objects/data.frames I was passing to the predict(), but
when I examined the code (online) I found the following:

I've trid to decipher the code, but most people are much better R
programmers that I, and I'm not sure what's going on here. DOes this mean
I have to use the same data frame I used to fit the model, in order to fit
"new" data?

.checkMFClasses <- function(cl, m, ordNotOK = FALSE)
{
new <- sapply(m, .MFclass)
if(length(new) == 0) return()
old <- cl[names(new)]
if(!ordNotOK) {
old[old == "ordered"] <- "factor"
new[new == "ordered"] <- "factor"
}
## ordered is OK as a substitute for factor, but not v.v.
new[new == "ordered" && old == "factor"] <- "factor"
if(!identical(old, new)) {
wrong <- old != new
if(sum(wrong) == 1)
stop(gettextf(
"variable '%s' was fitted with class \"%s\" but class \"%s\" was
supplied",
  names(old)[wrong], old[wrong], new[wrong]),
 call. = FALSE, domain = NA)
else
stop(gettextf(
"variables %s were specified with different classes from the fit",
 paste(sQuote(names(old)[wrong]), collapse=", ")),
 call. = FALSE, domain = NA)
}
}

The docs for predict.nls state:

 newdata: A named list or data frame in which to look for variables
  with which to predict.  If 'newdata' is missing the fitted
  values at the original data points are returned.

I understand this to mean that the newdata data.frame simply need
*include* the variables that are in the model (extra's variables fine?)

When I used,

predict( fits.by.species.30[[1]], dbh=other.202$dbh, se.fit=TRUE )

I got results alright, but I would like to be able to simply pass a new
data.frame that contains the independent variables without having to build
a list of names. Is that possible?

Thanks,
Jeff.

-- 
Jeff D. Hamann
Forest Informatics, Inc.
PO Box 1421
Corvallis, Oregon 97339-1421
phone 541-754-1428
jeff.hamann[at]forestinformatics.com
www.forestinformatics.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] how to use the rpart function?

2006-03-08 Thread Michael
I see! So you mean I have to collect error counts myself manually...

By the way, what parameters do I normally change to improve the default
rpart performance?

Thanks a lot!

On 3/8/06, Carlos Ortega <[EMAIL PROTECTED]> wrote:
>
> Hello Michael,
>
> In some of the examples in the rpart function you will find that
> comparison between the actual and the predicted values, although it is for
> the "Classification" mode, not for the regression.
>
> Regards,
> Carlos.
>
>
> On 3/7/06, Michael <[EMAIL PROTECTED]> wrote:
>
> > Hi all,
> >
> >
> >
> > What parameter do I normally change in the rpart function? How do I set
> > the
> > "cp" option?
> >
> >
> >
> > Is there a way to read off error rate directly from the "rpart" function
> > for
> > training data; I imagine for testing data I have to apply a "predict",
> > but
> > for training data I guess the error count would be somewhere existing
> > once
> > the "rpart" function is finished. Looks like it is related to
> > expressions
> > such as "expected loss=0.8362365" when using "summary" function.
> >
> > Now I have to do this manually, and when it came to compare the correct
> > vs.
> > wrong and count the errors, it was always very tedious...
> >
> >
> >
> > Thanks a lot!
> >
> >
> >
> > M.
> >
> >[[alternative HTML version deleted]]
> >
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> > http://www.R-project.org/posting-guide.html
> >
>
>

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] how to use the randomForest and rpart function?

2006-03-08 Thread Michael
Hi Andy,

Does the randomForest have a Cross Validation built-in to decide what is the
best number of trees or I have to find the best number manually by myself?

Thanks a lot!

Michael.

On 3/7/06, Liaw, Andy <[EMAIL PROTECTED]> wrote:
>
> Yes, I do know.  That's why I pointed you to the reference linked from the
> help page.
>
> BTW, there's also an R News article describing the initial version of the
> package.  Have you perused that?
>
> Andy
>
>  -Original Message-
> *From:* Michael [mailto:[EMAIL PROTECTED]
> *Sent:* Tuesday, March 07, 2006 9:27 PM
> *To:* Liaw, Andy
> *Cc:* R-help@stat.math.ethz.ch
> *Subject:* Re: [R] how to use the randomForest and rpart function?
>
> It did not have a legend showing on which color is for class1, which color
> is for class2, etc...
>
> I've read the R-help page.
>
> It lists a lot of options, but it did not say which ones are the key
> parameters that people use most for improving performance...
>
> Do you know?
>
> On 3/7/06, Liaw, Andy <[EMAIL PROTECTED]> wrote:
> >
> > As ?plot.randomForest says, it plots error rates.  In addition to
> > overall
> > error rates, it also plots error rates for each class.
> >
> > As to the options in randomForest, read about the options in the help
> > page
> > and the reference linked from the help page.
> >
> > Andy
> >
> > From: Michael
> > >
> > > When I plot the randomForest object, it shows a graph with 3
> > > lines, green, red and black, what's the meaning of these three lines?
> > >
> > > On 3/7/06, Michael < [EMAIL PROTECTED]> wrote:
> > > >
> > > > Hi all,
> > > >
> > > > I am trying to play around with the randomForest function for
> > > > classification. I know its performance is great.
> > > >
> > > > I am currently using the default options.
> > > >
> > > > It has many options.
> > > >
> > > > How do I further tweak the options so that I can make its
> > > performance
> > > > even better?
> > > >
> > > > What are the options that are mostly used?
> > > >
> > > > Thanks a lot!
> > > >
> > > > M
> > > >
> > >
> > >   [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide!
> > > http://www.R-project.org/posting-guide.html
> > >
> > >
> >
> >
> > --
> >
> > Notice:  This e-mail message, together with any attachments, contains
> > information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station, New
> > Jersey, USA 08889), and/or its affiliates (which may be known outside the
> > United States as Merck Frosst, Merck Sharp & Dohme or MSD and in Japan, as
> > Banyu) that may be confidential, proprietary copyrighted and/or legally
> > privileged. It is intended solely for the use of the individual or entity
> > named on this message.  If you are not the intended recipient, and have
> > received this message in error, please notify us immediately by reply e-mail
> > and then delete it from your system.
> >
> > --
> >
>
>
> --
> Notice: This e-mail message, together with any attachments...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] how to use the randomForest and rpart function?

2006-03-08 Thread Michael
Thanks a lot Joe!

I will take a further look at the article...

On 3/8/06, Joseph Retzer <[EMAIL PROTECTED]> wrote:
>
> Hi Michael,
>   I've looked into this a bit and the only parameter that seems to be
> suggested (by Brieman and a Salford Systems white paper) as one which may
> have an impact on the RF model is that which sets the number of potential
> split variables (mtry) for each tree split. The default for categorical
> response is root(total number of attributes) and (total number of
> attributes)/3 for regression. Take a look at the tuneRF function in
> randomForest which takes the default and searches above and below to see if
> the OOB error rate can be improved by changing mtry. Based on my very
> limited experimentation with the program, the default value seems to
> be tough to improve on.
> Best of luck & take care,
> Joe Retzer
>
> *Michael <[EMAIL PROTECTED]>* wrote:
>
> It did not have a legend showing on which color is for class1, which color
> is for class2, etc...
>
> I've read the R-help page.
>
> It lists a lot of options, but it did not say which ones are the key
> parameters that people use most for improving performance...
>
> Do you know?
>
> On 3/7/06, Liaw, Andy wrote:
> >
> > As ?plot.randomForest says, it plots error rates. In addition to overall
> > error rates, it also plots error rates for each class.
> >
> > As to the options in randomForest, read about the options in the help
> page
> > and the reference linked from the help page.
> >
> > Andy
> >
> > From: Michael
> > >
> > > When I plot the randomForest object, it shows a graph with 3
> > > lines, green, red and black, what's the meaning of these three lines?
> > >
> > > On 3/7/06, Michael wrote:
> > > >
> > > > Hi all,
> > > >
> > > > I am trying to play around with the randomForest function for
> > > > classification. I know its performance is great.
> > > >
> > > > I am currently using the default options.
> > > >
> > > > It has many options.
> > > >
> > > > How do I further tweak the options so that I can make its
> > > performance
> > > > even better?
> > > >
> > > > What are the options that are mostly used?
> > > >
> > > > Thanks a lot!
> > > >
> > > > M
> > > >
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide!
> > > http://www.R-project.org/posting-guide.html
> > >
> > >
> >
> >
> >
> >
> --
> > Notice: This e-mail message, together with any attachment...{{dropped}}
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>
>
>

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] how to use the randomForest and rpart function?

2006-03-08 Thread Michael
Wow, I didn't know that. That's great! He the man!

On 3/8/06, Carlos Ortega <[EMAIL PROTECTED]> wrote:
>
> Hello Michael,
>
> Just a few words about you phrase "Do you know?"...
>
> Andy Liaw, is the creator and maintainer of the randomForest package.
> He ported the original library of Briemman to R.
>
> Regards,
> Carlos.
>
>
> On 3/8/06, Michael <[EMAIL PROTECTED]> wrote:
>
> > It did not have a legend showing on which color is for class1, which
> > color
> > is for class2, etc...
> >
> > I've read the R-help page.
> >
> > It lists a lot of options, but it did not say which ones are the key
> > parameters that people use most for improving performance...
> >
> > Do you know?
> >
> > On 3/7/06, Liaw, Andy < [EMAIL PROTECTED]> wrote:
> > >
> > > As ?plot.randomForest says, it plots error rates.  In addition to
> > overall
> > > error rates, it also plots error rates for each class.
> > >
> > > As to the options in randomForest, read about the options in the help
> > page
> > > and the reference linked from the help page.
> > >
> > > Andy
> > >
> > > From: Michael
> > > >
> > > > When I plot the randomForest object, it shows a graph with 3
> > > > lines, green, red and black, what's the meaning of these three
> > lines?
> > > >
> > > > On 3/7/06, Michael <[EMAIL PROTECTED]> wrote:
> > > > >
> > > > > Hi all,
> > > > >
> > > > > I am trying to play around with the randomForest function for
> > > > > classification. I know its performance is great.
> > > > >
> > > > > I am currently using the default options.
> > > > >
> > > > > It has many options.
> > > > >
> > > > > How do I further tweak the options so that I can make its
> > > > performance
> > > > > even better?
> > > > >
> > > > > What are the options that are mostly used?
> > > > >
> > > > > Thanks a lot!
> > > > >
> > > > > M
> > > > >
> > > >
> > > >   [[alternative HTML version deleted]]
> > > >
> > > > __
> > > > R-help@stat.math.ethz.ch mailing list
> > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > PLEASE do read the posting guide!
> > > > http://www.R-project.org/posting-guide.html
> > > >
> > > >
> > >
> > >
> > >
> > >
> > --
> >
> > > Notice:  This e-mail message, together with any
> > attachment...{{dropped}}
> >
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! 
> > http://www.R-project.org/posting-guide.html
> >
> >
>
>

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] When calling external C-function repeatedly I get different results; can't figure out why..

2006-03-08 Thread Søren Højsgaard
Dear all,
I need to calculate tr(xyxy) fast for matrices x and y. Inspired by the 
R-source code, I've created the following functions (I am *new* to writing 
external C-functions, so feel free to laugh at my code - or, perhaps, suggest 
changes):
 
#include 
#include  /* for dgemm */

static void matprod(double *x, int nrx, int ncx,
  double *y, int nry, int ncy, double *z)
{
char *transa = "N", *transb = "N";
double one = 1.0, zero = 0.0;
F77_CALL(dgemm)(transa, transb, &nrx, &ncy, &ncx, &one,
  x, &nrx, y, &nry, &zero, z, &nrx);
}

SEXP trProd2(SEXP x, SEXP y)
{
  int nrx, ncx, nry, ncy, mode, i;
  SEXP xdims, ydims, ans, ans2, tr;
  xdims = getAttrib(x, R_DimSymbol);
  ydims = getAttrib(y, R_DimSymbol);
  mode = REALSXP;
  nrx = INTEGER(xdims)[0];
  ncx = INTEGER(xdims)[1];
  nry = INTEGER(ydims)[0];
  ncy = INTEGER(ydims)[1];
  PROTECT(ans  = allocMatrix(mode, nrx, ncy));
  PROTECT(ans2 = allocMatrix(mode, nrx, ncy));
  PROTECT(tr   = allocVector(mode, 1));
  matprod(REAL(x), nrx, ncx, REAL(y), nry, ncy, REAL(ans)); 
  
  matprod(REAL(ans), nrx, ncy, REAL(ans), nrx, ncy, REAL(ans2)); 
  
  for (i=0; i< nrx; i++){
REAL(tr)[0] = REAL(tr)[0] + REAL(ans2)[i*(nrx+1)];
  }
  UNPROTECT(3);
  return(tr);
}

In R I do:
 
trProd2 <- function(x, y) {
.Call("trProd2", x, y)
}
x <- y <- matrix(1:4,ncol=2)
storage.mode(x) <- storage.mode(y) <- "double"

for (i in 1:10)
  print(trProd2(x, y))
[1] 835
[1] 833
[1] 833
[1] 862
[1] 834
[1] 835
[1] 834
[1] 836
[1] 862
[1] 833

The correct answer is 833. Can anyone give me a hint to what is wrong? I am on 
a windows xp platform.
Thanks in advance
Søren

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] 'less' for R?

2006-03-08 Thread ronggui
As Prof Brian Ripley has pointed out,That changes the object if you
make a change in the editor, so definitely
NOT to be recommended.

And you should use invisible(edit(dat.frame)) instead.

2006/3/9, Christos Hatzis <[EMAIL PROTECTED]>:
> Or
> fix(data.frame)
> to open it in the spreadsheet-like data editor.
>
> -Christos Hatzis
>
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Duncan Murdoch
> Sent: Wednesday, March 08, 2006 12:26 PM
> To: [EMAIL PROTECTED]
> Cc: r-help
> Subject: Re: [R] 'less' for R?
>
> On 3/8/2006 12:03 PM, Federico Calboli wrote:
> > Hi All,
> >
> > is there an equivalent of the Unix command 'less' (or 'more'), so I
> > can look at what's inside a data.frame or a matrix without having it
> > printed out on console?
> >
> > I am using R on Debian Linux and Mac OS 10.4.5
>
> I think you want page().  head() and tail() are also useful, as equivalents
> to the head and tail commands.
>
> Duncan Murdoch
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>


--
黄荣贵
Deparment of Sociology
Fudan University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Adding polygons to a barplot

2006-03-08 Thread Jamieson Cobleigh
Thanks!  That worked.

Jamie

On 3/8/06, Marc Schwartz (via MN) <[EMAIL PROTECTED]> wrote:
> On Wed, 2006-03-08 at 14:02 -0500, Jamieson Cobleigh wrote:
> > I have a barplot I have created using barplot2 and I have been able to
> > add points and lines (using the points and lines methods,
> > respectively).  I now need to add some polygons (triangles in
> > particular), that I want to be shaded to match bars in the plot.  I
> > can get the coordinates of the corners of the triangles, but don't
> > know how to draw the triangles.  I know there is the grid.polygon
> > method, but I don't know how to get it to draw on my plot.  Any help
> > would be appreciated.
> >
> > Thanks!
> >
> > Jamie
>
> There is a polygon() function in base R graphics. It supports a
> 'density' argument in the same fashion as barplot()/barplot2().
>
> See ?polygon
>
> help.search("polygon") would have gotten you there.
>
> HTH,
>
> Marc Schwartz
>
>
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] install.packages saying the car package is not in therepositories

2006-03-08 Thread John Fox
Dear Jeremy,

The car package (version 1.1-0) is certainly on CRAN. I checked a couple of
mirrors, and it's there as well. (I assume that .Library contains the
library location to which you want to install?) What OS and version of R are
you using?

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Jeremy Morris
> Sent: Wednesday, March 08, 2006 5:22 PM
> To: r-help@stat.math.ethz.ch
> Subject: [R] install.packages saying the car package is not 
> in therepositories
> 
> I am attempting to install the car package using the command :
> 
> > install.packages('car',.Library)
> 
> After I have chosen a mirror, I get the following message :
> 
> Warning message:
> no package 'car' at the repositories in: 
> download.packages(pkgs, destdir = tmpd, available = available,
> 
> I used the CRAN.packages() function to see what was going on, 
> and the car package is not listed.  I have also tried several 
> mirrors, all with the same error message.  I have tried to 
> install other packages, this has been successful.
> 
> Any help would be great.
> 
> Jeremy
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Mixed GLM methodology and execution question

2006-03-08 Thread Ben Ridenhour
Hi all,
I have a question regarding how to properly analyze a data set and then how to 
perform the analysis in R.

First,
I have data that I would like to analyze using a mixed GLM (I think this is the 
most appropriate method, but I am unsure).  In a mixed model (y = 
X*beta+Z*gamma+epsilon), I would like to structure the variance matrices of 
gamma, G, and the error, R, to take advantage of all my information. The 
structure of the data is like this:

Response Variable: 
 y = continuous response variable 
 


Predictor Variables: 
 x1 =  nominal treatment 
 x2 =  nominal group 
 


Random Variables: 
 z = nominal subgroup of x2, i.e. z is nested within x2 
 


Other variables(?; I'm not sure what exactly these are) 
 z1 = first continuous property of z 
 z2 = second continuous property of z 
 z3 =  third continuous property of z 
 

Presumably all the traits z1-z3 could potentially affect y, though I'm 
primarily interested in the model y=x1+x2+x1*x2. My wish is to put z in as a 
random variable and z1-z3 in the error matrix R.

A small data sample  would be like

x1x2 z  z1  z2 z3 y
L1   A1   S1   1.23   4.59   -1.02   100.45
L2   A1   S1   1.23   4.59   -1.02   113.09
L1   A1   S2   1.50   3.76-0.06  119.21
L2   A1   S2   1.50   3.76-0.06  150.44
L1   A2   S3   1.09   4.01-1.50  109.18
L2   A2   S3   1.09   4.01-1.50  170.23
 L1   A2   S4   1.01   3.70-0.78  109.26
 L2   A2   S4   1.01   3.70-0.78  99.44


What is correct way to put together my model/matrices for this situation?  How 
do accomplish such a task in R?

Thanks,
Ben



[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Debugging a program written in the R language

2006-03-08 Thread Gabor Grothendieck
On 3/8/06, Thomas L Jones <[EMAIL PROTECTED]> wrote:
> >From Tom:
>
> The subject is debugging a program written in the R language,under
> Windows. (Sorry, but I do not know either the Apple OS or *nix.) A
> computer program will usually not work on the first try, if only
> because of the risk of typos. Instead, it must be debugged. Roughly,
> here is the sequence:
>
> (1) One codes a program using the R language,  and stores it on the
> hard drive, using some particular editor.

Editors with various levels of integration with R can
be found at these links:

http://www.sciviews.org/_rgui/projects/Editors.html
http://ess.r-project.org/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How to plot the xaxis label at 45 degree angle?

2006-03-08 Thread Marc Schwartz (via MN)
On Wed, 2006-03-08 at 16:53 -0500, Lisa Wang wrote:
> Hello there,
> 
> I would like to plot a graph with the x axis's label displayed at a 45
> angle to the x axis instead of horizontal to it as the label is very
> long. What should I do?
> 
> Thank you for your help in advance


See R FAQ 7.27 How can I create rotated axis labels?

http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-create-rotated-axis-labels_003f


Using:

 > RSiteSearch("rotated axis labels")

would be helpful as well.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] install.packages saying the car package is not in the repositories

2006-03-08 Thread Jeremy Morris
I am attempting to install the car package using the command :

> install.packages('car',.Library)

After I have chosen a mirror, I get the following message :

Warning message:
no package 'car' at the repositories in: download.packages(pkgs,
destdir = tmpd, available = available,

I used the CRAN.packages() function to see what was going on, and the
car package is not listed.  I have also tried several mirrors, all
with the same error message.  I have tried to install other packages,
this has been successful.

Any help would be great.

Jeremy

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] POSIX time zone codes

2006-03-08 Thread Don MacQueen
I would have suggested "US/Central", by analogy with "US/Pacific" 
which I use on my unix-like Mac OS X 10.3.9 system. I don't know what 
might be "common knowledge" in UNIX circles, but here's a bit of 
information from my system.

It has a directory named /usr/share/zoneinfo.

zoneinfo[166]% pwd
/usr/share/zoneinfo

zoneinfo[167]% ls | grep DT
CST6CDT
EST5EDT
MST7MDT
PST8PDT

There are files named GMT, UTC, GB, and a number of others.

There are numerous subdirectories, including for example, "Asia", 
"Australia", "Europe", "US".

zoneinfo[168]% cd US
US[169]% ls
./  AleutianEast-IndianaIndiana-Starke  Pacific
../ Arizona Eastern MichiganPacific-New
Alaska  Central Hawaii  MountainSamoa

This suggests to me that valid values for the tz argument are 
relative paths to files /usr/share/zoneinfo, but I don't know that 
for a fact.

-Don

At 9:22 AM -0500 3/7/06, Jason Horn wrote:
>Thank you again Gabor, that did the trick.  Any thoughts on where I 
>go go for a reference for these time codes?  Where did you get 
>"CDT6CST" from?  Or is this just one of those things that is "common 
>knowledge" in UNIX circles.
>
>To the R developers:  I recommend a sentence be added to the manual 
>for as.POSIXxx such as "vales for tz are system dependent, examples 
>for common systems are."
>
>
>
>
>Thanks!
>
>
>On Mar 7, 2006, at 9:00 AM, Gabor Grothendieck wrote:
>
>>  Only "" and "GMT" are really guaranteed to work on all systems
>>  since the time zones are system dependent but try: "CDT6CST"
>>  and see if that works on your system.
>>
>>
>>  On 3/7/06, Jason Horn <[EMAIL PROTECTED]> wrote:
>>>  Whoops,
>>>
>>>  [EDIT]
>>>
>>>  as.POSIX(x, tz="UTC")  ... works, gives UTC times
>>>  as.POSIX(x, tz="EST")  ... works, gives EST times
>>>
>>>  as.POSIX(x, tz="CST")  ... does NOT work, gives UTC times
>>>
>>>  [/EDIT]
>>>
>>>  On Mar 7, 2006, at 8:05 AM, Jason Horn wrote:
>>>
  as.POSIX(x, tz="UTC")  ... works, gives UTC times
  as.POSIX(x, tz="UTC")  ... works, gives EST times

  as.POSIX(x, tz="CST")  ... does NOT work, gives UTC times
>>>
>>>
>>> [[alternative HTML version deleted]]
>>>
>>>  __
>>>  R-help@stat.math.ethz.ch mailing list
>>>  https://stat.ethz.ch/mailman/listinfo/r-help
>>>  PLEASE do read the posting guide! http://www.R-project.org/posting-
>>>  guide.html
>>>
>>
>
>__
>R-help@stat.math.ethz.ch mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


-- 
--
Don MacQueen
Environmental Protection Department
Lawrence Livermore National Laboratory
Livermore, CA, USA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] bug in map('world') ?

2006-03-08 Thread Gregory Snow
There is a function recenter.Map in the TeachingDemos package that can
be used with the maptools packages (rather than the maps package) to
move polygons around for better looking maps.  The original idea was to
put the Alaskan islands on the left of the map rather than having some
of them on the far right, but it also works to creat a pacific centric
map or other things.  It is still pretty basic as a function, if anyone
wants to improve it (it could probably use a lot of improvement) they
are more than welcome.

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Ray Brownrigg
> Sent: Wednesday, March 08, 2006 12:40 PM
> To: [EMAIL PROTECTED]; r-help@stat.math.ethz.ch
> Subject: Re: [R] bug in map('world') ?
> 
> I seem to recall I came across something similar recently 
> (well, relatively recently, perhaps a couple of years ago).  
> The problem is related to how the code handles polygons that 
> are split across the map boundaries.  As I recall, the fix 
> was to modify the polygons that straddle the date line.  I'll 
> have to delve into this again, but don't hold your breath, sorry.
> 
> Ray Brownrigg
> 
> > From: Joerg van den Hoff <[EMAIL PROTECTED]>
> > 
> > hi,
> > 
> > did'nt see anything in the archive:
> > 
> > map('world',pro='rectangular',para=0)
> > 
> > yields a strange artifact (horizontal bar) extending over the whole 
> > map at a certain latitude range (approx 65 deg. north), whereas
> > 
> > map('world',pro='rectangular',para=180)
> > 
> > (which should be the same) does not show the artifact.
> > 
> > the artifact shows up in other projections as well, e.g.
> > 
> > map('world',pro='bonne',para=45)
> > 
> > 
> > which seems to show that the problem might be in the data 
> base of the 
> > map (i.e. polygon definition)??
> > 
> > 
> > any ideas?
> > 
> > regards,
> > 
> > joerg
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] How to plot the xaxis label at 45 degree angle?

2006-03-08 Thread Lisa Wang
Hello there,

I would like to plot a graph with the x axis's label displayed at a 45
angle to the x axis instead of horizontal to it as the label is very
long. What should I do?

Thank you for your help in advance

Lisa Wang 
Princess Margaret Hospital
Toronto, Ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Adding polygons to a barplot

2006-03-08 Thread Marc Schwartz (via MN)
On Wed, 2006-03-08 at 14:02 -0500, Jamieson Cobleigh wrote:
> I have a barplot I have created using barplot2 and I have been able to
> add points and lines (using the points and lines methods,
> respectively).  I now need to add some polygons (triangles in
> particular), that I want to be shaded to match bars in the plot.  I
> can get the coordinates of the corners of the triangles, but don't
> know how to draw the triangles.  I know there is the grid.polygon
> method, but I don't know how to get it to draw on my plot.  Any help
> would be appreciated.
> 
> Thanks!
> 
> Jamie

There is a polygon() function in base R graphics. It supports a
'density' argument in the same fashion as barplot()/barplot2().

See ?polygon

help.search("polygon") would have gotten you there.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Debugging a program written in the R language

2006-03-08 Thread Ferdinand Alimadhi
The simplest way:

1. Write the program using your favorite text editor and save it i.e. 
c:/try.R
2. Open R and write source("c:/try.R")

See "source" help (?source) for more information

I would recommend that, even you are a windows user, you use the emacs 
package for statistic (ESS). There are also other "IDE"s available.
In this respect I actually do not see any difference between R and other 
programming languages

HTH



Thomas L Jones wrote:

>>From Tom:
>
>The subject is debugging a program written in the R language,under 
>Windows. (Sorry, but I do not know either the Apple OS or *nix.) A 
>computer program will usually not work on the first try, if only 
>because of the risk of typos. Instead, it must be debugged. Roughly, 
>here is the sequence:
>
>(1) One codes a program using the R language,  and stores it on the 
>hard drive, using some particular editor.
>
>(2) The program is fed to the R software, together with test data, 
>etc.
>
>(3) A test computation is run and bugs are spotted.
>
>(4) The program is corrected, using an editor, and the revised version 
>is stored on the hard drive.
>
>(5) The sequence goes back to step (2) and is repeated until the 
>program hopefully works.
>
>Unfortunately, the documentation doesn't really explain how to do all 
>of this, or if it is explained in the documentation, I can't find it.
>
>Reading between the lines a bit, I infer that you are supposed to be 
>able to use something called a History file, then sort of work 
>backward in the code and make corrections. I never got it to work for 
>me. Also, it is unclear how you would handle code entered six weeks or 
>six months ago.
>
>That is the bad news; the good news is that some kind soul told me 
>about a key trick; prepare the program in Windows text format (.txt) 
>and copy it and paste it into the console. The program will now run 
>from a user-defined "wrapper" or "driver" function.
>
>I am aware that there is an editor called Emacs which you can use if 
>you are a member of the *nix community, which I am not.
>
>Question: How are you -supposed- to debug a program which you have 
>written in the R language?
>
>Tom
>Thomas L. Jones, Ph.D., Computer Science
>
>__
>R-help@stat.math.ethz.ch mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>
>  
>


-- 
Ferdinand Alimadhi
Programmer / Analyst
Harvard University
The Institute for Quantitative Social Science
(617) 496-0187
[EMAIL PROTECTED]
www.iq.harvard.edu

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Debugging a program written in the R language

2006-03-08 Thread roger bos
You could always use notepad, but there are better solutions.  There are
many text editors which will send the commands to R for you and return the
results and also offer syntax highlighting.  I like Tinn-R.  Xemacs is
probably the best, but its hard to learn (IMHO) and I have not taken the
time to do so.  There is also a host of GUIs that people are developing for
R.  Take a look at:
http://www.sciviews.org/_rgui/

I suggest R commander.
library(Rcmdr)
You will need to download it from CRAN.

HTH,

Roger

On 3/8/06, Thomas L Jones <[EMAIL PROTECTED]> wrote:
>
> >From Tom:
>
> The subject is debugging a program written in the R language,under
> Windows. (Sorry, but I do not know either the Apple OS or *nix.) A
> computer program will usually not work on the first try, if only
> because of the risk of typos. Instead, it must be debugged. Roughly,
> here is the sequence:
>
> (1) One codes a program using the R language,  and stores it on the
> hard drive, using some particular editor.
>
> (2) The program is fed to the R software, together with test data,
> etc.
>
> (3) A test computation is run and bugs are spotted.
>
> (4) The program is corrected, using an editor, and the revised version
> is stored on the hard drive.
>
> (5) The sequence goes back to step (2) and is repeated until the
> program hopefully works.
>
> Unfortunately, the documentation doesn't really explain how to do all
> of this, or if it is explained in the documentation, I can't find it.
>
> Reading between the lines a bit, I infer that you are supposed to be
> able to use something called a History file, then sort of work
> backward in the code and make corrections. I never got it to work for
> me. Also, it is unclear how you would handle code entered six weeks or
> six months ago.
>
> That is the bad news; the good news is that some kind soul told me
> about a key trick; prepare the program in Windows text format (.txt)
> and copy it and paste it into the console. The program will now run
> from a user-defined "wrapper" or "driver" function.
>
> I am aware that there is an editor called Emacs which you can use if
> you are a member of the *nix community, which I am not.
>
> Question: How are you -supposed- to debug a program which you have
> written in the R language?
>
> Tom
> Thomas L. Jones, Ph.D., Computer Science
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Debugging a program written in the R language

2006-03-08 Thread Michael H. Prager
 From the R prompt, try typing

?source

A good editor, specialized for working with R source on Windows, is Tinn-R.



on 3/8/2006 4:12 PM Thomas L Jones said the following:
> >From Tom:
>
> The subject is debugging a program written in the R language,under 
> Windows. (Sorry, but I do not know either the Apple OS or *nix.) A 
> computer program will usually not work on the first try, if only 
> because of the risk of typos. Instead, it must be debugged. Roughly, 
> here is the sequence:
>
> (1) One codes a program using the R language,  and stores it on the 
> hard drive, using some particular editor.
>
> (2) The program is fed to the R software, together with test data, 
> etc.
>
> (3) A test computation is run and bugs are spotted.
>
> (4) The program is corrected, using an editor, and the revised version 
> is stored on the hard drive.
>
> (5) The sequence goes back to step (2) and is repeated until the 
> program hopefully works.
>
> Unfortunately, the documentation doesn't really explain how to do all 
> of this, or if it is explained in the documentation, I can't find it.
>
> Reading between the lines a bit, I infer that you are supposed to be 
> able to use something called a History file, then sort of work 
> backward in the code and make corrections. I never got it to work for 
> me. Also, it is unclear how you would handle code entered six weeks or 
> six months ago.
>
> That is the bad news; the good news is that some kind soul told me 
> about a key trick; prepare the program in Windows text format (.txt) 
> and copy it and paste it into the console. The program will now run 
> from a user-defined "wrapper" or "driver" function.
>
> I am aware that there is an editor called Emacs which you can use if 
> you are a member of the *nix community, which I am not.
>
> Question: How are you -supposed- to debug a program which you have 
> written in the R language?
>
> Tom
> Thomas L. Jones, Ph.D., Computer Science
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>   

-- 

Michael Prager, Ph.D.
Population Dynamics Team, NMFS SE Fisheries Science Center
NOAA Center for Coastal Fisheries and Habitat Research
Beaufort, North Carolina  28516
http://shrimp.ccfhrb.noaa.gov/~mprager/
Opinions expressed are personal, not official.  No
government endorsement of any product is made or implied.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Debugging a program written in the R language

2006-03-08 Thread Liaw, Andy
Firstly, some corrections:

- It's best to write code as functions in R, just like most sensible
languages.  You can then debug the functions.

- (X)Emacs has versions for Windows.  Prof. John Fox has kindly made
available some instructions on how Xemacs/ESS can be set up on Windows.

Now, Roger Peng has kindly made available a note on debugging R code:
http://www.biostat.jhsph.edu/~rpeng/docs/R-debug-tools.pdf

Also, for compiled code linked to R, Duncan Murdoch's notes will be helpful:
http://www.stats.uwo.ca/faculty/murdoch/software/debuggingR/

Finally, there's a `debug' package on CRAN with a R News article describing
it.

Andy

From: Thomas L Jones
> 
> >From Tom:
> 
> The subject is debugging a program written in the R language,under 
> Windows. (Sorry, but I do not know either the Apple OS or *nix.) A 
> computer program will usually not work on the first try, if only 
> because of the risk of typos. Instead, it must be debugged. Roughly, 
> here is the sequence:
> 
> (1) One codes a program using the R language,  and stores it on the 
> hard drive, using some particular editor.
> 
> (2) The program is fed to the R software, together with test data, 
> etc.
> 
> (3) A test computation is run and bugs are spotted.
> 
> (4) The program is corrected, using an editor, and the 
> revised version 
> is stored on the hard drive.
> 
> (5) The sequence goes back to step (2) and is repeated until the 
> program hopefully works.
> 
> Unfortunately, the documentation doesn't really explain how to do all 
> of this, or if it is explained in the documentation, I can't find it.
> 
> Reading between the lines a bit, I infer that you are supposed to be 
> able to use something called a History file, then sort of work 
> backward in the code and make corrections. I never got it to work for 
> me. Also, it is unclear how you would handle code entered six 
> weeks or 
> six months ago.
> 
> That is the bad news; the good news is that some kind soul told me 
> about a key trick; prepare the program in Windows text format (.txt) 
> and copy it and paste it into the console. The program will now run 
> from a user-defined "wrapper" or "driver" function.
> 
> I am aware that there is an editor called Emacs which you can use if 
> you are a member of the *nix community, which I am not.
> 
> Question: How are you -supposed- to debug a program which you have 
> written in the R language?
> 
> Tom
> Thomas L. Jones, Ph.D., Computer Science
> 
> __
> R-help@stat.math.ethz.ch mailing list 
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] [Q] BIC as a goodness-of-fit stat

2006-03-08 Thread Vincent Zoonekynd
The following article gives some motivations for the BIC and
some rules to interpret a difference of BICs (if I recall correctly,
over 10: very strong difference, between 6 and 10: strong
difference, between 2 and 6: some difference):

  A.E. Raftery, Bayesian model selection in social research
  http://www.stat.washington.edu/tech.reports/bic.ps

Regards,

-- Vincent

On 06/03/06, Young-Jin Lee <[EMAIL PROTECTED]> wrote:
> Dear R-List
>
> I have a question about how to interpret BIC as a goodness-of-fit statistic.
> I was trying to use "EMclust" and other "mclust" library and found that BIC
> was used as a goodness-of-fit statistic.
> Although I know that smaller BIC indicates a better fit, it is not clear to
> me how good a fit is by reading a BIC number. Is there a standard way of
> interpreting a BIC value?
>
> Thanks in advance.
>
> [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Debugging a program written in the R language

2006-03-08 Thread Thomas L Jones
>From Tom:

The subject is debugging a program written in the R language,under 
Windows. (Sorry, but I do not know either the Apple OS or *nix.) A 
computer program will usually not work on the first try, if only 
because of the risk of typos. Instead, it must be debugged. Roughly, 
here is the sequence:

(1) One codes a program using the R language,  and stores it on the 
hard drive, using some particular editor.

(2) The program is fed to the R software, together with test data, 
etc.

(3) A test computation is run and bugs are spotted.

(4) The program is corrected, using an editor, and the revised version 
is stored on the hard drive.

(5) The sequence goes back to step (2) and is repeated until the 
program hopefully works.

Unfortunately, the documentation doesn't really explain how to do all 
of this, or if it is explained in the documentation, I can't find it.

Reading between the lines a bit, I infer that you are supposed to be 
able to use something called a History file, then sort of work 
backward in the code and make corrections. I never got it to work for 
me. Also, it is unclear how you would handle code entered six weeks or 
six months ago.

That is the bad news; the good news is that some kind soul told me 
about a key trick; prepare the program in Windows text format (.txt) 
and copy it and paste it into the console. The program will now run 
from a user-defined "wrapper" or "driver" function.

I am aware that there is an editor called Emacs which you can use if 
you are a member of the *nix community, which I am not.

Question: How are you -supposed- to debug a program which you have 
written in the R language?

Tom
Thomas L. Jones, Ph.D., Computer Science

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Accessing functions in a library

2006-03-08 Thread Seth Falcon
"Michael Conklin" <[EMAIL PROTECTED]> writes:

> I am trying to write a modified function to plot an rpart object.  By
> using getS3method I can see the plot and text code that I want to
> modify.  Since I don't want to modify the package, I create a new
> function to plot the rpart object.  The problem is that the original
> function calls many rpart specific functions that are only visible
> inside the rpart namespace.  Therefore, when I call my modified function
> it does not find the rpart specific functions and gives me an error.
> How can I make a new function that is a small modification of an
> existing package function and let the modified function access the
> hidden package functions?

You can subvert namespace protection using ":::".  So if there is a
hidden function g in package foo, use foo:::g().

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] RES: survival

2006-03-08 Thread Paulo Brando
Dear Thomas,

The head of my dataset

> head(wsuv)
  parcel  sp time censo treatment
species
1 S8 Poecilanthe effusa ( Hub. ) Ducke. 1   1   1  1
2 S8 Poecilanthe effusa ( Hub. ) Ducke. 1   1   1  1
3 S8 Poecilanthe effusa ( Hub. ) Ducke. 1   1   1  1
4 S8 Poecilanthe effusa ( Hub. ) Ducke. 1   1   1  1
5 S8 Poecilanthe effusa ( Hub. ) Ducke. 1   1   1  1
6 S8 Poecilanthe effusa ( Hub. ) Ducke. 1   1   1  1
...
144361

>  summary(model.fit) # just one species from one treatment shown below

Call: survfit(formula = Surv(time, censo) ~ treatment + species, data =
wsuv)

treatment=0, species=1 
 time n.risk n.event survival std.err lower 95% CI upper 95% CI
1  15440 3860.975 0.001260.9730.977
2  15054 3360.953 0.001700.9500.957
3  14668 3020.934 0.002000.9300.938
4  14282 2960.914 0.002260.9100.919
5  13896 2810.896 0.002470.8910.901
6  13510 2640.878 0.002640.8730.883
7  13124 2510.861 0.002800.8560.867
8  12738 2320.846 0.002930.8400.852
9  12352 2160.831 0.003050.8250.837
   10  11966 2060.817 0.003150.8110.823
   11  11580 1900.803 0.003250.7970.810
   12  11194 1790.790 0.003330.7840.797
   13  10808 1670.778 0.003410.7720.785
   14  10422 1670.766 0.003490.7590.773
   15  10036 1450.755 0.003560.7480.762
   16   9650 1420.744 0.003630.7370.751
   17   9264 1350.733 0.003690.7260.740
   18   8878 1220.723 0.003750.7150.730
   19   8492  990.714 0.003800.7070.722
   20   8106  840.707 0.003850.6990.714
   21   7720  680.701 0.003890.6930.708
   22   7334  660.694 0.003930.6870.702
   23   6948  510.689 0.003970.6810.697
   24   6562  400.685 0.004000.6770.693
   25   6176  380.681 0.004030.6730.689
   26   5790  370.676 0.004070.6690.684
   27   5404  330.672 0.004110.6640.680
   28   5018  310.668 0.004150.6600.676
   29   4632  260.664 0.004190.6560.673
   30   4246  220.661 0.004230.6530.669
   31   3860  150.658 0.004270.6500.667
   32   3474  140.656 0.004310.6470.664
   33   3088  140.653 0.004360.6440.661
   34   2702  130.650 0.004430.6410.658
   35   2316  120.646 0.004510.6380.655
   36   1930  110.643 0.004620.6340.652
   37   1544  120.638 0.004800.6280.647
   38   1158  100.632 0.005070.6220.642
   39772   90.625 0.005570.6140.636
   40386   80.612 0.007090.5980.626

I don't get why with 8 leaves remaining (out of 384), the survival is
about 0.6???


Call: survfit(formula = Surv(time, censo) ~ 1, data = wsuv)

  n  events  median 0.95LCL 0.95UCL 
 144361   58830  40  39  40

 
> survfit(Surv(timee,ind)~sp2,data=wsuv)
Call: survfit(formula = Surv(timee, ind) ~ sp2, data = wsuv)

  n events median 0.95LCL 0.95UCL
sp2=1 32226  10856Inf Inf Inf
sp2=2 23370   9824 38  37  39
sp2=3 31201  13275 40  39  41
sp2=4 28044  10401 41  40  41
sp2=5 29520  14474 31  30  31


> survfit(Surv(timee,ind)~parcel2,data=wsuv)
Call: survfit(formula = Surv(timee, ind) ~ parcel2, data = wsuv)

  n events median 0.95LCL 0.95UCL
parcel2=0 68183  28116 38  38  38
parcel2=1 76178  30714 41  41  41


> survfit(Surv(timee,ind)~interaction(parcel2,sp2),data=wsuv)
Call: survfit(formula = Surv(timee, ind) ~ interaction(parcel2, sp2), 
data = wsuv)

  n events median 0.95LCL 0.95UCL
interaction(parcel2, sp2)=0.1 15826   5070Inf Inf Inf
interaction(parcel2, sp2)=1.1 16400   5786Inf Inf Inf
interaction(parcel2, sp2)=0.2  9430   3935 38  37  39
interaction(parcel2, sp2)=1.2 13940   5889 38  37  39
interaction(parcel2, sp2)=0.3 14678   6021 40  39  41
interaction(parcel2, sp2)=1.3 16523   7254 39  37  41
interaction(parcel2, sp2)=0.4 14473   5758 38  37  39
interaction(parcel2, sp2)=1.4 13571   4643Inf Inf Inf
interaction(parcel2, sp2)=0.5 13776   7332 2

[R] Accessing functions in a library

2006-03-08 Thread Michael Conklin
I am trying to write a modified function to plot an rpart object.  By
using getS3method I can see the plot and text code that I want to
modify.  Since I don't want to modify the package, I create a new
function to plot the rpart object.  The problem is that the original
function calls many rpart specific functions that are only visible
inside the rpart namespace.  Therefore, when I call my modified function
it does not find the rpart specific functions and gives me an error.
How can I make a new function that is a small modification of an
existing package function and let the modified function access the
hidden package functions?

 

Michael Conklin

Chief Methodolgist - Advanced Analytics

 

MarketTools, Inc.

6465 Wayzata Blvd. Suite 170

Minneapolis, MN 55426

Tel: 952.417.4719 | Mobile:612.201.8978

[EMAIL PROTECTED] 


 

MarketTools(r)http://www.markettools.com
 

 

This e-mail and any attachments may contain privileged, confidential or
proprietary information. If you are not the intended recipient, be aware
that any review, copying, or distribution of this e-mail or any
attachment is strictly prohibited. If you have received this e-mail in
error, please return it to the sender immediately, and permanently
delete the original and any copies from your system. Thank you for your
cooperation.

 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] memory limits in Windows

2006-03-08 Thread johnston
Hello,

I apologize, since I know that variations of this question come up often, but I
have searched both the archives and the rw-FAQ, and haven't had any luck solving
my problem.  I am trying to run some generalized additive mixed models (using
the mgcv library) with some large datasets, and get error messages regarding
memory allocation.  An example dataset has around 45,000 points contributed by
about 2200 individuals (so about 20 observations per individual).  Errors within
individuals are modeled as AR(1).  Currently, I can run the models on a random
subset of about 25% of the data in a reasonable amount of time, so I think that
memory is the only major issue here. 

I have used the "--max-mem-size" command line option, and set it to the maximum
allowable, which is 3Gb (any larger and I get a message that it is too large and
is ignored when I open R).  I also run the R command memory.limit(4095), again
the maximum allowable without receiving a "don't be silly" message.  

I am running this on a brand new computer with Windows XP-64 OS and 4GB RAM (it
was bought primarily to be able to handle these models!)  Do I have any options
to increase memory?  Any advice would be hugely appreciated.

Thanks so much for any input.

Karissa Johnston

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] bug in map('world') ?

2006-03-08 Thread Ray Brownrigg
I seem to recall I came across something similar recently (well,
relatively recently, perhaps a couple of years ago).  The problem is
related to how the code handles polygons that are split across the map
boundaries.  As I recall, the fix was to modify the polygons that
straddle the date line.  I'll have to delve into this again, but don't
hold your breath, sorry.

Ray Brownrigg

> From: Joerg van den Hoff <[EMAIL PROTECTED]>
> 
> hi,
> 
> did'nt see anything in the archive:
> 
> map('world',pro='rectangular',para=0)
> 
> yields a strange artifact (horizontal bar) extending over the whole map 
> at a certain latitude range (approx 65 deg. north), whereas
> 
> map('world',pro='rectangular',para=180)
> 
> (which should be the same) does not show the artifact.
> 
> the artifact shows up in other projections as well, e.g.
> 
> map('world',pro='bonne',para=45)
> 
> 
> which seems to show that the problem might be in the data base of the 
> map (i.e. polygon definition)??
> 
> 
> any ideas?
> 
> regards,
> 
> joerg

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Adding polygons to a barplot

2006-03-08 Thread Jamieson Cobleigh
I have a barplot I have created using barplot2 and I have been able to
add points and lines (using the points and lines methods,
respectively).  I now need to add some polygons (triangles in
particular), that I want to be shaded to match bars in the plot.  I
can get the coordinates of the corners of the triangles, but don't
know how to draw the triangles.  I know there is the grid.polygon
method, but I don't know how to get it to draw on my plot.  Any help
would be appreciated.

Thanks!

Jamie

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] function gdist, dist and vegdist in mvpart

2006-03-08 Thread Gavin Simpson
On Wed, 2006-03-08 at 19:12 +0100, Sina Muster wrote:
> Dear R community,
> 
> What I am missing? Where is the problem?

That dist returns an object of class "dist", even if you use upper =
TRUE. ?dist says:

   upper: logical value indicating whether the upper triangle of the
  distance matrix should be printed by 'print.dist'.

Beware confusing the printed representation of the object with what the
object actually looks like.

The reason gdist works, is that is does not do what its help page says:

Value:

 Should be interchangeable with 'dist' and returns a distance
 object of the same type.

Which the following sequence illustrates does not happen (using the
spider dataset from mvpart):

> spider.dist <- gdist(spider[1:12,])
> class(spider.dist)
[1] "dist"
> spider.dist <- gdist(spider[1:12,], full = TRUE)
> class(spider.dist)
[1] "matrix"

You might consider contacting the maintainer and submit a bug report to
him about this...

So even though you thought the two (dist and gdist) were the same, they
are not.

To solve this problem then, a simple modification to your mvpart call is
required, as shown below:

> spider.dist2 <- dist(spider[, 1:12], method="minkowski", p=1, diag=T,
upper=T)
> class(spider.dist2)
[1] "dist"
> mvpart(spider.dist2~herbs+reft+moss+sand+twigs+water,spider)
Error in model.frame(formula, rownames, variables, varnames, extras,
extranames,  :
variable lengths differ
> mvpart(as.matrix(spider.dist2)~herbs+reft+moss+sand+twigs
+water,spider)

The key is the as.matrix() wrapped around the lhs of the formula.

It would have been useful if you'd included an reproducible example
using a built in data set - like I do above. Readers of the list wont
have your "ba12", so extra work is required to investigate the problem.

> 
> Thanks for helping.
> Sina

You're welcome, hope this helps.

Gav

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Gavin Simpson [T] +44 (0)20 7679 5522
ENSIS Research Fellow [F] +44 (0)20 7679 7565
ENSIS Ltd. & ECRC [E] gavin.simpsonATNOSPAMucl.ac.uk
UCL Department of Geography   [W] http://www.ucl.ac.uk/~ucfagls/cv/
26 Bedford Way[W] http://www.ucl.ac.uk/~ucfagls/
London.  WC1H 0AP.
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Want to fit random intercept in logistic regression (testing lmer and glmmML)

2006-03-08 Thread Paul Johnson
Greetings.  Here is sample code, with some comments. It shows how I
can simulate data and estimate glm with binomial family when there is
no individual level random error, but when I add random error into the
linear predictor, I have a difficult time getting reasonable estimates
of the model parameters or the variance component.

There are no clusters here, just individual level responses, so
perhaps I am misunderstanding the translation from my simple mixed
model to the syntax of glmmML and lmer.  I get roughly the same
(inaccurate) fixed effect parameter estimates from glmmML and lmer,
but the information they give on the variance components is quite
different.

Thanks in advance.

Now I paste in the example code

### Paul Johnson <[EMAIL PROTECTED]>
### 2006-03-08

N <- 1000
A <- -1
B <- 0.3


x <- 1 + 10 * rnorm(N)
eta <- A + B * x

pi <- exp(eta)/(1+exp(eta))

myunif <- runif(N)

y <- ifelse(myunif < pi, 1, 0)

plot(x,y, main=bquote( eta[i] == .(A) +   .(B) * x[i] ))

text ( 0.5*max(x), 0.5, expression( Prob( y[i] == 1) == frac( 1 , 1 +
exp(-eta[i] 


myglm1 <- glm ( y ~ x, family=binomial(link="logit") )
summary(myglm1)

## Just for fun
myglm2 <- glm(y~x, family=quasibinomial)
summary(myglm2)



### Mixed model: random intercept with large variance

eta <- A + B * x + 5 * rnorm(N)
pi <- exp(eta)/(1+exp(eta))
myunif <- runif(N)
y <- ifelse(myunif < pi, 1, 0)

plot(x,y, main=bquote( eta[i] == .(A) +   .(B) * x[i] + u[i]))

text ( 0.5*max(x), 0.5, expression( Prob( y[i] == 1) == frac( 1 , 1 +
exp(-eta[i] 

### Parameter estimates go to hell, as expected
myglm3 <- glm ( y ~ x, family=binomial(link="logit") )
summary(myglm3)

### Why doesn't quasibinomial show more evidence of the random intercept?
myglm4 <- glm(y~x, family=quasibinomial)
summary(myglm4)



# Can I estimate with lmer?


library(lme4)

### With no repeated observations, what does lmer want?
### Clusters of size 1 ?
### In lme, I'd need random= ~ 1

idnumber <- 1: length(y)

mylmer1 <- lmer( y ~ x + ( 1 | idnumber ), family=binomial, method="Laplace" )
summary(mylmer1)


### Glmm wants clusters, and I don't have any clusters with more than
1 observation
###

library(glmmML)


myglmm1 <- glmmML(y~x, family=binomial, cluster = idnumber )

summary(myglmm1)

--
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] function gdist, dist and vegdist in mvpart

2006-03-08 Thread Sina Muster
Dear R community,

I am analyzing plant communities with the function mvpart, using a 
dissimilarit matrix as input. The matrix is calculated with the funtion 
gdist.

fit <- mvpart(gdist (ba12[,18:29], meth="maximum", full=TRUE, 
sq=F) ~ beers + slope_dem + elev_dem+ plc_dem + 
pr_curv+ 
+curv+max_depth+doc_rocks+ abandon+land_use+ca_old,
data=ba12, xv="p")

This works fine. Now I would like to use other dissimilarity measures as 
can be found in the function dist (STATS) or vegdist (VEGAN).

De'Ath notes that gdist should be interchangeable with dist - but I 
receive following error message:

 > fit21 <- mvpart(dist (ba12[,18:29], meth="minkowski", diag=T, 
upper=T p=2) ~
beers + slope_dem + elev_dem+ plc_dem + pr_curv+
 
curv+max_depth+doc_rocks+ abandon+land_use+ca_old,
data=ba12, xv="p")
Error in model.frame(formula, rownames, variables, varnames, extras, 
extranames,  :
Variable lengths differ


When computing the matrices directly...

ba12.dist1 <- gdist (ba12[,18:29], method="maximum", full=T, sq=F)
ba12.dist2 <- vegdist (ba12[,18:29], method="jaccard", diag=T, upper=T)
ba12.dist3 <- dist (ba12[,18:29], method="minkowski", p=1, diag=T, upper=T)

and looking at them, there appears to be no difference though only 
ba12.dist1 works with mvpart. With the other two I get the same error 
message as above.

What I am missing? Where is the problem?

Thanks for helping.
Sina

University of Potsdam, Germany
Department of Geoecology

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How to do it without for loops?

2006-03-08 Thread Thomas Lumley

On Wed, 8 Mar 2006, Gabor Grothendieck wrote:


Actually one problem with both of the solutions is determining
that they are correct.  From that viewpoint, although not as
elegant or fun, a more straightforward
translation of the original question might actually be preferable


Yes. Although for the application where I encounter this (sandwich 
variance estimators), the proof that the two versions are the same is of 
independent interest, since it shows the relationship between the cluster 
jackknife, delta-betas, and the sandwich estimator.


-thomas




(Thomas also already mentioned the computation in f):

f <- function(x) crossprod(crossprod(x[[3]], as.matrix(x[,1:2])))
mm <-by(dat, dat$id, f)
mm[[1]] + mm[[2]] + mm[[3]]

or the last line could be replaced with these two if you don't know
that there are necessarily three components:

result <- mm[[1]]
result[] <- do.call("mapply", c(sum, mm))

One thing that this bought out for me is that R does not have
a parallel to pmax for sum.  If one had psum of if "+" allowed
more than 2 arguments one could have done this instead of the
two lines involving result above:

 do.call("psum", mm)  # if psum analog to pmax were available
 do.call("+", mm)  # if + allowed > 2 arguments


On 3/8/06, Thomas Lumley <[EMAIL PROTECTED]> wrote:

On Wed, 8 Mar 2006, Gabor Grothendieck wrote:


Thomas' solution is better but thought this might be of interest
anyways since it can be written closer to mathematical notation.
That is, the required expression can be written in the
following equivalent way for a suitable matrix A:

X' diag(u) A' A diag(u) X


Um. n x n matrix? O(n^2) storage? O(n^3) execution time? Yes, it's fine
when n=20, or even 200,  but still...

   -thomas




where diag(u) is a diagonal matrix with u along the diagonal
as in the R diag function, spaces refer to matrix multiplication
and ' means transpose.

Thus we have:

A <- outer(unique(dat$id), dat$id, "==")
crossprod(A %*% diag(dat$u) %*% as.matrix(dat[1:2]))


On 3/8/06, Thomas Lumley <[EMAIL PROTECTED]> wrote:

On Wed, 8 Mar 2006, ronggui wrote:


Thank you for all .

One more question.How can I calculate these efficiently?

set.seed(100)
dat<-data.frame(x1=rnorm(20),x2=rnorm(20),u=rnorm(20),id=round(2*runif(20)))
# In this example,id's elements are  0,1,2.
y<-list()
for (i in 0:2){
X<-as.matrix(subset(dat,id==i,c("x1","x2")))
u<-as.matrix(subset(dat,id==i,c("u")))
y[[i+1]]<-t(X)%*%u%*%t(u)%*%X
}
y[[1]]+y[[2]]+y[[3]]



People have already told you about crossprod, so crossprod(crossprod(X,u))
would seem an obvious improvement over the matrix multiplications.

There is a better solution, though.

Xu<-dat[,c("x1","x2")]*dat[,"u"]
crossprod( rowsum(Xu, dat$id))

   -thomas



the above code is not elegant.And my second solution to this problem
is using by to get a list.

matlis<-by(dat, dat$id, function(x){
a<-as.matrix(x[,c("x1","x2")])
b<-as.matrix(x[, "u"])
t(a) %*% b  %*% t(b) %*% a
})

S <- matrix(unlist(matlis), 4, length(matlis))
S1 <- matrix(rowSums(S), 2, 2)

The code works ,but I want to ask if there is any other more better
ways to do it? It seems that this kind of computation is quite common.





2006/2/28, Gabor Grothendieck <[EMAIL PROTECTED]>:

Try:

crossprod(x)

or

t(x) %*% x

On 2/28/06, ronggui <[EMAIL PROTECTED]> wrote:

This is the code:

x<-matrix(rnorm(20),5)
y<-list()
for (i in seq(nrow(x))) y[[i]]<-t(x[i,,drop=F])%*%x[i,,drop=F]
y[[1]]+y[[2]]+y[[3]]+y[[4]]+y[[5]]

How can I do it without using for loops?
Thank you in advance!
--
ronggui
Deparment of Sociology
Fudan University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html






--
»ÆÈÙ¹ó
Deparment of Sociology
Fudan University




Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]University of Washington, Seattle

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html




__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html



Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]University of Washington, Seattle



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html



Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do

Re: [R] 'less' for R?

2006-03-08 Thread Romain Francois
Le 08.03.2006 18:03, Federico Calboli a écrit :
> Hi All,
>
> is there an equivalent of the Unix command 'less' (or 'more'), so I can
> look at what's inside a data.frame or a matrix without having it printed
> out on console?
>
> I am using R on Debian Linux and Mac OS 10.4.5
>
> Cheers,
>
> F
>   
Hi,

A cheap way is to use the unix command :
> less <- function(a){
>write.table(a, quote=F, file="/tmp/dataframe.txt")
>system("less /tmp/dataframe.txt")
>}
> less(iris)
Romain

-- 
visit the R Graph Gallery : http://addictedtor.free.fr/graphiques
Discover the R Movies Gallery : http://addictedtor.free.fr/movies
+---+
| Romain FRANCOIS - http://francoisromain.free.fr   |
| Doctorant INRIA Futurs / EDF  |
+---+

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] 'less' for R?

2006-03-08 Thread Christos Hatzis
Or 
fix(data.frame) 
to open it in the spreadsheet-like data editor.

-Christos Hatzis 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Duncan Murdoch
Sent: Wednesday, March 08, 2006 12:26 PM
To: [EMAIL PROTECTED]
Cc: r-help
Subject: Re: [R] 'less' for R?

On 3/8/2006 12:03 PM, Federico Calboli wrote:
> Hi All,
> 
> is there an equivalent of the Unix command 'less' (or 'more'), so I 
> can look at what's inside a data.frame or a matrix without having it 
> printed out on console?
> 
> I am using R on Debian Linux and Mac OS 10.4.5

I think you want page().  head() and tail() are also useful, as equivalents
to the head and tail commands.

Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How to do it without for loops?

2006-03-08 Thread Gabor Grothendieck
Yes, its horribly inefficient but as I stated
the reason I mentioned it was because it could be translated
more easily from mathematical notation and I did mention I
preferred your solution.

Actually one problem with both of the solutions is determining
that they are correct.  From that viewpoint, although not as
elegant or fun, a more straightforward
translation of the original question might actually be preferable
(Thomas also already mentioned the computation in f):

f <- function(x) crossprod(crossprod(x[[3]], as.matrix(x[,1:2])))
mm <-by(dat, dat$id, f)
mm[[1]] + mm[[2]] + mm[[3]]

or the last line could be replaced with these two if you don't know
that there are necessarily three components:

result <- mm[[1]]
result[] <- do.call("mapply", c(sum, mm))

One thing that this bought out for me is that R does not have
a parallel to pmax for sum.  If one had psum of if "+" allowed
more than 2 arguments one could have done this instead of the
two lines involving result above:

  do.call("psum", mm)  # if psum analog to pmax were available
  do.call("+", mm)  # if + allowed > 2 arguments


On 3/8/06, Thomas Lumley <[EMAIL PROTECTED]> wrote:
> On Wed, 8 Mar 2006, Gabor Grothendieck wrote:
>
> > Thomas' solution is better but thought this might be of interest
> > anyways since it can be written closer to mathematical notation.
> > That is, the required expression can be written in the
> > following equivalent way for a suitable matrix A:
> >
> > X' diag(u) A' A diag(u) X
>
> Um. n x n matrix? O(n^2) storage? O(n^3) execution time? Yes, it's fine
> when n=20, or even 200,  but still...
>
>-thomas
>
>
>
> > where diag(u) is a diagonal matrix with u along the diagonal
> > as in the R diag function, spaces refer to matrix multiplication
> > and ' means transpose.
> >
> > Thus we have:
> >
> > A <- outer(unique(dat$id), dat$id, "==")
> > crossprod(A %*% diag(dat$u) %*% as.matrix(dat[1:2]))
> >
> >
> > On 3/8/06, Thomas Lumley <[EMAIL PROTECTED]> wrote:
> >> On Wed, 8 Mar 2006, ronggui wrote:
> >>
> >>> Thank you for all .
> >>>
> >>> One more question.How can I calculate these efficiently?
> >>>
> >>> set.seed(100)
> >>> dat<-data.frame(x1=rnorm(20),x2=rnorm(20),u=rnorm(20),id=round(2*runif(20)))
> >>> # In this example,id's elements are  0,1,2.
> >>> y<-list()
> >>> for (i in 0:2){
> >>> X<-as.matrix(subset(dat,id==i,c("x1","x2")))
> >>> u<-as.matrix(subset(dat,id==i,c("u")))
> >>> y[[i+1]]<-t(X)%*%u%*%t(u)%*%X
> >>> }
> >>> y[[1]]+y[[2]]+y[[3]]
> >>>
> >>
> >> People have already told you about crossprod, so crossprod(crossprod(X,u))
> >> would seem an obvious improvement over the matrix multiplications.
> >>
> >> There is a better solution, though.
> >>
> >> Xu<-dat[,c("x1","x2")]*dat[,"u"]
> >> crossprod( rowsum(Xu, dat$id))
> >>
> >>-thomas
> >>
> >>
> >>> the above code is not elegant.And my second solution to this problem
> >>> is using by to get a list.
> >>>
> >>> matlis<-by(dat, dat$id, function(x){
> >>> a<-as.matrix(x[,c("x1","x2")])
> >>> b<-as.matrix(x[, "u"])
> >>> t(a) %*% b  %*% t(b) %*% a
> >>> })
> >>>
> >>> S <- matrix(unlist(matlis), 4, length(matlis))
> >>> S1 <- matrix(rowSums(S), 2, 2)
> >>>
> >>> The code works ,but I want to ask if there is any other more better
> >>> ways to do it? It seems that this kind of computation is quite common.
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> 2006/2/28, Gabor Grothendieck <[EMAIL PROTECTED]>:
>  Try:
> 
>  crossprod(x)
> 
>  or
> 
>  t(x) %*% x
> 
>  On 2/28/06, ronggui <[EMAIL PROTECTED]> wrote:
> > This is the code:
> >
> > x<-matrix(rnorm(20),5)
> > y<-list()
> > for (i in seq(nrow(x))) y[[i]]<-t(x[i,,drop=F])%*%x[i,,drop=F]
> > y[[1]]+y[[2]]+y[[3]]+y[[4]]+y[[5]]
> >
> > How can I do it without using for loops?
> > Thank you in advance!
> > --
> > ronggui
> > Deparment of Sociology
> > Fudan University
> >
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! 
> > http://www.R-project.org/posting-guide.html
> >
> 
> >>>
> >>>
> >>> --
> >>> »ÆÈÙ¹ó
> >>> Deparment of Sociology
> >>> Fudan University
> >>>
> >>>
> >>
> >> Thomas Lumley   Assoc. Professor, Biostatistics
> >> [EMAIL PROTECTED]University of Washington, Seattle
> >>
> >> __
> >> R-help@stat.math.ethz.ch mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide! 
> >> http://www.R-project.org/posting-guide.html
> >>
> >>
> >
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! 
> > http://www.R-project.org/posting-guide.html
> >
>
> Thomas Lumley   Assoc. Professor, Biostatistics
> [EMAIL 

Re: [R] 'less' for R?

2006-03-08 Thread Duncan Murdoch
On 3/8/2006 12:03 PM, Federico Calboli wrote:
> Hi All,
> 
> is there an equivalent of the Unix command 'less' (or 'more'), so I can
> look at what's inside a data.frame or a matrix without having it printed
> out on console?
> 
> I am using R on Debian Linux and Mac OS 10.4.5

I think you want page().  head() and tail() are also useful, as 
equivalents to the head and tail commands.

Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] 'less' for R?

2006-03-08 Thread Federico Calboli
Hi All,

is there an equivalent of the Unix command 'less' (or 'more'), so I can
look at what's inside a data.frame or a matrix without having it printed
out on console?

I am using R on Debian Linux and Mac OS 10.4.5

Cheers,

F

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How to do it without for loops?

2006-03-08 Thread Thomas Lumley

On Wed, 8 Mar 2006, Gabor Grothendieck wrote:


Thomas' solution is better but thought this might be of interest
anyways since it can be written closer to mathematical notation.
That is, the required expression can be written in the
following equivalent way for a suitable matrix A:

X' diag(u) A' A diag(u) X


Um. n x n matrix? O(n^2) storage? O(n^3) execution time? Yes, it's fine 
when n=20, or even 200,  but still...


-thomas




where diag(u) is a diagonal matrix with u along the diagonal
as in the R diag function, spaces refer to matrix multiplication
and ' means transpose.

Thus we have:

A <- outer(unique(dat$id), dat$id, "==")
crossprod(A %*% diag(dat$u) %*% as.matrix(dat[1:2]))


On 3/8/06, Thomas Lumley <[EMAIL PROTECTED]> wrote:

On Wed, 8 Mar 2006, ronggui wrote:


Thank you for all .

One more question.How can I calculate these efficiently?

set.seed(100)
dat<-data.frame(x1=rnorm(20),x2=rnorm(20),u=rnorm(20),id=round(2*runif(20)))
# In this example,id's elements are  0,1,2.
y<-list()
for (i in 0:2){
X<-as.matrix(subset(dat,id==i,c("x1","x2")))
u<-as.matrix(subset(dat,id==i,c("u")))
y[[i+1]]<-t(X)%*%u%*%t(u)%*%X
}
y[[1]]+y[[2]]+y[[3]]



People have already told you about crossprod, so crossprod(crossprod(X,u))
would seem an obvious improvement over the matrix multiplications.

There is a better solution, though.

Xu<-dat[,c("x1","x2")]*dat[,"u"]
crossprod( rowsum(Xu, dat$id))

   -thomas



the above code is not elegant.And my second solution to this problem
is using by to get a list.

matlis<-by(dat, dat$id, function(x){
a<-as.matrix(x[,c("x1","x2")])
b<-as.matrix(x[, "u"])
t(a) %*% b  %*% t(b) %*% a
})

S <- matrix(unlist(matlis), 4, length(matlis))
S1 <- matrix(rowSums(S), 2, 2)

The code works ,but I want to ask if there is any other more better
ways to do it? It seems that this kind of computation is quite common.





2006/2/28, Gabor Grothendieck <[EMAIL PROTECTED]>:

Try:

crossprod(x)

or

t(x) %*% x

On 2/28/06, ronggui <[EMAIL PROTECTED]> wrote:

This is the code:

x<-matrix(rnorm(20),5)
y<-list()
for (i in seq(nrow(x))) y[[i]]<-t(x[i,,drop=F])%*%x[i,,drop=F]
y[[1]]+y[[2]]+y[[3]]+y[[4]]+y[[5]]

How can I do it without using for loops?
Thank you in advance!
--
ronggui
Deparment of Sociology
Fudan University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html






--
»ÆÈÙ¹ó
Deparment of Sociology
Fudan University




Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]University of Washington, Seattle

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html




__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html



Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] How to do it without for loops?

2006-03-08 Thread Gabor Grothendieck
Thomas' solution is better but thought this might be of interest
anyways since it can be written closer to mathematical notation.
That is, the required expression can be written in the
following equivalent way for a suitable matrix A:

X' diag(u) A' A diag(u) X

where diag(u) is a diagonal matrix with u along the diagonal
as in the R diag function, spaces refer to matrix multiplication
and ' means transpose.

Thus we have:

A <- outer(unique(dat$id), dat$id, "==")
crossprod(A %*% diag(dat$u) %*% as.matrix(dat[1:2]))


On 3/8/06, Thomas Lumley <[EMAIL PROTECTED]> wrote:
> On Wed, 8 Mar 2006, ronggui wrote:
>
> > Thank you for all .
> >
> > One more question.How can I calculate these efficiently?
> >
> > set.seed(100)
> > dat<-data.frame(x1=rnorm(20),x2=rnorm(20),u=rnorm(20),id=round(2*runif(20)))
> > # In this example,id's elements are  0,1,2.
> > y<-list()
> > for (i in 0:2){
> > X<-as.matrix(subset(dat,id==i,c("x1","x2")))
> > u<-as.matrix(subset(dat,id==i,c("u")))
> > y[[i+1]]<-t(X)%*%u%*%t(u)%*%X
> > }
> > y[[1]]+y[[2]]+y[[3]]
> >
>
> People have already told you about crossprod, so crossprod(crossprod(X,u))
> would seem an obvious improvement over the matrix multiplications.
>
> There is a better solution, though.
>
> Xu<-dat[,c("x1","x2")]*dat[,"u"]
> crossprod( rowsum(Xu, dat$id))
>
>-thomas
>
>
> > the above code is not elegant.And my second solution to this problem
> > is using by to get a list.
> >
> > matlis<-by(dat, dat$id, function(x){
> > a<-as.matrix(x[,c("x1","x2")])
> > b<-as.matrix(x[, "u"])
> > t(a) %*% b  %*% t(b) %*% a
> > })
> >
> > S <- matrix(unlist(matlis), 4, length(matlis))
> > S1 <- matrix(rowSums(S), 2, 2)
> >
> > The code works ,but I want to ask if there is any other more better
> > ways to do it? It seems that this kind of computation is quite common.
> >
> >
> >
> >
> >
> > 2006/2/28, Gabor Grothendieck <[EMAIL PROTECTED]>:
> >> Try:
> >>
> >> crossprod(x)
> >>
> >> or
> >>
> >> t(x) %*% x
> >>
> >> On 2/28/06, ronggui <[EMAIL PROTECTED]> wrote:
> >>> This is the code:
> >>>
> >>> x<-matrix(rnorm(20),5)
> >>> y<-list()
> >>> for (i in seq(nrow(x))) y[[i]]<-t(x[i,,drop=F])%*%x[i,,drop=F]
> >>> y[[1]]+y[[2]]+y[[3]]+y[[4]]+y[[5]]
> >>>
> >>> How can I do it without using for loops?
> >>> Thank you in advance!
> >>> --
> >>> ronggui
> >>> Deparment of Sociology
> >>> Fudan University
> >>>
> >>> __
> >>> R-help@stat.math.ethz.ch mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide! 
> >>> http://www.R-project.org/posting-guide.html
> >>>
> >>
> >
> >
> > --
> > »ÆÈÙ¹ó
> > Deparment of Sociology
> > Fudan University
> >
> >
>
> Thomas Lumley   Assoc. Professor, Biostatistics
> [EMAIL PROTECTED]University of Washington, Seattle
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] how to draw x-axis at randomly position

2006-03-08 Thread Liaw, Andy
Try something like:

plot(1:5, xaxt="n")
axis(side=1, pos=3)

When all else fails, read the help pages (again)...

Andy

From: Jing Yang
> 
> Dear R-user,
> 
> Does any of you know how to draw x-axis at randomly position? 
> It looks I could only draw x-axis at the bottom and top of 
> the plot area (by using function axis). 
> However, sometimes I need to draw x-axis at randomly 
> position, for example, in the plot of y~x, I'd like to let 
> the x-axis go cross y=5.
> 
> Best,Jing
> 
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] how to draw x-axis at randomly position

2006-03-08 Thread Jing Yang
Dear R-user,

Does any of you know how to draw x-axis at randomly position?
It looks I could only draw x-axis at the bottom and top of the plot area (by 
using function axis). 
However, sometimes I need to draw x-axis at randomly position, for example, in 
the plot of y~x, I'd like to let the x-axis go cross y=5.

Best,Jing

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Degrees of freedom using Box.test()

2006-03-08 Thread Patrick Burns
In 3 decades I suspect that the definition of "large"
has changed.  Now with faster computers and R
we don't need to rely on the guesses of Ljung and
Box.  All it will take is someone (not me) to do
some experiments.

Pat

Nestor Arguea wrote:

>It's Ljung and Box (1978) saying that for large number of observations a chi 
>squared with lags-p-q, should provide a good approximation "for most 
>practical purposes" (p. 298 of reference above).
>
>Nestor
>On Wednesday 08 March 2006 4:00 am, Patrick Burns wrote:
>  
>
>>You are saying that the penalty on the degrees of freedom
>>should be the same whether the model was fit with 100
>>observations or 1 million observations.  You are also saying
>>that some tests should have negative degrees of freedom.
>>So I don't think your proposal is the right answer, though
>>presumably there should be some penalty.
>>
>>There is a working paper on the Burns Statistics website
>>about robustness in Ljung-Box tests, but this issue is not one
>>that is covered.
>>
>>
>>Patrick Burns
>>[EMAIL PROTECTED]
>>+44 (0)20 8525 0696
>>http://www.burns-stat.com
>>(home of S Poetry and "A Guide for the Unwilling S User")
>>
>>Nestor Arguea wrote:
>>
>>
>>>After an RSiteSeach("Box.test") I found some discussion regarding the
>>>degrees of freedom in the computation of the Ljung-Box test using
>>>Box.test(), but did not find any posting about the proper degrees of
>>>freedom.
>>>
>>>Box.test() uses "lag=number" as the degrees of freedom.  However, I
>>>believe the correct degrees of freedom should be "number-p-q" where p and
>>>q are the number of estimated parameters (for instance, in a Box-Jenkins
>>>family of models). This, according to the main source in documentation of
>>>Box.test:
>>>
>>>G. M. Ljung and G. E. P. Box, On a measure of Lack of Fit in Time Series
>>>Models, Biometrika, Vol. 65, No. 2 (August, 1978), pp. 297-303.
>>>
>>>One can still compute the correct p-value with
>>>
>>>  
>>>
1-pchisq(value,correctdf)


>>>Nestor
>>>(R 2.2.1 on Linux, Suse 9.3)
>>>  
>>>
>>__
>>R-help@stat.math.ethz.ch mailing list
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide!
>>http://www.R-project.org/posting-guide.html
>>
>>
>
>  
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Unsupervised RandomForest

2006-03-08 Thread Liaw, Andy
To be able to do that, at least a couple of matrices of size 6804 x 6804
have to be created, and 1GB RAM is unlikely to be sufficient.

Andy

From: Rich Jacques
> 
> Dear all,
> 
> I am trying to calculate the proximity matrix for a data set 
> with 16 variables and 6804 observations using random forests. 
>  I have a Pentium 4, 3.00GHz processor with 1 GB of RAM.  
> When I use the command
> 
> randomForest(data.scale,proximity=T)
> 
> I get the warning message
> 
> Error: cannot allocate vector of size 361675 kb
> 
> Is this because I have reached the limit of what my computer 
> is capable of? Or is there a more efficient way to call randomForest?
> 
> Thanks in advance,
> 
> Rich Jacques
> 
> -- 
> Rich Jacques   
> Department of Probability & Statistics 
> University of Sheffield
> Hicks Building   
> Sheffield
> S3 7RH
> 
> __
> R-help@stat.math.ethz.ch mailing list 
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Trellis stacked bar legend

2006-03-08 Thread Deepayan Sarkar
On 3/8/06, Dieter Menne <[EMAIL PROTECTED]> wrote:
> Dear R-Listers,
> (well... called Depayan)
>
> The standard example
>
> barchart(yield ~ variety | site, data = barley,
> groups = year, layout = c(1,6), stack = TRUE,
> auto.key = list(points = FALSE, rectangles = TRUE, space = "right"),
> ylab = "Barley Yield (bushels/acre)",
> scales = list(x = list(rot = 45)))
>
> shows the problem: legend colors are inverted compared to stack colors.
> This is still acceptable with 2 colors, but with 3 or more, people
> ask me why it's partly down-under. I use to mumble that one
> author is in New Zealand, the other either in India or in Wisconsin.

Blame René Descartes for making the positive y-axis point up. Writing
systems are pretty much universally top to bottom, as far as I know.

> I know the hard way to correct this, but auto.key is nice because
> it's print-time settings dynamic. Any easy workaround?

None I can think of, except circumventing Descartes with an additional

ylim = c(130, -5)

I was thinking of adding an option to panel.barchart, but a more
general solution would be options to flip the order in draw.key. I'll
see if that's doable.

Deepayan

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Survival Plots by Strata

2006-03-08 Thread Bret Collier
Thomas,

Thank you for the response.  I will post what I come up with.

Bret

>>> Thomas Lumley <[EMAIL PROTECTED]> 03/08/06 9:16 AM >>>

I don't see any easy way to do this.  I think you may have to do the 
plotting yourself, based on the code in plot.survfit.

-thomas


On Wed, 8 Mar 2006, Bret Collier wrote:

> All,
> I am struggling to create a survival plot using LTRC data for each
year
> of a 10 year period.
>
> I have a set of individuals (birds) where 'entry' is the day of the
> year (1-365) they are released (let out of pens) into the wild (2
year
> data snip below).  'Entry'  (e.g., day of year the first bird is
> released for each year) is highly variable, ranging from 48 to >250.
> When I create a plot using the below code I would like to remove the
> 'solid' line which represent S_hat=1 out to the LC point for each
year
> and instead have the survival curves formatted in more of a
'hanging'
> style, where the LC day (e.g., min(Entry for each year)) is where
each
> curve begins at S_hat=1 for each year (and does not extend back to
the
> y-axis)?  I could not find anything on this in the archives or MASS
or
> Survival Analysis using S?  Anyone have a suggestion on where to
look?
>
> TIA, Bret
>
>
> #Code snip for R email.
> apc.coxfit1<-coxph(Surv(Entry, Exit, Fate)~Sex + Agerelease +
> Dayrelease + strata(Year), data=mydat)
> coxfit.apc<-survfit(apc.coxfit1)
> coxfit.apc
> plot(survfit(apc.coxfit1), conf.int=F, log=T, lty=c(1:2),
col=c(1:2),
> xlim=c(205, 800)) #not run--first entry for this example is day 205
for
> 1996, 259 for 1997
>
>> mydat
> IDYearDayrelease
> AgereleaseSurvivorshipEntry   ExitFateSex
>
16240   1996205 95  164 205 369 1   0
>
16319   1996205 88  140 205 345 1   0
>
16378   1996248 108 100 248 348 1   0
>
20383   1996241 98  204 241 445 1   0
>
16324   1996219 90  227 219 446 1   0
>
16327   1996219 90  497 219 716 1   0
>
20373   1996241 114 413 241 654 1   0
>
20374   1996241 111 211 241 452 1   0
>
16241   1996205 95  234 205 439 1   1
>
16321   1996219 90  118 219 337 1   1
>
16323   1996219 90  180 219 399 1   1
>
20375   1996241 103 268 241 509 1   1
>
20384   1996241 98  299 241 540 1   1
>
20390   1996241 93  204 241 445 1   1
>
20393   1996241 88  208 241 449 1   1
>
16313   1996248 122 512 248 760 0   1
>
20378   1996241 103 236 241 477 0   1
>
20381   1996241 101 329 241 570 0   1
>
16328   1996219 90  224 219 443 0   1
>
16827   1997259 127 52  259 311 1   0
>
16828   1997259 127 216 259 475 1   0
>
16831   1997303 171 19  303 322 1   0
>
20466   1997289 149 31  289 320 1   0
>
20469   1997289 149 199 289 488 1   0
>
20483   1997289 134 18  289 307 1   0
>
16807   1997259 137 223 259 482 1   0
>
16809   1997259 137 1   259 260 1   0
>
16819   1997259 131 237 259 496 1   0
>
16829   1997303 171 7   303 310 1   0
>
20440   1997289 161 7   289 296 1   0
>
20470   1997289 148 257 289 546 1   0
>
20478   1997289 143 12  289 301 1   0
>
16817   1997259 130 85  259 344 1   0
>
20454   1997289 154 4   289 293 1   1
>
20459   1997289 153 335 289 624 1   1
>
20460   1997289 153 118 289 407 1   1
>
20465   1997289 150 31  289 320 1   1
>
20473   1997289 147 65  289 354 1   1
>
20484   1997289 133 58  289 347 1   1
>
20489   1997289 130 56  289 345 1   1
>
16808   1997259 137 137 259 396 0   1
>
16810   1997303 181 64  303 367 0   1
>
16816   1997259 130 1   259 260 0   1
>
20471   1997289 147 334 289 623 0   1
>
16826   1997259 127 20  259 279 0   1
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help 
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html 
>

Thomas Lumley   Assoc

Re: [R] Degrees of freedom using Box.test()

2006-03-08 Thread Nestor Arguea
It's Ljung and Box (1978) saying that for large number of observations a chi 
squared with lags-p-q, should provide a good approximation "for most 
practical purposes" (p. 298 of reference above).

Nestor
On Wednesday 08 March 2006 4:00 am, Patrick Burns wrote:
> You are saying that the penalty on the degrees of freedom
> should be the same whether the model was fit with 100
> observations or 1 million observations.  You are also saying
> that some tests should have negative degrees of freedom.
> So I don't think your proposal is the right answer, though
> presumably there should be some penalty.
>
> There is a working paper on the Burns Statistics website
> about robustness in Ljung-Box tests, but this issue is not one
> that is covered.
>
>
> Patrick Burns
> [EMAIL PROTECTED]
> +44 (0)20 8525 0696
> http://www.burns-stat.com
> (home of S Poetry and "A Guide for the Unwilling S User")
>
> Nestor Arguea wrote:
> >After an RSiteSeach("Box.test") I found some discussion regarding the
> > degrees of freedom in the computation of the Ljung-Box test using
> > Box.test(), but did not find any posting about the proper degrees of
> > freedom.
> >
> >Box.test() uses "lag=number" as the degrees of freedom.  However, I
> > believe the correct degrees of freedom should be "number-p-q" where p and
> > q are the number of estimated parameters (for instance, in a Box-Jenkins
> > family of models). This, according to the main source in documentation of
> > Box.test:
> >
> >G. M. Ljung and G. E. P. Box, On a measure of Lack of Fit in Time Series
> >Models, Biometrika, Vol. 65, No. 2 (August, 1978), pp. 297-303.
> >
> >One can still compute the correct p-value with
> >
> >>1-pchisq(value,correctdf)
> >
> >Nestor
> >(R 2.2.1 on Linux, Suse 9.3)
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html

-- 
Nestor M. Arguea, Chair
Department of Marketing and Economics
University of West Florida
11000 University Parkway
Pensacola, FL 32514
Phone: (850)474-3071
Fax: (850)474-3069

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Survival Plots by Strata

2006-03-08 Thread Thomas Lumley

I don't see any easy way to do this.  I think you may have to do the 
plotting yourself, based on the code in plot.survfit.

-thomas


On Wed, 8 Mar 2006, Bret Collier wrote:

> All,
> I am struggling to create a survival plot using LTRC data for each year
> of a 10 year period.
>
> I have a set of individuals (birds) where 'entry' is the day of the
> year (1-365) they are released (let out of pens) into the wild (2 year
> data snip below).  'Entry'  (e.g., day of year the first bird is
> released for each year) is highly variable, ranging from 48 to >250.
> When I create a plot using the below code I would like to remove the
> 'solid' line which represent S_hat=1 out to the LC point for each year
> and instead have the survival curves formatted in more of a 'hanging'
> style, where the LC day (e.g., min(Entry for each year)) is where each
> curve begins at S_hat=1 for each year (and does not extend back to the
> y-axis)?  I could not find anything on this in the archives or MASS or
> Survival Analysis using S?  Anyone have a suggestion on where to look?
>
> TIA, Bret
>
>
> #Code snip for R email.
> apc.coxfit1<-coxph(Surv(Entry, Exit, Fate)~Sex + Agerelease +
> Dayrelease + strata(Year), data=mydat)
> coxfit.apc<-survfit(apc.coxfit1)
> coxfit.apc
> plot(survfit(apc.coxfit1), conf.int=F, log=T, lty=c(1:2), col=c(1:2),
> xlim=c(205, 800)) #not run--first entry for this example is day 205 for
> 1996, 259 for 1997
>
>> mydat
> IDYearDayrelease
> AgereleaseSurvivorshipEntry   ExitFateSex
> 16240 1996205 95  164 205 369 1   0
> 16319 1996205 88  140 205 345 1   0
> 16378 1996248 108 100 248 348 1   0
> 20383 1996241 98  204 241 445 1   0
> 16324 1996219 90  227 219 446 1   0
> 16327 1996219 90  497 219 716 1   0
> 20373 1996241 114 413 241 654 1   0
> 20374 1996241 111 211 241 452 1   0
> 16241 1996205 95  234 205 439 1   1
> 16321 1996219 90  118 219 337 1   1
> 16323 1996219 90  180 219 399 1   1
> 20375 1996241 103 268 241 509 1   1
> 20384 1996241 98  299 241 540 1   1
> 20390 1996241 93  204 241 445 1   1
> 20393 1996241 88  208 241 449 1   1
> 16313 1996248 122 512 248 760 0   1
> 20378 1996241 103 236 241 477 0   1
> 20381 1996241 101 329 241 570 0   1
> 16328 1996219 90  224 219 443 0   1
> 16827 1997259 127 52  259 311 1   0
> 16828 1997259 127 216 259 475 1   0
> 16831 1997303 171 19  303 322 1   0
> 20466 1997289 149 31  289 320 1   0
> 20469 1997289 149 199 289 488 1   0
> 20483 1997289 134 18  289 307 1   0
> 16807 1997259 137 223 259 482 1   0
> 16809 1997259 137 1   259 260 1   0
> 16819 1997259 131 237 259 496 1   0
> 16829 1997303 171 7   303 310 1   0
> 20440 1997289 161 7   289 296 1   0
> 20470 1997289 148 257 289 546 1   0
> 20478 1997289 143 12  289 301 1   0
> 16817 1997259 130 85  259 344 1   0
> 20454 1997289 154 4   289 293 1   1
> 20459 1997289 153 335 289 624 1   1
> 20460 1997289 153 118 289 407 1   1
> 20465 1997289 150 31  289 320 1   1
> 20473 1997289 147 65  289 354 1   1
> 20484 1997289 133 58  289 347 1   1
> 20489 1997289 130 56  289 345 1   1
> 16808 1997259 137 137 259 396 0   1
> 16810 1997303 181 64  303 367 0   1
> 16816 1997259 130 1   259 260 0   1
> 20471 1997289 147 334 289 623 0   1
> 16826 1997259 127 20  259 279 0   1
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do rea

Re: [R] power and sample size for a GLM with Poisson response variable

2006-03-08 Thread Wassell, James T., Ph.D.
Craig, Thanks for your follow-up note on using the asypow package.  My
problem was not only constructing the "constraints" vector but, for my
particular situation (Poisson regression, two groups, sample sizes of
(1081,3180), I get very different results using asypow package compared
to my other (home grown) approaches.  

library(asypow)
pois.mean<-c(0.0065,0.0003)
info.pois <- info.poisson.kgroup(pois.mean, group.size=c(1081,3180))
constraints <- matrix(c(2,1,2), ncol = 3, byrow=T)
poisson.object <- asypow.noncent(pois.mean, info.pois,constraints)
power.pois <- asypow.power(poisson.object, c(1081,3180), 0.05)
print(power.pois)

[1] 0.2438309

asy.pwr()   #  the function is shown below.

$power
[1] 0.96

sim.pwr()   #  the function is shown below.
4261000 Poisson random variates simulated 
$power
[1] 0.567

I tend to think the true power is greater than 0.567 but less than 0.96
(not as small as 0.2438309).  

Maybe I am still not using the asypow functions correctly?
Suggestions/comments welcome.  


-Original Message-
From: Craig A. Faulhaber [mailto:[EMAIL PROTECTED] 
Sent: Saturday, March 04, 2006 12:04 PM
To: Wassell, James T., Ph.D.
Subject: Re: [R] power and sample size for a GLM with Poisson response
variable

Hi James,

Thanks again for your help.  With the assistance of a statistician
colleauge of mine, I figured out the "constraints" vector.  It is a
3-column vector describing the null hypothesis.  For example, let's say
you have 3 populations and your null hypothesis is that there is no
difference between the 3.  The first row of the matrix would be "3 1 2" 
indicating that you have 3 populations and that populations 1 and 2 are
equal.  The second row would be "3 2 3" indicating that you have 3
populations and that populations 2 and 3 are equal.  If you had only 2
populations, there would be only one row ("2 1 2", indicating 2
populations with 1 and 2 equal).  I hope this helps.

Craig

Wassell, James T., Ph.D. wrote:

> Craig,   I found the package asypow difficult to use and it did not 
> yield results in the ballpark of other approaches.   (could not figure

> out the "constraints" vector).  
>
> I wrote some simple functions, one asy.pwr uses the non-central 
> chi-square distn.
>
> asy.pwr<-function(counts=c(7,1),py=c(1081,3180))
> { # a two group Poisson regression power computation #  py is 
> person-years or person-time however measured
> group<-gl(2,1)
> ncp<-summary(glm(counts~group+offset(log(py)),family=poisson))$null.de
> viance
>
> q.tile<-qchisq(.95,1)  #  actually just the X2 critical value of 
> 3.841459 list(power=round(1-pchisq(q.tile,df=1,ncp),2))}
>
> The second function, sim.pwr,  estimates power using simulated Poisson

> random variates.  The most time consuming step is the call to rpois.

> (Maybe someone knows a more efficient way to accomplish this?).  The 
> "for" loop is rather quick in comparison.   I hope you may find this 
> helpful, or if you have solved your problem some other way, please 
> pass along your approach.Note, that for this problem,  very small 
> values of lambda, the two approaches give  much different power 
> estimates  (96% vs. 55% or so).   My problem may be better addressed 
> as binomial logistic regression, maybe then the simulation and the 
> asymptotic estimates my agree better.   
>
> sim.pwr<-function(means=c(0.0065,0.0003),ptime=c(1081,3180),nsim=1000)
> { # a two group poisson regression power computation # based 
> simulating lots of Poisson r.v.'s # input rates followed by a vector 
> of the corresponding person times # the most time consuming part is 
> the r.v. generation.
> # power is determined by counting the how often p-values are <= 0.05
> group<-as.factor(rep(c(1,2),ptime))
> rej<-vector(length=nsim)
> y<-rpois(ptime[1]*nsim,means[1])
> y<-c(y,rpois(ptime[2]*nsim,means[2]))
> y<-matrix(y,nrow=nsim)
> cat(sum(ptime)*nsim,"Poisson random variates simulated","\n") for(i in

> 1:nsim){res<-glm(y[i,]~group,family=poisson())
> rej[i]<-summary(res)$coeff[2,4]<=0.05}
> list(power=sum(rej)/nsim) }
>
>
>
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How to do it without for loops?

2006-03-08 Thread Thomas Lumley

On Wed, 8 Mar 2006, ronggui wrote:


Thank you for all .

One more question.How can I calculate these efficiently?

set.seed(100)
dat<-data.frame(x1=rnorm(20),x2=rnorm(20),u=rnorm(20),id=round(2*runif(20)))
# In this example,id's elements are  0,1,2.
y<-list()
for (i in 0:2){
X<-as.matrix(subset(dat,id==i,c("x1","x2")))
u<-as.matrix(subset(dat,id==i,c("u")))
y[[i+1]]<-t(X)%*%u%*%t(u)%*%X
}
y[[1]]+y[[2]]+y[[3]]



People have already told you about crossprod, so crossprod(crossprod(X,u)) 
would seem an obvious improvement over the matrix multiplications.


There is a better solution, though.

Xu<-dat[,c("x1","x2")]*dat[,"u"]
crossprod( rowsum(Xu, dat$id))

-thomas



the above code is not elegant.And my second solution to this problem
is using by to get a list.

matlis<-by(dat, dat$id, function(x){
a<-as.matrix(x[,c("x1","x2")])
b<-as.matrix(x[, "u"])
t(a) %*% b  %*% t(b) %*% a
})

S <- matrix(unlist(matlis), 4, length(matlis))
S1 <- matrix(rowSums(S), 2, 2)

The code works ,but I want to ask if there is any other more better
ways to do it? It seems that this kind of computation is quite common.





2006/2/28, Gabor Grothendieck <[EMAIL PROTECTED]>:

Try:

crossprod(x)

or

t(x) %*% x

On 2/28/06, ronggui <[EMAIL PROTECTED]> wrote:

This is the code:

x<-matrix(rnorm(20),5)
y<-list()
for (i in seq(nrow(x))) y[[i]]<-t(x[i,,drop=F])%*%x[i,,drop=F]
y[[1]]+y[[2]]+y[[3]]+y[[4]]+y[[5]]

How can I do it without using for loops?
Thank you in advance!
--
ronggui
Deparment of Sociology
Fudan University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html






--
??
Deparment of Sociology
Fudan University




Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] fitting a distribution using glm

2006-03-08 Thread Thomas Lumley
On Wed, 8 Mar 2006, Vumani Dlamini wrote:

> it is easy to fit a distribution using fitdistr
>
> poisdata <- rpois(n = 100, lambda = 2)
> poismle <- fitdistr(poisdata, "Poisson")
> poismle
>
> but i would like to know whether its possible to get an identical result
> using glm.

Yes. Easy.

  glm(poisdata ~ 1, family = poisson)

>
> poistab <- data.frame(table(poisdata))
> colnames(poistab) <- c("width","freq");
> poistab[,"width"] <- as.numeric(poistab[,"width"])
> glm(freq ~ 1-width, family = poisson, data = poistab)
>
> but the results are always different.

Well, they would be.  I don't know what this is doing, but it certainly 
isn't fitting a poisson distribution to poisdata.

-thomas

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] data import problem

2006-03-08 Thread Thomas Lumley
On Wed, 8 Mar 2006, [EMAIL PROTECTED] wrote:
>
> I thought I coudl simply use something line this:
>
> con <- file("test2.txt");
> do {
>e <- read.table(con, nlines = 1);
>if ( length(e) == 2 ) {
>   d <- read.table(con, nrows = e[1,2]);
>   #process data frame d
>}
> } while (length(e) == 2);
>
> The problem is that read.table closes the connection object, I assumed 
> that it would not close the connection, and instead contines where it 
> last stopped.

I think the problem is just that you didn't open the connection before 
passing it to read.table.

?file says "By default the connection is not opened"
and
?read.table says
   Alternatively, 'file' can be a 'connection', which will be
   opened if necessary, and if so closed at the end of the
   function call.



-thomas

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] fitting a distribution using glm

2006-03-08 Thread Peter Dalgaard
"Vumani Dlamini" <[EMAIL PROTECTED]> writes:

> it is easy to fit a distribution using fitdistr
> 
> poisdata <- rpois(n = 100, lambda = 2)
> poismle <- fitdistr(poisdata, "Poisson")
> poismle
> 
> but i would like to know whether its possible to get an identical result 
> using glm. I use
> 
> poistab <- data.frame(table(poisdata))
> colnames(poistab) <- c("width","freq");
> poistab[,"width"] <- as.numeric(poistab[,"width"])
> glm(freq ~ 1-width, family = poisson, data = poistab)
> 
> but the results are always different.


Well, the freq values in that setup are not independent Poisson
variates. For a start, they sum to constant!

Try instead

exp(coef(glm(poisdata~1,family=poisson)))


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Multiple logistic regression

2006-03-08 Thread Frank E Harrell Jr
Stephanie Delalieux wrote:
> Dear R-users,
> 
> Is there a function in R that classifies data in more than 2 groups using
> logistic regression/classification? I want to compare the c-indices of
> earlier research (lrm, binary response variables) with new c-indices
> obtained from 'multiple' (more response variables) logistic regression.
> 
> Best regards,
> Stephanie Delalieux
> 
> Department Biosystems
> M³-BIORES
> Group of Geomatics Engineering
> [EMAIL PROTECTED]

There are many functions available but the answer depends on whether the 
response variable has a natural ordering.  Whether it does or not, 
though, C-indexes may decrease from the binary case because predicting 
more categories is a more difficult task.

Frank Harrell

> 
> 
> Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
> 


-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Unsupervised RandomForest

2006-03-08 Thread Rich Jacques
Dear all,

I am trying to calculate the proximity matrix for a data set with 16 variables
and 6804 observations using random forests.  I have a Pentium 4, 3.00GHz
processor with 1 GB of RAM.  When I use the command

randomForest(data.scale,proximity=T)

I get the warning message

Error: cannot allocate vector of size 361675 kb

Is this because I have reached the limit of what my computer is capable of? Or
is there a more efficient way to call randomForest?

Thanks in advance,

Rich Jacques

-- 
Rich Jacques   
Department of Probability & Statistics 
University of Sheffield
Hicks Building   
Sheffield
S3 7RH

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Multiple logistic regression

2006-03-08 Thread ronggui
Do you mean multinomial logistic model?
If it is,the  multinom function in nnet package and multinomial
function in VGAM(http://www.stat.auckland.ac.nz/~yee) package can do
it.

But I have no idea about the c-index.

2006/3/8, Stephanie Delalieux <[EMAIL PROTECTED]>:
> Dear R-users,
>
> Is there a function in R that classifies data in more than 2 groups using
> logistic regression/classification? I want to compare the c-indices of
> earlier research (lrm, binary response variables) with new c-indices
> obtained from 'multiple' (more response variables) logistic regression.
>
> Best regards,
> Stephanie Delalieux
>
> Department Biosystems
> M³-BIORES
> Group of Geomatics Engineering
> [EMAIL PROTECTED]
>
>
> Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>


--
黄荣贵
Deparment of Sociology
Fudan University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Trellis stacked bar legend

2006-03-08 Thread Dieter Menne
Dear R-Listers,
(well... called Depayan)

The standard example 

barchart(yield ~ variety | site, data = barley,
groups = year, layout = c(1,6), stack = TRUE,
auto.key = list(points = FALSE, rectangles = TRUE, space = "right"),
ylab = "Barley Yield (bushels/acre)",
scales = list(x = list(rot = 45)))

shows the problem: legend colors are inverted compared to stack colors. 
This is still acceptable with 2 colors, but with 3 or more, people 
ask me why it's partly down-under. I use to mumble that one
author is in New Zealand, the other either in India or in Wisconsin.

I know the hard way to correct this, but auto.key is nice because
it's print-time settings dynamic. Any easy workaround?

Dieter

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] glm automation

2006-03-08 Thread ronggui
I just pointed out the mistake about using index.

Andy's post let me know the trouble of using formula like that.It's
valuable information for me:) Thank you!

在 06-3-8,Liaw, Andy<[EMAIL PROTECTED]> 写道:
> From: ronggui
> >
> > 2006/3/8, A Mani <[EMAIL PROTECTED]>:
> > > Hello,
> > > I have two problems in automating multiple glm(s)
> > operations.
> > > The data file is tab delimited file with headers and two
> > columns. like
> > >
> > > "ABC"  "EFG"
> > > 1  2
> > > 2  3
> > > 3  4
> > > dat <- read.table("FILENAME", header=TRUE, sep="\t",
> > na.strings="NA",
> > > dec=".", strip.white=TRUE) dataf <- read.table("FILENAME",
> > > header=FALSE, sep="\t", na.strings="NA", dec=".", strip.white=TRUE)
> > > norm1 <- glm(dataf[1,1] ~ dataf[1,2], family= normal(log), data=dat)
> > > norm2 <- glm(dataf[1,1] ~ dataf[1,2], family=
> > normal(identity), data=dat)
> > > and so on.
> > It should be
> >  norm1 <- glm(dataf[,1] ~ dataf[,2], family= normal(log),
> > data=dat)  norm2 <- glm(dataf[,1] ~ dataf[,2], family=
> > normal(identity), data=dat)
> >
> > you should read the document of "[" about how to use index.
>
> I wish people would just give up on (ab)using model formula like that.  IMHO
> it's really asking for trouble down the road.  For example:
>
> > n <- 5
> > d1 <- data.frame(x=1:n, y=rnorm(n))
> > d2 <- data.frame(u=n:1, v=rnorm(n))
> > d3 <- data.frame(y=rnorm(n), x=n:1)
> > f1 <- lm(d1[,2] ~ d1[,1], data=d1)
> > f2 <- lm(y ~ x, data=d1)
> > predict(f1)
>12345
> -1.767697694 -1.326691900 -0.885686106 -0.444680312 -0.003674518
> > predict(f2)
>12345
> -1.767697694 -1.326691900 -0.885686106 -0.444680312 -0.003674518
> > predict(f1, d2)
>12345
> -1.767697694 -1.326691900 -0.885686106 -0.444680312 -0.003674518
>
> Notice anything odd above?
>
> > predict(f2, d3)
>12345
> -0.003674518 -0.444680312 -0.885686106 -1.326691900 -1.767697694
>
> Now that's more like it...
>
> Andy
>
>
>
> > > But glm does not work on the data unless I write ABC and
> > EFG there...
> > > I want to automate the script for multiple files.
> > >
> > > The other problem is to write the plot(GLM) to a file without
> > > displaying it at stdout.
> > >
> > > Thanks,
> > >
> > >
> > > A. Mani
> > > Member, Cal. Math. Soc
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide!
> > > http://www.R-project.org/posting-guide.html
> > >
> >
> >
> > --
> > 黄荣贵
> > Deparment of Sociology
> > Fudan University
> >
> >
>
>
> --
> Notice:  This e-mail message, together with any attachment...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] How to do it without for loops?

2006-03-08 Thread ronggui
Thank you for all .

One more question.How can I calculate these efficiently?

set.seed(100)
dat<-data.frame(x1=rnorm(20),x2=rnorm(20),u=rnorm(20),id=round(2*runif(20)))
# In this example,id's elements are  0,1,2.
y<-list()
for (i in 0:2){
X<-as.matrix(subset(dat,id==i,c("x1","x2")))
u<-as.matrix(subset(dat,id==i,c("u")))
y[[i+1]]<-t(X)%*%u%*%t(u)%*%X
}
y[[1]]+y[[2]]+y[[3]]

the above code is not elegant.And my second solution to this problem
is using by to get a list.

matlis<-by(dat, dat$id, function(x){
a<-as.matrix(x[,c("x1","x2")])
b<-as.matrix(x[, "u"])
t(a) %*% b  %*% t(b) %*% a
})

S <- matrix(unlist(matlis), 4, length(matlis))
S1 <- matrix(rowSums(S), 2, 2)

The code works ,but I want to ask if there is any other more better
ways to do it? It seems that this kind of computation is quite common.





2006/2/28, Gabor Grothendieck <[EMAIL PROTECTED]>:
> Try:
>
> crossprod(x)
>
> or
>
> t(x) %*% x
>
> On 2/28/06, ronggui <[EMAIL PROTECTED]> wrote:
> > This is the code:
> >
> > x<-matrix(rnorm(20),5)
> > y<-list()
> > for (i in seq(nrow(x))) y[[i]]<-t(x[i,,drop=F])%*%x[i,,drop=F]
> > y[[1]]+y[[2]]+y[[3]]+y[[4]]+y[[5]]
> >
> > How can I do it without using for loops?
> > Thank you in advance!
> > --
> > ronggui
> > Deparment of Sociology
> > Fudan University
> >
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! 
> > http://www.R-project.org/posting-guide.html
> >
>


--
黄荣贵
Deparment of Sociology
Fudan University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] data import problem

2006-03-08 Thread Gabor Grothendieck
Check out:

http://finzi.psych.upenn.edu/R/Rhelp02a/archive/52957.html

for a similar problem.

On 3/8/06, [EMAIL PROTECTED]
<[EMAIL PROTECTED]> wrote:
> Dear All,
>
> I'm trying to read a text data file that contains several records separated 
> by a blank line. Each record starts with a row that contains it's ID and the 
> number of rows for the records (two columns), then the data table itself, e.g.
>
> 123 5
> 89.17911.1024
> 90.57351.1024
> 92.56661.1024
> 95.07251.1024
> 101.20701.1024
>
> 321 3
> 60.16011.1024
> 64.80231.1024
> 70.05932.1502
>
> ...
>
> I thought I coudl simply use something line this:
>
> con <- file("test2.txt");
> do {
>e <- read.table(con, nlines = 1);
>if ( length(e) == 2 ) {
>d <- read.table(con, nrows = e[1,2]);
>#process data frame d
>}
> } while (length(e) == 2);
>
> The problem is that read.table closes the connection object, I assumed that 
> it would not close the connection, and instead contines where it last stopped.
>
> Since the data is nearly a simple table I though read.table could work rather 
> than using scan directly. Any suggestions to read this file efficently are 
> welcome (the file can contain several thousand record and each record can 
> contain several thousand rows).
>
>thanks a lot for your help,
>+kind regards,
>
>Arne
>
>
>[[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] fitting a distribution using glm

2006-03-08 Thread Vumani Dlamini
it is easy to fit a distribution using fitdistr

poisdata <- rpois(n = 100, lambda = 2)
poismle <- fitdistr(poisdata, "Poisson")
poismle

but i would like to know whether its possible to get an identical result 
using glm. I use

poistab <- data.frame(table(poisdata))
colnames(poistab) <- c("width","freq");
poistab[,"width"] <- as.numeric(poistab[,"width"])
glm(freq ~ 1-width, family = poisson, data = poistab)

but the results are always different.
cheers, vumani

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Read.table

2006-03-08 Thread Sean Davis



On 3/8/06 8:31 AM, "Liaw, Andy" <[EMAIL PROTECTED]> wrote:

> From: Uwe Ligges
>> 
>> Matias Mayor Fernandez wrote:
>> 
>>> Hi,
>>> 
>>>  
>>> 
>>> I have some column vector in txt or xls and I need to load
>> into R as 
>>> numeric vector.
>>> 
>>>  
>>> 
>>> I use the read.table  (X=read.table(123.txt") command but
>> the program 
>>> say that "X is not a numeric vector"

See here if you want to think of your data as a single-column table:

http://cran.r-project.org/doc/manuals/R-data.html#Spreadsheet_002dlike-data

Sean

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] info() function?

2006-03-08 Thread Gabor Grothendieck
Another possibility is eapply where I have used naCount
from Henrik's solution:

prop <- function(x)
 list(class = data.class(x), dim = dim(x), size = object.size(x), NAs
= naCount(x))
do.call("rbind", eapply(.GlobalEnv, prop))


On 3/8/06, Henrik Bengtsson <[EMAIL PROTECTED]> wrote:
> >library(R.oo)
> >ll()
> member data.class dimension object.size
> 1 anumeric  10004028
> 2author  character 1 112
> 3   expnumeric 1  36
> 4  last.warning   list 2 488
> 5object   function  NULL 864
> 6   row  character 1  72
> 7 USArrests data.frame   c(50,4)4076
> 8  VADeaths matrixc(5,4) 824
> 9 value  character 1  72
>
> with NA counts:
> >naCount <- function(x, ...) ifelse(is.vector(x), sum(is.na(x)), NA)
> >properties <- c("data.class", "dimension", "object.size", "naCount")
> ll(properties=properties)
> member data.class dimension object.size
> 1 anumeric  10004028
> 2author  character 1 112
> 3   expnumeric 1  36
> 4  last.warning   list 2 488
> 5   naCount   function  NULL 864
> 6object   function  NULL 864
> 7properties  character 4 212
> 8   row  character 1  72
> 9 USArrests data.frame   c(50,4)4076
> 10 VADeaths matrixc(5,4) 824
> 11value  character 1  72
>
> FYI: In next version of R.oo, there will probably be some kind of
> option to set the default 'properties' argument so that this must not
> be given explicitly by default.
>
> Cheers
>
> Henrik
>
> On 3/8/06, Robert Lundqvist <[EMAIL PROTECTED]> wrote:
> > I would like to have some function for getting an overview of the
> > variables in a worksheet. Class, dimesions, length, number of missing
> > values,... Guess it wouldn't be that hard to set up such a function, but I
> > guess there are others who have made it already. Or is it already a
> > standard feature in the base package? Any suggestions?
> >
> > Robert
> >
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! 
> > http://www.R-project.org/posting-guide.html
> >
> >
>
>
> --
> Henrik Bengtsson
> Mobile: +46 708 909208 (+1h UTC)
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Read.table

2006-03-08 Thread Liaw, Andy
From: Uwe Ligges
> 
> Matias Mayor Fernandez wrote:
> 
> > Hi,
> > 
> >  
> > 
> > I have some column vector in txt or xls and I need to load 
> into R as 
> > numeric vector.
> > 
> >  
> > 
> > I use the read.table  (X=read.table(123.txt") command but 
> the program 
> > say that "X is not a numeric vector"
> 
> No, I think you got:
> 
>Error: syntax error in "(X=read.table(123.txt"
> 
> or you have used another call without the syntax error in it.
> 
> In any case, please be more specific what you did. You might 
> also want 
> to copy the first few lines of file 123.txt in your mail.

Besides, if Matias only has one column of data and want that read in as a
vector, scan() is really more appropriate.

Andy

 
> Uwe Ligges
> 
> 
> >  
> > 
> >  
> > 
> > Where is the problem?
> > 
> >  
> > 
> > Matías
> > 
> >  
> > 
> > University of Oviedo,
> > 
> >  
> > 
> > Spain
> > 
> > 
> > [[alternative HTML version deleted]]
> > 
> > 
> > 
> > 
> --
> > --
> > 
> > __
> > R-help@stat.math.ethz.ch mailing list 
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! 
> > http://www.R-project.org/posting-guide.html
> 
> __
> R-help@stat.math.ethz.ch mailing list 
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] combining results and exporting problems

2006-03-08 Thread Dario Greco
hello everybody, 

this is my first post to the list.
i am facing two foolish but tedious problems:

1) i have a data frame like this:
  var1  var2   var3  var4   var5   var6   var7   var8   var9
1   4.12  1314  867  552  276 302 105   13   420
2   8.99  2712 2197 1492  597 271 217   27  1031
3   9.64  2263 1720 1064  588 136 385   23  1064
4   2.53  1328  876  558  226 212 212   13   465
5   3.60  2260 1808 1130  565 181 203   23   972
6   8.88  2274 1819  864  682 182 2270  1137
7   4.92  1110  744  477  233 189 144   11   244
8  12.11  2895 2085 1419  608 550 232   29  1216
9   2.37  1438  963  604  316 273 144   14   489
10 17.30  1332  932  546  333 253 146   13   493
11  2.65  1285  977  604  308 154 103   13   501
12  4.35  2069 1221  828  290 621 186   21   600
13  2.37  2015 1471  967  403 302 1810   806
14  2.09  1707 1280  802  410 171 188   17   563
15  0.70  2027 1318  811  365 405 223   20   527
16  7.77  2264 1471  702  634 589 1810   860
17 15.18  1578 1073  600  347 395  95   16   631
18 24.46  1991 1573 1035  478 139 239   20   776
19 14.55   1489  933  422 467 244   89   489
20  4.00  1818 1455  909  418 109 218   18   745

i need to compute the linear model lm() pairwise between the columns.
i need also to compute the anova().
i would like to get into one object all the results, instead of running many 
lm() and anova() separately for each couple of columns.

2) i would like to export into a text file the results from aov() and 
TukeyHSD(). 

is there any suggestion?
thankyou very much for your kind help.

sincerely
d


-- 

Dario Greco

Institute of Biotechnology - University of Helsinki
Building Cultivator II
P.O.Box 56  Viikinkaari 4
FIN-00014   Finland

Office: +358 9 191 58951
Fax: +358 9 191 58952
Mobile: +358 44 023 5780

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Multiple logistic regression

2006-03-08 Thread Stephanie Delalieux
Dear R-users,

Is there a function in R that classifies data in more than 2 groups using
logistic regression/classification? I want to compare the c-indices of
earlier research (lrm, binary response variables) with new c-indices
obtained from 'multiple' (more response variables) logistic regression.

Best regards,
Stephanie Delalieux

Department Biosystems
M³-BIORES
Group of Geomatics Engineering
[EMAIL PROTECTED]


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Survival Plots by Strata

2006-03-08 Thread Bret Collier
All,
I am struggling to create a survival plot using LTRC data for each year
of a 10 year period.

I have a set of individuals (birds) where 'entry' is the day of the
year (1-365) they are released (let out of pens) into the wild (2 year
data snip below).  'Entry'  (e.g., day of year the first bird is
released for each year) is highly variable, ranging from 48 to >250. 
When I create a plot using the below code I would like to remove the
'solid' line which represent S_hat=1 out to the LC point for each year
and instead have the survival curves formatted in more of a 'hanging'
style, where the LC day (e.g., min(Entry for each year)) is where each
curve begins at S_hat=1 for each year (and does not extend back to the
y-axis)?  I could not find anything on this in the archives or MASS or
Survival Analysis using S?  Anyone have a suggestion on where to look?

TIA, Bret


#Code snip for R email.
apc.coxfit1<-coxph(Surv(Entry, Exit, Fate)~Sex + Agerelease +
Dayrelease + strata(Year), data=mydat)
coxfit.apc<-survfit(apc.coxfit1)
coxfit.apc
plot(survfit(apc.coxfit1), conf.int=F, log=T, lty=c(1:2), col=c(1:2),
xlim=c(205, 800)) #not run--first entry for this example is day 205 for
1996, 259 for 1997

>mydat
ID  YearDayrelease
Agerelease  SurvivorshipEntry   ExitFateSex
16240   1996205 95  164 205 369 1   0
16319   1996205 88  140 205 345 1   0
16378   1996248 108 100 248 348 1   0
20383   1996241 98  204 241 445 1   0
16324   1996219 90  227 219 446 1   0
16327   1996219 90  497 219 716 1   0
20373   1996241 114 413 241 654 1   0
20374   1996241 111 211 241 452 1   0
16241   1996205 95  234 205 439 1   1
16321   1996219 90  118 219 337 1   1
16323   1996219 90  180 219 399 1   1
20375   1996241 103 268 241 509 1   1
20384   1996241 98  299 241 540 1   1
20390   1996241 93  204 241 445 1   1
20393   1996241 88  208 241 449 1   1
16313   1996248 122 512 248 760 0   1
20378   1996241 103 236 241 477 0   1
20381   1996241 101 329 241 570 0   1
16328   1996219 90  224 219 443 0   1
16827   1997259 127 52  259 311 1   0
16828   1997259 127 216 259 475 1   0
16831   1997303 171 19  303 322 1   0
20466   1997289 149 31  289 320 1   0
20469   1997289 149 199 289 488 1   0
20483   1997289 134 18  289 307 1   0
16807   1997259 137 223 259 482 1   0
16809   1997259 137 1   259 260 1   0
16819   1997259 131 237 259 496 1   0
16829   1997303 171 7   303 310 1   0
20440   1997289 161 7   289 296 1   0
20470   1997289 148 257 289 546 1   0
20478   1997289 143 12  289 301 1   0
16817   1997259 130 85  259 344 1   0
20454   1997289 154 4   289 293 1   1
20459   1997289 153 335 289 624 1   1
20460   1997289 153 118 289 407 1   1
20465   1997289 150 31  289 320 1   1
20473   1997289 147 65  289 354 1   1
20484   1997289 133 58  289 347 1   1
20489   1997289 130 56  289 345 1   1
16808   1997259 137 137 259 396 0   1
16810   1997303 181 64  303 367 0   1
16816   1997259 130 1   259 260 0   1
20471   1997289 147 334 289 623 0   1
16826   1997259 127 20  259 279 0   1

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] data import problem

2006-03-08 Thread Philipp Pagel
On Wed, Mar 08, 2006 at 12:49:29PM +0100, [EMAIL PROTECTED] wrote:
> Well, the data is generated by a perl script, and I could just
> configure the perl script so that there is one file per data table,
> but I though I'd probably must more efficent to have all records in a
> single file rather than reading a thousands of small files ... .

I guess I would make it a singe file and put the IDs in their own
column:

 ID x y
123   89.17911.1024
123   90.57351.1024
123   92.56661.1024
123   95.07251.1024
123  101.20701.1024
321   60.16011.1024
321   64.80231.1024
321   70.05932.1502
...

cu
Philipp

-- 
Dr. Philipp PagelTel.  +49-8161-71 2131
Dept. of Genome Oriented Bioinformatics  Fax.  +49-8161-71 2186
Technical University of Munich
Science Center Weihenstephan
85350 Freising, Germany

 and

Institute for Bioinformatics / MIPS  Tel.  +49-89-3187 3675
GSF - National Research Center   Fax.  +49-89-3187 3585
  for Environment and Health
Ingolstädter Landstrasse 1
85764 Neuherberg, Germany
http://mips.gsf.de/staff/pagel

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] malloc: vm_allocate(size=381886464) failed (error code=3)

2006-03-08 Thread Sean Davis



On 3/8/06 6:58 AM, "Mahdi Osman" <[EMAIL PROTECTED]> wrote:

> Hi all,
> 
> I am having memory allocation problem with my R 2.2.1 for Mac OS. The
> following is the error message that I get. I do not get this message if I
> break down the large dataset in to sub datasets. I think breaking up the
> dataset is not a sustainable solution in the long run. The data that I am
> analysing is essentially big, and it would be reasonable to do the analyis
> on the whole dataset without even considering to partition it. So I was
> really wondering if you could give me a clue about how to handle this
> problem.
> 
> model6 <-lm(logintens~ factor (slide) + factor(ind) + factor(dye) +
> factor(id)*factor(l6) + factor(rep)-1, data=sample_data2)
> 
> *** malloc: vm_allocate(size=381886464) failed (error code=3)
> *** malloc[387]: error: Can't allocate region
> *** malloc: vm_allocate(size=381886464) failed (error code=3)
> *** malloc[387]: error: Can't allocate region
> *** malloc: vm_allocate(size=381886464) failed (error code=3)
> *** malloc[387]: error: Can't allocate region
> *** malloc: vm_allocate(size=381886464) failed (error code=3)
> *** malloc[387]: error: Can't allocate region
> *** malloc: vm_allocate(size=381886464) failed (error code=3)
> *** malloc[387]: error: Can't allocate region

Mahdi,

How large are the 'id' and 'l6' factors--you are looking for all interaction
terms?  

As an aside, these look like microarray data.  If they are, is there a
reason that you couldn't use one of the bioconductor packages to do the
analyses (look at limma and maanova, in particular)?  Also, is there a
reason not to do a gene-by-gene analysis?

Sean

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] package installation on Mac OS X 10.3.9

2006-03-08 Thread David Ruau
Hi,
What is the R version you use?

I am also working with OS X 10.3.9 and I got this error once with a  
older release of R.
I found that when I used the 'superuser' it worked.
either:
$ sudo r
and do the install as you did or:
$ sudo R CMD INSTALL /path/to/your/package.tar.gz

Best,
David

On Mar 8, 2006, at 9:58, Patrick Giraudoux wrote:

> Dear listers,
>
> I am tryin to install a package in a student training room.
> Unsuccessfull! With this message:
>
>> install.packages("pgirmess")
> trying URL `http://cran.r-project.org/src/contrib/PACKAGES'
> Content type `text/plain; charset=iso-8859-1' length 77400 bytes
> opened URL
> ==
> downloaded 75Kb
>
> trying URL  
> `http://cran.r-project.org/src/contrib/pgirmess_1.2.5.tar.gz'
> Content type `application/x-tar' length 49962 bytes
> opened URL
> ==
> downloaded 48Kb
>
> ERROR: failed to lock directory
> '/Library/Frameworks/R.framework/Versions/2.0.1/Resources/library' for
> modifying
> Try removing
> '/Library/Frameworks/R.framework/Versions/2.0.1/Resources/library/ 
> 00LOCK'
>
> Delete downloaded files (y/N)? N
>
> Can anybody advise a bit more clearly about the origin of this failure
> and anticipate a way to work it around?
>
> Cheers,
>
> Patrick
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!  
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] malloc: vm_allocate(size=381886464) failed (error code=3)

2006-03-08 Thread Mahdi Osman
Hi all,

I am having memory allocation problem with my R 2.2.1 for Mac OS. The
following is the error message that I get. I do not get this message if I
break down the large dataset in to sub datasets. I think breaking up the
dataset is not a sustainable solution in the long run. The data that I am
analysing is essentially big, and it would be reasonable to do the analyis
on the whole dataset without even considering to partition it. So I was
really wondering if you could give me a clue about how to handle this
problem.

model6 <-lm(logintens~ factor (slide) + factor(ind) + factor(dye) + 
factor(id)*factor(l6) + factor(rep)-1, data=sample_data2)

*** malloc: vm_allocate(size=381886464) failed (error code=3)
*** malloc[387]: error: Can't allocate region
*** malloc: vm_allocate(size=381886464) failed (error code=3)
*** malloc[387]: error: Can't allocate region
*** malloc: vm_allocate(size=381886464) failed (error code=3)
*** malloc[387]: error: Can't allocate region
*** malloc: vm_allocate(size=381886464) failed (error code=3)
*** malloc[387]: error: Can't allocate region
*** malloc: vm_allocate(size=381886464) failed (error code=3)
*** malloc[387]: error: Can't allocate region


Thanks for your help in advance

Regards

Mahdi

-- 
---
Mahdi Osman (PhD)
E-mail: [EMAIL PROTECTED]
---

Echte DSL-Flatrate dauerhaft für 0,- Euro*!

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] data import problem

2006-03-08 Thread Sean Davis



On 3/8/06 6:43 AM, "Philipp Pagel" <[EMAIL PROTECTED]> wrote:

> On Wed, Mar 08, 2006 at 12:32:28PM +0100, [EMAIL PROTECTED]
> wrote:
>> I'm trying to read a text data file that contains several records
>> separated by a blank line. Each record starts with a row that contains
>> it's ID and the number of rows for the records (two columns), then the
>> data table itself, e.g.
>> 
>> 123 5
>> 89.17911.1024
>> 90.57351.1024
>> 92.56661.1024
>> 95.07251.1024
>> 101.20701.1024
>> 
>> 321 3
>> 60.16011.1024
>> 64.80231.1024
>> 70.05932.1502
> 
> That sound like a job for awk. I think it will be much easier to
> transform the data into a flat table using awk, python or perl an then
> just read the table with R.

If you want to use R, you can use a simple combination of
readLines(con,n=1), strsplit on tabs, and simple if statements (to find
blank lines and start new records) to do parse these types of files.  I
thought this would be slow, but I do it in one of my own packages and find
that it is pretty fast.

Sean

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] data import problem

2006-03-08 Thread Carlos Ortega
Hello,

Well, you can also automate the process of reading multiple files and
generate a big and unique file, but that choce is again more complex than
running an awk or perl script.

In awk it would be as simpler as this:

gawk ' { if(NF>1) print $0 } ' bigfile.in > bigfile.out

Regards,
Carlos Ortega.


On 3/8/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]>
wrote:
>
> Well, the data is generated by a perl script, and I could just configure
> the perl script so that there is one file per data table, but I though I'd
> probably must more efficent to have all records in a single file rather than
> reading a thousands of small files ... .
>
>kind regards,
>
>Arne
>
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] Behalf Of Philipp Pagel
> Sent: Wednesday, March 08, 2006 12:44
> To: r-help@stat.math.ethz.ch
> Subject: Re: [R] data import problem
>
>
> On Wed, Mar 08, 2006 at 12:32:28PM +0100, [EMAIL PROTECTED]:
> > I'm trying to read a text data file that contains several records
> > separated by a blank line. Each record starts with a row that contains
> > it's ID and the number of rows for the records (two columns), then the
> > data table itself, e.g.
> >
> > 123 5
> > 89.17911.1024
> > 90.57351.1024
> > 92.56661.1024
> > 95.07251.1024
> > 101.20701.1024
> >
> > 321 3
> > 60.16011.1024
> > 64.80231.1024
> > 70.05932.1502
>
> That sound like a job for awk. I think it will be much easier to
> transform the data into a flat table using awk, python or perl an then
> just read the table with R.
>
> cu
>Philipp
>
> --
> Dr. Philipp PagelTel.  +49-8161-71 2131
> Dept. of Genome Oriented Bioinformatics  Fax.  +49-8161-71 2186
> Technical University of Munich
> Science Center Weihenstephan
> 85350 Freising, Germany
>
> and
>
> Institute for Bioinformatics / MIPS  Tel.  +49-89-3187 3675
> GSF - National Research Center   Fax.  +49-89-3187 3585
>  for Environment and Health
> Ingolstädter Landstrasse 1
> 85764 Neuherberg, Germany
> http://mips.gsf.de/staff/pagel
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] data import problem

2006-03-08 Thread Arne.Muller
Well, the data is generated by a perl script, and I could just configure the 
perl script so that there is one file per data table, but I though I'd probably 
must more efficent to have all records in a single file rather than reading a 
thousands of small files ... .

kind regards,

Arne

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf Of Philipp Pagel
Sent: Wednesday, March 08, 2006 12:44
To: r-help@stat.math.ethz.ch
Subject: Re: [R] data import problem


On Wed, Mar 08, 2006 at 12:32:28PM +0100, [EMAIL PROTECTED] wrote:
> I'm trying to read a text data file that contains several records
> separated by a blank line. Each record starts with a row that contains
> it's ID and the number of rows for the records (two columns), then the
> data table itself, e.g. 
> 
> 123 5
> 89.17911.1024
> 90.57351.1024
> 92.56661.1024
> 95.07251.1024
> 101.20701.1024
> 
> 321 3
> 60.16011.1024
> 64.80231.1024
> 70.05932.1502

That sound like a job for awk. I think it will be much easier to
transform the data into a flat table using awk, python or perl an then
just read the table with R. 

cu
Philipp

-- 
Dr. Philipp PagelTel.  +49-8161-71 2131
Dept. of Genome Oriented Bioinformatics  Fax.  +49-8161-71 2186
Technical University of Munich
Science Center Weihenstephan
85350 Freising, Germany

 and

Institute for Bioinformatics / MIPS  Tel.  +49-89-3187 3675
GSF - National Research Center   Fax.  +49-89-3187 3585
  for Environment and Health
Ingolstädter Landstrasse 1
85764 Neuherberg, Germany
http://mips.gsf.de/staff/pagel

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] data import problem

2006-03-08 Thread Philipp Pagel
On Wed, Mar 08, 2006 at 12:32:28PM +0100, [EMAIL PROTECTED] wrote:
> I'm trying to read a text data file that contains several records
> separated by a blank line. Each record starts with a row that contains
> it's ID and the number of rows for the records (two columns), then the
> data table itself, e.g. 
> 
> 123 5
> 89.17911.1024
> 90.57351.1024
> 92.56661.1024
> 95.07251.1024
> 101.20701.1024
> 
> 321 3
> 60.16011.1024
> 64.80231.1024
> 70.05932.1502

That sound like a job for awk. I think it will be much easier to
transform the data into a flat table using awk, python or perl an then
just read the table with R. 

cu
Philipp

-- 
Dr. Philipp PagelTel.  +49-8161-71 2131
Dept. of Genome Oriented Bioinformatics  Fax.  +49-8161-71 2186
Technical University of Munich
Science Center Weihenstephan
85350 Freising, Germany

 and

Institute for Bioinformatics / MIPS  Tel.  +49-89-3187 3675
GSF - National Research Center   Fax.  +49-89-3187 3585
  for Environment and Health
Ingolstädter Landstrasse 1
85764 Neuherberg, Germany
http://mips.gsf.de/staff/pagel

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] data import problem

2006-03-08 Thread Arne.Muller
Dear All,

I'm trying to read a text data file that contains several records separated by 
a blank line. Each record starts with a row that contains it's ID and the 
number of rows for the records (two columns), then the data table itself, e.g. 

123 5
89.17911.1024
90.57351.1024
92.56661.1024
95.07251.1024
101.20701.1024

321 3
60.16011.1024
64.80231.1024
70.05932.1502

...

I thought I coudl simply use something line this:

con <- file("test2.txt");
do {
e <- read.table(con, nlines = 1);
if ( length(e) == 2 ) {
d <- read.table(con, nrows = e[1,2]);
#process data frame d
}
} while (length(e) == 2);

The problem is that read.table closes the connection object, I assumed that it 
would not close the connection, and instead contines where it last stopped.

Since the data is nearly a simple table I though read.table could work rather 
than using scan directly. Any suggestions to read this file efficently are 
welcome (the file can contain several thousand record and each record can 
contain several thousand rows).

thanks a lot for your help,
+kind regards,

Arne


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Drop1 and weights

2006-03-08 Thread Yan Wong
On 2 Mar 2006, at 17:56, Prof Brian Ripley wrote:

> On Thu, 2 Mar 2006, Yan Wong wrote:
>
>>
>> Indeed, I realised this, but assumed that the problem could be
>> understood even without my dataset. I'll test my data on the patched
>> version when it becomes available, and re-post then.

I've just tried R-patched, and it now works as expected. Thanks for  
the quick fix. As you point out, the glm and lm versions differ in  
AIC reported, but only by an additive constant, and so the tests are  
not affected.

Yan



 > weighted.lm <- lm(trSex ~ (river+length +depth)^2-length:depth,  
dno2, weights=males+females)
 > d1.lm <- drop1(weighted.lm, test="F");
 > weighted.glm <- glm(trSex ~ (river+length +depth)^2-length:depth,  
dno2, weights=males+females, family="gaussian")
 > d1.glm <- drop1(weighted.glm, test="F");
 >
 > #differ by a constant
 > d1.lm$AIC - d1.glm$AIC
[1] 628.1381 628.1381 628.1381
 >
 > Anova(weighted.lm)
Anova Table (Type II tests)

Response: trSex
  Sum Sq  Df F value  Pr(>F)
river 1.471   1  3.3775 0.06843 .
length1.002   1  2.2999 0.13187
depth 2.974   1  6.8295 0.01005 *
river:length  0.075   1  0.1733 0.67790
river:depth   2.020   1  4.6397 0.03313 *
Residuals55.303 127
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 > d1.lm
Single term deletions

Model:
trSex ~ (river + length + depth)^2 - length:depth
  Df Sum of Sq  RSS  AIC F value   Pr(F)
  55.303 -104.710
river:length  1 0.075   55.379 -106.529  0.1733 0.67790
river:depth   1 2.020   57.324 -101.938  4.6397 0.03313 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 > d1.glm
Single term deletions

Model:
trSex ~ (river + length + depth)^2 - length:depth
  Df Deviance AIC F value   Pr(F)
 55.30 -732.85
river:length  155.38 -734.67  0.1733 0.67790
river:depth   157.32 -730.08  4.6397 0.03313 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] bug in map('world') ?

2006-03-08 Thread Joerg van den Hoff
hi,

did'nt see anything in the archive:

map('world',pro='rectangular',para=0)

yields a strange artifact (horizontal bar) extending over the whole map 
at a certain latitude range (approx 65 deg. north), whereas

map('world',pro='rectangular',para=180)

(which should be the same) does not show the artifact.

the artifact shows up in other projections as well, e.g.

map('world',pro='bonne',para=45)


which seems to show that the problem might be in the data base of the 
map (i.e. polygon definition)??


any ideas?

regards,

joerg

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] predicted values in mgcv gam

2006-03-08 Thread Simon Wood
Hi Denis,

Your first plot is of f(x) against x where \sum_i f(x_i) = 0 (x_i are
observed x's).

Your second plot is of \exp(\alpha + f(x)) against x where
\sum_i f(x_i)=0, and \alpha is an intercept parameter.

So the zero line on the first plot, corresponds to the \exp(\alpha) line
on the second plot (which is not the same as the mean of the response
data).

If you replace
> abline(h=mean.y,lty=5,col=grey(0.35))
by
> abline(h=coef(gam2)[1],lty=5,col=grey(0.35))
then everything should work.

best,
Simon

On Sun, 5 Mar 2006, Denis Chabot wrote:

> Hi,
>
> In fitting GAMs to assess environmental preferences, I use the part
> of the fit where the lower confidence interval is above zero as my
> criterion for positive association between the environmental variable
> and species abundance. However I like to plot this on the original
> scale of species abundance. To do so I extract the fit and SE using
> predict.gam.
>
> Lately I compared more carefully the plots I obtain in this way and
> those obtained with plot.gam and noticed differences which I do not
> understand.
>
> To avoid sending a large dataset I took an example from gam Help to
> illustrate this.
>
> Was I wrong to believe that the fit and its confidence band should
> behave the same way on both scales?
>
> Thanks in advance,
>
> Denis Chabot
> ###
> library(mgcv)
> set.seed(0)
> n<-400
> sig<-2
> x0 <- runif(n, 0, 1)
> x1 <- runif(n, 0, 1)
> x2 <- runif(n, 0, 1)
> x3 <- runif(n, 0, 1)
> f0 <- function(x) 2 * sin(pi * x)
> f1 <- function(x) exp(2 * x)
> f2 <- function(x) 0.2*x^11*(10*(1-x))^6+10*(10*x)^3*(1-x)^10
> f <- f0(x0) + f1(x1) + f2(x2)
> g<-exp(f/4)
> y<-rpois(rep(1,n),g)
> mean.y <- mean(y)
>
> gam2 <- gam(y~ s(x2), poisson)
>
> # to plot on the response scale
> val.for.pred <- data.frame(x2=seq(min(x2), max(x2), length.out=100))
> pred.2.resp <- predict.gam(gam2, val.for.pred ,type="response",
> se.fit=TRUE)
> lower.band <- pred.2.resp$fit - 2*pred.2.resp$se.fit
> upper.band <- pred.2.resp$fit + 2*pred.2.resp$se.fit
> pred.2.resp <- data.frame(val.for.pred, pred.2.resp, lower.band,
> upper.band)
>
> # same thing on term scale
> pred.2.term <- predict.gam(gam2, val.for.pred ,type="terms",
> se.fit=TRUE)
> lower.band <- pred.2.term$fit - 2*pred.2.term$se.fit
> upper.band <- pred.2.term$fit + 2*pred.2.term$se.fit
> pred.2.term <- data.frame(val.for.pred, pred.2.term, lower.band,
> upper.band)
>
> # it is easier to compare two plots instead of looking at these two
> data.frames
>
> plot(gam2, residuals=T, pch=1, cex=0.7)
> abline(h=0)
>
> plot(y~x2, col=grey(0.5))
> lines(fit~x2, col="blue", data=pred.2.resp)
> lines(lower.band~x2, col="red", lty=2, data=pred.2.resp)
> lines(upper.band~x2, col="red", lty=2, data=pred.2.resp)
> abline(h=mean.y,lty=5,col=grey(0.35))
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Read.table

2006-03-08 Thread Uwe Ligges
Matias Mayor Fernandez wrote:

> Hi,
> 
>  
> 
> I have some column vector in txt or xls and I need to load into R as numeric
> vector.
> 
>  
> 
> I use the read.table  (X=read.table(123.txt”) command but the program say
> that “X is not a numeric vector”

No, I think you got:

   Error: syntax error in "(X=read.table(123.txt"

or you have used another call without the syntax error in it.

In any case, please be more specific what you did. You might also want 
to copy the first few lines of file 123.txt in your mail.

Uwe Ligges


>  
> 
>  
> 
> Where is the problem?
> 
>  
> 
> Matías
> 
>  
> 
> University of Oviedo,
> 
>  
> 
> Spain
> 
> 
>   [[alternative HTML version deleted]]
> 
> 
> 
> 
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Read.table

2006-03-08 Thread Matias Mayor Fernandez
Hi,

 

I have some column vector in txt or xls and I need to load into R as numeric
vector.

 

I use the read.table  (X=read.table(123.txt”) command but the program say
that “X is not a numeric vector”

 

 

Where is the problem?

 

Matías

 

University of Oviedo,

 

Spain


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Degrees of freedom using Box.test()

2006-03-08 Thread Patrick Burns
You are saying that the penalty on the degrees of freedom
should be the same whether the model was fit with 100
observations or 1 million observations.  You are also saying
that some tests should have negative degrees of freedom.
So I don't think your proposal is the right answer, though
presumably there should be some penalty.

There is a working paper on the Burns Statistics website
about robustness in Ljung-Box tests, but this issue is not one
that is covered.


Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")

Nestor Arguea wrote:

>After an RSiteSeach("Box.test") I found some discussion regarding the degrees 
>of freedom in the computation of the Ljung-Box test using Box.test(), but did 
>not find any posting about the proper degrees of freedom.  
>
>Box.test() uses "lag=number" as the degrees of freedom.  However, I believe  
>the correct degrees of freedom should be "number-p-q" where p and q are the 
>number of estimated parameters (for instance, in a Box-Jenkins family of 
>models). This, according to the main source in documentation of Box.test:
>
>G. M. Ljung and G. E. P. Box, On a measure of Lack of Fit in Time Series 
>Models, Biometrika, Vol. 65, No. 2 (August, 1978), pp. 297-303.
>
>One can still compute the correct p-value with
>
>  
>
>>1-pchisq(value,correctdf)
>>
>>
>
>
>Nestor
>(R 2.2.1 on Linux, Suse 9.3)
>
>  
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] removing of memory - optim()?

2006-03-08 Thread Marcel Prokopczuk
Dear all,

I have the following problem: I am using Windows XP as OS and have a R
program which uses optim() in a loop. Although I overwrite every variable in
the loop the memory R is using is increasing until my system breaks down
because the swap file is getting just to big. I suspect that optim() is
creating some variables which are not deleted automatically. So I tried to
do the following: every 10th loop or so, I save the variables I want to
keep, delete the memory with rm(list=ls(all=TRUE)), and load back the data I
saved before. But this is also not working. rm(list=ls(all=TRUE)) does not
delete everything (I can see in the Task Manager of Windows that the memory
occupied by the Rgui process stays huge).

Does anybody had similar problems and/or know how to solve it. Thanks in
advance for your help.

Marcel

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] package installation on Mac OS X 10.3.9

2006-03-08 Thread Patrick Giraudoux
Dear listers,

I am tryin to install a package in a student training room. 
Unsuccessfull! With this message:

 > install.packages("pgirmess")
trying URL `http://cran.r-project.org/src/contrib/PACKAGES'
Content type `text/plain; charset=iso-8859-1' length 77400 bytes
opened URL
==
downloaded 75Kb

trying URL `http://cran.r-project.org/src/contrib/pgirmess_1.2.5.tar.gz'
Content type `application/x-tar' length 49962 bytes
opened URL
==
downloaded 48Kb

ERROR: failed to lock directory 
'/Library/Frameworks/R.framework/Versions/2.0.1/Resources/library' for 
modifying
Try removing 
'/Library/Frameworks/R.framework/Versions/2.0.1/Resources/library/00LOCK'

Delete downloaded files (y/N)? N

Can anybody advise a bit more clearly about the origin of this failure 
and anticipate a way to work it around?

Cheers,

Patrick

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] info() function?

2006-03-08 Thread Henrik Bengtsson
>library(R.oo)
>ll()
 member data.class dimension object.size
1 anumeric  10004028
2author  character 1 112
3   expnumeric 1  36
4  last.warning   list 2 488
5object   function  NULL 864
6   row  character 1  72
7 USArrests data.frame   c(50,4)4076
8  VADeaths matrixc(5,4) 824
9 value  character 1  72

with NA counts:
>naCount <- function(x, ...) ifelse(is.vector(x), sum(is.na(x)), NA)
>properties <- c("data.class", "dimension", "object.size", "naCount")
ll(properties=properties)
 member data.class dimension object.size
1 anumeric  10004028
2author  character 1 112
3   expnumeric 1  36
4  last.warning   list 2 488
5   naCount   function  NULL 864
6object   function  NULL 864
7properties  character 4 212
8   row  character 1  72
9 USArrests data.frame   c(50,4)4076
10 VADeaths matrixc(5,4) 824
11value  character 1  72

FYI: In next version of R.oo, there will probably be some kind of
option to set the default 'properties' argument so that this must not
be given explicitly by default.

Cheers

Henrik

On 3/8/06, Robert Lundqvist <[EMAIL PROTECTED]> wrote:
> I would like to have some function for getting an overview of the
> variables in a worksheet. Class, dimesions, length, number of missing
> values,... Guess it wouldn't be that hard to set up such a function, but I
> guess there are others who have made it already. Or is it already a
> standard feature in the base package? Any suggestions?
>
> Robert
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>
>


--
Henrik Bengtsson
Mobile: +46 708 909208 (+1h UTC)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] info() function?

2006-03-08 Thread Robert Lundqvist
I would like to have some function for getting an overview of the
variables in a worksheet. Class, dimesions, length, number of missing
values,... Guess it wouldn't be that hard to set up such a function, but I
guess there are others who have made it already. Or is it already a
standard feature in the base package? Any suggestions?

Robert

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html