[R] info() function?
I would like to have some function for getting an overview of the variables in a worksheet. Class, dimesions, length, number of missing values,... Guess it wouldn't be that hard to set up such a function, but I guess there are others who have made it already. Or is it already a standard feature in the base package? Any suggestions? Robert __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] info() function?
library(R.oo) ll() member data.class dimension object.size 1 anumeric 10004028 2author character 1 112 3 expnumeric 1 36 4 last.warning list 2 488 5object function NULL 864 6 row character 1 72 7 USArrests data.frame c(50,4)4076 8 VADeaths matrixc(5,4) 824 9 value character 1 72 with NA counts: naCount - function(x, ...) ifelse(is.vector(x), sum(is.na(x)), NA) properties - c(data.class, dimension, object.size, naCount) ll(properties=properties) member data.class dimension object.size 1 anumeric 10004028 2author character 1 112 3 expnumeric 1 36 4 last.warning list 2 488 5 naCount function NULL 864 6object function NULL 864 7properties character 4 212 8 row character 1 72 9 USArrests data.frame c(50,4)4076 10 VADeaths matrixc(5,4) 824 11value character 1 72 FYI: In next version of R.oo, there will probably be some kind of option to set the default 'properties' argument so that this must not be given explicitly by default. Cheers Henrik On 3/8/06, Robert Lundqvist [EMAIL PROTECTED] wrote: I would like to have some function for getting an overview of the variables in a worksheet. Class, dimesions, length, number of missing values,... Guess it wouldn't be that hard to set up such a function, but I guess there are others who have made it already. Or is it already a standard feature in the base package? Any suggestions? Robert __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Henrik Bengtsson Mobile: +46 708 909208 (+1h UTC) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] package installation on Mac OS X 10.3.9
Dear listers, I am tryin to install a package in a student training room. Unsuccessfull! With this message: install.packages(pgirmess) trying URL `http://cran.r-project.org/src/contrib/PACKAGES' Content type `text/plain; charset=iso-8859-1' length 77400 bytes opened URL == downloaded 75Kb trying URL `http://cran.r-project.org/src/contrib/pgirmess_1.2.5.tar.gz' Content type `application/x-tar' length 49962 bytes opened URL == downloaded 48Kb ERROR: failed to lock directory '/Library/Frameworks/R.framework/Versions/2.0.1/Resources/library' for modifying Try removing '/Library/Frameworks/R.framework/Versions/2.0.1/Resources/library/00LOCK' Delete downloaded files (y/N)? N Can anybody advise a bit more clearly about the origin of this failure and anticipate a way to work it around? Cheers, Patrick __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] removing of memory - optim()?
Dear all, I have the following problem: I am using Windows XP as OS and have a R program which uses optim() in a loop. Although I overwrite every variable in the loop the memory R is using is increasing until my system breaks down because the swap file is getting just to big. I suspect that optim() is creating some variables which are not deleted automatically. So I tried to do the following: every 10th loop or so, I save the variables I want to keep, delete the memory with rm(list=ls(all=TRUE)), and load back the data I saved before. But this is also not working. rm(list=ls(all=TRUE)) does not delete everything (I can see in the Task Manager of Windows that the memory occupied by the Rgui process stays huge). Does anybody had similar problems and/or know how to solve it. Thanks in advance for your help. Marcel __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Degrees of freedom using Box.test()
You are saying that the penalty on the degrees of freedom should be the same whether the model was fit with 100 observations or 1 million observations. You are also saying that some tests should have negative degrees of freedom. So I don't think your proposal is the right answer, though presumably there should be some penalty. There is a working paper on the Burns Statistics website about robustness in Ljung-Box tests, but this issue is not one that is covered. Patrick Burns [EMAIL PROTECTED] +44 (0)20 8525 0696 http://www.burns-stat.com (home of S Poetry and A Guide for the Unwilling S User) Nestor Arguea wrote: After an RSiteSeach(Box.test) I found some discussion regarding the degrees of freedom in the computation of the Ljung-Box test using Box.test(), but did not find any posting about the proper degrees of freedom. Box.test() uses lag=number as the degrees of freedom. However, I believe the correct degrees of freedom should be number-p-q where p and q are the number of estimated parameters (for instance, in a Box-Jenkins family of models). This, according to the main source in documentation of Box.test: G. M. Ljung and G. E. P. Box, On a measure of Lack of Fit in Time Series Models, Biometrika, Vol. 65, No. 2 (August, 1978), pp. 297-303. One can still compute the correct p-value with 1-pchisq(value,correctdf) Nestor (R 2.2.1 on Linux, Suse 9.3) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Read.table
Hi, I have some column vector in txt or xls and I need to load into R as numeric vector. I use the read.table (X=read.table(123.txt) command but the program say that X is not a numeric vector Where is the problem? Matías University of Oviedo, Spain [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Read.table
Matias Mayor Fernandez wrote: Hi, I have some column vector in txt or xls and I need to load into R as numeric vector. I use the read.table (X=read.table(123.txt”) command but the program say that “X is not a numeric vector” No, I think you got: Error: syntax error in (X=read.table(123.txt or you have used another call without the syntax error in it. In any case, please be more specific what you did. You might also want to copy the first few lines of file 123.txt in your mail. Uwe Ligges Where is the problem? Matías University of Oviedo, Spain [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] predicted values in mgcv gam
Hi Denis, Your first plot is of f(x) against x where \sum_i f(x_i) = 0 (x_i are observed x's). Your second plot is of \exp(\alpha + f(x)) against x where \sum_i f(x_i)=0, and \alpha is an intercept parameter. So the zero line on the first plot, corresponds to the \exp(\alpha) line on the second plot (which is not the same as the mean of the response data). If you replace abline(h=mean.y,lty=5,col=grey(0.35)) by abline(h=coef(gam2)[1],lty=5,col=grey(0.35)) then everything should work. best, Simon On Sun, 5 Mar 2006, Denis Chabot wrote: Hi, In fitting GAMs to assess environmental preferences, I use the part of the fit where the lower confidence interval is above zero as my criterion for positive association between the environmental variable and species abundance. However I like to plot this on the original scale of species abundance. To do so I extract the fit and SE using predict.gam. Lately I compared more carefully the plots I obtain in this way and those obtained with plot.gam and noticed differences which I do not understand. To avoid sending a large dataset I took an example from gam Help to illustrate this. Was I wrong to believe that the fit and its confidence band should behave the same way on both scales? Thanks in advance, Denis Chabot ### library(mgcv) set.seed(0) n-400 sig-2 x0 - runif(n, 0, 1) x1 - runif(n, 0, 1) x2 - runif(n, 0, 1) x3 - runif(n, 0, 1) f0 - function(x) 2 * sin(pi * x) f1 - function(x) exp(2 * x) f2 - function(x) 0.2*x^11*(10*(1-x))^6+10*(10*x)^3*(1-x)^10 f - f0(x0) + f1(x1) + f2(x2) g-exp(f/4) y-rpois(rep(1,n),g) mean.y - mean(y) gam2 - gam(y~ s(x2), poisson) # to plot on the response scale val.for.pred - data.frame(x2=seq(min(x2), max(x2), length.out=100)) pred.2.resp - predict.gam(gam2, val.for.pred ,type=response, se.fit=TRUE) lower.band - pred.2.resp$fit - 2*pred.2.resp$se.fit upper.band - pred.2.resp$fit + 2*pred.2.resp$se.fit pred.2.resp - data.frame(val.for.pred, pred.2.resp, lower.band, upper.band) # same thing on term scale pred.2.term - predict.gam(gam2, val.for.pred ,type=terms, se.fit=TRUE) lower.band - pred.2.term$fit - 2*pred.2.term$se.fit upper.band - pred.2.term$fit + 2*pred.2.term$se.fit pred.2.term - data.frame(val.for.pred, pred.2.term, lower.band, upper.band) # it is easier to compare two plots instead of looking at these two data.frames plot(gam2, residuals=T, pch=1, cex=0.7) abline(h=0) plot(y~x2, col=grey(0.5)) lines(fit~x2, col=blue, data=pred.2.resp) lines(lower.band~x2, col=red, lty=2, data=pred.2.resp) lines(upper.band~x2, col=red, lty=2, data=pred.2.resp) abline(h=mean.y,lty=5,col=grey(0.35)) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] bug in map('world') ?
hi, did'nt see anything in the archive: map('world',pro='rectangular',para=0) yields a strange artifact (horizontal bar) extending over the whole map at a certain latitude range (approx 65 deg. north), whereas map('world',pro='rectangular',para=180) (which should be the same) does not show the artifact. the artifact shows up in other projections as well, e.g. map('world',pro='bonne',para=45) which seems to show that the problem might be in the data base of the map (i.e. polygon definition)?? any ideas? regards, joerg __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Drop1 and weights
On 2 Mar 2006, at 17:56, Prof Brian Ripley wrote: On Thu, 2 Mar 2006, Yan Wong wrote: Indeed, I realised this, but assumed that the problem could be understood even without my dataset. I'll test my data on the patched version when it becomes available, and re-post then. I've just tried R-patched, and it now works as expected. Thanks for the quick fix. As you point out, the glm and lm versions differ in AIC reported, but only by an additive constant, and so the tests are not affected. Yan weighted.lm - lm(trSex ~ (river+length +depth)^2-length:depth, dno2, weights=males+females) d1.lm - drop1(weighted.lm, test=F); weighted.glm - glm(trSex ~ (river+length +depth)^2-length:depth, dno2, weights=males+females, family=gaussian) d1.glm - drop1(weighted.glm, test=F); #differ by a constant d1.lm$AIC - d1.glm$AIC [1] 628.1381 628.1381 628.1381 Anova(weighted.lm) Anova Table (Type II tests) Response: trSex Sum Sq Df F value Pr(F) river 1.471 1 3.3775 0.06843 . length1.002 1 2.2999 0.13187 depth 2.974 1 6.8295 0.01005 * river:length 0.075 1 0.1733 0.67790 river:depth 2.020 1 4.6397 0.03313 * Residuals55.303 127 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 d1.lm Single term deletions Model: trSex ~ (river + length + depth)^2 - length:depth Df Sum of Sq RSS AIC F value Pr(F) none 55.303 -104.710 river:length 1 0.075 55.379 -106.529 0.1733 0.67790 river:depth 1 2.020 57.324 -101.938 4.6397 0.03313 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 d1.glm Single term deletions Model: trSex ~ (river + length + depth)^2 - length:depth Df Deviance AIC F value Pr(F) none 55.30 -732.85 river:length 155.38 -734.67 0.1733 0.67790 river:depth 157.32 -730.08 4.6397 0.03313 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] data import problem
Dear All, I'm trying to read a text data file that contains several records separated by a blank line. Each record starts with a row that contains it's ID and the number of rows for the records (two columns), then the data table itself, e.g. 123 5 89.17911.1024 90.57351.1024 92.56661.1024 95.07251.1024 101.20701.1024 321 3 60.16011.1024 64.80231.1024 70.05932.1502 ... I thought I coudl simply use something line this: con - file(test2.txt); do { e - read.table(con, nlines = 1); if ( length(e) == 2 ) { d - read.table(con, nrows = e[1,2]); #process data frame d } } while (length(e) == 2); The problem is that read.table closes the connection object, I assumed that it would not close the connection, and instead contines where it last stopped. Since the data is nearly a simple table I though read.table could work rather than using scan directly. Any suggestions to read this file efficently are welcome (the file can contain several thousand record and each record can contain several thousand rows). thanks a lot for your help, +kind regards, Arne [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] data import problem
On Wed, Mar 08, 2006 at 12:32:28PM +0100, [EMAIL PROTECTED] wrote: I'm trying to read a text data file that contains several records separated by a blank line. Each record starts with a row that contains it's ID and the number of rows for the records (two columns), then the data table itself, e.g. 123 5 89.17911.1024 90.57351.1024 92.56661.1024 95.07251.1024 101.20701.1024 321 3 60.16011.1024 64.80231.1024 70.05932.1502 That sound like a job for awk. I think it will be much easier to transform the data into a flat table using awk, python or perl an then just read the table with R. cu Philipp -- Dr. Philipp PagelTel. +49-8161-71 2131 Dept. of Genome Oriented Bioinformatics Fax. +49-8161-71 2186 Technical University of Munich Science Center Weihenstephan 85350 Freising, Germany and Institute for Bioinformatics / MIPS Tel. +49-89-3187 3675 GSF - National Research Center Fax. +49-89-3187 3585 for Environment and Health Ingolstädter Landstrasse 1 85764 Neuherberg, Germany http://mips.gsf.de/staff/pagel __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] data import problem
Well, the data is generated by a perl script, and I could just configure the perl script so that there is one file per data table, but I though I'd probably must more efficent to have all records in a single file rather than reading a thousands of small files ... . kind regards, Arne -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Philipp Pagel Sent: Wednesday, March 08, 2006 12:44 To: r-help@stat.math.ethz.ch Subject: Re: [R] data import problem On Wed, Mar 08, 2006 at 12:32:28PM +0100, [EMAIL PROTECTED] wrote: I'm trying to read a text data file that contains several records separated by a blank line. Each record starts with a row that contains it's ID and the number of rows for the records (two columns), then the data table itself, e.g. 123 5 89.17911.1024 90.57351.1024 92.56661.1024 95.07251.1024 101.20701.1024 321 3 60.16011.1024 64.80231.1024 70.05932.1502 That sound like a job for awk. I think it will be much easier to transform the data into a flat table using awk, python or perl an then just read the table with R. cu Philipp -- Dr. Philipp PagelTel. +49-8161-71 2131 Dept. of Genome Oriented Bioinformatics Fax. +49-8161-71 2186 Technical University of Munich Science Center Weihenstephan 85350 Freising, Germany and Institute for Bioinformatics / MIPS Tel. +49-89-3187 3675 GSF - National Research Center Fax. +49-89-3187 3585 for Environment and Health Ingolstädter Landstrasse 1 85764 Neuherberg, Germany http://mips.gsf.de/staff/pagel __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] data import problem
Hello, Well, you can also automate the process of reading multiple files and generate a big and unique file, but that choce is again more complex than running an awk or perl script. In awk it would be as simpler as this: gawk ' { if(NF1) print $0 } ' bigfile.in bigfile.out Regards, Carlos Ortega. On 3/8/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Well, the data is generated by a perl script, and I could just configure the perl script so that there is one file per data table, but I though I'd probably must more efficent to have all records in a single file rather than reading a thousands of small files ... . kind regards, Arne -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Philipp Pagel Sent: Wednesday, March 08, 2006 12:44 To: r-help@stat.math.ethz.ch Subject: Re: [R] data import problem On Wed, Mar 08, 2006 at 12:32:28PM +0100, [EMAIL PROTECTED]: I'm trying to read a text data file that contains several records separated by a blank line. Each record starts with a row that contains it's ID and the number of rows for the records (two columns), then the data table itself, e.g. 123 5 89.17911.1024 90.57351.1024 92.56661.1024 95.07251.1024 101.20701.1024 321 3 60.16011.1024 64.80231.1024 70.05932.1502 That sound like a job for awk. I think it will be much easier to transform the data into a flat table using awk, python or perl an then just read the table with R. cu Philipp -- Dr. Philipp PagelTel. +49-8161-71 2131 Dept. of Genome Oriented Bioinformatics Fax. +49-8161-71 2186 Technical University of Munich Science Center Weihenstephan 85350 Freising, Germany and Institute for Bioinformatics / MIPS Tel. +49-89-3187 3675 GSF - National Research Center Fax. +49-89-3187 3585 for Environment and Health Ingolstädter Landstrasse 1 85764 Neuherberg, Germany http://mips.gsf.de/staff/pagel __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] data import problem
On 3/8/06 6:43 AM, Philipp Pagel [EMAIL PROTECTED] wrote: On Wed, Mar 08, 2006 at 12:32:28PM +0100, [EMAIL PROTECTED] wrote: I'm trying to read a text data file that contains several records separated by a blank line. Each record starts with a row that contains it's ID and the number of rows for the records (two columns), then the data table itself, e.g. 123 5 89.17911.1024 90.57351.1024 92.56661.1024 95.07251.1024 101.20701.1024 321 3 60.16011.1024 64.80231.1024 70.05932.1502 That sound like a job for awk. I think it will be much easier to transform the data into a flat table using awk, python or perl an then just read the table with R. If you want to use R, you can use a simple combination of readLines(con,n=1), strsplit on tabs, and simple if statements (to find blank lines and start new records) to do parse these types of files. I thought this would be slow, but I do it in one of my own packages and find that it is pretty fast. Sean __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] malloc: vm_allocate(size=381886464) failed (error code=3)
Hi all, I am having memory allocation problem with my R 2.2.1 for Mac OS. The following is the error message that I get. I do not get this message if I break down the large dataset in to sub datasets. I think breaking up the dataset is not a sustainable solution in the long run. The data that I am analysing is essentially big, and it would be reasonable to do the analyis on the whole dataset without even considering to partition it. So I was really wondering if you could give me a clue about how to handle this problem. model6 -lm(logintens~ factor (slide) + factor(ind) + factor(dye) + factor(id)*factor(l6) + factor(rep)-1, data=sample_data2) *** malloc: vm_allocate(size=381886464) failed (error code=3) *** malloc[387]: error: Can't allocate region *** malloc: vm_allocate(size=381886464) failed (error code=3) *** malloc[387]: error: Can't allocate region *** malloc: vm_allocate(size=381886464) failed (error code=3) *** malloc[387]: error: Can't allocate region *** malloc: vm_allocate(size=381886464) failed (error code=3) *** malloc[387]: error: Can't allocate region *** malloc: vm_allocate(size=381886464) failed (error code=3) *** malloc[387]: error: Can't allocate region Thanks for your help in advance Regards Mahdi -- --- Mahdi Osman (PhD) E-mail: [EMAIL PROTECTED] --- Echte DSL-Flatrate dauerhaft für 0,- Euro*! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] package installation on Mac OS X 10.3.9
Hi, What is the R version you use? I am also working with OS X 10.3.9 and I got this error once with a older release of R. I found that when I used the 'superuser' it worked. either: $ sudo r and do the install as you did or: $ sudo R CMD INSTALL /path/to/your/package.tar.gz Best, David On Mar 8, 2006, at 9:58, Patrick Giraudoux wrote: Dear listers, I am tryin to install a package in a student training room. Unsuccessfull! With this message: install.packages(pgirmess) trying URL `http://cran.r-project.org/src/contrib/PACKAGES' Content type `text/plain; charset=iso-8859-1' length 77400 bytes opened URL == downloaded 75Kb trying URL `http://cran.r-project.org/src/contrib/pgirmess_1.2.5.tar.gz' Content type `application/x-tar' length 49962 bytes opened URL == downloaded 48Kb ERROR: failed to lock directory '/Library/Frameworks/R.framework/Versions/2.0.1/Resources/library' for modifying Try removing '/Library/Frameworks/R.framework/Versions/2.0.1/Resources/library/ 00LOCK' Delete downloaded files (y/N)? N Can anybody advise a bit more clearly about the origin of this failure and anticipate a way to work it around? Cheers, Patrick __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] malloc: vm_allocate(size=381886464) failed (error code=3)
On 3/8/06 6:58 AM, Mahdi Osman [EMAIL PROTECTED] wrote: Hi all, I am having memory allocation problem with my R 2.2.1 for Mac OS. The following is the error message that I get. I do not get this message if I break down the large dataset in to sub datasets. I think breaking up the dataset is not a sustainable solution in the long run. The data that I am analysing is essentially big, and it would be reasonable to do the analyis on the whole dataset without even considering to partition it. So I was really wondering if you could give me a clue about how to handle this problem. model6 -lm(logintens~ factor (slide) + factor(ind) + factor(dye) + factor(id)*factor(l6) + factor(rep)-1, data=sample_data2) *** malloc: vm_allocate(size=381886464) failed (error code=3) *** malloc[387]: error: Can't allocate region *** malloc: vm_allocate(size=381886464) failed (error code=3) *** malloc[387]: error: Can't allocate region *** malloc: vm_allocate(size=381886464) failed (error code=3) *** malloc[387]: error: Can't allocate region *** malloc: vm_allocate(size=381886464) failed (error code=3) *** malloc[387]: error: Can't allocate region *** malloc: vm_allocate(size=381886464) failed (error code=3) *** malloc[387]: error: Can't allocate region Mahdi, How large are the 'id' and 'l6' factors--you are looking for all interaction terms? As an aside, these look like microarray data. If they are, is there a reason that you couldn't use one of the bioconductor packages to do the analyses (look at limma and maanova, in particular)? Also, is there a reason not to do a gene-by-gene analysis? Sean __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] data import problem
On Wed, Mar 08, 2006 at 12:49:29PM +0100, [EMAIL PROTECTED] wrote: Well, the data is generated by a perl script, and I could just configure the perl script so that there is one file per data table, but I though I'd probably must more efficent to have all records in a single file rather than reading a thousands of small files ... . I guess I would make it a singe file and put the IDs in their own column: ID x y 123 89.17911.1024 123 90.57351.1024 123 92.56661.1024 123 95.07251.1024 123 101.20701.1024 321 60.16011.1024 321 64.80231.1024 321 70.05932.1502 ... cu Philipp -- Dr. Philipp PagelTel. +49-8161-71 2131 Dept. of Genome Oriented Bioinformatics Fax. +49-8161-71 2186 Technical University of Munich Science Center Weihenstephan 85350 Freising, Germany and Institute for Bioinformatics / MIPS Tel. +49-89-3187 3675 GSF - National Research Center Fax. +49-89-3187 3585 for Environment and Health Ingolstädter Landstrasse 1 85764 Neuherberg, Germany http://mips.gsf.de/staff/pagel __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Survival Plots by Strata
All, I am struggling to create a survival plot using LTRC data for each year of a 10 year period. I have a set of individuals (birds) where 'entry' is the day of the year (1-365) they are released (let out of pens) into the wild (2 year data snip below). 'Entry' (e.g., day of year the first bird is released for each year) is highly variable, ranging from 48 to 250. When I create a plot using the below code I would like to remove the 'solid' line which represent S_hat=1 out to the LC point for each year and instead have the survival curves formatted in more of a 'hanging' style, where the LC day (e.g., min(Entry for each year)) is where each curve begins at S_hat=1 for each year (and does not extend back to the y-axis)? I could not find anything on this in the archives or MASS or Survival Analysis using S? Anyone have a suggestion on where to look? TIA, Bret #Code snip for R email. apc.coxfit1-coxph(Surv(Entry, Exit, Fate)~Sex + Agerelease + Dayrelease + strata(Year), data=mydat) coxfit.apc-survfit(apc.coxfit1) coxfit.apc plot(survfit(apc.coxfit1), conf.int=F, log=T, lty=c(1:2), col=c(1:2), xlim=c(205, 800)) #not run--first entry for this example is day 205 for 1996, 259 for 1997 mydat ID YearDayrelease Agerelease SurvivorshipEntry ExitFateSex 16240 1996205 95 164 205 369 1 0 16319 1996205 88 140 205 345 1 0 16378 1996248 108 100 248 348 1 0 20383 1996241 98 204 241 445 1 0 16324 1996219 90 227 219 446 1 0 16327 1996219 90 497 219 716 1 0 20373 1996241 114 413 241 654 1 0 20374 1996241 111 211 241 452 1 0 16241 1996205 95 234 205 439 1 1 16321 1996219 90 118 219 337 1 1 16323 1996219 90 180 219 399 1 1 20375 1996241 103 268 241 509 1 1 20384 1996241 98 299 241 540 1 1 20390 1996241 93 204 241 445 1 1 20393 1996241 88 208 241 449 1 1 16313 1996248 122 512 248 760 0 1 20378 1996241 103 236 241 477 0 1 20381 1996241 101 329 241 570 0 1 16328 1996219 90 224 219 443 0 1 16827 1997259 127 52 259 311 1 0 16828 1997259 127 216 259 475 1 0 16831 1997303 171 19 303 322 1 0 20466 1997289 149 31 289 320 1 0 20469 1997289 149 199 289 488 1 0 20483 1997289 134 18 289 307 1 0 16807 1997259 137 223 259 482 1 0 16809 1997259 137 1 259 260 1 0 16819 1997259 131 237 259 496 1 0 16829 1997303 171 7 303 310 1 0 20440 1997289 161 7 289 296 1 0 20470 1997289 148 257 289 546 1 0 20478 1997289 143 12 289 301 1 0 16817 1997259 130 85 259 344 1 0 20454 1997289 154 4 289 293 1 1 20459 1997289 153 335 289 624 1 1 20460 1997289 153 118 289 407 1 1 20465 1997289 150 31 289 320 1 1 20473 1997289 147 65 289 354 1 1 20484 1997289 133 58 289 347 1 1 20489 1997289 130 56 289 345 1 1 16808 1997259 137 137 259 396 0 1 16810 1997303 181 64 303 367 0 1 16816 1997259 130 1 259 260 0 1 20471 1997289 147 334 289 623 0 1 16826 1997259 127 20 259 279 0 1 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Multiple logistic regression
Dear R-users, Is there a function in R that classifies data in more than 2 groups using logistic regression/classification? I want to compare the c-indices of earlier research (lrm, binary response variables) with new c-indices obtained from 'multiple' (more response variables) logistic regression. Best regards, Stephanie Delalieux Department Biosystems M³-BIORES Group of Geomatics Engineering [EMAIL PROTECTED] Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] combining results and exporting problems
hello everybody, this is my first post to the list. i am facing two foolish but tedious problems: 1) i have a data frame like this: var1 var2 var3 var4 var5 var6 var7 var8 var9 1 4.12 1314 867 552 276 302 105 13 420 2 8.99 2712 2197 1492 597 271 217 27 1031 3 9.64 2263 1720 1064 588 136 385 23 1064 4 2.53 1328 876 558 226 212 212 13 465 5 3.60 2260 1808 1130 565 181 203 23 972 6 8.88 2274 1819 864 682 182 2270 1137 7 4.92 1110 744 477 233 189 144 11 244 8 12.11 2895 2085 1419 608 550 232 29 1216 9 2.37 1438 963 604 316 273 144 14 489 10 17.30 1332 932 546 333 253 146 13 493 11 2.65 1285 977 604 308 154 103 13 501 12 4.35 2069 1221 828 290 621 186 21 600 13 2.37 2015 1471 967 403 302 1810 806 14 2.09 1707 1280 802 410 171 188 17 563 15 0.70 2027 1318 811 365 405 223 20 527 16 7.77 2264 1471 702 634 589 1810 860 17 15.18 1578 1073 600 347 395 95 16 631 18 24.46 1991 1573 1035 478 139 239 20 776 19 14.55 1489 933 422 467 244 89 489 20 4.00 1818 1455 909 418 109 218 18 745 i need to compute the linear model lm() pairwise between the columns. i need also to compute the anova(). i would like to get into one object all the results, instead of running many lm() and anova() separately for each couple of columns. 2) i would like to export into a text file the results from aov() and TukeyHSD(). is there any suggestion? thankyou very much for your kind help. sincerely d -- Dario Greco Institute of Biotechnology - University of Helsinki Building Cultivator II P.O.Box 56 Viikinkaari 4 FIN-00014 Finland Office: +358 9 191 58951 Fax: +358 9 191 58952 Mobile: +358 44 023 5780 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Read.table
From: Uwe Ligges Matias Mayor Fernandez wrote: Hi, I have some column vector in txt or xls and I need to load into R as numeric vector. I use the read.table (X=read.table(123.txt) command but the program say that X is not a numeric vector No, I think you got: Error: syntax error in (X=read.table(123.txt or you have used another call without the syntax error in it. In any case, please be more specific what you did. You might also want to copy the first few lines of file 123.txt in your mail. Besides, if Matias only has one column of data and want that read in as a vector, scan() is really more appropriate. Andy Uwe Ligges Where is the problem? Matías University of Oviedo, Spain [[alternative HTML version deleted]] -- -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] info() function?
Another possibility is eapply where I have used naCount from Henrik's solution: prop - function(x) list(class = data.class(x), dim = dim(x), size = object.size(x), NAs = naCount(x)) do.call(rbind, eapply(.GlobalEnv, prop)) On 3/8/06, Henrik Bengtsson [EMAIL PROTECTED] wrote: library(R.oo) ll() member data.class dimension object.size 1 anumeric 10004028 2author character 1 112 3 expnumeric 1 36 4 last.warning list 2 488 5object function NULL 864 6 row character 1 72 7 USArrests data.frame c(50,4)4076 8 VADeaths matrixc(5,4) 824 9 value character 1 72 with NA counts: naCount - function(x, ...) ifelse(is.vector(x), sum(is.na(x)), NA) properties - c(data.class, dimension, object.size, naCount) ll(properties=properties) member data.class dimension object.size 1 anumeric 10004028 2author character 1 112 3 expnumeric 1 36 4 last.warning list 2 488 5 naCount function NULL 864 6object function NULL 864 7properties character 4 212 8 row character 1 72 9 USArrests data.frame c(50,4)4076 10 VADeaths matrixc(5,4) 824 11value character 1 72 FYI: In next version of R.oo, there will probably be some kind of option to set the default 'properties' argument so that this must not be given explicitly by default. Cheers Henrik On 3/8/06, Robert Lundqvist [EMAIL PROTECTED] wrote: I would like to have some function for getting an overview of the variables in a worksheet. Class, dimesions, length, number of missing values,... Guess it wouldn't be that hard to set up such a function, but I guess there are others who have made it already. Or is it already a standard feature in the base package? Any suggestions? Robert __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Henrik Bengtsson Mobile: +46 708 909208 (+1h UTC) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Read.table
On 3/8/06 8:31 AM, Liaw, Andy [EMAIL PROTECTED] wrote: From: Uwe Ligges Matias Mayor Fernandez wrote: Hi, I have some column vector in txt or xls and I need to load into R as numeric vector. I use the read.table (X=read.table(123.txt) command but the program say that X is not a numeric vector See here if you want to think of your data as a single-column table: http://cran.r-project.org/doc/manuals/R-data.html#Spreadsheet_002dlike-data Sean __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] fitting a distribution using glm
it is easy to fit a distribution using fitdistr poisdata - rpois(n = 100, lambda = 2) poismle - fitdistr(poisdata, Poisson) poismle but i would like to know whether its possible to get an identical result using glm. I use poistab - data.frame(table(poisdata)) colnames(poistab) - c(width,freq); poistab[,width] - as.numeric(poistab[,width]) glm(freq ~ 1-width, family = poisson, data = poistab) but the results are always different. cheers, vumani __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] data import problem
Check out: http://finzi.psych.upenn.edu/R/Rhelp02a/archive/52957.html for a similar problem. On 3/8/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Dear All, I'm trying to read a text data file that contains several records separated by a blank line. Each record starts with a row that contains it's ID and the number of rows for the records (two columns), then the data table itself, e.g. 123 5 89.17911.1024 90.57351.1024 92.56661.1024 95.07251.1024 101.20701.1024 321 3 60.16011.1024 64.80231.1024 70.05932.1502 ... I thought I coudl simply use something line this: con - file(test2.txt); do { e - read.table(con, nlines = 1); if ( length(e) == 2 ) { d - read.table(con, nrows = e[1,2]); #process data frame d } } while (length(e) == 2); The problem is that read.table closes the connection object, I assumed that it would not close the connection, and instead contines where it last stopped. Since the data is nearly a simple table I though read.table could work rather than using scan directly. Any suggestions to read this file efficently are welcome (the file can contain several thousand record and each record can contain several thousand rows). thanks a lot for your help, +kind regards, Arne [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to do it without for loops?
Thank you for all . One more question.How can I calculate these efficiently? set.seed(100) dat-data.frame(x1=rnorm(20),x2=rnorm(20),u=rnorm(20),id=round(2*runif(20))) # In this example,id's elements are 0,1,2. y-list() for (i in 0:2){ X-as.matrix(subset(dat,id==i,c(x1,x2))) u-as.matrix(subset(dat,id==i,c(u))) y[[i+1]]-t(X)%*%u%*%t(u)%*%X } y[[1]]+y[[2]]+y[[3]] the above code is not elegant.And my second solution to this problem is using by to get a list. matlis-by(dat, dat$id, function(x){ a-as.matrix(x[,c(x1,x2)]) b-as.matrix(x[, u]) t(a) %*% b %*% t(b) %*% a }) S - matrix(unlist(matlis), 4, length(matlis)) S1 - matrix(rowSums(S), 2, 2) The code works ,but I want to ask if there is any other more better ways to do it? It seems that this kind of computation is quite common. 2006/2/28, Gabor Grothendieck [EMAIL PROTECTED]: Try: crossprod(x) or t(x) %*% x On 2/28/06, ronggui [EMAIL PROTECTED] wrote: This is the code: x-matrix(rnorm(20),5) y-list() for (i in seq(nrow(x))) y[[i]]-t(x[i,,drop=F])%*%x[i,,drop=F] y[[1]]+y[[2]]+y[[3]]+y[[4]]+y[[5]] How can I do it without using for loops? Thank you in advance! -- ronggui Deparment of Sociology Fudan University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- 黄荣贵 Deparment of Sociology Fudan University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] glm automation
I just pointed out the mistake about using index. Andy's post let me know the trouble of using formula like that.It's valuable information for me:) Thank you! 在 06-3-8,Liaw, Andy[EMAIL PROTECTED] 写道: From: ronggui 2006/3/8, A Mani [EMAIL PROTECTED]: Hello, I have two problems in automating multiple glm(s) operations. The data file is tab delimited file with headers and two columns. like ABC EFG 1 2 2 3 3 4 dat - read.table(FILENAME, header=TRUE, sep=\t, na.strings=NA, dec=., strip.white=TRUE) dataf - read.table(FILENAME, header=FALSE, sep=\t, na.strings=NA, dec=., strip.white=TRUE) norm1 - glm(dataf[1,1] ~ dataf[1,2], family= normal(log), data=dat) norm2 - glm(dataf[1,1] ~ dataf[1,2], family= normal(identity), data=dat) and so on. It should be norm1 - glm(dataf[,1] ~ dataf[,2], family= normal(log), data=dat) norm2 - glm(dataf[,1] ~ dataf[,2], family= normal(identity), data=dat) you should read the document of [ about how to use index. I wish people would just give up on (ab)using model formula like that. IMHO it's really asking for trouble down the road. For example: n - 5 d1 - data.frame(x=1:n, y=rnorm(n)) d2 - data.frame(u=n:1, v=rnorm(n)) d3 - data.frame(y=rnorm(n), x=n:1) f1 - lm(d1[,2] ~ d1[,1], data=d1) f2 - lm(y ~ x, data=d1) predict(f1) 12345 -1.767697694 -1.326691900 -0.885686106 -0.444680312 -0.003674518 predict(f2) 12345 -1.767697694 -1.326691900 -0.885686106 -0.444680312 -0.003674518 predict(f1, d2) 12345 -1.767697694 -1.326691900 -0.885686106 -0.444680312 -0.003674518 Notice anything odd above? predict(f2, d3) 12345 -0.003674518 -0.444680312 -0.885686106 -1.326691900 -1.767697694 Now that's more like it... Andy But glm does not work on the data unless I write ABC and EFG there... I want to automate the script for multiple files. The other problem is to write the plot(GLM) to a file without displaying it at stdout. Thanks, A. Mani Member, Cal. Math. Soc [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- 黄荣贵 Deparment of Sociology Fudan University -- Notice: This e-mail message, together with any attachment...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Trellis stacked bar legend
Dear R-Listers, (well... called Depayan) The standard example barchart(yield ~ variety | site, data = barley, groups = year, layout = c(1,6), stack = TRUE, auto.key = list(points = FALSE, rectangles = TRUE, space = right), ylab = Barley Yield (bushels/acre), scales = list(x = list(rot = 45))) shows the problem: legend colors are inverted compared to stack colors. This is still acceptable with 2 colors, but with 3 or more, people ask me why it's partly down-under. I use to mumble that one author is in New Zealand, the other either in India or in Wisconsin. I know the hard way to correct this, but auto.key is nice because it's print-time settings dynamic. Any easy workaround? Dieter __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Multiple logistic regression
Do you mean multinomial logistic model? If it is,the multinom function in nnet package and multinomial function in VGAM(http://www.stat.auckland.ac.nz/~yee) package can do it. But I have no idea about the c-index. 2006/3/8, Stephanie Delalieux [EMAIL PROTECTED]: Dear R-users, Is there a function in R that classifies data in more than 2 groups using logistic regression/classification? I want to compare the c-indices of earlier research (lrm, binary response variables) with new c-indices obtained from 'multiple' (more response variables) logistic regression. Best regards, Stephanie Delalieux Department Biosystems M³-BIORES Group of Geomatics Engineering [EMAIL PROTECTED] Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- 黄荣贵 Deparment of Sociology Fudan University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Unsupervised RandomForest
Dear all, I am trying to calculate the proximity matrix for a data set with 16 variables and 6804 observations using random forests. I have a Pentium 4, 3.00GHz processor with 1 GB of RAM. When I use the command randomForest(data.scale,proximity=T) I get the warning message Error: cannot allocate vector of size 361675 kb Is this because I have reached the limit of what my computer is capable of? Or is there a more efficient way to call randomForest? Thanks in advance, Rich Jacques -- Rich Jacques Department of Probability Statistics University of Sheffield Hicks Building Sheffield S3 7RH __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Multiple logistic regression
Stephanie Delalieux wrote: Dear R-users, Is there a function in R that classifies data in more than 2 groups using logistic regression/classification? I want to compare the c-indices of earlier research (lrm, binary response variables) with new c-indices obtained from 'multiple' (more response variables) logistic regression. Best regards, Stephanie Delalieux Department Biosystems M³-BIORES Group of Geomatics Engineering [EMAIL PROTECTED] There are many functions available but the answer depends on whether the response variable has a natural ordering. Whether it does or not, though, C-indexes may decrease from the binary case because predicting more categories is a more difficult task. Frank Harrell Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] fitting a distribution using glm
Vumani Dlamini [EMAIL PROTECTED] writes: it is easy to fit a distribution using fitdistr poisdata - rpois(n = 100, lambda = 2) poismle - fitdistr(poisdata, Poisson) poismle but i would like to know whether its possible to get an identical result using glm. I use poistab - data.frame(table(poisdata)) colnames(poistab) - c(width,freq); poistab[,width] - as.numeric(poistab[,width]) glm(freq ~ 1-width, family = poisson, data = poistab) but the results are always different. Well, the freq values in that setup are not independent Poisson variates. For a start, they sum to constant! Try instead exp(coef(glm(poisdata~1,family=poisson))) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] data import problem
On Wed, 8 Mar 2006, [EMAIL PROTECTED] wrote: I thought I coudl simply use something line this: con - file(test2.txt); do { e - read.table(con, nlines = 1); if ( length(e) == 2 ) { d - read.table(con, nrows = e[1,2]); #process data frame d } } while (length(e) == 2); The problem is that read.table closes the connection object, I assumed that it would not close the connection, and instead contines where it last stopped. I think the problem is just that you didn't open the connection before passing it to read.table. ?file says By default the connection is not opened and ?read.table says Alternatively, 'file' can be a 'connection', which will be opened if necessary, and if so closed at the end of the function call. -thomas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] fitting a distribution using glm
On Wed, 8 Mar 2006, Vumani Dlamini wrote: it is easy to fit a distribution using fitdistr poisdata - rpois(n = 100, lambda = 2) poismle - fitdistr(poisdata, Poisson) poismle but i would like to know whether its possible to get an identical result using glm. Yes. Easy. glm(poisdata ~ 1, family = poisson) poistab - data.frame(table(poisdata)) colnames(poistab) - c(width,freq); poistab[,width] - as.numeric(poistab[,width]) glm(freq ~ 1-width, family = poisson, data = poistab) but the results are always different. Well, they would be. I don't know what this is doing, but it certainly isn't fitting a poisson distribution to poisdata. -thomas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to do it without for loops?
On Wed, 8 Mar 2006, ronggui wrote: Thank you for all . One more question.How can I calculate these efficiently? set.seed(100) dat-data.frame(x1=rnorm(20),x2=rnorm(20),u=rnorm(20),id=round(2*runif(20))) # In this example,id's elements are 0,1,2. y-list() for (i in 0:2){ X-as.matrix(subset(dat,id==i,c(x1,x2))) u-as.matrix(subset(dat,id==i,c(u))) y[[i+1]]-t(X)%*%u%*%t(u)%*%X } y[[1]]+y[[2]]+y[[3]] People have already told you about crossprod, so crossprod(crossprod(X,u)) would seem an obvious improvement over the matrix multiplications. There is a better solution, though. Xu-dat[,c(x1,x2)]*dat[,u] crossprod( rowsum(Xu, dat$id)) -thomas the above code is not elegant.And my second solution to this problem is using by to get a list. matlis-by(dat, dat$id, function(x){ a-as.matrix(x[,c(x1,x2)]) b-as.matrix(x[, u]) t(a) %*% b %*% t(b) %*% a }) S - matrix(unlist(matlis), 4, length(matlis)) S1 - matrix(rowSums(S), 2, 2) The code works ,but I want to ask if there is any other more better ways to do it? It seems that this kind of computation is quite common. 2006/2/28, Gabor Grothendieck [EMAIL PROTECTED]: Try: crossprod(x) or t(x) %*% x On 2/28/06, ronggui [EMAIL PROTECTED] wrote: This is the code: x-matrix(rnorm(20),5) y-list() for (i in seq(nrow(x))) y[[i]]-t(x[i,,drop=F])%*%x[i,,drop=F] y[[1]]+y[[2]]+y[[3]]+y[[4]]+y[[5]] How can I do it without using for loops? Thank you in advance! -- ronggui Deparment of Sociology Fudan University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- ?? Deparment of Sociology Fudan University Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle__ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] power and sample size for a GLM with Poisson response variable
Craig, Thanks for your follow-up note on using the asypow package. My problem was not only constructing the constraints vector but, for my particular situation (Poisson regression, two groups, sample sizes of (1081,3180), I get very different results using asypow package compared to my other (home grown) approaches. library(asypow) pois.mean-c(0.0065,0.0003) info.pois - info.poisson.kgroup(pois.mean, group.size=c(1081,3180)) constraints - matrix(c(2,1,2), ncol = 3, byrow=T) poisson.object - asypow.noncent(pois.mean, info.pois,constraints) power.pois - asypow.power(poisson.object, c(1081,3180), 0.05) print(power.pois) [1] 0.2438309 asy.pwr() # the function is shown below. $power [1] 0.96 sim.pwr() # the function is shown below. 4261000 Poisson random variates simulated $power [1] 0.567 I tend to think the true power is greater than 0.567 but less than 0.96 (not as small as 0.2438309). Maybe I am still not using the asypow functions correctly? Suggestions/comments welcome. -Original Message- From: Craig A. Faulhaber [mailto:[EMAIL PROTECTED] Sent: Saturday, March 04, 2006 12:04 PM To: Wassell, James T., Ph.D. Subject: Re: [R] power and sample size for a GLM with Poisson response variable Hi James, Thanks again for your help. With the assistance of a statistician colleauge of mine, I figured out the constraints vector. It is a 3-column vector describing the null hypothesis. For example, let's say you have 3 populations and your null hypothesis is that there is no difference between the 3. The first row of the matrix would be 3 1 2 indicating that you have 3 populations and that populations 1 and 2 are equal. The second row would be 3 2 3 indicating that you have 3 populations and that populations 2 and 3 are equal. If you had only 2 populations, there would be only one row (2 1 2, indicating 2 populations with 1 and 2 equal). I hope this helps. Craig Wassell, James T., Ph.D. wrote: Craig, I found the package asypow difficult to use and it did not yield results in the ballpark of other approaches. (could not figure out the constraints vector). I wrote some simple functions, one asy.pwr uses the non-central chi-square distn. asy.pwr-function(counts=c(7,1),py=c(1081,3180)) { # a two group Poisson regression power computation # py is person-years or person-time however measured group-gl(2,1) ncp-summary(glm(counts~group+offset(log(py)),family=poisson))$null.de viance q.tile-qchisq(.95,1) # actually just the X2 critical value of 3.841459 list(power=round(1-pchisq(q.tile,df=1,ncp),2))} The second function, sim.pwr, estimates power using simulated Poisson random variates. The most time consuming step is the call to rpois. (Maybe someone knows a more efficient way to accomplish this?). The for loop is rather quick in comparison. I hope you may find this helpful, or if you have solved your problem some other way, please pass along your approach.Note, that for this problem, very small values of lambda, the two approaches give much different power estimates (96% vs. 55% or so). My problem may be better addressed as binomial logistic regression, maybe then the simulation and the asymptotic estimates my agree better. sim.pwr-function(means=c(0.0065,0.0003),ptime=c(1081,3180),nsim=1000) { # a two group poisson regression power computation # based simulating lots of Poisson r.v.'s # input rates followed by a vector of the corresponding person times # the most time consuming part is the r.v. generation. # power is determined by counting the how often p-values are = 0.05 group-as.factor(rep(c(1,2),ptime)) rej-vector(length=nsim) y-rpois(ptime[1]*nsim,means[1]) y-c(y,rpois(ptime[2]*nsim,means[2])) y-matrix(y,nrow=nsim) cat(sum(ptime)*nsim,Poisson random variates simulated,\n) for(i in 1:nsim){res-glm(y[i,]~group,family=poisson()) rej[i]-summary(res)$coeff[2,4]=0.05} list(power=sum(rej)/nsim) } __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Survival Plots by Strata
I don't see any easy way to do this. I think you may have to do the plotting yourself, based on the code in plot.survfit. -thomas On Wed, 8 Mar 2006, Bret Collier wrote: All, I am struggling to create a survival plot using LTRC data for each year of a 10 year period. I have a set of individuals (birds) where 'entry' is the day of the year (1-365) they are released (let out of pens) into the wild (2 year data snip below). 'Entry' (e.g., day of year the first bird is released for each year) is highly variable, ranging from 48 to 250. When I create a plot using the below code I would like to remove the 'solid' line which represent S_hat=1 out to the LC point for each year and instead have the survival curves formatted in more of a 'hanging' style, where the LC day (e.g., min(Entry for each year)) is where each curve begins at S_hat=1 for each year (and does not extend back to the y-axis)? I could not find anything on this in the archives or MASS or Survival Analysis using S? Anyone have a suggestion on where to look? TIA, Bret #Code snip for R email. apc.coxfit1-coxph(Surv(Entry, Exit, Fate)~Sex + Agerelease + Dayrelease + strata(Year), data=mydat) coxfit.apc-survfit(apc.coxfit1) coxfit.apc plot(survfit(apc.coxfit1), conf.int=F, log=T, lty=c(1:2), col=c(1:2), xlim=c(205, 800)) #not run--first entry for this example is day 205 for 1996, 259 for 1997 mydat IDYearDayrelease AgereleaseSurvivorshipEntry ExitFateSex 16240 1996205 95 164 205 369 1 0 16319 1996205 88 140 205 345 1 0 16378 1996248 108 100 248 348 1 0 20383 1996241 98 204 241 445 1 0 16324 1996219 90 227 219 446 1 0 16327 1996219 90 497 219 716 1 0 20373 1996241 114 413 241 654 1 0 20374 1996241 111 211 241 452 1 0 16241 1996205 95 234 205 439 1 1 16321 1996219 90 118 219 337 1 1 16323 1996219 90 180 219 399 1 1 20375 1996241 103 268 241 509 1 1 20384 1996241 98 299 241 540 1 1 20390 1996241 93 204 241 445 1 1 20393 1996241 88 208 241 449 1 1 16313 1996248 122 512 248 760 0 1 20378 1996241 103 236 241 477 0 1 20381 1996241 101 329 241 570 0 1 16328 1996219 90 224 219 443 0 1 16827 1997259 127 52 259 311 1 0 16828 1997259 127 216 259 475 1 0 16831 1997303 171 19 303 322 1 0 20466 1997289 149 31 289 320 1 0 20469 1997289 149 199 289 488 1 0 20483 1997289 134 18 289 307 1 0 16807 1997259 137 223 259 482 1 0 16809 1997259 137 1 259 260 1 0 16819 1997259 131 237 259 496 1 0 16829 1997303 171 7 303 310 1 0 20440 1997289 161 7 289 296 1 0 20470 1997289 148 257 289 546 1 0 20478 1997289 143 12 289 301 1 0 16817 1997259 130 85 259 344 1 0 20454 1997289 154 4 289 293 1 1 20459 1997289 153 335 289 624 1 1 20460 1997289 153 118 289 407 1 1 20465 1997289 150 31 289 320 1 1 20473 1997289 147 65 289 354 1 1 20484 1997289 133 58 289 347 1 1 20489 1997289 130 56 289 345 1 1 16808 1997259 137 137 259 396 0 1 16810 1997303 181 64 303 367 0 1 16816 1997259 130 1 259 260 0 1 20471 1997289 147 334 289 623 0 1 16826 1997259 127 20 259 279 0 1 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Degrees of freedom using Box.test()
It's Ljung and Box (1978) saying that for large number of observations a chi squared with lags-p-q, should provide a good approximation for most practical purposes (p. 298 of reference above). Nestor On Wednesday 08 March 2006 4:00 am, Patrick Burns wrote: You are saying that the penalty on the degrees of freedom should be the same whether the model was fit with 100 observations or 1 million observations. You are also saying that some tests should have negative degrees of freedom. So I don't think your proposal is the right answer, though presumably there should be some penalty. There is a working paper on the Burns Statistics website about robustness in Ljung-Box tests, but this issue is not one that is covered. Patrick Burns [EMAIL PROTECTED] +44 (0)20 8525 0696 http://www.burns-stat.com (home of S Poetry and A Guide for the Unwilling S User) Nestor Arguea wrote: After an RSiteSeach(Box.test) I found some discussion regarding the degrees of freedom in the computation of the Ljung-Box test using Box.test(), but did not find any posting about the proper degrees of freedom. Box.test() uses lag=number as the degrees of freedom. However, I believe the correct degrees of freedom should be number-p-q where p and q are the number of estimated parameters (for instance, in a Box-Jenkins family of models). This, according to the main source in documentation of Box.test: G. M. Ljung and G. E. P. Box, On a measure of Lack of Fit in Time Series Models, Biometrika, Vol. 65, No. 2 (August, 1978), pp. 297-303. One can still compute the correct p-value with 1-pchisq(value,correctdf) Nestor (R 2.2.1 on Linux, Suse 9.3) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Nestor M. Arguea, Chair Department of Marketing and Economics University of West Florida 11000 University Parkway Pensacola, FL 32514 Phone: (850)474-3071 Fax: (850)474-3069 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Survival Plots by Strata
Thomas, Thank you for the response. I will post what I come up with. Bret Thomas Lumley [EMAIL PROTECTED] 03/08/06 9:16 AM I don't see any easy way to do this. I think you may have to do the plotting yourself, based on the code in plot.survfit. -thomas On Wed, 8 Mar 2006, Bret Collier wrote: All, I am struggling to create a survival plot using LTRC data for each year of a 10 year period. I have a set of individuals (birds) where 'entry' is the day of the year (1-365) they are released (let out of pens) into the wild (2 year data snip below). 'Entry' (e.g., day of year the first bird is released for each year) is highly variable, ranging from 48 to 250. When I create a plot using the below code I would like to remove the 'solid' line which represent S_hat=1 out to the LC point for each year and instead have the survival curves formatted in more of a 'hanging' style, where the LC day (e.g., min(Entry for each year)) is where each curve begins at S_hat=1 for each year (and does not extend back to the y-axis)? I could not find anything on this in the archives or MASS or Survival Analysis using S? Anyone have a suggestion on where to look? TIA, Bret #Code snip for R email. apc.coxfit1-coxph(Surv(Entry, Exit, Fate)~Sex + Agerelease + Dayrelease + strata(Year), data=mydat) coxfit.apc-survfit(apc.coxfit1) coxfit.apc plot(survfit(apc.coxfit1), conf.int=F, log=T, lty=c(1:2), col=c(1:2), xlim=c(205, 800)) #not run--first entry for this example is day 205 for 1996, 259 for 1997 mydat IDYearDayrelease AgereleaseSurvivorshipEntry ExitFateSex 16240 1996205 95 164 205 369 1 0 16319 1996205 88 140 205 345 1 0 16378 1996248 108 100 248 348 1 0 20383 1996241 98 204 241 445 1 0 16324 1996219 90 227 219 446 1 0 16327 1996219 90 497 219 716 1 0 20373 1996241 114 413 241 654 1 0 20374 1996241 111 211 241 452 1 0 16241 1996205 95 234 205 439 1 1 16321 1996219 90 118 219 337 1 1 16323 1996219 90 180 219 399 1 1 20375 1996241 103 268 241 509 1 1 20384 1996241 98 299 241 540 1 1 20390 1996241 93 204 241 445 1 1 20393 1996241 88 208 241 449 1 1 16313 1996248 122 512 248 760 0 1 20378 1996241 103 236 241 477 0 1 20381 1996241 101 329 241 570 0 1 16328 1996219 90 224 219 443 0 1 16827 1997259 127 52 259 311 1 0 16828 1997259 127 216 259 475 1 0 16831 1997303 171 19 303 322 1 0 20466 1997289 149 31 289 320 1 0 20469 1997289 149 199 289 488 1 0 20483 1997289 134 18 289 307 1 0 16807 1997259 137 223 259 482 1 0 16809 1997259 137 1 259 260 1 0 16819 1997259 131 237 259 496 1 0 16829 1997303 171 7 303 310 1 0 20440 1997289 161 7 289 296 1 0 20470 1997289 148 257 289 546 1 0 20478 1997289 143 12 289 301 1 0 16817 1997259 130 85 259 344 1 0 20454 1997289 154 4 289 293 1 1 20459 1997289 153 335 289 624 1 1 20460 1997289 153 118 289 407 1 1 20465 1997289 150 31 289 320 1 1 20473 1997289 147 65 289 354 1 1 20484 1997289 133 58 289 347 1 1 20489 1997289 130 56 289 345 1 1 16808 1997259 137 137 259 396 0 1 16810 1997303 181 64 303 367 0 1 16816 1997259 130 1 259 260 0 1 20471 1997289 147 334 289 623 0 1 16826 1997259 127 20 259 279 0 1 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle
Re: [R] Trellis stacked bar legend
On 3/8/06, Dieter Menne [EMAIL PROTECTED] wrote: Dear R-Listers, (well... called Depayan) The standard example barchart(yield ~ variety | site, data = barley, groups = year, layout = c(1,6), stack = TRUE, auto.key = list(points = FALSE, rectangles = TRUE, space = right), ylab = Barley Yield (bushels/acre), scales = list(x = list(rot = 45))) shows the problem: legend colors are inverted compared to stack colors. This is still acceptable with 2 colors, but with 3 or more, people ask me why it's partly down-under. I use to mumble that one author is in New Zealand, the other either in India or in Wisconsin. Blame René Descartes for making the positive y-axis point up. Writing systems are pretty much universally top to bottom, as far as I know. I know the hard way to correct this, but auto.key is nice because it's print-time settings dynamic. Any easy workaround? None I can think of, except circumventing Descartes with an additional ylim = c(130, -5) I was thinking of adding an option to panel.barchart, but a more general solution would be options to flip the order in draw.key. I'll see if that's doable. Deepayan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Unsupervised RandomForest
To be able to do that, at least a couple of matrices of size 6804 x 6804 have to be created, and 1GB RAM is unlikely to be sufficient. Andy From: Rich Jacques Dear all, I am trying to calculate the proximity matrix for a data set with 16 variables and 6804 observations using random forests. I have a Pentium 4, 3.00GHz processor with 1 GB of RAM. When I use the command randomForest(data.scale,proximity=T) I get the warning message Error: cannot allocate vector of size 361675 kb Is this because I have reached the limit of what my computer is capable of? Or is there a more efficient way to call randomForest? Thanks in advance, Rich Jacques -- Rich Jacques Department of Probability Statistics University of Sheffield Hicks Building Sheffield S3 7RH __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Degrees of freedom using Box.test()
In 3 decades I suspect that the definition of large has changed. Now with faster computers and R we don't need to rely on the guesses of Ljung and Box. All it will take is someone (not me) to do some experiments. Pat Nestor Arguea wrote: It's Ljung and Box (1978) saying that for large number of observations a chi squared with lags-p-q, should provide a good approximation for most practical purposes (p. 298 of reference above). Nestor On Wednesday 08 March 2006 4:00 am, Patrick Burns wrote: You are saying that the penalty on the degrees of freedom should be the same whether the model was fit with 100 observations or 1 million observations. You are also saying that some tests should have negative degrees of freedom. So I don't think your proposal is the right answer, though presumably there should be some penalty. There is a working paper on the Burns Statistics website about robustness in Ljung-Box tests, but this issue is not one that is covered. Patrick Burns [EMAIL PROTECTED] +44 (0)20 8525 0696 http://www.burns-stat.com (home of S Poetry and A Guide for the Unwilling S User) Nestor Arguea wrote: After an RSiteSeach(Box.test) I found some discussion regarding the degrees of freedom in the computation of the Ljung-Box test using Box.test(), but did not find any posting about the proper degrees of freedom. Box.test() uses lag=number as the degrees of freedom. However, I believe the correct degrees of freedom should be number-p-q where p and q are the number of estimated parameters (for instance, in a Box-Jenkins family of models). This, according to the main source in documentation of Box.test: G. M. Ljung and G. E. P. Box, On a measure of Lack of Fit in Time Series Models, Biometrika, Vol. 65, No. 2 (August, 1978), pp. 297-303. One can still compute the correct p-value with 1-pchisq(value,correctdf) Nestor (R 2.2.1 on Linux, Suse 9.3) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] how to draw x-axis at randomly position
Dear R-user, Does any of you know how to draw x-axis at randomly position? It looks I could only draw x-axis at the bottom and top of the plot area (by using function axis). However, sometimes I need to draw x-axis at randomly position, for example, in the plot of y~x, I'd like to let the x-axis go cross y=5. Best,Jing __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to draw x-axis at randomly position
Try something like: plot(1:5, xaxt=n) axis(side=1, pos=3) When all else fails, read the help pages (again)... Andy From: Jing Yang Dear R-user, Does any of you know how to draw x-axis at randomly position? It looks I could only draw x-axis at the bottom and top of the plot area (by using function axis). However, sometimes I need to draw x-axis at randomly position, for example, in the plot of y~x, I'd like to let the x-axis go cross y=5. Best,Jing __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to do it without for loops?
Thomas' solution is better but thought this might be of interest anyways since it can be written closer to mathematical notation. That is, the required expression can be written in the following equivalent way for a suitable matrix A: X' diag(u) A' A diag(u) X where diag(u) is a diagonal matrix with u along the diagonal as in the R diag function, spaces refer to matrix multiplication and ' means transpose. Thus we have: A - outer(unique(dat$id), dat$id, ==) crossprod(A %*% diag(dat$u) %*% as.matrix(dat[1:2])) On 3/8/06, Thomas Lumley [EMAIL PROTECTED] wrote: On Wed, 8 Mar 2006, ronggui wrote: Thank you for all . One more question.How can I calculate these efficiently? set.seed(100) dat-data.frame(x1=rnorm(20),x2=rnorm(20),u=rnorm(20),id=round(2*runif(20))) # In this example,id's elements are 0,1,2. y-list() for (i in 0:2){ X-as.matrix(subset(dat,id==i,c(x1,x2))) u-as.matrix(subset(dat,id==i,c(u))) y[[i+1]]-t(X)%*%u%*%t(u)%*%X } y[[1]]+y[[2]]+y[[3]] People have already told you about crossprod, so crossprod(crossprod(X,u)) would seem an obvious improvement over the matrix multiplications. There is a better solution, though. Xu-dat[,c(x1,x2)]*dat[,u] crossprod( rowsum(Xu, dat$id)) -thomas the above code is not elegant.And my second solution to this problem is using by to get a list. matlis-by(dat, dat$id, function(x){ a-as.matrix(x[,c(x1,x2)]) b-as.matrix(x[, u]) t(a) %*% b %*% t(b) %*% a }) S - matrix(unlist(matlis), 4, length(matlis)) S1 - matrix(rowSums(S), 2, 2) The code works ,but I want to ask if there is any other more better ways to do it? It seems that this kind of computation is quite common. 2006/2/28, Gabor Grothendieck [EMAIL PROTECTED]: Try: crossprod(x) or t(x) %*% x On 2/28/06, ronggui [EMAIL PROTECTED] wrote: This is the code: x-matrix(rnorm(20),5) y-list() for (i in seq(nrow(x))) y[[i]]-t(x[i,,drop=F])%*%x[i,,drop=F] y[[1]]+y[[2]]+y[[3]]+y[[4]]+y[[5]] How can I do it without using for loops? Thank you in advance! -- ronggui Deparment of Sociology Fudan University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- »ÆÈÙ¹ó Deparment of Sociology Fudan University Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED]University of Washington, Seattle __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to do it without for loops?
On Wed, 8 Mar 2006, Gabor Grothendieck wrote: Thomas' solution is better but thought this might be of interest anyways since it can be written closer to mathematical notation. That is, the required expression can be written in the following equivalent way for a suitable matrix A: X' diag(u) A' A diag(u) X Um. n x n matrix? O(n^2) storage? O(n^3) execution time? Yes, it's fine when n=20, or even 200, but still... -thomas where diag(u) is a diagonal matrix with u along the diagonal as in the R diag function, spaces refer to matrix multiplication and ' means transpose. Thus we have: A - outer(unique(dat$id), dat$id, ==) crossprod(A %*% diag(dat$u) %*% as.matrix(dat[1:2])) On 3/8/06, Thomas Lumley [EMAIL PROTECTED] wrote: On Wed, 8 Mar 2006, ronggui wrote: Thank you for all . One more question.How can I calculate these efficiently? set.seed(100) dat-data.frame(x1=rnorm(20),x2=rnorm(20),u=rnorm(20),id=round(2*runif(20))) # In this example,id's elements are 0,1,2. y-list() for (i in 0:2){ X-as.matrix(subset(dat,id==i,c(x1,x2))) u-as.matrix(subset(dat,id==i,c(u))) y[[i+1]]-t(X)%*%u%*%t(u)%*%X } y[[1]]+y[[2]]+y[[3]] People have already told you about crossprod, so crossprod(crossprod(X,u)) would seem an obvious improvement over the matrix multiplications. There is a better solution, though. Xu-dat[,c(x1,x2)]*dat[,u] crossprod( rowsum(Xu, dat$id)) -thomas the above code is not elegant.And my second solution to this problem is using by to get a list. matlis-by(dat, dat$id, function(x){ a-as.matrix(x[,c(x1,x2)]) b-as.matrix(x[, u]) t(a) %*% b %*% t(b) %*% a }) S - matrix(unlist(matlis), 4, length(matlis)) S1 - matrix(rowSums(S), 2, 2) The code works ,but I want to ask if there is any other more better ways to do it? It seems that this kind of computation is quite common. 2006/2/28, Gabor Grothendieck [EMAIL PROTECTED]: Try: crossprod(x) or t(x) %*% x On 2/28/06, ronggui [EMAIL PROTECTED] wrote: This is the code: x-matrix(rnorm(20),5) y-list() for (i in seq(nrow(x))) y[[i]]-t(x[i,,drop=F])%*%x[i,,drop=F] y[[1]]+y[[2]]+y[[3]]+y[[4]]+y[[5]] How can I do it without using for loops? Thank you in advance! -- ronggui Deparment of Sociology Fudan University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- »ÆÈÙ¹ó Deparment of Sociology Fudan University Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED]University of Washington, Seattle __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle__ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] 'less' for R?
Hi All, is there an equivalent of the Unix command 'less' (or 'more'), so I can look at what's inside a data.frame or a matrix without having it printed out on console? I am using R on Debian Linux and Mac OS 10.4.5 Cheers, F -- Federico C. F. Calboli Department of Epidemiology and Public Health Imperial College, St Mary's Campus Norfolk Place, London W2 1PG Tel +44 (0)20 7594 1602 Fax (+44) 020 7594 3193 f.calboli [.a.t] imperial.ac.uk f.calboli [.a.t] gmail.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] 'less' for R?
On 3/8/2006 12:03 PM, Federico Calboli wrote: Hi All, is there an equivalent of the Unix command 'less' (or 'more'), so I can look at what's inside a data.frame or a matrix without having it printed out on console? I am using R on Debian Linux and Mac OS 10.4.5 I think you want page(). head() and tail() are also useful, as equivalents to the head and tail commands. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to do it without for loops?
Yes, its horribly inefficient but as I stated the reason I mentioned it was because it could be translated more easily from mathematical notation and I did mention I preferred your solution. Actually one problem with both of the solutions is determining that they are correct. From that viewpoint, although not as elegant or fun, a more straightforward translation of the original question might actually be preferable (Thomas also already mentioned the computation in f): f - function(x) crossprod(crossprod(x[[3]], as.matrix(x[,1:2]))) mm -by(dat, dat$id, f) mm[[1]] + mm[[2]] + mm[[3]] or the last line could be replaced with these two if you don't know that there are necessarily three components: result - mm[[1]] result[] - do.call(mapply, c(sum, mm)) One thing that this bought out for me is that R does not have a parallel to pmax for sum. If one had psum of if + allowed more than 2 arguments one could have done this instead of the two lines involving result above: do.call(psum, mm) # if psum analog to pmax were available do.call(+, mm) # if + allowed 2 arguments On 3/8/06, Thomas Lumley [EMAIL PROTECTED] wrote: On Wed, 8 Mar 2006, Gabor Grothendieck wrote: Thomas' solution is better but thought this might be of interest anyways since it can be written closer to mathematical notation. That is, the required expression can be written in the following equivalent way for a suitable matrix A: X' diag(u) A' A diag(u) X Um. n x n matrix? O(n^2) storage? O(n^3) execution time? Yes, it's fine when n=20, or even 200, but still... -thomas where diag(u) is a diagonal matrix with u along the diagonal as in the R diag function, spaces refer to matrix multiplication and ' means transpose. Thus we have: A - outer(unique(dat$id), dat$id, ==) crossprod(A %*% diag(dat$u) %*% as.matrix(dat[1:2])) On 3/8/06, Thomas Lumley [EMAIL PROTECTED] wrote: On Wed, 8 Mar 2006, ronggui wrote: Thank you for all . One more question.How can I calculate these efficiently? set.seed(100) dat-data.frame(x1=rnorm(20),x2=rnorm(20),u=rnorm(20),id=round(2*runif(20))) # In this example,id's elements are 0,1,2. y-list() for (i in 0:2){ X-as.matrix(subset(dat,id==i,c(x1,x2))) u-as.matrix(subset(dat,id==i,c(u))) y[[i+1]]-t(X)%*%u%*%t(u)%*%X } y[[1]]+y[[2]]+y[[3]] People have already told you about crossprod, so crossprod(crossprod(X,u)) would seem an obvious improvement over the matrix multiplications. There is a better solution, though. Xu-dat[,c(x1,x2)]*dat[,u] crossprod( rowsum(Xu, dat$id)) -thomas the above code is not elegant.And my second solution to this problem is using by to get a list. matlis-by(dat, dat$id, function(x){ a-as.matrix(x[,c(x1,x2)]) b-as.matrix(x[, u]) t(a) %*% b %*% t(b) %*% a }) S - matrix(unlist(matlis), 4, length(matlis)) S1 - matrix(rowSums(S), 2, 2) The code works ,but I want to ask if there is any other more better ways to do it? It seems that this kind of computation is quite common. 2006/2/28, Gabor Grothendieck [EMAIL PROTECTED]: Try: crossprod(x) or t(x) %*% x On 2/28/06, ronggui [EMAIL PROTECTED] wrote: This is the code: x-matrix(rnorm(20),5) y-list() for (i in seq(nrow(x))) y[[i]]-t(x[i,,drop=F])%*%x[i,,drop=F] y[[1]]+y[[2]]+y[[3]]+y[[4]]+y[[5]] How can I do it without using for loops? Thank you in advance! -- ronggui Deparment of Sociology Fudan University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- »ÆÈÙ¹ó Deparment of Sociology Fudan University Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED]University of Washington, Seattle __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED]University of Washington, Seattle __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] 'less' for R?
Or fix(data.frame) to open it in the spreadsheet-like data editor. -Christos Hatzis -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Duncan Murdoch Sent: Wednesday, March 08, 2006 12:26 PM To: [EMAIL PROTECTED] Cc: r-help Subject: Re: [R] 'less' for R? On 3/8/2006 12:03 PM, Federico Calboli wrote: Hi All, is there an equivalent of the Unix command 'less' (or 'more'), so I can look at what's inside a data.frame or a matrix without having it printed out on console? I am using R on Debian Linux and Mac OS 10.4.5 I think you want page(). head() and tail() are also useful, as equivalents to the head and tail commands. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] 'less' for R?
Le 08.03.2006 18:03, Federico Calboli a écrit : Hi All, is there an equivalent of the Unix command 'less' (or 'more'), so I can look at what's inside a data.frame or a matrix without having it printed out on console? I am using R on Debian Linux and Mac OS 10.4.5 Cheers, F Hi, A cheap way is to use the unix command : less - function(a){ write.table(a, quote=F, file=/tmp/dataframe.txt) system(less /tmp/dataframe.txt) } less(iris) Romain -- visit the R Graph Gallery : http://addictedtor.free.fr/graphiques Discover the R Movies Gallery : http://addictedtor.free.fr/movies +---+ | Romain FRANCOIS - http://francoisromain.free.fr | | Doctorant INRIA Futurs / EDF | +---+ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to do it without for loops?
On Wed, 8 Mar 2006, Gabor Grothendieck wrote: Actually one problem with both of the solutions is determining that they are correct. From that viewpoint, although not as elegant or fun, a more straightforward translation of the original question might actually be preferable Yes. Although for the application where I encounter this (sandwich variance estimators), the proof that the two versions are the same is of independent interest, since it shows the relationship between the cluster jackknife, delta-betas, and the sandwich estimator. -thomas (Thomas also already mentioned the computation in f): f - function(x) crossprod(crossprod(x[[3]], as.matrix(x[,1:2]))) mm -by(dat, dat$id, f) mm[[1]] + mm[[2]] + mm[[3]] or the last line could be replaced with these two if you don't know that there are necessarily three components: result - mm[[1]] result[] - do.call(mapply, c(sum, mm)) One thing that this bought out for me is that R does not have a parallel to pmax for sum. If one had psum of if + allowed more than 2 arguments one could have done this instead of the two lines involving result above: do.call(psum, mm) # if psum analog to pmax were available do.call(+, mm) # if + allowed 2 arguments On 3/8/06, Thomas Lumley [EMAIL PROTECTED] wrote: On Wed, 8 Mar 2006, Gabor Grothendieck wrote: Thomas' solution is better but thought this might be of interest anyways since it can be written closer to mathematical notation. That is, the required expression can be written in the following equivalent way for a suitable matrix A: X' diag(u) A' A diag(u) X Um. n x n matrix? O(n^2) storage? O(n^3) execution time? Yes, it's fine when n=20, or even 200, but still... -thomas where diag(u) is a diagonal matrix with u along the diagonal as in the R diag function, spaces refer to matrix multiplication and ' means transpose. Thus we have: A - outer(unique(dat$id), dat$id, ==) crossprod(A %*% diag(dat$u) %*% as.matrix(dat[1:2])) On 3/8/06, Thomas Lumley [EMAIL PROTECTED] wrote: On Wed, 8 Mar 2006, ronggui wrote: Thank you for all . One more question.How can I calculate these efficiently? set.seed(100) dat-data.frame(x1=rnorm(20),x2=rnorm(20),u=rnorm(20),id=round(2*runif(20))) # In this example,id's elements are 0,1,2. y-list() for (i in 0:2){ X-as.matrix(subset(dat,id==i,c(x1,x2))) u-as.matrix(subset(dat,id==i,c(u))) y[[i+1]]-t(X)%*%u%*%t(u)%*%X } y[[1]]+y[[2]]+y[[3]] People have already told you about crossprod, so crossprod(crossprod(X,u)) would seem an obvious improvement over the matrix multiplications. There is a better solution, though. Xu-dat[,c(x1,x2)]*dat[,u] crossprod( rowsum(Xu, dat$id)) -thomas the above code is not elegant.And my second solution to this problem is using by to get a list. matlis-by(dat, dat$id, function(x){ a-as.matrix(x[,c(x1,x2)]) b-as.matrix(x[, u]) t(a) %*% b %*% t(b) %*% a }) S - matrix(unlist(matlis), 4, length(matlis)) S1 - matrix(rowSums(S), 2, 2) The code works ,but I want to ask if there is any other more better ways to do it? It seems that this kind of computation is quite common. 2006/2/28, Gabor Grothendieck [EMAIL PROTECTED]: Try: crossprod(x) or t(x) %*% x On 2/28/06, ronggui [EMAIL PROTECTED] wrote: This is the code: x-matrix(rnorm(20),5) y-list() for (i in seq(nrow(x))) y[[i]]-t(x[i,,drop=F])%*%x[i,,drop=F] y[[1]]+y[[2]]+y[[3]]+y[[4]]+y[[5]] How can I do it without using for loops? Thank you in advance! -- ronggui Deparment of Sociology Fudan University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- »ÆÈÙ¹ó Deparment of Sociology Fudan University Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED]University of Washington, Seattle __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED]University of Washington, Seattle __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle__ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide!
[R] function gdist, dist and vegdist in mvpart
Dear R community, I am analyzing plant communities with the function mvpart, using a dissimilarit matrix as input. The matrix is calculated with the funtion gdist. fit - mvpart(gdist (ba12[,18:29], meth=maximum, full=TRUE, sq=F) ~ beers + slope_dem + elev_dem+ plc_dem + pr_curv+ +curv+max_depth+doc_rocks+ abandon+land_use+ca_old, data=ba12, xv=p) This works fine. Now I would like to use other dissimilarity measures as can be found in the function dist (STATS) or vegdist (VEGAN). De'Ath notes that gdist should be interchangeable with dist - but I receive following error message: fit21 - mvpart(dist (ba12[,18:29], meth=minkowski, diag=T, upper=T p=2) ~ beers + slope_dem + elev_dem+ plc_dem + pr_curv+ curv+max_depth+doc_rocks+ abandon+land_use+ca_old, data=ba12, xv=p) Error in model.frame(formula, rownames, variables, varnames, extras, extranames, : Variable lengths differ When computing the matrices directly... ba12.dist1 - gdist (ba12[,18:29], method=maximum, full=T, sq=F) ba12.dist2 - vegdist (ba12[,18:29], method=jaccard, diag=T, upper=T) ba12.dist3 - dist (ba12[,18:29], method=minkowski, p=1, diag=T, upper=T) and looking at them, there appears to be no difference though only ba12.dist1 works with mvpart. With the other two I get the same error message as above. What I am missing? Where is the problem? Thanks for helping. Sina University of Potsdam, Germany Department of Geoecology __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Want to fit random intercept in logistic regression (testing lmer and glmmML)
Greetings. Here is sample code, with some comments. It shows how I can simulate data and estimate glm with binomial family when there is no individual level random error, but when I add random error into the linear predictor, I have a difficult time getting reasonable estimates of the model parameters or the variance component. There are no clusters here, just individual level responses, so perhaps I am misunderstanding the translation from my simple mixed model to the syntax of glmmML and lmer. I get roughly the same (inaccurate) fixed effect parameter estimates from glmmML and lmer, but the information they give on the variance components is quite different. Thanks in advance. Now I paste in the example code ### Paul Johnson [EMAIL PROTECTED] ### 2006-03-08 N - 1000 A - -1 B - 0.3 x - 1 + 10 * rnorm(N) eta - A + B * x pi - exp(eta)/(1+exp(eta)) myunif - runif(N) y - ifelse(myunif pi, 1, 0) plot(x,y, main=bquote( eta[i] == .(A) + .(B) * x[i] )) text ( 0.5*max(x), 0.5, expression( Prob( y[i] == 1) == frac( 1 , 1 + exp(-eta[i] myglm1 - glm ( y ~ x, family=binomial(link=logit) ) summary(myglm1) ## Just for fun myglm2 - glm(y~x, family=quasibinomial) summary(myglm2) ### Mixed model: random intercept with large variance eta - A + B * x + 5 * rnorm(N) pi - exp(eta)/(1+exp(eta)) myunif - runif(N) y - ifelse(myunif pi, 1, 0) plot(x,y, main=bquote( eta[i] == .(A) + .(B) * x[i] + u[i])) text ( 0.5*max(x), 0.5, expression( Prob( y[i] == 1) == frac( 1 , 1 + exp(-eta[i] ### Parameter estimates go to hell, as expected myglm3 - glm ( y ~ x, family=binomial(link=logit) ) summary(myglm3) ### Why doesn't quasibinomial show more evidence of the random intercept? myglm4 - glm(y~x, family=quasibinomial) summary(myglm4) # Can I estimate with lmer? library(lme4) ### With no repeated observations, what does lmer want? ### Clusters of size 1 ? ### In lme, I'd need random= ~ 1 idnumber - 1: length(y) mylmer1 - lmer( y ~ x + ( 1 | idnumber ), family=binomial, method=Laplace ) summary(mylmer1) ### Glmm wants clusters, and I don't have any clusters with more than 1 observation ### library(glmmML) myglmm1 - glmmML(y~x, family=binomial, cluster = idnumber ) summary(myglmm1) -- Paul E. Johnson Professor, Political Science 1541 Lilac Lane, Room 504 University of Kansas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] function gdist, dist and vegdist in mvpart
On Wed, 2006-03-08 at 19:12 +0100, Sina Muster wrote: Dear R community, What I am missing? Where is the problem? That dist returns an object of class dist, even if you use upper = TRUE. ?dist says: upper: logical value indicating whether the upper triangle of the distance matrix should be printed by 'print.dist'. Beware confusing the printed representation of the object with what the object actually looks like. The reason gdist works, is that is does not do what its help page says: Value: Should be interchangeable with 'dist' and returns a distance object of the same type. Which the following sequence illustrates does not happen (using the spider dataset from mvpart): spider.dist - gdist(spider[1:12,]) class(spider.dist) [1] dist spider.dist - gdist(spider[1:12,], full = TRUE) class(spider.dist) [1] matrix You might consider contacting the maintainer and submit a bug report to him about this... So even though you thought the two (dist and gdist) were the same, they are not. To solve this problem then, a simple modification to your mvpart call is required, as shown below: spider.dist2 - dist(spider[, 1:12], method=minkowski, p=1, diag=T, upper=T) class(spider.dist2) [1] dist mvpart(spider.dist2~herbs+reft+moss+sand+twigs+water,spider) Error in model.frame(formula, rownames, variables, varnames, extras, extranames, : variable lengths differ mvpart(as.matrix(spider.dist2)~herbs+reft+moss+sand+twigs +water,spider) The key is the as.matrix() wrapped around the lhs of the formula. It would have been useful if you'd included an reproducible example using a built in data set - like I do above. Readers of the list wont have your ba12, so extra work is required to investigate the problem. Thanks for helping. Sina You're welcome, hope this helps. Gav -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [T] +44 (0)20 7679 5522 ENSIS Research Fellow [F] +44 (0)20 7679 7565 ENSIS Ltd. ECRC [E] gavin.simpsonATNOSPAMucl.ac.uk UCL Department of Geography [W] http://www.ucl.ac.uk/~ucfagls/cv/ 26 Bedford Way[W] http://www.ucl.ac.uk/~ucfagls/ London. WC1H 0AP. %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Adding polygons to a barplot
I have a barplot I have created using barplot2 and I have been able to add points and lines (using the points and lines methods, respectively). I now need to add some polygons (triangles in particular), that I want to be shaded to match bars in the plot. I can get the coordinates of the corners of the triangles, but don't know how to draw the triangles. I know there is the grid.polygon method, but I don't know how to get it to draw on my plot. Any help would be appreciated. Thanks! Jamie __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] bug in map('world') ?
I seem to recall I came across something similar recently (well, relatively recently, perhaps a couple of years ago). The problem is related to how the code handles polygons that are split across the map boundaries. As I recall, the fix was to modify the polygons that straddle the date line. I'll have to delve into this again, but don't hold your breath, sorry. Ray Brownrigg From: Joerg van den Hoff [EMAIL PROTECTED] hi, did'nt see anything in the archive: map('world',pro='rectangular',para=0) yields a strange artifact (horizontal bar) extending over the whole map at a certain latitude range (approx 65 deg. north), whereas map('world',pro='rectangular',para=180) (which should be the same) does not show the artifact. the artifact shows up in other projections as well, e.g. map('world',pro='bonne',para=45) which seems to show that the problem might be in the data base of the map (i.e. polygon definition)?? any ideas? regards, joerg __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] memory limits in Windows
Hello, I apologize, since I know that variations of this question come up often, but I have searched both the archives and the rw-FAQ, and haven't had any luck solving my problem. I am trying to run some generalized additive mixed models (using the mgcv library) with some large datasets, and get error messages regarding memory allocation. An example dataset has around 45,000 points contributed by about 2200 individuals (so about 20 observations per individual). Errors within individuals are modeled as AR(1). Currently, I can run the models on a random subset of about 25% of the data in a reasonable amount of time, so I think that memory is the only major issue here. I have used the --max-mem-size command line option, and set it to the maximum allowable, which is 3Gb (any larger and I get a message that it is too large and is ignored when I open R). I also run the R command memory.limit(4095), again the maximum allowable without receiving a don't be silly message. I am running this on a brand new computer with Windows XP-64 OS and 4GB RAM (it was bought primarily to be able to handle these models!) Do I have any options to increase memory? Any advice would be hugely appreciated. Thanks so much for any input. Karissa Johnston __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Accessing functions in a library
I am trying to write a modified function to plot an rpart object. By using getS3method I can see the plot and text code that I want to modify. Since I don't want to modify the package, I create a new function to plot the rpart object. The problem is that the original function calls many rpart specific functions that are only visible inside the rpart namespace. Therefore, when I call my modified function it does not find the rpart specific functions and gives me an error. How can I make a new function that is a small modification of an existing package function and let the modified function access the hidden package functions? Michael Conklin Chief Methodolgist - Advanced Analytics MarketTools, Inc. 6465 Wayzata Blvd. Suite 170 Minneapolis, MN 55426 Tel: 952.417.4719 | Mobile:612.201.8978 [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] MarketTools(r)http://www.markettools.com http://www.markettools.com/ This e-mail and any attachments may contain privileged, confidential or proprietary information. If you are not the intended recipient, be aware that any review, copying, or distribution of this e-mail or any attachment is strictly prohibited. If you have received this e-mail in error, please return it to the sender immediately, and permanently delete the original and any copies from your system. Thank you for your cooperation. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] RES: survival
Dear Thomas, The head of my dataset head(wsuv) parcel sp time censo treatment species 1 S8 Poecilanthe effusa ( Hub. ) Ducke. 1 1 1 1 2 S8 Poecilanthe effusa ( Hub. ) Ducke. 1 1 1 1 3 S8 Poecilanthe effusa ( Hub. ) Ducke. 1 1 1 1 4 S8 Poecilanthe effusa ( Hub. ) Ducke. 1 1 1 1 5 S8 Poecilanthe effusa ( Hub. ) Ducke. 1 1 1 1 6 S8 Poecilanthe effusa ( Hub. ) Ducke. 1 1 1 1 ... 144361 summary(model.fit) # just one species from one treatment shown below Call: survfit(formula = Surv(time, censo) ~ treatment + species, data = wsuv) treatment=0, species=1 time n.risk n.event survival std.err lower 95% CI upper 95% CI 1 15440 3860.975 0.001260.9730.977 2 15054 3360.953 0.001700.9500.957 3 14668 3020.934 0.002000.9300.938 4 14282 2960.914 0.002260.9100.919 5 13896 2810.896 0.002470.8910.901 6 13510 2640.878 0.002640.8730.883 7 13124 2510.861 0.002800.8560.867 8 12738 2320.846 0.002930.8400.852 9 12352 2160.831 0.003050.8250.837 10 11966 2060.817 0.003150.8110.823 11 11580 1900.803 0.003250.7970.810 12 11194 1790.790 0.003330.7840.797 13 10808 1670.778 0.003410.7720.785 14 10422 1670.766 0.003490.7590.773 15 10036 1450.755 0.003560.7480.762 16 9650 1420.744 0.003630.7370.751 17 9264 1350.733 0.003690.7260.740 18 8878 1220.723 0.003750.7150.730 19 8492 990.714 0.003800.7070.722 20 8106 840.707 0.003850.6990.714 21 7720 680.701 0.003890.6930.708 22 7334 660.694 0.003930.6870.702 23 6948 510.689 0.003970.6810.697 24 6562 400.685 0.004000.6770.693 25 6176 380.681 0.004030.6730.689 26 5790 370.676 0.004070.6690.684 27 5404 330.672 0.004110.6640.680 28 5018 310.668 0.004150.6600.676 29 4632 260.664 0.004190.6560.673 30 4246 220.661 0.004230.6530.669 31 3860 150.658 0.004270.6500.667 32 3474 140.656 0.004310.6470.664 33 3088 140.653 0.004360.6440.661 34 2702 130.650 0.004430.6410.658 35 2316 120.646 0.004510.6380.655 36 1930 110.643 0.004620.6340.652 37 1544 120.638 0.004800.6280.647 38 1158 100.632 0.005070.6220.642 39772 90.625 0.005570.6140.636 40386 80.612 0.007090.5980.626 I don't get why with 8 leaves remaining (out of 384), the survival is about 0.6??? Call: survfit(formula = Surv(time, censo) ~ 1, data = wsuv) n events median 0.95LCL 0.95UCL 144361 58830 40 39 40 survfit(Surv(timee,ind)~sp2,data=wsuv) Call: survfit(formula = Surv(timee, ind) ~ sp2, data = wsuv) n events median 0.95LCL 0.95UCL sp2=1 32226 10856Inf Inf Inf sp2=2 23370 9824 38 37 39 sp2=3 31201 13275 40 39 41 sp2=4 28044 10401 41 40 41 sp2=5 29520 14474 31 30 31 survfit(Surv(timee,ind)~parcel2,data=wsuv) Call: survfit(formula = Surv(timee, ind) ~ parcel2, data = wsuv) n events median 0.95LCL 0.95UCL parcel2=0 68183 28116 38 38 38 parcel2=1 76178 30714 41 41 41 survfit(Surv(timee,ind)~interaction(parcel2,sp2),data=wsuv) Call: survfit(formula = Surv(timee, ind) ~ interaction(parcel2, sp2), data = wsuv) n events median 0.95LCL 0.95UCL interaction(parcel2, sp2)=0.1 15826 5070Inf Inf Inf interaction(parcel2, sp2)=1.1 16400 5786Inf Inf Inf interaction(parcel2, sp2)=0.2 9430 3935 38 37 39 interaction(parcel2, sp2)=1.2 13940 5889 38 37 39 interaction(parcel2, sp2)=0.3 14678 6021 40 39 41 interaction(parcel2, sp2)=1.3 16523 7254 39 37 41 interaction(parcel2, sp2)=0.4 14473 5758 38 37 39 interaction(parcel2, sp2)=1.4 13571 4643Inf Inf Inf interaction(parcel2, sp2)=0.5 13776 7332 29
Re: [R] Accessing functions in a library
Michael Conklin [EMAIL PROTECTED] writes: I am trying to write a modified function to plot an rpart object. By using getS3method I can see the plot and text code that I want to modify. Since I don't want to modify the package, I create a new function to plot the rpart object. The problem is that the original function calls many rpart specific functions that are only visible inside the rpart namespace. Therefore, when I call my modified function it does not find the rpart specific functions and gives me an error. How can I make a new function that is a small modification of an existing package function and let the modified function access the hidden package functions? You can subvert namespace protection using :::. So if there is a hidden function g in package foo, use foo:::g(). __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Debugging a program written in the R language
From Tom: The subject is debugging a program written in the R language,under Windows. (Sorry, but I do not know either the Apple OS or *nix.) A computer program will usually not work on the first try, if only because of the risk of typos. Instead, it must be debugged. Roughly, here is the sequence: (1) One codes a program using the R language, and stores it on the hard drive, using some particular editor. (2) The program is fed to the R software, together with test data, etc. (3) A test computation is run and bugs are spotted. (4) The program is corrected, using an editor, and the revised version is stored on the hard drive. (5) The sequence goes back to step (2) and is repeated until the program hopefully works. Unfortunately, the documentation doesn't really explain how to do all of this, or if it is explained in the documentation, I can't find it. Reading between the lines a bit, I infer that you are supposed to be able to use something called a History file, then sort of work backward in the code and make corrections. I never got it to work for me. Also, it is unclear how you would handle code entered six weeks or six months ago. That is the bad news; the good news is that some kind soul told me about a key trick; prepare the program in Windows text format (.txt) and copy it and paste it into the console. The program will now run from a user-defined wrapper or driver function. I am aware that there is an editor called Emacs which you can use if you are a member of the *nix community, which I am not. Question: How are you -supposed- to debug a program which you have written in the R language? Tom Thomas L. Jones, Ph.D., Computer Science __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] [Q] BIC as a goodness-of-fit stat
The following article gives some motivations for the BIC and some rules to interpret a difference of BICs (if I recall correctly, over 10: very strong difference, between 6 and 10: strong difference, between 2 and 6: some difference): A.E. Raftery, Bayesian model selection in social research http://www.stat.washington.edu/tech.reports/bic.ps Regards, -- Vincent On 06/03/06, Young-Jin Lee [EMAIL PROTECTED] wrote: Dear R-List I have a question about how to interpret BIC as a goodness-of-fit statistic. I was trying to use EMclust and other mclust library and found that BIC was used as a goodness-of-fit statistic. Although I know that smaller BIC indicates a better fit, it is not clear to me how good a fit is by reading a BIC number. Is there a standard way of interpreting a BIC value? Thanks in advance. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Debugging a program written in the R language
Firstly, some corrections: - It's best to write code as functions in R, just like most sensible languages. You can then debug the functions. - (X)Emacs has versions for Windows. Prof. John Fox has kindly made available some instructions on how Xemacs/ESS can be set up on Windows. Now, Roger Peng has kindly made available a note on debugging R code: http://www.biostat.jhsph.edu/~rpeng/docs/R-debug-tools.pdf Also, for compiled code linked to R, Duncan Murdoch's notes will be helpful: http://www.stats.uwo.ca/faculty/murdoch/software/debuggingR/ Finally, there's a `debug' package on CRAN with a R News article describing it. Andy From: Thomas L Jones From Tom: The subject is debugging a program written in the R language,under Windows. (Sorry, but I do not know either the Apple OS or *nix.) A computer program will usually not work on the first try, if only because of the risk of typos. Instead, it must be debugged. Roughly, here is the sequence: (1) One codes a program using the R language, and stores it on the hard drive, using some particular editor. (2) The program is fed to the R software, together with test data, etc. (3) A test computation is run and bugs are spotted. (4) The program is corrected, using an editor, and the revised version is stored on the hard drive. (5) The sequence goes back to step (2) and is repeated until the program hopefully works. Unfortunately, the documentation doesn't really explain how to do all of this, or if it is explained in the documentation, I can't find it. Reading between the lines a bit, I infer that you are supposed to be able to use something called a History file, then sort of work backward in the code and make corrections. I never got it to work for me. Also, it is unclear how you would handle code entered six weeks or six months ago. That is the bad news; the good news is that some kind soul told me about a key trick; prepare the program in Windows text format (.txt) and copy it and paste it into the console. The program will now run from a user-defined wrapper or driver function. I am aware that there is an editor called Emacs which you can use if you are a member of the *nix community, which I am not. Question: How are you -supposed- to debug a program which you have written in the R language? Tom Thomas L. Jones, Ph.D., Computer Science __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Debugging a program written in the R language
From the R prompt, try typing ?source A good editor, specialized for working with R source on Windows, is Tinn-R. on 3/8/2006 4:12 PM Thomas L Jones said the following: From Tom: The subject is debugging a program written in the R language,under Windows. (Sorry, but I do not know either the Apple OS or *nix.) A computer program will usually not work on the first try, if only because of the risk of typos. Instead, it must be debugged. Roughly, here is the sequence: (1) One codes a program using the R language, and stores it on the hard drive, using some particular editor. (2) The program is fed to the R software, together with test data, etc. (3) A test computation is run and bugs are spotted. (4) The program is corrected, using an editor, and the revised version is stored on the hard drive. (5) The sequence goes back to step (2) and is repeated until the program hopefully works. Unfortunately, the documentation doesn't really explain how to do all of this, or if it is explained in the documentation, I can't find it. Reading between the lines a bit, I infer that you are supposed to be able to use something called a History file, then sort of work backward in the code and make corrections. I never got it to work for me. Also, it is unclear how you would handle code entered six weeks or six months ago. That is the bad news; the good news is that some kind soul told me about a key trick; prepare the program in Windows text format (.txt) and copy it and paste it into the console. The program will now run from a user-defined wrapper or driver function. I am aware that there is an editor called Emacs which you can use if you are a member of the *nix community, which I am not. Question: How are you -supposed- to debug a program which you have written in the R language? Tom Thomas L. Jones, Ph.D., Computer Science __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Michael Prager, Ph.D. Population Dynamics Team, NMFS SE Fisheries Science Center NOAA Center for Coastal Fisheries and Habitat Research Beaufort, North Carolina 28516 http://shrimp.ccfhrb.noaa.gov/~mprager/ Opinions expressed are personal, not official. No government endorsement of any product is made or implied. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Debugging a program written in the R language
You could always use notepad, but there are better solutions. There are many text editors which will send the commands to R for you and return the results and also offer syntax highlighting. I like Tinn-R. Xemacs is probably the best, but its hard to learn (IMHO) and I have not taken the time to do so. There is also a host of GUIs that people are developing for R. Take a look at: http://www.sciviews.org/_rgui/ I suggest R commander. library(Rcmdr) You will need to download it from CRAN. HTH, Roger On 3/8/06, Thomas L Jones [EMAIL PROTECTED] wrote: From Tom: The subject is debugging a program written in the R language,under Windows. (Sorry, but I do not know either the Apple OS or *nix.) A computer program will usually not work on the first try, if only because of the risk of typos. Instead, it must be debugged. Roughly, here is the sequence: (1) One codes a program using the R language, and stores it on the hard drive, using some particular editor. (2) The program is fed to the R software, together with test data, etc. (3) A test computation is run and bugs are spotted. (4) The program is corrected, using an editor, and the revised version is stored on the hard drive. (5) The sequence goes back to step (2) and is repeated until the program hopefully works. Unfortunately, the documentation doesn't really explain how to do all of this, or if it is explained in the documentation, I can't find it. Reading between the lines a bit, I infer that you are supposed to be able to use something called a History file, then sort of work backward in the code and make corrections. I never got it to work for me. Also, it is unclear how you would handle code entered six weeks or six months ago. That is the bad news; the good news is that some kind soul told me about a key trick; prepare the program in Windows text format (.txt) and copy it and paste it into the console. The program will now run from a user-defined wrapper or driver function. I am aware that there is an editor called Emacs which you can use if you are a member of the *nix community, which I am not. Question: How are you -supposed- to debug a program which you have written in the R language? Tom Thomas L. Jones, Ph.D., Computer Science __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Debugging a program written in the R language
The simplest way: 1. Write the program using your favorite text editor and save it i.e. c:/try.R 2. Open R and write source(c:/try.R) See source help (?source) for more information I would recommend that, even you are a windows user, you use the emacs package for statistic (ESS). There are also other IDEs available. In this respect I actually do not see any difference between R and other programming languages HTH Thomas L Jones wrote: From Tom: The subject is debugging a program written in the R language,under Windows. (Sorry, but I do not know either the Apple OS or *nix.) A computer program will usually not work on the first try, if only because of the risk of typos. Instead, it must be debugged. Roughly, here is the sequence: (1) One codes a program using the R language, and stores it on the hard drive, using some particular editor. (2) The program is fed to the R software, together with test data, etc. (3) A test computation is run and bugs are spotted. (4) The program is corrected, using an editor, and the revised version is stored on the hard drive. (5) The sequence goes back to step (2) and is repeated until the program hopefully works. Unfortunately, the documentation doesn't really explain how to do all of this, or if it is explained in the documentation, I can't find it. Reading between the lines a bit, I infer that you are supposed to be able to use something called a History file, then sort of work backward in the code and make corrections. I never got it to work for me. Also, it is unclear how you would handle code entered six weeks or six months ago. That is the bad news; the good news is that some kind soul told me about a key trick; prepare the program in Windows text format (.txt) and copy it and paste it into the console. The program will now run from a user-defined wrapper or driver function. I am aware that there is an editor called Emacs which you can use if you are a member of the *nix community, which I am not. Question: How are you -supposed- to debug a program which you have written in the R language? Tom Thomas L. Jones, Ph.D., Computer Science __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Ferdinand Alimadhi Programmer / Analyst Harvard University The Institute for Quantitative Social Science (617) 496-0187 [EMAIL PROTECTED] www.iq.harvard.edu __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Adding polygons to a barplot
On Wed, 2006-03-08 at 14:02 -0500, Jamieson Cobleigh wrote: I have a barplot I have created using barplot2 and I have been able to add points and lines (using the points and lines methods, respectively). I now need to add some polygons (triangles in particular), that I want to be shaded to match bars in the plot. I can get the coordinates of the corners of the triangles, but don't know how to draw the triangles. I know there is the grid.polygon method, but I don't know how to get it to draw on my plot. Any help would be appreciated. Thanks! Jamie There is a polygon() function in base R graphics. It supports a 'density' argument in the same fashion as barplot()/barplot2(). See ?polygon help.search(polygon) would have gotten you there. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] How to plot the xaxis label at 45 degree angle?
Hello there, I would like to plot a graph with the x axis's label displayed at a 45 angle to the x axis instead of horizontal to it as the label is very long. What should I do? Thank you for your help in advance Lisa Wang Princess Margaret Hospital Toronto, Ca __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] bug in map('world') ?
There is a function recenter.Map in the TeachingDemos package that can be used with the maptools packages (rather than the maps package) to move polygons around for better looking maps. The original idea was to put the Alaskan islands on the left of the map rather than having some of them on the far right, but it also works to creat a pacific centric map or other things. It is still pretty basic as a function, if anyone wants to improve it (it could probably use a lot of improvement) they are more than welcome. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] (801) 408-8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ray Brownrigg Sent: Wednesday, March 08, 2006 12:40 PM To: [EMAIL PROTECTED]; r-help@stat.math.ethz.ch Subject: Re: [R] bug in map('world') ? I seem to recall I came across something similar recently (well, relatively recently, perhaps a couple of years ago). The problem is related to how the code handles polygons that are split across the map boundaries. As I recall, the fix was to modify the polygons that straddle the date line. I'll have to delve into this again, but don't hold your breath, sorry. Ray Brownrigg From: Joerg van den Hoff [EMAIL PROTECTED] hi, did'nt see anything in the archive: map('world',pro='rectangular',para=0) yields a strange artifact (horizontal bar) extending over the whole map at a certain latitude range (approx 65 deg. north), whereas map('world',pro='rectangular',para=180) (which should be the same) does not show the artifact. the artifact shows up in other projections as well, e.g. map('world',pro='bonne',para=45) which seems to show that the problem might be in the data base of the map (i.e. polygon definition)?? any ideas? regards, joerg __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] POSIX time zone codes
I would have suggested US/Central, by analogy with US/Pacific which I use on my unix-like Mac OS X 10.3.9 system. I don't know what might be common knowledge in UNIX circles, but here's a bit of information from my system. It has a directory named /usr/share/zoneinfo. zoneinfo[166]% pwd /usr/share/zoneinfo zoneinfo[167]% ls | grep DT CST6CDT EST5EDT MST7MDT PST8PDT There are files named GMT, UTC, GB, and a number of others. There are numerous subdirectories, including for example, Asia, Australia, Europe, US. zoneinfo[168]% cd US US[169]% ls ./ AleutianEast-IndianaIndiana-Starke Pacific ../ Arizona Eastern MichiganPacific-New Alaska Central Hawaii MountainSamoa This suggests to me that valid values for the tz argument are relative paths to files /usr/share/zoneinfo, but I don't know that for a fact. -Don At 9:22 AM -0500 3/7/06, Jason Horn wrote: Thank you again Gabor, that did the trick. Any thoughts on where I go go for a reference for these time codes? Where did you get CDT6CST from? Or is this just one of those things that is common knowledge in UNIX circles. To the R developers: I recommend a sentence be added to the manual for as.POSIXxx such as vales for tz are system dependent, examples for common systems are. Thanks! On Mar 7, 2006, at 9:00 AM, Gabor Grothendieck wrote: Only and GMT are really guaranteed to work on all systems since the time zones are system dependent but try: CDT6CST and see if that works on your system. On 3/7/06, Jason Horn [EMAIL PROTECTED] wrote: Whoops, [EDIT] as.POSIX(x, tz=UTC) ... works, gives UTC times as.POSIX(x, tz=EST) ... works, gives EST times as.POSIX(x, tz=CST) ... does NOT work, gives UTC times [/EDIT] On Mar 7, 2006, at 8:05 AM, Jason Horn wrote: as.POSIX(x, tz=UTC) ... works, gives UTC times as.POSIX(x, tz=UTC) ... works, gives EST times as.POSIX(x, tz=CST) ... does NOT work, gives UTC times [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting- guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- -- Don MacQueen Environmental Protection Department Lawrence Livermore National Laboratory Livermore, CA, USA __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] install.packages saying the car package is not in the repositories
I am attempting to install the car package using the command : install.packages('car',.Library) After I have chosen a mirror, I get the following message : Warning message: no package 'car' at the repositories in: download.packages(pkgs, destdir = tmpd, available = available, I used the CRAN.packages() function to see what was going on, and the car package is not listed. I have also tried several mirrors, all with the same error message. I have tried to install other packages, this has been successful. Any help would be great. Jeremy __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to plot the xaxis label at 45 degree angle?
On Wed, 2006-03-08 at 16:53 -0500, Lisa Wang wrote: Hello there, I would like to plot a graph with the x axis's label displayed at a 45 angle to the x axis instead of horizontal to it as the label is very long. What should I do? Thank you for your help in advance See R FAQ 7.27 How can I create rotated axis labels? http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-create-rotated-axis-labels_003f Using: RSiteSearch(rotated axis labels) would be helpful as well. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Debugging a program written in the R language
On 3/8/06, Thomas L Jones [EMAIL PROTECTED] wrote: From Tom: The subject is debugging a program written in the R language,under Windows. (Sorry, but I do not know either the Apple OS or *nix.) A computer program will usually not work on the first try, if only because of the risk of typos. Instead, it must be debugged. Roughly, here is the sequence: (1) One codes a program using the R language, and stores it on the hard drive, using some particular editor. Editors with various levels of integration with R can be found at these links: http://www.sciviews.org/_rgui/projects/Editors.html http://ess.r-project.org/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Mixed GLM methodology and execution question
Hi all, I have a question regarding how to properly analyze a data set and then how to perform the analysis in R. First, I have data that I would like to analyze using a mixed GLM (I think this is the most appropriate method, but I am unsure). In a mixed model (y = X*beta+Z*gamma+epsilon), I would like to structure the variance matrices of gamma, G, and the error, R, to take advantage of all my information. The structure of the data is like this: Response Variable: y = continuous response variable Predictor Variables: x1 = nominal treatment x2 = nominal group Random Variables: z = nominal subgroup of x2, i.e. z is nested within x2 Other variables(?; I'm not sure what exactly these are) z1 = first continuous property of z z2 = second continuous property of z z3 = third continuous property of z Presumably all the traits z1-z3 could potentially affect y, though I'm primarily interested in the model y=x1+x2+x1*x2. My wish is to put z in as a random variable and z1-z3 in the error matrix R. A small data sample would be like x1x2 z z1 z2 z3 y L1 A1 S1 1.23 4.59 -1.02 100.45 L2 A1 S1 1.23 4.59 -1.02 113.09 L1 A1 S2 1.50 3.76-0.06 119.21 L2 A1 S2 1.50 3.76-0.06 150.44 L1 A2 S3 1.09 4.01-1.50 109.18 L2 A2 S3 1.09 4.01-1.50 170.23 L1 A2 S4 1.01 3.70-0.78 109.26 L2 A2 S4 1.01 3.70-0.78 99.44 What is correct way to put together my model/matrices for this situation? How do accomplish such a task in R? Thanks, Ben [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] install.packages saying the car package is not in therepositories
Dear Jeremy, The car package (version 1.1-0) is certainly on CRAN. I checked a couple of mirrors, and it's there as well. (I assume that .Library contains the library location to which you want to install?) What OS and version of R are you using? Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jeremy Morris Sent: Wednesday, March 08, 2006 5:22 PM To: r-help@stat.math.ethz.ch Subject: [R] install.packages saying the car package is not in therepositories I am attempting to install the car package using the command : install.packages('car',.Library) After I have chosen a mirror, I get the following message : Warning message: no package 'car' at the repositories in: download.packages(pkgs, destdir = tmpd, available = available, I used the CRAN.packages() function to see what was going on, and the car package is not listed. I have also tried several mirrors, all with the same error message. I have tried to install other packages, this has been successful. Any help would be great. Jeremy __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Adding polygons to a barplot
Thanks! That worked. Jamie On 3/8/06, Marc Schwartz (via MN) [EMAIL PROTECTED] wrote: On Wed, 2006-03-08 at 14:02 -0500, Jamieson Cobleigh wrote: I have a barplot I have created using barplot2 and I have been able to add points and lines (using the points and lines methods, respectively). I now need to add some polygons (triangles in particular), that I want to be shaded to match bars in the plot. I can get the coordinates of the corners of the triangles, but don't know how to draw the triangles. I know there is the grid.polygon method, but I don't know how to get it to draw on my plot. Any help would be appreciated. Thanks! Jamie There is a polygon() function in base R graphics. It supports a 'density' argument in the same fashion as barplot()/barplot2(). See ?polygon help.search(polygon) would have gotten you there. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] 'less' for R?
As Prof Brian Ripley has pointed out,That changes the object if you make a change in the editor, so definitely NOT to be recommended. And you should use invisible(edit(dat.frame)) instead. 2006/3/9, Christos Hatzis [EMAIL PROTECTED]: Or fix(data.frame) to open it in the spreadsheet-like data editor. -Christos Hatzis -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Duncan Murdoch Sent: Wednesday, March 08, 2006 12:26 PM To: [EMAIL PROTECTED] Cc: r-help Subject: Re: [R] 'less' for R? On 3/8/2006 12:03 PM, Federico Calboli wrote: Hi All, is there an equivalent of the Unix command 'less' (or 'more'), so I can look at what's inside a data.frame or a matrix without having it printed out on console? I am using R on Debian Linux and Mac OS 10.4.5 I think you want page(). head() and tail() are also useful, as equivalents to the head and tail commands. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- 黄荣贵 Deparment of Sociology Fudan University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] When calling external C-function repeatedly I get different results; can't figure out why..
Dear all, I need to calculate tr(xyxy) fast for matrices x and y. Inspired by the R-source code, I've created the following functions (I am *new* to writing external C-functions, so feel free to laugh at my code - or, perhaps, suggest changes): #include Rinternals.h #include R_ext/Applic.h /* for dgemm */ static void matprod(double *x, int nrx, int ncx, double *y, int nry, int ncy, double *z) { char *transa = N, *transb = N; double one = 1.0, zero = 0.0; F77_CALL(dgemm)(transa, transb, nrx, ncy, ncx, one, x, nrx, y, nry, zero, z, nrx); } SEXP trProd2(SEXP x, SEXP y) { int nrx, ncx, nry, ncy, mode, i; SEXP xdims, ydims, ans, ans2, tr; xdims = getAttrib(x, R_DimSymbol); ydims = getAttrib(y, R_DimSymbol); mode = REALSXP; nrx = INTEGER(xdims)[0]; ncx = INTEGER(xdims)[1]; nry = INTEGER(ydims)[0]; ncy = INTEGER(ydims)[1]; PROTECT(ans = allocMatrix(mode, nrx, ncy)); PROTECT(ans2 = allocMatrix(mode, nrx, ncy)); PROTECT(tr = allocVector(mode, 1)); matprod(REAL(x), nrx, ncx, REAL(y), nry, ncy, REAL(ans)); matprod(REAL(ans), nrx, ncy, REAL(ans), nrx, ncy, REAL(ans2)); for (i=0; i nrx; i++){ REAL(tr)[0] = REAL(tr)[0] + REAL(ans2)[i*(nrx+1)]; } UNPROTECT(3); return(tr); } In R I do: trProd2 - function(x, y) { .Call(trProd2, x, y) } x - y - matrix(1:4,ncol=2) storage.mode(x) - storage.mode(y) - double for (i in 1:10) print(trProd2(x, y)) [1] 835 [1] 833 [1] 833 [1] 862 [1] 834 [1] 835 [1] 834 [1] 836 [1] 862 [1] 833 The correct answer is 833. Can anyone give me a hint to what is wrong? I am on a windows xp platform. Thanks in advance Søren __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to use the randomForest and rpart function?
Wow, I didn't know that. That's great! He the man! On 3/8/06, Carlos Ortega [EMAIL PROTECTED] wrote: Hello Michael, Just a few words about you phrase Do you know?... Andy Liaw, is the creator and maintainer of the randomForest package. He ported the original library of Briemman to R. Regards, Carlos. On 3/8/06, Michael [EMAIL PROTECTED] wrote: It did not have a legend showing on which color is for class1, which color is for class2, etc... I've read the R-help page. It lists a lot of options, but it did not say which ones are the key parameters that people use most for improving performance... Do you know? On 3/7/06, Liaw, Andy [EMAIL PROTECTED] wrote: As ?plot.randomForest says, it plots error rates. In addition to overall error rates, it also plots error rates for each class. As to the options in randomForest, read about the options in the help page and the reference linked from the help page. Andy From: Michael When I plot the randomForest object, it shows a graph with 3 lines, green, red and black, what's the meaning of these three lines? On 3/7/06, Michael [EMAIL PROTECTED] wrote: Hi all, I am trying to play around with the randomForest function for classification. I know its performance is great. I am currently using the default options. It has many options. How do I further tweak the options so that I can make its performance even better? What are the options that are mostly used? Thanks a lot! M [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Notice: This e-mail message, together with any attachment...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to use the randomForest and rpart function?
Thanks a lot Joe! I will take a further look at the article... On 3/8/06, Joseph Retzer [EMAIL PROTECTED] wrote: Hi Michael, I've looked into this a bit and the only parameter that seems to be suggested (by Brieman and a Salford Systems white paper) as one which may have an impact on the RF model is that which sets the number of potential split variables (mtry) for each tree split. The default for categorical response is root(total number of attributes) and (total number of attributes)/3 for regression. Take a look at the tuneRF function in randomForest which takes the default and searches above and below to see if the OOB error rate can be improved by changing mtry. Based on my very limited experimentation with the program, the default value seems to be tough to improve on. Best of luck take care, Joe Retzer *Michael [EMAIL PROTECTED]* wrote: It did not have a legend showing on which color is for class1, which color is for class2, etc... I've read the R-help page. It lists a lot of options, but it did not say which ones are the key parameters that people use most for improving performance... Do you know? On 3/7/06, Liaw, Andy wrote: As ?plot.randomForest says, it plots error rates. In addition to overall error rates, it also plots error rates for each class. As to the options in randomForest, read about the options in the help page and the reference linked from the help page. Andy From: Michael When I plot the randomForest object, it shows a graph with 3 lines, green, red and black, what's the meaning of these three lines? On 3/7/06, Michael wrote: Hi all, I am trying to play around with the randomForest function for classification. I know its performance is great. I am currently using the default options. It has many options. How do I further tweak the options so that I can make its performance even better? What are the options that are mostly used? Thanks a lot! M [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Notice: This e-mail message, together with any attachment...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to use the randomForest and rpart function?
Hi Andy, Does the randomForest have a Cross Validation built-in to decide what is the best number of trees or I have to find the best number manually by myself? Thanks a lot! Michael. On 3/7/06, Liaw, Andy [EMAIL PROTECTED] wrote: Yes, I do know. That's why I pointed you to the reference linked from the help page. BTW, there's also an R News article describing the initial version of the package. Have you perused that? Andy -Original Message- *From:* Michael [mailto:[EMAIL PROTECTED] *Sent:* Tuesday, March 07, 2006 9:27 PM *To:* Liaw, Andy *Cc:* R-help@stat.math.ethz.ch *Subject:* Re: [R] how to use the randomForest and rpart function? It did not have a legend showing on which color is for class1, which color is for class2, etc... I've read the R-help page. It lists a lot of options, but it did not say which ones are the key parameters that people use most for improving performance... Do you know? On 3/7/06, Liaw, Andy [EMAIL PROTECTED] wrote: As ?plot.randomForest says, it plots error rates. In addition to overall error rates, it also plots error rates for each class. As to the options in randomForest, read about the options in the help page and the reference linked from the help page. Andy From: Michael When I plot the randomForest object, it shows a graph with 3 lines, green, red and black, what's the meaning of these three lines? On 3/7/06, Michael [EMAIL PROTECTED] wrote: Hi all, I am trying to play around with the randomForest function for classification. I know its performance is great. I am currently using the default options. It has many options. How do I further tweak the options so that I can make its performance even better? What are the options that are mostly used? Thanks a lot! M [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Notice: This e-mail message, together with any attachments, contains information of Merck Co., Inc. (One Merck Drive, Whitehouse Station, New Jersey, USA 08889), and/or its affiliates (which may be known outside the United States as Merck Frosst, Merck Sharp Dohme or MSD and in Japan, as Banyu) that may be confidential, proprietary copyrighted and/or legally privileged. It is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please notify us immediately by reply e-mail and then delete it from your system. -- -- Notice: This e-mail message, together with any attachments...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to use the rpart function?
I see! So you mean I have to collect error counts myself manually... By the way, what parameters do I normally change to improve the default rpart performance? Thanks a lot! On 3/8/06, Carlos Ortega [EMAIL PROTECTED] wrote: Hello Michael, In some of the examples in the rpart function you will find that comparison between the actual and the predicted values, although it is for the Classification mode, not for the regression. Regards, Carlos. On 3/7/06, Michael [EMAIL PROTECTED] wrote: Hi all, What parameter do I normally change in the rpart function? How do I set the cp option? Is there a way to read off error rate directly from the rpart function for training data; I imagine for testing data I have to apply a predict, but for training data I guess the error count would be somewhere existing once the rpart function is finished. Looks like it is related to expressions such as expected loss=0.8362365 when using summary function. Now I have to do this manually, and when it came to compare the correct vs. wrong and count the errors, it was always very tedious... Thanks a lot! M. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] variable '%s' was fitted with class... in predict.nls()
I've tried to predict the values from a new data.frame using the nls.predict function and keep getting the error message: Error in if (sum(wrong) == 1) stop(gettextf(variable '%s' was fitted with class \%s\ but class \%s\ was supplied, : missing value where TRUE/FALSE needed I first thought that it was becuase there may have been something goofy in the formats of the objects/data.frames I was passing to the predict(), but when I examined the code (online) I found the following: I've trid to decipher the code, but most people are much better R programmers that I, and I'm not sure what's going on here. DOes this mean I have to use the same data frame I used to fit the model, in order to fit new data? .checkMFClasses - function(cl, m, ordNotOK = FALSE) { new - sapply(m, .MFclass) if(length(new) == 0) return() old - cl[names(new)] if(!ordNotOK) { old[old == ordered] - factor new[new == ordered] - factor } ## ordered is OK as a substitute for factor, but not v.v. new[new == ordered old == factor] - factor if(!identical(old, new)) { wrong - old != new if(sum(wrong) == 1) stop(gettextf( variable '%s' was fitted with class \%s\ but class \%s\ was supplied, names(old)[wrong], old[wrong], new[wrong]), call. = FALSE, domain = NA) else stop(gettextf( variables %s were specified with different classes from the fit, paste(sQuote(names(old)[wrong]), collapse=, )), call. = FALSE, domain = NA) } } The docs for predict.nls state: newdata: A named list or data frame in which to look for variables with which to predict. If 'newdata' is missing the fitted values at the original data points are returned. I understand this to mean that the newdata data.frame simply need *include* the variables that are in the model (extra's variables fine?) When I used, predict( fits.by.species.30[[1]], dbh=other.202$dbh, se.fit=TRUE ) I got results alright, but I would like to be able to simply pass a new data.frame that contains the independent variables without having to build a list of names. Is that possible? Thanks, Jeff. -- Jeff D. Hamann Forest Informatics, Inc. PO Box 1421 Corvallis, Oregon 97339-1421 phone 541-754-1428 jeff.hamann[at]forestinformatics.com www.forestinformatics.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] install.packages saying the car package is not in therepositories
I am running version 2.1.0 of R on a Debian Linux machine. Jeremy __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] install.packages saying the car package is not in therepositories
On 9 March 2006 at 02:23, Jeremy Morris wrote: | I am running version 2.1.0 of R on a Debian Linux machine. In which case $ apt-get install r-cran-car is your friend. 2.1.0 is ancient, so you could take advantage of the backport of the current R the Debian stable which Chris Steigies provides -- and that is on every CRAN mirror as detailed in the R FAQ. Hth, Dirk -- Hell, there are no rules here - we're trying to accomplish something. -- Thomas A. Edison __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html