RE: [R] 3 little questions
On Sun, 2004-02-01 at 20:22, Gabor Grothendieck wrote: See ?cor.test I'll stand to be corrected here, but I do not believe that cor.test() will work for Kendall's W, since it can only handle two variables. Kendall's W is designed for = 3 variables as a generalization of rho/tau and Friedman's test. There are some suggestions I have seen that estimations of Kendall's W can be done by using the average of multiple pairwise Kendall's tau across the variables, with two formula variations depending upon whether there is an even or odd number of variables AND in the case of no ties. I have also seen a suggestion that the R^2 from a one-way ANOVA yields an approximation of Kendall's W. In fact, the SAS macro I reference below uses this approach. However, each of the aforementioned approaches can vary from W and so carries various caveats under certain conditions. I have not seen anything searching on r-help or CRAN for Kendall's W and I have not coded it myself. However, I did find one reference on the s-news list at: http://www.biostat.wustl.edu/archives/html/s-news/2001-03/msg00197.html and I also located a SAS Macro called %MAGREE at: http://ewe3.sas.com/techsup/download/stat/magree.html If you are familiar with SAS, that might be helpful as well. Another reference which has various formulas and worked examples for W is: Nonparametric Measures of Association Jean Dickson Gibbons Sage, 1993 With respect to gamma, I have worked through a methodology to calculate this and some other measures in R, that I have planned to add to the CrossTable() function in the gregmisc package. Unfortunately, I have not yet completed the coding for several of the measures and the associated ASE's and p values due to lack of time. I have done gamma however and the code is below, starting with the functions to compute concordant and discordant pairs. These approaches (using a cross-tabulation and matrix partitioning) can save a fair amount of time, if the number of unique pairs in the data is substantially less than the total number of pairs. # Calculate CONcordant Pairs in a table # cycle through x[r, c] and multiply by # sum(x elements below and to the right of x[r, c]) # x = table concordant - function(x) { # get sum(matrix values r AND c) # for each matrix[r, c] mat.lr - function(r, c) { lr - x[(r.x r) (c.x c)] sum(lr) } # get row and column index for each # matrix element r.x - row(x) c.x - col(x) # return the sum of each matrix[r, c] * sums # using mapply to sequence thru each matrix[r, c] sum(x * mapply(mat.lr, r = r.x, c = c.x)) } # Calculate DIScordant Pairs in a table # cycle through x[r, c] and multiply by # sum(x elements below and to the left of x[r, c]) # x = table discordant - function(x) { # get sum(matrix values r AND c) # for each matrix[r, c] mat.ll - function(r, c) { ll - x[(r.x r) (c.x c)] sum(ll) } # get row and column index for each # matrix element r.x - row(x) c.x - col(x) # return the sum of each matrix[r, c] * sums # using mapply to sequence thru each matrix[r, c] sum(x * mapply(mat.ll, r = r.x, c = c.x)) } # Calculate Goodman-Kruskal gamma # x = table calc.gamma - function(x) { c - concordant(x) d - discordant(x) gamma - (c - d) / (c + d) gamma } Here is an example of use. Keep in mind that x is the cross tabulation of two vectors of measures, in this example, yielding a 3 x 3 table: x - matrix(c(70, 10, 27, 85, 134, 60, 15, 41, 100), ncol = 3) x [,1] [,2] [,3] [1,] 70 85 15 [2,] 10 134 41 [3,] 27 60 100 calc.gamma(x) [1] 0.57045 If you have any questions on the above, let me know. Hope this helps. Marc Schwartz On Sun, 2004-02-01 at 18:53, Siegfried.Macho wrote: Dear R-helpers, 3 questions: 1. Is there a package that contains a routine for computing Kendall's W (coefficient of concordance), with and without ties ? 2. Is there a package that contains a routine for computing Goodman' s Gamma. 3. I there a simple method for computing the number of ties as well as their lengths within a vector fo ranks, e.g. r1 - rank(c(1, 3, 2, 3, 3, 2, 4)) gives: [1] 1.0 5.0 2.5 5.0 5.0 2.5 7.0 which contains 2 ties with length 2 and 3. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] CART: rapart vs bagging
From: Qin Liu Hi, Is here anyone knows the difference between rapart and bagging when grow a CART tree? rpart produces one classification or regression tree. Bagging is a general technique for combining classifiers or regression models, usually trees, from bootstrap samples. The bagging() function in the `ipred' package just loop over call to rpart(). Andy Thanks Qin __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Notice: This e-mail message, together with any attachments,...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Stepwise regression and PLS
Dear all, I am a newcomer to R. I intend to using R to do stepwise regression and PLS with a data set (a 55x20 matrix, with one dependent and 19 independent variable). Based on the same data set, I have done the same work using SPSS and SAS. However, there is much difference between the results obtained by R and SPSS or SAS. In the case of stepwise, SPSS gave out a model with 4 independent variable, but with step(), R gave out a model with 10 and much higher R2. Furthermore, regsubsets() also indicate the 10 variable is one of the best regression subset. How to explain this difference? And in the case of my data set, how many variables that enter the model would be reasonable? In the case of PLS, the results of mvr function of pls.pcr package is also different with that of SAS. Although the number of optimum latent variables is same, the difference between R2 is much large. Why? Any comment and suggestion is very appreciated. Thanks in advance! Best wishes, Jinsong Zhao = (Mr.) Jinsong Zhao Ph.D. Candidate School of the Environment Nanjing University No.22 Hankou Road, Najing 210093 P.R. China E-mail: [EMAIL PROTECTED] _ 60 http://cn.rd.yahoo.com/mail_cn/tag/?http://cn.mail.yahoo.com __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] ATTENZIONE: [La vostra email: MAIL TRANSACTION FAILED]
posta.powerhouse.it ha trovato ciò che segue nella email da lei inviata: Worm/MyDoom.A2 virus La email non può essere inoltrata. Il suo computer può essere infettato da un virus. Mail-Info: --8-- Message-Id: [EMAIL PROTECTED] From: [EMAIL PROTECTED] To: [EMAIL PROTECTED] Date: Mon, 2 Feb 2004 09:16:49 +0100 Subject: MAIL TRANSACTION FAILED --8-- __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] MATLAB to R
[EMAIL PROTECTED] writes: In MATLAB, I can write: for J=1:M Y(J+1)=Y(J)+ h * feval(f,T(J),Y(J)); ... In R, I can write above as: for (J in 2:M) { y = y + h * f(t,y) ... } Are you sure this gives the same result? If Y and T in Matlab are vectors, I believe for (J in 1:M) { y[J+1] - y[J] + h * f(tt[J], y[J]) ... } is what you want. (Don't use `t' as a variable; t() is the function to transpose a matrix.) for J=1:M k1 = feval(f,T(J),Y(J)); k2 = feval(f,T(J+1),Y(J)+ h * k1 I assume you mean k1(J) = ... and k2(J) = ... How do I write k2 in R? k1 = f(t,y) k2 = ? ## If f can take vector arguments: k1 - f(tt[-M],y) k2 - f(tt[-1], y+h*k1) ## Otherwise: for (J in 1:M) { k1[J] - f(tt[J], y[J]) k2[J] - f(tt[J+1], y[J] + h*k1[J]) } -- Hth, Bjørn-Helge Mevik __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] PLS discriminant analysis
Hi everyone, we would like to perform a PLS discriminant analysis with R. Does anyone knows if at least a PLS regression package is available? Thanks by advance, Xavière Panhard University Hospital Bichat - Claude Bernard Paris [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] how to keep functions while remove all other commands
As an alternative to Petr's approach of using separate directories, one can also write a function (which I find quite useful in its own right) to separate ls() output by mode, e.g. lsd - function(splitby=mode, pos=1, ...) { lsout - ls(pos=pos, ...) tapply(lsout, sapply(lsout, function(x) {splitby(get(x))}), invisible) } Given that, you can use expressions such as rm(list=lsd()$numeric) rm(list=lsd(splitby=class, all.names=T)$integer) or to remove functions, rm(list=lsd()$function) # nb reserved word function needs quotes this latter also removes `lsd` itself, so do also follow the advice re keeping a copy of all your functions in a text file somewhere! -Original Message- On 24 Jan 2004 at 21:47, Yong Wang wrote: Dear all: a quick question: I am used to apply rm(list=()) regularly to remove all old codes in preventing them creeping in current analysis.however, with that application, functions I wrote are also removed. please let me know how to keep the thing you want while remove those you don't. Simon Fear Senior Statistician Syne qua non Ltd Tel: +44 (0) 1379 69 Fax: +44 (0) 1379 65 email: [EMAIL PROTECTED] web: http://www.synequanon.com Number of attachments included with this message: 0 This message (and any associated files) is confidential and\...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] ordering in dotplot
Dear R-friends, the dataset I am using (data.it) is organized as follows partnerstpbtpreg hk0.641s ger0.271d tur0.271s rom0.241s-f por0.241s spa0.231s gre0.221d-f aus0.171d uk0.161s be0.161d arg0.151s usa0.131d-f fra0.131s neth0.051s-f pol0.051d bra0.041s ko0.0061s un-0.0092d-f svi-0.0222s-f cin-0.0402 d ru-0.0742d-f mex-0.0772s ............ and plotting it using dotplot, I specified the script as: library(lattice) attach(data.ita) dotplot(reg~stp | partner, data=data.ita, groups=btp, xlab=list(Data - it,cex=1.5),col=c(black,red),aspect=0.3,as.table=TRUE,xlim=c(-1,1)) detach(data.ita) In the resulting plot the variable reg is ordered alphabetically. Instead, I would like the variable to be plotted in the following order: s, s-f, d, d-f. How can I do it? Many thanks Luca __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] ordering in dotplot
Luca De Benedictis wrote: the dataset I am using (data.it) is organized as follows ............ and plotting it using dotplot, I specified the script as: library(lattice) attach(data.ita) dotplot(reg~stp | partner, data=data.ita, groups=btp, xlab=list(Data - it,cex=1.5),col=c(black,red),aspect=0.3,as.table=TRUE,xlim=c(-1,1)) detach(data.ita) In the resulting plot the variable reg is ordered alphabetically. Instead, I would like the variable to be plotted in the following order: s, s-f, d, d-f. Make that an ordered factor as follows: data.it$reg - ordered(data.it$reg, levels=c(s, s-f, d, d-f)) hope this helps, Chuck Cleland -- Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 452-1424 (M, W, F) fax: (917) 438-0894 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] array of variable length vectors
Hi, I'd like to store N vectors of different lengths, and to be able to access them with an index, and eventually free the memory for one of them without modifying the indexes to the others. In C this would be a vector of N pointers that point to memory cells independently allocated. For example int *pv[3]; pv[0] = (int *) malloc(13 * sizeof(int)); pv[1] = (int *) malloc(7 * sizeof(int)); pv[2] = (int *) malloc(110 * sizeof(int)); free(pv[1]) ... What is the best data type (or class) in R to do such a thing? Thank you! Giampiero _ Giampiero Salvi, M.Sc. www.speech.kth.se/~giampi Speech, Music and Hearing Tel: +46-8-790 75 62 Royal Institute of Technology Fax: +46-8-790 78 54 Drottning Kristinasv. 31, SE-100 44, Stockholm, Sweden __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] array of variable length vectors
Giampiero Salvi wrote: Hi, I'd like to store N vectors of different lengths, and to be able to access them with an index, and eventually free the memory for one of them without modifying the indexes to the others. int *pv[3]; pv[0] = (int *) malloc(13 * sizeof(int)); pv[1] = (int *) malloc(7 * sizeof(int)); pv[2] = (int *) malloc(110 * sizeof(int)); free(pv[1]) ... What is the best data type (or class) in R to do such a thing? A list, with vector elements (index starts at 1 in R): pv = list() pv[[1]] = real(13) pv[[2]] = real(7) pv[[3]] = real(110) then the equivalent of freeing the memory and keeping the indexing would be: pv[[2]] = real(0) and NOT pv[[2]] = NULL (which deletes element 2) *BUT* I dont know if R will really free() the memory at that point. You may need to force the garbage collection with gc() Baz __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Stepwise Regression and PLS
Frank Harrell wrote I think you missed the point. None of the variable selection procedures will provide results that have a fair probability of replicating in another sample. FH And Jinsong Zhao answered Do you mean different procedures will provide different results? Maybe I don't understand your email correctly. Now, I just hope I could get a reasonable linear model using stepwise method in R, but I don't know how to deal with collinear problem. The problem is not with R, SAS, or SPSS, but with your desire to produce a reasonable linear model using stepwise. Stepwise does not, in general, produce reasonable linear models, nor does it produce models that are generally replicable. This issue has been discussed here in the past, but there have been more extensive discussions on SAS-L, or in numerous statistics books, including Dr. Harrell's excellent one. HTH Peter L. Flom, PhD Assistant Director, Statistics and Data Analysis Core Center for Drug Use and HIV Research National Development and Research Institutes 71 W. 23rd St www.peterflom.com New York, NY 10010 (212) 845-4485 (voice) (917) 438-0894 (fax) __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] array of variable length vectors
On Mon, 2 Feb 2004, Giampiero Salvi wrote: Hi, I'd like to store N vectors of different lengths, and to be able to access them with an index, and eventually free the memory for one of them without modifying the indexes to the others. In C this would be a vector of N pointers that point to memory cells independently allocated. For example int *pv[3]; pv[0] = (int *) malloc(13 * sizeof(int)); pv[1] = (int *) malloc(7 * sizeof(int)); pv[2] = (int *) malloc(110 * sizeof(int)); free(pv[1]) ... What is the best data type (or class) in R to do such a thing? Sounds like an R list. However, in R you cannot free memory, but what you can do (carefully) is to change the list element to NULL and then memory will be salvaged at a future garbage collection. z - vector(list, 3) z[[1]] - integer(13) z[[2]] - integer(7) z[[3]] - integer(110) then z[1] - list(NULL) # and not z[[1]] - NULL will potentially release the memory allocated for the first element. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Stepwise Regression and PLS
On Sun, 1 Feb 2004 20:03:36 -0800 (PST) Jinsong Zhao [EMAIL PROTECTED] wrote: --- Frank E Harrell Jr [EMAIL PROTECTED] wrote: For the case of stepwise regression, I have found that the subsets I got using regsubsets() are collinear. However, the variables in SPSS's result are not collinear. I wonder what I should do to get a same or better linear model. I think you missed the point. None of the variable selection procedures will provide results that have a fair probability of replicating in another sample. FH --- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University Do you mean different procedures will provide different results? Maybe I don't understand your email correctly. Now, I just hope I could get a reasonable linear model using stepwise method in R, but I don't know how to deal with collinear problem. = (Mr.) Jinsong Zhao No, I mean the SAME procedure will provide different results. Use the bootstrap, or use simulation to repeatedly sample from the same population and the same true regression model. You will see dramatically different final models selected by same algorithm. The algorithm is inherently unstable unless perhaps you have a sample an order of magnitude larger than the one you have. See http://www.pitt.edu/~wpilib/statfaq/regrfaq.html) which contains some good references. --- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] PLS discriminant analysis
You could have at least look on CRAN, where you would have found pls.pcr, which does PLS and PC regression. As to using PLS for discrimination, you might want to look at the gpls package in the BioConductor suite. Andy From: xavière panhard Hi everyone, we would like to perform a PLS discriminant analysis with R. Does anyone knows if at least a PLS regression package is available? Thanks by advance, Xavière Panhard University Hospital Bichat - Claude Bernard Paris -- Notice: This e-mail message, together with any attachments,...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] array of variable length vectors
Giampiero Salvi wrote: Hi, I'd like to store N vectors of different lengths, and to be able to access them with an index, and eventually free the memory for one of them without modifying the indexes to the others. In C this would be a vector of N pointers that point to memory cells independently allocated. For example int *pv[3]; pv[0] = (int *) malloc(13 * sizeof(int)); pv[1] = (int *) malloc(7 * sizeof(int)); pv[2] = (int *) malloc(110 * sizeof(int)); free(pv[1]) ... What is the best data type (or class) in R to do such a thing? See ?list Uwe Ligges __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] for loops?
Hello R people! How can one use a for loop (or something similar) in R? As I type in each line, I get syntax errors... I'm just confused how much to type in at each prompt. Thanks for your help, cathy ~~~ Catherine M. Stein Research Assistant, Tuberculosis Research Unit Doctoral Candidate in Genetic Epidemiology Department of Epidemiology and Biostatistics Case Western Reserve University office: (216)368-0875 or (216)778-1378 e-mail: [EMAIL PROTECTED], or [EMAIL PROTECTED] http://darwin.cwru.edu/~kasia EPBI Student Resources Page: http://hal.epbi.cwru.edu/stures/ __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] for loops?
On Mon, 2004-02-02 at 13:42, Catherine Stein wrote: Hello R people! How can one use a for loop (or something similar) in R? As I type in each line, I get syntax errors... I'm just confused how much to type in at each prompt. Thanks for your help, cathy Hello. I believe you want something like: for(i in 1:n){ ...some lines here... } If you have only one line within the for loop you can use it without the brackets. If the environment is still open (as within the for loop) you will not get a new prompt but a + prompt to continue as if you were writing on the same line. Only when you close the brackets the prompt will return to after executing the for loop. Hope this will help. Pedro -- --- Pedro Pereira Rodrigues http://www.dcc.fc.up.pt/~prodrigues/ Phone: +351 226078830 - Ext: 121 Snail: Department of Computer Science Faculty of Sciences - University of Porto Rua do Campo Alegre, 823 4150-180 Porto - Portugal __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] for loops?
Catherine Stein wrote: How can one use a for loop (or something similar) in R? As I type in each line, I get syntax errors... I'm just confused how much to type in at each prompt. Have you read help(for) (you need to quote 'for' here to avoid a syntax error!)? If you'd shown us exactly what you'd typed we could probably help better. Suppose you want to loop from 1 to 10 and print it. You can do the following, where '' is the R prompt (dont type it): for(i in 1:10)print(i) - ie all on one line for(i in 1:10){print(i)} - with curly brackets for(i in 1:10) + print(i) - where '+' is the continuation prompt (dont type it) - R gives you this when it realises you havent written a complete expression yet. for(i in 1:10) { + print(i) + } - curly brackets enclose as many expressions as you like inside the loop. for(i in 1:10) + { print(i) + } - curly brackets anywhere. R works it out. You can even do, and I didn't think this would work... for(i in + 1:10) + {print(i)} So in short, I cant get it to give a syntax error :) What are you doing? What version of R, and what platform? Baz __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] for loops?
Hi On 2 Feb 2004 at 8:42, Catherine Stein wrote: Hello R people! How can one use a for loop (or something similar) in R? As I type in each line, I get syntax errors... I'm just confused how much to type Use curly braces for (i in ...) { your long commands in several lines } An Introduction to R in doc directory is your best friend. Cheers Petr in at each prompt. Thanks for your help, cathy ~~ ~ Catherine M. Stein Research Assistant, Tuberculosis Research Unit Doctoral Candidate in Genetic Epidemiology Department of Epidemiology and Biostatistics Case Western Reserve University office: (216)368-0875 or (216)778-1378 e-mail: [EMAIL PROTECTED], or [EMAIL PROTECTED] http://darwin.cwru.edu/~kasia EPBI Student Resources Page: http://hal.epbi.cwru.edu/stures/ ~~ ~~ __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Petr Pikal [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] axes in boxplots
Hi I am struggling with controlling the axes of boxplots. I need to produce two horizontal boxplots with the same x-axis for comparison purposes. But when I generate a plot(c(-12,8), c(-6,1), type=n) first, then the following boxplot always overwrites it! I went into the code of boxplot.default, but even that without success. Thanks for any hint! David __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] filled contour + points
Hello I have a small problem with filled contour plots. I'd like to plot point on top of that using points(). Trouble is, the x axis of the contour plot is modified to make room for the legend but points() is not aware of that. It could be easily tackled by using a linear transformation of x in points(), but does anyone know exactly *what* transformation? Kind regards Ivailo Partchev __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] axes in boxplots
David Andel wrote: Hi I am struggling with controlling the axes of boxplots. I need to produce two horizontal boxplots with the same x-axis for comparison purposes. But when I generate a plot(c(-12,8), c(-6,1), type=n) first, then the following boxplot always overwrites it! I went into the code of boxplot.default, but even that without success. Thanks for any hint! boxplot(dat1, dat2, horizontal=TRUE) plots 2 parallel boxplots of dat1 and dat2 respectively. Uwe Ligges __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Stepwise regression and PLS
On Sun, 1 Feb 2004, [gb2312] Jinsong Zhao wrote: In the case of stepwise, SPSS gave out a model with 4 independent variable, but with step(), R gave out a model with 10 and much higher R2. Furthermore, regsubsets() also indicate the 10 variable is one of the best regression subset. How to explain this difference? And in the case of my data set, how many variables that enter the model would be reasonable? Most likely because step() uses AIC and SPSS uses a p-value criterion, so the models are `best' in different ways. regsubsets() gives best models of each size, so it doesn't address the 4 vs 10 issue. This isn't what regsubsets() is intended for. If you want a single model for prediction, you need a method based on an honest estimate of prediction error and if you want a single model to explain relationships you need to think about relationships. While people seem to want to use it for finding a single model, the purpose of regsubsets() is to give you many models, precisely as a way around the problem of instability everyone else has pointed out. Given a large number of models you can see what features are common to them, or you can do a crude but reasonably effective approximation to model averaging. -thomas Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] filled contour + points
Rather than transform the points, you can use the `plot.axes' argument to filled.contour() and call points() from there. -roger Ivailo Partchev wrote: Hello I have a small problem with filled contour plots. I'd like to plot point on top of that using points(). Trouble is, the x axis of the contour plot is modified to make room for the legend but points() is not aware of that. It could be easily tackled by using a linear transformation of x in points(), but does anyone know exactly *what* transformation? Kind regards Ivailo Partchev __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] problem building R on HPUX 11.23
When building the X11 modules under HPUX 11.23 I get the following errors ld: Unsatisfied symbol Rf_isNull in file dataentry.lo ld: Unsatisfied symbol Rf_length in file dataentry.lo ld: Unsatisfied symbol Rf_warningcall in file devX11.lo ld: Unsatisfied symbol UNIMPLEMENTED in file dataentry.lo ld: Unsatisfied symbol R_alloc in file devX11.lo ld: Unsatisfied symbol R_GlobalEnv in file dataentry.lo ld: Unsatisfied symbol R_setX11Routines in file devX11.lo ld: Unsatisfied symbol Rf_devNumber in file devX11.lo ld: Unsatisfied symbol Rf_elt in file devX11.lo ld: Unsatisfied symbol R_NaInt in file dataentry.lo I assume I need to link in an R library built earlier; which library would I need? Please reply directly, as I am not on the list. Thanks Chuck Fisher [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Robust nonlinear regression - sin(x)/x?
Dear all Since I did not receive any answer to my general question (?), let me ask a concrete question: How can I fit the simple function y = a*sin(x)/b*x? This is the code that I tried, but nls gives an error: x - seq(1,10,0.1) y - sin(x)/x plot(x,y) z - jitter(y,amount=0.1) plot(x,z) df - as.data.frame(cbind(x,z)) nf - nls(z ~ a*sin(x)/b*x, data=df, start=list(a=0.8,b=0.9), trace = TRUE) I have followed the Puromycin sample which works fine: Pur.wt - nls(rate ~ (Vm * conc)/(K + conc), data = Treated, start = list(Vm = 200, K = 0.1), trace = TRUE) Do I make some mistake or is it not possible to fit sin(x)/x? Thank you in advance Best regards Christian _._._._._._._._._._._._._._._._ C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a V.i.e.n.n.a A.u.s.t.r.i.a _._._._._._._._._._._._._._._._ cstrato wrote: Dear R experts This is a general question: Does R have functions for nonlinear robust regression, analogous to e.g. LTS? Searching google I have found 1, an abstract to generalize LTS for nonlinear regression models, see: http://smealsearch.psu.edu/1509.html 2, an AD-model builder, see: http://otter-rsch.com/admodel/cc1.html but no mention of R/S Thank you in advance Best regards Christian _._._._._._._._._._._._._._._._ C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a V.i.e.n.n.a A.u.s.t.r.i.a _._._._._._._._._._._._._._._._ __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Unknown account
The account is not in use. If you don't know the correct address, please send our mail to [EMAIL PROTECTED] / With best regards, Rederi Ab Eckero __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Robust nonlinear regression - sin(x)/x?
You didn't tell us what the error message nls gave, but it should be obvious: You can estimate either a or b in this model (or their ratio), but you can't estimate both. See any good book on nonlinear regression, e.g., Bates and Watts (1988) Nonlinear Regression and Its Applications (Wiley). With that simplification, the equation is linear. However, I see another problem: If x is 0, sin(x)/x is NaN. For a situation like this, I typically write a function to first compute sin(x)/x, then test for x being 0 or very small, and replace the value in such cases with a more accurate Taylor series or asymptotic expansion. hope this helps. spencer graves cstrato wrote: Dear all Since I did not receive any answer to my general question (?), let me ask a concrete question: How can I fit the simple function y = a*sin(x)/b*x? This is the code that I tried, but nls gives an error: x - seq(1,10,0.1) y - sin(x)/x plot(x,y) z - jitter(y,amount=0.1) plot(x,z) df - as.data.frame(cbind(x,z)) nf - nls(z ~ a*sin(x)/b*x, data=df, start=list(a=0.8,b=0.9), trace = TRUE) I have followed the Puromycin sample which works fine: Pur.wt - nls(rate ~ (Vm * conc)/(K + conc), data = Treated, start = list(Vm = 200, K = 0.1), trace = TRUE) Do I make some mistake or is it not possible to fit sin(x)/x? Thank you in advance Best regards Christian _._._._._._._._._._._._._._._._ C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a V.i.e.n.n.a A.u.s.t.r.i.a _._._._._._._._._._._._._._._._ cstrato wrote: Dear R experts This is a general question: Does R have functions for nonlinear robust regression, analogous to e.g. LTS? Searching google I have found 1, an abstract to generalize LTS for nonlinear regression models, see: http://smealsearch.psu.edu/1509.html 2, an AD-model builder, see: http://otter-rsch.com/admodel/cc1.html but no mention of R/S Thank you in advance Best regards Christian _._._._._._._._._._._._._._._._ C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a V.i.e.n.n.a A.u.s.t.r.i.a _._._._._._._._._._._._._._._._ __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Robust nonlinear regression - sin(x)/x?
You reall have only one parameter in your model, c = a/b. You can't identify both a and b from your model, therefore, you should fit the linear model: lm(z ~ c* sin(x)/x) Ravi. - Original Message - From: cstrato [EMAIL PROTECTED] Date: Monday, February 2, 2004 2:28 pm Subject: [R] Robust nonlinear regression - sin(x)/x? Dear all Since I did not receive any answer to my general question (?), let me ask a concrete question: How can I fit the simple function y = a*sin(x)/b*x? This is the code that I tried, but nls gives an error: x - seq(1,10,0.1) y - sin(x)/x plot(x,y) z - jitter(y,amount=0.1) plot(x,z) df - as.data.frame(cbind(x,z)) nf - nls(z ~ a*sin(x)/b*x, data=df, start=list(a=0.8,b=0.9), trace = TRUE) I have followed the Puromycin sample which works fine: Pur.wt - nls(rate ~ (Vm * conc)/(K + conc), data = Treated, start = list(Vm = 200, K = 0.1), trace = TRUE) Do I make some mistake or is it not possible to fit sin(x)/x? Thank you in advance Best regards Christian _._._._._._._._._._._._._._._._ C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a V.i.e.n.n.a A.u.s.t.r.i.a _._._._._._._._._._._._._._._._ cstrato wrote: Dear R experts This is a general question: Does R have functions for nonlinear robust regression, analogous to e.g. LTS? Searching google I have found 1, an abstract to generalize LTS for nonlinear regression models, see: http://smealsearch.psu.edu/1509.html 2, an AD-model builder, see: http://otter-rsch.com/admodel/cc1.html but no mention of R/S Thank you in advance Best regards Christian _._._._._._._._._._._._._._._._ C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a V.i.e.n.n.a A.u.s.t.r.i.a _._._._._._._._._._._._._._._._ __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting- guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Robust nonlinear regression - sin(x)/x?
cstrato [EMAIL PROTECTED] writes: Dear all Since I did not receive any answer to my general question (?), let me ask a concrete question: How can I fit the simple function y = a*sin(x)/b*x? This is the code that I tried, but nls gives an error: x - seq(1,10,0.1) y - sin(x)/x plot(x,y) z - jitter(y,amount=0.1) plot(x,z) df - as.data.frame(cbind(x,z)) nf - nls(z ~ a*sin(x)/b*x, data=df, start=list(a=0.8,b=0.9), trace = TRUE) I have followed the Puromycin sample which works fine: Pur.wt - nls(rate ~ (Vm * conc)/(K + conc), data = Treated, start = list(Vm = 200, K = 0.1), trace = TRUE) Do I make some mistake or is it not possible to fit sin(x)/x? The expression only depends on a/b so you cannot estimate both. Besides, you need to check up on operator precedence and associativity: What you wrote is equivalent to a*sin(x)*x/b. -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] memory problem for R --Summary
Thank you very much for the replies you have sent me regarding the memory problem. The following is the summary (I tried to read all the messages through. I apologized if I overlooked your message) Cheers, Yun-Fang Backgrounds: a. Data: 1million rows with 73 numeric attributes b. Environment: R 1.7.1 on FreeBSD 4.3 with 2GB memory and double CPU Pentium III/Pentium III Xeon/Celeron with data seg size (kbytes) =1572864 limit Suggested Solutions: z. use SAS since SAS is not trying to read all the data into RAM. a. random sampling from the large data set i.e. 10% of 1 million rows (the option singular.ok=TRUE can be used in lm for singular matrice.) b. use kalman filter with migration variance =0. ( see the dse package for details) c. add the following configuration: options(object.size=1e8) Results: still OOM d. if data is all numeric, add colClasses=numeric in read.table() Results: read.table read in the data successfully but I failed to access the dataset after the loading (even dataset[1:10,] didn't work) - Original Message - From: Liaw, Andy [EMAIL PROTECTED] To: 'Yun-Fang Juan' [EMAIL PROTECTED]; Prof Brian Ripley [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Friday, January 30, 2004 11:44 AM Subject: RE: [R] memory problem for R You still have not read the posting guide, have you? See more below. From: Yun-Fang Juan [...] I tried 10% sample and it turned out the matrix became singular after I did that. Ther reason is some of the attributes only have zero values most of the time. The data i am using is web log data and after some transformation, they are all numeric. Can we specify some parameters in read.table so that the program will treat all the vars as numeric (with this context, hopefully that will reduce the memory consumption) ? and you clearly have not read my (private) reply, either, in which I told you *exactly* how to do that, via the colClasses argument to read.table(). Please take the help given to you seriously. If you want attention, you have to pay attention. Andy thanks a lot, Yun-Fang -- Notice: This e-mail message, together with any attachments, contains information of Merck Co., Inc. (One Merck Drive, Whitehouse Station, New Jersey, USA 08889), and/or its affiliates (which may be known outside the United States as Merck Frosst, Merck Sharp Dohme or MSD and in Japan, as Banyu) that may be confidential, proprietary copyrighted and/or legally privileged. It is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please notify us immediately by reply e-mail and then delete it from your system. -- __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] mvrnorm problem
I am trying to simulate draws from a multivariate normal using mvrnorm, and am getting the following error message: Error in mu + eS$vectors %*% diag(sqrt(pmax(ev, 0)), p) %*% t(X) : non-conformable arrays I do not understand why I am getting this message, since the vector of means I am giving to the function is 13 by 1 and the variance matrix I am giving to the function is 13 by 13...these matrices are therefore conformable, right? Is it possible that this is actually a problem with the positive-definiteness of the variance matrix? The matrix I am attempting to use has eigenvalues that are close to zero (but positive). Here's a transcript of the commands I've tried: mvrnorm(n = 1000,B,V) Error in mu + eS$vectors %*% diag(sqrt(pmax(ev, 0)), p) %*% t(X) : non-conformable arrays mvrnorm(n = 1000,t(B),V) Error in mu + eS$vectors %*% diag(sqrt(pmax(ev, 0)), p) %*% t(X) : non-conformable arrays V [,1] [,2] [,3] [,4] [,5] [1,] 2.632324e+05 -1.286920e-01 -9.461830e-02 6.27173e-02 -2.625664e+00 [2,] -1.286920e-01 1.164268e+00 -7.496500e-03 4.95040e-03 -1.557751e-01 [3,] -9.461830e-02 -7.496500e-03 5.761000e-03 -6.66370e-03 6.366440e-02 [4,] 6.271730e-02 4.950400e-03 -6.663700e-03 8.23150e-03 -6.252150e-02 [5,] -2.625664e+00 -1.557751e-01 6.366440e-02 -6.25215e-02 1.753953e+02 [6,] 1.017806e+01 8.797900e-02 -1.028916e+01 9.75542e+00 -8.614497e+03 [7,] -1.286920e-01 1.164268e+00 -7.496500e-03 4.95040e-03 -1.557751e-01 [8,] 3.93e-07 2.06e-09 1.15e-10 -1.39000e-10 5.55e-09 [9,] -1.95e-07 3.68e-09 -2.05e-11 8.86000e-12 -6.11e-09 [10,] 4.53e-07 4.61e-10 9.25e-12 -1.5e-11 1.31e-08 [11,] 1.17e-06 -6.71e-09 6.74e-11 -1.68000e-10 -3.40e-09 [12,] -9.60e-07 3.97e-09 -9.52e-11 2.12000e-10 -3.43e-10 [13,] -4.97e-07 -2.07e-10 -3.10e-11 5.69000e-11 -2.27e-09 [,6] [,7] [,8] [,9][,10] [1,] 1.017806e+01 -1.286920e-01 3.93000e-07 -1.95000e-07 4.53000e-07 [2,] 8.797900e-02 1.164268e+00 2.06000e-09 3.68000e-09 4.61000e-10 [3,] -1.028916e+01 -7.496500e-03 1.15000e-10 -2.05000e-11 9.25000e-12 [4,] 9.755420e+00 4.950400e-03 -1.39000e-10 8.86000e-12 -1.5e-11 [5,] -8.614497e+03 -1.557751e-01 5.55000e-09 -6.11000e-09 1.31000e-08 [6,] 4.980406e+05 8.797900e-02 -4.67000e-07 3.5e-07 -7.68000e-07 [7,] 8.797900e-02 1.164268e+00 2.06000e-09 3.68000e-09 4.61000e-10 [8,] -4.67e-07 2.06e-09 2.15049e-02 9.00950e-03 9.58560e-03 [9,] 3.50e-07 3.68e-09 9.00950e-03 1.58849e-02 1.65230e-03 [10,] -7.68e-07 4.61e-10 9.58560e-03 1.65230e-03 1.53024e-02 [11,] 5.61e-07 -6.71e-09 4.26370e-03 -9.08500e-04 6.03300e-03 [12,] -2.72e-07 3.97e-09 -3.98220e-03 4.55100e-04 -5.55290e-03 [13,] 9.53e-08 -2.07e-10 -1.38413e-02 -9.42850e-03 -1.12002e-02 [,11][,12][,13] [1,] 1.17000e-06 -9.6e-07 -4.97000e-07 [2,] -6.71000e-09 3.97000e-09 -2.07000e-10 [3,] 6.74000e-11 -9.52000e-11 -3.1e-11 [4,] -1.68000e-10 2.12000e-10 5.69000e-11 [5,] -3.4e-09 -3.43000e-10 -2.27000e-09 [6,] 5.61000e-07 -2.72000e-07 9.53000e-08 [7,] -6.71000e-09 3.97000e-09 -2.07000e-10 [8,] 4.26370e-03 -3.98220e-03 -1.38413e-02 [9,] -9.08500e-04 4.55100e-04 -9.42850e-03 [10,] 6.03300e-03 -5.55290e-03 -1.12002e-02 [11,] 9.29045e-02 -8.00488e-02 -2.21646e-02 [12,] -8.00488e-02 7.19209e-02 1.82813e-02 [13,] -2.21646e-02 1.82813e-02 1.76818e-02 B [,1] [,2] [,3] [,4] [,5] [,6] [,7] [1,] -22.02731 0.4982429 2.651566 -2.356899 31.65277 -3302.719 0.4982429 [,8] [,9] [,10][,11] [,12] [,13] [1,] 0.3675237 -0.2562356 0.4549983 4.901309 -4.464062 -0.0836918 t(B) [,1] [1,] -22.0273100 [2,] 0.4982429 [3,] 2.6515660 [4,]-2.3568990 [5,]31.6527700 [6,] -3302.719 [7,] 0.4982429 [8,] 0.3675237 [9,]-0.2562356 [10,] 0.4549983 [11,] 4.9013090 [12,]-4.4640620 [13,]-0.0836918 Thanks very much, Stu Jordan Princeton University [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Robust nonlinear regression - sin(x)/x?
A small correction to my previous email: You actually specify the following call to lm: y - sin(x)/x lm(z ~ y - 1) to make sure that the intercept is not estimated. Ravi. - Original Message - From: Ravi Varadhan [EMAIL PROTECTED] Date: Monday, February 2, 2004 2:46 pm Subject: Re: [R] Robust nonlinear regression - sin(x)/x? You reall have only one parameter in your model, c = a/b. You can't identify both a and b from your model, therefore, you should fit the linear model: lm(z ~ c* sin(x)/x) Ravi. - Original Message - From: cstrato [EMAIL PROTECTED] Date: Monday, February 2, 2004 2:28 pm Subject: [R] Robust nonlinear regression - sin(x)/x? Dear all Since I did not receive any answer to my general question (?), let me ask a concrete question: How can I fit the simple function y = a*sin(x)/b*x? This is the code that I tried, but nls gives an error: x - seq(1,10,0.1) y - sin(x)/x plot(x,y) z - jitter(y,amount=0.1) plot(x,z) df - as.data.frame(cbind(x,z)) nf - nls(z ~ a*sin(x)/b*x, data=df, start=list(a=0.8,b=0.9), trace = TRUE) I have followed the Puromycin sample which works fine: Pur.wt - nls(rate ~ (Vm * conc)/(K + conc), data = Treated, start = list(Vm = 200, K = 0.1), trace = TRUE) Do I make some mistake or is it not possible to fit sin(x)/x? Thank you in advance Best regards Christian _._._._._._._._._._._._._._._._ C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a V.i.e.n.n.a A.u.s.t.r.i.a _._._._._._._._._._._._._._._._ cstrato wrote: Dear R experts This is a general question: Does R have functions for nonlinear robust regression, analogous to e.g. LTS? Searching google I have found 1, an abstract to generalize LTS for nonlinear regression models, see: http://smealsearch.psu.edu/1509.html 2, an AD-model builder, see: http://otter- rsch.com/admodel/cc1.html but no mention of R/S Thank you in advance Best regards Christian _._._._._._._._._._._._._._._._ C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a V.i.e.n.n.a A.u.s.t.r.i.a _._._._._._._._._._._._._._._._ __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R- project.org/posting- guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting- guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] ordering and plotting question
Hi, I am trying to plot several rows out of a list of thousands. I have 40 columns, and about 16,000 rows with the following Df structure. ID X01 X02 X03..X40 AI456 45 64 23... AI943 14 3 45 .. AI278 78 12 68.. BW768 -2 -7 34.. ... My question is, I have a list of 100 IDs generated elsewhere (Df-Ofinterest), I would like to plot the 100 IDs from that data frame over the 40 columns (40 points per ID, with each column being 1 x value). But I cant figure out how to retrieve the values from the main Df and plot them. Ive tried looking at order, rank etc, but these seem to apply to numbers, and not text strings. thanks Simon. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Robust nonlinear regression - sin(x)/x?
Dear all Thank you very much this time for the fast response and your many comments, and sorry for the stupid mistake. 1, The following gives the following error: nf - nls(z ~ a*sin(x)/(b*x), data=df, start=list(a=0.8,b=0.9), trace = TRUE) Error in nlsModel(formula, mf, start) : singular gradient matrix at initial parameter estimates 2, However, as Peter Dalgaard mentioned, the following gives the correct result: nf - nls(z ~ c*sin(x)/x, data=df, start=list(c=0.5), trace = TRUE) 2.113783 : 0.5 0.3187204 : 1.022591 3, Now to the question of robust nonlinear fitting: Let me introduce some outliers: z1 - z z1[c(6,12,13,34,36,42,67,69,72,76)] - c(0.8,0.9,0.8,-0.5,-0.4,-0.6,0.5,0.6,0.8,0.7) plot(x,z1) df1 - as.data.frame(cbind(x,z1)) Now, the fit gives: nf1 - nls(z1 ~ c*sin(x)/x, data=df1, start=list(c=0.5), trace = TRUE) 4.814774 : 0.5 2.072135 : 1.145962 The true result should be c=1.0 fitting w/o outliers gives c=1.023 fitting with outliers gives c=1.146 Can this fit considered to be robust? Best regards Christian Peter Dalgaard wrote: cstrato [EMAIL PROTECTED] writes: Dear all Since I did not receive any answer to my general question (?), let me ask a concrete question: How can I fit the simple function y = a*sin(x)/b*x? This is the code that I tried, but nls gives an error: x - seq(1,10,0.1) y - sin(x)/x plot(x,y) z - jitter(y,amount=0.1) plot(x,z) df - as.data.frame(cbind(x,z)) nf - nls(z ~ a*sin(x)/b*x, data=df, start=list(a=0.8,b=0.9), trace = TRUE) I have followed the Puromycin sample which works fine: Pur.wt - nls(rate ~ (Vm * conc)/(K + conc), data = Treated, start = list(Vm = 200, K = 0.1), trace = TRUE) Do I make some mistake or is it not possible to fit sin(x)/x? The expression only depends on a/b so you cannot estimate both. Besides, you need to check up on operator precedence and associativity: What you wrote is equivalent to a*sin(x)*x/b. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Nearest Neighbor Algorithm in R -- again.
Several of the methods I use for analyzing large data sets, such as WinGamma: determining the level of noise in data Relief-F: estimating the influence of variables depend on finding the k nearest neighbors of a point in a data frame or matrix efficiently. (For large data sets it is not feasible to compute the 'dist' matrix anyway.) Seeing the proposed solution to [R] distance between two matrices last month for finding _one_ nearest neighbor I came up with a solution 'nearest(A, n, k)' as appended. Still, this is very clumsy and slow if you have to find the 3 nearest neighbors for 1000 points in a data frame with 10 entries at least -- about 2 secs per data point on my computer or half an hour for an application from real life. Are there better/faster ways to perform this task using R functions? Even better, is there a free implementation of kd-trees that I could utilize (the one I found did not conform to the ANSI C standard)? Someome pointed to 'spdep' of the R-Sig-Geo project, but 'knearneigh' only accepts 2D data points. Hans Werner Borchers ABB Corporate Research, Germany require (class) nearest - function (X, n, k=3) ## Find k nearest neighbors of X[n, ] in the data frame ## or matrix X, utilizing function knn1 k-times. { N - nrow(X) # inds contains the indices of nearest neighbors inds - c(n); i - 0 while (i k) { # use knn1 for one index... j - as.integer(knn1(X [-inds, ], X[n, ], 1:(N-length(inds # ...and change to true index of neighbor inds - c(inds, setdiff(1:N, inds)[j]) i - i+1 } # return nearest neighbor indices (without n, of course) return(inds[-1]) } __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] problems when compiling C code
Hello, I would like to call C code from R. My C code is divided in two files. In the file testpermut.c, I have the following lines: #include adesub.h In my working folder, I have the files: - adesub.c which contains general functions - adesub.h with the header of functions contained in adesub.c - testpermut.c which call some functions defined in adesub.c When I try to create my dll (Work on Windows XP, R-1.8.1), I obtain error message: $ Rcmd shlib testpermut.c making adesub.d from adesub.c making testpermut.d from testpermut.c gcc -Ic:/Rdev/R-1.8.1/src/include -Wall -O2 -c testpermut.c -o testpermut.o ar cr testpermut.a *.o ranlib testpermut.a gcc --shared -s -o testpermut.dll testpermut.def testpermut.a -Lc:/Rdev/R-1.8 .1/src/gnuwin32 -lg2c -lR testpermut.a(testpermut.o.b)(.text+0x35):testpermut.c: undefined reference to `taballoc' testpermut.a(testpermut.o.b)(.text+0x49):testpermut.c: undefined reference to `taballoc' testpermut.a(testpermut.o.b)(.text+0x62):testpermut.c: undefined reference to `taballoc' testpermut.a(testpermut.o.b)(.text+0x14c):testpermut.c: undefined reference to `freetab' testpermut.a(testpermut.o.b)(.text+0x156):testpermut.c: undefined reference to `freetab' testpermut.a(testpermut.o.b)(.text+0x160):testpermut.c: undefined reference to `freetab' testpermut.a(testpermut.o.b)(.text+0x192):testpermut.c: undefined reference to `taballoc' testpermut.a(testpermut.o.b)(.text+0x22b):testpermut.c: undefined reference to `taballoc' testpermut.a(testpermut.o.b)(.text+0x23f):testpermut.c: undefined reference to `vecalloc' testpermut.a(testpermut.o.b)(.text+0x24b):testpermut.c: undefined reference to `vecalloc' testpermut.a(testpermut.o.b)(.text+0x339):testpermut.c: undefined reference to `freevec' testpermut.a(testpermut.o.b)(.text+0x343):testpermut.c: undefined reference to `freevec' testpermut.a(testpermut.o.b)(.text+0x34d):testpermut.c: undefined reference to `freetab' testpermut.a(testpermut.o.b)(.text+0x412):testpermut.c: undefined reference to `taballoc' make: *** [testpermut.dll] Error 1 $ The functions taballoc, freetab, vecalloc and freevec are defined in adesub files. So it seems that gcc does not make the links between my files. If I include the problematic functions in testpermut.c, gcc works perfectly and my dll is created. Perhaps someone could explain me what is my problem although it is not an R problem but probably a misuse of gcc ?. Thanks in advance. Stéphane DRAY -- Département des Sciences Biologiques Université de Montréal, C.P. 6128, succursale centre-ville Montréal, Québec H3C 3J7, Canada Tel : 514 343 6111 poste 1233 E-mail : [EMAIL PROTECTED] -- Web http://www.steph280.freesurf.fr/ __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Robust nonlinear regression - sin(x)/x?
Dear Ravi Sorry, I forgot to mention that you have also indicated that I have only one parameter. Fitting using lm gives: c=1.023 w/o and c=1.146 with outliers. Maybe sin(x)/x was a bad example, how about trying to fit a polynomial of degree n? Best regards Christian Ravi Varadhan wrote: A small correction to my previous email: You actually specify the following call to lm: y - sin(x)/x lm(z ~ y - 1) to make sure that the intercept is not estimated. Ravi. - Original Message - From: Ravi Varadhan [EMAIL PROTECTED] Date: Monday, February 2, 2004 2:46 pm Subject: Re: [R] Robust nonlinear regression - sin(x)/x? You reall have only one parameter in your model, c = a/b. You can't identify both a and b from your model, therefore, you should fit the linear model: lm(z ~ c* sin(x)/x) Ravi. - Original Message - From: cstrato [EMAIL PROTECTED] Date: Monday, February 2, 2004 2:28 pm Subject: [R] Robust nonlinear regression - sin(x)/x? Dear all Since I did not receive any answer to my general question (?), let me ask a concrete question: How can I fit the simple function y = a*sin(x)/b*x? This is the code that I tried, but nls gives an error: x - seq(1,10,0.1) y - sin(x)/x plot(x,y) z - jitter(y,amount=0.1) plot(x,z) df - as.data.frame(cbind(x,z)) nf - nls(z ~ a*sin(x)/b*x, data=df, start=list(a=0.8,b=0.9), trace = TRUE) I have followed the Puromycin sample which works fine: Pur.wt - nls(rate ~ (Vm * conc)/(K + conc), data = Treated, start = list(Vm = 200, K = 0.1), trace = TRUE) Do I make some mistake or is it not possible to fit sin(x)/x? Thank you in advance Best regards Christian _._._._._._._._._._._._._._._._ C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a V.i.e.n.n.a A.u.s.t.r.i.a _._._._._._._._._._._._._._._._ cstrato wrote: Dear R experts This is a general question: Does R have functions for nonlinear robust regression, analogous to e.g. LTS? Searching google I have found 1, an abstract to generalize LTS for nonlinear regression models, see: http://smealsearch.psu.edu/1509.html 2, an AD-model builder, see: http://otter- rsch.com/admodel/cc1.html but no mention of R/S Thank you in advance Best regards Christian _._._._._._._._._._._._._._._._ C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a V.i.e.n.n.a A.u.s.t.r.i.a _._._._._._._._._._._._._._._._ __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R- project.org/posting- guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting- guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] mvrnorm problem
Stuart V Jordan [EMAIL PROTECTED] writes: mvrnorm(n = 1000,B,V) Error in mu + eS$vectors %*% diag(sqrt(pmax(ev, 0)), p) %*% t(X) : non-conformable arrays mvrnorm(n = 1000,t(B),V) Error in mu + eS$vectors %*% diag(sqrt(pmax(ev, 0)), p) %*% t(X) : non-conformable arrays You might, for at least two good reasons, have said that this is from library(MASS). The point is that mvrnorm(n=10,matrix(c(1,1),1,2),diag(2)) Error in mu + eS$vectors %*% diag(sqrt(pmax(ev, 0)), p) %*% t(X) : non-conformable arrays mvrnorm(n=10,matrix(c(1,1),2,1),diag(2)) Error in mu + eS$vectors %*% diag(sqrt(pmax(ev, 0)), p) %*% t(X) : non-conformable arrays mvrnorm(n=10,c(1,1),diag(2)) [,1] [,2] [1,] 0.5005327 1.1919216 [2,] 2.8273925 2.7004788 [3,] 2.6493970 1.1304274 and the docs quite clearly say that mu wants to be a vector, not a matrix. Curiously enough, this works with rmvnorm from the mvtnorm package by Genz, Bretz, and Hothorn, the difference being that this version adds in the means with a sweep() operation, whereas mvrnorm just adds mu (to the transpose of the ultimate result) and relies on recycling rules. I.e. the point is that x - matrix(1:2,1,2) M - matrix(1:4,2) x+M Error in x + M : non-conformable arrays t(x)+M Error in t(x) + M : non-conformable arrays c(x)+M [,1] [,2] [1,]24 [2,]46 sweep(M,1,x,+) [,1] [,2] [1,]24 [2,]46 -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] sorting by date
Hello, I have set up a data.frame and one of the columns contains a date of the form (with slashes as separators): mm/dd/ I would like to use formulas on other columns in the data.frame organized by date, for example: tapply(var1, sort(date), mean) However, when I try sort(date) it sorts based on the first two entries in the date field: 9/1/20019/1/20029/1/20039/2/2001 ... 5.6 7.5 6.4 7.0 ... Instead of: 9/1/20019/2/20019/3/20019/4/2001 ... 5.6 6.1 7.2 6.8 ... I would greatly appreciate any help in sorting chronologically. Do I need to create separate columns for month, day, and year, and then use order() and then stipulate the hierarchy for which to sort the output? Or, is there some other more efficient way? Thanks, Jeff __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Robust nonlinear regression - better example
Dear all Here is a hopefully better example with regards to nonlinear robust fitting: # fitting a polynomial: x - seq(-10,10,0.2) y - 10*x + 4*x*x - 2*x*x*x plot(x,y) z - jitter(y,amount=300) plot(x,z) df - as.data.frame(cbind(x,z)) nf - nls(z ~ a*x + b*x*x + c*x*x*x, data=df, + start=list(a=4,b=2,c=1), trace = TRUE) 127697531 : 4 2 1 2974480 : 10.972123 3.793426 -1.942278 # introducing outliers before fitting the polynomial: z1 - z z1[c(16,22,23,34,36,42,67,69,72,76)] - + c(2000,1900,2000,1900,1600,1600,500,-2000,-1700,-1800) plot(x,z1) df1 - as.data.frame(cbind(x,z1)) nf1 - nls(z1 ~ a*x + b*x*x + c*x*x*x, data=df1, + start=list(a=4,b=2,c=1), trace = TRUE) 159359174 : 4 2 1 24098548 : -59.053288 4.169518 -1.072027 # plotting the results: y1 - 10.97*x + 3.79*x*x - 1.94*x*x*x y2 - -59.05*x + 4.17*x*x - 1.07*x*x*x oldpar - par(pty=s,mfrow=c(2,2),mar=c(5,5,4,1)) plot(x,y) plot(x,z1) plot(x,y1) plot(x,y2) par(oldpar) In my opinion this fit could hardly be considered to be robust. Are there functions in R which can do robust fitting? (Sorrowly, at the moment I could not test the package nlrq mentioned by Roger Koenker) Best regards Christian _._._._._._._._._._._._._._._._ C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a V.i.e.n.n.a A.u.s.t.r.i.a _._._._._._._._._._._._._._._._ __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Nearest Neighbor Algorithm in R -- again.
Why not modify the C code of knn? (Not knn1) On Mon, 2 Feb 2004, Hans W Borchers wrote: Several of the methods I use for analyzing large data sets, such as WinGamma: determining the level of noise in data Relief-F: estimating the influence of variables depend on finding the k nearest neighbors of a point in a data frame or matrix efficiently. (For large data sets it is not feasible to compute the 'dist' matrix anyway.) Seeing the proposed solution to [R] distance between two matrices last month for finding _one_ nearest neighbor I came up with a solution 'nearest(A, n, k)' as appended. Still, this is very clumsy and slow if you have to find the 3 nearest neighbors for 1000 points in a data frame with 10 entries at least -- about 2 secs per data point on my computer or half an hour for an application from real life. Are there better/faster ways to perform this task using R functions? Even better, is there a free implementation of kd-trees that I could utilize (the one I found did not conform to the ANSI C standard)? Someome pointed to 'spdep' of the R-Sig-Geo project, but 'knearneigh' only accepts 2D data points. Hans Werner Borchers ABB Corporate Research, Germany require (class) nearest - function (X, n, k=3) ## Find k nearest neighbors of X[n, ] in the data frame ## or matrix X, utilizing function knn1 k-times. { N - nrow(X) # inds contains the indices of nearest neighbors inds - c(n); i - 0 while (i k) { # use knn1 for one index... j - as.integer(knn1(X [-inds, ], X[n, ], 1:(N-length(inds # ...and change to true index of neighbor inds - c(inds, setdiff(1:N, inds)[j]) i - i+1 } # return nearest neighbor indices (without n, of course) return(inds[-1]) } __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] ordering and plotting question
On Mon, 2004-02-02 at 14:13, Simon Melov wrote: Hi, I am trying to plot several rows out of a list of thousands. I have 40 columns, and about 16,000 rows with the following Df structure. ID X01 X02 X03..X40 AI456 45 64 23... AI943 14 3 45 .. AI278 78 12 68.. BW768 -2 -7 34.. ... My question is, I have a list of 100 IDs generated elsewhere (Df-Ofinterest), I would like to plot the 100 IDs from that data frame over the 40 columns (40 points per ID, with each column being 1 x value). But I cant figure out how to retrieve the values from the main Df and plot them. Ive tried looking at order, rank etc, but these seem to apply to numbers, and not text strings. thanks Simon. I am not entirely sure what type of plot you want, however the following will enable you to subset the main dataframe, based upon matching ID's: # Create an example vector of the ID's you want df.index - c(AI456, AI278, BW768) df.index [1] AI456 AI278 BW768 # Create a df with the subset of data you have above df - data.frame(ID = c(AI456, AI943, AI278, BW768), X01 = c(45, 14, 78, -2), X02 = c(64, 3, 12, -7), X03 = c(23, 45, 68, 34)) df ID X01 X02 X03 1 AI456 45 64 23 2 AI943 14 3 45 3 AI278 78 12 68 4 BW768 -2 -7 34 # Use which() and %in% to get a vector containing the # row indices in df that match the three entries in df.index found - which(df$ID %in% df.index) found [1] 1 3 4 # Now subset df to include only those rows that match MySubset - df[found, ] MySubset ID X01 X02 X03 1 AI456 45 64 23 3 AI278 78 12 68 4 BW768 -2 -7 34 MySubset now contains only those rows that match based upon your IDs in the index vector. If you know how to generate the plot you want from here, have at it. If not, let me know what you wish to do and I can help further. See ?which and ?%in% for more information. Also, see ?subset for additional ways to subset dataframes using more complex logicals. HTH, Marc Schwartz __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] sorting by date
Convert to POSIXct and sort. Note that tapply will coerce to a factor, so you need to create a factor with the levels sorted as you want them: just sorting date will not help. Something like udate - unique(date) lev - udate[sort.list(as.POSIXct(strptime(udate, %m/%d/%Y)))] date - factor(date, levels=lev) On Mon, 2 Feb 2004, Jeff Jorgensen wrote: I have set up a data.frame and one of the columns contains a date of the form (with slashes as separators): mm/dd/ I would like to use formulas on other columns in the data.frame organized by date, for example: tapply(var1, sort(date), mean) However, when I try sort(date) it sorts based on the first two entries in the date field: 9/1/2001 9/1/20029/1/20039/2/2001 ... 5.6 7.5 6.4 7.0 ... Instead of: 9/1/2001 9/2/20019/3/20019/4/2001 ... 5.6 6.1 7.2 6.8 ... I would greatly appreciate any help in sorting chronologically. Do I need to create separate columns for month, day, and year, and then use order() and then stipulate the hierarchy for which to sort the output? Or, is there some other more efficient way? -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] sorting by date
Jeff Jorgensen [EMAIL PROTECTED] writes: Hello, I have set up a data.frame and one of the columns contains a date of the form (with slashes as separators): mm/dd/ I would like to use formulas on other columns in the data.frame organized by date, for example: tapply(var1, sort(date), mean) I don't think that does what I think you think it does! However, when I try sort(date) it sorts based on the first two entries in the date field: 9/1/2001 9/1/20029/1/20039/2/2001 ... 5.6 7.5 6.4 7.0 ... Instead of: 9/1/2001 9/2/20019/3/20019/4/2001 ... 5.6 6.1 7.2 6.8 ... I would greatly appreciate any help in sorting chronologically. Do I need to create separate columns for month, day, and year, and then use order() and then stipulate the hierarchy for which to sort the output? Or, is there some other more efficient way? You now know why the ISO standard has -mm-dd ... It's a bit awkward, but I think you need something like pdate - as.POSIXct(strptime(date,%m/%d/%Y)) tapply(var1, pdate, mean) -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Robust nonlinear regression - better example
1. The question of linear vs. nonlinear means linear in the parameters to be estimated. All the examples you have given so far are linear in the parameters to be estimated. The fact that they are nonlinear in x is immaterial. 2. With this hint and the posting guide http://www.R-project.org/posting-guide.html;, you may find more information. A search there exposed much discussion of robust regression and even robust nonlinear regression, if you actually still need that. In addition, I found useful information on robust regression in Venables and Ripley (2002) Modern Applied Statistics with S, 4th ed. (Springer). hope this helps. spencer graves cstrato wrote: Dear all Here is a hopefully better example with regards to nonlinear robust fitting: # fitting a polynomial: x - seq(-10,10,0.2) y - 10*x + 4*x*x - 2*x*x*x plot(x,y) z - jitter(y,amount=300) plot(x,z) df - as.data.frame(cbind(x,z)) nf - nls(z ~ a*x + b*x*x + c*x*x*x, data=df, + start=list(a=4,b=2,c=1), trace = TRUE) 127697531 : 4 2 1 2974480 : 10.972123 3.793426 -1.942278 # introducing outliers before fitting the polynomial: z1 - z z1[c(16,22,23,34,36,42,67,69,72,76)] - + c(2000,1900,2000,1900,1600,1600,500,-2000,-1700,-1800) plot(x,z1) df1 - as.data.frame(cbind(x,z1)) nf1 - nls(z1 ~ a*x + b*x*x + c*x*x*x, data=df1, + start=list(a=4,b=2,c=1), trace = TRUE) 159359174 : 4 2 1 24098548 : -59.053288 4.169518 -1.072027 # plotting the results: y1 - 10.97*x + 3.79*x*x - 1.94*x*x*x y2 - -59.05*x + 4.17*x*x - 1.07*x*x*x oldpar - par(pty=s,mfrow=c(2,2),mar=c(5,5,4,1)) plot(x,y) plot(x,z1) plot(x,y1) plot(x,y2) par(oldpar) In my opinion this fit could hardly be considered to be robust. Are there functions in R which can do robust fitting? (Sorrowly, at the moment I could not test the package nlrq mentioned by Roger Koenker) Best regards Christian _._._._._._._._._._._._._._._._ C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a V.i.e.n.n.a A.u.s.t.r.i.a _._._._._._._._._._._._._._._._ __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] problems when compiling C code
On Mon, 2 Feb 2004, Stephane DRAY wrote: Hello, I would like to call C code from R. My C code is divided in two files. In the file testpermut.c, I have the following lines: #include adesub.h In my working folder, I have the files: - adesub.c which contains general functions - adesub.h with the header of functions contained in adesub.c - testpermut.c which call some functions defined in adesub.c When I try to create my dll (Work on Windows XP, R-1.8.1), I obtain error message: $ Rcmd shlib testpermut.c You only included one of the files. You need Rcmd SHLIB testpermut.c adesub.c I believe. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] problems when compiling C code
Hello, Do you have S Programming by Venables and Ripley, Springer, 2000? There is an excellent discussion and examples there on compiling multifile codes. I think your problem is in the order of the compilation of the multiple files. Best, /oal __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Re: packages
Hi, I am trying to make my own package of R functions, datasets and help files, which were originally in S and have been converted. As a Unix user, I am trying to make a package that installs on Windows and I am having some trouble. I have a zip file that seems to unzip fine, but I have to use the source command to make functions accessible. e.g. source(C:/Program Files/R/rw1081/library/slab/R/slab.q) Is there a command that make happen? or is the .q extention causing trouble? It turns out that the data() command works fine, but help files are not accessible either. Thanks for your help, Barb Bailey __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] glm.poisson.disp versus glm.nb
Dear list, This is a question about overdispersion and the ML estimates of the parameters returned by the glm.poisson.disp (L. Scrucca) and glm.nb (Venables and Ripley) functions. Both appear to assume a negative binomial distribution for the response variable. Paul and Banerjee (1998) developed C(alpha) tests for interaction and main effects, in an unbalanced two-way layout of counts involving two fixed factors, when data are Poisson distributed, and when data are extra dispersed. In R I coded their C(alpha) test statistic (called TNBI) for interaction for the case where the counts are extra-dispersed, as well as their test for extra-dispersion (called T_a). Using the Quine data set (Quine, 1975) the authors collapse the orginal 4x2x2x2 study design into a 2x4 table and obtained estimates of TNBI=10.36 and T_a=90.81. Using the dispersion estimate from glm.poisson.disp and the estimates for mu I got exactly the same results for TNBI and T_a. This made me happy. Now I thought to try the ML estimates from glm.nb to see if the results would be the same but I am having difficulty relating the dispersion phi from glm.poisson.disp to theta estimated by glm.nb. According to the R help for glm.poisson.disp Var(y_i) = mu_i(1+mu_i*phi) . The help for glm.nb lead me to a book by VR (1994) which indicates that Var(y)=mu+mu^2/theta. From this I gathered that phi=1/theta but the estimates do not relate to each other in this way unless one is in error. In a document by L.P. Ammann he says a negative binomial model can be specified with mean mu and dispersion phi by taking theta=mu/(phi-1). I had a problem implementing this because in my mind mu is a vector whereas phi and theta are scalars. Consequently, I would like to know how to get phi from theta so that I can compare the glm.poisson.disp and glm.nb methods for estimating dispersion. Regards, Alex Alex Hanke Department of Fisheries and Oceans St. Andrews Biological Station 531 Brandy Cove Road St. Andrews, NB Canada E5B 2L9 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] how to label plots?
Hi With main=Title I can write centered above the plot. Is there also a way to write into the left (or right) upper corner? I'd like to label my plots by (a), (b), ... Thanks, David __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Re: packages
On Mon, Feb 02, 2004 at 05:35:12PM -0600, Barbara Bailey wrote: I am trying to make my own package of R functions, datasets and help files, which were originally in S and have been converted. As a Unix user, I am trying to make a package that installs on Windows and I am having some trouble. AFAIK that is not guaranteed to work. Generally speaking, for any given system, binary R packages (as opposed to source packages) are built on that architecture. There are exceptions, most notably the cross-builds for Windows i386 that can be generated, given a suitable environment, on a Linux system. This is documented in a few places; google should find it. For the normal case, your best bet is to read some more of the fine 'Writing R Extensions' manual, and to possibly study some of the examples provided by the over 300 source packages available on CRAN. Hth, Dirk -- The relationship between the computed price and reality is as yet unknown. -- From the pac(8) manual page __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] output from multcomp and lm
Dear R-users (B (BI analysed the same data set by two different ways; (Banalysis of covariance by using lm and anova functions (Band multiple comparison by using simtest function in (Bthe multcomp library. (B (BThe output from the analysis of covariance is; (B (B y-lm(D~Cond+Q1,data=x) (B anova(y) (BAnalysis of Variance Table (B (BResponse: D (B Df Sum Sq Mean Sq F valuePr(F) (BCond2 1017.8 508.9 4.7548 0.0135041 * (BQ1 1 1652.7 1652.7 15.4417 0.0002969 *** (BResiduals 44 4709.2 107.0 (B--- (BSignif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 (B (Bwhere Cond is a factor with three levels (A,B,C) (Band Q1 is a covariate. (B (B (B (BNow, simtest showed the following output (B (B o5-summary(simtest(D~Cond+Q1,conf.level=0.95,data=x,type="Tukey")) (B o5 (B (B Simultaneous tests: Tukey contrasts (B (BCall: (Bsimtest.formula(formula = D ~ Cond + Q1, data = x, conf.level = 0.95, (Btype = "Tukey") (B (B Tukey contrasts for factor Cond, covariable: Q1 (B (BContrast matrix: (B CondA CondB CondC (BCondB-CondA 0-1 1 0 0 (BCondC-CondA 0-1 0 1 0 (BCondC-CondB 0 0-1 1 0 (B (B (BAbsolute Error Tolerance: 0.001 (B (BCoefficients: (BEstimate t value Std.Err. p raw p Bonf p adj (BCondB-CondA5.555 -1.4613.802 0.151 0.453 0.319 (BCondC-CondB -5.248 -1.3653.661 0.179 0.453 0.319 (BCondC-CondA0.306 -0.0843.844 0.934 0.934 0.934 (B (BThe results from two analyses seem so different that I am (Bwondering why. I do understand that multiple comparison may (Bnot show any significant difference even when the overall analysis (Bof (co)variance shows the statistical significance of a factor. (B (BHowever, in my analysis, overall analysis showed statistical significance of (B1.4% level and mutiple comparison showed significance of 32% level (BCould this happen? and why? Please enlighten me. (B (BSincerely (B (BHiroto Miyoshi $B;09%90?M(B (B[EMAIL PROTECTED] (B (B__ (B[EMAIL PROTECTED] mailing list (Bhttps://www.stat.math.ethz.ch/mailman/listinfo/r-help (BPLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to label plots?
On Mon, 2004-02-02 at 19:28, Marc Schwartz wrote: You can play with the outer margin settings for more space if you wish and the 'adj' argument in mtext() manipulates the positioning of the text at the x,y coordinate of line, at. Oops...That last line should read: text at the x,y coordinate of at, line. Sorry. Marc __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to label plots?
On Mon, 2004-02-02 at 17:25, David Andel wrote: Hi With main=Title I can write centered above the plot. Is there also a way to write into the left (or right) upper corner? I'd like to label my plots by (a), (b), ... Thanks, David # Create a basic plot plot(1:5) # Now set the outer plot margin to 1 line at the top # and 0 for bottom, left and right # See ?par for more information par(oma = c(0, 0, 1, 0)) # Now use mtext() to place text in the outer margin # The outer margin coordinates go from 0 to 1 # See ?mtext for more information #For the upper left hand corner, use the following: mtext((a), side = 3, line = 0, at = 0, outer = TRUE, adj = 0) #For the upper right hand corner, use the following: mtext((b), side = 3, line = 0, at = 1, outer = TRUE, adj = 1) You can play with the outer margin settings for more space if you wish and the 'adj' argument in mtext() manipulates the positioning of the text at the x,y coordinate of line, at. HTH, Marc Schwartz __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] running R from PHP
I would like to construct a PHP script that runs R to generate a graphics file. Running R itself is no problem. However, it seems impossible to instantiate one of the graphics devices to create output. For example, the normal bitmap devices (e.g., jpeg, png, etc.) are derived from X11, which requires a display. This seems true, even if no output is ever directed to a real display. For some reason, the postscript device seems to suffer from similar problems. Is there a trick to creating a graphics device in the absence of an actual display in order to create an image in a file? Thanks for your help. Cheers, Brook __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] running R from PHP
@biology.nmsu.edu [EMAIL PROTECTED] wrote: I would like to construct a PHP script that runs R to generate a graphics file. Running R itself is no problem. However, it seems impossible to instantiate one of the graphics devices to create output. For example, the normal bitmap devices (e.g., jpeg, png, etc.) are derived from X11, which requires a display. This seems true, even if no output is ever directed to a real display. For some reason, the postscript device seems to suffer from similar problems. Is there a trick to creating a graphics device in the absence of an actual display in order to create an image in a file? If you need a bitmap graphic file, I would suggest the use of ImageMagick: cunegonde:~/tmp ls foo cunegonde:~/tmp cat foo pdf(file=g.pdf) plot(1:5) dev.off() cunegonde:~/tmp R --no-save foo/dev/null convert g.pdf g.png cunegonde:~/tmp ls -g -rw---1 glaziou 3374 2004-02-03 11:58 g.pdf -rw---1 glaziou 4115 2004-02-03 11:58 g.png -rw---1 glaziou80 2004-02-03 11:58 foo This works from a unix console without X running (the postcript device works similarly on my machine). R can easily be fed this way with a file and parameters passed from a php script. -- Philippe Glaziou, MD Epidemiologist Institut Pasteur du Cambodge __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Nearest Neighbor Algorithm in R -- again.
/Why not modify the C code of knn? (Not knn1)/ Because I thought it would not be a good idea to apply 'knn' a 1000 times or more. Instead I was looking for a procedure that returns a structure making it easy to find nearest neighbors for any point one wants. This way I used 'dist' for smaller data sets. The next step could be to extent nearest neighbors to data frames with factors generalizing, for example, the 'daisy' function. But you might be right that looking at the 'knn' implementation will be a faster road for the moment. Hans Werner Borchers ABB Corporate Research, Germany __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html