[R] Odp: Superposing mean line to xyplot
Hi Dear R-users, I'm using lattice package and function xyplot for the first time so you will excuse me for my inexperience. I'm facing quite a simple problem but I'm having troubles on how to solve it, I've read tons of old mails in the archives and looked at some slides from Deepayan Sarkar but still can not get the point. This is the context. I've got data on 9 microRNAs, each miRNA has been measured on three different arrays and on each array I have 4 replicates for each miRNA, which sums up to a total of 108 measurements. I've the suspect that measurement on the first array are systematically lower than the others so I wanted to draw some line plot where each panel correspond to a miRNA, and each line correspond to one of the four replicates (that is: first replicate of miRNA A on array 1 must be connected to first replicate of miRNA A on array 2 and so on), so that for each panel there are 4 series of three points connected by a line/segment. I've done this easily with lattice doing this: array = rep(c(A,B,C),each = 36) # array replicate spot = rep(1:4,27) # miRNA replicate on each array miRNA = rep(rep(paste(miRNA,1:9,sep=.),each=4),3) # miRNA label exprs = rnorm(mean=2.8,n = 108) # intensity data = data.frame(miRNA,array,spot,exprs) xyplot(exprs ~ array|miRNA,data=data,type=b,groups=spot,xlab=Array,ylab = Intensity,col=black,lty=2:5,scales = list(y = list(relation = free))) Now, I want to superpose to each panel an other series of three points connected by a line, where each point represent the mean of the four replicates of the miRNA on each array, a sort of mean line. I've tried using the following, but it's not working as expected: xyplot(exprs ~ array|miRNA,data=array,type=b,groups=spot,xlab=Array,ylab = Intensity,col=black,lty=2:5,scales = list(y = list(relation = free)), panel = function(x,y,groups,subscripts){ panel.xyplot(x,y,groups=groups,subscripts=subscripts) panel.superpose (x,y,panel.groups=panel.average,groups=groups,subscripts=subscripts) }) This is maybe a silly question and possibly there's a trivial way to do it, but I can not figure it out. With some help I made function addLine # based on Gabor Grothendieck's code suggestion # adds straight lines to panels in lattice plots addLine- function(a=NULL, b=NULL, v = NULL, h = NULL, ..., once=F) { tcL - trellis.currentLayout() k-0 for(i in 1:nrow(tcL)) for(j in 1:ncol(tcL)) if (tcL[i,j] 0) { k-k+1 trellis.focus(panel, j, i, highlight = FALSE) if (once) panel.abline(a=a[k], b=b[k], v=v[k], h=h[k], ...) else panel.abline(a=a, b=b, v=v, h=h, ...) trellis.unfocus() } } addLine(h=tapply(data$exprs, miRNA, mean), once=T) Regards Petr Thanx for any help. niccolò __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] extra digits added to data
I am having a problem with extra digits being added to my data which I think is a result of how I am converting my data.frame data to xts. I see the same issue in R v2.13.1 and RStudio version 0.94.106. I am loading historical foreign exchange data in via csv files or from a sql server database. In both cases there are no extra digits and the original data looks like the following: Date Open HighLow Close 1 2001-01-03 1.5021 1.5094 1.4883 1.4898 2 2001-01-04 1.4897 1.5037 1.4882 1.5020 3 2001-01-05 1.5020 1.5074 1.4952 1.5016 4 2001-01-08 1.5035 1.5104 1.4931 1.4964 5 2001-01-09 1.4964 1.4978 1.4873 1.4887 6 2001-01-10 1.4887 1.4943 1.4856 1.4866 So for 2001-01-03 the Open value is 1.5021 with only 4 digits after the decimal place - i.e. .5021. I then proceed to do the following in R to convert the 'british pound' data above from data.frame to xts: Require(quantmod) rownames(gbp) - gbp$Date head(gbp) Open HighLow Close 2001-01-03 1.5021 1.5094 1.4883 1.4898 2001-01-04 1.4897 1.5037 1.4882 1.5020 2001-01-05 1.5020 1.5074 1.4952 1.5016 2001-01-08 1.5035 1.5104 1.4931 1.4964 2001-01-09 1.4964 1.4978 1.4873 1.4887 2001-01-10 1.4887 1.4943 1.4856 1.4866 gbp- as.xts(gbp[,2:5]) class(gbp) [1] xts zoo The data at this point looks ok until you look closer or output the data to excel at which point you see the following for the 'Open' 2001-01-03: 1.5020084473 It is not just the above 'Open' or the first value but all the data points contain the extra digits which I think is the original date data and/or row numbers that are being tacked on. My problem is the extra digits being added or whatever I am doing wrong in R to cause the extra digits to be added. I need 1.5021 to be 1.5021 and not 1.5020084473. Thanks for the help. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SLOW split() function
As another followup, given that you are doing numerous regression models and (I presume) working with finance/stock data that is strictly numeric (no need for special contrast coding, etc.), you can substantially reduce the time spent estimating the coefficients. A simple way is to use lm.fit directly instead of lm. For lm.fit, you pass the y and x (design) matrices directly. This skips a good deal of overhead. Here is one naive way, I imagine more speedups could be gained by incorporating the intercept (1 vector) into d instead of cbind()ing it. The catch it that lm.fit requires matrices, not data tables, so what you gain may be lost in having to do an extra conversion. In any case, here are the times on my system for the two options (note I used N = 1000 * 100 because I am presently on a glorified netbook). print(system.time(all.2b - lapply(si, function(.indx) { coef(lm(y ~ + x, data=d[.indx,])) }))) user system elapsed 69.000.00 69.56 print(system.time(all.2c - lapply(si, function(.indx) { coef(lm.fit(y = d[.indx, y], x = cbind(1, d[.indx, x]))) }))) user system elapsed 37.830.03 38.36 the column names for the coeficients will not be the same as from lm, but the estimates should be identical. While this is not recommended in typical usage, in an application like regressions on rolling time windows, etc. where you know the data are not changing, I think it makes sense to bypass the clever determine your data and best methods to use, and go straight to passing the design matrix. Since you do not need residuals, variances, etc. it may be possible to speed this up even more, perhaps bypassing dqrls altogether. Cheers, Josh On Mon, Oct 10, 2011 at 9:56 PM, ivo welch ivo.we...@gmail.com wrote: thank you, everyone. this was very helpful to my specific task and understanding. for the benefit of future googlers, I thought I would post some experiments and results here. ultimately, I need to do a by() on an irregular matrix, and I now know how to speed up by() on a single-core, and then again on a multi-core machine. library(data.table) N - 1000*1000 d - data.table(data.frame( key= as.integer(runif(N, min=1, max=N/10)), x=rnorm(N), y=rnorm(N) )) # irregular setkey(d, key); gc() ## sort and force a garbage collection cat(N=, N, . Size of d=, object.size(d)/1024/1024, MB\n) cat(\nStandard by() Function:\n) print(system.time( all.1 - by( d, d$key, function(d) coef(lm(y ~ x, data=d) cat(\n\nPreSplit Function [aka Jim H]\n\t(a) Splitting Operation:\n) print(system.time(si - split(seq(nrow(d)), d$key))) cat(\n\t(b) Regressions:\n) print(system.time(all.2 - lapply(si, function(.indx) { coef(lm(d$y[.indx] ~ d$x[.indx])) }))) print(system.time(all.2b - lapply(si, function(.indx) { coef(lm(y ~ x, data=d[.indx,])) }))) cat(\n\nNaive Split Data Frame\n\t(a) Splitting Operation:\n) print(system.time(ds - split(d, d$key))) cat(\n\t(b) Regressions:\n) print(system.time(all.3a - lapply(ds, function(ds) { coef(lm(ds$y ~ ds$x)) }))) print(system.time(all.3b - lapply(ds, function(ds) { coef(lm(y ~ x, data=ds)) }))) the first and the last ways (all.1 and all.3) are naive ways of doing this, and take about 400-500 seconds on a Mac Air, core i5. Jim's suggestion (all.2) cuts this roughly into half by speeding up the split to take almost no time. and now, library(multicore) print(system.time(all.4 - mclapply(si, function(.indx) { coef(lm(y ~ x, data=d[.indx,])) }))) on my dual-core (quad-thread) i5, all four pseudo cores become busy, and the time roughly halves again from 230 seconds to 120 seconds. maybe the by() function should use Jim's approach, and multicore should provide mcby(). of course, knowing how to do this myself fast now by hand, this is not so important for me. but it may help some other novices. thanks again everybody. regards, /iaw Ivo Welch (ivo.we...@gmail.com) On Mon, Oct 10, 2011 at 9:31 PM, William Dunlap wdun...@tibco.com wrote: The following avoids the overhead of data.frame methods (and assumes the data.frame doesn't include matrices or other data.frames) and relies on split(vector,factor) quickly splitting a vector into a list of vectors. For a 10^6 row by 10 column data.frame split in 10^5 groups this took 14.1 seconds while split took 658.7 s. Both returned the same thing. Perhaps something based on this idea would help your parallelized by(). mysplit.data.frame - function (x, f, drop = FALSE, ...) { f - as.factor(f) tmp - lapply(x, function(xi) split(xi, f, drop = drop, ...)) rn - split(rownames(x), f, drop = drop, ...) tmp - unlist(unname(tmp), recursive = FALSE) tmp - split(tmp, factor(names(tmp), levels = unique(names(tmp tmp - lapply(setNames(seq_along(tmp), names(tmp)), function(i) { t - tmp[[i]] names(t) - names(x) attr(t, row.names) - rn[[i]] class(t) - data.frame t }) tmp } Bill
[R] Labels in ICLUST
Dear all, I can't get the labels slot in ICLUST to accept a character vector. library(psych) test.data - Harman74.cor$cov ic.out - ICLUST(test.data,nclusters =4,labels=letters[1:ncol(test.data)]) ## Error in !labels : invalid argument type ic.out - ICLUST(test.data,nclusters =4,labels=1:ncol(test.data)) ## OK Any ideas? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] binding all elements of list (character vectors) to a matrix as rows
dear r-users, i have got a problem which i am trying to solve: i have got the following commands: Mymatrix - matrix(1:9,ncol=3) Z - list(V1=c(a,,),V2=c(b,,),V3=c(c,,),V4=c(d,,)) Mymatrix - rbind(Mymatrix,Z[[1]],Z[[2]],Z[[3]],Z[[4]]) now this is working, but i would like to substitute Z[[1]],Z[[2]],Z[[3]],Z[[4]] for a command with which i could also use another list with a different number of elements, e.g. 5 or 6 elements. does anyone know the solution to this problem? thank you very much in advance! marion [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] binding all elements of list (character vectors) to a matrixasrows
Marion, try rbind( Mymatrix, do.call( rbind, Z)) Hth -- Gerrit On Tue, 11 Oct 2011, Marion Wenty wrote: dear r-users, i have got a problem which i am trying to solve: i have got the following commands: Mymatrix - matrix(1:9,ncol=3) Z - list(V1=c(a,,),V2=c(b,,),V3=c(c,,),V4=c(d,,)) Mymatrix - rbind(Mymatrix,Z[[1]],Z[[2]],Z[[3]],Z[[4]]) now this is working, but i would like to substitute Z[[1]],Z[[2]],Z[[3]],Z[[4]] for a command with which i could also use another list with a different number of elements, e.g. 5 or 6 elements. does anyone know the solution to this problem? thank you very much in advance! marion [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Package/Function for Blending Images
Hi there, Does someone know a package/function for blending two pictures or to add transparency.. Thanks in advance, KC - Kay Cichini Postgraduate student Institute of Botany Univ. of Innsbruck -- View this message in context: http://r.789695.n4.nabble.com/Package-Function-for-Blending-Images-tp3893170p3893170.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] warning with cut2 function
Dear r user, please find my attached sample of the dataset i am using to create a crosstable and eventually plot a histogram from the output. I am using the cut2 function to create bins, about 7 of them using the code after reading the data: cluster - cut2(cross_val$value, g=7) I get the warning: Warning message: In min(xx[xx upper]) : no non-missing arguments to min; returning Inf additionally, the bins become 6 instead of 7 through the crossTable function: cross1 -CrossTable(cross_val$factor, cluster,prop.chisq=FALSE,prop.r=FALSE,prop.t=FALSE) Please assist me to get my 7 bins. How can i plot an output of the cross table as a historgram of factor rate vs bins? Any help will be highly appreciated. Kind regards, Taby An idea not coupled with action will never get any bigger than the brain cell it occupied. Arnold Glasgow .. Attempt something large enough that failure is guaranteed…unless God steps in! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] binding all elements of list (character vectors) to a matrix as rows
On Oct 11, 2011, at 2:47 AM, Marion Wenty wrote: dear r-users, i have got a problem which i am trying to solve: i have got the following commands: Mymatrix - matrix(1:9,ncol=3) Z - list (V1 =c(a,,),V2=c(b,,),V3=c(c,,),V4=c(d,,)) rbind(Mymatrix, t(as.data.frame(Z))) The next method could be used if you had more lists: do.call(rbind, list(Mymatrix, t(as.data.frame(Z Mymatrix - rbind(Mymatrix,Z[[1]],Z[[2]],Z[[3]],Z[[4]]) now this is working, but i would like to substitute Z[[1]],Z[[2]],Z[[3]],Z[[4]] for a command with which i could also use another list with a different number of elements, e.g. 5 or 6 elements. -- David Winsemius __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Vegan: Anova.CCA accessing original data using option by=margin
Hello, I am attempting to use the ANOVA.CCA function with the by=margin option. The process works fine using the by=terms option and I note in the Vegan manual that Jari suggests that an error may occur if the anova does not have access to the data on the original constraints. This is the error that I get: Error in dimnames(x) - dn : length of 'dimnames' [2] not equal to array extent My question is, does anyone know if this error relates to what Jari is referring to (or is it a different problem), and if it is, how do I link the anova to the original constraints? Many thanks for any help provided. Regards Steve -- View this message in context: http://r.789695.n4.nabble.com/Vegan-Anova-CCA-accessing-original-data-using-option-by-margin-tp3893005p3893005.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to add a double quote to a string
Hi I want to add a double quote to a string eg Expected output = DROP TABLE IF EXISTS abc My code tab=c(abc) query = paste(DROP TABLE IF EXISTS ,tab,sep=) Please help me to solve this problem -- View this message in context: http://r.789695.n4.nabble.com/How-to-add-a-double-quote-to-a-string-tp3893061p3893061.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to add a double quote to a string
On 11-Oct-11 09:23, arunkumar wrote: Hi I want to add a double quote to a string eg Expected output = DROP TABLE IF EXISTS abc My code tab=c(abc) query = paste(DROP TABLE IF EXISTS ,tab,sep=) query = paste(DROP TABLE IF EXISTS \,tab,\, sep=) or query = paste('DROP TABLE IF EXISTS ',tab,'', sep=) and tab=abc. no need for c() Ciao! mario Please help me to solve this problem -- View this message in context: http://r.789695.n4.nabble.com/How-to-add-a-double-quote-to-a-string-tp3893061p3893061.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ing. Mario Valle Data Analysis and Visualization Group| http://www.cscs.ch/~mvalle Swiss National Supercomputing Centre (CSCS) | Tel: +41 (91) 610.82.60 v. Cantonale Galleria 2, 6928 Manno, Switzerland | Fax: +41 (91) 610.82.82 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Type of Graph to use
On 10/10/2011 09:49 PM, Jurgens de Bruin wrote: Hi, Please advice on what type of graph can be used to display the following data set. I have the following: NameClass a Class 1 a Class4 b Class2 b Class1 d Class3 d Class5 e Class4 e Class2 So each entry in name can belong to more than one class. I want to represent the data as to see where overlaps occur that is which names are in the same Class Name and also which names are unique to a Class. I tough a Venn Diagram would work but this can only present numerical values for each Class, I would like each name to be presented by a dot or *. Hi Jurgens, Have a look at the intersectDiagram function in the plotrix package. This only plots the number of cases in each intersection, but it would be possible to plot dots or asterisks or even the lower case letters as long as there are not too many cases. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Perform 20 x one-way anova in 1 go
Hi Guys, I have about 20 continous predictors and I want to do one-way anova to check the significance of each variable against the dependent variable. Apart from doing running the anova 20 times, is there a faster way? Thanks, Joshua [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to test if two C statistics are significantly different?
Hi Yujie, there is still a lot of work in progress, I think. As http://faculty.washington.edu/heagerty/Software/SurvROC/RisksetROC/risksetROCdiscuss.pdf states: [...] for inference and variance estimation, we now suggest bootstrapping [...]. Recently I catched a glimpse on roc.test from the pROC package, they implemented, amongst others, a bootstrap algorithm - maybe this is a start for your own work? Hth. Am 10.10.2011 21:35, schrieb Yujie Wang: Hey all, In order to test if a marker is a risk factor, I built two models (using cox proportional hazard model). One model included this marker, and the other is not. Then, I use R package risksetROC to test how much predictive value did the marker add to this model. I get two C statistics by analyzing the linear predictors of the two models into this package. The qustion is How to test if two C statistics are significantly different? Your help will be greatly appreciated! Yujie [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Eik Vettorazzi Institut für Medizinische Biometrie und Epidemiologie Universitätsklinikum Hamburg-Eppendorf Martinistr. 52 20246 Hamburg T ++49/40/7410-58243 F ++49/40/7410-57790 -- Pflichtangaben gemäß Gesetz über elektronische Handelsregister und Genossenschaftsregister sowie das Unternehmensregister (EHUG): Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen Rechts; Gerichtsstand: Hamburg Vorstandsmitglieder: Prof. Dr. Guido Sauter (Vertreter des Vorsitzenden), Dr. Alexander Kirstein, Joachim Prölß, Prof. Dr. Dr. Uwe Koch-Gromus __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extra digits added to data
FAQ 7.31 Sent from my iPad On Oct 11, 2011, at 1:07, Mark Harrison harrisonma...@gmail.com wrote: I am having a problem with extra digits being added to my data which I think is a result of how I am converting my data.frame data to xts. I see the same issue in R v2.13.1 and RStudio version 0.94.106. I am loading historical foreign exchange data in via csv files or from a sql server database. In both cases there are no extra digits and the original data looks like the following: Date Open HighLow Close 1 2001-01-03 1.5021 1.5094 1.4883 1.4898 2 2001-01-04 1.4897 1.5037 1.4882 1.5020 3 2001-01-05 1.5020 1.5074 1.4952 1.5016 4 2001-01-08 1.5035 1.5104 1.4931 1.4964 5 2001-01-09 1.4964 1.4978 1.4873 1.4887 6 2001-01-10 1.4887 1.4943 1.4856 1.4866 So for 2001-01-03 the Open value is 1.5021 with only 4 digits after the decimal place - i.e. .5021. I then proceed to do the following in R to convert the 'british pound' data above from data.frame to xts: Require(quantmod) rownames(gbp) - gbp$Date head(gbp) Open HighLow Close 2001-01-03 1.5021 1.5094 1.4883 1.4898 2001-01-04 1.4897 1.5037 1.4882 1.5020 2001-01-05 1.5020 1.5074 1.4952 1.5016 2001-01-08 1.5035 1.5104 1.4931 1.4964 2001-01-09 1.4964 1.4978 1.4873 1.4887 2001-01-10 1.4887 1.4943 1.4856 1.4866 gbp- as.xts(gbp[,2:5]) class(gbp) [1] xts zoo The data at this point looks ok until you look closer or output the data to excel at which point you see the following for the 'Open' 2001-01-03: 1.5020084473 It is not just the above 'Open' or the first value but all the data points contain the extra digits which I think is the original date data and/or row numbers that are being tacked on. My problem is the extra digits being added or whatever I am doing wrong in R to cause the extra digits to be added. I need 1.5021 to be 1.5021 and not 1.5020084473. Thanks for the help. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] need help on read.spss
Hi, I have one doubt about one of the parameter of 'read.spss()' from 'foreign' package. Here is the syntax :- read.spss ( file, use.value.labels = TRUE, to.data.frame = FALSE, max.value.labels = Inf, trim.factor.names = FALSE, trim_values = TRUE, reencode = NA, use.missings = to.data.frame ) In above syntax when I pass *'to.data.frame= FALSE*' it gives me missing values from SPSS file (that I try to read using read.spss() ). But when I pass '*to.data.frame = TRUE*' then its not giving me missing values. And need to get missing values. According to read.spss() documentation *to.data.frame : return a data frame?* I am curious to know, if we pass *'to.data.frame = TRUE*' , is it going to cause some issue or effect something? I didn't understand the read.spss() documentation correctly. Please explain. Thanks in Advance -- SG [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Is it possible to generate an ExpressionSet object that contain duplicate row names?
Dear all, I am facing the problem that comes up when an ExpressionSet object is intended to be created parsing a matrix expression data with duplicate row names: try(myExpressionSet - new(ExpressionSet, exprs = myexprsunique, phenoData = myphenoData, annotation = myannotation, check.names=FALSE)) Error in data.frame(numeric(n), row.names = nms) : duplicate row.names: blu I was wondering if there exists a way to create this ExpressionSet object although duplicate row names exist in the expression matrix data parsed? Many thanks in advance. Kind regards, Núria [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fwd: WHO Anthro growth curve macros and R
On Tue, Oct 11, 2011 at 1:21 AM, David Winsemius dwinsem...@comcast.netwrote: On Oct 10, 2011, at 4:48 PM, Gustaf Rydevik wrote: Hi all, some years ago, I sent a question to the mailing list regarding the WHO anthro macros. Since I've now received three mails asking how I solved it, I thought I'd cc R-help in for future reference. Attaching a zip file with the relevant code parts that I used that I'm not sure gets through (if anyone has recommendations on how to manage such files for the list, I'd be grateful. What I ended up doing was importing the data in SPSS format, and adapting the Splus function igrowup.standard slightly. igrowup.standard2.R is the adapted function, while the ssc files are original splus functions. Let me know if anyone gets problems in figuring out how to use the files. The only files that reach the readership are .pdf and .txt files. I do not know how carefully these get inspected, so it is possible that a zip file named something.txt might make it through. best regards, Gustaf \ David Winsemius, MD West Hartford, CT Hi all again, I noticed (and suspected) that as David said, zip files does not get through. Here's a google docs link for the Anthro example.zip file that won't change in the foreseeable future: * https://docs.google.com/viewer?a=vpid=explorerchrome=truesrcid=0B77NeAmIHMaQMjJkZTQ0OTQtNTRkYy00ZWMzLThhNTUtMzg1ZDY5MjljOGQxhl=en_US *(if the link is problematic due to it's length, try * http://tinyurl.com/625vod6 *instead)* *The most interesting files are igrowup.standard2.R (which is a modified version of igrowup.standard) and anthro-example.R. Hopes this comes in use for someone in the future! Regards, Gustaf -- Gustaf Rydevik, M.Sci. tel: +44(0)704 253 760 42 address:St John's hill 18/5 EH8 9UQ Edinburgh, UK skype:gustaf_rydevik [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] need help on read.spss
Hi, if you specify to.data.frame=T, then use.missings is implictly set to T as well, which causes different results for (user-defined) missing values. cheers. Am 11.10.2011 12:07, schrieb Smart Guy: Hi, I have one doubt about one of the parameter of 'read.spss()' from 'foreign' package. Here is the syntax :- read.spss ( file, use.value.labels = TRUE, to.data.frame = FALSE, max.value.labels = Inf, trim.factor.names = FALSE, trim_values = TRUE, reencode = NA, use.missings = to.data.frame ) In above syntax when I pass *'to.data.frame= FALSE*' it gives me missing values from SPSS file (that I try to read using read.spss() ). But when I pass '*to.data.frame = TRUE*' then its not giving me missing values. And need to get missing values. According to read.spss() documentation *to.data.frame : return a data frame?* I am curious to know, if we pass *'to.data.frame = TRUE*' , is it going to cause some issue or effect something? I didn't understand the read.spss() documentation correctly. Please explain. Thanks in Advance -- Eik Vettorazzi Department of Medical Biometry and Epidemiology University Medical Center Hamburg-Eppendorf Martinistr. 52 20246 Hamburg T ++49/40/7410-58243 F ++49/40/7410-57790 -- Pflichtangaben gemäß Gesetz über elektronische Handelsregister und Genossenschaftsregister sowie das Unternehmensregister (EHUG): Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen Rechts; Gerichtsstand: Hamburg Vorstandsmitglieder: Prof. Dr. Guido Sauter (Vertreter des Vorsitzenden), Dr. Alexander Kirstein, Joachim Prölß, Prof. Dr. Dr. Uwe Koch-Gromus __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pmml for random forest rules
Hi Patrick, Thanks for the detailed report. See comments below. On 11 October 2011 05:57, Patrick McCann patmmcc...@gmail.com wrote: [...] I am having some trouble using R 2.13.1 for generating a pmml object of class c('randomForest.formula', 'randomForest') [...] Random Forest (and randomSurvivalForest) — randomForest (Breiman and Cutler. R port by A. Liaw and M. Wiener, 2009) and randomSurvivalForest (Ishwaran and Kogalur , 2009): PMML export of a randomSurvivalForest rsf object. This function gives the user the ability to export PMML containing the geometry of a forest. [...] Error in UseMethod(pmml) : no applicable method for 'pmml' applied to an object of class c('randomForest.formula', 'randomForest') Sorry for the ambiguity there. It tries to say in the paper that pmml supports PMML export of a randomSurvivalForest rsf object. It mentions randomForest but does not say it can export randomForest. There is some experimental code for pmml.randomForest but it has not yet been completed. Also, if I run these lines of code data(Adult) ## Mine association rules. rules - apriori(Adult, parameter = list(supp = 0.5, conf = 0.9, target = rules)) pmml(rules) I get this error: pmml(rules) Error in function (classes, fdef, mtable) : unable to find an inherited method for function size, for signature itemMatrix [...] standardGeneric(size), environment) 3: size(is.unique) 2: pmml.rules(rules) 1: pmml(rules) That's odd. Not quite sure yet what is causing that. On my system it works just fine: library(pmml) library(arules) data(Adult) rules - apriori(Adult, parameter = list(supp = 0.5, conf = 0.9, target = rules)) pmml(rules) PMML version=3.2 ... Header copyright=Copyright (c) 2011 gjw... Extension name=user value=gjw extender=Rattle/PMML/ Application name=Rattle/PMML version=1.2.27/ Timestamp2011-10-11 21:50:40/Timestamp /Header [...] My system: rattleInfo() Rattle: version 2.6.11 cran 2.6.11 R: version 2.13.2 (2011-09-30) (Revision 57111) Sysname: Linux Release: 2.6.38-12-generic Version: #51-Ubuntu SMP Wed Sep 28 14:27:32 UTC 2011 [...] pmml: version 1.2.27 [...] arules: version 1.0-6 I'm using R 2.13.2 - could that be an issue - you have 2.13.1? Regards, Graham __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Import/convert PMML to R model
Not possible (at least with the pmml package) at this time. There is some experimental code for reading PMML (and converting into standalone executable C code) but importing into an R object needs quite a bit of work to re-create the kmeans object before it would be worth releasing. Regards, Graham On 1 June 2011 18:40, Raji raji.sanka...@gmail.com wrote: Hi R-helpers, Can you please let me know if it is possible to import a PMML in R? If yes, can you give me the command to do the same? If not, can you tell me the reason why? Many thanks, Raji -- View this message in context: http://r.789695.n4.nabble.com/Import-convert-PMML-to-R-model-tp3332772p3565260.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Is it possible to generate an ExpressionSet object that contain duplicate row names?
nqueralt at clinic.ub.es writes: I am facing the problem that comes up when an ExpressionSet object is intended to be created parsing a matrix expression data with duplicate row names: You might have more luck with this question on the BioConductor mailing list ... __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] variable scope for deltavar function from emdbook
adad adad at gmx.at writes: Working example: -- library(emdbook) fn - function() { browser() y - 2 print(deltavar(y*b2, meanval=c(b2=3), Sigma=1) ) } x - 2 print(deltavar(x*b1, meanval=c(b1=3), Sigma=1) ) y-3 fn() running this returns 4 for the first function call, which is fine. For the call of deltavar in fn(), I get 9, i.e. the function uses y-3 instead of the local y-2. If y- is commented, deltavar returns an error. So why is the function not using the local variable and how do I make it use it? The real problem is that I (the author) don't understand scoping in R, and how to manipulate it, as well as I'd like to. I will work on this (any tips from the R-helpers appreciated). In the meantime, you could try out one of the other available delta-method calculators, such as the one in the msm package (library(sos); findFn({delta method})). More text to try to make gmane happy Ben Bolker __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vegan: Anova.CCA accessing original data using option by=margin
On Mon, 2011-10-10 at 23:51 -0700, Steve Pawson wrote: Hello, I am attempting to use the ANOVA.CCA function with the by=margin option. The process works fine using the by=terms option and I note in the Vegan manual that Jari suggests that an error may occur if the anova does not have access to the data on the original constraints. This is the error that I get: Error in dimnames(x) - dn : length of 'dimnames' [2] not equal to array extent My question is, does anyone know if this error relates to what Jari is referring to (or is it a different problem), and if it is, how do I link the anova to the original constraints? It is almost impossible to answer that without a lot more information. For starters, what does traceback() say when run immediately *after* you get the error? G Many thanks for any help provided. Regards Steve -- View this message in context: http://r.789695.n4.nabble.com/Vegan-Anova-CCA-accessing-original-data-using-option-by-margin-tp3893005p3893005.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] An issue regarding to gradient
The following code will get me a curve plot: cutoff - seq(1,7,0.25) Sensitivity - 1 - pnorm(cutoff, 5, 0.8) Specificity - pnorm(cutoff, 3, 1.2) plot(1-Specificity,Sensitivity,main = ROC curve,type = o) How do I get a gradient of a particular point on that curve? Any packages/functions allow me to do that? Thank you -- View this message in context: http://r.789695.n4.nabble.com/An-issue-regarding-to-gradient-tp3893401p3893401.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Parallel processing for R loop
I have an R script that consists of a for loop that repeats a process for many different files. I want to process this parallely on machine with multiple cores, is there any package for it ? Thanks -- Sandeep R Patil [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to run Rcmdr with OS 10.4?
I've installed Rcmdr package and it doesn't run Here is the error message: R version 2.9.2 (2009-08-24) [R.app GUI 1.29 (5464) powerpc-apple-darwin8.11.1] [Workspace restored from /Users/jfc/Documents/TravauxFR/.RData] Le chargement a nécessité le package : tcltk Chargement de Tcl/Tk... terminé Le chargement a nécessité le package : car Error in structure(.External(dotTclObjv, objv, PACKAGE = tcltk), class = tclObj) : [tcl] invalid command name font. De plus : Warning message: In fun(...) : couldn't connect to display :0 Error : .onAttach a échoué dans 'attachNamespace' Erreur : le chargement du package / espace de noms a échoué pour 'Rcmdr' I've tried another version 2.10.2 and Rcmdr with its dependences and it returns the same warnings! I feel that something lacks on my computer. X11 works and I've installed TcTlk 8.5.5-x11. What else to do? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help to ... import the data from Excel
Hi every one i have problem in R program to import the data from excel , I have done the following: 1. install.packages(xlsReadWrite) 2. library(xlsReadWrite) 3. z- read.xls(ReadXls,LTS,colNames=FALSE,sheet,type,form,rowNames=FALSE) and i got on the result: Error in read.xls(ReadXls, LTS, colNames = FALSE, rowNames = FALSE) : object 'LTS' not found also i tried to done data(LTS, package = xlsReadWrite) and we got on : Warning message: In data(LTS, package = xlsReadWrite) : data set 'LTS' not found How i get on LTS in the list objects? Note: LTS is name my data in Eexcl i used another way as following: mydata- read.table(C:\Users\user\Desktop\LTS.xls) but its not working how can i do it? */My regards/ * -- View this message in context: http://r.789695.n4.nabble.com/help-to-import-the-data-from-Excel-tp3893382p3893382.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to map current Europe?
hi, see here http://r.789695.n4.nabble.com/Create-a-map-td3689877.html#a3893581 -- View this message in context: http://r.789695.n4.nabble.com/How-to-map-current-Europe-tp3715709p3893588.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Create a map
hi, I've just figured out how to plot (more) up-to-date poilitical borders in R in an easy way: You can get the outline file Borders_MWDB3 from the NASA panoply site http://www.giss.nasa.gov/tools/panoply/overlays/ Then read it into R, get the indices with a jump over the -180/180 E line, remove the first point of the part on the other side and finally plot the boarders as line: bord - read.table(Borders_MWDB3.cno,sep=,,na.strings=,fill=T) around - which(abs(diff(bord[,1]))180)+1 bord[around,] - NA plot(bord,type=l,xlab=degrees east,ylab=degrees north) hope this works. -- View this message in context: http://r.789695.n4.nabble.com/Create-a-map-tp3689877p3893581.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Mean or mode imputation fro missing values
Dear R experts, I have a large database made up of mixed data types (numeric, character, factor, ordinal factor) with missing values, and I am looking for a package that would help me impute the missing values using either the mean if numerical or the mode if character/factor. I maybe could use replace like this: df$var[is.na(df$var)] - mean(df$var, na.rm = TRUE) And go through all the many different variables of the datasets using mean or mode for each, but I was wondering if there was a faster way, or if a package existed to automate this (by doing 'mode' if it is a factor or character or 'mean' if it is numeric)? I have tried the package dprep because I wanted to use the function ce.mimp, btu unfortunately it is not available anymore. Thank you for your help, -francy __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] map question
maybe this helps: http://r.789695.n4.nabble.com/Create-a-map-td3689877.html#a3893581 -- View this message in context: http://r.789695.n4.nabble.com/map-question-tp795873p3893593.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rpanel
Dear all, I am struggling to align textentry fields in a Tcl/Tk widget. In the example below, I'd like to have the boxes aligned. library(rpanel) panel - rp.control(title=title,size=c(100,100)) rp.textentry(panel,var=a,labels=Variable A, initval=1,pos=list(row=0,column=0)) rp.textentry(panel,var=b,labels=Var. B, initval=1,pos= list(row=1,column=0)) Thanks for your help Pascal -- Pascal A. Niklaus Institute of Evolutionary Biology and Environmental Studies University of Zurich Winterthurerstrasse 190 CH-8057 Zurich / Switzerland __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to test if two C statistics are significantly different?
?Hmisc::rcorrp.cens -Alan -Original Message- From: Eik Vettorazzi [mailto:e.vettora...@uke.de] Sent: Tue 10/11/2011 2:25 AM To: Yujie Wang Cc: r-help@r-project.org Subject: Re: [R] How to test if two C statistics are significantly different? Hi Yujie, there is still a lot of work in progress, I think. As http://faculty.washington.edu/heagerty/Software/SurvROC/RisksetROC/risksetROCdiscuss.pdf states: [...] for inference and variance estimation, we now suggest bootstrapping [...]. Recently I catched a glimpse on roc.test from the pROC package, they implemented, amongst others, a bootstrap algorithm - maybe this is a start for your own work? Hth. Am 10.10.2011 21:35, schrieb Yujie Wang: Hey all, In order to test if a marker is a risk factor, I built two models (using cox proportional hazard model). One model included this marker, and the other is not. Then, I use R package risksetROC to test how much predictive value did the marker add to this model. I get two C statistics by analyzing the linear predictors of the two models into this package. The qustion is How to test if two C statistics are significantly different? Your help will be greatly appreciated! Yujie [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Eik Vettorazzi Institut für Medizinische Biometrie und Epidemiologie Universitätsklinikum Hamburg-Eppendorf Martinistr. 52 20246 Hamburg T ++49/40/7410-58243 F ++49/40/7410-57790 -- Pflichtangaben gemäß Gesetz über elektronische Handelsregister und Genossenschaftsregister sowie das Unternehmensregister (EHUG): Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen Rechts; Gerichtsstand: Hamburg Vorstandsmitglieder: Prof. Dr. Guido Sauter (Vertreter des Vorsitzenden), Dr. Alexander Kirstein, Joachim Prölß, Prof. Dr. Dr. Uwe Koch-Gromus __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to test if two C statistics are significantly different?
Thanks for mentioning rcorrp.cens which is much more powerful than testing for differences in C. Likelihood ratio tests would be even more powerful. Ordinary differences in C index yields a test with power that is too low. Frank alanm (Alan Mitchell) wrote: ?Hmisc::rcorrp.cens -Alan -Original Message- From: Eik Vettorazzi [mailto:E.Vettorazzi@] Sent: Tue 10/11/2011 2:25 AM To: Yujie Wang Cc: r-help@ Subject: Re: [R] How to test if two C statistics are significantly different? Hi Yujie, there is still a lot of work in progress, I think. As http://faculty.washington.edu/heagerty/Software/SurvROC/RisksetROC/risksetROCdiscuss.pdf states: [...] for inference and variance estimation, we now suggest bootstrapping [...]. Recently I catched a glimpse on roc.test from the pROC package, they implemented, amongst others, a bootstrap algorithm - maybe this is a start for your own work? Hth. Am 10.10.2011 21:35, schrieb Yujie Wang: Hey all, In order to test if a marker is a risk factor, I built two models (using cox proportional hazard model). One model included this marker, and the other is not. Then, I use R package risksetROC to test how much predictive value did the marker add to this model. I get two C statistics by analyzing the linear predictors of the two models into this package. The qustion is How to test if two C statistics are significantly different? Your help will be greatly appreciated! Yujie [[alternative HTML version deleted]] __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Eik Vettorazzi Institut für Medizinische Biometrie und Epidemiologie Universitätsklinikum Hamburg-Eppendorf Martinistr. 52 20246 Hamburg T ++49/40/7410-58243 F ++49/40/7410-57790 -- Pflichtangaben gemäß Gesetz über elektronische Handelsregister und Genossenschaftsregister sowie das Unternehmensregister (EHUG): Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen Rechts; Gerichtsstand: Hamburg Vorstandsmitglieder: Prof. Dr. Guido Sauter (Vertreter des Vorsitzenden), Dr. Alexander Kirstein, Joachim Prölß, Prof. Dr. Dr. Uwe Koch-Gromus __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/How-to-test-if-two-C-statistics-are-significantly-different-tp3891857p3894430.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mean or mode imputation fro missing values
In your case, it may not be sensible to simply fill missing values by mean or mode as multiple imputation becomes the norm this day. For your specific question, na.roughfix in randomForest package would do the work. Weidong Gu On Tue, Oct 11, 2011 at 8:11 AM, francesca casalino francy.casal...@gmail.com wrote: Dear R experts, I have a large database made up of mixed data types (numeric, character, factor, ordinal factor) with missing values, and I am looking for a package that would help me impute the missing values using either the mean if numerical or the mode if character/factor. I maybe could use replace like this: df$var[is.na(df$var)] - mean(df$var, na.rm = TRUE) And go through all the many different variables of the datasets using mean or mode for each, but I was wondering if there was a faster way, or if a package existed to automate this (by doing 'mode' if it is a factor or character or 'mean' if it is numeric)? I have tried the package dprep because I wanted to use the function ce.mimp, btu unfortunately it is not available anymore. Thank you for your help, -francy __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem executing function
Hello All, I have a series of steps that needs to be run many times. Hence I put them all into a function. There is no problem in function creation, but when I call the function, the steps are not getting executed or only the first step gets executed. What possibly could be the reason? Sample Function and the result: fun - function () { # Package load into R; a - c(library(RODBC),library(e1071)); b - read.csv(Path of the csv file, header=TRUE,sep=,,quote=); c - b[,1]; d - b[,2]; e - b[,3]; rm(b); # Establishing ODBC connection; conn - odbcConnect(c,uid=d,pwd=e); } fun() Warning messages: 1: package 'RODBC' was built under R version 2.13.1 2: package 'e1071' was built under R version 2.13.1 The subsequent csv fetch and odbc connection establishment are not getting executed. Why is the function not getting executed fully? Even if I create a separate function for csv file fetch, it is not being executed. But if I simply type on the command prompt directly b - read.csv(Path of the csv file, header=TRUE, sep=,,quote=); it is working. Why is it like this? I am not able to figure out the mistake. Any help will be much useful. Have been struggling with this for quite some time now. Thanks Divya -- View this message in context: http://r.789695.n4.nabble.com/Problem-executing-function-tp3894359p3894359.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] controling text in facets (ggplot2)
Hi R-helpers! Here is my problem: I have a graph with 3 different facets where there are 3 different regression line. My goal is to mention separately in each facet each equation that describes my lines. So far, I managed to add a line and the same equation to all my facets but that's not unfortunately what I want. Is there a way to do that? Any suggestion would be gladly welcome! Thanks for your help! Thomas -- View this message in context: http://r.789695.n4.nabble.com/controling-text-in-facets-ggplot2-tp3894148p3894148.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] correlation matrix
Thank you all for your suggestions. Sharad -- View this message in context: http://r.789695.n4.nabble.com/correlation-matrix-tp3891085p3894329.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem executing function
Sounds like what you were hoping for does happen in the function environment but isn't returned to the global environment. The proper fix is to put the values in a list and return() them as the function output. Michael On Oct 11, 2011, at 9:21 AM, Divyam divyamural...@gmail.com wrote: Hello All, I have a series of steps that needs to be run many times. Hence I put them all into a function. There is no problem in function creation, but when I call the function, the steps are not getting executed or only the first step gets executed. What possibly could be the reason? Sample Function and the result: fun - function () { # Package load into R; a - c(library(RODBC),library(e1071)); b - read.csv(Path of the csv file, header=TRUE,sep=,,quote=); c - b[,1]; d - b[,2]; e - b[,3]; rm(b); # Establishing ODBC connection; conn - odbcConnect(c,uid=d,pwd=e); } fun() Warning messages: 1: package 'RODBC' was built under R version 2.13.1 2: package 'e1071' was built under R version 2.13.1 The subsequent csv fetch and odbc connection establishment are not getting executed. Why is the function not getting executed fully? Even if I create a separate function for csv file fetch, it is not being executed. But if I simply type on the command prompt directly b - read.csv(Path of the csv file, header=TRUE, sep=,,quote=); it is working. Why is it like this? I am not able to figure out the mistake. Any help will be much useful. Have been struggling with this for quite some time now. Thanks Divya -- View this message in context: http://r.789695.n4.nabble.com/Problem-executing-function-tp3894359p3894359.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] filtering rows
Hi everyone, I've got two data sets as below. My question now is: how can I use Dataset2 as a filter for Dataset1? My goal is just to keep the rows of Dataset1 where the first column (Date) matches the Dates in Dataset2. I would appreciate any solutions to this issue. Many thanks! S.B. Dataset1: Date A B C D 1 1977 10 11 12 13 2 1978 14 15 16 17 3 1979 18 19 20 21 4 1980 22 23 24 25 5 1981 26 27 28 29 Dateset2: Date 1 1977 2 1978 3 1979 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Text Mining with Facebook Reviews (XML and FQL)
Hi Kenneth First off, you probably don't need to use xmlParseDoc(), but rather xmlParse(). (Both are fine, but xmlParseDoc() allows you to control many of the options in the libxml2 parser, which you don't need here.) xmlParse() has some capabilities to fetch the content of URLs. However, it cannot deal with HTTPS requests which this call to facebook is. The approach to this is to i) make the request ii) parse the resulting string via xmlParse(txt, asText = TRUE) As for i), there are several ways to do this, but the RCurl package allows you to do it entirely within R and gives you more control over the request than you would ever want. library(RCurl) txt = getForm('https://api.facebook.com/method/fql.query', query = QUERY) mydata.xml = xmlParse(txt, asText = TRUE) However, you are most likely going to have to login / get a token before you make this request. And then, if you are using RCurl, you will want to use the same curl object with the token or cookies, etc. D. On 10/10/11 3:52 PM, Kenneth Zhang wrote: Hello, I am trying to use XML package to download Facebook reviews in the following way: require(XML) mydata.vectors - character(0) Qword - URLencode('#IBM') QUERY - paste('SELECT review_id, message, rating from review where message LIKE %',Qword,'%',sep='') Facebook_url = paste('https://api.facebook.com/method/fql.query?query= ',QUERY,sep='') mydata.xml - xmlParseDoc(Facebook_url, asText=F) mydata.vector - xpathSApply(mydata.xml, '//s:entry/s:title', xmlValue, namespaces =c('s'='http://www.w3.org/2005/Atom')) The mydata.xml is NULL therefore no further step can be execute. I am not so familiar with XML or FQL. Any suggestion will be appreciated. Thank you! Best regards, Kenneth [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mean or mode imputation fro missing values
Yes thank you Gu… I am just trying to do this as a rough step and will try other imputation methods which are more appropriate later. I am just learning R, and was trying to do the for loop and f-statement by hand but something is going wrong… This is what I have until now: *fake array: age- c(5,8,10,12,NA) a- factor(c(aa, bb, NA, cc, cc)) b- c(banana, apple, pear, grape, NA) df_test - data.frame(age=age, a=a, b=b) df_test$b- as.character(df_test$b) for (var in 1:ncol(df_test)) { if (class(df_test$var)==numeric) { df_test$var[is.na(df_test$var)] - mean(df_test$var, na.rm = TRUE) } else if (class(df_test$var)==character) { Mode(df_test$var[is.na(df_test$var)], na.rm = TRUE) } } Where 'Mode' is the function: function (x, na.rm) { xtab - table(x) xmode - names(which(xtab == max(xtab))) if (length(xmode) 1) xmode - 1 mode return(xmode) } It seems as it is just ignoring the statements though, without giving any error…Does anybody have any idea what is going on? Thank you very much for all the great help! -f 2011/10/11 Weidong Gu anopheles...@gmail.com: In your case, it may not be sensible to simply fill missing values by mean or mode as multiple imputation becomes the norm this day. For your specific question, na.roughfix in randomForest package would do the work. Weidong Gu On Tue, Oct 11, 2011 at 8:11 AM, francesca casalino francy.casal...@gmail.com wrote: Dear R experts, I have a large database made up of mixed data types (numeric, character, factor, ordinal factor) with missing values, and I am looking for a package that would help me impute the missing values using either the mean if numerical or the mode if character/factor. I maybe could use replace like this: df$var[is.na(df$var)] - mean(df$var, na.rm = TRUE) And go through all the many different variables of the datasets using mean or mode for each, but I was wondering if there was a faster way, or if a package existed to automate this (by doing 'mode' if it is a factor or character or 'mean' if it is numeric)? I have tried the package dprep because I wanted to use the function ce.mimp, btu unfortunately it is not available anymore. Thank you for your help, -francy __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extra digits added to data
Thanks for the quick response. Read the FAQ. If i want to keep the values in R the same as when inputed should i be converting the data to a different type - i.e. Not numeric? Sent from my iPhone On Oct 11, 2011, at 4:46 AM, Jim Holtman jholt...@gmail.com wrote: FAQ 7.31 Sent from my iPad On Oct 11, 2011, at 1:07, Mark Harrison harrisonma...@gmail.com wrote: I am having a problem with extra digits being added to my data which I think is a result of how I am converting my data.frame data to xts. I see the same issue in R v2.13.1 and RStudio version 0.94.106. I am loading historical foreign exchange data in via csv files or from a sql server database. In both cases there are no extra digits and the original data looks like the following: Date Open HighLow Close 1 2001-01-03 1.5021 1.5094 1.4883 1.4898 2 2001-01-04 1.4897 1.5037 1.4882 1.5020 3 2001-01-05 1.5020 1.5074 1.4952 1.5016 4 2001-01-08 1.5035 1.5104 1.4931 1.4964 5 2001-01-09 1.4964 1.4978 1.4873 1.4887 6 2001-01-10 1.4887 1.4943 1.4856 1.4866 So for 2001-01-03 the Open value is 1.5021 with only 4 digits after the decimal place - i.e. .5021. I then proceed to do the following in R to convert the 'british pound' data above from data.frame to xts: Require(quantmod) rownames(gbp) - gbp$Date head(gbp) Open HighLow Close 2001-01-03 1.5021 1.5094 1.4883 1.4898 2001-01-04 1.4897 1.5037 1.4882 1.5020 2001-01-05 1.5020 1.5074 1.4952 1.5016 2001-01-08 1.5035 1.5104 1.4931 1.4964 2001-01-09 1.4964 1.4978 1.4873 1.4887 2001-01-10 1.4887 1.4943 1.4856 1.4866 gbp- as.xts(gbp[,2:5]) class(gbp) [1] xts zoo The data at this point looks ok until you look closer or output the data to excel at which point you see the following for the 'Open' 2001-01-03: 1.5020084473 It is not just the above 'Open' or the first value but all the data points contain the extra digits which I think is the original date data and/or row numbers that are being tacked on. My problem is the extra digits being added or whatever I am doing wrong in R to cause the extra digits to be added. I need 1.5021 to be 1.5021 and not 1.5020084473. Thanks for the help. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Background Colors
Hi R-Help - If I make a plot: numYears = 500 plot(x = c(1,numYears), y = c(200,300), xlab = Time, ylab = Vegetation Class, xlim = c(100,600), ylim = c(200,300), type=n) Is there a way to make different parts of the background for the plot different colors? For example, I'd like to have the background color col = (250,250,0,50) for y = c(200,204), and col = (250,125,0,50) for y = c(210,212). Any suggestions? Thanks in advance for the help, Gabe -- Gabriel I. Yospin Institute of Ecology and Evolution Bridgham Lab University of Oregon Eugene, OR 97403-5289 Ph: 541 346 1549 Fax: 541 346 2364 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SLOW split() function
thanks, josh. in my posting example, I did not need anything except coefficients. (when this is the case, I usually do not even use lm.fit, but I eliminate all missing obs first and then use solve crossprod(y,cbind(1,x)) crossprod(cbind(1,x)).) this is pretty fast.) alas, I will need to figure how to get coef standard errors faster in this case. summary.lm() is really slow. regards, /iaw Ivo Welch (ivo.we...@gmail.com) http://www.ivo-welch.info/ J. Fred Weston Professor of Finance Anderson School at UCLA, C519 On Mon, Oct 10, 2011 at 11:30 PM, Joshua Wiley jwiley.ps...@gmail.com wrote: As another followup, given that you are doing numerous regression models and (I presume) working with finance/stock data that is strictly numeric (no need for special contrast coding, etc.), you can substantially reduce the time spent estimating the coefficients. A simple way is to use lm.fit directly instead of lm. For lm.fit, you pass the y and x (design) matrices directly. This skips a good deal of overhead. Here is one naive way, I imagine more speedups could be gained by incorporating the intercept (1 vector) into d instead of cbind()ing it. The catch it that lm.fit requires matrices, not data tables, so what you gain may be lost in having to do an extra conversion. In any case, here are the times on my system for the two options (note I used N = 1000 * 100 because I am presently on a glorified netbook). print(system.time(all.2b - lapply(si, function(.indx) { coef(lm(y ~ + x, data=d[.indx,])) }))) user system elapsed 69.00 0.00 69.56 print(system.time(all.2c - lapply(si, function(.indx) { coef(lm.fit(y = d[.indx, y], x = cbind(1, d[.indx, x]))) }))) user system elapsed 37.83 0.03 38.36 the column names for the coeficients will not be the same as from lm, but the estimates should be identical. While this is not recommended in typical usage, in an application like regressions on rolling time windows, etc. where you know the data are not changing, I think it makes sense to bypass the clever determine your data and best methods to use, and go straight to passing the design matrix. Since you do not need residuals, variances, etc. it may be possible to speed this up even more, perhaps bypassing dqrls altogether. Cheers, Josh On Mon, Oct 10, 2011 at 9:56 PM, ivo welch ivo.we...@gmail.com wrote: thank you, everyone. this was very helpful to my specific task and understanding. for the benefit of future googlers, I thought I would post some experiments and results here. ultimately, I need to do a by() on an irregular matrix, and I now know how to speed up by() on a single-core, and then again on a multi-core machine. library(data.table) N - 1000*1000 d - data.table(data.frame( key= as.integer(runif(N, min=1, max=N/10)), x=rnorm(N), y=rnorm(N) )) # irregular setkey(d, key); gc() ## sort and force a garbage collection cat(N=, N, . Size of d=, object.size(d)/1024/1024, MB\n) cat(\nStandard by() Function:\n) print(system.time( all.1 - by( d, d$key, function(d) coef(lm(y ~ x, data=d) cat(\n\nPreSplit Function [aka Jim H]\n\t(a) Splitting Operation:\n) print(system.time(si - split(seq(nrow(d)), d$key))) cat(\n\t(b) Regressions:\n) print(system.time(all.2 - lapply(si, function(.indx) { coef(lm(d$y[.indx] ~ d$x[.indx])) }))) print(system.time(all.2b - lapply(si, function(.indx) { coef(lm(y ~ x, data=d[.indx,])) }))) cat(\n\nNaive Split Data Frame\n\t(a) Splitting Operation:\n) print(system.time(ds - split(d, d$key))) cat(\n\t(b) Regressions:\n) print(system.time(all.3a - lapply(ds, function(ds) { coef(lm(ds$y ~ ds$x)) }))) print(system.time(all.3b - lapply(ds, function(ds) { coef(lm(y ~ x, data=ds)) }))) the first and the last ways (all.1 and all.3) are naive ways of doing this, and take about 400-500 seconds on a Mac Air, core i5. Jim's suggestion (all.2) cuts this roughly into half by speeding up the split to take almost no time. and now, library(multicore) print(system.time(all.4 - mclapply(si, function(.indx) { coef(lm(y ~ x, data=d[.indx,])) }))) on my dual-core (quad-thread) i5, all four pseudo cores become busy, and the time roughly halves again from 230 seconds to 120 seconds. maybe the by() function should use Jim's approach, and multicore should provide mcby(). of course, knowing how to do this myself fast now by hand, this is not so important for me. but it may help some other novices. thanks again everybody. regards, /iaw Ivo Welch (ivo.we...@gmail.com) On Mon, Oct 10, 2011 at 9:31 PM, William Dunlap wdun...@tibco.com wrote: The following avoids the overhead of data.frame methods (and assumes the data.frame doesn't include matrices or other data.frames) and relies on split(vector,factor) quickly splitting a vector into a list of vectors. For a 10^6 row by 10 column data.frame split in 10^5 groups this took 14.1 seconds while split took 658.7 s. Both
[R] apply for each value
Hello, There has to be a more R'ish way to do this. I have two matrices, one has the values I want, but I want to NA some of them. The other matrix has binary values that tell me if I want to NA the values in the other matrix. I produce a third matrix based on this. I've also tried apply() passing in c(1,2) for rows and columns with no success yet. Example (this works, but I'm looking for a better/faster solution): a = matrix(1:6,2,3) colnames(a) = c('a','b','c') b = matrix(c(1,0,1,0,0,1),2,3) colnames(b) = colnames(a) c = matrix(0,nrow(a),ncol(a)) for(cl in 1:ncol(a)){ for(rw in 1:nrow(a)){ c[rw,cl] = ifelse(b[rw,cl]==1,a[rw,cl],NA) } } a a b c [1,] 1 3 5 [2,] 2 4 6 b a b c [1,] 1 1 0 [2,] 0 0 1 c [,1] [,2] [,3] [1,]13 NA [2,] NA NA6 Thanks! Ben [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] apply for each value
Hi, On Tue, Oct 11, 2011 at 12:08 PM, Ben qant ccqu...@gmail.com wrote: Hello, There has to be a more R'ish way to do this. I have two matrices, one has the values I want, but I want to NA some of them. The other matrix has binary values that tell me if I want to NA the values in the other matrix. I produce a third matrix based on this. I've also tried apply() passing in c(1,2) for rows and columns with no success yet. Example (this works, but I'm looking for a better/faster solution): a = matrix(1:6,2,3) colnames(a) = c('a','b','c') b = matrix(c(1,0,1,0,0,1),2,3) colnames(b) = colnames(a) c = matrix(0,nrow(a),ncol(a)) for(cl in 1:ncol(a)){ for(rw in 1:nrow(a)){ c[rw,cl] = ifelse(b[rw,cl]==1,a[rw,cl],NA) } } You're making it far too complicated. No need for loops or apply() or anything like that. c - a c[b == 0] - NA c a b c [1,] 1 3 NA [2,] NA NA 6 And thanks for the reproducible small example. Sarah a a b c [1,] 1 3 5 [2,] 2 4 6 b a b c [1,] 1 1 0 [2,] 0 0 1 c [,1] [,2] [,3] [1,] 1 3 NA [2,] NA NA 6 Thanks! Ben -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] question about string to boor?
Thanks guys, that's a great help. Nellie -- View this message in context: http://r.789695.n4.nabble.com/question-about-string-to-boor-tp3890983p3894996.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] get(a[1]) : object 'a[1]' not found
In the help for get(), the following example is given: a - 1:4 assign(a[1], 2) a[1] == 2 #FALSE get(a[1]) == 2 #TRUE However, executing that last line for me gives Error in get(a[1]) : object 'a[1]' not found __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] New package announcement: R2STATS, a GUI for fitting GLM and GLMM
Dear R-users, I wanted to inform you that a new package called R2STATS is available, as a graphical front-end for the glm() and glmer() functions. The GUI is based on the RGTk2 and gWidgets packages by Michael Lawrence and John Verzani, and so requires that the GTK+ library be installed first on your system. This is done automagically when installing the RGtk2 package (or the script mentioned below). It also use the RGtk2Extras by Tom Taverner, to provide editable grids for data frames. This GUI is intended to provide an easy way to fit and compare GLM and GLMM models. The GLMM part is based on Douglas Bates' lme4 package and the glmer() function. Automatic plots are also drawn for every model, and you can switch from one plot to the other by just clicking on the model name. I found this feature quite useful when teaching: It helps students to get an immediate understanding of differences between models. Note that this GUI is left (deliberately) simple and is not intended to provide a full-featured GUI (please consider using Rcmdr instead for a far more advanced GUI). But it tries to do well the one and only thing it was designed to do: Fitting and comparing models. Note that most standard statistical tests may well be presented as a simple comparison between GLMs and this is the way I go with my students here. This allows an integrated presentation for almost all common (and simple) situations in social sciences. More information is available on my webpage : http://yvonnick.noel.free.fr/r2stats [in French for the moment, although the package is in English]. Installing the package is done from a temporary repository: install.packages(R2STATS,repos=http://yvonnick.noel.free.fr/cran,dep=TRUE) if you already have a recent version of GTK+ and RGtk2 installed, or by: source(http://yvonnick.noel.free.fr/r2stats/installwin.R;) for an automatic script that download and install everything. I will submit it to CRAN as soon as I have fixed some minor issues with R-devel (but the package works flawlessly with the current R-2.13.2). Any comment welcome. Also, if you are willing to contribute a translation into your language, please let me know. Best, Yvonnick Noel, PhD. University of Brittany Rennes, France __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] get(a[1]) : object 'a[1]' not found
Hi, On Tue, Oct 11, 2011 at 12:31 PM, Timothy Bates timothy.c.ba...@gmail.com wrote: In the help for get(), the following example is given: a - 1:4 assign(a[1], 2) a[1] == 2 #FALSE get(a[1]) == 2 #TRUE However, executing that last line for me gives Error in get(a[1]) : object 'a[1]' not found That's actually in the help for assign(). But anyway, help files are checked before distribution, so something is likely odd about your session. Is this in an empty session? OS, version, etc? sessionInfo() at the very least. What does ls() look like? Did you get any other warnings or errors? Sarah -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plots of correlation matrices
Hi, I want to do a visualisation of a matrix plot made up of several plots of correlation matrices (using corrplot()). My data is in csv format. Here's an example: id,category,attribute1,attribute2,attribute3,attribute4 661,SCHS,43.2,0,56.5,1 12202,SCHS,161.7,5.7,155,16 1182,SCHS,21.4,0,29,0 1356,SSS, 8.8182,0.1818,10.6667,0.6667 1864,SCHS,443.7273,9.9091,537,46 12360,SOA,6.6364,0,10,0 3382,SOA,7.1667,0,26,0.5 1033,SOA,63.9231,1.5385,91.5,11.5 14742,SSS,4.3846,0,8,0 12760,SSS,425.0714,1.7857,297.5,3.5 I can get rid of the id. But I need the 'category' as a way of distinguishing the various correlation matrices. I can do a plot of the correlation matrix using corrplot() function in the corrplot package (ignoring the id and category). But what I need is a matrix of the plots of each correlation matrix based on the category, ie I have three categories in the data, hence I will need three plots of the correlation matrix in one diagram (because the correlation matrix only makes sense if they are distinguished by category). Any help? Regards Gawesh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] get(a[1]) : object 'a[1]' not found
On 11/10/2011 12:31 PM, Timothy Bates wrote: In the help for get(), the following example is given: a- 1:4 assign(a[1], 2) a[1] == 2 #FALSE get(a[1]) == 2 #TRUE However, executing that last line for me gives Error in get(a[1]) : object 'a[1]' not found What did the second line say? It's the line that created the `a[1]` object. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] An issue regarding to gradient
On Oct 11, 2011, at 6:02 AM, luke1022 wrote: The following code will get me a curve plot: cutoff - seq(1,7,0.25) Sensitivity - 1 - pnorm(cutoff, 5, 0.8) Specificity - pnorm(cutoff, 3, 1.2) plot(1-Specificity,Sensitivity,main = ROC curve,type = o) How do I get a gradient of a particular point on that curve? First you need to define what you mean by gradient at a point when the gradient is discontinuous at each point. Is this a numerical example and you want to take the means of the slopes on either side, (rather like the definition of the Dirac function at x=0) ... in whiich case these are the slopes _between_not_at_ the points: diff(Specificity)/diff(1-Sensitivity) or ... is this a homework problem and you are being asked to use the knowledge that those (Sensitivity, Specificity) points came from particular pnorm functions? -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] filtering rows
On Oct 11, 2011, at 11:18 AM, Samir Benzerfa wrote: Hi everyone, I've got two data sets as below. My question now is: how can I use Dataset2 as a filter for Dataset1? My goal is just to keep the rows of Dataset1 where the first column (Date) matches the Dates in Dataset2. Perhaps: merge(Dataset1, Dataset2) Or: Dataset1[ Dataset1$Date %in% Dataset1$Date , ] -- David. I would appreciate any solutions to this issue. Many thanks! S.B. Dataset1: Date A B C D 1 1977 10 11 12 13 2 1978 14 15 16 17 3 1979 18 19 20 21 4 1980 22 23 24 25 5 1981 26 27 28 29 Dateset2: Date 1 1977 2 1978 3 1979 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help to write to a file
Dear all: I am having some problems to use the function sink(). Basically I am doing a loop over two files which contain unit-root variables. Then on a loop, I extract every i element of both files to create an object called z. If z meets some requirements, then I perform a unit root test (ADF test), otherwise not. As this process is repeated several times, for each i I want to get the summary of the ADF test on a common file. For that I use the function sink(). My code runs fine, but I do not get anything written on the text file where my results are supposed to be saved. The code is below setwd(C:\\Users\\Sergio René\\Dropbox\\R) library(urca) P1-read.csv(2R_EQ_P_R1_500.csv) P2-read.csv(2R_EQ_P_R2_500.csv) d-(1:1000) sink (ADF_results_b_1.txt) for (i in seq(d)) { z.1-P1[i]*-1-P2[i]*-1 if (all(z.1=0)) {r=1} else {if (all(z.1=0)) {r=1} else {r=2}} if (r==1) {ADF-ur.df(ts(z.1), lags=1, type='drift')} if (r==1) {summary(ADF)} } sink() Any suggestion of what I might be doing wroong? best regards, Sergio René [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] restricted cubic spline within survfit.cph in the package rms
Hello, does anyone have an example on how to use restricted cubic splines function rcs within survfit.cph, if cph (Cox Proportional Hazard Regression) was done with restricted cubic splines (which I made to work)? Thank you. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] suggestions for ANOVA which includes the year as a factor
Dear R Fundation, I am a post-doc researcher at the University of Pisa, Italy. I apologize for my english and I have to tell you in advance that I am a very beginner with R. I used R for fitting dose-response curves (drc package) and for an ordinary ANOVA (one, two or three factors), including the post-hoc mean comparison (I used the LSD test...). Now I have to process some simple data on tomato yield. I have just three different treatments (weed control) and three different years of experiment. My questions are: How can I insert the factor year in the ANOVA? Do you think a mixed system could be suitable? In case which texts, paper, references, manuals, etc. could you suggest me? How can I compare means in a mixed model (for example LSD test) ? Thank you very much and sorry for bothering you. Sincerely. Marco __ Marco Fontanelli Sezione Meccanica Agraria e Meccanizzazione Agricola Dipartimento di Agronomia e Gestione dell'Agro-Ecosistema Facoltà di Agraria Università di Pisa tel: 050 2218922 cell: 338 8832323 mail: mfontane...@agr.unipi.it -- Questa email e' stata controllata da Astaro Security Gateway. http://www.astaro.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Background Colors
Hi, Yes, one way to do that is by using function polygon(). Regards, Carlos Ortega www.qualityexcellence.es 2011/10/11 Gabriel Yospin yosp...@gmail.com Hi R-Help - If I make a plot: numYears = 500 plot(x = c(1,numYears), y = c(200,300), xlab = Time, ylab = Vegetation Class, xlim = c(100,600), ylim = c(200,300), type=n) Is there a way to make different parts of the background for the plot different colors? For example, I'd like to have the background color col = (250,250,0,50) for y = c(200,204), and col = (250,125,0,50) for y = c(210,212). Any suggestions? Thanks in advance for the help, Gabe -- Gabriel I. Yospin Institute of Ecology and Evolution Bridgham Lab University of Oregon Eugene, OR 97403-5289 Ph: 541 346 1549 Fax: 541 346 2364 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help to write to a file
Hi, Inside a loop, you must explicitly wrap your summary() command and anything else from which you expect output in a print() command. Sarah 2011/10/11 Sergio René Araujo Enciso araujo.enc...@gmail.com: Dear all: I am having some problems to use the function sink(). Basically I am doing a loop over two files which contain unit-root variables. Then on a loop, I extract every i element of both files to create an object called z. If z meets some requirements, then I perform a unit root test (ADF test), otherwise not. As this process is repeated several times, for each i I want to get the summary of the ADF test on a common file. For that I use the function sink(). My code runs fine, but I do not get anything written on the text file where my results are supposed to be saved. The code is below setwd(C:\\Users\\Sergio René\\Dropbox\\R) library(urca) P1-read.csv(2R_EQ_P_R1_500.csv) P2-read.csv(2R_EQ_P_R2_500.csv) d-(1:1000) sink (ADF_results_b_1.txt) for (i in seq(d)) { z.1-P1[i]*-1-P2[i]*-1 if (all(z.1=0)) {r=1} else {if (all(z.1=0)) {r=1} else {r=2}} if (r==1) {ADF-ur.df(ts(z.1), lags=1, type='drift')} if (r==1) {summary(ADF)} } sink() Any suggestion of what I might be doing wroong? best regards, Sergio René -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help to write to a file
Untested, does adding a print() around summary() get it done? Michael On Oct 11, 2011, at 1:03 PM, Sergio René Araujo Enciso araujo.enc...@gmail.com wrote: Dear all: I am having some problems to use the function sink(). Basically I am doing a loop over two files which contain unit-root variables. Then on a loop, I extract every i element of both files to create an object called z. If z meets some requirements, then I perform a unit root test (ADF test), otherwise not. As this process is repeated several times, for each i I want to get the summary of the ADF test on a common file. For that I use the function sink(). My code runs fine, but I do not get anything written on the text file where my results are supposed to be saved. The code is below setwd(C:\\Users\\Sergio René\\Dropbox\\R) library(urca) P1-read.csv(2R_EQ_P_R1_500.csv) P2-read.csv(2R_EQ_P_R2_500.csv) d-(1:1000) sink (ADF_results_b_1.txt) for (i in seq(d)) { z.1-P1[i]*-1-P2[i]*-1 if (all(z.1=0)) {r=1} else {if (all(z.1=0)) {r=1} else {r=2}} if (r==1) {ADF-ur.df(ts(z.1), lags=1, type='drift')} if (r==1) {summary(ADF)} } sink() Any suggestion of what I might be doing wroong? best regards, Sergio René [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help to write to a file
On Oct 11, 2011, at 1:03 PM, Sergio René Araujo Enciso wrote: Dear all: I am having some problems to use the function sink(). Basically I am doing a loop over two files which contain unit-root variables. Then on a loop, I extract every i element of both files to create an object called z. If z meets some requirements, then I perform a unit root test (ADF test), otherwise not. As this process is repeated several times, for each i I want to get the summary of the ADF test on a common file. For that I use the function sink(). My code runs fine, but I do not get anything written on the text file where my results are supposed to be saved. The code is below setwd(C:\\Users\\Sergio René\\Dropbox\\R) library(urca) P1-read.csv(2R_EQ_P_R1_500.csv) P2-read.csv(2R_EQ_P_R2_500.csv) d-(1:1000) sink (ADF_results_b_1.txt) for (i in seq(d)) { z.1-P1[i]*-1-P2[i]*-1 if (all(z.1=0)) {r=1} else {if (all(z.1=0)) {r=1} else {r=2}} if (r==1) {ADF-ur.df(ts(z.1), lags=1, type='drift')} if (r==1) {summary(ADF)} You may need to print() that summary-object inside the for-function. summary(a) Min. 1st Qu. MedianMean 3rd Qu.Max. 1.001.752.502.503.254.00 sink(test.txt) for(i in 1) summary(a) sink()# No test.txt file created sink(test2.txt) for(i in 1) print( summary(a) ) sink() # The expected file created This relates to the FAQ about similar puzzling behavior with plotting lattice , grid or ggplot objects. } sink() Any suggestion of what I might be doing wroong? best regards, Sergio René David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help to write to a file
On 11/10/2011 1:03 PM, Sergio René Araujo Enciso wrote: Dear all: I am having some problems to use the function sink(). Basically I am doing a loop over two files which contain unit-root variables. Then on a loop, I extract every i element of both files to create an object called z. If z meets some requirements, then I perform a unit root test (ADF test), otherwise not. As this process is repeated several times, for each i I want to get the summary of the ADF test on a common file. For that I use the function sink(). My code runs fine, but I do not get anything written on the text file where my results are supposed to be saved. The code is below setwd(C:\\Users\\Sergio René\\Dropbox\\R) library(urca) P1-read.csv(2R_EQ_P_R1_500.csv) P2-read.csv(2R_EQ_P_R2_500.csv) d-(1:1000) sink (ADF_results_b_1.txt) for (i in seq(d)) { z.1-P1[i]*-1-P2[i]*-1 if (all(z.1=0)) {r=1} else {if (all(z.1=0)) {r=1} else {r=2}} if (r==1) {ADF-ur.df(ts(z.1), lags=1, type='drift')} if (r==1) {summary(ADF)} } sink() Any suggestion of what I might be doing wroong? You aren't printing anything. In a loop, you need to call print() explicitly; only the last value of an expression auto-prints. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] get(a[1]) : object 'a[1]' not found
so… cleared out, and now it’s working: Must have been an obscure workspace conflict. Thanks for quick helpful replies a - 1:4 assign(a[1], 2) a[1] == 2 [1] FALSE get(a[1]) == 2 [1] TRUE On 11 Oct 2011, at 5:45 PM, Duncan Murdoch wrote: On 11/10/2011 12:31 PM, Timothy Bates wrote: In the help for get(), the following example is given: a- 1:4 assign(a[1], 2) a[1] == 2 #FALSE get(a[1]) == 2 #TRUE However, executing that last line for me gives Error in get(a[1]) : object 'a[1]' not found What did the second line say? It's the line that created the `a[1]` object. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help to write to a file
Ok, I see my mistake, just did as you suggest and works. Thanks for the answer people Best, Sergio Rné El 11 de octubre de 2011 19:03, Sergio René Araujo Enciso araujo.enc...@gmail.com escribió: Dear all: I am having some problems to use the function sink(). Basically I am doing a loop over two files which contain unit-root variables. Then on a loop, I extract every i element of both files to create an object called z. If z meets some requirements, then I perform a unit root test (ADF test), otherwise not. As this process is repeated several times, for each i I want to get the summary of the ADF test on a common file. For that I use the function sink(). My code runs fine, but I do not get anything written on the text file where my results are supposed to be saved. The code is below setwd(C:\\Users\\Sergio René\\Dropbox\\R) library(urca) P1-read.csv(2R_EQ_P_R1_500.csv) P2-read.csv(2R_EQ_P_R2_500.csv) d-(1:1000) sink (ADF_results_b_1.txt) for (i in seq(d)) { z.1-P1[i]*-1-P2[i]*-1 if (all(z.1=0)) {r=1} else {if (all(z.1=0)) {r=1} else {r=2}} if (r==1) {ADF-ur.df(ts(z.1), lags=1, type='drift')} if (r==1) {summary(ADF)} } sink() Any suggestion of what I might be doing wroong? best regards, Sergio René [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] stop()
Suppose I have a function, such as the toy example below: myFun - function(x, max.iter = 5) { for(i in 1:10){ result - x + i iter - i if(iter == max.iter) stop('Max reached') } result } I can of course do this: myFun(10, max.iter = 11) However, if I reach the maximum number of iterations before my algorithm has finished (in my real application there are EM steps for a mixed model), I actually want the function to return the value of result up to that point. Currently using stop(), I would get myFun(10, max.iter = 4) Error in myFun(10, max.iter = 4) : Max reached But, in this toy case the function should return the value of result up to iteration 4. Not sure how I can adjust this. Thanks, Harold [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] stop()
You could use return(), e.g., myFun - function (x, max.iter = 5) { for (i in 1:10) { result - x + i iter - i if (iter == max.iter) { return(result) } } result } myFun(10, max.iter = 4) I hope it helps. Best, Dimitris On 10/11/2011 7:31 PM, Doran, Harold wrote: Suppose I have a function, such as the toy example below: myFun- function(x, max.iter = 5) { for(i in 1:10){ result- x + i iter- i if(iter == max.iter) stop('Max reached') } result } I can of course do this: myFun(10, max.iter = 11) However, if I reach the maximum number of iterations before my algorithm has finished (in my real application there are EM steps for a mixed model), I actually want the function to return the value of result up to that point. Currently using stop(), I would get myFun(10, max.iter = 4) Error in myFun(10, max.iter = 4) : Max reached But, in this toy case the function should return the value of result up to iteration 4. Not sure how I can adjust this. Thanks, Harold [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 Web: http://www.erasmusmc.nl/biostatistiek/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] stop()
Thanks, Dimitris. Very helpful on something I *should* know by now. -Original Message- From: Dimitris Rizopoulos [mailto:d.rizopou...@erasmusmc.nl] Sent: Tuesday, October 11, 2011 1:43 PM To: Doran, Harold Cc: r-help@r-project.org Subject: Re: [R] stop() You could use return(), e.g., myFun - function (x, max.iter = 5) { for (i in 1:10) { result - x + i iter - i if (iter == max.iter) { return(result) } } result } myFun(10, max.iter = 4) I hope it helps. Best, Dimitris On 10/11/2011 7:31 PM, Doran, Harold wrote: Suppose I have a function, such as the toy example below: myFun- function(x, max.iter = 5) { for(i in 1:10){ result- x + i iter- i if(iter == max.iter) stop('Max reached') } result } I can of course do this: myFun(10, max.iter = 11) However, if I reach the maximum number of iterations before my algorithm has finished (in my real application there are EM steps for a mixed model), I actually want the function to return the value of result up to that point. Currently using stop(), I would get myFun(10, max.iter = 4) Error in myFun(10, max.iter = 4) : Max reached But, in this toy case the function should return the value of result up to iteration 4. Not sure how I can adjust this. Thanks, Harold [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 Web: http://www.erasmusmc.nl/biostatistiek/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help to ... import the data from Excel
Sarah_R_edu wrote on 10/11/2011 04:57:08 AM: Hi every one i have problem in R program to import the data from excel , I have done the following: 1. install.packages(xlsReadWrite) 2. library(xlsReadWrite) 3. z- read.xls(ReadXls,LTS,colNames=FALSE,sheet,type,form,rowNames=FALSE) and i got on the result: Error in read.xls(ReadXls, LTS, colNames = FALSE, rowNames = FALSE) : object 'LTS' not found also i tried to done data(LTS, package = xlsReadWrite) and we got on : Warning message: In data(LTS, package = xlsReadWrite) : data set 'LTS' not found How i get on LTS in the list objects? Note: LTS is name my data in Eexcl i used another way as following: mydata- read.table(C:\Users\user\Desktop\LTS.xls) but its not working how can i do it? */My regards/ * Try this z - read.xls(file=C:\\Users\\user\\Desktop\\LTS.xls, colNames=FALSE, rowNames=FALSE) Jean [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SLOW split() function
I do not know if stripping down functions is generally recommended, but it is not too difficult to do if you know that you can make assumptions. Here is an example (I also found a fast way to convert the data table to a matrix, again if some assumptions can be made). Using the stripped down function, you can get coefficients and standard errors in less time than you can get just coefficients using default lm. It is hugely less flexible. Cheers, Josh ## library(data.table) ## stripped down lm and summary.lm (for standard errors) minimal.lm - function(y, x) { dims - dim(x) x - unlist(x, FALSE, FALSE) dim(x) - dims obj - lm.fit(x = x, y = y) resvar - sum(obj$residuals^2)/obj$df.residual p - obj$rank R - .Call(La_chol2inv, x = obj$qr$qr[1L:p, 1L:p, drop = FALSE], size = p, PACKAGE = base) m - min(dim(R)) d - c(R)[1L + 0L:(m - 1L) * (dim(R)[1L] + 1L)] se - sqrt(d * resvar) cbind(coef = obj$coefficients, se) } N - 1000*100 d - data.table(data.frame( key= as.integer(runif(N, min=1, max=N/10)), x=rnorm(N), y=rnorm(N) )) # irregular ## add intercept column d$int - 1L setkey(d, key); gc() ## sort and force a garbage collection cat(N=, N, . Size of d=, object.size(d)/1024/1024, MB\n) print(system.time(si - split(seq(nrow(d)), d$key))) cat(\n\t(b) Regressions:\n) ## using lm print(system.time(all.2b - lapply(si, function(.indx) { coef(lm(y ~ x, data=d[.indx,])) }))) ## using minimal.lm---faster and gives standard errors print(system.time(all.2c - lapply(si, function(.indx) { minimal.lm(y = d[.indx, y], x = d[.indx, list(int, x)]) }))) Timings on my system print(system.time(all.2b - lapply(si, function(.indx) { coef(lm(y ~ + x, data=d[.indx,])) }))) user system elapsed 67.870.01 68.46 print(system.time(all.2c - lapply(si, function(.indx) { minimal.lm(y = d[.indx, y], x = d[.indx, list(int, x)]) }))) user system elapsed 47.720.00 48.00 ## On Tue, Oct 11, 2011 at 8:56 AM, ivo welch ivo.we...@gmail.com wrote: thanks, josh. in my posting example, I did not need anything except coefficients. (when this is the case, I usually do not even use lm.fit, but I eliminate all missing obs first and then use solve crossprod(y,cbind(1,x)) crossprod(cbind(1,x)).) this is pretty fast.) alas, I will need to figure how to get coef standard errors faster in this case. summary.lm() is really slow. regards, /iaw Ivo Welch (ivo.we...@gmail.com) http://www.ivo-welch.info/ J. Fred Weston Professor of Finance Anderson School at UCLA, C519 On Mon, Oct 10, 2011 at 11:30 PM, Joshua Wiley jwiley.ps...@gmail.com wrote: As another followup, given that you are doing numerous regression models and (I presume) working with finance/stock data that is strictly numeric (no need for special contrast coding, etc.), you can substantially reduce the time spent estimating the coefficients. A simple way is to use lm.fit directly instead of lm. For lm.fit, you pass the y and x (design) matrices directly. This skips a good deal of overhead. Here is one naive way, I imagine more speedups could be gained by incorporating the intercept (1 vector) into d instead of cbind()ing it. The catch it that lm.fit requires matrices, not data tables, so what you gain may be lost in having to do an extra conversion. In any case, here are the times on my system for the two options (note I used N = 1000 * 100 because I am presently on a glorified netbook). print(system.time(all.2b - lapply(si, function(.indx) { coef(lm(y ~ + x, data=d[.indx,])) }))) user system elapsed 69.00 0.00 69.56 print(system.time(all.2c - lapply(si, function(.indx) { coef(lm.fit(y = d[.indx, y], x = cbind(1, d[.indx, x]))) }))) user system elapsed 37.83 0.03 38.36 the column names for the coeficients will not be the same as from lm, but the estimates should be identical. While this is not recommended in typical usage, in an application like regressions on rolling time windows, etc. where you know the data are not changing, I think it makes sense to bypass the clever determine your data and best methods to use, and go straight to passing the design matrix. Since you do not need residuals, variances, etc. it may be possible to speed this up even more, perhaps bypassing dqrls altogether. Cheers, Josh On Mon, Oct 10, 2011 at 9:56 PM, ivo welch ivo.we...@gmail.com wrote: thank you, everyone. this was very helpful to my specific task and understanding. for the benefit of future googlers, I thought I would post some experiments and results here. ultimately, I need to do a by() on an irregular matrix, and I now know how to speed up by() on a single-core, and then again on a multi-core machine. library(data.table) N - 1000*1000 d - data.table(data.frame( key= as.integer(runif(N, min=1, max=N/10)), x=rnorm(N), y=rnorm(N) )) #
Re: [R] Background Colors
Carlos Ortega wrote on 10/11/2011 11:30:46 AM: Hi, Yes, one way to do that is by using function polygon(). Regards, Carlos Ortega www.qualityexcellence.es 2011/10/11 Gabriel Yospin yosp...@gmail.com Hi R-Help - If I make a plot: numYears = 500 plot(x = c(1,numYears), y = c(200,300), xlab = Time, ylab = Vegetation Class, xlim = c(100,600), ylim = c(200,300), type=n) Is there a way to make different parts of the background for the plot different colors? For example, I'd like to have the background color col = (250,250,0,50) for y = c(200,204), and col = (250,125,0,50) for y = c(210,212). Any suggestions? Thanks in advance for the help, Gabe -- Gabriel I. Yospin Institute of Ecology and Evolution Bridgham Lab University of Oregon Eugene, OR 97403-5289 Ph: 541 346 1549 Fax: 541 346 2364 For example: plot(1, 1, xlab=Time, ylab=Vegetation Class, xlim=c(100, 600), ylim=c(200, 300), type=n) xrange - par(usr)[1:2] polygon(c(xrange, rev(xrange)), c(200, 200, 204, 204), col=rgb(250, 250, 0, 50, maxColorValue=255), border=NA) polygon(c(xrange, rev(xrange)), c(210, 210, 212, 212), col=rgb(250, 125, 0, 50, maxColorValue=255), border=NA) points(10*(20:30), 10*(20:30)) Jean [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] singular gradient error in nls
I am trying to fit a nonlinear regression to infiltration data in order to determine saturated hydraulic conductivity and matric pressure. The original equation can be found in Bagarello et al. 2004 SSSAJ (green-ampt equation for falling head including gravity). I am also VERY new to R and to nonlinear regressions. I have searched the posts, but am still unable to determine why my data come up with the singular gradient error. Here are the data: time - c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19) #time in minutes cumul -c(2, 5, 7, 9.5, 11, 13, 14, 15, 16, 18.5, 21, 23, 24.5, 26.5, 28, 29.5, 31, 31.5, 32.5) #cumulative infiltration in cm per min df - data.frame(time, cumul) df$cumul.m - df$cumul/100/60 #convert to meters per second df$time.s - df$time*60 #convert to seconds b2 - 1-(0.196/(0.06/0.01131)) #relationship between soil moisture and the size of the ring infiltrometer (6 cm radius by 113.1 cm2 cross sectional area) theta - 0.196 #difference in residual soil water and field capacity Here is the formula: #Where a = K_fs and b=psi_f nlsfit - nls(time.s~(theta/a*b2)*((cumul.m/theta)-(((0.16-b)/b2)*log(1+((cumul.m*b2)/(theta*(0.16-b)), data = df, start=list (a=1, b=0.5), trace=TRUE) - I am likely over parameterizing, but I must admit, that I am not entirely sure what that means. Any help offered would be greatly appreciated. I am sorry if I sound naive, but I am an ecologist, not a hydrologist. Kate [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to calculate percentage variation in a zero-inflated negative binomial regression model
I am a novice in R but using R 2.13.1 in Windows I wish to be able to calculate the percentage variation in a zero-inflated negative binomial regression model that is explained by the two predictors in my model. My response variable was no. of dung-piles per km and the predictor of excess zeros was distance to major road (km) . Thanks in advance. Boafo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Creating the mean using algebra matrix
Dear all, I wanted to create the mean using a algebra matrix. so I tried this one: meanAnimals - new3%*%factorial (Calculates the matrix multiplication of the new3 * factorial). But I get the following error message: Error in new3 %*% factorial : non-conformable arguments These are my matrices: new3 [,1] [,2] [1,] 1.3508.1 [2,] 465.000 423.0 [3,]36.330 119.5 [4,]27.660 115.0 [5,] 1.0405.5 [6,] 11700.000 50.0 [7,] 2547.000 4603.0 [8,] 187.100 419.0 [9,] 521.000 655.0 [10,]10.000 115.0 [11,] 3.300 25.6 [12,] 529.000 680.0 [13,] 207.000 406.0 [14,]62.000 1320.0 [15,] 6654.000 5712.0 [16,] 9400.000 70.0 [17,] 6.800 179.0 [18,]35.000 56.0 [19,] 0.1201.0 [20,] 0.0230.4 [21,] 2.500 12.1 [22,]55.500 175.0 [23,] 100.000 157.0 [24,]52.160 440.0 [25,] 0.2801.9 [26,] 87000.000 154.5 [27,] 0.1223.0 [28,] 192.000 180.0 factorial [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [1,]111111111 1 1 1 1 1 1 1 1 [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25] [,26] [,27] [,28] [1,] 1 1 1 1 1 1 1 1 1 1 1 Can anyone help me out of this? Cheers, maria -- View this message in context: http://r.789695.n4.nabble.com/Creating-the-mean-using-algebra-matrix-tp3895378p3895378.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] stop()
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Dimitris Rizopoulos Sent: Tuesday, October 11, 2011 10:43 AM To: Doran, Harold Cc: r-help@r-project.org Subject: Re: [R] stop() You could use return(), e.g., myFun - function (x, max.iter = 5) { for (i in 1:10) { result - x + i iter - i if (iter == max.iter) { return(result) } } result } myFun(10, max.iter = 4) I hope it helps. Best, Dimitris Or, just use break : myFun - function (x, max.iter = 5) { for (i in 1:10) { result - x + i iter - i if (iter == max.iter) break } result } Hope this is helpful, Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204 On 10/11/2011 7:31 PM, Doran, Harold wrote: Suppose I have a function, such as the toy example below: myFun- function(x, max.iter = 5) { for(i in 1:10){ result- x + i iter- i if(iter == max.iter) stop('Max reached') } result } I can of course do this: myFun(10, max.iter = 11) However, if I reach the maximum number of iterations before my algorithm has finished (in my real application there are EM steps for a mixed model), I actually want the function to return the value of result up to that point. Currently using stop(), I would get myFun(10, max.iter = 4) Error in myFun(10, max.iter = 4) : Max reached But, in this toy case the function should return the value of result up to iteration 4. Not sure how I can adjust this. Thanks, Harold [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 Web: http://www.erasmusmc.nl/biostatistiek/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SLOW split() function
On Wed, Oct 12, 2011 at 4:56 AM, ivo welch ivo.we...@gmail.com wrote: thanks, josh. in my posting example, I did not need anything except coefficients. (when this is the case, I usually do not even use lm.fit, but I eliminate all missing obs first and then use solve crossprod(y,cbind(1,x)) crossprod(cbind(1,x)).) this is pretty fast.) solve(cbind(1,x), y) should be even faster, and more numerically stable, [and less likely to make certain people want to cast you into the outer darkness, where there is SAS and gnashing of teeth] alas, I will need to figure how to get coef standard errors faster in this case. summary.lm() is really slow. The code from summary.lm that actually computes the standard errors is fairly efficient; you could extract that. -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating the mean using algebra matrix
On Oct 11, 2011, at 1:45 PM, flokke wrote: Dear all, I wanted to create the mean using a algebra matrix. so I tried this one: meanAnimals - new3%*%factorial (Calculates the matrix multiplication of the new3 * factorial). But I get the following error message: Error in new3 %*% factorial : non-conformable arguments You probably want to transpose `factorial`. I don't understand how the result would be particularly interesting, however. -- David. These are my matrices: new3 [,1] [,2] [1,] 1.3508.1 [2,] 465.000 423.0 [3,]36.330 119.5 [4,]27.660 115.0 [5,] 1.0405.5 [6,] 11700.000 50.0 [7,] 2547.000 4603.0 [8,] 187.100 419.0 [9,] 521.000 655.0 [10,]10.000 115.0 [11,] 3.300 25.6 [12,] 529.000 680.0 [13,] 207.000 406.0 [14,]62.000 1320.0 [15,] 6654.000 5712.0 [16,] 9400.000 70.0 [17,] 6.800 179.0 [18,]35.000 56.0 [19,] 0.1201.0 [20,] 0.0230.4 [21,] 2.500 12.1 [22,]55.500 175.0 [23,] 100.000 157.0 [24,]52.160 440.0 [25,] 0.2801.9 [26,] 87000.000 154.5 [27,] 0.1223.0 [28,] 192.000 180.0 factorial [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [, 13] [,14] [,15] [,16] [,17] [1,]111111111 1 1 1 1 1 1 1 1 [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25] [,26] [,27] [,28] [1,] 1 1 1 1 1 1 1 1 1 1 1 Can anyone help me out of this? Cheers, maria -- View this message in context: http://r.789695.n4.nabble.com/Creating-the-mean-using-algebra-matrix-tp3895378p3895378.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating the mean using algebra matrix
To do matrix multiplication: m x n, the Rows and columns of m must be equal to the columns and rows of n, respectively. Sent from my iPhone On 11 Oct 2011, at 06:45 PM, flokke ingaschw...@gmail.com wrote: Dear all, I wanted to create the mean using a algebra matrix. so I tried this one: meanAnimals - new3%*%factorial (Calculates the matrix multiplication of the new3 * factorial). But I get the following error message: Error in new3 %*% factorial : non-conformable arguments These are my matrices: new3 [,1] [,2] [1,] 1.3508.1 [2,] 465.000 423.0 [3,]36.330 119.5 [4,]27.660 115.0 [5,] 1.0405.5 [6,] 11700.000 50.0 [7,] 2547.000 4603.0 [8,] 187.100 419.0 [9,] 521.000 655.0 [10,]10.000 115.0 [11,] 3.300 25.6 [12,] 529.000 680.0 [13,] 207.000 406.0 [14,]62.000 1320.0 [15,] 6654.000 5712.0 [16,] 9400.000 70.0 [17,] 6.800 179.0 [18,]35.000 56.0 [19,] 0.1201.0 [20,] 0.0230.4 [21,] 2.500 12.1 [22,]55.500 175.0 [23,] 100.000 157.0 [24,]52.160 440.0 [25,] 0.2801.9 [26,] 87000.000 154.5 [27,] 0.1223.0 [28,] 192.000 180.0 factorial [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [1,]111111111 1 1 1 1 1 1 1 1 [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25] [,26] [,27] [,28] [1,] 1 1 1 1 1 1 1 1 1 1 1 Can anyone help me out of this? Cheers, maria -- View this message in context: http://r.789695.n4.nabble.com/Creating-the-mean-using-algebra-matrix-tp3895378p3895378.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plot methods for summary of rms objects
The integration of plot methods for various outputs from rms packages is a great appreciated aspect of the rms package. I particularly like to use: plot(summary(model)) for my own purposes, but... for publication/presentation I need to modify details like variable names, or the number of signficant digits used in the figure annotations. Is there a simple way to modify the plot inputs arising from summary, or is it necessary to hack the summary object? Thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] singular gradient error in nls
Katie: I would say that this is not an R question, so I would suggest that either a) You ask it on a statistics help website like stats.stackexchange.com or b) You consult with someone locally who knows about nonlinear regression (possibly a statistician, but not necessarily so). -- Bert On Tue, Oct 11, 2011 at 11:34 AM, Katie Tully katherinetu...@gmail.comwrote: I am trying to fit a nonlinear regression to infiltration data in order to determine saturated hydraulic conductivity and matric pressure. The original equation can be found in Bagarello et al. 2004 SSSAJ (green-ampt equation for falling head including gravity). I am also VERY new to R and to nonlinear regressions. I have searched the posts, but am still unable to determine why my data come up with the singular gradient error. Here are the data: time - c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19) #time in minutes cumul -c(2, 5, 7, 9.5, 11, 13, 14, 15, 16, 18.5, 21, 23, 24.5, 26.5, 28, 29.5, 31, 31.5, 32.5) #cumulative infiltration in cm per min df - data.frame(time, cumul) df$cumul.m - df$cumul/100/60 #convert to meters per second df$time.s - df$time*60 #convert to seconds b2 - 1-(0.196/(0.06/0.01131)) #relationship between soil moisture and the size of the ring infiltrometer (6 cm radius by 113.1 cm2 cross sectional area) theta - 0.196 #difference in residual soil water and field capacity Here is the formula: #Where a = K_fs and b=psi_f nlsfit - nls(time.s~(theta/a*b2)*((cumul.m/theta)-(((0.16-b)/b2)*log(1+((cumul.m*b2)/(theta*(0.16-b)), data = df, start=list (a=1, b=0.5), trace=TRUE) - I am likely over parameterizing, but I must admit, that I am not entirely sure what that means. Any help offered would be greatly appreciated. I am sorry if I sound naive, but I am an ecologist, not a hydrologist. Kate [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] high and lowest with names
Hello, I'm looking to get the values, row names and column names of the largest and smallest values in a matrix. Example (except is does not include the names): x - swiss$Education[1:25] dat = matrix(x,5,5) colnames(dat) = c('a','b','c','d','c') rownames(dat) = c('z','y','x','w','v') dat a b c d c z 12 7 6 2 10 y 9 7 12 8 3 x 5 8 7 28 12 w 7 7 12 20 6 v 15 13 5 9 1 #top 10 sort(dat,partial=n-9:n)[(n-9):n] [1] 9 10 12 12 12 12 13 15 20 28 # bottom 10 sort(dat,partial=1:10)[1:10] [1] 1 2 3 5 5 6 6 7 7 7 ...except I need the rownames and colnames to go along for the ride with the values...because of this, I am guessing the return value will need to be a list since all of the values have different row and col names (which is fine). Regards, Ben [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] replicate data.frame n times
Hi, is there a way to replicate a data.frame like you can replicate the entries of a vector (with the repeat-function)? I want to do this: x - data.frame(x, x) (where x is a data.frame). but n times. And it should be as cpu / memory efficient as possible, since n is pretty big in my case. thanks for any suggestions! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] restricted cubic spline within survfit.cph in the package rms
It may be best to either write to the package maintainer (me, as you did) or post to the group but not both. Frank Stan Maydan-2 wrote: Hello, does anyone have an example on how to use restricted cubic splines function rcs within survfit.cph, if cph (Cox Proportional Hazard Regression) was done with restricted cubic splines (which I made to work)? Thank you. [[alternative HTML version deleted]] __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/restricted-cubic-spline-within-survfit-cph-in-the-package-rms-tp3895252p3895797.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix multiplication
Your question as answered by Timothy in your previous thread http://r.789695.n4.nabble.com/Re-Creating-the-mean-using-algebra-matrix-td3895689.html flokke wrote: Dear all, Sorry to bother you with such a stupid question, but I just cannot find the solution to my problem. I'd like to use matrix multiplication for meanA and factorial 3. I use the command meanA%*%factorial 3. But everything I get is: Error in factorial3 %*% A : non-conformable arguments I know that the number of the columns of the first vector has to be the same number of rows of the second vector to be able to use matrix multiplication, but that is the case here. I also tried it with two columns for factorial 3 and that didnt work either. Can someone help me out with this?' these are my matrices: meanA [,1] [,2] [1,] 3.67 4.67 factorial3 [,1] [1,]1 Thank you so much! Cheers, maria -- View this message in context: http://r.789695.n4.nabble.com/matrix-multiplication-tp3895833p3895860.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plots of correlation matrices
Hi, One way to do that is this (avoiding the use of a for loop): l.txt- id category attribute1 attribute2 attribute3 attribute4 661 SCHS 43.2 0 56.5 1 12202 SCHS 161.7 5.7 155 16 1182 SCHS 21.4 0 29 0 1356 SSS 8.8182 0.1818 10.6667 0.6667 1864 SCHS 443.7273 9.9091 537 46 12360 SOA 6.6364 0 10 0 3382 SOA 7.1667 0 26 0.5 1033 SOA 63.9231 1.5385 91.5 11.5 14742 SSS 4.3846 0 8 0 12760 SSS 425.0714 1.7857 297.5 3.5 dat.df - read.table(textConnection(l.txt), header=T, as.is = TRUE) closeAllConnections() dat.lt-by(dat.df[,3:6], dat.df$category, cor) lapply(dat.lt,corrplot) Regards, Carlos Ortega www.qualityexcellence.es 2011/10/11 gj gaw...@gmail.com Hi, I want to do a visualisation of a matrix plot made up of several plots of correlation matrices (using corrplot()). My data is in csv format. Here's an example: id,category,attribute1,attribute2,attribute3,attribute4 661,SCHS,43.2,0,56.5,1 12202,SCHS,161.7,5.7,155,16 1182,SCHS,21.4,0,29,0 1356,SSS, 8.8182,0.1818,10.6667,0.6667 1864,SCHS,443.7273,9.9091,537,46 12360,SOA,6.6364,0,10,0 3382,SOA,7.1667,0,26,0.5 1033,SOA,63.9231,1.5385,91.5,11.5 14742,SSS,4.3846,0,8,0 12760,SSS,425.0714,1.7857,297.5,3.5 I can get rid of the id. But I need the 'category' as a way of distinguishing the various correlation matrices. I can do a plot of the correlation matrix using corrplot() function in the corrplot package (ignoring the id and category). But what I need is a matrix of the plots of each correlation matrix based on the category, ie I have three categories in the data, hence I will need three plots of the correlation matrix in one diagram (because the correlation matrix only makes sense if they are distinguished by category). Any help? Regards Gawesh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] stop()
Replace stop() with break to see if that does what you want. (you may also want to include cat() or warn() to indicate the early stopping. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Doran, Harold Sent: Tuesday, October 11, 2011 11:32 AM To: r-help@r-project.org Subject: [R] stop() Suppose I have a function, such as the toy example below: myFun - function(x, max.iter = 5) { for(i in 1:10){ result - x + i iter - i if(iter == max.iter) stop('Max reached') } result } I can of course do this: myFun(10, max.iter = 11) However, if I reach the maximum number of iterations before my algorithm has finished (in my real application there are EM steps for a mixed model), I actually want the function to return the value of result up to that point. Currently using stop(), I would get myFun(10, max.iter = 4) Error in myFun(10, max.iter = 4) : Max reached But, in this toy case the function should return the value of result up to iteration 4. Not sure how I can adjust this. Thanks, Harold [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] replicate data.frame n times
Replicate the row indices? x[rep(seq_len(nrow(x)), k), ] -- Bert On Tue, Oct 11, 2011 at 12:55 PM, Martin Batholdy batho...@googlemail.comwrote: Hi, is there a way to replicate a data.frame like you can replicate the entries of a vector (with the repeat-function)? I want to do this: x - data.frame(x, x) (where x is a data.frame). but n times. And it should be as cpu / memory efficient as possible, since n is pretty big in my case. thanks for any suggestions! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Men by nature long to get on to the ultimate truths, and will often be impatient with elementary studies or fight shy of them. If it were possible to reach the ultimate truths without the elementary studies usually prefixed to them, these would not be preparatory studies but superfluous diversions. -- Maimonides (1135-1204) Bert Gunter Genentech Nonclinical Biostatistics 467-7374 http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plots of correlation matrices
Hi, One way to do that is this (avoiding the use of a for loop): l.txt- id category attribute1 attribute2 attribute3 attribute4 661 SCHS 43.2 0 56.5 1 12202 SCHS 161.7 5.7 155 16 1182 SCHS 21.4 0 29 0 1356 SSS 8.8182 0.1818 10.6667 0.6667 1864 SCHS 443.7273 9.9091 537 46 12360 SOA 6.6364 0 10 0 3382 SOA 7.1667 0 26 0.5 1033 SOA 63.9231 1.5385 91.5 11.5 14742 SSS 4.3846 0 8 0 12760 SSS 425.0714 1.7857 297.5 3.5 dat.df - read.table(textConnection(l.txt), header=T, as.is = TRUE) closeAllConnections() dat.lt-by(dat.df[,3:6], dat.df$category, cor) I guess Gawesh is looking for ?layout or ?par: par(mfrow=c(2,2)) lapply(dat.lt,corrplot) lapply(dat.lt,corrplot) Regards, Carlos Ortega www.qualityexcellence.es 2011/10/11 gj gaw...@gmail.com Hi, I want to do a visualisation of a matrix plot made up of several plots of correlation matrices (using corrplot()). My data is in csv format. Here's an example: id,category,attribute1,attribute2,attribute3,attribute4 661,SCHS,43.2,0,56.5,1 12202,SCHS,161.7,5.7,155,16 1182,SCHS,21.4,0,29,0 1356,SSS, 8.8182,0.1818,10.6667,0.6667 1864,SCHS,443.7273,9.9091,537,46 12360,SOA,6.6364,0,10,0 3382,SOA,7.1667,0,26,0.5 1033,SOA,63.9231,1.5385,91.5,11.5 14742,SSS,4.3846,0,8,0 12760,SSS,425.0714,1.7857,297.5,3.5 I can get rid of the id. But I need the 'category' as a way of distinguishing the various correlation matrices. I can do a plot of the correlation matrix using corrplot() function in the corrplot package (ignoring the id and category). But what I need is a matrix of the plots of each correlation matrix based on the category, ie I have three categories in the data, hence I will need three plots of the correlation matrix in one diagram (because the correlation matrix only makes sense if they are distinguished by category). Any help? Regards Gawesh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with twitteR package
check the version of libcurl you have installed. If you have an older version some of the options may not be present. On Sun, Oct 9, 2011 at 10:39 AM, Steven Oliver s1oli...@ucsd.edu wrote: Hey Guys, I just started fooling around with the twitteR package in order to get a record of all tweets from a single public account. When I run userTimeline, I get the default 20 most recent tweets just fine. However, when I specify an arbitrary number of tweets (as described in the documentation from June 14th, 2011), I get the following warning: bjaTweets-userTimeline(BeijingAir, n=50) Warning message: In mapCurlOptNames(names(.els), asNames = TRUE) : Unrecognized CURL options: n Does anyone familiar with the twitteR package know what is going on with options? Alternatively, if there are any other simple means for getting this sort of data? Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot methods for summary of rms objects
On Oct 11, 2011, at 3:20 PM, Rob James wrote: The integration of plot methods for various outputs from rms packages is a great appreciated aspect of the rms package. I particularly like to use: plot(summary(model)) for my own purposes, but... for publication/presentation I need to modify details like variable names, or the number of signficant digits used in the figure annotations. Is there a simple way to modify the plot inputs arising from summary, or is it necessary to hack the summary object? If you type: methods(summary) ... you should see why it might be very difficult to answer your question in its current state of vagueness. I just ran the example in help(summary.rms) and it appears that it used base graphics and that if such output is your target, you would need to either hack the code or hack the pdf file. Much of the graphical output from rms functions has been ported to lattice graphics, but apparently not the version for summary.rms objects. If you have the data and can redo the analysis, read on. The comparison levels used by summary.rms are set with datadist and that is probably what you should be spending some time understanding, There are other possibilities than just using a datadist(data_object) call. ?datadist -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] controling text in facets (ggplot2)
In the absence of a reproducible example, a general question induces a general response. I'd suggest creating a small data frame that contains the x and y coordinates, a third variable consisting of expressions representing each fitted model and an indicator of the group to which the expression is to be applied. Use this data frame as the data argument of geom_text, and set x, y and labels = variable containing expressions as the aesthetics of the geom. If that doesn't work, provide a reproducible example and you'll undoubtedly get a more accurate answer. You're also more likely to get a higher response rate if you post on the ggplot2 group: http://had.co.nz/ggplot2/ (see the Mailing List paragraph near the top of the page for subscription information). Dennis On Tue, Oct 11, 2011 at 5:45 AM, Thomthom rime.tho...@gmail.com wrote: Hi R-helpers! Here is my problem: I have a graph with 3 different facets where there are 3 different regression line. My goal is to mention separately in each facet each equation that describes my lines. So far, I managed to add a line and the same equation to all my facets but that's not unfortunately what I want. Is there a way to do that? Any suggestion would be gladly welcome! Thanks for your help! Thomas -- View this message in context: http://r.789695.n4.nabble.com/controling-text-in-facets-ggplot2-tp3894148p3894148.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] high and lowest with names
Hi, With this code you can find row and col names for the largest value applied to your example: r.m.tmp-apply(dat,1,max) r.max-names(r.m.tmp)[r.m.tmp==max(r.m.tmp)] c.m.tmp-apply(dat,2,max) c.max-names(c.m.tmp)[c.m.tmp==max(c.m.tmp)] It's inmediate how to get the same for the smallest and build a function to calculate everything and return a list. Regards, Carlos Ortega www.qualityexcellence.es 2011/10/11 Ben qant ccqu...@gmail.com Hello, I'm looking to get the values, row names and column names of the largest and smallest values in a matrix. Example (except is does not include the names): x - swiss$Education[1:25] dat = matrix(x,5,5) colnames(dat) = c('a','b','c','d','c') rownames(dat) = c('z','y','x','w','v') dat a b c d c z 12 7 6 2 10 y 9 7 12 8 3 x 5 8 7 28 12 w 7 7 12 20 6 v 15 13 5 9 1 #top 10 sort(dat,partial=n-9:n)[(n-9):n] [1] 9 10 12 12 12 12 13 15 20 28 # bottom 10 sort(dat,partial=1:10)[1:10] [1] 1 2 3 5 5 6 6 7 7 7 ...except I need the rownames and colnames to go along for the ride with the values...because of this, I am guessing the return value will need to be a list since all of the values have different row and col names (which is fine). Regards, Ben [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] high and lowest with names
But it's simpler and probably faster to use R's built-in capabilities. ?which ## note the arr.ind argument!) As an example: test - matrix(rnorm(24), nr = 4) which(test==max(test), arr.ind=TRUE) row col [1,] 2 6 So this gives the row and column indices of the max, from which row and column names can easily be obtained from the dimnames attribute of the matrix. Note: This assumes that the object in question is a matrix, NOT a data frame, for which it would be slightly more complicated. -- Bert On Tue, Oct 11, 2011 at 3:06 PM, Carlos Ortega c...@qualityexcellence.eswrote: Hi, With this code you can find row and col names for the largest value applied to your example: r.m.tmp-apply(dat,1,max) r.max-names(r.m.tmp)[r.m.tmp==max(r.m.tmp)] c.m.tmp-apply(dat,2,max) c.max-names(c.m.tmp)[c.m.tmp==max(c.m.tmp)] It's inmediate how to get the same for the smallest and build a function to calculate everything and return a list. Regards, Carlos Ortega www.qualityexcellence.es 2011/10/11 Ben qant ccqu...@gmail.com Hello, I'm looking to get the values, row names and column names of the largest and smallest values in a matrix. Example (except is does not include the names): x - swiss$Education[1:25] dat = matrix(x,5,5) colnames(dat) = c('a','b','c','d','c') rownames(dat) = c('z','y','x','w','v') dat a b c d c z 12 7 6 2 10 y 9 7 12 8 3 x 5 8 7 28 12 w 7 7 12 20 6 v 15 13 5 9 1 #top 10 sort(dat,partial=n-9:n)[(n-9):n] [1] 9 10 12 12 12 12 13 15 20 28 # bottom 10 sort(dat,partial=1:10)[1:10] [1] 1 2 3 5 5 6 6 7 7 7 ...except I need the rownames and colnames to go along for the ride with the values...because of this, I am guessing the return value will need to be a list since all of the values have different row and col names (which is fine). Regards, Ben [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating the mean using algebra matrix
On 12/10/11 08:31, Timothy Bates wrote: To do matrix multiplication: m x n, the Rows and columns of m must be equal to the columns and rows of n, respectively. No. The number of columns of m must equal the number of rows of n, that's all. The number of *rows* of m and the number of *columns* of n can be anything you like. cheers, Rolf Turner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] round() and negative digits
On 11/10/11 08:17, Michael Friendly wrote: On 10/9/2011 6:18 AM, Prof Brian Ripley wrote: Sometimes it is better not to document things than try to give precise details which may get changed *and* there will be useRs who misread (and maybe even file bug reports on their misreadings). The source is the ultimate documentation. I can't agree with this less. The source does the computation. The documentation says how to use it and what it should do. Corner cases can be trapped in code or mentioned in Notes. But the source is only useful if you can easily find it and then can understand what it is doing, particularly for a .Primitive like round(). The source is only the documentation of last resort. I agree. It seems to me that saying that the source is the ultimate documentation is rather like (in pure mathematics) saying that all maths follows from the Zermello-Fraenkel axioms plus the Axiom of Choice, so those axioms are all that we need to tell anyone. cheers, Rolf Turner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] round() and negative digits
On 11-10-11 7:14 PM, Rolf Turner wrote: On 11/10/11 08:17, Michael Friendly wrote: On 10/9/2011 6:18 AM, Prof Brian Ripley wrote: Sometimes it is better not to document things than try to give precise details which may get changed *and* there will be useRs who misread (and maybe even file bug reports on their misreadings). The source is the ultimate documentation. I can't agree with this less. The source does the computation. The documentation says how to use it and what it should do. Corner cases can be trapped in code or mentioned in Notes. But the source is only useful if you can easily find it and then can understand what it is doing, particularly for a .Primitive like round(). The source is only the documentation of last resort. I agree. It seems to me that saying that the source is the ultimate documentation is rather like (in pure mathematics) saying that all maths follows from the Zermello-Fraenkel axioms plus the Axiom of Choice, so those axioms are all that we need to tell anyone. R is an open source project. That means we expect people to look at the source, to answer some of their own questions, to suggest improvements, to point out errors. If you don't look at it, you aren't holding up your side of the bargain. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] high and lowest with names
which.max is even faster: dims - c(1000,1000) tt - array(rnorm(prod(dims)),dims) # which system.time( replicate(100, which(tt==max(tt), arr.ind=TRUE)) ) # which.max ( arrayInd) system.time( replicate(100, arrayInd(which.max(tt), dims)) ) Best, Denes But it's simpler and probably faster to use R's built-in capabilities. ?which ## note the arr.ind argument!) As an example: test - matrix(rnorm(24), nr = 4) which(test==max(test), arr.ind=TRUE) row col [1,] 2 6 So this gives the row and column indices of the max, from which row and column names can easily be obtained from the dimnames attribute of the matrix. Note: This assumes that the object in question is a matrix, NOT a data frame, for which it would be slightly more complicated. -- Bert On Tue, Oct 11, 2011 at 3:06 PM, Carlos Ortega c...@qualityexcellence.eswrote: Hi, With this code you can find row and col names for the largest value applied to your example: r.m.tmp-apply(dat,1,max) r.max-names(r.m.tmp)[r.m.tmp==max(r.m.tmp)] c.m.tmp-apply(dat,2,max) c.max-names(c.m.tmp)[c.m.tmp==max(c.m.tmp)] It's inmediate how to get the same for the smallest and build a function to calculate everything and return a list. Regards, Carlos Ortega www.qualityexcellence.es 2011/10/11 Ben qant ccqu...@gmail.com Hello, I'm looking to get the values, row names and column names of the largest and smallest values in a matrix. Example (except is does not include the names): x - swiss$Education[1:25] dat = matrix(x,5,5) colnames(dat) = c('a','b','c','d','c') rownames(dat) = c('z','y','x','w','v') dat a b c d c z 12 7 6 2 10 y 9 7 12 8 3 x 5 8 7 28 12 w 7 7 12 20 6 v 15 13 5 9 1 #top 10 sort(dat,partial=n-9:n)[(n-9):n] [1] 9 10 12 12 12 12 13 15 20 28 # bottom 10 sort(dat,partial=1:10)[1:10] [1] 1 2 3 5 5 6 6 7 7 7 ...except I need the rownames and colnames to go along for the ride with the values...because of this, I am guessing the return value will need to be a list since all of the values have different row and col names (which is fine). Regards, Ben [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Nonlinear regression aborting due to error
Colleagues, I am fitting an Emax model using nls. The code is: START - list(EMAX=INITEMAX, EFFECT=INITEFFECT, C50=INITC50) CONTROL - list(maxiter=1000, warnOnly=T) #FORMULA- as.formula(YVAR ~ EMAX - EFFECT * XVAR^GAMMA / (XVAR^GAMMA + C50^GAMMA)) ## alternate version of formula FORMULA - as.formula(YVAR ~ EMAX - EFFECT / (1 + (C50/XVAR)^GAMMA)) FIT - nls(FORMULA, start=START, control=CONTROL, trace=T) If GAMMA equals 10-80, nls converges successfully and the fit tracks the fit from a smoother (Supersmoother). However, if I attempt to estimate GAMMA using: START - list(EMAX=INITEMAX, EFFECT=INITEFFECT, C50=INITC50, GAMMA=INITGAMMA) GAMMA increases rapidly to 500 and nls terminates with: Error in chol2inv(object$m$Rmat()) : element (4, 4) is zero, so the inverse cannot be computed In addition: Warning message: In nls(FORMULA, start = START, control = CONTROL, trace = T) : singular gradient I also tried fixing GAMMA to 1000 and I get a similar error message: Error in chol2inv(object$m$Rmat()) : element (2, 2) is zero, so the inverse cannot be computed In addition: Warning message: In nls(FORMULA, start = START, control = CONTROL, trace = T) : singular gradient The data do not suggest a very large value for GAMMA so I am surprised that the estimate is increasing so rapidly. I attempted to use the port algorithm with an upper bound on GAMMA but the upper bound is reached rapidly, suggesting that the data support a large value for GAMMA. A subset of the data (with added noise) is shown below. A GAMMA value of 1280 triggers the error with this subset XVAR- c(26, 31.3, 20.9, 24.8, 22.9, 4.79, 19.6, 18, 19.6, 9.69, 21.7, 26.6, 27.8, 9.12, 10.5, 20.1, 16.7, 14.1, 10.2, 19.2, 24.7, 34.6, 26.6, 25.1, 5.98, 13.4, 15.7, 9.59, 7.39, 21.5, 15.7, 12.4, 19.2, 17.8, 19.7, 27.1, 25.6, 36.4, 22.9, 8.68, 27, 25.9, 33.3, 24.2, 21.4, 31, 19.1, 18.7, 23.5, 19.4, 10.3, 12.8, 13.9, 18.5, 21, 15.2, 18.9, 9.12, 16.9, 12.9, 29.5, 15.5, 7.34, 8.97, 8.04, 23.7, 16.3, 37.6, 35.2, 13.7, 28.1, 29.5, 15.1, 26, 6.52) YVAR- c(-34.2, -84.2, -71.1, -91.9, -104.1, -23.2, -27.2, -13.4, -143.2, 24.7, -72.1, -38, 25.2, -8, -34.1, -15.1, -112.6, -93.5, -130.9, -127.8, -118.7, -53.5, -29.8, 98, 0, -37.6, -99.4, 57.9, 0.2, -62.2, -27.3, 8.3, -51.6, -111.6, -25.6, -51.7, -106.4, -85.1, -63.1, -60.8, -27.7, -20.7, 22.9, -49.4, -85.7, -90.9, -107, -20.6, -36.3, -40.2, 39.8, -55, -54.5, -103.9, -53.1, -2.3, -72.3, -65.6, -57.8, -64.4, -129.1, 10.4, -9.9, -29.6, -40.8, 52, -94, 8.8, -98.8, 28, -16.3, -99.2, -48.5, -111.9, -15.4) I suspect that I am making a conceptual error in the use of nls. Any help would be appreciated. If a different function to fit nonlinear regression would work better, please direct me. Dennis Dennis Fisher MD P (The P Less Than Company) Phone: 1-866-PLessThan (1-866-753-7784) Fax: 1-866-PLessThan (1-866-753-7784) www.PLessThan.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.