Re: [R] fitting mixed models to censored data?
Hi Bill, Thanks for your reply. The first place I looked was in the survival package since it can obviously handle censored data. However, I don't have any particular desire to restrict myself to standard survival models just because I have some censoring. Frailties appear to fit in nicely with the types of models typically used with survival data, but that's not the only kind of model I'd like to look at. Thanks, Doug On Mon, 23 Apr 2007, Pikounis, Bill [CNTUS] wrote: > Doug, > In perhaps similar situations where there are clusters of measurements > due to repeated time or space on an individual subject or experimental > unit, I have used the survreg() function from the survival library. > > You can specify left, right, and/or interval censoring within a data set > through Surv(), and so I have used left censoring for the LOD > observations. I was just focused on marginal or population-averaged > estimation, so the use of cluster() in the argument for survreg() and > the robust option in survreg() to get sandwich error estimates was > sufficient for me. Depending on your needs to evaluate random effects, > frailty() in the survival package -- which can be used with survreg() or > coxph() --- is another alternative to explore, I believe. > > Hope that helps, > Bill > Nonclinical Statistics, Centocor R & D > >> -Original Message- >> From: [EMAIL PROTECTED] >> [mailto:[EMAIL PROTECTED] Behalf Of Douglas Grove >> Sent: Monday, April 23, 2007 2:29 PM >> To: Bert Gunter >> Cc: r-help@stat.math.ethz.ch >> Subject: Re: [R] fitting mixed models to censored data? >> >> >> Hi Bert, >> >> Yes, I am always wary when one software offers something that >> other do not. >> >> The censoring I'm faced with (at present) isn't as complicated >> as with much 'survival' data. I'm trying to analyze assay data >> and have a lower limit of detection (LLD) to contend with. >> Once the level of the analyte gets low enough it can't be >> accurately quantitated, hence all that is reported is that >> the level is less than some value (the LLD). >> >> So I'm not worried about all the complex assumptions that go along >> with censoring in clinical trials, etc. >> >> Thanks, >> Doug >> >> >> On Mon, 23 Apr 2007, Bert Gunter wrote: >> >>> Douglas: >>> >>> AFAIK, this is subject area of active current research. >> Diggle, Heagerty, >>> Liang, and Zeger , 2002, (ANALYSIS OF LONGITUDINAL DATA) >> say on p.316: "An >>> emerging consensus is that analysis of data with >> potentially informative >>> dropouts necessarily involves assumptions which are >> difficult, or even >>> impossible, to check from the observed data." This was ca >> 1994, I believe, >>> so I don't know whether this view is still held among >> experts (which I am >>> not). But if it is, you may do well to be careful of >> whatever SAS does even >>> if you do have to go running off to it. >>> >>> Cheers, >>> >>> Bert Gunter >>> Genentech Nonclinical Statistics >>> >>> >>> -Original Message- >>> From: [EMAIL PROTECTED] >>> [mailto:[EMAIL PROTECTED] On Behalf Of Douglas Grove >>> Sent: Monday, April 23, 2007 10:58 AM >>> To: r-help@stat.math.ethz.ch >>> Subject: [R] fitting mixed models to censored data? >>> >>> Hi, >>> >>> I'm trying to figure out if there are any packages allowing >>> one to fit mixed models (or non-linear mixed models) to data >>> that includes censoring. >>> >>> I've done some searching already on CRAN and through the mailing >>> list archives, but haven't discovered anything. Since I may well >>> have done a poor job searching I thought I'd ask here prior to >>> giving up. >>> >>> I understand that SAS's proc nlmixed can accomodate censoring >>> (though proc mixed apparently can't), so if I can't find >>> something available in R, I'll have to break down and use >>> that. Please, save me from having to use SAS! >>> >>> Thanks much, >>> Doug >>> >>> __ >>> R-help@stat.math.ethz.ch mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >> >> __ >> R-help@stat.math.ethz.ch mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fitting mixed models to censored data?
Hi Bert, Yes, I am always wary when one software offers something that other do not. The censoring I'm faced with (at present) isn't as complicated as with much 'survival' data. I'm trying to analyze assay data and have a lower limit of detection (LLD) to contend with. Once the level of the analyte gets low enough it can't be accurately quantitated, hence all that is reported is that the level is less than some value (the LLD). So I'm not worried about all the complex assumptions that go along with censoring in clinical trials, etc. Thanks, Doug On Mon, 23 Apr 2007, Bert Gunter wrote: > Douglas: > > AFAIK, this is subject area of active current research. Diggle, Heagerty, > Liang, and Zeger , 2002, (ANALYSIS OF LONGITUDINAL DATA) say on p.316: "An > emerging consensus is that analysis of data with potentially informative > dropouts necessarily involves assumptions which are difficult, or even > impossible, to check from the observed data." This was ca 1994, I believe, > so I don't know whether this view is still held among experts (which I am > not). But if it is, you may do well to be careful of whatever SAS does even > if you do have to go running off to it. > > Cheers, > > Bert Gunter > Genentech Nonclinical Statistics > > > -----Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Douglas Grove > Sent: Monday, April 23, 2007 10:58 AM > To: r-help@stat.math.ethz.ch > Subject: [R] fitting mixed models to censored data? > > Hi, > > I'm trying to figure out if there are any packages allowing > one to fit mixed models (or non-linear mixed models) to data > that includes censoring. > > I've done some searching already on CRAN and through the mailing > list archives, but haven't discovered anything. Since I may well > have done a poor job searching I thought I'd ask here prior to > giving up. > > I understand that SAS's proc nlmixed can accomodate censoring > (though proc mixed apparently can't), so if I can't find > something available in R, I'll have to break down and use > that. Please, save me from having to use SAS! > > Thanks much, > Doug > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] fitting mixed models to censored data?
Hi, I'm trying to figure out if there are any packages allowing one to fit mixed models (or non-linear mixed models) to data that includes censoring. I've done some searching already on CRAN and through the mailing list archives, but haven't discovered anything. Since I may well have done a poor job searching I thought I'd ask here prior to giving up. I understand that SAS's proc nlmixed can accomodate censoring (though proc mixed apparently can't), so if I can't find something available in R, I'll have to break down and use that. Please, save me from having to use SAS! Thanks much, Doug __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cannot turn some columns in a data frame into factors
You need to create a new object and assign it to 'df' so you'd do something like this: df <- sapply(factors, function (name) { pos <- match(name,df.names) factor(df[[pos]]) }) Doug On Thu, 11 May 2006, Sam Steingold wrote: > > * jim holtman <[EMAIL PROTECTED]> [2006-05-11 12:27:39 -0400]: > > > > try '<<-' as the assignment to make it global. > > > > df[[pos]] <<- factor(df[[pos]]) > > nothing changed -- I observe the exact same behaviour: > > Month ( 1 ): TRUE > factors: FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE > > > > On 5/11/06, Sam Steingold <[EMAIL PROTECTED]> wrote: > >> > >> Hi, > >> I have a data frame df and a list of names of columns that I want to > >> turn into factors: > >> > >> df.names <- attr(df,"names") > >> sapply(factors, function (name) { > >>pos <- match(name,df.names) > >>if (is.na(pos)) stop(paste(name,": no such column\n")) > >>df[[pos]] <- factor(df[[pos]]) > >>cat(name,"(",pos,"):",is.factor(df[[pos]]),"\n") > >> }) > >> cat("factors:",sapply(df,is.factor),"\n") > >> > >> the output is: > >> > >> > >> Month ( 1 ): TRUE > >> factors: FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE > >> > >> > >> i.e., there is a column named "Month" (the 1st column), and it is indeed > >> turned into a factor inside sapply(), but after that it is numerical > >> again! > >> > >> what am I doing wrong? > > -- > Sam Steingold (http://www.podval.org/~sds) on Fedora Core release 5 (Bordeaux) > http://pmw.org.il http://ffii.org http://memri.org http://palestinefacts.org > http://truepeace.org http://mideasttruth.com http://dhimmi.com > If you're being passed on the right, you're in the wrong lane. > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] is there a formatted output in R?
You really need to learn how to do some searching, as you seem to be constantly asking questions you can answer yourself help.search("sprintf") On Fri, 10 Mar 2006, Michael wrote: > something like "sprintf" in C? > > so I can do: > > print(sprintf("the correct result is %3.4f\n", myresult)); > > --- > > Also, I am desperately looking for a "clear console screen" function in > R... > > thanks a lot! > > [[alternative HTML version deleted]] > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] can I do this with read.table??
I did read the help page, very carefully. The colClasses argument can be used if I want to stop and look through every data set to see which column I need to protect. But that's what I said that I don't want to do. As for 'as.is', I wish it did what you suggest, but it doesn't. If one reads carefully, as.is protects a character vector from converstion to a *factor*, but not from conversion to numeric/logical. Doug On Sun, 26 Feb 2006, Kjetil Brinchmann Halvorsen wrote: > Douglas Grove wrote: > > Hi, > > > > I'm trying to figure out if there's an automated way to get > > read.table to read in my data and *not* convert the character > > columns into anything, just leave them alone. What I'm referring > > ?Did you read the help page? > What about argument as.is=TRUE? > See also argument colClasses > > Kjetil > > > to as 'character columns' are columns in the data that are quoted. > > For columns of alphabetic strings (that aren't TRUE or FALSE) I can > > suppress conversion to factor with as.is=TRUE, but what I'd like to > > stop is the conversion of quoted numbers of the form "01","02",..., into > > numeric form. > > > > By an 'automated way', I mean one that does not involve me having > > to know which columns in the data are the ones I want kept as > > they are. > > > > This doesn't seem like an unreasonable thing to want to do. > > After all, say I've got the data.frame: > > > > A <- data.frame(a=1:3, b=I(c("01","02","03"))) > > > > I can export this to a text file with the simple command > > > > write.table(A, "A.txt", sep="\t", row.names=FALSE, quote=TRUE) > > > > but I cannot find an equally simple mechanism for reading this > > data back in from A.txt that allows me to reconstruct my > > data.frame 'A'. Is this an unreasonable thing to expect? > > > > Thanks, > > Doug > > > > __ > > R-help@stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide! > > http://www.R-project.org/posting-guide.html > > > > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] can I do this with read.table??
Hi, I'm trying to figure out if there's an automated way to get read.table to read in my data and *not* convert the character columns into anything, just leave them alone. What I'm referring to as 'character columns' are columns in the data that are quoted. For columns of alphabetic strings (that aren't TRUE or FALSE) I can suppress conversion to factor with as.is=TRUE, but what I'd like to stop is the conversion of quoted numbers of the form "01","02",..., into numeric form. By an 'automated way', I mean one that does not involve me having to know which columns in the data are the ones I want kept as they are. This doesn't seem like an unreasonable thing to want to do. After all, say I've got the data.frame: A <- data.frame(a=1:3, b=I(c("01","02","03"))) I can export this to a text file with the simple command write.table(A, "A.txt", sep="\t", row.names=FALSE, quote=TRUE) but I cannot find an equally simple mechanism for reading this data back in from A.txt that allows me to reconstruct my data.frame 'A'. Is this an unreasonable thing to expect? Thanks, Doug __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] " 'x' must be numeric"
It's much more helpful if you show the actual command you used. Presumably you have a data frame 'd' and you've done hist(d), and 'hist' has complained because d is not numeric, d is a data frame that *contains* a numeric vector. You need to give hist() that numeric vector, which you can do in many ways, including: d$V1, d[,"V1"] and d[,1] Doug On Fri, 20 Jan 2006, Naiara S. Pinto wrote: > Hello all, > > I am importing data from a txt file and try to get a histogram, I get the > message: "Error in hist: 'x' must be numeric". > When I use mode R returns "List". > However when I use srt I get: > `data.frame': 456 obs. of 1 variable: > $ V1: num 0.6344 0.4516 0.0968 0.7634 0.7957 ... > My file consists of one column only (no headers) and I can't figure out > why I am getting this error message. Why does this happen? > > Thanks! > > Naiara. > > > Naiara S. Pinto > Ecology, Evolution and Behavior > 1 University Station A6700 > Austin, TX, 78712 > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Selecting data frame components by name - do you know a shorter way?
So you want to create a subset of a data frame? with components "name1" "name2" "name3" ... dframe[, c("name1","name2","name3",...)] will do that Doug On Fri, 20 Jan 2006, Michael Reinecke wrote: > Hi! I suspect there must be an easy way to access components of a data frame > by name, i.e. the input should look like "name1 name2 name3 ..." and the > output be a data frame of those components with the corresponding names. I > ´ve been trying for hours, but only found the long way to do it (which is not > feasible, since I have lots of components to select): > > > > dframe[names(dframe)=="name1" | dframe=="name2" | dframe=="name3"] > > > > Do you know a shortcut? > > > > Michael > > > [[alternative HTML version deleted]] > > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] discovery (was: data.frame to character)w
Help pages are useful, you should try them e.g. ?pi or ?LETTERS > How can one discover or list all available built-in objects? > On Jun 10, 2005, at 7:23 AM, Muhammad Subianto wrote: > >> L3 <- LETTERS[1:3] > >> L10 <- LETTERS[1:10] > LETTERS is apparently a built-in character vector. ls() and objects > () only lists the ones I've created. Is there a function that lists > all available built-in objects? > For example, "pi" is another built-in, but "e" is not. A means to > list them would be nice. > Regards, > - Robert __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] problem with dir() in R-2.1.0?
The new version of R has begun enforcing rules on regular expressions. Your pattern is not a valid regular expression, hence it no longer works. The meaning of '*' is with respect to a preceding character, hence it is ill-defined without one. On Mon, 25 Apr 2005, Ye, Bin wrote: > Hi, > > I always use dir(pattern="*.RData") in all the earlier version of R (1.8, > 1.9, 2.0.1). > > Error messege is as below: > Error in list.files(path, pattern, all.files, full.names, recursive) : > invalid 'pattern' regular expression > > Does anyone have an idea what's going on? How should I define the pattern I > need in R-2.1.0? > > Thanks! > > > Bin > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How about a mascot for R?
When I think of New Zealand I think "Rabbit" :) How 'bout something like the Monty Python rabbit from "the Holy Grail" ("nasty pointy teeth...", "look at the bones!") Doug __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] inverse function of order()
An alternate method that saves having to use order() again is r[o] <- r Doug On Mon, 2004-10-04 at 15:21, Wolfram Fischer wrote: > I have: > > d <- sample(10:100, 9) > o <- order(d) > r <- d[o] > > How I can get d (in the original order), knowing only r and o? > > Thanks - Wolfram > > __ > [EMAIL PROTECTED] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] alternate rank method
I agree. These are obvious extensions to the options provided now by rank. I didn't suggest this as I am not a contributor and don't feel comfortable asking others to do more work :) Thanks, Doug On Tue, 29 Jun 2004, Martin Maechler wrote: > >>>>> "Torsten" == Torsten Hothorn <[EMAIL PROTECTED]> > >>>>> on Mon, 28 Jun 2004 10:59:26 +0200 (CEST) writes: > > Torsten> On Fri, 25 Jun 2004, Douglas Grove wrote: > > >> I should have specified an additional constraint: > >> > >> I'm going to need to use this repeatedly on large vectors > >> (length 10^6), so something efficient is needed. > >> > > Torsten> give function `irank' in package `exactRankTests' a > Torsten> try. > > As an answer to Torsten (who got it already orally) and Gabor's > original tricky suggestions: > > I strongly believe this should happen in the same C code on > which R's base rank() function works and already implements the > *averaging* of ties. > Doing the analog of changing "average(..)" to min(..) or max(..) > shouldn't be hard and certainly will be more efficient than the > "workarounds" posted here. > > Patches welcome... > since otherwise I'm not sure I'll get there in time. > > Martin > __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] ties in runif() output
On Sat, 26 Jun 2004, Prof Brian Ripley wrote: > On Fri, 25 Jun 2004, Douglas Grove wrote: > > > I get ties in output from runif() when I generate as few as 10^5 > > variates and get quite a lot when I generate 10^6. Is this > > expected?? > > It should have been. > > > I haven't seen any duplication with rnorm(10^6), but > > see varying amounts of duplication using rexp(), rbeta() and > > rgamma(). I would have thought that there'd be enough precision > > that one wouldn't get ties until generating samples larger than this.. > > Did you do the calculations? Please do so. There are about 2e9 possible > values of the standard generators. I know little about the limitations of random number generation and didn't realize that only 2e9 values were obtainable. I could have done the math myself had I known Thanks very much for your help, Doug > > qbirthday(classes=2e9) > [1] 52655 > > Statisticians ought to know about the birthday problem! > > (rnorm is different because the default generator uses two uniforms, > deliberately to increase the precision.) > > > > set.seed(222) > > > sum(duplicated(runif(10^5))) > > [1] 4 > > That's unusually high, BTW. > > > > sum(duplicated(runif(10^6))) > > [1] 140 > > -- > Brian D. Ripley, [EMAIL PROTECTED] > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UKFax: +44 1865 272595 > > __ > [EMAIL PROTECTED] mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] ties in runif() output
I get ties in output from runif() when I generate as few as 10^5 variates and get quite a lot when I generate 10^6. Is this expected?? I haven't seen any duplication with rnorm(10^6), but see varying amounts of duplication using rexp(), rbeta() and rgamma(). I would have thought that there'd be enough precision that one wouldn't get ties until generating samples larger than this.. > set.seed(222) > sum(duplicated(runif(10^5))) [1] 4 > sum(duplicated(runif(10^6))) [1] 140 platform i686-pc-linux-gnu arch i686 os linux-gnu system i686, linux-gnu status Patched major1 minor9.0 year 2004 month04 day 13 language R Thanks, Doug Grove __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] alternate rank method
I should have specified an additional constraint: I'm going to need to use this repeatedly on large vectors (length 10^6), so something efficient is needed. On Fri, 25 Jun 2004, Sundar Dorai-Raj wrote: > Douglas Grove wrote: > > > Hi, > > > > I'm wondering if anyone can point me to a function that will > > allow me to do a ranking that treats ties differently than > > rank() provides for? > > > > I'd like a method that will assign to the elements of each > > tie group the largest rank. > > > > An example: > > > > For the vector 'v', I'd like the method to return 'rv' > > > > v: 1 2 3 3 3 4 5 5 6 7 > > rv: 1 2 5 5 5 6 8 8 9 10 > > > > > > Thanks, > > Doug Grove > > > > How about > > rv <- rowSums(outer(v, v, ">=")) > > Adapted from Prof. Ripley's reply in the following thread: > > http://finzi.psych.upenn.edu/R/Rhelp02/archive/31993.html > > HTH, > > --sundar > > __ > [EMAIL PROTECTED] mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] alternate rank method
Hi, I'm wondering if anyone can point me to a function that will allow me to do a ranking that treats ties differently than rank() provides for? I'd like a method that will assign to the elements of each tie group the largest rank. An example: For the vector 'v', I'd like the method to return 'rv' v: 1 2 3 3 3 4 5 5 6 7 rv: 1 2 5 5 5 6 8 8 9 10 Thanks, Doug Grove __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] predict function
You can't use this anymore. The function predict() has a method for loess objects, but there is no longer an available function called "predict.loess". So just replace "predict.loess" with "predict". On Fri, 13 Feb 2004, Thomas Jagoe wrote: > I am using R to do a loess normalisation procedure. > In 1.5.1 I used the following commands to normalise the variable "logratio", > over a 2d surface (defined by coordinates x and y): > > > array <- read.table("121203B_QCnew.txt", header=T, sep="\t") > > array$logs555<-log(array$s555)/log(2) > > array$logs647<-log(array$s647)/log(2) > > array$logratio<-array$logs555-array$logs647 > > array$logav<-(array$logs555+array$logs647)/2 > > library(modreg) > > loess2d<-loess(logratio~x+y,data=array) > > array$logratio2DLoeNorm <-array$logratio - predict.loess(loess2d, array) > > However in 1.8.1 all goes well until the last step when I get an error: > > Error: couldn't find function "predict.loess" > > Can anyone help ? > > Thomas > > __ > [EMAIL PROTECTED] mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Windows Memory Issues
On Sat, 6 Dec 2003, Prof Brian Ripley wrote: > I think you misunderstand how R uses memory. gc() does not free up all > the memory used for the objects it frees, and repeated calls will free > more. Don't speculate about how memory management works: do your > homework! Are you saying that consecutive calls to gc() will free more memory than a single call, or am I misunderstanding? Reading ?gc and ?Memory I don't see anything about this mentioned. Where should I be looking to find more comprehensive info on R's memory management?? I'm not writing any packages, just would like to have a better handle on efficiently using memory as it is usually the limiting factor with R. FYI, I'm running R1.8.1 and RedHat9 on a P4 with 2GB of RAM in case there is any platform specific info that may be applicable. Thanks, Doug Grove Statistical Research Associate Fred Hutchinson Cancer Research Center > In any case, you are using an outdated version of R, and your first > course of action should be to compile up R-devel and try that, as there > has been improvements to memory management under Windows. You could also > try compiling using the native malloc (and that *is* described in the > INSTALL file) as that has different compromises. > > > On Sat, 6 Dec 2003, Richard Pugh wrote: > > > Hi all, > > > > I am currently building an application based on R 1.7.1 (+ compiled > > C/C++ code + MySql + VB). I am building this application to work on 2 > > different platforms (Windows XP Professional (500mb memory) and Windows > > NT 4.0 with service pack 6 (1gb memory)). This is a very memory > > intensive application performing sophisticated operations on "large" > > matrices (typically 5000x1500 matrices). > > > > I have run into some issues regarding the way R handles its memory, > > especially on NT. In particular, R does not seem able to recollect some > > of the memory used following the creation and manipulation of large data > > objects. For example, I have a function which receives a (large) > > numeric matrix, matches against more data (maybe imported from MySql) > > and returns a large list structure for further analysis. A typical call > > may look like this . > > > > > myInputData <- matrix(sample(1:100, 750, T), nrow=5000) > > > myPortfolio <- createPortfolio(myInputData) > > > > It seems I can only repeat this code process 2/3 times before I have to > > restart R (to get the memory back). I use the same object names > > (myInputData and myPortfolio) each time, so I am not create more large > > objects .. > > > > I think the problems I have are illustrated with the following example > > from a small R session . > > > > > # Memory usage for Rui process = 19,800 > > > testData <- matrix(rnorm(1000), 1000) # Create big matrix > > > # Memory usage for Rgui process = 254,550k > > > rm(testData) > > > # Memory usage for Rgui process = 254,550k > > > gc() > > used (Mb) gc trigger (Mb) > > Ncells 369277 9.9 667722 17.9 > > Vcells 87650 0.7 24286664 185.3 > > > # Memory usage for Rgui process = 20,200k > > > > In the above code, R cannot recollect all memory used, so the memory > > usage increases from 19.8k to 20.2. However, the following example is > > more typical of the environments I use . > > > > > # Memory 128,100k > > > myTestData <- matrix(rnorm(1000), 1000) > > > # Memory 357,272k > > > rm(myTestData) > > > # Memory 357,272k > > > gc() > > used (Mb) gc trigger (Mb) > > Ncells 478197 12.8 818163 21.9 > > Vcells 9309525 71.1 31670210 241.7 > > > # Memory 279,152k > > > > Here, the memory usage increases from 128.1k to 279.1k > > > > Could anyone point out what I could do to rectify this (if anything), or > > generally what strategy I could take to improve this? > > > > Many thanks, > > Rich. > > > > Mango Solutions > > Tel : (01628) 418134 > > Mob : (07967) 808091 > > > > > > [[alternative HTML version deleted]] > > > > __ > > [EMAIL PROTECTED] mailing list > > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > > > > > > -- > Brian D. Ripley, [EMAIL PROTECTED] > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UKFax: +44 1865 272595 > > __ > [EMAIL PROTECTED] mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] A suggestion regarding multiple replies
On Fri, 14 Nov 2003 [EMAIL PROTECTED] wrote: > I was wondering if it is time to adopt a strategy a-la Splus help whereby > people reply to the author and the author summarizes all the replies? That might be a bit extreme, but it would be nice if people didn't reply to the list (only to the authors) for very basic questions. Most of us already know how to e.g. find the position of the largest element in a vector. Doug __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Kmeans again
> I'm sorry to insist but I still think there is something wrong with the function > kmeans. For instance, let's try the same small example: > > > dados<-matrix(c(-1,0,2,2.5,7,9,0,3,0,6,1,4),6,2) > > I will choose observations 3 and 4 for initial centers and just one iteration. The > results are > > > A<-kmeans(dados,dados[c(3,4),],1) > > A > $cluster > [1] 1 1 1 1 2 2 > $centers >[,1] [,2] > 1 0.875 2.75 > 2 8.000 2.50 > $withinss > [1] 38.9375 6.5000 > $size > [1] 4 2 > > If I do it by hand, after one iteration, the results are > > $cluster > [1] 1 2 1 2 1 2 > > So I think that something is wrong with the function kmeans; probably the initial > centers given > by the user are not being taken into account. Andy Liaw already gave an example where he specified two different starting values and Kmeans gave different results after 1 iteration, so clearly your hypothesis is incorrect. Either your calculations are wrong or you are calculating the wrong formulae. It is very doubtful that anything is wrong with Kmeans. Doug Grove __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] removing leading/trailing blanks
Hi, What's the best way of dropping leading or trailing blanks from a character string? The only thing I can think of is using sub() to replace blanks with null strings, but I don't know if there is a better way (I also don't know how to represent the trailing blank in a regular expression). Thanks, Doug Grove __ [EMAIL PROTECTED] mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] dataframe subsetting behaviour
> Douglas Grove <[EMAIL PROTECTED]> writes: > > > Hi, > > > > I'm trying to understand a behaviour that I have encountered > > and can't fathom. > > > > > > Here's some code I will use to illustrate the behaviour: > > > > # start with some data frame "a" having some named columns > > a <- data.frame(a=rep(1,3),c=rep(2,3),d=rep(3,3),e=rep(4,3)) > > > > # create a subset of the original data frame, but include a > > # name "b" that is not present in my original data frame > > b <- a[,c("a","b","c")] > > > > > > ## Up until now no errors are issued, but the following commands > > ## will give the error shown: > > > > b[1,] ## "Error in x[[j]] : subscript out of bounds" > > b[1,2]## "Error in "names<-.default"(*tmp*, value = cols) : > > ## names attribute must be the same length as the vector" > > > > > > Can anyone explain to me the meaning of these error messages in terms > > of R is actually doing? These error messages had me baffled and > > it took me hours to track down that the source of the error was an > > incorrect column name in my data frame subsetting. > > Looks like a (semi-)bug. Indexing outside of the data frame creates a > "column" which is really the single value NULL, e.g. > > > dput(a[,4:5]) > structure(list(e = c(4, 4, 4), "NA" = NULL), .Names = c("e", > NA), row.names = c("1", "2", "3"), class = "data.frame") > > This will print because the format.data.frame called inside > print.data.frame will recycle the NULL and give you > > > a[,4:5] > e NA > 1 4 NULL > 2 4 NULL > 3 4 NULL > > However, it confuses the h*ck out of "[.data.frame" > > > (a[,4:5])[2] > Error in "[.data.frame"((a[, 4:5]), 2) : undefined columns selected > > (a[,4:5])[,2] > NULL > > (a[,4:5])[,1] > [1] 4 4 4 > > and also the examples you found. However, the main issue is that you > have managed to construct a corrupt data frame. So indexing outside > the array should probably either give an error or return a column of > NA. Yes, it would be nice if trying to index outside the data frame generated an error, that is what happens in Splus (at least the version I have access to: 6.0 Release 1 for Linux 2.2.12) > > -- >O__ Peter Dalgaard Blegdamsvej 3 > c/ /'_ --- Dept. of Biostatistics 2200 Cph. N > (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 > ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 > __ [EMAIL PROTECTED] mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] dataframe subsetting behaviour
Hi, I'm trying to understand a behaviour that I have encountered and can't fathom. Here's some code I will use to illustrate the behaviour: # start with some data frame "a" having some named columns a <- data.frame(a=rep(1,3),c=rep(2,3),d=rep(3,3),e=rep(4,3)) # create a subset of the original data frame, but include a # name "b" that is not present in my original data frame b <- a[,c("a","b","c")] ## Up until now no errors are issued, but the following commands ## will give the error shown: b[1,] ## "Error in x[[j]] : subscript out of bounds" b[1,2]## "Error in "names<-.default"(*tmp*, value = cols) : ## names attribute must be the same length as the vector" Can anyone explain to me the meaning of these error messages in terms of R is actually doing? These error messages had me baffled and it took me hours to track down that the source of the error was an incorrect column name in my data frame subsetting. Thanks, Doug Grove Statistical Research Associate Fred Hutchinson Cancer Research Center Seattle, WA __ [EMAIL PROTECTED] mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-help