[R] rpostgresql - write table to specific tablespace?
Hello everybody, is there a possibility to write a data.frame as a table to a postgresql-database on a specific table_space? As I see it, the normal dbWriteTable() function does not provide this functionality. Can I do it using a form of dbSendQuery? Lamentably, I did not find any information on this issue on the web. Please correct me if I missed something. All the best Julian [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] issues with calling predict.coxph.penal (survival) inside a function
Hello, and thanks for the answer. 1) I found a work-around - in the end it is easier than thought before. The only thing you have to do is to have the same variable name with the new values. So if predict(coxph.penal.fit, newdata[subset,]) does not work inside a function, the following works: pred_function - function(coxph_model, newdata){ #things to do before newdata=newdata[the_subset_i_want,] predict(coxph_model, newdata) } I attach a working example. I am not sure, maybe this is even what is written in the help to NextMethod ;) NextMethod works by creating a special call frame for the next method. If no new arguments are supplied, the arguments will be the same in number, order and name as those to the current method but their values will be promises to evaluate their name in the current method and environment. Any named arguments matched to ... are handled specially: they either replace existing arguments of the same name or are appended to the argument list. They are passed on as the promise that was supplied as an argument to the current environment. (S does this differently!) *If they have been evaluated in the current (or a previous environment) they remain evaluated.* (This is a complex area, and subject to change: see the draft R Language Definition.) (Help to NextMethod ) 2) Terry, I am not sure about the work-around you provided in your mail. I want to do subsetting on newdata, not on the model. Additionally, when trying the example you provided, I received different results. Example is attached. Thanks and all the best Julian #--- test1 - data.frame(time=c(4,3,1,1,2,2,3), status=c(1,1,1,0,1,1,0), x=c(0,2,1,1,1,0,0), sex=c(0,0,0,0,1,1,1)) # Fit a stratified model fit1 - coxph(Surv(time, status) ~ x + strata(sex), test1) summary(fit1) #fit stratified wih spline fit2 - coxph(Surv(time, status) ~ pspline(x, df=2) + strata(sex), test1) summary(fit2) ## work-around predicting_function_which_works- function(model, newdata ){ subs -vector(mode='logical', length=nrow(newdata)) subs[3:length(subs)]- TRUE #try with first values set to false newdata_alt-newdata newdata-newdata[subs,] ret-vector(mode='numeric', length=nrow(newdata_alt)) ret[!subs]- NA ret[subs]- predict(model,newdata ) return(ret) } predicting_function_which_works(fit1, test1) # works predicting_function_which_works(fit2,test1) # works predicting_function_which_works(fit2,data.frame(time=c(4,3,1,1,2), # works status=c(1,1,1,0,1), x=c(0,2,1,1,2), sex=c(1,1,0,0,1)) ) ## How I understood Terry's work-around. Provides different results and doesn't consider subset predicting_function_2 - function(model, newdata){ subs -vector(mode='logical', length=nrow(newdata)) subs[2:length(subs)]- TRUE newX - model.matrix(model) newY - model$y newfit - coxph(newY ~ newX, iter=0, init=coef(model)) newfit$var - model$var #print(model) #print(newfit) #predict(newfit) #ret=predict(newfit) print(comparison) print(paste( model, original prediction:, paste(predict(model), collapse=,))) print(paste(newfit, original prediction:, paste(predict(newfit), collapse=,))) ret - predict (newfit, newdata[subs,]) return(ret) } predicting_function_2(fit1, test1) predicting_function_2(fit2,test1) -Ursprüngliche Nachricht- Von: Terry Therneau [mailto:thern...@mayo.edu] Gesendet: Donnerstag, 14. November 2013 16:31 An: r-help@r-project.org; julian.bo...@elitepartner.de Betreff: Re: issues with calling predict.coxph.penal (survival) inside a function Thanks for the reproducable example. I can confirm that it fails on my machine using survival 2-37.5, the next soon-to-be-released version, The issue is with NextMethod, and my assumption that the called routine inherited everything from the parent, including the environment chain. A simple test this AM showed me that the assumption is false. It might have been true for Splus. Working this out may take some time -- every other one of my wrestling matches with predict inside a function has -- and there is a reasonable chance that it won't make this already overdue release. In the meantime, here is a workaround that I have sometimes used in other situations. Inside your function do the following: fit a new coxph model with fixed coefficients, and do prediction on that. myfun - function(oldfit, subset) { newX - model.matrix(oldfit)[subset,] newY - oldfit$y[subset] newfit - coxph(newY ~ newX, iter=0, init=coef(oldfit)) newfit$var - oldfit$var predict(newfit) } If the subset is all of a particular strata, as you indicated, then all of the predictions will be correct. If not, then those that make use of the the baseline hazard (type= expect) will be incorrect but all others are ok. Terry Therneau On 11/14/2013 05:00
[R] issues with calling predict.coxph.penal (survival) inside a function - subset-vector not found. Because of NextMethod?
Hello everyone, I got an issue with calling predict.coxph.penal inside a function. Regarding the context: My original problem is that I wrote a function that uses predict.coxph and survfit(model) to predict a lot of survival-curves using only the basis-curves for the strata (as delivered by survfit(model) ) and then adapts them with the predicted risk-scores. Because there are cases where my new data has strata which didn't exist in the original model I exclude them, using a Boolean vector inside the function. I end up with a call like this: predict (coxph_model, newdata[subscript_vector,] ) This works fine for coxph.model, but when I fit a model with a spline (class coxph.penal), I get an error: Error in `[.data.frame`(newdata, [subscript_vector, ) : object '[subscript_vector ' not found I suppose this is because of NextMethod, but I am not sure how to work around it. I also read a little bit about all those matching-and-frame-issues, But must confess I am not really into it. I attach a reproducible example. Any help or suggestions of work-arounds will be appreciated. Thanks Julian version _ platform x86_64-w64-mingw32 arch x86_64 os mingw32 system x86_64, mingw32 status major 3 minor 0.1 year 2013 month 05 day16 svn rev62743 language R version.string R version 3.0.1 (2013-05-16) nickname Good Sport ##TEST-DATA # Create the simplest test data set test1 - data.frame(time=c(4,3,1,1,2,2,3), status=c(1,1,1,0,1,1,0), x=c(0,2,1,1,1,0,0), sex=c(0,0,0,0,1,1,1)) # Fit a stratified model fit1 - coxph(Surv(time, status) ~ x + strata(sex), test1) summary(fit1) #fit stratified wih spline fit2 - coxph(Surv(time, status) ~ pspline(x, df=2) + strata(sex), test1) summary(fit2) #function to predict within predicting_function - function(model, newdata){ subs -vector(mode='logical', length=nrow(newdata)) subs[1:length(subs)]- TRUE ret - predict (model, newdata=newdata[subs,]) return(ret) } predicting_function(fit1, test1) # works predicting_function(fit2,test1) #doesnt work - Error in `[.data.frame`(newdata, subs, ) : object 'subs' not found # probably because of NextMethod # traceback() #12: `[.data.frame`(newdata, subs, ) #11: newdata[subs, ] #10: is.data.frame(data) #9: model.frame.default(data = newdata[subs, ], formula = ~pspline(x, # df = 2) + strata(sex), na.action = function (object, ...) # object) #8: model.frame(data = newdata[subs, ], formula = ~pspline(x, df = 2) + # strata(sex), na.action = function (object, ...) # object) #7: eval(expr, envir, enclos) #6: eval(tcall, parent.frame()) #5: predict.coxph(model, newdata = newdata[subs, ]) #4: NextMethod(predict, object, ...) #3: predict.coxph.penal(model, newdata = newdata[subs, ]) #2: predict(model, newdata = newdata[subs, ]) at #5 #1: predicting_function(fit2, test1) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Bug in Survival - predict.coxph with collapse? (related to question censor=FALSE and id options in survfit.coxph)
Hello everyone, Hello Terry, Trying to find workarounds for the bug described in http://r.789695.n4.nabble.com/censor-FALSE-and-id-options-in-survfit-coxph -td4670320.html, I found another issue which might or might not be related and which I think is a bug - unless I got the usage of the collapse argument in predict.coxph totally wrong. Using predict.coxph with a collapse vector, it gives the error the collapse-vector has the wrong length. Using the example data given in http://r.789695.n4.nabble.com/censor-FALSE-and-id-options-in-survfit-coxph -td4670320.html: predict = predict(mod, newdata=datnew, collapse=datnew$id) Error in predict.coxph(mod, newdata = datnew, collapse = datnew$id) : Collapse vector is the wrong length all the best Julian ### CODE # # create data set.seed(20130625) n - 100 # sample size x - rbinom(n, 1, 0.5) # covariate z - rep(0, n) # start time y - rexp(n, exp(x)) # event time e - y 2 # censor at 2 y - pmin(y, 2) # observation time dat - data.frame(x,z,y,e) rm(x,z,y,e) # fit cox model with start/stop format library(survival) mod - coxph(Surv(z, y, e)~x, data=dat) summary(mod) plot(survfit(mod)) # create prediction dataset with 3 individuals with # x = 0 on (0,2) # x = 1 on (0,2) # x = 0 on (0,1) and x = 1 on (1,2) datnew - data.frame(x=c(0,1,0,1), z=c(0,0,0,1), y=c(2,2,1,2), e=rep(0,4), id=c(1,2,3,3)) datnew Prediction ## predict(mod, newdata=datnew)## works predict(mod, newdata=datnew, collapse=datnew$id) ##error predict(mod, newdata=dat[1:5,], collapse=c(1,2,3,4)) ## error [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] prediction survival curves for coxph-models; how to extract the right strata per individual
At first, I would like to plot the survival curves. After that , the main use will be to calculate conditional probabilities - given that an individual already survived x days, what will be the chance it survives till day ,e.g., 100. I 've already found a solution, though. Of the - non-subscribed - survfit-object, the following functions select the indices belonging to the individual. Usage is e.g.: stratas= extract_strata(coxph.3, data) coxph.3.surfvit = survfit(coxph.3, newdata=data) strata_subscripts = extract_strata_subscripts(coxph.3.surfvit) plot(0,0, ylim=c(0,1), xlim=c(-10,360)) for(i in 1:100) lines(coxph.3.surfvit$time[strata_subscripts[,stratas[i]] ], coxph.3.surfvit$surv[strata_subscripts[,stratas[i]] ,i], col = rainbow(100)[i]) Of course, this is not very effective, but it works. Optimisations could be to work with basehaz and predict, instead of fitting a survfit-object. But, here are the functions I use (still partly in german, nor well documented, and no nice coding) All the best Julian extract_strata - function( object, data){ ## returns the correct stratum for every element of data ## Gibt für jedes Element von Data das zugehörige Stratum gemäß object zurück if( inherits(object,coxph)){ terms_form = terms(object$formula, specials=strata) } else if (inherits(object,formula)){ terms_form = terms(object, specials=strata) } else stop(object muss von Klasse coxph oder formula sein) if(length(attr(terms_form, which=specials)$strata)==0) { warning(Keine strata gefunden) return (NULL) } else if(length(attr(terms_form, which=specials)$strata)1) { warning(Mehr als ein Aufruf von strata gefunden, return NULL) return (NULL) } else { # strata-call parsen, im envrionment data ausführen, zurückgeben) return(eval(parse(text=rownames( attr(terms_form,factors))[attr(terms_form, which=specials)$strata]), envir=data) ) } } extract_strata_subscripts - function(survfit_object){ if( !inherits(survfit_object,survfit.cox)) stop(survfit_object must be of class \survfit.cox\) require(survival) if( is.null(survfit_object$strata)){ warning(No Strata found) return(TRUE) } stratanames= names(survfit_object$strata) nstrata = length(stratanames) ntimes = length(survfit_object$time) strataborders = matrix(ncol=3, nrow=nstrata, dimnames=list( strata=stratanames,borders=c(min,max, length))) strataborders[1,]=c(1,nrow(survfit_object[1]$surv),nrow(survfit_object[1]$ surv)) for(x in 2:nstrata){ strataborders[x,1]=strataborders[x-1,2]+1 strataborders[x,2]=strataborders[x-1,2]+ nrow(survfit_object[x]$surv) strataborders[x,3] = nrow(survfit_object[x]$surv) } ret_matrix=matrix(data=F,ncol=nstrata, nrow=ntimes) colnames(ret_matrix)=stratanames attr(ret_matrix, which=strataborders)= strataborders for(x in stratanames) ret_matrix[( (1:ntimes)= strataborders[x,1])( (1:ntimes)= strataborders[x,2]),x] =T return(ret_matrix) } -Ursprüngliche Nachricht- Von: Terry Therneau [mailto:thern...@mayo.edu] Gesendet: Freitag, 26. Juli 2013 16:12 An: r-help@r-project.org; julian.bo...@elitepartner.de Betreff: Re: [R] prediction survival curves for coxph-models; how to extract the right strata per individual It would help me to give advice if I knew what you wanted to do with the new curves. Plot, print, extract? A more direct solution to your question will appear in the next release of the code, btw. Terry T. On 07/25/2013 05:00 AM, r-help-requ...@r-project.org wrote: My problem is: I have a coxph.model with several strata and other covariables. Now I want to fit the estimated survival-curves for new data, using survfit.coxph. But this returns a prediction for each stratum per individual. So if I have 15 new individuals and 10 strata, I have 150 fitted surivival curves (or if I don't use the subscripts I have 15 predictions with the curves for all strata pasted together) Is there any possibility to get only the survival curves for the stratum the new individual belongs to? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] prediction survival curves for coxph-models; how to extract the right strata per individual
Hello everyone, It somehow seems like a strange question, but I don't find the answer. I think my question is the same http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=user_nodesu ser=23573 stephenb wanted to ask in January, but it seems he didn't get an answer (see http://r.789695.n4.nabble.com/obtainl-survival-curves-for-single-strata-td 4657177.html). My problem is: I have a coxph.model with several strata and other covariables. Now I want to fit the estimated survival-curves for new data, using survfit.coxph. But this returns a prediction for each stratum per individual. So if I have 15 new individuals and 10 strata, I have 150 fitted surivival curves (or if I don't use the subscripts I have 15 predictions with the curves for all strata pasted together) Is there any possibility to get only the survival curves for the stratum the new individual belongs to? I think the newstrata argument at survfit.coxph should do this, but trying to use the argument I get the message Warning message: In survfit.coxph(coxph.3, newdata = activisale_join[1:15, ], na.action = na.pass, : newstrata argument under construction, value ignored (below the documentation for survfit.coxph) So did anyone a workaround? Or does anyone know a fast solution to automatically use the right subscripts? (beside the one supplied by Stephen, which seems to take a little bit too much time.) All the best Julian Ps.: I use survival in version 2.37-4 newstrata if the original coxph model had strata, should the predictions be for all strata, or only for those in newdata? The default for this is TRUE if either the id or individual argument is present, as these require strata for the resulting curves to make sense. Otherwise the default is FALSE, which means to ignore any strata variable in the newdata data set, and produce predicted survivals for the entire set of strata in the original model. In this case some components of the output will be matrices with one column for each row of newdata. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] error in predict.gam used with bam
Solved it!! ;) The problem was that the test-data contained factor-levels the training-data didn't. So when trying to fit this new factor-levels to a model which didn't have this levels, the error occurred. When excluding the factor-levels not used when fitting the model or when taking care all levels are used for modell-fitting, everything is fine. All the best Julian -Ursprüngliche Nachricht- Von: Simon Wood [mailto:s.w...@bath.ac.uk] Gesendet: Dienstag, 9. Juli 2013 09:07 An: julian.bo...@elitepartner.de Cc: r-help@r-project.org Betreff: Re: [R] error in predict.gam used with bam Hi Julian, Any chance you could send me (offline) a short version of your data, which reproduces the problem? I can't reproduce it in a quick attempt (but it is quite puzzling, given that bam calls predict.gam internally in pretty much the same way that you are doing here). btw (and nothing to do with the error) given that you are using R 3.0.1 it's a good idea to upgrade to mgcv_1.7-23 or above, for the following reason (taken from the mgcv changeLog) 1.7-23 -- *** Fix of severe bug introduced with R 2.15.2 LAPACK change. The shipped version of dsyevr can fail to produce orthogonal eigenvectors when uplo='U' (upper triangle of symmetric matrix used), as opposed to 'L'. This led to a substantial number of gam smoothing parameter estimation convergence failures, as the key stabilizing re-parameterization was substantially degraded. The issue did not affect gaussian additive models with GCV model selection. Other models could fail to converge any further as soon as any smoothing parameter became `large', as happens when a smooth is estimated as a straight line. check.gam reported the lack of full convergence, but the issue could also generate complete fit failures. Picked up late as full test suite had only been run on R 2.15.1 with an external LAPACK. best, Simon On 08/07/13 10:02, julian.bo...@elitepartner.de wrote: Hello everyone. I am doing a logistic gam (package mgcv) on a pretty large dataframe (130.000 cases with 100 variables). Because of that, the gam is fitted on a random subset of 1. Now when I want to predict the values for the rest of the data, I get the following error: gam.basis_alleakti.1.pr=predict(gam.basis_alleakti.1, + newdata=activisale_join[gam.basis_alleakti.1.complete_cases,all.vars(g am.b asis_alleakti.1.formula)],type=response) Error in predict.gam(gam.basis_alleakti.1, newdata = activisale_join[gam.basis_alleakti.1.complete_cases, : number of items to replace is not a multiple of replacement length The following is the code: #formula with some factors and a lot of variables to be fitted gam.basis_alleakti.1.formula=as.formula( paste(verlängerung ~, paste( names(activisale_join)[c(2:10)], collapse=+), ##factors paste(s(,names(activisale_join)[c(17,19:29,31:42,44)],), collapse=+)) # numeric variables, all count data ) # complete cases gam.basis_alleakti.1.complete_cases = complete.cases(activisale_join[,all.vars(gam.basis_alleakti.1.formula) ]) # modell fitting works on random subset gam.basis_alleakti.1=bam(gam.basis_alleakti.1.formula, data = activisale_join[subset.1, ], family= binomial) # error, no idea why gam.basis_alleakti.1.pr=predict(gam.basis_alleakti.1, newdata=activisale_join[gam.basis_alleakti.1.complete_cases, ],type=response) the prediction on the same subset (subset.1) works. It could be that this error is somewhat similar to that described as sidequestion in http://r.789695.n4.nabble.com/gamm-tensor-product-and-interaction-td45 2618 8.html, where simon answered the following: Here is the error message I obtain: vis.gam(gm1$gam,plot.type=contour,n.grid=200,color=heat,zlim=c(0,4 )) Error in predict.gam(x, newdata = newd, se.fit = TRUE, type = type) : number of items to replace is not a multiple of replacement length - hmm, possibly a bug. I'll look into it. best, Simon All the best Julian Ps.: version _ platform x86_64-w64-mingw32 arch x86_64 os mingw32 system x86_64, mingw32 status major 3 minor 0.1 year 2013 month 05 day16 svn rev62743 language R version.string R version 3.0.1 (2013-05-16) nickname Good Sport package mgcv version 1.7-22 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Simon Wood, Mathematical Science, University of Bath BA2 7AY UK +44 (0)1225 386603 http://people.bath.ac.uk/sw283 __ R-help@r-project.org
[R] error in predict.gam used with bam
Hello everyone. I am doing a logistic gam (package mgcv) on a pretty large dataframe (130.000 cases with 100 variables). Because of that, the gam is fitted on a random subset of 1. Now when I want to predict the values for the rest of the data, I get the following error: gam.basis_alleakti.1.pr=predict(gam.basis_alleakti.1, + newdata=activisale_join[gam.basis_alleakti.1.complete_cases,all.vars(gam.b asis_alleakti.1.formula)],type=response) Error in predict.gam(gam.basis_alleakti.1, newdata = activisale_join[gam.basis_alleakti.1.complete_cases, : number of items to replace is not a multiple of replacement length The following is the code: #formula with some factors and a lot of variables to be fitted gam.basis_alleakti.1.formula=as.formula( paste(verlängerung ~, paste( names(activisale_join)[c(2:10)], collapse=+), ##factors paste(s(,names(activisale_join)[c(17,19:29,31:42,44)],), collapse=+)) # numeric variables, all count data ) # complete cases gam.basis_alleakti.1.complete_cases = complete.cases(activisale_join[,all.vars(gam.basis_alleakti.1.formula) ]) # modell fitting works on random subset gam.basis_alleakti.1=bam(gam.basis_alleakti.1.formula, data = activisale_join[subset.1, ], family= binomial) # error, no idea why gam.basis_alleakti.1.pr=predict(gam.basis_alleakti.1, newdata=activisale_join[gam.basis_alleakti.1.complete_cases, ],type=response) the prediction on the same subset (subset.1) works. It could be that this error is somewhat similar to that described as sidequestion in http://r.789695.n4.nabble.com/gamm-tensor-product-and-interaction-td452618 8.html, where simon answered the following: Here is the error message I obtain: vis.gam(gm1$gam,plot.type=contour,n.grid=200,color=heat,zlim=c(0,4)) Error in predict.gam(x, newdata = newd, se.fit = TRUE, type = type) : number of items to replace is not a multiple of replacement length - hmm, possibly a bug. I'll look into it. best, Simon All the best Julian Ps.: version _ platform x86_64-w64-mingw32 arch x86_64 os mingw32 system x86_64, mingw32 status major 3 minor 0.1 year 2013 month 05 day16 svn rev62743 language R version.string R version 3.0.1 (2013-05-16) nickname Good Sport package mgcv version 1.7-22 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.