[R] [R-pkgs] segmented 2.1-0 is released
dear R users, I am pleased to announce that segmented 2.1-0 is now available on CRAN. segmented focuses on estimation of breakpoints/changepoints of segmented, i.e. piecewise linear, relationships in (generalized) linear models. Starting with version 2.0-0, it is also possible to model stepmented, i.e. piecewise constant, effects. In the last release both models may be fitted via a formula interface, such as segreg(y ~ seg(x1, npsi=2) + seg(x2) + z) stepreg(y ~ seg(x1, npsi=2) + seg(x2) +seg(x3, npsi=3) + z, family=poisson) There is virtually no limit in the number of covariates and corresponding number of changepoints to be estimated. thank you, kind regards, Vito -- = Vito M.R. Muggeo, PhD Professor of Statistics Dip.to Sc Econom, Az e Statistiche Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240; fax: 091 485726 http://www.unipa.it/persone/docenti/m/vito.muggeo Assoc Editor: Statist Modelling, Statist Meth Appl past chair, Statistical Modelling Society coordinator, PhD Program in Econ, Businss, Statist ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] losing variable attributes when subsetting model.frame
dear Duncan, Thank you very much for your quick reply. Very useful and also very elegant, I would add. Thanks again, kind regards, Vito Il 08/05/2024 14:06, Duncan Murdoch ha scritto: I would guess that subsetting uses [] indexing on each column, with a logical argument. If that is true, then one solution (which is less direct, but maybe someone would call it elegant) is to put a class on the result of `f()`, and define a `[` method for that class that preserves the attributes you want to preserve. In your example: f <- function(x) { attr(x, "vi") <- length(x) class(x) <- c("withAttributes", class(x)) x } `[.withAttributes` <- function(x, i) { subset <- NextMethod() attr(subset, "vi") <- attr(x, "vi") subset } x<- 1:5 z<-runif(5) y<-rnorm(5) mf <- model.frame(y~f(z), subset=x>=3) attr(mf[,2],"vi") #it works #> [1] 5 -- = Vito M.R. Muggeo, PhD Professor of Statistics Dip.to Sc Econom, Az e Statistiche Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240; fax: 091 485726 http://www.unipa.it/persone/docenti/m/vito.muggeo Assoc Editor: Statist Modelling, Statist Meth Appl past chair, Statistical Modelling Society coordinator, PhD Program in Econ, Businss, Statist __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] losing variable attributes when subsetting model.frame
dear all, I have a simple function f() which, when included in model.frame() via the formula, returns the variable itself with some attributes. However when I specify the subset argument, the attributes get lost, apparently. I would like to extract the attributes also when specifying the subset argument. Of course, I can build the whole dataframe without subsetting, taking the attributes and then build again the dataframe with 'subset', but I am wondering if a more direct (and elegant) solution exists. Any suggestion? Thank you very much, best, Vito #= Here a simple example.. f<- function(x){ attr(x,"vi")<-length(x) x } x<- 1:5 z<-runif(5) y<-rnorm(5) mf<- model.frame(y~f(z)) attr(mf[,2],"vi") #it works mf <- model.frame(y~f(z), subset=x>=3) attr(mf[,2],"vi") #it does not work -- = Vito M.R. Muggeo, PhD Professor of Statistics Dip.to Sc Econom, Az e Statistiche Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240; fax: 091 485726 http://www.unipa.it/persone/docenti/m/vito.muggeo Assoc Editor: Statist Modelling past chair, Statistical Modelling Society coordinator, PhD Program in Econ, Businss, Statist __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] OT: IWSM 2013
dear all, apologizes for this off topic. I would like to inform you that registration and paper submission for the 28th International Workshop on Statistical Modelling (IWSM) to be held in Palermo (Italy) 8-12 July 2013 is now open at http://iwsm2013.unipa.it Register at http://iwsm2013.unipa.it/?cmd=registration and then submit your abstract. Deadlines for Abstract submission is February 4, 2013 and for Early Registration is April 20, 2013. **Invited Speakers** 1)Ciprian Crainiceanu Johns Hopkins University, USA 2)Torsten Hothorn Ludwig-Maximilians-Universität Munchen, DEU 3)Stefano M. Iacus Università di Milano, ITA 4)Geoff McLachlan University of Queensland, AUS 5)Hein Putter Leiden University Medical Cente, NLD **Short Course** (sunday 7 july 2013) J. Fox, ' An Introduction to Structural Equation Modelling with the sem Package for R'. best wishes, Vito Muggeo, on behalf of the IWSM2013 Scientific Committee -- == Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240 fax: 091 485726 http://dssm.unipa.it/vmuggeo 28th IWSM International Workshop on Statistical Modelling July 8-12, 2013, Palermo http://iwsm2013.unipa.it __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Interpretation of davies.test() in segmented package
dear Greg, It does not, however give me Pr(|t|) for the break point coefficients. Of course a test H_0: breakpoint=0 is meaningless.. I need to answer the question H:0 Beta0=Beta with a certainty metric, sorry, who is your Beta0? davies.test() tests the hypothesis H0: leftSlope=rightSlope which implies diffSlope=0 and then the breakpoint does not exist. K in davies.test() means the number of evaluation points used to compute the approximate p-value.. Please contact me off list if you need more details (given detailed questions) vito Il 16/11/2012 20.57, Greg Cohn ha scritto: My data: I have raw data points that form a logit style curve as if they were a time series. Which is to say they form 3 distinct lines with 3 distinct slopes in backwards z pattern. A certain class of my data looks essentially flat to the eye with marginal oscillation. What is important to me is the x value at which the state change is occurring, in other words, the break point Use of segmented(): Segmented does a very good job of capturing the breakpoints and fitting three distinct slopes, i.e. linear models. It does not, however give me Pr(|t|) for the break point coefficients. I need to answer the question H:0 Beta0=Beta with a certainty metric, i.e. probability statistic. This is especially important for my, flat looking data class. davies.test() question: davies.test() only excepts lm() or glm() objects as input. If I run segmented to find 1 breakpoint instead of 2, I get a totally bogus answer. Without knowing the breakpoints, how is this test able to assess the proper breakpiont? It appears to only give 1 best breakpoint, which is not consistent with the breakpoints found by segmented(). Also, is K the number of breakpoints or the number of iterations that it evaluates the breakpoint? Thanks in advance. lmfit-glm(TotRad_KW~HRRPUA_kWm2,data=d1) davies.test(lmfit,seg.Z=~HRRPUA_kWm2,k=1000,alternative=less, beta0=0,dispersion=NULL) Davies' test for a change in the slope data: Model = gaussian , link = identity formula = TotRad_KW ~ HRRPUA_kWm2 segmented variable = HRRPUA_kWm2 `Best' at = 561.205, n.points = 1000, p-value 2.2e-16 alternative hypothesis: less segments - segmented(lmfit, seg.Z=~HRRPUA_kWm2,psi=c(475,550)) summary(segments) ***Regression Model with Segmented Relationship(s)*** Call: segmented.glm(obj = lmfit, seg.Z = ~HRRPUA_kWm2, psi = c(475, 550)) Estimated Break-Point(s): Est. St.Err psi1.HRRP 430.2 4.087 psi2.HRRP 484.6 3.077 t value for the gap-variable(s) V: 0 0 Meaningful coefficients of the linear terms: Estimate Std. Error t value Pr(|t|) (Intercept) -38.6993 274.7666 -0.141 0.8891 HRRPUA_kWm2 1.4297 0.7472 1.914 0.0668 . U1.HRRP 42.2884 4.7696 8.866 NA U2.HRRP -40.5897 4.7123 -8.614 NA --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 (Dispersion parameter for gaussian family taken to be 6934.706) Null deviance: 70776718 on 31 degrees of freedom Residual deviance: 180302 on 26 degrees of freedom AIC: 377.19 Convergence attained in 2 iterations with relative change -1.662839e-14 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240 fax: 091 485726 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fit a threshold function with nls
Véronique, in addition to Bert's comments, I would like to bring to your attention that there are several packages that perform threshold/breakpoint/changepoint estimation in R, including cumSeg, segmented, strucchange, and bcp for a Bayesian approach Moreover some packages, such as cghFLasso, perfom smoothing with a L1 penalty that can yield a step-function fit. I hope this helps you, vito Il 15/10/2012 22.21, Bert Gunter ha scritto: Véronique: I've cc'ed this to a true expert (Ravi Varadhan) who is one of those who can give you a definitive response, but I **believe** the problem is that threshhold type function fits have objective functions whose derivatives are discontinuous,and hence gradient -based methods can run into the sorts of problems that you see. **If** this is so, then you might do better using an explicit non-gradient optimizer = rss minimizer via one of the optim() suite of functions (maybe even the default Nelder-Mead); but this is definitely where the counsel of an expert would be valuable. Also check out the CRAN Optimization Task View for advice on optimization options. Cheers, Bert On Mon, Oct 15, 2012 at 9:43 AM, Véronique Boucher Lalonde veronique.boucher.lalo...@gmail.com wrote: I am trying to model a dependent variable as a threshold function of my independent variable. What I mean is that I want to fit different intercepts to y following 2 breakpoints, but with fixed slopes. I am trying to do this with using ifelse statements in the nls function. Perhaps, this is not an appropriate approach. I have created a very simple example to illustrate what I am trying to do. #Creating my independent variable x - seq(0, 1000) #Creating my dependent variable, where all values below threshold #1 are 0, all values above threshold #2 are 0 and all values within the two thresholds are 1 y - ifelse(x 300, 0, ifelse(x700, 0, 1)) #Adding noise to my dependent variable y - y + rnorm(length(y), 0, 0.01) #I am having trouble clearly explaining the model I want to fit but perhaps the plot is self explanatory plot(x, y) #Now I am trying to adjust a nonlinear model to fit the two breakpoints, with known slopes between the breakpoints (here, respectively 0, 1 and 0) threshold.model - nls(y~ifelse(xb1, 0, ifelse(xb2, 0, 1)), start=list(b1=300, b2=700), trace=T) I have played around with this function quite a bit and systematically get an error saying: singular gradient matrix at initial parameter estimates I admit that I don't fully understand what a singular gradient matrix implies. But it seems to me that both my model and starting values are sensible given the data, and that the break points are in fact estimable. Could someone point to what I am doing wrong? More generally, I am interested in knowing (1) whether I can use such ifelse statements in the function nls (1) whether I can even use nls for this model (3) whether I can model this with a function that would allow me to assume that the errors are binomial, rather than normally distributed (ultimately I want to use such a model on binomial data) I am using R version 2.15.1 on 64-bit Windows 7 Any guidance would be greatly appreciated. Veronique [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240 fax: 091 485726 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Comparing crossing survival curves
hi isabel, You have to decide if focus is on the survival curves or hazards.. Crossing hazards do not imply crossing survival curves If you are dealing with crossing hazards, and you are interested in testing for an effect of a covariate (presumably with a crossing hazard effect), then a standard Cox model framework based on the martingale representation (start, stop, event) (rather than (time, event)) will suffice Finally, if you are interested in estimating the crossing point you could have a look to the package flexCrossHaz (currently in the Archive.. Let me know if you are interested in becoming maintainer..:-) ) http://cran.r-project.org/src/contrib/Archive/flexCrossHaz/ best, vito Il 05/07/2012 12.37, Isabel Borges ha scritto: Hi I want to compare the survival curves in two groups. Because the hazards are not proportional (the curves cross) the log rank test or Cox proportional hazard test are not suitable. How should such curves be compared? Comands are welcome Thanks in advance __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240 fax: 091 485726 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Piecewise Lasso Regression
dear lucas, yes you are right, segmented does not handle 'lars' objects. Out of curisity, are you interested in selecting the number of breakpoints or in selecting additional covariates with linear parameters? vito Il 06/06/2012 0.01, Lucas Santana dos Santos ha scritto: Hi All, I am trying to fit a piecewise lasso regression, but package Segmented does not work with Lars objects. Does any know of any package or implementation of piecewise lasso regression? Thanks, Lucas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240 fax: 091 485726 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Piecewise Lasso Regression
dear Lucas, If you are interested in selecting the number of breakpoints here a possible remedy: 1. Fit a segmented model with a large number of breakpoints via the arguments psi=NA and stop.if.error=FALSE in seg.control() (see the example below) 2. extract the model matrix relevant to the variables of the breakpoints 3. Use lars with the extracted model matrix Notice that there are at least two issues that I did not mention: a) The criterion to be used to select the variables (i.e the breakpoints) b) When you remove variables (i.e. breakpoints) you are assuming that the ML estimate of the remaining breakpoints is the same (and this is not the case here, as point estimates of the breakpoints depend on the number of breakpoints themselves..) vito ###Here a simple example n=100 xx-1:n/n psi0-seq(.2,.8,by=.2) X-sapply(c(0,psi0), function(x)pmax(xx-x,0)) b-c(-.6,.7,.4,-1,.5) mu-drop(X%*%b) yy-mu+rnorm(n)*.01 plot(xx,mu);lines(xx, mu) library(segmented) o-lm(yy~xx) os-segmented(o,seg.Z=~xx,psi=NA,control=seg.control(stop.if.error=FALSE,K=30, n.boot=0)) plot(os, add=T, col=2) #have a look to results #extract the new model matrix new.X-model.matrix(os)[,os$nameUV$U] #finally use lars on y~new.X Hope this helps you, vito Il 06/06/2012 15.48, Lucas Santana dos Santos ha scritto: Hi Vito, I am more interested in selecting the number of breakpoints. My data has some structure and I believe that fitting a piecewise regression would be of great benefit. Thanks, Lucas On Jun 6, 2012, at 4:54 AM, Vito Muggeo (UniPa) wrote: dear lucas, yes you are right, segmented does not handle 'lars' objects. Out of curisity, are you interested in selecting the number of breakpoints or in selecting additional covariates with linear parameters? vito Il 06/06/2012 0.01, Lucas Santana dos Santos ha scritto: Hi All, I am trying to fit a piecewise lasso regression, but package Segmented does not work with Lars objects. Does any know of any package or implementation of piecewise lasso regression? Thanks, Lucas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240 fax: 091 485726 http://dssm.unipa.it/vmuggeo -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240 fax: 091 485726 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] getting the name of the working .Rdata file
dear all, I do not if it is a nonsense question.. Is it possible in the R session to get the name of the current .Rdata file that I ran? I mean: suppose I double click the file myfile.Rdata. ls() returns the names of the objects in the current workspace (that is saved in myfile.Rdata). In the current R session, I would like to obtain myfile.Rdata. Is it possible? Thanks in advance, vito -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240 fax: 091 485726 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding multiple breakpoints - 'segmented' ?
dear Peter, Currently segmented handles multiple breakpoints for several variables (the limit discussed in the msg 2006 has been fixed..). However you are looking for a somewhat complicated model where the breakpoint of the relationship 's.size' and 'R.AUC' depends on another covariate 'bedekking'. Namely it is an interaction in the breakpoint parameter (rather then in a linear parameter). Currently segmented is not able to deal with these interactions, and splitting up the data into two or three groups according the values of bedekking could be a feasible alternative. Hope this helps you, vito Il 01/06/2012 11.35, Peter Hoitinga ha scritto: Hello, I'm attempting to find multiple breakpoints in an association of my response variable (R.AUC) with two explanatory variables ('s.size' and 'bedekking'). The association between 's.size' and 'R.AUC' shows a plateau, but the value when this plateau is reached is differs for different values of 'bedekking'. Initially I thought these different values could be assessed by the 'segmented' function in the package of the same name, but this only seems to give one value of a breakpoint for each explanatory variable. A similar issue was treated in the following message from 2006: https://stat.ethz.ch/pipermail/r-help/2006-January/086446.html Does anybody know whether there are any packages that do have a function for this? Or might it be a possibility to split up the data per value of 'bedekking' and then do the segmented function? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240 fax: 091 485726 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Breakpoint in logistic GLM with 'segmented' package - error: replacement length zero
dear Peter, Your code appears correct, so it is difficult to reply without the data.. If you are interested in further details, please contact me off-list vito Il 25/05/2012 15.34, Peter Hoitinga ha scritto: Hello all, I've been having trouble with assessing a breakpoint in a logistic GLM with two explanatory variables. For this analysis I've been using the 'segmented' package version 0.2-9.1. But I keep getting an error and I don't see where I would be going awry. The situation is the following: Two explanatory variables: bedekking - a variable with possible values between 0 and 1 - mine runs from 0.05 to 0.5, increasing with steps of 0.05 s.size - a count variable - increases from 3 to 25 with steps of 1, and from 25 to 60 with steps of 5 Each combination of s.size and bedekking has 100 repeats so the resulting dataframe 'dat.al2' consists of 3 observations of 3 variables. Because the response variable has values between 0 and 1, I used a logistic GLM: gmodel- glm(R.AUC ~ bedekking + s.size, data=dat.al2, family = quasibinomial(link=logit)) R.AUC increases with increasing s.size and decreasing bedekking, looking at the graph shows that the association reaches a plateau at a s.size of 10 and a bedekking of 0.45. So these are the values I use in 'psi' argument in the 'segmented' function psi.mod- list(0.45, 10) names(psi.mod)- c(bedekking, s.size) Then I attempt to run the 'segmented' function: seg.gm- segmented(obj = gmodel, seg.Z= ~bedekking + s.size, psi = psi.mod) When I run this, after half a minute I get the following error: Error in ifelse(is.list(o0), o0$dev.no.gap, 10^12) : replacement has length zero Does anybody know what might be causing this error, and could somebody point out where I might go wrong? Thanks in advance, Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240 fax: 091 485726 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error with psi value for 'segmented' package for R
dear Szymon, what do you mean it does not work for others.. that fit within similar range? Each dataset has its own features and breakpoint estimation is not as simple as estimation of linear models even if your data fit within similar range. I will contact you out of the list for details, best, vito Il 08/05/2012 16.44, Szymon Biskup ha scritto: Hi everyone, while trying to use 'segmented' (R i386 2.15.0 for Windows 32bit OS) to determine the breakpoint I got stuck with an error message and I can't find solution. It is connected with psi value, and the error says: Error in seg.glm.fit(y, XREG, Z, PSI, weights, offs, opz) : (Some) estimated psi out of its range This is the code I am using: library(segmented) curva-read.table(lamintr1.txt, header=T) attach(curva) fit.glm-glm(gpp~temp, weight=NULL, family=gaussian) plot(temp,gpp,xlab=expression(temp), ylab=gpp,pch=15,cex=0.8,xlim=c(0,50), ylim=c(0,40)) o1-glm(gpp ~ temp, weight=NULL, family=gaussian) os1-segmented(o1, seg.Z=~temp, psi=15, control=seg.control(n.boot=0, display=T, it.max=5)) plot(os1, add=TRUE, res=TRUE, se=FALSE, show.gap=TRUE, linkinv = FALSE, res.col=1, rev.sgn=FALSE, const=0) summary(os1) And the most surprising fact is that it works for some of my data, eg: tempgpp 5 5.08050857592085 10 9.50809597873546 15 21.0206415558052 20 21.5340216521042 25 22.8455243983385 30 17.6106786978697 but not for the others, that fit within similar range (in what case I tired to change the psi value but it didn't help), eg: tempgpp 5 10.1494724447878 10 9.64730588470101 15 19.3439579009423 20 20.6756229089911 25 13.7902544619339 30 21.9355758560751 or tempgpp 5 8.64380785577685 10 9.47992535226006 15 16.7556554476544 20 14.5189937476639 25 20.6874556832793 30 17.5509059595314 I saw post with similar questons but none of them had the answer I am looking for. Would there be anyone that could help me with this? Thanks a lot for your time and help. Best regards, Szymon __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240 fax: 091 485726 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error with the 'segmented' package for R
dear Szymon, it is a bug (in the new version), thanks. It depends on the flat underlying relationship you are trying to estimate with a small sample.. I will correct it as soon as possible. Meanwhile you can use o1-glm(gpp ~ temp) os1-segmented(o1, seg.Z=~temp, psi=15, control=seg.control(n.boot=0, display=T, it.max=2)) best, vito -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240 fax: 091 485726 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] variable dispersion in glm models
there are at least two alternatives 1) package dglm for Double generalized linear models or probably better 2)package gamlss for Generalized Additive Models for Location Scale and Shape. Here you can use alternative, sometimes more appropriate, families and also you can include additive (nonparametric) terms in the linear predictors Hope this help, you vito Il 26/04/2012 10.00, Chris Rh ha scritto: Hello, I am currently working with the betareg package, which allows the fitting of a variable dispersion beta regression model (Simas et al. 2010, Computational Statistics Data Analysis). I was wondering whether there is any package in R that allows me to fit variable dispersion parameters in the standard logistic regression model, that is to make the dispersion parameter contingent upon some covariates. I know that glm() from the base package allows the fitting of a common dispersion parameter via selection of the quasibinomial family, but glm() seems not to be able to individualize the dispersion parameter. Every hint is welcomed! Chris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240 fax: 091 485726 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] linear-by-linear association model in R?
dear Christofer, Try the following d-expand.grid(a=1:3,b=1:4) d$freq-rpois(12,5) o-glm(freq~factor(a)+factor(b)+I(a*b), family=poisson, data=d) vito Il 02/04/2012 9.34, Christofer Bogaso ha scritto: Dear all, can somebody give me some pointer how I can fit a linear-by-linear association model (i.e. loglinear model for the ordinal variables) in R? A brief description can be found here 'https://onlinecourses.science.psu.edu/stat504/node/141'. Thanks for your help __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240 fax: 091 485726 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] glmnet() vs. lars()
dear all, It appears that glmnet(), when selecting the covariates entering the model, skips from K covariates, say, to K+2 or K+3. Thus 2 or 3 variables are added at the same time and it is not possible to obtain a ranking of the covariates according to their importance in the model. On the other hand lars() adds the covariates one at a time. My question is: is it possible to obtain a similar output of lars (in terms of order of the variables entering the model) using glmnet()? many thanks, vito #Example (from ?glmnet) set.seed(123) x=matrix(rnorm(100*20),100,20) y=rnorm(100) fit1=glmnet(x,y) fit1$df #no. of covariates entering the model at different lambdas #Thus in the first model no covariate is included and in the second #one 2 covariates (V8 and V20) are included at the same time. Because #two variables are included at the same time I do not know which #variable (among the selected ones) is more important. #Everything is fine with lars o-lars(x,y) o$df #the covariates enter one at a time.. V8 is better than V20 -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240 fax: 091 485726 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] looking for an ecological dataset
dear all, apologizes for this off-topic question. I am looking for a ecological dataset (n100, say) including measurements of one or more growth variable and age. Could anyone to suggest the R package/URL where I can find it? many thanks, vito -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240 fax: 091 485726 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] breakpoints and nonlinear regression
dear Julian, Il 18/01/2012 14.36, crimsonengineer87 ha scritto: Thanks for the comments. Yes, I also had segmented and then I went away from that. I can't remember. I've tried using it but I get some sort of strange error. Here's some code ... it is difficult for me to help you without knowing which error you obtain.. If you refer to maximum number of iterations, it is a warning (not error). See the discussion in the paper on Rnews (that Achim suggested). The following code is expected to work pavlu.glm- lm(Na ~ yield, data=pavludata) pavlu.seg- segmented(pavlu.glm, seg.Z=~yield, psi=1000) with(pavludata, plot(yield, Na)) plot(pavlu.seg, add=TRUE) See in ?segmented and ?plot.segmented for additional examples and contact me off list if you have additional questions best, vito pavlu.glm- glm(Na ~ yield, data=pavludata, family=gaussian) pavlu.seg- segmented(pavlu.glm, seg.Z=~yield, psi=1000, control=seg.control(display=FALSE)) plot.series- function() { plot(pavlu.seg) plot(pavlu.seg, add=TRUE, linkinv=TRUE, lwd=2, col=2:3, lty=c(1,3)) lines(pavlu.seg, col=2, pch=19, bottom=FALSE, lwd=2) } jpeg(pavlu-cuttingsystem-segmented.jpg, width = 1000, height = 700, units = px) plot.series() ## Turn off device driver (to flush output to JPG) dev.off() 1. I don't think I'm doing my plotting right. I'm just not sure how that works with segmented. 2. My error is something about an error in do.call(lines) and that the maximum number of iterations has been reached. Am I missing something with glm or lm? Thanks again. -- View this message in context: http://r.789695.n4.nabble.com/breakpoints-and-nonlinear-regression-tp4303629p4306657.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240 fax: 091 485726 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with segmented
dear Phil, I am not able to read the error message.. did you forget it? However: does x exist in the workspace? The following lines work: myreg2 = lm(y ~ x, data=xy) mysegmented = segmented(myreg2, seg.Z=~x, psi=c(245000)) myreg2 = lm(xy$y ~ xy$x) x-xy$x mysegmented = segmented(myreg2, seg.Z=~x, psi=c(245000)) The following line does *not* work (as specified in ?segmented, argument seg.Z) myreg2 = lm(xy$y ~ xy$x) mysegmented = segmented(myreg2, seg.Z=~xy$x, psi=c(245000)) #error Hope to have been clear, vito Il 10/01/2012 17.17, Filoche ha scritto: Hi everyone. I'm trying to use the segmented function with the following data: For instance, I use segmented package as follow: myreg2 = lm(xy$y ~ xy$x) mysegmented = segmented(myreg2, seg.Z=~x, psi=c(245000), control = seg.control(display=FALSE)) Which get me to the following error : As a break point, a starting guess of 245000 seems fair. Anyone has an idea why I'm getting such error? Regards, Phil -- View this message in context: http://r.789695.n4.nabble.com/Problem-with-segmented-tp4282398p4282398.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240 fax: 091 485726 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mgcv gam predict problem
dear Philip, I am not able to solve your problem, however the error message you get does not depends on mgcv::gam, therefore gam(,..outer.ok=TRUE) or predict.gam(,outer.ok=TRUE) do not make sense. The error message comes from the function splines::splineDesign which is called when the option bs=ps is used. I think the error depends on the fact that you want to predict a value outside the observed range of the covariate. When using P-splines the predictions outside the range follow a given polynomial.. hope this helps vito Il 28/03/2011 7.10, Philip Gautier ha scritto: Hello I'm using function gam from package mgcv to fit splines. When I try to make a prediction slightly beyond the original 'x' range, I get this error: A = runif(50,1,149) B = sqrt(A) + rnorm(50) range(A) [1] 3.289136 145.342961 fit1 = gam(B ~ s(A, bs=ps), outer.ok=TRUE) predict(fit1, newdata=data.frame(A=149.9), outer.ok=TRUE) Error in splineDesign(knots, x, ord, derivs, outer.ok = outer.ok) : the 'x' data must be in the range 3.14708 to 145.485 unless you set 'outer.ok = TRUE' I've inserted the argument 'outer.ok=TRUE' as you can see, but it hasn't helped. How can I obtain this prediction? Thanks, Philip Gautier Dept. of Mathematics and Statistics American University, Washington, DC __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240 fax: 091 485726/485612 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot not generic
dear Nick, getAnywhere(plot.glmnet) Note the message you get when you type methods(plot) ... Non-visible functions are asterisked Il 28/01/2011 14.26, Nick Sabbe ha scritto: Hello list. I was trying to see some of the code for plot.glmnet in package glmnet (this function name is in the documentation). After loading the library, I tried the obvious typing in the name, but I received a message telling me it could not be found. So I fiddled around a little, and noticed that R does not recognize 'plot' as a generic function, and as such, showMethods does not work. This seems to conflict with the documentation for plot. So 2 questions: . How can I find the code of plot.glmnet . Why is plot not seen as generic? Thx. Nick Sabbe -- ping: nick.sa...@ugent.be link:http://biomath.ugent.be/ http://biomath.ugent.be wink: A1.056, Coupure Links 653, 9000 Gent ring: 09/264.59.36 -- Do Not Disapprove [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240 fax: 091 485726/485612 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert a matrix's columns to list
hi feng, a possible solution is b1-apply(a,2,list) and possibly lapply(b1,unlist) if you want exactly the output equal to list(a[, 1], a[, 2]) best, vito Il 18/01/2011 13.53, Feng Li ha scritto: Dear R, Is there an efficient way to make a list that each element is from the corresponding column of a matrix. For example, if I have a matrix a a- matrix(1:10, 5, 2) a [,1] [,2] [1,]16 [2,]27 [3,]38 [4,]49 [5,]5 10 I would like to have a list b like this b- list(a[, 1], a[, 2]) b [[1]] [1] 1 2 3 4 5 [[2]] [1] 6 7 8 9 10 Thanks in advance! Feng -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240 fax: 091 485726/485612 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Choosing statistical test - Fisher's Exact Test?
It appears that you have a 2x2 table coming from paired binary data.. If this is the case the McNemar test is appropriate. See ?mcnemar.test or even better the package exact2x2, function mcnemar.exact() for an exact approach, vito Il 18/01/2011 14.40, debz ha scritto: Hi I was wondering whether anyone can help me with this problemit's been driving me nuts, I've been trying to figure it out for months and months without success!! Basically I have a group of participants who attended 2 experimental sessions a few months apart. I took measures of the way they approach two tasks at Time 1 and the same two tasks at Time 2. I have categorical data (a frequency table for each task) consisting of the goal profiles that participants adopted at Time 1 and at Time 2. I would like to test whether there is a significant difference between the goal profiles that participants adopted at Time 1 and Time 2. Is Fisher's exact test the test to use in this case? I would really really appreciate any help!! Thanks Debbie -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240 fax: 091 485726/485612 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] confidence interval for logistic joinpoint regression from package ljr
dear M., I do not know how to get the SE for the joinpoint (or breakpoint) from your ljr fit. However you can find useful the segmented package which works for any GLM (including the logistic one) and it returns (approximate) StErr (and Conf Int) also for the joinpoint (breakpoint in the segmented package). For some examples, see the R news paper http://cran.r-project.org/doc/Rnews/Rnews_2008-1.pdf Also, as regards to the APC, you could find interesting the following note http://onlinelibrary.wiley.com/doi/10.1002/sim.3850/abstract hope this helps you, vito Il 30/11/2010 23.54, moleps ha scritto: I´m trying to run a logistic joinpoint regression utilising the ljr package. I´ve been using the forward selection technique to get the number of knots for the analysis, but I´m uncertain as to my results and the interpretation. The documentation is rather brief ( in the package and the stats in medicine article is quite technical) and without any good examples. At the moment I´m thinking 1)find the number of knots both using forward and backward techniques and see if they are close 2)present the annual percent change (APC) for each of the intervals, ie my present data (1950-2010 in 5 year intervals) is giving me Variables Coef b0 Intercept -131.20404630 g0 t0.06146463 g1 max(t-tau1,0) -0.51582466 g2 max(t-tau2,0)0.43429615 Joinpoints: 1 tau1= 1990.5 2 tau2= 1995.5 APC 1950-1990=exp(0.06)=1.06--6% 1990-1995=exp(0.06-0.51)=exp(-0.45)=0.63-- -37% 1995-2010=exp(0.06-0.51+0.43)---2% 3) Preferably a confidence interval for the APC should be given. However, this I havent figured out yet. //M __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240 fax: 091 485726/485612 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] One silly question about tapply output
dear Vincy, Firstly, a suggestion: to increase the probability of getting help, you should provide reproducible code (people can do copy-and-paste of your code and to modify the code to obtain the response.. ) However a possible solution (not tested, of course..) could be simply a-tapply(rate, rating, mean) d-data.frame(rating=names(a),mean=a) best, vito Il 27/10/2010 12.39, Vincy Pyne ha scritto: Dear R helpers I have a data which gives Month-wise and Rating-wise Rates. So the input file is something like month rating rate JanuaryAAA 9.04 February AAA 9.07 .. .. Decemeber AAA8.97 January BBB 11.15 February BBB11.13 January CCC17.13 . December CCC 17.56 and so on. My objective is to calculate Rating-wise mean rate, for which I have used rating_mean = tapply(rate, rating, mean) and I am getting following output tapply(rate, rating, mean) AAA BBB CCC 9.1104 11.136163717.1606779 which is correct when compared with an excel output. However, I wish to have my output something like a data.frame (so that I should be able to save this output as csv file with respective headings and should be able to carry out further analysis) Rating Mean AAA9.1104 BBB 11.1361637 CCC 17.1606779 Please guide as how should I achieve my output like this. Thanking in advance. Regards Vincy [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240 fax: 091 485726/485612 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] building lme call via call()
dear all, I would like to get the lme call without fitting the relevant model. library(nlme) data(Orthodont) fm1 - lme(distance ~ age, random=list(Subject=~age),data = Orthodont) To get fm1$call without fitting the model I use call(): my.cc-call(lme.formula, fixed= distance ~ age, random = list(Subject = ~age)) However the two calls are not the same (apart from the data argument I am not interested in), as call() *does* evaluate the arguments: my.cc$random $Subject ~age fm1$call$random list(Subject = ~age) How is it possible to get the right call (similar to the one from fm1$call) by means of call()? thanks, vito -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240 fax: 091 485726/485612 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Interactions in GAMMs
Hi Karen, I think you should decide what you mean for interaction. s(x:y) is meaningless If you want to fit a surface you should use s(x,y). If you want to fit a varying coefficient model (interaction between a linear and a smooth term) you should use the argument by in s(). The help files of the mgcv package by S. Wood are quite clear and they also include a lot of examples. Hope this helps you vito Karen Moore ha scritto: Hi, I've an issue adding an interaction to a GAMM: My model was of form: gamm1 - gamm(TOTSR ~ fROT + s(PH) + s(LOI) + s(ASP) + s(SQRT_ELEV) + CANCOV + s(SQRT_TOTCWD) + s(WELLF) + s(WELLN) + s(OLDWDLD) + s(DISTWOOD) + s(Annprec) + s(OLDWDLD:DISTWOOD) + (1|fSITE), family = poisson, data = BIOFOR2) with interaction of s(OLDWDLD:DISTWOOD). However I got this error message below but don't know what it means? (my data is composed of info for 150 plots) #I Warning messages: #2: In OLDWDLD:DISTWOOD : # numerical expression has 150 elements: only the first used #3: In OLDWDLD:DISTWOOD : # numerical expression has 150 elements: only the first used Can anyone offer advice on how to include this interaction in GAMM model? Thanks Karen Moore [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240 fax: 091 485726/485612 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Peak Over Threshold values
dear Tonja, By plotting your data plot(df) it seems to me that you are looking for a piecewise linear relationships. If this is the case, have a look to the package segmented. You have to specify or not the number and the starting values for the breakpoints library(segmented) olm-lm(walevel~day) #specify number and starting values for the breakpoints.. oseg-segmented(olm, seg.Z=~day, psi=c(20,50,80)) plot(oseg,add=TRUE,col=2) oseg$psi #or you can avoid to specify starting values for psi oseg-segmented(olm, seg.Z=~day, psi=NA,control=seg.control(stop.if.error=FALSE,K=30)) plot(oseg,add=TRUE,col=2) oseg$psi best, vito Tonja Krueger ha scritto: Dear List I hope you can help me: I’ve got a dataframe (df) within which I am looking for Peak Over Threshold values as well as the length of the events. An event starts when walevel equals 5.8 and it should end when walevel equals the lower threshold value (5.35). I tried “clusters (…)” from “evd package”, and varied r (see example) but it did not work for all events (again see example). walevel - c(5.75, 5.75, 5.75, 5.76, 5.80, 5.82, 5.85, 5.86, 5.91, 5.91, 5.88, 5.92, 5.99, 6.02, 6.00, 6.08, 6.11, 6.10, 6.10, 6.11, 6.14, 6.12, 6.15, 6.17, 6.15, 6.08, 6.01, 5.95, 5.89, 5.83, 5.77, 5.72, 5.70, 5.65, 5.59, 5.51, 5.43, 5.33, 5.30, 5.25, 5.22, 5.21, 5.19, 5.19, 5.19, 5.13, 5.15, 5.07, 5.04, 5.02, 4.99, 5.05, 5.07, 5.12, 5.17, 5.22, 5.32, 5.29, 5.33, 5.33, 5.36, 5.37, 5.42, 5.43, 5.48, 5.55, 5.57, 5.66, 5.68, 5.72, 5.77, 5.79, 5.81, 5.80, 5.80, 5.77, 5.72, 5.70, 5.66, 5.62, 5.62, 5.61, 5.59, 5.57, 5.51, 5.47, 5.41, 5.39, 5.40, 5.40, 5.42, 5.45, 5.47, 5.50, 5.50, 5.49, 5.43, 5.39, 5.33, 5.26) day - c(1:100) df - data.frame(day,walevel) library(evd) clusters(df$walevel, u = 5.80, r = 1, ulow = 5.35, cmax = T, plot = T) clusters(df$walevel, u = 5.80, r = 50, ulow = 5.35, cmax = T, plot = T) What have I done wrong? Tonja __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240 fax: 091 485726/485612 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] only actual variable names in all.names()
dear all, When I use all.vars(), I am interest in extracting only the variable names.. Here a simple example all.vars(as.formula(y~poly(x,k)+z)) returns [1] y x k z and I would like to obtain y x z Where is the trick? many thanks vito -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240 fax: 091 485726/485612 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] partial matching with grep()
dear all, This is a probably a silly question. If I type grep(x,c(a.x ,b.x,a.xx),value=TRUE) [1] a.x b.x a.xx Instead, I would like to obtain only a.x b.x How is it possible to get this result with grep()? many thanks for your attention, best, vito -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 6626240 fax: 091 485726/485612 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Estimation in a changepoint regression with R
There are at least two R packages dealing with changepoint estimation, segmented and strucchange. Two possible relevant papers are available: 1)Journal of Statistical Software for strucchange (2002, Vol.7, Issue2) 2)Rnews for segmented (2008, 8/1: 20-25) Hope this helps you vito FMH ha scritto: Dear All, I'm trying to do the estimation in a changepoint regression problem via R, but never found any suitable function which might help me to do this. Could someone give me a hand on this matter? Thank you. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 6626240 fax: 091 485726/485612 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fitting a linear model with a break point
dear Dan, As far as I know, the strucchange package can be helpful for you.. On the other hand, if your regression function is continuous at the unknown break points to be estimated, you could try the segmented package. Hope this helps you, vito Daniel Brewer ha scritto: Hello, I would like to test some data to see whether it has the shape of a step function (i.e. y1 up until x_th and then y2 where x_th is the threshold). The threshold x_th is unknown and the x values can only take discrete values (0,1,2,3,4). An example would be: data- data.frame(x=1:20,y=c(rnorm(10),rnorm(10,10))) I was thinking along the lines of fitting some sort of piiecewise linear model which has the gradient constrained to zero trying out all possible different threshold and taking the one with the least residuals. I am not sure how to implement this in R. Anyone got any ideas? Also is there a way of including the threshold in the actual model, so that could be estimated too? Thanks Dan -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 6626240 fax: 091 485726/485612 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Zinb for Non-interger data
I think that the (impressive) gamlss package (see http://www.gamlss.com) may be helpful. If I remember correctly, in gamlss you can fit model with zero-inflated continuous distributions hope this helps you, vito Alain Zuur ha scritto: JPS2009 wrote: Sorry bit of a Newbie question, and I promise I have searched the forum already, but I'm getting a bit desperate! I have over-dispersed, zero inflated data, with variance greater than the mean, suggesting Zero-Inflated Negative Binomial - which I attempted in R with the pscl package suggested on http://www.ats.ucla.edu/stat/R/dae/zinbreg.htm However my data is non-integer with some pesky decimals (i.e. 33.12) and zinb / pscl doesn't like that - not surprising as zinb is for count data, normally whole integers etc. Does anyone know of a different zinb package that will allow non-integers or and equivalent test/ model to zinb for non-integer data? Or should I try something else like a quasi-Poisson GLM? Apologies for the Newbie question! Any help much appreciated! Thanks! Is it really non-integer...or is it a density (in which case you could use NB + offset)? The quasi-Poisson will not help you with the zero inflation. I'm afraid you will have to do some hard programming by combining the 0-1 binomial part with a continuous distribution on the second part of the data..and I guess the easiest is to do this in MCMC. Perhaps the Gamma distribution can be used? You would have to adjust all likelihood equations as Gamma doesn't allow for zeros. But perhaps another continuous distribution is more appropriate...depends on your data. Alain Zuur -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 6626240 fax: 091 485726/485612 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Linear Regression Problem
dear Alex, I think your problem with a large number of predictors and a relatively small number of subjects may be faced via some regularization approach (ridge or lasso regression..) hope this helps you, vito Alex Roy ha scritto: Dear All, I have a matrix say, X ( 100 X 40,000) and a vector say, y (100 X 1) . I want to perform linear regression. I have scaled X matrix by using scale () to get mean zero and s.d 1 . But still I get very high values of regression coefficients. If I scale X matrix, then the regression coefficients will bahave as a correlation coefficient and they should not be more than 1. Am I right? I do not whats going wrong. Thanks for your help. Alex *Code:* UniBeta - sapply(1:dim(X)[2], function(k) + summary(lm(y~X[,k]))$coefficients[2,1]) pval - sapply(1:dim(X)[2], function(l) + summary(lm(y~X[,l]))$coefficients[2,4]) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 6626240 fax: 091 485726/485612 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] zero cells in one variable in logistic regression
dear anna, if you are not interested in point estimate and SE of the parameter of the aforementioned categorical variable, I believe the conventional glm(..,family=binomial) is correct. In particular, the returned deviance is reliable and also it is the relevant likelihood ratio test.. hope this helps, vito David Winsemius ha scritto: On Jul 13, 2009, at 5:37 AM, anna.bucharova wrote: Dear all. I am sort of beginner with R. I do logistic regression with binomial response variable and several continuous and categorical variables. In one categorical variable, zero cell occures (2x2 table looks like 7 - 0 23 - 25 This leads to overestimating of odds ratio and inflated confidence interval for odds for given variable. The variable is significant in univariate test. I do not necessarilly need odd ratio, but I need the explained deviance by this variable and I really want to keep this variable in the model. It probably matters for explained deviance. How to treat this problem? Thanks for help, Anna Bucharova -- You might consider glmrob in package:robustbase. See http://www.jstatsoft.org/v10/i04 David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 6626240 fax: 091 485726/485612 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gradually switching regression
Hi, Although BaconWatts (1971) assume a transition, the aim is to estimate the breakpoint where the linear relationship changes. The start- and end-point of the transition phase are not parameters to estimate ; it is a trick to estimate the model. As a possible alternative, you could have a look to package segmented where pure piecewise straight-lines are fitted.. Hope this helps you.. vito Benjamin Volland ha scritto: Hello, I'm trying to find an algorithm to estimate a switching regression model based on the 1990 Economics Letters paper by Ohtani/Kakimoto/Abe or the earlier version from 1985 (Ohtani/Katayama, Economic Studies Quarterly; assuming as a transition path a polynomial of order 1). I found an idea for using nls here: http://www.biostat.wustl.edu/archives/html/s-news/2000-04/msg00223.html. Unfortunately it's based on the 1971 BaconWatts model, which does not allow obtaining explicitly both start- and end-point of the transition phase. If anyone stumbled across such a function or has suggestions on how to do it, I would greatly appreciate hearing about them. Thanks so much Ben Volland __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 6626240 fax: 091 485726/485612 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Linear Models: Explanatory variables with uncertainties
Probably you are looking for EIV (errors-in-variables) or ME (measurement errors) models. simex is a possible package which needs to know the error variance.. Also RSiteSearch() may be helpful.. hope this helps, vito kpal ha scritto: One of the assumptions, on which the (General) Linear Modelling is based is that the response variable is measured with some uncertainties (or weighted), but the explanatory variables are fixed. Is it possible to extend the model by assigning the weights to the explanatory variables as well? Is there a package for doing such a model fit? Thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 6626240 fax: 091 485726/485612 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Discover significant change in sorted vector
dear Hans, As pointed out by Petr Pikal, segmented allows to estimate breakpoints of (continuous) piecewise relationships. That is, the mean lines are assumed to be connected and segmented tries to join them at the estimated breakpoints. The estimated breakpoints may be any value in the range of the segmented covariate (not necessarly an integer). Computation times are substantially indipendent of the number of breakpoints to be estimated. strucchange thinks different; it allows disconnected segments and the estimated breakpoints have to be the observed time indices. Hope this helps you, vito Petr PIKAL ha scritto: Hi One possibility is to use segmented e.g a - c(2,3,3,5,6,8,8,9,15, 25, 34,36,36,38,41,43,44,44,46) ix - seq_along(a) plot(ix,a) library(segmented) fit-lm(a~ix) fit.s-segmented(fit, ~ix, list(ix=c(5,10))) fit.s Call: segmented.lm(obj = fit, seg.Z = ~ix, psi = list(ix = c(5, 10))) Meaningful coefficients of the linear terms: (Intercept) ix U1.ix U2.ix 0.6785714 1.0714286 8.9285714 -8.450 Estimated Break-Point(s) psi1.ix psi2.ix : 8.476 10.880 Regards Petr r-help-boun...@r-project.org napsal dne 22.04.2009 15:45:55: Gabor, initially this looked like the perfect solution, exactly what I need. Unfortunately it is too expensive/costly. I have vectors of length 800 and more, my machine needs 5 minutes (I aborted) to compute the breakpoints. Required is computation time 1 sec. :) Any other suggestions? Maybe there is another approach not that perfect as from the strucchange package, but still sufficient? Best Henning Am 22.04.2009 um 14:55 schrieb Gabor Grothendieck: Try this: a - c(2,3,3,5,6,8,8,9,15, 25, 34,36,36,38,41,43,44,44,46); ix - seq_along(a) library(strucchange) bp - breakpoints(a ~ ix, h = 4) bp Optimal 3-segment partition: Call: breakpoints.formula(formula = a ~ ix, h = 4) Breakpoints at observation number: 7 11 Corresponding to breakdates: 0.3684211 0.5789474 plot(a ~ ix) lines(ix, fitted(bp)) On Wed, Apr 22, 2009 at 7:27 AM, Hans-Henning Gabriel hanshenning.gabr...@gmail.com wrote: Hi, suppose I have a simple sorted vector like this: a - c(2,3,3,5,6,8,8,9,15, 25, 34,36,36,38,41,43,44,44,46); Is there a function in R, I can use to discover that from index 8 to index 11 the values are changing significantly? The function should return a value pointing to one of the indices 8, 9, 10 or 11. Any of them would be fine. The difficulty is that there may be no big gap. I mean, indices 8 and 11 are somehow connected by indices 9 and 10. So, it's not an option to just search for biggest difference between the values. Perfect would be a function that is able to discover multiple changes if it is present in the data. Thanks!! Henning __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 6626240 fax: 091 485726/485612 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Piecewise
Hi, In addition to useful Ben's suggestion, you have a, possibly simpler, alternative. If you are willing to assume to know the power of you piecewise polynomial (beta parameter according to code of Ben) you can use the package segmented. Using the data generated by Ben in his previous email, you get olm-lm(y~0+I(x^(-.5))) #note that 0 as the true model has not interc os1-segmented(olm,seg.Z=~x,psi=40) Results are comparable with the ones returned by the nls() (even as regard the St.Err of the breakpoint), although segmented returns a somewhat smaller residual sum of squares :-) ***segmented() vs. nls(): some possibles differences*** 1)In segmented, you are assuming to know the power of polynomial. In practice you can try several different values such as {-1,-.5,1,2,3}, say, and to assess the fit.. However the St.Err from the other estimates might be underestimated.. 2)In segmented, you need to specify starting value only for the breakpoint. 3)In segmented, you can easy generalize the model: multiple breakpoints and/or additional linear covariates and/or non-continuous response and/or non-identity link functions (e.g. gamma with log-link)... Hope this helps you, vito Ben Bolker ha scritto: Joe Waddell joecwaddell at gmail.com writes: Hi, I am a biologist (relatively new to R) analyzing data which we predict to fit a power function. I was wondering if anyone knew a way to model piecewise functions in R, where across a range of values (0-x) the data is modeled as a power function, and across another range (x-inf) it is a linear function. This would be predicted by one of our hypotheses, and we would like to find the AICs and weights for a piecewise function as described, compared with a power function across the entire range. I have looked into the polySpline function, however it appears to use only polynomials, instead of the nls models I have been using. Thanks in advance for any help you might be able to offer. You should be able to do it in nls as follows ... There are some potentially subtle issues if you get into the details of likelihood profile confidence intervals for the breakpoint (since the likelihood profile is flat between/discontinuous across the locations of data points), but hopefully that won't bite you in your applications. ## function to generate piecewise power-law/linear data f - function(x,brk,alpha,beta,b,sd) { mu - ifelse(xbrk,alpha*x^beta,(alpha*brk^beta)-b*(x-brk)) rnorm(length(x),mean=mu,sd=sd) } ## generate data set.seed(1001) x - runif(1000,max=100) y - f(x,brk=50,alpha=100,beta=-0.5,b=1,sd=5) ## take a quick look, vs. theoretical curve plot(y~x) curve(f(x,50,100,-0.5,1,0),col=2,add=TRUE,n=1000,lwd=2) ## fit, using the I() command to do the piecewise part dat - data.frame(x,y) fit1 - nls(y~I(xbrk)*alpha*x^beta+I(xbrk)*((alpha*brk^beta)-b*(x-brk)), start=list(brk=60,alpha=110,beta=-0.75,b=2),data=dat) ## plot the fit xvec - seq(0,100,length=200) lines(xvec,predict(fit1,newdata=data.frame(x=xvec)),col=4,lwd=2) ## testing ... with(as.list(coef(fit1)), lines(xvec,f(xvec,brk,alpha,beta,b,sd=0),col=5,lwd=2)) Ben Bolker __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 6626240 fax: 091 485726/485612 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] harmonic function fiting? how to do
dear Yogesh It appears that your model based on parametric terms is too inflexible.. A better alternative to parametric harmonic terms is a spline-based approach, may be cyclic splines.. Have a look to the mgcv package.. vito Yogesh Tiwari ha scritto: Dear R Users, I have a CO2 time series. I want to fit this series seasonal cycle and trend with fourth harmonic function, and then compute residuals. I am doing something like: file-read.csv(co2data.csv) names(file) attach(file) fit-lm(co2~1+time+I(time^2)+sin(2*pi*time)+cos(2*pi*time)+sin(4*pi*time)+cos(4*pi*time)+ sin(6*pi*time)+cos(6*pi*time)+sin(8*pi*time)+cos(8*pi*time),data=file) fit$residuals # variable 'co2' is in ppmv and variable 'time' is in the form of decimal time. The problem is: when I plot above residuals vs. time, it still shows some seasonal cycle with time. SO, I doubt that I am doing something wrong. Kindly help, how to fit correctly, a fourth harmonic function on CO2 which is varying with variable 'time'. Great thanks, Regards, Yogesh -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 6626240 fax: 091 485726/485612 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R equivalent of SAS Cochran-Mantel-Haenszel tests?
Dear Michael, It sounds as a linear-by-linear loglinear model (and its variants) which uses scores for one or more variables in the table.. (see Agresti, 1990, Categorical Data Analysis. I do remember the pages and I have not the book here..) If this is the case, you can use standard call to glm(.., family=poisson) with score variables in the linear predictor. For instance for a two-way table with ordered variables the linear-by-linear model is, glm(freq~factor(x)+factor(y)+I(score.x*score.y), family=poisson) The CMH test, probably, is the score test of the parameter of I(score.x*score.y).. best, vito Michael Friendly ha scritto: In SAS, for a two-way (or 3-way, stratified) table, the CMH option in SAS PROC FREQ gives 3 tests that take ordinality of the factors into account, for both variables, just the column variable or neither. Is there an equivalent in R? The mantelhaen.test in stats gives something quite different (a test of conditional independence for *nominal* factors in a 3-way table). e.g. I'd like to reproduce: *-- CMH tests; proc freq data=sexfun order=data; weight count; tables husband * wife / cmh chisq nocol norow; run; The FREQ Procedure Summary Statistics for Husband by Wife Cochran-Mantel-Haenszel Statistics (Based on Table Scores) StatisticAlternative HypothesisDF Value Prob 1Nonzero Correlation1 10.01420.0016 2Row Mean Scores Differ 3 12.56810.0057 3General Association9 16.76890.0525 -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 6626240 fax: 091 485726/485612 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] holidays effect
Gabor Grothendieck ha scritto: One possibility if you don't have to have days is to reduce it to a weekly or monthly series. Alternatively you can put a dummy variable (1=holiday and zero otherwise) in the regression model for your response. For instance, you could use the xreg argument of the arima() function. This allows to avoid aggregation of your data which, in general, is not recommended.. best, vito On Wed, Feb 4, 2009 at 8:46 AM, elisia elisabetta.fab...@guest.telecomitalia.it wrote: how can I eliminate the influence of the festivities in a time series with daily data?I tried to remove them and replace their value with a value of interpolation using na.approx (). There is an alternative method? -- View this message in context: http://www.nabble.com/holidays-effect-tp21830785p21830785.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 6626240 fax: 091 485726/485612 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] OFF topic testing for positive coeffs
Dear all, This is off-topic, however I hope someone can give me useful suggestion.. Given the regression model y = b0 + b1*x + e I am interested in testing for positive coeffs, namely H0: b00 AND b10 H1: b0,b1 unconstrained It is simple to estimate the model under H0 and H1 (there are several suggestions on the Rlist about estimation but nothing about testing..) perform a likelihood ratio test by comparing the logLik under the constrained and the unconstrained models, however I do not know how many degrees of freedom.. Model under H0 uses two df, however it reasonable to believe that the real dimension is =2.. Is there anyone which can give me any advices or suggest me references? Many thanks, vito -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 6626240 fax: 091 485726/485612 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] OT: (quasi-?) separation in a logistic GLM
dear Gavin, I do not know whether such comment may be still useful.. Why are you unsure about quasi-separation? I think that it is quite evident in the plot plot(analogs ~ Dij, data = dat) Also it may be useful to see the plot of the monotone (profile) deviance (or the log-lik) for the coef of Dij, xval-seq(-20,0,l=50) ll-vector(length=50) for(i in 1:length(xval)){ mod - glm(analogs ~ offset(xval[i]*Dij), data = dat, family = binomial) ll[i]-mod$dev } plot(xval, ll) Hope this helps you, vito Gavin Simpson ha scritto: Dear List, Apologies for this off-topic post but it is R-related in the sense that I am trying to understand what R is telling me with the data to hand. ROC curves have recently been used to determine a dissimilarity threshold for identifying whether two samples are from the same type or not. Given the bashing that ROC curves get whenever anyone asks about them on this list (and having implemented the ROC methodology in my analogue package) I wanted to try directly modelling the probability that two sites are analogues for one another for given dissimilarity using glm(). The data I have then are a logical vector ('analogs') indicating whether the two sites come from the same vegetation and a vector of the dissimilarity between the two sites ('Dij'). These are in a csv file currently in my university web space. Each 'row' in this file corresponds to single comparison between 2 sites. When I analyse these data using glm() I get the familiar fitted probabilities numerically 0 or 1 occurred warning. The data do not look linearly separable when plotted (code for which is below). I have read Venables and Ripley's discussion of this in MASS4 and other sources that discuss this warning and R (Faraway's Extending the Linear Model with R and John Fox's new Applied Regression, Generalized Linear Models, and Related Methods, 2nd Ed) as well as some of the literature on Firth's bias reduction method. But I am still somewhat unsure what (quasi-)separation is and if this is the reason for the warnings in this case. My question then is, is this a separation issue with my data, or is it quasi-separation that I have read a bit about whilst researching this problem? Or is this something completely different? Code to reproduce my problem with the actual data is given below. I'd appreciate any comments or thoughts on this. Begin code snippet ## note data file is ~93Kb in size dat - read.csv(url(http://www.homepages.ucl.ac.uk/~ucfagls/dat.csv;)) head(dat) ## fit model --- produces warning mod - glm(analogs ~ Dij, data = dat, family = binomial) ## plot the data plot(analogs ~ Dij, data = dat) fit.mod - fitted(mod) ord - with(dat, order(Dij)) with(dat, lines(Dij[ord], fit.mod[ord], col = red, lwd = 2)) End code snippet ## Thanks in advance Gavin -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 6626240 fax: 091 485726/485612 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] NA's in segmented
Dear Tyler, Yes the problem is with NA.. There are two solutions: 1) You can use lm() + segmented (you fit a gaussian model, so why do you use glm()?) 2)If you want to use glm()+ segmented(), use na.omit() to pass your dataframe to the data argument of glm, glm(.., data=na.omit()) Also, if you want to constrain the right slope to be zero use the minus variable (see the relevant recent paper on Rnews) hope this helps you vito ### Example x-1:50/50 y- -2*x+2*pmax(x-.6,0)+rnorm(50)*.1 x[20:22]-NA d-data.frame(xx=x,yy=y) rm(x,y) #Use lm() - It works: o-lm(yy~xx, data=d, na.action=na.omit) os-segmented(o,seg.Z=~xx,psi=.5) #Use glm() - It works: o-glm(yy~xx, data=na.omit(d)) os-segmented(o,seg.Z=~xx,psi=.5) #constrain the right slope to zero d$xxx- -d$x o-glm(yy~1, data=na.omit(d)) os1-segmented(o,seg.Z=~xxx,psi=-.5) with(d,plot(xx,yy) plot(os, add=TRUE) plot(os1, add=TRUE, col=2, rev.sgn=TRUE) T.D.Rudolph ha scritto: I am trying to fit a very simple broken stick model using the package segmented but I have hit a roadblock. str(data) 'data.frame': 18 obs. of 2 variables: $ Bin : num 0.25 0.75 1.25 1.75 2.25 2.75 3.25 3.75 4.25 4.75 ... $ LnFREQ: num 5.06 4.23 3.50 3.47 2.83 ... I fit the lm easily: fit.lm-lm(LnFREQ~Bin, data=id07) But I keep getting an error message: fit.seg-segmented(fit.glm, seg.Z=~Bin, psi=6) Error in cbind(XREG, U, V) : number of rows of matrices must match (see arg 2) I think the problem is due to NA's in the Bin data, but there doesn't seem to be an na.action for segmented (). What is the best way to get around this problem? I would like to keep all Bin values in the output for continuity. Data below Tyler data BinLnFREQ 1 0.25 5.0562458 2 0.75 4.2341065 3 1.25 3.4965076 4 1.75 3.4657359 5 2.25 2.8332133 6 2.75 2.9444390 7 3.25 2.4849066 8 3.75 1.3862944 9 4.25 1.7917595 10 4.75 1.3862944 11 5.25 0.6931472 12 5.75 1.0986123 13 6.25 0.000 14 6.75 0.000 15 7.25NA 16 7.75 0.000 17 8.25 0.000 18 8.75NA summary(fit.glm) Call: glm(formula = LnFREQ ~ Bin, data = id07, na.action = na.omit) Deviance Residuals: Min1QMedian3Q Max -0.74139 -0.22999 0.01065 0.21245 0.72684 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 4.506460.21088 21.37 4.37e-12 *** Bin -0.634340.04467 -14.20 1.05e-09 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 (Dispersion parameter for gaussian family taken to be 0.1844898) Null deviance: 39.7785 on 15 degrees of freedom Residual deviance: 2.5829 on 14 degrees of freedom (2 observations deleted due to missingness) AIC: 22.227 Number of Fisher Scoring iterations: 2 -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 6626240 fax: 091 485726/485612 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] difference between MASS::polr() and Design::lrm()
Dear all, It appears that MASS::polr() and Design::lrm() return the same point estimates but different st.errs when fitting proportional odds models, grade-c(4,4,2,4,3,2,3,1,3,3,2,2,3,3,2,4,2,4,5,2,1,4,1,2,5,3,4,2,2,1) score-c(525,533,545,582,581,576,572,609,559,543,576,525,574,582,574,471,595, 557,557,584,599,517,649,584,463,591,488,563,553,549) library(MASS) library(Design) summary(polr(factor(grade)~score)) lrm(factor(grade)~score) It seems to me that results from lrm() are right.. Am I wrong? Thanks, vito -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 6626240 fax: 091 485726/485612 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] error in using by + median
dear all, Could anyone explain me the behaviour of median() within by()? (I am running R.2.7.0) thanks, vito H-cbind(rep(0:1,l=20),matrix(rnorm(20*2),20,2)) by(H[,-1],H[,1],mean) INDICES: 0 V1 V2 -0.2101069 0.2954377 - INDICES: 1 V1 V2 -0.23682173 -0.01225147 by(H[,-1],H[,1],median) Error in median.default(data[x, , drop = FALSE], ...) : need numeric data -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 6626240 fax: 091 485726/485612 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Comparing switchpoints from segmented
Dear Rob, Rob Knell ha scritto: Hello everyone Not strictly an R question but close... hopefully someone will be able to help. I wish to compare the switchpoints in two switchpoint regressions. The switchpoints were estimated using the segmented library It's a package. running in R, and I have standard errors for the estimates. I initially thought I could just bootstrap confidence intervals for the difference between the switchpoints, but I have been having trouble with getting this to work because for about 25% of the bootstrap samples the algorithm in segmented fails to converge. So I had another think, and I thought that maybe I could just do a t-test: knowing the estimated switchpoints and their standard errors I can easily calculate the SE of the difference, so I can calculate a t-value using that. My question is whether there is anything wrong with doing it this way, and if not, how many degrees of freedom should I use? I would guess at df=n1-5+n2-5 5 df lost for each sample because two slopes, two intercepts and one switchpoint have been estimated, but I'm not sure: I'm but a humble biologist and not very good at this sort of thing. The SE() of the breakpoints are reliable only for large samples and/and with clear-cut relationships. Having said that, I think that you can compare them by using the approximate t-like studentized statistic, but results have to taken with care.. The relevant sampling distribution is unknown (it depends on the location of the breakpoint, (standardized) difference-in-slope, sample size and, to some extend, distribution of the `segmented variable') . Therefore quantiles of a t distributions are not appropriate in theory, but in practice someone uses them..BTW the model parameters are 4 (intercept+two slopes+ breakpoint, the regression lines are joined at the breakpoint). Hope this helps you, best, vito Any help gratefully received Thanks Rob Knell School of Biological and Chemical Sciences Queen Mary, University of London 'Phone +44 (0)20 7882 7720 Skype Rob Knell Research: http://webspace.qmul.ac.uk/rknell/ The truth is that they have no clue why the beetles had horns, it's the researchers who have sex on the brain and everything has to have a sexual explanation. And this is reasearch?! Correspondent known as FairOpinion on Neo-Con American website discussing my research. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 6626240 fax: 091 485726/485612 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] strange behaviour of is.factor()
Dear all, It appears that the function is.factor() returns different results when used inside the apply() function: that is, is.factor() fails to recognize a factor.. Where is the trick? many thanks, vito df1-data.frame(y=1:10,x=rnorm(10),g=factor(c(rep(A,6),rep(B,4 is.factor(df1[,1]) [1] FALSE is.factor(df1[,2]) [1] FALSE is.factor(df1[,3]) [1] TRUE is.factor(df1$g) [1] TRUE apply(df1,2,is.factor) y x g FALSE FALSE FALSE R.version _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 6.1 year 2007 month 11 day26 svn rev43537 language R version.string R version 2.6.1 (2007-11-26) -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 6626240 fax: 091 485726/485612 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Segmented regression
Dear Brendan, I am not sure to understand your code.. It seems to me that your are interested in fitting a one-breakpoint segmented relationship in each level of your grouping variable If this is the case, the correct code is below. In order to fit a segmented relationship in each group you have to define the relevant variable before fitting, and to constrain the last slope to be zero you have to consider the `minus' variable..(I discuss these points in the submitted Rnews article..If you are interested, let me know off list). If my code does not fix your problem, please let me know, Best, vito ##--define the group-specific segmented variable: X-model.matrix(~0+factor(group),data=df)*df$tt df$tt.KV-X[,1] #KV df$tt.KW-X[,2] #KW df$tt.WC-X[,3] #WC ##-fit the unconstrained model olm-lm(y~group+tt.KV+tt.KW+tt.WC,data=df) os-segmented(olm,seg.Z=~tt.KV+tt.KW+tt.WC,psi=list(tt.KV=350, tt.KW=500, tt.WC=350)) #have a look to results: with(df,plot(tt,y)) with(subset(df,group==RKW),points(tt,y,col=2)) with(subset(df,group==RKV),points(tt,y,col=3)) points(df$tt[df$group==RWC],fitted(os)[df$group==RWC],pch=20) points(df$tt[df$group==RKW],fitted(os)[df$group==RKW],pch=20,col=2) points(df$tt[df$group==RKV],fitted(os)[df$group==RKV],pch=20,col=3) #constrain the last slope in group KW tt.KW.minus- -df$tt.KW olm1-lm(y~group+tt.KV+tt.WC,data=df) os1-segmented(olm1,seg.Z=~tt.KV+tt.KW.minus+tt.WC,psi=list(tt.KV=350, tt.KW.minus=-500, tt.WC=350)) #check..:-) slope(os1) with(df,plot(tt,y)) with(subset(df,group==RKW),points(tt,y,col=2)) with(subset(df,group==RKV),points(tt,y,col=3)) points(df$tt[df$group==RWC],fitted(os1)[df$group==RWC],pch=20) points(df$tt[df$group==RKW],fitted(os1)[df$group==RKW],pch=20,col=2) points(df$tt[df$group==RKV],fitted(os1)[df$group==RKV],pch=20,col=3) Power, Brendan D (Toowoomba) ha scritto: Hello all, I have 3 time series (tt) that I've fitted segmented regression models to, with 3 breakpoints that are common to all, using code below (requires segmented package). However I wish to specifiy a zero coefficient, a priori, for the last segment of the KW series (green) only. Is this possible to do with segmented? If not, could someone point in a direction? The final goal is to compare breakpoint sets for differences from those derived from other data. Thanks in advance, Brendan. library(segmented) df-data.frame(y=c(0.12,0.12,0.11,0.19,0.27,0.28,0.35,0.38,0.46,0.51,0.5 8,0.59,0.60,0.57,0.64,0.68,0.72,0.73,0.78,0.84,0.85,0.83,0.86,0.88,0.88, 0.95,0.95,0.93,0.92,0.97,0.86,1.00,0.85,0.97,0.90,1.02,0.95,0.54,0.53,0. 50,0.60,0.70,0.74,0.78,0.82,0.88,0.83,1.00,0.85,0.96,0.84,0.86,0.82,0.86 ,0.84,0.84,0.84,0.77,0.69,0.61,0.67,0.73,0.65,0.55,0.58,0.56,0.60,0.50,0 .50,0.42,0.43,0.44,0.42,0.40,0.51,0.60,0.63,0.71,0.74,0.82,0.82,0.85,0.8 9,0.91,0.87,0.91,0.93,0.95,0.95,0.97,1.00,0.96,0.90,0.86,0.91,0.94,0.96, 0.88,0.88,0.88,0.92,0.82,0.85), tt=c(141.6,141.6,141.6,183.2,212.8,227.0,242.4,271.5,297.4,312.3,331.4,3 42.4,346.3,356.6,371.6,408.8,408.8,419.5,434.4,464.5,492.6,521.7,550.5,5 50.3,565.4,588.0,602.9,623.7,639.6,647.9,672.6,680.6,709.7,709.7,750.2,7 50.2,750.2,141.6,141.6,141.6,183.2,212.8,227.0,242.4,271.5,297.4,312.3,3 31.4,342.4,346.3,356.6,371.6,408.8,408.8,419.5,434.4,464.5,492.6,521.7,5 50.5,550.3,565.4,588.0,602.9,623.7,639.6,647.9,672.6,680.6,709.7,709.7,1 41.6,141.6,141.6,183.2,212.8,227.0,242.4,271.5,297.4,312.3,331.4,342.4,3 46.3,356.6,371.6,408.8,408.8,419.5,434.4,464.5,492.6,521.7,550.5,550.3,5 65.4,588.0,602.9,623.7,639.6,647.9,672.6,709.7), group=c(rep(RKW,37),rep(RWC,34),rep(RKV,32))) init.bp - c(297.4,639.6,680.6) lm.1 - lm(y~tt+group,data=df) seg.1 - segmented(lm.1, seg.Z=~tt, psi=list(tt=init.bp)) version _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 6.0 year 2007 month 10 day03 svn rev43063 language R version.string R version 2.6.0 (2007-10-03) DISCLAIMER**...{{dropped:15}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 6626240 fax: 091 485726/485612
[R] differences in using source() or console
Dear all, Is there *any* reason explaining what I describe below? I have the following line myfun(x) If I type them directly in R (or copy/past), it works.. However if I type in R 2.6.1 source(code.R) ##code.R includes the above line Error in inherits(x, data.frame) : object d not found namely myfun() does not work correctly. In particular the non-working line inside myfun() is update(eval(x$call$obj), data=d) #d is created in myfun() My question is: why the problem occurs just via source()ing??? many thanks, vito Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 6626240 fax: 091 485726/485612 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] using names with functions..
Dear all, I have the following (rather) strange problem.. For some reasons, I finally work with a variable whose name includes an R function, a.log(z), say. And that is a problem when I call it in a formula, for instance: myname-a.log(z) dd-data.frame(a.log(z)=1:10,y=rnorm(10)) o-lm(y~1,data=dd) fo-as.formula(paste(.~.+,paste(myname, collapse = +))) fo . ~ . + a.log(z) update(o,formula=fo) Error in eval(expr, envir, enclos) : could not find function a.log How can fit the model? namely how can I use a.log(z) in the example above? Many thanks, vito -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 6626240 fax: 091 485726/485612 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to extract the original data of a glm object
See and *read* the help file ?glm the object returned by glm() includes the `data' component hence: aa-glm(..) aa$data or also eval(aa$call$data) leffgh ha scritto: my function is glm(a~log(b)+c+d+e,family=binomial,data=f)-aa I want to extract the original data set of aa. How to do it ? You may suggest the model.frame() function. In fact ,i have tried it. model.frame returns a data frame of containing a,log(b) NOT b,c,d,e I want to extract a data frame containing a,b,c,d,e,which is exactly the same as f How can I achieve this result? I want to do this because I need to extract the formular and act on another data set ,whose predict variables are the same as those of f, but the response variable is different . -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 6626240 fax: 091 485726/485612 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R segmented package
Dear Maura, For package-specific questions, it is probably advisable to contact privately the author.. BTW: However I cannot understand if a break-point per independent variable (regressor) is handled when there are many independent variables in the model, or if segmented can handle many breakpoints for the same independent variable (time in my case) ... ? Yes, both the statements are true. Virtually segmented is able to handle GLMs with any number of (linear) exaplanatory variables + any number of `segmented' variables, and multiple breakpoints are allowed for *each* `segmented' variable. See the relevant help files. All the best, vito PS: I am sending you some further documentation via off-list best, vito Maura E Monville ha scritto: Most of the data sets I'm dealing with exhibit a time trend. We would like to get rid of the time trend. The plot shows in some cases a monotonic increase of the dependent variable with time. This is the easiest case. In some other cases the plot shows a time trend where the dependent variable changes slope 4-5 times along the observations measurement period. I've attempted a segmented regression and estimated the break-point empirically. This is a tedious error-prone process. I just found out that R includes a package, called segmented, which can estimate the break-point discriminating between two regression models. I have browsed through the on-line documentation. It is stated multiple break-points are supported by segmented. However I cannot understand if a break-point per independent variable (regressor) is handled when there are many independent variables in the model, or if segmented can handle many breakpoints for the same independent variable (time in my case) ... ? I would greatly appreciate some explanations and suggestions on the use of segmented. Thank you very much in advance. Regards, -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 6626240 fax: 091 485726/485612 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Splines
As reported in ?spline, you should use splinefun() instead. ff-splinefun(x,y) ff(x0) where x0=5.25 is your case. best, vito stat stat ha scritto: I want to fit a cubic spline of x on y. where : x [1] 467 468 460 460 450 432 419 420 423 423 y [1] 1 2 3 4 5 6 7 8 9 10 using the syntax spline(y, x) I got following output : $x [1] 1.00 1.310345 1.620690 1.931034 2.241379 2.551724 2.862069 [8] 3.172414 3.482759 3.793103 4.103448 4.413793 4.724138 5.034483 [15] 5.344828 5.655172 5.965517 6.275862 6.586207 6.896552 7.206897 [22] 7.517241 7.827586 8.137931 8.448276 8.758621 9.068966 9.379310 [29] 9.689655 10.00 $y [1] 467. 469.5381 469.8643 468.4865 465.9444 463.0284 460.6479 459.6560 [9] 459.8737 460.2296 459.6313 457.4921 454.0094 449.4482 444.1040 438.3613 [17] 432.6204 427.2892 422.8183 419.6733 418.2646 418.3633 419.3283 420.5202 [25] 421.5768 422.4555 423.1312 423.5419 423.5480 423. Now I want to get what is the value of y at x = 5.25. Can anyone tell me how to find that? thanks in advance - Meet people who discuss and share your passions. Join them now. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 6626240 fax: 091 485726/485612 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.