[R] [R-pkgs] segmented 2.1-0 is released

2024-05-16 Thread Vito Muggeo via R-packages

dear R users,
I am pleased to announce that segmented 2.1-0 is now available on CRAN.

segmented focuses on estimation of breakpoints/changepoints of 
segmented, i.e. piecewise linear, relationships in (generalized) linear 
models. Starting with version 2.0-0, it is also possible to model 
stepmented, i.e. piecewise constant, effects.


In the last release both models may be fitted via a formula interface, 
such as


segreg(y ~ seg(x1, npsi=2) + seg(x2) + z)

stepreg(y ~ seg(x1, npsi=2) + seg(x2) +seg(x3, npsi=3) + z, family=poisson)

There is virtually no limit in the number of covariates and 
corresponding number of changepoints to be estimated.


thank you,
kind regards,
Vito


--
=
Vito M.R. Muggeo, PhD
Professor of Statistics
Dip.to Sc Econom, Az e Statistiche
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240; fax: 091 485726
http://www.unipa.it/persone/docenti/m/vito.muggeo
Assoc Editor: Statist Modelling, Statist Meth Appl
past chair, Statistical Modelling Society
coordinator, PhD Program in Econ, Businss, Statist

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] losing variable attributes when subsetting model.frame

2024-05-08 Thread Vito Muggeo via R-help

dear Duncan,
Thank you very much for your quick reply. Very useful and also very 
elegant, I would add.


Thanks again,
kind regards,

Vito


Il 08/05/2024 14:06, Duncan Murdoch ha scritto:
I would guess that subsetting uses [] indexing on each column, with a 
logical argument.  If that is true, then one solution (which is less 
direct, but maybe someone would call it elegant) is to put a class on 
the result of `f()`, and define a `[` method for that class that 
preserves the attributes you want to preserve.


In your example:

f <- function(x) {
   attr(x, "vi") <- length(x)
   class(x) <- c("withAttributes", class(x))
   x
}

`[.withAttributes` <- function(x, i) {
   subset <- NextMethod()
   attr(subset, "vi") <- attr(x, "vi")
   subset
}


x<- 1:5
z<-runif(5)
y<-rnorm(5)

mf <- model.frame(y~f(z), subset=x>=3)
attr(mf[,2],"vi") #it works
#> [1] 5


--
=
Vito M.R. Muggeo, PhD
Professor of Statistics
Dip.to Sc Econom, Az e Statistiche
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240; fax: 091 485726
http://www.unipa.it/persone/docenti/m/vito.muggeo
Assoc Editor: Statist Modelling, Statist Meth Appl
past chair, Statistical Modelling Society
coordinator, PhD Program in Econ, Businss, Statist

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] losing variable attributes when subsetting model.frame

2024-05-08 Thread Vito Muggeo via R-help

dear all,
I have a simple function f() which, when included in model.frame() via 
the formula, returns the variable itself with some attributes.
However when I specify the subset argument, the attributes get lost, 
apparently.


I would like to extract the attributes also when specifying the subset 
argument. Of course, I can build the whole dataframe without subsetting, 
taking the attributes and then build again the dataframe with 'subset', 
but I am wondering if a more direct (and elegant) solution exists.


Any suggestion?
Thank you very much,
best,
Vito


#=
Here a simple example..

f<- function(x){
attr(x,"vi")<-length(x)
x
}

x<- 1:5
z<-runif(5)
y<-rnorm(5)

mf<- model.frame(y~f(z))
attr(mf[,2],"vi") #it works


mf <- model.frame(y~f(z), subset=x>=3)
attr(mf[,2],"vi") #it does not work




--
=
Vito M.R. Muggeo, PhD
Professor of Statistics
Dip.to Sc Econom, Az e Statistiche
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240; fax: 091 485726
http://www.unipa.it/persone/docenti/m/vito.muggeo
Assoc Editor: Statist Modelling
past chair, Statistical Modelling Society
coordinator, PhD Program in Econ, Businss, Statist

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] OT: IWSM 2013

2013-01-18 Thread Vito Muggeo

dear all,
apologizes for this off topic.

I would like to inform you that registration and paper submission for
the 28th International Workshop on Statistical Modelling (IWSM)
to be held in Palermo (Italy) 8-12 July 2013 is now open at

http://iwsm2013.unipa.it

Register at http://iwsm2013.unipa.it/?cmd=registration and then submit
your abstract. Deadlines for Abstract submission is February 4, 2013
and for Early Registration is April 20, 2013.

**Invited Speakers**
1)Ciprian Crainiceanu   Johns Hopkins University, USA
2)Torsten Hothorn   Ludwig-Maximilians-Universität Munchen, DEU
3)Stefano M. Iacus  Università di Milano, ITA
4)Geoff McLachlan   University of Queensland, AUS
5)Hein Putter   Leiden University Medical Cente, NLD


**Short Course** (sunday 7 july 2013)
J. Fox, ' An Introduction to Structural Equation Modelling with the sem 
Package for R'.




best wishes,

Vito Muggeo,
on behalf of the IWSM2013 Scientific Committee



--
==
Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240
fax: 091 485726
http://dssm.unipa.it/vmuggeo

28th IWSM
International Workshop on Statistical Modelling
July 8-12, 2013, Palermo
http://iwsm2013.unipa.it

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Interpretation of davies.test() in segmented package

2012-11-19 Thread Vito Muggeo

dear Greg,

 It does not, however give me
 Pr(|t|) for the break point coefficients.
Of course a test H_0: breakpoint=0 is meaningless..

 I need to answer the question
 H:0 Beta0=Beta with a certainty metric,
sorry, who is your Beta0?

davies.test() tests the hypothesis H0: leftSlope=rightSlope which 
implies diffSlope=0 and then the breakpoint does not exist. K in 
davies.test() means the number of evaluation points used to compute the 
approximate p-value..


Please contact me off list if you need more details (given detailed 
questions)


vito



Il 16/11/2012 20.57, Greg Cohn ha scritto:

My data:
I have raw data points that form a logit style curve as if they were a time
series. Which is to say they form 3 distinct lines with 3 distinct slopes
in backwards z pattern.  A certain class of my data looks essentially flat
to the eye with marginal oscillation. What is important to me is the x
value at which the state change is occurring, in other words, the break
point

Use of segmented():
Segmented does a very good job of capturing the breakpoints and fitting
three distinct slopes, i.e. linear models. It does not, however give me
Pr(|t|) for the break point coefficients. I need to answer the question
H:0 Beta0=Beta with a certainty metric, i.e. probability statistic. This is
especially important for my, flat looking data class.

davies.test() question:
davies.test() only excepts lm() or glm() objects as input. If I run
segmented to find 1 breakpoint instead of 2, I get a totally bogus answer.
Without knowing the breakpoints, how is this test able to assess the proper
breakpiont? It appears to only give 1 best breakpoint, which is not
consistent with the breakpoints found by segmented(). Also, is K the number
of breakpoints or the number of iterations that it evaluates the breakpoint?


Thanks in advance.



lmfit-glm(TotRad_KW~HRRPUA_kWm2,data=d1)
davies.test(lmfit,seg.Z=~HRRPUA_kWm2,k=1000,alternative=less,
beta0=0,dispersion=NULL)

Davies' test for a change in the slope

data:  Model =  gaussian , link = identity
formula = TotRad_KW ~ HRRPUA_kWm2
segmented variable = HRRPUA_kWm2
`Best' at = 561.205, n.points = 1000, p-value  2.2e-16
alternative hypothesis: less




segments - segmented(lmfit, seg.Z=~HRRPUA_kWm2,psi=c(475,550))
summary(segments)

***Regression Model with Segmented Relationship(s)***

Call:
segmented.glm(obj = lmfit, seg.Z = ~HRRPUA_kWm2, psi = c(475,
 550))

Estimated Break-Point(s):
 Est. St.Err
psi1.HRRP 430.2  4.087
psi2.HRRP 484.6  3.077

t value for the gap-variable(s) V:  0 0

Meaningful coefficients of the linear terms:
 Estimate Std. Error t value Pr(|t|)
(Intercept) -38.6993   274.7666  -0.141   0.8891
HRRPUA_kWm2   1.4297 0.7472   1.914   0.0668 .
U1.HRRP  42.2884 4.7696   8.866   NA
U2.HRRP -40.5897 4.7123  -8.614   NA
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for gaussian family taken to be 6934.706)

Null deviance: 70776718  on 31  degrees of freedom
Residual deviance:   180302  on 26  degrees of freedom
AIC: 377.19

Convergence attained in 2 iterations with relative change -1.662839e-14




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240
fax: 091 485726
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fit a threshold function with nls

2012-10-16 Thread Vito Muggeo

Véronique,
in addition to Bert's comments, I would like to bring to your attention 
that there are several packages that perform 
threshold/breakpoint/changepoint estimation in R, including


cumSeg, segmented, strucchange, and bcp for a Bayesian approach

Moreover some packages, such as cghFLasso, perfom smoothing with a L1 
penalty that can yield a step-function fit.


I hope this helps you,

vito


Il 15/10/2012 22.21, Bert Gunter ha scritto:

Véronique:

I've cc'ed this to a true expert (Ravi Varadhan) who is one of those
who can give you a definitive response, but I **believe** the problem
is that threshhold type function fits have  objective functions whose
derivatives are discontinuous,and hence gradient -based methods can
run into the sorts of problems that you see.

**If** this is so, then you might do better using an explicit
non-gradient optimizer = rss minimizer via one of the optim() suite of
functions (maybe even the default Nelder-Mead); but this is definitely
where the counsel of an expert would be valuable. Also check out the
CRAN Optimization Task View for advice on optimization options.

Cheers,
Bert

On Mon, Oct 15, 2012 at 9:43 AM, Véronique Boucher Lalonde
veronique.boucher.lalo...@gmail.com wrote:

I am trying to model a dependent variable as a threshold function of
my independent variable.
What I mean is that I want to fit different intercepts to y following 2
breakpoints, but with fixed slopes.
I am trying to do this with using ifelse statements in the nls function.
Perhaps, this is not an appropriate approach.

I have created a very simple example to illustrate what I am trying to do.

#Creating my independent variable
x - seq(0, 1000)
#Creating my dependent variable, where all values below threshold #1 are 0,
all values above threshold #2 are 0 and all values within the two
thresholds are 1
y - ifelse(x  300, 0, ifelse(x700, 0, 1))
#Adding noise to my dependent variable
y - y + rnorm(length(y), 0, 0.01)
#I am having trouble clearly explaining the model I want to fit but perhaps
the plot is self explanatory
plot(x, y)
#Now I am trying to adjust a nonlinear model to fit the two breakpoints,
with known slopes between the breakpoints (here, respectively 0, 1 and 0)
threshold.model - nls(y~ifelse(xb1, 0, ifelse(xb2, 0, 1)),
start=list(b1=300, b2=700), trace=T)

I have played around with this function quite a bit and systematically get
an error saying: singular gradient matrix at initial parameter estimates
I admit that I don't fully understand what a singular gradient matrix
implies.
But it seems to me that both my model and starting values are sensible
given the data, and that the break points are in fact estimable.
Could someone point to what I am doing wrong?

More generally, I am interested in knowing
(1) whether I can use such ifelse statements in the function nls
(1) whether I can even use nls for this model
(3) whether I can model this with a function that would allow me to assume
that the errors are binomial, rather than normally distributed
(ultimately I want to use such a model on binomial data)

I am using R version 2.15.1 on 64-bit Windows 7

Any guidance would be greatly appreciated.
Veronique

 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.






--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240
fax: 091 485726
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Comparing crossing survival curves

2012-07-05 Thread Vito Muggeo (UniPa)

hi isabel,
You have to decide if focus is on the survival curves or hazards.. 
Crossing hazards do not imply crossing survival curves


If you are dealing with crossing hazards, and you are interested in 
testing for an effect of a covariate (presumably with a crossing hazard 
effect), then a standard Cox model framework based on the martingale 
representation (start, stop, event) (rather than (time, event)) will 
suffice


Finally, if you are interested in estimating the crossing point you 
could have a look to the package flexCrossHaz (currently in the 
Archive.. Let me know if you are interested in becoming maintainer..:-) 
)


http://cran.r-project.org/src/contrib/Archive/flexCrossHaz/

best,
vito


Il 05/07/2012 12.37, Isabel Borges ha scritto:


 Hi
I want to compare the survival curves in two groups. Because the hazards are
not proportional (the curves cross) the log rank test or Cox proportional
hazard test are not suitable. How should such curves be compared? Comands
are welcome
Thanks in advance
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240
fax: 091 485726
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Piecewise Lasso Regression

2012-06-06 Thread Vito Muggeo (UniPa)

dear lucas,
yes you are right, segmented does not handle 'lars' objects.

Out of curisity, are you interested in selecting the number of 
breakpoints or in selecting additional covariates with linear parameters?


vito


Il 06/06/2012 0.01, Lucas Santana dos Santos ha scritto:

Hi All,


I am trying to fit a piecewise lasso regression, but package Segmented does not 
work with Lars objects.
Does any know of any package or implementation of piecewise lasso regression?

Thanks,

Lucas
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240
fax: 091 485726
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Piecewise Lasso Regression

2012-06-06 Thread Vito Muggeo (UniPa)

dear Lucas,
If you are interested in selecting the number of breakpoints here a 
possible remedy:


1. Fit a segmented model with a large number of breakpoints via the 
arguments psi=NA and stop.if.error=FALSE in seg.control() (see the 
example below)


2. extract the model matrix relevant to the variables of the breakpoints

3. Use lars with the extracted model matrix

Notice that there are at least two issues that I did not mention:
a) The criterion to be used to select the variables (i.e the breakpoints)
b) When you remove variables (i.e. breakpoints) you are assuming that 
the ML estimate of the remaining breakpoints is the same (and this is 
not the case here, as point estimates of the breakpoints depend on the 
number of breakpoints themselves..)



vito

###Here a simple example
 n=100
 xx-1:n/n
 psi0-seq(.2,.8,by=.2)
 X-sapply(c(0,psi0), function(x)pmax(xx-x,0))
 b-c(-.6,.7,.4,-1,.5)
 mu-drop(X%*%b)
 yy-mu+rnorm(n)*.01
 plot(xx,mu);lines(xx, mu)



library(segmented)
o-lm(yy~xx)
os-segmented(o,seg.Z=~xx,psi=NA,control=seg.control(stop.if.error=FALSE,K=30, 
n.boot=0))

plot(os, add=T, col=2) #have a look to results

#extract the new model matrix
new.X-model.matrix(os)[,os$nameUV$U]

#finally use lars on y~new.X

Hope this helps you,
vito

Il 06/06/2012 15.48, Lucas Santana dos Santos ha scritto:

Hi Vito,

I am more interested in selecting the number of breakpoints. My data has some 
structure and I believe that fitting a piecewise regression would be of great 
benefit.

Thanks,

Lucas

On Jun 6, 2012, at 4:54 AM, Vito Muggeo (UniPa) wrote:


dear lucas,
yes you are right, segmented does not handle 'lars' objects.

Out of curisity, are you interested in selecting the number of breakpoints or 
in selecting additional covariates with linear parameters?

vito


Il 06/06/2012 0.01, Lucas Santana dos Santos ha scritto:

Hi All,


I am trying to fit a piecewise lasso regression, but package Segmented does not 
work with Lars objects.
Does any know of any package or implementation of piecewise lasso regression?

Thanks,

Lucas
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240
fax: 091 485726
http://dssm.unipa.it/vmuggeo






--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240
fax: 091 485726
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] getting the name of the working .Rdata file

2012-06-01 Thread Vito Muggeo (UniPa)

dear all,
I do not if it is a nonsense question..

Is it possible in the R session to get the name of the current .Rdata 
file that I ran?


I mean: suppose I double click the file myfile.Rdata. ls() returns the 
names of the objects in the current workspace (that is saved in 
myfile.Rdata). In the current R session, I would like to obtain 
myfile.Rdata. Is it possible?


Thanks in advance,
vito



--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240
fax: 091 485726
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Finding multiple breakpoints - 'segmented' ?

2012-06-01 Thread Vito Muggeo (UniPa)

dear Peter,
Currently segmented handles multiple breakpoints for several variables 
(the limit discussed in the msg 2006 has been fixed..).


However you are looking for a somewhat complicated model where the 
breakpoint of the relationship 's.size' and 'R.AUC' depends on another 
covariate 'bedekking'. Namely it is an interaction in the breakpoint 
parameter (rather then in a linear parameter).


Currently segmented is not able to deal with these interactions, and 
splitting up the data into two or three groups according the values of 
bedekking could be a feasible alternative.


Hope this helps you,

vito




Il 01/06/2012 11.35, Peter Hoitinga ha scritto:

Hello,

I'm attempting to find multiple breakpoints in an association of my
response variable (R.AUC) with two explanatory variables ('s.size' and
'bedekking'). The association between 's.size' and 'R.AUC' shows a
plateau, but the value when this plateau is reached is differs for
different values of 'bedekking'.

Initially I thought these different values could be assessed by the
'segmented' function in the package of the same name, but this only
seems to give one value of a breakpoint for each explanatory variable.
A similar issue was treated in the following message from 2006:
https://stat.ethz.ch/pipermail/r-help/2006-January/086446.html

Does anybody know whether there are any packages that do have a
function for this? Or might it be a possibility to split up the data
per value of 'bedekking' and then do the segmented function?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240
fax: 091 485726
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Breakpoint in logistic GLM with 'segmented' package - error: replacement length zero

2012-05-25 Thread Vito Muggeo (UniPa)

dear Peter,
Your code appears correct, so it is difficult to reply without the data..

If you are interested in further details, please contact me off-list

vito


Il 25/05/2012 15.34, Peter Hoitinga ha scritto:

Hello all,

I've been having trouble with assessing a breakpoint in a logistic GLM
with two explanatory variables. For this analysis I've been using the
'segmented' package version 0.2-9.1. But I keep getting an error and I
don't see where I would be going awry. The situation is the following:

Two explanatory variables:
bedekking - a variable with possible values between 0 and 1 - mine
runs from 0.05 to 0.5, increasing with steps of 0.05
s.size - a count variable - increases from 3 to 25 with steps of 1,
and from 25 to 60 with steps of 5

Each combination of s.size and bedekking has 100 repeats so the
resulting dataframe 'dat.al2' consists of 3 observations of 3
variables.

Because the response variable has values between 0 and 1, I used a logistic GLM:


gmodel- glm(R.AUC ~ bedekking + s.size, data=dat.al2, family = 
quasibinomial(link=logit))


R.AUC increases with increasing s.size and decreasing bedekking,
looking at the graph shows that the association reaches a plateau at a
s.size of 10 and a bedekking of 0.45. So these are the values I use in
'psi' argument in the 'segmented' function


psi.mod- list(0.45, 10)
names(psi.mod)- c(bedekking, s.size)


Then I attempt to run the 'segmented' function:


seg.gm- segmented(obj = gmodel, seg.Z= ~bedekking + s.size, psi = psi.mod)


When I run this, after half a minute I get the following error:

Error in ifelse(is.list(o0), o0$dev.no.gap, 10^12) :
   replacement has length zero

Does anybody know what might be causing this error, and could somebody
point out where I might go wrong?

Thanks in advance,

Peter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240
fax: 091 485726
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error with psi value for 'segmented' package for R

2012-05-08 Thread Vito Muggeo (UniPa)

dear Szymon,

what do you mean

it does not work for others.. that fit within similar range?

Each dataset has its own features and breakpoint estimation is not as 
simple as estimation of linear models even if your data fit within 
similar range.


I will contact you out of the list for details,

best,
vito




Il 08/05/2012 16.44, Szymon Biskup ha scritto:

Hi everyone,

while trying to use 'segmented' (R i386 2.15.0 for Windows 32bit OS) to 
determine the breakpoint I got stuck with an error message and I can't find 
solution. It is connected with psi value, and the error says:



Error in seg.glm.fit(y, XREG, Z, PSI, weights, offs, opz) :
   (Some) estimated psi out of its range



This is the code I am using:

library(segmented)
curva-read.table(lamintr1.txt, header=T)
attach(curva)
fit.glm-glm(gpp~temp, weight=NULL, family=gaussian)
plot(temp,gpp,xlab=expression(temp),
ylab=gpp,pch=15,cex=0.8,xlim=c(0,50), ylim=c(0,40))
o1-glm(gpp ~ temp, weight=NULL, family=gaussian)
os1-segmented(o1, seg.Z=~temp, psi=15, control=seg.control(n.boot=0,
display=T, it.max=5))
plot(os1, add=TRUE, res=TRUE, se=FALSE, show.gap=TRUE, linkinv = FALSE,
res.col=1, rev.sgn=FALSE, const=0)
summary(os1)


And the most surprising fact is that it works for some of my data, eg:

tempgpp
5   5.08050857592085
10  9.50809597873546
15  21.0206415558052
20  21.5340216521042
25  22.8455243983385
30  17.6106786978697


but not for the others, that fit within similar range (in what case I tired to 
change the psi value but it didn't help), eg:

tempgpp
5   10.1494724447878
10  9.64730588470101
15  19.3439579009423
20  20.6756229089911
25  13.7902544619339
30  21.9355758560751


or

tempgpp
5   8.64380785577685
10  9.47992535226006
15  16.7556554476544
20  14.5189937476639
25  20.6874556832793
30  17.5509059595314


I saw post with similar questons but none of them had the answer I am looking 
for.
Would there be anyone that could help me with this?


Thanks a lot for your time and help.


Best regards,

Szymon

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240
fax: 091 485726
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error with the 'segmented' package for R

2012-05-04 Thread Vito Muggeo (UniPa)

dear Szymon,
it is a bug (in the new version), thanks. It depends on the flat 
underlying relationship you are trying to estimate with a small sample..


I will correct it as soon as possible. Meanwhile you can use

o1-glm(gpp ~ temp)
os1-segmented(o1, seg.Z=~temp, psi=15, control=seg.control(n.boot=0, 
display=T, it.max=2))


best,
vito

--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240
fax: 091 485726
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] variable dispersion in glm models

2012-04-26 Thread Vito Muggeo (UniPa)

there are at least two alternatives

1) package dglm for Double generalized linear models

or probably better

2)package gamlss for Generalized Additive Models for Location Scale and 
Shape. Here you can use alternative, sometimes more appropriate, 
families and also you can include additive (nonparametric) terms in the 
linear predictors


Hope this help,
you

vito

Il 26/04/2012 10.00, Chris Rh ha scritto:

Hello,
I am currently working with the betareg package, which allows the fitting of a 
variable dispersion beta regression model (Simas et al. 2010, Computational 
Statistics  Data Analysis). I was wondering whether there is any package in R 
that allows me to fit variable dispersion parameters in the standard logistic 
regression model, that is to make the dispersion parameter contingent upon some 
covariates. I know that glm() from the base package allows the fitting of a common 
dispersion parameter via selection of the quasibinomial family, but glm() seems not 
to be able to individualize the dispersion parameter. Every hint is welcomed!
Chris

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240
fax: 091 485726
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] linear-by-linear association model in R?

2012-04-02 Thread Vito Muggeo (UniPa)

dear Christofer,

Try the following

d-expand.grid(a=1:3,b=1:4)
d$freq-rpois(12,5)
o-glm(freq~factor(a)+factor(b)+I(a*b), family=poisson, data=d)

vito


Il 02/04/2012 9.34, Christofer Bogaso ha scritto:

Dear all, can somebody give me some pointer how I can fit a
linear-by-linear association model (i.e. loglinear model for the
ordinal variables) in R? A brief description can be found here
'https://onlinecourses.science.psu.edu/stat504/node/141'.

Thanks for your help

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240
fax: 091 485726
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] glmnet() vs. lars()

2012-03-21 Thread Vito Muggeo (UniPa)

dear all,

It appears that glmnet(), when selecting the covariates entering the 
model, skips from K covariates, say, to K+2 or K+3. Thus 2 or 3 
variables are added at the same time and it is not possible to obtain 
a ranking of the covariates according to their importance in the model. 
On the other hand lars() adds the covariates one at a time.
My question is: is it possible to obtain a similar output of lars (in 
terms of order of the variables entering the model) using glmnet()?


many thanks,
vito


#Example (from ?glmnet)

set.seed(123)
x=matrix(rnorm(100*20),100,20)
y=rnorm(100)
fit1=glmnet(x,y)
fit1$df #no. of covariates entering the model at different lambdas

#Thus in the first model no covariate is included and in the second 
#one 2 covariates (V8 and V20) are included at the same time. Because 
#two variables are included at the same time I do not know which 
#variable (among the selected ones) is more important.

#Everything is fine with lars

o-lars(x,y)
o$df #the covariates enter one at a time.. V8 is better than V20


--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240
fax: 091 485726
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] looking for an ecological dataset

2012-01-19 Thread Vito Muggeo (UniPa)

dear all,
apologizes for this off-topic question.

I am looking for a ecological dataset (n100, say) including 
measurements of one or more growth variable and age.


Could anyone to suggest the R package/URL where I can find it?

many thanks,

vito


--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240
fax: 091 485726
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] breakpoints and nonlinear regression

2012-01-19 Thread Vito Muggeo (UniPa)

dear Julian,


Il 18/01/2012 14.36, crimsonengineer87 ha scritto:

Thanks for the comments. Yes, I also had segmented and then I went away from
that. I can't remember. I've tried using it but I get some sort of strange
error. Here's some code ...


it is difficult for me to help you without knowing which error you 
obtain.. If you refer to maximum number of iterations, it is a warning 
(not error). See the discussion in the paper on Rnews (that Achim 
suggested). The following code is expected to work


pavlu.glm- lm(Na ~ yield, data=pavludata)
pavlu.seg- segmented(pavlu.glm, seg.Z=~yield, psi=1000)
with(pavludata, plot(yield, Na))
plot(pavlu.seg, add=TRUE)

See in ?segmented and ?plot.segmented for additional examples and 
contact me off list if you have additional questions


best,
vito






pavlu.glm- glm(Na ~ yield, data=pavludata, family=gaussian)
pavlu.seg- segmented(pavlu.glm, seg.Z=~yield, psi=1000,
control=seg.control(display=FALSE))

plot.series- function()
{
plot(pavlu.seg)
plot(pavlu.seg, add=TRUE, linkinv=TRUE, lwd=2, col=2:3, lty=c(1,3))
lines(pavlu.seg, col=2, pch=19, bottom=FALSE, lwd=2)
}

jpeg(pavlu-cuttingsystem-segmented.jpg, width = 1000, height = 700, units
= px)
plot.series()

## Turn off device driver (to flush output to JPG)
dev.off()

1. I don't think I'm doing my plotting right. I'm just not sure how that
works with segmented.
2. My error is something about an error in do.call(lines) and that the
maximum number of iterations has been reached. Am I missing something with
glm or lm?

Thanks again.

--
View this message in context: 
http://r.789695.n4.nabble.com/breakpoints-and-nonlinear-regression-tp4303629p4306657.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240
fax: 091 485726
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with segmented

2012-01-11 Thread Vito Muggeo (UniPa)

dear Phil,
I am not able to read the error message.. did you forget it?

However: does x exist in the workspace?

The following lines work:

myreg2 = lm(y ~ x, data=xy)
mysegmented = segmented(myreg2, seg.Z=~x, psi=c(245000))

myreg2 = lm(xy$y ~ xy$x)
x-xy$x
mysegmented = segmented(myreg2, seg.Z=~x, psi=c(245000))


The following line does *not* work (as specified in ?segmented, argument 
seg.Z)


myreg2 = lm(xy$y ~ xy$x)
mysegmented = segmented(myreg2, seg.Z=~xy$x, psi=c(245000)) #error


Hope to have been clear,
vito



Il 10/01/2012 17.17, Filoche ha scritto:

Hi everyone.

I'm trying to use the segmented function with the following data:


For instance, I use segmented package as follow:

myreg2 = lm(xy$y ~ xy$x)
mysegmented = segmented(myreg2, seg.Z=~x, psi=c(245000), control =
seg.control(display=FALSE))

Which get me to the following error :


As a break point, a starting guess of 245000 seems fair.

Anyone has an idea why I'm getting such error?

Regards,
Phil

--
View this message in context: 
http://r.789695.n4.nabble.com/Problem-with-segmented-tp4282398p4282398.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240
fax: 091 485726
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mgcv gam predict problem

2011-03-28 Thread Vito Muggeo (UniPa)

dear Philip,
I am not able to solve your problem, however the error message you get 
does not depends on mgcv::gam, therefore gam(,..outer.ok=TRUE) or 
predict.gam(,outer.ok=TRUE) do not make sense.


The error message comes from the function splines::splineDesign which is 
called when the option bs=ps is used.


I think the error depends on the fact that you want to predict a value 
outside the observed range of the covariate. When using P-splines the 
predictions outside the range follow a given polynomial..


hope this helps
vito




Il 28/03/2011 7.10, Philip Gautier ha scritto:

Hello

I'm using function gam from package mgcv to fit splines.  When I try
to make a prediction slightly beyond the original 'x' range, I get
this error:


A = runif(50,1,149)
B = sqrt(A) + rnorm(50)
range(A)

[1]   3.289136 145.342961



fit1 = gam(B ~ s(A, bs=ps), outer.ok=TRUE)
predict(fit1, newdata=data.frame(A=149.9), outer.ok=TRUE)

Error in splineDesign(knots, x, ord, derivs, outer.ok = outer.ok) :
   the 'x' data must be in the range 3.14708 to 145.485 unless you set
'outer.ok = TRUE'




I've inserted the argument 'outer.ok=TRUE' as you can see, but it
hasn't helped.  How can I obtain this prediction?

Thanks,
Philip Gautier
Dept. of Mathematics and Statistics
American University, Washington, DC

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot not generic

2011-01-28 Thread Vito Muggeo (UniPa)

dear Nick,

getAnywhere(plot.glmnet)


Note the message you get when you type

methods(plot)
...
   Non-visible functions are asterisked




Il 28/01/2011 14.26, Nick Sabbe ha scritto:

Hello list.



I was trying to see some of the code for plot.glmnet in package glmnet (this
function name is in the documentation).

After loading the library, I tried the obvious typing in the name, but I
received a message telling me it could not be found.



So I fiddled around a little, and noticed that R does not recognize 'plot'
as a generic function, and as such, showMethods does not work.

This seems to conflict with the documentation for plot.



So 2 questions:

. How can I find the code of plot.glmnet

. Why is plot not seen as generic?



Thx.



Nick Sabbe

--

ping: nick.sa...@ugent.be

link:http://biomath.ugent.be/  http://biomath.ugent.be

wink: A1.056, Coupure Links 653, 9000 Gent

ring: 09/264.59.36



-- Do Not Disapprove




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Convert a matrix's columns to list

2011-01-18 Thread Vito Muggeo (UniPa)

hi feng,
a possible solution is

b1-apply(a,2,list)

and possibly

lapply(b1,unlist)

if you want exactly the output equal to list(a[, 1], a[, 2])

best,
vito

Il 18/01/2011 13.53, Feng Li ha scritto:

Dear R,

Is there an efficient way to make a list that each element is from the
corresponding column of a matrix. For example, if I have a matrix a


a- matrix(1:10, 5, 2)
a

  [,1] [,2]
[1,]16
[2,]27
[3,]38
[4,]49
[5,]5   10

I would like to have a list b like this


b- list(a[, 1], a[, 2])
b

[[1]]
[1] 1 2 3 4 5

[[2]]
[1]  6  7  8  9 10


Thanks in advance!


Feng



--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Choosing statistical test - Fisher's Exact Test?

2011-01-18 Thread Vito Muggeo (UniPa)

It appears that you have a 2x2 table coming from paired binary data..

If this is the case the McNemar test is appropriate.
See
?mcnemar.test

or even better the package exact2x2, function mcnemar.exact() for an 
exact approach,


vito




Il 18/01/2011 14.40, debz ha scritto:


Hi I was wondering whether anyone can help me with this problemit's been
driving me nuts, I've been trying to figure it out for months and months
without success!! Basically I have a group of participants who attended 2
experimental sessions a few months apart. I took measures of the way they
approach two tasks at Time 1 and the same two tasks at Time 2. I have
categorical data (a frequency table for each task) consisting of the goal
profiles that participants adopted at Time 1 and at Time 2. I would like to
test whether there is a significant difference between the goal profiles
that participants adopted at Time 1 and Time 2. Is Fisher's exact test the
test to use in this case?

I would really really appreciate any help!!

Thanks


Debbie


--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] confidence interval for logistic joinpoint regression from package ljr

2010-12-01 Thread Vito Muggeo (UniPa)

dear M.,
I do not know how to get the SE for the joinpoint (or breakpoint) from 
your ljr fit. However you can find useful the segmented package which 
works for any GLM (including the logistic one) and it returns 
(approximate) StErr (and Conf Int) also for the joinpoint (breakpoint in 
the segmented package). For some examples, see the R news paper

http://cran.r-project.org/doc/Rnews/Rnews_2008-1.pdf

Also, as regards to the APC, you could find interesting the following note

http://onlinelibrary.wiley.com/doi/10.1002/sim.3850/abstract

hope this helps you,
vito


Il 30/11/2010 23.54, moleps ha scritto:

I´m trying to run a logistic joinpoint regression utilising the ljr package. 
I´ve been using the forward selection technique to get the number of knots for 
the analysis, but I´m uncertain as to my results and the interpretation. The 
documentation is rather brief ( in the package and the stats in medicine 
article is quite technical) and without any good examples. At the moment I´m 
thinking
1)find the number of knots both using forward and backward techniques and see 
if they are close
2)present the annual percent change (APC) for each of the intervals, ie my 
present data (1950-2010 in 5 year intervals) is giving me

Variables  Coef
b0 Intercept -131.20404630
g0 t0.06146463
g1 max(t-tau1,0)   -0.51582466
g2 max(t-tau2,0)0.43429615

Joinpoints:

1 tau1= 1990.5
2 tau2= 1995.5

APC 1950-1990=exp(0.06)=1.06--6%
1990-1995=exp(0.06-0.51)=exp(-0.45)=0.63--  -37%
1995-2010=exp(0.06-0.51+0.43)---2%

3) Preferably a confidence interval for the APC should be given. However, this 
I havent figured out yet.

//M

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] One silly question about tapply output

2010-10-27 Thread Vito Muggeo (UniPa)

dear Vincy,
Firstly, a suggestion: to increase the probability of getting help, you 
should provide reproducible code (people can do copy-and-paste of your 
code and to modify the code to obtain the response.. )


However a possible solution (not tested, of course..) could be simply

a-tapply(rate, rating, mean)
d-data.frame(rating=names(a),mean=a)


best,
vito


Il 27/10/2010 12.39, Vincy Pyne ha scritto:

Dear R helpers

I have a data which gives Month-wise and Rating-wise Rates. So the input file 
is something like

month   rating   rate
JanuaryAAA 9.04
February  AAA 9.07
..
..
Decemeber AAA8.97
January   BBB   11.15
February BBB11.13



January  CCC17.13
.

December   CCC   17.56

and so on.

My objective is to calculate Rating-wise mean rate, for which I have used

rating_mean = tapply(rate, rating, mean)

and I am getting following output


tapply(rate, rating, mean)

   AAA   BBB  CCC
9.1104   11.136163717.1606779

which is correct when compared with an excel output.

However, I wish to have my output something like a data.frame (so that I should 
be able to save this output as csv file with respective headings and should be 
able to carry out further analysis)

Rating Mean
AAA9.1104
BBB   11.1361637
CCC   17.1606779


Please guide as how should I achieve my output like this.

Thanking in advance.

Regards

Vincy






[[alternative HTML version deleted]]




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] building lme call via call()

2010-10-25 Thread Vito Muggeo (UniPa)

dear all,
I would like to get the lme call without fitting the relevant model.

library(nlme)
data(Orthodont)
fm1 - lme(distance ~ age, random=list(Subject=~age),data = Orthodont)

To get fm1$call without fitting the model I use call():

my.cc-call(lme.formula, fixed= distance ~ age, random = list(Subject 
= ~age))


However the two calls are not the same (apart from the data argument I 
am not interested in), as call() *does* evaluate the arguments:


 my.cc$random
$Subject
~age

 fm1$call$random
list(Subject = ~age)

How is it possible to get the right call (similar to the one from 
fm1$call) by means of call()?


thanks,
vito



--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Interactions in GAMMs

2010-07-21 Thread Vito Muggeo (UniPa)

Hi Karen,
I think you should decide what you mean for interaction. s(x:y) is 
meaningless


If you want to fit a surface you should use s(x,y).

If you want to fit a varying coefficient model (interaction between a 
linear and a smooth term) you should use the argument by in s().


The help files of the mgcv package by S. Wood  are quite clear and they 
also include a lot of examples.


Hope this helps you

vito



Karen Moore ha scritto:

Hi,

I've an issue adding an interaction to a GAMM:

My model was of form:

gamm1 - gamm(TOTSR ~ fROT + s(PH) + s(LOI) + s(ASP) + s(SQRT_ELEV) + CANCOV
+ s(SQRT_TOTCWD) + s(WELLF) + s(WELLN) + s(OLDWDLD) + s(DISTWOOD) +
s(Annprec) + s(OLDWDLD:DISTWOOD) + (1|fSITE),  family = poisson, data =
BIOFOR2)

with interaction of s(OLDWDLD:DISTWOOD).

However I got this error message below but don't know what it means? (my
data is composed of info for 150 plots)

#I Warning messages:
#2: In OLDWDLD:DISTWOOD :
 # numerical expression has 150 elements: only the first used
#3: In OLDWDLD:DISTWOOD :
#  numerical expression has 150 elements: only the first used

Can anyone offer advice on how to include this interaction in GAMM model?
Thanks

Karen Moore

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Peak Over Threshold values

2010-05-26 Thread Vito Muggeo (UniPa)

dear Tonja,
By plotting your data

plot(df)

it seems to me that you are looking for a piecewise linear 
relationships. If this is the case, have a look to the package 
segmented. You have to specify or not the number and the starting values 
for the breakpoints


library(segmented)
olm-lm(walevel~day)

#specify number and starting values for the breakpoints..
oseg-segmented(olm, seg.Z=~day, psi=c(20,50,80))
plot(oseg,add=TRUE,col=2)
oseg$psi

#or you can avoid to specify starting values for psi
oseg-segmented(olm, seg.Z=~day, 
psi=NA,control=seg.control(stop.if.error=FALSE,K=30))


plot(oseg,add=TRUE,col=2)
oseg$psi


best,
vito


Tonja Krueger ha scritto:

   Dear List

   I hope you can help me: I’ve got a dataframe (df) within which I am looking
   for Peak Over Threshold values as well as the length of the events. An event
   starts when walevel equals 5.8 and it should end when walevel equals the
   lower threshold value (5.35).

   I tried “clusters (…)” from “evd package”, and varied r (see example) but it
   did not work for all events (again see example).

   walevel - c(5.75, 5.75, 5.75, 5.76, 5.80, 5.82, 5.85, 5.86, 5.91, 5.91,
   5.88, 5.92, 5.99, 6.02, 6.00, 6.08, 6.11, 6.10, 6.10, 6.11, 6.14, 6.12,
   6.15, 6.17, 6.15, 6.08, 6.01, 5.95, 5.89, 5.83, 5.77, 5.72, 5.70, 5.65,
   5.59, 5.51, 5.43, 5.33, 5.30, 5.25, 5.22, 5.21, 5.19, 5.19, 5.19, 5.13,
   5.15, 5.07, 5.04, 5.02, 4.99, 5.05, 5.07, 5.12, 5.17, 5.22, 5.32, 5.29,
   5.33, 5.33, 5.36, 5.37, 5.42, 5.43, 5.48, 5.55, 5.57, 5.66, 5.68, 5.72,
   5.77, 5.79, 5.81, 5.80, 5.80, 5.77, 5.72, 5.70, 5.66, 5.62, 5.62, 5.61,
   5.59, 5.57, 5.51, 5.47, 5.41, 5.39, 5.40, 5.40, 5.42, 5.45, 5.47, 5.50,
   5.50, 5.49, 5.43, 5.39, 5.33, 5.26)

   day - c(1:100)

   df - data.frame(day,walevel)

   library(evd)
   clusters(df$walevel, u = 5.80, r = 1, ulow = 5.35, cmax = T, plot = T)
   clusters(df$walevel, u = 5.80, r = 50, ulow = 5.35, cmax = T, plot = T)

   What have I done wrong?

   Tonja
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] only actual variable names in all.names()

2010-03-04 Thread Vito Muggeo (UniPa)

dear all,
When I use all.vars(), I am interest in extracting only the variable names..
Here a simple example

all.vars(as.formula(y~poly(x,k)+z))

returns
[1] y x k z

and I would like to obtain
y x z

Where is the trick?

many thanks
vito


--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] partial matching with grep()

2009-11-02 Thread Vito Muggeo (UniPa)

dear all,
This is a probably a silly question.
If I type
 grep(x,c(a.x ,b.x,a.xx),value=TRUE)
[1] a.x  b.x  a.xx

Instead, I would like to obtain only
a.x  b.x
How is it possible to get this result with grep()?

many thanks for your attention,
best,
vito

--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 6626240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Estimation in a changepoint regression with R

2009-10-15 Thread Vito Muggeo (UniPa)
There are at least two R packages dealing with changepoint estimation, 
segmented and strucchange.

Two possible relevant papers are available:
1)Journal of Statistical Software for strucchange (2002, Vol.7, Issue2) 
2)Rnews for segmented (2008, 8/1: 20-25)


Hope this helps you
vito

FMH ha scritto:

Dear All,

I'm trying to do the estimation in a changepoint regression problem via R, but never found any suitable function which might help me to do this. 


Could someone give me a hand on this matter?

Thank you.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 6626240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fitting a linear model with a break point

2009-09-08 Thread Vito Muggeo (UniPa)

dear Dan,
As far as I know, the strucchange package can be helpful for you..

On the other hand, if your regression function is continuous at the 
unknown break points to be estimated, you could try the segmented package.


Hope this helps you,
vito


Daniel Brewer ha scritto:

Hello,

I would like to test some data to see whether it has the shape of a step
function (i.e. y1 up until x_th and then y2 where x_th is the
threshold).  The threshold x_th is unknown and the x values can only
take discrete values (0,1,2,3,4).

An example would be:
data- data.frame(x=1:20,y=c(rnorm(10),rnorm(10,10)))


I was thinking along the lines of fitting some sort of piiecewise linear
model which has the gradient constrained to zero trying out all possible
different threshold and taking the one with the least residuals.  I am
not sure how to implement this in R.  Anyone got any ideas?

Also is there a way of including the threshold in the actual model, so
that could be estimated too?

Thanks

Dan



--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 6626240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Zinb for Non-interger data

2009-07-21 Thread Vito Muggeo (UniPa)
I think that the (impressive) gamlss package (see 
http://www.gamlss.com) may be helpful.


If I remember correctly, in gamlss you can fit model with zero-inflated 
continuous distributions


hope this helps you,
vito


Alain Zuur ha scritto:


JPS2009 wrote:

Sorry bit of a Newbie question, and I promise I have searched the forum
already, but I'm getting a bit desperate!

I have over-dispersed, zero inflated data, with variance greater than the
mean, suggesting Zero-Inflated Negative Binomial - which I attempted in R
with the pscl package suggested on
http://www.ats.ucla.edu/stat/R/dae/zinbreg.htm

However my data is non-integer with some pesky decimals (i.e. 33.12) and
zinb / pscl doesn't like that - not surprising as zinb is for count data,
normally whole integers etc.

Does anyone know of a different zinb package that will allow non-integers
or and equivalent test/ model to zinb for non-integer data? Or should I
try something else like a quasi-Poisson GLM?


Apologies for the Newbie question! Any help much appreciated!
Thanks!



Is it really non-integer...or is it a density (in which case you could use
NB + offset)?


The quasi-Poisson will not help you with the zero inflation.
I'm afraid you will have to do some hard programming by combining the 0-1
binomial part with a continuous distribution on the second part of the
data..and I guess the easiest is to do this in MCMC. Perhaps the Gamma
distribution can be used? You would have to adjust all likelihood equations
as Gamma doesn't allow for zeros. But perhaps another continuous
distribution is more appropriate...depends on your data.


Alain Zuur




--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 6626240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Linear Regression Problem

2009-07-14 Thread Vito Muggeo (UniPa)

dear Alex,
I think your problem with a large number of predictors and a relatively 
small number of subjects may be faced via some regularization approach 
(ridge or lasso regression..)


hope this helps you,
vito

Alex Roy ha scritto:

Dear All,
 I have a matrix  say, X ( 100 X 40,000) and a vector say, y
(100 X 1) . I want to perform linear regression. I have scaled  X matrix by
using scale () to get mean zero and s.d 1  . But still I get very high
values of regression coefficients.  If I scale X matrix, then the regression
coefficients will bahave as a correlation coefficient and they should not be
more than 1. Am I right? I do not whats going wrong.
Thanks for your help.
Alex


*Code:*

UniBeta - sapply(1:dim(X)[2], function(k)
+ summary(lm(y~X[,k]))$coefficients[2,1])

pval - sapply(1:dim(X)[2], function(l)
+ summary(lm(y~X[,l]))$coefficients[2,4])

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 6626240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] zero cells in one variable in logistic regression

2009-07-13 Thread Vito Muggeo (UniPa)

dear anna,
if you are not interested in point estimate and SE of the parameter of 
the aforementioned categorical variable, I believe the conventional 
glm(..,family=binomial) is correct. In particular, the returned deviance 
is reliable and also it is the relevant likelihood ratio test..


hope this helps,
vito


David Winsemius ha scritto:


On Jul 13, 2009, at 5:37 AM, anna.bucharova wrote:



Dear all.
I am sort of beginner with R. I do logistic regression with binomial
response variable and several continuous and categorical variables. In 
one

categorical variable, zero cell occures (2x2 table looks like
7 - 0
23 - 25
This leads to overestimating of odds ratio and inflated confidence 
interval
for odds for given variable. The variable is significant in univariate 
test.
I do not necessarilly need odd ratio, but I need the explained 
deviance by

this variable and I really want to keep this variable in the model. It
probably matters for explained deviance. How to treat this problem?
Thanks for help, Anna Bucharova
--


You might consider glmrob in package:robustbase. See 
http://www.jstatsoft.org/v10/i04


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 6626240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] gradually switching regression

2009-06-24 Thread Vito Muggeo (UniPa)

Hi,
Although BaconWatts (1971) assume a transition, the aim is to estimate 
the breakpoint where the linear relationship changes. The start- and 
end-point of the transition phase are not parameters to estimate ; it is 
a trick to estimate the model.


As a possible alternative, you could have a look to package segmented 
where pure piecewise straight-lines are fitted..


Hope this helps you..
vito

Benjamin Volland ha scritto:

Hello,
I'm trying to find an algorithm to estimate a switching regression model 
based on the 1990 Economics Letters paper by Ohtani/Kakimoto/Abe or the 
earlier version from 1985 (Ohtani/Katayama, Economic Studies Quarterly; 
assuming as a transition path a polynomial of order 1).


I found an idea for using nls here: 
http://www.biostat.wustl.edu/archives/html/s-news/2000-04/msg00223.html. 
Unfortunately it's based on the 1971 BaconWatts model, which does not 
allow obtaining explicitly both start- and end-point of the transition 
phase. If anyone stumbled across such a function or has suggestions on 
how to do it, I would greatly appreciate hearing about them.


Thanks so much
Ben Volland

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 6626240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Linear Models: Explanatory variables with uncertainties

2009-06-15 Thread vito muggeo
Probably you are looking for EIV (errors-in-variables) or ME 
(measurement errors) models.


simex is a possible package which needs to know the error variance..
Also RSiteSearch() may be helpful..

hope this helps,
vito


kpal ha scritto:

One of the assumptions, on which the (General) Linear Modelling is
based is that the response variable is measured with some
uncertainties (or weighted), but the explanatory variables are fixed.
Is it possible to extend the model by assigning the weights to the
explanatory variables as well? Is there a package for doing such a
model fit?

Thanks

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 6626240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Discover significant change in sorted vector

2009-04-23 Thread vito muggeo

dear Hans,
As pointed out by Petr Pikal, segmented allows to estimate breakpoints 
of (continuous) piecewise relationships. That is, the mean lines are 
assumed to be connected and segmented tries to join them at the 
estimated breakpoints. The estimated breakpoints may be any value in the 
range of the segmented covariate (not necessarly an integer). 
Computation times are substantially indipendent of the number of 
breakpoints to be estimated.


strucchange thinks different; it allows disconnected segments and the 
estimated breakpoints have to be the observed time indices.


Hope this helps you,
vito

Petr PIKAL ha scritto:

Hi

One possibility is to use segmented

e.g


a - c(2,3,3,5,6,8,8,9,15, 25, 34,36,36,38,41,43,44,44,46)
ix - seq_along(a)
plot(ix,a)
library(segmented)
fit-lm(a~ix)
fit.s-segmented(fit, ~ix, list(ix=c(5,10)))
fit.s

Call: segmented.lm(obj = fit, seg.Z = ~ix, psi = list(ix = c(5, 10)))

Meaningful coefficients of the linear terms:
(Intercept)  ix   U1.ix   U2.ix 
  0.6785714   1.0714286   8.9285714  -8.450 


Estimated Break-Point(s) psi1.ix psi2.ix :  8.476 10.880

Regards
Petr


r-help-boun...@r-project.org napsal dne 22.04.2009 15:45:55:

Gabor, initially this looked like the perfect solution, exactly what I 
need.


Unfortunately it is too expensive/costly. I have vectors of length 800 
and more, my machine needs  5 minutes (I aborted) to compute the 
breakpoints. Required is computation time  1 sec. :)


Any other suggestions? Maybe there is another approach not that 
perfect as from the strucchange package, but still sufficient?


Best
Henning


Am 22.04.2009 um 14:55 schrieb Gabor Grothendieck:


Try this:


a - c(2,3,3,5,6,8,8,9,15, 25, 34,36,36,38,41,43,44,44,46);
ix - seq_along(a)
library(strucchange)
bp - breakpoints(a ~ ix, h = 4)
bp

Optimal 3-segment partition:

Call:
breakpoints.formula(formula = a ~ ix, h = 4)

Breakpoints at observation number:
7 11

Corresponding to breakdates:
0.3684211 0.5789474

plot(a ~ ix)
lines(ix, fitted(bp))


On Wed, Apr 22, 2009 at 7:27 AM, Hans-Henning Gabriel
hanshenning.gabr...@gmail.com wrote:

Hi,

suppose I have a simple sorted vector like this:

a - c(2,3,3,5,6,8,8,9,15, 25, 34,36,36,38,41,43,44,44,46);

Is there a function in R, I can use to discover that from index 8 
to index

11 the values are changing significantly?
The function should return a value pointing to one of the indices 
8, 9, 10

or 11. Any of them would be fine.
The difficulty is that there may be no big gap. I mean, indices 8 
and 11 are
somehow connected by indices 9 and 10. So, it's not an option to 
just

search for biggest difference between the values.

Perfect would be a function that is able to discover multiple 
changes if it

is present in the data.

Thanks!!
Henning

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 6626240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Piecewise

2009-03-26 Thread vito muggeo

Hi,
In addition to useful Ben's suggestion, you have a, possibly simpler, 
alternative.


If you are willing to assume to know the power of you piecewise 
polynomial (beta parameter according to code of Ben) you can use the 
package segmented. Using the data generated by Ben in his previous 
email, you get


olm-lm(y~0+I(x^(-.5))) #note that 0 as the true model has not interc
os1-segmented(olm,seg.Z=~x,psi=40)

Results are comparable with the ones returned by the nls() (even as 
regard the St.Err of the breakpoint), although segmented returns a 
somewhat smaller residual sum of squares :-)


***segmented() vs. nls(): some possibles differences***

1)In segmented, you are assuming to know the power of polynomial. In 
practice you can try several different values such as {-1,-.5,1,2,3}, 
say, and to assess the fit.. However the St.Err from the other estimates 
might be underestimated..


2)In segmented, you need to specify starting value only for the breakpoint.

3)In segmented, you can easy generalize the model: multiple breakpoints 
and/or additional linear covariates and/or non-continuous response 
and/or non-identity link functions (e.g. gamma with log-link)...



Hope this helps you,

vito



Ben Bolker ha scritto:

Joe Waddell joecwaddell at gmail.com writes:


Hi,
I am a biologist (relatively new to R) analyzing data which we predict 
to fit a power function.  I was wondering if anyone knew a way to model 
piecewise functions in R, where across a range of values (0-x) the data 
is modeled as a power function, and across another range (x-inf) it is a 
linear function.  This would be predicted by one of our hypotheses, and 
we would like to find the AICs and weights for a piecewise function as 
described, compared with a power function across the entire range.


I have looked into the polySpline function, however it appears to use 
only polynomials, instead of the nls models I have been using.  Thanks 
in advance for any help you might be able to offer.


 You should be able to do it in nls as follows ...
There are some potentially subtle issues if you get into
the details of likelihood profile confidence intervals
for the breakpoint (since the likelihood profile is flat 
between/discontinuous across the locations of data points),

but hopefully that won't bite you in your applications.

## function to generate piecewise power-law/linear data
f - function(x,brk,alpha,beta,b,sd) {
  mu - ifelse(xbrk,alpha*x^beta,(alpha*brk^beta)-b*(x-brk))
  rnorm(length(x),mean=mu,sd=sd)
}

## generate data
set.seed(1001)
x - runif(1000,max=100)
y - f(x,brk=50,alpha=100,beta=-0.5,b=1,sd=5)

## take a quick look, vs. theoretical curve
plot(y~x)
curve(f(x,50,100,-0.5,1,0),col=2,add=TRUE,n=1000,lwd=2)

## fit, using the I() command to do the piecewise part
dat - data.frame(x,y)
fit1 - nls(y~I(xbrk)*alpha*x^beta+I(xbrk)*((alpha*brk^beta)-b*(x-brk)),
start=list(brk=60,alpha=110,beta=-0.75,b=2),data=dat)

## plot the fit
xvec - seq(0,100,length=200)
lines(xvec,predict(fit1,newdata=data.frame(x=xvec)),col=4,lwd=2)
## testing ...
with(as.list(coef(fit1)),
 lines(xvec,f(xvec,brk,alpha,beta,b,sd=0),col=5,lwd=2))


  Ben Bolker

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 6626240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] harmonic function fiting? how to do

2009-02-10 Thread vito muggeo

dear Yogesh
It appears that your model based on parametric terms is too inflexible..

A better alternative to parametric harmonic terms is a spline-based 
approach, may be cyclic splines.. Have a look to the mgcv package..


vito



Yogesh Tiwari ha scritto:

Dear R Users,
I have a CO2 time series. I want to fit this series seasonal cycle and trend
with fourth harmonic function,
and then compute residuals.

I am doing something like:

file-read.csv(co2data.csv)
names(file)
attach(file)
fit-lm(co2~1+time+I(time^2)+sin(2*pi*time)+cos(2*pi*time)+sin(4*pi*time)+cos(4*pi*time)+
sin(6*pi*time)+cos(6*pi*time)+sin(8*pi*time)+cos(8*pi*time),data=file)

fit$residuals

# variable 'co2' is in ppmv and variable 'time' is in the form of decimal
time.

The problem is: when I plot above residuals vs. time, it still shows some
seasonal cycle with time.
SO, I doubt that I am doing something wrong.

Kindly help, how to fit correctly, a fourth harmonic function on CO2 which
is varying with variable 'time'.

Great thanks,

Regards,
Yogesh



--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 6626240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R equivalent of SAS Cochran-Mantel-Haenszel tests?

2009-02-09 Thread vito muggeo

Dear Michael,
It sounds as a linear-by-linear loglinear model (and its variants) which 
uses scores for one or more variables in the table.. (see Agresti, 1990, 
Categorical Data Analysis. I do remember the pages and I have not the 
book here..)


If this is the case, you can use standard call to glm(.., 
family=poisson) with score variables in the linear predictor. For 
instance for a two-way table with ordered variables the linear-by-linear 
model is,


glm(freq~factor(x)+factor(y)+I(score.x*score.y), family=poisson)

The CMH test, probably, is the score test of the parameter of 
I(score.x*score.y)..


best,
vito

Michael Friendly ha scritto:
In SAS, for a two-way (or 3-way, stratified) table, the CMH option in 
SAS PROC FREQ gives
3 tests that take ordinality of the factors into account, for both 
variables, just the column variable

or neither.   Is there an equivalent in R?
The mantelhaen.test in stats gives something quite different (a test of 
conditional independence for

*nominal* factors in a 3-way table).

e.g. I'd like to reproduce:
*-- CMH tests;
proc freq data=sexfun order=data;
 weight count;
 tables husband * wife / cmh chisq nocol norow;
 run;

 The FREQ Procedure
Summary Statistics for Husband by Wife
 Cochran-Mantel-Haenszel Statistics (Based on Table Scores)

   StatisticAlternative HypothesisDF   Value  Prob

   1Nonzero Correlation1 10.01420.0016
   2Row Mean Scores Differ 3 12.56810.0057
   3General Association9 16.76890.0525



--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 6626240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] holidays effect

2009-02-04 Thread vito muggeo



Gabor Grothendieck ha scritto:

One possibility if you don't have to have days is to reduce it to a
weekly or monthly
series.
Alternatively you can put a dummy variable (1=holiday  and zero 
otherwise) in the regression model for your response. For instance, you 
could use the xreg argument of the arima() function.


This allows to avoid aggregation of your data which, in general, is not 
recommended..


best,
vito



On Wed, Feb 4, 2009 at 8:46 AM, elisia
elisabetta.fab...@guest.telecomitalia.it wrote:

how can I eliminate the influence of the festivities in a time series with
daily data?I tried to remove them and replace their value with a value of
interpolation using na.approx (). There is an alternative method?
--
View this message in context: 
http://www.nabble.com/holidays-effect-tp21830785p21830785.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 6626240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] OFF topic testing for positive coeffs

2008-12-17 Thread vito muggeo

Dear all,
This is off-topic,
however I hope someone can give me useful suggestion..

Given the regression model
y = b0 + b1*x + e
I am interested in testing for positive coeffs, namely
H0: b00 AND b10
H1: b0,b1 unconstrained

It is simple to estimate the model under H0 and H1 (there are several 
suggestions on the Rlist about estimation but nothing about testing..) 
perform a likelihood ratio test by comparing the logLik under the 
constrained and the unconstrained models, however I do not know how many 
degrees of freedom..


Model under H0 uses two df, however it reasonable to believe that the 
real dimension is =2..


Is there anyone which can give me any advices or suggest me references?

Many thanks,

vito


--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 6626240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] OT: (quasi-?) separation in a logistic GLM

2008-12-16 Thread vito muggeo

dear Gavin,
I do not know whether such comment may be still useful..

Why are you unsure about quasi-separation?
I think that it is quite evident in the plot

plot(analogs ~ Dij, data = dat)

Also it may be useful to see the plot of the monotone (profile) deviance 
(or the log-lik) for the coef of Dij,


xval-seq(-20,0,l=50)
ll-vector(length=50)
for(i in 1:length(xval)){
mod - glm(analogs ~ offset(xval[i]*Dij), data = dat, family = binomial)
ll[i]-mod$dev
}

plot(xval, ll)

Hope this helps you,

vito

Gavin Simpson ha scritto:

Dear List,

Apologies for this off-topic post but it is R-related in the sense that
I am trying to understand what R is telling me with the data to hand.

ROC curves have recently been used to determine a dissimilarity
threshold for identifying whether two samples are from the same type
or not. Given the bashing that ROC curves get whenever anyone asks about
them on this list (and having implemented the ROC methodology in my
analogue package) I wanted to try directly modelling the probability
that two sites are analogues for one another for given dissimilarity
using glm().

The data I have then are a logical vector ('analogs') indicating whether
the two sites come from the same vegetation and a vector of the
dissimilarity between the two sites ('Dij'). These are in a csv file
currently in my university web space. Each 'row' in this file
corresponds to single comparison between 2 sites.

When I analyse these data using glm() I get the familiar fitted
probabilities numerically 0 or 1 occurred warning. The data do not look
linearly separable when plotted (code for which is below). I have read
Venables and Ripley's discussion of this in MASS4 and other sources that
discuss this warning and R (Faraway's Extending the Linear Model with R
and John Fox's new Applied Regression, Generalized Linear Models, and
Related Methods, 2nd Ed) as well as some of the literature on Firth's
bias reduction method. But I am still somewhat unsure what
(quasi-)separation is and if this is the reason for the warnings in this
case.

My question then is, is this a separation issue with my data, or is it
quasi-separation that I have read a bit about whilst researching this
problem? Or is this something completely different?

Code to reproduce my problem with the actual data is given below. I'd
appreciate any comments or thoughts on this.

 Begin code snippet 

## note data file is ~93Kb in size
dat - read.csv(url(http://www.homepages.ucl.ac.uk/~ucfagls/dat.csv;))
head(dat)
## fit model --- produces warning
mod - glm(analogs ~ Dij, data = dat, family = binomial)
## plot the data
plot(analogs ~ Dij, data = dat)
fit.mod - fitted(mod)
ord - with(dat, order(Dij))
with(dat, lines(Dij[ord], fit.mod[ord], col = red, lwd = 2))

 End code snippet ##

Thanks in advance

Gavin


--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 6626240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] NA's in segmented

2008-10-06 Thread vito muggeo

Dear Tyler,
Yes the problem is with NA..
There are two solutions:

1) You can use lm() + segmented (you fit a gaussian model, so why do you 
use glm()?)


2)If you want to use glm()+ segmented(), use na.omit() to pass your 
dataframe to the data argument of glm, glm(.., data=na.omit())



Also, if you want to constrain the right slope to be zero use the minus 
variable (see the relevant recent paper on Rnews)


hope this helps you

vito

###
Example

x-1:50/50
y- -2*x+2*pmax(x-.6,0)+rnorm(50)*.1

x[20:22]-NA
d-data.frame(xx=x,yy=y)
rm(x,y)

#Use lm() - It works:
o-lm(yy~xx, data=d, na.action=na.omit)
os-segmented(o,seg.Z=~xx,psi=.5)

#Use glm() - It works:
o-glm(yy~xx, data=na.omit(d))
os-segmented(o,seg.Z=~xx,psi=.5)

#constrain the right slope to zero
d$xxx- -d$x
o-glm(yy~1, data=na.omit(d))
os1-segmented(o,seg.Z=~xxx,psi=-.5)

with(d,plot(xx,yy)
plot(os, add=TRUE)
plot(os1, add=TRUE, col=2, rev.sgn=TRUE)



T.D.Rudolph ha scritto:

I am trying to fit a very simple broken stick model using the package
segmented but I have hit a roadblock. 


str(data)

'data.frame':   18 obs. of  2 variables:
 $ Bin   : num  0.25 0.75 1.25 1.75 2.25 2.75 3.25 3.75 4.25 4.75 ...
 $ LnFREQ: num  5.06 4.23 3.50 3.47 2.83 ...
 
I fit the lm easily:

fit.lm-lm(LnFREQ~Bin, data=id07)


But I keep getting an error message:

fit.seg-segmented(fit.glm, seg.Z=~Bin, psi=6)
Error in cbind(XREG, U, V) : 
  number of rows of matrices must match (see arg 2)
 
I think the problem is due to NA's in the Bin data, but there doesn't seem

to be an na.action for segmented ().  What is the best way to get around
this problem?  I would like to keep all Bin values in the output for
continuity.  Data below
 
Tyler



data

BinLnFREQ
1  0.25 5.0562458
2  0.75 4.2341065
3  1.25 3.4965076
4  1.75 3.4657359
5  2.25 2.8332133
6  2.75 2.9444390
7  3.25 2.4849066
8  3.75 1.3862944
9  4.25 1.7917595
10 4.75 1.3862944
11 5.25 0.6931472
12 5.75 1.0986123
13 6.25 0.000
14 6.75 0.000
15 7.25NA
16 7.75 0.000
17 8.25 0.000
18 8.75NA
 

summary(fit.glm)

Call:
glm(formula = LnFREQ ~ Bin, data = id07, na.action = na.omit)
Deviance Residuals: 
 Min1QMedian3Q   Max  
-0.74139  -0.22999   0.01065   0.21245   0.72684  
Coefficients:
Estimate Std. Error t value Pr(|t|)
(Intercept)  4.506460.21088   21.37 4.37e-12 ***

Bin -0.634340.04467  -14.20 1.05e-09 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
(Dispersion parameter for gaussian family taken to be 0.1844898)

Null deviance: 39.7785  on 15  degrees of freedom
Residual deviance:  2.5829  on 14  degrees of freedom
  (2 observations deleted due to missingness)
AIC: 22.227
Number of Fisher Scoring iterations: 2


--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 6626240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] difference between MASS::polr() and Design::lrm()

2008-06-30 Thread vito muggeo

Dear all,
It appears that MASS::polr() and Design::lrm() return the same point 
estimates but different st.errs when fitting proportional odds models,


grade-c(4,4,2,4,3,2,3,1,3,3,2,2,3,3,2,4,2,4,5,2,1,4,1,2,5,3,4,2,2,1)
score-c(525,533,545,582,581,576,572,609,559,543,576,525,574,582,574,471,595,
  557,557,584,599,517,649,584,463,591,488,563,553,549)

library(MASS)
library(Design)

summary(polr(factor(grade)~score))
lrm(factor(grade)~score)

It seems to me that results from lrm() are right..

Am I wrong?
Thanks,
vito

--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 6626240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] error in using by + median

2008-05-02 Thread vito muggeo

dear all,
Could anyone explain me the behaviour of median() within by()?
(I am running R.2.7.0)

thanks,
vito

 H-cbind(rep(0:1,l=20),matrix(rnorm(20*2),20,2))
 by(H[,-1],H[,1],mean)
INDICES: 0
V1 V2
-0.2101069  0.2954377
- 


INDICES: 1
 V1  V2
-0.23682173 -0.01225147
 by(H[,-1],H[,1],median)
Error in median.default(data[x, , drop = FALSE], ...) : need numeric data



--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 6626240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Comparing switchpoints from segmented

2008-03-14 Thread vito muggeo
Dear Rob,

Rob Knell ha scritto:
 Hello everyone
 Not strictly an R question but close... hopefully someone will be able  
 to help. I wish to compare the switchpoints in two switchpoint  
 regressions. The switchpoints were estimated using the segmented  
 library 
It's a package.

running in R, and I have standard errors for the estimates. I
 initially thought I could just bootstrap confidence intervals for the  
 difference between the switchpoints, but I have been having trouble  
 with getting this to work because for about 25% of the bootstrap  
 samples the algorithm in segmented fails to converge. So I had another  
 think, and I thought that maybe I could just do a t-test: knowing the  
 estimated switchpoints and their
 standard errors I can easily calculate the SE of the difference, so I  
 can calculate a t-value using that. My question is whether there is
 anything wrong with doing it this way, and if not, how many degrees of  
 freedom should I use? I would guess at df=n1-5+n2-5 5 df lost for each  
 sample because two slopes, two intercepts and one switchpoint have  
 been estimated, but I'm not sure: I'm but a humble biologist and not  
 very good at this sort of thing.

The SE() of the breakpoints are reliable only for large samples and/and 
with clear-cut relationships. Having said that, I think that you can 
compare them by using the approximate t-like studentized statistic, but 
results have to taken with care.. The relevant sampling distribution is 
unknown (it depends on the location of the breakpoint, (standardized) 
difference-in-slope, sample size and, to some extend, distribution of 
the `segmented variable') . Therefore quantiles of a t distributions are 
not appropriate in theory, but in practice someone uses them..BTW the 
model parameters are 4 (intercept+two slopes+ breakpoint, the regression 
lines are joined at the breakpoint).
Hope this helps you,
best,

vito


 
 Any help gratefully received
 
 Thanks
 
 Rob Knell
 
 
 
 
 School of Biological and Chemical Sciences
 Queen Mary, University of London
 
 'Phone +44 (0)20 7882 7720
 Skype Rob Knell
 
 Research: http://webspace.qmul.ac.uk/rknell/
 
 The truth is that they have no clue why the beetles had horns, it's  
 the researchers who have sex on the brain and everything has to have a  
 sexual explanation. And this is reasearch?! Correspondent known as  
 FairOpinion on Neo-Con American website discussing my research.
 
 
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 6626240
fax: 091 485726/485612

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] strange behaviour of is.factor()

2008-01-16 Thread vito muggeo
Dear all,
It appears that the function is.factor() returns different results when 
used inside the apply() function: that is, is.factor() fails to 
recognize a factor..
Where is the trick?

many thanks,
vito

  df1-data.frame(y=1:10,x=rnorm(10),g=factor(c(rep(A,6),rep(B,4
  is.factor(df1[,1])
[1] FALSE
  is.factor(df1[,2])
[1] FALSE
  is.factor(df1[,3])
[1] TRUE
  is.factor(df1$g)
[1] TRUE
  apply(df1,2,is.factor)
 y x g
FALSE FALSE FALSE
 
  R.version
_
platform   i386-pc-mingw32
arch   i386
os mingw32
system i386, mingw32
status
major  2
minor  6.1
year   2007
month  11
day26
svn rev43537
language   R
version.string R version 2.6.1 (2007-11-26)
 

-- 

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 6626240
fax: 091 485726/485612

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Segmented regression

2007-12-06 Thread vito muggeo
Dear Brendan,
I am not sure to understand your code..

It seems to me that your are interested in fitting a one-breakpoint 
segmented relationship in each level of your grouping variable

If this is the case, the correct code is below.
In order to fit a segmented relationship in each group you have to 
define the relevant variable before fitting, and to constrain the last 
slope to be zero you have to consider the `minus' variable..(I discuss 
these points in the submitted Rnews article..If you are interested, let 
me know off list).

If my code does not fix your problem, please let me know,

Best,
vito

##--define the group-specific segmented variable:
X-model.matrix(~0+factor(group),data=df)*df$tt
df$tt.KV-X[,1] #KV
df$tt.KW-X[,2]   #KW
df$tt.WC-X[,3]   #WC

##-fit the unconstrained model
olm-lm(y~group+tt.KV+tt.KW+tt.WC,data=df)
os-segmented(olm,seg.Z=~tt.KV+tt.KW+tt.WC,psi=list(tt.KV=350, 
tt.KW=500, tt.WC=350))
#have a look to results:
with(df,plot(tt,y))
with(subset(df,group==RKW),points(tt,y,col=2))
with(subset(df,group==RKV),points(tt,y,col=3))
points(df$tt[df$group==RWC],fitted(os)[df$group==RWC],pch=20)
points(df$tt[df$group==RKW],fitted(os)[df$group==RKW],pch=20,col=2)
points(df$tt[df$group==RKV],fitted(os)[df$group==RKV],pch=20,col=3)


#constrain the last slope in group KW
tt.KW.minus- -df$tt.KW
olm1-lm(y~group+tt.KV+tt.WC,data=df)
os1-segmented(olm1,seg.Z=~tt.KV+tt.KW.minus+tt.WC,psi=list(tt.KV=350, 
tt.KW.minus=-500, tt.WC=350))
#check..:-)
slope(os1)

with(df,plot(tt,y))
with(subset(df,group==RKW),points(tt,y,col=2))
with(subset(df,group==RKV),points(tt,y,col=3))
points(df$tt[df$group==RWC],fitted(os1)[df$group==RWC],pch=20)
points(df$tt[df$group==RKW],fitted(os1)[df$group==RKW],pch=20,col=2)
points(df$tt[df$group==RKV],fitted(os1)[df$group==RKV],pch=20,col=3)






Power, Brendan D (Toowoomba) ha scritto:
 Hello all,
 
 I have 3 time series (tt) that I've fitted segmented regression models
 to, with 3 breakpoints that are common to all, using code below
 (requires segmented package). However I wish to specifiy a zero
 coefficient, a priori, for the last segment of the KW series (green)
 only. Is this possible to do with segmented? If not, could someone point
 in a direction?
 
 The final goal is to compare breakpoint sets for differences from those
 derived from other data.
 
 Thanks in advance,
 
 Brendan.
 
 
 library(segmented)
 df-data.frame(y=c(0.12,0.12,0.11,0.19,0.27,0.28,0.35,0.38,0.46,0.51,0.5
 8,0.59,0.60,0.57,0.64,0.68,0.72,0.73,0.78,0.84,0.85,0.83,0.86,0.88,0.88,
 0.95,0.95,0.93,0.92,0.97,0.86,1.00,0.85,0.97,0.90,1.02,0.95,0.54,0.53,0.
 50,0.60,0.70,0.74,0.78,0.82,0.88,0.83,1.00,0.85,0.96,0.84,0.86,0.82,0.86
 ,0.84,0.84,0.84,0.77,0.69,0.61,0.67,0.73,0.65,0.55,0.58,0.56,0.60,0.50,0
 .50,0.42,0.43,0.44,0.42,0.40,0.51,0.60,0.63,0.71,0.74,0.82,0.82,0.85,0.8
 9,0.91,0.87,0.91,0.93,0.95,0.95,0.97,1.00,0.96,0.90,0.86,0.91,0.94,0.96,
 0.88,0.88,0.88,0.92,0.82,0.85),
  
 tt=c(141.6,141.6,141.6,183.2,212.8,227.0,242.4,271.5,297.4,312.3,331.4,3
 42.4,346.3,356.6,371.6,408.8,408.8,419.5,434.4,464.5,492.6,521.7,550.5,5
 50.3,565.4,588.0,602.9,623.7,639.6,647.9,672.6,680.6,709.7,709.7,750.2,7
 50.2,750.2,141.6,141.6,141.6,183.2,212.8,227.0,242.4,271.5,297.4,312.3,3
 31.4,342.4,346.3,356.6,371.6,408.8,408.8,419.5,434.4,464.5,492.6,521.7,5
 50.5,550.3,565.4,588.0,602.9,623.7,639.6,647.9,672.6,680.6,709.7,709.7,1
 41.6,141.6,141.6,183.2,212.8,227.0,242.4,271.5,297.4,312.3,331.4,342.4,3
 46.3,356.6,371.6,408.8,408.8,419.5,434.4,464.5,492.6,521.7,550.5,550.3,5
 65.4,588.0,602.9,623.7,639.6,647.9,672.6,709.7),
group=c(rep(RKW,37),rep(RWC,34),rep(RKV,32)))
 init.bp - c(297.4,639.6,680.6)
 lm.1 - lm(y~tt+group,data=df)
 seg.1 - segmented(lm.1, seg.Z=~tt, psi=list(tt=init.bp))
 
  version
_   
 platform   i386-pc-mingw32 
 arch   i386
 os mingw32 
 system i386, mingw32   
 status 
 major  2   
 minor  6.0 
 year   2007
 month  10  
 day03  
 svn rev43063   
 language   R   
 version.string R version 2.6.0 (2007-10-03)
 
 
 
 DISCLAIMER**...{{dropped:15}}
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 6626240
fax: 091 485726/485612


[R] differences in using source() or console

2007-12-06 Thread vito muggeo
Dear all,
Is there *any* reason explaining what I describe below?
I have the following line

myfun(x)

If I type them directly in R (or copy/past), it works..

However if I type in R 2.6.1

  source(code.R) ##code.R includes the above line
Error in inherits(x, data.frame) : object d not found

namely myfun() does not work correctly.
In particular the non-working line inside myfun() is

update(eval(x$call$obj), data=d) #d is created in myfun()

My question is: why the problem occurs just via source()ing???

many thanks,
vito



Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 6626240
fax: 091 485726/485612

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] using names with functions..

2007-11-28 Thread vito muggeo
Dear all,
I have the following (rather) strange problem..
For some reasons, I finally work with a variable whose name includes an 
R function, a.log(z), say. And that is a problem when I call it in a 
formula, for instance:

  myname-a.log(z)
  dd-data.frame(a.log(z)=1:10,y=rnorm(10))
  o-lm(y~1,data=dd)
  fo-as.formula(paste(.~.+,paste(myname, collapse = +)))
  fo
. ~ . + a.log(z)
  update(o,formula=fo)
Error in eval(expr, envir, enclos) : could not find function a.log
 

How can fit the model? namely how can I use a.log(z) in the example above?

Many thanks,
vito


-- 

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 6626240
fax: 091 485726/485612

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to extract the original data of a glm object

2007-11-12 Thread vito muggeo
See and *read* the help file ?glm
the object returned by glm() includes the `data' component

hence:

aa-glm(..)
aa$data

or also eval(aa$call$data)



leffgh ha scritto:
   my function is 
  glm(a~log(b)+c+d+e,family=binomial,data=f)-aa
  
   
   I want to extract the original data set of aa. How to do it ?
   
 You may suggest the model.frame() function. In fact ,i have tried it.
 model.frame returns a data frame of containing a,log(b) NOT b,c,d,e 
 I want to extract a data frame containing a,b,c,d,e,which is exactly the
 same as f
 How can I achieve this result?
 
 
 I want to do this because I need to extract the formular and act on another
 data set ,whose predict variables are the same as those of f, but the
 response variable is different .

-- 

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 6626240
fax: 091 485726/485612

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R segmented package

2007-10-30 Thread vito muggeo
Dear Maura,
For package-specific questions, it is probably advisable to contact 
privately the author..

BTW:
  However I cannot understand if a break-point per independent variable
  (regressor) is handled when there are many independent variables in the
  model, or if segmented can handle many breakpoints for the same
  independent variable (time in my case) ... ?

Yes, both the statements are true. Virtually segmented is able to handle 
GLMs with any number of (linear) exaplanatory variables + any number of 
`segmented' variables, and multiple breakpoints are allowed for *each* 
`segmented' variable. See the relevant help files.

All the best,

vito


PS: I am sending you some further documentation via off-list

best,
vito


Maura E Monville ha scritto:
 Most of the data sets I'm dealing with exhibit a time trend.
 We would like to get rid of the time trend.
 The plot shows in some cases a monotonic increase of the dependent variable
 with time. This is the easiest case.
 In some other cases the plot shows a time trend where the dependent variable
 changes slope 4-5 times along the observations measurement period.
 
 I've attempted a segmented regression and estimated the break-point
 empirically. This is a tedious error-prone process.
 I just found out that R includes a package, called segmented,  which can
 estimate the break-point discriminating between two regression models.
 I have browsed through the on-line documentation. It is stated multiple
 break-points are supported by segmented.
 However I cannot understand if a break-point per independent variable
 (regressor) is handled when there are many independent variables in the
 model, or if segmented can handle many breakpoints for the same
 independent variable (time in my case) ... ?
 
 I would greatly appreciate some explanations and suggestions on the use of
 segmented.
 
 Thank you very much in advance.
 
 Regards,
 

-- 

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 6626240
fax: 091 485726/485612

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Splines

2007-10-05 Thread vito muggeo
As reported in ?spline, you should use splinefun() instead.

ff-splinefun(x,y)
ff(x0)

where x0=5.25 is your case.

best,

vito


stat stat ha scritto:
 I want to fit a cubic spline of x on y. where :

   x
  [1] 467 468 460 460 450 432 419 420 423 423
 y
  [1]  1  2  3  4  5  6  7  8  9 10

   using the syntax

   spline(y, x)

   I got following output :

   $x
  [1]  1.00  1.310345  1.620690  1.931034  2.241379  2.551724  2.862069
  [8]  3.172414  3.482759  3.793103  4.103448  4.413793  4.724138  5.034483
 [15]  5.344828  5.655172  5.965517  6.275862  6.586207  6.896552  7.206897
 [22]  7.517241  7.827586  8.137931  8.448276  8.758621  9.068966  9.379310
 [29]  9.689655 10.00
   $y
  [1] 467. 469.5381 469.8643 468.4865 465.9444 463.0284 460.6479 459.6560
  [9] 459.8737 460.2296 459.6313 457.4921 454.0094 449.4482 444.1040 438.3613
 [17] 432.6204 427.2892 422.8183 419.6733 418.2646 418.3633 419.3283 420.5202
 [25] 421.5768 422.4555 423.1312 423.5419 423.5480 423.
 
   Now I want to get what is the value of y at x = 5.25. Can anyone tell me 
 how to find that?

 
 
 thanks in advance

 -
  Meet people who discuss and share your passions.  Join them now.
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 6626240
fax: 091 485726/485612

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.