Re: [R] Constrained Regression

Jim Silverton Sun, 31 Oct 2010 18:28:15 -0700

I thought I would 'add' some meat to the problem I sent.  This is all I know
(1) f = a*X1 + (1-a)*X2
(2) I know n values of f and X1 which happens to be probabilities
(3) I know nothing about X2 except that it also lies in (0,1)
(4) X1 is the probability under the null (fisher's exact test) and X2 is the
alternative. But I have no idea what the alternative to fisher's exact test
can be..
(5) I need to estimate a (which is supposed to be a proportion).
(6) I was thinking about imputing values from (0,1) using  a beta as the
values of X2.


Any help is greatly appreciated.

Jim...

On Sun, Oct 31, 2010 at 1:44 PM, David Winsemius <dwinsem...@comcast.net>wrote:

>
> On Oct 31, 2010, at 12:54 PM, Spencer Graves wrote:
>
>  Have you tried the 'sos' package?
>>
>
> I have, and I am taking this opportunity to load it with my .Rprofile to
> make it more accessible. It works very well. Very clean display. I also have
> constructed a variant of RSiteSearch that I find more useful than the
> current defaults.
>
> rhelpSearch <- function(string,
>                  restrict = c("Rhelp10", "Rhelp08", "Rhelp02", "functions"
> ),
>                   matchesPerPage = 100, ...) {
>              RSiteSearch(string=string,
>              restrict = restrict,
>              matchesPerPage = matchesPerPage, ...)}
>
>
>
>> install.packages('sos') # if not already installed
>> library(sos)
>> cr <- ???'constrained regression' # found 149 matches
>> summary(cr) # in 69 packages
>> cr # opens a table in a browser listing all 169 matches with links to the
>> help pages
>>
>>     However, I agree with Ravi Varadhan:  I'd want to understand the
>> physical mechanism generating the data.  If each is, for example, a
>> proportion, then I'd want to use logistic regression, possible after some
>> approximate logistic transformation of X1 and X2 that prevents logit(X) from
>> going to +/-Inf.  This is a different model, but it achieves the need to
>> avoid predictions of Y going outside the range (0, 1).
>>
>
> No argument. I defer to both of your greater experiences in such problems
> and your interest in educating us less knowledgeable users. I also need to
> amend my suggested strategy in situations where a linear model _might_ be
> appropriate, since I think the inclusion of the surrogate variable in the
> solve.QP setup is very probably creating confusion. After reconsideration I
> think one should keep the two approaches separate. These are two approaches
> to the non-intercept versions of the model that yield the same estimate (but
> only because the constraints do not get invoked ):
>
> > lm(medv~I(age-lstat) +offset(lstat) -1, data=Boston)
>
> Call:
> lm(formula = medv ~ I(age - lstat) + offset(lstat) - 1, data = Boston)
>
> Coefficients:
> I(age - lstat)
>        0.1163
>
>
> > library(MASS)   ## to access the Boston data
> >  designmat <- model.matrix(medv~age+lstat-1, data=Boston)
>
> >  Dmat <-crossprod(designmat, designmat); dvec <- crossprod(designmat,
> +  Boston$medv)
> >   Amat <- cbind(1, diag(NROW(Dmat)));  bvec <- c(1,rep(0,NROW(Dmat)));
> meq <- 1
> >  library(quadprog);
> >  res <- solve.QP(Dmat, dvec, Amat, bvec, meq)
> > zapsmall(res$solution)  # zapsmall not really needed in this instance
> [1] 0.1163065 0.8836935
>
> --
> David.
>
>
>>
>>     Spencer
>>
>>
>> On 10/31/2010 9:01 AM, David Winsemius wrote:
>>
>>>
>>> On Oct 31, 2010, at 2:44 AM, Jim Silverton wrote:
>>>
>>>  Hello everyone,
>>>> I have 3 variables Y, X1 and X2. Each variables lies between 0 and 1. I
>>>> want
>>>> to do a constrained regression such that a>0 and (1-a) >0
>>>>
>>>> for the model:
>>>>
>>>> Y = a*X1 + (1-a)*X2
>>>>
>>>
>>>
>>> It would not accomplish the constraint that a > 0 but you could
>>> accomplish the other constraint within an lm fit:
>>>
>>> X3 <- X1-X2
>>> lm(Y ~ X3 + offset(X2) )
>>>
>>> Since beta1 is for the model Y ~ 1 + beta1(X1- X2) + 1*X2)
>>>                            Y ~ intercept + beta1*X1 + (1 -beta1)*X2
>>>
>>> ... so beta1 is a.
>>>
>>> In the case beta < 0 then I suppose a would be assigned 0. This might be
>>> accomplished within an iterative calculation framework by a large
>>> penalization for negative values. In a reply (1) to a question by Carlos
>>> Alzola in 2008 on rhalp, Berwin Turlach offered a solution to a similar
>>> problem ( sum(coef) == 1 AND coef non-negative). Modifying his code to
>>> incorporate the above strategy (and choosing two variables for which
>>> parameter values might be inside the constraint boundaries) we get:
>>>
>>> library(MASS)   ## to access the Boston data
>>>  designmat <- model.matrix(medv~I(age-lstat) +offset(lstat), data=Boston)
>>>  Dmat <-crossprod(designmat, designmat); dvec <- crossprod(designmat,
>>>  Boston$medv)
>>>  Amat <- cbind(1, diag(NROW(Dmat)))
>>>  bvec <- c(1,rep(0,NROW(Dmat)))
>>>  meq <- 1
>>> library(quadprog)
>>>  res <- solve.QP(Dmat, dvec, Amat, bvec, meq)
>>>
>>> > zapsmall(res$solution)
>>> [1] 0.686547 0.313453
>>>
>>> Turlach specifically advised against any interpretation of this
>>> particular result which was only contructed to demonstrate the mathematical
>>> mechanics.
>>>
>>>
>>>> I tried the help on the constrained regression in R but I concede that
>>>> it
>>>> was not helpful.
>>>>
>>>
>>> I must not have that package installed because I got nothing that
>>> appeared to be useful with ??"constrained regression" .
>>>
>>>
>>> David Winsemius, MD
>>> West Hartford, CT
>>>
>>> 1) http://finzi.psych.upenn.edu/Rhelp10/2008-March/155990.html
>>>
>>> ______________________________________________
>>>
>>
> David Winsemius, MD
> West Hartford, CT
>
>


-- 
Thanks,
Jim.

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Constrained Regression

Reply via email to