[R] Constrained regression

2008-03-02 Thread Carlos Alzola
Dear list members,

I am trying to get information on how to fit a linear regression with
constrained parameters. Specifically, I have 8 predictors , their
coeffiecients should all be non-negative and add up to 1. I understand it is
a quadratic programming problem but I have no experience in the subject. I
searched the archives but the results were inconclusive. 

Could someone provide suggestions and references to the literature, please?

Thank you very much.

Carlos

Carlos Alzola
[EMAIL PROTECTED]
(703) 242-6747 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Constrained regression

2008-03-05 Thread Carlos Alzola
I would like to acknowledge the answers I received from Tom Filloon, Mike
Cheung and Berwyn Turlach.
Berwyn's response was exactly what I needed. Use solve.QP from the quadprog
package in R. S-Plus has the equivalent function solveQP in the NuOpt
module.

Berwyn's response is below

G'day Carlos,

On Mon, Mar 3, 2008 at 11:52 AM
Carlos Alzola <[EMAIL PROTECTED]> wrote:

>  I am trying to get information on how to fit a linear regression with 
> constrained parameters. Specifically, I have 8 predictors , their 
> coeffiecients should all be non-negative and add up to 1. I understand 
> it is a quadratic programming problem but I have no experience in the 
> subject. I searched the archives but the results were inconclusive.
>
>  Could someone provide suggestions and references to the literature, 
> please?

A suggestion:

> library(MASS)   ## to access the Boston data
> designmat <- model.matrix(medv~., data=Boston) Dmat <- 
> crossprod(designmat, designmat) dvec <- crossprod(designmat, 
> Boston$medv) Amat <- cbind(1, diag(NROW(Dmat))) bvec <- c(1, 
> rep(0,NROW(Dmat)) meq <- 1
> library(quadprog)
> res <- solve.QP(Dmat, dvec, Amat, bvec, meq)

The solution seems to contain values that are, for all practical purposes,
actually zero:

> res$solution
 [1]  4.535581e-16  2.661931e-18  1.016929e-01 -1.850699e-17  [5]
1.458219e-16 -3.892418e-15  8.544939e-01  0.00e+00  [9]  2.410742e-16
2.905722e-17 -5.700600e-20 -4.227261e-17 [13]  4.381328e-02 -3.723065e-18

So perhaps better:

> zapsmall(res$solution)
 [1] 0.000 0.000 0.1016929 0.000 0.000 0.000  [7]
0.8544939 0.000 0.000 0.000 0.000 0.000 [13] 0.0438133
0.000

So the estimates seem to follow the constraints.

And the unconstrained solution is:

> res$unconstrainted.solution
 [1]  3.645949e+01 -1.080114e-01  4.642046e-02  2.055863e-02  [5]
2.686734e+00 -1.776661e+01  3.809865e+00  6.922246e-04  [9] -1.475567e+00
3.060495e-01 -1.233459e-02 -9.527472e-01 [13]  9.311683e-03 -5.247584e-01

which seems to coincide with what lm() thinks it should be:

> coef(lm(medv~., Boston))
  (Intercept)  crimzn indus  chas 
 3.645949e+01 -1.080114e-01  4.642046e-02  2.055863e-02  2.686734e+00 
  noxrm   age   dis   rad 
-1.776661e+01  3.809865e+00  6.922246e-04 -1.475567e+00  3.060495e-01 
  tax   ptratio black lstat 
-1.233459e-02 -9.527472e-01  9.311683e-03 -5.247584e-01 

So there seem to be no numeric problems.  Otherwise we could have done
something else (e.g calculate the QR factorization of the design matrix, say
X, and give the R factor to solve.QP, instead of calculating X'X and giving
that one to solve.QP).

If the intercept is not supposed to be included in the set of constrained
estimates, then something like the following can be done:

> Amat[1,] <- 0
> res <- solve.QP(Dmat, dvec, Amat, bvec, meq)
> zapsmall(res$solution)
 [1] 6.073972 0.00 0.109124 0.00 0.00 0.00 0.863421  [8]
0.00 0.00 0.00 0.00 0.00 0.027455 0.00

Of course, since after the first command in that last block the second
column of Amat contains only zeros
> Amat[,2]
 [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0
we might as well have removed it (and the corresponding entry in bvec)
> Amat <- Amat[, -2]
> bvec <- bvec[-2]
before calling solve.QP().

Note, the Boston data set was only used to illustrate how to fit such
models, I do not want to imply that these models are sensible for these
data. :-)

Hope this helps.

Cheers,

Berwin


Carlos Alzola
[EMAIL PROTECTED]
(703) 242-6747 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Constrained regression

2008-03-03 Thread Mike Cheung
Dear Carlos,

One approach is to use structural equation modeling (SEM). Some SEM
packages, such as LISREL, Mplus and Mx, allow inequality and nonlinear
constraints. Phantom variables (Rindskopf, 1984) may be used to impose
inequality constraints. Your model is basically:
y = b0 + b1*b1*x1 + b2*b2*x2 +...+ bp*bp*xp + e
1 = b1*b1 + b2*b2 +...+ bp*bp

Alternatively, you can set some condition bounds on the parameter
estimates. Then you only have to impose the second constraint.

Rindskopf, D. (1984). Using phantom and imaginary latent variables to
parameterize constraints in linear structural models. Psychometrika,
49, 37-47.

Regards,
Mike
-- 
-
 Mike W.L. Cheung   Phone: (65) 6516-3702
 Department of Psychology   Fax:   (65) 6773-1843
 National University of Singapore
 http://courses.nus.edu.sg/course/psycwlm/internet/
-

On Mon, Mar 3, 2008 at 11:52 AM, Carlos Alzola <[EMAIL PROTECTED]> wrote:
> Dear list members,
>
>  I am trying to get information on how to fit a linear regression with
>  constrained parameters. Specifically, I have 8 predictors , their
>  coeffiecients should all be non-negative and add up to 1. I understand it is
>  a quadratic programming problem but I have no experience in the subject. I
>  searched the archives but the results were inconclusive.
>
>  Could someone provide suggestions and references to the literature, please?
>
>  Thank you very much.
>
>  Carlos
>
>  Carlos Alzola
>  [EMAIL PROTECTED]
>  (703) 242-6747
>
>
> [[alternative HTML version deleted]]
>
>  __
>  R-help@r-project.org mailing list
>  https://stat.ethz.ch/mailman/listinfo/r-help
>  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>  and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Constrained regression

2008-03-03 Thread Berwin A Turlach
G'day Carlos,

On Mon, Mar 3, 2008 at 11:52 AM 
Carlos Alzola <[EMAIL PROTECTED]> wrote:

>  I am trying to get information on how to fit a linear regression
> with constrained parameters. Specifically, I have 8 predictors ,
> their coeffiecients should all be non-negative and add up to 1. I
> understand it is a quadratic programming problem but I have no
> experience in the subject. I searched the archives but the results
> were inconclusive.
>
>  Could someone provide suggestions and references to the
> literature, please?

A suggestion:

> library(MASS)   ## to access the Boston data
> designmat <- model.matrix(medv~., data=Boston)
> Dmat <- crossprod(designmat, designmat)
> dvec <- crossprod(designmat, Boston$medv)
> Amat <- cbind(1, diag(NROW(Dmat)))
> bvec <- c(1, rep(0,NROW(Dmat))
> meq <- 1
> library(quadprog)
> res <- solve.QP(Dmat, dvec, Amat, bvec, meq)

The solution seems to contain values that are, for all practical
purposes, actually zero:

> res$solution
 [1]  4.535581e-16  2.661931e-18  1.016929e-01 -1.850699e-17
 [5]  1.458219e-16 -3.892418e-15  8.544939e-01  0.00e+00
 [9]  2.410742e-16  2.905722e-17 -5.700600e-20 -4.227261e-17
[13]  4.381328e-02 -3.723065e-18

So perhaps better:

> zapsmall(res$solution)
 [1] 0.000 0.000 0.1016929 0.000 0.000 0.000
 [7] 0.8544939 0.000 0.000 0.000 0.000 0.000
[13] 0.0438133 0.000

So the estimates seem to follow the constraints.

And the unconstrained solution is:

> res$unconstrainted.solution
 [1]  3.645949e+01 -1.080114e-01  4.642046e-02  2.055863e-02
 [5]  2.686734e+00 -1.776661e+01  3.809865e+00  6.922246e-04
 [9] -1.475567e+00  3.060495e-01 -1.233459e-02 -9.527472e-01
[13]  9.311683e-03 -5.247584e-01

which seems to coincide with what lm() thinks it should be:

> coef(lm(medv~., Boston))
  (Intercept)  crimzn indus  chas 
 3.645949e+01 -1.080114e-01  4.642046e-02  2.055863e-02  2.686734e+00 
  noxrm   age   dis   rad 
-1.776661e+01  3.809865e+00  6.922246e-04 -1.475567e+00  3.060495e-01 
  tax   ptratio black lstat 
-1.233459e-02 -9.527472e-01  9.311683e-03 -5.247584e-01 

So there seem to be no numeric problems.  Otherwise we could have done
something else (e.g calculate the QR factorization of the design
matrix, say X, and give the R factor to solve.QP, instead of
calculating X'X and giving that one to solve.QP).

If the intercept is not supposed to be included in the set of
constrained estimates, then something like the following can be done:

> Amat[1,] <- 0
> res <- solve.QP(Dmat, dvec, Amat, bvec, meq)
> zapsmall(res$solution)
 [1] 6.073972 0.00 0.109124 0.00 0.00 0.00 0.863421
 [8] 0.00 0.00 0.00 0.00 0.00 0.027455 0.00

Of course, since after the first command in that last block the second
column of Amat contains only zeros
> Amat[,2]
 [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0
we might as well have removed it (and the corresponding entry in bvec)
> Amat <- Amat[, -2]
> bvec <- bvec[-2]
before calling solve.QP().

Note, the Boston data set was only used to illustrate how to fit such 
models, I do not want to imply that these models are sensible for these
data. :-)

Hope this helps.

Cheers,

Berwin

=== Full address =
Berwin A TurlachTel.: +65 6516 4416 (secr)
Dept of Statistics and Applied Probability+65 6516 6650 (self)
Faculty of Science  FAX : +65 6872 3919   
National University of Singapore 
6 Science Drive 2, Blk S16, Level 7  e-mail: [EMAIL PROTECTED]
Singapore 117546http://www.stat.nus.edu.sg/~statba

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.