I thought I would 'add' some meat to the problem I sent. This is all I know (1) f = a*X1 + (1-a)*X2 (2) I know n values of f and X1 which happens to be probabilities (3) I know nothing about X2 except that it also lies in (0,1) (4) X1 is the probability under the null (fisher's exact test) and X2 is the alternative. But I have no idea what the alternative to fisher's exact test can be.. (5) I need to estimate a (which is supposed to be a proportion). (6) I was thinking about imputing values from (0,1) using a beta as the values of X2.
Any help is greatly appreciated. Jim... On Sun, Oct 31, 2010 at 1:44 PM, David Winsemius <dwinsem...@comcast.net>wrote: > > On Oct 31, 2010, at 12:54 PM, Spencer Graves wrote: > > Have you tried the 'sos' package? >> > > I have, and I am taking this opportunity to load it with my .Rprofile to > make it more accessible. It works very well. Very clean display. I also have > constructed a variant of RSiteSearch that I find more useful than the > current defaults. > > rhelpSearch <- function(string, > restrict = c("Rhelp10", "Rhelp08", "Rhelp02", "functions" > ), > matchesPerPage = 100, ...) { > RSiteSearch(string=string, > restrict = restrict, > matchesPerPage = matchesPerPage, ...)} > > > >> install.packages('sos') # if not already installed >> library(sos) >> cr <- ???'constrained regression' # found 149 matches >> summary(cr) # in 69 packages >> cr # opens a table in a browser listing all 169 matches with links to the >> help pages >> >> However, I agree with Ravi Varadhan: I'd want to understand the >> physical mechanism generating the data. If each is, for example, a >> proportion, then I'd want to use logistic regression, possible after some >> approximate logistic transformation of X1 and X2 that prevents logit(X) from >> going to +/-Inf. This is a different model, but it achieves the need to >> avoid predictions of Y going outside the range (0, 1). >> > > No argument. I defer to both of your greater experiences in such problems > and your interest in educating us less knowledgeable users. I also need to > amend my suggested strategy in situations where a linear model _might_ be > appropriate, since I think the inclusion of the surrogate variable in the > solve.QP setup is very probably creating confusion. After reconsideration I > think one should keep the two approaches separate. These are two approaches > to the non-intercept versions of the model that yield the same estimate (but > only because the constraints do not get invoked ): > > > lm(medv~I(age-lstat) +offset(lstat) -1, data=Boston) > > Call: > lm(formula = medv ~ I(age - lstat) + offset(lstat) - 1, data = Boston) > > Coefficients: > I(age - lstat) > 0.1163 > > > > library(MASS) ## to access the Boston data > > designmat <- model.matrix(medv~age+lstat-1, data=Boston) > > > Dmat <-crossprod(designmat, designmat); dvec <- crossprod(designmat, > + Boston$medv) > > Amat <- cbind(1, diag(NROW(Dmat))); bvec <- c(1,rep(0,NROW(Dmat))); > meq <- 1 > > library(quadprog); > > res <- solve.QP(Dmat, dvec, Amat, bvec, meq) > > zapsmall(res$solution) # zapsmall not really needed in this instance > [1] 0.1163065 0.8836935 > > -- > David. > > >> >> Spencer >> >> >> On 10/31/2010 9:01 AM, David Winsemius wrote: >> >>> >>> On Oct 31, 2010, at 2:44 AM, Jim Silverton wrote: >>> >>> Hello everyone, >>>> I have 3 variables Y, X1 and X2. Each variables lies between 0 and 1. I >>>> want >>>> to do a constrained regression such that a>0 and (1-a) >0 >>>> >>>> for the model: >>>> >>>> Y = a*X1 + (1-a)*X2 >>>> >>> >>> >>> It would not accomplish the constraint that a > 0 but you could >>> accomplish the other constraint within an lm fit: >>> >>> X3 <- X1-X2 >>> lm(Y ~ X3 + offset(X2) ) >>> >>> Since beta1 is for the model Y ~ 1 + beta1(X1- X2) + 1*X2) >>> Y ~ intercept + beta1*X1 + (1 -beta1)*X2 >>> >>> ... so beta1 is a. >>> >>> In the case beta < 0 then I suppose a would be assigned 0. This might be >>> accomplished within an iterative calculation framework by a large >>> penalization for negative values. In a reply (1) to a question by Carlos >>> Alzola in 2008 on rhalp, Berwin Turlach offered a solution to a similar >>> problem ( sum(coef) == 1 AND coef non-negative). Modifying his code to >>> incorporate the above strategy (and choosing two variables for which >>> parameter values might be inside the constraint boundaries) we get: >>> >>> library(MASS) ## to access the Boston data >>> designmat <- model.matrix(medv~I(age-lstat) +offset(lstat), data=Boston) >>> Dmat <-crossprod(designmat, designmat); dvec <- crossprod(designmat, >>> Boston$medv) >>> Amat <- cbind(1, diag(NROW(Dmat))) >>> bvec <- c(1,rep(0,NROW(Dmat))) >>> meq <- 1 >>> library(quadprog) >>> res <- solve.QP(Dmat, dvec, Amat, bvec, meq) >>> >>> > zapsmall(res$solution) >>> [1] 0.686547 0.313453 >>> >>> Turlach specifically advised against any interpretation of this >>> particular result which was only contructed to demonstrate the mathematical >>> mechanics. >>> >>> >>>> I tried the help on the constrained regression in R but I concede that >>>> it >>>> was not helpful. >>>> >>> >>> I must not have that package installed because I got nothing that >>> appeared to be useful with ??"constrained regression" . >>> >>> >>> David Winsemius, MD >>> West Hartford, CT >>> >>> 1) http://finzi.psych.upenn.edu/Rhelp10/2008-March/155990.html >>> >>> ______________________________________________ >>> >> > David Winsemius, MD > West Hartford, CT > > -- Thanks, Jim. [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.