In view of your further explanation, Robin, the best I can offer is the following.
[1] Theoretical frame. *IF* variables (X1,X2,X3) are distributed according to a mixture of two multivariate normal distributions, i.e. as two groups, each with a multivariate normal distribution, *AND* the members of one group are labelled "Y=0" and the members of the other group are labelled "Y=1", *THEN* for a unit chosen at random from the two groups (pooled) the probability that Y=1 conditional on (X1=x1,X2=x2,X3=x3) follows a logistic regression. This regression will be linear in (x1,x2,x3) if the two multivariate normals have the same covariance matrix; it will be quadratic if the two covariance matrices are different. The coefficients in the regression will be algebraic expressions involving these parameters of the two multivariates normals, together with the two proportions p1 and p2 of the two groups. This result is a straightforward algebraic consequence of applying Bayes's Theorem. [2] Practical application If you can identify that the data on (X1,X2,X3) correspond to a mixture of two multivariate normal distributions whose parameters (two multivariate mean vectors, one or two covariance matrices, proportions in the two groups) you can estimate, *AND* *IF* you are justified in assuming that the *unobserved* response variable Y takes the value 0 for one group and 1 for the other, *THEN* you can apply logistic regression to the results (but you will not learn anything by doing so that was not already available from the estimated parameters, and the algebraic expression of the logistic coefficients, as found in [1] above). [3] Caveat Being able to perform the identification and estimation of the two multivariate normals as in [2], by using some mixture identification method, does *NOT* of itself justify making the assumption in [2] that the unobserved response variable Y takes values 0 and 1 according to group membership *UNLESS* that is what you precisely mean by "Y" (i.e. index of group membership in one or other of two multikvariate normals). If the meaning of variable "Y" is different, then success with a mixture algorithm may have nothing to do with what the values of Y are likely to be. [4] Comment Many algorithms for identifying mixtures are based on the EM algorithm. Your additional "prior information" about how the coefficients are distributed could be incorporated into the EM algorithm, but I can't think explicitly of an R function which would enable this (though the MCMC methods associated with BRugs -- the R interface to OpenBUGS -- may allow you to set this up). Probably others can offer more help on this aspect of the matter. I think it is necessary to be absolutely clear about what your model represents! Hoping this helps, Ted. On 10-Jan-11 20:08:09, Robin Aly wrote: > Dear Ted, > > sorry for being unclear. Let me try again. > > I indeed have no knowledge about the value of the response > variable for any object. > Instead, I have a data frames of explanatory variables for > each object. For example, > > x1 x2 x3 > 1 4.409974 2.348745 1.9845313 > 2 3.809249 2.281260 1.9170466 > 3 4.229544 2.610347 0.9127431 > 4 4.259644 1.866025 1.5982859 > 5 4.001306 2.225069 1.2551570 > ... > > , and I want to model a regression model of the form > y ~ x1 + x2 + x3. > > From prior information I know that all coefficients are > approximately Gaussian distributed around one and the same > for the intercept around -10. Now I think there must be a > package which estimates the coefficients more precisely by > fitting the logistic regression function to the data without > knowledge of the response variable (similar to fitting > Gaussians in a mixture model where the class labels are > unknown). > > I looked at the flexmix package but this seems to "only" > find dependencies in the data assuming the presence of some > training data. > I also found some evidence In Magder1997 (see below) that > such an algorithm exists, however from the documented math > I can't apply the method to my problem. > > Thanks in advance, > Best Regards > Robin > > Magder, L. S. & Hughes, J. P. Logistic Regression When the Outcome Is > Measured with Uncertainty American Journal of Epidemiology, 1997, 146, > 195-203 > > > > > On 01/04/2011 12:36 AM, (Ted Harding) wrote: >> On 03-Jan-11 14:02:21, Robin Aly wrote: >>> Hi all, >>> is there any package which can do an EM algorithm fitting of >>> logistic regression coefficients given only the explanatory >>> variables? I tried to realize this using the Design package, >>> but I didn't find a way. >>> >>> Thanks a lot& Kind regards >>> Robin Aly >> As written, this is a strange question! You imply that you >> do not have data on the response (0/1) variable at all, >> only on the explanatory variables. In that case there is >> no possible estimate, because that would require data on >> at least some of the values of the response variable. >> >> I think you should explain more clearly and explicitly what >> the information is that you have for all the variables. >> >> Ted. >> >> -------------------------------------------------------------------- >> E-Mail: (Ted Harding)<ted.hard...@wlandres.net> >> Fax-to-email: +44 (0)870 094 0861 >> Date: 03-Jan-11 Time: 23:36:56 >> ------------------------------ XFMail ------------------------------ > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -------------------------------------------------------------------- E-Mail: (Ted Harding) <ted.hard...@wlandres.net> Fax-to-email: +44 (0)870 094 0861 Date: 10-Jan-11 Time: 23:52:18 ------------------------------ XFMail ------------------------------ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.