On Wed, 17 May 2000, mbattagl wrote in part:
> The regression analysis is also somewhat confusing. Regression analysis
> is based on the fact that the Y (dependent variable) is random and the X
> (independent variable) is fixed with no error.
Not so much "on the fact that ..." as "on assigning all random and
measurement error to the measurement of Y". The alleged "fact" is not
always a fact...
> For my case, both X and Y are random and have some measurement error.
> Is it correct to use simple linear regression for this analysis or is
> there another type of analysis to obtain predictions?
It is not really INcorrect; there exist alternatives that may be more
appropriate, depending. How large is the measurement error in X? If
measurement error is small compared to the distance between adjacent
values in X, use regression analysis without qualms. (In most designed
experiments, the nominal values of X are deliberately chosen to be fairly
widely spaced, partly so that one may assume that the measurement error
in X, if not zero, is at any rate negligible in context. There is then
no particular advantage to be had in using analytical methods that work
with a random-errors-in-X model.)
If measurement error is large (compared to adjacent distances), two
approaches are possible:
(1) Divide the data (visualized on a scatterplot) into vertical slices
(that is, segments of nearly-constant X). Replace the observed X values
with a single nominal value for each slice, possibly (but not
necessarily) the center value for the slice. This will introduce some
random error into the X values, but the resulting standard error (of the
combination of random & measurement error) for the mean of the n_j
cases (in slice j ) may now be small compared to the difference between
adjacent nominal values of X. (This depends of course on how many cases
there are in a slice.) If you have a LOT of data, it may even be
sensible to discard slices that are sparsely populated. If this
procedure works, you're back in Plan A above (so to speak) and ordinary
regression is appropriate.
(2) Otherwise, an errors-in-variables regression may be called for, of
the kind that simultaneously deals with uncertainty in Y and uncertainty
in X. All such approaches suffer from a common problem: one must decide
(or let the program decide ! ) how to weigh the deviations in Y and the
deviations in X. For problems where Y and X are in the same units, it
may be reasonable to weigh the two deviations equally in generating sums
of squares (or the equivalent of SS). But if Y and X are in different
units, the solution one obtains depends on the units in which one chose
to measure the variables, and what counts as "equal weighting" is VERY
poorly defined. (Consider Y in pounds-mass and X in inches; now think
of pounds & feet; now think of kilograms & cm; ... See? A
least-squares solution cannot be invariant with respect to changes in
scale [i.e., changes in unit of measurement] -- unless, of course, the
same change in scale is imposed on both variables. This disadvantage
alone may be enough to drive one to ordinary regression [or to drink, or
both] as one contemplates explaining a set of results to a client.)
NOTE, by the way: standardizing both variables (to, say, zero
mean and unit variance) may seem like a way out of this impasse; but all
THAT does is to specify a certain change of scale in each variable, and
to specify it in a way that may not be reproducible in subsequently
observed samples, depending as it does on the sample means and variances
in the present sample.
I hope this has not been too confusing; but you did ask!
-- DFB.
------------------------------------------------------------------------
Donald F. Burrill [EMAIL PROTECTED]
348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED]
MSC #29, Plymouth, NH 03264 603-535-2597
184 Nashua Road, Bedford, NH 03110 603-471-7128
===========================================================================
This list is open to everyone. Occasionally, less thoughtful
people send inappropriate messages. Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.
For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===========================================================================