You might like my reply.
Even with a low Alpha, the sum of the items is still a better measure
of the trait (locus of control) than any individual item. This assumes
that the items have 'face validity' as measures of locus of control.
It also assumes that whatever they do have in common is mainly the
locus-of-control trait (rather than, say, some other thing like social
desirability).
If so, then the only real 'problem' is that your summed score is a
relatively weak measure of locus of control. (That is, it has limited
validity--i.e., the correlation of the score with the true
trait--because reliability constrains the level of validity). But that
means any statistical analysis you perform is *conservative*. That is,
by using a weak measure of locus of control, you are 'stacking the
deck' against finding a significant relationship between locus of
control with the other variables in your study. Now, if you have
obtained statistically significant results with a weak measure of locus
of control, then your results are still significant! In fact, one
could argue that they are even stronger since you have obtained
significant results with the deck stacked against finding them.
This principle is frequently overlooked. Ultimately, a scale only has
to be as reliable as you need to find statistically significant results
when comparing the scale with another construct.
So, to summarize, if you have obtained significant results with your
summed score, you can go back to your critics with confidence and point
out that you have done so with a conservative analysis, and that had
you used a more 'reliable' scale, your results would only be stronger.
Naturally this assumes you have obtained positive results. If you have
obtained negative results (lack of correlation between the scale and
some other variable(s)), then clearly this logic does not apply.
One other thing to mention: one could set up the problem as a
LISREL-type model, in which the four items are multiple indicators of a
common trait (locus of control.) Interestingly, in a
multiple-indicator type model, people rarely bring up the issue of the
reliability of the common trait and how it is influenced by the number
of indicators, although, logically, one would think it would apply more
as less in the same way as adding the items to create an aggregate
score. This isn't to suggest that you do a LISREL analysis--it's
merely to point out a logical inconsistency in how people regard
multiple indicators.
John Uebersax
[EMAIL PROTECTED]
In article <AC09DC4F4DFCD211A83C00805FE6138D3691B9@NHQJPK1EX2>,
[EMAIL PROTECTED] (Magill, Brett) wrote:
> Just wanted people's thought on the following:
>
> I am a graduate student in sociology studying individual's
perceptions of
> control (locus of control) using existing data. The data set include
four
> items to measure this construct which were taken from a larger scale
of more
> than twenty, the larger scale reaching an acceptable level of
reliability (I
> do not know the exact level, but it is a widely researched and used
> instrument) in previous research. The four items that were included
were
> selected as the best measures of the construct based on empirical
evidence
> (item-total correlation's, factor analysis).
>
> In my own research, I used these items and decided to sum responses
across
> these four likert-type items. However, the Alpha reliability is very
low
> 0.30 (items were reverse scored as necessary and coding was
double-checked).
> I defended the decision to sum the items, despite the low Alpha,
based on
> the fact that they were selected from a larger set of items which are
> internally consistent. In presenting my findings, I was heavily
criticized
> for this decision.
Sent via Deja.com http://www.deja.com/
Before you buy.