[R] Using R for a slightly tricky survey analysis

Andreas Stefik Fri, 11 Sep 2009 14:37:49 -0700

Hello folks,

I have recently finished a pilot study of a survey and am working to
complete the statistical analysis of the results in R. My Phd is technically
in computer science (not statistics), although I teach basic stats and have
a "decent" working knowledge of the area. With that said, my expertise in
psychometrics theory and factor analysis is weaker, so I thought I would
send an email here and try to solicit some advice on the proper technique
for my analysis.


First, in the survey, I have a series of "concepts" and word choices
regarding those concepts (e.g., how well does concept A relate to words A1
through AN), which each participant rates on a scale from 1 to 10. For each
question, I've gathered a significant amount of data with various answers to
the questions.

Now, what I'm most interested in is gathering whether there were
differences, for each answer in each question, between group A and B. The
total difference between A and B summed across all questions and answers in
the survey, isn't very meaningful. Similarly, the relationship between
questions are not meaningful at all, nor is the rate of change (if any)
between questions. In other words, there are probably correlations between
questions, as there are with many surveys, but they aren't of interest here.

It seems like there would be a few ways to tackle this. Since I'm only
interested the relationship between a list of answers to each question
individually, I was thinking I could run a simple ANOVA for each question
with appropriate post-hoc tests, but I'm not sure. First, there are quite a
few questions (about 12), and I'm a little worried about inflating my
family-wise error. Now, I could lower my alpha, but ...

Second, I know in some branches of survey analysis, they use factor analysis
and a series of complicated measures for determining the consistency of the
survey itself.  Since the relationships between questions doesn't have any
significant meaning, I'm not sure if that sort of analysis is the right way
to go here or not. For example, if a particular metric (chronbach's alpha),
said the survey was consistent or not, I don't know what that would even
mean in this case.

As for the data itself, it looks pretty good. Skew and Kurtosis values look
fine, the data appears reasonably normally distributed. There was no
discussion between participants or correlated error in that. In graphing and
going through the data, I don't see anything that pops out as unusual.

A couple questions:

1. Should I even be concerned about running measures for survey consistency
(chronbach's alpha or some kind of factor analysis related measures) if I'm
not particularly interested in the relationship between questions?

2. Should I run something more complex, like a MANOVA, in this case, to try
and weed out any correlated errors between the questions? Would a Wilks'
Lambda score even hold any meaning in a case like this, where the
correlations between the questions are quite coincidental anyway?


Or maybe I'm barking up the wrong tree completely and I should be doing a
thorough analysis of internal consistency measures, as that tells me
something I'm not quite realizing. Any hints out there from the R community,
perhaps from folks that do more survey analysis than I do?

Andreas Stefik, Ph.D.
Department of Computer Science
Southern Illinois University Edwardsville

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Using R for a slightly tricky survey analysis

Reply via email to