[R] Using R for a slightly tricky survey analysis

2009-09-11 Thread Andreas Stefik
Hello folks,

I have recently finished a pilot study of a survey and am working to
complete the statistical analysis of the results in R. My Phd is technically
in computer science (not statistics), although I teach basic stats and have
a decent working knowledge of the area. With that said, my expertise in
psychometrics theory and factor analysis is weaker, so I thought I would
send an email here and try to solicit some advice on the proper technique
for my analysis.

First, in the survey, I have a series of concepts and word choices
regarding those concepts (e.g., how well does concept A relate to words A1
through AN), which each participant rates on a scale from 1 to 10. For each
question, I've gathered a significant amount of data with various answers to
the questions.

Now, what I'm most interested in is gathering whether there were
differences, for each answer in each question, between group A and B. The
total difference between A and B summed across all questions and answers in
the survey, isn't very meaningful. Similarly, the relationship between
questions are not meaningful at all, nor is the rate of change (if any)
between questions. In other words, there are probably correlations between
questions, as there are with many surveys, but they aren't of interest here.

It seems like there would be a few ways to tackle this. Since I'm only
interested the relationship between a list of answers to each question
individually, I was thinking I could run a simple ANOVA for each question
with appropriate post-hoc tests, but I'm not sure. First, there are quite a
few questions (about 12), and I'm a little worried about inflating my
family-wise error. Now, I could lower my alpha, but ...

Second, I know in some branches of survey analysis, they use factor analysis
and a series of complicated measures for determining the consistency of the
survey itself.  Since the relationships between questions doesn't have any
significant meaning, I'm not sure if that sort of analysis is the right way
to go here or not. For example, if a particular metric (chronbach's alpha),
said the survey was consistent or not, I don't know what that would even
mean in this case.

As for the data itself, it looks pretty good. Skew and Kurtosis values look
fine, the data appears reasonably normally distributed. There was no
discussion between participants or correlated error in that. In graphing and
going through the data, I don't see anything that pops out as unusual.

A couple questions:

1. Should I even be concerned about running measures for survey consistency
(chronbach's alpha or some kind of factor analysis related measures) if I'm
not particularly interested in the relationship between questions?

2. Should I run something more complex, like a MANOVA, in this case, to try
and weed out any correlated errors between the questions? Would a Wilks'
Lambda score even hold any meaning in a case like this, where the
correlations between the questions are quite coincidental anyway?


Or maybe I'm barking up the wrong tree completely and I should be doing a
thorough analysis of internal consistency measures, as that tells me
something I'm not quite realizing. Any hints out there from the R community,
perhaps from folks that do more survey analysis than I do?

Andreas Stefik, Ph.D.
Department of Computer Science
Southern Illinois University Edwardsville

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using R for a slightly tricky survey analysis

2009-09-11 Thread Andreas Stefik
Hello folks,

I have recently finished a pilot study of a survey and am working to
complete the statistical analysis of the results in R. My Phd is technically
in computer science (not statistics), although I teach basic stats and have
a decent working knowledge of the area. With that said, my expertise in
psychometrics theory and factor analysis is weaker, so I thought I would
send an email here and try to solicit some advice on the proper technique
for my analysis.

First, in the survey, I have a series of concepts and word choices
regarding those concepts (e.g., how well does concept A relate to words A1
through AN), which each participant rates on a scale from 1 to 10. For each
question, I've gathered a significant amount of data with various answers to
the questions.

Now, what I'm most interested in is gathering whether there were
differences, for each answer in each question, between group A and B. The
total difference between A and B summed across all questions and answers in
the survey, isn't very meaningful. Similarly, the relationship between
questions are not meaningful at all, nor is the rate of change (if any)
between questions. In other words, there are probably correlations between
questions, as there are with many surveys, but they aren't of interest here.

It seems like there would be a few ways to tackle this. Since I'm only
interested the relationship between a list of answers to each question
individually, I was thinking I could run a simple ANOVA for each question
with appropriate post-hoc tests, but I'm not sure. First, there are quite a
few questions (about 12), and I'm a little worried about inflating my
family-wise error. Now, I could lower my alpha, but ...

Second, I know in some branches of survey analysis, they use factor analysis
and a series of complicated measures for determining the consistency of the
survey itself.  Since the relationships between questions doesn't have any
significant meaning, I'm not sure if that sort of analysis is the right way
to go here or not. For example, if a particular metric (chronbach's alpha),
said the survey was consistent or not, I don't know what that would even
mean in this case.

As for the data itself, it looks pretty good. Skew and Kurtosis values look
fine, the data appears reasonably normally distributed. There was no
discussion between participants or correlated error in that. In graphing and
going through the data, I don't see anything that pops out as unusual.

A couple questions:

1. Should I even be concerned about running measures for survey consistency
(chronbach's alpha or some kind of factor analysis related measures) if I'm
not particularly interested in the relationship between questions?

2. Should I run something more complex, like a MANOVA, in this case, to try
and weed out any correlated errors between the questions? Would a Wilks'
Lambda score even hold any meaning in a case like this, where the
correlations between the questions are quite coincidental anyway?


Or maybe I'm barking up the wrong tree completely and I should be doing a
thorough analysis of internal consistency measures, as that tells me
something I'm not quite realizing. Any hints out there from the R community,
perhaps from folks that do more survey analysis than I do?

Andreas Stefik, Ph.D.
Department of Computer Science
Southern Illinois University Edwardsville

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] teaching R

2009-08-27 Thread Andreas Stefik
Along this same note, are there any editors that have good code completion
(intellisense) capabilities for R? I'll be teaching R to undergraduates this
semester and I imagine having code completion would be helpful.

Andreas Stefik, Ph.D.
Department of Computer Science
Southern Illinois University Edwardsville

On Thu, Aug 27, 2009 at 2:51 PM, Erich Neuwirth erich.neuwi...@univie.ac.at
 wrote:

 And if your students are used to work with Excel (on Windows) and will
 have data in Excel, consider RExcel (more info at rcom.univie.ac.at)
 which among other things gives you the R Commander menu
 as an Excel menu.

 Disclaimer: I am the author of RExcel.


 David L Carlson wrote:
  I'd suggest looking at Rcmdr by John Fox
  (http://socserv.mcmaster.ca/jfox/Misc/Rcmdr/). I use it to introduce
  anthropology students to R for statistical analyses. It is a graphical
 user
  interface that lets students quickly begin using R to run statistical
  analyses. It includes a command window so you can access functions that
 are
  not included in the menu structure. Think of it as training wheels (and
  more) for beginners.
 
 
 
  --
 
  David L Carlson
 
  Associate Professor of Anthropology
 
  Texas AM University
 
  College Station, TX 77843-4352
 
 
 
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 

 --
 Erich Neuwirth, University of Vienna
 Faculty of Computer Science
 Computer Supported Didactics Working Group
 Visit our SunSITE at http://sunsite.univie.ac.at
 Phone: +43-1-4277-39464 Fax: +43-1-4277-39459

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.