[R] Using R for a slightly tricky survey analysis
Hello folks, I have recently finished a pilot study of a survey and am working to complete the statistical analysis of the results in R. My Phd is technically in computer science (not statistics), although I teach basic stats and have a decent working knowledge of the area. With that said, my expertise in psychometrics theory and factor analysis is weaker, so I thought I would send an email here and try to solicit some advice on the proper technique for my analysis. First, in the survey, I have a series of concepts and word choices regarding those concepts (e.g., how well does concept A relate to words A1 through AN), which each participant rates on a scale from 1 to 10. For each question, I've gathered a significant amount of data with various answers to the questions. Now, what I'm most interested in is gathering whether there were differences, for each answer in each question, between group A and B. The total difference between A and B summed across all questions and answers in the survey, isn't very meaningful. Similarly, the relationship between questions are not meaningful at all, nor is the rate of change (if any) between questions. In other words, there are probably correlations between questions, as there are with many surveys, but they aren't of interest here. It seems like there would be a few ways to tackle this. Since I'm only interested the relationship between a list of answers to each question individually, I was thinking I could run a simple ANOVA for each question with appropriate post-hoc tests, but I'm not sure. First, there are quite a few questions (about 12), and I'm a little worried about inflating my family-wise error. Now, I could lower my alpha, but ... Second, I know in some branches of survey analysis, they use factor analysis and a series of complicated measures for determining the consistency of the survey itself. Since the relationships between questions doesn't have any significant meaning, I'm not sure if that sort of analysis is the right way to go here or not. For example, if a particular metric (chronbach's alpha), said the survey was consistent or not, I don't know what that would even mean in this case. As for the data itself, it looks pretty good. Skew and Kurtosis values look fine, the data appears reasonably normally distributed. There was no discussion between participants or correlated error in that. In graphing and going through the data, I don't see anything that pops out as unusual. A couple questions: 1. Should I even be concerned about running measures for survey consistency (chronbach's alpha or some kind of factor analysis related measures) if I'm not particularly interested in the relationship between questions? 2. Should I run something more complex, like a MANOVA, in this case, to try and weed out any correlated errors between the questions? Would a Wilks' Lambda score even hold any meaning in a case like this, where the correlations between the questions are quite coincidental anyway? Or maybe I'm barking up the wrong tree completely and I should be doing a thorough analysis of internal consistency measures, as that tells me something I'm not quite realizing. Any hints out there from the R community, perhaps from folks that do more survey analysis than I do? Andreas Stefik, Ph.D. Department of Computer Science Southern Illinois University Edwardsville [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using R for a slightly tricky survey analysis
Hello folks, I have recently finished a pilot study of a survey and am working to complete the statistical analysis of the results in R. My Phd is technically in computer science (not statistics), although I teach basic stats and have a decent working knowledge of the area. With that said, my expertise in psychometrics theory and factor analysis is weaker, so I thought I would send an email here and try to solicit some advice on the proper technique for my analysis. First, in the survey, I have a series of concepts and word choices regarding those concepts (e.g., how well does concept A relate to words A1 through AN), which each participant rates on a scale from 1 to 10. For each question, I've gathered a significant amount of data with various answers to the questions. Now, what I'm most interested in is gathering whether there were differences, for each answer in each question, between group A and B. The total difference between A and B summed across all questions and answers in the survey, isn't very meaningful. Similarly, the relationship between questions are not meaningful at all, nor is the rate of change (if any) between questions. In other words, there are probably correlations between questions, as there are with many surveys, but they aren't of interest here. It seems like there would be a few ways to tackle this. Since I'm only interested the relationship between a list of answers to each question individually, I was thinking I could run a simple ANOVA for each question with appropriate post-hoc tests, but I'm not sure. First, there are quite a few questions (about 12), and I'm a little worried about inflating my family-wise error. Now, I could lower my alpha, but ... Second, I know in some branches of survey analysis, they use factor analysis and a series of complicated measures for determining the consistency of the survey itself. Since the relationships between questions doesn't have any significant meaning, I'm not sure if that sort of analysis is the right way to go here or not. For example, if a particular metric (chronbach's alpha), said the survey was consistent or not, I don't know what that would even mean in this case. As for the data itself, it looks pretty good. Skew and Kurtosis values look fine, the data appears reasonably normally distributed. There was no discussion between participants or correlated error in that. In graphing and going through the data, I don't see anything that pops out as unusual. A couple questions: 1. Should I even be concerned about running measures for survey consistency (chronbach's alpha or some kind of factor analysis related measures) if I'm not particularly interested in the relationship between questions? 2. Should I run something more complex, like a MANOVA, in this case, to try and weed out any correlated errors between the questions? Would a Wilks' Lambda score even hold any meaning in a case like this, where the correlations between the questions are quite coincidental anyway? Or maybe I'm barking up the wrong tree completely and I should be doing a thorough analysis of internal consistency measures, as that tells me something I'm not quite realizing. Any hints out there from the R community, perhaps from folks that do more survey analysis than I do? Andreas Stefik, Ph.D. Department of Computer Science Southern Illinois University Edwardsville [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] teaching R
Along this same note, are there any editors that have good code completion (intellisense) capabilities for R? I'll be teaching R to undergraduates this semester and I imagine having code completion would be helpful. Andreas Stefik, Ph.D. Department of Computer Science Southern Illinois University Edwardsville On Thu, Aug 27, 2009 at 2:51 PM, Erich Neuwirth erich.neuwi...@univie.ac.at wrote: And if your students are used to work with Excel (on Windows) and will have data in Excel, consider RExcel (more info at rcom.univie.ac.at) which among other things gives you the R Commander menu as an Excel menu. Disclaimer: I am the author of RExcel. David L Carlson wrote: I'd suggest looking at Rcmdr by John Fox (http://socserv.mcmaster.ca/jfox/Misc/Rcmdr/). I use it to introduce anthropology students to R for statistical analyses. It is a graphical user interface that lets students quickly begin using R to run statistical analyses. It includes a command window so you can access functions that are not included in the menu structure. Think of it as training wheels (and more) for beginners. -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Erich Neuwirth, University of Vienna Faculty of Computer Science Computer Supported Didactics Working Group Visit our SunSITE at http://sunsite.univie.ac.at Phone: +43-1-4277-39464 Fax: +43-1-4277-39459 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.