Re: They look different; are they really?
Gus Gassmann wrote: Stan Brown wrote: Another instructor and I gave the same exam to our sections of a course. Here's a summary of the results: Section A: n=20, mean=56.1, median=52.5, standard dev=20.1 Section B: n=23 mean=73.0, median=70.0, standard dev=21.6 Now, they certainly _look_ different. (If it's of any valid I can post the 20+23 raw data.) If I treat them as samples of two populations -- which I'm not at all sure is valid -- I can compute 90% confidence intervals as follows: Class A: 48.3 mu 63.8 Class B: 65.4 mu 80.9 As I say, I have major qualms about whether this computation means anything. So let me pose my question: given the two sets of results shown earlier, _is_ there a valid statistical method to say whether one class really is learning the subject better than the other, and by how much? Before you jump out of a window, you should ask yourself if there is any reason to suspect that the samples should be homogeneous (assuming equal learning). Remember that the students are often self-selected into the sections, and the reasons for selecting one section over the other may well be correlated with learning styles and/or scholastic achievements. Speaking as someone who does a lot of psychometrics, is there any reason to believe you have a reliable test? Reliable in the technical psychometric term that is? That is the first and most important question. We will ignore the question of validity :) Are you and your associate using the same test? You say so but is there any chance of minor modifications? Even in the instrutcions ? Sorry to be so picky but it can be important. Are you sure that you and the other instructor are teaching the same things (especially as to what will be on the exam?) Yes students do form exam strategies. -- -- John Kane The Rideau Lakes, Ontario Canada = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: They look different; are they really?
Stan Brown wrote: Jill Binker [EMAIL PROTECTED] wrote in sci.stat.edu: Even assuming the test yields a good measure of how well the students know the material (which should be investigated, rather than assumed), it isn't telling you whether students have learned more from the class itself, unless you assume all students started from the same place. Good point! I was unconsciously making that very assumption, and I thank you for reminding me that it _is_ an assumption. I did assume that in my earlier post. Stupid! Albeit in the context of my old uni understandable. Just shows one cannot take anything for granted. I had already decided to lead off with an assessment test the first day of class next time, for the students' benefit. Err, see below. Should anyone do this to me he/she might be in trouble. (If they should be in a more or less advanced class, the sooner they know it the better for them.) But as you point out, that will benefit me too. The other instructor has developed a pre-assessment test over the past couple of years, and has offered to let me use it too, so we'll be able to establish comparable baselines. Can I suggest that this may or may not be a good idea? I once did some data analysis on a test for chemistry students. The unfortunate finding was that the Chemistry Profs who had constructed the test did not understand what were the best predictors of success. Not published as far as I know. If you want a good test you need a good psychometrican. His/her stats skills, probably are indifferent (such as mine are) but what we do know is how to measure people (en mass that is). And given the right people we can analyze what a student (worker) must do. It is often different from the ideal. Job analysis is important even for students Give a call to the local Psych Dept. They always have a few grad students wanting money and hopefuly a usable data base. Ask for an Indusriall or I/O grad. A home grown test without norms, reliabilyt , validty stats, etc. I can see lawyers (and myself if called as a witness- although I really don't have the qualifications) just salivating. As I gather is common in this field, the problem isn't statistics per se, but framing questions that can be answered by the kind of data you can get. Err see above for the problem :) -- John Kane The Rideau Lakes, Ontario Canada = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: They look different; are they really?
Jon Miller wrote: Stan Brown wrote: You assume that it was my section that performed worse! (That's true, but I carefully avoided saying so.) Section A (mine) meets at 8 am, Section B at 2 pm. Not only does the time of day quite possibly have an effect, but since most people prefer not to have 8 am classes we can infer that it's likely many of the students in Section A waited until relatively late to register, which in turns suggests they were less highly motivated for the class. I am not sure this is true, It is an emprical hypthisis but not to be accpeted as gospel. The dean has suggested the same self-selection hypothesis you mention. Another possible explanation, which I was unaware of when I posted, is that the instructor for section B held a review session for the half hour just before the exam. Well there goes the hypothis. Which immediately leads also to the question of how much of the class was teaching to the exam and how much was teaching the subject matter. Never been in an Ontario Gr 13 class? Most of the year was teaching to the exam, not the subject matter. However, I'm willing to suggest (without any evidence about _this specific case_) that you gave the students too much freedom. I did not think that slavery was the purpose of education. You assumed that they were adults, and didn't set up your lessons to force them to learn. I am amazed by the number of students who think the purpose of school is to avoid learning anything. So no, I'm not jumping out of any windows. (I did hand out a lot of referrals to the tutoring center.) Mostly I was curious about whether the apparent difference was a real one (as Jerry Dallal has confirmed it is). But as you suggest, we may have two different populations here. This is a huge difference in test scores. But you know your students. Do their test scores adequately reflect their knowledge? (This is probably a better question to ask than whether the test scores are significantly different.) This within reason is very true. Test scores are useful but don't always believe them. Now, looking at your individual students, can you explain why they do or do not know the material? My guess is that some are unmotivated (can we still say lazy?), some have inadequate background, some have . . . I have always made it clear to my students that the grading scale is a guide and a guarantee for them: if they get 90%, they get an A. But I reserve the right to lower the scale so that, in theory at least, if I believe a 30% student is really an A student, then 30% becomes an A. After all, isn't that what professional judgment means: not slavishly following an arithmetic rule? No that is dishonest. If the student does not show his/her capability then he/she does not get the mark. Anything else is fraud. -- -- John Kane The Rideau Lakes, Ontario Canada = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: They look different; are they really?
I sometimes teach two sections of the same class. I always like to compare the stats for them to see if my instructing is the same. My college has lecture or computer mediated developmental studies. I was curious, when I taught a lecture class and a computer mediated class one semester, if the stats would match up (since the computer does most of the instruction not me). The stats didn't match at the end of the semester. It was the same course and material, but different deliver methods. I only have the data for those courses I taught, but my college is collecting the data for all classes over a few years so a comparison can be made. For the same two sections (I teach) that are both lecture, the stats are closer, but never the same. Student learning is definitely multivariate. :-) And I have learned to read the students to see how to approach the teaching. Some classes laugh at my jokes others don't. If the jokes aren't getting responses by the second week then I throw the rest of them out the window for that class. This semester I have two stat classes that are totally different. One class is really easy going and the other is serious. So I change my teaching style to match and the stats for the first test were really close. I'm looking forward to the stats on the second test. I find it isn't always how students learn but how flexiable our teaching style is. :-) SR Chandler Mathematics Faculty TCC - Moss Campus [EMAIL PROTECTED] http://onlinelearning.tcc.vccs.edu/faculty/tcchans/ -- Mathematics is the alphabet with which God has written the universe. -- Galileo Galilei (1564-1642) [EMAIL PROTECTED] 10/02/01 06:45AM edstat-digestTuesday, October 2 2001Volume 2000 : Number 520 Date: Mon, 01 Oct 2001 14:33:53 -0300 From: Gus Gassmann [EMAIL PROTECTED] Subject: Re: They look different; are they really? Stan Brown wrote: Another instructor and I gave the same exam to our sections of a course. Here's a summary of the results: Section A: n=20, mean=56.1, median=52.5, standard dev=20.1 Section B: n=23 mean=73.0, median=70.0, standard dev=21.6 Now, they certainly _look_ different. (If it's of any valid I can post the 20+23 raw data.) If I treat them as samples of two populations -- which I'm not at all sure is valid -- I can compute 90% confidence intervals as follows: Class A: 48.3 mu 63.8 Class B: 65.4 mu 80.9 As I say, I have major qualms about whether this computation means anything. So let me pose my question: given the two sets of results shown earlier, _is_ there a valid statistical method to say whether one class really is learning the subject better than the other, and by how much? Before you jump out of a window, you should ask yourself if there is any reason to suspect that the samples should be homogeneous (assuming equal learning). Remember that the students are often self-selected into the sections, and the reasons for selecting one section over the other may well be correlated with learning styles and/or scholastic achievements. - --- gus gassmann ([EMAIL PROTECTED]) When in doubt, travel. Remove NOSPAM in the reply-to address = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: They look different; are they really?
Stan Brown wrote: You assume that it was my section that performed worse! (That's true, but I carefully avoided saying so.) Section A (mine) meets at 8 am, Section B at 2 pm. Not only does the time of day quite possibly have an effect, but since most people prefer not to have 8 am classes we can infer that it's likely many of the students in Section A waited until relatively late to register, which in turns suggests they were less highly motivated for the class. The dean has suggested the same self-selection hypothesis you mention. Another possible explanation, which I was unaware of when I posted, is that the instructor for section B held a review session for the half hour just before the exam. Which immediately leads also to the question of how much of the class was teaching to the exam and how much was teaching the subject matter. However, I'm willing to suggest (without any evidence about _this specific case_) that you gave the students too much freedom. You assumed that they were adults, and didn't set up your lessons to force them to learn. I am amazed by the number of students who think the purpose of school is to avoid learning anything. So no, I'm not jumping out of any windows. (I did hand out a lot of referrals to the tutoring center.) Mostly I was curious about whether the apparent difference was a real one (as Jerry Dallal has confirmed it is). But as you suggest, we may have two different populations here. This is a huge difference in test scores. But you know your students. Do their test scores adequately reflect their knowledge? (This is probably a better question to ask than whether the test scores are significantly different.) Now, looking at your individual students, can you explain why they do or do not know the material? My guess is that some are unmotivated (can we still say lazy?), some have inadequate background, some have . . . I have always made it clear to my students that the grading scale is a guide and a guarantee for them: if they get 90%, they get an A. But I reserve the right to lower the scale so that, in theory at least, if I believe a 30% student is really an A student, then 30% becomes an A. After all, isn't that what professional judgment means: not slavishly following an arithmetic rule? Jon Miller = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: They look different; are they really?
Jill Binker [EMAIL PROTECTED] wrote in sci.stat.edu: Even assuming the test yields a good measure of how well the students know the material (which should be investigated, rather than assumed), it isn't telling you whether students have learned more from the class itself, unless you assume all students started from the same place. Good point! I was unconsciously making that very assumption, and I thank you for reminding me that it _is_ an assumption. I had already decided to lead off with an assessment test the first day of class next time, for the students' benefit. (If they should be in a more or less advanced class, the sooner they know it the better for them.) But as you point out, that will benefit me too. The other instructor has developed a pre-assessment test over the past couple of years, and has offered to let me use it too, so we'll be able to establish comparable baselines. As I gather is common in this field, the problem isn't statistics per se, but framing questions that can be answered by the kind of data you can get. Yes, I agree. It's easy to crank the numbers; the hard part is deciding what hypothesis to test, which test to apply, and how to interpret the results. That's where I'm particularly grateful for everyone's feedback. -- Stan Brown, Oak Road Systems, Cortland County, New York, USA http://oakroadsystems.com My reply address is correct as is. The courtesy of providing a correct reply address is more important to me than time spent deleting spam. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: They look different; are they really?
Gus Gassmann [EMAIL PROTECTED] wrote in sci.stat.edu: Stan Brown wrote: Another instructor and I gave the same exam to our sections of a course. Here's a summary of the results: Section A: n=20, mean=56.1, median=52.5, standard dev=20.1 Section B: n=23 mean=73.0, median=70.0, standard dev=21.6 So let me pose my question: given the two sets of results shown earlier, _is_ there a valid statistical method to say whether one class really is learning the subject better than the other, and by how much? Before you jump out of a window, you should ask yourself if there is any reason to suspect that the samples should be homogeneous (assuming equal learning). Remember that the students are often self-selected into the sections, and the reasons for selecting one section over the other may well be correlated with learning styles and/or scholastic achievements. You assume that it was my section that performed worse! (That's true, but I carefully avoided saying so.) Section A (mine) meets at 8 am, Section B at 2 pm. Not only does the time of day quite possibly have an effect, but since most people prefer not to have 8 am classes we can infer that it's likely many of the students in Section A waited until relatively late to register, which in turns suggests they were less highly motivated for the class. The dean has suggested the same self-selection hypothesis you mention. Another possible explanation, which I was unaware of when I posted, is that the instructor for section B held a review session for the half hour just before the exam. So no, I'm not jumping out of any windows. (I did hand out a lot of referrals to the tutoring center.) Mostly I was curious about whether the apparent difference was a real one (as Jerry Dallal has confirmed it is). But as you suggest, we may have two different populations here. -- Stan Brown, Oak Road Systems, Cortland County, New York, USA http://oakroadsystems.com My reply address is correct as is. The courtesy of providing a correct reply address is more important to me than time spent deleting spam. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: They look different; are they really?
Stan Brown wrote: I had already decided to lead off with an assessment test the first day of class next time, for the students' benefit. (If they should be in a more or less advanced class, the sooner they know it the better for them.) But as you point out, that will benefit me too. The other instructor has developed a pre-assessment test over the past couple of years, and has offered to let me use it too, so we'll be able to establish comparable baselines. The two classes are in the same subject, aren't they? How come one group is treated differently (given a pre-assessment test) from the other? Alan -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: They look different; are they really?
Stan Brown wrote: Another instructor and I gave the same exam to our sections of a course. Here's a summary of the results: Section A: n=20, mean=56.1, median=52.5, standard dev=20.1 Section B: n=23 mean=73.0, median=70.0, standard dev=21.6 Now, they certainly _look_ different. (If it's of any valid I can post the 20+23 raw data.) If I treat them as samples of two populations -- which I'm not at all sure is valid -- I can compute 90% confidence intervals as follows: Class A: 48.3 mu 63.8 Class B: 65.4 mu 80.9 As I say, I have major qualms about whether this computation means anything. So let me pose my question: given the two sets of results shown earlier, _is_ there a valid statistical method to say whether one class really is learning the subject better than the other, and by how much? Before you jump out of a window, you should ask yourself if there is any reason to suspect that the samples should be homogeneous (assuming equal learning). Remember that the students are often self-selected into the sections, and the reasons for selecting one section over the other may well be correlated with learning styles and/or scholastic achievements. --- gus gassmann ([EMAIL PROTECTED]) When in doubt, travel. Remove NOSPAM in the reply-to address = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: They look different; are they really?
were these two different sections at the same class time? that is ... 10AM on mwf? if not ... then there can be all kinds of reasons why means would be this different ... nonewithstanding one or two real deviant scores in either section ... could also be different quality in the instruction ... all kinds of things of course, if you opted for 95 or 99% cis, the non overlap would be greater ... what is the purpose of doing this in the first place? do not the mean differences really suggest that there is SOMEthing different about the two groups ... ? or ... at least something different in the overall operation of the course in these two sections? At 02:33 PM 10/1/01 -0300, Gus Gassmann wrote: Stan Brown wrote: Another instructor and I gave the same exam to our sections of a course. Here's a summary of the results: Section A: n=20, mean=56.1, median=52.5, standard dev=20.1 Section B: n=23 mean=73.0, median=70.0, standard dev=21.6 Now, they certainly _look_ different. (If it's of any valid I can post the 20+23 raw data.) If I treat them as samples of two populations -- which I'm not at all sure is valid -- I can compute 90% confidence intervals as follows: Class A: 48.3 mu 63.8 Class B: 65.4 mu 80.9 == dennis roberts, penn state university educational psychology, 8148632401 http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: They look different; are they really?
Be careful of the move from data to conclusion! You say whether one class really is learning the subject better than the other, and by how much? Even assuming the test yields a good measure of how well the students know the material (which should be investigated, rather than assumed), it isn't telling you whether students have learned more from the class itself, unless you assume all students started from the same place. As I gather is common in this field, the problem isn't statistics per se, but framing questions that can be answered by the kind of data you can get. Stan Brown wrote: Another instructor and I gave the same exam to our sections of a course. Here's a summary of the results: Section A: n=20, mean=56.1, median=52.5, standard dev=20.1 Section B: n=23 mean=73.0, median=70.0, standard dev=21.6 Now, they certainly _look_ different. (If it's of any valid I can post the 20+23 raw data.) If I treat them as samples of two populations -- which I'm not at all sure is valid -- I can compute 90% confidence intervals as follows: Class A: 48.3 mu 63.8 Class B: 65.4 mu 80.9 As I say, I have major qualms about whether this computation means anything. So let me pose my question: given the two sets of results shown earlier, _is_ there a valid statistical method to say whether one class really is learning the subject better than the other, and by how much? Jill Binker Fathom Dynamic Statistics Software KCP Technologies, an affiliate of Key Curriculum Press 1150 65th St Emeryville, CA 94608 1-800-995-MATH (6284) [EMAIL PROTECTED] http://www.keypress.com http://www.keycollege.com __ = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =