Re: Measuring Teaching Performance

Richard Hake Fri, 13 May 2005 16:56:03 -0700

If you object to cross-posting as a way to tunnel through inter- and intra-disciplinary barriers or have no interest in "Measuring Teaching Performance," please hit "delete" now. And if you respond to this long (33 kB) post please don't hit the reply button unless you prune the original message normally contained in your reply down to a few lines, otherwise you may inflict this entire post yet again on suffering list subscribers.

ABSTACT: Assuming that "teaching performance" is gauged by *student learning* and not *teacher behavior*, I discuss: five INDIRECT and therefore problematic measures of teaching performance: (1) Reformed Teaching Observation Protocol (RTOP), (2) Student Evaluations Of Teaching (SET's), (3) Course Exams or Final Grades, (4) National Survey Of Student Engagement (NSSE), and (5) Student Assessment Of Learning Gains" (SALG). These are contrasted with a DIRECT measure of teaching performance (i.e., student learning) pioneered by physics education researchers: pre/post testing with valid and consistently reliable diagnostic tests based on thorough qualitative and quantitative research by disciplinary experts.

In a previous post "Re: Measuring Teaching Performance" [Hake (2005a)] I wrote bracketed by lines "HHHHHHH. . . ":

HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH In a PsychTeacher post of 6 May 2005 titled "Re: Measuring Teaching Performance," Jesse Owen wrote:

"I was wondering what, if any, evaluations have you used to measure what professors are doing in class? Please note I am not interested in measures that only examine student satisfaction or preference (although these items on a measure would be helpful). I have looked in the literature and have found some useful starting places; however, I figured that this list would a great resource to help generate some ideas and some insights about the practicality of these measures." HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH

To which Joe Bellina responded: "Isn't this what RTOP . . . [Reformed Teaching Observation Protocol] . . . is all about?"

IMHO, Bellina's statement is somewhat ambiguous - if Bellina's "this" refers to Owen's:

A. "I was wondering what, if any, evaluations have you used to measure what professors are DOING in class?" then YES, that's what RTOP is about.

But if Bellina's "this" refers to Owen's:

B. subject title "Re: MEASURING TEACHING PERFORMANCE," then NO, that's NOT what RTOP is about, IF one judges "teaching performance" by DIRECT gauges of student learning such as pre/post testing with valid and consistently reliable diagnostic tests based on thorough qualitative and quantitative research by disciplinary experts [see, e.g., Halloun & Hestenes (1985a,b), Beichner (1994), Thornton & Sokoloff (1998), Maloney et al. (2001)] and not by INDIRECT gauges.

The lessons from the physics education reform effort [Hake (2002b)] most relevant to the rigorous pre/post testing of student learning are:

**Lesson #3. High-quality standardized tests of the cognitive and affective impact of courses are essential for gauging the relative effectiveness of traditional and non-traditional educational methods.**

Such effort would probably require fulfillment of Lesson #4:

**Lesson #4. Education research and development (R&D) by disciplinary experts (DE's), and of the same quality and nature as traditional Science/Engineering R&D), is needed to develop potentially effective educational methods within each discipline. But the DE's should take advantage of the insights of (a) DE's doing education R&D in other disciplines, (b) cognitive scientists, (c) faculty and graduates of education schools, and (d) classroom teachers**

Despite the alleged "new paradigm" to emphasize student learning rather than professorial teaching in academia [Barr & Tagg (1995)], the above lessons have been generally ignored by academia with its pre/post paranoia [Hake (2000, 2001 2004c)] and its infatuation with indirect measures of student learning (see below). Even the NRC's undergraduate science education committees [e.g., Fox & Hackerman (2003), McCray et al. (2003)] have dismissed the lessons from the physics education reform effort. However the NRC appears to have finally awakened with the publication of Donovan & Pelligrino (2003).

The above pre/post testing recommendation appears to conflict with the opinion of psychologist David Berliner (2005). Berliner writes: ". . . .measurement of [teachers'] success in promoting learning through "pay for performance" or "value-added" assessments is so filled with psychometric problems that no current system is acceptable for assessing this dimension of teacher quality." But the evident success of physics education researchers in the use of pre/post testing to gauge student learning and thereby improve introductory courses [most notably at Harvard [Crouch & Mazur (2002) and MIT (Dori & Belcher (2004)] seems to contradict Berliner's claim, suggesting that psychometric problems may not be as serious as many believe [Hake (2004), Scriven (2004)].

Five relatively popular INDIRECT AND THEREFORE PROBLEMATIC gauges of student learning are:

1. REFORMED TEACHING OBSERVATION PROTOCAL (RTOP) [see e.g., MacIsaac & Falconer (2002), ACEPT (2002), Lawson (2003), MacIsaac (2003)]. In a PhysLrnR post titled "RTOP Was: Measuring Content Knowledge" [Hake (2004a)] I wrote [bracketed by lines "HHHHHHH. . . ."]

HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH In his PhysLrnR post of 16 Mar 2004 titled "RTOP Was: Measuring Content Knowledge," Dan McIsaac (2004) wrote:

"RTOP is NOT intended to indirectly measure conceptual learning; it is intended to explicitly characterize specific classroom behaviours . . [British variant of "behaviors"]. . . . which promote large conceptual gains in student learning. The fact that RTOP does correlate strongly with student conceptual gains shows RTOP has validity, but does not eliminate a need for independently measuring student learning."

THREE POINTS: 111111111111111111111111111111111111111111111 1. For what does RTOP have validity? A more cautious wording might be "The fact that RTOP has been shown in a few cases to correlate strongly with student conceptual gains SUGGESTS that RTOP *might* be of general value in gauging teaching effectiveness."

222222222222222222222222222222222222222222222 2. I agree with Dan that those who use RTOP should also attempt to independently measure gains in student understanding. But I fear that, just as with the current misuse of student evaluations of teaching (SET's), administrators (and even teachers themselves) may take bare RTOP scores as evidence of teaching effectiveness. . . . . . . . . . . . . . . . . . . . . . . . . . . . . That RTOP scores correlate strongly with student conceptual gains may have been shown for certain types of non-traditional instruction [MacIsaac & Falconer (2002), Lawson (2003), Adamson et al. (2003)], but courses with relatively high . . .[average NORMALIZED gains - Hake (1998a)] . . . <g>'s and relatively low RTOP scores may well exist. IF that's the case then the teaching methods that garner high RTOP scores ["modeling" . . .[see <http://modeling.asu.edu/R&E/Research.html>]. . . . is evidently one of them] may not be the only methods that are effective in promoting conceptual understanding. HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH

2. STUDENT EVALUATIONS OF TEACHING (SET's). In "Re: Problems with Student Evaluations: Is Assessment the Remedy?"[Hake (2002a)], I wrote [see that article for references other than Hake (2002b, 2004c, 2005b) and Scriven (2004)]:

HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Although there are many SET researchers (see, e.g. Abrami et al. 1990; Aleamoni 1987 ; d'Apollonia & Cohen 1997; Cohen 1981; Cashin 1995; Marsh & Roche 1997; Marsh & Dunkin 1992) who claim that SET's are valid indicators of the . . . [the cognitive (as distinguished from the affective) impact of a course]. . . (for a review see Hake 2000), their conclusions are almost always based on measuring student learning or "achievement" by course grades or exams and NOT by pre/post testing . . . pre/post even despite the Lordly Cronbachian objections of some education/psychology specialists - see Hake (2001). . . [more recent references are Hake (2004d, 2005b) and Scriven (2004)]. . . . with valid and reliable instruments such as the "Force Concept Inventory" of Hestenes et al.(1992) and Halloun et al. (1995) [see, e.g., Hake (2002b)].

With regard to the problem of using course performance as a measure of student achievement or learning, Peter Cohen's (1981) oft-quoted meta-analysis of 41 studies on 68 separate multisectioncourses purportedly showing that: "the average correlation between an overall instructor rating and student achievement was +0.43; the average correlation between an overall course rating and student achievement was +0.47 . . . the results . . . provide strong support for the validity of student ratings as measures of teaching effectiveness was reviewed and reanalyzed by Feldman (1989) who pointed out that "McKeachie (1987) has recently reminded educational researchers and practitioners that the achievement tests assessing student learning in the sorts of studies reviewed here. . . (e.g., those by Cohen (1981, 1986, 1987). . . typically measure lower-level educational objectives such as memory of facts and definitions rather than higher-level outcomes such as critical thinking and problem solving . . .[he might have added conceptual understanding] . . . that are usually taken as important in higher education.

Striking back at SET skeptics, Peter Cohen (1990) opined: "Negative attitudes toward student ratings are especially resistant to change, and it seems that faculty and administrators support their belief in student-rating myths with personal and anecdotal evidence, which (for them) outweighs empirically based research evidence.

However, as far as I know, neither Cohen nor any other SET champion has countered the fatal objection of McKeachie **that the evidence for the validity of SET's as gauges of the cognitive impact of courses rests for the most part on measures of students' lower-level thinking as exhibited in course grades or exams**. Furthermore, rampant grade inflation [Merrow (2003), Johnson (2003) seriously undermines the use of final grades as measures of learning.

At least in physics it is well-known [see, e.g., Hake (2002b)] that students in TRADITIONAL mechanics courses can achieve A's through rote memorization and algorithmic problem solving, while achieving NORMALIZED gains in conceptual understanding of only about 0.2 (i.e., pre-to-post gains that are only about 0.2 of the maximum possible gain). . . For a discussion of higher-level learning outcomes - "procedural", "schematic", and "strategic" see Shavelson & Huang (SH)(2003). In Chart 1, SH list higher-level learning objectives (not usually measured by course grades or exams and certainly not by student evaluations of teaching) such as "procedural,". . . .[see Arons (1983)]. . . "schematic," and "strategic" knowledge within knowledge domains. Such knowledge can and has been measured by tests that are constructed by disciplinary experts in education research - for listings see NCSU (2005) and FLAG (2005)]. HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH

3. COURSE EXAMS OR FINAL GRADES [for arguments against the use of faculty and final grades to gauge the cognitive impact of courses see "2" above.

4. NATIONAL SURVEY OF STUDENT ENGAGEMENT (NSSE) [Kuh (2002, 2003, <http://www.indiana.edu/~nsse/>); NCPI. 2002] Kuh (2003) wrote: "Although NSSE DOES NOT DIRECTLY ASSESS LEARNING OUTCOMES, the results from the survey point to areas where colleges are performing well in enhancing learning, as well as to aspects of the undergraduate experience that could be improved." (My CAPS.)

In Hake (2003b) I wrote:

HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH The move to evaluate undergraduate education by means of student feedback surveys such as. . .[NSSE]. . . appears to ignore:

(a) Lessons #3 & 4 . . . [of lessons of the physics education reform effort (Hake 2002b) - see above] . . . as well as

(b) DIRECT measurement in an increasing number of disciplines (Hake 2002c, 2003c) of higher-level learning outcomes . . . [and the correlation of those outcomes with the degree of interactive engagement (Hake 1998a,b) as corroborated by many other physics education research groups as listed in Hake (2002b,c)]. HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH

5. STUDENT ASSESSMENT OF LEARNING GAINS" (SALG) [Seymour et al. (2000), FLAG (2005)]

According to Semour et al. (2005)[bracketed by lines "SSSSSS. . ."

SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS The SALG instrument can spotlight those elements in the course that best support student learning and those that need improvement. This instrument is a powerful tool, can be easily individualized, provides instant statistical analysis of the results, and facilitates formative evaluation throughout a course. Instructors feel that typical classroom evaluations offer poor feedback, and this dissatisfaction is heightened when these instruments are used for promotion decisions. We've found that questions about how well instructors performed their teaching role and about "the class overall" yield inconclusive results. We believe all of these shortcomings are addressed with the SALG. . . . . . SALG is a web-based instrument consisting of statements about the degree of "gain" (on a five-point scale) which students perceive they've made in specific aspects of the class. Instructors can add, delete, or edit questions. The instrument is administered on-line, and typically takes 10-15 minutes. A summary of results is instantly available in both statistical and graphical form. . . . . There is substantial research which concludes that administering classroom instruments based on student perceptions of the efficacy of particular teaching methods can be both valid and reliable [Hinton (1993)].

I have not seen the article by Hinton, but because valid and consistently reliable tests of student learning are so rare, I am somewhat skeptical that Hinton has conclusively shown that "administering classroom instruments based on student perceptions of the efficacy of particular teaching methods CAN be both valid and reliable reliable." What independent gauge of student learning could he have used in his research?

In regard to students' self assessment of their learning gains, Mike
Zeilik (2003) wrote in response to Hake (2003c):

"I have read a fascinating book [Druckman & Bjork (1994)] related to
this issue [the unreliability of students' self assessment of their
learning] . . . After a review of the literature, they state bluntly:

'In short, learners do not necessarily know what is best for them' (p. 72)."

Nevertheless, I think that SALG is probably far superior to the standard SET methods that now dominate university assessment of faculty.


Richard Hake, Emeritus Professor of Physics, Indiana University
24245 Hatteras Street, Woodland Hills, CA 91367
<[EMAIL PROTECTED]>
<http://www.physics.indiana.edu/~hake>
<http://www.physics.indiana.edu/~sdi>

REFERENCES ACEPT. 2002. "Final Report," online at <http://acept.la.asu.edu/final_report/>: The report below summarizes the activities and findings of the first five years (1995-2000) of the ACEPT project. I thank Beth Kubitskey for this reference.

Adamson, A.E., D. Banks, M. Burtch, F. Cox III, E. Judson, J.B. Turley, R. Benford, & A.E. Lawson. 2003. "Reformed Undergraduate Instruction and Its Subsequent Impact on Secondary School Teaching Practice and Student Achievement. Journal of Research in Science Teaching 40(10): 939-958; abstract online at <http://www3.interscience.wiley.com/cgi-bin/abstract/106567443/ABSTRAC T>. Non-subscribers may purchase the complete article for $25. The abstract reads: "The Arizona Collaborative for Excellence in the Preparation of Teachers (ACEPT) Program is one of several reform efforts supported by the National Science Foundation. The primary ACEPT reform mechanism has been month-long summer workshops in which university and community college science and mathematics faculty learn about instructional reforms and then attempt to apply them in their courses. Previous ACEPT evaluation efforts suggest that, when implemented, the reforms boost undergraduate student achievement. The initial purpose of the present study was to discover whether enrollment of preservice teachers in one or more of these reformed undergraduate courses is linked to the way they teach after they graduate and become in-service teachers. Assuming that a link is found, a second purpose was to discover whether the presumed positive effect is in turn linked to their students' achievement. In short, the answer appears to be yes, at least among the biology teachers and students surveyed. Compared with controls, the biology teachers who had enrolled in one or more ACEPT reformed course during their teacher preparation program demonstrated significantly higher scores on the measure of reformed instruction and their students demonstrated significantly higher achievement in terms of scientific reasoning, nature of science, and biology concepts. These results support the hypothesis that teachers teach as they have been taught. Furthermore, it appears that instructional reform in teacher preparation programs including both methods and major's courses can improve secondary school student achievement."

Arons, A.B. 1983. "Achieving Wider Scientific Literacy," Daedalus, Spring. Reprinted in Arons (1997) as Chapter 12, "Achieving Wider Scientific Literacy." Arons wrote (see the article for the References, my CAPS): "Researchers in cognitive development describe two principal classes of knowledge: figurative (or declarative) and operative (or procedural) [Anderson (1980); Lawson (1982)]. DECLARATIVE KNOWLEDGE CONSISTS OF KNOWING 'FACTS'; for example, that the moon shines by reflected sunlight, that the earth and planets revolve around the sun . . . . OPERATIVE (OR PROCEDURAL) KNOWLEDGE, ON THE OTHER HAND, INVOLVES UNDERSTANDING THE SOURCE OF SUCH DECLARATIVE KNOWLEDGE (How do we know the moon shines by reflected sunlight? Why do we believe the earth and planets revolve around the sun when appearances suggest that everything revolves around the earth? . . . .) and the capacity to use, apply, transform, or recognize the relevance of the declarative knowledge to new or unfamiliar situations.

Arons, A.B. 1997. "Teaching Introductory Physics." Wiley.

Barr, R.B. & J. Tagg. 1995. "From Teaching to Learning: A New Paradigm for Undergraduate Education," Change 27(6); 13-25, November/December. Reprinted in D. Dezure, "Learning from Change: Landmarks in Teaching and Learning in Higher Education from 'Change Magazine' 1969-1999." American Association for Higher Education. Barr & Tagg write: "A paradigm shift is occurring in American higher education. Under the traditional, dominant 'Instruction Paradigm,' colleges are institutions that exist to provide instruction. Subtly but profoundly, however, a 'Learning Paradigm' is taking hold, whereby colleges are institutions that exist to produce learning. This shift is both needed and wanted, and it changes everything. The writers discuss the mission and purposes of the Learning Paradigm."

Beichner, R.J. 1994. "Testing student interpretation of kinematics graphs," Am. J. Phys. 62(8): 750-762. An abstract is online at <http://www2.ncsu.edu/ncsu/pams/physics/Physics_Ed/publications.html>.

Berliner, D. 2005. "The Near Impossibility Of Testing For Teacher Quality," Journal of Teacher Education 56(3): 205-213. I thank Berliner for calling my attention to this article.

Crouch, C.H. & E. Mazur. 2001. "Peer Instruction: Ten years of experience and results," Am. J. Phys. 69: 970-977; online at <http://mazur-www.harvard.edu/library.php>, search "All Education Areas" for author "Crouch" (without the quotes).

Donovan, M.S. & J. Pellegrino, eds. 2003. Learning and Instruction: A SERP Research Agenda, National Academies Press; online at <http://books.nap.edu/catalog/10858.html>.

Dori, Y.J. & J. Belcher, J. 2004. "How Does Technology-Enabled Active Learning Affect Undergraduate Students' Understanding of Electromagnetism Concepts?" To appear in The Journal of the Learning Sciences 14(2), online at <http://web.mit.edu/jbelcher/www/TEALref/TEAL_Dori&Belcher_JLS_10_01_2 004.pdf> (1 MB).

Druckman, D. & R.A. Bjork, eds. 1994. "Learning, Remembering, and Believing; Enhancing Human Performance." Nat. Acad. Press; online at <http://www.nap.edu/catalog/2303.html>.

FLAG. 2005. "Field-tested Learning Assessment Guide; online at <http://www.flaguide.org/>: ". . . offers broadly applicable, self-contained modular classroom assessment techniques (CAT's) and discipline-specific tools for STEM [Science, Technology, Engineering, and Mathematics] instructors interested in new approaches to evaluating student learning, attitudes and performance. Each has been developed, tested, and refined in real colleges and universities classrooms." Assessment tools for physics and astronomy (and other disciplines) are at <http://www.flaguide.org/tools/tools.php>.

Fox, M.A., & N. Hackerman, eds. 2003. National Research Council, Committee on Undergraduate Science Education, National Academy Press; online at <http://www.nap.edu/catalog/10024.html>.

Hake, R.R. 1998a. "Interactive-engagement vs traditional methods: A six-thousand-student survey of mechanics test data for introductory physics courses," Am. J. Phys. 66: 64-74; online as ref. 24 at <http://www.physics.indiana.edu/~hake>, or simply click on <http://www.physics.indiana.edu/~sdi/ajpv3i.pdf> (84 kB).

Hake, R.R. 1998b. "Interactive-engagement methods in introductory mechanics courses," online as ref. 25 at <http://www.physics.indiana.edu/~hake>, or simply click on <http://www.physics.indiana.edu/~sdi/IEM-2b.pdf> (108 kB) - a crucial companion paper to Hake (1998a).

Hake, R.R. 2000. "Re: Is Pre-post Testing a Waste of Time?" AERA-D post of 21 Oct 2000 16:29:48-0700; online at <http://lists.asu.edu/cgi-bin/wa?A2=ind0010&L=aera-d&P=R4769>.

Hake, R.R. 2001. "Pre/Post Paranoia (was 'Re: effect size')"," AERA-D/PhysLrnR post of 17 May 2001 16:01:56-0700; online at <http://lists.asu.edu/cgi-bin/wa?A2=ind0105&L=aera-d&P=R19884>.

Hake, R.R. 2002a. "Re: Problems with Student Evaluations: Is Assessment the Remedy?" online at <http://listserv.nd.edu/cgi-bin/wa?A2=ind0204&L=pod&P=R14535>. Post of 25 Apr 2002 16:54:24-0700 to AERA-D, ASSESS, EvalTalk, Phys-L, PhysLrnR, POD, & STLHE-L. Slightly edited and improved on 16 November 2002 as ref. 18 at <http://www.physics.indiana.edu/~hake> or download directly at <http://www.physics.indiana.edu/~hake/AssessTheRem1.pdf> (72 kB). Also online in HTML at <http://www.stu.ca/~hunt/hake.htm> as one of the many resources in Russ Hunt's annotated bibliography of articles and books on student evaluation of teaching <http://www.stu.ca/~hunt/evalbib.htm>.

Hake, R.R. 2002b. "Lessons from the physics education reform effort," Ecology and Society 5(2): 28; online at <http://www.ecologyandsociety.org/vol5/iss2/art28/>. Ecology and Society (formerly Conservation Ecology) is a free "peer-reviewed journal of integrative science and fundamental policy research" with about 11,000 subscribers in about 108 countries.

Hake, R.R. 2002c. "Assessment of Physics Teaching Methods," Proceedings of the UNESCO-ASPEN Workshop on Active Learning in Physics, Univ. of Peradeniya, Sri Lanka, 2-4 Dec. 2002; also online as ref. 29 at <http://www.physics.indiana.edu/~hake/> or download directly by clicking on <http://www.physics.indiana.edu/~hake/Hake-SriLanka-Assessb.pdf> (84 KB).

Hake, R.R. 2003a. "Meta-analyses of <g> Values" post of 5 Sep 2003 17:14:26 -0700 to ASSESS, EvalTalk, PhysLnR, and POD; online at <http://listserv.nd.edu/cgi-bin/wa?A2=ind0309&L=pod&P=R2439>.

Hake, R.R. 2003b. "Re: Thin-Slice Judgments, End-of-Course Evaluations, Grades, and Student Learning, " online at <http://listserv.nd.edu/cgi-bin/wa?A2=ind0303&L=pod&P=R21469&I=-3>. Post of 31 Mar 2003 12:47:55-0800 to ASSESS, EvalTalk, PhysLrnR, POD, & STLHE-L.

Hake, R.R. 2003c. "Beyond Dead Reckoning to Improve Educational Quality," post of 20 Mar 2003 15:11:26-0800 to Biopi-L, ASSESS, Chemed-L, EvalTalk, PhysLrnR, & POD; online at <http://listserv.nd.edu/cgi-bin/wa?A2=ind0303&L=pod&F=&S=&P=11714>.

Hake, R.R. 2004a. "RTOP - was: Measuring Content Knowledge," PhysLrnR post of 16 Mar 2004 16:43:09-0800; online at <http://listserv.boisestate.edu/cgi-bin/wa?A2=ind0403&L=physlrnr&P=R42 17&I=-3&X=3DF6785158CF1FE959&[EMAIL PROTECTED]>. The encyclopedic URL indicates that PhysLrnR is one of the few "closed" discussion lists for which one must subscribe to access its archives. However, it takes only a few minutes to subscribe by following the simple directions at <http://listserv.boisestate.edu/archives/physlrnr.html> / "Join or leave the list (or change settings)" where "/" means "click on." If you're busy, then subscribe using the "NOMAIL" option under "Miscellaneous." Then, as a subscriber, you may access the archives and/or post messages at any time, while receiving NO MAIL from the list!

Hake, R.R. 2004b. "Re: Measuring Content Knowledge", online at <http://lists.asu.edu/cgi-bin/wa?A2=ind0403&L=aera-d&T=0&O=D&P=5436>. Post of 14 Mar 2004 16:29:47-0800 to ASSESS, Biopi-L, Chemed-L, EvalTalk, Physhare, Phys-L, PhysLnrR, POD, and STLHE-L; later sent to AERA-D with a few corrections where it appears at <http://lists.asu.edu/cgi-bin/wa?A2=ind0403&L=aera-d&P=R3625&I=-3>.

Hake, R.R. 2004c. "Re: Measuring Content Knowledge", online at
<http://listserv.nd.edu/cgi-bin/wa?A2=ind0403&L=pod&O=A&P=17167>.
Post of 15 Mar 2004 14:29:59-0800 to ASSESS, EvalTalk, Phys-L, PhysLrnR, & POD.

Hake, R.R. 2004d. "Re: pre-post testing in assessment," online at <http://listserv.nd.edu/cgi-bin/wa?A2=ind0408&L=pod&P=R9135&I=-3>. Post of 19 Aug 2004 13:56:07-0700 to AERA-D, AERA-J, EDSTAT-L, EVALTALK, PhysLrnR, and POD.

Hake, R.R. 2005a. "Re: Measuring Teaching Performance" online at <http://listserv.nd.edu/cgi-bin/wa?A2=ind0505&L=pod&O=D&P=8800>. Post of 11 May 2005 18:30:15-0700 to ASSESS, EvalTalk, PhysLrnR, POD, TIPS, and TeachingEdPsych.

Hake, R.R. 2005b. "Carnegie Scholar Backs Pre/post Testing," online at <http://listserv.nd.edu/cgi-bin/wa?A2=ind0503&L=pod&P=R17631&I=-3>. Post of 24 Mar 2005 01:23:45-0800 to AERA-C, AERA-D, AERA-J, AERA-K, ASSESS, EdStat-L, EvalTalk, PhysLrnR, POD, & STLHE-L

Halloun, I. & D. Hestenes. 1985a. "The initial knowledge state of college physics students." Am. J. Phys. 53:1043-1055; online at <http://modeling.asu.edu/R&E/Research.html>. Contains the "Mechanics Diagnostic" test, precursor to the "Force Concept Inventory."

Halloun, I. & D. Hestenes. 1985b. "Common sense concepts about motion." Am. J. Phys. 53:1056-1065; online at <http://modeling.asu.edu/R&E/Research.html>.

Hinton, H. (1993). Reliability and validity of student evaluations: Testing models versus survey research models. PS: Political Science and Politics September: 562-569.

Johnson, V.E. 2003. "Grade Inflation: A Crisis in College Education, Springer-Verlag, 2003."

Kuh, G.D. 2002. "Assessing What Really Matters to Student Learning:
Inside the National Survey of Student Engagement," Change 33(3):
10-17, 66.

Kuh, G.D. 2003. "What We're Learning About Engagement From NSEE"
Change, March/April, pp. 13-23>.

Lawson, A.E. 2003. "Using the RTOP to Evaluate Reformed Science and Mathematics Instruction," in McCray et al. (2003): 89-100. In my view it's ironic that McCray et al. (2003) provide a complete "commissioned paper" by Lawson on the RTOP, an INDIRECT measure of student learning at best, while totally neglecting the landmark DIRECT measure of student learning by Halloun & Hestenes (1985a,b) and the pre/post testing movement that it initiated.

MacIsaac, D.L. & K.A. Falconer. 2002. "Reforming physics education via RTOP." Phys. Teach. 40(8), 479-485; online as a 116 kB pdf at <http://physicsed.buffalostate.edu/pubs/TPT/TPTNov02RTOP/>; also online to Physics Teacher subscribers as a 115kB pdf at <http://scitation.aip.org/dbt/dbt.jsp?KEY=PHTEAH&Volume=40&Issue=8>. Describes physics-specific RTOP use.

MacIsaac, D.L. 2004. "RTOP - Was: Measuring Content Knowledge," PhysLrnR post of 16 Mar 2004 11:49:44-0500; online at <http://listserv.boisestate.edu/cgi-bin/wa?A2=ind0403&L=physlrnr&O=D&X =033060702551103201&[EMAIL PROTECTED]&P=5445>. Regarding the encyclopedic URL see Hake (2004a).

MacIsaac, D.L. 2005. Reformed Teaching Observation Protocol (RTOP) website at
<http://physicsed.buffalostate.edu/AZTEC/RTOP/RTOP_full/index.htm>.
Included the instrument itself, sample videos, and relevant
references.

Maloney, D., T.L. O'Kuma, C.J. Hieggelke, & A. Van Heuvelen. 2001. "Surveying students' conceptual knowledge of electricity and magnetism," Physics Education Research Supplement to Am. J. Phys 69(7): S12-S23.

McCray, R.A., R.L. DeHaan, J.A. Schuck, eds. 2003. "Improving Undergraduate Instruction in Science, Technology, Engineering, and Mathematics: Report of a Workshop" Committee on Undergraduate STEM Instruction," National Research Council, National Academy Press; online at <http://www.nap.edu/catalog/10711.html>. Physicists/astronomers attending the workshop were Paula Heron, Priscilla Laws, John Layman, Ramon Lopez, Richard McCray, Lillian McDermott, Carl Wieman, Jack Wilson, and (believe it or not) even the FLAG (2005) waving Mike Zeilik.

Merrow, J. 2003. "Easy grading makes 'deep learning' more important," USA Today Editorial, 4 February; online at http://www.usatoday.com/news/opinion/editorials/2003-02-04-merrow_x.ht m <>: "Duke University Professor Valen Johnson . . . [2003]. . . studied 42,000 grade reports and discovered easier grades in the "soft" sciences such as cultural anthropology, sociology, psychology, and communications. The hardest A's were in the natural sciences, such as physics, and in advanced math courses. The easiest department was music, with a mean grade of 3.69; the toughest was math, with a mean of 2.91."

NCPI. 2002. National Center for Postsecondary Improvement, "Beyond Dead Reckoning: Research Priorities for Redirecting American Higher Education." The 25-page report is online as a 4.7 K pdf at <http://www.stanford.edu/group/ncpi/>.

NCSU. 2005. "Assessment Instrument Information Page," Physics Education R & D Group, North Carolina State University; online at <http://www.ncsu.edu/per/TestInfo.html>.

Scriven, M. 2004. "Re: pre- post testing in assessment," AERA-D post of 15 Sep 2004 19:27:14-0400; online at <http://lists.asu.edu/cgi-bin/wa?A2=ind0409&L=aera-d&T=0&F=&S=&P=1952>.

Seymour, E., D. Wiese, A. Hunter, & S. Daffinrud. 2000. "Creating a Better Mousetrap: On-line Student Assessment of Their Learning Gains." Paper presented at the National Meetings of the American Chemical Society Symposium, San Francisco; online as a pdf <http://www.aacu-edu.org/issues/sciencehealth/Mousetrap.pdf> (116 KB).

Seymour, E., D. Wiese, A. Hunter, & S. Daffinrud. 2005. "Student Assessment of Learning Gains(SALG)"; online at <http://www.flaguide.org/extra/download/cat/salg/salg.txt>.

Shavelson, R.J. & L. Huang. 2003. "Responding Responsibly To the Frenzy to Assess Learning in Higher Education," Change Magazine, January/February; online at <http://www.aahe.org/change/> / "Selected Change Articles," where "/" means "click on/"

Thornton, R.K. & D.R. Sokoloff. 1998. "Assessing student learning of Newton's Laws: The force and motion conceptual evaluation and the evaluation of active learning laboratory and lecture curricula," Am. J. Phys. 66(4): 338-352.

Zeilik, M. 2003. "Re: Beyond Dead Reckoning to Improve Educational Quality, PhysLrnR post of 29 Mar 2003 18:57:11-0700; online at <http://listserv.boisestate.edu/cgi-bin/wa?A2=ind0303&L=physlrnr&P=R96 12&I=-3&X=21F372467D444946B7&[EMAIL PROTECTED]>


---
You are currently subscribed to tips as: [email protected]
To unsubscribe send a blank email to [EMAIL PROTECTED]

Re: Measuring Teaching Performance

Reply via email to