Re: [Edu-sig] Python Programming: Procedural Online Test
Damon, Thank you for your thoughtful response. In terms of the Python tests, I as well would hope that all my students (13- to 15-years-old) could answer questions based on the content shared - kind of in the spirit of the Computing for All/Core Knowledge (NoChildLeftBehind-ish? - not playing "gotcha", but here is the information we expect you to know, do you know it? can you apply it?) approach (along with opportunities for students to display and be recoginized for comprehension and ability above and beyond what was expressly expected within the realm of the standard curriculum) - as you indicated in the phrase "training program" in the first paragraph of your response. As far as the assessment of the distributed ability-related issues (primarily expressed in your second paragraph), I will definitely leave that to the education psychologists and what is attempting to be measured - perhaps that of which is beyond the curriculum. Thanks again, Scott damon bryant said: >>Could it be argued that the goal be for all students to score 100% on the >>desired content? >> > > I would argue that it should be one of the goals in designing and > implementing a training program. The test could have a different purpose. > What we all have experienced in teaching students is that ability is > distributed; more than likely that distribution is normal for whatever > reason, and the variation of scores within the distribution can be tight > (e.g., SAT quantitative scores at Rice) or loose (e.g., SAT quantitative > scores at a junior college, assuming that the SAT is a requirement). > > Psychological tests and measures can give us an indication of where > students > stand in a distribution (norm-referenced testing) or where each student's > achievement level is relative to some absolute performance criterion > (criterion-referenced testing) before, during, or after training. In other > words, it depends on the purpose of testing, which is determined before it > is designed and is a major evaluation point of its validity or accuracy in > doing what it purports to do. > > Damon > > > S c o t t J.D u r k i n Computer Science Preston Junior High [EMAIL PROTECTED] http://staffweb.psdschools.org/sdurkin ___ _ ___ _ ___ _ ___ _ ___ _ [(_)] |=|[(_)] |=|[(_)] |=|[(_)] |=|[(_)] |=| '-` |_| '-` |_| '-` |_| '-` |_| '-` |_| /mmm/ / /mmm/ / /mmm/ / /mmm/ / /mmm/ / ||||| ||| ___ \_ ___ \_ ___ \_ Computer Room [(_)] |=|[(_)] |=|[(_)] |=| Lab N205 '-` |_| '-` |_| '-` |_| /mmm//mmm//mmm/ 970.419.7358 2005-2006 scott.james.durkin ___ Edu-sig mailing list Edu-sig@python.org http://mail.python.org/mailman/listinfo/edu-sig
Re: [Edu-sig] Python Programming: Procedural Online Test
>Could it be argued that the goal be for all students to score 100% on the >desired content? > I would argue that it should be one of the goals in designing and implementing a training program. The test could have a different purpose. What we all have experienced in teaching students is that ability is distributed; more than likely that distribution is normal for whatever reason, and the variation of scores within the distribution can be tight (e.g., SAT quantitative scores at Rice) or loose (e.g., SAT quantitative scores at a junior college, assuming that the SAT is a requirement). Psychological tests and measures can give us an indication of where students stand in a distribution (norm-referenced testing) or where each student's achievement level is relative to some absolute performance criterion (criterion-referenced testing) before, during, or after training. In other words, it depends on the purpose of testing, which is determined before it is designed and is a major evaluation point of its validity or accuracy in doing what it purports to do. Damon ___ Edu-sig mailing list Edu-sig@python.org http://mail.python.org/mailman/listinfo/edu-sig
Re: [Edu-sig] Python Programming: Procedural Online Test
[ Scott Durkin ]: > Could it be argued that the goal be for all students to score 100% > on the > desired content? That is precisely my goal when I elaborate exams. No success so far ;o) [ Damon Bryant ]: > No, students are not receiving a hard A or an easy A. I make no > classifications such as those you propose. My point is that > questions are placed on the same scale as the ability being > measured (called a theta scale). Grades may be mapped to the scale > though, but a hard A or easy A will not be assigned under > aforementioned conditions described. > > Because all questions in the item bank have been linked, two > students can take the same computer adaptive test but have no items > in common between the two administrations. However, scores are on > the same scale. Thank you for taking the trouble to explain it further. Abração, Senra Rodrigo Senra __ rsenra @ acm.org http://rodrigo.senra.nom.br ___ Edu-sig mailing list Edu-sig@python.org http://mail.python.org/mailman/listinfo/edu-sig
Re: [Edu-sig] Python Programming: Procedural Online Test
Total does make more sense. I've made the change to "total". Thanks, Scott! From: Scott David Daniels <[EMAIL PROTECTED]> To: edu-sig@python.org Subject: Re: [Edu-sig] Python Programming: Procedural Online Test Date: Tue, 06 Dec 2005 13:23:52 -0800 damon bryant wrote: ... > I have corrected the issue with the use of 'sum' (now sum1) and the I'd suggest "total" would be a better replacement than sum1. --Scott David Daniels [EMAIL PROTECTED] ___ Edu-sig mailing list Edu-sig@python.org http://mail.python.org/mailman/listinfo/edu-sig ___ Edu-sig mailing list Edu-sig@python.org http://mail.python.org/mailman/listinfo/edu-sig
Re: [Edu-sig] Python Programming: Procedural Online Test
damon bryant wrote: ... > I have corrected the issue with the use of 'sum' (now ‘sum1’) and the I'd suggest "total" would be a better replacement than sum1. --Scott David Daniels [EMAIL PROTECTED] ___ Edu-sig mailing list Edu-sig@python.org http://mail.python.org/mailman/listinfo/edu-sig
Re: [Edu-sig] Python Programming: Procedural Online Test
Thanks, Wesley! If the item bank were larger, you would not have received easier questions at the end. You would have gotten more difficult questions. The bank for the demo is quite small, so you exhausted all of the difficult ones first because your ability initially mapped on to the difficult portion of the scale. The algorithm is quite efficient in determining where you are on the scale after about 3 - 5 questions. In a test with a larger bank, you would have received more difficult questions as long as you kept getting them right. The test would finally terminate after 20 questions being administered. The alpha of the test, a psychometric term for reliability, is estimated to be .92 or higher with this number of items in a well designed computer adaptive test. I have corrected the issue with the use of 'sum' (now sum1) and the syntax error with 'True:' (now True); that was a good catch! On a different note, I thought by designing this trial version of the system in Python, there would be an increase in the time in serving the questions to the client. I guess that using numarray and multithreading to do the heavy lifting on the back end has made it fast enough for operational use. What do you think? From: w chun <[EMAIL PROTECTED]> To: damon bryant <[EMAIL PROTECTED]> CC: edu-sig@python.org Subject: Re: [Edu-sig] Python Programming: Procedural Online Test Date: Mon, 5 Dec 2005 23:46:32 -0800 > The problems seemed to get much easier in the last 5 or so (very basic > syntax questions). The one about "James"=="james" returning -1 is no longer > true on some Pythons (as now we have boolean True). the tests were well done... i enjoyed taking them. like kirby, i also found the Boolean issue. in the procedural test, i found a syntax error... i think the question with the [None] * 5... (well, [None, None, None, None, None] actually), where you're setting "x[b[i] = True:" ... that colon shouldn't be there. there was/were also question(s) which used sum as a variable name. that is a built-in function that is hidden if used. interestingly enough, your syntax checked actually highlighted it too. :-) cheers, -- wesley - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - "Core Python Programming", Prentice Hall, (c)2006,2001 http://corepython.com wesley.j.chun :: wescpy-at-gmail.com cyberweb.consulting : silicon valley, ca http://cyberwebconsulting.com ___ Edu-sig mailing list Edu-sig@python.org http://mail.python.org/mailman/listinfo/edu-sig
Re: [Edu-sig] Python Programming: Procedural Online Test
, thus providin >g a >50/50 chance of getting it right. > >This is very different than administering a test where the professor seek >s >to have an average score is 50% because low ability students will get the > >vast majority of questions wrong, which could potentially increase anxiet >y, >decrease self-efficacy, and lower the chance of acquiring information in >subsequent teaching sessions (Bandura, self regulation). Adaptive testing > is >able to mitigate the psychological influences of testing on examinees by >seeking to provide equal opportunities for both high and low ability >students to experience success and failure to the same degree by getting >items that are appropriately matched to their skill level. This is the >aspect of adaptive testing that is attractive to me. It may not solve the > >problem, but it is a way of using technology to move in the right directi >on. >I hope this is a better explanation than what I provided earlier. > > > >>From: Rodrigo Senra <[EMAIL PROTECTED]> >>To: edu-sig@python.org >>Subject: Re: [Edu-sig] Python Programming: Procedural Online Test >>Date: Mon, 5 Dec 2005 19:53:00 -0200 >> >> >>On 5Dec 2005, at 7:50 AM, damon bryant wrote: >> >> > One of the main reasons I decided to use an Item Response Theory (IRT >) >> > framework was that the testing platform, once fully operational, >> > will not >> > give students questions that are either too easy or too difficult >> > for them, >> > thus reducing anxiety and boredom for low and high ability students, >> > respectively. In other words, high ability students will be >> > challenged with >> > more difficult questions and low ability students will receive >> > questions >> > that are challenging but matched to their ability. >> >>So far so good... >> >> > Each score is on the same scale, although some students will not >> > receive the same questions. This is the beautiful thing! >> >>I'd like to respectfully disagree. I'm afraid that would cause more >>harm than good. >>One side of student evaluation is to give feedback *for* the >>students. That is a >>relative measure, his/her performance against his/her peers. >> >>If I understood correctly the proposal is to give a "hard"-A for some >>and an "easy"-A >>for others, so everybody have A's (A=='good score'). Is that it ? >>That sounds like >>sweeping the dirt under the carpet. Students will know. We have to >>prepare them to >>tackle failure as well as success. >> >>I do not mean such efforts are not worthy, quite the reverse. But I >>strongly disagree >>with an adaptive scale. There should be a single scale fro the whole >>spectre of tests. >>If some students excel their results must show this, as well as if >>some students perform >>poorly that should not be hidden from them. Give them a goal and the >>means to pursue >>their goal. >> >>If I got your proposal all wrong, I apologize ;o) >> >>best regards, >>Senra >> >> >>Rodrigo Senra >>__ >>rsenra @ acm.org >>http://rodrigo.senra.nom.br >> >> >> >> >>___ >>Edu-sig mailing list >>Edu-sig@python.org >>http://mail.python.org/mailman/listinfo/edu-sig > > >___ >Edu-sig mailing list >Edu-sig@python.org >http://mail.python.org/mailman/listinfo/edu-sig ___ Edu-sig mailing list Edu-sig@python.org http://mail.python.org/mailman/listinfo/edu-sig
Re: [Edu-sig] Python Programming: Procedural Online Test
> The problems seemed to get much easier in the last 5 or so (very basic > syntax questions). The one about "James"=="james" returning -1 is no longer > true on some Pythons (as now we have boolean True). the tests were well done... i enjoyed taking them. like kirby, i also found the Boolean issue. in the procedural test, i found a syntax error... i think the question with the [None] * 5... (well, [None, None, None, None, None] actually), where you're setting "x[b[i] = True:" ... that colon shouldn't be there. there was/were also question(s) which used sum as a variable name. that is a built-in function that is hidden if used. interestingly enough, your syntax checked actually highlighted it too. :-) cheers, -- wesley - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - "Core Python Programming", Prentice Hall, (c)2006,2001 http://corepython.com wesley.j.chun :: wescpy-at-gmail.com cyberweb.consulting : silicon valley, ca http://cyberwebconsulting.com ___ Edu-sig mailing list Edu-sig@python.org http://mail.python.org/mailman/listinfo/edu-sig
Re: [Edu-sig] Python Programming: Procedural Online Test
damon bryant wrote: > Hi Rodrigo! > >> If I understood correctly the proposal is to give a "hard"-A for some >> and an "easy"-A >> for others, so everybody have A's (A=='good score'). Is that it? > > No, students are not receiving a hard A or an easy A. I make no > classifications such as those you propose. My point is that questions are > placed on the same scale as the ability being measured (called a theta > scale). Grades may be mapped to the scale though, but a hard A or easy A > will not be assigned under aforementioned conditions described. > > Because all questions in the item bank have been linked, two students can > take the same computer adaptive test but have no items in common between the > two administrations. However, scores are on the same scale. Research has > shown that even low ability students, despite their performance, prefer > computer adaptive tests over static fixed-length tests. It has also been > shown to lower test anxiety while serving the same purpose as fixed-length > linear tests in that educators are able to extract the same level of > information about student achievement or aptitude without banging a > student's head up against questions that he/she may have a very low > probability of getting correct. The high ability students, instead of being > bored, are receiving questions on the higher end of the theta scale that are > appropriately matched to their ability to challenge them. > >> That sounds like >> sweeping the dirt under the carpet. Students will know. We have to >> prepare them to >> tackle failure as well as success. > > The item is appropriately match for Examinee B because s/he has > approximately > a 50% chance of getting this one right - not a very high chance or a very low > chance of getting it correct but a equi-probable opportunity of either a > success or a failure Two comments: (1) You may find target a higher probability of correct gives a better subjective experience without significantly increasing the length of the test required to be confident of the score. (2) You should track each question's history vs. the final score for the test-taker. This practice can help validate your scoring, as well as help you in weeding out mis-scored questions. --Scott David Daniels [EMAIL PROTECTED] ___ Edu-sig mailing list Edu-sig@python.org http://mail.python.org/mailman/listinfo/edu-sig
Re: [Edu-sig] Python Programming: Procedural Online Test
Could it be argued that the goal be for all students to score 100% on the desired content? Rodrigo Senra said: > > On 5Dec 2005, at 7:50 AM, damon bryant wrote: > >> One of the main reasons I decided to use an Item Response Theory (IRT) >> framework was that the testing platform, once fully operational, >> will not >> give students questions that are either too easy or too difficult >> for them, >> thus reducing anxiety and boredom for low and high ability students, >> respectively. In other words, high ability students will be >> challenged with >> more difficult questions and low ability students will receive >> questions >> that are challenging but matched to their ability. > > So far so good... > >> Each score is on the same scale, although some students will not >> receive the same questions. This is the beautiful thing! > > I'd like to respectfully disagree. I'm afraid that would cause more > harm than good. > One side of student evaluation is to give feedback *for* the > students. That is a > relative measure, his/her performance against his/her peers. > > If I understood correctly the proposal is to give a "hard"-A for some > and an "easy"-A > for others, so everybody have A's (A=='good score'). Is that it ? > That sounds like > sweeping the dirt under the carpet. Students will know. We have to > prepare them to > tackle failure as well as success. > > I do not mean such efforts are not worthy, quite the reverse. But I > strongly disagree > with an adaptive scale. There should be a single scale fro the whole > spectre of tests. > If some students excel their results must show this, as well as if > some students perform > poorly that should not be hidden from them. Give them a goal and the > means to pursue > their goal. > > If I got your proposal all wrong, I apologize ;o) > > best regards, > Senra > > > Rodrigo Senra > __ > rsenra @ acm.org > http://rodrigo.senra.nom.br > > > > > ___ > Edu-sig mailing list > Edu-sig@python.org > http://mail.python.org/mailman/listinfo/edu-sig > S c o t t J.D u r k i n Computer Science Preston Junior High [EMAIL PROTECTED] http://staffweb.psdschools.org/sdurkin ___ _ ___ _ ___ _ ___ _ ___ _ [(_)] |=|[(_)] |=|[(_)] |=|[(_)] |=|[(_)] |=| '-` |_| '-` |_| '-` |_| '-` |_| '-` |_| /mmm/ / /mmm/ / /mmm/ / /mmm/ / /mmm/ / ||||| ||| ___ \_ ___ \_ ___ \_ Computer Room [(_)] |=|[(_)] |=|[(_)] |=| Lab N205 '-` |_| '-` |_| '-` |_| /mmm//mmm//mmm/ 970.419.7358 2005-2006 scott.james.durkin ___ Edu-sig mailing list Edu-sig@python.org http://mail.python.org/mailman/listinfo/edu-sig
Re: [Edu-sig] Python Programming: Procedural Online Test
Hi Rodrigo! >If I understood correctly the proposal is to give a "hard"-A for some >and an "easy"-A >for others, so everybody have A's (A=='good score'). Is that it? No, students are not receiving a hard A or an easy A. I make no classifications such as those you propose. My point is that questions are placed on the same scale as the ability being measured (called a theta scale). Grades may be mapped to the scale though, but a hard A or easy A will not be assigned under aforementioned conditions described. Because all questions in the item bank have been linked, two students can take the same computer adaptive test but have no items in common between the two administrations. However, scores are on the same scale. Research has shown that even low ability students, despite their performance, prefer computer adaptive tests over static fixed-length tests. It has also been shown to lower test anxiety while serving the same purpose as fixed-length linear tests in that educators are able to extract the same level of information about student achievement or aptitude without banging a student's head up against questions that he/she may have a very low probability of getting correct. The high ability students, instead of being bored, are receiving questions on the higher end of the theta scale that are appropriately matched to their ability to challenge them. >That sounds like >sweeping the dirt under the carpet. Students will know. We have to >prepare them to >tackle failure as well as success. In fact computer adaptive tests are designed to administer items to a person of a SPECIFIC ability that will yield a 50/50 chance of correctly responding. For example, there are two examinees: Examinee A has a true theta of -1.5, and Examinee B has a true theta of 1.5. The theta scale has a typical range of -3 to 3. There is a question that has been mapped to the theta scale and it has a difficulty value of 1.5, how we estimate this is beyond our discussion but is relatively easy to do with Python. The item is appropriately match for Examinee B because s/he has approximately a 50% chance of getting this one right - not a very high chance or a very low chance of getting it correct but a equi-probable opportunity of either a success or a failure. According to sampling theory, with multiple administrations of this item to a population of persons with a theta of 1.5, there will be an approximately equal number of successes and failures on this item, because the odds of getting it correct vs. incorrect are equal. However, with multiple administrations of this same item to a population of examinees with a theta of -1.5, which is substantially lower than 1.5, there will be exceedingly more failures than successes. Adaptive test algorithms seek to maximize information about examinees by estimating their ability and searching for questions in the item bank that match their ability levels, thus providing a 50/50 chance of getting it right. This is very different than administering a test where the professor seeks to have an average score is 50% because low ability students will get the vast majority of questions wrong, which could potentially increase anxiety, decrease self-efficacy, and lower the chance of acquiring information in subsequent teaching sessions (Bandura, self regulation). Adaptive testing is able to mitigate the psychological influences of testing on examinees by seeking to provide equal opportunities for both high and low ability students to experience success and failure to the same degree by getting items that are appropriately matched to their skill level. This is the aspect of adaptive testing that is attractive to me. It may not solve the problem, but it is a way of using technology to move in the right direction. I hope this is a better explanation than what I provided earlier. >From: Rodrigo Senra <[EMAIL PROTECTED]> >To: edu-sig@python.org >Subject: Re: [Edu-sig] Python Programming: Procedural Online Test >Date: Mon, 5 Dec 2005 19:53:00 -0200 > > >On 5Dec 2005, at 7:50 AM, damon bryant wrote: > > > One of the main reasons I decided to use an Item Response Theory (IRT) > > framework was that the testing platform, once fully operational, > > will not > > give students questions that are either too easy or too difficult > > for them, > > thus reducing anxiety and boredom for low and high ability students, > > respectively. In other words, high ability students will be > > challenged with > > more difficult questions and low ability students will receive > > questions > > that are challenging but matched to their ability. > >So far so good... > > > Each score is on the same scale, although some students will not > > receive the same questions. This is the beautiful thing! > >I
Re: [Edu-sig] Python Programming: Procedural Online Test
On 5Dec 2005, at 7:50 AM, damon bryant wrote: > One of the main reasons I decided to use an Item Response Theory (IRT) > framework was that the testing platform, once fully operational, > will not > give students questions that are either too easy or too difficult > for them, > thus reducing anxiety and boredom for low and high ability students, > respectively. In other words, high ability students will be > challenged with > more difficult questions and low ability students will receive > questions > that are challenging but matched to their ability. So far so good... > Each score is on the same scale, although some students will not > receive the same questions. This is the beautiful thing! I'd like to respectfully disagree. I'm afraid that would cause more harm than good. One side of student evaluation is to give feedback *for* the students. That is a relative measure, his/her performance against his/her peers. If I understood correctly the proposal is to give a "hard"-A for some and an "easy"-A for others, so everybody have A's (A=='good score'). Is that it ? That sounds like sweeping the dirt under the carpet. Students will know. We have to prepare them to tackle failure as well as success. I do not mean such efforts are not worthy, quite the reverse. But I strongly disagree with an adaptive scale. There should be a single scale fro the whole spectre of tests. If some students excel their results must show this, as well as if some students perform poorly that should not be hidden from them. Give them a goal and the means to pursue their goal. If I got your proposal all wrong, I apologize ;o) best regards, Senra Rodrigo Senra __ rsenra @ acm.org http://rodrigo.senra.nom.br ___ Edu-sig mailing list Edu-sig@python.org http://mail.python.org/mailman/listinfo/edu-sig
Re: [Edu-sig] Python Programming: Procedural Online Test
One of the main reasons I decided to use an Item Response Theory (IRT) framework was that the testing platform, once fully operational, will not give students questions that are either too easy or too difficult for them, thus reducing anxiety and boredom for low and high ability students, respectively. In other words, high ability students will be challenged with more difficult questions and low ability students will receive questions that are challenging but matched to their ability. Each score is on the same scale, although some students will not receive the same questions. This is the beautiful thing! That is the concept of adaptive or tailored testing being implemented in the Python Programming: Procedural Online Test (http://www.adaptiveassessmentservices.com). After reading the comment on 50% percent being optimal for measurement theory, I have to say about 90 years ago that was the best practice in order to maximize item/test variance, which maximized the distribution of scores. This is primarily a World War I and II convention in developing selection tests, i.e., Alpha and Beta, used to place conscripts in appropriate combat roles. Those two tests are the predecessors of the SAT administered by the Educational Testing Service, which is the organization where most of the war psychologists who developed Alpha and Beta went after the WW II. Because of their influence in selecting recruits who then received money after the war to go to college in the form of the GI Bill, these measurement specialists (psychometricians) did the same thing for ETS with the SAT in screening the same cohort for placement in colleges and universities around America. These psychologists had a strong influence of what constituted good practice in standardized testing. Accordingly, the practice of using 50% became well entrenched. Later, IRT came on the scene in the early 1950s as an alternative to classical test theory and has some great theoretical and practical advantages over the previous approach of selecting items that have a variance of .50. The computing technology was not available then to implement the theory. However, it wasn't until the advent of the PC in the late 70s and early 80s that got psychometricians like me motivated to begin the implementation of IRT; once again at the forefront in the development was the armed services in the late 70s. It will take another decade or so to break the hold that Classical Test Theory has on measurement, and expect students' test anxiety to remain high in the interim. But as more and more begin to realize the benefits of IRT, especially computer adaptive testing, over CTT, it will no longer be an issue of was guidance should be used to administer and score tests. >From: Chuck Allison <[EMAIL PROTECTED]> >Reply-To: Chuck Allison <[EMAIL PROTECTED]> >To: Laura Creighton <[EMAIL PROTECTED]> >CC: edu-sig@python.org, Scott David Daniels <[EMAIL PROTECTED]> >Subject: Re: [Edu-sig] Python Programming: Procedural Online Test >Date: Mon, 5 Dec 2005 00:52:50 -0700 > >Hello Laura, > >That's better than the Abstract Algebra class I took as an >undergraduate. The highest score on Test 1 was 19%. I got 6%! I retook >the class from another teacher and topped the class. Liked the subject >so much I took the second semester just for fun. Testing and teaching >strategies make a tremendous difference. > >Sunday, December 4, 2005, 11:50:22 PM, you wrote: > >LC> In a message of Sun, 04 Dec 2005 11:32:27 PST, Scott David Daniels >writes: > >>I wrote: > >> >> ... keeping people at 80% correct is great rule-of-thumb goal ... > >> > >>To elaborate on the statement above a bit, we did drill-and practice > >>teaching (and had students loving it). The value of the 80% is for > >>maximal learning. Something like 50% is the best for measurement theory > >>(but discourages the student drastically). In graduate school I had > >>one instructor who tried to target his tests to get 50% as the average > >>mark. It was incredibly discouraging for most of the students (I > >>eventually came to be OK with it, but it took half the course). > >LC> > >LC> 'Discouraging' misses the mark. The University of Toronto has >professors >LC> who like to test to 50% as well. And it causes suicides among >undergraduates >LC> who are first exposed to this, unless there is adequate preparation. >This >LC> is incredibly _dangerous_ stuff. > >LC> Laura > > >>--Scott David Daniels > >>[EMAIL PROTECTED] > >> > >>___ > >>Edu-sig mailing list > >>Edu-sig@python.org > >>http://mail.python.org/mailman/listinfo/edu-sig &g
Re: [Edu-sig] Python Programming: Procedural Online Test
Hello Laura, That's better than the Abstract Algebra class I took as an undergraduate. The highest score on Test 1 was 19%. I got 6%! I retook the class from another teacher and topped the class. Liked the subject so much I took the second semester just for fun. Testing and teaching strategies make a tremendous difference. Sunday, December 4, 2005, 11:50:22 PM, you wrote: LC> In a message of Sun, 04 Dec 2005 11:32:27 PST, Scott David Daniels writes: >>I wrote: >> >> ... keeping people at 80% correct is great rule-of-thumb goal ... >> >>To elaborate on the statement above a bit, we did drill-and practice >>teaching (and had students loving it). The value of the 80% is for >>maximal learning. Something like 50% is the best for measurement theory >>(but discourages the student drastically). In graduate school I had >>one instructor who tried to target his tests to get 50% as the average >>mark. It was incredibly discouraging for most of the students (I >>eventually came to be OK with it, but it took half the course). LC> LC> 'Discouraging' misses the mark. The University of Toronto has professors LC> who like to test to 50% as well. And it causes suicides among undergraduates LC> who are first exposed to this, unless there is adequate preparation. This LC> is incredibly _dangerous_ stuff. LC> Laura >>--Scott David Daniels >>[EMAIL PROTECTED] >> >>___ >>Edu-sig mailing list >>Edu-sig@python.org >>http://mail.python.org/mailman/listinfo/edu-sig LC> ___ LC> Edu-sig mailing list LC> Edu-sig@python.org LC> http://mail.python.org/mailman/listinfo/edu-sig -- Best regards, Chuck ___ Edu-sig mailing list Edu-sig@python.org http://mail.python.org/mailman/listinfo/edu-sig
Re: [Edu-sig] Python Programming: Procedural Online Test
In a message of Sun, 04 Dec 2005 11:32:27 PST, Scott David Daniels writes: >I wrote: > >> ... keeping people at 80% correct is great rule-of-thumb goal ... > >To elaborate on the statement above a bit, we did drill-and practice >teaching (and had students loving it). The value of the 80% is for >maximal learning. Something like 50% is the best for measurement theory >(but discourages the student drastically). In graduate school I had >one instructor who tried to target his tests to get 50% as the average >mark. It was incredibly discouraging for most of the students (I >eventually came to be OK with it, but it took half the course). 'Discouraging' misses the mark. The University of Toronto has professors who like to test to 50% as well. And it causes suicides among undergraduates who are first exposed to this, unless there is adequate preparation. This is incredibly _dangerous_ stuff. Laura >--Scott David Daniels >[EMAIL PROTECTED] > >___ >Edu-sig mailing list >Edu-sig@python.org >http://mail.python.org/mailman/listinfo/edu-sig ___ Edu-sig mailing list Edu-sig@python.org http://mail.python.org/mailman/listinfo/edu-sig
Re: [Edu-sig] Python Programming: Procedural Online Test
I wrote: >> ... keeping people at 80% correct is great rule-of-thumb goal ... To elaborate on the statement above a bit, we did drill-and practice teaching (and had students loving it). The value of the 80% is for maximal learning. Something like 50% is the best for measurement theory (but discourages the student drastically). In graduate school I had one instructor who tried to target his tests to get 50% as the average mark. It was incredibly discouraging for most of the students (I eventually came to be OK with it, but it took half the course). The hardest part to create is the courseware (including questions), the second-hardest effort is scoring the questions (rating the difficulty in all applicable strands). The software to deliver the questions was, in many senses, a less labor-intensive task (especially when amortized over a number of courses). I think we came up with at least a ten-to-one ratio (may have been higher, but definitely not lower) in effort compared to the new prep for a course by an instructor. I am (and was) a programming, rather than an education, guy. I do not know the education theory behind our research well, but I know how a lot of the code worked (and know where some of our papers went). We kept an exponentially decaying model of the student's ability in each "strand" and used that to help the estimate of his score in the coming question "cloud." A simplified version of the same approach would be to have strand-specific questions, randomly pick a strand, and pick the "best" question for that student in that strand. Or, you could bias the choices between strands to give more balanced progress (increasing the probability of work where the student is weakest). --Scott David Daniels [EMAIL PROTECTED] ___ Edu-sig mailing list Edu-sig@python.org http://mail.python.org/mailman/listinfo/edu-sig
Re: [Edu-sig] Python Programming: Procedural Online Test
Scott: I will attempt to incorporate your suggestion of keeping track of performance; I'll need to create some attributes on the examinee objects that will hold past test scores created within the system. I am, however, approaching the scoring differently. Although I do report percentage correct, I'm using Item Response Theory to (1) score each question, (2) estimate ability using a Bayesian algorithm based on maximum likelihood, (3) estimate the error in the estimate of ability, and (4) select the most appropriate question to administer next. This is very similar to what is done at the Educational Testing Service in Princeton with the computer adaptive versions of the SAT and the GRE. I don't know the language used to develop their platform, but this one for the demo is developed in Python using numarray and multithreading modules to widen the bottlenecks and speed the delivery of test questions served in html format to the client's page. Thanks for your comments! By the way, I am looking for teachers, preferably middle and high school, who would be willing to trial the system. I have another site where they will have the ability to enroll students, monitor testing status, and view scores for all students. Do you know of any? >From: Scott David Daniels <[EMAIL PROTECTED]> >To: edu-sig@python.org >Subject: Re: [Edu-sig] Python Programming: Procedural Online Test >Date: Sat, 03 Dec 2005 12:03:06 -0800 > >damon bryant wrote: > > As you got more items correct > > you got harder questions. In contrast, if you initially got questions > > incorrect, you would have received easier questions >In the 70s there was research on such systems (keeping people at 80% >correct is great rule-of-thumb goal). See Stuff done at Stanford's >Institute for Mathematical Studies in the Social Sciences. At IMSSS >we did lots of this kind of stuff. We generally broke the skills into >strands (separate concepts), and kept track of the student's performance >in each strand separately (try it; it helps). BIP (Basic Instructional >Program) was an ONR (Office of Naval Research) sponsored system, that >tried to teach "programming in Basic." The BIP model (and often the >"standard" IMSSS model) was to score every task in each strand, and find >the "best" for the student based on his current position. >For arithmetic, we actually generated problems based on the different >desired strand properties; nobody was clever enough to generate software >problems; we simply consulted our DB. We taught how to do proofs in >Logic and Set Theory using some of these techniques. >Names to look for on papers in the 70s-80s include Patrick Suppes (head >of one side of IMSSS), Richard Atkinson (head of the other side), >Barbara Searle, Avron Barr, and Marian Beard. These are not the only >people who worked there, but a number I recall that should help you to >find the research publications (try Google Scholar). > > >A follow-on for some of this work is: > http://www-epgy.stanford.edu/ > >I worked there "back in the day" and was quite proud to be a part of >some of that work. > >--Scott David Daniels >[EMAIL PROTECTED] > >___ >Edu-sig mailing list >Edu-sig@python.org >http://mail.python.org/mailman/listinfo/edu-sig ___ Edu-sig mailing list Edu-sig@python.org http://mail.python.org/mailman/listinfo/edu-sig
Re: [Edu-sig] Python Programming: Procedural Online Test
damon bryant wrote: > As you got more items correct > you got harder questions. In contrast, if you initially got questions > incorrect, you would have received easier questions In the 70s there was research on such systems (keeping people at 80% correct is great rule-of-thumb goal). See Stuff done at Stanford's Institute for Mathematical Studies in the Social Sciences. At IMSSS we did lots of this kind of stuff. We generally broke the skills into strands (separate concepts), and kept track of the student's performance in each strand separately (try it; it helps). BIP (Basic Instructional Program) was an ONR (Office of Naval Research) sponsored system, that tried to teach "programming in Basic." The BIP model (and often the "standard" IMSSS model) was to score every task in each strand, and find the "best" for the student based on his current position. For arithmetic, we actually generated problems based on the different desired strand properties; nobody was clever enough to generate software problems; we simply consulted our DB. We taught how to do proofs in Logic and Set Theory using some of these techniques. Names to look for on papers in the 70s-80s include Patrick Suppes (head of one side of IMSSS), Richard Atkinson (head of the other side), Barbara Searle, Avron Barr, and Marian Beard. These are not the only people who worked there, but a number I recall that should help you to find the research publications (try Google Scholar). A follow-on for some of this work is: http://www-epgy.stanford.edu/ I worked there "back in the day" and was quite proud to be a part of some of that work. --Scott David Daniels [EMAIL PROTECTED] ___ Edu-sig mailing list Edu-sig@python.org http://mail.python.org/mailman/listinfo/edu-sig
Re: [Edu-sig] Python Programming: Procedural Online Test
Kirby: Thank you for your feedback! You completed the Declarative measure. I am also interested in your feedback on the Procedural test in which Python application or procedural questions are administered. Questions on this part are coded and displayed as they appear in the IDLE with highlighted key words and indentation for nested code. I think the longest code problem is about 20 lines. I appreciate your comment on the ability to specify font type/size because I'm currently working to accommodate persons with disabilities and others who may have difficulty viewing the text. Linda Grandel and I have an experimental site that we use for research and educational purposes; we are going to trial some Python questions for a class of her colleagues but are having some difficultly translating the testing in a short period. This is a long-term project. We have the goal of developing a worldwide database of Python test norms in an effort to track progress on the spread and proficiency of the language in different countries. Although it is a great idea, it is too large for a dissertation research project. If you are interested in trialing it with your class, perhaps we can collaborate. You did notice that towards the end, questions got easier for you. The test algorithm is adaptive but the question bank from which the items are pulled is not that large. In other words, the test presented items that were most appropriate for you when you began the test. As you got more items correct you got harder questions. In contrast, if you initially got questions incorrect, you would have received easier questions. Because the bank is so small (I do have plans of expanding it when I get some more time on my hands), you exhausted the bank of difficult questions and began to receive easier items. The opposite would have happened to an examinee of very low ability. My goal is to administer a computer adaptive Python test where examinees will only receive questions that are most appropriate for them. In other words, different examinees will be tested according to their ability. This goes back to Binet's idea of tailored testing where the psychologist administering the intelligence test would give items to an examinee based on previous responses. In the present case, it's done by computer using an artificially intelligent algorithm based on my dissertation. By expanding the question bank, I'll be able to reach that goal. >From: "Kirby Urner" <[EMAIL PROTECTED]> >To: "'damon bryant'" <[EMAIL PROTECTED]>, [EMAIL PROTECTED] >CC: edu-sig@python.org >Subject: RE: [Edu-sig] Python Programming: Procedural Online Test >Date: Sat, 3 Dec 2005 07:44:32 -0800 > > > I tweaked it now where all other browsers and OS combinations can access > > the computer adaptive tests. Performance may be unpredictable though. > > > > Damon > >OK, thanks. Worked with no problems. > >As an administrator, I'd be curious to get the actual text of missed >problems (maybe via URL), not just a raw percentage (I got 90% i.e. 2 wrong >-- probably the one about getting current working directory, not sure which >other). > >The problems seemed to get much easier in the last 5 or so (very basic >syntax questions). The one about "James"=="james" returning -1 is no >longer >true on some Pythons (as now we have boolean True). > >The font used to pose the questions was a little distracting. I vastly >prefer fixed width fonts when programming. I know that's a personal >preference (some actually like variable pitch -- blech). Perhaps as a >future enhancement, you could let the user customize the font? > >Anyway, a useful service. I could see teachers like me wanting to use this >with our classes. > >Thank you for giving me this opportunity. > >Kirby > > >-- >No virus found in this outgoing message. >Checked by AVG Free Edition. >Version: 7.1.362 / Virus Database: 267.13.11/191 - Release Date: 12/2/2005 > > ___ Edu-sig mailing list Edu-sig@python.org http://mail.python.org/mailman/listinfo/edu-sig
Re: [Edu-sig] Python Programming: Procedural Online Test
> I tweaked it now where all other browsers and OS combinations can access > the computer adaptive tests. Performance may be unpredictable though. > > Damon OK, thanks. Worked with no problems. As an administrator, I'd be curious to get the actual text of missed problems (maybe via URL), not just a raw percentage (I got 90% i.e. 2 wrong -- probably the one about getting current working directory, not sure which other). The problems seemed to get much easier in the last 5 or so (very basic syntax questions). The one about "James"=="james" returning -1 is no longer true on some Pythons (as now we have boolean True). The font used to pose the questions was a little distracting. I vastly prefer fixed width fonts when programming. I know that's a personal preference (some actually like variable pitch -- blech). Perhaps as a future enhancement, you could let the user customize the font? Anyway, a useful service. I could see teachers like me wanting to use this with our classes. Thank you for giving me this opportunity. Kirby -- No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.1.362 / Virus Database: 267.13.11/191 - Release Date: 12/2/2005 ___ Edu-sig mailing list Edu-sig@python.org http://mail.python.org/mailman/listinfo/edu-sig
Re: [Edu-sig] Python Programming: Procedural Online Test
I tweaked it now where all other browsers and OS combinations can access the computer adaptive tests. Performance may be unpredictable though. Damon >From: "Kirby Urner" <[EMAIL PROTECTED]> >To: "'Vern Ceder'" <[EMAIL PROTECTED]>, "'damon bryant'" ><[EMAIL PROTECTED]> >CC: edu-sig@python.org >Subject: RE: [Edu-sig] Python Programming: Procedural Online Test >Date: Fri, 2 Dec 2005 19:50:36 -0800 > >Similar comment. I'm on Windows but don't want to be tested by a service >that won't let me use FireFox. I have tests too. > >Kirby > > > > -Original Message- > > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On > > Behalf Of Vern Ceder > > Sent: Friday, December 02, 2005 2:38 PM > > To: damon bryant > > Cc: edu-sig@python.org > > Subject: Re: [Edu-sig] Python Programming: Procedural Online Test > > > > In my opinion, you would get more responses if the testing system > > accepted a browser/OS combination other than IE/Windows > > > > Cheers, > > Vern Ceder (using Firefox and Ubuntu Linux) > > ___ Edu-sig mailing list Edu-sig@python.org http://mail.python.org/mailman/listinfo/edu-sig
Re: [Edu-sig] Python Programming: Procedural Online Test
Similar comment. I'm on Windows but don't want to be tested by a service that won't let me use FireFox. I have tests too. Kirby > -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On > Behalf Of Vern Ceder > Sent: Friday, December 02, 2005 2:38 PM > To: damon bryant > Cc: edu-sig@python.org > Subject: Re: [Edu-sig] Python Programming: Procedural Online Test > > In my opinion, you would get more responses if the testing system > accepted a browser/OS combination other than IE/Windows > > Cheers, > Vern Ceder (using Firefox and Ubuntu Linux) ___ Edu-sig mailing list Edu-sig@python.org http://mail.python.org/mailman/listinfo/edu-sig
Re: [Edu-sig] Python Programming: Procedural Online Test
In my opinion, you would get more responses if the testing system accepted a browser/OS combination other than IE/Windows Cheers, Vern Ceder (using Firefox and Ubuntu Linux) damon bryant wrote: > Hey folks! > > Lindel Grandel and I have been working on some Python questions for > potential use in high schools, college, and employment. If you are > interested in taking one of the online tests go to > http://www.adaptiveassessmentservices.com and self-register to take one of > two Python tests: one is Declarative (knowledge of built-in data types and > functions) and the other is Procedural (application of loops, import, > functions,...,etc.). I would like to know what you think. Thanks in advance! > > Damon > > > ___ > Edu-sig mailing list > Edu-sig@python.org > http://mail.python.org/mailman/listinfo/edu-sig -- This time for sure! -Bullwinkle J. Moose - Vern Ceder, Director of Technology Canterbury School, 3210 Smith Road, Ft Wayne, IN 46804 [EMAIL PROTECTED]; 260-436-0746; FAX: 260-436-5137 ___ Edu-sig mailing list Edu-sig@python.org http://mail.python.org/mailman/listinfo/edu-sig
[Edu-sig] Python Programming: Procedural Online Test
Hey folks! Lindel Grandel and I have been working on some Python questions for potential use in high schools, college, and employment. If you are interested in taking one of the online tests go to http://www.adaptiveassessmentservices.com and self-register to take one of two Python tests: one is Declarative (knowledge of built-in data types and functions) and the other is Procedural (application of loops, import, functions,...,etc.). I would like to know what you think. Thanks in advance! Damon ___ Edu-sig mailing list Edu-sig@python.org http://mail.python.org/mailman/listinfo/edu-sig