Re: [Edu-sig] Python Programming: Procedural Online Test

2005-12-06 Thread Scott Durkin
Damon,

Thank you for your thoughtful response.  In terms of the Python tests, I
as well would hope that all my students (13- to 15-years-old) could answer
questions based on the content shared - kind of in the spirit of the
Computing for All/Core Knowledge (NoChildLeftBehind-ish? - not playing
"gotcha", but here is the information we expect you to know, do you know
it? can you apply it?) approach (along with opportunities for students to
display and be recoginized for comprehension and ability above and beyond
what was expressly expected within the realm of the standard curriculum) -
as you indicated in the phrase "training program" in the first paragraph
of your response.  As far as the assessment of the distributed
ability-related issues (primarily expressed in your second paragraph), I
will definitely leave that to the education psychologists and what is
attempting to be measured - perhaps that of which is beyond the
curriculum.

Thanks again,

Scott

damon bryant said:
>>Could it be argued that the goal be for all students to score 100% on the
>>desired content?
>>
>
> I would argue that it should be one of the goals in designing and
> implementing a training program. The test could have a different purpose.
> What we all have experienced in teaching students is that ability is
> distributed; more than likely that distribution is normal for whatever
> reason, and the variation of scores within the distribution can be tight
> (e.g., SAT quantitative scores at Rice) or loose (e.g., SAT quantitative
> scores at a junior college, assuming that the SAT is a requirement).
>
> Psychological tests and measures can give us an indication of where
> students
> stand in a distribution (norm-referenced testing) or where each student's
> achievement level is relative to some absolute performance criterion
> (criterion-referenced testing) before, during, or after training. In other
> words, it depends on the purpose of testing, which is determined before it
> is designed and is a major evaluation point of its validity or accuracy in
> doing what it purports to do.
>
> Damon
>
>
>



   S  c  o  t  t J.D  u  r  k  i  n


 Computer Science    Preston Junior High

[EMAIL PROTECTED]    http://staffweb.psdschools.org/sdurkin

___   _  ___   _  ___   _  ___   _  ___   _
   [(_)] |=|[(_)] |=|[(_)] |=|[(_)] |=|[(_)] |=|
'-`  |_| '-`  |_| '-`  |_| '-`  |_| '-`  |_|
   /mmm/  / /mmm/  / /mmm/  / /mmm/  / /mmm/  /
 |||||
 |||
 ___  \_  ___  \_  ___  \_
   Computer   Room  [(_)] |=|[(_)] |=|[(_)] |=|
 Lab  N205   '-`  |_| '-`  |_| '-`  |_|
/mmm//mmm//mmm/


   970.419.7358    2005-2006


scott.james.durkin


___
Edu-sig mailing list
Edu-sig@python.org
http://mail.python.org/mailman/listinfo/edu-sig


Re: [Edu-sig] Python Programming: Procedural Online Test

2005-12-06 Thread damon bryant
>Could it be argued that the goal be for all students to score 100% on the
>desired content?
>

I would argue that it should be one of the goals in designing and 
implementing a training program. The test could have a different purpose. 
What we all have experienced in teaching students is that ability is 
distributed; more than likely that distribution is normal for whatever 
reason, and the variation of scores within the distribution can be tight 
(e.g., SAT quantitative scores at Rice) or loose (e.g., SAT quantitative 
scores at a junior college, assuming that the SAT is a requirement).

Psychological tests and measures can give us an indication of where students 
stand in a distribution (norm-referenced testing) or where each student's 
achievement level is relative to some absolute performance criterion 
(criterion-referenced testing) before, during, or after training. In other 
words, it depends on the purpose of testing, which is determined before it 
is designed and is a major evaluation point of its validity or accuracy in 
doing what it purports to do.

Damon


___
Edu-sig mailing list
Edu-sig@python.org
http://mail.python.org/mailman/listinfo/edu-sig


Re: [Edu-sig] Python Programming: Procedural Online Test

2005-12-06 Thread Rodrigo Senra

[ Scott Durkin ]:

> Could it be argued that the goal be for all students to score 100%  
> on the
> desired content?

That is precisely my goal when I  elaborate exams. No success so far ;o)


[ Damon Bryant ]:

> No, students are not receiving a hard A or an easy A. I make no  
> classifications such as those you propose. My point is that  
> questions are placed on the same scale as the ability being  
> measured (called a theta scale). Grades may be mapped to the scale  
> though, but a hard A or easy A will not be assigned under  
> aforementioned conditions described.
>
> Because all questions in the item bank have been linked, two  
> students can take the same computer adaptive test but have no items  
> in common between the two administrations. However, scores are on  
> the same scale.

Thank you for taking the trouble to explain it further.

Abração,
Senra


Rodrigo Senra
__
rsenra @ acm.org
http://rodrigo.senra.nom.br




___
Edu-sig mailing list
Edu-sig@python.org
http://mail.python.org/mailman/listinfo/edu-sig


Re: [Edu-sig] Python Programming: Procedural Online Test

2005-12-06 Thread damon bryant

Total does make more sense. I've made the change to "total". Thanks, Scott!



From: Scott David Daniels <[EMAIL PROTECTED]>
To: edu-sig@python.org
Subject: Re: [Edu-sig] Python Programming: Procedural Online Test
Date: Tue, 06 Dec 2005 13:23:52 -0800

damon bryant wrote:
...
> I have corrected the issue with the use of 'sum' (now ‘sum1’) and the
I'd suggest "total" would be a better replacement than sum1.

--Scott David Daniels
[EMAIL PROTECTED]

___
Edu-sig mailing list
Edu-sig@python.org
http://mail.python.org/mailman/listinfo/edu-sig



___
Edu-sig mailing list
Edu-sig@python.org
http://mail.python.org/mailman/listinfo/edu-sig


Re: [Edu-sig] Python Programming: Procedural Online Test

2005-12-06 Thread Scott David Daniels
damon bryant wrote:
...
> I have corrected the issue with the use of 'sum' (now ‘sum1’) and the 
I'd suggest "total" would be a better replacement than sum1.

--Scott David Daniels
[EMAIL PROTECTED]

___
Edu-sig mailing list
Edu-sig@python.org
http://mail.python.org/mailman/listinfo/edu-sig


Re: [Edu-sig] Python Programming: Procedural Online Test

2005-12-06 Thread damon bryant

Thanks, Wesley!

If the item bank were larger, you would not have received easier questions 
at the end. You would have gotten more difficult questions. The bank for the 
demo is quite small, so you exhausted all of the difficult ones first 
because your ability initially mapped on to the difficult portion of the 
scale. The algorithm is quite efficient in determining where you are on the 
scale after about 3 - 5 questions.  In a test with a larger bank, you would 
have received more difficult questions as long as you kept getting them 
right. The test would finally terminate after 20 questions being 
administered. The alpha of the test, a psychometric term for reliability, is 
estimated to be .92 or higher with this number of items in a well designed 
computer adaptive test.


I have corrected the issue with the use of 'sum' (now ‘sum1’) and the syntax 
error with 'True:' (now ‘True’); that was a good catch! On a different note, 
I thought by designing this trial version of the system in Python, there 
would be an increase in the time in serving the questions to the client. I 
guess that using numarray and multithreading to do the heavy lifting on the 
back end has made it ‘fast enough’ for operational use. What do you think?





From: w chun <[EMAIL PROTECTED]>
To: damon bryant <[EMAIL PROTECTED]>
CC: edu-sig@python.org
Subject: Re: [Edu-sig] Python Programming: Procedural Online Test
Date: Mon, 5 Dec 2005 23:46:32 -0800

> The problems seemed to get much easier in the last 5 or so (very basic
> syntax questions).  The one about "James"=="james" returning -1 is no 
longer

> true on some Pythons (as now we have boolean True).


the tests were well done... i enjoyed taking them.  like kirby, i also
found the Boolean issue.  in the procedural test, i found a syntax
error... i think the question with the [None] * 5... (well, [None,
None, None, None, None] actually), where you're setting "x[b[i] =
True:" ... that colon shouldn't be there.  there was/were also
question(s) which used sum as a variable name.  that is a built-in
function that is hidden if used.  interestingly enough, your syntax
checked actually highlighted it too.  :-)

cheers,
-- wesley
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
"Core Python Programming", Prentice Hall, (c)2006,2001
http://corepython.com

wesley.j.chun :: wescpy-at-gmail.com
cyberweb.consulting : silicon valley, ca
http://cyberwebconsulting.com



___
Edu-sig mailing list
Edu-sig@python.org
http://mail.python.org/mailman/listinfo/edu-sig


Re: [Edu-sig] Python Programming: Procedural Online Test

2005-12-06 Thread Laura Creighton
, thus providin
>g a 
>50/50 chance of getting it right.
>
>This is very different than administering a test where the professor seek
>s 
>to have an average score is 50% because low ability students will get the
> 
>vast majority of questions wrong, which could potentially increase anxiet
>y, 
>decrease self-efficacy, and lower the chance of acquiring information in 
>subsequent teaching sessions (Bandura, self regulation). Adaptive testing
> is 
>able to mitigate the psychological influences of testing on examinees by 
>seeking to provide equal opportunities for both high and low ability 
>students to experience success and failure to the same degree by getting 
>items that are appropriately matched to their skill level. This is the 
>aspect of adaptive testing that is attractive to me. It may not solve the
> 
>problem, but it is a way of using technology to move in the right directi
>on. 
>I hope this is a better explanation than what I provided earlier.
>
>
>
>>From: Rodrigo Senra <[EMAIL PROTECTED]>
>>To: edu-sig@python.org
>>Subject: Re: [Edu-sig] Python Programming: Procedural Online Test
>>Date: Mon, 5 Dec 2005 19:53:00 -0200
>>
>>
>>On 5Dec 2005, at 7:50 AM, damon bryant wrote:
>>
>> > One of the main reasons I decided to use an Item Response Theory (IRT
>)
>> > framework was that the testing platform, once fully operational,
>> > will not
>> > give students questions that are either too easy or too difficult
>> > for them,
>> > thus reducing anxiety and boredom for low and high ability students,
>> > respectively. In other words, high ability students will be
>> > challenged with
>> > more difficult questions and low ability students will receive
>> > questions
>> > that are challenging but matched to their ability.
>>
>>So far so good...
>>
>> > Each score is on the same scale, although some students will not
>> > receive the same questions. This is the beautiful thing!
>>
>>I'd like to respectfully disagree. I'm afraid that would cause more
>>harm than good.
>>One side of student evaluation is to give feedback *for* the
>>students. That is a
>>relative measure, his/her performance against his/her peers.
>>
>>If I understood correctly the proposal is to give a "hard"-A for some
>>and an "easy"-A
>>for others, so everybody have A's (A=='good score'). Is that it ?
>>That sounds like
>>sweeping the dirt under the carpet. Students will know. We have to
>>prepare them to
>>tackle failure as well as success.
>>
>>I do not mean such efforts are not worthy, quite the reverse. But I
>>strongly disagree
>>with an adaptive scale. There should be a single scale fro the whole
>>spectre of tests.
>>If some students excel their results must show this, as well as if
>>some students perform
>>poorly that should not be hidden from them. Give them a goal and the
>>means to pursue
>>their goal.
>>
>>If I got your proposal all wrong, I apologize ;o)
>>
>>best regards,
>>Senra
>>
>>
>>Rodrigo Senra
>>__
>>rsenra @ acm.org
>>http://rodrigo.senra.nom.br
>>
>>
>>
>>
>>___
>>Edu-sig mailing list
>>Edu-sig@python.org
>>http://mail.python.org/mailman/listinfo/edu-sig
>
>
>___
>Edu-sig mailing list
>Edu-sig@python.org
>http://mail.python.org/mailman/listinfo/edu-sig
___
Edu-sig mailing list
Edu-sig@python.org
http://mail.python.org/mailman/listinfo/edu-sig


Re: [Edu-sig] Python Programming: Procedural Online Test

2005-12-05 Thread w chun
> The problems seemed to get much easier in the last 5 or so (very basic
> syntax questions).  The one about "James"=="james" returning -1 is no longer
> true on some Pythons (as now we have boolean True).


the tests were well done... i enjoyed taking them.  like kirby, i also
found the Boolean issue.  in the procedural test, i found a syntax
error... i think the question with the [None] * 5... (well, [None,
None, None, None, None] actually), where you're setting "x[b[i] =
True:" ... that colon shouldn't be there.  there was/were also
question(s) which used sum as a variable name.  that is a built-in
function that is hidden if used.  interestingly enough, your syntax
checked actually highlighted it too.  :-)

cheers,
-- wesley
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
"Core Python Programming", Prentice Hall, (c)2006,2001
http://corepython.com

wesley.j.chun :: wescpy-at-gmail.com
cyberweb.consulting : silicon valley, ca
http://cyberwebconsulting.com
___
Edu-sig mailing list
Edu-sig@python.org
http://mail.python.org/mailman/listinfo/edu-sig


Re: [Edu-sig] Python Programming: Procedural Online Test

2005-12-05 Thread Scott David Daniels
damon bryant wrote:
> Hi Rodrigo!
> 
>> If I understood correctly the proposal is to give a "hard"-A for some
>> and an "easy"-A
>> for others, so everybody have A's (A=='good score'). Is that it?
> 
> No, students are not receiving a hard A or an easy A. I make no 
> classifications such as those you propose. My point is that questions are 
> placed on the same scale as the ability being measured (called a theta 
> scale). Grades may be mapped to the scale though, but a hard A or easy A 
> will not be assigned under aforementioned conditions described.
> 
> Because all questions in the item bank have been linked, two students can 
> take the same computer adaptive test but have no items in common between the 
> two administrations. However, scores are on the same scale. Research has 
> shown that even low ability students, despite their performance, prefer 
> computer adaptive tests over static fixed-length tests. It has also been 
> shown to lower test anxiety while serving the same purpose as fixed-length 
> linear tests in that educators are able to extract the same level of 
> information about student achievement or aptitude without banging a 
> student's head up against questions that he/she may have a very low 
> probability of getting correct. The high ability students, instead of being 
> bored, are receiving questions on the higher end of the theta scale that are 
> appropriately matched to their ability to challenge them.
> 
>> That sounds like
>> sweeping the dirt under the carpet. Students will know. We have to
>> prepare them to
>> tackle failure as well as success.
> 
>  The item is appropriately match for Examinee B because s/he has 
> approximately
 > a 50% chance of getting this one right - not a very high chance or a 
very low
> chance of getting it correct but a equi-probable opportunity of either a 
> success or a failure

Two comments:
   (1) You may find target a higher probability of correct gives a better
   subjective experience without significantly increasing the length
   of the test required to be confident of the score.

   (2) You should track each question's history vs. the final score for
   the test-taker.  This practice can help validate your scoring,
   as well as help you in weeding out mis-scored questions.

--Scott David Daniels
[EMAIL PROTECTED]

___
Edu-sig mailing list
Edu-sig@python.org
http://mail.python.org/mailman/listinfo/edu-sig


Re: [Edu-sig] Python Programming: Procedural Online Test

2005-12-05 Thread Scott Durkin
Could it be argued that the goal be for all students to score 100% on the
desired content?


Rodrigo Senra said:
>
> On 5Dec 2005, at 7:50 AM, damon bryant wrote:
>
>> One of the main reasons I decided to use an Item Response Theory (IRT)
>> framework was that the testing platform, once fully operational,
>> will not
>> give students questions that are either too easy or too difficult
>> for them,
>> thus reducing anxiety and boredom for low and high ability students,
>> respectively. In other words, high ability students will be
>> challenged with
>> more difficult questions and low ability students will receive
>> questions
>> that are challenging but matched to their ability.
>
> So far so good...
>
>> Each score is on the same scale, although some students will not
>> receive the same questions. This is the beautiful thing!
>
> I'd like to respectfully disagree. I'm afraid that would cause more
> harm than good.
> One side of student evaluation is to give feedback *for* the
> students. That is a
> relative measure, his/her performance against his/her peers.
>
> If I understood correctly the proposal is to give a "hard"-A for some
> and an "easy"-A
> for others, so everybody have A's (A=='good score'). Is that it ?
> That sounds like
> sweeping the dirt under the carpet. Students will know. We have to
> prepare them to
> tackle failure as well as success.
>
> I do not mean such efforts are not worthy, quite the reverse. But I
> strongly disagree
> with an adaptive scale. There should be a single scale fro the whole
> spectre of tests.
> If some students excel their results must show this, as well as if
> some students perform
> poorly that should not be hidden from them. Give them a goal and the
> means to pursue
> their goal.
>
> If I got your proposal all wrong, I apologize ;o)
>
> best regards,
> Senra
>
>
> Rodrigo Senra
> __
> rsenra @ acm.org
> http://rodrigo.senra.nom.br
>
>
>
>
> ___
> Edu-sig mailing list
> Edu-sig@python.org
> http://mail.python.org/mailman/listinfo/edu-sig
>



   S  c  o  t  t J.D  u  r  k  i  n


 Computer Science    Preston Junior High

[EMAIL PROTECTED]    http://staffweb.psdschools.org/sdurkin

___   _  ___   _  ___   _  ___   _  ___   _
   [(_)] |=|[(_)] |=|[(_)] |=|[(_)] |=|[(_)] |=|
'-`  |_| '-`  |_| '-`  |_| '-`  |_| '-`  |_|
   /mmm/  / /mmm/  / /mmm/  / /mmm/  / /mmm/  /
 |||||
 |||
 ___  \_  ___  \_  ___  \_
   Computer   Room  [(_)] |=|[(_)] |=|[(_)] |=|
 Lab  N205   '-`  |_| '-`  |_| '-`  |_|
/mmm//mmm//mmm/


   970.419.7358    2005-2006


scott.james.durkin


___
Edu-sig mailing list
Edu-sig@python.org
http://mail.python.org/mailman/listinfo/edu-sig


Re: [Edu-sig] Python Programming: Procedural Online Test

2005-12-05 Thread damon bryant

Hi Rodrigo!

>If I understood correctly the proposal is to give a "hard"-A for some
>and an "easy"-A
>for others, so everybody have A's (A=='good score'). Is that it?

No, students are not receiving a hard A or an easy A. I make no 
classifications such as those you propose. My point is that questions are 
placed on the same scale as the ability being measured (called a theta 
scale). Grades may be mapped to the scale though, but a hard A or easy A 
will not be assigned under aforementioned conditions described.

Because all questions in the item bank have been linked, two students can 
take the same computer adaptive test but have no items in common between the 
two administrations. However, scores are on the same scale. Research has 
shown that even low ability students, despite their performance, prefer 
computer adaptive tests over static fixed-length tests. It has also been 
shown to lower test anxiety while serving the same purpose as fixed-length 
linear tests in that educators are able to extract the same level of 
information about student achievement or aptitude without banging a 
student's head up against questions that he/she may have a very low 
probability of getting correct. The high ability students, instead of being 
bored, are receiving questions on the higher end of the theta scale that are 
appropriately matched to their ability to challenge them.

>That sounds like
>sweeping the dirt under the carpet. Students will know. We have to
>prepare them to
>tackle failure as well as success.

In fact computer adaptive tests are designed to administer items to a person 
of a SPECIFIC ability that will yield a 50/50 chance of correctly 
responding. For example, there are two examinees: Examinee A has a true 
theta of -1.5, and Examinee B has a true theta of 1.5. The theta scale has a 
typical range of -3 to 3. There is a question that has been mapped to the 
theta scale and it has a difficulty value of 1.5, how we estimate this is 
beyond our discussion but is relatively easy to do with Python. The item is 
appropriately match for Examinee B because s/he has approximately a 50% 
chance of getting this one right - not a very high chance or a very low 
chance of getting it correct but a equi-probable opportunity of either a 
success or a failure.

According to sampling theory, with multiple administrations of this item to 
a population of persons with a theta of 1.5, there will be an approximately 
equal number of successes and failures on this item, because the odds of 
getting it correct vs. incorrect are equal. However, with multiple 
administrations of this same item to a population of examinees with a theta 
of -1.5, which is substantially lower than 1.5, there will be exceedingly 
more failures than successes. Adaptive test algorithms seek to maximize 
information about examinees by estimating their ability and searching for 
questions in the item bank that match their ability levels, thus providing a 
50/50 chance of getting it right.

This is very different than administering a test where the professor seeks 
to have an average score is 50% because low ability students will get the 
vast majority of questions wrong, which could potentially increase anxiety, 
decrease self-efficacy, and lower the chance of acquiring information in 
subsequent teaching sessions (Bandura, self regulation). Adaptive testing is 
able to mitigate the psychological influences of testing on examinees by 
seeking to provide equal opportunities for both high and low ability 
students to experience success and failure to the same degree by getting 
items that are appropriately matched to their skill level. This is the 
aspect of adaptive testing that is attractive to me. It may not solve the 
problem, but it is a way of using technology to move in the right direction. 
I hope this is a better explanation than what I provided earlier.



>From: Rodrigo Senra <[EMAIL PROTECTED]>
>To: edu-sig@python.org
>Subject: Re: [Edu-sig] Python Programming: Procedural Online Test
>Date: Mon, 5 Dec 2005 19:53:00 -0200
>
>
>On 5Dec 2005, at 7:50 AM, damon bryant wrote:
>
> > One of the main reasons I decided to use an Item Response Theory (IRT)
> > framework was that the testing platform, once fully operational,
> > will not
> > give students questions that are either too easy or too difficult
> > for them,
> > thus reducing anxiety and boredom for low and high ability students,
> > respectively. In other words, high ability students will be
> > challenged with
> > more difficult questions and low ability students will receive
> > questions
> > that are challenging but matched to their ability.
>
>So far so good...
>
> > Each score is on the same scale, although some students will not
> > receive the same questions. This is the beautiful thing!
>
>I

Re: [Edu-sig] Python Programming: Procedural Online Test

2005-12-05 Thread Rodrigo Senra

On 5Dec 2005, at 7:50 AM, damon bryant wrote:

> One of the main reasons I decided to use an Item Response Theory (IRT)
> framework was that the testing platform, once fully operational,  
> will not
> give students questions that are either too easy or too difficult  
> for them,
> thus reducing anxiety and boredom for low and high ability students,
> respectively. In other words, high ability students will be  
> challenged with
> more difficult questions and low ability students will receive  
> questions
> that are challenging but matched to their ability.

So far so good...

> Each score is on the same scale, although some students will not
> receive the same questions. This is the beautiful thing!

I'd like to respectfully disagree. I'm afraid that would cause more  
harm than good.
One side of student evaluation is to give feedback *for* the  
students. That is a
relative measure, his/her performance against his/her peers.

If I understood correctly the proposal is to give a "hard"-A for some  
and an "easy"-A
for others, so everybody have A's (A=='good score'). Is that it ?  
That sounds like
sweeping the dirt under the carpet. Students will know. We have to  
prepare them to
tackle failure as well as success.

I do not mean such efforts are not worthy, quite the reverse. But I  
strongly disagree
with an adaptive scale. There should be a single scale fro the whole  
spectre of tests.
If some students excel their results must show this, as well as if  
some students perform
poorly that should not be hidden from them. Give them a goal and the  
means to pursue
their goal.

If I got your proposal all wrong, I apologize ;o)

best regards,
Senra


Rodrigo Senra
__
rsenra @ acm.org
http://rodrigo.senra.nom.br




___
Edu-sig mailing list
Edu-sig@python.org
http://mail.python.org/mailman/listinfo/edu-sig


Re: [Edu-sig] Python Programming: Procedural Online Test

2005-12-05 Thread damon bryant
One of the main reasons I decided to use an Item Response Theory (IRT) 
framework was that the testing platform, once fully operational, will not 
give students questions that are either too easy or too difficult for them, 
thus reducing anxiety and boredom for low and high ability students, 
respectively. In other words, high ability students will be challenged with 
more difficult questions and low ability students will receive questions 
that are challenging but matched to their ability. Each score is on the same 
scale, although some students will not receive the same questions. This is 
the beautiful thing! That is the concept of adaptive or tailored testing 
being implemented in the Python Programming: Procedural Online Test 
(http://www.adaptiveassessmentservices.com).

After reading the comment on 50% percent being optimal for measurement 
theory, I have to say about 90 years ago that was the best practice in order 
to maximize item/test variance, which maximized the distribution of scores. 
This is primarily a World War I and II convention in developing selection 
tests, i.e., Alpha and Beta, used to place conscripts in appropriate combat 
roles. Those two tests are the predecessors of the SAT administered by the 
Educational Testing Service, which is the organization where most of the war 
psychologists who developed Alpha and Beta went after the WW II. Because of 
their influence in selecting recruits who then received money after the war 
to go to college in the form of the GI Bill, these measurement specialists 
(psychometricians) did the same thing for ETS with the SAT in screening the 
same cohort for placement in colleges and universities around America. These 
psychologists had a strong influence of what constituted good practice in 
standardized testing. Accordingly, the practice of using 50% became well 
entrenched.

Later, IRT came on the scene in the early 1950s as an alternative to 
classical test theory and has some great theoretical and practical 
advantages over the previous approach of selecting items that have a 
variance of .50. The computing technology was not available then to 
implement the theory. However, it wasn't until the advent of the PC in the 
late 70s and early 80s that got psychometricians like me motivated to begin 
the implementation of IRT; once again at the forefront in the development 
was the armed services in the late 70s. It will take another decade or so to 
break the hold that Classical Test Theory has on measurement, and expect 
students' test anxiety to remain high in the interim. But as more and more 
begin to realize the benefits of IRT, especially computer adaptive testing, 
over CTT, it will no longer be an issue of was guidance should be used to 
administer and score tests.



>From: Chuck Allison <[EMAIL PROTECTED]>
>Reply-To: Chuck Allison <[EMAIL PROTECTED]>
>To: Laura Creighton <[EMAIL PROTECTED]>
>CC: edu-sig@python.org, Scott David Daniels <[EMAIL PROTECTED]>
>Subject: Re: [Edu-sig] Python Programming: Procedural Online Test
>Date: Mon, 5 Dec 2005 00:52:50 -0700
>
>Hello Laura,
>
>That's better than the Abstract Algebra class I took as an
>undergraduate. The highest score on Test 1 was 19%. I got 6%! I retook
>the class from another teacher and topped the class. Liked the subject
>so much I took the second semester just for fun. Testing and teaching
>strategies make a tremendous difference.
>
>Sunday, December 4, 2005, 11:50:22 PM, you wrote:
>
>LC> In a message of Sun, 04 Dec 2005 11:32:27 PST, Scott David Daniels 
>writes:
> >>I wrote:
> >> >> ... keeping people at 80% correct is great rule-of-thumb goal ...
> >>
> >>To elaborate on the statement above a bit, we did drill-and practice
> >>teaching (and had students loving it).  The value of the 80% is for
> >>maximal learning.  Something like 50% is the best for measurement theory
> >>(but discourages the student drastically).  In graduate school I had
> >>one instructor who tried to target his tests to get 50% as the average
> >>mark.  It was incredibly discouraging for most of the students (I
> >>eventually came to be OK with it, but it took half the course).
>
>LC> 
>
>LC> 'Discouraging' misses the mark.  The University of Toronto has 
>professors
>LC> who like to test to 50% as well.  And it causes suicides among 
>undergraduates
>LC> who are first exposed to this, unless there is adequate preparation.  
>This
>LC> is incredibly _dangerous_ stuff.
>
>LC> Laura
>
> >>--Scott David Daniels
> >>[EMAIL PROTECTED]
> >>
> >>___
> >>Edu-sig mailing list
> >>Edu-sig@python.org
> >>http://mail.python.org/mailman/listinfo/edu-sig
&g

Re: [Edu-sig] Python Programming: Procedural Online Test

2005-12-04 Thread Chuck Allison
Hello Laura,

That's better than the Abstract Algebra class I took as an
undergraduate. The highest score on Test 1 was 19%. I got 6%! I retook
the class from another teacher and topped the class. Liked the subject
so much I took the second semester just for fun. Testing and teaching
strategies make a tremendous difference.

Sunday, December 4, 2005, 11:50:22 PM, you wrote:

LC> In a message of Sun, 04 Dec 2005 11:32:27 PST, Scott David Daniels writes:
>>I wrote:
>> >> ... keeping people at 80% correct is great rule-of-thumb goal ...
>>
>>To elaborate on the statement above a bit, we did drill-and practice
>>teaching (and had students loving it).  The value of the 80% is for
>>maximal learning.  Something like 50% is the best for measurement theory
>>(but discourages the student drastically).  In graduate school I had
>>one instructor who tried to target his tests to get 50% as the average
>>mark.  It was incredibly discouraging for most of the students (I
>>eventually came to be OK with it, but it took half the course).

LC> 

LC> 'Discouraging' misses the mark.  The University of Toronto has professors
LC> who like to test to 50% as well.  And it causes suicides among 
undergraduates
LC> who are first exposed to this, unless there is adequate preparation.  This
LC> is incredibly _dangerous_ stuff.

LC> Laura 

>>--Scott David Daniels
>>[EMAIL PROTECTED]
>>
>>___
>>Edu-sig mailing list
>>Edu-sig@python.org
>>http://mail.python.org/mailman/listinfo/edu-sig
LC> ___
LC> Edu-sig mailing list
LC> Edu-sig@python.org
LC> http://mail.python.org/mailman/listinfo/edu-sig




-- 
Best regards,
 Chuck


___
Edu-sig mailing list
Edu-sig@python.org
http://mail.python.org/mailman/listinfo/edu-sig


Re: [Edu-sig] Python Programming: Procedural Online Test

2005-12-04 Thread Laura Creighton
In a message of Sun, 04 Dec 2005 11:32:27 PST, Scott David Daniels writes:
>I wrote:
> >> ... keeping people at 80% correct is great rule-of-thumb goal ...
>
>To elaborate on the statement above a bit, we did drill-and practice
>teaching (and had students loving it).  The value of the 80% is for
>maximal learning.  Something like 50% is the best for measurement theory
>(but discourages the student drastically).  In graduate school I had
>one instructor who tried to target his tests to get 50% as the average
>mark.  It was incredibly discouraging for most of the students (I
>eventually came to be OK with it, but it took half the course).



'Discouraging' misses the mark.  The University of Toronto has professors
who like to test to 50% as well.  And it causes suicides among undergraduates
who are first exposed to this, unless there is adequate preparation.  This
is incredibly _dangerous_ stuff.

Laura 

>--Scott David Daniels
>[EMAIL PROTECTED]
>
>___
>Edu-sig mailing list
>Edu-sig@python.org
>http://mail.python.org/mailman/listinfo/edu-sig
___
Edu-sig mailing list
Edu-sig@python.org
http://mail.python.org/mailman/listinfo/edu-sig


Re: [Edu-sig] Python Programming: Procedural Online Test

2005-12-04 Thread Scott David Daniels
I wrote:
 >> ... keeping people at 80% correct is great rule-of-thumb goal ...

To elaborate on the statement above a bit, we did drill-and practice
teaching (and had students loving it).  The value of the 80% is for
maximal learning.  Something like 50% is the best for measurement theory
(but discourages the student drastically).  In graduate school I had
one instructor who tried to target his tests to get 50% as the average
mark.  It was incredibly discouraging for most of the students (I
eventually came to be OK with it, but it took half the course).

The hardest part to create is the courseware (including questions), the
second-hardest effort is scoring the questions (rating the difficulty in
all applicable strands).  The software to deliver the questions was, in
many senses, a less labor-intensive task (especially when amortized over
a number of courses).  I think we came up with at least a ten-to-one
ratio (may have been higher, but definitely not lower) in effort
compared to the new prep for a course by an instructor.

I am (and was) a programming, rather than an education, guy.  I do not
know the education theory behind our research well, but I know how a
lot of the code worked (and know where some of our papers went).  We
kept an exponentially decaying model of the student's ability in each
"strand" and used that to help the estimate of his score in the coming
question "cloud."  A simplified version of the same approach would be
to have strand-specific questions, randomly pick a strand, and pick the
"best" question for that student in that strand.  Or, you could bias the
choices between strands to give more balanced progress (increasing the
probability of work where the student is weakest).


--Scott David Daniels
[EMAIL PROTECTED]

___
Edu-sig mailing list
Edu-sig@python.org
http://mail.python.org/mailman/listinfo/edu-sig


Re: [Edu-sig] Python Programming: Procedural Online Test

2005-12-04 Thread damon bryant

Scott:

I will attempt to incorporate your suggestion of keeping track of 
performance; I'll need to create some attributes on the examinee objects 
that will hold past test scores created within the system.

I am, however, approaching the scoring differently. Although I do report 
percentage correct, I'm using Item Response Theory to (1) score each 
question, (2) estimate ability using a Bayesian algorithm based on maximum 
likelihood, (3) estimate the error in the estimate of ability, and (4) 
select the most appropriate question to administer next. This is very 
similar to what is done at the Educational Testing Service in Princeton with 
the computer adaptive versions of the SAT and the GRE. I don't know the 
language used to develop their platform, but this one for the demo is 
developed in Python using numarray and multithreading modules to widen the 
bottlenecks and speed the delivery of test questions served in html format 
to the client's page.

Thanks for your comments!

By the way, I am looking for teachers, preferably middle and high school, 
who would be willing to trial the system. I have another site where they 
will have the ability to enroll students, monitor testing status, and view 
scores for all students. Do you know of any?




>From: Scott David Daniels <[EMAIL PROTECTED]>
>To: edu-sig@python.org
>Subject: Re: [Edu-sig] Python Programming: Procedural Online Test
>Date: Sat, 03 Dec 2005 12:03:06 -0800
>
>damon bryant wrote:
> > As you got more items correct
> > you got harder questions. In contrast, if you initially got questions
> > incorrect, you would have received easier questions
>In the 70s there was research on such systems (keeping people at 80%
>correct is great rule-of-thumb goal).  See Stuff done at Stanford's
>Institute for Mathematical Studies in the Social Sciences.  At IMSSS
>we did lots of this kind of stuff.  We generally broke the skills into
>strands (separate concepts), and kept track of the student's performance
>in each strand separately (try it; it helps).  BIP (Basic Instructional
>Program) was an ONR (Office of Naval Research) sponsored system, that
>tried to teach "programming in Basic."  The BIP model (and often the
>"standard" IMSSS model) was to score every task in each strand, and find
>the "best" for the student based on his current position.
>For arithmetic, we actually generated problems based on the different
>desired strand properties; nobody was clever enough to generate software
>problems; we simply consulted our DB.  We taught how to do proofs in
>Logic and Set Theory using some of these techniques.
>Names to look for on papers in the 70s-80s include Patrick Suppes (head
>of one side of IMSSS), Richard Atkinson (head of the other side),
>Barbara Searle, Avron Barr, and Marian Beard.  These are not the only
>people who worked there, but a number I recall that should help you to
>find the research publications (try Google Scholar).
>
>
>A follow-on for some of this work is:
>  http://www-epgy.stanford.edu/
>
>I worked there "back in the day" and was quite proud to be a part of
>some of that work.
>
>--Scott David Daniels
>[EMAIL PROTECTED]
>
>___
>Edu-sig mailing list
>Edu-sig@python.org
>http://mail.python.org/mailman/listinfo/edu-sig


___
Edu-sig mailing list
Edu-sig@python.org
http://mail.python.org/mailman/listinfo/edu-sig


Re: [Edu-sig] Python Programming: Procedural Online Test

2005-12-03 Thread Scott David Daniels
damon bryant wrote:
> As you got more items correct 
> you got harder questions. In contrast, if you initially got questions 
> incorrect, you would have received easier questions
In the 70s there was research on such systems (keeping people at 80%
correct is great rule-of-thumb goal).  See Stuff done at Stanford's
Institute for Mathematical Studies in the Social Sciences.  At IMSSS
we did lots of this kind of stuff.  We generally broke the skills into
strands (separate concepts), and kept track of the student's performance
in each strand separately (try it; it helps).  BIP (Basic Instructional
Program) was an ONR (Office of Naval Research) sponsored system, that
tried to teach "programming in Basic."  The BIP model (and often the
"standard" IMSSS model) was to score every task in each strand, and find
the "best" for the student based on his current position.
For arithmetic, we actually generated problems based on the different
desired strand properties; nobody was clever enough to generate software
problems; we simply consulted our DB.  We taught how to do proofs in
Logic and Set Theory using some of these techniques.
Names to look for on papers in the 70s-80s include Patrick Suppes (head
of one side of IMSSS), Richard Atkinson (head of the other side),
Barbara Searle, Avron Barr, and Marian Beard.  These are not the only
people who worked there, but a number I recall that should help you to
find the research publications (try Google Scholar).


A follow-on for some of this work is:
 http://www-epgy.stanford.edu/

I worked there "back in the day" and was quite proud to be a part of
some of that work.

--Scott David Daniels
[EMAIL PROTECTED]

___
Edu-sig mailing list
Edu-sig@python.org
http://mail.python.org/mailman/listinfo/edu-sig


Re: [Edu-sig] Python Programming: Procedural Online Test

2005-12-03 Thread damon bryant
Kirby:

Thank you for your feedback!

You completed the Declarative measure.  I am also interested in your 
feedback on the Procedural test in which Python application or procedural 
questions are administered. Questions on this part are coded and displayed 
as they appear in the IDLE with highlighted key words and indentation for 
nested code. I think the longest code problem is about 20 lines.

I appreciate your comment on the ability to specify font type/size because 
I'm currently working to accommodate persons with disabilities and others 
who may have difficulty viewing the text.

Linda Grandel and I have an experimental site that we use for research and 
educational purposes; we are going to trial some Python questions for a 
class of her colleagues but are having some difficultly translating the 
testing in a short period. This is a long-term project.

We have the goal of developing a worldwide database of Python test norms in 
an effort to track progress on the spread and proficiency of the language in 
different countries. Although it is a great idea, it is too large for a 
dissertation research project. If you are interested in trialing it with 
your class, perhaps we can collaborate.

You did notice that towards the end, questions got easier for you. The test 
algorithm is adaptive but the question bank from which the items are pulled 
is not that large. In other words, the test presented items that were most 
appropriate for you when you began the test. As you got more items correct 
you got harder questions. In contrast, if you initially got questions 
incorrect, you would have received easier questions. Because the bank is so 
small (I do have plans of expanding it when I get some more time on my 
hands), you exhausted the bank of difficult questions and began to receive 
easier items. The opposite would have happened to an examinee of very low 
ability.

My goal is to administer a computer adaptive Python test where examinees 
will only receive questions that are most appropriate for them. In other 
words, different examinees will be tested according to their ability. This 
goes back to Binet's idea of tailored testing where the psychologist 
administering the intelligence test would give items to an examinee based on 
previous responses. In the present case, it's done by computer using an 
artificially intelligent algorithm based on my dissertation. By expanding 
the question bank, I'll be able to reach that goal.


>From: "Kirby Urner" <[EMAIL PROTECTED]>
>To: "'damon bryant'" <[EMAIL PROTECTED]>, [EMAIL PROTECTED]
>CC: edu-sig@python.org
>Subject: RE: [Edu-sig] Python Programming: Procedural Online Test
>Date: Sat, 3 Dec 2005 07:44:32 -0800
>
> > I tweaked it now where all other browsers and OS combinations can access
> > the computer adaptive tests. Performance may be unpredictable though.
> >
> > Damon
>
>OK, thanks.  Worked with no problems.
>
>As an administrator, I'd be curious to get the actual text of missed
>problems (maybe via URL), not just a raw percentage (I got 90% i.e. 2 wrong
>-- probably the one about getting current working directory, not sure which
>other).
>
>The problems seemed to get much easier in the last 5 or so (very basic
>syntax questions).  The one about "James"=="james" returning -1 is no 
>longer
>true on some Pythons (as now we have boolean True).
>
>The font used to pose the questions was a little distracting.  I vastly
>prefer fixed width fonts when programming.  I know that's a personal
>preference (some actually like variable pitch -- blech).  Perhaps as a
>future enhancement, you could let the user customize the font?
>
>Anyway, a useful service.  I could see teachers like me wanting to use this
>with our classes.
>
>Thank you for giving me this opportunity.
>
>Kirby
>
>
>--
>No virus found in this outgoing message.
>Checked by AVG Free Edition.
>Version: 7.1.362 / Virus Database: 267.13.11/191 - Release Date: 12/2/2005
>
>


___
Edu-sig mailing list
Edu-sig@python.org
http://mail.python.org/mailman/listinfo/edu-sig


Re: [Edu-sig] Python Programming: Procedural Online Test

2005-12-03 Thread Kirby Urner
> I tweaked it now where all other browsers and OS combinations can access
> the computer adaptive tests. Performance may be unpredictable though.
> 
> Damon

OK, thanks.  Worked with no problems.

As an administrator, I'd be curious to get the actual text of missed
problems (maybe via URL), not just a raw percentage (I got 90% i.e. 2 wrong
-- probably the one about getting current working directory, not sure which
other).

The problems seemed to get much easier in the last 5 or so (very basic
syntax questions).  The one about "James"=="james" returning -1 is no longer
true on some Pythons (as now we have boolean True).

The font used to pose the questions was a little distracting.  I vastly
prefer fixed width fonts when programming.  I know that's a personal
preference (some actually like variable pitch -- blech).  Perhaps as a
future enhancement, you could let the user customize the font?

Anyway, a useful service.  I could see teachers like me wanting to use this
with our classes.

Thank you for giving me this opportunity.

Kirby


-- 
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.362 / Virus Database: 267.13.11/191 - Release Date: 12/2/2005
 

___
Edu-sig mailing list
Edu-sig@python.org
http://mail.python.org/mailman/listinfo/edu-sig


Re: [Edu-sig] Python Programming: Procedural Online Test

2005-12-02 Thread damon bryant
I tweaked it now where all other browsers and OS combinations can access the 
computer adaptive tests. Performance may be unpredictable though.

Damon

>From: "Kirby Urner" <[EMAIL PROTECTED]>
>To: "'Vern Ceder'" <[EMAIL PROTECTED]>,  "'damon bryant'" 
><[EMAIL PROTECTED]>
>CC: edu-sig@python.org
>Subject: RE: [Edu-sig] Python Programming: Procedural Online Test
>Date: Fri, 2 Dec 2005 19:50:36 -0800
>
>Similar comment.  I'm on Windows but don't want to be tested by a service
>that won't let me use FireFox.  I have tests too.
>
>Kirby
>
>
> > -Original Message-
> > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
> > Behalf Of Vern Ceder
> > Sent: Friday, December 02, 2005 2:38 PM
> > To: damon bryant
> > Cc: edu-sig@python.org
> > Subject: Re: [Edu-sig] Python Programming: Procedural Online Test
> >
> > In my opinion, you would get more responses if the testing system
> > accepted a browser/OS combination other than IE/Windows
> >
> > Cheers,
> > Vern Ceder (using Firefox and Ubuntu Linux)
>
>


___
Edu-sig mailing list
Edu-sig@python.org
http://mail.python.org/mailman/listinfo/edu-sig


Re: [Edu-sig] Python Programming: Procedural Online Test

2005-12-02 Thread Kirby Urner
Similar comment.  I'm on Windows but don't want to be tested by a service
that won't let me use FireFox.  I have tests too.

Kirby


> -Original Message-
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
> Behalf Of Vern Ceder
> Sent: Friday, December 02, 2005 2:38 PM
> To: damon bryant
> Cc: edu-sig@python.org
> Subject: Re: [Edu-sig] Python Programming: Procedural Online Test
> 
> In my opinion, you would get more responses if the testing system
> accepted a browser/OS combination other than IE/Windows
> 
> Cheers,
> Vern Ceder (using Firefox and Ubuntu Linux)


___
Edu-sig mailing list
Edu-sig@python.org
http://mail.python.org/mailman/listinfo/edu-sig


Re: [Edu-sig] Python Programming: Procedural Online Test

2005-12-02 Thread Vern Ceder
In my opinion, you would get more responses if the testing system 
accepted a browser/OS combination other than IE/Windows

Cheers,
Vern Ceder (using Firefox and Ubuntu Linux)

damon bryant wrote:
> Hey folks!
> 
> Lindel Grandel and I have been working on some Python questions for 
> potential use in high schools, college, and employment. If you are 
> interested in taking one of the online tests go to
> http://www.adaptiveassessmentservices.com and self-register to take one of 
> two Python tests: one is Declarative (knowledge of built-in data types and 
> functions) and the other is Procedural (application of loops, import, 
> functions,...,etc.). I would like to know what you think. Thanks in advance!
> 
> Damon
> 
> 
> ___
> Edu-sig mailing list
> Edu-sig@python.org
> http://mail.python.org/mailman/listinfo/edu-sig

-- 
This time for sure!
-Bullwinkle J. Moose
-
Vern Ceder, Director of Technology
Canterbury School, 3210 Smith Road, Ft Wayne, IN 46804
[EMAIL PROTECTED]; 260-436-0746; FAX: 260-436-5137
___
Edu-sig mailing list
Edu-sig@python.org
http://mail.python.org/mailman/listinfo/edu-sig


[Edu-sig] Python Programming: Procedural Online Test

2005-12-02 Thread damon bryant
Hey folks!

Lindel Grandel and I have been working on some Python questions for 
potential use in high schools, college, and employment. If you are 
interested in taking one of the online tests go to
http://www.adaptiveassessmentservices.com and self-register to take one of 
two Python tests: one is Declarative (knowledge of built-in data types and 
functions) and the other is Procedural (application of loops, import, 
functions,...,etc.). I would like to know what you think. Thanks in advance!

Damon


___
Edu-sig mailing list
Edu-sig@python.org
http://mail.python.org/mailman/listinfo/edu-sig