Re: [R] Proportion test in three-chices experiment

2005-07-18 Thread NOEL Yvonnick

Rafael,

when testing binomial hypotheses with both repeated measures and 
inter-group factors, you should make explicit your model on the 
intra-subject part of the data. You can't do Chi-square comparisons on 
count data that mix independent and dependent measures.


But you can define a Bernoulli logistic model at the individual single 
response level, and define a proper subject factor, and a stimulus-type 
factor, and possibly an item factor (nested within the stimulus-type 
category). This may be viewed as a Rasch model of measurement.


Within this model, coefficient estimates on the subject factor are 
measures of individual ability that can be a posteriori introduced in an 
ANOVA.


In some cases, you can model the intra-subject part of the data as a 
temporal model, assuming for instance that subjects are getting more 
efficient as time goes on.


HTH,

Yvonnick.
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Proportion test in three-chices experiment

2005-07-17 Thread Rafael Laboissiere
Jon and Spencer,

First of all, thanks for your insightful comments on my questions.  I am
quite impressed by the level of support one finds in the r-help mailing
list.  In particular, as Spencer pointed out in another post, I did not
do my homework and you have been overly kind in discussing the issue.  I
promised Spencer privately that I will read the Posting Guide before
getting farther in the thread, but I would like just to give a short
answer: 

* Jonathan Baron <[EMAIL PROTECTED]> [2005-07-17 15:05]:

> You still aren't saying whether you are doing this for each subject for
> the entire data set summed over subjects. If the latter, are you
> worried about subject variance? Do you think it possible that some
> subjects might show better performance in condition 2?  Would you be
> happy if you tested a single subject and got the result?  If subject
> variance is an issue, then you need to test "across subjects." One way
> to do that is to compute some performance measure for each subject and
> each condition and then do a matched-pairs t test across subjects.

Yes, I intend to do the test across subjects and subject variance is
indeed an issue in my case.

Thanks for your further suggestions, I will look at them carefully.

-- 
Rafael

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Proportion test in three-chices experiment

2005-07-17 Thread Spencer Graves
Hi, Rafael:

  At this point, it might help if you try the posting guide! 
"http://www.R-project.org/posting-guide.html";, especially the part about 
constructing a toy example with real numbers, try some of the R 
facilities discussed, and explain why you aren't sure they will solve 
your problem.  You may answer your own question in the course of working 
through that guide, and if you don't, the exercise could make it easier 
for someone else to suggest something you actually find useful.

  spencer graves

Jonathan Baron wrote:

> On 07/17/05 20:12, Rafael Laboissiere wrote:
> 
> 
>>Thanks for your reply, Jonathan.  Thanks also to Spencer, who suggested
>>using the BTm function.  I realize that my description of both the
>>experiment and the involved issue was not clear.  Let me try again:
>>
>>My subjects do a recognition task where I present stimuli belonging to
>>three different classes (let us say A, B, and C).  There are many of
>>them.  Subjects are asked to recognize each stimulus as belonging to one
>>of the three classes (forced-choice design).  This is done under two
>>different conditions (say conditions 1 and 2).  I end up with matrices of
>>counts like this (in R notation):
>>
>># under condition 1
>>c1 <- t (matrix (c (c1AA, c1AB, c1AC,
>>c1BA, c1BB, c1BC,
>>  c1CA, c1CB, c1CC), nc = 3))
>># under condition 2
>>c2 <- t (matrix (c (c2AA, c2AB, c2AC,
>>c2BA, c2BB, c2BC,
>>  c2CA, c2CB, c2CC), nc = 3))
>>
>>where "cijk" is the number of times the subject gave answer k when
>>presented with a stimulus of class j, under condition i.
>>
>>The issue is to test whether subjects perform better (in the sense of a
>>higher recognition score) in condition 1 compared with condition 2.  My
>>first idea was to test the global recognition rate, which could be
>>computed as:
>>
>># under condition 1
>>r1 <- sum (diag (c1)) / sum (c1)
>># under condition 2
>>r2 <- sum (diag (c2)) / sum (c2)
>>
>>The null hypothesis is that r1 is not different from r2. I guess that I
>>could test it with the chisq.test function, like this:
>>
>>p1 <- sum (diag (c1))
>>q1 <- sum (c1) - p1
>>p2 <- sum (diag (c2))
>>q2 <- sum (c2) - p2
>>chisq.test (matrix (c(p1, q1, p2, q2), nc = 2))
>>
>>What do you think?
>>
>>I also thought about testing the triples like [c1AA, c1AB, c1AC] against
>>[c2AA, c2AB, c2AC], hence my original question.
> 
> 
> You still aren't saying whether you are doing this for each
> subject for the entire data set summed over subjects.  If the
> latter, are you worried about subject variance?  Do you think it
> possible that some subjects might show better performance in
> condition 2?  Would you be happy if you tested a single subject
> and got the result?  If subject variance is an issue, then you
> need to test "across subjects."  One way to do that is to
> compute some performance measure for each subject and each
> condition and then do a matched-pairs t test across subjects.
> 
> The method you suggest requires several assumptions, and I don't
> know if these are reasonable.  The problem is in using a sum of
> the diagonal (p1) and off-diagonal entries (q1) in the table.
> This may work if you have no reason to think that c2 is better,
> ever.  In that case, all you need is a measure that varies
> monotonically with the true measure, whatever it is.  You need
> also to assume that c1 and c2 do not differ in response biases,
> and that it could not be the case that one of the diagonal cells
> is better in c1 and another is better in c2.
> 
> I have not studied these issues much since my PhD thesis (1970!), 
> but then the usual approach was to develop a sensible model of
> the task and then use some parameter of the model as the
> measure.  Perhaps this is over-kill for what you are doing, but I 
> don't know.  For example, one model says that the subject either
> knows the answer or guesses, and the guesses are distributed
> across the three categories according to biases that are specific 
> to the condition, but knowing the answer is independent of the
> category.  (You can test the assumptions of this model.)  Another 
> model (popular in 1970) is Luce's choice theory, which is similar 
> to the first but uses multiplication.  If I remember correctly
> (which I probably don't) you would exactly what you propose but
> after taking the logs of the frequencies.
> 
> It is possible to get different, even opposite, results using
> logs than you would get with your proposal.  Likewise, it is
> possible to get opposite results if you ignore response bias, and 
> if the conditions differ in response bias.
> 
> The suggestion I made based on the idea of inter-rater agreement
> implies a rough-and-ready model similar to the first.  It does
> take response bias into account.
> 
> Jon

-- 
Spencer Graves, PhD
Senior Development Engineer
PDF Solutions, Inc.
333 West San Carlos Street Suite 700
San Jose, CA 95110, USA

[EMAIL

Re: [R] Proportion test in three-chices experiment

2005-07-17 Thread Jonathan Baron
On 07/17/05 20:12, Rafael Laboissiere wrote:

> Thanks for your reply, Jonathan.  Thanks also to Spencer, who suggested
> using the BTm function.  I realize that my description of both the
> experiment and the involved issue was not clear.  Let me try again:
> 
> My subjects do a recognition task where I present stimuli belonging to
> three different classes (let us say A, B, and C).  There are many of
> them.  Subjects are asked to recognize each stimulus as belonging to one
> of the three classes (forced-choice design).  This is done under two
> different conditions (say conditions 1 and 2).  I end up with matrices of
> counts like this (in R notation):
> 
> # under condition 1
> c1 <- t (matrix (c (c1AA, c1AB, c1AC,
> c1BA, c1BB, c1BC,
>   c1CA, c1CB, c1CC), nc = 3))
> # under condition 2
> c2 <- t (matrix (c (c2AA, c2AB, c2AC,
> c2BA, c2BB, c2BC,
>   c2CA, c2CB, c2CC), nc = 3))
> 
> where "cijk" is the number of times the subject gave answer k when
> presented with a stimulus of class j, under condition i.
> 
> The issue is to test whether subjects perform better (in the sense of a
> higher recognition score) in condition 1 compared with condition 2.  My
> first idea was to test the global recognition rate, which could be
> computed as:
> 
> # under condition 1
> r1 <- sum (diag (c1)) / sum (c1)
> # under condition 2
> r2 <- sum (diag (c2)) / sum (c2)
> 
> The null hypothesis is that r1 is not different from r2. I guess that I
> could test it with the chisq.test function, like this:
> 
> p1 <- sum (diag (c1))
> q1 <- sum (c1) - p1
> p2 <- sum (diag (c2))
> q2 <- sum (c2) - p2
> chisq.test (matrix (c(p1, q1, p2, q2), nc = 2))
> 
> What do you think?
> 
> I also thought about testing the triples like [c1AA, c1AB, c1AC] against
> [c2AA, c2AB, c2AC], hence my original question.

You still aren't saying whether you are doing this for each
subject for the entire data set summed over subjects.  If the
latter, are you worried about subject variance?  Do you think it
possible that some subjects might show better performance in
condition 2?  Would you be happy if you tested a single subject
and got the result?  If subject variance is an issue, then you
need to test "across subjects."  One way to do that is to
compute some performance measure for each subject and each
condition and then do a matched-pairs t test across subjects.

The method you suggest requires several assumptions, and I don't
know if these are reasonable.  The problem is in using a sum of
the diagonal (p1) and off-diagonal entries (q1) in the table.
This may work if you have no reason to think that c2 is better,
ever.  In that case, all you need is a measure that varies
monotonically with the true measure, whatever it is.  You need
also to assume that c1 and c2 do not differ in response biases,
and that it could not be the case that one of the diagonal cells
is better in c1 and another is better in c2.

I have not studied these issues much since my PhD thesis (1970!), 
but then the usual approach was to develop a sensible model of
the task and then use some parameter of the model as the
measure.  Perhaps this is over-kill for what you are doing, but I 
don't know.  For example, one model says that the subject either
knows the answer or guesses, and the guesses are distributed
across the three categories according to biases that are specific 
to the condition, but knowing the answer is independent of the
category.  (You can test the assumptions of this model.)  Another 
model (popular in 1970) is Luce's choice theory, which is similar 
to the first but uses multiplication.  If I remember correctly
(which I probably don't) you would exactly what you propose but
after taking the logs of the frequencies.

It is possible to get different, even opposite, results using
logs than you would get with your proposal.  Likewise, it is
possible to get opposite results if you ignore response bias, and 
if the conditions differ in response bias.

The suggestion I made based on the idea of inter-rater agreement
implies a rough-and-ready model similar to the first.  It does
take response bias into account.

Jon
-- 
Jonathan Baron, Professor of Psychology, University of Pennsylvania
Home page: http://www.sas.upenn.edu/~baron

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Proportion test in three-chices experiment

2005-07-17 Thread Rafael Laboissiere
* Jonathan Baron <[EMAIL PROTECTED]> [2005-07-16 11:49]:

> I suspect that there are more direct ways to do this test, but it 
> is unclear to me just what the issue is.  For example, if there
> are many subjects and very few stimuli for each, you might want
> to get some sort of measure of ability for each subject (many
> possibilities here, then test the measure across subjects with a
> t test.  The measure must be chosen so that you can specify a
> null hypothesis.  It must be directional.
> 
> If you have a few subjects and many trials per subject, then you
> could do a significance test for each subject.
> You want a directional test, because you have a specific
> hypothesis, namely, that the correct answer will occur more often 
> than predicted from the marginal frequencies in the 3x3 table.
> (I assume it is a 3x3 table with stimuli as rows and responses ad 
> columns, and you want to show that the diagonal cells are higher
> than predicted.) One possibility is kappa, which is in the vcd
> package, and also in psy and concord, in somewhat different
> forms.

Thanks for your reply, Jonathan.  Thanks also to Spencer, who suggested
using the BTm function.  I realize that my description of both the
experiment and the involved issue was not clear.  Let me try again:

My subjects do a recognition task where I present stimuli belonging to
three different classes (let us say A, B, and C).  There are many of
them.  Subjects are asked to recognize each stimulus as belonging to one
of the three classes (forced-choice design).  This is done under two
different conditions (say conditions 1 and 2).  I end up with matrices of
counts like this (in R notation):

# under condition 1
c1 <- t (matrix (c (c1AA, c1AB, c1AC, 
c1BA, c1BB, c1BC, 
c1CA, c1CB, c1CC), nc = 3))
# under condition 2
c2 <- t (matrix (c (c2AA, c2AB, c2AC, 
c2BA, c2BB, c2BC, 
c2CA, c2CB, c2CC), nc = 3))

where "cijk" is the number of times the subject gave answer k when
presented with a stimulus of class j, under condition i.

The issue is to test whether subjects perform better (in the sense of a
higher recognition score) in condition 1 compared with condition 2.  My
first idea was to test the global recognition rate, which could be
computed as:

# under condition 1
r1 <- sum (diag (c1)) / sum (c1)
# under condition 2
r2 <- sum (diag (c2)) / sum (c2)

The null hypothesis is that r1 is not different from r2. I guess that I
could test it with the chisq.test function, like this:

p1 <- sum (diag (c1))
q1 <- sum (c1) - p1
p2 <- sum (diag (c2))
q2 <- sum (c2) - p2
chisq.test (matrix (c(p1, q1, p2, q2), nc = 2))

What do you think?

I also thought about testing the triples like [c1AA, c1AB, c1AC] against
[c2AA, c2AB, c2AC], hence my original question.

-- 
Rafael

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Proportion test in three-chices experiment

2005-07-16 Thread Jonathan Baron
I suspect that there are more direct ways to do this test, but it 
is unclear to me just what the issue is.  For example, if there
are many subjects and very few stimuli for each, you might want
to get some sort of measure of ability for each subject (many
possibilities here, then test the measure across subjects with a
t test.  The measure must be chosen so that you can specify a
null hypothesis.  It must be directional.

If you have a few subjects and many trials per subject, then you
could do a significance test for each subject.
You want a directional test, because you have a specific
hypothesis, namely, that the correct answer will occur more often 
than predicted from the marginal frequencies in the 3x3 table.
(I assume it is a 3x3 table with stimuli as rows and responses ad 
columns, and you want to show that the diagonal cells are higher
than predicted.) One possibility is kappa, which is in the vcd
package, and also in psy and concord, in somewhat different
forms.

Usually in this sort of experiment, though, there isn't much of
an issue about whether subjects are transmitting information at
all.  Rather the issue is testing alternative models of what they 
are doing.

Jon

On 07/16/05 06:33, Spencer Graves wrote:
> Have you considered "BTm" in library(BradleyTerry)?  Consider the
> following example:
> 
>  > cond1 <- data.frame(winner=rep(LETTERS[1:3], e=2),
> +   loser=c("B","C","A","C","A","B"),
> +   Freq=1:6)
>  > cond2 <- data.frame(winner=rep(LETTERS[1:3], e=2),
> +   loser=c("B","C","A","C","A","B"),
> +   Freq=6:1)
>  > fit1 <- BTm(cond1~..)
>  > fit2 <- BTm(cond2~..)
>  > fit12 <- BTm(rbind(cond1, cond2)~..)
>  > Dev12 <- (fit1$deviance+fit2$deviance
> +   -fit12$deviance)
>  > pchisq(Dev12, 2, lower=FALSE)
> [1] 0.8660497
> 
> This says the difference between the two data sets, cond1 and cond2,
> are not statistically significant.
> 
> Do you present each subject with onely one pair?  If yes, then this
> model is appropriate.  If no, then the multiple judgments by the same
> subject are not statistically independent, as assumed by this model.
> However, if you don't get statistical significance via this kind of
> computation, it's unlikely that a better model would give you
> statistical significance.  If you get a p value of, say, 0.04, then the
> difference is probably NOT statistically significant.
> 
> The p value you get here would be an upper bound.  You could get a
> lower bound by using only one of the three pairs presented to each
> subject selected at random.  If that p value were statistically
> significant, then I think it is safe to say that your two sets of
> conditions are significantly different.  For any value in between, it
> would depend on how independent the three choices by the same subject.
> You might, for example, delete one of the three pairs at random and use
> the result of that comparison.
> 
> There are doubtless better techniques, but I'm not familiar with
> them.  Perhaps someone else will reply to my reply.
> 
> spencer graves
> 
> Rafael Laboissiere wrote:
> 
> > Hi,
> >
> > I wish to analyze with R the results of a perception experiment in which
> > subjects had to recognize each stimulus among three choices (this was a
> > forced-choice design).  The experiment runs under two different
> > conditions and the data is like the following:
> >
> >N1 : count of trials in condition 1
> >p11, p12, p13: proportions of choices 1, 2, and 3 in condition 1
> >
> >N2 : count of trials in condition 2
> >p21, p22, p23: proportions of choices 1, 2, and 3 in condition 2
> >
> > How can I test whether the triple (p11,p12,p13) is different from the
> > triple (p21,p22,p23)?  Clearly, prop.test does not help me here, because
> > it relates to two-choices tests.
> >
> > I apologize if the answer is trivial, but I am relatively new to R and
> > could not find any pointers in the FAQ or in the mailing list archives.
> >
> > Thanks in advance for any help,
> >
> 
> --
> Spencer Graves, PhD
> Senior Development Engineer
> PDF Solutions, Inc.
> 333 West San Carlos Street Suite 700
> San Jose, CA 95110, USA
> 
> [EMAIL PROTECTED]
> www.pdf.com 
> Tel:  408-938-4420
> Fax: 408-280-7915
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

-- 
Jonathan Baron, Professor of Psychology, University of Pennsylvania
Home page: http://www.sas.upenn.edu/~baron

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Proportion test in three-chices experiment

2005-07-16 Thread Spencer Graves
  Have you considered "BTm" in library(BradleyTerry)?  Consider the 
following example:

 > cond1 <- data.frame(winner=rep(LETTERS[1:3], e=2),
+   loser=c("B","C","A","C","A","B"),
+   Freq=1:6)
 > cond2 <- data.frame(winner=rep(LETTERS[1:3], e=2),
+   loser=c("B","C","A","C","A","B"),
+   Freq=6:1)
 > fit1 <- BTm(cond1~..)
 > fit2 <- BTm(cond2~..)
 > fit12 <- BTm(rbind(cond1, cond2)~..)
 > Dev12 <- (fit1$deviance+fit2$deviance
+   -fit12$deviance)
 > pchisq(Dev12, 2, lower=FALSE)
[1] 0.8660497

  This says the difference between the two data sets, cond1 and cond2, 
are not statistically significant.

  Do you present each subject with onely one pair?  If yes, then this 
model is appropriate.  If no, then the multiple judgments by the same 
subject are not statistically independent, as assumed by this model. 
However, if you don't get statistical significance via this kind of 
computation, it's unlikely that a better model would give you 
statistical significance.  If you get a p value of, say, 0.04, then the 
difference is probably NOT statistically significant.

  The p value you get here would be an upper bound.  You could get a 
lower bound by using only one of the three pairs presented to each 
subject selected at random.  If that p value were statistically 
significant, then I think it is safe to say that your two sets of 
conditions are significantly different.  For any value in between, it 
would depend on how independent the three choices by the same subject. 
You might, for example, delete one of the three pairs at random and use 
the result of that comparison.

  There are doubtless better techniques, but I'm not familiar with 
them.  Perhaps someone else will reply to my reply.

  spencer graves

Rafael Laboissiere wrote:

> Hi,
> 
> I wish to analyze with R the results of a perception experiment in which
> subjects had to recognize each stimulus among three choices (this was a
> forced-choice design).  The experiment runs under two different
> conditions and the data is like the following:
> 
>N1 : count of trials in condition 1
>p11, p12, p13: proportions of choices 1, 2, and 3 in condition 1
>
>N2 : count of trials in condition 2
>p21, p22, p23: proportions of choices 1, 2, and 3 in condition 2
>
> How can I test whether the triple (p11,p12,p13) is different from the
> triple (p21,p22,p23)?  Clearly, prop.test does not help me here, because
> it relates to two-choices tests.
> 
> I apologize if the answer is trivial, but I am relatively new to R and
> could not find any pointers in the FAQ or in the mailing list archives.
> 
> Thanks in advance for any help,
> 

-- 
Spencer Graves, PhD
Senior Development Engineer
PDF Solutions, Inc.
333 West San Carlos Street Suite 700
San Jose, CA 95110, USA

[EMAIL PROTECTED]
www.pdf.com 
Tel:  408-938-4420
Fax: 408-280-7915

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html