[R] testing independence of categorical variables

2007-11-22 Thread Shoaaib Mehmood
hi,

is there a way of calculating of measuring dependence between two
categorical variables. i tried using the chi square test to test for
independence but i got error saying that the lengths of the two
vectors don't match. Suppose X and Y are two factors. X has 5 levels
and Y has 7 levels. This is what i tried doing

>temp<-chisq.test(x,y)

but got error "the lengths of the two vectors don't match". any help
will be appreciated
-- 
Regards,
Rana Shoaaib Mehmood

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] testing independence of categorical variables

2007-11-22 Thread Moshe Olshansky
Hi,

When testing whether random variables X and Y are
independent the usual assumption is that you have n
pairs of outcomes - (X1,Y1), (X2,Y2), ... , (Xn,Yn)
and you are basically checking whether the value of X
affects the value of Y.
If you have 7 observations of X and 5 separate
observations of Y (which have nothing to do with the
observations of X) you can not test for independence.

Regards,

Moshe.

--- Shoaaib Mehmood <[EMAIL PROTECTED]> wrote:

> hi,
> 
> is there a way of calculating of measuring
> dependence between two
> categorical variables. i tried using the chi square
> test to test for
> independence but i got error saying that the lengths
> of the two
> vectors don't match. Suppose X and Y are two
> factors. X has 5 levels
> and Y has 7 levels. This is what i tried doing
> 
> >temp<-chisq.test(x,y)
> 
> but got error "the lengths of the two vectors don't
> match". any help
> will be appreciated
> -- 
> Regards,
> Rana Shoaaib Mehmood
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained,
> reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] testing independence of categorical variables

2007-11-22 Thread David Winsemius
"Shoaaib Mehmood" <[EMAIL PROTECTED]> wrote in 
news:[EMAIL PROTECTED]:

> hi,
> 
> is there a way of calculating of measuring dependence between two
> categorical variables. i tried using the chi square test to test for
> independence but i got error saying that the lengths of the two
> vectors don't match. Suppose X and Y are two factors. X has 5 levels
> and Y has 7 levels. This is what i tried doing
> 
>>temp<-chisq.test(x,y)
> 
> but got error "the lengths of the two vectors don't match". any help
> will be appreciated

If you posted the table, it might be more clear why the error was being 
thrown. In the example shown you have mixed "x" and "X". They would be 
different in R.

chisq.test should not be having a problem with unequal row and column 
lengths.

#simulate a 5 x 7 table
> TT<-r2dtable(1,5*c(1,8,5,8,4),5*c(3,3,3,3,4,4,6))
> TT
[[1]]
 [,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,]0110210
[2,]336628   12
[3,]1233925
[4,]833367   10
[5,]3623123
#general test for association
> chisq.test(TT[[1]],TT[[2]])

Pearson's Chi-squared test

data:  TT[[1]] 
X-squared = 33.5942, df = 24, p-value = 0.09214

Warning message:
In chisq.test(TT[[1]], TT[[2]]) :
  Chi-squared approximation may be incorrect

-- 
David Winsemius

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] testing independence of categorical variables

2007-11-25 Thread Shoaaib Mehmood
i cant find help for xtab. Which package contains this function

On Nov 24, 2007 12:16 PM, G Ilhamto <[EMAIL PROTECTED]> wrote:
> hi shohaib,
> have you tried xtab instead of chisq.test?
>
> Ilham
>
>
>
> On Nov 22, 2007 6:16 AM, Shoaaib Mehmood <[EMAIL PROTECTED]> wrote:
> >
> >
> >
> > hi,
> >
> > is there a way of calculating of measuring dependence between two
> > categorical variables. i tried using the chi square test to test for
> > independence but i got error saying that the lengths of the two
> > vectors don't match. Suppose X and Y are two factors. X has 5 levels
> > and Y has 7 levels. This is what i tried doing
> >
> > >temp<-chisq.test(x,y)
> >
> > but got error "the lengths of the two vectors don't match". any help
> > will be appreciated
> > --
> > Regards,
> > Rana Shoaaib Mehmood
> >
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>



-- 
Regards,
Rana Shoaaib Mehmood
(+92) 333 550 4531

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] testing independence of categorical variables

2007-11-26 Thread Bernardo Rangel Tura

On Thu, 2007-11-22 at 16:16 +0500, Shoaaib Mehmood wrote:
> hi,
> 
> is there a way of calculating of measuring dependence between two
> categorical variables. i tried using the chi square test to test for
> independence but i got error saying that the lengths of the two
> vectors don't match. Suppose X and Y are two factors. X has 5 levels
> and Y has 7 levels. This is what i tried doing
> 
> >temp<-chisq.test(x,y)
> 
> but got error "the lengths of the two vectors don't match". any help
> will be appreciated


Hi Shoaaib,

Try using chisq.test(table(x,y)).

If you using chisq.test(x,y) R will testing goodness-of-fit.


-- 
Bernardo Rangel Tura, M.D,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] testing independence of categorical variables

2007-11-26 Thread John Kane
prettyR

--- Shoaaib Mehmood <[EMAIL PROTECTED]> wrote:

> i cant find help for xtab. Which package contains
> this function
> 
> On Nov 24, 2007 12:16 PM, G Ilhamto
> <[EMAIL PROTECTED]> wrote:
> > hi shohaib,
> > have you tried xtab instead of chisq.test?
> >
> > Ilham
> >
> >
> >
> > On Nov 22, 2007 6:16 AM, Shoaaib Mehmood
> <[EMAIL PROTECTED]> wrote:
> > >
> > >
> > >
> > > hi,
> > >
> > > is there a way of calculating of measuring
> dependence between two
> > > categorical variables. i tried using the chi
> square test to test for
> > > independence but i got error saying that the
> lengths of the two
> > > vectors don't match. Suppose X and Y are two
> factors. X has 5 levels
> > > and Y has 7 levels. This is what i tried doing
> > >
> > > >temp<-chisq.test(x,y)
> > >
> > > but got error "the lengths of the two vectors
> don't match". any help
> > > will be appreciated
> > > --
> > > Regards,
> > > Rana Shoaaib Mehmood
> > >
> > >
> > > __
> > > R-help@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained,
> reproducible code.
> > >
> >
> >
> 
> 
> 
> -- 
> Regards,
> Rana Shoaaib Mehmood
> (+92) 333 550 4531
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained,
> reproducible code.
> 



  Instant Messaging, free SMS, sharing photos and more... Try the new 
Yahoo! Canada Messenger at http://ca.beta.messenger.yahoo.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] testing independence of categorical variables

2007-12-06 Thread Ramin Shamshiri

The chi-square does not need your two categorical variables to have equal
levels, nor limitation for the number of levels.

The Chi-square procedure is as follow:
χ^2=∑_(All Cells)▒〖(Observed-Expected)〗^2/Expected

Expected Cell= E_ij=n((i^th RowTotal)/n)((j^th RowTotal)/n)

Degree of Freedom=df= (row-1)(Col-1)

This way should not give you any errors if your calculations are all
correct. I usually use SAS for calculations like this. Below is a sample
code I wrote to test whether US_State and Blood type are independent. You
can modify it for your data and should give you no error.

data bloodtype;
input bloodtype$ state$ count@@;
datalines;
A FL 122 B FL 117
AB FL 19 O FL 244
A IA 1781 B IA 351
AB IA 289 O IA 3301
A MO 353 B MO 269
AB MO 60 O MO 713
;
proc freq data=bloodtype;
tables bloodtype*state
/ cellchi2 chisq expected norow nocol nopercent;
weight count;
quit;


Best
Ramin
Gainesville



Shoaaib Mehmood wrote:
> 
> hi,
> 
> is there a way of calculating of measuring dependence between two
> categorical variables. i tried using the chi square test to test for
> independence but i got error saying that the lengths of the two
> vectors don't match. Suppose X and Y are two factors. X has 5 levels
> and Y has 7 levels. This is what i tried doing
> 
>>temp<-chisq.test(x,y)
> 
> but got error "the lengths of the two vectors don't match". any help
> will be appreciated
> -- 
> Regards,
> Rana Shoaaib Mehmood
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/testing-independence-of-categorical-variables-tf4855773.html#a14202348
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] testing independence of categorical variables

2007-12-06 Thread Petr PIKAL
Hi

Well, R does exactly what it says. From help page.

"Otherwise, x and y must be vectors or factors of the same length"

I do not know SAS but I presume that

> tables bloodtype*state

gives you something like

tab <- table(bloodtype, state)

and

chisq.test(tab)

shall give you the expected result. You can also do directly 
chisq.test(bloodtype, state). But what you cannot do is to test vectors 
unequal **lengths**, and that is what he did. I beleve that you can not do 
it in SAS either.
 
 x<-sample(letters[1:3], 10, replace=T)
 x
 [1] "c" "a" "c" "c" "a" "c" "a" "c" "a" "a"
 y<-sample(1:5, 20, replace=T)
> y
 [1] 2 5 1 1 2 5 2 3 1 5 5 5 1 5 5 3 2 2 5 1
> chisq.test(x,y)
Error in chisq.test(x, y) : 'x' and 'y' must have the same length
 x<-sample(letters[1:3], 20, replace=T)

> chisq.test(x,y)

Pearson's Chi-squared test

data:  x and y 
X-squared = 4.7937, df = 6, p-value = 0.5705

Warning message:
In chisq.test(x, y) : Chi-squared approximation may be incorrect
>

Regards
Petr


[EMAIL PROTECTED] napsal dne 06.12.2007 23:09:24:

> 
> The chi-square does not need your two categorical variables to have 
equal
> levels, nor limitation for the number of levels.
> 
> The Chi-square procedure is as follow:
> χ^2=∑_(All Cells)▒〖(Observed-Expected)〗^2/Expected
> 
> Expected Cell= E_ij=n((i^th RowTotal)/n)((j^th RowTotal)/n)
> 
> Degree of Freedom=df= (row-1)(Col-1)
> 
> This way should not give you any errors if your calculations are all
> correct. I usually use SAS for calculations like this. Below is a sample
> code I wrote to test whether US_State and Blood type are independent. 
You
> can modify it for your data and should give you no error.
> 
> data bloodtype;
> input bloodtype$ state$ count@@;
> datalines;
> A FL 122 B FL 117
> AB FL 19 O FL 244
> A IA 1781 B IA 351
> AB IA 289 O IA 3301
> A MO 353 B MO 269
> AB MO 60 O MO 713
> ;
> proc freq data=bloodtype;
> tables bloodtype*state
> / cellchi2 chisq expected norow nocol nopercent;
> weight count;
> quit;
> 
> 
> Best
> Ramin
> Gainesville
> 
> 
> 
> Shoaaib Mehmood wrote:
> > 
> > hi,
> > 
> > is there a way of calculating of measuring dependence between two
> > categorical variables. i tried using the chi square test to test for
> > independence but i got error saying that the lengths of the two
> > vectors don't match. Suppose X and Y are two factors. X has 5 levels
> > and Y has 7 levels. This is what i tried doing
> > 
> >>temp<-chisq.test(x,y)
> > 
> > but got error "the lengths of the two vectors don't match". any help
> > will be appreciated
> > -- 
> > Regards,
> > Rana Shoaaib Mehmood
> > 
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> > 
> > 
> 
> -- 
> View this message in context: 
http://www.nabble.com/testing-independence-of-
> categorical-variables-tf4855773.html#a14202348
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.