Re: [R] How to run prop.test on 3-level factors?

2021-11-16 Thread Jim Lemon
Hi Luigi,
Maybe multinomial regression?

https://www.r-bloggers.com/2020/05/multinomial-logistic-regression-with-r/

Jim

On Tue, Nov 16, 2021 at 7:33 PM Luigi Marongiu  wrote:
>
> Hello,
> I have a large database with a column containing a factor:
> ```
> > str(df)
> 'data.frame': 500 obs. of  4 variables:
> $ MR   : num  0.000809 0.001236 0.001663 0.002089 0.002516 ...
> $ FCN  : num  2 2 2 2 2 2 2 2 2 2 ...
> $ Class: Factor w/ 3 levels "negative","positive",..: 1 1 1 1 1 1 1 1 1 1 ...
> $ Set  : int  1 1 1 1 1 1 1 1 1 1 ...
> - attr(*, "out.attrs")=List of 2
> ..$ dim : Named int [1:2] 1000 1000
> .. ..- attr(*, "names")= chr [1:2] "X1" "X2"
> ..$ dimnames:List of 2
> .. ..$ X1: chr [1:1000] "X1=0.0008094667" "X1=0.0012360955"
> "X1=0.0016627243" "X1=0.0020893531" ...
> .. ..$ X2: chr [1:1000] "X2= 2.00" "X2= 2.048048" "X2= 2.096096"
> "X2= 2.144144" ...
> ```
> I would like to run prop.test on df$Class, but:
> ```
> > prop.test(x=point$Class, n=length(point$Class),
> + conf.level=.95, correct=FALSE)
> Error in prop.test(x = point$Class, n = length(point$Class),
> conf.level = 0.95,  :
> 'x' and 'n' must have the same length
> ```
> Since `x` is "a vector of counts of successes, a one-dimensional table
> with two entries, or a two-dimensional table (or matrix) with 2
> columns, giving the counts of successes and failures, respectively." I
> provided point$Class. The total number of tests is
> length(point$Class).
> There are three levels:
> ```
> > unique(df$Class)
> [1] negative  positive  uncertain
> Levels: negative positive uncertain
> ```
> I tried to remove the levels to check if the levels were interfering
> with the test:
> ```
> > df$Class = levels(droplevels(df$Class))
> Error in `$<-.data.frame`(`*tmp*`, Class, value = c("negative", "positive",  :
> replacement has 3 rows, data has 500
> ```
> What would be the syntax for this test? The idea is to get the most
> common value for each unique(df$Set) and prop.test will provide also
> the 95% CI for the estimate.
> Thanks
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to run prop.test on 3-level factors?

2021-11-16 Thread Luigi Marongiu
Hello,
I have a large database with a column containing a factor:
```
> str(df)
'data.frame': 500 obs. of  4 variables:
$ MR   : num  0.000809 0.001236 0.001663 0.002089 0.002516 ...
$ FCN  : num  2 2 2 2 2 2 2 2 2 2 ...
$ Class: Factor w/ 3 levels "negative","positive",..: 1 1 1 1 1 1 1 1 1 1 ...
$ Set  : int  1 1 1 1 1 1 1 1 1 1 ...
- attr(*, "out.attrs")=List of 2
..$ dim : Named int [1:2] 1000 1000
.. ..- attr(*, "names")= chr [1:2] "X1" "X2"
..$ dimnames:List of 2
.. ..$ X1: chr [1:1000] "X1=0.0008094667" "X1=0.0012360955"
"X1=0.0016627243" "X1=0.0020893531" ...
.. ..$ X2: chr [1:1000] "X2= 2.00" "X2= 2.048048" "X2= 2.096096"
"X2= 2.144144" ...
```
I would like to run prop.test on df$Class, but:
```
> prop.test(x=point$Class, n=length(point$Class),
+ conf.level=.95, correct=FALSE)
Error in prop.test(x = point$Class, n = length(point$Class),
conf.level = 0.95,  :
'x' and 'n' must have the same length
```
Since `x` is "a vector of counts of successes, a one-dimensional table
with two entries, or a two-dimensional table (or matrix) with 2
columns, giving the counts of successes and failures, respectively." I
provided point$Class. The total number of tests is
length(point$Class).
There are three levels:
```
> unique(df$Class)
[1] negative  positive  uncertain
Levels: negative positive uncertain
```
I tried to remove the levels to check if the levels were interfering
with the test:
```
> df$Class = levels(droplevels(df$Class))
Error in `$<-.data.frame`(`*tmp*`, Class, value = c("negative", "positive",  :
replacement has 3 rows, data has 500
```
What would be the syntax for this test? The idea is to get the most
common value for each unique(df$Set) and prop.test will provide also
the 95% CI for the estimate.
Thanks

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.