On Mon, 16 Feb 2015, Rodica Coderie via R-help wrote:
Hello,
I've created a ctree model called fit using 15 input variables for a factor
predicted variable Response (YES/NO).
When I run the following :
table(predict(fit2), training_data$response)
I get the following result:
NO YES
NO 48694 480
YES 0 0
It appears that the NO responses are predicted with 100% accuracy and
the YES response are predicted with 0% accuracy.
Why is this happening? It's because of my data or it's something in
ctree algorithm?
Your data has less than 1% of YES observations and I would guess that the
tree cannot separate these in a way such that majority voting gives a YES
prediction. You might consider a different cutoff (other than 50%) or
downsampling the NO observations.
Thanks!
Rodica
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.