Hi, I am trying to model credit risk data using decision trees. Since the number of defaulters is less compared to non-defaulters (defaulters around 10%), we have the class imbalance problem. Consequently, the confusion matrix shows that the number of misclassified non-defaulters is large. Classifying a defaulter as non-defaulter is more expensive. How does one include this information (penalty matrix) into rpart function?
Thanks and regards, Dr S Muralidharan Chief Scientist, Tata Consultancy Services 17, Cathedral Road, Chennai - 600 086,Tamil Nadu India Ph:- 91 44 66164513 Buzz:- 444 4513 Mailto: muralidharan.somasunda...@tcs.com Website: http://www.tcs.com ____________________________________________ Experience certainty. IT Services Business Solutions Outsourcing ____________________________________________ =====-----=====-----===== Notice: The information contained in this e-mail message and/or attachments to it may contain confidential or privileged information. If you are not the intended recipient, any dissemination, use, review, distribution, printing or copying of the information contained in this e-mail message and/or attachments to it are strictly prohibited. If you have received this communication in error, please notify us by reply e-mail or telephone and immediately and permanently delete the message and any attachments. Thank you [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.