I used SMOTE algorithm in R for class balancing. My data size has
13000 rows, I had 7% minority class in my sample now I used SMOTE(
Synthetic Minority Oversampling Technique) for class balancing such
that I raised the ration of minority class to 42 % and number of rows
in data sample becomes 12655, Now I need to fit a logistic regression
on my data set for that I need to divide the sample for cross
validation and testing. I tried two approach :

a.) train my data on sample obtained after SMOTE and tested on the
original sample having 13000 rows.

b.) divide the sample obtained after SMOTE into train and test and do
the fitting and testing on this data set only

In first approach my results might get skewed so which approach should
I take and Why ?
-- 
Vijay Goel
*+91-7501378852*

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to