Simon Tao created SPARK-37285: --------------------------------- Summary: Add Weight of Evidence and Information value to ml.feature Key: SPARK-37285 URL: https://issues.apache.org/jira/browse/SPARK-37285 Project: Spark Issue Type: New Feature Components: ML Affects Versions: 3.2.0 Reporter: Simon Tao
The weight of evidence (WOE) and information value (IV) provide a great framework for exploratory analysis and variable screening for binary classifiers as well as beneficial and help us analyze multiple points as listed below: 1. Helps check the linear relationship of a feature with its dependent feature to be used in the model. 2. Is a good variable transformation method for both continuous and categorical features. 3. Is better than on-hot encoding as this method of variable transformation does not increase the complexity of the model. 4. Detect linear and non-linear relationships. 5. Be useful in feature selection. 6. Is a good measure of the predictive power of a feature and it also helps point out the suspicious feature. -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org