Wojciech Jurczyk created SPARK-13030: ----------------------------------------
Summary: Change OneHotEncoder to Estimator Key: SPARK-13030 URL: https://issues.apache.org/jira/browse/SPARK-13030 Project: Spark Issue Type: Improvement Components: ML Affects Versions: 1.6.0 Reporter: Wojciech Jurczyk OneHotEncoder should be an Estimator, just like in scikit-learn (http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html). In its current form, it is impossible to use when number of categories is different between training dataset and test dataset. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org