[ https://issues.apache.org/jira/browse/SPARK-11259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen updated SPARK-11259: ------------------------------ Target Version/s: (was: 1.6.1) Priority: Minor (was: Major) Issue Type: Improvement (was: Bug) > Params.validateParams() should be called automatically > ------------------------------------------------------ > > Key: SPARK-11259 > URL: https://issues.apache.org/jira/browse/SPARK-11259 > Project: Spark > Issue Type: Improvement > Components: ML > Reporter: Yanbo Liang > Assignee: Yanbo Liang > Priority: Minor > > Params.validateParams() can not be called automatically currently. Such as > the following code snippet will not throw exception which is not as expected. > {code} > val df = sqlContext.createDataFrame( > Seq( > (1, Vectors.dense(0.0, 1.0, 4.0), 1.0), > (2, Vectors.dense(1.0, 0.0, 4.0), 2.0), > (3, Vectors.dense(1.0, 0.0, 5.0), 3.0), > (4, Vectors.dense(0.0, 0.0, 5.0), 4.0)) > ).toDF("id", "features", "label") > val scaler = new MinMaxScaler() > .setInputCol("features") > .setOutputCol("features_scaled") > .setMin(10) > .setMax(0) > val pipeline = new Pipeline().setStages(Array(scaler)) > pipeline.fit(df) > {code} > validateParams() should be called by > PipelineStage(Pipeline/Estimator/Transformer) automatically, so I propose to > put it in transformSchema(). -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org