[ https://issues.apache.org/jira/browse/SPARK-15092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nick Pentreath resolved SPARK-15092. ------------------------------------ Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12919 [https://github.com/apache/spark/pull/12919] > toDebugString missing from ML DecisionTreeClassifier > ---------------------------------------------------- > > Key: SPARK-15092 > URL: https://issues.apache.org/jira/browse/SPARK-15092 > Project: Spark > Issue Type: Bug > Components: ML > Affects Versions: 1.6.0 > Environment: HDP 2.3.4, Red Hat 6.7 > Reporter: Ivan SPM > Assignee: holdenk > Priority: Minor > Labels: features > Fix For: 2.0.0 > > > The attribute toDebugString is missing from the DecisionTreeClassifier and > DecisionTreeClassifierModel from ML. The attribute exists on the MLLib > DecisionTree model. > There's no way to check or print the model tree structure from the ML. > The basic code for it is this: > rom pyspark.ml import Pipeline > from pyspark.ml.feature import VectorAssembler, StringIndexer > from pyspark.ml.classification import DecisionTreeClassifier > cl = DecisionTreeClassifier(labelCol='target_idx', featuresCol='features') > pipe = Pipeline(stages=[target_index, assembler, cl]) > model = pipe.fit(df_train) > # Prediction and model evaluation > predictions = model.transform(df_test) > mc_evaluator = MulticlassClassificationEvaluator( > labelCol="target_idx", predictionCol="prediction", metricName="precision" ) > accuracy = mc_evaluator.evaluate(predictions) > print("Test Error = {}".format(1.0 - accuracy)) > now it would be great to be able to do what is being done on the MLLib model: > print model.toDebugString(), # it already has newline > DecisionTreeModel classifier of depth 1 with 3 nodes > If (feature 0 <= 0.0) > Predict: 0.0 > Else (feature 0 > 0.0) > Predict: 1.0 > but there's no toDebugString attribute either to the pipeline model or the > DecisionTreeClassifier model: > cl.toDebugString() > Attribute Error > https://spark.apache.org/docs/1.6.0/api/python/_modules/pyspark/mllib/tree.html -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org