[ 
https://issues.apache.org/jira/browse/SPARK-16851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhengruifeng updated SPARK-16851:
---------------------------------
    Description: 
{code}
val path = 
"./spark-2.0.0-bin-hadoop2.7/data/mllib/sample_multiclass_classification_data.txt"
val data = spark.read.format("libsvm").load(path)
val rf = new RandomForestClassifier()
val model = rf.fit(data)

model.numClasses
res48: Int = 3

model.setThresholds(Array(0.5,0.1))
res49: org.apache.spark.ml.classification.RandomForestClassificationModel = 
RandomForestClassificationModel (uid=rfc_b39da354ac8b) with 20 trees


model.transform(data)
java.lang.IllegalArgumentException: requirement failed: 
RandomForestClassificationModel.transform() called with non-matching numClasses 
and thresholds.length. numClasses=3, but thresholds has length 2
  at scala.Predef$.require(Predef.scala:224)
  at 
org.apache.spark.ml.classification.ProbabilisticClassificationModel.transform(ProbabilisticClassifier.scala:101)
  ... 58 elided
{code}

Although model set with wrong threshoulds will fail in prediction, it maybe 
nice to evoke exception earlier in {setThreshoulds} 

  was:
{code}
val path = 
"./spark-2.0.0-bin-hadoop2.7/data/mllib/sample_multiclass_classification_data.txt"
val data = spark.read.format("libsvm").load(path)
val rf = new RandomForestClassifier()
val model = rf.fit(data)

model.numClasses
res48: Int = 3

model.setThresholds(Array(0.5,0.1))
res49: org.apache.spark.ml.classification.RandomForestClassificationModel = 
RandomForestClassificationModel (uid=rfc_b39da354ac8b) with 20 trees
{{code}

model.transform(data)
java.lang.IllegalArgumentException: requirement failed: 
RandomForestClassificationModel.transform() called with non-matching numClasses 
and thresholds.length. numClasses=3, but thresholds has length 2
  at scala.Predef$.require(Predef.scala:224)
  at 
org.apache.spark.ml.classification.ProbabilisticClassificationModel.transform(ProbabilisticClassifier.scala:101)
  ... 58 elided
{{code}}

Although model set with wrong threshoulds will fail in prediction, it maybe 
nice to evoke exception earlier in {setThreshoulds} 


> Incorrect threshould length in 'setThresholds()' evoke Exception 
> -----------------------------------------------------------------
>
>                 Key: SPARK-16851
>                 URL: https://issues.apache.org/jira/browse/SPARK-16851
>             Project: Spark
>          Issue Type: Improvement
>          Components: ML
>            Reporter: zhengruifeng
>            Priority: Trivial
>
> {code}
> val path = 
> "./spark-2.0.0-bin-hadoop2.7/data/mllib/sample_multiclass_classification_data.txt"
> val data = spark.read.format("libsvm").load(path)
> val rf = new RandomForestClassifier()
> val model = rf.fit(data)
> model.numClasses
> res48: Int = 3
> model.setThresholds(Array(0.5,0.1))
> res49: org.apache.spark.ml.classification.RandomForestClassificationModel = 
> RandomForestClassificationModel (uid=rfc_b39da354ac8b) with 20 trees
> model.transform(data)
> java.lang.IllegalArgumentException: requirement failed: 
> RandomForestClassificationModel.transform() called with non-matching 
> numClasses and thresholds.length. numClasses=3, but thresholds has length 2
>   at scala.Predef$.require(Predef.scala:224)
>   at 
> org.apache.spark.ml.classification.ProbabilisticClassificationModel.transform(ProbabilisticClassifier.scala:101)
>   ... 58 elided
> {code}
> Although model set with wrong threshoulds will fail in prediction, it maybe 
> nice to evoke exception earlier in {setThreshoulds} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to