[GitHub] spark pull request: [SPARK-14862][ML] Updated Classifiers to not r...

sethah Tue, 26 Apr 2016 08:49:01 -0700

Github user sethah commented on the pull request:

    https://github.com/apache/spark/pull/12663#issuecomment-214790032
  
    Out of curiosity, if or when 
[SPARK-7126](https://issues.apache.org/jira/browse/SPARK-7126) is implemented, 
do we plan to remove this behavior? 
    
    Regarding small datasets and cross validation, it is a bit concerning that 
the model could get trained with an incorrect number of classes, and since it 
will happen silently, it could create some confusion. However, I think it is 
reasonable to expect that end users should realize that some splits of their 
data could be missing label class values, and without explicitly flagging the 
number of classes, there is no way for the algorithm to know.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14862][ML] Updated Classifiers to not r...

Reply via email to