[jira] [Commented] (SPARK-9720) spark.ml Identifiable types should have UID in toString methods

2015-09-12 Thread Bertrand Dechoux (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14742161#comment-14742161
 ] 

Bertrand Dechoux commented on SPARK-9720:
-

The pull request can be merged.

> spark.ml Identifiable types should have UID in toString methods
> ---
>
> Key: SPARK-9720
> URL: https://issues.apache.org/jira/browse/SPARK-9720
> Project: Spark
>  Issue Type: Improvement
>  Components: ML
>Reporter: Joseph K. Bradley
>Assignee: Bertrand Dechoux
>Priority: Minor
>  Labels: starter
>
> It would be nice to include the UID (instance name) in toString methods.  
> That's the default behavior for Identifiable, but some types override the 
> default toString and do not include the UID.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-9720) spark.ml Identifiable types should have UID in toString methods

2015-08-10 Thread Joseph K. Bradley (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14680520#comment-14680520
 ] 

Joseph K. Bradley commented on SPARK-9720:
--

Oh sorry!  I shouldn't have said print.

 spark.ml Identifiable types should have UID in toString methods
 ---

 Key: SPARK-9720
 URL: https://issues.apache.org/jira/browse/SPARK-9720
 Project: Spark
  Issue Type: Improvement
  Components: ML
Reporter: Joseph K. Bradley
Priority: Minor
  Labels: starter

 It would be nice to include the UID (instance name) in toString methods.  
 That's the default behavior for Identifiable, but some types override the 
 default toString and do not include the UID.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-9720) spark.ml Identifiable types should have UID in toString methods

2015-08-09 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14679228#comment-14679228
 ] 

Apache Spark commented on SPARK-9720:
-

User 'BertrandDechoux' has created a pull request for this issue:
https://github.com/apache/spark/pull/8062

 spark.ml Identifiable types should have UID in toString methods
 ---

 Key: SPARK-9720
 URL: https://issues.apache.org/jira/browse/SPARK-9720
 Project: Spark
  Issue Type: Improvement
  Components: ML
Reporter: Joseph K. Bradley
Priority: Minor
  Labels: starter

 It would be nice to print the UID (instance name) in toString methods.  
 That's the default behavior for Identifiable, but some types override the 
 default toString and do not print the UID.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-9720) spark.ml Identifiable types should have UID in toString methods

2015-08-09 Thread Sean Cho (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14679481#comment-14679481
 ] 

Sean Cho commented on SPARK-9720:
-

I was fooled by the word print. I thought it was said that toString method 
should print the uid. I suppose it was meant to be return the uid. ;)

 spark.ml Identifiable types should have UID in toString methods
 ---

 Key: SPARK-9720
 URL: https://issues.apache.org/jira/browse/SPARK-9720
 Project: Spark
  Issue Type: Improvement
  Components: ML
Reporter: Joseph K. Bradley
Priority: Minor
  Labels: starter

 It would be nice to print the UID (instance name) in toString methods.  
 That's the default behavior for Identifiable, but some types override the 
 default toString and do not print the UID.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-9720) spark.ml Identifiable types should have UID in toString methods

2015-08-07 Thread Joseph K. Bradley (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14662514#comment-14662514
 ] 

Joseph K. Bradley commented on SPARK-9720:
--

I like the proposal, but I don't think we should break APIs...which 
unfortunately means we will need to stick with encouragement instead of 
enforcement.  Would you mind sending a PR to update those classes with issues?

 spark.ml Identifiable types should have UID in toString methods
 ---

 Key: SPARK-9720
 URL: https://issues.apache.org/jira/browse/SPARK-9720
 Project: Spark
  Issue Type: Improvement
  Components: ML
Reporter: Joseph K. Bradley
Priority: Minor
  Labels: starter

 It would be nice to print the UID (instance name) in toString methods.  
 That's the default behavior for Identifiable, but some types override the 
 default toString and do not print the UID.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-9720) spark.ml Identifiable types should have UID in toString methods

2015-08-07 Thread Bertrand Dechoux (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14662248#comment-14662248
 ] 

Bertrand Dechoux commented on SPARK-9720:
-

I might not understand but isn't it already the case for the master branch?

https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/util/Identifiable.scala
trait Identifiable {
  override def toString: String = uid
}

And many Identifiables have a default constructor using 
Identifiable.randomUID(keyword) for uid.
https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/evaluation/BinaryClassificationEvaluator.scala

Do you have specific counter examples?

 spark.ml Identifiable types should have UID in toString methods
 ---

 Key: SPARK-9720
 URL: https://issues.apache.org/jira/browse/SPARK-9720
 Project: Spark
  Issue Type: Improvement
  Components: ML
Reporter: Joseph K. Bradley
Priority: Minor
  Labels: starter

 It would be nice to print the UID (instance name) in toString methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-9720) spark.ml Identifiable types should have UID in toString methods

2015-08-07 Thread Joseph K. Bradley (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14662253#comment-14662253
 ] 

Joseph K. Bradley commented on SPARK-9720:
--

Good point about the default toString.  The main problem is from a few classes 
overriding toString.  I'll state that in the description.

 spark.ml Identifiable types should have UID in toString methods
 ---

 Key: SPARK-9720
 URL: https://issues.apache.org/jira/browse/SPARK-9720
 Project: Spark
  Issue Type: Improvement
  Components: ML
Reporter: Joseph K. Bradley
Priority: Minor
  Labels: starter

 It would be nice to print the UID (instance name) in toString methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-9720) spark.ml Identifiable types should have UID in toString methods

2015-08-07 Thread Bertrand Dechoux (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14662333#comment-14662333
 ] 

Bertrand Dechoux commented on SPARK-9720:
-

I could take care of it.

Here is the list (only in spark.ml) :
* DecisionTreeClassificationModel
* DecisionTreeRegressionModel
* GBTClassificationModel
* GBTRegressionModel
* NaiveBayesModel
* RFormula
* RFormulaModel
* RandomForestClassificationModel
* RandomForestRegressionModel

The question is do we want to enforce that identifiable types should be 
identifiable by their toString.
It does make sense. The following question is can we introduce potential API 
breaking change in the API in order to do it?

If the answer is yes, the easy way would be to set Identifiable.toString as 
final and compose it with an overridable empty suffix

private[spark] trait Identifiable {

  /**
   * An immutable unique ID for the object and its derivatives.
   */
  val uid: String
  
  def toStringSuffix: String = 

  override final def toString: String = uid + toStringSuffix
}

Is there a committer that could validate this proposal?

 spark.ml Identifiable types should have UID in toString methods
 ---

 Key: SPARK-9720
 URL: https://issues.apache.org/jira/browse/SPARK-9720
 Project: Spark
  Issue Type: Improvement
  Components: ML
Reporter: Joseph K. Bradley
Priority: Minor
  Labels: starter

 It would be nice to print the UID (instance name) in toString methods.  
 That's the default behavior for Identifiable, but some types override the 
 default toString and do not print the UID.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-9720) spark.ml Identifiable types should have UID in toString methods

2015-08-06 Thread Joseph K. Bradley (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661218#comment-14661218
 ] 

Joseph K. Bradley commented on SPARK-9720:
--

Say you have a big Pipeline, run it, and get a failure saying some model type 
MyModel failed at point X.  You may have multiple instances of MyModel in the 
Pipeline, and you will have no idea which of those instances caused the 
failure.  It'd be nice to know which one, and the UID provides that.

 spark.ml Identifiable types should have UID in toString methods
 ---

 Key: SPARK-9720
 URL: https://issues.apache.org/jira/browse/SPARK-9720
 Project: Spark
  Issue Type: Improvement
  Components: ML
Reporter: Joseph K. Bradley
Priority: Minor
  Labels: starter

 It would be nice to print the UID (instance name) in toString methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-9720) spark.ml Identifiable types should have UID in toString methods

2015-08-06 Thread Sean Cho (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661195#comment-14661195
 ] 

Sean Cho commented on SPARK-9720:
-

Can I ask why this is required and if it is a good idea?

 spark.ml Identifiable types should have UID in toString methods
 ---

 Key: SPARK-9720
 URL: https://issues.apache.org/jira/browse/SPARK-9720
 Project: Spark
  Issue Type: Improvement
  Components: ML
Reporter: Joseph K. Bradley
Priority: Minor
  Labels: starter

 It would be nice to print the UID (instance name) in toString methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org