[GitHub] spark issue #16194: [SPARK-18767][ML] Unify Models' toString methods
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/16194 @sethah +1, I prefer not to do any update. I think this isuue will be resolved by the way after some refactor on SharedParam. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16194: [SPARK-18767][ML] Unify Models' toString methods
Github user sethah commented on the issue: https://github.com/apache/spark/pull/16194 I lean towards doing nothing, unless we can find a solution that is both generic AND lists all/only relevant information. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16194: [SPARK-18767][ML] Unify Models' toString methods
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16194 To move towards a resolution, I'd either implement one generic `toString` that enumerates params, or not do anything at this stage. Up to you @zhengruifeng --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16194: [SPARK-18767][ML] Unify Models' toString methods
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/16194 @sethah No. Just for the convenience in repl. It's somewhat confusing when the model's desc is just the uid of its trainer when I work in repl. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16194: [SPARK-18767][ML] Unify Models' toString methods
Github user sethah commented on the issue: https://github.com/apache/spark/pull/16194 One problem I see is that some models have relevant information stored in class members instead of params. LogisticRegression for example would probably list the number of classes and the number of features, neither of which are params. Also, as @zhengruifeng mentioned, there are a lot of not-useful params that we would end up listing. However, I do not like the idea of implementing something different for every single estimator/model either. I guess I'm not entirely convinced the added benefit is worth the cost. What is the motivation for this, other than that it is convenient in the repl? Are there users asking for it? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16194: [SPARK-18767][ML] Unify Models' toString methods
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16194 It still may be better than manually implementing so many toString methods. The worst case is that this carries some extra output. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16194: [SPARK-18767][ML] Unify Models' toString methods
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/16194 Good point. This make me think of the usage of `instr.logParams(params: _*)` in training instrumentation [https://github.com/apache/spark/pull/15671]. IMIO, params are copied from its trainer, and some are not significant for the model. For `KMeansModel`, I think only param `k` make sense, others are just the params to create the model. I am not sure whether we should print all shared params of model. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16194: [SPARK-18767][ML] Unify Models' toString methods
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16194 Oh that's a good point, you can enumerate the params generically? that makes a lot more sense. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16194: [SPARK-18767][ML] Unify Models' toString methods
Github user hhbyyh commented on the issue: https://github.com/apache/spark/pull/16194 Sorry but I would still strongly prefer to override the toString() in Model or Params. The default implementation can provide more information about the params. And only in a few cases you need to override the default implementation. More specifically, ``` override def toString = s"${this.logName}(uid=$uid)" + this.params.map(p => p.name + ": " + $(p)).mkString(" {\n ", "\n ", "\n}") // or use this.extractParamMap().toString() ``` would give a result: ``` org.apache.spark.ml.clustering.NewModel(uid=1122) { featuresCol: features initMode: k-means|| initSteps: 2 k: 2 maxIter: 20 predictionCol: prediction seed: 1387313927 tol: 1.0E-4 } ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16194: [SPARK-18767][ML] Unify Models' toString methods
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16194 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16194: [SPARK-18767][ML] Unify Models' toString methods
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16194 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70057/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16194: [SPARK-18767][ML] Unify Models' toString methods
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16194 **[Test build #70057 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70057/consoleFull)** for PR 16194 at commit [`3b7ddfe`](https://github.com/apache/spark/commit/3b7ddfe02a00e7722b6ec41aef5ea1e69f738561). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16194: [SPARK-18767][ML] Unify Models' toString methods
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16194 **[Test build #70057 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70057/consoleFull)** for PR 16194 at commit [`3b7ddfe`](https://github.com/apache/spark/commit/3b7ddfe02a00e7722b6ec41aef5ea1e69f738561). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16194: [SPARK-18767][ML] Unify Models' toString methods
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/16194 @srowen Ok, I will update the version. @hhbyyh as Owen said, if we add a `toString` in class `Model`, we still need to override it in most algs if we want to output some specific info: for example, current `DecisionTreeClassifier` will output the depth and the number of nodes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16194: [SPARK-18767][ML] Unify Models' toString methods
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16194 Yes, it would be nicer to just have a default like `ClassName(uid=...)` at least. That however would not include the class-specific details like number of layers. I am neutral on whether that's so useful, but OK. At least, this should not say `@Since("2.1.0")` since it can only go in for 2.2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16194: [SPARK-18767][ML] Unify Models' toString methods
Github user hhbyyh commented on the issue: https://github.com/apache/spark/pull/16194 I haven't really checked about it, but is it possible to just provide the default `toString` implementation in some upstream abstract class, like Model? And we can do the similar thing for Estimator. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16194: [SPARK-18767][ML] Unify Models' toString methods
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16194 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16194: [SPARK-18767][ML] Unify Models' toString methods
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16194 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69789/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16194: [SPARK-18767][ML] Unify Models' toString methods
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16194 **[Test build #69789 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69789/consoleFull)** for PR 16194 at commit [`efdc165`](https://github.com/apache/spark/commit/efdc165a481bdb56ba61b702895be227aeddb05e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16194: [SPARK-18767][ML] Unify Models' toString methods
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16194 **[Test build #69789 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69789/consoleFull)** for PR 16194 at commit [`efdc165`](https://github.com/apache/spark/commit/efdc165a481bdb56ba61b702895be227aeddb05e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org