[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user keypointt commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-213159510 Thank you for your review :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/12432 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-213158352 Nope, this LGTM Thank you for the PR! Merging with master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user keypointt commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-213098554 hi @jkbradley I just fixed those points you mentioned, is there anything extra should I do? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user keypointt commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-212625966 hi @jkbradley I just fixed those points you mentioned, is there anything extra should I do? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-212624337 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56410/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-212624334 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-212624080 **[Test build #56410 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56410/consoleFull)** for PR 12432 at commit [`5ef6f70`](https://github.com/apache/spark/commit/5ef6f70ef16b742962184d729da3623fea1d703b). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-212608802 **[Test build #56410 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56410/consoleFull)** for PR 12432 at commit [`5ef6f70`](https://github.com/apache/spark/commit/5ef6f70ef16b742962184d729da3623fea1d703b). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-212606502 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-212606493 **[Test build #56409 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56409/consoleFull)** for PR 12432 at commit [`9c95790`](https://github.com/apache/spark/commit/9c95790a4cbef9a7bc5c55e9da9c8b095b9c6e44). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-212606503 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56409/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-212606188 **[Test build #56409 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56409/consoleFull)** for PR 12432 at commit [`9c95790`](https://github.com/apache/spark/commit/9c95790a4cbef9a7bc5c55e9da9c8b095b9c6e44). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-212591688 I just had a few small comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/12432#discussion_r60481691 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala --- @@ -238,7 +246,9 @@ class KMeans private ( /** * Implementation of K-Means algorithm. */ - private def runAlgorithm(data: RDD[VectorWithNorm]): KMeansModel = { + private def runAlgorithm( +data: RDD[VectorWithNorm], --- End diff -- indent 2 more space (this & next line) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/12432#discussion_r60481662 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala --- @@ -206,12 +208,18 @@ class KMeans private ( this } + def run(data: RDD[Vector]): KMeansModel = { --- End diff -- This is the public method, so it needs to have the documentation and Since tag. The private version does not. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/12432#discussion_r60481707 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala --- @@ -274,6 +284,10 @@ class KMeans private ( val iterationStartTime = System.nanoTime() +if (!instr.isEmpty) { --- End diff -- simpler: ```instr.map(_.logNumFeatures(...))``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-212239669 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56316/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-212239667 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-212239615 **[Test build #56316 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56316/consoleFull)** for PR 12432 at commit [`61cb1de`](https://github.com/apache/spark/commit/61cb1decf6fdd03066709d04f88a077cf5d22c21). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-212226074 **[Test build #56316 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56316/consoleFull)** for PR 12432 at commit [`61cb1de`](https://github.com/apache/spark/commit/61cb1decf6fdd03066709d04f88a077cf5d22c21). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user keypointt commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-212217086 Hi @jkbradley I removed null and now use Option, could you please have a look if it is ok now? thanks a lot --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-212205981 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-212205982 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56302/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-212205896 **[Test build #56302 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56302/consoleFull)** for PR 12432 at commit [`d427bb3`](https://github.com/apache/spark/commit/d427bb3ad6ad20d9ba41226d585d6192d4f59029). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-212190707 **[Test build #56302 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56302/consoleFull)** for PR 12432 at commit [`d427bb3`](https://github.com/apache/spark/commit/d427bb3ad6ad20d9ba41226d585d6192d4f59029). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-212156488 @keypointt Thanks for the updates. I'll check back later! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/12432#discussion_r60324319 --- Diff: project/MimaExcludes.scala --- @@ -626,6 +626,9 @@ object MimaExcludes { // [SPARK-13048][ML][MLLIB] keepLastCheckpoint option for LDA EM optimizer ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.mllib.clustering.DistributedLDAModel.this") ) ++ Seq( +// [SPARK-14569][ML] Log instrumentation in KMeans + ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.mllib.clustering.KMeans.run") --- End diff -- This should not be needed after the changes I suggested above. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/12432#discussion_r60324292 --- Diff: mllib/src/main/scala/org/apache/spark/ml/util/Instrumentation.scala --- @@ -39,7 +39,7 @@ import org.apache.spark.sql.Dataset * @param dataset the training dataset * @tparam E the type of the estimator */ -private[ml] class Instrumentation[E <: Estimator[_]] private ( +class Instrumentation[E <: Estimator[_]] private ( --- End diff -- Change to ```private[spark]``` rather than making it public --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/12432#discussion_r60324297 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala --- @@ -21,6 +21,8 @@ import scala.collection.mutable.ArrayBuffer import org.apache.spark.annotation.Since import org.apache.spark.internal.Logging +import org.apache.spark.ml.clustering --- End diff -- Rename this to make it clear it is the new API: ``` import org.apache.spark.ml.clustering.{KMeans => NewKMeans} ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/12432#discussion_r60324300 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala --- @@ -209,9 +211,10 @@ class KMeans private ( /** * Train a K-means model on the given set of points; `data` should be cached for high * performance, because this is an iterative algorithm. + * `instr` is used to log instrumentation parameters. --- End diff -- do not include this in public API docs --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/12432#discussion_r60324316 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala --- @@ -287,6 +291,10 @@ class KMeans private ( val bcActiveCenters = sc.broadcast(activeCenters) + if (instr != null) { +instr.logNumFeatures(bcActiveCenters.value(0)(0).vector.size) --- End diff -- This is being logged on every iteration, but it should only be logged once. Move before the while loop, and set it using "centers". --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/12432#discussion_r60324296 --- Diff: mllib/src/main/scala/org/apache/spark/ml/util/Instrumentation.scala --- @@ -95,7 +95,7 @@ private[ml] class Instrumentation[E <: Estimator[_]] private ( /** * Some common methods for logging information about a training session. */ -private[ml] object Instrumentation { +object Instrumentation { --- End diff -- same here: ```private[spark]``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/12432#discussion_r60324313 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala --- @@ -238,7 +241,8 @@ class KMeans private ( /** * Implementation of K-Means algorithm. */ - private def runAlgorithm(data: RDD[VectorWithNorm]): KMeansModel = { + private def runAlgorithm(data: RDD[VectorWithNorm], --- End diff -- Please follow the Spark style guide: [https://cwiki.apache.org/confluence/display/SPARK/Spark+Code+Style+Guide]. Here, for multi-line method headers, put 1 arg per line, and put the initial arg on the line below the method name. Check out surrounding code for examples. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/12432#discussion_r60324307 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala --- @@ -209,9 +211,10 @@ class KMeans private ( /** * Train a K-means model on the given set of points; `data` should be cached for high * performance, because this is an iterative algorithm. + * `instr` is used to log instrumentation parameters. */ @Since("0.8.0") - def run(data: RDD[Vector]): KMeansModel = { + def run(data: RDD[Vector], instr: Instrumentation[clustering.KMeans] = null): KMeansModel = { --- End diff -- Default arguments are not Java friendly. You'll need to do this: ``` def run(data: RDD[Vector]): KMeansModel = { run(data, None) } private[spark] def run(data: RDD[Vector], instr: Option[Instrumentation[clustering.KMeans]]): KMeansModel = ... ``` That way, we will not change the public API. Note: I'd also use Option instead of null. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user keypointt commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-211627178 Hi @thunterdb, I made some changes and I'm not sure if this is the right way to do it. Would you mind have a look at it? thanks a lot --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-211621515 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-211621519 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56135/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-211621243 **[Test build #56135 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56135/consoleFull)** for PR 12432 at commit [`248c8b0`](https://github.com/apache/spark/commit/248c8b00eeebad14080e0076df476b360d953b0e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-211573897 **[Test build #56135 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56135/consoleFull)** for PR 12432 at commit [`248c8b0`](https://github.com/apache/spark/commit/248c8b00eeebad14080e0076df476b360d953b0e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-211561004 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56115/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-211560999 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-211560381 **[Test build #56115 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56115/consoleFull)** for PR 12432 at commit [`e8acece`](https://github.com/apache/spark/commit/e8acecefee60958c8521bc82dc455061d32e1fe7). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-211522644 **[Test build #56115 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56115/consoleFull)** for PR 12432 at commit [`e8acece`](https://github.com/apache/spark/commit/e8acecefee60958c8521bc82dc455061d32e1fe7). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-211508789 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56106/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-211508743 **[Test build #56106 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56106/consoleFull)** for PR 12432 at commit [`c8e35e2`](https://github.com/apache/spark/commit/c8e35e208a90f5bceb3ee84fce4af3ef887cade9). * This patch **fails MiMa tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-211508786 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-211504969 **[Test build #56106 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56106/consoleFull)** for PR 12432 at commit [`c8e35e2`](https://github.com/apache/spark/commit/c8e35e208a90f5bceb3ee84fce4af3ef887cade9). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-211496350 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56102/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-211496320 **[Test build #56102 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56102/consoleFull)** for PR 12432 at commit [`cc746e5`](https://github.com/apache/spark/commit/cc746e589dce9cce671b40d8086ce997d1afdd9d). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-211496346 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-211494813 **[Test build #56102 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56102/consoleFull)** for PR 12432 at commit [`cc746e5`](https://github.com/apache/spark/commit/cc746e589dce9cce671b40d8086ce997d1afdd9d). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user keypointt commented on a diff in the pull request: https://github.com/apache/spark/pull/12432#discussion_r59957731 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala --- @@ -264,6 +264,9 @@ class KMeans @Since("1.5.0") ( override def fit(dataset: Dataset[_]): KMeansModel = { val rdd = dataset.select(col($(featuresCol))).rdd.map { case Row(point: Vector) => point } +val instr = Instrumentation.create(this, rdd) +instr.logParams(featuresCol, predictionCol, k, initMode, initSteps, maxIter, seed, tol) + val algo = new MLlibKMeans() --- End diff -- Thanks Timothy. I'm a starter on Spark sorry for being naive. I just want to confirm with you that I understand correctly. 1. for creating a new method `algo.run(rdd, instr)`, I just find I also need to create another method `runAlgorithm(zippedData, instr)` to take `instr` as a parameter https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala#L241 , since inside 'runAlgorithm' is the dimension we want https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala#L295 1. class 'Instrumentation' is private and in ml package, so it cannot be accessed from mllib package. So I have to change it to be public by removing `private[ml] `? https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/util/Instrumentation.scala#L42 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-210675584 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/55969/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-210675580 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-210675495 **[Test build #55969 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55969/consoleFull)** for PR 12432 at commit [`f9592e2`](https://github.com/apache/spark/commit/f9592e2588f0a5b987f6b822a06bbe2c94a3b4e6). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user thunterdb commented on a diff in the pull request: https://github.com/apache/spark/pull/12432#discussion_r59950837 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala --- @@ -264,6 +264,9 @@ class KMeans @Since("1.5.0") ( override def fit(dataset: Dataset[_]): KMeansModel = { val rdd = dataset.select(col($(featuresCol))).rdd.map { case Row(point: Vector) => point } +val instr = Instrumentation.create(this, rdd) +instr.logParams(featuresCol, predictionCol, k, initMode, initSteps, maxIter, seed, tol) + val algo = new MLlibKMeans() --- End diff -- one statistic that is usually very useful to get is the dimension of the vectors (`numFeatures`). One way to get it is to pass the instrumentation instance to `algo.run(rdd)` below, and mark this new method as `private[spark]`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user thunterdb commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-210671834 @keypointt thanks! I have one comment. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12432#issuecomment-210668012 **[Test build #55969 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55969/consoleFull)** for PR 12432 at commit [`f9592e2`](https://github.com/apache/spark/commit/f9592e2588f0a5b987f6b822a06bbe2c94a3b4e6). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14569][ML] Log instrumentation in KMean...
GitHub user keypointt opened a pull request: https://github.com/apache/spark/pull/12432 [SPARK-14569][ML] Log instrumentation in KMeans ## What changes were proposed in this pull request? https://issues.apache.org/jira/browse/SPARK-14569 Log instrumentation in KMeans: - featuresCol - predictionCol - k - initMode - initSteps - maxIter - seed - tol - summary ## How was this patch tested? Manually test on local machine, by running and checking output of org.apache.spark.examples.ml.KMeansExample You can merge this pull request into a Git repository by running: $ git pull https://github.com/keypointt/spark SPARK-14569 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/12432.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #12432 commit f9592e2588f0a5b987f6b822a06bbe2c94a3b4e6 Author: Xin RenDate: 2016-04-15T21:48:13Z [SPARK-14569] Log instrumentation in KMeans --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org