zhengruifeng commented on a change in pull request #27035: [SPARK-30351][ML][PySpark] BisectingKMeans support instance weighting URL: https://github.com/apache/spark/pull/27035#discussion_r361830795
########## File path: mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala ########## @@ -521,8 +521,10 @@ object KMeans { /** * A vector with its norm for fast distance computation. */ -private[clustering] class VectorWithNorm(val vector: Vector, val norm: Double) - extends Serializable { +private[clustering] class VectorWithNorm( Review comment: I am neutral on adding weight in `VectorWithNorm`, then what about also using it in KMeans? for example: `val zippedData: RDD[(VectorWithNorm, Double)]` => `val zippedData: RDD[VectorWithNorm]` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org