You could fit the k-means pipeline, get the cluster centers, create a
Transformer using that info, then create a new PipelineModel including all
the original elements and the new Transformer. Does that work?
It's not out of the question to expose a new parameter in KMeansModel that
lets you also
First some background:
* We want to use the k-means model for anomaly detection against a
multi-dimensional dataset. The current k-means implementation in
Spark is designed for clustering purpose, not exactly for anomaly
detection. Once a model is trained and pipeline is