Re: Equivalent Function in ml for computeCost()

2021-11-29 Thread Sean Owen
I knew I was forgetting something, right. Feel free to make an update for the doxs6 On Mon, Nov 29, 2021, 4:49 PM Artemis User wrote: > Thanks Sean! After a little bit digging through the source code, it seems > that the computeCost method has been replaced by the trainingCost method in >

Re: Equivalent Function in ml for computeCost()

2021-11-29 Thread Artemis User
Thanks Sean!  After a little bit digging through the source code, it seems that the computeCost method has been replaced by the trainingCost method in KMeansSummary class.  This is the hidden comment in the source code for the trainingCost method (somehow it wasn't propagated to the online

Re: Equivalent Function in ml for computeCost()

2021-11-29 Thread Sean Owen
I don't believe there is, directly, though there is ClusteringMetrics to evaluate clusterings in .ml. I'm kinda confused that it doesn't expose sum of squared distances though; it computes silhouette only? You can compute it directly, pretty easily, in any event, either by just writing up a few

Equivalent Function in ml for computeCost()

2021-11-29 Thread Artemis User
The RDD-based org.apache.spark.mllib.clustering.KMeansModel class defines a method called computeCost that is used to calculate the WCSS error of K-Means clusters (https://spark.apache.org/docs/latest/api/scala/org/apache/spark/mllib/clustering/KMeansModel.html). Is there an equivalent method

Spark jobs with Apache Ranger Plugin for Authorization with Hive tables

2021-11-29 Thread Srihari Kusumanchi
Hi Team, We are running spark jobs in the Kubernetes cluster and we want to enable authorization for spark jobs with Ranger policies defined on hive tables, I could see that there are no direct plugins available to connect with Ranger from the spark. Can someone please help me with how I can