Hi Anant, Thank you for reviewing and helping us out. Please find the following link where you can see the initial code. https://github.com/codeAshu/Outlier-Detection-with-AVF-Spark/blob/master/OutlierWithAVFModel.scala
The input file for the code should be in csv format. We have provided a dataset there at the link. We are currently facing the following style issues in the code(code is working fine though) : At line no 62 and 79 we have redundant functions and variables (count_dataPoint, count_trimmedData) for giving line numbers within the function trimScores(). At line no 144 and 149 if we do not use two separate functions to increment line numbers we get erroneous results . Is there any alternative way of handling that? We think that it because of scala clousers where any local variable which is not in RDD doesn't get updated in subsequent pairRDDFunctions. Regards, Ashutosh & Kaushik -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/MLlib-Contributing-Algorithm-for-Outlier-Detection-tp8880p8990.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org