[ https://issues.apache.org/jira/browse/SPARK-22867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16301498#comment-16301498 ]
Sean Owen commented on SPARK-22867: ----------------------------------- The problem is that this goes for a hundred things. I don't think MLlib aspires to be like scikit, especially because you can easily add a third-party package to any app. It's just the basics. Given that most JIRAs like this have been rejected I doubt this woudl be different. At least you'd need to argue this is widely used (e.g. papers, other implementations) > Add Isolation Forest algorithm to MLlib > --------------------------------------- > > Key: SPARK-22867 > URL: https://issues.apache.org/jira/browse/SPARK-22867 > Project: Spark > Issue Type: New Feature > Components: MLlib > Affects Versions: 2.2.1 > Reporter: Fangzhou Yang > > Isolation Forest (iForest) is an effective model that focuses on anomaly > isolation. > iForest uses tree structure for modeling data, iTree isolates anomalies > closer to the root of the tree as compared to normal points. > A anomaly score is calculated by iForest model to measure the abnormality of > the data instances. The lower, the more abnormal. > More details about iForest can be found in the following papers: > <a href="https://dl.acm.org/citation.cfm?id=1511387">Isolation Forest</a> [1] > and <a href="https://dl.acm.org/citation.cfm?id=2133363">Isolation-Based > Anomaly Detection</a> [2]. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org