Dear community, I would like to know if someone can help clarifying how to predict anomaly scores on new data sets using the "solitude" package. A simple model can be trained using:
library(solitude) # Training the model: iris_train <- iris[1:100, ] model <- isolation_forest(iris_train[, 1:4], seed = 100,num.trees=100,importance="none") # The anomaly scores of a new test data set can be calculated by iris_test <- iris[100:150, ] predicted_anomalies <- predict(mo, iris_test[, 1:4],type="anomaly_score") #The challenge is how to predict the anomaly scores for a data set with less observations than the #number of observations in the training data set. # Example: using a subset of just 11 observations as compared to the 51 observations results in anomaly scores that are smaller: iris_test <- iris[100:110, ] predicted_anomalies <- predict(mo, iris_test[, 1:4],type="anomaly_score") Anyone knows how to predict "normalised (with respect to sample size)" anomaly scores using the solitude package for R? Thanks in advance! Johan -- Johan Lassen "In the cities people live in time - in the mountains people live in space" (Budistisk munk). [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.