regaleo605 opened a new pull request, #1879: URL: https://github.com/apache/systemds/pull/1879
This patch enables systemds to be able to impute missing value using KNN-algorithm. We calculate the similairy or distance between all pairs of records using the euclidean distances. The first method brute forces the imputation using the dist() method. However, this could leads to an expensive computation. Therefore, we proposed 2 other methods, the second method split the number of records (potentially large) and compute the distances with missing records(hopefully small). The third method is similar to the second method. However, we create a subset from the number of records to compute with missing records(large). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
