[ https://issues.apache.org/jira/browse/MADLIB-1370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Orhan Kislal closed MADLIB-1370. -------------------------------- > Knn - add zero check and output distance array > ---------------------------------------------- > > Key: MADLIB-1370 > URL: https://issues.apache.org/jira/browse/MADLIB-1370 > Project: Apache MADlib > Issue Type: Improvement > Components: k-NN > Reporter: Frank McQuillan > Assignee: Orhan Kislal > Priority: Minor > Fix For: v1.17 > > > In unsupervised mode of knn > http://madlib.apache.org/docs/latest/group__grp__knn.html > when `point_source` and `test_source` are the same data set, nearest > neighbors is not reliably returning the 0 distance point as a nearest > neighbor. > Could there a small neg issue here for a distance that is effectively 0 but > shows up as neg epsilon? > Also, please assess if we can add a vector of distances to the output file: > {code} > Output Format > The output of the KNN module is a table with the following columns: > id INTEGER. The ids of test data points. > test_column_name DOUBLE PRECISION[]. The test data points. > prediction INTEGER. Label in case of classification, average value in case > of regression. > k_nearest_neighbours INTEGER[]. List of nearest neighbors, sorted closest to > furthest from the corresponding test point. > distance DOUBLE PRECISION[]. Distance sorted in the same order as the > 'k_nearest_neighbours' array. > {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016)