Github user fmcquillan99 commented on the issue: https://github.com/apache/madlib/pull/315 (1) expression for test data array: ``` DROP TABLE IF EXISTS knn_result_classification; SELECT * FROM madlib.knn( 'knn_train_data', -- Table of training data 'data', -- Col name of training data 'id', -- Col name of id in train data 'label', -- Training labels 'knn_test_data', -- Table of test data '3 || ARRAY[4]', -- Col name of test data 'id', -- Col name of id in test data 'knn_result_classification', -- Output table 3, -- Number of nearest neighbors True, -- True to list nearest-neighbors by id 'madlib.squared_dist_norm2' -- Distance function ); SELECT * from knn_result_classification ORDER BY id; ``` produces ``` id | 3 || ARRAY[4] | prediction | k_nearest_neighbours ----+---------------+------------+---------------------- 1 | {3,4} | 1 | {3,4,5} 2 | {3,4} | 1 | {3,4,5} 3 | {3,4} | 1 | {3,4,5} 4 | {3,4} | 1 | {4,3,5} 5 | {3,4} | 1 | {3,4,5} 6 | {3,4} | 1 | {4,3,5} (6 rows) ``` (2) another expression for test data array: ``` DROP TABLE IF EXISTS knn_result_classification; SELECT * FROM madlib.knn( 'knn_train_data', -- Table of training data 'data', -- Col name of training data 'id', -- Col name of id in train data 'label', -- Training labels 'knn_test_data', -- Table of test data 'array[3.]::int[] || array[4]', -- Col name of test data 'id', -- Col name of id in test data 'knn_result_classification', -- Output table 3, -- Number of nearest neighbors True, -- True to list nearest-neighbors by id 'madlib.squared_dist_norm2' -- Distance function ); SELECT * from knn_result_classification ORDER BY id; ``` produces ``` id | array[3.]::int[] || array[4] | prediction | k_nearest_neighbours ----+------------------------------+------------+---------------------- 1 | {3,4} | 1 | {3,4,5} 2 | {3,4} | 1 | {3,4,5} 3 | {3,4} | 1 | {4,3,5} 4 | {3,4} | 1 | {3,4,5} 5 | {3,4} | 1 | {4,3,5} 6 | {3,4} | 1 | {4,3,5} (6 rows) ``` so this bit seems to work
---