Github user fmcquillan99 commented on the issue:

    https://github.com/apache/madlib/pull/315
  
    (1)
    expression for test data array:
    ```
    DROP TABLE IF EXISTS knn_result_classification;
    
    SELECT * FROM madlib.knn(
                    'knn_train_data',      -- Table of training data
                    'data',                -- Col name of training data
                    'id',                  -- Col name of id in train data
                    'label',               -- Training labels
                    'knn_test_data',       -- Table of test data
                    '3 || ARRAY[4]',                -- Col name of test data
                    'id',                  -- Col name of id in test data
                    'knn_result_classification',  -- Output table
                     3,                    -- Number of nearest neighbors
                     True,                 -- True to list nearest-neighbors by 
id
                     'madlib.squared_dist_norm2' -- Distance function
                    );
    
    SELECT * from knn_result_classification ORDER BY id;
    ```
    produces
    ```
     id | 3 || ARRAY[4] | prediction | k_nearest_neighbours 
    ----+---------------+------------+----------------------
      1 | {3,4}         |          1 | {3,4,5}
      2 | {3,4}         |          1 | {3,4,5}
      3 | {3,4}         |          1 | {3,4,5}
      4 | {3,4}         |          1 | {4,3,5}
      5 | {3,4}         |          1 | {3,4,5}
      6 | {3,4}         |          1 | {4,3,5}
    (6 rows)
    ```
    
    
    (2)
    another expression for test data array:
    ```
    DROP TABLE IF EXISTS knn_result_classification;
    
    SELECT * FROM madlib.knn(
                    'knn_train_data',      -- Table of training data
                    'data',                -- Col name of training data
                    'id',                  -- Col name of id in train data
                    'label',               -- Training labels
                    'knn_test_data',       -- Table of test data
                    'array[3.]::int[] || array[4]',                -- Col name 
of test data
                    'id',                  -- Col name of id in test data
                    'knn_result_classification',  -- Output table
                     3,                    -- Number of nearest neighbors
                     True,                 -- True to list nearest-neighbors by 
id
                     'madlib.squared_dist_norm2' -- Distance function
                    );
    
    SELECT * from knn_result_classification ORDER BY id;
    ```
    produces
    ```
     id | array[3.]::int[] || array[4] | prediction | k_nearest_neighbours 
    ----+------------------------------+------------+----------------------
      1 | {3,4}                        |          1 | {3,4,5}
      2 | {3,4}                        |          1 | {3,4,5}
      3 | {3,4}                        |          1 | {4,3,5}
      4 | {3,4}                        |          1 | {3,4,5}
      5 | {3,4}                        |          1 | {4,3,5}
      6 | {3,4}                        |          1 | {4,3,5}
    (6 rows)
    ```
    so this bit seems to work



---

Reply via email to