Github user fmcquillan99 commented on the issue:

    https://github.com/apache/madlib/pull/315
  
    I'm not sure what this is doing:
    ```
    %%sql
    DROP TABLE IF EXISTS knn_result_classification;
    
    SELECT * FROM madlib.knn(
                    'knn_train_data',      -- Table of training data
                    'array[99.]::int[] || array[99]',                -- Col 
name of training data
                    'id',                  -- Col name of id in train data
                    'label',               -- Training labels
                    'knn_test_data',       -- Table of test data
                    'data',                -- Col name of test data
                    'id',                  -- Col name of id in test data
                    'knn_result_classification',  -- Output table
                     1,                    -- Number of nearest neighbors
                     True,                 -- True to list nearest-neighbors by 
id
                     'madlib.squared_dist_norm2' -- Distance function
                    );
    
    SELECT * from knn_result_classification ORDER BY id;
    ``` 
    produces
    ```
     id |  data   | prediction | k_nearest_neighbours 
    ----+---------+------------+----------------------
      1 | {2,1}   |          0 | {8}
      2 | {2,6}   |          0 | {8}
      3 | {15,40} |          0 | {8}
      4 | {12,1}  |          0 | {8}
      5 | {2,90}  |          1 | {1}
      6 | {50,45} |          1 | {1}
    (6 rows)
    ```
    
    I get the same result if I do:
    ```
    DROP TABLE IF EXISTS knn_result_classification;
    
    SELECT * FROM madlib.knn(
                    'knn_train_data',      -- Table of training data
                    'array[0.]::int[] || array[0]',                -- Col name 
of training data
                    'id',                  -- Col name of id in train data
                    'label',               -- Training labels
                    'knn_test_data',       -- Table of test data
                    'data',                -- Col name of test data
                    'id',                  -- Col name of id in test data
                    'knn_result_classification',  -- Output table
                     1,                    -- Number of nearest neighbors
                     True,                 -- True to list nearest-neighbors by 
id
                     'madlib.squared_dist_norm2' -- Distance function
                    );
    
    SELECT * from knn_result_classification ORDER BY id;
    ```
    gives
    ```
     id |  data   | prediction | k_nearest_neighbours 
    ----+---------+------------+----------------------
      1 | {2,1}   |          0 | {8}
      2 | {2,6}   |          0 | {8}
      3 | {15,40} |          0 | {8}
      4 | {12,1}  |          0 | {8}
      5 | {2,90}  |          1 | {1}
      6 | {50,45} |          1 | {1}
    (6 rows)
    ```
    



---

Reply via email to