Github user njayaram2 commented on a diff in the pull request:

    https://github.com/apache/madlib/pull/225#discussion_r163653414
  
    --- Diff: src/ports/postgres/modules/knn/knn.py_in ---
    @@ -212,22 +244,27 @@ def knn(schema_madlib, point_source, 
point_column_name, point_id,
                 WHERE {y_temp_table}.r <= {k_val}
                 """.format(**locals()))
     
    -        plpy.execute(
    -            """
    +        plpy.execute("""
                 CREATE TABLE {output_table} AS
    -                SELECT {test_id_temp} AS id, {test_column_name}
    +                {view_def}
    +                SELECT knn_temp.{test_id_temp} AS id ,
    +                    knn_test.data
                         {pred_out}
    --- End diff --
    
    This `pred_out` doesn't seem right for classification with weighted 
averaging. Without weighted averaging, we just get the mode as the class 
predicted. But, with weighted averaging, we must present the class 
corresponding to the one with the highest weighted sum as the prediction value, 
and not the highest weighted sum itself.
    We should also take multi-class scenario into account while changing this.


---

Reply via email to