Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/225#discussion_r163653414 --- Diff: src/ports/postgres/modules/knn/knn.py_in --- @@ -212,22 +244,27 @@ def knn(schema_madlib, point_source, point_column_name, point_id, WHERE {y_temp_table}.r <= {k_val} """.format(**locals())) - plpy.execute( - """ + plpy.execute(""" CREATE TABLE {output_table} AS - SELECT {test_id_temp} AS id, {test_column_name} + {view_def} + SELECT knn_temp.{test_id_temp} AS id , + knn_test.data {pred_out} --- End diff -- This `pred_out` doesn't seem right for classification with weighted averaging. Without weighted averaging, we just get the mode as the class predicted. But, with weighted averaging, we must present the class corresponding to the one with the highest weighted sum as the prediction value, and not the highest weighted sum itself. We should also take multi-class scenario into account while changing this.
---