[systemds] branch master updated: [SYSTEMDS-2869][DOC] Built-in functions KNN and KNNBF (#1255)

janardhan Wed, 05 May 2021 13:37:40 -0700

This is an automated email from the ASF dual-hosted git repository.

janardhan pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/systemds.git



The following commit(s) were added to refs/heads/master by this push:
     new 28da55f  [SYSTEMDS-2869][DOC] Built-in functions KNN and KNNBF (#1255)
28da55f is described below

commit 28da55fea4c8c1c45ba215ccb4fc29a7d4014025
Author: j143 <[email protected]>
AuthorDate: Thu May 6 02:07:25 2021 +0530

    [SYSTEMDS-2869][DOC] Built-in functions KNN and KNNBF (#1255)
---
 docs/site/builtins-reference.md | 50 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 50 insertions(+)

diff --git a/docs/site/builtins-reference.md b/docs/site/builtins-reference.md
index c46de37..0f37d87 100644
--- a/docs/site/builtins-reference.md
+++ b/docs/site/builtins-reference.md
@@ -43,6 +43,7 @@ limitations under the License.
     * [`imputeByFD`-Function](#imputeByFD-function)
     * [`intersect`-Function](#intersect-function)
     * [`KMeans`-Function](#KMeans-function)
+    * [`KNN`-function](#KNN-function)
     * [`lm`-Function](#lm-function)
     * [`lmDS`-Function](#lmds-function)
     * [`lmCG`-Function](#lmcg-function)
@@ -677,6 +678,55 @@ X = rand (rows = 3972, cols = 972)
 kmeans(X = X, k = 20, runs = 10, max_iter = 5000, eps = 0.000001, is_verbose = 
FALSE, avg_sample_size_per_centroid = 50, seed = -1)
 ```
 
+## `KNN`-Function
+
+The knn() implements the KNN (K Nearest Neighbor) algorithm.
+
+### Usage
+
+```r
+[NNR, PR, FI] = knn(Train, Test, CL, k_value)
+```
+
+### Arguments
+
+| Name       | Type            | Default    | Description |
+| :--------- | :-------------- | :--------- | :---------- |
+| Train      | Matrix          | required   | The input matrix as features |
+| Test       | Matrix          | required   | Number of centroids |
+| CL         | Matrix          | Optional   | The input matrix as target |
+| CL_T       | Integer         | `0`        | The target type of matrix CL 
whether columns in CL are continuous ( =1 ) or categorical ( =2 ) or not 
specified ( =0 ) |
+| trans_continuous | Boolean | `FALSE` | Whether to transform continuous 
features to [-1,1] |
+| k_value     |  int |     `5`  |  k value for KNN, ignore if select_k enable |
+| select_k    | Boolean | `FALSE` | Use k selection algorithm to estimate k ( 
TRUE means yes ) |
+| k_min       | int   | `1`|   Min k value(  available if select_k = 1 ) |
+| k_max       | int   | `100` | Max k value(  available if select_k = 1 ) |
+| select_feature | Boolean | `FALSE` | Use feature selection algorithm to 
select feature ( TRUE means yes ) |
+| feature_max | int   | `10` | Max feature selection |
+| interval    | int   | `1000` | Interval value for K selecting (  available 
if select_k = 1 ) |
+| feature_importance | Boolean | `FALSE` | Use feature importance algorithm to 
estimate each feature ( TRUE means yes ) |
+| predict_con_tg | int | `0`   | Continuous  target predict function: mean(=0) 
or median(=1) |
+| START_SELECTED | Matrix | Optional | feature selection initial value |
+
+### Returns
+
+| Type   | Description |
+| :----- | :---------- |
+| Matrix |  NNR |
+| Matrix |  PR |
+| Matrix | Feature importance value |
+
+### Example
+
+```r
+X = rand(rows = 100, cols = 20)
+T = rand(rows= 3, cols = 20) # query rows, and columns
+CL = matrix(seq(1,100), 100, 1)
+k = 3
+[NNR, PR, FI] = knn(Train=X, Test=T, CL=CL, k_value=k, predict_con_tg=1)
+```
+
+
 ## `lm`-Function
 
 The `lm`-function solves linear regression using either the **direct solve 
method** or the **conjugate gradient algorithm**

[systemds] branch master updated: [SYSTEMDS-2869][DOC] Built-in functions KNN and KNNBF (#1255)

Reply via email to