This is an automated email from the ASF dual-hosted git repository.

janardhan pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/systemds.git


The following commit(s) were added to refs/heads/master by this push:
     new 08f6e11  [MINOR] Arrange Builtin functions in the alphabetical order
08f6e11 is described below

commit 08f6e114b9c756e657c9ef3f2d33bcb116789841
Author: Janardhan Pulivarthi <[email protected]>
AuthorDate: Sun May 30 10:20:28 2021 +0530

    [MINOR] Arrange Builtin functions in the alphabetical order
    
    * Alphabetic sorting of the functions
    * Use consistent spaces after header throughout the doc
    
    Closes #1291.
---
 docs/site/builtins-reference.md | 867 +++++++++++++++++++++-------------------
 1 file changed, 455 insertions(+), 412 deletions(-)

diff --git a/docs/site/builtins-reference.md b/docs/site/builtins-reference.md
index 0f8772b..1cbf660 100644
--- a/docs/site/builtins-reference.md
+++ b/docs/site/builtins-reference.md
@@ -28,6 +28,7 @@ limitations under the License.
     * [`tensor`-Function](#tensor-function)
   * [DML-Bodied Built-In functions](#dml-bodied-built-in-functions)
     * [`confusionMatrix`-Function](#confusionmatrix-function)
+    * [`correctTypos`-Function](#correcttypos-function)
     * [`cspline`-Function](#cspline-function)
     * [`csplineCG`-Function](#csplineCG-function)
     * [`csplineDS`-Function](#csplineDS-function)
@@ -40,6 +41,8 @@ limitations under the License.
     * [`ema`-Function](#ema-function)
     * [`gaussianClassifier`-Function](#gaussianClassifier-function)
     * [`glm`-Function](#glm-function)
+    * [`gmm`-Function](#gmm-function)
+    * [`gnmf`-Function](#gnmf-function)
     * [`gridSearch`-Function](#gridSearch-function)
     * [`hyperband`-Function](#hyperband-function)
     * [`img_brightness`-Function](#img_brightness-function)
@@ -50,31 +53,29 @@ limitations under the License.
     * [`KMeans`-Function](#KMeans-function)
     * [`KNN`-function](#KNN-function)
     * [`lm`-Function](#lm-function)
-    * [`lmDS`-Function](#lmds-function)
     * [`lmCG`-Function](#lmcg-function)
+    * [`lmDS`-Function](#lmds-function)
     * [`lmPredict`-Function](#lmPredict-function)
+    * [`mdedup`-Function](#mdedup-function)
     * [`mice`-Function](#mice-function)
+    * [`msvm`-Function](#msvm-function)
     * [`multiLogReg`-Function](#multiLogReg-function)
+    * [`naiveBayes`-Function](#naiveBayes-function)
+    * [`naiveBayesPredict`-Function](#naiveBayesPredict-function)
+    * [`normalize`-Function](#normalize-function)
+    * [`outlier`-Function](#outlier-function)
     * [`pnmf`-Function](#pnmf-function)
     * [`scale`-Function](#scale-function)
     * [`sherlock`-Function](#sherlock-function)
     * [`sherlockPredict`-Function](#sherlockPredict-function)
     * [`sigmoid`-Function](#sigmoid-function)
+    * [`slicefinder`-Function](#slicefinder-function)
     * [`smote`-Function](#smote-function)
     * [`steplm`-Function](#steplm-function)
-    * [`slicefinder`-Function](#slicefinder-function)
-    * [`normalize`-Function](#normalize-function)
-    * [`gnmf`-Function](#gnmf-function)
-    * [`mdedup`-Function](#mdedup-function)
-    * [`msvm`-Function](#msvm-function)
-    * [`naiveBayes`-Function](#naiveBayes-function)
-    * [`naiveBayesPredict`-Function](#naiveBayesPredict-function)
-    * [`outlier`-Function](#outlier-function)
     * [`tomekLink`-Function](#tomekLink-function)
     * [`toOneHot`-Function](#toOneHOt-function)
     * [`winsorize`-Function](#winsorize-function)
-    * [`gmm`-Function](#gmm-function)
-    * [`correctTypos`-Function](#correcttypos-function)
+
 
 # Introduction
 
@@ -152,10 +153,12 @@ print(toString(D))
 
 Note that reshape construction is not yet supported for **SPARK** execution.
 
+
 # DML-Bodied Built-In Functions
 
 **DML-bodied built-in functions** are written as DML-Scripts and executed as 
such when called.
 
+
 ## `confusionMatrix`-Function
 
 A `confusionMatrix`-accepts a vector for prediction and a one-hot-encoded 
matrix, then it computes the max value
@@ -191,6 +194,41 @@ y = toOneHot(X, numClasses)
 [ConfusionSum, ConfusionAvg] = confusionMatrix(P=z, Y=y)
 ```
 
+
+## `correctTypos`-Function
+
+The `correctTypos` - function tries to correct typos in a given frame. This 
algorithm operates on the assumption that most strings are correct and simply 
swaps strings that do not occur often with similar strings that occur more 
often. If correct is set to FALSE only prints suggested corrections without 
effecting the frame.
+
+### Usage
+
+```r
+correctTypos(strings, frequency_threshold, distance_threshold, decapitalize, 
correct, is_verbose)
+```
+
+### Arguments
+
+| NAME    | TYPE           | DEFAULT  | Description |
+| :------ | :------------- | -------- | :---------- |
+| strings | String  |   ---  |    The nx1 input frame of corrupted strings |
+| frequency_threshold   |            Double   | 0.05 |    Strings that occur 
above this relative frequency level will not be corrected |
+| distance_threshold         |       Int   |    2   |     Max editing distance 
at which strings are considered similar |
+| decapitalize            |          Boolean  | TRUE  |   Decapitalize all 
strings before correction |
+| correct              |             Boolean  | TRUE |    Correct strings or 
only report potential errors |
+| is_verbose                    |    Boolean |  FALSE |   Print debug 
information |
+
+### Returns
+
+|   TYPE     |   Description|
+|  :------------- |  :---------- |
+|      String   |  Corrected nx1 output frame |
+
+### Example
+```r
+A = read(“file1”, data_type=”frame”, rows=2000, cols=1, format=”binary”)
+A_corrected = correctTypos(A, 0.02, 3, FALSE, TRUE)
+```
+
+
 ## `cspline`-Function
 
 This `cspline`-function solves Cubic spline interpolation. The function usages 
natural spline with $$ q_1''(x_0) == q_n''(x_n) == 0.0 $$.
@@ -199,6 +237,7 @@ By default, it calculates via `csplineDS`-function.
 Algorithm reference: 
https://en.wikipedia.org/wiki/Spline_interpolation#Algorithm_to_find_the_interpolating_cubic_spline
 
 ### Usage
+
 ```r
 [result, K] = cspline(X, Y, inp_x, tol, maxi)
 ```
@@ -233,11 +272,13 @@ max_iter = num_rec
 [result, K] = cspline(X=X, Y=Y, inp_x=inp_x, tol=tolerance, maxi=max_iter)
 ```
 
+
 ## `csplineCG`-Function
 
 This `csplineCG`-function solves Cubic spline interpolation with conjugate 
gradient method. Usage will be same as `cspline`-function.
 
 ### Usage
+
 ```r
 [result, K] = csplineCG(X, Y, inp_x, tol, maxi)
 ```
@@ -271,11 +312,13 @@ max_iter = num_rec
 [result, K] = csplineCG(X=X, Y=Y, inp_x=inp_x, tol=tolerance, maxi=max_iter)
 ```
 
+
 ## `csplineDS`-Function
 
 This `csplineDS`-function solves Cubic spline interpolation with direct solver 
method.
 
 ### Usage
+
 ```r
 [result, K] = csplineDS(X, Y, inp_x)
 ```
@@ -344,6 +387,7 @@ y = X %*% rand(rows = ncol(X), cols = 1)
 [predict, beta] = cvlm(X = X, y = y, k = 4)
 ```
 
+
 ## `DBSCAN`-Function
 
 The dbscan() implements the DBSCAN Clustering algorithm using Euclidian 
distance.
@@ -375,6 +419,7 @@ X = rand(rows=1780, cols=180, min=1, max=20)
 dbscan(X = X, eps = 2.5, minPts = 360)
 ```
 
+
 ## `decisionTree`-Function
 
 The `decisionTree()` implements the classification tree with both scale and 
categorical
@@ -445,6 +490,7 @@ discoverFD(X, Mask, threshold)
 | :----- | :---------- |
 | Double | matrix of functional dependencies |
 
+
 ## `dist`-Function
 
 The `dist`-function is used to compute Euclidian distances between N 
d-dimensional points.
@@ -475,7 +521,6 @@ Y = dist(X)
 ```
 
 
-
 ## `dmv`-Function
 
 The `dmv`-function is used to find disguised missing values utilising 
syntactical pattern recognition.
@@ -509,6 +554,7 @@ Z = dmv(X=A, threshold=0.9)
 Z = dmv(X=A, threshold=0.9, replace="NaN")
 ```
 
+
 ## `gaussianClassifier`-Function
 
 The `gaussianClassifier`-function computes prior probabilities, means, 
determinants, and inverse
@@ -552,6 +598,7 @@ y = X %*% rand(rows = ncol(X), cols = 1)
 [prior, means, covs, det] = gaussianClassifier(D=X, C=y, varSmoothing=1e-9)
 ```
 
+
 ## `glm`-Function
 
 The `glm`-function  is a flexible generalization of ordinary linear regression 
that allows for response variables that have
@@ -587,18 +634,103 @@ glm(X,Y)
 | Matrix[Double] | Matrix whose size depends on icpt ( icpt=0: ncol(X) x 1;  
icpt=1: (ncol(X) + 1) x 1;  icpt=2: (ncol(X) + 1) x 2) |
 
 ### Example
+
 ```r
 X = rand (rows = 5, cols = 5 )
 y = X %*% rand(rows = ncol(X), cols = 1)
 beta = glm(X=X,Y=y)
 ```
 
+
+## `gmm`-Function
+
+The `gmm`-function implements builtin Gaussian Mixture Model with four 
different types of
+covariance matrices i.e., VVV, EEE, VVI, VII and two initialization methods 
namely "kmeans" and "random".
+
+### Usage
+
+```r
+gmm(X=X, n_components = 3,  model = "VVV",  init_params = "random", iter = 
100, reg_covar = 0.000001, tol = 0.0001, verbose=TRUE)
+```
+
+
+### Arguments
+
+| Name          | Type             | Default    | Description |
+| :------       | :-------------   | --------   | :---------- |
+| X             | Double           | ---        | Matrix X of feature vectors.|
+| n_components             | Integer           | 3        | Number of 
n_components in the Gaussian mixture model |
+| model     | String          | "VVV"| "VVV": unequal variance (full),each 
component has its own general covariance matrix<br><br>"EEE": equal variance 
(tied), all components share the same general covariance matrix<br><br>"VVI": 
spherical, unequal volume (diag), each component has its own diagonal 
covariance matrix<br><br>"VII": spherical, equal volume (spherical), each 
component has its own single variance |
+| init_param   | String          | "kmeans"         | initialize weights with 
"kmeans" or "random"|
+| iterations       | Integer           | 100    |  Number of iterations|
+| reg_covar         | Double           | 1e-6        | regularization 
parameter for covariance matrix|
+| tol | Double          | 0.000001        |tolerance value for convergence |
+| verbose       | Boolean          | False      | Set to true to print 
intermediate results.|
+
+
+### Returns
+
+| Name    | Type           | Default  | Description |
+| :------ | :------------- | -------- | :---------- |
+| weight    |      Double  | ---      |A matrix whose [i,k]th entry is the 
probability that observation i in the test data belongs to the kth class|
+|labels   |       Double  | ---     | Prediction matrix|
+|df |              Integer  |---  |    Number of estimated parameters|
+| bic |             Double  | ---  |    Bayesian information criterion for 
best iteration|
+
+### Example
+
+```r
+X = read($1)
+[labels, df, bic] = gmm(X=X, n_components = 3,  model = "VVV",  init_params = 
"random", iter = 100, reg_covar = 0.000001, tol = 0.0001, verbose=TRUE)
+```
+
+
+## `gnmf`-Function
+
+The `gnmf`-function does Gaussian Non-Negative Matrix Factorization.
+In this, a matrix X is factorized into two matrices W and H, such that all 
three matrices have no negative elements.
+This non-negativity makes the resulting matrices easier to inspect.
+
+### Usage
+
+```r
+gnmf(X, rnk, eps = 10^-8, maxi = 10)
+```
+
+### Arguments
+
+| Name    | Type           | Default  | Description |
+| :------ | :------------- | -------- | :---------- |
+| X       | Matrix[Double] | required | Matrix of feature vectors. |
+| rnk     | Integer        | required | Number of components into which matrix 
X is to be factored. |
+| eps     | Double         | `10^-8`  | Tolerance |
+| maxi    | Integer        | `10`     | Maximum number of conjugate gradient 
iterations. |
+
+
+### Returns
+
+| Type           | Description |
+| :------------- | :---------- |
+| Matrix[Double] | List of pattern matrices, one for each repetition. |
+| Matrix[Double] | List of amplitude matrices, one for each repetition. |
+
+### Example
+
+```r
+X = rand(rows = 50, cols = 10)
+W = rand(rows = nrow(X), cols = 2, min = -0.05, max = 0.05);
+H = rand(rows = 2, cols = ncol(X), min = -0.05, max = 0.05);
+gnmf(X = X, rnk = 2, eps = 10^-8, maxi = 10)
+```
+
+
 ## `gridSearch`-Function
 
 The `gridSearch`-function is used to find the optimal hyper-parameters of a 
model which results in the most _accurate_
 predictions. This function takes `train` and `eval` functions by name.
 
 ### Usage
+
 ```r
 gridSearch(X, y, train, predict, params, paramValues, verbose)
 ```
@@ -632,6 +764,7 @@ paramRanges = list(10^seq(0,-4), 10^seq(-5,-9), 10^seq(1,3))
 [B, opt]= gridSearch(X=X, y=y, train="lm", predict="lmPredict", params=params, 
paramValues=paramRanges, verbose = TRUE)
 ```
 
+
 ## `hyperband`-Function
 
 The `hyperband`-function is used for hyper parameter optimization and is based 
on multi-armed bandits and early elimination.
@@ -644,6 +777,7 @@ Notes:
 * `hyperband` can only optimize continuous hyperparameters
 
 ### Usage
+
 ```r
 hyperband(X_train, y_train, X_val, y_val, params, paramRanges, R, eta, verbose)
 ```
@@ -684,6 +818,7 @@ paramRanges = matrix("0 20", rows=1, cols=2);
     X_val=X_val, y_val=y_val, params=params, paramRanges=paramRanges);
 ```
 
+
 ## `img_brightness`-Function
 
 The `img_brightness`-function is an image data augumentation function.
@@ -716,6 +851,7 @@ A = rand(rows = 3, cols = 3, min = 0, max = 255)
 B = img_brightness(img_in = A, value = 128, channel_max = 255)
 ```
 
+
 ## `img_crop`-Function
 
 The `img_crop`-function is an image data augumentation function.
@@ -750,6 +886,7 @@ A = rand(rows = 3, cols = 3, min = 0, max = 255)
 B = img_crop(img_in = A, w = 20, h = 10, x_offset = 0, y_offset = 0)
 ```
 
+
 ## `img_mirror`-Function
 
 The `img_mirror`-function is an image data augumentation function.
@@ -781,6 +918,7 @@ A = rand(rows = 3, cols = 3, min = 0, max = 255)
 B = img_mirror(img_in = A, horizontal_axis = TRUE)
 ```
 
+
 ## `imputeByFD`-Function
 
 The `imputeByFD`-function imputes missing values from observed values (if 
exist)
@@ -850,6 +988,7 @@ X = read("fileA", data_type="frame")
 ema(X = X, search_iterations = 1, mode = "triple", freq = 4, alpha = 0.1, beta 
= 0.1, gamma = 0.1,)
 ```
 
+
 ## `KMeans`-Function
 
 The kmeans() implements the KMeans Clustering algorithm.
@@ -887,6 +1026,7 @@ X = rand (rows = 3972, cols = 972)
 kmeans(X = X, k = 20, runs = 10, max_iter = 5000, eps = 0.000001, is_verbose = 
FALSE, avg_sample_size_per_centroid = 50, seed = -1)
 ```
 
+
 ## `KNN`-Function
 
 The knn() implements the KNN (K Nearest Neighbor) algorithm.
@@ -943,6 +1083,7 @@ depending on the input size of the matrices (See 
[`lmDS`-function](#lmds-functio
 [`lmCG`-function](#lmcg-function) respectively).
 
 ### Usage
+
 ```r
 lm(X, y, icpt = 0, reg = 1e-7, tol = 1e-7, maxi = 0, verbose = TRUE)
 ```
@@ -984,11 +1125,13 @@ y = X %*% rand(rows = ncol(X), cols = 1)
 lm(X = X, y = y)
 ```
 
+
 ## `intersect`-Function
 
 The `intersect`-function implements set intersection for numeric data.
 
 ### Usage
+
 ```r
 intersect(X, Y)
 ```
@@ -1007,14 +1150,14 @@ intersect(X, Y)
 | Double | intersection matrix, set of intersecting items |
 
 
-## `lmDS`-Function
+## `lmCG`-Function
 
-The `lmDS`-function solves linear regression by directly solving the *linear 
system*.
+The `lmCG`-function solves linear regression using the *conjugate gradient 
algorithm*.
 
 ### Usage
 
 ```r
-lmDS(X, y, icpt = 0, reg = 1e-7, verbose = TRUE)
+lmCG(X, y, icpt = 0, reg = 1e-7, tol = 1e-7, maxi = 0, verbose = TRUE)
 ```
 
 ### Arguments
@@ -1025,6 +1168,8 @@ lmDS(X, y, icpt = 0, reg = 1e-7, verbose = TRUE)
 | y       | Matrix[Double] | required | 1-column matrix of response values. |
 | icpt    | Integer        | `0`      | Intercept presence, shifting and 
rescaling the columns of X ([Details](#icpt-argument))|
 | reg     | Double         | `1e-7`   | Regularization constant (lambda) for 
L2-regularization. set to nonzero for highly dependant/sparse/numerous features|
+| tol     | Double         | `1e-7`   | Tolerance (epsilon); conjugate 
gradient procedure terminates early if L2 norm of the beta-residual is less 
than tolerance * its initial norm|
+| maxi    | Integer        | `0`      | Maximum number of conjugate gradient 
iterations. 0 = no maximum |
 | verbose | Boolean        | `TRUE`   | If `TRUE` print messages are activated 
|
 
 ### Returns
@@ -1038,17 +1183,18 @@ lmDS(X, y, icpt = 0, reg = 1e-7, verbose = TRUE)
 ```r
 X = rand (rows = 50, cols = 10)
 y = X %*% rand(rows = ncol(X), cols = 1)
-lmDS(X = X, y = y)
+lmCG(X = X, y = y, maxi = 10)
 ```
 
-## `lmCG`-Function
 
-The `lmCG`-function solves linear regression using the *conjugate gradient 
algorithm*.
+## `lmDS`-Function
+
+The `lmDS`-function solves linear regression by directly solving the *linear 
system*.
 
 ### Usage
 
 ```r
-lmCG(X, y, icpt = 0, reg = 1e-7, tol = 1e-7, maxi = 0, verbose = TRUE)
+lmDS(X, y, icpt = 0, reg = 1e-7, verbose = TRUE)
 ```
 
 ### Arguments
@@ -1059,8 +1205,6 @@ lmCG(X, y, icpt = 0, reg = 1e-7, tol = 1e-7, maxi = 0, 
verbose = TRUE)
 | y       | Matrix[Double] | required | 1-column matrix of response values. |
 | icpt    | Integer        | `0`      | Intercept presence, shifting and 
rescaling the columns of X ([Details](#icpt-argument))|
 | reg     | Double         | `1e-7`   | Regularization constant (lambda) for 
L2-regularization. set to nonzero for highly dependant/sparse/numerous features|
-| tol     | Double         | `1e-7`   | Tolerance (epsilon); conjugate 
gradient procedure terminates early if L2 norm of the beta-residual is less 
than tolerance * its initial norm|
-| maxi    | Integer        | `0`      | Maximum number of conjugate gradient 
iterations. 0 = no maximum |
 | verbose | Boolean        | `TRUE`   | If `TRUE` print messages are activated 
|
 
 ### Returns
@@ -1074,9 +1218,10 @@ lmCG(X, y, icpt = 0, reg = 1e-7, tol = 1e-7, maxi = 0, 
verbose = TRUE)
 ```r
 X = rand (rows = 50, cols = 10)
 y = X %*% rand(rows = ncol(X), cols = 1)
-lmCG(X = X, y = y, maxi = 10)
+lmDS(X = X, y = y)
 ```
 
+
 ## `lmPredict`-Function
 
 The `lmPredict`-function predicts the class of a feature vector.
@@ -1097,7 +1242,6 @@ lmPredict(X=X, B=w, ytest= Y)
 | icpt    | Integer        | 0        | Intercept presence, shifting and 
rescaling of X ([Details](#icpt-argument))|
 | verbose | Boolean        | FALSE    | Print various statistics for 
evaluating accuracy. |
 
-
 ### Returns
 
 | Type           | Description |
@@ -1113,6 +1257,47 @@ w = lm(X = X, y = y)
 yp = lmPredict(X = X, B = w, ytest=matrix(0,1,1))
 ```
 
+
+## `mdedup`-Function
+
+The `mdedup`-function implements builtin for deduplication using matching 
dependencies
+(e.g. Street 0.95, City 0.90 -> ZIP 1.0) by Jaccard distance.
+
+### Usage
+
+```r
+mdedup(X, Y, intercept, epsilon, lamda, maxIterations, verbose)
+```
+
+### Arguments
+
+| Name          | Type             | Default    | Description |
+| :------       | :-------------   | --------   | :---------- |
+| X             | Frame            | ---        | Input Frame X |
+| LHSfeatures   | Matrix[Integer]  | ---        | A matrix 1xd with numbers of 
columns for MDs |
+| LHSthreshold  | Matrix[Double]   | ---        | A matrix 1xd with threshold 
values in interval [0, 1] for MDs |
+| RHSfeatures   | Matrix[Integer]  | ---        | A matrix 1xd with numbers of 
columns for MDs |
+| RHSthreshold  | Matrix[Double]   | ---        | A matrix 1xd with threshold 
values in interval [0, 1] for MDs |
+| verbose       | Boolean          | False      | Set to true to print 
duplicates.|
+
+### Returns
+
+| Type            | Default  | Description |
+| :-------------- | -------- | :---------- |
+| Matrix[Integer] | ---      | Matrix of duplicates (rows). |
+
+### Example
+
+```r
+X = as.frame(rand(rows = 50, cols = 10))
+LHSfeatures = matrix("1 3 19", 1, 2)
+LHSthreshold = matrix("0.85 0.85", 1, 2)
+RHSfeatures = matrix("30", 1, 1)
+RHSthreshold = matrix("1.0", 1, 1)
+duplicates = mdedup(X, LHSfeatures, LHSthreshold, RHSfeatures, RHSthreshold, 
verbose = FALSE)
+```
+
+
 ## `mice`-Function
 
 The `mice`-function implements Multiple Imputation using Chained Equations 
(MICE) for nominal data.
@@ -1138,7 +1323,6 @@ mice(F, cMask, iter, complete, verbose)
 | :------------- | :---------- |
 | Matrix[Double]  | imputed dataset. |
 
-
 ### Example
 
 ```r
@@ -1147,34 +1331,74 @@ cMask = round(rand(rows=1,cols=ncol(F),min=0,max=1))
 dataset = mice(F, cMask, iter = 3, verbose = FALSE)
 ```
 
-## `multiLogReg`-Function
 
-The `multiLogReg`-function solves Multinomial Logistic Regression using Trust 
Region method.
-(See: Trust Region Newton Method for Logistic Regression, Lin, Weng and 
Keerthi, JMLR 9 (2008) 627-650)
+## `msvm`-Function
+
+The `msvm`-function implements builtin multiclass SVM with squared slack 
variables
+It learns one-against-the-rest binary-class classifiers by making a function 
call to l2SVM
 
 ### Usage
 
 ```r
-multiLogReg(X, Y, icpt, reg, tol, maxi, maxii, verbose)
+msvm(X, Y, intercept, epsilon, lamda, maxIterations, verbose)
 ```
 
 ### Arguments
 
-| Name  | Type   | Default | Description |
-| :---- | :----- | ------- | :---------- |
-| X     | Double | --      | The matrix of feature vectors |
-| Y     | Double | --      | The matrix with category labels |
-| icpt  | Int    | `0`     | Intercept presence, shifting and rescaling X 
columns: 0 = no intercept, no shifting, no rescaling; 1 = add intercept, but 
neither shift nor rescale X; 2 = add intercept, shift & rescale X columns to 
mean = 0, variance = 1 |
-| reg   | Double | `0`     | regularization parameter (lambda = 1/C); 
intercept is not regularized |
-| tol   | Double | `1e-6`  | tolerance ("epsilon") |
-| maxi  | Int    | `100`   | max. number of outer newton interations |
-| maxii | Int    | `0`     | max. number of inner (conjugate gradient) 
iterations |
-
-### Returns
-
-| Type   | Description |
-| :----- | :---------- |
-| Double | Regression betas as output for prediction |
+| Name          | Type             | Default    | Description |
+| :------       | :-------------   | --------   | :---------- |
+| X             | Double           | ---        | Matrix X of feature vectors.|
+| Y             | Double           | ---        | Matrix Y of class labels. |
+| intercept     | Boolean          | False      | No Intercept ( If set to 
TRUE then a constant bias column is added to X)|
+| num_classes   | Integer          | 10         | Number of classes.|
+| epsilon       | Double           | 0.001      | Procedure terminates early 
if the reduction in objective function value is less than epsilon (tolerance) 
times the initial objective function value.|
+| lamda         | Double           | 1.0        | Regularization parameter 
(lambda) for L2 regularization|
+| maxIterations | Integer          | 100        | Maximum number of conjugate 
gradient iterations|
+| verbose       | Boolean          | False      | Set to true to print while 
training.|
+
+### Returns
+
+| Name    | Type           | Default  | Description |
+| :------ | :------------- | -------- | :---------- |
+| model   | Double         | ---      | Model matrix. |
+
+### Example
+
+```r
+X = rand(rows = 50, cols = 10)
+y = round(X %*% rand(rows=ncol(X), cols=1))
+model = msvm(X = X, Y = y, intercept = FALSE, epsilon = 0.005, lambda = 1.0, 
maxIterations = 100, verbose = FALSE)
+```
+
+
+## `multiLogReg`-Function
+
+The `multiLogReg`-function solves Multinomial Logistic Regression using Trust 
Region method.
+(See: Trust Region Newton Method for Logistic Regression, Lin, Weng and 
Keerthi, JMLR 9 (2008) 627-650)
+
+### Usage
+
+```r
+multiLogReg(X, Y, icpt, reg, tol, maxi, maxii, verbose)
+```
+
+### Arguments
+
+| Name  | Type   | Default | Description |
+| :---- | :----- | ------- | :---------- |
+| X     | Double | --      | The matrix of feature vectors |
+| Y     | Double | --      | The matrix with category labels |
+| icpt  | Int    | `0`     | Intercept presence, shifting and rescaling X 
columns: 0 = no intercept, no shifting, no rescaling; 1 = add intercept, but 
neither shift nor rescale X; 2 = add intercept, shift & rescale X columns to 
mean = 0, variance = 1 |
+| reg   | Double | `0`     | regularization parameter (lambda = 1/C); 
intercept is not regularized |
+| tol   | Double | `1e-6`  | tolerance ("epsilon") |
+| maxi  | Int    | `100`   | max. number of outer newton interations |
+| maxii | Int    | `0`     | max. number of inner (conjugate gradient) 
iterations |
+
+### Returns
+
+| Type   | Description |
+| :----- | :---------- |
+| Double | Regression betas as output for prediction |
 
 ### Example
 
@@ -1185,6 +1409,138 @@ betas = multiLogReg(X = X, Y = Y, icpt = 2,  tol = 
0.000001, reg = 1.0, maxi = 1
 ```
 
 
+## `naiveBayes`-Function
+
+The `naiveBayes`-function computes the class conditional probabilities and 
class priors.
+
+### Usage
+
+```r
+naiveBayes(D, C, laplace, verbose)
+```
+
+### Arguments
+
+| Name            | Type           | Default  | Description |
+| :------         | :------------- | -------- | :---------- |
+| D               | Matrix[Double] | required | One dimensional column matrix 
with N rows. |
+| C               | Matrix[Double] | required | One dimensional column matrix 
with N rows. |
+| Laplace         | Double         | `1`      | Any Double value. |
+| Verbose         | Boolean        | `TRUE`   | Boolean value. |
+
+### Returns
+
+| Type           | Description |
+| :------------- | :---------- |
+| Matrix[Double] | Class priors, One dimensional column matrix with N rows. |
+| Matrix[Double] | Class conditional probabilites, One dimensional column 
matrix with N rows. |
+
+### Example
+
+```r
+D=rand(rows=10,cols=1,min=10)
+C=rand(rows=10,cols=1,min=10)
+[prior, classConditionals] = naiveBayes(D, C, laplace = 1, verbose = TRUE)
+```
+
+
+## `naiveBaysePredict`-Function
+
+The `naiveBaysePredict`-function predicts the scoring with a naive Bayes model.
+
+### Usage
+
+```r
+naiveBaysePredict(X=X, P=P, C=C)
+```
+
+### Arguments
+
+| Name    | Type           | Default  | Description |
+| :------ | :------------- | -------- | :---------- |
+| X       | Matrix[Double] | required | Matrix of test data with N rows. |
+| P       | Matrix[Double] | required | Class priors, One dimensional column 
matrix with N rows. |
+| C       | Matrix[Double] | required | Class conditional probabilities, 
matrix with N rows. |
+
+### Returns
+
+| Type           | Description |
+| :------------- | :---------- |
+| Matrix[Double] | A matrix containing the top-K item-ids with highest 
predicted ratings. |
+| Matrix[Double] | A matrix containing predicted ratings. |
+
+### Example
+
+```r
+[YRaw, Y] = naiveBaysePredict(X=data, P=model_prior, C=model_conditionals)
+```
+
+
+## `normalize`-Function
+
+The `normalize`-function normalises the values of a matrix by changing the 
dataset to use a common scale.
+This is done while preserving differences in the ranges of values.
+The output is a matrix of values in range [0,1].
+
+### Usage
+
+```r
+normalize(X); 
+```
+
+### Arguments
+
+| Name    | Type           | Default  | Description |
+| :------ | :------------- | -------- | :---------- |
+| X       | Matrix[Double] | required | Matrix of feature vectors. |
+
+### Returns
+
+| Type           | Description |
+| :------------- | :---------- |
+| Matrix[Double] | 1-column matrix of normalized values. |
+
+### Example
+
+```r
+X = rand(rows = 50, cols = 10)
+y = X %*% rand(rows = ncol(X), cols = 1)
+y = normalize(X = X)
+```
+
+
+## `outlier`-Function
+
+This `outlier`-function takes a matrix data set as input from where it 
determines which point(s)
+have the largest difference from mean.
+
+### Usage
+
+```r
+outlier(X, opposite)
+```
+
+### Arguments
+
+| Name     | Type           | Default  | Description |
+| :------- | :------------- | -------- | :---------- |
+| X        | Matrix[Double] | required | Matrix of Recoded dataset for outlier 
evaluation |
+| opposite | Boolean        | required | (1)TRUE for evaluating outlier from 
upper quartile range, (0)FALSE for evaluating outlier from lower quartile range 
|
+
+### Returns
+
+| Type           | Description |
+| :------------- | :---------- |
+| Matrix[Double] | matrix indicating outlier values |
+
+### Example
+
+```r
+X = rand (rows = 50, cols = 10)
+outlier(X=X, opposite=1)
+```
+
+
 ## `pnmf`-Function
 
 The `pnmf`-function implements Poisson Non-negative Matrix Factorization 
(PNMF). Matrix `X` is factorized into
@@ -1207,7 +1563,6 @@ pnmf(X, rnk, eps = 10^-8, maxi = 10, verbose = TRUE)
 | maxi    | Integer        | `10`     | Maximum number of conjugate gradient 
iterations. |
 | verbose | Boolean        | TRUE     | If TRUE, 'iter' and 'obj' are printed.|
 
-
 ### Returns
 
 | Type           | Description |
@@ -1222,6 +1577,7 @@ X = rand(rows = 50, cols = 10)
 [W, H] = pnmf(X = X, rnk = 2, eps = 10^-8, maxi = 10, verbose = TRUE)
 ```
 
+
 ## `scale`-Function
 
 The scale function is a generic function whose default method centers or 
scales the column of a numeric matrix.
@@ -1255,12 +1611,14 @@ scale=TRUE;
 Y= scale(X,center,scale)
 ```
 
+
 ## `sherlock`-Function
 
 Implements training phase of Sherlock: A Deep Learning Approach to Semantic 
Data Type Detection
 
 [Hulsebos, Madelon, et al. "Sherlock: A deep learning approach to semantic 
data type detection."
 Proceedings of the 25th ACM SIGKDD International Conference on Knowledge 
Discovery & Data Mining., 2019]
+
 ### Usage
 
 ```r
@@ -1317,6 +1675,7 @@ Implements prediction and evaluation phase of Sherlock: A 
Deep Learning Approach
 
 [Hulsebos, Madelon, et al. "Sherlock: A deep learning approach to semantic 
data type detection."
 Proceedings of the 25th ACM SIGKDD International Conference on Knowledge 
Discovery & Data Mining., 2019]
+
 ### Usage
 
 ```r
@@ -1324,6 +1683,7 @@ sherlockPredict(X, cW1, cb1, cW2, cb2, cW3, cb3, wW1, 
wb1, wW2, wb2, wW3, wb3,
                    pW1, pb1, pW2, pb2, pW3, pb3, sW1, sb1, sW2, sb2, sW3, sb3,
                    fW1, fb1, fW2, fb2, fW3, fb3)
 ```
+
 ### Arguments
 
 | Name    | Type           | Default  | Description |
@@ -1375,6 +1735,7 @@ fW3,  fb3)
 [loss, accuracy] = sherlockPredict::eval(probs, processed_val_labels)
 ```
 
+
 ## `sigmoid`-Function
 
 The Sigmoid function is a type of activation function, and also defined as a 
squashing function which limit the output
@@ -1405,6 +1766,45 @@ sigmoid(X)
 X = rand (rows = 20, cols = 10)
 Y = sigmoid(X)
 ```
+
+
+## `slicefinder`-Function
+
+The `slicefinder`-function returns top-k worst performing subsets according to 
a model calculation.
+
+### Usage
+
+```r
+slicefinder(X,W, y, k, paq, S);
+```
+
+### Arguments
+
+| Name    | Type           | Default  | Description |
+| :------ | :------------- | -------- | :---------- |
+| X       | Matrix[Double] | required | Recoded dataset into Matrix |
+| W       | Matrix[Double] | required | Trained model |
+| y       | Matrix[Double] | required | 1-column matrix of response values. |
+| k       | Integer        | 1        | Number of subsets required |
+| paq     | Integer        | 1        | amount of values wanted for each col, 
if paq = 1 then its off |
+| S       | Integer        | 2        | amount of subsets to combine (for now 
supported only 1 and 2) |
+
+### Returns
+
+| Type           | Description |
+| :------------- | :---------- |
+| Matrix[Double] | Matrix containing the information of top_K slices (relative 
error, standart error, value0, value1, col_number(sort), rows, 
cols,range_row,range_cols, value00, value01,col_number2(sort), rows2, 
cols2,range_row2,range_cols2) |
+
+### Usage
+
+```r
+X = rand (rows = 50, cols = 10)
+y = X %*% rand(rows = ncol(X), cols = 1)
+w = lm(X = X, y = y)
+ress = slicefinder(X = X,W = w, Y = y,  k = 5, paq = 1, S = 2);
+```
+
+
 ## `smote`-Function
 
 The `smote`-function (Synthetic Minority Oversampling Technique) implements a 
classical techniques for handling class imbalance.
@@ -1442,6 +1842,8 @@ smote(X, s, k, verbose);
 X = rand (rows = 50, cols = 10)
 B = smote(X = X, s=200, k=3, verbose=TRUE);
 ```
+
+
 ## `steplm`-Function
 
 The `steplm`-function (stepwise linear regression) implements a classical 
forward feature selection method.
@@ -1493,223 +1895,19 @@ y = X %*% rand(rows = ncol(X), cols = 1)
 [C, S] = steplm(X = X, y = y, icpt = 1);
 ```
 
-## `slicefinder`-Function
 
-The `slicefinder`-function returns top-k worst performing subsets according to 
a model calculation.
+## `tomekLink`-Function
+
+The `tomekLink`-function performs undersampling by removing Tomek's links for 
imbalanced
+multiclass problems
+
+Reference:
+"Two Modifications of CNN," in IEEE Transactions on Systems, Man, and 
Cybernetics, vol. SMC-6, no. 11, pp. 769-772, Nov. 1976, doi: 
10.1109/TSMC.1976.4309452.
 
 ### Usage
 
 ```r
-slicefinder(X,W, y, k, paq, S);
-```
-
-### Arguments
-
-| Name    | Type           | Default  | Description |
-| :------ | :------------- | -------- | :---------- |
-| X       | Matrix[Double] | required | Recoded dataset into Matrix |
-| W       | Matrix[Double] | required | Trained model |
-| y       | Matrix[Double] | required | 1-column matrix of response values. |
-| k       | Integer        | 1        | Number of subsets required |
-| paq     | Integer        | 1        | amount of values wanted for each col, 
if paq = 1 then its off |
-| S       | Integer        | 2        | amount of subsets to combine (for now 
supported only 1 and 2) |
-
-### Returns
-
-| Type           | Description |
-| :------------- | :---------- |
-| Matrix[Double] | Matrix containing the information of top_K slices (relative 
error, standart error, value0, value1, col_number(sort), rows, 
cols,range_row,range_cols, value00, value01,col_number2(sort), rows2, 
cols2,range_row2,range_cols2) |
-
-### Usage
-
-```r
-X = rand (rows = 50, cols = 10)
-y = X %*% rand(rows = ncol(X), cols = 1)
-w = lm(X = X, y = y)
-ress = slicefinder(X = X,W = w, Y = y,  k = 5, paq = 1, S = 2);
-```
-
-## `normalize`-Function
-
-The `normalize`-function normalises the values of a matrix by changing the 
dataset to use a common scale.
-This is done while preserving differences in the ranges of values.
-The output is a matrix of values in range [0,1].
-
-### Usage
-
-```r
-normalize(X); 
-```
-
-### Arguments
-
-| Name    | Type           | Default  | Description |
-| :------ | :------------- | -------- | :---------- |
-| X       | Matrix[Double] | required | Matrix of feature vectors. |
-
-
-### Returns
-
-| Type           | Description |
-| :------------- | :---------- |
-| Matrix[Double] | 1-column matrix of normalized values. |
-
-
-
-### Example
-
-```r
-X = rand(rows = 50, cols = 10)
-y = X %*% rand(rows = ncol(X), cols = 1)
-y = normalize(X = X)
-```
-
-## `gnmf`-Function
-
-The `gnmf`-function does Gaussian Non-Negative Matrix Factorization.
-In this, a matrix X is factorized into two matrices W and H, such that all 
three matrices have no negative elements.
-This non-negativity makes the resulting matrices easier to inspect.
-
-### Usage
-
-```r
-gnmf(X, rnk, eps = 10^-8, maxi = 10)
-```
-
-### Arguments
-
-| Name    | Type           | Default  | Description |
-| :------ | :------------- | -------- | :---------- |
-| X       | Matrix[Double] | required | Matrix of feature vectors. |
-| rnk     | Integer        | required | Number of components into which matrix 
X is to be factored. |
-| eps     | Double         | `10^-8`  | Tolerance |
-| maxi    | Integer        | `10`     | Maximum number of conjugate gradient 
iterations. |
-
-
-### Returns
-
-| Type           | Description |
-| :------------- | :---------- |
-| Matrix[Double] | List of pattern matrices, one for each repetition. |
-| Matrix[Double] | List of amplitude matrices, one for each repetition. |
-
-### Example
-
-```r
-X = rand(rows = 50, cols = 10)
-W = rand(rows = nrow(X), cols = 2, min = -0.05, max = 0.05);
-H = rand(rows = 2, cols = ncol(X), min = -0.05, max = 0.05);
-gnmf(X = X, rnk = 2, eps = 10^-8, maxi = 10)
-```
-
-## `naiveBayes`-Function
-
-The `naiveBayes`-function computes the class conditional probabilities and 
class priors.
-
-### Usage
-
-```r
-naiveBayes(D, C, laplace, verbose)
-```
-
-### Arguments
-
-| Name            | Type           | Default  | Description |
-| :------         | :------------- | -------- | :---------- |
-| D               | Matrix[Double] | required | One dimensional column matrix 
with N rows. |
-| C               | Matrix[Double] | required | One dimensional column matrix 
with N rows. |
-| Laplace         | Double         | `1`      | Any Double value. |
-| Verbose         | Boolean        | `TRUE`   | Boolean value. |
-
-### Returns
-
-| Type           | Description |
-| :------------- | :---------- |
-| Matrix[Double] | Class priors, One dimensional column matrix with N rows. |
-| Matrix[Double] | Class conditional probabilites, One dimensional column 
matrix with N rows. |
-
-### Example
-
-```r
-D=rand(rows=10,cols=1,min=10)
-C=rand(rows=10,cols=1,min=10)
-[prior, classConditionals] = naiveBayes(D, C, laplace = 1, verbose = TRUE)
-```
-
-## `naiveBaysePredict`-Function
-
-The `naiveBaysePredict`-function predicts the scoring with a naive Bayes model.
-
-### Usage
-
-```r
-naiveBaysePredict(X=X, P=P, C=C)
-```
-
-### Arguments
-
-| Name    | Type           | Default  | Description |
-| :------ | :------------- | -------- | :---------- |
-| X       | Matrix[Double] | required | Matrix of test data with N rows. |
-| P       | Matrix[Double] | required | Class priors, One dimensional column 
matrix with N rows. |
-| C       | Matrix[Double] | required | Class conditional probabilities, 
matrix with N rows. |
-
-### Returns
-
-| Type           | Description |
-| :------------- | :---------- |
-| Matrix[Double] | A matrix containing the top-K item-ids with highest 
predicted ratings. |
-| Matrix[Double] | A matrix containing predicted ratings. |
-
-### Example
-
-```r
-[YRaw, Y] = naiveBaysePredict(X=data, P=model_prior, C=model_conditionals)
-```
-
-## `outlier`-Function
-
-This `outlier`-function takes a matrix data set as input from where it 
determines which point(s)
-have the largest difference from mean.
-
-### Usage
-
-```r
-outlier(X, opposite)
-```
-
-### Arguments
-
-| Name     | Type           | Default  | Description |
-| :------- | :------------- | -------- | :---------- |
-| X        | Matrix[Double] | required | Matrix of Recoded dataset for outlier 
evaluation |
-| opposite | Boolean        | required | (1)TRUE for evaluating outlier from 
upper quartile range, (0)FALSE for evaluating outlier from lower quartile range 
|
-
-### Returns
-
-| Type           | Description |
-| :------------- | :---------- |
-| Matrix[Double] | matrix indicating outlier values |
-
-### Example
-
-```r
-X = rand (rows = 50, cols = 10)
-outlier(X=X, opposite=1)
-```
-
-## `tomekLink`-Function
-
-The `tomekLink`-function performs undersampling by removing Tomek's links for 
imbalanced
-multiclass problems
-
-Reference:
-"Two Modifications of CNN," in IEEE Transactions on Systems, Man, and 
Cybernetics, vol. SMC-6, no. 11, pp. 769-772, Nov. 1976, doi: 
10.1109/TSMC.1976.4309452.
-
-### Usage
-
-```r
-[X_under, y_under, drop_idx] = tomeklink(X, y)
+[X_under, y_under, drop_idx] = tomeklink(X, y)
 ```
 
 ### Arguments
@@ -1735,6 +1933,7 @@ y = round(rand(rows = nrow(X), cols = 1, min = 0, max = 
1))
 [X_under, y_under, drop_idx] = tomeklink(X, y)
 ```
 
+
 ## `toOneHot`-Function
 
 The `toOneHot`-function encodes unordered categorical vector to multiple 
binarized vectors.
@@ -1766,88 +1965,6 @@ X = round(rand(rows = 10, cols = 10, min = 1, max = 
numClasses))
 y = toOneHot(X,numClasses)
 ```
 
-## `mdedup`-Function
-
-The `mdedup`-function implements builtin for deduplication using matching 
dependencies
-(e.g. Street 0.95, City 0.90 -> ZIP 1.0) by Jaccard distance.
-
-### Usage
-
-```r
-mdedup(X, Y, intercept, epsilon, lamda, maxIterations, verbose)
-```
-
-
-### Arguments
-
-| Name          | Type             | Default    | Description |
-| :------       | :-------------   | --------   | :---------- |
-| X             | Frame            | ---        | Input Frame X |
-| LHSfeatures   | Matrix[Integer]  | ---        | A matrix 1xd with numbers of 
columns for MDs |
-| LHSthreshold  | Matrix[Double]   | ---        | A matrix 1xd with threshold 
values in interval [0, 1] for MDs |
-| RHSfeatures   | Matrix[Integer]  | ---        | A matrix 1xd with numbers of 
columns for MDs |
-| RHSthreshold  | Matrix[Double]   | ---        | A matrix 1xd with threshold 
values in interval [0, 1] for MDs |
-| verbose       | Boolean          | False      | Set to true to print 
duplicates.|
-
-
-### Returns
-
-| Type            | Default  | Description |
-| :-------------- | -------- | :---------- |
-| Matrix[Integer] | ---      | Matrix of duplicates (rows). |
-
-
-### Example
-
-```r
-X = as.frame(rand(rows = 50, cols = 10))
-LHSfeatures = matrix("1 3 19", 1, 2)
-LHSthreshold = matrix("0.85 0.85", 1, 2)
-RHSfeatures = matrix("30", 1, 1)
-RHSthreshold = matrix("1.0", 1, 1)
-duplicates = mdedup(X, LHSfeatures, LHSthreshold, RHSfeatures, RHSthreshold, 
verbose = FALSE)
-```
-
-## `msvm`-Function
-
-The `msvm`-function implements builtin multiclass SVM with squared slack 
variables
-It learns one-against-the-rest binary-class classifiers by making a function 
call to l2SVM
-
-### Usage
-
-```r
-msvm(X, Y, intercept, epsilon, lamda, maxIterations, verbose)
-```
-
-
-### Arguments
-
-| Name          | Type             | Default    | Description |
-| :------       | :-------------   | --------   | :---------- |
-| X             | Double           | ---        | Matrix X of feature vectors.|
-| Y             | Double           | ---        | Matrix Y of class labels. |
-| intercept     | Boolean          | False      | No Intercept ( If set to 
TRUE then a constant bias column is added to X)|
-| num_classes   | Integer          | 10         | Number of classes.|
-| epsilon       | Double           | 0.001      | Procedure terminates early 
if the reduction in objective function value is less than epsilon (tolerance) 
times the initial objective function value.|
-| lamda         | Double           | 1.0        | Regularization parameter 
(lambda) for L2 regularization|
-| maxIterations | Integer          | 100        | Maximum number of conjugate 
gradient iterations|
-| verbose       | Boolean          | False      | Set to true to print while 
training.|
-
-
-### Returns
-
-| Name    | Type           | Default  | Description |
-| :------ | :------------- | -------- | :---------- |
-| model   | Double         | ---      | Model matrix. |
-
-
-### Example
-
-```r
-X = rand(rows = 50, cols = 10)
-y = round(X %*% rand(rows=ncol(X), cols=1))
-model = msvm(X = X, Y = y, intercept = FALSE, epsilon = 0.005, lambda = 1.0, 
maxIterations = 100, verbose = FALSE)
-```
 
 ## `winsorize`-Function
 
@@ -1880,77 +1997,3 @@ X = rand(rows=10, cols=10,min = 1, max=9)
 Y = winsorize(X=X)
 ```
 
-## `gmm`-Function
-
-The `gmm`-function implements builtin Gaussian Mixture Model with four 
different types of
-covariance matrices i.e., VVV, EEE, VVI, VII and two initialization methods 
namely "kmeans" and "random".
-
-### Usage
-
-```r
-gmm(X=X, n_components = 3,  model = "VVV",  init_params = "random", iter = 
100, reg_covar = 0.000001, tol = 0.0001, verbose=TRUE)
-```
-
-
-### Arguments
-
-| Name          | Type             | Default    | Description |
-| :------       | :-------------   | --------   | :---------- |
-| X             | Double           | ---        | Matrix X of feature vectors.|
-| n_components             | Integer           | 3        | Number of 
n_components in the Gaussian mixture model |
-| model     | String          | "VVV"| "VVV": unequal variance (full),each 
component has its own general covariance matrix<br><br>"EEE": equal variance 
(tied), all components share the same general covariance matrix<br><br>"VVI": 
spherical, unequal volume (diag), each component has its own diagonal 
covariance matrix<br><br>"VII": spherical, equal volume (spherical), each 
component has its own single variance |
-| init_param   | String          | "kmeans"         | initialize weights with 
"kmeans" or "random"|
-| iterations       | Integer           | 100    |  Number of iterations|
-| reg_covar         | Double           | 1e-6        | regularization 
parameter for covariance matrix|
-| tol | Double          | 0.000001        |tolerance value for convergence |
-| verbose       | Boolean          | False      | Set to true to print 
intermediate results.|
-
-
-### Returns
-
-| Name    | Type           | Default  | Description |
-| :------ | :------------- | -------- | :---------- |
-| weight    |      Double  | ---      |A matrix whose [i,k]th entry is the 
probability that observation i in the test data belongs to the kth class|
-|labels   |       Double  | ---     | Prediction matrix|
-|df |              Integer  |---  |    Number of estimated parameters|
-| bic |             Double  | ---  |    Bayesian information criterion for 
best iteration|
-
-### Example
-
-```r
-X = read($1)
-[labels, df, bic] = gmm(X=X, n_components = 3,  model = "VVV",  init_params = 
"random", iter = 100, reg_covar = 0.000001, tol = 0.0001, verbose=TRUE)
-```
-
-## `correctTypos`-Function
-
-The `correctTypos` - function tries to correct typos in a given frame. This 
algorithm operates on the assumption that most strings are correct and simply 
swaps strings that do not occur often with similar strings that occur more 
often. If correct is set to FALSE only prints suggested corrections without 
effecting the frame.
-
-### Usage
-
-```r
-correctTypos(strings, frequency_threshold, distance_threshold, decapitalize, 
correct, is_verbose)
-```
-
-### Arguments
-
-| NAME    | TYPE           | DEFAULT  | Description |
-| :------ | :------------- | -------- | :---------- |
-| strings | String  |   ---  |    The nx1 input frame of corrupted strings |
-| frequency_threshold   |            Double   | 0.05 |    Strings that occur 
above this relative frequency level will not be corrected |
-| distance_threshold         |       Int   |    2   |     Max editing distance 
at which strings are considered similar |
-| decapitalize            |          Boolean  | TRUE  |   Decapitalize all 
strings before correction |
-| correct              |             Boolean  | TRUE |    Correct strings or 
only report potential errors |
-| is_verbose                    |    Boolean |  FALSE |   Print debug 
information |
-
-### Returns
-
-|   TYPE     |   Description|
-|  :------------- |  :---------- |
-|      String   |  Corrected nx1 output frame |
-
-### Example
-```r
-A = read(“file1”, data_type=”frame”, rows=2000, cols=1, format=”binary”)
-A_corrected = correctTypos(A, 0.02, 3, FALSE, TRUE)
-```

Reply via email to