This is an automated email from the ASF dual-hosted git repository. fmcquillan pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/madlib.git
The following commit(s) were added to refs/heads/master by this push: new fe1c1f5 clarify input row weights vs network weights in user docs for MLP fe1c1f5 is described below commit fe1c1f5915cc7c5c0dfa7422e3b6a7713402524f Author: Frank McQuillan <fmcquil...@pivotal.io> AuthorDate: Mon Mar 8 15:35:08 2021 -0800 clarify input row weights vs network weights in user docs for MLP --- src/ports/postgres/modules/convex/mlp.sql_in | 26 ++++++++++++++------------ 1 file changed, 14 insertions(+), 12 deletions(-) diff --git a/src/ports/postgres/modules/convex/mlp.sql_in b/src/ports/postgres/modules/convex/mlp.sql_in index d6ce7ce..d98f8c4 100644 --- a/src/ports/postgres/modules/convex/mlp.sql_in +++ b/src/ports/postgres/modules/convex/mlp.sql_in @@ -152,19 +152,20 @@ mlp_classification( <DT>weights (optional)</DT> <DD>TEXT, default: 1. - Weights for input rows. Column name which specifies the weight for each input row. - This weight will be incorporated into the update during stochastic gradient - descent (SGD), but will not be used for loss calculations. If not specified, - weight for each row will default to 1 (equal weights). Column should be a - numeric type. + Column name for giving different weights to different rows during training. + E.g., a weight of two for a specific row is equivalent to dupicating that row. + This weight is incorporated into the update during stochastic gradient + descent (SGD), but is not be used for loss calculations. If not specified, + weight for each row will default to 1 (equal weights). Column should be a + numeric type. @note - The 'weights' parameter is not currently for mini-batching. + The 'weights' parameter cannot be used if you use mini-batching of the source dataset. </DD> <DT>warm_start (optional)</DT> <DD>BOOLEAN, default: FALSE. - Initalize weights with the coefficients from the last call of the training - function. If set to true, weights will be initialized from the output_table + Initalize neural network weights with the coefficients from the last call of the training + function. If set to true, neural network weights will be initialized from the output_table generated by the previous run. Note that all parameters other than optimizer_params and verbose must remain constant between calls when warm_start is used. @@ -173,7 +174,7 @@ mlp_classification( The warm start feature works based on the name of the output_table. When using warm start, do not drop the output table or the output table summary before calling the training function, since these are needed to obtain the - weights from the previous run. + neural network weights from the previous run. If you are not using warm start, the output table and the output table summary must be dropped in the usual way before calling the training function. @@ -294,7 +295,8 @@ A summary table named \<output_table\>_summary is also created, which has the fo </tr> <tr> <th>weights</th> - <td>The weight column used during training.</td> + <td>The weight column used during training for giving different + weights to different rows.</td> </tr> <tr> <th>grouping_col</th> @@ -421,7 +423,7 @@ a factor of gamma. Valid for learning rate policy = 'step'. <DT>n_tries</dt> <DD>Default: 1. Number of times to retrain the network with randomly initialized -weights. +neural network weights. </DD> <DT>lambda</dt> @@ -954,7 +956,7 @@ num_iterations | 450 </pre> Notice that the loss is lower compared to the previous example, despite having the same values for every other parameter. This is because the algorithm -learnt three different models starting with a different set of initial weights +learned three different models starting with a different set of initial weights for the coefficients, and chose the best model among them as the initial weights for the coefficients when run with warm start.