[
https://issues.apache.org/jira/browse/SYSTEMML-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
LI Guobao updated SYSTEMML-2299:
Description:
The objective of “paramserv” built-in function is to update an initial or
existing model with configuration. An initial function signature would be:
{code:java}
model'=paramserv(model=paramsList, features=X, labels=Y, val_features=X_val,
val_labels=Y_val, upd="fun1", agg="fun2", mode="LOCAL", utype="BSP",
freq="BATCH", epochs=100, batchsize=64, k=7, scheme="disjoint_contiguous",
hyperparams=params, checkpointing="NONE"){code}
We are interested in providing the model (which will be a struct-like data
structure consisting of the weights, the biases and the hyperparameters), the
training features and labels, the validation features and labels, the batch
update function (i.e., gradient calculation func), the update strategy (e.g.
sync, async, hogwild!, stale-synchronous), the update frequency (e.g. epoch or
mini-batch), the gradient aggregation function, the number of epoch, the batch
size, the degree of parallelism, the data partition scheme, a list of
additional hyper parameters, as well as the checkpointing strategy. And the
function will return a trained model in struct format.
*Inputs*:
* model : a list consisting of the weight and bias matrices
* features : training features matrix
* labels : training label matrix
* val_features [optional]: validation features matrix
* val_labels [optional]: validation label matrix
* upd : the name of gradient calculation function
* agg : the name of gradient aggregation function
* mode (options: LOCAL, REMOTE_SPARK): the execution backend where
the parameter is executed
* utype (options: BSP, ASP, SSP): the updating mode
* freq [optional] (default: BATCH) (options: EPOCH, BATCH) : the
frequence of updates
* epochs : the number of epoch
* batchsize [optional] (default: 64): the size of batch, if the
update frequence is "EPOCH", this argument will be ignored
* k [optional] (default: number of vcores, otherwise vcores / 2 if
using openblas): the degree of parallelism
* scheme [optional] (default: disjoint_contiguous) (options:
disjoint_contiguous, disjoint_round_robin, disjoint_random, overlap_reshuffle):
the scheme of data partition, i.e., how the data is distributed across workers
* hyperparams [optional]: a list consisting of the additional hyper
parameters, e.g., learning rate, momentum
* checkpointing [optional] (default: NONE) (options: NONE, EPOCH,
EPOCH10) : the checkpoint strategy, we could set a checkpoint for each epoch or
each 10 epochs
*Output*:
* model' : a list consisting of the updated weight and bias matrices
was:
The objective of “paramserv” built-in function is to update an initial or
existing model with configuration. An initial function signature would be:
{code:java}
model'=paramserv(model=paramsList, features=X, labels=Y, val_features=X_val,
val_labels=Y_val, upd="fun1", agg="fun2", mode="LOCAL", utype="BSP",
freq="BATCH", epochs=100, batchsize=64, k=7, scheme="disjoint_contiguous",
hyperparams=params, checkpointing="NONE"){code}
We are interested in providing the model (which will be a struct-like data
structure consisting of the weights, the biases and the hyperparameters), the
training features and labels, the validation features and labels, the batch
update function (i.e., gradient calculation func), the update strategy (e.g.
sync, async, hogwild!, stale-synchronous), the update frequency (e.g. epoch or
mini-batch), the gradient aggregation function, the number of epoch, the batch
size, the degree of parallelism, the data partition scheme, a list of
additional hyper parameters, as well as the checkpointing strategy. And the
function will return a trained model in struct format.
*Inputs*:
* model : a list consisting of the weight and bias matrices
* features : training features matrix
* labels : training label matrix
* val_features : validation features matrix
* val_labels : validation label matrix
* upd : the name of gradient calculation function
* agg : the name of gradient aggregation function
* mode (options: LOCAL, REMOTE_SPARK): the execution backend where
the parameter is executed
* utype (options: BSP, ASP, SSP): the updating mode
* freq [optional] (default: BATCH) (options: EPOCH, BATCH) : the
frequence of updates
* epochs : the number of epoch
* batchsize [optional] (default: 64): the size of batch, if the
update frequence is "EPOCH", this argument will be ignored
* k [optional] (default: number of vcores, otherwise vcores / 2 if
using openblas): the degree of parallelism
* scheme [optional] (default: disjoint_contiguous) (options:
disjoint_contiguous, disjoint_round_robin, disjoint_random, overlap_reshuffle):
the scheme of data partition, i.e., how the data is distributed across workers
* hyperparams [optional]: a list consisting