[jira] [Updated] (SYSTEMML-2299) API design of the paramserv function

2018-05-16 Thread LI Guobao (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

LI Guobao updated SYSTEMML-2299:

Description: 
The objective of “paramserv” built-in function is to update an initial or 
existing model with configuration. An initial function signature would be: 
{code:java}
model'=paramserv(model, features=X, labels=Y, val_features=X_val, 
val_labels=Y_val, upd="fun1", agg="fun2", mode="LOCAL", utype="BSP", 
freq="BATCH", epochs=100, batchsize=64, k=7, scheme="disjoint_contiguous", 
hyperparams=params, checkpointing="NONE"){code}
We are interested in providing the model (which will be a struct-like data 
structure consisting of the weights, the biases and the hyperparameters), the 
training features and labels, the validation features and labels, the batch 
update function (i.e., gradient calculation func), the update strategy (e.g. 
sync, async, hogwild!, stale-synchronous), the update frequency (e.g. epoch or 
mini-batch), the gradient aggregation function, the number of epoch, the batch 
size, the degree of parallelism, the data partition scheme, a list of 
additional hyper parameters, as well as the checkpointing strategy. And the 
function will return a trained model in struct format.

*Inputs*:
 * model : a list consisting of the weight and bias matrices
 * features : training features matrix
 * labels : training label matrix
 * val_features : validation features matrix
 * val_labels : validation label matrix
 * upd : the name of gradient calculation function
 * agg : the name of gradient aggregation function
 * mode  (options: LOCAL, REMOTE_SPARK): the execution backend where 
the parameter is executed
 * utype  (options: BSP, ASP, SSP): the updating mode
 * freq  (options: EPOCH, BATCH): the frequence of updates
 * epochs : the number of epoch
 * batchsize  [optional] (default value: 64): the size of batch, if 
the update frequence is "EPOCH", this argument will be ignored
 * k : the degree of parallelism
 * scheme  (options: disjoint_contiguous, disjoint_round_robin, 
disjoint_random, overlap_reshuffle): the scheme of data partition, i.e., how 
the data is distributed across workers
 * hyperparams  [optional]: a list consisting of the additional hyper 
parameters, e.g., learning rate, momentum
 * checkpointing  (options: NONE(default), EPOCH, EPOCH10) [optional]: 
the checkpoint strategy, we could set a checkpoint for each epoch or each 10 
epochs 

*Output*:
 * model' : a list consisting of the updated weight and bias matrices

  was:
The objective of “paramserv” built-in function is to update an initial or 
existing model with configuration. An initial function signature would be: 
{code:java}
model'=paramserv(model, features=X, labels=Y, val_features=X_val, 
val_labels=Y_val, upd="fun1", agg="fun2", mode="LOCAL", utype="BSP", 
freq="BATCH", epochs=100, batchsize=64, k=7, scheme="disjoint_contiguous", 
hyperparams=params, checkpointing="NONE"){code}
We are interested in providing the model (which will be a struct-like data 
structure consisting of the weights, the biases and the hyperparameters), the 
training features and labels, the validation features and labels, the batch 
update function (i.e., gradient calculation func), the update strategy (e.g. 
sync, async, hogwild!, stale-synchronous), the update frequency (e.g. epoch or 
mini-batch), the gradient aggregation function, the number of epoch, the batch 
size, the degree of parallelism, the data partition scheme, a list of 
additional hyper parameters, as well as the checkpointing strategy. And the 
function will return a trained model in struct format.

*Inputs*:
 * model : a list consisting of the weight and bias matrices
 * features : training features matrix
 * labels : training label matrix
 * val_features : validation features matrix
 * val_labels : validation label matrix
 * upd : the name of gradient calculation function
 * agg : the name of gradient aggregation function
 * mode  (options: LOCAL, REMOTE_SPARK): the execution backend where 
the parameter is executed
 * utype  (options: BSP, ASP, SSP): the updating mode
 * freq  (options: EPOCH, BATCH): the frequence of updates
 * epochs : the number of epoch
 * batchsize  [optional]: the size of batch, if the update frequence 
is "EPOCH", this argument will be ignored
 * k : the degree of parallelism
 * scheme  (options: disjoint_contiguous, disjoint_round_robin, 
disjoint_random, overlap_reshuffle): the scheme of data partition, i.e., how 
the data is distributed across workers
 * hyperparams  [optional]: a list consisting of the additional hyper 
parameters, e.g., learning rate, momentum
 * checkpointing  (options: NONE(default), EPOCH, EPOCH10) [optional]: 
the checkpoint strategy, we could set a checkpoint for each epoch or each 10 
epochs 

*Output*:
 * model' : a list consisting of the updated weight and bias matrices


> API design of the paramserv function
> 

[jira] [Commented] (SYSTEMML-2319) IPA integration

2018-05-16 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16477836#comment-16477836
 ] 

Matthias Boehm commented on SYSTEMML-2319:
--

The list of hops in the statement block is not the list of all hops but just 
the list of root nodes (i.e., outputs) of the DAG. All other hops are reachable 
by traversing the DAG from the root nodes. Since it's a DAG this traversal 
should generally use memoization to remember already processed operators, which 
is important to avoid redundant processing if nodes are reachable over multiple 
paths.

> IPA integration
> ---
>
> Key: SYSTEMML-2319
> URL: https://issues.apache.org/jira/browse/SYSTEMML-2319
> Project: SystemML
>  Issue Type: Sub-task
>Reporter: LI Guobao
>Assignee: LI Guobao
>Priority: Major
>
> It aims to extend the IPA to avoid removing the referenced functions due to 
> the fact that the paramserv function is a second-order function.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SYSTEMML-2319) IPA integration

2018-05-16 Thread LI Guobao (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16477408#comment-16477408
 ] 

LI Guobao commented on SYSTEMML-2319:
-

[~mboehm7] I have a question about the hops. In the StatementBlock, why the 
hops list does not contain the ParamBuiltinOp? It seems that it consists of 
only the UnaryOp, BinaryOp, etc.

> IPA integration
> ---
>
> Key: SYSTEMML-2319
> URL: https://issues.apache.org/jira/browse/SYSTEMML-2319
> Project: SystemML
>  Issue Type: Sub-task
>Reporter: LI Guobao
>Assignee: LI Guobao
>Priority: Major
>
> It aims to extend the IPA to avoid removing the referenced functions due to 
> the fact that the paramserv function is a second-order function.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)