[jira] [Created] (SYSTEMML-2486) Performance features sparsity estimators

2018-08-05 Thread Matthias Boehm (JIRA)
Matthias Boehm created SYSTEMML-2486:


 Summary: Performance features sparsity estimators
 Key: SYSTEMML-2486
 URL: https://issues.apache.org/jira/browse/SYSTEMML-2486
 Project: SystemML
  Issue Type: Sub-task
Reporter: Matthias Boehm


This includes features such as:
* Multi-threaded sketch construction
* Multi-threaded estimation
* Reduced memory footprint and special cases



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SYSTEMML-2458) Add experiment on spark paramserv

2018-08-05 Thread Matthias Boehm (JIRA)


[ 
https://issues.apache.org/jira/browse/SYSTEMML-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16569601#comment-16569601
 ] 

Matthias Boehm commented on SYSTEMML-2458:
--

Thanks - the adagrad results are in the repo; currently adam and sgd are 
running. One observation is that ASP-batch is much slower than BSP-batch. It's 
understandable because for BSP-batch we simply accure gradients and perform one 
update for all workers but this effect should not be that pronounced.

> Add experiment on spark paramserv
> -
>
> Key: SYSTEMML-2458
> URL: https://issues.apache.org/jira/browse/SYSTEMML-2458
> Project: SystemML
>  Issue Type: Sub-task
>Reporter: LI Guobao
>Assignee: LI Guobao
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (SYSTEMML-2299) API design of the paramserv function

2018-08-05 Thread LI Guobao (JIRA)


 [ 
https://issues.apache.org/jira/browse/SYSTEMML-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

LI Guobao updated SYSTEMML-2299:

Description: 
The objective of “paramserv” built-in function is to update an initial or 
existing model with configuration. An initial function signature would be: 
{code:java}
model'=paramserv(model=paramsList, features=X, labels=Y, val_features=X_val, 
val_labels=Y_val, upd="fun1", agg="fun2", mode="LOCAL", utype="BSP", 
freq="BATCH", epochs=100, batchsize=64, k=7, scheme="disjoint_contiguous", 
hyperparams=params, checkpointing="NONE"){code}
We are interested in providing the model (which will be a struct-like data 
structure consisting of the weights, the biases and the hyperparameters), the 
training features and labels, the validation features and labels, the batch 
update function (i.e., gradient calculation func), the update strategy (e.g. 
sync, async, hogwild!, stale-synchronous), the update frequency (e.g. epoch or 
mini-batch), the gradient aggregation function, the number of epoch, the batch 
size, the degree of parallelism, the data partition scheme, a list of 
additional hyper parameters, as well as the checkpointing strategy. And the 
function will return a trained model in struct format.

*Inputs*:
 * model : a list consisting of the weight and bias matrices
 * features : training features matrix
 * labels : training label matrix
 * val_features  [optional]: validation features matrix
 * val_labels  [optional]: validation label matrix
 * upd : the name of gradient calculation function
 * agg : the name of gradient aggregation function
 * mode  (options: LOCAL, REMOTE_SPARK): the execution backend where 
the parameter is executed
 * utype  (options: BSP, ASP, SSP): the updating mode
 * freq  [optional] (default: BATCH) (options: EPOCH, BATCH) : the 
frequence of updates
 * epochs : the number of epoch
 * batchsize  [optional] (default: 64): the size of batch, if the 
update frequence is "EPOCH", this argument will be ignored
 * k  [optional] (default: number of vcores, otherwise vcores / 2 if 
using openblas): the degree of parallelism
 * scheme  [optional] (default: disjoint_contiguous) (options: 
disjoint_contiguous, disjoint_round_robin, disjoint_random, overlap_reshuffle): 
the scheme of data partition, i.e., how the data is distributed across workers
 * hyperparams  [optional]: a list consisting of the additional hyper 
parameters, e.g., learning rate, momentum
 * checkpointing [optional] (default: NONE) (options: NONE, EPOCH, 
EPOCH10) : the checkpoint strategy, we could set a checkpoint for each epoch or 
each 10 epochs 

*Output*:
 * model' : a list consisting of the updated weight and bias matrices

  was:
The objective of “paramserv” built-in function is to update an initial or 
existing model with configuration. An initial function signature would be: 
{code:java}
model'=paramserv(model=paramsList, features=X, labels=Y, val_features=X_val, 
val_labels=Y_val, upd="fun1", agg="fun2", mode="LOCAL", utype="BSP", 
freq="BATCH", epochs=100, batchsize=64, k=7, scheme="disjoint_contiguous", 
hyperparams=params, checkpointing="NONE"){code}
We are interested in providing the model (which will be a struct-like data 
structure consisting of the weights, the biases and the hyperparameters), the 
training features and labels, the validation features and labels, the batch 
update function (i.e., gradient calculation func), the update strategy (e.g. 
sync, async, hogwild!, stale-synchronous), the update frequency (e.g. epoch or 
mini-batch), the gradient aggregation function, the number of epoch, the batch 
size, the degree of parallelism, the data partition scheme, a list of 
additional hyper parameters, as well as the checkpointing strategy. And the 
function will return a trained model in struct format.

*Inputs*:
 * model : a list consisting of the weight and bias matrices
 * features : training features matrix
 * labels : training label matrix
 * val_features : validation features matrix
 * val_labels : validation label matrix
 * upd : the name of gradient calculation function
 * agg : the name of gradient aggregation function
 * mode  (options: LOCAL, REMOTE_SPARK): the execution backend where 
the parameter is executed
 * utype  (options: BSP, ASP, SSP): the updating mode
 * freq  [optional] (default: BATCH) (options: EPOCH, BATCH) : the 
frequence of updates
 * epochs : the number of epoch
 * batchsize  [optional] (default: 64): the size of batch, if the 
update frequence is "EPOCH", this argument will be ignored
 * k  [optional] (default: number of vcores, otherwise vcores / 2 if 
using openblas): the degree of parallelism
 * scheme  [optional] (default: disjoint_contiguous) (options: 
disjoint_contiguous, disjoint_round_robin, disjoint_random, overlap_reshuffle): 
the scheme of data partition, i.e., how the data is distributed across workers
 * hyperparams  [optional]: a list consisting 

[jira] [Commented] (SYSTEMML-2458) Add experiment on spark paramserv

2018-08-05 Thread LI Guobao (JIRA)


[ 
https://issues.apache.org/jira/browse/SYSTEMML-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16569424#comment-16569424
 ] 

LI Guobao commented on SYSTEMML-2458:
-

[~mboehm7], yes, I added the baseline experiment w/o paramserv and fixed the 
location of SystemML-config.xml file. Addtionnally, I've double checked the 
configuration of native BLAS for remote worker and it is well transferred and 
set to remote worker.

> Add experiment on spark paramserv
> -
>
> Key: SYSTEMML-2458
> URL: https://issues.apache.org/jira/browse/SYSTEMML-2458
> Project: SystemML
>  Issue Type: Sub-task
>Reporter: LI Guobao
>Assignee: LI Guobao
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)