[jira] [Commented] (SUBMARINE-548) [Umbrella] Predefined Experiment

2020-07-30 Thread Wangda Tan (Jira)


[ 
https://issues.apache.org/jira/browse/SUBMARINE-548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168293#comment-17168293
 ] 

Wangda Tan commented on SUBMARINE-548:
--

[~jotjohnting], thanks for working on this, I just reviewed 
[https://github.com/apache/submarine/pull/351]

I think we missed some part in the design: 

The design doc: 
[https://github.com/apache/submarine/blob/master/docs/design/experiment-implementation.md#predefined-experiment-template-api-to-run-experiment]
 defined the spec of how to submit a pre-defined template, which will be 
sufficient for submission from CLI/REST/UI. However, it is not enough to 
*register/define* a pre-defined template. 

The differences between register and submission a pre-defined template are: 
 * *Register* an experiment-template requires information of how Submarine can 
run the experiment, for example, it needs to include: resources required for 
worker; environment (docker image, conda kernel); commandline options for 
workers/ps, etc. 
 * In contrast, *submit* an experiment-template only requires filling 
required/optional parameters.

So to register a pre-defined template, we need to *not only* include 
ExperimentTemplate, but also, we need to tell how Submarine can run it. 

*So the predefined template registration should include the following:* 

*1) A template of Experiment yaml, for example, if we take an experiment 
example from our* doc: 
[https://github.com/apache/submarine/blob/master/docs/userdocs/k8s/run-tensorflow-experiment.md]
{code:java}
meta:
  name: "tf-mnist-yaml"
  namespace: "default"
  framework: "TensorFlow"
  cmd: "python /var/tf_mnist/mnist_with_summaries.py --log_dir=/train/log 
--learning_rate=0.01 --batch_size=150"
  envVars:
ENV_1: "ENV1"
environment:
  image: "gcr.io/kubeflow-ci/tf-mnist-with-summaries:1.0"
spec:
  Ps:
replicas: 1
resources: "cpu=1,memory=1024M"
  Worker:
replicas: 1
resources: "cpu=1,memory=1024M" {code}
We can create a template of the YAML (with placeholders) using syntax like:
{code:java}
meta:
  name: {{name}}
  namespace: "default"
  framework: "TensorFlow"
  cmd: "python /var/tf_mnist/mnist_with_summaries.py --input {{input}} 
--log_dir=/train/log --learning_rate={{training.learning_rate}} 
--batch_size={{training.batch_size}}"
  envVars:
ENV_1: "ENV1"
environment:
  image: "gcr.io/kubeflow-ci/tf-mnist-with-summaries:1.0"
spec:
  Ps:
replicas: 1
resources: "cpu=1,memory=1024M"
  Worker:
replicas: 1
resources: "cpu=1,memory=1024M" {code}
The above template defined 3 variables (placeholders): 
 * name 
 * input
 * training.learning_rate.
 * training.batch_size

(The above YAML placeholder is based on [https://stackoverflow.com/a/41620747)]

*2) A list of parameters (Similar to ExperimentTemplate)*

*So I think we need the following object:* 

*a. RegisterExperimentTemplateSpec*
{code:java}
{
   template_name: Name of the template
   experiment_spec: the spec for experiment with placeholders. 
   parameters: 
  List of parameters definition
} {code}
*a. SubmissionExperimentTemplateSpec*
{code:java}
{
   experiment_name: Name of the running experiment
   template_name: Name of the template
   parameters: 
  List of parameters (with values)
} {code}
Does this make sense? cc: [~pingsutw], [~ztang] for suggestions.

> [Umbrella] Predefined Experiment
> 
>
> Key: SUBMARINE-548
> URL: https://issues.apache.org/jira/browse/SUBMARINE-548
> Project: Apache Submarine
>  Issue Type: New Feature
>  Components: experiment template
>Reporter: JohnTing
>Assignee: JohnTing
>Priority: Major
> Fix For: 0.5.0
>
>
> Predefined-experiment features
>  * [API] Define Experiment API for pre-defined template
>  * [SDK] Add Python SDK to support pre-defined experiment
>  * [UI] Allow Run pre-defined experiment
>  * [API] Define Swagger API for pre-defined template submission
>  * [API] Define Swagger API for pre-defined template registration/delete, etc.
>  * [Sever] Support submit pre-defined template, and translate it to actual job
> [https://github.com/apache/submarine/blob/master/docs/design/experiment-implementation.md#support-predefined-experiment-templates]
> [https://cwiki.apache.org/confluence/display/SUBMARINE/Roadmap]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@submarine.apache.org
For additional commands, e-mail: dev-h...@submarine.apache.org



[jira] [Commented] (SUBMARINE-548) [Umbrella] Predefined Experiment

2020-08-02 Thread Kevin Su (Jira)


[ 
https://issues.apache.org/jira/browse/SUBMARINE-548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17169515#comment-17169515
 ] 

Kevin Su commented on SUBMARINE-548:


[~wangda], A few questions, make sure I understand it correctly.
 # If I want to use a predefined template to submit an experiment, we would 
register the *ExperimentTemplateSpec* first.
*ExperimentTemplateSpec* will look like below

{
   template_name: mnist_template
   experiment_spec: 
 meta:
   name: \{{name}}
   namespace: "default"
   framework: "TensorFlow"
   cmd: "python /var/tf_mnist/mnist_with_summaries.py --input 
\{{input.train_data}} --log_dir=/train/log -- 
 learning_rate=\{{training.learning_rate}} 
--batch_size=\{{training.batch_size}}"
   envVars:
 ENV_1: "ENV1"
 environment:
   image: "gcr.io/kubeflow-ci/tf-mnist-with-summaries:1.0"
 spec:
   Ps:
 replicas: 1
 resources: "cpu=1,memory=1024M"
   Worker:
 replicas: 1
 resources: "cpu=1,memory=1024M"  
   parameters: 
 - name: input.train_data
   required: true
   description: > 
 Train data is expected in SVM format, and can be stored in HDFS/S3
 - name: training.learning_rate
   required: true
   description: > 
 Learning rate for mnist model, default is 0.001
 - name: training.batch_size
   required: true
   description: > 
 Integer or `None`. Number of samples per gradient update. If 
unspecified, `batch_size` will default to 32 
} 
Should we add *Author* and *description* in *ExperimentTemplateSpec,* as 
mention in 
[https://github.com/apache/submarine/blob/master/docs/design/experiment-implementation.md#predefined-experiment-template-api-to-run-experiment]

 

       2. After registering, we will submit a list of parameters to run an 
experiment like below
{
   experiment_name: mnist_example
   template_name: mnist_template
   parameters: 
 input.train_data: "hdsf://foo/bar"
 training.learning_rate: 0.01
 training.batch_size: 64
} 
 

IIUC, It's a great proposal that users could very easily submit an experiment 
with a list of parameters, and no need to worry about other system resources 
and the environment.

> [Umbrella] Predefined Experiment
> 
>
> Key: SUBMARINE-548
> URL: https://issues.apache.org/jira/browse/SUBMARINE-548
> Project: Apache Submarine
>  Issue Type: New Feature
>  Components: experiment template
>Reporter: JohnTing
>Assignee: JohnTing
>Priority: Major
> Fix For: 0.5.0
>
>
> Predefined-experiment features
>  * [API] Define Experiment API for pre-defined template
>  * [SDK] Add Python SDK to support pre-defined experiment
>  * [UI] Allow Run pre-defined experiment
>  * [API] Define Swagger API for pre-defined template submission
>  * [API] Define Swagger API for pre-defined template registration/delete, etc.
>  * [Sever] Support submit pre-defined template, and translate it to actual job
> [https://github.com/apache/submarine/blob/master/docs/design/experiment-implementation.md#support-predefined-experiment-templates]
> [https://cwiki.apache.org/confluence/display/SUBMARINE/Roadmap]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@submarine.apache.org
For additional commands, e-mail: dev-h...@submarine.apache.org



[jira] [Commented] (SUBMARINE-548) [Umbrella] Predefined Experiment

2020-08-03 Thread Wangda Tan (Jira)


[ 
https://issues.apache.org/jira/browse/SUBMARINE-548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17170536#comment-17170536
 ] 

Wangda Tan commented on SUBMARINE-548:
--

[~pingsutw],  

Something similar to that, please feel free to add whatever required parameters 
(such as author).

> [Umbrella] Predefined Experiment
> 
>
> Key: SUBMARINE-548
> URL: https://issues.apache.org/jira/browse/SUBMARINE-548
> Project: Apache Submarine
>  Issue Type: New Feature
>  Components: experiment template
>Reporter: JohnTing
>Assignee: JohnTing
>Priority: Major
> Fix For: 0.5.0
>
>
> Predefined-experiment features
>  * [API] Define Experiment API for pre-defined template
>  * [SDK] Add Python SDK to support pre-defined experiment
>  * [UI] Allow Run pre-defined experiment
>  * [API] Define Swagger API for pre-defined template submission
>  * [API] Define Swagger API for pre-defined template registration/delete, etc.
>  * [Sever] Support submit pre-defined template, and translate it to actual job
> [https://github.com/apache/submarine/blob/master/docs/design/experiment-implementation.md#support-predefined-experiment-templates]
> [https://cwiki.apache.org/confluence/display/SUBMARINE/Roadmap]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@submarine.apache.org
For additional commands, e-mail: dev-h...@submarine.apache.org