[ 
https://issues.apache.org/jira/browse/SUBMARINE-507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17117911#comment-17117911
 ] 

Manikandan R commented on SUBMARINE-507:
----------------------------------------

Writing down my thoughts on storing environments..Please share your views.
 
1. Following tables can be created in Submarine Metastore:
 
a) Table Name: environments
 
Columns:
 
environment_id int primary key
name varchar(255) unique not null
description string
location string
docker_id int references docker_images(docker_id)
kernel_id int  references kernel(kernel_id)
created_date timestamp
last_updated_date timestamp
 
"location" column captures hdfs path of the environment file.
 
b) Table Name: docker_images
 
docker_id int primary key
name varchar(255) unique not null
description string
created_date timestamp
last_updated_date timestamp
 
c) Table Name: kernel
 
kernel_id int primary key
name varchar(255) unique not null
description string
repository string
repository_type enum(''private', 'public')
created_date timestamp
last_updated_date timestamp
 
Having separate tables for docker_images and kernel give us lot of flexibility 
while operating environments. 
- docker and kernel images could be created only once and used for many 
environments.
- Avoid creating the same images and kernel/conda again and again in registry 
and repository respectively.
(If required, we can clean up these 2 tables if it grows very very big and 
becomes a bottleneck, but very unlikely).
 
2. How to store environment file?
 
Create a directory hdfs://mycluster/submarine/environments/ if it doesn't 
exists and use environment name as file name.  For example, 
 
hdfs://mycluster/submarine/environments/my_env.txt
 
3. How to store docker_images?
 
We could set up our own registry as part of starting up the server. Please 
refer [https://www.docker.com/blog/how-to-use-your-own-registry/] for details. 
There are several options for this storage as documented in 
[https://docs.docker.com/registry/configuration/#storage]. For first cut, We 
can begin with file system and can be iterated over next releases based on the 
need.
 
4. How to store kernel/conda?
 
There are 2 types. 1. Private 2. Public.
 
For private repo, we will need to set up local repo's and can be used.

> Submarine Environment Management
> --------------------------------
>
>                 Key: SUBMARINE-507
>                 URL: https://issues.apache.org/jira/browse/SUBMARINE-507
>             Project: Apache Submarine
>          Issue Type: New Feature
>            Reporter: Manikandan R
>            Assignee: Manikandan R
>            Priority: Major
>              Labels: pull-request-available
>
> Scope of this JIRA is to support environment management. It includes the 
> following:
> 1. Create Environment
> 2. Update Environment
> 3. Delete Environment
> 4. List Environments
> In addition, this JIRA should also ensures that environments has been 
> persisted like experiments so that it can used for later use.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@submarine.apache.org
For additional commands, e-mail: dev-h...@submarine.apache.org

Reply via email to