Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-23 Thread Karthik Manamcheri via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/#review218374
---




addons/models/4000-MachineLearning/4010-ml_model.json
Line 328 (original), 328 (patched)


Why do we need this relationship? We can already trace the relationship 
through lineage?

Model Project -> Train Process -> Model Build

Same goes for the deployment. I don't think we need the relationship 
definition. Do we?

Model Build -> Deploy Process -> Model Deployment


- Karthik Manamcheri


On Oct. 23, 2019, 9:22 p.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71619/
> ---
> 
> (Updated Oct. 23, 2019, 9:22 p.m.)
> 
> 
> Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
> Sridhar K, Madhan Neethiraj, and Sarath Subramanian.
> 
> 
> Bugs: atlas-3464
> https://issues.apache.org/jira/browse/atlas-3464
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Define entities used for Machine Learning Governance
> 
> 
> Diffs
> -
> 
>   addons/models/4000-MachineLearning/4010-ml_model.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71619/diff/11/
> 
> 
> Testing
> ---
> 
> verified it is valid json file
> 
> 
> Thanks,
> 
> Na Li
> 
>



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-23 Thread Na Li via Review Board


> On Oct. 23, 2019, 10:20 p.m., Karthik Manamcheri wrote:
> > addons/models/4000-MachineLearning/4010-ml_model.json
> > Line 328 (original), 328 (patched)
> > 
> >
> > Why do we need this relationship? We can already trace the relationship 
> > through lineage?
> > 
> > Model Project -> Train Process -> Model Build
> > 
> > Same goes for the deployment. I don't think we need the relationship 
> > definition. Do we?
> > 
> > Model Build -> Deploy Process -> Model Deployment

That is what Atlas team preferred. It does no harm, and allow relationship 
graph to show.


- Na


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/#review218374
---


On Oct. 23, 2019, 9:22 p.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71619/
> ---
> 
> (Updated Oct. 23, 2019, 9:22 p.m.)
> 
> 
> Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
> Sridhar K, Madhan Neethiraj, and Sarath Subramanian.
> 
> 
> Bugs: atlas-3464
> https://issues.apache.org/jira/browse/atlas-3464
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Define entities used for Machine Learning Governance
> 
> 
> Diffs
> -
> 
>   addons/models/4000-MachineLearning/4010-ml_model.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71619/diff/11/
> 
> 
> Testing
> ---
> 
> verified it is valid json file
> 
> 
> Thanks,
> 
> Na Li
> 
>



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-23 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/#review218372
---


Ship it!




I deployed these models on an Atlas server and created entities.

Sample json

**ml_project**
```
{
"referredEntities": null,
"entity": {
"typeName": "ml_project",
"attributes": {
"qualifiedName": "face_recognition@cluster-8",
"name": "face_recognition",
"owner": "systest",
"metadata": {
"iteratoin": "start_1",
"generation": "first"
}
},
"guid": "-2"
}
}
``` 
---
**ml_model_build**
```
{
"referredEntities": null,
"entity": {
"typeName": "ml_model_build",
"attributes": {
"qualifiedName": "face_recognition.build1@cluster-8",
"name": "face_recognition.b1",
"version": 1,
"owner": "systest",
"metadata": {
"iteratoin": "0.1",
"details": "uploading training data"
},
"defaultCpuMillicores": 4,
"defaultMemoryMb": 6
},
"guid": "-2",
"relationshipAttributes": {
"project": {
"guid": "3cdeddf8-6a71-4041-a1d2-80ead1a38c63",
"type": "ml_project"
}
}
}
}
```
---
**ml_model_deployment**
```
{
"referredEntities": null,
"entity": {
"typeName": "ml_model_deployment",
"attributes": {
"qualifiedName": "face_recognition.b1.d1@cluster-8",
"name": "face_recognition.b1.d1",
"createTime": 1571725864000,
"deployedTime": 1571725864001,
"version": 1,
"owner": "systest",
"modelEndpointURL": "http://localhost:80111/face_recoginition/d1;,
"metadata": {
"iteratoin": "D0.1",
"details": "deployment to test"
},
"status": "deployed",
"replicas": 3,
"cpuMillicores": 2
},
"guid": "-2",
"relationshipAttributes": {
"build": {
"guid": "539aee60-4c0e-4f3b-9982-562a198d99ac",
"type": "ml_build"
}
}
}
}
```
---
**ml_project_create_process**

```
{
"referredEntities": null,
"entity": {
"typeName": "ml_project_create_process",
"attributes": {
"qualifiedName": 
"face_recognition.b1.d1@cluster-8:training:1571725864000",
"name": "face_recognition_training_1",
"userName": "eng_1",
"inputs": [
{
"guid": "e7d578c9-5978-43d9-aacd-3635"
},
{
"guid": "e7d578c9-5978-43d9-aacd-2205"
}
],
"outputs": [
{
"guid": "e7d578c9-5978-43d9-aacd-2209"
}
]
},
"guid": "-2"
}
}
```

- Ashutosh Mestry


On Oct. 23, 2019, 9:22 p.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71619/
> ---
> 
> (Updated Oct. 23, 2019, 9:22 p.m.)
> 
> 
> Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
> Sridhar K, Madhan Neethiraj, and Sarath Subramanian.
> 
> 
> Bugs: atlas-3464
> https://issues.apache.org/jira/browse/atlas-3464
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Define entities used for Machine Learning Governance
> 
> 
> Diffs
> -
> 
>   addons/models/4000-MachineLearning/4010-ml_model.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71619/diff/11/
> 
> 
> Testing
> ---
> 
> verified it is valid json file
> 
> 
> Thanks,
> 
> Na Li
> 
>



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-23 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/#review218371
---


Ship it!




Ship It!

- Ashutosh Mestry


On Oct. 23, 2019, 9:22 p.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71619/
> ---
> 
> (Updated Oct. 23, 2019, 9:22 p.m.)
> 
> 
> Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
> Sridhar K, Madhan Neethiraj, and Sarath Subramanian.
> 
> 
> Bugs: atlas-3464
> https://issues.apache.org/jira/browse/atlas-3464
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Define entities used for Machine Learning Governance
> 
> 
> Diffs
> -
> 
>   addons/models/4000-MachineLearning/4010-ml_model.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71619/diff/11/
> 
> 
> Testing
> ---
> 
> verified it is valid json file
> 
> 
> Thanks,
> 
> Na Li
> 
>



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-23 Thread Na Li via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/
---

(Updated Oct. 23, 2019, 9:22 p.m.)


Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
Sridhar K, Madhan Neethiraj, and Sarath Subramanian.


Bugs: atlas-3464
https://issues.apache.org/jira/browse/atlas-3464


Repository: atlas


Description
---

Define entities used for Machine Learning Governance


Diffs (updated)
-

  addons/models/4000-MachineLearning/4010-ml_model.json PRE-CREATION 


Diff: https://reviews.apache.org/r/71619/diff/11/

Changes: https://reviews.apache.org/r/71619/diff/10-11/


Testing
---

verified it is valid json file


Thanks,

Na Li



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-23 Thread Na Li via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/
---

(Updated Oct. 23, 2019, 6:29 p.m.)


Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
Sridhar K, Madhan Neethiraj, and Sarath Subramanian.


Bugs: atlas-3464
https://issues.apache.org/jira/browse/atlas-3464


Repository: atlas


Description
---

Define entities used for Machine Learning Governance


Diffs (updated)
-

  addons/models/4000-MachineLearning/4010-ml_model.json PRE-CREATION 


Diff: https://reviews.apache.org/r/71619/diff/10/

Changes: https://reviews.apache.org/r/71619/diff/9-10/


Testing
---

verified it is valid json file


Thanks,

Na Li



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-22 Thread Na Li via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/
---

(Updated Oct. 22, 2019, 9:58 p.m.)


Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
Sridhar K, Madhan Neethiraj, and Sarath Subramanian.


Bugs: atlas-3464
https://issues.apache.org/jira/browse/atlas-3464


Repository: atlas


Description
---

Define entities used for Machine Learning Governance


Diffs (updated)
-

  addons/models/-Area0/0012-base_model.json PRE-CREATION 
  addons/models/4000-MachineLearning/4010-ml_model.json PRE-CREATION 


Diff: https://reviews.apache.org/r/71619/diff/9/

Changes: https://reviews.apache.org/r/71619/diff/8-9/


Testing
---

verified it is valid json file


Thanks,

Na Li



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-22 Thread Na Li via Review Board


> On Oct. 22, 2019, 9:08 p.m., Sridhar K wrote:
> > addons/models/-Area0/0010-base_model.json
> > Lines 203 (patched)
> > 
> >
> > Instead of modifying addons/models/-Area0/0010-base_model.json, can 
> > you add this change in addons/models/-Area0/0011-base_model.json
> 
> Sridhar K wrote:
> Sorryinstead of modifying 
> addons/models/-Area0/0010-base_model.json, can you add this change in 
> addons/models/-Area0/0012-base_model.json
> 
> Sridhar K wrote:
> we want to add new changes model enhancements as new files with 
> increasing names.

it is done


- Na


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/#review218339
---


On Oct. 22, 2019, 9:58 p.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71619/
> ---
> 
> (Updated Oct. 22, 2019, 9:58 p.m.)
> 
> 
> Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
> Sridhar K, Madhan Neethiraj, and Sarath Subramanian.
> 
> 
> Bugs: atlas-3464
> https://issues.apache.org/jira/browse/atlas-3464
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Define entities used for Machine Learning Governance
> 
> 
> Diffs
> -
> 
>   addons/models/-Area0/0012-base_model.json PRE-CREATION 
>   addons/models/4000-MachineLearning/4010-ml_model.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71619/diff/9/
> 
> 
> Testing
> ---
> 
> verified it is valid json file
> 
> 
> Thanks,
> 
> Na Li
> 
>



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-22 Thread Sridhar K


> On Oct. 22, 2019, 6:31 p.m., Sridhar K wrote:
> > addons/models/4000-MachineLearning/4010-ml_model.json
> > Lines 123 (patched)
> > 
> >
> > I am not the right person for this comment. But, this name looks 
> > confusingis it a tag or image url?
> 
> Na Li wrote:
> Anand is very experienced in ML, and he said this name is commonly used 
> for that purpose. It contains where to get the container image of a model
> 
> Anand Patil wrote:
> "Image tag" is a standard Docker term: 
> https://docs.docker.com/engine/reference/commandline/tag/#tag-an-image-for-a-private-repository
>  It does include a sort of URL. I agree that the term is confusing, though.

thanks for explaining.


> On Oct. 22, 2019, 6:31 p.m., Sridhar K wrote:
> > addons/models/4000-MachineLearning/4010-ml_model.json
> > Lines 269 (patched)
> > 
> >
> > I am wondering if we should have a first class entity called "User" 
> > defined in base model and you have ml_user extend from it. what do you say?
> 
> Na Li wrote:
> what is parent for the entity "user"?
> 
> Na Li wrote:
> Made the changes 
> 1) define "AtlasUser" in base, derived from "Asset", it has "userName" 
> attribute
> 2) define "ml_user", derived from "AtlasUser" and "DataSet". It only has 
> "metaData" attibute
> 
> Anand Patil wrote:
> If we are OK to inherit from Asset, should the other types (project, 
> build, deployment) inherit from Asset too? None of them are really datasets. 
> +Sridhar thoughts?

I like this idea as wellI am assuming that they will have name, description 
and owner like the Asset has.


- Sridhar


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/#review218332
---


On Oct. 22, 2019, 8:04 p.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71619/
> ---
> 
> (Updated Oct. 22, 2019, 8:04 p.m.)
> 
> 
> Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
> Sridhar K, Madhan Neethiraj, and Sarath Subramanian.
> 
> 
> Bugs: atlas-3464
> https://issues.apache.org/jira/browse/atlas-3464
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Define entities used for Machine Learning Governance
> 
> 
> Diffs
> -
> 
>   addons/models/-Area0/0010-base_model.json 2f5fdaf 
>   addons/models/4000-MachineLearning/4010-ml_model.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71619/diff/8/
> 
> 
> Testing
> ---
> 
> verified it is valid json file
> 
> 
> Thanks,
> 
> Na Li
> 
>



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-22 Thread Sridhar K


> On Oct. 22, 2019, 9:08 p.m., Sridhar K wrote:
> > addons/models/-Area0/0010-base_model.json
> > Lines 203 (patched)
> > 
> >
> > Instead of modifying addons/models/-Area0/0010-base_model.json, can 
> > you add this change in addons/models/-Area0/0011-base_model.json
> 
> Sridhar K wrote:
> Sorryinstead of modifying 
> addons/models/-Area0/0010-base_model.json, can you add this change in 
> addons/models/-Area0/0012-base_model.json

we want to add new changes model enhancements as new files with increasing 
names.


- Sridhar


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/#review218339
---


On Oct. 22, 2019, 8:04 p.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71619/
> ---
> 
> (Updated Oct. 22, 2019, 8:04 p.m.)
> 
> 
> Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
> Sridhar K, Madhan Neethiraj, and Sarath Subramanian.
> 
> 
> Bugs: atlas-3464
> https://issues.apache.org/jira/browse/atlas-3464
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Define entities used for Machine Learning Governance
> 
> 
> Diffs
> -
> 
>   addons/models/-Area0/0010-base_model.json 2f5fdaf 
>   addons/models/4000-MachineLearning/4010-ml_model.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71619/diff/8/
> 
> 
> Testing
> ---
> 
> verified it is valid json file
> 
> 
> Thanks,
> 
> Na Li
> 
>



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-22 Thread Sridhar K


> On Oct. 22, 2019, 9:08 p.m., Sridhar K wrote:
> > addons/models/-Area0/0010-base_model.json
> > Lines 203 (patched)
> > 
> >
> > Instead of modifying addons/models/-Area0/0010-base_model.json, can 
> > you add this change in addons/models/-Area0/0011-base_model.json

Sorryinstead of modifying addons/models/-Area0/0010-base_model.json, 
can you add this change in addons/models/-Area0/0012-base_model.json


- Sridhar


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/#review218339
---


On Oct. 22, 2019, 8:04 p.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71619/
> ---
> 
> (Updated Oct. 22, 2019, 8:04 p.m.)
> 
> 
> Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
> Sridhar K, Madhan Neethiraj, and Sarath Subramanian.
> 
> 
> Bugs: atlas-3464
> https://issues.apache.org/jira/browse/atlas-3464
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Define entities used for Machine Learning Governance
> 
> 
> Diffs
> -
> 
>   addons/models/-Area0/0010-base_model.json 2f5fdaf 
>   addons/models/4000-MachineLearning/4010-ml_model.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71619/diff/8/
> 
> 
> Testing
> ---
> 
> verified it is valid json file
> 
> 
> Thanks,
> 
> Na Li
> 
>



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-22 Thread Sridhar K

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/#review218339
---




addons/models/-Area0/0010-base_model.json
Lines 203 (patched)


Instead of modifying addons/models/-Area0/0010-base_model.json, can you 
add this change in addons/models/-Area0/0011-base_model.json


- Sridhar K


On Oct. 22, 2019, 8:04 p.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71619/
> ---
> 
> (Updated Oct. 22, 2019, 8:04 p.m.)
> 
> 
> Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
> Sridhar K, Madhan Neethiraj, and Sarath Subramanian.
> 
> 
> Bugs: atlas-3464
> https://issues.apache.org/jira/browse/atlas-3464
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Define entities used for Machine Learning Governance
> 
> 
> Diffs
> -
> 
>   addons/models/-Area0/0010-base_model.json 2f5fdaf 
>   addons/models/4000-MachineLearning/4010-ml_model.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71619/diff/8/
> 
> 
> Testing
> ---
> 
> verified it is valid json file
> 
> 
> Thanks,
> 
> Na Li
> 
>



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-22 Thread Na Li via Review Board


> On Oct. 22, 2019, 6:31 p.m., Sridhar K wrote:
> > addons/models/4000-MachineLearning/4010-ml_model.json
> > Lines 269 (patched)
> > 
> >
> > I am wondering if we should have a first class entity called "User" 
> > defined in base model and you have ml_user extend from it. what do you say?
> 
> Na Li wrote:
> what is parent for the entity "user"?

Made the changes 
1) define "AtlasUser" in base, derived from "Asset", it has "userName" attribute
2) define "ml_user", derived from "AtlasUser" and "DataSet". It only has 
"metaData" attibute


- Na


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/#review218332
---


On Oct. 22, 2019, 8:04 p.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71619/
> ---
> 
> (Updated Oct. 22, 2019, 8:04 p.m.)
> 
> 
> Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
> Sridhar K, Madhan Neethiraj, and Sarath Subramanian.
> 
> 
> Bugs: atlas-3464
> https://issues.apache.org/jira/browse/atlas-3464
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Define entities used for Machine Learning Governance
> 
> 
> Diffs
> -
> 
>   addons/models/-Area0/0010-base_model.json 2f5fdaf 
>   addons/models/4000-MachineLearning/4010-ml_model.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71619/diff/8/
> 
> 
> Testing
> ---
> 
> verified it is valid json file
> 
> 
> Thanks,
> 
> Na Li
> 
>



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-22 Thread Na Li via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/
---

(Updated Oct. 22, 2019, 8:04 p.m.)


Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
Sridhar K, Madhan Neethiraj, and Sarath Subramanian.


Bugs: atlas-3464
https://issues.apache.org/jira/browse/atlas-3464


Repository: atlas


Description
---

Define entities used for Machine Learning Governance


Diffs (updated)
-

  addons/models/-Area0/0010-base_model.json 2f5fdaf 
  addons/models/4000-MachineLearning/4010-ml_model.json PRE-CREATION 


Diff: https://reviews.apache.org/r/71619/diff/8/

Changes: https://reviews.apache.org/r/71619/diff/7-8/


Testing
---

verified it is valid json file


Thanks,

Na Li



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-22 Thread Anand Patil via Review Board


> On Oct. 22, 2019, 6:31 p.m., Sridhar K wrote:
> > addons/models/4000-MachineLearning/4010-ml_model.json
> > Lines 123 (patched)
> > 
> >
> > I am not the right person for this comment. But, this name looks 
> > confusingis it a tag or image url?
> 
> Na Li wrote:
> Anand is very experienced in ML, and he said this name is commonly used 
> for that purpose. It contains where to get the container image of a model

"Image tag" is a standard Docker term: 
https://docs.docker.com/engine/reference/commandline/tag/#tag-an-image-for-a-private-repository
 It does include a sort of URL. I agree that the term is confusing, though.


- Anand


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/#review218332
---


On Oct. 22, 2019, 6:17 p.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71619/
> ---
> 
> (Updated Oct. 22, 2019, 6:17 p.m.)
> 
> 
> Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
> Sridhar K, Madhan Neethiraj, and Sarath Subramanian.
> 
> 
> Bugs: atlas-3464
> https://issues.apache.org/jira/browse/atlas-3464
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Define entities used for Machine Learning Governance
> 
> 
> Diffs
> -
> 
>   addons/models/4000-MachineLearning/4010-ml_model.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71619/diff/7/
> 
> 
> Testing
> ---
> 
> verified it is valid json file
> 
> 
> Thanks,
> 
> Na Li
> 
>



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-22 Thread Sridhar K

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/#review218332
---




addons/models/4000-MachineLearning/4010-ml_model.json
Lines 123 (patched)


I am not the right person for this comment. But, this name looks 
confusingis it a tag or image url?



addons/models/4000-MachineLearning/4010-ml_model.json
Lines 269 (patched)


I am wondering if we should have a first class entity called "User" defined 
in base model and you have ml_user extend from it. what do you say?


- Sridhar K


On Oct. 22, 2019, 6:17 p.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71619/
> ---
> 
> (Updated Oct. 22, 2019, 6:17 p.m.)
> 
> 
> Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
> Sridhar K, Madhan Neethiraj, and Sarath Subramanian.
> 
> 
> Bugs: atlas-3464
> https://issues.apache.org/jira/browse/atlas-3464
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Define entities used for Machine Learning Governance
> 
> 
> Diffs
> -
> 
>   addons/models/4000-MachineLearning/4010-ml_model.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71619/diff/7/
> 
> 
> Testing
> ---
> 
> verified it is valid json file
> 
> 
> Thanks,
> 
> Na Li
> 
>



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-22 Thread Na Li via Review Board


> On Oct. 22, 2019, 5:43 p.m., Sridhar K wrote:
> > addons/models/4000-MachineLearning/4010-ml_model.json
> > Lines 46 (patched)
> > 
> >
> > is it different from Name?
> 
> Sridhar K wrote:
> We have name attribute already. Can we we use it instead?

good point. removed it.


- Na


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/#review218328
---


On Oct. 22, 2019, 3:42 p.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71619/
> ---
> 
> (Updated Oct. 22, 2019, 3:42 p.m.)
> 
> 
> Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
> Sridhar K, Madhan Neethiraj, and Sarath Subramanian.
> 
> 
> Bugs: atlas-3464
> https://issues.apache.org/jira/browse/atlas-3464
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Define entities used for Machine Learning Governance
> 
> 
> Diffs
> -
> 
>   addons/models/4000-MachineLearning/4010-ml_model.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71619/diff/6/
> 
> 
> Testing
> ---
> 
> verified it is valid json file
> 
> 
> Thanks,
> 
> Na Li
> 
>



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-22 Thread Sridhar K

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/#review218328
---



Minor comments. all related to attribute names. Please submit with the updated 
review request.


addons/models/4000-MachineLearning/4010-ml_model.json
Lines 46 (patched)


is it different from Name?



addons/models/4000-MachineLearning/4010-ml_model.json
Lines 63 (patched)


Can we change it to createTime for consistency with other entities.



addons/models/4000-MachineLearning/4010-ml_model.json
Lines 71 (patched)


Can we please change it to modifiedTime for consistency with other entities.



addons/models/4000-MachineLearning/4010-ml_model.json
Lines 179 (patched)


Same comment as above. In general, we want to have consistent name for 
similar attributes even if they are from different entities.

So this comment applies to all createdAt and updatedAt attributes.



addons/models/4000-MachineLearning/4010-ml_model.json
Lines 206 (patched)


Let us call this as deployedTime.

If you have time based attributes have names ending with "*Time" for 
consistency.


- Sridhar K


On Oct. 22, 2019, 3:42 p.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71619/
> ---
> 
> (Updated Oct. 22, 2019, 3:42 p.m.)
> 
> 
> Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
> Sridhar K, Madhan Neethiraj, and Sarath Subramanian.
> 
> 
> Bugs: atlas-3464
> https://issues.apache.org/jira/browse/atlas-3464
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Define entities used for Machine Learning Governance
> 
> 
> Diffs
> -
> 
>   addons/models/4000-MachineLearning/4010-ml_model.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71619/diff/6/
> 
> 
> Testing
> ---
> 
> verified it is valid json file
> 
> 
> Thanks,
> 
> Na Li
> 
>



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-22 Thread Na Li via Review Board


> On Oct. 22, 2019, 3:27 p.m., Karthik Manamcheri wrote:
> > addons/models/4000-MachineLearning/4010-ml_model.json
> > Lines 293 (patched)
> > 
> >
> > Should this also be an enum? What types are we envisioning?

removes this attributes as it is not clear how userful it is. User can specify 
the type in metadata attribute.


- Na


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/#review218322
---


On Oct. 22, 2019, 3:42 p.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71619/
> ---
> 
> (Updated Oct. 22, 2019, 3:42 p.m.)
> 
> 
> Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
> Sridhar K, Madhan Neethiraj, and Sarath Subramanian.
> 
> 
> Bugs: atlas-3464
> https://issues.apache.org/jira/browse/atlas-3464
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Define entities used for Machine Learning Governance
> 
> 
> Diffs
> -
> 
>   addons/models/4000-MachineLearning/4010-ml_model.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71619/diff/6/
> 
> 
> Testing
> ---
> 
> verified it is valid json file
> 
> 
> Thanks,
> 
> Na Li
> 
>



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-22 Thread Karthik Manamcheri via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/#review218322
---




addons/models/4000-MachineLearning/4010-ml_model.json
Lines 293 (patched)


Should this also be an enum? What types are we envisioning?


- Karthik Manamcheri


On Oct. 21, 2019, 6:51 p.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71619/
> ---
> 
> (Updated Oct. 21, 2019, 6:51 p.m.)
> 
> 
> Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
> Sridhar K, Madhan Neethiraj, and Sarath Subramanian.
> 
> 
> Bugs: atlas-3464
> https://issues.apache.org/jira/browse/atlas-3464
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Define entities used for Machine Learning Governance
> 
> 
> Diffs
> -
> 
>   addons/models/4000-MachineLearning/4010-ml_model.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71619/diff/5/
> 
> 
> Testing
> ---
> 
> verified it is valid json file
> 
> 
> Thanks,
> 
> Na Li
> 
>



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-21 Thread Na Li via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/
---

(Updated Oct. 21, 2019, 6:51 p.m.)


Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
Sridhar K, Madhan Neethiraj, and Sarath Subramanian.


Bugs: atlas-3464
https://issues.apache.org/jira/browse/atlas-3464


Repository: atlas


Description
---

Define entities used for Machine Learning Governance


Diffs (updated)
-

  addons/models/4000-MachineLearning/4010-ml_model.json PRE-CREATION 


Diff: https://reviews.apache.org/r/71619/diff/5/

Changes: https://reviews.apache.org/r/71619/diff/4-5/


Testing
---

verified it is valid json file


Thanks,

Na Li



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-21 Thread Na Li via Review Board


> On Oct. 21, 2019, 6:11 p.m., Sarath Subramanian wrote:
> > Move the ml model to a new directory in addons/models => 
> > 4000-MachineLearning
> > 
> > -ml_model.json => 4010-ml_model.json

done. Thanks!


- Na


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/#review218305
---


On Oct. 21, 2019, 6:51 p.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71619/
> ---
> 
> (Updated Oct. 21, 2019, 6:51 p.m.)
> 
> 
> Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
> Sridhar K, Madhan Neethiraj, and Sarath Subramanian.
> 
> 
> Bugs: atlas-3464
> https://issues.apache.org/jira/browse/atlas-3464
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Define entities used for Machine Learning Governance
> 
> 
> Diffs
> -
> 
>   addons/models/4000-MachineLearning/4010-ml_model.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71619/diff/5/
> 
> 
> Testing
> ---
> 
> verified it is valid json file
> 
> 
> Thanks,
> 
> Na Li
> 
>



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-21 Thread Sarath Subramanian

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/#review218305
---



Move the ml model to a new directory in addons/models => 4000-MachineLearning

-ml_model.json => 4010-ml_model.json

- Sarath Subramanian


On Oct. 20, 2019, 9:13 p.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71619/
> ---
> 
> (Updated Oct. 20, 2019, 9:13 p.m.)
> 
> 
> Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
> Sridhar K, Madhan Neethiraj, and Sarath Subramanian.
> 
> 
> Bugs: atlas-3464
> https://issues.apache.org/jira/browse/atlas-3464
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Define entities used for Machine Learning Governance
> 
> 
> Diffs
> -
> 
>   addons/models/1000-Hadoop/-ml_model.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71619/diff/4/
> 
> 
> Testing
> ---
> 
> verified it is valid json file
> 
> 
> Thanks,
> 
> Na Li
> 
>



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-21 Thread Anand Patil via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/#review218304
---


Ship it!




Ship It!

- Anand Patil


On Oct. 21, 2019, 4:13 a.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71619/
> ---
> 
> (Updated Oct. 21, 2019, 4:13 a.m.)
> 
> 
> Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
> Sridhar K, Madhan Neethiraj, and Sarath Subramanian.
> 
> 
> Bugs: atlas-3464
> https://issues.apache.org/jira/browse/atlas-3464
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Define entities used for Machine Learning Governance
> 
> 
> Diffs
> -
> 
>   addons/models/1000-Hadoop/-ml_model.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71619/diff/4/
> 
> 
> Testing
> ---
> 
> verified it is valid json file
> 
> 
> Thanks,
> 
> Na Li
> 
>



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-20 Thread Na Li via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/
---

(Updated Oct. 21, 2019, 4:13 a.m.)


Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
Sridhar K, Madhan Neethiraj, and Sarath Subramanian.


Bugs: atlas-3464
https://issues.apache.org/jira/browse/atlas-3464


Repository: atlas


Description
---

Define entities used for Machine Learning Governance


Diffs (updated)
-

  addons/models/1000-Hadoop/-ml_model.json PRE-CREATION 


Diff: https://reviews.apache.org/r/71619/diff/4/

Changes: https://reviews.apache.org/r/71619/diff/3-4/


Testing
---

verified it is valid json file


Thanks,

Na Li



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-20 Thread Na Li via Review Board


> On Oct. 18, 2019, 11:07 p.m., Anand Patil wrote:
> > addons/models/1000-Hadoop/-ml_model.json
> > Lines 27 (patched)
> > 
> >
> > Suggest aligning with the CML model deployment statuses for now: 
> > https://github.infra.cloudera.com/Sense/cloudera-sense/blob/master/services/proto/common.proto#L55
> 
> Na Li wrote:
> do you mean to change 
> 
> ml_model_deployment_status from
>   
> "elementDefs": [
> {
>   "value": "unknown",
>   "ordinal": 0
> },
> {
>   "value": "initializing",
>   "ordinal": 1
> },
> {
>   "value": "deployed",
>   "ordinal": 2
> },
> {
>   "value": "stopped",
>   "ordinal": 3
> }
>   ]
> 
> to  
>   "elementDefs": [
> {
>   "value": "deploying",
>   "ordinal": 0
> },
> {
>   "value": "deployed",
>   "ordinal": 1
> },
> {
>   "value": "stopping",
>   "ordinal": 2
> },
> {
>   "value": "stopped",
>   "ordinal": 3
> }
>   ]
>   
> ?
> 
> Anand Patil wrote:
> Yeah, but keep "unknown" too. What do you think?

agree.


- Na


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/#review218292
---


On Oct. 18, 2019, 2 a.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71619/
> ---
> 
> (Updated Oct. 18, 2019, 2 a.m.)
> 
> 
> Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
> Sridhar K, Madhan Neethiraj, and Sarath Subramanian.
> 
> 
> Bugs: atlas-3464
> https://issues.apache.org/jira/browse/atlas-3464
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Define entities used for Machine Learning Governance
> 
> 
> Diffs
> -
> 
>   addons/models/1000-Hadoop/-ml_model.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71619/diff/3/
> 
> 
> Testing
> ---
> 
> verified it is valid json file
> 
> 
> Thanks,
> 
> Na Li
> 
>



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-18 Thread Na Li via Review Board


> On Oct. 18, 2019, 11:07 p.m., Anand Patil wrote:
> > addons/models/1000-Hadoop/-ml_model.json
> > Lines 27 (patched)
> > 
> >
> > Suggest aligning with the CML model deployment statuses for now: 
> > https://github.infra.cloudera.com/Sense/cloudera-sense/blob/master/services/proto/common.proto#L55

do you mean to change 

ml_model_deployment_status from

"elementDefs": [
{
  "value": "unknown",
  "ordinal": 0
},
{
  "value": "initializing",
  "ordinal": 1
},
{
  "value": "deployed",
  "ordinal": 2
},
{
  "value": "stopped",
  "ordinal": 3
}
  ]

to
"elementDefs": [
{
  "value": "deploying",
  "ordinal": 0
},
{
  "value": "deployed",
  "ordinal": 1
},
{
  "value": "stopping",
  "ordinal": 2
},
{
  "value": "stopped",
  "ordinal": 3
}
  ]
  
?


- Na


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/#review218292
---


On Oct. 18, 2019, 2 a.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71619/
> ---
> 
> (Updated Oct. 18, 2019, 2 a.m.)
> 
> 
> Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
> Sridhar K, Madhan Neethiraj, and Sarath Subramanian.
> 
> 
> Bugs: atlas-3464
> https://issues.apache.org/jira/browse/atlas-3464
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Define entities used for Machine Learning Governance
> 
> 
> Diffs
> -
> 
>   addons/models/1000-Hadoop/-ml_model.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71619/diff/3/
> 
> 
> Testing
> ---
> 
> verified it is valid json file
> 
> 
> Thanks,
> 
> Na Li
> 
>



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-18 Thread Anand Patil via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/#review218292
---




addons/models/1000-Hadoop/-ml_model.json
Lines 27 (patched)


Suggest aligning with the CML model deployment statuses for now: 
https://github.infra.cloudera.com/Sense/cloudera-sense/blob/master/services/proto/common.proto#L55


- Anand Patil


On Oct. 18, 2019, 2 a.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71619/
> ---
> 
> (Updated Oct. 18, 2019, 2 a.m.)
> 
> 
> Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
> Sridhar K, Madhan Neethiraj, and Sarath Subramanian.
> 
> 
> Bugs: atlas-3464
> https://issues.apache.org/jira/browse/atlas-3464
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Define entities used for Machine Learning Governance
> 
> 
> Diffs
> -
> 
>   addons/models/1000-Hadoop/-ml_model.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71619/diff/3/
> 
> 
> Testing
> ---
> 
> verified it is valid json file
> 
> 
> Thanks,
> 
> Na Li
> 
>



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-17 Thread Na Li via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/
---

(Updated Oct. 18, 2019, 2 a.m.)


Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
Sridhar K, Madhan Neethiraj, and Sarath Subramanian.


Bugs: atlas-3464
https://issues.apache.org/jira/browse/atlas-3464


Repository: atlas


Description
---

Define entities used for Machine Learning Governance


Diffs (updated)
-

  addons/models/1000-Hadoop/-ml_model.json PRE-CREATION 


Diff: https://reviews.apache.org/r/71619/diff/3/

Changes: https://reviews.apache.org/r/71619/diff/2-3/


Testing
---

verified it is valid json file


Thanks,

Na Li



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-17 Thread Na Li via Review Board


> On Oct. 16, 2019, 4:28 p.m., Anand Patil wrote:
> > addons/models/1000-Hadoop/-ml_model.json
> > Lines 336 (patched)
> > 
> >
> > Like model builds, recommend adding an attribute for number of GPU's.

done


> On Oct. 16, 2019, 4:28 p.m., Anand Patil wrote:
> > addons/models/1000-Hadoop/-ml_model.json
> > Lines 339 (patched)
> > 
> >
> > Would this be things like pod ID's? If we want to expose this kind of 
> > low-level, debugging information then let's make it complete enough to 
> > support most debugging use cases. 
> > 
> > For example, in order to take action based on the pod ID's, a cluster 
> > admin would need to know the k8s API server API, the kubeconfig, and the 
> > namespace in which the model is running.
> > 
> > I think we could leave this field out for now, add an unstructured 
> > 'metadata' field and tackle the debugging flows later.

I removed it


> On Oct. 16, 2019, 4:28 p.m., Anand Patil wrote:
> > addons/models/1000-Hadoop/-ml_model.json
> > Lines 346 (patched)
> > 
> >
> > Should we add number of replicas?

done


> On Oct. 16, 2019, 4:28 p.m., Anand Patil wrote:
> > addons/models/1000-Hadoop/-ml_model.json
> > Lines 361 (patched)
> > 
> >
> > Should this be optional?

it is removed


> On Oct. 16, 2019, 4:28 p.m., Anand Patil wrote:
> > addons/models/1000-Hadoop/-ml_model.json
> > Lines 373 (patched)
> > 
> >
> > Is this for things like admin/full user/viewer only? If so, why make it 
> > non-optional?

changed it to optional. I am thinking the type is admin/user, mlgov/service etc.


> On Oct. 16, 2019, 4:28 p.m., Anand Patil wrote:
> > addons/models/1000-Hadoop/-ml_model.json
> > Lines 384-385 (patched)
> > 
> >
> > There currently is no such process in CML, projects are always created 
> > by user clicks in the UI.
> > 
> > Oh, is this just here to define lineage relationships betwene users and 
> > projects?

yes. The user is associated with a process, and the process creates a project. 
A user can have a lot of metadata. We don't want to duplicate the info to a 
project.


> On Oct. 16, 2019, 4:28 p.m., Anand Patil wrote:
> > addons/models/1000-Hadoop/-ml_model.json
> > Lines 415 (patched)
> > 
> >
> > Same question about whether uniqueId should really be optional.

removed.


> On Oct. 16, 2019, 4:28 p.m., Anand Patil wrote:
> > addons/models/1000-Hadoop/-ml_model.json
> > Lines 442 (patched)
> > 
> >
> > The name above is just ml_project_create_process.

This is relationship between ml_user and ml_project_create_process


> On Oct. 16, 2019, 4:28 p.m., Anand Patil wrote:
> > addons/models/1000-Hadoop/-ml_model.json
> > Lines 480 (patched)
> > 
> >
> > What's the reason for the ml_user prefix here and on 
> > ml_user_model_deploy_process?

it is a relationship involves two types


- Na


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/#review218231
---


On Oct. 16, 2019, 12:30 a.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71619/
> ---
> 
> (Updated Oct. 16, 2019, 12:30 a.m.)
> 
> 
> Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
> Sridhar K, Madhan Neethiraj, and Sarath Subramanian.
> 
> 
> Bugs: atlas-3464
> https://issues.apache.org/jira/browse/atlas-3464
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Define entities used for Machine Learning Governance
> 
> 
> Diffs
> -
> 
>   addons/models/1000-Hadoop/-ml_model.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71619/diff/1/
> 
> 
> Testing
> ---
> 
> verified it is valid json file
> 
> 
> Thanks,
> 
> Na Li
> 
>



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-17 Thread Na Li via Review Board


> On Oct. 16, 2019, 4:45 p.m., Karthik Manamcheri wrote:
> > addons/models/1000-Hadoop/-ml_model.json
> > Lines 286 (patched)
> > 
> >
> > What is the purpose of "updatedAt"?

I removed it. if user wants it, can add it in attribute "metadata"


> On Oct. 16, 2019, 4:45 p.m., Karthik Manamcheri wrote:
> > addons/models/1000-Hadoop/-ml_model.json
> > Lines 349 (patched)
> > 
> >
> > How is a ML User different from a regular user?

it is a user involved in ML actions. There is no base type we can re-use.

__AtlasUserProfile is used internally, totally different purpose from ours.


> On Oct. 16, 2019, 4:45 p.m., Karthik Manamcheri wrote:
> > addons/models/1000-Hadoop/-ml_model.json
> > Lines 447 (patched)
> > 
> >
> > I question the need for a ml_user at this point. I don't completely 
> > understand how or why a ml_user is different from any other user in the 
> > system?
> > 
> > Should this just be "user"?

I prefix each type with "ml_" to avoid type conflict. Other integration may 
change to use "user" for a totally different purpose. We do need to have a type 
for user.


- Na


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/#review218235
---


On Oct. 16, 2019, 12:30 a.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71619/
> ---
> 
> (Updated Oct. 16, 2019, 12:30 a.m.)
> 
> 
> Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
> Sridhar K, Madhan Neethiraj, and Sarath Subramanian.
> 
> 
> Bugs: atlas-3464
> https://issues.apache.org/jira/browse/atlas-3464
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Define entities used for Machine Learning Governance
> 
> 
> Diffs
> -
> 
>   addons/models/1000-Hadoop/-ml_model.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71619/diff/1/
> 
> 
> Testing
> ---
> 
> verified it is valid json file
> 
> 
> Thanks,
> 
> Na Li
> 
>



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-17 Thread Na Li via Review Board


> On Oct. 16, 2019, 7:36 p.m., Na Li wrote:
> > addons/models/1000-Hadoop/-ml_model.json
> > Lines 51 (patched)
> > 
> >
> > Those fields are optional. So if some projects cannot provide those 
> > fields, that is OK. The attributes I put here are used in several projects.
> > 
> > If we put several "key:value" pairs in a single string, we risk losing 
> > some info. 
> > 
> > I think the big drawback of putting several key:value pairs in a single 
> > json string is that:some can be lost by accident. For example the 
> > customAttributes = "{githubRepoURL:https://host1/project1 }". Then later 
> > on, an update comes in with customAttributes = "{mlFramework:tensorFlow}" 
> > because that application has no idea someone is tracking githubRepoURL, or 
> > did not set githubRepoURL because only mlFramework has changed. Then the 
> > previous info on where the source code is got lost by accident. It is hard 
> > to debug. Besides, each time someone wants to use the info, has to parse 
> > the string.
> > 
> > The counter argument can be how about putting these attributes here. If 
> > they are not popular, we just don't use them.
> > 
> > The following info is from Ashutosh. 
> > Having an attribute that contains json blob is not searchable at Atlas 
> > in general. 
> > 
> > Other benefit of adding an attribute to the model is that entities 
> > created will get validated for the type
> > because of validation, we can be certain about the data present in that 
> > field
> > 
> > A hacky approach to add json blob and make it searchable is to set them 
> > in customAttributes, which is system property. We cannot see what keys are 
> > used in customAttributes at model files. it is set by whoever creates the 
> > Atlas entity.
> 
> Anand Patil wrote:
> Instead of a string type containing json, what about a 
> map? I see that type used in some other attributes.

That is a really good idea without the risk of lossing info!

I tested Atlas behavior and Atlas can update part of an attribute's value if 
its type is map<>. I modified a HMS hook test in HiveMetastoreHookIT. 
Basically, the hive_table.parameters attribute's type is map. 
When I add another parameter after it has two parameters, all three parameters 
were stored and retrieved successfully.

I will reduce the attributes and have a general purpose attribute of 
map. So users can experiment and we can know what attributes 
are universally useful.


- Na


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/#review218237
---


On Oct. 16, 2019, 12:30 a.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71619/
> ---
> 
> (Updated Oct. 16, 2019, 12:30 a.m.)
> 
> 
> Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
> Sridhar K, Madhan Neethiraj, and Sarath Subramanian.
> 
> 
> Bugs: atlas-3464
> https://issues.apache.org/jira/browse/atlas-3464
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Define entities used for Machine Learning Governance
> 
> 
> Diffs
> -
> 
>   addons/models/1000-Hadoop/-ml_model.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71619/diff/1/
> 
> 
> Testing
> ---
> 
> verified it is valid json file
> 
> 
> Thanks,
> 
> Na Li
> 
>



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-17 Thread Na Li via Review Board


> On Oct. 16, 2019, 4:28 p.m., Anand Patil wrote:
> > addons/models/1000-Hadoop/-ml_model.json
> > Lines 165 (patched)
> > 
> >
> > Same comment about metadata: gitCommitId, hyperParameters, 
> > outcomeTypeDescription feel prematurely standardized. I feel that a 
> > map metadata field would be better suited to where we are 
> > now.

I will remove those attributes and add a general purpose attribute "metadata"


> On Oct. 16, 2019, 4:28 p.m., Anand Patil wrote:
> > addons/models/1000-Hadoop/-ml_model.json
> > Lines 175 (patched)
> > 
> >
> > It's slightly awkward to have to stringify all hyperparameter values. 
> > Does Atlas have option types?

it is removed


> On Oct. 16, 2019, 4:28 p.m., Anand Patil wrote:
> > addons/models/1000-Hadoop/-ml_model.json
> > Lines 197 (patched)
> > 
> >
> > If status is non-optional, it should probably be an enum or Altus 
> > equivalent.
> > 
> > Also, I'm not sure this field is required for model builds as opposed 
> > to model deployments. Builds' status is very simple - building, built or 
> > errored I suppose. We don't necessarily need to represent building or 
> > errored builds in Atlas.

It is removed.


> On Oct. 16, 2019, 4:28 p.m., Anand Patil wrote:
> > addons/models/1000-Hadoop/-ml_model.json
> > Lines 220 (patched)
> > 
> >
> > I'm not sure defaultReplicas belongs on the model build. The correct 
> > number of replicas depends on usage, it's not really a property of the 
> > model itself.

It is removed.


> On Oct. 16, 2019, 4:28 p.m., Anand Patil wrote:
> > addons/models/1000-Hadoop/-ml_model.json
> > Lines 228 (patched)
> > 
> >
> > "imageTag" would be more standard terminology. The tag typically 
> > includes the entire image URL.
> > 
> > Consider also adding the image hash, as tags/url's are not necessarily 
> > unique.

fixed


> On Oct. 16, 2019, 4:28 p.m., Anand Patil wrote:
> > addons/models/1000-Hadoop/-ml_model.json
> > Lines 268 (patched)
> > 
> >
> > We don't need a uniqueID for model deployments?

"qualifiedName" contains unique ID. no need for this.


> On Oct. 16, 2019, 4:28 p.m., Anand Patil wrote:
> > addons/models/1000-Hadoop/-ml_model.json
> > Lines 306 (patched)
> > 
> >
> > I don't think we shoud expose serviceIP. It may not be distinct from 
> > modelEndpointURL in non-Kubernetes serving environments and when it is, we 
> > may not want to encourage use of it. For example, when a model is being 
> > called from outside the k8s cluster where it is running, the service IP is 
> > not usable.

it is removed.


> On Oct. 16, 2019, 4:28 p.m., Anand Patil wrote:
> > addons/models/1000-Hadoop/-ml_model.json
> > Lines 319 (patched)
> > 
> >
> > If this is required I think it should be an enum or equivalent in Atlas.

changed to enum


> On Oct. 16, 2019, 4:28 p.m., Anand Patil wrote:
> > addons/models/1000-Hadoop/-ml_model.json
> > Lines 323 (patched)
> > 
> >
> > Use the same units for CPU and memory here as in the defaults for model 
> > build. I would go with millicores and bytes.

I changed the CPU unit to millicores. and still keep memory unit in Mb. Is a 
unit of bytes too small?


- Na


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/#review218231
---


On Oct. 16, 2019, 12:30 a.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71619/
> ---
> 
> (Updated Oct. 16, 2019, 12:30 a.m.)
> 
> 
> Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
> Sridhar K, Madhan Neethiraj, and Sarath Subramanian.
> 
> 
> Bugs: atlas-3464
> https://issues.apache.org/jira/browse/atlas-3464
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Define entities used for Machine Learning Governance
> 
> 
> Diffs
> -
> 
>   addons/models/1000-Hadoop/-ml_model.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71619/diff/1/
> 
> 
> Testing
> ---
> 
> verified it is valid json file
> 
> 
> Thanks,
> 
> Na 

Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-17 Thread Na Li via Review Board


> On Oct. 16, 2019, 4:28 p.m., Anand Patil wrote:
> > addons/models/1000-Hadoop/-ml_model.json
> > Lines 10 (patched)
> > 
> >
> > Why make project a subtype of dataset?

Atlas server has a lot of code that build lineage based on Process and DataSet 
types

1) Process only handle types derives from `DataSet`
   {
  "name": "Process",
  "superTypes": [
"Asset"
  ],
  "serviceType": "atlas_core",
  "typeVersion": "1.1",
  "attributeDefs": [
{
  "name": "inputs",
  "typeName": "array",
  "cardinality": "SET",
  "isIndexable": false,
  "isOptional": true,
  "isUnique": false
},
{
  "name": "outputs",
  "typeName": "array",
  "cardinality": "SET",
  "isIndexable": false,
  "isOptional": true,
  "isUnique": false
}
  ]
}

2) Lineage is only built between Process type and DataSet type. 
https://github.com/apache/atlas/blob/master/repository/src/main/java/org/apache/atlas/discovery/EntityLineageService.java#L117
3) If we don't want to re-implement the lineage in Atlas, it is better to 
deriver action type from Process type, and data type from DataSet.


> On Oct. 16, 2019, 4:28 p.m., Anand Patil wrote:
> > addons/models/1000-Hadoop/-ml_model.json
> > Lines 51 (patched)
> > 
> >
> > businessUseCase, modelFramework, modelAlgorithms, githubRepoURL, 
> > notebookURL and resourceURL feel prematurely opinionated to me. Some 
> > projects will not use all these fields, and for some projects fields other 
> > than these will probably be more important.
> > 
> > Would it be possible in Atlas to move these to a single "metadata" 
> > attribute whose value is key-value pairs? We can then move to stronger 
> > typing as common patterns emerge.

replace those attributes by

{
  "name": "metadata",
  "description": "Contains key-value pairs that provide project 
metadata",
  "typeName": "map",
  "cardinality": "SINGLE",
  "isIndexable": false,
  "isOptional": true,
  "isUnique": false
}


- Na


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/#review218231
---


On Oct. 16, 2019, 12:30 a.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71619/
> ---
> 
> (Updated Oct. 16, 2019, 12:30 a.m.)
> 
> 
> Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
> Sridhar K, Madhan Neethiraj, and Sarath Subramanian.
> 
> 
> Bugs: atlas-3464
> https://issues.apache.org/jira/browse/atlas-3464
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Define entities used for Machine Learning Governance
> 
> 
> Diffs
> -
> 
>   addons/models/1000-Hadoop/-ml_model.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71619/diff/1/
> 
> 
> Testing
> ---
> 
> verified it is valid json file
> 
> 
> Thanks,
> 
> Na Li
> 
>



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-17 Thread Na Li via Review Board


> On Oct. 16, 2019, 4:28 p.m., Anand Patil wrote:
> > addons/models/1000-Hadoop/-ml_model.json
> > Lines 18 (patched)
> > 
> >
> > Can this really be optional?

I will drop this attribute, and user "qualifiedName", which must be unique.


- Na


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/#review218231
---


On Oct. 16, 2019, 12:30 a.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71619/
> ---
> 
> (Updated Oct. 16, 2019, 12:30 a.m.)
> 
> 
> Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
> Sridhar K, Madhan Neethiraj, and Sarath Subramanian.
> 
> 
> Bugs: atlas-3464
> https://issues.apache.org/jira/browse/atlas-3464
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Define entities used for Machine Learning Governance
> 
> 
> Diffs
> -
> 
>   addons/models/1000-Hadoop/-ml_model.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71619/diff/1/
> 
> 
> Testing
> ---
> 
> verified it is valid json file
> 
> 
> Thanks,
> 
> Na Li
> 
>



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-17 Thread Na Li via Review Board


> On Oct. 16, 2019, 7:36 p.m., Na Li wrote:
> > addons/models/1000-Hadoop/-ml_model.json
> > Lines 10 (patched)
> > 
> >
> > "DataSet" in Atlas has different meaning from the meaning in ML.  
> > 
> > DataSet: This type extends Referenceable and Asset. Conceptually, it 
> > can be used to represent an type that stores data. In Atlas, hive tables, 
> > Sqoop RDBMS tables etc are all types that extend from DataSet. Types that 
> > extend DataSet can be expected to have a Schema in the sense that they 
> > would have an attribute that defines attributes of that dataset. For e.g. 
> > the columns attribute in a hive_table. Also entities of types that extend 
> > DataSet participate in data transformation and this transformation can be 
> > captured by Atlas via lineage (or provenance) graphs. 
> > https://atlas.apache.org/0.8.1/TypeSystem.html
> 
> Anand Patil wrote:
> Hmm, it still feels like an odd fit to me, eg model builds and users 
> don't really have a schema. What would you think about instead making our 
> types inherit from Referenceable and Asset?

The type hierarchy is below

1) Asset derives from Referenceable
2) DataSet derives from Asset
3) Infrastructure derives from Asset
4) Process derives from Asset
5) ProcessExecution derives from Asset

It seems to me 
a) DataSet is used for storing data, 
b) Infrastructure is used as container, 
c) Process and ProcessExecution represent any data transformation operation or 
action

Deriving "model builds" or "users" from Asset directly violate this convention. 
I search the integration code from other modules, no type is derived from 
Referenceable or Asset directly.

{
  "name": "Referenceable",
  "superTypes": [],
  "serviceType": "atlas_core",
  "typeVersion": "1.0",
  "attributeDefs": [
{
  "name": "qualifiedName",
  "typeName": "string",
  "cardinality": "SINGLE",
  "isIndexable": true,
  "isOptional": false,
  "isUnique": true
}
  ]
},
{
  "name": "Asset",
  "superTypes": [
"Referenceable"
  ],
  "serviceType": "atlas_core",
  "typeVersion": "1.1",
  "attributeDefs": [
{
  "name": "name",
  "typeName": "string",
  "cardinality": "SINGLE",
  "isIndexable": true,
  "isOptional": false,
  "isUnique": false,
  "indexType": "STRING"
},
{
  "name": "DataSet",
  "superTypes": [
"Asset"
  ],
  "serviceType": "atlas_core",
  "typeVersion": "1.1",
  "attributeDefs": []
},
{
  "name": "Infrastructure",
  "description": "Infrastructure can be IT infrastructure, which contains 
hosts and servers. Infrastructure might not be IT orientated, such as 'Car' for 
IoT applications.",
  "superTypes": [
"Asset"
  ],
  "serviceType": "atlas_core",
  "typeVersion": "1.1",
  "attributeDefs": []
},
{
  "name": "Process",
  "superTypes": [
"Asset"
  ],
  "serviceType": "atlas_core",
  "typeVersion": "1.1",
  "attributeDefs": [
{
  "name": "inputs",
  "typeName": "array",
  "cardinality": "SET",
  "isIndexable": false,
  "isOptional": true,
  "isUnique": false
},
{
  "name": "outputs",
  "typeName": "array",
  "cardinality": "SET",
  "isIndexable": false,
  "isOptional": true,
  "isUnique": false
}
  ]
},
{
  "name": "ProcessExecution",
  "superTypes": [
"Asset"
  ],
  "serviceType": "atlas_core",
  "typeVersion": "1.0",
  "attributeDefs": []
}


> On Oct. 16, 2019, 7:36 p.m., Na Li wrote:
> > addons/models/1000-Hadoop/-ml_model.json
> > Lines 18 (patched)
> > 
> >
> > I don't want this to be optional. However, right now, some data source 
> > does not provide this info.
> 
> Anand Patil wrote:
> Could you give an example?

every Referenceable has an attribute "qualifiedName", which has to be unique. I 
am thinking we should get rid of this attribute "uniqueId", which duplicates 
the purpose of "qualifiedName". the attibute "name" can contain user-friendly 
name, and may not be unique.


- Na


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/#review218237
---


On Oct. 16, 2019, 12:30 a.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71619/
> ---
> 
> (Updated Oct. 16, 

Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-16 Thread Anand Patil via Review Board


> On Oct. 16, 2019, 7:36 p.m., Na Li wrote:
> > addons/models/1000-Hadoop/-ml_model.json
> > Lines 10 (patched)
> > 
> >
> > "DataSet" in Atlas has different meaning from the meaning in ML.  
> > 
> > DataSet: This type extends Referenceable and Asset. Conceptually, it 
> > can be used to represent an type that stores data. In Atlas, hive tables, 
> > Sqoop RDBMS tables etc are all types that extend from DataSet. Types that 
> > extend DataSet can be expected to have a Schema in the sense that they 
> > would have an attribute that defines attributes of that dataset. For e.g. 
> > the columns attribute in a hive_table. Also entities of types that extend 
> > DataSet participate in data transformation and this transformation can be 
> > captured by Atlas via lineage (or provenance) graphs. 
> > https://atlas.apache.org/0.8.1/TypeSystem.html

Hmm, it still feels like an odd fit to me, eg model builds and users don't 
really have a schema. What would you think about instead making our types 
inherit from Referenceable and Asset?


> On Oct. 16, 2019, 7:36 p.m., Na Li wrote:
> > addons/models/1000-Hadoop/-ml_model.json
> > Lines 18 (patched)
> > 
> >
> > I don't want this to be optional. However, right now, some data source 
> > does not provide this info.

Could you give an example?


> On Oct. 16, 2019, 7:36 p.m., Na Li wrote:
> > addons/models/1000-Hadoop/-ml_model.json
> > Lines 43 (patched)
> > 
> >
> > One use case I can think of is when a user cannot access a project, it 
> > would be useful to know that the project is a private, and then the user 
> > should be a collaborator. If the project is public, then something else 
> > could go wrong.
> > 
> > For some business, it may be desirable for the project to be private.

OK.


> On Oct. 16, 2019, 7:36 p.m., Na Li wrote:
> > addons/models/1000-Hadoop/-ml_model.json
> > Lines 51 (patched)
> > 
> >
> > Those fields are optional. So if some projects cannot provide those 
> > fields, that is OK. The attributes I put here are used in several projects.
> > 
> > If we put several "key:value" pairs in a single string, we risk losing 
> > some info. 
> > 
> > I think the big drawback of putting several key:value pairs in a single 
> > json string is that:some can be lost by accident. For example the 
> > customAttributes = "{githubRepoURL:https://host1/project1 }". Then later 
> > on, an update comes in with customAttributes = "{mlFramework:tensorFlow}" 
> > because that application has no idea someone is tracking githubRepoURL, or 
> > did not set githubRepoURL because only mlFramework has changed. Then the 
> > previous info on where the source code is got lost by accident. It is hard 
> > to debug. Besides, each time someone wants to use the info, has to parse 
> > the string.
> > 
> > The counter argument can be how about putting these attributes here. If 
> > they are not popular, we just don't use them.
> > 
> > The following info is from Ashutosh. 
> > Having an attribute that contains json blob is not searchable at Atlas 
> > in general. 
> > 
> > Other benefit of adding an attribute to the model is that entities 
> > created will get validated for the type
> > because of validation, we can be certain about the data present in that 
> > field
> > 
> > A hacky approach to add json blob and make it searchable is to set them 
> > in customAttributes, which is system property. We cannot see what keys are 
> > used in customAttributes at model files. it is set by whoever creates the 
> > Atlas entity.

Instead of a string type containing json, what about a map? I 
see that type used in some other attributes.


- Anand


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/#review218237
---


On Oct. 16, 2019, 12:30 a.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71619/
> ---
> 
> (Updated Oct. 16, 2019, 12:30 a.m.)
> 
> 
> Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
> Sridhar K, Madhan Neethiraj, and Sarath Subramanian.
> 
> 
> Bugs: atlas-3464
> https://issues.apache.org/jira/browse/atlas-3464
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Define entities used for Machine Learning Governance
> 
> 
> Diffs
> -
> 
>   addons/models/1000-Hadoop/-ml_model.json 

Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-16 Thread Anand Patil via Review Board


> On Oct. 16, 2019, 8:10 p.m., Na Li wrote:
> > addons/models/1000-Hadoop/-ml_model.json
> > Lines 175 (patched)
> > 
> >
> > do you want to define types that can be used for hyperparameters?
> > 
> > Can you give me some examples that we can define general purpose types 
> > to contain hyper parameters?

Floats would be most common, but ints, bools, categoricals and arrays are all 
possible. @mlw at cloudera could give you some more color.


> On Oct. 16, 2019, 8:10 p.m., Na Li wrote:
> > addons/models/1000-Hadoop/-ml_model.json
> > Lines 197 (patched)
> > 
> >
> > the status can be "deprecated" for example. We need to look from ML 
> > operation point of view, not from how model build is created.
> > 
> > If operation engineer found some issue on a model build, which will 
> > then be marked as "deprecated", and some model may be in "dev" stage, and 
> > some are approved to be in "production". We need those info to help decide 
> > what model build to deploy.
> > 
> > For different deployment situations, the status can be of different 
> > values. Making it as enum may be too restrictive.

Hmmm. If status can be any user-chosen string, as opposed to an enum, then we 
aren't going to be able to do much with it in UI or tooling. For example, some 
users may choose 'deprecated' and some may choose 'stale'. In that case we 
could leave it to a flexible metadata attribute and keep the type definition 
small at this stage. We will definitely want to add attributes as we start to 
get feedback from users.


- Anand


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/#review218238
---


On Oct. 16, 2019, 12:30 a.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71619/
> ---
> 
> (Updated Oct. 16, 2019, 12:30 a.m.)
> 
> 
> Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
> Sridhar K, Madhan Neethiraj, and Sarath Subramanian.
> 
> 
> Bugs: atlas-3464
> https://issues.apache.org/jira/browse/atlas-3464
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Define entities used for Machine Learning Governance
> 
> 
> Diffs
> -
> 
>   addons/models/1000-Hadoop/-ml_model.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71619/diff/1/
> 
> 
> Testing
> ---
> 
> verified it is valid json file
> 
> 
> Thanks,
> 
> Na Li
> 
>



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-16 Thread Na Li via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/#review218238
---




addons/models/1000-Hadoop/-ml_model.json
Lines 135 (patched)


same answer as above



addons/models/1000-Hadoop/-ml_model.json
Lines 175 (patched)


do you want to define types that can be used for hyperparameters?

Can you give me some examples that we can define general purpose types to 
contain hyper parameters?



addons/models/1000-Hadoop/-ml_model.json
Lines 197 (patched)


the status can be "deprecated" for example. We need to look from ML 
operation point of view, not from how model build is created.

If operation engineer found some issue on a model build, which will then be 
marked as "deprecated", and some model may be in "dev" stage, and some are 
approved to be in "production". We need those info to help decide what model 
build to deploy.

For different deployment situations, the status can be of different values. 
Making it as enum may be too restrictive.



addons/models/1000-Hadoop/-ml_model.json
Lines 204 (patched)


good point. I will add the recommended gpu number. If it is 0, which means 
it does not need.



addons/models/1000-Hadoop/-ml_model.json
Lines 220 (patched)


OK. I will remove it.



addons/models/1000-Hadoop/-ml_model.json
Lines 228 (patched)


will add this attribute. Thanks!



addons/models/1000-Hadoop/-ml_model.json
Lines 248 (patched)


I am thinking about this. Will add example response to help caller process 
the response.



addons/models/1000-Hadoop/-ml_model.json
Lines 268 (patched)


We should. I intended to do that, but missed this one. Thanks!


- Na Li


On Oct. 16, 2019, 12:30 a.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71619/
> ---
> 
> (Updated Oct. 16, 2019, 12:30 a.m.)
> 
> 
> Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
> Sridhar K, Madhan Neethiraj, and Sarath Subramanian.
> 
> 
> Bugs: atlas-3464
> https://issues.apache.org/jira/browse/atlas-3464
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Define entities used for Machine Learning Governance
> 
> 
> Diffs
> -
> 
>   addons/models/1000-Hadoop/-ml_model.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71619/diff/1/
> 
> 
> Testing
> ---
> 
> verified it is valid json file
> 
> 
> Thanks,
> 
> Na Li
> 
>



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-16 Thread Na Li via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/#review218237
---




addons/models/1000-Hadoop/-ml_model.json
Lines 10 (patched)


"DataSet" in Atlas has different meaning from the meaning in ML.  

DataSet: This type extends Referenceable and Asset. Conceptually, it can be 
used to represent an type that stores data. In Atlas, hive tables, Sqoop RDBMS 
tables etc are all types that extend from DataSet. Types that extend DataSet 
can be expected to have a Schema in the sense that they would have an attribute 
that defines attributes of that dataset. For e.g. the columns attribute in a 
hive_table. Also entities of types that extend DataSet participate in data 
transformation and this transformation can be captured by Atlas via lineage (or 
provenance) graphs. https://atlas.apache.org/0.8.1/TypeSystem.html



addons/models/1000-Hadoop/-ml_model.json
Lines 18 (patched)


I don't want this to be optional. However, right now, some data source does 
not provide this info.



addons/models/1000-Hadoop/-ml_model.json
Lines 43 (patched)


One use case I can think of is when a user cannot access a project, it 
would be useful to know that the project is a private, and then the user should 
be a collaborator. If the project is public, then something else could go wrong.

For some business, it may be desirable for the project to be private.



addons/models/1000-Hadoop/-ml_model.json
Lines 51 (patched)


Those fields are optional. So if some projects cannot provide those fields, 
that is OK. The attributes I put here are used in several projects.

If we put several "key:value" pairs in a single string, we risk losing some 
info. 

I think the big drawback of putting several key:value pairs in a single 
json string is that:some can be lost by accident. For example the 
customAttributes = "{githubRepoURL:https://host1/project1 }". Then later on, an 
update comes in with customAttributes = "{mlFramework:tensorFlow}" because that 
application has no idea someone is tracking githubRepoURL, or did not set 
githubRepoURL because only mlFramework has changed. Then the previous info on 
where the source code is got lost by accident. It is hard to debug. Besides, 
each time someone wants to use the info, has to parse the string.

The counter argument can be how about putting these attributes here. If 
they are not popular, we just don't use them.

The following info is from Ashutosh. 
Having an attribute that contains json blob is not searchable at Atlas in 
general. 

Other benefit of adding an attribute to the model is that entities created 
will get validated for the type
because of validation, we can be certain about the data present in that 
field

A hacky approach to add json blob and make it searchable is to set them in 
customAttributes, which is system property. We cannot see what keys are used in 
customAttributes at model files. it is set by whoever creates the Atlas entity.


- Na Li


On Oct. 16, 2019, 12:30 a.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71619/
> ---
> 
> (Updated Oct. 16, 2019, 12:30 a.m.)
> 
> 
> Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
> Sridhar K, Madhan Neethiraj, and Sarath Subramanian.
> 
> 
> Bugs: atlas-3464
> https://issues.apache.org/jira/browse/atlas-3464
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Define entities used for Machine Learning Governance
> 
> 
> Diffs
> -
> 
>   addons/models/1000-Hadoop/-ml_model.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71619/diff/1/
> 
> 
> Testing
> ---
> 
> verified it is valid json file
> 
> 
> Thanks,
> 
> Na Li
> 
>



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-16 Thread Karthik Manamcheri via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/#review218235
---




addons/models/1000-Hadoop/-ml_model.json
Lines 286 (patched)


What is the purpose of "updatedAt"?



addons/models/1000-Hadoop/-ml_model.json
Lines 349 (patched)


How is a ML User different from a regular user?



addons/models/1000-Hadoop/-ml_model.json
Lines 447 (patched)


I question the need for a ml_user at this point. I don't completely 
understand how or why a ml_user is different from any other user in the system?

Should this just be "user"?


- Karthik Manamcheri


On Oct. 16, 2019, 12:30 a.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71619/
> ---
> 
> (Updated Oct. 16, 2019, 12:30 a.m.)
> 
> 
> Review request for atlas, Austin Nobis, Ashutosh Mestry, Karthik Manamcheri, 
> Sridhar K, Madhan Neethiraj, and Sarath Subramanian.
> 
> 
> Bugs: atlas-3464
> https://issues.apache.org/jira/browse/atlas-3464
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Define entities used for Machine Learning Governance
> 
> 
> Diffs
> -
> 
>   addons/models/1000-Hadoop/-ml_model.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71619/diff/1/
> 
> 
> Testing
> ---
> 
> verified it is valid json file
> 
> 
> Thanks,
> 
> Na Li
> 
>



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-16 Thread Anand Patil via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/#review218231
---




addons/models/1000-Hadoop/-ml_model.json
Lines 10 (patched)


Why make project a subtype of dataset?



addons/models/1000-Hadoop/-ml_model.json
Lines 18 (patched)


Can this really be optional?



addons/models/1000-Hadoop/-ml_model.json
Lines 43 (patched)


Do we have a use case for project visibility, or is it redundant with 
permissions? In other words, if user X has read access to project Y, do we care 
whether the reason is that the user is a collaborator to a private project or 
that the project is public?



addons/models/1000-Hadoop/-ml_model.json
Lines 51 (patched)


businessUseCase, modelFramework, modelAlgorithms, githubRepoURL, 
notebookURL and resourceURL feel prematurely opinionated to me. Some projects 
will not use all these fields, and for some projects fields other than these 
will probably be more important.

Would it be possible in Atlas to move these to a single "metadata" 
attribute whose value is key-value pairs? We can then move to stronger typing 
as common patterns emerge.



addons/models/1000-Hadoop/-ml_model.json
Lines 135 (patched)


Same question about why dataset is the supertype.



addons/models/1000-Hadoop/-ml_model.json
Lines 165 (patched)


Same comment about metadata: gitCommitId, hyperParameters, 
outcomeTypeDescription feel prematurely standardized. I feel that a 
map metadata field would be better suited to where we are now.



addons/models/1000-Hadoop/-ml_model.json
Lines 175 (patched)


It's slightly awkward to have to stringify all hyperparameter values. Does 
Atlas have option types?



addons/models/1000-Hadoop/-ml_model.json
Lines 197 (patched)


If status is non-optional, it should probably be an enum or Altus 
equivalent.

Also, I'm not sure this field is required for model builds as opposed to 
model deployments. Builds' status is very simple - building, built or errored I 
suppose. We don't necessarily need to represent building or errored builds in 
Atlas.



addons/models/1000-Hadoop/-ml_model.json
Lines 204 (patched)


In addition to defaultCpuCores and defaultMemoryMb, we should probably have 
a field indicating whether the model utilizes GPU's.



addons/models/1000-Hadoop/-ml_model.json
Lines 220 (patched)


I'm not sure defaultReplicas belongs on the model build. The correct number 
of replicas depends on usage, it's not really a property of the model itself.



addons/models/1000-Hadoop/-ml_model.json
Lines 228 (patched)


"imageTag" would be more standard terminology. The tag typically includes 
the entire image URL.

Consider also adding the image hash, as tags/url's are not necessarily 
unique.



addons/models/1000-Hadoop/-ml_model.json
Lines 248 (patched)


Should we also have exampleResponse?



addons/models/1000-Hadoop/-ml_model.json
Lines 264 (patched)


Same question about supertype



addons/models/1000-Hadoop/-ml_model.json
Lines 268 (patched)


We don't need a uniqueID for model deployments?



addons/models/1000-Hadoop/-ml_model.json
Lines 306 (patched)


I don't think we shoud expose serviceIP. It may not be distinct from 
modelEndpointURL in non-Kubernetes serving environments and when it is, we may 
not want to encourage use of it. For example, when a model is being called from 
outside the k8s cluster where it is running, the service IP is not usable.



addons/models/1000-Hadoop/-ml_model.json
Lines 319 (patched)


If this is required I think it should be an enum or equivalent in Atlas.



addons/models/1000-Hadoop/-ml_model.json
Lines 323 (patched)


Use the same units for CPU and memory here as in the defaults for model 
build. I would go with millicores and bytes.



addons/models/1000-Hadoop/-ml_model.json
Lines 336 (patched)



Re: Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-15 Thread Na Li via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/
---

(Updated Oct. 15, 2019, 9:47 p.m.)


Review request for atlas, Ashutosh Mestry, Sridhar K, Madhan Neethiraj, and 
Sarath Subramanian.


Bugs: atlas-3464
https://issues.apache.org/jira/browse/atlas-3464


Repository: atlas


Description
---

Define entities used for Machine Learning Governance


Diffs
-

  addons/models/1000-Hadoop/-ml_model.json PRE-CREATION 


Diff: https://reviews.apache.org/r/71619/diff/1/


Testing
---

verified it is valid json file


Thanks,

Na Li



Review Request 71619: ATLAS-3464: Define Entities stored in Atlas for ML Governance

2019-10-15 Thread Na Li via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71619/
---

Review request for atlas, Ashutosh Mestry, Sridhar K, Madhan Neethiraj, and 
Sarath Subramanian.


Repository: atlas


Description
---

Define entities used for Machine Learning Governance


Diffs
-

  addons/models/1000-Hadoop/-ml_model.json PRE-CREATION 


Diff: https://reviews.apache.org/r/71619/diff/1/


Testing
---

verified it is valid json file


Thanks,

Na Li