Re: Review Request 71025: Import Service: Support Concurrent Ingest

2020-01-23 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/
---

(Updated Jan. 23, 2020, 5:30 p.m.)


Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and 
Sarath Subramanian.


Changes
---

Updates include:
- Rebased with latest version.
- Added new method that uses batch size to commit.
- Refactoring.


Bugs: ATLAS-3320
https://issues.apache.org/jira/browse/ATLAS-3320


Repository: atlas


Description
---

**Approach**
- Use existing producer-consumer (PC) framework.
- Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
- Add support for configuring number of workers and batch size within 
_AtlasImportRequest_.

_AtlasImportRequest_
```
{
"options": {
"numWorkers": 8,
"batchSize": 25
}
}
```

**CURL**
```
curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H 
"Cache-Control: no-cache" -F request=@./import-options.json -F 
data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import
```


Diffs (updated)
-

  
graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java
 4acb371f1 
  intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 
0b3ede93f 
  intg/src/main/java/org/apache/atlas/pc/WorkItemConsumer.java 9ba4bf4e3 
  intg/src/main/java/org/apache/atlas/pc/WorkItemManager.java a7ba67cb0 
  repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
bbe0dc5ba 
  repository/src/main/java/org/apache/atlas/repository/impexp/AuditsWriter.java 
55990f780 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasEntityStore.java
 928c70dba 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
 25284e92f 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java
 54c32c5e8 


Diff: https://reviews.apache.org/r/71025/diff/4/

Changes: https://reviews.apache.org/r/71025/diff/3-4/


Testing
---

**Unit tests**
Existing tests.

**Functional tests**
- Verified import for pre-1.0 and post-1.0 exported ZIP files.

**Pre-commit**
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/1292

**Volume tests**
- Measure performance with large data.

+--+--+--++
| File | Before   | After| Configuration  |
+--+--+--++
| smalldb  |   6 min  |2 min | Shards: 4, Threads: 8  |
| (2.2 MB) |  |  ||
+--+--+--++
| largedb  |3 hrs |  10 mins | Shards: 4, Threads: 16 |
| (40 MB)  |  |  ||
+--+--+--++


Thanks,

Ashutosh Mestry



[jira] [Commented] (ATLAS-3594) Invalid Swagger Specifications in Atlas Swagger.json

2020-01-23 Thread Siddharth Singh (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-3594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17022076#comment-17022076
 ] 

Siddharth Singh commented on ATLAS-3594:


Errors Corrected
 * Invalid Default values in Schemas
 * UN-necessary escape \/ in paths
 * HTML Escape characters in description
 * Conflicting Operation IDS

[^apacheAtlas.json]

> Invalid Swagger Specifications in Atlas Swagger.json
> 
>
> Key: ATLAS-3594
> URL: https://issues.apache.org/jira/browse/ATLAS-3594
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Affects Versions: 2.0.0
> Environment: Ubuntu
>Reporter: Siddharth Singh
>Priority: Major
>  Labels: rest, swagger
> Attachments: apacheAtlas.json
>
>
> Swagger specification provided in 
> [https://atlas.apache.org/api/v2/ui/swagger.json] are invalid and one cannot 
> generate a client from them.
> There happens to be invalid schema.
>  
> Steps to reproduce:
>  * Download Apache Atlas Swagger Specs
>  * Open any Swagger Code generating Utility and Generate a Client for the same
>  
> Sample output from go-swagger on parsing the file
> {code:java}
> swagger generate client -f api/apacheAtlas.json >> output.txt
> 2020/01/22 16:54:01 validating spec api/apacheAtlas.json
> The swagger spec at "api/apacheAtlas.json" is invalid against swagger 
> specification 2.0. see errors :
> - 
> "paths./v2/entity/uniqueAttribute/type/{typeName}/classifications.post.parameters"
>  must validate one and only one schema (oneOf). Found none valid
> - 
> paths./v2/entity/uniqueAttribute/type/{typeName}/classifications.post.parameters.in
>  in body should be one of [header]
> - 
> "paths./v2/entity/uniqueAttribute/type/{typeName}/classifications.put.parameters"
>  must validate one and only one schema (oneOf). Found none valid
> - 
> paths./v2/entity/uniqueAttribute/type/{typeName}/classifications.put.parameters.in
>  in body should be one of [header]
> - "paths./v2/glossary.get.responses.200" must validate one and only one 
> schema (oneOf). Found none valid
> - paths./v2/glossary.get.responses.200.example in body is a forbidden property
> - "paths./v2/glossary.post.responses.200" must validate one and only one 
> schema (oneOf). Found none valid
> - paths./v2/glossary.post.responses.200.example in body is a forbidden 
> property
> - "paths./v2/glossary.post.responses.400" must validate one and only one 
> schema (oneOf). Found none valid
> - paths./v2/glossary.post.responses.400.example in body is a forbidden 
> property
> - "paths./v2/glossary.post.responses.409" must validate one and only one 
> schema (oneOf). Found none valid
> - paths./v2/glossary.post.responses.409.example in body is a forbidden 
> property
> - "paths./v2/search/saved.post.responses.201" must validate one and only one 
> schema (oneOf). Found none valid
> - paths./v2/search/saved.post.responses.201.example in body is a forbidden 
> property
> - "paths./v2/search/saved.put.responses.204" must validate one and only one 
> schema (oneOf). Found none valid
> - paths./v2/search/saved.put.responses.204.example in body is a forbidden 
> property
> - "paths./v2/search/saved.get.responses.200" must validate one and only one 
> schema (oneOf). Found none valid
> - paths./v2/search/saved.get.responses.200.example in body is a forbidden 
> property
> - "paths./v2/types/entitydef/guid/{guid}.get.parameters" must validate one 
> and only one schema (oneOf). Found none valid
> - paths./v2/types/entitydef/guid/{guid}.get.parameters.in in body should be 
> one of [header]
> - "paths./v2/types/entitydef/guid/{guid}.get.responses.200" must validate one 
> and only one schema (oneOf). Found none valid
> - paths./v2/types/entitydef/guid/{guid}.get.responses.200.example in body is 
> a forbidden property
> - "paths./v2/types/entitydef/guid/{guid}.get.responses.404" must validate one 
> and only one schema (oneOf). Found none valid
> - paths./v2/types/entitydef/guid/{guid}.get.responses.404.example in body is 
> a forbidden property
> - "paths./v2/entity/guid/{guid}/classifications.post.parameters" must 
> validate one and only one schema (oneOf). Found none valid
> - paths./v2/entity/guid/{guid}/classifications.post.parameters.in in body 
> should be one of [header]
> - "paths./v2/entity/guid/{guid}/classifications.put.parameters" must validate 
> one and only one schema (oneOf). Found none valid
> - paths./v2/entity/guid/{guid}/classifications.put.parameters.in in body 
> should be one of [header]
> - "paths./v2/entity/guid/{guid}/classifications.get.parameters" must validate 
> one and only one schema (oneOf). Found none valid
> - paths./v2/entity/guid/{guid}/classifications.get.parameters.in in body 
> should be one of [header]
> - "paths./v2/entity/guid/{guid}/classifications.get

[jira] [Updated] (ATLAS-3594) Invalid Swagger Specifications in Atlas Swagger.json

2020-01-23 Thread Siddharth Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-3594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Singh updated ATLAS-3594:
---
Attachment: apacheAtlas.json

> Invalid Swagger Specifications in Atlas Swagger.json
> 
>
> Key: ATLAS-3594
> URL: https://issues.apache.org/jira/browse/ATLAS-3594
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Affects Versions: 2.0.0
> Environment: Ubuntu
>Reporter: Siddharth Singh
>Priority: Major
>  Labels: rest, swagger
> Attachments: apacheAtlas.json, apacheAtlas.json
>
>
> Swagger specification provided in 
> [https://atlas.apache.org/api/v2/ui/swagger.json] are invalid and one cannot 
> generate a client from them.
> There happens to be invalid schema.
>  
> Steps to reproduce:
>  * Download Apache Atlas Swagger Specs
>  * Open any Swagger Code generating Utility and Generate a Client for the same
>  
> Sample output from go-swagger on parsing the file
> {code:java}
> swagger generate client -f api/apacheAtlas.json >> output.txt
> 2020/01/22 16:54:01 validating spec api/apacheAtlas.json
> The swagger spec at "api/apacheAtlas.json" is invalid against swagger 
> specification 2.0. see errors :
> - 
> "paths./v2/entity/uniqueAttribute/type/{typeName}/classifications.post.parameters"
>  must validate one and only one schema (oneOf). Found none valid
> - 
> paths./v2/entity/uniqueAttribute/type/{typeName}/classifications.post.parameters.in
>  in body should be one of [header]
> - 
> "paths./v2/entity/uniqueAttribute/type/{typeName}/classifications.put.parameters"
>  must validate one and only one schema (oneOf). Found none valid
> - 
> paths./v2/entity/uniqueAttribute/type/{typeName}/classifications.put.parameters.in
>  in body should be one of [header]
> - "paths./v2/glossary.get.responses.200" must validate one and only one 
> schema (oneOf). Found none valid
> - paths./v2/glossary.get.responses.200.example in body is a forbidden property
> - "paths./v2/glossary.post.responses.200" must validate one and only one 
> schema (oneOf). Found none valid
> - paths./v2/glossary.post.responses.200.example in body is a forbidden 
> property
> - "paths./v2/glossary.post.responses.400" must validate one and only one 
> schema (oneOf). Found none valid
> - paths./v2/glossary.post.responses.400.example in body is a forbidden 
> property
> - "paths./v2/glossary.post.responses.409" must validate one and only one 
> schema (oneOf). Found none valid
> - paths./v2/glossary.post.responses.409.example in body is a forbidden 
> property
> - "paths./v2/search/saved.post.responses.201" must validate one and only one 
> schema (oneOf). Found none valid
> - paths./v2/search/saved.post.responses.201.example in body is a forbidden 
> property
> - "paths./v2/search/saved.put.responses.204" must validate one and only one 
> schema (oneOf). Found none valid
> - paths./v2/search/saved.put.responses.204.example in body is a forbidden 
> property
> - "paths./v2/search/saved.get.responses.200" must validate one and only one 
> schema (oneOf). Found none valid
> - paths./v2/search/saved.get.responses.200.example in body is a forbidden 
> property
> - "paths./v2/types/entitydef/guid/{guid}.get.parameters" must validate one 
> and only one schema (oneOf). Found none valid
> - paths./v2/types/entitydef/guid/{guid}.get.parameters.in in body should be 
> one of [header]
> - "paths./v2/types/entitydef/guid/{guid}.get.responses.200" must validate one 
> and only one schema (oneOf). Found none valid
> - paths./v2/types/entitydef/guid/{guid}.get.responses.200.example in body is 
> a forbidden property
> - "paths./v2/types/entitydef/guid/{guid}.get.responses.404" must validate one 
> and only one schema (oneOf). Found none valid
> - paths./v2/types/entitydef/guid/{guid}.get.responses.404.example in body is 
> a forbidden property
> - "paths./v2/entity/guid/{guid}/classifications.post.parameters" must 
> validate one and only one schema (oneOf). Found none valid
> - paths./v2/entity/guid/{guid}/classifications.post.parameters.in in body 
> should be one of [header]
> - "paths./v2/entity/guid/{guid}/classifications.put.parameters" must validate 
> one and only one schema (oneOf). Found none valid
> - paths./v2/entity/guid/{guid}/classifications.put.parameters.in in body 
> should be one of [header]
> - "paths./v2/entity/guid/{guid}/classifications.get.parameters" must validate 
> one and only one schema (oneOf). Found none valid
> - paths./v2/entity/guid/{guid}/classifications.get.parameters.in in body 
> should be one of [header]
> - "paths./v2/entity/guid/{guid}/classifications.get.responses.200" must 
> validate one and only one schema (oneOf). Found none valid
> - paths./v2/entity/guid/{guid}/classifications.get.responses.200.example in 
> body is a forbidden pro