Re: Review Request 71025: Import Service: Support Concurrent Ingest
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/71025/ --- (Updated Jan. 23, 2020, 5:30 p.m.) Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and Sarath Subramanian. Changes --- Updates include: - Rebased with latest version. - Added new method that uses batch size to commit. - Refactoring. Bugs: ATLAS-3320 https://issues.apache.org/jira/browse/ATLAS-3320 Repository: atlas Description --- **Approach** - Use existing producer-consumer (PC) framework. - Modify _BulkImporterImpl_ to implement _WorkItemConsumer_. - Add support for configuring number of workers and batch size within _AtlasImportRequest_. _AtlasImportRequest_ ``` { "options": { "numWorkers": 8, "batchSize": 25 } } ``` **CURL** ``` curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F request=@./import-options.json -F data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import ``` Diffs (updated) - graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java 4acb371f1 intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 0b3ede93f intg/src/main/java/org/apache/atlas/pc/WorkItemConsumer.java 9ba4bf4e3 intg/src/main/java/org/apache/atlas/pc/WorkItemManager.java a7ba67cb0 repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java bbe0dc5ba repository/src/main/java/org/apache/atlas/repository/impexp/AuditsWriter.java 55990f780 repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasEntityStore.java 928c70dba repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java 25284e92f repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java 54c32c5e8 Diff: https://reviews.apache.org/r/71025/diff/4/ Changes: https://reviews.apache.org/r/71025/diff/3-4/ Testing --- **Unit tests** Existing tests. **Functional tests** - Verified import for pre-1.0 and post-1.0 exported ZIP files. **Pre-commit** https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/1292 **Volume tests** - Measure performance with large data. +--+--+--++ | File | Before | After| Configuration | +--+--+--++ | smalldb | 6 min |2 min | Shards: 4, Threads: 8 | | (2.2 MB) | | || +--+--+--++ | largedb |3 hrs | 10 mins | Shards: 4, Threads: 16 | | (40 MB) | | || +--+--+--++ Thanks, Ashutosh Mestry
[jira] [Commented] (ATLAS-3594) Invalid Swagger Specifications in Atlas Swagger.json
[ https://issues.apache.org/jira/browse/ATLAS-3594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17022076#comment-17022076 ] Siddharth Singh commented on ATLAS-3594: Errors Corrected * Invalid Default values in Schemas * UN-necessary escape \/ in paths * HTML Escape characters in description * Conflicting Operation IDS [^apacheAtlas.json] > Invalid Swagger Specifications in Atlas Swagger.json > > > Key: ATLAS-3594 > URL: https://issues.apache.org/jira/browse/ATLAS-3594 > Project: Atlas > Issue Type: Bug > Components: atlas-core >Affects Versions: 2.0.0 > Environment: Ubuntu >Reporter: Siddharth Singh >Priority: Major > Labels: rest, swagger > Attachments: apacheAtlas.json > > > Swagger specification provided in > [https://atlas.apache.org/api/v2/ui/swagger.json] are invalid and one cannot > generate a client from them. > There happens to be invalid schema. > > Steps to reproduce: > * Download Apache Atlas Swagger Specs > * Open any Swagger Code generating Utility and Generate a Client for the same > > Sample output from go-swagger on parsing the file > {code:java} > swagger generate client -f api/apacheAtlas.json >> output.txt > 2020/01/22 16:54:01 validating spec api/apacheAtlas.json > The swagger spec at "api/apacheAtlas.json" is invalid against swagger > specification 2.0. see errors : > - > "paths./v2/entity/uniqueAttribute/type/{typeName}/classifications.post.parameters" > must validate one and only one schema (oneOf). Found none valid > - > paths./v2/entity/uniqueAttribute/type/{typeName}/classifications.post.parameters.in > in body should be one of [header] > - > "paths./v2/entity/uniqueAttribute/type/{typeName}/classifications.put.parameters" > must validate one and only one schema (oneOf). Found none valid > - > paths./v2/entity/uniqueAttribute/type/{typeName}/classifications.put.parameters.in > in body should be one of [header] > - "paths./v2/glossary.get.responses.200" must validate one and only one > schema (oneOf). Found none valid > - paths./v2/glossary.get.responses.200.example in body is a forbidden property > - "paths./v2/glossary.post.responses.200" must validate one and only one > schema (oneOf). Found none valid > - paths./v2/glossary.post.responses.200.example in body is a forbidden > property > - "paths./v2/glossary.post.responses.400" must validate one and only one > schema (oneOf). Found none valid > - paths./v2/glossary.post.responses.400.example in body is a forbidden > property > - "paths./v2/glossary.post.responses.409" must validate one and only one > schema (oneOf). Found none valid > - paths./v2/glossary.post.responses.409.example in body is a forbidden > property > - "paths./v2/search/saved.post.responses.201" must validate one and only one > schema (oneOf). Found none valid > - paths./v2/search/saved.post.responses.201.example in body is a forbidden > property > - "paths./v2/search/saved.put.responses.204" must validate one and only one > schema (oneOf). Found none valid > - paths./v2/search/saved.put.responses.204.example in body is a forbidden > property > - "paths./v2/search/saved.get.responses.200" must validate one and only one > schema (oneOf). Found none valid > - paths./v2/search/saved.get.responses.200.example in body is a forbidden > property > - "paths./v2/types/entitydef/guid/{guid}.get.parameters" must validate one > and only one schema (oneOf). Found none valid > - paths./v2/types/entitydef/guid/{guid}.get.parameters.in in body should be > one of [header] > - "paths./v2/types/entitydef/guid/{guid}.get.responses.200" must validate one > and only one schema (oneOf). Found none valid > - paths./v2/types/entitydef/guid/{guid}.get.responses.200.example in body is > a forbidden property > - "paths./v2/types/entitydef/guid/{guid}.get.responses.404" must validate one > and only one schema (oneOf). Found none valid > - paths./v2/types/entitydef/guid/{guid}.get.responses.404.example in body is > a forbidden property > - "paths./v2/entity/guid/{guid}/classifications.post.parameters" must > validate one and only one schema (oneOf). Found none valid > - paths./v2/entity/guid/{guid}/classifications.post.parameters.in in body > should be one of [header] > - "paths./v2/entity/guid/{guid}/classifications.put.parameters" must validate > one and only one schema (oneOf). Found none valid > - paths./v2/entity/guid/{guid}/classifications.put.parameters.in in body > should be one of [header] > - "paths./v2/entity/guid/{guid}/classifications.get.parameters" must validate > one and only one schema (oneOf). Found none valid > - paths./v2/entity/guid/{guid}/classifications.get.parameters.in in body > should be one of [header] > - "paths./v2/entity/guid/{guid}/classifications.get
[jira] [Updated] (ATLAS-3594) Invalid Swagger Specifications in Atlas Swagger.json
[ https://issues.apache.org/jira/browse/ATLAS-3594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Singh updated ATLAS-3594: --- Attachment: apacheAtlas.json > Invalid Swagger Specifications in Atlas Swagger.json > > > Key: ATLAS-3594 > URL: https://issues.apache.org/jira/browse/ATLAS-3594 > Project: Atlas > Issue Type: Bug > Components: atlas-core >Affects Versions: 2.0.0 > Environment: Ubuntu >Reporter: Siddharth Singh >Priority: Major > Labels: rest, swagger > Attachments: apacheAtlas.json, apacheAtlas.json > > > Swagger specification provided in > [https://atlas.apache.org/api/v2/ui/swagger.json] are invalid and one cannot > generate a client from them. > There happens to be invalid schema. > > Steps to reproduce: > * Download Apache Atlas Swagger Specs > * Open any Swagger Code generating Utility and Generate a Client for the same > > Sample output from go-swagger on parsing the file > {code:java} > swagger generate client -f api/apacheAtlas.json >> output.txt > 2020/01/22 16:54:01 validating spec api/apacheAtlas.json > The swagger spec at "api/apacheAtlas.json" is invalid against swagger > specification 2.0. see errors : > - > "paths./v2/entity/uniqueAttribute/type/{typeName}/classifications.post.parameters" > must validate one and only one schema (oneOf). Found none valid > - > paths./v2/entity/uniqueAttribute/type/{typeName}/classifications.post.parameters.in > in body should be one of [header] > - > "paths./v2/entity/uniqueAttribute/type/{typeName}/classifications.put.parameters" > must validate one and only one schema (oneOf). Found none valid > - > paths./v2/entity/uniqueAttribute/type/{typeName}/classifications.put.parameters.in > in body should be one of [header] > - "paths./v2/glossary.get.responses.200" must validate one and only one > schema (oneOf). Found none valid > - paths./v2/glossary.get.responses.200.example in body is a forbidden property > - "paths./v2/glossary.post.responses.200" must validate one and only one > schema (oneOf). Found none valid > - paths./v2/glossary.post.responses.200.example in body is a forbidden > property > - "paths./v2/glossary.post.responses.400" must validate one and only one > schema (oneOf). Found none valid > - paths./v2/glossary.post.responses.400.example in body is a forbidden > property > - "paths./v2/glossary.post.responses.409" must validate one and only one > schema (oneOf). Found none valid > - paths./v2/glossary.post.responses.409.example in body is a forbidden > property > - "paths./v2/search/saved.post.responses.201" must validate one and only one > schema (oneOf). Found none valid > - paths./v2/search/saved.post.responses.201.example in body is a forbidden > property > - "paths./v2/search/saved.put.responses.204" must validate one and only one > schema (oneOf). Found none valid > - paths./v2/search/saved.put.responses.204.example in body is a forbidden > property > - "paths./v2/search/saved.get.responses.200" must validate one and only one > schema (oneOf). Found none valid > - paths./v2/search/saved.get.responses.200.example in body is a forbidden > property > - "paths./v2/types/entitydef/guid/{guid}.get.parameters" must validate one > and only one schema (oneOf). Found none valid > - paths./v2/types/entitydef/guid/{guid}.get.parameters.in in body should be > one of [header] > - "paths./v2/types/entitydef/guid/{guid}.get.responses.200" must validate one > and only one schema (oneOf). Found none valid > - paths./v2/types/entitydef/guid/{guid}.get.responses.200.example in body is > a forbidden property > - "paths./v2/types/entitydef/guid/{guid}.get.responses.404" must validate one > and only one schema (oneOf). Found none valid > - paths./v2/types/entitydef/guid/{guid}.get.responses.404.example in body is > a forbidden property > - "paths./v2/entity/guid/{guid}/classifications.post.parameters" must > validate one and only one schema (oneOf). Found none valid > - paths./v2/entity/guid/{guid}/classifications.post.parameters.in in body > should be one of [header] > - "paths./v2/entity/guid/{guid}/classifications.put.parameters" must validate > one and only one schema (oneOf). Found none valid > - paths./v2/entity/guid/{guid}/classifications.put.parameters.in in body > should be one of [header] > - "paths./v2/entity/guid/{guid}/classifications.get.parameters" must validate > one and only one schema (oneOf). Found none valid > - paths./v2/entity/guid/{guid}/classifications.get.parameters.in in body > should be one of [header] > - "paths./v2/entity/guid/{guid}/classifications.get.responses.200" must > validate one and only one schema (oneOf). Found none valid > - paths./v2/entity/guid/{guid}/classifications.get.responses.200.example in > body is a forbidden pro