Toshiki Fukasawa created ATLAS-4769:
---------------------------------------
Summary: Duplicate Relationships
Key: ATLAS-4769
URL: https://issues.apache.org/jira/browse/ATLAS-4769
Project: Atlas
Issue Type: Bug
Components: atlas-core
Affects Versions: 2.3.0, 2.2.0
Environment: Apache Atlas Python client: 0.0.11
Operating System: Rocky Linux 8.5
Reporter: Toshiki Fukasawa
Attachments: simple-test.py
When registering entities with the same qualifiedName and the same relationship
multiple times, unexpected behavior occurs. Specifically, even though there is
only one entity registered, the relationshipAttributes section has multiple
instances of the same entity. This inconsistency arises when registering
different entities in between the repeated registrations of the same entity.
Expected Behavior:
The Ralationships section should accurately reflect the number of entities
registered with the specified qualifiedName and relationship. Duplicate
registrations should not result in the record of multiple instances of the same
entity in relationshipAttributes section.
Steps to Reproduce:
# Register the "dataA" entity with the qualified name "dataA_q" as
relationship of Process entity.
# Register the "dataB" entity with the qualified name "dataB_q" as
relationship of Process entity.
# Register the "dataC" entity with the qualified name "dataC_q" as
relationship of Process entity.
# Register the "dataB" entity with the same qualifiedName as relationship of
Process entity again.
# Register the "dataC" entity with the same qualifiedName as relationship of
Process entity again.
# Observe the Ralationships section.
In version 2.3.0, even if the order of registration is A->B->A->B, duplicate
relationships will occur.
Reproducible program:
I have created a program that demonstrates the issue:
simple-test.py
Running this will result in:
{noformat}
Recorded DataSet Entities
{"typeName": "DataSet", "attributes": {"qualifiedName": "dataA_q", "name":
"dataA"}, "guid": "471398f8-679b-4015-a60f-28bd3b4e315f", "status": "ACTIVE",
"displayText": "dataA", "classificationNames": [], "classifications": [],
"meaningNames": [], "meanings": null, "isIncomplete": false, "labels": []}
{"typeName": "DataSet", "attributes": {"qualifiedName": "dataB_q", "name":
"dataB"}, "guid": "bb6e6d93-38d8-4ae5-ac1d-3e5be880401e", "status": "ACTIVE",
"displayText": "dataB", "classificationNames": [], "classifications": [],
"meaningNames": [], "meanings": null, "isIncomplete": false, "labels": []}
{"typeName": "DataSet", "attributes": {"qualifiedName": "dataC_q", "name":
"dataC"}, "guid": "1026e5c9-acab-42a4-a4fe-ea6f08e4c81f", "status": "ACTIVE",
"displayText": "dataC", "classificationNames": [], "classifications": [],
"meaningNames": [], "meanings": null, "isIncomplete": false, "labels": []}
Recorded relationshipAttributes of the Process
{'guid': '471398f8-679b-4015-a60f-28bd3b4e315f', 'typeName': 'DataSet',
'entityStatus': 'ACTIVE', 'displayText': 'dataA', 'relationshipType':
'dataset_process_inputs', 'relationshipGuid':
'4e270e39-f592-473f-af5a-4976f7252428', 'relationshipStatus': 'DELETED',
'relationshipAttributes': {'typeName': 'dataset_process_inputs'}}
{'guid': 'bb6e6d93-38d8-4ae5-ac1d-3e5be880401e', 'typeName': 'DataSet',
'entityStatus': 'ACTIVE', 'displayText': 'dataB', 'relationshipType':
'dataset_process_inputs', 'relationshipGuid':
'bbe2c759-fc53-42f9-8d7a-6c2966ba25e1', 'relationshipStatus': 'DELETED',
'relationshipAttributes': {'typeName': 'dataset_process_inputs'}}
{'guid': 'bb6e6d93-38d8-4ae5-ac1d-3e5be880401e', 'typeName': 'DataSet',
'entityStatus': 'ACTIVE', 'displayText': 'dataB', 'relationshipType':
'dataset_process_inputs', 'relationshipGuid':
'b71cb1f3-7cc5-4b53-a35f-659d73b97c15', 'relationshipStatus': 'DELETED',
'relationshipAttributes': {'typeName': 'dataset_process_inputs'}}
{'guid': '1026e5c9-acab-42a4-a4fe-ea6f08e4c81f', 'typeName': 'DataSet',
'entityStatus': 'ACTIVE', 'displayText': 'dataC', 'relationshipType':
'dataset_process_inputs', 'relationshipGuid':
'cae464a4-50bc-4acb-b5e2-73bb9b2b9f75', 'relationshipStatus': 'DELETED',
'relationshipAttributes': {'typeName': 'dataset_process_inputs'}}
{'guid': '1026e5c9-acab-42a4-a4fe-ea6f08e4c81f', 'typeName': 'DataSet',
'entityStatus': 'ACTIVE', 'displayText': 'dataC', 'relationshipType':
'dataset_process_inputs', 'relationshipGuid':
'5323ffcb-b996-4437-83fb-810d304fb45b', 'relationshipStatus': 'ACTIVE',
'relationshipAttributes': {'typeName': 'dataset_process_inputs'}}
{noformat}
Thank you for your attention to this matter.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)