Toshiki Fukasawa created ATLAS-4769:
---------------------------------------

             Summary: Duplicate Relationships
                 Key: ATLAS-4769
                 URL: https://issues.apache.org/jira/browse/ATLAS-4769
             Project: Atlas
          Issue Type: Bug
          Components:  atlas-core
    Affects Versions: 2.3.0, 2.2.0
         Environment: Apache Atlas Python client: 0.0.11
Operating System: Rocky Linux 8.5
            Reporter: Toshiki Fukasawa
         Attachments: simple-test.py

When registering entities with the same qualifiedName and the same relationship 
multiple times, unexpected behavior occurs. Specifically, even though there is 
only one entity registered, the relationshipAttributes section has multiple 
instances of the same entity. This inconsistency arises when registering 
different entities in between the repeated registrations of the same entity.

Expected Behavior:
The Ralationships section should accurately reflect the number of entities 
registered with the specified qualifiedName and relationship. Duplicate 
registrations should not result in the record of multiple instances of the same 
entity in relationshipAttributes section.

Steps to Reproduce:
 # Register the "dataA" entity with the qualified name "dataA_q" as 
relationship of Process entity.
 # Register the "dataB" entity with the qualified name "dataB_q" as 
relationship of Process entity.
 # Register the "dataC" entity with the qualified name "dataC_q" as 
relationship of Process entity.
 # Register the "dataB" entity with the same qualifiedName as relationship of 
Process entity again.
 # Register the "dataC" entity with the same qualifiedName as relationship of 
Process entity again.
 # Observe the Ralationships section.

In version 2.3.0, even if the order of registration is A->B->A->B, duplicate 
relationships will occur.

Reproducible program:
I have created a program that demonstrates the issue:
simple-test.py

Running this will result in:
{noformat}
Recorded DataSet Entities
{"typeName": "DataSet", "attributes": {"qualifiedName": "dataA_q", "name": 
"dataA"}, "guid": "471398f8-679b-4015-a60f-28bd3b4e315f", "status": "ACTIVE", 
"displayText": "dataA", "classificationNames": [], "classifications": [], 
"meaningNames": [], "meanings": null, "isIncomplete": false, "labels": []}
{"typeName": "DataSet", "attributes": {"qualifiedName": "dataB_q", "name": 
"dataB"}, "guid": "bb6e6d93-38d8-4ae5-ac1d-3e5be880401e", "status": "ACTIVE", 
"displayText": "dataB", "classificationNames": [], "classifications": [], 
"meaningNames": [], "meanings": null, "isIncomplete": false, "labels": []}
{"typeName": "DataSet", "attributes": {"qualifiedName": "dataC_q", "name": 
"dataC"}, "guid": "1026e5c9-acab-42a4-a4fe-ea6f08e4c81f", "status": "ACTIVE", 
"displayText": "dataC", "classificationNames": [], "classifications": [], 
"meaningNames": [], "meanings": null, "isIncomplete": false, "labels": []}

Recorded relationshipAttributes of the Process
{'guid': '471398f8-679b-4015-a60f-28bd3b4e315f', 'typeName': 'DataSet', 
'entityStatus': 'ACTIVE', 'displayText': 'dataA', 'relationshipType': 
'dataset_process_inputs', 'relationshipGuid': 
'4e270e39-f592-473f-af5a-4976f7252428', 'relationshipStatus': 'DELETED', 
'relationshipAttributes': {'typeName': 'dataset_process_inputs'}}
{'guid': 'bb6e6d93-38d8-4ae5-ac1d-3e5be880401e', 'typeName': 'DataSet', 
'entityStatus': 'ACTIVE', 'displayText': 'dataB', 'relationshipType': 
'dataset_process_inputs', 'relationshipGuid': 
'bbe2c759-fc53-42f9-8d7a-6c2966ba25e1', 'relationshipStatus': 'DELETED', 
'relationshipAttributes': {'typeName': 'dataset_process_inputs'}}
{'guid': 'bb6e6d93-38d8-4ae5-ac1d-3e5be880401e', 'typeName': 'DataSet', 
'entityStatus': 'ACTIVE', 'displayText': 'dataB', 'relationshipType': 
'dataset_process_inputs', 'relationshipGuid': 
'b71cb1f3-7cc5-4b53-a35f-659d73b97c15', 'relationshipStatus': 'DELETED', 
'relationshipAttributes': {'typeName': 'dataset_process_inputs'}}
{'guid': '1026e5c9-acab-42a4-a4fe-ea6f08e4c81f', 'typeName': 'DataSet', 
'entityStatus': 'ACTIVE', 'displayText': 'dataC', 'relationshipType': 
'dataset_process_inputs', 'relationshipGuid': 
'cae464a4-50bc-4acb-b5e2-73bb9b2b9f75', 'relationshipStatus': 'DELETED', 
'relationshipAttributes': {'typeName': 'dataset_process_inputs'}}
{'guid': '1026e5c9-acab-42a4-a4fe-ea6f08e4c81f', 'typeName': 'DataSet', 
'entityStatus': 'ACTIVE', 'displayText': 'dataC', 'relationshipType': 
'dataset_process_inputs', 'relationshipGuid': 
'5323ffcb-b996-4437-83fb-810d304fb45b', 'relationshipStatus': 'ACTIVE', 
'relationshipAttributes': {'typeName': 'dataset_process_inputs'}}
{noformat}
Thank you for your attention to this matter.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to