[jira] [Commented] (ATLAS-122) Support for Deletion of Entities
[ https://issues.apache.org/jira/browse/ATLAS-122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15082600#comment-15082600 ] Shwetha G S commented on ATLAS-122: --- [~dkantor], Can you add the patch in ATLAS-370 and also upload in reviewboard. I will take a look. Thanks > Support for Deletion of Entities > > > Key: ATLAS-122 > URL: https://issues.apache.org/jira/browse/ATLAS-122 > Project: Atlas > Issue Type: New Feature >Reporter: Suma Shivaprasad >Assignee: David Kantor > Attachments: ATLAS-370-proposed.patch, changes.tar > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ATLAS-122) Support for Deletion of Entities
[ https://issues.apache.org/jira/browse/ATLAS-122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15081707#comment-15081707 ] David Kantor commented on ATLAS-122: [~suma.shivaprasad] On Atlas-106, [~shwethags] suggested creating a review board request for the review of bigger patches. Should I create a review board request for the proposed entity deletion changes? Please advise, as I'd like to address any comments and submit an official patch, so that I can move on to the next sub-task ATLAS-372 for exposing the entity delete operation in the REST API. > Support for Deletion of Entities > > > Key: ATLAS-122 > URL: https://issues.apache.org/jira/browse/ATLAS-122 > Project: Atlas > Issue Type: New Feature >Reporter: Suma Shivaprasad >Assignee: David Kantor > Attachments: ATLAS-370-proposed.patch, changes.tar > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ATLAS-122) Support for Deletion of Entities
[ https://issues.apache.org/jira/browse/ATLAS-122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15064484#comment-15064484 ] David Kantor commented on ATLAS-122: [~suma.shivaprasad] I have merged my changes the latest source from the trunk, and attached updated versions of the tar file and patch mentioned in my previous comment. Kindly review and comment on these proposed changes. I would like to address any comments and submit an official patch, so that I can move on to the next sub-task ATLAS-372 for exposing the entity delete operation in the REST API. Thanks in advance for your help with this. > Support for Deletion of Entities > > > Key: ATLAS-122 > URL: https://issues.apache.org/jira/browse/ATLAS-122 > Project: Atlas > Issue Type: New Feature >Reporter: Suma Shivaprasad >Assignee: David Kantor > Attachments: ATLAS-370-proposed.patch, changes.tar > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ATLAS-122) Support for Deletion of Entities
[ https://issues.apache.org/jira/browse/ATLAS-122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15059404#comment-15059404 ] David Kantor commented on ATLAS-122: I had already started working on sub-task Atlas-370 last week and have a working implementation which I'm in the process of testing. Deletes are cascaded through composite references. In doing so, I have addressed issues in the existing update code which neglected to delete any structs or traits owned by the composite entities. In addition, deletes are cascaded down through n-levels of composition - the existing code stopped at the first level. I have attached a tar file with the changes I have applied so far, and a patch file which shows the diffs. I have initially exposed delete through MetadataRepository as a GUID-based operation, i.e. List deleteEntities(String... guids) The idea here is to avoid requiring client applications from having to retrieve full entities they wish to delete. We could certainly add another entry point for an entity-based operation. With regard to the hive model, it looks like HIVE_TABLE.partitionKeys is already defined as a composite reference in HiveDataModelGenerator.createTableClass(). Please let me know about other hive model changes that are needed. > Support for Deletion of Entities > > > Key: ATLAS-122 > URL: https://issues.apache.org/jira/browse/ATLAS-122 > Project: Atlas > Issue Type: New Feature >Reporter: Suma Shivaprasad >Assignee: David Kantor > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ATLAS-122) Support for Deletion of Entities
[ https://issues.apache.org/jira/browse/ATLAS-122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15057900#comment-15057900 ] Suma Shivaprasad commented on ATLAS-122: [~dkantor] Can you pls publish the design/changes that you propose before starting work on this jira. There are a couple of issues that need to be addressed wrt cascading deletes in the current hive model. For eg: table -> partitions is a composite relationship but is currently not modelled that way. We will need to address that as well as part of the work. > Support for Deletion of Entities > > > Key: ATLAS-122 > URL: https://issues.apache.org/jira/browse/ATLAS-122 > Project: Atlas > Issue Type: New Feature >Reporter: Suma Shivaprasad >Assignee: David Kantor > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ATLAS-122) Support for Deletion of Entities
[ https://issues.apache.org/jira/browse/ATLAS-122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15033093#comment-15033093 ] Suma Shivaprasad commented on ATLAS-122: [~dkantor] I am not working on it currently. Have assigned it to you. Thanks for picking it up. > Support for Deletion of Entities > > > Key: ATLAS-122 > URL: https://issues.apache.org/jira/browse/ATLAS-122 > Project: Atlas > Issue Type: New Feature >Reporter: Suma Shivaprasad >Assignee: David Kantor > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ATLAS-122) Support for Deletion of Entities
[ https://issues.apache.org/jira/browse/ATLAS-122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15031996#comment-15031996 ] David Kantor commented on ATLAS-122: [~suma.shivaprasad] Are you actively working on this feature? If not, I would like to work on this. Please let me know. Thanks... > Support for Deletion of Entities > > > Key: ATLAS-122 > URL: https://issues.apache.org/jira/browse/ATLAS-122 > Project: Atlas > Issue Type: New Feature >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ATLAS-122) Support for Deletion of Entities
[ https://issues.apache.org/jira/browse/ATLAS-122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14716110#comment-14716110 ] Suma Shivaprasad commented on ATLAS-122: Regarding orphans, if the types are not defined correctly to indicate "composition", then it could lead to orphans, > Support for Deletion of Entities > > > Key: ATLAS-122 > URL: https://issues.apache.org/jira/browse/ATLAS-122 > Project: Atlas > Issue Type: New Feature >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ATLAS-122) Support for Deletion of Entities
[ https://issues.apache.org/jira/browse/ATLAS-122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14716109#comment-14716109 ] Suma Shivaprasad commented on ATLAS-122: Yes, relations could be assumed to be associations unless explicitly specified as "composition" . properties are maintained as part of the vertex and hence they should get removed as part of that. The tags/traits or a struct are completely owned by the entity currently( which is another thing that needs to be fixed - entity could share commong trait vertices) and hence will be removed as part of the delete. these relationship properties will mostly affect class-class relationships. > Support for Deletion of Entities > > > Key: ATLAS-122 > URL: https://issues.apache.org/jira/browse/ATLAS-122 > Project: Atlas > Issue Type: New Feature >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ATLAS-122) Support for Deletion of Entities
[ https://issues.apache.org/jira/browse/ATLAS-122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14712954#comment-14712954 ] sandeep samudrala commented on ATLAS-122: - [~suma.shivaprasad]: Are you saying that every relationship need to belong to composition or association and in that case relationship b/w falcon process-feed is more an association . There by columns and properties belonging to table will be deleted and not in case of falcon-process. While process is deleted, I assume all its properties and tags are also removed. Is that what you were mentioning? Just wondering if this can end up with orphans in the graph db. > Support for Deletion of Entities > > > Key: ATLAS-122 > URL: https://issues.apache.org/jira/browse/ATLAS-122 > Project: Atlas > Issue Type: New Feature >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ATLAS-122) Support for Deletion of Entities
[ https://issues.apache.org/jira/browse/ATLAS-122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14712938#comment-14712938 ] Suma Shivaprasad commented on ATLAS-122: Makes sense. We still need to differentiate the case where there is a parent-child relationship like hivetable->columns or hivetable->partitions and the case where there is only an association eg: falcon process - feed. In the case of hivetable, deleting a hive table would lead to deleting all the columns and partitions. However if we delete a falcon process, feeds will exist independently. To acheive this, we could model the relationships/edges to have a relationshipType as "composition" or "association". "Composition" would indicate parent-child and hence this would help to figure out which ones need to be deleted. In teh above case, hiveTable->hiveColumn can have a composedOf relationShip. And hiveTable->hiveDatabase would be "partOf" etc. We can take care of this while defining the types. Does this sound okay? > Support for Deletion of Entities > > > Key: ATLAS-122 > URL: https://issues.apache.org/jira/browse/ATLAS-122 > Project: Atlas > Issue Type: New Feature >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ATLAS-122) Support for Deletion of Entities
[ https://issues.apache.org/jira/browse/ATLAS-122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14711602#comment-14711602 ] Venkatesh Seetharam commented on ATLAS-122: --- Good questions. Cascading of deletion should not be the default, it should fail if there are dependencies. Lets support this in this iteration and then add a flag to support cascading deletes at a later time when there is an ask. Makes sense? > Support for Deletion of Entities > > > Key: ATLAS-122 > URL: https://issues.apache.org/jira/browse/ATLAS-122 > Project: Atlas > Issue Type: New Feature >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ATLAS-122) Support for Deletion of Entities
[ https://issues.apache.org/jira/browse/ATLAS-122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14711368#comment-14711368 ] Suma Shivaprasad commented on ATLAS-122: Deletion of entities raises some interesting scenarios like 1. If a hive_database is requested to be deleted, should we support deletion in the case where there are still tables in the model referring to it ? Or should we mandate the user to delete the tables first and then delete the database? So to generalize, if an entity has incoming edges, then we should throw an error saying other entities are dependent on this and hence cannot be deleted. If we dont throw an error , then it leads to challenges like "should we delete the database recursively along with the the tables that refer to it. To what level/depth of nesting should we go. What if there are other entities like a process referring to the tables, for eg: hive_process, should we delete that process as well? We might lose history/version info if we delete it. 2. If an entity has outgoing edges, for eg: hive_tables has outgoing edges to a list of columns, can we generalize that these referred entities will also be deleted if they have no other incoming edges other than the current entity being deleted? However this fails when there are outgoing lineage relationship edges point to other tables. For eg: a hive_process has outgoing edges to input and output tables. So when a delete is requested for a "hive_process/query" , then deleting the tables that it refers to doesnt make much sense even though there are no refernces to those tables from other processes. [~svenkat] Thoughts? > Support for Deletion of Entities > > > Key: ATLAS-122 > URL: https://issues.apache.org/jira/browse/ATLAS-122 > Project: Atlas > Issue Type: New Feature >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad > -- This message was sent by Atlassian JIRA (v6.3.4#6332)