[jira] [Commented] (ATLAS-1661) import hive script to handle updates like rename/delete
[ https://issues.apache.org/jira/browse/ATLAS-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15925647#comment-15925647 ] Madhan Neethiraj commented on ATLAS-1661: - [~ssainath] - import-hive is generally meant for one-time use, to update Atlas with the information from Hive metastore. It is not expected to be run multiple-times; even if it is run multiple times, the tool wouldn't be able to handle renames - as these information might not be available in Hive metastore. > import hive script to handle updates like rename/delete > --- > > Key: ATLAS-1661 > URL: https://issues.apache.org/jira/browse/ATLAS-1661 > Project: Atlas > Issue Type: Improvement > Components: atlas-intg >Reporter: Sharmadha Sainath >Priority: Minor > > 1. Disabled hive hook > 2. Created table table1 > 3. Ran import-hive.sh script , Atlas ingested table1. > 4. Altered table table1 , rename to table1_new. > 5. Ran import-hive.sh script , Atlas created a new table table1new . > table1 wasn't updated with new name. > This is the expected behavior with import-hive script as opposed to hive > hook, as hive hook is synchronous and import-hive is not. > But as a customer , running import-hive.sh multiple times and doing many hive > operations may result in inconsistency while applying ranger policies to the > table and in many scenarios , since it is not documented to run import hive > script only once. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ATLAS-1661) import hive script to handle updates like rename/delete
[ https://issues.apache.org/jira/browse/ATLAS-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15924417#comment-15924417 ] Ayub Khan commented on ATLAS-1661: -- [~madhan.neethiraj] Is this an expectation from the customer? could you please confirm? > import hive script to handle updates like rename/delete > --- > > Key: ATLAS-1661 > URL: https://issues.apache.org/jira/browse/ATLAS-1661 > Project: Atlas > Issue Type: Improvement > Components: atlas-intg >Reporter: Sharmadha Sainath >Priority: Minor > > 1. Disabled hive hook > 2. Created table table1 > 3. Ran import-hive.sh script , Atlas ingested table1. > 4. Altered table table1 , rename to table1_new. > 5. Ran import-hive.sh script , Atlas created a new table table1new . > table1 wasn't updated with new name. > This is the expected behavior with import-hive script as opposed to hive > hook, as hive hook is synchronous and import-hive is not. > But as a customer , running import-hive.sh multiple times and doing many hive > operations may result in inconsistency while applying ranger policies to the > table and in many scenarios , since it is not documented to run import hive > script only once. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ATLAS-1661) import hive script to handle updates like rename/delete
[ https://issues.apache.org/jira/browse/ATLAS-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15924301#comment-15924301 ] Sharmadha Sainath commented on ATLAS-1661: -- [~ayubkhan] >> I believe the intent of import-hive.sh tool is not to track all the metadata >> changes but to take a snapshot of the metadata at that point of time. I completely agree with you. That is the intent. Currently import-hive script looks for the qualified name of the table if it is already present , and updates the same . In the case mentioned in the description , it doesn't find the table tablenew so it creates a new table. But it would be a good to have feature if import hive script could have a mechanism to know the history of the tablenew and update accordingly. >> Are you suggesting to have the hiveHook capability built into import-hive.sh >> tool also? Yes , because that would be the expectation from the customer. Only difference customer would know is , hive hook updates as and when query is fired , and import hive script does bunch update when run. > import hive script to handle updates like rename/delete > --- > > Key: ATLAS-1661 > URL: https://issues.apache.org/jira/browse/ATLAS-1661 > Project: Atlas > Issue Type: Improvement > Components: atlas-intg >Reporter: Sharmadha Sainath >Priority: Minor > > 1. Disabled hive hook > 2. Created table table1 > 3. Ran import-hive.sh script , Atlas ingested table1. > 4. Altered table table1 , rename to table1_new. > 5. Ran import-hive.sh script , Atlas created a new table table1new . > table1 wasn't updated with new name. > This is the expected behavior with import-hive script as opposed to hive > hook, as hive hook is synchronous and import-hive is not. > But as a customer , running import-hive.sh multiple times and doing many hive > operations may result in inconsistency while applying ranger policies to the > table and in many scenarios , since it is not documented to run import hive > script only once. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ATLAS-1661) import hive script to handle updates like rename/delete
[ https://issues.apache.org/jira/browse/ATLAS-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15924240#comment-15924240 ] Ayub Khan commented on ATLAS-1661: -- Import-hive.sh is a standalone(not daemon) tool to take metadata snapshot when triggered and post it to Atlas - this is the primary targeted use-case of this tool. Import tool does not have details of previous changes to the same table. So, you wont see that alter table updates for import command. HiveHook (runs as a daemon) is configured to track the metadata changes to a hive entities and posts all the changes/updates to Atlas. This use-case is handled by default, as hiveHook is enabled when atlas is deployed. I believe the intent of import-hive.sh tool is not to track all the metadata changes but to take a snapshot of the metadata at that point of time. [~ssainath] Are you suggesting to have the hiveHook capability built into import-hive.sh tool also? [~madhan.neethiraj] [~suryakoneru] > import hive script to handle updates like rename/delete > --- > > Key: ATLAS-1661 > URL: https://issues.apache.org/jira/browse/ATLAS-1661 > Project: Atlas > Issue Type: Improvement > Components: atlas-intg >Reporter: Sharmadha Sainath >Priority: Minor > > 1. Disabled hive hook > 2. Created table table1 > 3. Ran import-hive.sh script , Atlas ingested table1. > 4. Altered table table1 , rename to table1_new. > 5. Ran import-hive.sh script , Atlas created a new table table1new . > table1 wasn't updated with new name. > This is the expected behavior with import-hive script as opposed to hive > hook, as hive hook is synchronous and import-hive is not. > But as a customer , running import-hive.sh multiple times and doing many hive > operations may result in inconsistency while applying ranger policies to the > table and in many scenarios , since it is not documented to run import hive > script only once. -- This message was sent by Atlassian JIRA (v6.3.15#6346)