[ https://issues.apache.org/jira/browse/ATLAS-626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Suma Shivaprasad updated ATLAS-626: ----------------------------------- Attachment: (was: ATLAS-626.patch) > Hive temporary table metadata is captured in atlas. > --------------------------------------------------- > > Key: ATLAS-626 > URL: https://issues.apache.org/jira/browse/ATLAS-626 > Project: Atlas > Issue Type: Bug > Affects Versions: 0.7-incubating > Reporter: Ayub Khan > Assignee: Suma Shivaprasad > Fix For: 0.7-incubating > > Attachments: ATLAS-626.patch > > > As part of HIVE-7090, hive supports session level temporary tables and life > cycle of them. > These temporary tables are used to run some additional queries against it and > cleaned up at the end of the session. > Inserting data in to table creates this temporary table, whose metadata is > synced to atlas and once the session expires, the table is cleaned up in hive > but the table still exists in atlas. > 1. What is the use-case of storing metadata of this temporary table? Are > temporary tables important? > 2. Impact: Metadata objects might grow if the the probability of insert > operation is high in production. > Lineage snapshot link: > https://monosnap.com/file/DtnPZA85Ug0Q27arTOqY1FnhbDQdXU > {noformat} > 0: jdbc:hive2://localhost:10000/default> show tables; > +-----------+--+ > | tab_name | > +-----------+--+ > | abc12312 | > | h3 | > | h5 | > +-----------+--+ > 3 rows selected (0.231 seconds) > 0: jdbc:hive2://localhost:10000/default> insert into table default.h5 values > ( "efg1", "abc1", 1231, 123121); > INFO : Number of reduce tasks is set to 0 since there's no reduce operator > INFO : number of splits:1 > INFO : Submitting tokens for job: job_local737234864_0006 > INFO : The url to track the job: http://localhost:8080/ > INFO : Job running in-process (local Hadoop) > INFO : 2016-04-04 15:21:20,381 Stage-1 map = 100%, reduce = 0% > INFO : Ended Job = job_local737234864_0006 > INFO : Stage-4 is selected by condition resolver. > INFO : Stage-3 is filtered out by condition resolver. > INFO : Stage-5 is filtered out by condition resolver. > INFO : Moving data to: > hdfs://localhost:9000/user/hive/warehouse/h5/.hive-staging_hive_2016-04-04_15-21-17_057_941878903735098303-9/-ext-10000 > from > hdfs://localhost:9000/user/hive/warehouse/h5/.hive-staging_hive_2016-04-04_15-21-17_057_941878903735098303-9/-ext-10002 > INFO : Loading data to table default.h5 from > hdfs://localhost:9000/user/hive/warehouse/h5/.hive-staging_hive_2016-04-04_15-21-17_057_941878903735098303-9/-ext-10000 > INFO : Table default.h5 stats: [numFiles=5, numRows=5, totalSize=106, > rawDataSize=101] > No rows affected (3.72 seconds) > 0: jdbc:hive2://localhost:10000/default> show tables; > +------------------------+--+ > | tab_name | > +------------------------+--+ > | abc12312 | > | h3 | > | h5 | > | values__tmp__table__1 | > +------------------------+--+ > 4 rows selected (0.196 seconds) > 0: jdbc:hive2://localhost:10000/default> describe extended > values__tmp__table__1; > +-----------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+--+ > | col_name | > > > > > > > > data_type > > > > > > > | comment | > +-----------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+--+ > | tmp_values_col1 | string > > > > > > > > > > > > > > > | | > | tmp_values_col2 | string > > > > > > > > > > > > > > > | | > | tmp_values_col3 | string > > > > > > > > > > > > > > > | | > | tmp_values_col4 | string > > > > > > > > > > > > > > > | | > | | NULL > > > > > > > > > > > > > > > | NULL | > | Detailed Table Information | Table(tableName:values__tmp__table__1, > dbName:default, owner:apathan, createTime:1459763477, lastAccessTime:0, > retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:tmp_values_col1, > type:string, comment:), FieldSchema(name:tmp_values_col2, type:string, > comment:), FieldSchema(name:tmp_values_col3, type:string, comment:), > FieldSchema(name:tmp_values_col4, type:string, comment:)], > location:hdfs://localhost:9000/tmp/hive/apathan/d461c5c3-931a-4aa3-9124-997b75f10c11/_tmp_space.db/Values__Tmp__Table__1, > inputFormat:org.apache.hadoop.mapred.TextInputFormat, > outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, > compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, > serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, > parameters:{serialization.format=1}), bucketCols:[], sortCols:[], > parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], > skewedColValueLocationMaps:{}), storedAsSubDirectories:false), > partitionKeys:[], parameters:{}, viewOriginalText:null, > viewExpandedText:null, tableType:MANAGED_TABLE, > privileges:PrincipalPrivilegeSet(userPrivileges:{}, groupPrivileges:null, > rolePrivileges:null), temporary:true) | | > +-----------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+--+ > 6 rows selected (0.174 seconds) > 0: jdbc:hive2://localhost:10000/default> > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)