[ 
https://issues.apache.org/jira/browse/ATLAS-626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayub Khan updated ATLAS-626:
----------------------------
    Description: 
As part of HIVE-7090, hive supports session level temporary tables and life 
cycle of them.

These temporary tables are used to run some additional queries against it and 
cleaned up at the end of the session.

Inserting data in to table creates this temporary table, whose metadata is 
synced to atlas and once the session expires, the table is cleaned up in hive 
but the table still exists in atlas.


1. What is the use-case of storing metadata of this temporary table? Are 
temporary tables important?
2. Impact: Metadata objects might grow if the the probability of insert 
operation is high in production.

Lineage snapshot link: https://monosnap.com/file/DtnPZA85Ug0Q27arTOqY1FnhbDQdXU

{noformat}
0: jdbc:hive2://localhost:10000/default> show tables;
+-----------+--+
| tab_name  |
+-----------+--+
| abc12312  |
| h3        |
| h5        |
+-----------+--+
3 rows selected (0.231 seconds)


0: jdbc:hive2://localhost:10000/default> insert into table default.h5 values ( 
"efg1", "abc1", 1231, 123121);
INFO  : Number of reduce tasks is set to 0 since there's no reduce operator
INFO  : number of splits:1
INFO  : Submitting tokens for job: job_local737234864_0006
INFO  : The url to track the job: http://localhost:8080/
INFO  : Job running in-process (local Hadoop)
INFO  : 2016-04-04 15:21:20,381 Stage-1 map = 100%,  reduce = 0%
INFO  : Ended Job = job_local737234864_0006
INFO  : Stage-4 is selected by condition resolver.
INFO  : Stage-3 is filtered out by condition resolver.
INFO  : Stage-5 is filtered out by condition resolver.
INFO  : Moving data to: 
hdfs://localhost:9000/user/hive/warehouse/h5/.hive-staging_hive_2016-04-04_15-21-17_057_941878903735098303-9/-ext-10000
 from 
hdfs://localhost:9000/user/hive/warehouse/h5/.hive-staging_hive_2016-04-04_15-21-17_057_941878903735098303-9/-ext-10002
INFO  : Loading data to table default.h5 from 
hdfs://localhost:9000/user/hive/warehouse/h5/.hive-staging_hive_2016-04-04_15-21-17_057_941878903735098303-9/-ext-10000
INFO  : Table default.h5 stats: [numFiles=5, numRows=5, totalSize=106, 
rawDataSize=101]
No rows affected (3.72 seconds)


0: jdbc:hive2://localhost:10000/default> show tables;
+------------------------+--+
|        tab_name        |
+------------------------+--+
| abc12312               |
| h3                     |
| h5                     |
| values__tmp__table__1  |
+------------------------+--+
4 rows selected (0.196 seconds)


0: jdbc:hive2://localhost:10000/default> describe extended 
values__tmp__table__1;
+-----------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+--+
|          col_name           |                                                 
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                             data_type          
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                    | comment  |
+-----------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+--+
| tmp_values_col1             | string                                          
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                    |          |
| tmp_values_col2             | string                                          
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                    |          |
| tmp_values_col3             | string                                          
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                    |          |
| tmp_values_col4             | string                                          
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                    |          |
|                             | NULL                                            
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                    | NULL     |
| Detailed Table Information  | Table(tableName:values__tmp__table__1, 
dbName:default, owner:apathan, createTime:1459763477, lastAccessTime:0, 
retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:tmp_values_col1, 
type:string, comment:), FieldSchema(name:tmp_values_col2, type:string, 
comment:), FieldSchema(name:tmp_values_col3, type:string, comment:), 
FieldSchema(name:tmp_values_col4, type:string, comment:)], 
location:hdfs://localhost:9000/tmp/hive/apathan/d461c5c3-931a-4aa3-9124-997b75f10c11/_tmp_space.db/Values__Tmp__Table__1,
 inputFormat:org.apache.hadoop.mapred.TextInputFormat, 
outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, 
compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
parameters:{serialization.format=1}), bucketCols:[], sortCols:[], 
parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], 
skewedColValueLocationMaps:{}), storedAsSubDirectories:false), 
partitionKeys:[], parameters:{}, viewOriginalText:null, viewExpandedText:null, 
tableType:MANAGED_TABLE, privileges:PrincipalPrivilegeSet(userPrivileges:{}, 
groupPrivileges:null, rolePrivileges:null), temporary:true)  |          |
+-----------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+--+
6 rows selected (0.174 seconds)
0: jdbc:hive2://localhost:10000/default>

{noformat}

  was:
As part of HIVE-7090, hive supports session level temporary tables and life 
cycle of them.

These temporary tables are used to run some additional queries against it and 
cleaned up at the end of the session.

Inserting data in to table creates this temporary table, whose metadata is 
synced to atlas and once the session expires, the table is cleaned up in hive 
but the table still exists in atlas.


1. What is the use-case of storing metadata of this temporary table? Are 
temporary tables important?
2. Impact: Metadata objects might grow if the the probability of insert 
operation is high in production.
3. Insert queries fired from across databases are shown in lineage.​ i.e; 
Temporary table is shown in lineage only when it is created in different 
database than the target database. is this expected?

Lineage snapshot link: https://monosnap.com/file/DtnPZA85Ug0Q27arTOqY1FnhbDQdXU

{noformat}
0: jdbc:hive2://localhost:10000/default> show tables;
+-----------+--+
| tab_name  |
+-----------+--+
| abc12312  |
| h3        |
| h5        |
+-----------+--+
3 rows selected (0.231 seconds)


0: jdbc:hive2://localhost:10000/default> insert into table default.h5 values ( 
"efg1", "abc1", 1231, 123121);
INFO  : Number of reduce tasks is set to 0 since there's no reduce operator
INFO  : number of splits:1
INFO  : Submitting tokens for job: job_local737234864_0006
INFO  : The url to track the job: http://localhost:8080/
INFO  : Job running in-process (local Hadoop)
INFO  : 2016-04-04 15:21:20,381 Stage-1 map = 100%,  reduce = 0%
INFO  : Ended Job = job_local737234864_0006
INFO  : Stage-4 is selected by condition resolver.
INFO  : Stage-3 is filtered out by condition resolver.
INFO  : Stage-5 is filtered out by condition resolver.
INFO  : Moving data to: 
hdfs://localhost:9000/user/hive/warehouse/h5/.hive-staging_hive_2016-04-04_15-21-17_057_941878903735098303-9/-ext-10000
 from 
hdfs://localhost:9000/user/hive/warehouse/h5/.hive-staging_hive_2016-04-04_15-21-17_057_941878903735098303-9/-ext-10002
INFO  : Loading data to table default.h5 from 
hdfs://localhost:9000/user/hive/warehouse/h5/.hive-staging_hive_2016-04-04_15-21-17_057_941878903735098303-9/-ext-10000
INFO  : Table default.h5 stats: [numFiles=5, numRows=5, totalSize=106, 
rawDataSize=101]
No rows affected (3.72 seconds)


0: jdbc:hive2://localhost:10000/default> show tables;
+------------------------+--+
|        tab_name        |
+------------------------+--+
| abc12312               |
| h3                     |
| h5                     |
| values__tmp__table__1  |
+------------------------+--+
4 rows selected (0.196 seconds)


0: jdbc:hive2://localhost:10000/default> describe extended 
values__tmp__table__1;
+-----------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+--+
|          col_name           |                                                 
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                             data_type          
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                    | comment  |
+-----------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+--+
| tmp_values_col1             | string                                          
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                    |          |
| tmp_values_col2             | string                                          
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                    |          |
| tmp_values_col3             | string                                          
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                    |          |
| tmp_values_col4             | string                                          
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                    |          |
|                             | NULL                                            
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                    | NULL     |
| Detailed Table Information  | Table(tableName:values__tmp__table__1, 
dbName:default, owner:apathan, createTime:1459763477, lastAccessTime:0, 
retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:tmp_values_col1, 
type:string, comment:), FieldSchema(name:tmp_values_col2, type:string, 
comment:), FieldSchema(name:tmp_values_col3, type:string, comment:), 
FieldSchema(name:tmp_values_col4, type:string, comment:)], 
location:hdfs://localhost:9000/tmp/hive/apathan/d461c5c3-931a-4aa3-9124-997b75f10c11/_tmp_space.db/Values__Tmp__Table__1,
 inputFormat:org.apache.hadoop.mapred.TextInputFormat, 
outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, 
compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
parameters:{serialization.format=1}), bucketCols:[], sortCols:[], 
parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], 
skewedColValueLocationMaps:{}), storedAsSubDirectories:false), 
partitionKeys:[], parameters:{}, viewOriginalText:null, viewExpandedText:null, 
tableType:MANAGED_TABLE, privileges:PrincipalPrivilegeSet(userPrivileges:{}, 
groupPrivileges:null, rolePrivileges:null), temporary:true)  |          |
+-----------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+--+
6 rows selected (0.174 seconds)
0: jdbc:hive2://localhost:10000/default>

{noformat}


> Hive temporary table metadata is captured in atlas.
> ---------------------------------------------------
>
>                 Key: ATLAS-626
>                 URL: https://issues.apache.org/jira/browse/ATLAS-626
>             Project: Atlas
>          Issue Type: Bug
>    Affects Versions: trunk
>            Reporter: Ayub Khan
>             Fix For: trunk
>
>
> As part of HIVE-7090, hive supports session level temporary tables and life 
> cycle of them.
> These temporary tables are used to run some additional queries against it and 
> cleaned up at the end of the session.
> Inserting data in to table creates this temporary table, whose metadata is 
> synced to atlas and once the session expires, the table is cleaned up in hive 
> but the table still exists in atlas.
> 1. What is the use-case of storing metadata of this temporary table? Are 
> temporary tables important?
> 2. Impact: Metadata objects might grow if the the probability of insert 
> operation is high in production.
> Lineage snapshot link: 
> https://monosnap.com/file/DtnPZA85Ug0Q27arTOqY1FnhbDQdXU
> {noformat}
> 0: jdbc:hive2://localhost:10000/default> show tables;
> +-----------+--+
> | tab_name  |
> +-----------+--+
> | abc12312  |
> | h3        |
> | h5        |
> +-----------+--+
> 3 rows selected (0.231 seconds)
> 0: jdbc:hive2://localhost:10000/default> insert into table default.h5 values 
> ( "efg1", "abc1", 1231, 123121);
> INFO  : Number of reduce tasks is set to 0 since there's no reduce operator
> INFO  : number of splits:1
> INFO  : Submitting tokens for job: job_local737234864_0006
> INFO  : The url to track the job: http://localhost:8080/
> INFO  : Job running in-process (local Hadoop)
> INFO  : 2016-04-04 15:21:20,381 Stage-1 map = 100%,  reduce = 0%
> INFO  : Ended Job = job_local737234864_0006
> INFO  : Stage-4 is selected by condition resolver.
> INFO  : Stage-3 is filtered out by condition resolver.
> INFO  : Stage-5 is filtered out by condition resolver.
> INFO  : Moving data to: 
> hdfs://localhost:9000/user/hive/warehouse/h5/.hive-staging_hive_2016-04-04_15-21-17_057_941878903735098303-9/-ext-10000
>  from 
> hdfs://localhost:9000/user/hive/warehouse/h5/.hive-staging_hive_2016-04-04_15-21-17_057_941878903735098303-9/-ext-10002
> INFO  : Loading data to table default.h5 from 
> hdfs://localhost:9000/user/hive/warehouse/h5/.hive-staging_hive_2016-04-04_15-21-17_057_941878903735098303-9/-ext-10000
> INFO  : Table default.h5 stats: [numFiles=5, numRows=5, totalSize=106, 
> rawDataSize=101]
> No rows affected (3.72 seconds)
> 0: jdbc:hive2://localhost:10000/default> show tables;
> +------------------------+--+
> |        tab_name        |
> +------------------------+--+
> | abc12312               |
> | h3                     |
> | h5                     |
> | values__tmp__table__1  |
> +------------------------+--+
> 4 rows selected (0.196 seconds)
> 0: jdbc:hive2://localhost:10000/default> describe extended 
> values__tmp__table__1;
> +-----------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+--+
> |          col_name           |                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                            
> data_type                                                                     
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                      | comment  |
> +-----------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+--+
> | tmp_values_col1             | string                                        
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                   |          |
> | tmp_values_col2             | string                                        
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                   |          |
> | tmp_values_col3             | string                                        
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                   |          |
> | tmp_values_col4             | string                                        
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                   |          |
> |                             | NULL                                          
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                   | NULL     |
> | Detailed Table Information  | Table(tableName:values__tmp__table__1, 
> dbName:default, owner:apathan, createTime:1459763477, lastAccessTime:0, 
> retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:tmp_values_col1, 
> type:string, comment:), FieldSchema(name:tmp_values_col2, type:string, 
> comment:), FieldSchema(name:tmp_values_col3, type:string, comment:), 
> FieldSchema(name:tmp_values_col4, type:string, comment:)], 
> location:hdfs://localhost:9000/tmp/hive/apathan/d461c5c3-931a-4aa3-9124-997b75f10c11/_tmp_space.db/Values__Tmp__Table__1,
>  inputFormat:org.apache.hadoop.mapred.TextInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, 
> compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
> serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
> parameters:{serialization.format=1}), bucketCols:[], sortCols:[], 
> parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], 
> skewedColValueLocationMaps:{}), storedAsSubDirectories:false), 
> partitionKeys:[], parameters:{}, viewOriginalText:null, 
> viewExpandedText:null, tableType:MANAGED_TABLE, 
> privileges:PrincipalPrivilegeSet(userPrivileges:{}, groupPrivileges:null, 
> rolePrivileges:null), temporary:true)  |          |
> +-----------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+--+
> 6 rows selected (0.174 seconds)
> 0: jdbc:hive2://localhost:10000/default>
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to