[ 
https://issues.apache.org/jira/browse/HIVE-28972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-28972:
----------------------------------
    Labels: pull-request-available  (was: )

> HMS performace degradation post HIVE-28909 for alter query
> ----------------------------------------------------------
>
>                 Key: HIVE-28972
>                 URL: https://issues.apache.org/jira/browse/HIVE-28972
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Raghav Aggarwal
>            Assignee: Zhihua Deng
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: Beeline_Output_with_HIVE-28909.png, 
> Beeline_Output_without_HIVE-28909.png, HMS_Heap_Graph.png, 
> VisualVM_Graphs.png, create_tbl.sql, 
> hivemetastore_with_HIVE-28909.log.tar.gz, 
> hivemetastore_without_HIVE-28909.log, hiveserver2_with_HIVE-28909.log, 
> hiveserver2_without_HIVE-28909.log
>
>
> ENV: Hive master branch (26e6880c9053717c17e5e6416451750e832d46c9) + JAVA 8 + 
> Datanucleus 5.x, HMS mem: 8GB
> Decription: I have a table with 800 columns and 5000 partitions. I ran
> {code:java}
> alter table test_tbl add columns (col801 string) cascade; {code}
> and without HIVE-28909 it took 117.42 sec but with HIVE-28909 it is taking 
> *986.065 sec*
>  
> *Steps to reproduce:*
>  # Create partitioned table with 800 columns (attaching the create table sql)
>  # Create 5000 or so empty partitions in hdfs (I used 
> [https://github.com/Aggarwal-Raghav/Concurrent-Partition-Gen] )
>  # Run msck repair table test_tbl to load the partitions to HMS
>  # alter table test_tbl add columns (col801 string) cascade; 
> Attaching all the captured info i.e. HMS logs, HMS2, beeline screenshots with 
> and without HIVE-28909
>  
> The following log is coming: *4006601 times*
>  
> {code:java}
> 2025-05-27T23:08:02,690  INFO [Metastore-Handler-Pool: Thread-64] 
> DataNucleus.Persistence: Object 
> "org.apache.hadoop.hive.metastore.model.MColumnDescriptor@68097e71" has a 
> collection "org.apache.hadoop.hive.metastore.model.MColumnDescriptor.fields" 
> yet element "org.apache.hadoop.hive.metastore.model.MColumn@43adb9db" doesnt 
> have the owner set. Managing the relation and setting the owner.
> 2025-05-27T23:08:02,690  INFO [Metastore-Handler-Pool: Thread-64] 
> DataNucleus.Persistence: Object 
> "org.apache.hadoop.hive.metastore.model.MColumnDescriptor@68097e71" has a 
> collection "org.apache.hadoop.hive.metastore.model.MColumnDescriptor.fields" 
> yet element "org.apache.hadoop.hive.metastore.model.MColumn@29276cc5" doesnt 
> have the owner set. Managing the relation and setting the owner.
> 2025-05-27T23:08:02,690  INFO [Metastore-Handler-Pool: Thread-64] 
> DataNucleus.Persistence: Object 
> "org.apache.hadoop.hive.metastore.model.MColumnDescriptor@68097e71" has a 
> collection "org.apache.hadoop.hive.metastore.model.MColumnDescriptor.fields" 
> yet element "org.apache.hadoop.hive.metastore.model.MColumn@109992db" doesnt 
> have the owner set. Managing the relation and setting the owner.
> 2025-05-27T23:08:02,691  INFO [Metastore-Handler-Pool: Thread-64] 
> DataNucleus.Persistence: Object 
> "org.apache.hadoop.hive.metastore.model.MColumnDescriptor@68097e71" has a 
> collection "org.apache.hadoop.hive.metastore.model.MColumnDescriptor.fields" 
> yet element "org.apache.hadoop.hive.metastore.model.MColumn@69748f31" doesnt 
> have the owner set. Managing the relation and setting the owner.
> 2025-05-27T23:08:02,691  INFO [Metastore-Handler-Pool: Thread-64] 
> DataNucleus.Persistence: Object 
> "org.apache.hadoop.hive.metastore.model.MColumnDescriptor@68097e71" has a 
> collection "org.apache.hadoop.hive.metastore.model.MColumnDescriptor.fields" 
> yet element "org.apache.hadoop.hive.metastore.model.MColumn@6ae74875" doesnt 
> have the owner set. Managing the relation and setting the owner. {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to