Copilot commented on code in PR #6198:
URL: https://github.com/apache/hive/pull/6198#discussion_r2904525436


##########
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java:
##########
@@ -1961,6 +1971,12 @@ public void testColumnStatistics() throws Throwable {
           Lists.newArrayList(partitions.get(0), partitions.get(1), 
partitions.get(2)), Lists.newArrayList(colName), ENGINE);
      assertEquals(1, stats2.size());
      assertEquals(2, stats2.get(partitions.get(2)).size());
+     // test if all columns are deleted from parameter COLUMN_STATS_ACCURATE
+     Partition partition_0 = client.getPartition(dbName, tblName, 
partitions.get(0));
+     Map<String, String> partitionParams = partition_0.getParameters();
+     String partition_column_stats_accurate = 
partitionParams.get("COLUMN_STATS_ACCURATE");
+     assertTrue("parameter COLUMN_STATS_ACCURATE is not accurate in " + 
partitions.get(0),partition_column_stats_accurate == null ||
+             (!table_column_stats_accurate.contains(colName[0]) && 
!table_column_stats_accurate.contains(colName[1])));

Review Comment:
   The assertion on lines 1978–1979 checks `table_column_stats_accurate` (the 
table-level variable from the outer scope) instead of 
`partition_column_stats_accurate` (the partition-level variable defined on line 
1977). This means the test is not actually verifying that the partition's 
`COLUMN_STATS_ACCURATE` parameter was correctly updated — it's re-checking the 
table-level value. Both conditions should reference 
`partition_column_stats_accurate`.
   ```suggestion
        assertTrue("parameter COLUMN_STATS_ACCURATE is not accurate in " + 
partitions.get(0), partition_column_stats_accurate == null ||
                (!partition_column_stats_accurate.contains(colName[0]) && 
!partition_column_stats_accurate.contains(colName[1])));
   ```



##########
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java:
##########
@@ -10138,6 +10170,18 @@ private boolean 
deleteTableColumnStatisticsViaJdo(String catName, String dbName,
       } else {
         throw new NoSuchObjectException("Column stats doesn't exist for db=" + 
dbName + " table="
             + tableName + " col=" + String.join(", ", colNames));
+      }
+        // get the persistent object MTable
+      MTable mTable = getMTable(catName, dbName, tableName);
+      if (mTable != null) {
+        Map<String, String> tableParams = mTable.getParameters();
+        if (tableParams != null) {
+          if (colNames == null || colNames.isEmpty()) {
+            StatsSetupConst.clearColumnStatsState(tableParams);
+          } else {
+            StatsSetupConst.removeColumnStatsState(tableParams, colNames);
+          }
+        }
       }

Review Comment:
   This block is inside the `else` branch of the `if (result > 0)` / `else` 
conditional that throws `NoSuchObjectException`. Because the closing brace of 
the `else` (line 10173 in the diff) is placed before this block, the 
`COLUMN_STATS_ACCURATE` update only executes when the deletion returned zero 
rows (i.e., when the column stats did not exist), not on success. The block 
should be inside the successful branch so that it only runs after stats are 
actually deleted. Verify and correct the placement of the braces to ensure this 
update runs after a successful deletion.



##########
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java:
##########
@@ -2760,6 +2763,165 @@ public List<Void> run(List<String> input) throws 
Exception {
     return true;
   }
 
+  /**
+      a helper function which will get the current COLUMN_STATS_ACCURATE 
parameter on table level
+      and update the COLUMN_STATS_ACCURATE parameter with the new value on 
table level using directSql
+   */
+  public long updateColumnStatsAccurateForTable(Table table, List<String> 
droppedCols) throws MetaException {
+    Map<String, String> params = table.getParameters();
+    // get the current COLUMN_STATS_ACCURATE
+    String currentValue = params.get(StatsSetupConst.COLUMN_STATS_ACCURATE);
+    if (currentValue == null) {
+      return 0;
+    }
+    try {
+      // if the dropping columns is empty, that means we delete all the columns
+      if (droppedCols == null || droppedCols.isEmpty()) {
+        StatsSetupConst.clearColumnStatsState(params);
+      } else {
+        StatsSetupConst.removeColumnStatsState(params, droppedCols);
+      }
+
+      String updatedValue = params.get(StatsSetupConst.COLUMN_STATS_ACCURATE);
+      // if the COL_STATS_ACCURATE has changed, then update it using directSql
+      if (currentValue.equals(updatedValue)) {
+        return 0;
+      }
+      return updateTableParam(table, StatsSetupConst.COLUMN_STATS_ACCURATE, 
currentValue, updatedValue);
+    } catch (Exception e) {
+      throw new MetaException("Failed to parse/update COLUMN_STATS_ACCURATE: " 
+ e.getMessage());
+    }
+  }
+
+
+
+  public boolean updateColumnStatsAccurateForPartitions(String catName, String 
dbName, Table table,
+                                                     List<String> partNames, 
List<String> colNames) throws MetaException {
+    if (partNames == null || partNames.isEmpty()) {
+      return true;
+    }
+
+    ObjectMapper mapper = new ObjectMapper();
+
+    // If colNames is empty, then all the column stats of all columns should 
be deleted fetch all table column names
+    List<String> effectiveColNames;
+    if (colNames == null || colNames.isEmpty()) {
+      if (table.getSd().getCols() == null) {
+        effectiveColNames = new ArrayList<>();
+      } else {
+        effectiveColNames = table.getSd().getCols().stream()
+                .map(f -> f.getName().toLowerCase())
+                .collect(Collectors.toList());
+      }
+    } else {
+      effectiveColNames = 
colNames.stream().map(String::toLowerCase).collect(Collectors.toList());
+    }
+
+    try {
+      Batchable.runBatched(batchSize, partNames, new Batchable<String, Void>() 
{
+        @Override
+        public List<Void> run(List<String> input) throws Exception {
+          // 1. Construct SQL filter for partition names
+          String sqlFilter = PARTITIONS + ".\"PART_NAME\" in (" + 
makeParams(input.size()) + ")";
+
+          // 2. Fetch PART_IDs of the partitions which are need to be changed
+          List<Long> partitionIds = getPartitionIdsViaSqlFilter(
+                  catName, dbName, table.getTableName(), sqlFilter, input, 
Collections.emptyList(), -1);
+
+          if (partitionIds.isEmpty()) return null;
+
+          // 3. Get current COLUMN_STATS_ACCURATE values
+          Map<Long, String> partStatsAccurateMap = 
getColumnStatsAccurateByPartitionIds(partitionIds);
+
+          // 4. Iterate each partition to update COLUMN_STATS_ACCURATE
+          for (Long partId : partitionIds) {
+            String currentValue = partStatsAccurateMap.get(partId);
+            if (currentValue == null) continue;
+
+            try {
+              Map<String, Object> statsMap = mapper.readValue(
+                      currentValue, new TypeReference<Map<String, Object>>() 
{});
+              Object columnStatsObj = statsMap.get("COLUMN_STATS");
+
+              boolean changed = false;
+              if (columnStatsObj instanceof Map) {
+                Map<String, String> columnStats = (Map<String, String>) 
columnStatsObj;
+                for (String col : effectiveColNames) {
+                  if (columnStats.remove(col) != null) {
+                    changed = true;
+                  }
+                }
+
+                if (columnStats.isEmpty()) {
+                  statsMap.remove("COLUMN_STATS");
+                  changed = true;
+                }
+              }
+
+              if (!statsMap.containsKey("COLUMN_STATS")) {
+                if (statsMap.remove("BASIC_STATS") != null) {

Review Comment:
   The string literal `"BASIC_STATS"` is used as a magic string. The existing 
`StatsSetupConst` class already defines constants for stats-related keys (e.g., 
`StatsSetupConst.BASIC_STATS`). This literal should be replaced with the 
appropriate constant to be consistent with the rest of the codebase and avoid 
potential mismatches.
   ```suggestion
                   if (statsMap.remove(StatsSetupConst.BASIC_STATS) != null) {
   ```



##########
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java:
##########
@@ -2760,6 +2763,165 @@ public List<Void> run(List<String> input) throws 
Exception {
     return true;
   }
 
+  /**
+      a helper function which will get the current COLUMN_STATS_ACCURATE 
parameter on table level
+      and update the COLUMN_STATS_ACCURATE parameter with the new value on 
table level using directSql

Review Comment:
   The Javadoc comment for `updateColumnStatsAccurateForTable` is written in 
lowercase and does not follow standard Javadoc format. It also doesn't document 
the parameters (`table`, `droppedCols`) or the return value. Consider using a 
proper Javadoc block with `@param` and `@return` tags.
   ```suggestion
      * Updates the {@code COLUMN_STATS_ACCURATE} parameter for the given table 
using direct SQL.
      *
      * <p>The current value of {@code COLUMN_STATS_ACCURATE} is read from the 
table parameters and
      * adjusted to reflect the columns that have been dropped.</p>
      *
      * @param table the table whose {@code COLUMN_STATS_ACCURATE} parameter 
should be updated
      * @param droppedCols list of column names that have been dropped; if 
{@code null} or empty,
      *                    all column statistics are cleared
      * @return the number of table parameter rows that were updated in the 
metastore, or {@code 0}
      *         if no update was necessary
      * @throws MetaException if the {@code COLUMN_STATS_ACCURATE} parameter 
cannot be parsed or updated
   ```



##########
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java:
##########
@@ -2760,6 +2763,165 @@ public List<Void> run(List<String> input) throws 
Exception {
     return true;
   }
 
+  /**
+      a helper function which will get the current COLUMN_STATS_ACCURATE 
parameter on table level
+      and update the COLUMN_STATS_ACCURATE parameter with the new value on 
table level using directSql
+   */
+  public long updateColumnStatsAccurateForTable(Table table, List<String> 
droppedCols) throws MetaException {
+    Map<String, String> params = table.getParameters();
+    // get the current COLUMN_STATS_ACCURATE
+    String currentValue = params.get(StatsSetupConst.COLUMN_STATS_ACCURATE);
+    if (currentValue == null) {
+      return 0;
+    }
+    try {
+      // if the dropping columns is empty, that means we delete all the columns
+      if (droppedCols == null || droppedCols.isEmpty()) {
+        StatsSetupConst.clearColumnStatsState(params);
+      } else {
+        StatsSetupConst.removeColumnStatsState(params, droppedCols);
+      }
+
+      String updatedValue = params.get(StatsSetupConst.COLUMN_STATS_ACCURATE);
+      // if the COL_STATS_ACCURATE has changed, then update it using directSql
+      if (currentValue.equals(updatedValue)) {
+        return 0;
+      }
+      return updateTableParam(table, StatsSetupConst.COLUMN_STATS_ACCURATE, 
currentValue, updatedValue);
+    } catch (Exception e) {
+      throw new MetaException("Failed to parse/update COLUMN_STATS_ACCURATE: " 
+ e.getMessage());
+    }
+  }
+
+
+
+  public boolean updateColumnStatsAccurateForPartitions(String catName, String 
dbName, Table table,
+                                                     List<String> partNames, 
List<String> colNames) throws MetaException {
+    if (partNames == null || partNames.isEmpty()) {
+      return true;
+    }
+
+    ObjectMapper mapper = new ObjectMapper();
+
+    // If colNames is empty, then all the column stats of all columns should 
be deleted fetch all table column names
+    List<String> effectiveColNames;
+    if (colNames == null || colNames.isEmpty()) {
+      if (table.getSd().getCols() == null) {
+        effectiveColNames = new ArrayList<>();
+      } else {
+        effectiveColNames = table.getSd().getCols().stream()
+                .map(f -> f.getName().toLowerCase())
+                .collect(Collectors.toList());
+      }
+    } else {
+      effectiveColNames = 
colNames.stream().map(String::toLowerCase).collect(Collectors.toList());
+    }
+
+    try {
+      Batchable.runBatched(batchSize, partNames, new Batchable<String, Void>() 
{
+        @Override
+        public List<Void> run(List<String> input) throws Exception {
+          // 1. Construct SQL filter for partition names
+          String sqlFilter = PARTITIONS + ".\"PART_NAME\" in (" + 
makeParams(input.size()) + ")";
+
+          // 2. Fetch PART_IDs of the partitions which are need to be changed
+          List<Long> partitionIds = getPartitionIdsViaSqlFilter(
+                  catName, dbName, table.getTableName(), sqlFilter, input, 
Collections.emptyList(), -1);
+
+          if (partitionIds.isEmpty()) return null;
+
+          // 3. Get current COLUMN_STATS_ACCURATE values
+          Map<Long, String> partStatsAccurateMap = 
getColumnStatsAccurateByPartitionIds(partitionIds);
+
+          // 4. Iterate each partition to update COLUMN_STATS_ACCURATE
+          for (Long partId : partitionIds) {
+            String currentValue = partStatsAccurateMap.get(partId);
+            if (currentValue == null) continue;
+
+            try {
+              Map<String, Object> statsMap = mapper.readValue(
+                      currentValue, new TypeReference<Map<String, Object>>() 
{});
+              Object columnStatsObj = statsMap.get("COLUMN_STATS");

Review Comment:
   Similar to `"BASIC_STATS"`, the string literal `"COLUMN_STATS"` should be 
replaced with `StatsSetupConst.COLUMN_STATS` (or the equivalent constant) for 
consistency with the rest of the codebase.



##########
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/cache/TestCachedStoreUpdateUsingEvents.java:
##########
@@ -829,7 +829,16 @@ private String getValidWriteIds(String dbName, String 
tblName) throws Throwable
   private void validateTablePara(String dbName, String tblName) throws 
Throwable {
     Table tblRead = rawStore.getTable(DEFAULT_CATALOG_NAME, dbName, tblName);
     Table tblRead1 = sharedCache.getTableFromCache(DEFAULT_CATALOG_NAME, 
dbName, tblName);
-    Assert.assertEquals(tblRead.getParameters(), tblRead1.getParameters());
+    // Prepare both the expected and actual table parameters
+    Map<String, String> expected = new HashMap<>(tblRead.getParameters());
+    Map<String, String> actual = new HashMap<>(tblRead1.getParameters());
+
+    // Remove the COLUMN_STATS_ACCURATE entry from both maps, because it is 
now completely removed
+    expected.remove("COLUMN_STATS_ACCURATE");
+    actual.remove("COLUMN_STATS_ACCURATE");

Review Comment:
   The string literal `"COLUMN_STATS_ACCURATE"` should be replaced with 
`StatsSetupConst.COLUMN_STATS_ACCURATE` to be consistent with the rest of the 
codebase and avoid hard-coded strings.
   ```suggestion
       expected.remove(StatsSetupConst.COLUMN_STATS_ACCURATE);
       actual.remove(StatsSetupConst.COLUMN_STATS_ACCURATE);
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to