soumyakanti3578 commented on code in PR #6198:
URL: https://github.com/apache/hive/pull/6198#discussion_r2695994946


##########
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java:
##########
@@ -3304,6 +3307,192 @@ public List<Void> run(List<String> input) throws 
Exception {
     return true;
   }
 
+  // a helper function which will firstly get the current 
COLUMN_STATS_ACCURATE parameter on table level
+  // secondly convert the JSON String into map, and update the information in 
it, and convert it back to JSON
+  // thirdly update the COLUMN_STATS_ACCURATE parameter with the new value on 
table level using directSql
+  public long updateColumnStatsAccurateForTable(Table table, List<String> 
droppedCols) throws MetaException {
+    String currentValue = table.getParameters().get("COLUMN_STATS_ACCURATE");
+    if (currentValue == null) return 0;
+
+    try {
+      ObjectMapper mapper = new ObjectMapper();
+
+      // Deserialize the JSON into a map
+      Map<String, Object> statsMap = mapper.readValue(currentValue, new 
TypeReference<Map<String, Object>>() {});
+
+      // Get the COLUMN_STATS object if it exists
+      Object columnStatsObj = statsMap.get("COLUMN_STATS");
+
+      if (columnStatsObj instanceof Map) {

Review Comment:
   nit: I think you can use:
   ```
   if (columnStatsObj instanceof Map columnStats) {
   ```



##########
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java:
##########
@@ -3304,6 +3307,192 @@ public List<Void> run(List<String> input) throws 
Exception {
     return true;
   }
 
+  // a helper function which will firstly get the current 
COLUMN_STATS_ACCURATE parameter on table level
+  // secondly convert the JSON String into map, and update the information in 
it, and convert it back to JSON
+  // thirdly update the COLUMN_STATS_ACCURATE parameter with the new value on 
table level using directSql
+  public long updateColumnStatsAccurateForTable(Table table, List<String> 
droppedCols) throws MetaException {
+    String currentValue = table.getParameters().get("COLUMN_STATS_ACCURATE");
+    if (currentValue == null) return 0;
+
+    try {
+      ObjectMapper mapper = new ObjectMapper();
+
+      // Deserialize the JSON into a map
+      Map<String, Object> statsMap = mapper.readValue(currentValue, new 
TypeReference<Map<String, Object>>() {});
+
+      // Get the COLUMN_STATS object if it exists
+      Object columnStatsObj = statsMap.get("COLUMN_STATS");
+
+      if (columnStatsObj instanceof Map) {
+        Map<String, String> columnStats = (Map<String, String>) columnStatsObj;
+
+        boolean removeAll = (droppedCols == null || droppedCols.isEmpty());
+
+        if (removeAll) {
+          // Remove entire column stats
+          statsMap.remove("COLUMN_STATS");
+        } else {
+          // Remove only the dropped columns
+          for (String col : droppedCols) {
+            if (col != null) {
+              columnStats.remove(col.toLowerCase());
+            }

Review Comment:
   Can `col` be null here, since it's coming from a list of columns? If it can 
never be null, we should probably remove `if (col != null)`



##########
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java:
##########
@@ -3304,6 +3307,192 @@ public List<Void> run(List<String> input) throws 
Exception {
     return true;
   }
 
+  // a helper function which will firstly get the current 
COLUMN_STATS_ACCURATE parameter on table level
+  // secondly convert the JSON String into map, and update the information in 
it, and convert it back to JSON
+  // thirdly update the COLUMN_STATS_ACCURATE parameter with the new value on 
table level using directSql
+  public long updateColumnStatsAccurateForTable(Table table, List<String> 
droppedCols) throws MetaException {
+    String currentValue = table.getParameters().get("COLUMN_STATS_ACCURATE");
+    if (currentValue == null) return 0;

Review Comment:
   It's a good practice to always adding braces:
   
   ```
   if (currentValue == null) {
     return 0;
   }
   ```



##########
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java:
##########
@@ -3304,6 +3307,192 @@ public List<Void> run(List<String> input) throws 
Exception {
     return true;
   }
 
+  // a helper function which will firstly get the current 
COLUMN_STATS_ACCURATE parameter on table level
+  // secondly convert the JSON String into map, and update the information in 
it, and convert it back to JSON
+  // thirdly update the COLUMN_STATS_ACCURATE parameter with the new value on 
table level using directSql

Review Comment:
   Please use `/** .. */` for method description.



##########
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java:
##########
@@ -3304,6 +3307,192 @@ public List<Void> run(List<String> input) throws 
Exception {
     return true;
   }
 
+  // a helper function which will firstly get the current 
COLUMN_STATS_ACCURATE parameter on table level
+  // secondly convert the JSON String into map, and update the information in 
it, and convert it back to JSON
+  // thirdly update the COLUMN_STATS_ACCURATE parameter with the new value on 
table level using directSql
+  public long updateColumnStatsAccurateForTable(Table table, List<String> 
droppedCols) throws MetaException {
+    String currentValue = table.getParameters().get("COLUMN_STATS_ACCURATE");
+    if (currentValue == null) return 0;
+
+    try {
+      ObjectMapper mapper = new ObjectMapper();
+
+      // Deserialize the JSON into a map
+      Map<String, Object> statsMap = mapper.readValue(currentValue, new 
TypeReference<Map<String, Object>>() {});
+
+      // Get the COLUMN_STATS object if it exists
+      Object columnStatsObj = statsMap.get("COLUMN_STATS");
+
+      if (columnStatsObj instanceof Map) {
+        Map<String, String> columnStats = (Map<String, String>) columnStatsObj;
+
+        boolean removeAll = (droppedCols == null || droppedCols.isEmpty());
+
+        if (removeAll) {
+          // Remove entire column stats
+          statsMap.remove("COLUMN_STATS");
+        } else {
+          // Remove only the dropped columns
+          for (String col : droppedCols) {
+            if (col != null) {
+              columnStats.remove(col.toLowerCase());
+            }
+          }
+          if (columnStats.isEmpty()) {
+            statsMap.remove("COLUMN_STATS");
+          }
+        }
+      }
+
+      // Serialize the map into a new JSON string
+      String updatedValue = mapper.writeValueAsString(statsMap);
+
+      // Update the COLUMN_STATS_ACCURATE parameter
+      return updateTableParam(table, "COLUMN_STATS_ACCURATE", currentValue, 
updatedValue);
+    } catch (Exception e) {
+      throw new MetaException("Failed to parse/update COLUMN_STATS_ACCURATE: " 
+ e.getMessage());
+    }
+  }
+
+
+
+  public boolean updateColumnStatsAccurateForPartitions(String catName, String 
dbName, Table table,
+                                                     List<String> partNames, 
List<String> colNames) throws MetaException {
+    if (partNames == null || partNames.isEmpty()) {
+      return true;
+    }
+
+    ObjectMapper mapper = new ObjectMapper();
+
+    // If colNames is empty, then all the column stats of all columns should 
be deleted fetch all table column names
+    List<String> effectiveColNames;
+    if (colNames == null || colNames.isEmpty()) {
+      if (table.getSd().getCols() == null) {
+        effectiveColNames = new ArrayList<>();
+      } else {
+        effectiveColNames = table.getSd().getCols().stream()
+                .map(f -> f.getName().toLowerCase())
+                .collect(Collectors.toList());
+      }
+    } else {
+      effectiveColNames = 
colNames.stream().map(String::toLowerCase).collect(Collectors.toList());
+    }
+    List<String> finalColNames = effectiveColNames;
+
+    try {
+      Batchable.runBatched(batchSize, partNames, new Batchable<String, Void>() 
{
+        @Override
+        public List<Void> run(List<String> input) throws Exception {
+          // 1. Construct SQL filter for partition names
+          String sqlFilter = PARTITIONS + ".\"PART_NAME\" in (" + 
makeParams(input.size()) + ")";
+
+          // 2. Fetch PART_IDs of the partitions which are need to be changed
+          List<Long> partitionIds = getPartitionIdsViaSqlFilter(
+                  catName, dbName, table.getTableName(), sqlFilter, input, 
Collections.emptyList(), -1);
+
+          if (partitionIds.isEmpty()) return null;
+
+          // 3. Get current COLUMN_STATS_ACCURATE values
+          Map<Long, String> partStatsAccurateMap = 
getColumnStatsAccurateByPartitionIds(partitionIds);
+
+          // 4. Iterate each partition to update COLUMN_STATS_ACCURATE
+          for (Long partId : partitionIds) {
+            String currentValue = partStatsAccurateMap.get(partId);
+            if (currentValue == null) continue;
+
+            try {
+              Map<String, Object> statsMap = mapper.readValue(
+                      currentValue, new TypeReference<Map<String, Object>>() 
{});
+              Object columnStatsObj = statsMap.get("COLUMN_STATS");
+
+              boolean changed = false;
+              if (columnStatsObj instanceof Map) {
+                Map<String, String> columnStats = (Map<String, String>) 
columnStatsObj;
+                for (String col : finalColNames) {
+                  if (columnStats.remove(col) != null) {
+                    changed = true;
+                  }
+                }
+
+                if (columnStats.isEmpty()) {
+                  statsMap.remove("COLUMN_STATS");
+                  changed = true;
+                }
+              }
+
+              if (!statsMap.containsKey("COLUMN_STATS")) {
+                if (statsMap.remove("BASIC_STATS") != null) {
+                  changed = true;
+                }
+              }
+
+              if (changed) {
+                String updatedValue = mapper.writeValueAsString(statsMap);
+                updatePartitionParam(partId,
+                        StatsSetupConst.COLUMN_STATS_ACCURATE, currentValue, 
updatedValue);
+              }
+
+            } catch (Exception e) {
+              throw new MetaException("Failed to update COLUMN_STATS_ACCURATE 
for PART_ID " + partId + ": " + e.getMessage());
+            }
+          }
+
+          return null;
+        }
+      });
+
+      return true; // All succeeded
+    } catch (Exception e) {
+      LOG.warn("Failed to update COLUMN_STATS_ACCURATE for some partitions", 
e);
+      return false; // Failed batch
+    }
+  }
+
+
+  private Map<Long, String> getColumnStatsAccurateByPartitionIds(List<Long> 
partIds) throws MetaException {
+    if (partIds == null || partIds.isEmpty()) {
+      return Collections.emptyMap();
+    }
+
+    StringBuilder queryText = new StringBuilder();
+    queryText.append("SELECT \"PART_ID\", \"PARAM_VALUE\" FROM ")
+            .append(PARTITION_PARAMS)
+            .append(" WHERE \"PARAM_KEY\" = ? AND \"PART_ID\" IN (")
+            .append(makeParams(partIds.size()))
+            .append(")");
+
+    // Create params: first COLUMN_STATS_ACCURATE, then all partIds
+    Object[] params = new Object[1 + partIds.size()];
+    params[0] = StatsSetupConst.COLUMN_STATS_ACCURATE;
+    for (int i = 0; i < partIds.size(); i++) {
+      params[i + 1] = partIds.get(i);
+    }
+
+    try (QueryWrapper query = new 
QueryWrapper(pm.newQuery("javax.jdo.query.SQL", queryText.toString()))) {
+      @SuppressWarnings("unchecked")
+      List<Object> sqlResult = executeWithArray(query.getInnerQuery(), params, 
queryText.toString());
+
+      Map<Long, String> result = new HashMap<>();
+      for (Object row : sqlResult) {
+        Object[] fields = (Object[]) row;
+        Long partId = MetastoreDirectSqlUtils.extractSqlLong(fields[0]);
+        String value = fields[1] == null ? null : fields[1].toString();
+        result.put(partId, value);
+      }
+
+      return result;
+    }
+  }
+
+
+
+
+
+
+
+
+
+

Review Comment:
   Please remove these extra space.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to