cshuo commented on code in PR #19045:
URL: https://github.com/apache/hudi/pull/19045#discussion_r3492391550


##########
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java:
##########


Review Comment:
   The table-service path, like `BaseHoodieCompactionPlanGenerator`, still asks 
the normal file-system view for slices under the logical partition name, so 
compaction can skip bucketed MDT log files. Cleaning has the same issue through 
its full-partition path discovery.
   



##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieAppendHandle.java:
##########
@@ -274,6 +275,37 @@ private void init(HoodieRecord record) {
     doInit = false;
   }
 
+  /**
+   * Returns true when this append handle is writing to a layout sub-path of 
an MDT partition
+   * (e.g. {@code record_index/0004} under sub-directory bucketing). In that 
case, the
+   * {@code .hoodie_partition_metadata} marker must NOT be created at the 
bucket level; it is
+   * written once at the logical partition root by the MDT initialization path.
+   *
+   * <p>Heuristic: an MDT partition path of the form {@code 
<known-mdt-partition>/<NNNN>} where
+   * {@code NNNN} is the standard 4-digit bucket name produced by the 
sub-directory bucketing
+   * layout. Third-party layouts using a different sub-path naming scheme can 
ship their own
+   * append-handle integration; the OSS-shipped layouts use this convention.
+   */
+  private boolean isMDTLayoutSubPath(String physicalPartitionPath) {
+    if (!hoodieTable.isMetadataTable() || physicalPartitionPath == null) {
+      return false;
+    }
+    int slash = physicalPartitionPath.lastIndexOf('/');
+    if (slash <= 0 || slash >= physicalPartitionPath.length() - 1) {
+      return false;
+    }
+    String last = physicalPartitionPath.substring(slash + 1);
+    if (last.length() != 4) {

Review Comment:
   This skip check assumes bucket directory names are exactly four digits. 
`SubDirBucketedMDTLayout` formats with `%04d`, but bucket index `10000+` 
becomes five digits, for e.g., default value for 
`GLOBAL_RECORD_LEVEL_INDEX_MAX_FILE_GROUP_COUNT_PROP` is 10000.



##########
hudi-common/src/main/java/org/apache/hudi/metadata/FlatMDTLayout.java:
##########
@@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.metadata;
+
+import org.apache.hudi.common.util.Option;
+
+import java.util.Collections;
+import java.util.List;
+
+/**
+ * Default MDT layout: every file group lives directly under its metadata
+ * partition directory, and a single {@code .hoodie_partition_metadata} marker
+ * lives at the partition root. This is the layout that existed before the
+ * layout SPI was introduced and remains the default for all tables that have
+ * not explicitly opted into a different layout.
+ */
+public final class FlatMDTLayout implements HoodieMetadataTableLayout {
+
+  public static final String LAYOUT_ID = "flat";
+
+  @Override
+  public String getLayoutId() {
+    return LAYOUT_ID;
+  }
+
+  @Override
+  public String getFileGroupRelativePath(LayoutContext ctx) {
+    return ctx.getPartitionType().getPartitionPath();

Review Comment:
   `LayoutContext` doesn't carry `relativePartitionPath`, so 
`FlatMDTLayout.getFileGroupRelativePath` / `getFileId` fall back to 
`ctx.getPartitionType().getPartitionPath()`. For `SECONDARY_INDEX` / 
`EXPRESSION_INDEX` that returns the static *prefix* (`"secondary_index_"` / 
`"expr_index_"`), not the real partition name (e.g. `secondary_index_idx0`).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to