boneanxs commented on code in PR #8452: URL: https://github.com/apache/hudi/pull/8452#discussion_r1191878586
########## hudi-common/src/main/java/org/apache/hudi/expression/Expression.java: ########## @@ -40,14 +51,19 @@ public enum Operator { } } - private final List<Expression> children; + List<Expression> getChildren(); Review Comment: I change this to Interface, which doesn't allowed specify modifier. ########## hudi-common/src/main/java/org/apache/hudi/metadata/FileSystemBackedTableMetadata.java: ########## @@ -84,6 +96,19 @@ public List<String> getAllPartitionPaths() throws IOException { return getPartitionPathWithPathPrefixes(Collections.singletonList("")); } + @Override + public List<String> getPartitionPathByExpression(List<String> relativePathPrefixes, Review Comment: Sure, will add tests soon ########## hudi-common/src/main/java/org/apache/hudi/metadata/FileSystemBackedTableMetadata.java: ########## @@ -58,13 +62,21 @@ public class FileSystemBackedTableMetadata implements HoodieTableMetadata { private final SerializableConfiguration hadoopConf; private final String datasetBasePath; private final boolean assumeDatePartitioning; + private final boolean hiveStylePartitioningEnabled; + private final boolean urlEncodePartitioningEnabled; public FileSystemBackedTableMetadata(HoodieEngineContext engineContext, SerializableConfiguration conf, String datasetBasePath, boolean assumeDatePartitioning) { this.engineContext = engineContext; this.hadoopConf = conf; this.datasetBasePath = datasetBasePath; this.assumeDatePartitioning = assumeDatePartitioning; + HoodieTableMetaClient metaClient = HoodieTableMetaClient.builder() Review Comment: `FileSystemBackedTableMetadata` directly implement interface `HoodieTableMetadata`, while `metaClient` is initialized in Class `BaseTableMetadata`. Maybe I need to move metaClient out instead of local variable, in case others may need it also. ########## hudi-common/src/main/java/org/apache/hudi/metadata/FileSystemBackedTableMetadata.java: ########## @@ -95,11 +120,38 @@ public List<String> getPartitionPathWithPathPrefixes(List<String> relativePathPr }).collect(Collectors.toList()); } + private int getRelativePathPartitionLevel(Types.RecordType partitionFields, String relativePathPrefix) { + if (StringUtils.isNullOrEmpty(relativePathPrefix) || partitionFields == null || partitionFields.fields().size() == 1) { + return 0; + } + + int level = 0; + for (int i = 1; i < relativePathPrefix.length() - 1; i++) { Review Comment: partitionFields have all partition columns, while relativePathPrefix only contains partial partitions. For ex. partitionFields could be <region, date, hour>, while relativePathPrefix is `/region=US`, we should return 1 to indicate the start partition index. ########## hudi-common/src/main/java/org/apache/hudi/metadata/FileSystemBackedTableMetadata.java: ########## @@ -95,11 +120,38 @@ public List<String> getPartitionPathWithPathPrefixes(List<String> relativePathPr }).collect(Collectors.toList()); } + private int getRelativePathPartitionLevel(Types.RecordType partitionFields, String relativePathPrefix) { + if (StringUtils.isNullOrEmpty(relativePathPrefix) || partitionFields == null || partitionFields.fields().size() == 1) { + return 0; + } + + int level = 0; + for (int i = 1; i < relativePathPrefix.length() - 1; i++) { Review Comment: By the way, is it possible we have more than 1 partition columns, while `hoodie.datasource.write.partitionpath.urlencode` is disabled. For ex. partitionFields is <region, date, hour> and some partition values are "/US/2023/05/12/10"(which means region=US, date=2023/05/12, hour=10), now we have only 3 partitions, but we could get 4 partition levels. We can handle only one partition column while its values contains `/`(like date=2023/05/12), but looks it's difficult to identify which value corresponds to which column if we have many columns. Do you have any suggestions? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org