github-actions[bot] commented on code in PR #64941:
URL: https://github.com/apache/doris/pull/64941#discussion_r3492817796
##########
fe/fe-filesystem/fe-filesystem-s3/src/main/java/org/apache/doris/filesystem/s3/S3FileSystem.java:
##########
@@ -48,19 +49,19 @@ public Optional<S3FileSystemProperties> properties() {
}
@Override
- protected String globListPrefix(String globPattern) {
+ protected String globListListingPrefix(String globPattern) {
if (isDirectoryBucketEndpoint()) {
return slashTerminatedNonGlobPrefix(globPattern);
}
- return super.globListPrefix(globPattern);
+ return super.globListListingPrefix(globPattern);
}
@Override
- protected List<String> globListPrefixes(String globPattern, String
listPrefix) {
+ protected List<String> globListObjectPrefixes(String globPattern, String
listingPrefix) {
Review Comment:
The directory-bucket path still inherits key-based resume from
`S3CompatibleFileSystem`: after this override returns `List.of(listingPrefix)`,
the shared paginator calls `listObjectsWithOptions(... startAfter(startAfter)
...)`, and `S3ObjStorage` sends that as `ListObjectsV2Request.startAfter`.
`S3SourceOffsetProvider` will pass a non-empty `currentOffset.endFile` on the
next batch, but AWS documents directory buckets as not supporting `StartAfter`
and not returning objects in lexicographical order
(https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectsV2.html). So a
batched/resumed directory-bucket glob can fail the list request or skip data
even though the prefix is now slash-terminated. Please keep directory buckets
off the key-cursor path, for example by rejecting limited/resumed glob listing
for directory buckets or by using a directory-bucket-specific
continuation-token cursor with coverage for the second page.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]