[
https://issues.apache.org/jira/browse/HDDS-12589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Arafat Khan updated HDDS-12589:
-------------------------------
Description:
*Problem:*
When using FSO buckets, files with the same name uploaded into different
directories were being merged into a single key record. This was because
Recon’s container key mapping used only the volume, bucket, and file name as
the unique identifier, which ignored the full directory path information.
*Reproducing the Issue:*
The issue can be reproduced by creating a nested directory structure and
uploading two files (testfile1 and testfile2) at different directory depths.
For example, run the following commands:
{code:java}
ozone fs -mkdir -p ofs://om/volume1/fso-bucket/dir1/dir2/dir3
ozone fs -put -f testfile1 ofs://om/volume1/fso-bucket/dir1/
ozone fs -put -f testfile2 ofs://om/volume1/fso-bucket/dir1/
ozone fs -put -f testfile1 ofs://om/volume1/fso-bucket/dir1/dir2/
ozone fs -put -f testfile2 ofs://om/volume1/fso-bucket/dir1/dir2/
ozone fs -put -f testfile1 ofs://om/volume1/fso-bucket/dir1/dir2/dir3/
ozone fs -put -f testfile2 ofs://om/volume1/fso-bucket/dir1/dir2/dir3/ {code}
In this scenario, two duplicate file names ({{{}testfile1{}}} and
{{{}testfile2{}}}) are created in three different directory hierarchies
({{{}dir1{}}}, {{{}dir1/dir2{}}}, and{{{} dir1/dir2/dir3{}}}).
*Root Cause:*
The root cause was that the Recon container key mapping computed a unique key
based only on the volume, bucket, and file name. For FSO buckets, the directory
structure is encoded as part of the raw key prefix (using negative object IDs),
but this information was being omitted from the computed key. As a result,
files with identical names from different directories were being incorrectly
merged.
*Fix:*
The fix updates the container key mapping logic to use the raw key prefix from
the container key table as the unique identifier. Since the raw key prefix
includes the complete directory structure (with the object IDs representing the
directories, volume, bucket), this change ensures that keys with the same file
name but in different directories (as in the above scenario) are recognized as
distinct records by Recon.
was:
When multiple *FSO keys having duplicate names* are stored within the same
volume and bucket but organized into different directories, our
container-to-key mapping shows an incomplete list of keys, omitting some
expected entries.
> Fix Incorrect FSO Key Listing for Container-to-Key Mapping
> ----------------------------------------------------------
>
> Key: HDDS-12589
> URL: https://issues.apache.org/jira/browse/HDDS-12589
> Project: Apache Ozone
> Issue Type: Bug
> Reporter: Arafat Khan
> Assignee: Arafat Khan
> Priority: Major
> Labels: pull-request-available
>
> *Problem:*
> When using FSO buckets, files with the same name uploaded into different
> directories were being merged into a single key record. This was because
> Recon’s container key mapping used only the volume, bucket, and file name as
> the unique identifier, which ignored the full directory path information.
> *Reproducing the Issue:*
> The issue can be reproduced by creating a nested directory structure and
> uploading two files (testfile1 and testfile2) at different directory depths.
> For example, run the following commands:
> {code:java}
> ozone fs -mkdir -p ofs://om/volume1/fso-bucket/dir1/dir2/dir3
> ozone fs -put -f testfile1 ofs://om/volume1/fso-bucket/dir1/
> ozone fs -put -f testfile2 ofs://om/volume1/fso-bucket/dir1/
> ozone fs -put -f testfile1 ofs://om/volume1/fso-bucket/dir1/dir2/
> ozone fs -put -f testfile2 ofs://om/volume1/fso-bucket/dir1/dir2/
> ozone fs -put -f testfile1 ofs://om/volume1/fso-bucket/dir1/dir2/dir3/
> ozone fs -put -f testfile2 ofs://om/volume1/fso-bucket/dir1/dir2/dir3/ {code}
> In this scenario, two duplicate file names ({{{}testfile1{}}} and
> {{{}testfile2{}}}) are created in three different directory hierarchies
> ({{{}dir1{}}}, {{{}dir1/dir2{}}}, and{{{} dir1/dir2/dir3{}}}).
> *Root Cause:*
> The root cause was that the Recon container key mapping computed a unique key
> based only on the volume, bucket, and file name. For FSO buckets, the
> directory structure is encoded as part of the raw key prefix (using negative
> object IDs), but this information was being omitted from the computed key. As
> a result, files with identical names from different directories were being
> incorrectly merged.
> *Fix:*
> The fix updates the container key mapping logic to use the raw key prefix
> from the container key table as the unique identifier. Since the raw key
> prefix includes the complete directory structure (with the object IDs
> representing the directories, volume, bucket), this change ensures that keys
> with the same file name but in different directories (as in the above
> scenario) are recognized as distinct records by Recon.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]