[jira] [Updated] (HDDS-12589) Fix Incorrect FSO Key Listing for Container-to-Key Mapping

Arafat Khan (Jira) Fri, 14 Mar 2025 07:44:03 -0700


     [ 
https://issues.apache.org/jira/browse/HDDS-12589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Arafat Khan updated HDDS-12589:
-------------------------------
    Description: 
*Problem:*
When using FSO buckets, files with the same name uploaded into different 
directories were being merged into a single key record. This was because 
Recon’s container key mapping used only the volume, bucket, and file name as 
the unique identifier, which ignored the full directory path information.

*Reproducing the Issue:*
The issue can be reproduced by creating a nested directory structure and 
uploading two files (testfile1 and testfile2) at different directory depths. 
For example, run the following commands:
{code:java}
ozone fs -mkdir -p ofs://om/volume1/fso-bucket/dir1/dir2/dir3 
ozone fs -put -f testfile1 ofs://om/volume1/fso-bucket/dir1/ 
ozone fs -put -f testfile2 ofs://om/volume1/fso-bucket/dir1/ 
ozone fs -put -f testfile1 ofs://om/volume1/fso-bucket/dir1/dir2/
ozone fs -put -f testfile2 ofs://om/volume1/fso-bucket/dir1/dir2/
ozone fs -put -f testfile1 ofs://om/volume1/fso-bucket/dir1/dir2/dir3/
ozone fs -put -f testfile2 ofs://om/volume1/fso-bucket/dir1/dir2/dir3/ {code}
In this scenario, two duplicate file names ({{{}testfile1{}}} and 
{{{}testfile2{}}}) are created in three different directory hierarchies 
({{{}dir1{}}}, {{{}dir1/dir2{}}}, and{{{} dir1/dir2/dir3{}}}).

*Root Cause:*
The root cause was that the Recon container key mapping computed a unique key 
based only on the volume, bucket, and file name. For FSO buckets, the directory 
structure is encoded as part of the raw key prefix (using negative object IDs), 
but this information was being omitted from the computed key. As a result, 
files with identical names from different directories were being incorrectly 
merged.

*Fix:*
The fix updates the container key mapping logic to use the raw key prefix from 
the container key table as the unique identifier. Since the raw key prefix 
includes the complete directory structure (with the object IDs representing the 
directories, volume, bucket), this change ensures that keys with the same file 
name but in different directories (as in the above scenario) are recognized as 
distinct records by Recon.

  was:
When multiple *FSO keys having duplicate names* are stored within the same 
volume and bucket but organized into different directories, our 
container-to-key mapping shows an incomplete list of keys, omitting some 
expected entries.

 


> Fix Incorrect FSO Key Listing for Container-to-Key Mapping
> ----------------------------------------------------------
>
>                 Key: HDDS-12589
>                 URL: https://issues.apache.org/jira/browse/HDDS-12589
>             Project: Apache Ozone
>          Issue Type: Bug
>            Reporter: Arafat Khan
>            Assignee: Arafat Khan
>            Priority: Major
>              Labels: pull-request-available
>
> *Problem:*
> When using FSO buckets, files with the same name uploaded into different 
> directories were being merged into a single key record. This was because 
> Recon’s container key mapping used only the volume, bucket, and file name as 
> the unique identifier, which ignored the full directory path information.
> *Reproducing the Issue:*
> The issue can be reproduced by creating a nested directory structure and 
> uploading two files (testfile1 and testfile2) at different directory depths. 
> For example, run the following commands:
> {code:java}
> ozone fs -mkdir -p ofs://om/volume1/fso-bucket/dir1/dir2/dir3 
> ozone fs -put -f testfile1 ofs://om/volume1/fso-bucket/dir1/ 
> ozone fs -put -f testfile2 ofs://om/volume1/fso-bucket/dir1/ 
> ozone fs -put -f testfile1 ofs://om/volume1/fso-bucket/dir1/dir2/
> ozone fs -put -f testfile2 ofs://om/volume1/fso-bucket/dir1/dir2/
> ozone fs -put -f testfile1 ofs://om/volume1/fso-bucket/dir1/dir2/dir3/
> ozone fs -put -f testfile2 ofs://om/volume1/fso-bucket/dir1/dir2/dir3/ {code}
> In this scenario, two duplicate file names ({{{}testfile1{}}} and 
> {{{}testfile2{}}}) are created in three different directory hierarchies 
> ({{{}dir1{}}}, {{{}dir1/dir2{}}}, and{{{} dir1/dir2/dir3{}}}).
> *Root Cause:*
> The root cause was that the Recon container key mapping computed a unique key 
> based only on the volume, bucket, and file name. For FSO buckets, the 
> directory structure is encoded as part of the raw key prefix (using negative 
> object IDs), but this information was being omitted from the computed key. As 
> a result, files with identical names from different directories were being 
> incorrectly merged.
> *Fix:*
> The fix updates the container key mapping logic to use the raw key prefix 
> from the container key table as the unique identifier. Since the raw key 
> prefix includes the complete directory structure (with the object IDs 
> representing the directories, volume, bucket), this change ensures that keys 
> with the same file name but in different directories (as in the above 
> scenario) are recognized as distinct records by Recon.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (HDDS-12589) Fix Incorrect FSO Key Listing for Container-to-Key Mapping

Reply via email to