[ 
https://issues.apache.org/jira/browse/HDFS-16984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ke Han updated HDFS-16984:
--------------------------
    Description: 
h1. Symptoms

The access timestamp for a directory is lost after the upgrading from HDFS 
cluster 2.10.2 to 3.3.6.
h1. Reproduce

Start up a four-node HDFS cluster in 2.10.2 version.

Execute the following commands. (The client is started up in NN, We have 
minimized the command sequence for reproducing)
{code:java}
bin/hdfs dfs -mkdir /GUBIkxOc
bin/hdfs dfs -put -f -p -d /tmp/upfuzz/hdfs/GUBIkxOc/bQfxf /GUBIkxOc/
bin/hdfs dfs -mkdir /GUBIkxOc/sKbTRjvS{code}
Perform read in the old version
{code:java}
bin/hdfs dfs -ls     -t  -r -u /GUBIkxOc/

Found 2 items
drwxr-xr-x   - root  supergroup          0 1970-01-01 00:00 /GUBIkxOc/sKbTRjvS
drwxr-xr-x   - 20001 998                 0 2023-04-17 16:15 
/GUBIkxOc/bQfxf{code}
Then perform a full-stop upgrade to upgrade the entire cluster to 3.3.6. 
(Follow upgrade procedure in the website: (1) enter safemode (2) rolling 
upgrade prepare (3) exit from safe mode). When all nodes in new version have 
started up, we perform the same read:
{code:java}
Found 2 items
drwxr-xr-x   - 20001 998                 0 1970-01-01 00:00 /GUBIkxOc/bQfxf
drwxr-xr-x   - root  supergroup          0 1970-01-01 00:00 /GUBIkxOc/sKbTRjvS 
{code}
The access timestamp info of directory /GUBIkxOc/bQfxf is lost. It changes from 
2023-04-17 16:15 to 1970-01-01 00:00.

PS: The prepare upgrade must happen after the commands have been executed.

I have also attached the required file: +/tmp/upfuzz/hdfs/GUBIkxOc/bQfxf+ . 
h1. Root Cause

When creating the FSImage, the access time field is not persisted.

If users perform an upgrade without creating the FSImage, this bug won't happen 
because access time is stored in the Edit Log. However, once FSImage is 
created, all the edit logs before the snapshot will be invalidated. When the 
new version system starts up, it only reconstructs the in-memory file system 
from the FSImage and ignores those edit logs.

We should make sure the access time of the directory is also properly 
persisted, just as files. I have submitted a PR for a fix.

  was:
h1. Symptoms

The access timestamp for a directory is lost after the upgrading from HDFS 
cluster 2.10.2 to 3.3.6.
h1. Reproduce

Start up a four-node HDFS cluster in 2.10.2 version.

Execute the following commands. (The client is started up in NN, We have 
minimized the command sequence for reproducing)
{code:java}
bin/hdfs dfs -mkdir /GUBIkxOc
bin/hdfs dfs -put -f -p -d /tmp/upfuzz/hdfs/GUBIkxOc/bQfxf /GUBIkxOc/
bin/hdfs dfs -mkdir /GUBIkxOc/sKbTRjvS{code}
Perform read in the old version
{code:java}
bin/hdfs dfs -ls     -t  -r -u /GUBIkxOc/

Found 2 items
drwxr-xr-x   - root  supergroup          0 1970-01-01 00:00 /GUBIkxOc/sKbTRjvS
drwxr-xr-x   - 20001 998                 0 2023-04-17 16:15 
/GUBIkxOc/bQfxf{code}
Then perform a full-stop upgrade to upgrade the entire cluster to 3.3.6. 
(Follow upgrade procedure in the website: (1) enter safemode (2) rolling 
upgrade prepare (3) exit from safe mode). When all nodes in new version have 
started up, we perform the same read:
{code:java}
Found 2 items
drwxr-xr-x   - 20001 998                 0 1970-01-01 00:00 /GUBIkxOc/bQfxf
drwxr-xr-x   - root  supergroup          0 1970-01-01 00:00 /GUBIkxOc/sKbTRjvS 
{code}
The access timestamp info of directory /GUBIkxOc/bQfxf is lost. It changes from 
2023-04-17 16:15 to 1970-01-01 00:00.

PS: The prepare upgrade must happen after the commands have been executed.

I have also attached the required file: +/tmp/upfuzz/hdfs/GUBIkxOc/bQfxf+ . 


> Directory timestamp lost during the upgrade process
> ---------------------------------------------------
>
>                 Key: HDFS-16984
>                 URL: https://issues.apache.org/jira/browse/HDFS-16984
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs
>    Affects Versions: 2.10.2, 3.3.6
>            Reporter: Ke Han
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: GUBIkxOc.tar.gz
>
>
> h1. Symptoms
> The access timestamp for a directory is lost after the upgrading from HDFS 
> cluster 2.10.2 to 3.3.6.
> h1. Reproduce
> Start up a four-node HDFS cluster in 2.10.2 version.
> Execute the following commands. (The client is started up in NN, We have 
> minimized the command sequence for reproducing)
> {code:java}
> bin/hdfs dfs -mkdir /GUBIkxOc
> bin/hdfs dfs -put -f -p -d /tmp/upfuzz/hdfs/GUBIkxOc/bQfxf /GUBIkxOc/
> bin/hdfs dfs -mkdir /GUBIkxOc/sKbTRjvS{code}
> Perform read in the old version
> {code:java}
> bin/hdfs dfs -ls     -t  -r -u /GUBIkxOc/
> Found 2 items
> drwxr-xr-x   - root  supergroup          0 1970-01-01 00:00 /GUBIkxOc/sKbTRjvS
> drwxr-xr-x   - 20001 998                 0 2023-04-17 16:15 
> /GUBIkxOc/bQfxf{code}
> Then perform a full-stop upgrade to upgrade the entire cluster to 3.3.6. 
> (Follow upgrade procedure in the website: (1) enter safemode (2) rolling 
> upgrade prepare (3) exit from safe mode). When all nodes in new version have 
> started up, we perform the same read:
> {code:java}
> Found 2 items
> drwxr-xr-x   - 20001 998                 0 1970-01-01 00:00 /GUBIkxOc/bQfxf
> drwxr-xr-x   - root  supergroup          0 1970-01-01 00:00 
> /GUBIkxOc/sKbTRjvS {code}
> The access timestamp info of directory /GUBIkxOc/bQfxf is lost. It changes 
> from 2023-04-17 16:15 to 1970-01-01 00:00.
> PS: The prepare upgrade must happen after the commands have been executed.
> I have also attached the required file: +/tmp/upfuzz/hdfs/GUBIkxOc/bQfxf+ . 
> h1. Root Cause
> When creating the FSImage, the access time field is not persisted.
> If users perform an upgrade without creating the FSImage, this bug won't 
> happen because access time is stored in the Edit Log. However, once FSImage 
> is created, all the edit logs before the snapshot will be invalidated. When 
> the new version system starts up, it only reconstructs the in-memory file 
> system from the FSImage and ignores those edit logs.
> We should make sure the access time of the directory is also properly 
> persisted, just as files. I have submitted a PR for a fix.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to