[ 
https://issues.apache.org/jira/browse/HDFS-14764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16913126#comment-16913126
 ] 

Xun REN commented on HDFS-14764:
--------------------------------

{code:java}
[myuser@myserver ~]$ hdfs dfs -mkdir /tmp/test1
[myuser@myserver ~]$ sudo su - hdfs
$ bash
bash-4.2$ hdfs dfsadmin -allowSnapshot /tmp/test1
Allowing snaphot on /tmp/test1 succeeded
bash-4.2$ hdfs dfsadmin -setQuota 10 /tmp/test1
bash-4.2$ exit
$
[myuser@myserver ~]$ hdfs dfs -touchz /tmp/test1/a
[myuser@myserver ~]$ hdfs dfs -touchz /tmp/test1/b
[myuser@myserver ~]$ hdfs dfs -touchz /tmp/test1/c
[myuser@myserver ~]$ hdfs dfs -touchz /tmp/test1/d
[myuser@myserver ~]$ hdfs dfs -createSnapshot /tmp/test1 s1
Created snapshot /tmp/test1/.snapshot/s1
[myuser@myserver ~]$ hdfs dfs -count -q -v /tmp/test1
 QUOTA REM_QUOTA SPACE_QUOTA REM_SPACE_QUOTA DIR_COUNT FILE_COUNT CONTENT_SIZE 
PATHNAME
 10 5 none inf 1 4 0 /tmp/test1
[myuser@myserver ~]$ hdfs dfs -rm /tmp/test1/b
19/08/13 12:04:12 INFO fs.TrashPolicyDefault: Moved: 'hdfs://HDFS/tmp/test1/b' 
to trash at: hdfs://HDFS/user/myuser/.Trash/Current/tmp/test1/b
[myuser@myserver ~]$ hdfs dfs -count -q -v /tmp/test1
 QUOTA REM_QUOTA SPACE_QUOTA REM_SPACE_QUOTA DIR_COUNT FILE_COUNT CONTENT_SIZE 
PATHNAME
 10 6 none inf 1 3 0 /tmp/test1
[myuser@myserver ~]$ hdfs dfs -rm /tmp/test1/c
19/08/13 12:05:24 INFO fs.TrashPolicyDefault: Moved: 'hdfs://HDFS/tmp/test1/c' 
to trash at: hdfs://HDFS/user/myuser/.Trash/Current/tmp/test1/c
[myuser@myserver ~]$ hdfs dfs -createSnapshot /tmp/test1 s2
Created snapshot /tmp/test1/.snapshot/s2
[myuser@myserver ~]$ hdfs dfs -ls /tmp/test1/
Found 2 items
-rw-r--r-- 1 myuser hdfs 0 2019-08-13 12:02 /tmp/test1/a
-rw-r--r-- 1 myuser hdfs 0 2019-08-13 12:02 /tmp/test1/d
[myuser@myserver ~]$ hdfs dfs -touchz /tmp/test1/e
[myuser@myserver ~]$ hdfs dfs -touchz /tmp/test1/f
[myuser@myserver ~]$ hdfs dfs -ls /tmp/test1/
Found 4 items
-rw-r--r-- 1 myuser hdfs 0 2019-08-13 12:02 /tmp/test1/a
-rw-r--r-- 1 myuser hdfs 0 2019-08-13 12:02 /tmp/test1/d
-rw-r--r-- 1 myuser hdfs 0 2019-08-13 12:05 /tmp/test1/e
-rw-r--r-- 1 myuser hdfs 0 2019-08-13 12:06 /tmp/test1/f
[myuser@myserver ~]$ hdfs dfs -count -q -v /tmp/test1
 QUOTA REM_QUOTA SPACE_QUOTA REM_SPACE_QUOTA DIR_COUNT FILE_COUNT CONTENT_SIZE 
PATHNAME
 10 5 none inf 1 4 0 /tmp/test1
[myuser@myserver ~]$ hdfs dfs -touchz /tmp/test1/g
[myuser@myserver ~]$ hdfs dfs -touchz /tmp/test1/h
[myuser@myserver ~]$ hdfs dfs -touchz /tmp/test1/i
[myuser@myserver ~]$ hdfs dfs -touchz /tmp/test1/j
touchz: The NameSpace quota (directories and files) of directory /tmp/test1 is 
exceeded: quota=10 file count=11
[myuser@myserver ~]$ hdfs dfs -count -q -v /tmp/test1
 QUOTA REM_QUOTA SPACE_QUOTA REM_SPACE_QUOTA DIR_COUNT FILE_COUNT CONTENT_SIZE 
PATHNAME
 10 2 none inf 1 7 0 /tmp/test1{code}

> HDFS count doesn't include snapshot files correctly
> ---------------------------------------------------
>
>                 Key: HDFS-14764
>                 URL: https://issues.apache.org/jira/browse/HDFS-14764
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Xun REN
>            Priority: Major
>         Attachments: hdfs_count_withsnapshot.txt
>
>
> Hi,
>  
> When we set a quota on a path, and that path contains some snapshots, in this 
> case, the status shown by the command "hdfs dfs -count -v -q /my_path" 
> doesn't match the real quota usage.
> The -count here will only count the current path without counting the files 
> in the snapshots which are already deleted in the current path.
> If there is a job continues to write files into that path, it will report an 
> error like 
> {code:java}
> The NameSpace quota (directories and files) of directory /my_path is 
> exceeded{code}
> While the count command shows there is still space.
> Because, when we write files into a directory, it will also check the 
> snapshot files. But the count command will not check.
>  
> The idea here is to modify the report of "hdfs dfs -count" to include also 
> the files in snapshots. Ideally, we could add an additional column to show 
> the total number of files of the current directory + files deleted from the 
> current directory but referenced in the snapshots.
>  
> You could find in the attached text file the steps to reproduce the issue.
>  
> Thanks.
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to