Hi Bart,
Is this on 2.8.2? Do you happen to know how many servers are needed to
trigger the problem?
thanks,
-Phil
On 06/17/2010 04:08 PM, Bart Taylor wrote:
Hey guys,
We have had some problems in the past on 2.6 with file creations
leaving bad
files that we cannot delete. Most utilities like ls and rm return "No
such file
or directory", and pvfs utilities like viewdist, pvfs2-ls, and
pvfs2-rm return
various errors. We have resorted to looking up the parent handle, the
fsid, and
filename and using pvfs2-remove-object to delete the entry. But we
weren't ever
able to intentionally recreate the problem.
Recently while testing 2.8, I have been able to reliably trigger a
similar
scenario where a file creation fails and leaves a garbage entry that
cannot be
deleted in any of the normal ways requiring the pvfs2-remove-object
approach to
clean up. The file and various outputs for this case:
[r...@client dir]# ls -l 2010.06.10.28050
total 0
?--------- ? ? ? ? ? File17027
[r...@client dir]# rm 2010.06.10.28050/File17027
rm: cannot lstat `2010.06.10.28050/File17027': No such file or directory
[r...@client dir]# rm -rf 2010.06.10.28050
rm: cannot remove directory `2010.06.10.28050': Directory not empty
[r...@client dir]# pvfs2-rm 2010.06.10.28050/File17027
Error: An error occurred while removing 2010.06.10.28050/File17027
PVFS_sys_remove: No such file or directory (error class: 0)
[r...@client dir]# pvfs2-stat 2010.06.10.28050/File17027
PVFS_sys_lookup: No such file or directory (error class: 0)
Error stating [2010.06.10.28050/File17027]
[r...@client dir]# pvfs2-viewdist -f 2010.06.10.28050/File17027
PVFS_sys_lookup: No such file or directory (error class: 0)
Could not open 2010.06.10.28050/File17027
[r...@client dir]# ls -l 2010.06.10.28050
total 0
?--------- ? ? ? ? ? File17027
I have included a test script that will spawn off a number of
processes, open a
bunch of files, write to each of them, then close them. You can tweak the
options as you want but using 5 processes and 50,000 files will
usually create
at least one of these files. Here is an example command:
$> ulimit -n 1000000 && ./open-file-limit --num-files=50000
--sleep-time=1 --num-processes=5 --directory=/mnt/pvfs2/ --file-size=1
You may have to do a long listing on any left-over directories to find
the file(s).
I will give any help I can to help recreate the bad file or find the
cause.
Until then, is there a better (simpler) way to remove these entries,
maybe
some sort of utility that doesn't require doing manual handle lookups
before
getting the file removed? It would ease some support pain if it were
simpler to
fix.
Thanks for your help,
Bart.
_______________________________________________
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
_______________________________________________
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers