Re: [GSoC 2013] Defragmentation for FFS

2013-08-18 Thread Manuel Wiesinger
Statusupdate: I am still working on the analysis part, mixed with features for allocation. What has been done: .) fixed (almost all) problems with indirect blocks. There is still one, which is worth to be posted on the mailinglist, when I do not find anything in the archives, I post it in hal

Max. number of subdirectories dump

2013-08-18 Thread Manuel Wiesinger
Hello, I am working on a defrag tool for UFS2/FFSv2 as Google Summer of Code Project. The size of a directory offset is of type int32_t (see src/sys/ufs/ufs/dir.h), which is a signed integer. So the maximum size can be (2^31)-1. When testing, the maximum number of subdirectories was 32767,

Re: Max. number of subdirectories dump

2013-08-18 Thread Joerg Sonnenberger
On Sun, Aug 18, 2013 at 03:08:21PM +0200, Manuel Wiesinger wrote: > My question is: > Is the number of subdirectories really limited by (2^15)-1? Yes. A subdirectory creates a hard-link to the parent directory's inode and nlink counter is 16bit. Whether or not that is interpreted as signed in all

Re: Max. number of subdirectories dump

2013-08-18 Thread Martin Husemann
On Sun, Aug 18, 2013 at 03:08:21PM +0200, Manuel Wiesinger wrote: > When testing, the maximum number of subdirectories was 32767, which is > (2^15)-1, when trying to add a 32767th directory, I got the error > message: "Too many links". This one is simple: a subdirectory has a ".." entry, which i

Re: Max. number of subdirectories dump

2013-08-18 Thread Johnny Billquist
On 2013-08-18 15:08, Manuel Wiesinger wrote: Hello, I am working on a defrag tool for UFS2/FFSv2 as Google Summer of Code Project. The size of a directory offset is of type int32_t (see src/sys/ufs/ufs/dir.h), which is a signed integer. So the maximum size can be (2^31)-1. When testing, the ma

Re: Max. number of subdirectories dump

2013-08-18 Thread Manuel Wiesinger
On 08/18/13 15:59, Johnny Billquist wrote: > As Joerg said, the link count is the limitation here. Thanks for the clarification. I could have thought of this myself. > Not sure I understand the question. Are you suggesting that you don't > need to scan through all the contents of a directory to

Re: Max. number of subdirectories dump

2013-08-18 Thread Manuel Wiesinger
On 08/18/13 15:58, Martin Husemann wrote: This is assuming there are no regular files in the directory. If there are (first, before the subdirectories), will it still work? Correct, I should have mentioned that I am testing a a special case. It still finds all entries but not those in indirect

Re: Max. number of subdirectories dump

2013-08-18 Thread Johnny Billquist
On 2013-08-18 15:58, Martin Husemann wrote: On Sun, Aug 18, 2013 at 03:08:21PM +0200, Manuel Wiesinger wrote: When testing, the maximum number of subdirectories was 32767, which is (2^15)-1, when trying to add a 32767th directory, I got the error message: "Too many links". This one is simple:

Re: Max. number of subdirectories dump

2013-08-18 Thread Johnny Billquist
On 2013-08-18 17:33, Manuel Wiesinger wrote: On 08/18/13 15:59, Johnny Billquist wrote: > Not sure I understand the question. Are you suggesting that you don't > need to scan through all the contents of a directory to find the > subdirectories? No, I'm in a step where I just search for subdir

Re: Max. number of subdirectories dump

2013-08-18 Thread Joerg Sonnenberger
On Sun, Aug 18, 2013 at 05:36:47PM +0200, Johnny Billquist wrote: > There is nothing in the directory entry that even tells if the entry > is a directory or just a plain file, unless I remember wrong. And > they are not sorted so that all directories comes first... There is a type tag for each dir

Re: Max. number of subdirectories dump

2013-08-18 Thread Manuel Wiesinger
On 08/18/13 17:38, Joerg Sonnenberger wrote: > There is a type tag for each directory entry. Correct. Namely u_int8_t d_type; in sys/ufs/ufs/dir.h

Re: Max. number of subdirectories dump

2013-08-18 Thread Manuel Wiesinger
On 08/18/13 16:18, Johnny Billquist wrote: It might also fail if the names of the subdirectories are rather long, I would suspect... If I got you correctly, yes it might fail. Unless I iterate any directory block there is.

Re: Max. number of subdirectories dump

2013-08-18 Thread Johnny Billquist
On 2013-08-18 17:38, Joerg Sonnenberger wrote: On Sun, Aug 18, 2013 at 05:36:47PM +0200, Johnny Billquist wrote: There is nothing in the directory entry that even tells if the entry is a directory or just a plain file, unless I remember wrong. And they are not sorted so that all directories come

Re: Max. number of subdirectories dump

2013-08-18 Thread Johnny Billquist
On 2013-08-18 17:50, Johnny Billquist wrote: On 2013-08-18 17:38, Joerg Sonnenberger wrote: On Sun, Aug 18, 2013 at 05:36:47PM +0200, Johnny Billquist wrote: There is nothing in the directory entry that even tells if the entry is a directory or just a plain file, unless I remember wrong. And th

Re: Max. number of subdirectories dump

2013-08-18 Thread Joerg Sonnenberger
On Sun, Aug 18, 2013 at 06:04:55PM +0200, Johnny Billquist wrote: > It's an obvious optimization to keep type already in the directory > itself. But is there any other reason why it was added there? It > obviously means you have the same information in two places, with > all the obvious risks of ha

Re: Max. number of subdirectories dump

2013-08-18 Thread Mouse
> [W]hen trying to add a 32767th directory, I got the error message: > "Too many links". As others have said, this is because link counts are 16 bits, and someone at some point didn't want to deal with making them unsigned. > When my tools reads only the single indirect blocks, it get all 32767 >

Re: Max. number of subdirectories dump

2013-08-18 Thread Manuel Wiesinger
On 08/18/13 18:24, Mouse wrote: Your subdirectory names are comaratively short. I've been testing the special case, where only directories are in a directory, they are short for that reason, to get a real maximum of subdirectories. [W]hy does dump iterate the indirect blocks, when looking fo

re: Max. number of subdirectories dump

2013-08-18 Thread matthew green
> It's an obvious optimization to keep type already in the directory > itself. But is there any other reason why it was added there? It > obviously means you have the same information in two places, with all struct dirent is not stored on the disk, but created during eg readdir() system call v

Re: Max. number of subdirectories dump

2013-08-18 Thread Mouse
>> It's an obvious optimization to keep type already in the directory >> itself. But is there any other reason why it was added there? It >> obviously means you have the same information in two places, [...] > struct dirent is not stored on the disk, but created during eg > readdir() system call

Re: Max. number of subdirectories dump

2013-08-18 Thread David Holland
On Sun, Aug 18, 2013 at 06:04:55PM +0200, Johnny Billquist wrote: > Looking at 2.11BSD, it looks like this: > > struct direct { > [snip] > > In NetBSD (fairly current): > > struct dirent { careful, you want struct direct, not struct dirent: struct direct { u_int32_t d_fileno;

Re: Max. number of subdirectories dump

2013-08-18 Thread David Holland
On Sun, Aug 18, 2013 at 12:24:12PM -0400, Mouse wrote: > A directory may contain entries other than subdirectories. Since there > is no enforced ordering of entries in a directory, the whole directory > must be read to find all the subdirectories (unless 32767 subdirs are > found first, I supp