Re: [RFC][PATCH] Implement SEEK_HOLE/SEEK_DATA

2007-11-28 Thread Nicholas Miell

On Thu, 2007-11-29 at 00:48 +0100, Jörn Engel wrote:
> On Wed, 28 November 2007 16:39:59 -0700, Andreas Dilger wrote:
> > On Nov 28, 2007  14:56 -0800, Nicholas Miell wrote:
> > > 
> > > type: one of EXTENT_TYPE_HOLE, EXTENT_TYPE_DATA, EXTENT_TYPE_EXTENTS,
> > > EXTENT_TYPE_COMPRESSED, EXTENT_TYPE_UNCOMPRESSED etc.
> > 
> > This is what FIEMAP is supposed to do.  We wrote a spec and implemented
> > a prototype for ext4, but haven't had time to make it generic to move
> > the large part of the code into the VFS.  If someone wanted to take that
> > up, it would be much appreciated.
> > 
> > See "[RFC] add FIEMAP ioctl to efficiently map file allocation" in
> > linux-fsdevel for details on this interface.
> 
> I didn't follow the discussion much, since it didn't appear to suit
> logfs too well.  In a nutshell, logfs is purely block-based, prepends
> every block with a header, may compress blocks and packs them as tightly
> as possible (byte alignment).
> 
> Maybe the "MAP" part fooled me to believe FIEMAP would also expose
> physical location of extends on the medium.  But reading the proposal
> again, I am unsure about that part.  If physical locations are exposed,
> SEEK_HOLE/SEEK_DATA is significantly more elegant for logfs.  If not,
> FIEMAP could be useful.
> 
> Jörn
> 

I'd have to reread the original proposal, but I remember FIEMAP as being
a generalized way of getting information about a files extents. I think
the original proposal only dealt with mapping file offsets to physical
extents, but IIRC the interface was flexible enough to implement a
"where are the holes" request.

Regardless, SEEK_HOLE/SEEK_DATA being a better suited interface for the
needs of logfs doesn't make it the best interface for that need.

-- 
Nicholas Miell <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH] Implement SEEK_HOLE/SEEK_DATA

2007-11-28 Thread Nicholas Miell

On Wed, 2007-11-28 at 18:33 -0500, Josef Bacik wrote:
> On Wed, Nov 28, 2007 at 02:56:54PM -0800, Nicholas Miell wrote:
> > 
> > On Wed, 2007-11-28 at 15:02 -0500, Josef Bacik wrote:
> > > Hello,
> > > 
> > > This is my first pass at implementing SEEK_HOLE/SEEK_DATA.  This has been 
> > > in
> > > solaris for about a year now, and is described here
> > > 
> > > http://docs.sun.com/app/docs/doc/819-2241/lseek-2?l=en&a=view&q=SEEK_HOLE
> > > http://blogs.sun.com/roller/page/bonwick?entry=seek_hole_and_seek_data
> > > 
> > > I've added a file operation to allow filesystems to override the default
> > > seek_hole_data function, which just loops through bmap looking for either 
> > > a hole
> > > or data.  I've tested this and it seems to work well.  I ran my testcase 
> > > on a
> > > solaris box to make sure I got consistent results (I just ran my test 
> > > script on
> > > the solaris box, I haven't looked at any of their code in case thats a 
> > > concern).
> > > All comments welcome.  Thank you,
> > > 
> > > Josef
> > 
> > I stand by my belief that SEEK_HOLE/SEEK_DATA is a lousy interface.
> > 
> > It abuses the seek operation to become a query operation, it requires a
> > total number of system calls proportional to the number holes+data and
> > it isn't general enough for other similar uses (e.g. total number of
> > contiguous extents, compressed extents, offline extents, extents
> > currently shared with other inodes, extents embedded in the inode
> > (tails), etc.)
> > 
> > Something like the following would be much better:
> > 
> > int getfilextents(int fd, off_t offset, int type, size_t *length, struct
> > extent *extents)
> > 
> > with
> > 
> > int fd: open file
> > 
> > offset: offset in file to start reporting extents
> > 
> > type: one of EXTENT_TYPE_HOLE, EXTENT_TYPE_DATA, EXTENT_TYPE_EXTENTS,
> > EXTENT_TYPE_COMPRESSED, EXTENT_TYPE_UNCOMPRESSED etc.
> > 
> > length: in/out parameter, on entry contains length of extents array, on
> > exit contains number of valid entries in the extents array or total
> > number of extents remaining in the file, whichever is larger
> > 
> > extents: array of struct extent { off_t offset; off_t length }, only
> > updated if non-NULL
> > 
> > Making the type parameter a bitmask and adding a type member to struct
> > extent could be useful so that multiple types of extents could be
> > reported at once could be useful, too. (But you end up with weird cases
> > like data extents overlapping with compressed extents.)
> > 
> > Actually, now that I've searched my mailbox, Andreas Dilger's FIEMAP
> > proposal is pretty much what I suggest here and is certainly superior to
> > Sun's SEEK_HOLE/SEEK_DATA.
> >
> 
> Agreed, however in speaking hch and others the consensus was FIEMAP was good,
> however there was no reason why SEEK_HOLE/SEEK_DATA shouldn't also be
> implemented, and then at some point down the road when a generic FIEMAP is in
> place either change the SEEK_HOLE/SEEK_DATA implementation to try to use 
> FIEMAP
> by default and then fall back on bmap if it has to, or some other such
> operation.  I'm cool with passing on this implementation in preference for
> FIEMAP, but given the discussion I had earlier this week with some of the 
> other
> fs people the general thought was go ahead and do this for now.
> 

Well, there's no demand specifically for SEEK_HOLE/SEEK_DATA[1] and the
interface is ugly, so killing it before it spreads beyond Solaris seems
like a good idea to me. OTOH, if you implement SEEK_HOLE/SEEK_DATA,
nobody is going to bother using the good interface if SEEK_HOLE/
SEEK_DATA is the only portable interface.



[1] The only user appears to be Joerg Schilling.
http://www.google.com/codesearch?hl=en&q=+%5CWSEEK_(DATA%7CHOLE)&sa=N&as_case=y

-- 
Nicholas Miell <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH] Implement SEEK_HOLE/SEEK_DATA

2007-11-28 Thread Nicholas Miell

On Wed, 2007-11-28 at 15:02 -0500, Josef Bacik wrote:
> Hello,
> 
> This is my first pass at implementing SEEK_HOLE/SEEK_DATA.  This has been in
> solaris for about a year now, and is described here
> 
> http://docs.sun.com/app/docs/doc/819-2241/lseek-2?l=en&a=view&q=SEEK_HOLE
> http://blogs.sun.com/roller/page/bonwick?entry=seek_hole_and_seek_data
> 
> I've added a file operation to allow filesystems to override the default
> seek_hole_data function, which just loops through bmap looking for either a 
> hole
> or data.  I've tested this and it seems to work well.  I ran my testcase on a
> solaris box to make sure I got consistent results (I just ran my test script 
> on
> the solaris box, I haven't looked at any of their code in case thats a 
> concern).
> All comments welcome.  Thank you,
> 
> Josef

I stand by my belief that SEEK_HOLE/SEEK_DATA is a lousy interface.

It abuses the seek operation to become a query operation, it requires a
total number of system calls proportional to the number holes+data and
it isn't general enough for other similar uses (e.g. total number of
contiguous extents, compressed extents, offline extents, extents
currently shared with other inodes, extents embedded in the inode
(tails), etc.)

Something like the following would be much better:

int getfilextents(int fd, off_t offset, int type, size_t *length, struct
extent *extents)

with

int fd: open file

offset: offset in file to start reporting extents

type: one of EXTENT_TYPE_HOLE, EXTENT_TYPE_DATA, EXTENT_TYPE_EXTENTS,
EXTENT_TYPE_COMPRESSED, EXTENT_TYPE_UNCOMPRESSED etc.

length: in/out parameter, on entry contains length of extents array, on
exit contains number of valid entries in the extents array or total
number of extents remaining in the file, whichever is larger

extents: array of struct extent { off_t offset; off_t length }, only
updated if non-NULL

Making the type parameter a bitmask and adding a type member to struct
extent could be useful so that multiple types of extents could be
reported at once could be useful, too. (But you end up with weird cases
like data extents overlapping with compressed extents.)

Actually, now that I've searched my mailbox, Andreas Dilger's FIEMAP
proposal is pretty much what I suggest here and is certainly superior to
Sun's SEEK_HOLE/SEEK_DATA.

-- 
Nicholas Miell <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: Case-insensitive support for XFS

2007-10-07 Thread Nicholas Miell
On Mon, 2007-10-08 at 15:07 +1000, Barry Naujok wrote:
> On Sat, 06 Oct 2007 04:52:18 +1000, Nicholas Miell <[EMAIL PROTECTED]>  
> wrote:
> 
> > On Fri, 2007-10-05 at 16:44 +0100, Christoph Hellwig wrote:
> >> [Adding -fsdevel because some of the things touched here might be of
> >>  broader interest and Urban because his name is on nls_utf8.c]
> >>
> >> On Fri, Oct 05, 2007 at 11:57:54AM +1000, Barry Naujok wrote:
> >> >
> >> > On it's own, linux only provides case conversion for old-style
> >> > character sets - 8 bit sequences only. A lot of distos are
> >> > now defaulting to UTF-8 and Linux NLS stuff does not support
> >> > case conversion for any unicode sets.
> >>
> >> The lack of case tables in nls_utf8.c defintively seems odd to me.
> >> Urban, is there a reason for that?  The only thing that comes to
> >> mind is that these tables might be quite large.
> >>
> >
> > Case conversion in Unicode is locale dependent. The legacy 8-bit
> > character encodings don't code for enough characters to run into the
> > ambiguities, so they can get away with fixed case conversion tables.
> > Unicode can't.
> 
> Based on http://www.unicode.org/reports/tr21/tr21-5.html and
> http://www.unicode.org/Public/UNIDATA/CaseFolding.txt
> 
> Doing case comparison using that table should cater for most
> circumstances except a few exeptions. It should be enough
> to satisfy a locale independant case-insensitive filesystem
> (ie. the C + F case folding option).
> 
> Is normalization required after case-folding? What I read
> implies it is not necessary for this purpose (and would
> slow things down and bloat the code more).
> 
> Now I suppose, it's just a question of a fixed table in the
> kernel driver (HFS+ style), or data stored in a special
> inode on-disk (NTFS style, shared refcounted in memory
> when the same). With the on-disk, the table can be generated
>  from mkfs.xfs.

You also have to decide whether to screw over people who speak Turkic
languages and expect an 'I' to 'ı' mapping or everybody else who expect
an 'I' to 'i' mapping.

Although, if you're content in ignoring the kernel's native NLS case
mapping tables (which expect a locale-independent 1-to-1 mapping), you
could just uppercase everything and map both 'i' and 'ı' to 'I'.

Then you have to decide whether things like 'ê' map to 'E' or 'Ê', which
is also locale dependent.

-- 
Nicholas Miell <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: Case-insensitive support for XFS

2007-10-05 Thread Nicholas Miell
On Fri, 2007-10-05 at 16:44 +0100, Christoph Hellwig wrote:
> [Adding -fsdevel because some of the things touched here might be of
>  broader interest and Urban because his name is on nls_utf8.c]
> 
> On Fri, Oct 05, 2007 at 11:57:54AM +1000, Barry Naujok wrote:
> > 
> > On it's own, linux only provides case conversion for old-style
> > character sets - 8 bit sequences only. A lot of distos are
> > now defaulting to UTF-8 and Linux NLS stuff does not support
> > case conversion for any unicode sets.
> 
> The lack of case tables in nls_utf8.c defintively seems odd to me.
> Urban, is there a reason for that?  The only thing that comes to
> mind is that these tables might be quite large.
> 

Case conversion in Unicode is locale dependent. The legacy 8-bit
character encodings don't code for enough characters to run into the
ambiguities, so they can get away with fixed case conversion tables.
Unicode can't.

I'd point you to the Unicode technical report which explains how to do
it, but unicode.org seems to be offline right now.

> > NTFS in Linux also implements it's own dcache and NTFS also
> 
>   ^^^ dentry operations?
> 
> > stores its unicode case table on disk. This allows the filesystem
> > to migrate to newer forms of Unicode at the time of formatting
> > the filesystem. Eg. Windows Vista now supports Unicode 5.0
> > while older version would support an earlier version of
> > Unicode. Linux's version of NTFS case table is implemented
> > in fs/ntfs/upcase.c defined as default_upcase.
> 
> Because ntfs uses 16bit wide chars it prefers to use it's own tables.
> I'm not sure it's a that good idea.  

Well, Windows uses those on-disk tables, so the Linux driver has to
also. I don't see how that's a bad idea or any way to not do it and
remain compatible.

-- 
Nicholas Miell <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] FUSE: mnotify (was: [RFC] VFS: mnotify)

2007-08-12 Thread Nicholas Miell
On Sun, 2007-08-12 at 13:24 +0200, Jan Engelhardt wrote:
> On Aug 12 2007 06:32, Al Boldi wrote:
> >Al Boldi wrote:
> >> Jakob Oestergaard wrote:
> >> > Why on earth would you cripple the kernel defaults for ext3 (which is a
> >> > fine FS for boot/root filesystems), when the *fundamental* problem you
> >> > really want to solve lie much deeper in the implementation of the
> >> > filesystem?  Noatime doesn't solve the problem, it just makes it "less
> >> > horrible".
> >>
> >> inotify could easily solve the atime problem, but it's got the drawback of
> >> forcing the user to register each and every file/dir of interest, which
> >> isn't really reasonable on TB-filesystems.
> 
> What inotify needs is some kind of SUBDIR flag on a watch so that one does not
> run out of fds, then the TB issue becomes a bit lighter I think.
> 

There's no risk of running out of fds; inotify only requires one. You
still have to register every directory you're interested in, though, but
that's a limitation caused by the Unix VFS philosophy and the resulting
filesystem design it inspired rather than of inotify itself.

Come up with a filesystem where given an inode you can find every
directory that has links to that inode with very little effort, convince
everybody to switch from ext3 to this new filesystem, and then maybe
inotify could start doing recursive subtree watches. Otherwise, it's
just not feasible.

-- 
Nicholas Miell <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: *at syscalls for xattrs?

2007-07-15 Thread Nicholas Miell
On Sun, 2007-07-15 at 21:53 +0100, Al Viro wrote:
> On Sun, Jul 15, 2007 at 09:46:27PM +0200, Jan Engelhardt wrote:
> > Hi,
> > 
> > 
> > recently, the family of *at() syscalls and functions (openat, fstatat, 
> > etc.) have been added to Linux and Glibc, respectively.
> > In short: I am missing xattr at functions :)
> 
> No.  They are not fscking forks.  They are almost as revolting, but
> not quite on the same level.

I suspect he was asking for 

int getxattrat(int fd, const char *path, const char *name, void *value, 
size_t size, int flags)
int setxattrat(int fd, const char *path, const char *name, void *value,
size_t size, int xattrflags, int atflags)

rather than the ability to access xattrs as files.

> > BTW, why is fstatat called fstatat and not statat? (Same goes for 
> > futimesat.) It does not take a file descriptor for the file argument. 
> > Otherwise we'd also need fopenat/funlinkat, etc. Any reasons?
> 
> Ulrich having an odd taste?

Solaris compatibility.

-- 
Nicholas Miell <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] add FIEMAP ioctl to efficiently map file allocation

2007-04-30 Thread Nicholas Miell
On Tue, 2007-05-01 at 14:22 +1000, David Chinner wrote:
> On Mon, Apr 30, 2007 at 04:44:01PM -0600, Andreas Dilger wrote:
> > On Apr 19, 2007  11:54 +1000, David Chinner wrote:
> > > > struct fiemap {
> > > > __u64 fm_start; /* logical start offset of mapping 
> > > > (in/out) */
> > > > __u64 fm_len;   /* logical length of mapping (in/out) */
> > > > __u32 fm_flags; /* FIEMAP_FLAG_* flags for request 
> > > > (in/out) */
> > > > __u32 fm_extent_count;  /* number of extents in fm_extents 
> > > > (in/out) */
> > > > __u64 fm_unused;
> > > > struct fiemap_extent fm_extents[0];
> > > > }
> > > > 
> > > > /* flags for the fiemap request */
> > > > #define FIEMAP_FLAG_SYNC0x0001  /* flush delalloc data 
> > > > to disk*/
> > > > #define FIEMAP_FLAG_HSM_READ0x0002  /* retrieve data from 
> > > > HSM */
> > > > #define FIEMAP_FLAG_INCOMPAT0xff00  /* must understand 
> > > > these flags*/
> > > 
> > > No flags in the INCOMPAT range - shouldn't it be 0x3 at this point?
> > 
> > This is actually for future use.  Any flags that are added into this range
> > must be understood by both sides or it should be considered an error.  Flags
> > outside the FIEMAP_FLAG_INCOMPAT do not necessarily need to be supported.
> > If it turns out that 8 bits is too small a range for INCOMPAT flags, then
> > we can make 0x0100 an incompat flag that means e.g. 0x00ff are also
> > incompat flags also.
> 
> Ah, ok. So it's not really a set of "compatibility" flags,
> it's more a "compulsory" set. Under those terms, i don't really
> see why this is necessary - either the filesystem will understand
> the flags or it will return EINVAL or ignore them...
> 
> > I'm assuming that all flags that will be in the original FIEMAP proposal
> > will be understood by the implementations.  Most filesystems can safely
> > ignore FLAG_HSM_READ, for example, since they don't support HSM, and for
> > that matter FLAG_SYNC is probably moot for most filesystems also because
> > they do block allocation at preprw time.
> 
> Exactly my point - so why do we really need to encode a compulsory set
> of flags in the API? 
> 

Because flags have meaning, independent of whether or not the filesystem
understands them. And if the filesystem chooses to ignore critically
important flags (instead of returning EINVAL), bad things may happen.

So, either the filesystem will understand the flag
or iff the unknown flag is in the incompat set, it will return EINVAL
or else the unknown flag will be safely ignored.

-- 
Nicholas Miell <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] add FIEMAP ioctl to efficiently map file allocation

2007-04-13 Thread Nicholas Miell
On Fri, 2007-04-13 at 12:38 +0100, Anton Altaparmakov wrote:
> > One addition freature from the XFS getbmapx interface we should
> > provide is the ability to query layout of xattrs.  While other
> > filesystems might not have the exact xattr fork XFS has it fits
> > nicely into the interface.  Especially when we have Anton's suggested
> > flag for inline data.
> 
> Would it not be better to allow people to get a file descriptor on  
> the xattr fork and then just run the normal FIEMAP ioctl on that file  
> descriptor?
> 
> I.e. "openat(base file descriptor, O_STREAM, streamname)" or O_XATTR  
> or whatever...  An alternative API would be to provide a "getxattrfd 
> ()/fgetxattrfd()" call or similar that would instead of returning the  
> value of an xattr return an fd to it.  Then you do not need to modify  
> openat() at all...  Interface doesn't bother me, just some ideas...
> 
> And for XFS you would define a magic streamname or xattrname (or  
> whatever you want to call it) of say  
> "com.sgi.filesystem.xfs.xattrstream" (or .xattrfork) or something and  
> then XFS would intercept that and know what to do with it...
> 
> Such an interface could then be used by NTFS named streams and other  
> file systems providing such things...
> 
> (Yes I know I will now totally get flamed about named streams not  
> being wanted in Linux and crap like that but that is exactly what you  
> are asking for except you want to special case a particular stream  
> using a flag instead of calling it for what it really is and once you  
> start doing that you might as well allow full named streams...)
> 
> You can just see named streams as an alternative, non-atomic API to  
> xattrs if you like, i.e. you can either use the atomic xattr API  
> provided in Linux already or you can get a file descriptor to an  
> xattr and then use the normal system calls to access it non- 
> atomically thus you can use the FIEMAP ioctl also.  (-:
> 
> FWIW this two-API approach to xattrs/named streams is the direction  
> OSX is heading towards also so it is not without precedent and  
> Windows has had both APIs for many years.  And Solaris has the "openat 
> (O_XATTR)" interface so that is not without precedent either.

Except that xattrs in Linux aren't streams, and providing a stream-like
interface to them would be a weird abuse of the xattr concept.

In essence, Linux xattrs are named extensions to struct stat, with
getxattr() being in the same category as stat() and setxattr() being in
the same category as chmod()/chown()/utime()/etc.

They system namespace exists to provide a better interface than ioctl()
to weird FS-specific features (DOS attribute bits, HFS+ creator/type,
ext2/3/reiserfs/etc. immutable/append-only/secure-delete/etc. attributes
and so on). The uptake of this feature isn't as high as I'd like, but
that's what it's there for.

They security namespace is there for all the neat LSM modules that need
to attach metadata to files in order to function.

Finally, the user namespace exists to allow users to attach small bits
of information to their own files, since the API was already there and
hey!, metadata is useful.

Now, Solaris came along and totally confused the issue by using the same
name for a completely different feature, but that isn't any real reason
to mess up the existing Linux xattr concept just to graft named streams
support into the kernel.

(Not that I'm opposed to named streams in Linux, you just have to
realize that xattrs aren't name streams, can't live in the same
namespace as named streams, and certainly don't serve the same purpose
as named streams.)

-- 
Nicholas Miell <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] add FIEMAP ioctl to efficiently map file allocation

2007-04-12 Thread Nicholas Miell
On Thu, 2007-04-12 at 05:05 -0600, Andreas Dilger wrote:
> I'm interested in getting input for implementing an ioctl to efficiently
> map file extents & holes (FIEMAP) instead of looping over FIBMAP a billion
> times.  We already have customers with single files in the 10TB range and
> we additionally need to get the mapping over the network so it needs to
> be efficient in terms of how data is passed, and how easily it can be
> extracted from the filesystem.
> 
> I had come up with a plan independently and was also steered toward
> XFS_IOC_GETBMAP* ioctls which are in fact very similar to my original
> plan, though I think the XFS structs used there are a bit bloated.
> 
> There was also recent discussion about SEEK_HOLE and SEEK_DATA as
> implemented by Sun, but even if we could skip the holes we still might
> need to do millions of FIBMAPs to see how large files are allocated
> on disk.  Conversely, having filesystems implement an efficient FIBMAP
> ioctl (or ->fiemap() method) could in turn be leveraged for SEEK_HOLE
> and SEEK_DATA instead of doing looping over ->bmap() inside the kernel
> as I saw one patch.
> 

I certainly hope not. SEEK_HOLE/SEEK_DATA is a poor interface and
doesn't deserve to spread.

OTOH, this is nicely done.

> 
> struct fibmap_extent {
>   __u64 fe_start; /* starting offset in bytes */
>   __u64 fe_len;   /* length in bytes */
> }
> 
> struct fibmap {
>   struct fibmap_extent fm_start;  /* offset, length of desired mapping */
>   __u32 fm_extent_count;  /* number of extents in array */
>   __u32 fm_flags; /* flags (similar to XFS_IOC_GETBMAP) */
>   __u64 unused;
>   struct fibmap_extent fm_extents[0];
> }
> 
> #define FIEMAP_LEN_MASK   0xff0000
> #define FIEMAP_LEN_HOLE   0x01
> #define FIEMAP_LEN_UNWRITTEN  0x02
> 


-- 
Nicholas Miell <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH] Secure Deletion and Trash-Bin Support for Ext4

2006-12-06 Thread Nicholas Miell
On Thu, 2006-12-07 at 13:49 +1100, David Chinner wrote:
> On Wed, Dec 06, 2006 at 09:35:30PM -0500, Josef Sipek wrote:
> > On Thu, Dec 07, 2006 at 12:44:27PM +1100, David Chinner wrote:
> > > Maybe we should be using EAs for this sort of thing instead of flags
> > > on the inode? If we keep adding inode flags for generic features
> > > then we are going to force more than just XFS into inode format
> > > changes eventually
> > 
> > Aren't EAs slow? Maybe not on XFS but on other filesystems...
> 
> Only when they don't fit in the inode itself and extra
> disk seeks are needed to retrieve them.
> 
> Cheers,
> 
> Dave.

Also keep in mind that the EA doesn't actually have to have a physical
representation on disk (or, rather, it does, but it doesn't need to be
the same representation used by EAs in the user namespace).

This means that if one of those slow EA filesystems still has room for
flags in the inode, it can synthesize the EA on demand.

This is even preferable to ioctls for the interface to new filesystem
metadata -- if a backup or archive program knows how to store EAs, it
will be able to save and restore any new exotic metadata without any
extra effort.

-- 
Nicholas Miell <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html