Re: [PATCH 00/37] Permit filesystem local caching

2008-02-26 Thread Daniel Phillips
On Tuesday 26 February 2008 06:33, David Howells wrote:
> > Suppose one were to take a mundane approach to the persistent cache
> > problem instead of layering filesystems.  What you would do then is
> > change NFS's ->write_page and variants to fiddle the persistent
> > cache
> 
> It is a requirement laid down by the Linux NFS fs maintainers that the writes
> to the cache be asynchronous, even if the writes to NFS aren't.

As it happens, I will be hanging out for the next few days with said
NFS maintainers, it would help to be as informed as possible about
your patch set.

> Note further that NFS's write_page() != writing to the cache.  Writing to the
> cache is typically done by NFS's readpages().

Yes, of course.  But also by ->write_page no?

> > Which I could eventually find out by reading all the patches but asking you
> > is so much more fun :-)
> 
> And a waste of my time.  I've provided documentation in the main FS-Cache
> patch, both as text files and in comments in header files that answer your
> questions.  Please read them first.

37 Patches, none of which has "Documentation" in the subject line, and
you did not provide a diffstat in patch 0 for the patch set as a whole.
If I had known it was there of course I would have read it.  It is great
to see this level of documentation.  But I do not think it is fair to
blame your (one) reader for missing it.

See the smiley above?  The _real_ reason I am asking you is that I do
not think anybody understands your patch set, in spite of your
considerable efforts to address that.  Discussion in public, right or
wrong, is the only way to fix that.  It is counterproductive to drive
readers away from the discussion for fear that they may miss some point
obvious to the original author, or perhaps already discussed earlier on
lkml, and get flamed for it.

Obviously, the patch set is not going to be perfect when it goes in and
it would be a silly abuse of the open source process to require that,
but the parts where it touches the rest of the system have to be really
well understood, and it is clear from the level of participation in the
thread that they are not.

One bit that already came out of this, which you have alluded to
several times yourself but somehow seem to keep glossing over, is that
you need a ->direct_bio file operations method.  So does loopback mount.
It might be worth putting some effort into seeing how ->direct_IO can
be refactored to make that happen.  You can get it in separately on the
basis of helping loopback, and it will make your patches nicer.

Daniel
-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/37] Permit filesystem local caching

2008-02-26 Thread David Howells
Daniel Phillips <[EMAIL PROTECTED]> wrote:

> I need to respond to this in pieces... first the bit that is bugging
> me:
> 
> > >   * two new page flags
> > 
> > I need to keep track of two bits of per-cached-page information:
> > 
> >  (1) This page is known by the cache, and that the cache must be informed if
> >  the page is going to go away.
> 
> I still do not understand the life cycle of this bit.  What does the
> cache do when it learns the page has gone away?

That's up to the cache.  CacheFS, for example, unpins some resources when all
the pages managed by a pointer block are taken away from it.  The cache may
also reserve a block on disk to back this page, and that reservation may then
be discarded by the netfs uncaching the page.

The cache may also speculatively take copies of the page if the machine is
idle.

Documentation/filesystems/caching/netfs-api.txt describes the caching API as a
process, including the presentation of netfs pages to the cache and their
uncaching.

> How is it informed?

[Documentation/filesystems/caching/netfs-api.txt]
==
PAGE UNCACHING
==

To uncache a page, this function should be called:

void fscache_uncache_page(struct fscache_cookie *cookie,
  struct page *page);

This function permits the cache to release any in-memory representation it
might be holding for this netfs page.  This function must be called once for
each page on which the read or write page functions above have been called to
make sure the cache's in-memory tracking information gets torn down.

Note that pages can't be explicitly deleted from the data file.  The whole
data file must be retired (see the relinquish cookie function below).

Furthermore, note that this does not cancel the asynchronous read or write
operation started by the read/alloc and write functions.
[/]

> Who owns the page cache in which such a page lives, the nfs client?
> Filesystem that hosts the page?  A third page cache owned by the
> cache itself?  (See my basic confusion about how many page cache
> levels you have, below.)

[Documentation/filesystems/caching/fscache.txt]
 (7) Data I/O is done direct to and from the netfs's pages.  The netfs
 indicates that page A is at index B of the data-file represented by cookie
 C, and that it should be read or written.  The cache backend may or may
 not start I/O on that page, but if it does, a netfs callback will be
 invoked to indicate completion.  The I/O may be either synchronous or
 asynchronous.
[/]

I should perhaps make the documentation more explicit: the pages passed to the
routines defined in include/linux/fscache.h are netfs pages, normally belonging
the pagecache of the appropriate netfs inode.  This is, however, mentioned in
the function banner comments in fscache.h.

> Suppose one were to take a mundane approach to the persistent cache
> problem instead of layering filesystems.  What you would do then is
> change NFS's ->write_page and variants to fiddle the persistent
> cache

It is a requirement laid down by the Linux NFS fs maintainers that the writes
to the cache be asynchronous, even if the writes to NFS aren't.

Note further that NFS's write_page() != writing to the cache.  Writing to the
cache is typically done by NFS's readpages().

Besides, at the moment, caching is suppressed for any NFS file opened for
writing due to coherency issues.  This is something to be revisited later.

> as well as the network, instead of just the network as now.

Not as now.  See above.

> This fiddling could even consist of ->write calls to another
> filesystem, though working directly with the bio interface would
> yield the fastest, and therefore to my mind, best result.

You can't necessarily access the BIO interface, and even if you can, the cache
is still a filesystem.

Essentially, what cachefiles does is to do what you say: to perform ->write
calls on another filesystem.

FS-Cache also protects the netfs against (a) there being no cache, (b) the
cache suffering a fatal I/O error and (c) the cache being removed; and protects
the cache against (d) the netfs uncaching pages that the cache is using and (e)
conflicting operations from the netfs, some of which may be queued for
asynchronous processing.

FS-Cache also groups asynchronous netfs store requests together, which
hopefully, one day, I'll be able to pass on to the backing fs.

> In any case, you find out how to write the page to backing store by
> asking the filesystem, which in the naive approach would be nfs
> augmented with caching library calls.

NFS and AFS and CIFS and ISOFS, but yes, that's what fscache is, if you like, a
caching library.

> The filesystem keeps its own metadata around to know how to map the page to
> disk.  So again naively, this metadata could tell the nfs client that the
> page is not mapped to disk at all.

The netfs should _not_ know about the metadata of a backing fs.  Firstly, there
are many different potent

Re: [PATCH 00/37] Permit filesystem local caching

2008-02-26 Thread Daniel Phillips
I need to respond to this in pieces... first the bit that is bugging
me:

> >   * two new page flags
> 
> I need to keep track of two bits of per-cached-page information:
> 
>  (1) This page is known by the cache, and that the cache must be informed if
>  the page is going to go away.

I still do not understand the life cycle of this bit.  What does the
cache do when it learns the page has gone away?  How is it informed?
Who owns the page cache in which such a page lives, the nfs client?
Filesystem that hosts the page?  A third page cache owned by the
cache itself?  (See my basic confusion about how many page cache
levels you have, below.)

Suppose one were to take a mundane approach to the persistent cache
problem instead of layering filesystems.  What you would do then is
change NFS's ->write_page and variants to fiddle the persistent
cache as well as the network, instead of just the network as now.
This fiddling could even consist of ->write calls to another
filesystem, though working directly with the bio interface would
yield the fastest, and therefore to my mind, best result.

In any case, you find out how to write the page to backing store by
asking the filesystem, which in the naive approach would be nfs
augmented with caching library calls.  The filesystem keeps its own
metadata around to know how to map the page to disk.  So again
naively, this metadata could tell the nfs client that the page is
not mapped to disk at all.  So I do not see what your per-page bit
is for, obviously because I do not fully understand your caching
scheme.  Which I could eventually find out by reading all the
patches but asking you is so much more fun :-)

By the way, how many levels of page caching for the same data are
there, is it:

  1) nfs client
  2) cache layer's own page cache
  3) filesystem hosting the cache

or just:

  1) nfs client page cache
  2) filesystem hosting the cache

I think it is the second, but that is already double caching, which
has got to hurt.

Regards,

Daniel
-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/37] Permit filesystem local caching

2008-02-25 Thread David Howells
Daniel Phillips <[EMAIL PROTECTED]> wrote:

> On Monday 25 February 2008 15:19, David Howells wrote:
> > So I guess there's a problem in cachefiles's efficiency - possibly due
> > to the fact that it tries to be fully asynchronous.
> 
> OK, not just my imagination, and it makes me feel better about the patch 
> set because efficiency bugs are fixable while fundamental limitations 
> are not.

One can hope:-)

> How much of a hurry are you in to merge this feature?  You have bits 
> like this:

I'd like to get it upstream sooner rather than later.  As it's not upstream,
but it's prerequisite patches touch a lot of code, I have to spend time
regularly making my patches work again.  Merge windows are completely not fun.

> "Add a function to install a monitor on the page lock waitqueue for a 
> particular page, thus allowing the page being unlocked to be detected.
> This is used by CacheFiles to detect read completion on a page in the 
> backing filesystem so that it can then copy the data to the waiting 
> netfs page."
> 
> We already have that hook, it is called bio_endio.

Except that isn't accessible.  CacheFiles currently has no access to the
notification from the blockdev to the backing fs, if indeed there is one.  All
we can do it trap the backing fs page becoming available.

> My strong intuition is that your whole mechanism should sit directly on the
> block device, no matter how attractive it seems to be able to piggyback on
> the namespace and layout management code of existing filesystems.

There's a place for both.

Consider a laptop with a small disk, possibly subdivided between Linux and
Windows.  Linux then subdivides its bit further to get a swap space.  What you
then propose is to break off yet another chunk to provide the cache.  You
can't then use this other chunk for anything else, even if it's, say, 1% used
by the cache.

The way CacheFiles works is that you tell it that it can use up to a certain
percentage of the otherwise free disk space on an otherwise existing
filesystem.  In the laptop case, you may just have a single big partition.  The
cache will fill up as much of it can, and as the other contents of the
partition consume space, the cache will be culled to make room.

On the other hand, a system like my desktop, where I can slap in extra disks
with mound of extra disk space, it might very well make sense to commit block
devices to caching, as this can be used to gain performance.

I have another cache backend (CacheFS) which takes the form of a filesystem,
thus allowing you to mount a blockdev as a cache.  It's much faster than Ext3
at storing and retrieving files... at first.  The problem is that I've mucked
up the free space retrieval such that performance degrades by 20x over time for
files of any size.

Basically any cache on a raw blockdev _is_ a filesystem, just one in which
you're randomly allowed to discard data to make life easier.

> I see your current effort as the moral equivalent of FUSE: you are able to
> demonstrate certain desirable behavioral properties, but you are unable to
> reach full theoretical efficiency because there are layers and layers of
> interface gunk interposed between the netfs user and the cache device.

The interface gunk is meant to be as thin as possible, but there are
constraints (see the documentation in the main FS-Cache patch for more
details):

 (1) It's a requirement that it only be tied to, say, AFS.  We might have
 several netfs's that want caching: AFS, CIFS, ISOFS (okay, that last isn't
 really a netfs, but it might still want caching).

 (2) I want to be able to change the backing cache.  Under some circumstances I
 might want to use an existing filesystem, under others I might want to
 commit a blockdev.  I've even been asked about using battery-backed RAM -
 which has different design constraints.

 (3) The constraint has been imposed by the NFS team that the cache be
 completely asynchronous.  I haven't quite met this: readpages() will wait
 until the cache knows whether or not the pages are available on the
 principle that read operations done through the cache can be considered
 synchronous.  This is an attempt to reduce the context switchage involved.

Unfortunately, the asynchronicity requirement has caused the middle layer to
bloat.  Fortunately, the backing cache needn't bloat as it can use the middle
layer's bloat.

> That said, I also see you have put a huge amount of work into this over 
> the years, it is nicely broken out, you are responsive and easy to work 
> with, all arguments for an early merge.  Against that, you invade core 
> kernel for reasons that are not necessarily justified:
>
>   * two new page flags

I need to keep track of two bits of per-cached-page information:

 (1) This page is known by the cache, and that the cache must be informed if
 the page is going to go away.

 (2) This page is being written to disk by the cache, and that it cannot be
 released un

Re: [PATCH 00/37] Permit filesystem local caching

2008-02-25 Thread Daniel Phillips
On Monday 25 February 2008 15:19, David Howells wrote:
> So I guess there's a problem in cachefiles's efficiency - possibly due
> to the fact that it tries to be fully asynchronous.

OK, not just my imagination, and it makes me feel better about the patch 
set because efficiency bugs are fixable while fundamental limitations 
are not.

How much of a hurry are you in to merge this feature?  You have bits 
like this:

"Add a function to install a monitor on the page lock waitqueue for a 
particular page, thus allowing the page being unlocked to be detected.
This is used by CacheFiles to detect read completion on a page in the 
backing filesystem so that it can then copy the data to the waiting 
netfs page."

We already have that hook, it is called bio_endio.  My strong intuition 
is that your whole mechanism should sit directly on the block device, 
no matter how attractive it seems to be able to piggyback on the 
namespace and layout management code of existing filesystems.  I see 
your current effort as the moral equivalent of FUSE: you are able to 
demonstrate certain desirable behavioral properties, but you are unable 
to reach full theoretical efficiency because there are layers and 
layers of interface gunk interposed between the netfs user and the 
cache device.

That said, I also see you have put a huge amount of work into this over 
the years, it is nicely broken out, you are responsive and easy to work 
with, all arguments for an early merge.  Against that, you invade core 
kernel for reasons that are not necessarily justified:

  * two new page flags
  * a new fileops method
  * many changes to LSM including new object class and new hooks
  * separate fs*id from task struct
  * new page-private destructor hook
  * probably other bits I missed

Would it be correct to say that some of these changes are to support 
disconnected operation?  If so, you really have two patch sets:

  1) Persistent netfs cache
  2) Disconnected netfs operation

You have some short snappers that look generally useful:

  * add_wait_queue_tail (cool)
  * write to a file without a struct file (includes ->mapping cleanup,
probably good)
  * export fsync_super

Why not hunt around for existing in-kernel users that would benefit so 
these can be submitted as standalone patches, shortening the remaining 
patch set and partially overcoming objections due to core kernel 
changes?

One thing I don't see is users coming on to lkml and saying "please 
merge this, it works great for me".  Since you probably have such 
users, why not give them a poke? 

Your cachefilesd is going to need anti-deadlock medicine like ddsnap 
has.  Since you don't seem at all worried about that right now, I 
suspect you have not hammered this code really heavily, correct?  
Without preventative measures, any memory-using daemon sitting in the 
block IO path will deadlock if you hit it hard enough.

A couple of years ago you explained the purpose of the new page flags to 
me and there is no way I can find that email again.  Could you explain 
it again please?  Meanwhile I am doing my duty and reading your OLS 
slides etc.

Regards,

Daniel
-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/37] Permit filesystem local caching

2008-02-25 Thread David Howells

Daniel Phillips <[EMAIL PROTECTED]> wrote:

> This factor of four (even worse on XFS, not quite as bad on Ext3) is
> worth ruminating upon.  Is all of the difference explained by avoiding
> seeks on the server, which has the files in memory?

Here are some more stats for you to consider:

 (1) Copy the data across the network to a fresh Ext3 fs on the same partition
 I was using for the cache:

[EMAIL PROTECTED] ~]# time cp -a /warthog/aaa /var/fscache
real0m39.052s
user0m0.368s
sys 0m15.229s

 (2) Reboot and read back the files just written into Ext3 on the local disk:

[EMAIL PROTECTED] ~]# time tar cf - /var/fscache/aaa >/dev/zero
real0m40.574s
user0m0.164s
sys 0m3.512s

 (3) Run through the cache population process, and then run a tar directly on
 cachefiles's cache directly after a reboot:

[EMAIL PROTECTED] ~]# time tar cf - /var/fscache/cache >/dev/zero
real4m53.104s
user0m0.192s
sys 0m4.240s

So I guess there's a problem in cachefiles's efficiency - possibly due to the
fact that it tries to be fully asynchronous.

In case (1) this is very similar to the time for a read through a completely
cold cache (37.497s).

In case (2) this is comparable to cachefiles with a cache warmed prior to a
reboot (1m54.350s); in this case, however, cachefiles is doing some extra work:

 (a) It's doing a lookup on the server for each file, in addition to the
 lookups on the disk.  However, just doing a tar from plain NFS, the
 command completes in 22.330s.

 (b) It's reading an xattr per object for cache coherency management.

 (c) As the cache knows nothing of directories, files, etc., it lays its
 directory subtree out in a way that suits it.  File lookup keys are
 turned into filenames.  This may result in a less efficient arrangement
 in the cache than the original data, especially as directories may become
 very large, so Ext3 may be doing some extra work.

In case (3), this perhaps suggests that cachefiles's directory layout may be
part of the problem.  Running the following:

ls -ldSr `find . -type d`

in /var/fscache/cache shows that the directories are either 4096 bytes in size
(158 instances) or 12288 bytes in size (105 instances), for a total of 263
directories.  There are 19255 files.

Running that ls command in /warthog/aaa shows 1185 directories, all but three
of them 4096 bytes in size; two are 12288 bytes and one is 20480 bytes in size
(include/linux/ unsurprisingly).  There are 19258 files, three of which are
hardlinks to other files in the tree.

> This could be easily tested by running a test against a server that is the
> same as the client, and does not have the files in memory.  If local access
> is still slower than network then there is a real issue with cache
> efficiency.

My server is also my desktop machine.  The only way to guarantee that the
memory is scrubbed is to reboot it:-(  I'll look at setting up one of my other
machines as an NFS server.

David
-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/37] Permit filesystem local caching

2008-02-22 Thread David Howells
Daniel Phillips <[EMAIL PROTECTED]> wrote:

> I am eventually going to suggest cutting the backing filesystem entirely out
> of the picture,

You still need a database to manage the cache.  A filesystem such as Ext3
makes a very handy database for four reasons:

 (1) It exists and works.

 (2) It has a well defined interface within the kernel.

 (3) I can place my cache on, say, my root partition on my laptop.  I don't
 have to dedicate a partition to the cache.

 (4) Userspace cache management tools (such as cachefilesd) have an already
 existing interface to use: rmdir, unlink, open, getdents, etc..

I do have a cache-on-blockdev thing, but it's basically a wandering tree
filesystem inside.  It is, or was, much faster than ext3 on a clean cache, but
it degrades horribly over time because my free space reclamation sucks - it
gradually randomises the block allocation sequence over time.

So, what would you suggest instead of a backing filesystem?

> I really do not like idea of force fitting this cache into a generic
> vfs model.  Sun was collectively smoking some serious crack when they
> cooked that one up.  But there is also the ageless principle "isness is
> more important than niceness".

What do you mean?  I'm not doing it like Sun.  The cache is a side path from
the netfs.  It should be transparent to the user, the VFS and the server.

The only place it might not be transparent is that you might to have to
instruct the netfs mount to use the cache.  I'd prefer to do it some other way
than passing parameters to mount, though, as (1) this causes fun with NIS
distributed automounter maps, and (2) people are asking for a finer grain of
control than per-mountpoint.  Unfortunately, I can't seem to find a way to do
it that's acceptable to Al.

> Which would require a change to NFS, not an option because you hope to
> work with standard servers?  Of course with years to think about this,
> the required protocol changes were put into v4.  Not.

I don't think there's much I can do about NFS.  It requires the filesystem
from which the NFS server is dealing to have inode uniquifiers, which are then
incorporated into the file handle.  I don't think the NFS protocol itself
needs to change to support this.

> Have you completely exhausted optimization ideas for the file handle
> lookup?

No, but there aren't many.  CacheFiles doesn't actually do very much, and it's
hard to reduce that not very much.  The most obvious thing is to prepopulate
the dcache, but that's at the expense of memory usage.

Actually, if I cache the name => FH mapping I used last time, I can make a
start on looking up in the cache whilst simultaneously accessing the server.
If what's on the server has changed, I can ditch the speculative cache lookup
I was making and start a new cache lookup.

However, storing directory entries has penalties of its own, though it'll be
necesary if we want to do disconnected operation.

> > Where "lookup table" == "dcache".  That would be good yes.  cachefilesd
> > prescans all the files in the cache, which ought to do just that, but it
> > doesn't seem to be very effective.  I'm not sure why.
> 
> RCU?  Anyway, it is something to be tracked down and put right.

cachefilesd runs in userspace.  It's possible it isn't doing enough to preload
all the metadata.

> What I tried to say.  So still... got any ideas?  That extra synchronous
> network round trip is a killer.  Can it be made streaming/async to keep
> throughput healthy?

That's a per-netfs thing.  With the test rig I've got, it's going to the
on-disk cache that's the killer.  Going over the network is much faster.

See the results I posted.  For the tarball load, and using Ext3 to back the
cache:

Cold NFS cache, no disk cache:  0m22.734s
Warm on-disk cache, cold pagecaches:1m54.350s

The problem is reading using tar is a worst case workload for this.  Everything
it does is pretty much completely synchronous.

One thing that might help is if things like tar and find can be made to use
fadvise() on directories to hint to the filesystem (NFS, AFS, whatever) that
it's going to access every file in those directories.

Certainly AFS could make use of that: the directory is read as a file, and the
netfs then parses the file to get a list of vnode IDs that that directory
points to.  It could then do bulk status fetch operations to instantiate the
inodes 50 at a time.

I don't know whether NFS could use it.  Someone like Trond or SteveD or Chuck
would have to answer that.

David
-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/37] Permit filesystem local caching

2008-02-22 Thread Daniel Phillips
On Friday 22 February 2008 04:48, David Howells wrote:
> > But looking up the object in the cache should be nearly free - much less
> > than a microsecond per block.
> 
> The problem is that you have to do a database lookup of some sort, possibly
> involving several synchronous disk operations.

Right, so the obvious optimization strategy for this corner of it is to
decimate the synchronous disk ops for the average case, for which there
are a variety of options, one of which you already suggested.

> CacheFiles does a disk lookup by taking the key given to it by NFS, turning it
> into a set of file or directory names, and doing a short pathwalk to the 
> target
> cache file.  Throwing in extra indices won't necessarily help.  What matters 
> is
> how quick the backing filesystem is at doing lookups.  As it turns out, Ext3 
> is
> a fair bit better then BTRFS when the disk cache is cold.

All understood.  I am eventually going to suggest cutting the backing
filesystem entirely out of the picture, with a view to improving both
efficiency and transparency, hopefully with a code size reduction as
well.  But you are up and running with the filesystem approach, enough
to tackle the basic algorithm questions, which is worth a lot.

I really do not like idea of force fitting this cache into a generic
vfs model.  Sun was collectively smoking some serious crack when they
cooked that one up.  But there is also the ageless principle "isness is
more important than niceness".

> > > The metadata problem is quite a tricky one since it increases with the
> > > number of files you're dealing with.  As things stand in my patches, when
> > > NFS, for example, wants to access a new inode, it first has to go to the
> > > server to lookup the NFS file handle, and only then can it go to the cache
> > > to find out if there's a matching object in the case.
> > 
> > So without the persistent cache it can omit the LOOKUP and just send the
> > filehandle as part of the READ?
> 
> What 'it'?  Note that the get the filehandle, you have to do a LOOKUP op.  
> With
> the cache, we could actually cache the results of lookups that we've done,
> however, we don't know that the results are still valid without going to the
> server:-/

What I was trying to say.  It => the cache logic.

> AFS has a way around that - it versions its vnode (inode) IDs.

Which would require a change to NFS, not an option because you hope to
work with standard servers?  Of course with years to think about this,
the required protocol changes were put into v4.  Not.

/me hopes for an NFS hack to show up and explain the thinking there

Actually, there are many situations where changing both the client (you
must do that anyway) and the server is logistically practical.  In fact
that is true for all actual use cases I know of for this cache model.
So elaborating the protocol is not an option to reject out of hand.  A
hack along those lines could (should?) be provided as an opportunistic
option.

Have you completely exhausted optimization ideas for the file handle
lookup?

> > > The reason my client going to my server is so quick is that the server has
> > > the dcache and the pagecache preloaded, so that across-network lookup
> > > operations are really, really quick, as compared to the synchronous
> > > slogging of the local disk to find the cache object.
> > 
> > Doesn't that just mean you have to preload the lookup table for the
> > persistent cache so you can determine whether you are caching the data
> > for a filehandle without going to disk?
> 
> Where "lookup table" == "dcache".  That would be good yes.  cachefilesd
> prescans all the files in the cache, which ought to do just that, but it
> doesn't seem to be very effective.  I'm not sure why.

RCU?  Anyway, it is something to be tracked down and put right.

> > Your big can-t-get-there-from-here is the round trip to the server to
> > determine whether you should read from the local cache.  Got any ideas?
> 
> I'm not sure what you mean.  Your statement should probably read "... to
> determine _what_ you should read from the local cache".

What I tried to say.  So still... got any ideas?  That extra synchronous
network round trip is a killer.  Can it be made streaming/async to keep
throughput healthy?

> > And where is the Trond-meister in all of this?
> 
> Keeping quiet as far as I can tell.

/me does the Trond summoning dance

Daniel
-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/37] Permit filesystem local caching

2008-02-22 Thread David Howells
Chris Mason <[EMAIL PROTECTED]> wrote:

> Thanks for trying this, of course I'll ask you to try again with the latest 
> v0.13 code, it has a number of optimizations especially for CPU usage.

Here you go.  The numbers are very similar.

David

=
FEW BIG FILES TEST ON BTRFS v0.13
=

Completely cold caches:

[EMAIL PROTECTED] ~]# time cat /warthog/bigfile >/dev/null
real0m2.202s
user0m0.000s
sys 0m1.716s
[EMAIL PROTECTED] ~]# time cat /warthog/biggerfile >/dev/null
real0m4.212s
user0m0.000s
sys 0m0.896s

Warm BTRFS pagecache, cold NFS pagecache:

[EMAIL PROTECTED] ~]# time cat /warthog/bigfile >/dev/null
real0m0.197s
user0m0.000s
sys 0m0.192s
[EMAIL PROTECTED] ~]# time cat /warthog/biggerfile >/dev/null
real0m0.376s
user0m0.000s
sys 0m0.372s

Warm on-disk cache, cold pagecaches:

[EMAIL PROTECTED] ~]# time cat /warthog/bigfile >/dev/null
real0m1.543s
user0m0.004s
sys 0m1.448s
[EMAIL PROTECTED] ~]# time cat /warthog/biggerfile >/dev/null
real0m3.111s
user0m0.000s
sys 0m2.856s


==
MANY SMALL/MEDIUM FILE READING TEST ON BTRFS v0.13
==

Completely cold caches:

[EMAIL PROTECTED] ~]# time tar cf - /warthog/aaa >/dev/zero
real0m31.575s
user0m0.176s
sys 0m6.316s

Warm BTRFS pagecache, cold NFS pagecache:

[EMAIL PROTECTED] ~]# time tar cf - /warthog/aaa >/dev/zero
real0m16.081s
user0m0.164s
sys 0m5.528s

Warm on-disk cache, cold pagecaches:

[EMAIL PROTECTED] ~]# time tar cf - /warthog/aaa >/dev/zero
real2m15.245s
user0m0.064s
sys 0m2.808s

-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 00/37] Permit filesystem local caching

2008-02-22 Thread Rick Macklem
> Well, the AFS paper that was referenced earlier was written around the
> time of 10bt and 100bt.  Local disk caching worked well then.  There
> should also be some papers at CITI about disk caching over slower
> connections, and disconnected operation (which should still be
> applicable today).  There are still winners from local disk caching, but
> their numbers have been reduced.  Server load reduction should be a win.
> I'm not sure if it's worth it from a security/manageability standpoint,
> but I haven't looked that closely at David's code.

One area that you might want to look at is WAN performance. When RPC RTT
goes up, ordinary NFS performance goes down. This tends to get overlooked
by the machine room folks. (There are several tools out there that can
introduce delay in an IP packet stream and emulate WAN RTTs.)

Just a thought, rick
-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/37] Permit filesystem local caching

2008-02-22 Thread David Howells
David Howells <[EMAIL PROTECTED]> wrote:

> > > Have you got before/after benchmark results?
> > 
> > See attached.
> 
> Attached here are results using BTRFS (patched so that it'll work at all)
> rather than Ext3 on the client on the partition backing the cache.

And here are XFS results.

Tuning XFS makes a *really* big difference for the lots of small/medium files
being tarred case.  However, in general BTRFS is much better.

David
---


=
FEW BIG FILES TEST ON XFS
=

Completely cold caches:

[EMAIL PROTECTED] ~]# time cat /warthog/bigfile >/dev/null
real0m2.286s
user0m0.000s
sys 0m1.828s
[EMAIL PROTECTED] ~]# time cat /warthog/biggerfile >/dev/null
real0m4.228s
user0m0.000s
sys 0m1.360s

Warm NFS pagecache:

[EMAIL PROTECTED] ~]# time cat /warthog/bigfile >/dev/null
real0m0.058s
user0m0.000s
sys 0m0.060s
[EMAIL PROTECTED] ~]# time cat /warthog/biggerfile >/dev/null
real0m0.122s
user0m0.000s
sys 0m0.120s

Warm XFS pagecache, cold NFS pagecache:

[EMAIL PROTECTED] ~]# time cat /warthog/bigfile >/dev/null
real0m0.181s
user0m0.000s
sys 0m0.180s
[EMAIL PROTECTED] ~]# time cat /warthog/biggerfile >/dev/null
real0m1.034s
user0m0.000s
sys 0m0.404s

Warm on-disk cache, cold pagecaches:

[EMAIL PROTECTED] ~]# time cat /warthog/bigfile >/dev/null
real0m1.540s
user0m0.000s
sys 0m0.256s
[EMAIL PROTECTED] ~]# time cat /warthog/biggerfile >/dev/null
real0m3.003s
user0m0.000s
sys 0m0.532s


==
MANY SMALL/MEDIUM FILE READING TEST ON XFS
==

Completely cold caches:

[EMAIL PROTECTED] ~]# time tar cf - /warthog/aaa >/dev/zero
real4m56.827s
user0m0.180s
sys 0m6.668s

Warm NFS pagecache:

[EMAIL PROTECTED] ~]# time tar cf - /warthog/aaa >/dev/zero
real0m15.084s
user0m0.212s
sys 0m5.008s

Warm XFS pagecache, cold NFS pagecache:

[EMAIL PROTECTED] ~]# time tar cf - /warthog/aaa >/dev/zero
real0m13.547s
user0m0.220s
sys 0m5.652s

Warm on-disk cache, cold pagecaches:

[EMAIL PROTECTED] ~]# time tar cf - /warthog/aaa >/dev/zero
real4m36.316s
user0m0.148s
sys 0m4.440s


===
MANY SMALL/MEDIUM FILE READING TEST ON AN OPTIMISED XFS
===

mkfs.xfs -d agcount=4 -l size=128m,version=2 /dev/sda6


Completely cold caches:

[EMAIL PROTECTED] ~]# time tar cf - /warthog/aaa >/dev/zero
real3m44.033s
user0m0.248s
sys 0m6.632s

Warm on-disk cache, cold pagecaches:

[EMAIL PROTECTED] ~]# time tar cf - /warthog/aaa >/dev/zero
real3m8.582s
user0m0.108s
sys 0m3.420s
-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/37] Permit filesystem local caching

2008-02-22 Thread David Howells
Chris Mason <[EMAIL PROTECTED]> wrote:

> > The interesting case is where the disk cache is warm, but the pagecache is
> > cold (ie: just after a reboot after filling the caches).  Here, for the two
> > big files case, BTRFS appears quite a bit better than Ext3, showing a 21%
> > reduction in time for the smaller case and a 13% reduction for the larger
> > case.
> 
> I'm afraid I don't have a good handle on the filesystem operations that
> result from this workload.  Are we reading from the FS to fill the NFS page
> cache?

I'm not sure what you're asking.

When the cache is cold, we determine that we can't read from the cache very
quickly.  We then read data from the server and, in the background, create the
metadata in the cache and store the data to it (by copying netfs pages to
backingfs pages).

When the cache is warm, we read the data from the cache, copying the data from
the backingfs pages to the netfs pages.  We use bmap() to ascertain that there
is data to be read, otherwise we detect a hole and fallback to reading from
the server.

Looking up cache object involves a sequence of lookup() ops and getxattr() ops
on the backingfs.  Should an object not exist, we defer creation of that
object to a background thread and do lookups(), mkdirs() and setxattrs() and a
create() to manufacture the object.

We read data from an object by calling readpages() on the backingfs to bring
the data into the pagecache.  We monitor the PG_lock bits to find out when
each page is read or has completed with an error.

Writing pages to the cache is done completely in the background.
PG_fscache_write is set on a page when it is handed to fscache to storage,
then at some point a background thread wakes up and calls write_one_page() in
the backingfs to write that page to the cache file.  At the moment, this
copies the data into a backingfs page which is then marked PG_dirty, and the
VM writes it out in the usual way.

> > More surprising is that BTRFS performed significantly worse (15% increase
> > in time) in the case where the cache on disk was fully populated and then
> > the machine had been rebooted to clear the pagecaches.
> 
> Which FS operations are included here?  Finding all the files or just an 
> unmount?  Btrfs defrags metadata in the background, and unmount has to wait 
> for that defrag to finish.

BTRFS might not be doing any writing at all here - apart from local atimes
(used by cache culling), that is.

What it does have to do is lots of lookups, reads and getxattrs, all of which
are synchronous.

David
-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/37] Permit filesystem local caching

2008-02-22 Thread Chris Mason
On Thursday 21 February 2008, David Howells wrote:
> David Howells <[EMAIL PROTECTED]> wrote:
> > > Have you got before/after benchmark results?
> >
> > See attached.
>
> Attached here are results using BTRFS (patched so that it'll work at all)
> rather than Ext3 on the client on the partition backing the cache.

Thanks for trying this, of course I'll ask you to try again with the latest 
v0.13 code, it has a number of optimizations especially for CPU usage.

>
> Note that I didn't bother redoing the tests that didn't involve a cache as
> the choice of filesystem backing the cache should have no bearing on the
> result.
>
> Generally, completely cold caches shouldn't show much variation as all the
> writing can be done completely asynchronously, provided the client doesn't
> fill its RAM.
>
> The interesting case is where the disk cache is warm, but the pagecache is
> cold (ie: just after a reboot after filling the caches).  Here, for the two
> big files case, BTRFS appears quite a bit better than Ext3, showing a 21%
> reduction in time for the smaller case and a 13% reduction for the larger
> case.

I'm afraid I don't have a good handle on the filesystem operations that result 
from this workload.  Are we reading from the FS to fill the NFS page cache?

>
> For the many small/medium files case, BTRFS performed significantly better
> (15% reduction in time) in the case where the caches were completely cold.
> I'm not sure why, though - perhaps because it doesn't execute a
> write_begin() stage during the write_one_page() call and thus doesn't go
> allocating disk blocks to back the data, but instead allocates them later.

If your write_one_page call does parts of btrfs_file_write, you'll get delayed 
allocation for anything bigger than 8k by default.  <= 8k will get packed 
into the btree leaves.

>
> More surprising is that BTRFS performed significantly worse (15% increase
> in time) in the case where the cache on disk was fully populated and then
> the machine had been rebooted to clear the pagecaches.

Which FS operations are included here?  Finding all the files or just an 
unmount?  Btrfs defrags metadata in the background, and unmount has to wait 
for that defrag to finish.

Thanks again,
Chris
-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/37] Permit filesystem local caching

2008-02-22 Thread David Howells
Daniel Phillips <[EMAIL PROTECTED]> wrote:

> > The way the client works is like this:
> 
> Thanks for the excellent ascii art, that cleared up the confusion right
> away.

You know what they say about pictures... :-)

> > What are you trying to do exactly?  Are you actually playing with it, or
> > just looking at the numbers I've produced?
> 
> Trying to see if you are offering enough of a win to justify testing it,
> and if that works out, then going shopping for a bin of rotten vegetables
> to throw at your design, which I hope you will perceive as useful.

One thing that you have to remember: my test setup is pretty much the
worst-case for being appropriate for showing the need for caching to improve
performance.  There's a single client and a single server, they've got GigE
networking between them that has very little other load, and the server has
sufficient memory to hold the entire test data set.

> From the numbers you have posted I think you are missing some basic
> efficiencies that could take this design from the sorta-ok zone to wow!

Not really, it's just that this lashup could be considered designed to show
local caching in the worst light.

> But looking up the object in the cache should be nearly free - much less
> than a microsecond per block.

The problem is that you have to do a database lookup of some sort, possibly
involving several synchronous disk operations.

CacheFiles does a disk lookup by taking the key given to it by NFS, turning it
into a set of file or directory names, and doing a short pathwalk to the target
cache file.  Throwing in extra indices won't necessarily help.  What matters is
how quick the backing filesystem is at doing lookups.  As it turns out, Ext3 is
a fair bit better then BTRFS when the disk cache is cold.

> > The metadata problem is quite a tricky one since it increases with the
> > number of files you're dealing with.  As things stand in my patches, when
> > NFS, for example, wants to access a new inode, it first has to go to the
> > server to lookup the NFS file handle, and only then can it go to the cache
> > to find out if there's a matching object in the case.
> 
> So without the persistent cache it can omit the LOOKUP and just send the
> filehandle as part of the READ?

What 'it'?  Note that the get the filehandle, you have to do a LOOKUP op.  With
the cache, we could actually cache the results of lookups that we've done,
however, we don't know that the results are still valid without going to the
server:-/

AFS has a way around that - it versions its vnode (inode) IDs.

> > The reason my client going to my server is so quick is that the server has
> > the dcache and the pagecache preloaded, so that across-network lookup
> > operations are really, really quick, as compared to the synchronous
> > slogging of the local disk to find the cache object.
> 
> Doesn't that just mean you have to preload the lookup table for the
> persistent cache so you can determine whether you are caching the data
> for a filehandle without going to disk?

Where "lookup table" == "dcache".  That would be good yes.  cachefilesd
prescans all the files in the cache, which ought to do just that, but it
doesn't seem to be very effective.  I'm not sure why.

> > I can probably improve this a little by pre-loading the subindex
> > directories (hash tables) that I use to reduce the directory size in the
> > cache, but I don't know by how much.
> 
> Ah I should have read ahead.  I think the correct answer is "a lot".

Quite possibly.  It'll allow me to dispense with at least one fs lookup call
per cache object request call.

> Your big can-t-get-there-from-here is the round trip to the server to
> determine whether you should read from the local cache.  Got any ideas?

I'm not sure what you mean.  Your statement should probably read "... to
determine _what_ you should read from the local cache".

> And where is the Trond-meister in all of this?

Keeping quiet as far as I can tell.

David
-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/37] Permit filesystem local caching

2008-02-21 Thread Daniel Phillips
On Thursday 21 February 2008 16:07, David Howells wrote:
> The way the client works is like this:

Thanks for the excellent ascii art, that cleared up the confusion right
away.

> What are you trying to do exactly?  Are you actually playing with it, or just
> looking at the numbers I've produced?

Trying to see if you are offering enough of a win to justify testing it,
and if that works out, then going shopping for a bin of rotten vegetables
to throw at your design, which I hope you will perceive as useful.

In short I am looking for a reason to throw engineering effort at it.
>From the numbers you have posted I think you are missing some basic
efficiencies that could take this design from the sorta-ok zone to wow!

I think you may already be in the wow zone for taking load off a server
and I know of applications where an NFS server gets hammered so badly
that having the client suck a little in the unloaded case is a price
worth paying.  But the whole idea would be much more attractive if the
regressions were smaller.

> > Who is supposed to win big?  Is this mainly about reducing the load on 
> > the server, or is the client supposed to win even with a lightly loaded 
> > server?
> 
> These are difficult questions to answer.  The obvious answer to both is "it
> depends", and the real answer to both is "it's a compromise".
> 
> Inserting a cache adds overhead: you have to look in the cache to see if your
> objects are mirrored there, and then you have to look in the cache to see if
> the data you want is stored there; and then you might have to go to the server
> anyway and then schedule a copy to be stored in the cache.

But looking up the object in the cache should be nearly free - much less
than a microsecond per block.  If not then there are design issues.  I
suspect that you are doing yourself a disservice by going all the way
through the vfs to do this cache lookup, but this needs to be proved.

> The characteristics of this type of cache depend on a number of things: the
> filesystem backing it being the most obvious variable, but also how fragmented
> it is and the properties of the disk drive or drives it is on.

Double caching and vm unawareness of that has to hurt.

> The metadata problem is quite a tricky one since it increases with the number
> of files you're dealing with.  As things stand in my patches, when NFS, for
> example, wants to access a new inode, it first has to go to the server to
> lookup the NFS file handle, and only then can it go to the cache to find out 
> if
> there's a matching object in the case.

So without the persistent cache it can omit the LOOKUP and just send the
filehandle as part of the READ?

> Worse, the cache must then perform 
> several synchronous disk bound metadata operations before it can be possible 
> to
> read from the cache.  Worse still, this means that a read on the network file
> cannot proceed until (a) we've been to the server *plus* (b) we've been to the
> disk.
> 
> The reason my client going to my server is so quick is that the server has 
> the 
> dcache and the pagecache preloaded, so that across-network lookup operations
> are really, really quick, as compared to the synchronous slogging of the local
> disk to find the cache object.

Doesn't that just mean you have to preload the lookup table for the
persistent cache so you can determine whether you are caching the data
for a filehandle without going to disk?

> I can probably improve this a little by pre-loading the subindex directories
> (hash tables) that I use to reduce the directory size in the cache, but I 
> don't
> know by how much.

Ah I should have read ahead.  I think the correct answer is "a lot".
Your big can-t-get-there-from-here is the round trip to the server to
determine whether you should read from the local cache.  Got any ideas?

And where is the Trond-meister in all of this?

Regards,

Daniel
-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/37] Permit filesystem local caching

2008-02-21 Thread David Howells
Daniel Phillips <[EMAIL PROTECTED]> wrote:

> When you say Ext3 cache vs NFS cache is the first on the server and the 
> second on the client?

The filesystem on the server is pretty much irrelevant as long as (a) it
doesn't change, and (b) all the data is in memory on the server anyway.

The way the client works is like this:

+-+
| |   
|   NFS   |--+
| |  |
+-+  |   +--+ 
 |   |  | 
+-+  +-->|  | 
| |  |  |
|   AFS   |->| FS-Cache |
| |  |  |--+
+-+  +-->|  |  |
 |   |  |  |   +--+   +--+
+-+  |   +--+  |   |  |   |  |
| |  | +-->|  CacheFiles  |-->|  Ext3|
|  ISOFS  |--+ |  /var/cache  |   |  /dev/sda6   |
| |+--+   +--+
+-+


 (1) NFS, say, asks FS-Cache to store/retrieve data for it;

 (2) FS-Cache asks the cache backend, in this case CacheFiles to honour the
 operation;

 (3) CacheFiles 'opens' a file in a mounted filesystem, say Ext3, and does read
 and write operations of a sort on it;

 (4) Ext3 decides how the cache data is laid out on disk - CacheFiles just
 attempts to use one sparse file per netfs inode.

> I am trying to spot the numbers that show the sweet spot for this 
> optimization, without much success so far.

What are you trying to do exactly?  Are you actually playing with it, or just
looking at the numbers I've produced?

> Who is supposed to win big?  Is this mainly about reducing the load on 
> the server, or is the client supposed to win even with a lightly loaded 
> server?

These are difficult questions to answer.  The obvious answer to both is "it
depends", and the real answer to both is "it's a compromise".

Inserting a cache adds overhead: you have to look in the cache to see if your
objects are mirrored there, and then you have to look in the cache to see if
the data you want is stored there; and then you might have to go to the server
anyway and then schedule a copy to be stored in the cache.

The characteristics of this type of cache depend on a number of things: the
filesystem backing it being the most obvious variable, but also how fragmented
it is and the properties of the disk drive or drives it is on.

Whether it's worth having a cache depend on the characteristics of the network
versus the characteristics of the cache.  Latency of the cache vs latency of
the network, for example.  Network loading is another: having a cache on each
of several clients sharing a server can reduce network traffic by avoiding the
read requests to the server.  NFS has a characteristic that it keeps spamming
the server with file status requests, so even if you take the read requests out
of the load, an NFS client still generates quite a lot of network traffic to
the server - but the reduction is still useful.

The metadata problem is quite a tricky one since it increases with the number
of files you're dealing with.  As things stand in my patches, when NFS, for
example, wants to access a new inode, it first has to go to the server to
lookup the NFS file handle, and only then can it go to the cache to find out if
there's a matching object in the case.  Worse, the cache must then perform
several synchronous disk bound metadata operations before it can be possible to
read from the cache.  Worse still, this means that a read on the network file
cannot proceed until (a) we've been to the server *plus* (b) we've been to the
disk.

The reason my client going to my server is so quick is that the server has the
dcache and the pagecache preloaded, so that across-network lookup operations
are really, really quick, as compared to the synchronous slogging of the local
disk to find the cache object.

I can probably improve this a little by pre-loading the subindex directories
(hash tables) that I use to reduce the directory size in the cache, but I don't
know by how much.


Anyway, to answer your questions:

 (1) It may help with heavily loaded networks with lots of read-only traffic.

 (2) It may help with slow connections (like doing NFS between the UK and
 Australia).

 (3) It could be used to do offline/disconnected operation.

David
-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/37] Permit filesystem local caching

2008-02-21 Thread David Howells
David Howells <[EMAIL PROTECTED]> wrote:

> > Have you got before/after benchmark results?
> 
> See attached.

Attached here are results using BTRFS (patched so that it'll work at all)
rather than Ext3 on the client on the partition backing the cache.

Note that I didn't bother redoing the tests that didn't involve a cache as the
choice of filesystem backing the cache should have no bearing on the result.

Generally, completely cold caches shouldn't show much variation as all the
writing can be done completely asynchronously, provided the client doesn't
fill its RAM.

The interesting case is where the disk cache is warm, but the pagecache is
cold (ie: just after a reboot after filling the caches).  Here, for the two
big files case, BTRFS appears quite a bit better than Ext3, showing a 21%
reduction in time for the smaller case and a 13% reduction for the larger
case.

For the many small/medium files case, BTRFS performed significantly better
(15% reduction in time) in the case where the caches were completely cold.
I'm not sure why, though - perhaps because it doesn't execute a write_begin()
stage during the write_one_page() call and thus doesn't go allocating disk
blocks to back the data, but instead allocates them later.

More surprising is that BTRFS performed significantly worse (15% increase in
time) in the case where the cache on disk was fully populated and then the
machine had been rebooted to clear the pagecaches.

It's important to note that I've only run each test once apiece, so the
numbers should be taken with a modicum of salt (bad statistics and all that).

David
---
===
FEW BIG FILES TEST ON BTRFS
===

Completely cold caches:

[EMAIL PROTECTED] ~]# time cat /warthog/bigfile >/dev/null
real0m2.124s
user0m0.000s
sys 0m1.260s
[EMAIL PROTECTED] ~]# time cat /warthog/biggerfile >/dev/null
real0m4.538s
user0m0.000s
sys 0m2.624s

Warm NFS pagecache:

[EMAIL PROTECTED] ~]# time cat /warthog/bigfile >/dev/null
real0m0.061s
user0m0.000s
sys 0m0.064s
[EMAIL PROTECTED] ~]# time cat /warthog/biggerfile >/dev/null
real0m0.118s
user0m0.000s
sys 0m0.116s

Warm BTRFS pagecache, cold NFS pagecache:

[EMAIL PROTECTED] ~]# time cat /warthog/bigfile >/dev/null
real0m0.189s
user0m0.000s
sys 0m0.188s
[EMAIL PROTECTED] ~]# time cat /warthog/biggerfile >/dev/null
real0m0.369s
user0m0.000s
sys 0m0.368s

Warm on-disk cache, cold pagecaches:

[EMAIL PROTECTED] ~]# time cat /warthog/bigfile >/dev/null
real0m1.540s
user0m0.000s
sys 0m1.440s
[EMAIL PROTECTED] ~]# time cat /warthog/biggerfile >/dev/null
real0m3.132s
user0m0.000s
sys 0m1.724s



MANY SMALL/MEDIUM FILE READING TEST ON BTRFS


Completely cold caches:

[EMAIL PROTECTED] ~]# time tar cf - /warthog/aaa >/dev/zero
real0m31.838s
user0m0.192s
sys 0m6.076s

Warm NFS pagecache:

[EMAIL PROTECTED] ~]# time tar cf - /warthog/aaa >/dev/zero
real0m14.841s
user0m0.148s
sys 0m4.988s

Warm BTRFS pagecache, cold NFS pagecache:

[EMAIL PROTECTED] ~]# time tar cf - /warthog/aaa >/dev/zero
real0m16.773s
user0m0.148s
sys 0m5.512s

Warm on-disk cache, cold pagecaches:

[EMAIL PROTECTED] ~]# time tar cf - /warthog/aaa >/dev/zero
real2m12.527s
user0m0.080s
sys 0m2.908s

-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 00/37] Permit filesystem local caching

2008-02-21 Thread Muntz, Daniel
Well, the AFS paper that was referenced earlier was written around the
time of 10bt and 100bt.  Local disk caching worked well then.  There
should also be some papers at CITI about disk caching over slower
connections, and disconnected operation (which should still be
applicable today).  There are still winners from local disk caching, but
their numbers have been reduced.  Server load reduction should be a win.
I'm not sure if it's worth it from a security/manageability standpoint,
but I haven't looked that closely at David's code.

  -Dan

-Original Message-
From: Daniel Phillips [mailto:[EMAIL PROTECTED] 
Sent: Thursday, February 21, 2008 2:44 PM
To: David Howells
Cc: Myklebust, Trond; [EMAIL PROTECTED]; [EMAIL PROTECTED];
[EMAIL PROTECTED]; linux-security-module@vger.kernel.org;
[EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: Re: [PATCH 00/37] Permit filesystem local caching

Hi David,

I am trying to spot the numbers that show the sweet spot for this
optimization, without much success so far.

Who is supposed to win big?  Is this mainly about reducing the load on
the server, or is the client supposed to win even with a lightly loaded
server?

When you say Ext3 cache vs NFS cache is the first on the server and the
second on the client?

Regards,

Daniel
___
NFSv4 mailing list
[EMAIL PROTECTED]
http://linux-nfs.org/cgi-bin/mailman/listinfo/nfsv4
-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/37] Permit filesystem local caching

2008-02-21 Thread Daniel Phillips
Hi David,

I am trying to spot the numbers that show the sweet spot for this 
optimization, without much success so far.

Who is supposed to win big?  Is this mainly about reducing the load on 
the server, or is the client supposed to win even with a lightly loaded 
server?

When you say Ext3 cache vs NFS cache is the first on the server and the 
second on the client?

Regards,

Daniel
-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/37] Permit filesystem local caching

2008-02-21 Thread Kevin Coffman
On Thu, Feb 21, 2008 at 9:55 AM, David Howells <[EMAIL PROTECTED]> wrote:
> Daniel Phillips <[EMAIL PROTECTED]> wrote:
>
>
> > Have you got before/after benchmark results?
>
>  See attached.
>
>  These show a couple of things:
>
>   (1) Dealing with lots of metadata slows things down a lot.  Note the result 
> of
>  looking and reading lots of small files with tar (the last result).  The
>  NFS client has to both consult the NFS server *and* the cache.  Not only
>  that, but any asynchronicity the cache may like to do is rendered
>  ineffective by the fact tar wants to do a read on a file pretty much
>  directly after opening it.
>
>   (2) Getting metadata from the local disk fs is slower than pulling it across
>  an unshared gigabit ethernet from a server that already has it in memory.

Hi David,

Your results remind me of this in case you're interested...

http://www.citi.umich.edu/techreports/reports/citi-tr-92-3.pdf
-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/37] Permit filesystem local caching

2008-02-21 Thread David Howells
Daniel Phillips <[EMAIL PROTECTED]> wrote:

> Have you got before/after benchmark results?

See attached.

These show a couple of things:

 (1) Dealing with lots of metadata slows things down a lot.  Note the result of
 looking and reading lots of small files with tar (the last result).  The
 NFS client has to both consult the NFS server *and* the cache.  Not only
 that, but any asynchronicity the cache may like to do is rendered
 ineffective by the fact tar wants to do a read on a file pretty much
 directly after opening it.

 (2) Getting metadata from the local disk fs is slower than pulling it across
 an unshared gigabit ethernet from a server that already has it in memory.

These points don't mean that fscache is no use, just that you have to consider
carefully whether it's of use to *you* given your particular situation, and
that depends on various factors.

Note that currently FS-Caching is disabled for individual NFS files opened for
writing as there's no way to handle the coherency problems thereby introduced.

David
---

  ===
  FS-CACHE FOR NFS BENCHMARKS
  ===

 (*) The NFS client has a 1.86GHz Core2 Duo CPU and 1GB of RAM.

 (*) The NFS client has a Seagate ST380211AS 80GB 7200rpm SATA disk on an
 interface running in AHCI mode.  The chipset is an Intel G965.

 (*) A partition of approx 4.5GB is committed to caching, and is formatted as
 Ext3 with a blocksize of 4096 and directory indices.

 (*) The NFS client is using SELinux.

 (*) The NFS server is running an in-kernel NFSd, and has a 2.66GHz Core2 Duo
 CPU and 6GB of RAM.  The chipset is an Intel P965.

 (*) The NFS client is connected to the NFS server by Gigabit Ethernet.

 (*) The NFS mount is made with defaults for all options not relating to the
 cache:

warthog:/warthog /warthog nfs
rw,vers=3,rsize=1048576,wsize=1048576,hard,proto=tcp,timeo=600,
retrans=2,sec=sys,fsc,addr=w.x.y.z 0 0


==
FEW BIG FILES TEST
==

Where:

 (*) The NFS server has two files:

[EMAIL PROTECTED] ~]# ls -l /warthog/bigfile
-rw-rw-r-- 1 4043 4043 104857600 2006-11-30 09:39 /warthog/bigfile
[EMAIL PROTECTED] ~]# ls -l /warthog/biggerfile 
-rw-rw-r-- 1 4043 4041 209715200 2006-03-21 13:56 /warthog/biggerfile

 Both of which are in memory on the server in all cases.


No patches, cold NFS cache:

[EMAIL PROTECTED] ~]# time cat /warthog/bigfile >/dev/null
real0m1.909s
user0m0.000s
sys 0m0.520s
[EMAIL PROTECTED] ~]# time cat /warthog/biggerfile >/dev/null
real0m3.750s
user0m0.000s
sys 0m0.904s

CONFIG_FSCACHE=n, cold NFS cache:

[EMAIL PROTECTED] ~]# time cat /warthog/bigfile >/dev/null
real0m2.003s
user0m0.000s
sys 0m0.124s
[EMAIL PROTECTED] ~]# time cat /warthog/biggerfile >/dev/null
real0m4.100s
user0m0.004s
sys 0m0.488s

Cold NFS cache, no disk cache:

[EMAIL PROTECTED] ~]# time cat /warthog/bigfile >/dev/null
real0m2.084s
user0m0.000s
sys 0m0.136s
[EMAIL PROTECTED] ~]# time cat /warthog/biggerfile >/dev/null
real0m4.020s
user0m0.000s
sys 0m0.720s

Completely cold caches:

[EMAIL PROTECTED] ~]# time cat /warthog/bigfile >/dev/null
real0m2.412s
user0m0.000s
sys 0m0.892s
[EMAIL PROTECTED] ~]# time cat /warthog/biggerfile >/dev/null
real0m4.449s
user0m0.000s
sys 0m2.300s

Warm NFS pagecache:

[EMAIL PROTECTED] ~]# time cat /warthog/bigfile >/dev/null
real0m0.067s
user0m0.000s
sys 0m0.064s
[EMAIL PROTECTED] ~]# time cat /warthog/biggerfile >/dev/null
real0m0.133s
user0m0.000s
sys 0m0.136s

Warm Ext3 pagecache, cold NFS pagecache:

[EMAIL PROTECTED] ~]# time cat /warthog/bigfile >/dev/null
real0m0.173s
user0m0.000s
sys 0m0.172s
[EMAIL PROTECTED] ~]# time cat /warthog/biggerfile >/dev/null
real0m0.316s
user0m0.000s
sys 0m0.316s

Warm on-disk cache, cold pagecaches:

[EMAIL PROTECTED] ~]# time cat /warthog/bigfile >/dev/null
real0m1.955s
user0m0.000s
sys 0m0.244s
[EMAIL PROTECTED] ~]# time cat /warthog/biggerfile >/dev/null
real0m3.596s
user0m0.000s
sys 0m0.460s


===
MANY SMALL/MEDIUM FILE READING TEST
===

Where:

 (*) The NFS server has an old kernel tree:

[EMAIL PROTECTED] ~]# du -s /warthog/aaa
347340  /warthog/aaa
 

Re: [PATCH 00/37] Permit filesystem local caching

2008-02-21 Thread David Howells
Daniel Phillips <[EMAIL PROTECTED]> wrote:

> > These patches add local caching for network filesystems such as NFS.
> 
> Have you got before/after benchmark results?

I need to get a new hard drive for my test machine before I can go and get
some more up to date benchmark results.  It does seem, however, that the I/O
error handling capabilities of FS-Cache work properly:-)

David
-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/37] Permit filesystem local caching

2008-02-20 Thread Daniel Phillips
Hi David,

On Wednesday 20 February 2008 08:05, David Howells wrote:
> These patches add local caching for network filesystems such as NFS.

Have you got before/after benchmark results?

Regards,

Daniel
-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/37] Permit filesystem local caching

2008-02-20 Thread James Carter
On Wed, 2008-02-20 at 13:58 -0600, Serge E. Hallyn wrote:
> 
> Seems *really* weird that every time you send this, patch 6 doesn't seem
> to reach me in any of my mailboxes...  (did get it from the url
> you listed)

That's because patch #6 is 169K.  You also don't get patch #20 which is
140K and patch #14 which is 237K.  

For the SELinux list, messages longer than 100K need to be approved.
Since David can't break his patches into smaller sizes and since he
provides a link to get the whole thing anyway, I don't bother approving
the large patches.  

The LSM list appears to have the same policy.  There are no messages
larger than 100K.

-- 
James Carter <[EMAIL PROTECTED]>
National Security Agency

-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/37] Permit filesystem local caching

2008-02-20 Thread David Howells
Serge E. Hallyn <[EMAIL PROTECTED]> wrote:

> Seems *really* weird that every time you send this, patch 6 doesn't seem
> to reach me in any of my mailboxes...  (did get it from the url
> you listed)

It's the largest of the patches, so that's not entirely surprising.  Hence why
I included the URL to the tarball also.

> I'm sorry if I miss where you explicitly state this, but is it safe to
> assume, as perusing the patches suggests, that
> 
>   1. tsk->sec never changes other than in task_alloc_security()?  

Correct.

>   2. tsk->act_as is only ever dereferenced from (a) current->

That ought to be correct.

>  except (b) in do_coredump?

Actually, do_coredump() only deals with current->act_as.

> (thereby carefully avoiding locking issues)

That's the idea.

> I'd still like to see some performance numbers.  Not to object to
> these patches, just to make sure there's no need to try and optimize
> more of the dereferences away when they're not needed.

I hope that the performance impact is minimal.  The kernel should spend very
little time looking at the security data.  I'll try and get some though.

> Oh, manually copied from patch 6, I see you have in the task_security
> struct definition:
> 
>   kernel_cap_tcap_bset;   /* ? */
> 
> That comment can be filled in with 'capability bounding set' (for this
> task and all its future descendents).

Thanks.

David
-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/37] Permit filesystem local caching

2008-02-20 Thread Serge E. Hallyn
Quoting David Howells ([EMAIL PROTECTED]):
> 
> 
> These patches add local caching for network filesystems such as NFS.
> 
> The patches can roughly be broken down into a number of sets:
> 
>   (*) 01-keys-inc-payload.diff
>   (*) 02-keys-search-keyring.diff
>   (*) 03-keys-callout-blob.diff
> 
>   Three patches to the keyring code made to help the CIFS people.
>   Included because of patches 05-08.
> 
>   (*) 04-keys-get-label.diff
> 
>   A patch to allow the security label of a key to be retrieved.
>   Included because of patches 05-08.
> 
>   (*) 05-security-current-fsugid.diff
>   (*) 06-security-separate-task-bits.diff

Seems *really* weird that every time you send this, patch 6 doesn't seem
to reach me in any of my mailboxes...  (did get it from the url
you listed)

I'm sorry if I miss where you explicitly state this, but is it safe to
assume, as perusing the patches suggests, that

1. tsk->sec never changes other than in task_alloc_security()?  

2. tsk->act_as is only ever dereferenced from (a) current->
   except (b) in do_coredump?

(thereby carefully avoiding locking issues)

I'd still like to see some performance numbers.  Not to object to
these patches, just to make sure there's no need to try and optimize
more of the dereferences away when they're not needed.

Oh, manually copied from patch 6, I see you have in the task_security
struct definition:

kernel_cap_tcap_bset;   /* ? */

That comment can be filled in with 'capability bounding set' (for this
task and all its future descendents).

thanks,
-serge

>   (*) 07-security-subjective.diff
>   (*) 08-security-kernel_service-class.diff
>   (*) 09-security-kernel-service.diff
>   (*) 10-security-nfsd.diff
> 
>   Patches to permit the subjective security of a task to be overridden.
>   All the security details in task_struct are decanted into a new struct
>   that task_struct then has two pointers two: one that defines the
>   objective security of that task (how other tasks may affect it) and one
>   that defines the subjective security (how it may affect other objects).
> 
>   Note that I have dropped the idea of struct cred for the moment.  With
>   the amount of stuff that was excluded from it, it wasn't actually any
>   use to me.  However, it can be added later.
> 
>   Required for cachefiles.
> 
>   (*) 11-release-page.diff
>   (*) 12-fscache-page-flags.diff
>   (*) 13-add_wait_queue_tail.diff
>   (*) 14-fscache.diff
> 
>   Patches to provide a local caching facility for network filesystems.
> 
>   (*) 15-cachefiles-ia64.diff
>   (*) 16-cachefiles-ext3-f_mapping.diff
>   (*) 17-cachefiles-write.diff
>   (*) 18-cachefiles-monitor.diff
>   (*) 19-cachefiles-export.diff
>   (*) 20-cachefiles.diff
> 
>   Patches to provide a local cache in a directory of an already mounted
>   filesystem.
> 
>   (*) 21-nfs-comment.diff
>   (*) 22-nfs-fscache-option.diff
>   (*) 23-nfs-fscache-kconfig.diff
>   (*) 24-nfs-fscache-top-index.diff
>   (*) 25-nfs-fscache-server-obj.diff
>   (*) 26-nfs-fscache-super-obj.diff
>   (*) 27-nfs-fscache-inode-obj.diff
>   (*) 28-nfs-fscache-use-inode.diff
>   (*) 29-nfs-fscache-invalidate-pages.diff
>   (*) 30-nfs-fscache-iostats.diff
>   (*) 31-nfs-fscache-page-management.diff
>   (*) 32-nfs-fscache-read-context.diff
>   (*) 33-nfs-fscache-read-fallback.diff
>   (*) 34-nfs-fscache-read-from-cache.diff
>   (*) 35-nfs-fscache-store-to-cache.diff
>   (*) 36-nfs-fscache-mount.diff
>   (*) 37-nfs-fscache-display.diff
> 
>   Patches to provide NFS with local caching.
> 
>   A couple of questions on the NFS iostat changes: (1) Should I update the
>   iostat version number; (2) is it permitted to have conditional iostats?
> 
> 
> I've brought the patchset up to date with respect to the 2.6.25-rc1 merge
> window, in particular altering Smack to handle the split in objective and
> subjective security in the task_struct.
> 
> --
> A tarball of the patches is available at:
> 
>   
> http://people.redhat.com/~dhowells/fscache/patches/nfs+fscache-30.tar.bz2
> 
> 
> To use this version of CacheFiles, the cachefilesd-0.9 is also required.  It
> is available as an SRPM:
> 
>   http://people.redhat.com/~dhowells/fscache/cachefilesd-0.9-1.fc7.src.rpm
> 
> Or as individual bits:
> 
>   http://people.redhat.com/~dhowells/fscache/cachefilesd-0.9.tar.bz2
>   http://people.redhat.com/~dhowells/fscache/cachefilesd.fc
>   http://people.redhat.com/~dhowells/fscache/cachefilesd.if
>   http://people.redhat.com/~dhowells/fscache/cachefilesd.te
>   http://people.redhat.com/~dhowells/fscache/cachefilesd.spec
> 
> The .fc, .if and .te files are for manipulating SELinux.
> 
> David
> -
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-
To unsubscribe from this list: send t

[PATCH 00/37] Permit filesystem local caching

2008-02-20 Thread David Howells


These patches add local caching for network filesystems such as NFS.

The patches can roughly be broken down into a number of sets:

  (*) 01-keys-inc-payload.diff
  (*) 02-keys-search-keyring.diff
  (*) 03-keys-callout-blob.diff

  Three patches to the keyring code made to help the CIFS people.
  Included because of patches 05-08.

  (*) 04-keys-get-label.diff

  A patch to allow the security label of a key to be retrieved.
  Included because of patches 05-08.

  (*) 05-security-current-fsugid.diff
  (*) 06-security-separate-task-bits.diff
  (*) 07-security-subjective.diff
  (*) 08-security-kernel_service-class.diff
  (*) 09-security-kernel-service.diff
  (*) 10-security-nfsd.diff

  Patches to permit the subjective security of a task to be overridden.
  All the security details in task_struct are decanted into a new struct
  that task_struct then has two pointers two: one that defines the
  objective security of that task (how other tasks may affect it) and one
  that defines the subjective security (how it may affect other objects).

  Note that I have dropped the idea of struct cred for the moment.  With
  the amount of stuff that was excluded from it, it wasn't actually any
  use to me.  However, it can be added later.

  Required for cachefiles.

  (*) 11-release-page.diff
  (*) 12-fscache-page-flags.diff
  (*) 13-add_wait_queue_tail.diff
  (*) 14-fscache.diff

  Patches to provide a local caching facility for network filesystems.

  (*) 15-cachefiles-ia64.diff
  (*) 16-cachefiles-ext3-f_mapping.diff
  (*) 17-cachefiles-write.diff
  (*) 18-cachefiles-monitor.diff
  (*) 19-cachefiles-export.diff
  (*) 20-cachefiles.diff

  Patches to provide a local cache in a directory of an already mounted
  filesystem.

  (*) 21-nfs-comment.diff
  (*) 22-nfs-fscache-option.diff
  (*) 23-nfs-fscache-kconfig.diff
  (*) 24-nfs-fscache-top-index.diff
  (*) 25-nfs-fscache-server-obj.diff
  (*) 26-nfs-fscache-super-obj.diff
  (*) 27-nfs-fscache-inode-obj.diff
  (*) 28-nfs-fscache-use-inode.diff
  (*) 29-nfs-fscache-invalidate-pages.diff
  (*) 30-nfs-fscache-iostats.diff
  (*) 31-nfs-fscache-page-management.diff
  (*) 32-nfs-fscache-read-context.diff
  (*) 33-nfs-fscache-read-fallback.diff
  (*) 34-nfs-fscache-read-from-cache.diff
  (*) 35-nfs-fscache-store-to-cache.diff
  (*) 36-nfs-fscache-mount.diff
  (*) 37-nfs-fscache-display.diff

  Patches to provide NFS with local caching.

  A couple of questions on the NFS iostat changes: (1) Should I update the
  iostat version number; (2) is it permitted to have conditional iostats?


I've brought the patchset up to date with respect to the 2.6.25-rc1 merge
window, in particular altering Smack to handle the split in objective and
subjective security in the task_struct.

--
A tarball of the patches is available at:


http://people.redhat.com/~dhowells/fscache/patches/nfs+fscache-30.tar.bz2


To use this version of CacheFiles, the cachefilesd-0.9 is also required.  It
is available as an SRPM:

http://people.redhat.com/~dhowells/fscache/cachefilesd-0.9-1.fc7.src.rpm

Or as individual bits:

http://people.redhat.com/~dhowells/fscache/cachefilesd-0.9.tar.bz2
http://people.redhat.com/~dhowells/fscache/cachefilesd.fc
http://people.redhat.com/~dhowells/fscache/cachefilesd.if
http://people.redhat.com/~dhowells/fscache/cachefilesd.te
http://people.redhat.com/~dhowells/fscache/cachefilesd.spec

The .fc, .if and .te files are for manipulating SELinux.

David
-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 00/37] Permit filesystem local caching

2008-02-08 Thread David Howells


These patches add local caching for network filesystems such as NFS.

The patches can roughly be broken down into a number of sets:

  (*) 01-keys-inc-payload.diff
  (*) 02-keys-search-keyring.diff
  (*) 03-keys-callout-blob.diff

  Three patches to the keyring code made to help the CIFS people.
  Included because of patches 05-08.

  (*) 04-keys-get-label.diff

  A patch to allow the security label of a key to be retrieved.
  Included because of patches 05-08.

  (*) 05-security-current-fsugid.diff
  (*) 06-security-separate-task-bits.diff
  (*) 07-security-subjective.diff
  (*) 08-security-kernel_service-class.diff
  (*) 09-security-kernel-service.diff
  (*) 10-security-nfsd.diff

  Patches to permit the subjective security of a task to be overridden.
  All the security details in task_struct are decanted into a new struct
  that task_struct then has two pointers two: one that defines the
  objective security of that task (how other tasks may affect it) and one
  that defines the subjective security (how it may affect other objects).

  Note that I have dropped the idea of struct cred for the moment.  With
  the amount of stuff that was excluded from it, it wasn't actually any
  use to me.  However, it can be added later.

  Required for cachefiles.

  (*) 11-release-page.diff
  (*) 12-fscache-page-flags.diff
  (*) 13-add_wait_queue_tail.diff
  (*) 14-fscache.diff

  Patches to provide a local caching facility for network filesystems.

  (*) 15-cachefiles-ia64.diff
  (*) 16-cachefiles-ext3-f_mapping.diff
  (*) 17-cachefiles-write.diff
  (*) 18-cachefiles-monitor.diff
  (*) 19-cachefiles-export.diff
  (*) 20-cachefiles.diff

  Patches to provide a local cache in a directory of an already mounted
  filesystem.

  (*) 21-nfs-comment.diff
  (*) 22-nfs-fscache-option.diff
  (*) 23-nfs-fscache-kconfig.diff
  (*) 24-nfs-fscache-top-index.diff
  (*) 25-nfs-fscache-server-obj.diff
  (*) 26-nfs-fscache-super-obj.diff
  (*) 27-nfs-fscache-inode-obj.diff
  (*) 28-nfs-fscache-use-inode.diff
  (*) 29-nfs-fscache-invalidate-pages.diff
  (*) 30-nfs-fscache-iostats.diff
  (*) 31-nfs-fscache-page-management.diff
  (*) 32-nfs-fscache-read-context.diff
  (*) 33-nfs-fscache-read-fallback.diff
  (*) 34-nfs-fscache-read-from-cache.diff
  (*) 35-nfs-fscache-store-to-cache.diff
  (*) 36-nfs-fscache-mount.diff
  (*) 37-nfs-fscache-display.diff

  Patches to provide NFS with local caching.

  A couple of questions on the NFS iostat changes: (1) Should I update the
  iostat version number; (2) is it permitted to have conditional iostats?


I've massively split up the NFS patches as requested by Trond Myklebust and
Chuck Lever.  I've also brought the patches up to date with the patch window
turbulence.

--
A tarball of the patches is available at:


http://people.redhat.com/~dhowells/fscache/patches/nfs+fscache-29.tar.bz2


To use this version of CacheFiles, the cachefilesd-0.9 is also required.  It
is available as an SRPM:

http://people.redhat.com/~dhowells/fscache/cachefilesd-0.9-1.fc7.src.rpm

Or as individual bits:

http://people.redhat.com/~dhowells/fscache/cachefilesd-0.9.tar.bz2
http://people.redhat.com/~dhowells/fscache/cachefilesd.fc
http://people.redhat.com/~dhowells/fscache/cachefilesd.if
http://people.redhat.com/~dhowells/fscache/cachefilesd.te
http://people.redhat.com/~dhowells/fscache/cachefilesd.spec

The .fc, .if and .te files are for manipulating SELinux.

David
-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html