Re: [PATCH 01/14] GFS: headers

2005-09-01 Thread Jörn Engel
On Thu, 1 September 2005 22:59:48 +0800, David Teigland wrote: > > We offered to removed this when I explained it before. It sounds like it > would give you some comfort so I'll just go ahead and do it barring any > pleas otherwise. Please do. Just have one test machine with an endianness diffe

Re: GFS, what's remaining

2005-09-01 Thread Lars Marowsky-Bree
On 2005-09-01T16:28:30, Alan Cox <[EMAIL PROTECTED]> wrote: > Competition will decide if OCFS or GFS is better, or indeed if someone > comes along with another contender that is better still. And competition > will probably get the answer right. Competition will come up with the

Re: GFS, what's remaining

2005-09-01 Thread Alan Cox
> That's GFS. The submission is about a GFS2 that's on-disk incompatible > to GFS. Just like say reiserfs3 and reiserfs4 or ext and ext2 or ext2 and ext3 then. I think the main point still stands - we have always taken multiple file systems on board and we have benefitted enormo

Re: [PATCH 01/14] GFS: headers

2005-09-01 Thread David Teigland
On Thu, Sep 01, 2005 at 04:19:34PM +0200, Arjan van de Ven wrote: > > +/* Endian functions */ > > e again why?? > Why is this a compiletime hack? > Either you care about either-endian on disk, at which point it has to be > a runtime thing, or you make the on disk layout fixed endian, at whic

[PATCH 03/13] GFS: directories

2005-09-01 Thread David Teigland
Code that handles directory operations. Signed-off-by: Ken Preslan <[EMAIL PROTECTED]> Signed-off-by: David Teigland <[EMAIL PROTECTED]> --- fs/gfs2/dir.c | 2158 ++ fs/gfs2/dir.h | 51 + 2 files changed, 2209 insertions(+) --- a/fs/gfs

Re: [PATCH 01/14] GFS: headers

2005-09-01 Thread viro
On Thu, Sep 01, 2005 at 04:19:34PM +0200, Arjan van de Ven wrote: > > +/* Endian functions */ > > e again why?? > Why is this a compiletime hack? > Either you care about either-endian on disk, at which point it has to be > a runtime thing, or you make the on disk layout fixed endian, at which

Re: GFS, what's remaining

2005-09-01 Thread Alan Cox
On Iau, 2005-09-01 at 03:59 -0700, Andrew Morton wrote: > - Why the kernel needs two clustered fileystems So delete reiserfs4, FAT, VFAT, ext2, and all the other "junk". > - Why GFS is better than OCFS2, or has functionality which OCFS2 cannot > possibly gain (or vice ver

[PATCH 04/13] GFS: allocation

2005-09-01 Thread David Teigland
Code that manages block allocation. Signed-off-by: Ken Preslan <[EMAIL PROTECTED]> Signed-off-by: David Teigland <[EMAIL PROTECTED]> --- fs/gfs2/bits.c | 179 +++ fs/gfs2/bits.h | 28 + fs/gfs2/rgrp.c | 1374 + fs/gfs2/rgrp.h | 62

Re: GFS, what's remaining

2005-09-01 Thread Christoph Hellwig
On Thu, Sep 01, 2005 at 03:49:18PM +0100, Alan Cox wrote: > > - Why GFS is better than OCFS2, or has functionality which OCFS2 cannot > > possibly gain (or vice versa) > > > > - Relative merits of the two offerings > > You missed the important one - people act

Re: [PATCH 01/14] GFS: headers

2005-09-01 Thread Arjan van de Ven
> +#ifndef TRUE > +#define TRUE 1 > +#endif > + > +#ifndef FALSE > +#define FALSE 0 > +#endif eh why can't you just use the regular kernel conventions > + > +#define NO_CREATE 0 > +#define CREATE 1 > + > +#define NO_WAIT 0 > +#define WAIT 1 > + > +#define NO_FORCE 0 > +#define FORCE 1 these de

[PATCH 01/14] GFS: headers

2005-09-01 Thread David Teigland
Central header files that are widely used. Signed-off-by: Ken Preslan <[EMAIL PROTECTED]> Signed-off-by: David Teigland <[EMAIL PROTECTED]> --- fs/gfs2/gfs2.h | 77 +++ fs/gfs2/incore.h| 691 +++ include/linux/gfs2_ioctl.h | 30 + include/l

[PATCH 05/13] GFS: ea and acl

2005-09-01 Thread David Teigland
Code that handles extended attributes and ACL's. Signed-off-by: Ken Preslan <[EMAIL PROTECTED]> Signed-off-by: David Teigland <[EMAIL PROTECTED]> --- fs/gfs2/acl.c | 313 ++ fs/gfs2/acl.h | 37 + fs/gfs2/eaops.c | 179 ++ fs/gfs2/eaops.h | 30 + fs/gfs2/eattr.c | 1621

[PATCH 08/13] GFS: mount and tuning options

2005-09-01 Thread David Teigland
There are a variety of mount options, tunable parameters, internal statistics, and methods of online file system manipulation. Signed-off-by: Ken Preslan <[EMAIL PROTECTED]> Signed-off-by: David Teigland <[EMAIL PROTECTED]> --- fs/gfs2/ioctl.c | 1485 +++

[PATCH 10/13] GFS: build and documentation

2005-09-01 Thread David Teigland
Add gfs to the build system and gfs2.txt to Documentation. Signed-off-by: Ken Preslan <[EMAIL PROTECTED]> Signed-off-by: David Teigland <[EMAIL PROTECTED]> --- Documentation/filesystems/gfs2.txt | 194 + fs/Kconfig |

[PATCH 07/13] GFS: quotas

2005-09-01 Thread David Teigland
Code that deals with quotas. Signed-off-by: Ken Preslan <[EMAIL PROTECTED]> Signed-off-by: David Teigland <[EMAIL PROTECTED]> --- fs/gfs2/lvb.c | 61 ++ fs/gfs2/lvb.h | 28 + fs/gfs2/quota.c | 1209 fs/gfs2/quota.h | 34 + 4 fil

[PATCH 06/13] GFS: logging and recovery

2005-09-01 Thread David Teigland
A per-node on-disk log is used for recovery. Signed-off-by: Ken Preslan <[EMAIL PROTECTED]> Signed-off-by: David Teigland <[EMAIL PROTECTED]> --- fs/gfs2/log.c | 670 + fs/gfs2/log.h | 68 + fs/gfs2/recovery.c | 561 +

[PATCH 11/13] GFS: lock_harness module

2005-09-01 Thread David Teigland
The lock_harness module allows a gfs file system to connect to a given lock module. Signed-off-by: Ken Preslan <[EMAIL PROTECTED]> Signed-off-by: David Teigland <[EMAIL PROTECTED]> --- fs/gfs2/locking/harness/Makefile |3 fs/gfs2/locking/harness/lm_interf

[PATCH 13/13] GFS: lock_dlm module

2005-09-01 Thread David Teigland
m_ls *ls = lp->ls; + + spin_lock(&ls->async_lock); + list_add_tail(&lp->delay_list, &ls->delayed); + spin_unlock(&ls->async_lock); +} + +/* convert gfs lock-state to dlm lock-mode */ + +static int16_t make_mode(int16_t lmstate) +{ + switch (lmstate) { +

[PATCH 12/13] GFS: lock_nolock module

2005-09-01 Thread David Teigland
The lock_nolock module does no inter-node locking and allows gfs to be used as a local file system. Signed-off-by: Ken Preslan <[EMAIL PROTECTED]> Signed-off-by: David Teigland <[EMAIL PROTECTED]> --- fs/gfs2/locking/nolock/Makefile |3 fs/gfs2/locking/nolock/mai

Re: GFS, what's remaining

2005-09-01 Thread Pekka Enberg
On 9/1/05, David Teigland <[EMAIL PROTECTED]> wrote: > - Adapt the vfs so gfs (and other cfs's) don't need to walk vma lists. > [cf. ops_file.c:walk_vm(), gfs works fine as is, but some don't like it.] It works fine only if you don't care about playing well

Re: GFS, what's remaining

2005-09-01 Thread Arjan van de Ven
On Thu, 2005-09-01 at 18:46 +0800, David Teigland wrote: > Hi, this is the latest set of gfs patches, it includes some minor munging > since the previous set. Andrew, could this be added to -mm? there's not > much in the way of pending changes. > > http://redhat.com/~teiglan

Re: GFS, what's remaining

2005-09-01 Thread Andrew Morton
David Teigland <[EMAIL PROTECTED]> wrote: > > Hi, this is the latest set of gfs patches, it includes some minor munging > since the previous set. Andrew, could this be added to -mm? Dumb question: why? Maybe I was asleep, but I don't recall seeing much discussion or expo

Re: GFS, what's remaining

2005-09-01 Thread Arjan van de Ven
On Thu, 2005-09-01 at 18:46 +0800, David Teigland wrote: > Hi, this is the latest set of gfs patches, it includes some minor munging > since the previous set. Andrew, could this be added to -mm? there's not > much in the way of pending changes. can you post them here instead so th

GFS, what's remaining

2005-09-01 Thread David Teigland
Hi, this is the latest set of gfs patches, it includes some minor munging since the previous set. Andrew, could this be added to -mm? there's not much in the way of pending changes. http://redhat.com/~teigland/gfs2/20050901/gfs2-full.patch http://redhat.com/~teigland/gfs2/20050901/broke

Re: GFS

2005-08-11 Thread Pekka Enberg
On Thu, 2005-08-11 at 09:33 -0700, Zach Brown wrote: > I don't think this patch is the way to go at all. It imposes an > allocation and vma walking overhead for the vast majority of IOs that > aren't interested. It doesn't look like it will get a consistent > ordering when multiple file systems a

Re: GFS

2005-08-11 Thread Zach Brown
> That doesn't matter. Please don't put in any effort for lustre special > cases - they are unwilling to cooperate and they'll get what they deserve. Sure, we can add that extra functional layer in another pass. I thought I'd still bring it up, though, as OCFS2 is slated to care at some point i

Re: GFS

2005-08-11 Thread Christoph Hellwig
On Thu, Aug 11, 2005 at 09:33:41AM -0700, Zach Brown wrote: > ordering when multiple file systems are concerned. It doesn't record > the ranges of the mappings involved so Lustre can't properly use its > range locks. That doesn't matter. Please don't put in any effort for lustre special cases -

Re: GFS

2005-08-11 Thread Zach Brown
> What I meant was that, if a filesystem requires vma walks, we need to do > it VFS level with something like the following patch. I don't think this patch is the way to go at all. It imposes an allocation and vma walking overhead for the vast majority of IOs that aren't interested. It doesn't

Re: GFS - updated patches

2005-08-11 Thread Pekka Enberg
Hi, On 8/11/05, David Teigland <[EMAIL PROTECTED]> wrote: > The large majority, and I think all that people care about. If we ignored > something that someone thinks is important, a reminder would be useful. The only remaining issue for me is the vma walk. Thanks, David!

Re: [Linux-cluster] GFS - updated patches

2005-08-11 Thread Pekka Enberg
On 8/11/05, Michael <[EMAIL PROTECTED]> wrote: > Hi, Dave, > > I quickly applied gfs2 and dlm patches in kernel 2.6.12.2, it passed > compiling but has some warning log, see attachment. maybe helpful to > you. kzalloc is not in Linus' tree yet. Try with 2.6.13-rc5-mm1.

Re: [Linux-cluster] GFS - updated patches

2005-08-11 Thread Michael
Hi, Dave, I quickly applied gfs2 and dlm patches in kernel 2.6.12.2, it passed compiling but has some warning log, see attachment. maybe helpful to you. Thanks, Michael On 8/11/05, David Teigland <[EMAIL PROTECTED]> wrote: > Thanks for all the review and comments. This is a new set of patches

Re: GFS - updated patches

2005-08-11 Thread David Teigland
On Thu, Aug 11, 2005 at 10:50:32AM +0200, Arjan van de Ven wrote: > > > > Thanks for all the review and comments. This is a new set of > > > > patches that incorporates the suggestions we've received. > > > > > > all of them or only a subset? > > with them I meant the suggestions not the patche

Re: GFS - updated patches

2005-08-11 Thread Arjan van de Ven
On Thu, 2005-08-11 at 16:50 +0800, David Teigland wrote: > On Thu, Aug 11, 2005 at 10:32:38AM +0200, Arjan van de Ven wrote: > > On Thu, 2005-08-11 at 16:17 +0800, David Teigland wrote: > > > Thanks for all the review and comments. This is a new set of patches that > > > incorporates the suggestio

Re: GFS - updated patches

2005-08-11 Thread Michael
yes, after apply dlm.patch, I saw it! although I don't know what's "-mm". Thanks, Michael On 8/11/05, David Teigland <[EMAIL PROTECTED]> wrote: > On Thu, Aug 11, 2005 at 04:21:04PM +0800, Michael wrote: > > I have the same question as I asked before, how ca

Re: GFS - updated patches

2005-08-11 Thread David Teigland
On Thu, Aug 11, 2005 at 10:32:38AM +0200, Arjan van de Ven wrote: > On Thu, 2005-08-11 at 16:17 +0800, David Teigland wrote: > > Thanks for all the review and comments. This is a new set of patches that > > incorporates the suggestions we've received. > > all of them or only a subset? All patche

Re: GFS - updated patches

2005-08-11 Thread David Teigland
On Thu, Aug 11, 2005 at 04:21:04PM +0800, Michael wrote: > I have the same question as I asked before, how can I see GFS in "make > menuconfig", after I patch gfs2-full.patch into a 2.6.12.2 kernel? You need to select the dlm under drivers. It's in -mm, or apply http:/

Re: GFS - updated patches

2005-08-11 Thread Arjan van de Ven
On Thu, 2005-08-11 at 16:17 +0800, David Teigland wrote: > Thanks for all the review and comments. This is a new set of patches that > incorporates the suggestions we've received. all of them or only a subset? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body

Re: [Linux-cluster] GFS - updated patches

2005-08-11 Thread Michael
I have the same question as I asked before, how can I see GFS in "make menuconfig", after I patch gfs2-full.patch into a 2.6.12.2 kernel? Michael On 8/11/05, David Teigland <[EMAIL PROTECTED]> wrote: > Thanks for all the review and comments. This is a new set of patches that

GFS - updated patches

2005-08-11 Thread David Teigland
Thanks for all the review and comments. This is a new set of patches that incorporates the suggestions we've received. http://redhat.com/~teigland/gfs2/20050811/gfs2-full.patch http://redhat.com/~teigland/gfs2/20050811/broken-out/ Dave - To unsubscribe from this list: send the line "unsubscribe

Re: GFS

2005-08-11 Thread Pekka J Enberg
Hi Mark, On Thu, 11 Aug 2005, Pekka J Enberg wrote: > Reading and writing from other filesystems to a GFS2 mmap'd file > does not walk the vmas. Therefore, data consistency guarantees > are different: What I meant was that, if a filesystem requires vma walks, we need to do it VFS level with some

Re: [PATCH 00/14] GFS

2005-08-10 Thread Arjan van de Ven
On Thu, 2005-08-11 at 14:06 +0800, David Teigland wrote: > On Tue, Aug 02, 2005 at 09:45:24AM +0200, Arjan van de Ven wrote: > > > * + if (create) > > + down_write(&ip->i_rw_mutex); > > + else > > + down_read(&ip->i_rw_mutex); > > > > why do you use a rwsem and not a regular

Re: [PATCH 00/14] GFS

2005-08-10 Thread David Teigland
On Tue, Aug 02, 2005 at 09:45:24AM +0200, Arjan van de Ven wrote: > * + if (create) > + down_write(&ip->i_rw_mutex); > + else > + down_read(&ip->i_rw_mutex); > > why do you use a rwsem and not a regular semaphore? You are aware that > rwsems are far more expensive th

Re: GFS

2005-08-10 Thread Pekka J Enberg
Hi, On Wed, Aug 10, 2005 at 11:18:48PM +0300, Pekka J Enberg wrote: > You, however, don't maintain the same level of data consistency when reads > and writes are from other filesystems as they use ->nopage. Mark Fasheh writes: I'm not sure what you mean here... Reading and writing from oth

Re: GFS

2005-08-10 Thread Mark Fasheh
On Wed, Aug 10, 2005 at 11:18:48PM +0300, Pekka J Enberg wrote: > Aah, I see GFS2 does that too so no deadlocks here. Thanks. Yep, no problem :) > You, however, don't maintain the same level of data consistency when reads > and writes are from other filesystems as they use ->nopage. I'm not sure w

Re: GFS

2005-08-10 Thread Pekka J Enberg
Mark Fasheh writes: Hmm, well today in OCFS2 if you're not coming from read or write, the lock is held only for the duration of ->nopage so I don't think we could get into any deadlocks for that usage. Aah, I see GFS2 does that too so no deadlocks here. Thanks. You, however, don't maintain the

Re: GFS

2005-08-10 Thread Mark Fasheh
On Wed, Aug 10, 2005 at 07:57:43PM +0300, Pekka J Enberg wrote: > Surely avoiding them is preferred but how do you do that when you have to > mmap'd regions where userspace does memcpy()? The kernel won't much saying > in it until ->nopage. We cannot grab all the required locks in proper order >

Re: GFS

2005-08-10 Thread Pekka J Enberg
Hi Mark, Mark Fasheh writes: This may sound naive, but so far OCFS2 has avoided the nead for deadlock detection... I'd hate to have to add it now -- better to try avoiding them in the first place. Surely avoiding them is preferred but how do you do that when you have to mmap'd regions where

Re: GFS

2005-08-10 Thread Mark Fasheh
On Wed, Aug 10, 2005 at 10:31:04AM +0300, Pekka J Enberg wrote: > It seems to me that the distributed locks must be acquired in ->nopage > anyway to solve the problem with memcpy() between two mmap'd regions. One > possible solution would be for the lock manager to detect deadlocks and > break s

Re: [Linux-cluster] Re: [PATCH 00/14] GFS

2005-08-10 Thread Kyle Moffett
-kernel AFS client could use it in similar fashion (It has no method to adjust hierarchy, because it's still read-only). GFS could use it for their Context Dependent Symlinks. Since it would pass the type in as well, it would be possible to use it for different kinds of links on t

Re: [Linux-cluster] Re: [PATCH 00/14] GFS

2005-08-10 Thread AJ Lewis
On Wed, Aug 10, 2005 at 12:11:10PM +0100, Christoph Hellwig wrote: > On Wed, Aug 10, 2005 at 01:09:17PM +0200, Lars Marowsky-Bree wrote: > > So for every directoy hiearchy on a shared filesystem, each user needs > > to have the complete list of bindmounts needed, and automatically resync > > that a

Re: [PATCH 00/14] GFS

2005-08-10 Thread Christoph Hellwig
On Wed, Aug 10, 2005 at 01:09:17PM +0200, Lars Marowsky-Bree wrote: > On 2005-08-10T12:05:11, Christoph Hellwig <[EMAIL PROTECTED]> wrote: > > > > What would a syntax look like which in your opinion does not remove > > > totally valid symlink targets for magic mushroom bullshit? Prefix with > > >

Re: [PATCH 00/14] GFS

2005-08-10 Thread Lars Marowsky-Bree
On 2005-08-10T12:05:11, Christoph Hellwig <[EMAIL PROTECTED]> wrote: > > What would a syntax look like which in your opinion does not remove > > totally valid symlink targets for magic mushroom bullshit? Prefix with > > // (which, according to POSIX, allows for implementation-defined > > behaviour

Re: [PATCH 00/14] GFS

2005-08-10 Thread Christoph Hellwig
On Wed, Aug 10, 2005 at 01:02:59PM +0200, Lars Marowsky-Bree wrote: > What would a syntax look like which in your opinion does not remove > totally valid symlink targets for magic mushroom bullshit? Prefix with > // (which, according to POSIX, allows for implementation-defined > behaviour)? Somethi

Re: [PATCH 00/14] GFS

2005-08-10 Thread Lars Marowsky-Bree
On 2005-08-10T11:54:50, Christoph Hellwig <[EMAIL PROTECTED]> wrote: > It works now. Unlike context link which steal totally valid symlink > targets for magic mushroom bullshit. Right, that is a valid concern. Avoiding context dependent symlinks entirely certainly is one possible path around thi

Re: [PATCH 00/14] GFS

2005-08-10 Thread Christoph Hellwig
On Wed, Aug 10, 2005 at 12:34:24PM +0200, Lars Marowsky-Bree wrote: > On 2005-08-10T11:32:56, Christoph Hellwig <[EMAIL PROTECTED]> wrote: > > > > Would a generic implementation of that higher up in the VFS be more > > > acceptable? > > No. Use mount --bind > > That's a working and less complex

Re: [PATCH 00/14] GFS

2005-08-10 Thread Lars Marowsky-Bree
On 2005-08-10T11:32:56, Christoph Hellwig <[EMAIL PROTECTED]> wrote: > > Would a generic implementation of that higher up in the VFS be more > > acceptable? > No. Use mount --bind That's a working and less complex alternative for upto how many places at once? That works for non-root users how...

Re: [PATCH 00/14] GFS

2005-08-10 Thread Christoph Hellwig
On Wed, Aug 10, 2005 at 12:30:41PM +0200, Lars Marowsky-Bree wrote: > On 2005-08-10T08:03:09, Christoph Hellwig <[EMAIL PROTECTED]> wrote: > > > > Kindly lose the "Context Dependent Pathname" crap. > > Same for ocfs2. > > Would a generic implementation of that higher up in the VFS be more > accep

Re: [PATCH 00/14] GFS

2005-08-10 Thread Lars Marowsky-Bree
On 2005-08-10T08:03:09, Christoph Hellwig <[EMAIL PROTECTED]> wrote: > > Kindly lose the "Context Dependent Pathname" crap. > Same for ocfs2. Would a generic implementation of that higher up in the VFS be more acceptable? It's not like context-dependent symlinks are an arbitary feature, but rath

Re: GFS

2005-08-10 Thread Christoph Hellwig
On Wed, Aug 10, 2005 at 10:40:37AM +0300, Pekka J Enberg wrote: > Hi David, > > >+ return -EINVAL; > >+ if (!access_ok(VERIFY_WRITE, buf, size)) > >+ return -EFAULT; > >+ > >+ if (!(file->f_flags & O_LARGEFILE)) { > >+ if (*offset >= 0x7FFFull) > >+

Re: GFS

2005-08-10 Thread Pekka J Enberg
Hi David, + return -EINVAL; + if (!access_ok(VERIFY_WRITE, buf, size)) + return -EFAULT; + + if (!(file->f_flags & O_LARGEFILE)) { + if (*offset >= 0x7FFFull) + return -EFBIG; + if (*offset + size > 0x7FFFull) +

Re: GFS

2005-08-10 Thread Pekka J Enberg
On Tue, Aug 09, 2005 at 05:49:43PM +0300, Pekka Enberg wrote: > In addition, the vma walk will become an unmaintainable mess as soon as > someone introduces another mmap() capable fs that needs similar locking. Christoph Hellwig writes: We already have OCFS2 in -mm that does similar things. I

Re: GFS

2005-08-10 Thread Christoph Hellwig
On Tue, Aug 09, 2005 at 05:49:43PM +0300, Pekka Enberg wrote: > On Mon, 2005-08-08 at 11:32 -0700, Zach Brown wrote: > > > Sorry if this is an obvious question but what prevents another thread > > > from doing mmap() before we do the second walk and messing up num_gh? > > > > Nothing, I suspect.

Re: [PATCH 00/14] GFS

2005-08-10 Thread Christoph Hellwig
On Tue, Aug 09, 2005 at 04:20:45PM +0100, Al Viro wrote: > On Tue, Aug 02, 2005 at 03:18:28PM +0800, David Teigland wrote: > > Hi, GFS (Global File System) is a cluster file system that we'd like to > > see added to the kernel. The 14 patches total about 900K so I won't

Re: GFS

2005-08-09 Thread Pekka J Enberg
David Teigland writes: if (!dumping) down_read(&mm->mmap_sem); > > + > > + for (vma = find_vma(mm, start); vma; vma = vma->vm_next) { > > + if (end <= vma->vm_start) > > + break; > > +

Re: GFS

2005-08-09 Thread David Teigland
On Mon, Aug 08, 2005 at 05:14:45PM +0300, Pekka J Enberg wrote: if (!dumping) down_read(&mm->mmap_sem); > >+ > >+ for (vma = find_vma(mm, start); vma; vma = vma->vm_next) { > >+ if (end <= vma->vm_start) > >+

Re: GFS

2005-08-09 Thread Pekka J Enberg
Zach Brown writes: But couldn't we use make_pages_present() to figure which locks we need, sort them, and then grab them? Doh, obviously we can't as nopage() needs to bring the page in. Sorry about that. I also thought of another failure case for the vma walk. When a thread uses userspace m

Re: GFS

2005-08-09 Thread Pekka J Enberg
Hi Zach, Zach Brown writes: I'll try, briefly. Thanks for the excellent explanation. Zach Brown writes: And that's the problem. Because they're acquired in ->nopage they can be acquired during a fault that is servicing the 'buf' argument to an outer file->{read,write} operation which has

Re: GFS

2005-08-09 Thread Zach Brown
d a lock for the target file. Acquiring multiple locks introduces the risk of ABBA deadlocks. It's trivial to construct examples of mmap(), read(), and write() on 2 nodes with 2 files that deadlock. So clustered file systems in Linux (GFS, Lustre, OCFS2, (GPFS?)) all walk vmas in their file

Re: [PATCH 00/14] GFS

2005-08-09 Thread Al Viro
On Tue, Aug 02, 2005 at 03:18:28PM +0800, David Teigland wrote: > Hi, GFS (Global File System) is a cluster file system that we'd like to > see added to the kernel. The 14 patches total about 900K so I won't send > them to the list unless that's requested. Comments and s

Re: GFS

2005-08-09 Thread Pekka Enberg
On Mon, 2005-08-08 at 11:32 -0700, Zach Brown wrote: > > Sorry if this is an obvious question but what prevents another thread > > from doing mmap() before we do the second walk and messing up num_gh? > > Nothing, I suspect. OCFS2 has a problem like this, too. It wants a way > for a file system

Re: GFS

2005-08-09 Thread Pekka J Enberg
Hi David, Here are some more comments. Pekka +/** +*** +** +** Copyright (C) Sistina Software, Inc. 1997-2003 All r

Re: GFS

2005-08-08 Thread Zach Brown
Pekka J Enberg wrote: > Sorry if this is an obvious question but what prevents another thread > from doing mmap() before we do the second walk and messing up num_gh? Nothing, I suspect. OCFS2 has a problem like this, too. It wants a way for a file system to serialize mmap/munmap/mremap during f

Re: GFS

2005-08-08 Thread Pekka J Enberg
David Teigland writes: +static ssize_t walk_vm_hard(struct file *file, char *buf, size_t size, + loff_t *offset, do_rw_t operation) +{ + struct gfs2_holder *ghs; + unsigned int num_gh = 0; + ssize_t count; + + { Can we please get rid of the extra braces e

Re: GFS

2005-08-08 Thread David Teigland
On Mon, Aug 08, 2005 at 01:57:55PM +0300, Pekka J Enberg wrote: > David Teigland writes: > >> but why can't you return NULL here on failure like you do for > >> find_lock_page()? > > > >because create is set > > Yes, but looking at (some of the) top-level callers, there's no real reason > why c

Re: GFS

2005-08-08 Thread Pekka J Enberg
David Teigland writes: > but why can't you return NULL here on failure like you do for > find_lock_page()? because create is set Yes, but looking at (some of the) top-level callers, there's no real reason why create must not fail. Am I missing something here? > gfs2-02.patch:+ RETRY_MALL

Re: GFS

2005-08-08 Thread David Teigland
On Mon, Aug 08, 2005 at 01:18:45PM +0300, Pekka J Enberg wrote: > gfs2-02.patch:+ RETRY_MALLOC(ip = kmem_cache_alloc(gfs2_inode_cachep, > -> GFP_NOFAIL. Already gone, inode_create() can return an error. if (create) { RETRY_MALLOC(page = grab_cache_page(aspace->i_mapping, index),

Re: GFS

2005-08-08 Thread Pekka J Enberg
David Teigland writes: > Is there a reason why you cannot use or ? See gfs2_hash_more() and comment; we hash discontiguous regions. jhash() takes an initial value. Isn't that sufficient? Pekka - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in t

Re: [PATCH 00/14] GFS

2005-08-08 Thread Jörn Engel
bility. gcc is getting smarter > about stack use nowadays, and {}'s shouldn't be needed to help it, it > tracks liveness of variables already. Plus, you don't have to guess about stack usage. Run "make checkstack" or, better yet, run the objdump of fs/gfs/built-in.o t

Re: GFS

2005-08-08 Thread Pekka J Enberg
David Teigland writes: > > +#define RETRY_MALLOC(do_this, until_this) \ > > +for (;;) { \ > > + { do_this; } \ > > + if (until_this) \ > > + break; \ > > + if (time_after_eq(jiffies, gfs2_malloc_warning + 5 * HZ)) { \ > > + printk("GFS2: out of memory: %s, %u\n

Re: [PATCH 00/14] GFS

2005-08-08 Thread Arjan van de Ven
On Mon, 2005-08-08 at 17:57 +0800, David Teigland wrote: > > > > Please drop the extra braces. > > Here and elsewhere we try to keep unused stuff off the stack. Are you > suggesting that we're being overly cautious, or do you just dislike the > way it looks? nice theory. In practice gcc 3.x sti

Re: GFS

2005-08-08 Thread Pekka J Enberg
David Teigland writes: > > +static int ea_set_i(struct gfs2_inode *ip, struct gfs2_ea_request *er, > > + struct gfs2_ea_location *el) > > +{ > > + { > > + struct ea_set es; > > + int error; > > + > > + memset(&es, 0, sizeof(struct ea_set));

Re: [PATCH 00/14] GFS

2005-08-08 Thread David Teigland
On Wed, Aug 03, 2005 at 09:44:06AM +0300, Pekka Enberg wrote: > > +uint32_t gfs2_hash(const void *data, unsigned int len) > > +{ > > + uint32_t h = 0x811C9DC5; > > + h = hash_more_internal(data, len, h); > > + return h; > > +} > > Is there a reason why you cannot use or ? See gfs2_h

Re: [PATCH 00/14] GFS

2005-08-07 Thread David Teigland
On Fri, Aug 05, 2005 at 03:14:15PM +0800, David Teigland wrote: > On Tue, Aug 02, 2005 at 09:45:24AM +0200, Arjan van de Ven wrote: > > * +++ b/fs/gfs2/fixed_div64.h 2005-08-01 14:13:08.009808200 +0800 > > e why? > > I'm not sure, actually, apart from the comments: > > do_div: /* For i

Re: [PATCH 00/14] GFS

2005-08-07 Thread Alan Cox
enced linux developer from outside the GFS team. Are you volunteering ? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

Re: [PATCH 00/14] GFS

2005-08-05 Thread David Teigland
On Fri, Aug 05, 2005 at 12:07:50PM +0200, J?rn Engel wrote: > On Fri, 5 August 2005 17:44:52 +0800, David Teigland wrote: > > Do we go a step beyond this and use say the crc32() function from > > linux/crc32.h? Is this _function_ as standard and unchanging as the table > > of crcs? In my tests it

Re: [PATCH 00/14] GFS

2005-08-05 Thread Jörn Engel
On Fri, 5 August 2005 17:44:52 +0800, David Teigland wrote: > > linux/lib/crc32table.h : crc32table_le[] is the same as our crc_32_tab[]. > This looks like a standard that's not going to change, as you've said, so > including crc32table.h and getting rid of our own table would work fine. > > Do w

Re: [PATCH 00/14] GFS

2005-08-05 Thread David Teigland
do you duplicate this? The kernel has a perfectly good set of > > > generic crc32 tables/functions just fine > > > > The gfs2_disk_hash() function and the crc table on which it's based are a > > part of gfs2_ondisk.h: the ondisk metadata specification. This is a bit

Re: [PATCH 00/14] GFS

2005-08-05 Thread Arjan van de Ven
On Fri, 2005-08-05 at 10:28 +0200, Jan Engelhardt wrote: > >The gfs2_disk_hash() function and the crc table on which it's based are a > >part of gfs2_ondisk.h: the ondisk metadata specification. This is a bit > >unusual since gfs uses a hash table on-disk for its direc

Re: [PATCH 00/14] GFS

2005-08-05 Thread Jan Engelhardt
>The gfs2_disk_hash() function and the crc table on which it's based are a >part of gfs2_ondisk.h: the ondisk metadata specification. This is a bit >unusual since gfs uses a hash table on-disk for its directory structure. >This header, including the hash function/table, must be

Re: [PATCH 00/14] GFS

2005-08-05 Thread Arjan van de Ven
bles/functions just fine > > The gfs2_disk_hash() function and the crc table on which it's based are a > part of gfs2_ondisk.h: the ondisk metadata specification. This is a bit > unusual since gfs uses a hash table on-disk for its directory structure. > This header, including the h

Re: [Linux-cluster] Re: [PATCH 00/14] GFS

2005-08-05 Thread Mike Christie
Mike Christie wrote: > David Teigland wrote: > >>On Tue, Aug 02, 2005 at 09:45:24AM +0200, Arjan van de Ven wrote: >> >> >>>* Why are you using bufferheads extensively in a new filesystem? >> >> >>bh's are used for metadata, the log, and journaled data which need to be >>written at the block granu

Re: [Linux-cluster] Re: [PATCH 00/14] GFS

2005-08-05 Thread Mike Christie
David Teigland wrote: > On Tue, Aug 02, 2005 at 09:45:24AM +0200, Arjan van de Ven wrote: > >>* Why are you using bufferheads extensively in a new filesystem? > > > bh's are used for metadata, the log, and journaled data which need to be > written at the block granularity, not page. > In a scs

Re: [PATCH 00/14] GFS

2005-08-05 Thread David Teigland
27;s based are a part of gfs2_ondisk.h: the ondisk metadata specification. This is a bit unusual since gfs uses a hash table on-disk for its directory structure. This header, including the hash function/table, must be included by user space programs like fsck that want to decipher a fs, and any

Re: [PATCH 00/14] GFS

2005-08-03 Thread Mark Fasheh
On Wed, Aug 03, 2005 at 12:37:44PM +0200, Lars Marowsky-Bree wrote: > On 2005-08-03T11:56:18, David Teigland <[EMAIL PROTECTED]> wrote: > > > > * Why use your own journalling layer and not say ... jbd ? > > Here's an analysis of three approaches to cluster-fs journaling and their > > pros/cons (in

Re: [PATCH 00/14] GFS

2005-08-03 Thread Andi Kleen
David Teigland <[EMAIL PROTECTED]> writes: > Hi, GFS (Global File System) is a cluster file system that we'd like to > see added to the kernel. The 14 patches total about 900K so I won't send > them to the list unless that's requested. Comments and suggestions ar

Re: [PATCH 00/14] GFS

2005-08-03 Thread Lars Marowsky-Bree
On 2005-08-03T11:56:18, David Teigland <[EMAIL PROTECTED]> wrote: > > * Why use your own journalling layer and not say ... jbd ? > Here's an analysis of three approaches to cluster-fs journaling and their > pros/cons (including using jbd): http://tinyurl.com/7sbqq Very instructive read, thanks f

Re: [PATCH 00/14] GFS

2005-08-03 Thread David Teigland
On Wed, Aug 03, 2005 at 11:17:09AM +0200, Arjan van de Ven wrote: > On Wed, 2005-08-03 at 11:56 +0800, David Teigland wrote: > > The point is you can define GFS2_ENDIAN_BIG to compile gfs to be BE > > on-disk instead of LE which is another useful way to verify endian > > c

Re: [PATCH 00/14] GFS

2005-08-03 Thread Arjan van de Ven
On Wed, 2005-08-03 at 11:56 +0800, David Teigland wrote: > The point is you can define GFS2_ENDIAN_BIG to compile gfs to be BE > on-disk instead of LE which is another useful way to verify endian > correctness. that sounds wrong to be a compile option. If you really want to deal with

Re: [PATCH 00/14] GFS

2005-08-03 Thread Arjan van de Ven
> I don't know anything about GFS, but expecting a filesystem author to > use a journaling layer he does not want to is a bit arrogant. good that I didn't expect that then. I think it's fair enough to ask people if they can use it. If the answer is "No because it do

Re: [PATCH 00/14] GFS

2005-08-02 Thread Pekka Enberg
Hi David, Some more comments below. Pekka On 8/2/05, David Teigland <[EMAIL PROTECTED]> wrote: > +/** > + * inode_create - create a struct gfs2_inode > + * @i_gl: The glock covering the inode > + * @inum: The inode number > + * @io_gl: the iopen glock to acquire/h

<    1   2   3   >