> > +int set_page_dirty_mapping(struct page *page);
> >
> >
> >
> This aspect of the design seems intrusive to me. I didn't see a strong
> reason to introduce new versions of many of the routines just to handle
> these semantics. What motivated
> > Take this example:
> >
> > fd = open()
> > addr = mmap(.., fd)
> > write(fd, ...)
> > close(fd)
> > sleep(100)
> > msync(addr,...)
> > munmap(addr)
> >
> > The file times will be updated in write(), but with your patch, the
> > bit in the mapping will also be set.
>
> > __fput() will be called when there are no more references to 'file',
> > then it will update the time if the flag is set. This applies to
> > regular files as well as devices.
> >
> >
>
> I suspect that you will find that, for a block device, the wrong inode
> gets updated. That's where t
> > > This still does not address the situation where a file is 'permanently'
> > > mmap'd, does it?
> >
> > So? If application doesn't do msync, then the file times won't be
> > updated. That's allowed by the standard, and so portable applications
> > will have to call msync.
>
> It is allowed
> Miklos Szeredi wrote:
> >>>> This still does not address the situation where a file is 'permanently'
> >>>> mmap'd, does it?
> >>>>
> >>> So? If application doesn't do msync, then the file times won
The first part of this series (1-7) contains miscellaneous patches,
some of which are needed for fuse writable mmap to work correctly.
Some of these are resends of patches already in -mm, with minor
updates.
The rest of the series adds shared writable mapping support to fuse,
with some write perf
From: Miklos Szeredi <[EMAIL PROTECTED]>
Changes:
o dput already checks dentry == NULL, so remove check
from prune_one_dentry()
The time shrink_dcache_parent() takes, grows quadratically with the
depth of the tree under 'parent'. This starts to get noticable at
about 10,0
From: Miklos Szeredi <[EMAIL PROTECTED]>
Changes:
o moved check from __fput() to remove_vma(), which is more logical
o changed set_page_dirty() to set_page_dirty_mapping in hugetlb.c
o cleaned up #ifdef CONFIG_BLOCK mess
This patch makes writing to shared memory mappings update st_cti
From: Miklos Szeredi <[EMAIL PROTECTED]>
Changes:
o fix theoretical NULL pointer dereference in __mpage_writepage
o merge Andrew Morton's cleanups
Clean up code duplication between mpage_writepages() and
generic_writepages().
The new generic function, write_cache_pages() takes
From: Miklos Szeredi <[EMAIL PROTECTED]>
This deadlock is similar to the one in balance_dirty_pages, but
instead of waiting in balance_dirty_pages after submitting a write
request, it happens during a memory allocation for filesystem B before
submitting a write request.
It is easy to rep
From: Miklos Szeredi <[EMAIL PROTECTED]>
Set the read and write congestion state if the request queue is close
to blocking, and clear it when it's not.
This prevents unnecessary blocking in readahead and writeback.
Signed-off-by: Miklos Szeredi <[EMAIL PROTECTED]>
---
Ind
From: Miklos Szeredi <[EMAIL PROTECTED]>
The function do_lo_send_aops() should call
balance_dirty_pages_ratelimited() after each page similarly to
generic_file_buffered_write().
Without this, writing the loop device directly (not through a
filesystem) is very slow, and also slows the
From: Miklos Szeredi <[EMAIL PROTECTED]>
Needed by fuse writepage.
Signed-off-by: Miklos Szeredi <[EMAIL PROTECTED]>
---
Index: linux/include/linux/rwsem.h
===
--- linux.orig/include/linux/rwsem.h2007-02-27 14:40:
From: Miklos Szeredi <[EMAIL PROTECTED]>
This patch adds a new helper function fuse_write_fill() which makes it
possible to send WRITE requests asynchronously.
A new flag for WRITE requests is also added which indicates that this
a write from the page cache, and not a "normal&
From: Miklos Szeredi <[EMAIL PROTECTED]>
Each WRITE request must carry a valid file descriptor. When a page is
written back from a memory mapping, the file through which the page
was dirtied is not available, so a new mechananism is needed to find a
suitable file in ->writepage(s).
From: Miklos Szeredi <[EMAIL PROTECTED]>
Add a per-filesystem limit for the number of dirty pages. If half the
limit is reached, background writeback is started. If the limit is
reached, then start some writeback and wait until the the number goes
below the limit again.
The dirty li
From: Miklos Szeredi <[EMAIL PROTECTED]>
Up to now, file writes were split into page size WRITE requests. This
is inefficient, since there are two context switches per request.
So allow bigger writes, but still do it synchronously. Asynchronous
writeback would be even better, but i
From: Miklos Szeredi <[EMAIL PROTECTED]>
Make per-filesystem statistics about dirty and under-writeback pages
available through the fuse control filesystem.
Signed-off-by: Miklos Szeredi <[EMAIL PROTECTED]>
---
Index: linux/fs/fu
From: Miklos Szeredi <[EMAIL PROTECTED]>
Change fuse_file_mmap() to allow shared writable mappings. Change the
->set_page_dirty address space operation to __set_page_dirty_nobuffers.
In fuse_fsync() sync the inode's dirty data.
It is important, that after all writable file are c
From: Miklos Szeredi <[EMAIL PROTECTED]>
Implement the ->writepage address space operation. Be careful not to
block if the wbc->nonblocking flag is set.
Acquire the read-write truncation semaphore for read when allocating
the request. Use the _non_owner variants, since the semap
From: Miklos Szeredi <[EMAIL PROTECTED]>
Implement the ->writepages address space operation. This is very
similar to fuse_writepage(), but batches multiple pages into a single
request.
It reuses the fuse_fill_data structure currently used by
fuse_readpages().
Signed-off-by: Miklo
From: Miklos Szeredi <[EMAIL PROTECTED]>
Other than truncate, there are two cases, when fuse tries to get rid
of cached pages:
a) in open, if KEEP_CACHE flag is not set)
b) in getattr, if file size changed spontaneously
Until now invalidate_mapping_pages() were used, which didn't
From: Miklos Szeredi <[EMAIL PROTECTED]>
Create a function sync_sb() and export it to modules. This is the
generic interface for writing back dirty data from a single
superblock.
Signed-off-by: Miklos Szeredi <[EMAIL PROTECTED]>
---
Index: linux/fs/fs
From: Miklos Szeredi <[EMAIL PROTECTED]>
Make lifetime of 'struct fuse_file' independent from 'struct file' by
adding a reference counter and destructor.
This will enable asynchronous page writeback, where it cannot be
guaranteed, that the file is not released whil
From: Miklos Szeredi <[EMAIL PROTECTED]>
Add a new semaphore to prevent asynchronous page writeback during the
TRUNCATE request.
Using i_alloc_sem would almost work, but it has to be released before
invalidating the truncated pages, so it's easier to define a separate
one.
Signed-off
From: Miklos Szeredi <[EMAIL PROTECTED]>
There's a slight problem with filesystem type representation in fuse
based filesystems.
>From the kernel's view, there are just two filesystem types: fuse and
fuseblk. From the user's view there are lots of different filesystem
From: Miklos Szeredi <[EMAIL PROTECTED]>
Use wake_up_all instead of wake_up in put_reserved_req(), otherwise it
is possible that the right task is not woken up.
Also create a separate reserved_req_waitq in addition to the
blocked_waitq, since they fulfill totally separate functions.
Sign
From: Miklos Szeredi <[EMAIL PROTECTED]>
This deadlock happens, when dirty pages from one filesystem are
written back through another filesystem. It easiest to demonstrate
with fuse although it could affect looback mounts as well (see
following patches).
Let's call the filesystems A(
> These change still have the undesirable property that although the
> modified pages may be flushed to stable storage, the metadata on
> the file will not be updated until the application takes positive
> action. This is permissible given the current wording in the
> specifications, but it would
> While these entry points do not actually modify the file itself,
> as was pointed out, they are handy points at which the kernel gains
> control and could actually notice that the contents of the file are
> no longer the same as they were, ie. modified.
>
> From the operating system viewpoint,
> >> While these entry points do not actually modify the file itself,
> >> as was pointed out, they are handy points at which the kernel gains
> >> control and could actually notice that the contents of the file are
> >> no longer the same as they were, ie. modified.
> >>
> >> From the operating s
> What happens if the application overwrites what it had written some
> time later? Nothing. The page is already read-write, the pte dirty,
> so even though the file was clearly modified, there's absolutely no
> way in which this can be used to force an update to the timestamp.
Which, I realize
> >> What happens if the application overwrites what it had written some
> >> time later? Nothing. The page is already read-write, the pte dirty,
> >> so even though the file was clearly modified, there's absolutely no
> >> way in which this can be used to force an update to the timestamp.
> >>
> > This deadlock happens, when dirty pages from one filesystem are
> > written back through another filesystem. It easiest to demonstrate
> > with fuse although it could affect looback mounts as well (see
> > following patches).
> >
> > Let's call the filesystems A(bove) and B(elow). Process Pr
> > From: Miklos Szeredi <[EMAIL PROTECTED]>
> >
> > This deadlock is similar to the one in balance_dirty_pages, but
> > instead of waiting in balance_dirty_pages after submitting a write
> > request, it happens during a memory allocation for filesystem B b
> > > > This deadlock happens, when dirty pages from one filesystem are
> > > > written back through another filesystem. It easiest to demonstrate
> > > > with fuse although it could affect looback mounts as well (see
> > > > following patches).
> > > >
> > > > Let's call the filesystems A(bove)
ted number of threads
+ no progress is made.
Thanks,
Miklos
From: Miklos Szeredi <[EMAIL PROTECTED]>
This deadlock happens, when dirty pages from one filesystem are
written back through another filesystem. It easiest to demonstrate
with fuse although it could affect looback mounts as w
Hi Jeff,
I'm having problems using 2.6.20 UML. It's a long time I last tried,
so don't know which version this started with.
It boots fine, then usually just after logging in and before starting
the shell (but sometimes after the shell started) it gets into some
loop. Looking at the strace show
> No, it doesn't. This is a strace on the host, I take it?
Yes.
> Can you get backtraces from the processes?
Here's one:
#0 0xe410 in __kernel_vsyscall ()
#1 0xb7f0fbc3 in write () from /lib/tls/i686/cmov/libc.so.6
#2 0x08066f52 in file_io (fd=10, buf=0x8a0fc8b, len=1,
io_proc=0x805
From: Miklos Szeredi <[EMAIL PROTECTED]>
The time shrink_dcache_parent() takes, grows quadratically with the
depth of the tree under 'parent'. This starts to get noticable at
about 10,000.
These kinds of depths don't occur normally, and filesystems which
invoke shrin
> > "The file system mounted on /tmp/z in the example contains 2^50
> > directories". heh.
> >
> > I do wonder how realistic this problem is in real life.
>
> That's a fair concern, although I was trying this as part
> of evaluating how much someone could hose a system
> if we let them mount arb
> > Unfortunately this patch doesn't completely solve this problem, since
> > the system will still be hosed due to all memory being used up by
> > dentries. And I bet the OOM killer won't find the real target (du)
> > but will kill anything before that.
> >
> > So the second part of the problem i
There's a slight problem with filesystem type representation in fuse
based filesystems.
>From the kernel's view, there are just two filesystem types: fuse and
fuseblk. From the user's view there are lots of different filesystem
types. The user is not even much concerned if the filesystem is fuse
> > There's a slight problem with filesystem type representation in fuse
> > based filesystems.
> >
> > >From the kernel's view, there are just two filesystem types: fuse and
> > fuseblk. From the user's view there are lots of different filesystem
> > types. The user is not even much concerned i
> >-static struct file_system_type **find_filesystem(const char *name)
> >+static struct file_system_type **find_filesystem(const char *name, unsigned
> >len)
> > {
> > struct file_system_type **p;
> > for (p=&file_systems; *p; p=&(*p)->next)
> >-if (strcmp((*p)->name,name) ==
> > Strangely enough after continuing in gdb, UML is back to normal, and I
> > can't make it hang any more. It must be something timing related.
>
> Can you see if the patch below fixes it?
Yay! Got my nice fast UML back instead of ugly slow QEmu ;)
Seems to work perfectly now.
Thanks,
Miklos
From: Miklos Szeredi <[EMAIL PROTECTED]>
Clean up massive code duplication between mpage_writepages() and
generic_writepages().
The new generic function, write_cache_pages() takes a function pointer
argument, which will be called for each page to be written.
Maybe cifs_writepages() too c
> >Maybe cifs_writepages() too can use this infrastructure, but I'm not
> >touching that with a ten-foot pole.
> >
> >
> The cifs case ought to be one of the simpler ones, pseudo-code is pretty
> easy, the hard part is all of the stuff unrelated to cifs:
> Ideally if there were generic functions
I was testing the new fuse shared writable mmap support, and finding
that bash-shared-mapping deadlocks (which isn't so strange ;). What
is more strange is that this is not an OOM situation at all, with
plenty of free and cached pages.
A little more investigation shows that a similar deadlock hap
> > I was testing the new fuse shared writable mmap support, and finding
> > that bash-shared-mapping deadlocks (which isn't so strange ;). What
> > is more strange is that this is not an OOM situation at all, with
> > plenty of free and cached pages.
> >
> > A little more investigation shows tha
> Andrew Morton wrote:
> > On Sun, 18 Feb 2007 19:28:18 +0100 Miklos Szeredi <[EMAIL PROTECTED]> wrote:
> >
> >> I was testing the new fuse shared writable mmap support, and finding
> >> that bash-shared-mapping deadlocks (which isn't so strange ;). W
> > > > I was testing the new fuse shared writable mmap support, and finding
> > > > that bash-shared-mapping deadlocks (which isn't so strange ;). What
> > > > is more strange is that this is not an OOM situation at all, with
> > > > plenty of free and cached pages.
> > > >
> > > > A little more
> > > If so, writes to B will decrease the dirty memory threshold.
> >
> > Yes, but not by enough. Say A dirties a 1100 pages, limit is 1000.
> > Some pages queued for writeback (doesn't matter how much). B writes
> > back 1, 1099 dirty remain in A, zero in B. balance_dirty_pages() for
> > B do
> --- a/fs/fs-writeback.c~a
> +++ a/fs/fs-writeback.c
> @@ -356,7 +356,7 @@ int generic_sync_sb_inodes(struct super_
> continue; /* Skip a congested blockdev */
> }
>
> - if (wbc->bdi && bdi != wbc->bdi) {
> + if (wbc->bdi
> > > > If so, writes to B will decrease the dirty memory threshold.
> > >
> > > Yes, but not by enough. Say A dirties a 1100 pages, limit is 1000.
> > > Some pages queued for writeback (doesn't matter how much). B writes
> > > back 1, 1099 dirty remain in A, zero in B. balance_dirty_pages() fo
> > > > > If so, writes to B will decrease the dirty memory threshold.
> > > >
> > > > Yes, but not by enough. Say A dirties a 1100 pages, limit is 1000.
> > > > Some pages queued for writeback (doesn't matter how much). B writes
> > > > back 1, 1099 dirty remain in A, zero in B. balance_dirty_
> > > In general, writepage is supposed to do work without blocking on
> > > expensive locks that will get pdflush and dirty reclaim stuck in this
> > > fashion. You'll probably have to take the same approach reiserfs does
> > > in data=journal mode, which is leaving the page dirty if fuse_get_req
How about this?
Solves the FUSE deadlock, but not the throttle_vm_writeout() one.
I'll try to tackle that one as well.
If the per-bdi dirty counter goes below 16, balance_dirty_pages()
returns.
Does the constant need to tunable? If it's too large, then the global
threshold is more easily exceed
> Solves the FUSE deadlock, but not the throttle_vm_writeout() one.
> I'll try to tackle that one as well.
>
> If the per-bdi dirty counter goes below 16, balance_dirty_pages()
> returns.
>
> Does the constant need to tunable? If it's too large, then the global
> threshold is more easily exceede
> > How about this?
> >
> > Solves the FUSE deadlock, but not the throttle_vm_writeout() one.
> > I'll try to tackle that one as well.
> >
> > If the per-bdi dirty counter goes below 16, balance_dirty_pages()
> > returns.
> >
> > Does the constant need to tunable? If it's too large, then the gl
> > > > > In general, writepage is supposed to do work without blocking on
> > > > > expensive locks that will get pdflush and dirty reclaim stuck in this
> > > > > fashion. You'll probably have to take the same approach reiserfs does
> > > > > in data=journal mode, which is leaving the page dirty
> On Tue, 2007-02-20 at 02:31 -0500, Hank Leininger wrote:
> > Is there anything provided by the kernel that would let you see the
> > current offset of an existing filehandle?
> >
> > Sometimes when processing a very large file (grepping a log, bzip2'ing
> > or gpg'ing a file, or whatever), I'd r
Mertens.
Signed-off-by: Miklos Szeredi <[EMAIL PROTECTED]>
---
Index: linux/fs/fuse/control.c
===
--- linux.orig/fs/fuse/control.c2007-01-29 20:40:50.0 +0100
+++ linux/fs/fuse/control.c 2007-01-29 20:40:52.000
> I've came across this problem: how can a userspace program (such as for
> example "cp -a") tell that two files form a hardlink? Comparing inode
> number will break on filesystems that can have more than 2^32 files (NFS3,
> OCFS, SpadFS; kernel developers already implemented iget5_locked for th
> > > > High probability is all you have. Cosmic radiation hitting your
> > > > computer will more likly cause problems, than colliding 64bit inode
> > > > numbers ;)
> > >
> > > Some of us have machines designed to cope with cosmic rays, and would be
> > > unimpressed with a decrease in reliabil
> > Well, sort of. Samefile without keeping fds open doesn't have any
> > protection against the tree changing underneath between first
> > registering a file and later opening it. The inode number is more
>
> You only need to keep one-file-per-hardlink-group open during final
> verification, ch
> And does it matter? If you rename a file, tar might skip it no matter of
> hardlink detection (if readdir races with rename, you can read none of the
> names of file, one or both --- all these are possible).
>
> If you have "dir1/a" hardlinked to "dir1/b" and while tar runs you delete
> both
> > And does it matter? If you rename a file, tar might skip it no matter of
> > hardlink detection (if readdir races with rename, you can read none of the
> > names of file, one or both --- all these are possible).
> >
> > If you have "dir1/a" hardlinked to "dir1/b" and while tar runs you delet
> >> No one guarantees you sane result of tar or cp -a while changing the tree.
> >> I don't see how is_samefile() could make it worse.
> >
> > There are several cases where changing the tree doesn't affect the
> > correctness of the tar or cp -a result. In some of these cases using
> > samefile()
> > There's really no point trying to push for such an inferior interface
> > when the problems which samefile is trying to address are purely
> > theoretical.
>
> Oh yes, there is. st_ino is powerful, *but impossible to implement*
> on many filesystems.
You mean POSIX compliance is impossible?
> > You mean POSIX compliance is impossible? So what? It is possible to
> > implement an approximation that is _at least_ as good as samefile().
> > One really dumb way is to set st_ino to the 'struct inode' pointer for
> > example. That will sure as hell fit into 64bits and will give a
> > uniq
> > > If this solves the problem on your box then i'll do a proper fix and
> > > introduce a cpu_relax_memory_change(*addr) type of API to around
> > > monitor/mwait. This patch boots fine on my T60 - but i never saw
> > > your problem.
> >
> > Yes, the patch does make the pauses go away. In f
away
> > > perfectly good packets with AF_UNIX sockets in them.
> > >
> > > The problems arise when a socket goes from installed to in-flight or
> > > vice versa during garbage collection. Since gc is done with a
> > > spinlock held, this only shows u
> From: Miklos Szeredi <[EMAIL PROTECTED]>
> Date: Mon, 18 Jun 2007 09:49:32 +0200
>
> > Ping Dave,
> >
> > Since there doesn't seem to be any new ideas forthcoming, can we
> > please decide on either one of my two sumbitted patches?
>
> Yo
> To test this theory, could you try the patch below, does this fix your
> hangs too?
Not tried yet, but obviously it does, since it's a superset of the
previous fix. I could try without the smb_mb(), but see below.
> This change causes the memory access of the "easy" spin-loop portion
> to be
> > > This change causes the memory access of the "easy" spin-loop portion
> > > to be more agressive: after the REP; NOP we'd not do the 'easy-loop'
> > > with a simple CMPB, but we'd re-attempt the atomic op.
> >
> > It looks as if this is going to overflow of the lock counter, no?
>
> hm, wh
> > And is anyone working on a better patch?
>
> I have no idea.
>
> > Those patches aren't "bad" in the correctness sense. So IMO any one
> > of them is better, than having that bug in there.
>
> You're adding a very serious performance regression, which is
> about as bad as the bug itself.
N
> > > > And is anyone working on a better patch?
> > >
> > > I have no idea.
> > >
> > > > Those patches aren't "bad" in the correctness sense. So IMO any one
> > > > of them is better, than having that bug in there.
> > >
> > > You're adding a very serious performance regression, which is
> >
> > > Secondarily, this bug has been around for years and nobody noticed.
> > > The world will not explode if this bug takes a few more days or
> > > even a week to work out. Let's do it right instead of ramming
> > > arbitrary turds into the kernel.
> >
> > Fine, but just wishing a bug to get fi
> * Ingo Molnar <[EMAIL PROTECTED]> wrote:
>
> > how about the patch below? Boot-tested on 32-bit. As a side-effect
> > this change also removes the 255 CPUs limit from the 32-bit kernel.
>
> boot-tested on 64-bit too now.
Strange, I can't even get past the compile stage ;)
CC kernel/sp
> * Miklos Szeredi <[EMAIL PROTECTED]> 2007-06-18 11:44
> > Garbage collection only ever happens, if the app is sending AF_UNIX
> > sockets over AF_UNIX sockets. Which is a rather rare case. And which
> > is basically why this bug went unnoticed for so long.
>
> * Thomas Graf <[EMAIL PROTECTED]> 2007-06-18 12:32
> > * Miklos Szeredi <[EMAIL PROTECTED]> 2007-06-18 11:44
> > > Garbage collection only ever happens, if the app is sending AF_UNIX
> > > sockets over AF_UNIX sockets. Which is a rather rare case. And whi
> > but it's not as if it's really going to affect performance
> > in real cases.
>
> Since these circumstances are creatable by any user, we have
> to consider the cases caused by malicious entities.
OK. But then the whole gc thing is already broken, since a user can
DoS socket creation/destruc
> > I'm all for fixing this gc mess that we have now. But please don't
> > expect me to be the one who's doing it.
>
> Don't worry, I only expect you to make the situation worse :-)
That's real nice. Looks like finding and fixing bugs in not
appreciated in the networking subsystem :-/
Miklos
-
> > > I'm all for fixing this gc mess that we have now. But please don't
> > > expect me to be the one who's doing it.
> >
> > Don't worry, I only expect you to make the situation worse :-)
>
> In any event, I'll try to find some time to look more at your patch.
>
> But just like you don't want
> > No, correctness always trumps performance. Lost packets on an AF_UNIX
> > socket are _unexceptable_, and this is definitely not a theoretical
> > problem.
>
> If its so unacceptable why has nobody noticed until now - its a bug
> clearly, it needs fixing clearly, but unless you can produce som
> > > * Ingo Molnar <[EMAIL PROTECTED]> wrote:
> > >
> > > > how about the patch below? Boot-tested on 32-bit. As a side-effect
> > > > this change also removes the 255 CPUs limit from the 32-bit kernel.
> > >
> > > boot-tested on 64-bit too now.
> >
> > Strange, I can't even get past the compi
> P.S. maybe a posix filesystem interface manual would be good?
Maybe you are looking for this:
http://www.opengroup.org/onlinepubs/009695399/
Miklos
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at
> Hmm? Untested, I know. Maybe I overlooked something. But even the
> generated assembly code looks fine (much better than it looked before!)
Boots and runs fine. Fixes the freezes as well, which is not such a
big surprise, since basically any change in that function seems to do
that ;)
Miklos
> > Right now it is actually impossible to conclusively determine a
> > filesystem-relative path in the presence of bind (and possibly move)
> > mounts. This is highly desirable to be able to do in contexts that
> > involve non-Linux (or not-the-current-instance-of-Linux) accesses to the
> > files
t or
> >> > vice versa during garbage collection. Since gc is done with a
> >> > spinlock held, this only shows up on SMP.
> >> >
> >> > Signed-off-by: Miklos Szeredi <[EMAIL PROTECTED]>
> >>
> >> I'm going to hold of
> > the freezes that Miklos was seeing were hardirq contexts blocking in
> > task_rq_lock() - that is done with interrupts disabled. (Miklos i
> > think also tried !NOHZ kernels and older kernels, with a similar
> > result.)
> >
> > plus on the ptrace side, the wait_task_inactive() code had mos
> > Right. But the devil is in the details, and (as you correctly point
> > out later) to implement this, the whole locking scheme needs to be
> > overhauled. Problems:
> >
> > - Using the queue lock to make the dequeue and the fd detach atomic
> >wrt the GC is difficult, if not impossible:
ice versa during garbage collection. Since gc is done with a
> > spinlock held, this only shows up on SMP.
> >
> > Signed-off-by: Miklos Szeredi <[EMAIL PROTECTED]>
>
> I'm going to hold off on this one for now.
>
> Holding all of the read locks kind of
> fs/fuse/inode.c:658:3: error: Initializer entry defined twice
> fs/fuse/inode.c:661:3: also defined here
Duh, that's a stupid conflict. I wonder why I don't get this compile
error...
> Signed-off-by: Alexey Dobriyan <[EMAIL PROTECTED]>
Acked-by: Miklos Szeredi &l
From: Miklos Szeredi <[EMAIL PROTECTED]>
Thanks to Alexey Dobriyan for spotting the other one.
Signed-off-by: Miklos Szeredi <[EMAIL PROTECTED]>
---
Index: linux/fs/fuse/inode.c
===
--- linux.orig/fs/fuse/inode.c 2007
I've got some more info about this bug. It is gathered with
nmi_watchdog=2 and a modified nmi_watchdog_tick(), which instead of
calling die_nmi() just prints a line and calls show_registers().
This makes the machine actually survive the NMI tracing. The attached
traces are gathered over about an
Chuck, Ingo, thanks for the responses.
> > The pattern that emerges is that on CPU0 we have an interrupt, which
> > is trying to acquire the rq lock, but can't.
> >
> > On CPU1 we have strace which is doing wait_task_inactive(), which sort
> > of spins acquiring and releasing the rq lock. I've
> > One task doing ptrace() can basically do whatever it wants with the
> > task being traced. This is not an exact analogy to what fuse does,
> > but close.
>
> Well, IMO userland tasks should not have power to grab VFS mutexes for
> indefinite ammount of time. ("fused is allowed to deadlock ker
> We can just wait for all fuse requests to be serviced before
> proceeding further with freeze, right?
Right. Nice way to slow down or stop the suspend with an unprivileged
process. Avoiding that sort of DoS is one of the design goals of
fuse.
Look at it this way: the task of the freezer is to
901 - 1000 of 2936 matches
Mail list logo