Re: [OpenAFS-devel] Linux readpage handler

Simon Wilkinson Thu, 26 May 2011 06:40:04 -0700

On 25 May 2011, at 22:52, Andrew Deason wrote:

Firstly, I haven't looked specifically at the versions you are running - your 
Linux kernel is sufficiently ancient that it isn't in the kernel git repo, and 
I don't have my linux-prehistory tree on my laptop. So what follows is how 
things work in recent kernels. There have been significant changes here since 
2.6.9.


> So, my question here is what is supposed to happen? Is
> current->journal_info supposed to have the journal transaction of the
> current process (in which case I assume the readpage handler is not
> allowed to start write transactions, but I can't find this warned
> against anywhere), or is something supposed to reset the current task's
> journal_info or otherwise somehow guard against this?

The way that jbd is currently implemented, a thread cannot have two journals 
open at the same time - you can't call journal_start() on a different fs when 
you already have a journal started. If you are on the _same_ fs, then you can 
get away with this, as you just get a reference to the current handle, rather 
than an error.

When we write_begin on ext3 we start a journal, which isn't completed until 
write_end is called. So, if we page fault whilst we are copying between 
userspace and kernel, we will re-enter the journal, and see the assert you see. 
However, the kernel should prevent this page fault from ever occurring, as it 
can cause deadlocks (the page fault may result in memory pressure which causes 
pages to be flushed, but you're already in a filesystem, and you then 
deadlock). So, write() ensures that all user pages required for the copy are in 
memory before calling write_begin, and then actually disables pagefaults during 
the duration of the copy.

I suspect that the reason why you can't reproduce this on your test system, but 
are seeing it in the wild, is that 2.6.9 has some, but not all, of this logic, 
and so when testing you're seeing pagefaults occurring before begin_write 
(prepare_write, on something that old) is called, but on the "real" system, 
memory pressure is causing a race whereby a page that has been swapped in is 
being swapped out again before it can be used.

Hope that's of some use!

Cheers,

Simon.

_______________________________________________
OpenAFS-devel mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-devel

Re: [OpenAFS-devel] Linux readpage handler

Reply via email to