Bug#394392: msync() in recent kernels fails LSB
* Jeff Licquia [EMAIL PROTECTED] [2006-10-20 19:17]: From a recent run of the LSB 3.1 tests: 10|852 /tset/LSB.os/mfiles/msync_P/T.msync_P 22:58:49|TC Start, scenario ref 858-0 FSG internal testing showed that Fedora Core 5's 2.6.18 kernel does not fail in the same way. I believe I've traced it to a backported change from 2.6.19 development. The specific commit touching msync() is 204ec841fbea3e5138168edbc3a76d46747cc987 in git; it relies on several commits immediately preceding it. I've built Linus's tree on amd64, and it passes the test. I have not, however, built a 2.6.18 kernel with this patch and tested it, though it's the only patch in the Fedora kernel which touches the msync() code. So it seems that the patches needed for msync() conformance we applied from 2.6.19 to our 2.6.18 cause filesystem corruption, see the current discussion on this on lkml. From what I understand it, plain 2.6.18 is not LSB 3.1 conform and you need some fixes which are associated with filesystem corruption. While Andrew, Linus and co are currently trying to come up with a patch, I think it might be better for us to simply back out these patches. What doe it take to get an exception for this LSB test? Surely the reasons cited above (fails with 2.6.18, a fairly current kernel and the patches to fix it are associated with fs corruption) are pretty good arguments for an exception... -- Martin Michlmayr http://www.cyrius.com/ -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#394392: msync() in recent kernels fails LSB
On Tue, 2006-12-19 at 17:08 +0100, Martin Michlmayr wrote: So it seems that the patches needed for msync() conformance we applied from 2.6.19 to our 2.6.18 cause filesystem corruption, see the current discussion on this on lkml. From what I understand it, plain 2.6.18 is not LSB 3.1 conform and you need some fixes which are associated with filesystem corruption. While Andrew, Linus and co are currently trying to come up with a patch, I think it might be better for us to simply back out these patches. What doe it take to get an exception for this LSB test? Surely the reasons cited above (fails with 2.6.18, a fairly current kernel and the patches to fix it are associated with fs corruption) are pretty good arguments for an exception... I brought this up at our weekly conference call, which generated quite a lot of discussion. The argument against issuing a waiver is that this isn't strictly required; a distro could fix the problem by downgrading the kernel to 2.6.16. I've also forwarded your message to Ian Murdock, who is the current chair of the LSB Steering Committee. The process for getting an exception is as follows: - Release a product with a problem. - Run the tests, and fail in some way. - Request a waiver from the LSB Specification Authority. There's a link for doing so from the certification site. Of course, the problem is that we have to make a decision now, so we also have an unofficial process of discussing known issues. That process has already been started. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#394392: msync() in recent kernels fails LSB
On Tue, Dec 19, 2006 at 05:08:24PM +0100, Martin Michlmayr wrote: So it seems that the patches needed for msync() conformance we applied from 2.6.19 to our 2.6.18 cause filesystem corruption, see the current discussion on this on lkml. From what I understand it, plain 2.6.18 is not LSB 3.1 conform and you need some fixes which are associated with filesystem corruption. While Andrew, Linus and co are currently trying to come up with a patch, I think it might be better for us to simply back out these patches. What doe it take to get an exception for this LSB test? Surely the reasons cited above (fails with 2.6.18, a fairly current kernel and the patches to fix it are associated with fs corruption) are pretty good arguments for an exception... -- Martin Michlmayr http://www.cyrius.com/ why not wait for the fix and backport it?! -- maks -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#394392: msync() in recent kernels fails LSB
* maximilian attems [EMAIL PROTECTED] [2006-12-19 20:30]: why not wait for the fix and backport it?! Well, have you seen the discussion on lkml in which people are basically tapping in the dark? I hope there'll be a clean fix in a few days but... -- Martin Michlmayr http://www.cyrius.com/ -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#394392: msync() in recent kernels fails LSB
On Tue, Dec 19, 2006 at 11:25:22PM +0100, Martin Michlmayr wrote: * maximilian attems [EMAIL PROTECTED] [2006-12-19 20:30]: why not wait for the fix and backport it?! Well, have you seen the discussion on lkml in which people are basically tapping in the dark? I hope there'll be a clean fix in a few days but... -- Martin Michlmayr http://www.cyrius.com/ yes there are 2 working hacks around, so the final should come up. but we shouldn't mould our hands to quickly with the first shot. -- maks christmas with dj dsl and mieze medusa - http://www.cabaretrenz.org/programm+M5c858bac7ae.html -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#394392: msync() in recent kernels fails LSB
On Tue, Dec 19, 2006 at 05:08:24PM +0100, Martin Michlmayr wrote: 10|852 /tset/LSB.os/mfiles/msync_P/T.msync_P 22:58:49|TC Start, scenario ref 858-0 FSG internal testing showed that Fedora Core 5's 2.6.18 kernel does not fail in the same way. I believe I've traced it to a backported change from 2.6.19 development. The specific commit touching msync() is 204ec841fbea3e5138168edbc3a76d46747cc987 in git; it relies on several commits immediately preceding it. I've built Linus's tree on amd64, and it passes the test. I have not, however, built a 2.6.18 kernel with this patch and tested it, though it's the only patch in the Fedora kernel which touches the msync() code. So it seems that the patches needed for msync() conformance we applied from 2.6.19 to our 2.6.18 cause filesystem corruption, see the current discussion on this on lkml. From what I understand it, plain 2.6.18 is not LSB 3.1 conform and you need some fixes which are associated with filesystem corruption. While Andrew, Linus and co are currently trying to come up with a patch, I think it might be better for us to simply back out these patches. What doe it take to get an exception for this LSB test? Surely the reasons cited above (fails with 2.6.18, a fairly current kernel and the patches to fix it are associated with fs corruption) are pretty good arguments for an exception... Reverting this is an ABI change which may be installer-affecting (I don't know if it is, but unlike most of the other pressing ABI changes this one would apply to the kernels used by the installer). If we think the fix may be available soon, I think we're better off pushing forward rather than reverting. If the decision *is* made to revert, release-wise it's best if the kernel team can bundle up any final ABI changes they want to make for etch at the same time so that we can get it done with and get d-i RC2 out. Thanks, -- Steve Langasek Give me a lever long enough and a Free OS Debian Developer to set it on, and I can move the world. [EMAIL PROTECTED] http://www.debian.org/ -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#394392: msync() in recent kernels fails LSB
Package: linux-image-2.6.17-2-686 Version: 2.6.17-9 Severity: important From a recent run of the LSB 3.1 tests: 10|852 /tset/LSB.os/mfiles/msync_P/T.msync_P 22:58:49|TC Start, scenario ref 858-0 15|852 3.6-lite 9|TCM Start 400|852 7 1 22:59:13|IC Start 200|852 7 22:59:13|TP Start 520|852 7 8662 1 1|msync() did not return -1, returned 0 220|852 7 1 22:59:13|FAIL 410|852 7 1 22:59:13|IC End 80|852 0 22:59:15|TC End, scenario ref 858-0 The test mmap()'s three pages from a large file read-write, munmap()'s the middle page, and then tries to msync() the first two pages, both in synchronous and asynchronous modes. Both attempts should fail, because one of the pages in the range is not mapped. Starting with kernel 2.6.17, at least one of the msync() calls succeeded. I've confirmed the failure happens in 2.6.18 i386 kernels, and on powerpc and amd64 with 2.6.17 kernels. I've been able to trace the bug to commit 707c21c848deeb0200ba3f07e4ba90e6dc419c2f in git. FSG internal testing showed that Fedora Core 5's 2.6.18 kernel does not fail in the same way. I believe I've traced it to a backported change from 2.6.19 development. The specific commit touching msync() is 204ec841fbea3e5138168edbc3a76d46747cc987 in git; it relies on several commits immediately preceding it. I've built Linus's tree on amd64, and it passes the test. I have not, however, built a 2.6.18 kernel with this patch and tested it, though it's the only patch in the Fedora kernel which touches the msync() code. The patch from the Fedora kernel is attached. It is fairly high-impact, though; if a less invasive patch is needed, please let me know. Marked important because LSB 3.1 compatibility has been identified as a release goal. Date: Wed, 19 Jul 2006 00:03:33 +0200 From: Peter Zijlstra [EMAIL PROTECTED] Subject: Re: [RHEL5][PATCH 1/8] mm: tracking shared dirty pages Respin against current Rawhide kernel. The other patches apply with a little offset/fuzz but end up rightly. It even compiles :-) Don, is this enough, or would you like me to repost the whole series (minus 8/8) fuzzless? --- From: Peter Zijlstra [EMAIL PROTECTED] Tracking of dirty pages in shared writeable mmap()s. The idea is simple: write protect clean shared writeable pages, catch the write-fault, make writeable and set dirty. On page write-back clean all the PTE dirty bits and write protect them once again. The implementation is a tad harder, mainly because the default backing_dev_info capabilities were too loosely maintained. Hence it is not enough to test the backing_dev_info for cap_account_dirty. The current heuristic is as follows, a VMA is eligible when: - its shared writeable (vm_flags (VM_WRITE|VM_SHARED)) == (VM_WRITE|VM_SHARED) - it is not a 'special' mapping (vm_flags (VM_PFNMAP|VM_INSERTPAGE)) == 0 - the backing_dev_info is cap_account_dirty mapping_cap_account_dirty(vma-vm_file-f_mapping) - f_op-mmap() didn't change the default page protection Page from remap_pfn_range() are explicitly excluded because their COW semantics are already horrid enough (see vm_normal_page() in do_wp_page()) and because they don't have a backing store anyway. mprotect() is taught about the new behaviour as well. However it fudges the last condition. Cleaning the pages on write-back is done with page_mkclean() a new rmap call. It cleans and wrprotects all PTEs of dirty accountable pages. Finally, in fs/buffers.c:try_to_free_buffers(); remove clear_page_dirty() from under -private_lock. This seems to be safe, since -private_lock is used to serialize access to the buffers, not the page itself. This is needed because clear_page_dirty() will call into page_mkclean() and would thereby violate locking order. Signed-off-by: Peter Zijlstra [EMAIL PROTECTED] Cc: Hugh Dickins [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- fs/buffer.c |2 - include/linux/mm.h | 34 ++ include/linux/rmap.h |8 ++ mm/memory.c | 29 ++ mm/mmap.c| 10 +++ mm/mprotect.c| 21 ++-- mm/page-writeback.c | 17 ++--- mm/rmap.c| 65 +++ 8 files changed, 156 insertions(+), 30 deletions(-) Index: latest/fs/buffer.c === --- latest.orig/fs/buffer.c +++ latest/fs/buffer.c @@ -2984,6 +2984,7 @@ int try_to_free_buffers(struct page *pag spin_lock(mapping-private_lock); ret = drop_buffers(page, buffers_to_free); + spin_unlock(mapping-private_lock); if (ret) { /* * If the filesystem writes its buffers by hand (eg ext3) @@ -2995,7 +2996,6 @@ int try_to_free_buffers(struct page *pag */ clear_page_dirty(page); } - spin_unlock(mapping-private_lock); out: if (buffers_to_free) { struct buffer_head *bh = buffers_to_free; Index: latest/include/linux/mm.h