Re: bad rss-counter message in 3.14rc5
On Thu, 20 Mar 2014, Sasha Levin wrote: > On 03/20/2014 09:51 AM, Dave Jones wrote: > > On Wed, Mar 19, 2014 at 10:00:29PM -0700, Hugh Dickins wrote: > > > > > > This might be collateral damage from the swapops thing, I guess we > > won't know until > > > > that gets fixed, but I thought I'd mention that we might still have a > > problem here. > > > > > > Yes, those Bad rss-counters could well be collateral damage from the > > > swapops BUG. To which I believe I now have the answer: again untested, > > > but please give this a try... > > > > This survived an overnight run. No swapops bug, and no bad RSS. Good job:) > > Same here, swapops bug is gone! That was welcome news, thanks guys. I notice it has not (yet) magically appeared in Linus's public tree like the rss one did: so to be on the safe side, I'll just repost it now, with your Reported-and-tested-bys, otherwise unchanged. Hugh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On 03/20/2014 09:51 AM, Dave Jones wrote: On Wed, Mar 19, 2014 at 10:00:29PM -0700, Hugh Dickins wrote: > > This might be collateral damage from the swapops thing, I guess we won't know until > > that gets fixed, but I thought I'd mention that we might still have a problem here. > > Yes, those Bad rss-counters could well be collateral damage from the > swapops BUG. To which I believe I now have the answer: again untested, > but please give this a try... This survived an overnight run. No swapops bug, and no bad RSS. Good job:) Same here, swapops bug is gone! Thanks, Sasha -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Wed, Mar 19, 2014 at 10:00:29PM -0700, Hugh Dickins wrote: > > This might be collateral damage from the swapops thing, I guess we won't > > know until > > that gets fixed, but I thought I'd mention that we might still have a > > problem here. > > Yes, those Bad rss-counters could well be collateral damage from the > swapops BUG. To which I believe I now have the answer: again untested, > but please give this a try... This survived an overnight run. No swapops bug, and no bad RSS. Good job :) > (It's worth saying, by the way, that these bugs are not a consequence > of recent changes at all, they've been there for ages; but trinity has > just got better at taunting remap_file_pages and the rest of mm...) Indeed. I hope to lift the covers on more stuff like this (and hopefully get it done in a more reproducable manner). A lot of the stuff trinity is doing with VM syscalls is still very naive. Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Wed, Mar 19, 2014 at 10:00:29PM -0700, Hugh Dickins wrote: This might be collateral damage from the swapops thing, I guess we won't know until that gets fixed, but I thought I'd mention that we might still have a problem here. Yes, those Bad rss-counters could well be collateral damage from the swapops BUG. To which I believe I now have the answer: again untested, but please give this a try... This survived an overnight run. No swapops bug, and no bad RSS. Good job :) (It's worth saying, by the way, that these bugs are not a consequence of recent changes at all, they've been there for ages; but trinity has just got better at taunting remap_file_pages and the rest of mm...) Indeed. I hope to lift the covers on more stuff like this (and hopefully get it done in a more reproducable manner). A lot of the stuff trinity is doing with VM syscalls is still very naive. Dave -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On 03/20/2014 09:51 AM, Dave Jones wrote: On Wed, Mar 19, 2014 at 10:00:29PM -0700, Hugh Dickins wrote: This might be collateral damage from the swapops thing, I guess we won't know until that gets fixed, but I thought I'd mention that we might still have a problem here. Yes, those Bad rss-counters could well be collateral damage from the swapops BUG. To which I believe I now have the answer: again untested, but please give this a try... This survived an overnight run. No swapops bug, and no bad RSS. Good job:) Same here, swapops bug is gone! Thanks, Sasha -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Thu, 20 Mar 2014, Sasha Levin wrote: On 03/20/2014 09:51 AM, Dave Jones wrote: On Wed, Mar 19, 2014 at 10:00:29PM -0700, Hugh Dickins wrote: This might be collateral damage from the swapops thing, I guess we won't know until that gets fixed, but I thought I'd mention that we might still have a problem here. Yes, those Bad rss-counters could well be collateral damage from the swapops BUG. To which I believe I now have the answer: again untested, but please give this a try... This survived an overnight run. No swapops bug, and no bad RSS. Good job:) Same here, swapops bug is gone! That was welcome news, thanks guys. I notice it has not (yet) magically appeared in Linus's public tree like the rss one did: so to be on the safe side, I'll just repost it now, with your Reported-and-tested-bys, otherwise unchanged. Hugh -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Wed, 19 Mar 2014, Dave Jones wrote: > On Tue, Mar 18, 2014 at 07:19:09PM -0700, Hugh Dickins wrote: > > > Another positive on the rss counters, great, thanks Dave. > > That encourages me to think again on the swapops BUG, but no promises. > > So while I slept I ran a test kernel with that swapops BUG replaced with a > printk. > I'm not sure of the validity of this, given the state of the kernel afterwards > is somewhat suspect, but I did see in the logs this morning.. > > [18728.075153] migration_entry_to_page BUG hit > [18728.200705] BUG: Bad rss-counter state mm:880241b3f500 idx:0 val:1 > (Not tainted) > [18728.200706] BUG: Bad rss-counter state mm:880241b3f500 idx:1 val:-1 > (Not tainted) > > This might be collateral damage from the swapops thing, I guess we won't know > until > that gets fixed, but I thought I'd mention that we might still have a problem > here. Yes, those Bad rss-counters could well be collateral damage from the swapops BUG. To which I believe I now have the answer: again untested, but please give this a try... (It's worth saying, by the way, that these bugs are not a consequence of recent changes at all, they've been there for ages; but trinity has just got better at taunting remap_file_pages and the rest of mm...) [PATCH] mm: fix swapops.h:131 bug if remap_file_pages raced migration Add remove_linear_migration_ptes_from_nonlinear(), to fix an interesting little include/linux/swapops.h:131 BUG_ON(!PageLocked) found by trinity: indicating that remove_migration_ptes() failed to find one of the migration entries that was temporarily inserted. The problem comes from remap_file_pages()'s switch from vma_interval_tree (good for inserting the migration entry) to i_mmap_nonlinear list (no good for locating it again); but can only be a problem if the remap_file_pages() range does not cover the whole of the vma (zap_pte() clears the range). remove_migration_ptes() needs a file_nonlinear method to go down the i_mmap_nonlinear list, applying linear location to look for migration entries in those vmas too, just in case there was this race. The file_nonlinear method does need rmap_walk_control.arg to do this; but it never needed vma passed in - vma comes from its own iteration. Signed-off-by: Hugh Dickins --- include/linux/rmap.h |3 +-- mm/migrate.c | 32 mm/rmap.c|5 +++-- 3 files changed, 36 insertions(+), 4 deletions(-) --- 3.14-rc7/include/linux/rmap.h 2014-02-02 18:49:07.429302104 -0800 +++ linux/include/linux/rmap.h 2014-03-19 20:12:27.056451541 -0700 @@ -250,8 +250,7 @@ struct rmap_walk_control { int (*rmap_one)(struct page *page, struct vm_area_struct *vma, unsigned long addr, void *arg); int (*done)(struct page *page); - int (*file_nonlinear)(struct page *, struct address_space *, - struct vm_area_struct *vma); + int (*file_nonlinear)(struct page *, struct address_space *, void *arg); struct anon_vma *(*anon_lock)(struct page *page); bool (*invalid_vma)(struct vm_area_struct *vma, void *arg); }; --- 3.14-rc7/mm/migrate.c 2014-03-16 19:24:19.635512576 -0700 +++ linux/mm/migrate.c 2014-03-19 21:06:02.704527965 -0700 @@ -178,6 +178,37 @@ out: } /* + * Congratulations to trinity for discovering this bug. + * mm/fremap.c's remap_file_pages() accepts any range within a single vma to + * convert that vma to VM_NONLINEAR; and generic_file_remap_pages() will then + * replace the specified range by file ptes throughout (maybe populated after). + * If page migration finds a page within that range, while it's still located + * by vma_interval_tree rather than lost to i_mmap_nonlinear list, no problem: + * zap_pte() clears the temporary migration entry before mmap_sem is dropped. + * But if the migrating page is in a part of the vma outside the range to be + * remapped, then it will not be cleared, and remove_migration_ptes() needs to + * deal with it. Fortunately, this part of the vma is of course still linear, + * so we just need to use linear location on the nonlinear list. + */ +static int remove_linear_migration_ptes_from_nonlinear(struct page *page, + struct address_space *mapping, void *arg) +{ + struct vm_area_struct *vma; + /* hugetlbfs does not support remap_pages, so no huge pgoff worries */ + pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT); + unsigned long addr; + + list_for_each_entry(vma, + >i_mmap_nonlinear, shared.nonlinear) { + + addr = vma->vm_start + ((pgoff - vma->vm_pgoff) << PAGE_SHIFT); + if (addr >= vma->vm_start && addr < vma->vm_end) + remove_migration_pte(page, vma, addr, arg); + } + return SWAP_AGAIN; +} + +/* * Get rid of all migration entries and replace them by * references to the indicated page.
Re: bad rss-counter message in 3.14rc5
On Tue, Mar 18, 2014 at 07:19:09PM -0700, Hugh Dickins wrote: > Another positive on the rss counters, great, thanks Dave. > That encourages me to think again on the swapops BUG, but no promises. So while I slept I ran a test kernel with that swapops BUG replaced with a printk. I'm not sure of the validity of this, given the state of the kernel afterwards is somewhat suspect, but I did see in the logs this morning.. [18728.075153] migration_entry_to_page BUG hit [18728.200705] BUG: Bad rss-counter state mm:880241b3f500 idx:0 val:1 (Not tainted) [18728.200706] BUG: Bad rss-counter state mm:880241b3f500 idx:1 val:-1 (Not tainted) This might be collateral damage from the swapops thing, I guess we won't know until that gets fixed, but I thought I'd mention that we might still have a problem here. Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, Mar 18, 2014 at 05:38:38PM -0700, Hugh Dickins wrote: > > (Cyrill, entirely unrelated, but in preparing this patch I noticed > your soft_dirty work in install_file_pte(): which looked good at > first, until I realized that it's propagating the soft_dirty of a > pte it's about to zap completely, to the unrelated entry it's about > to insert in its place. Which seems very odd to me.) > Thanks a lot Hugh for pointing! I'll revisit all file-softdirty cases. (btw, I've grabbed Dave's config to run trinity and somehow help in testing and attempt to figure out what causes it but didn't yet find hardware node to run, hopefully i'll get a spare machine for testing in a couple of days). -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue 18-03-14 19:37:01, Hugh Dickins wrote: > On Tue, 18 Mar 2014, Linus Torvalds wrote: > > On Tue, Mar 18, 2014 at 7:06 PM, Hugh Dickins wrote: > > > > > > I'd love that, if we can get away with it now: depends very > > > much on whether we then turn out to break userspace or not. > > > > Right. I suspect we can, though, but it's one of those "we can try it > > and see". Remind me early in the 3.15 merge window, and we can just > > turn the "force" case into an error case and see if anybody hollers. > > Super, I'll do that, thanks. > > For 3.15, and probably 3.16 too, we should keep in place whatever > partial accommodations we have for the case (such as allowing for > anon and swap in fremap's zap_pte), in case we do need to revert; > but clean those away later on. (Not many, I think: it was mainly > a guilty secret that VM accounting didn't really hold together.) Different drivers actually use the 'force' argument of get_user_pages() a lot on userspace provided buffers (AFAIU because they want to tell the kernel HW is going to write to that memory so they want to prepare for it). It is hard to imagine someone will use this for MAP_SHARED pages (or what that would be supposed to achieve) but sometimes userspace is surprisingly inventive... Just something to be aware of... Honza -- Jan Kara SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue 18-03-14 19:37:01, Hugh Dickins wrote: On Tue, 18 Mar 2014, Linus Torvalds wrote: On Tue, Mar 18, 2014 at 7:06 PM, Hugh Dickins hu...@google.com wrote: I'd love that, if we can get away with it now: depends very much on whether we then turn out to break userspace or not. Right. I suspect we can, though, but it's one of those we can try it and see. Remind me early in the 3.15 merge window, and we can just turn the force case into an error case and see if anybody hollers. Super, I'll do that, thanks. For 3.15, and probably 3.16 too, we should keep in place whatever partial accommodations we have for the case (such as allowing for anon and swap in fremap's zap_pte), in case we do need to revert; but clean those away later on. (Not many, I think: it was mainly a guilty secret that VM accounting didn't really hold together.) Different drivers actually use the 'force' argument of get_user_pages() a lot on userspace provided buffers (AFAIU because they want to tell the kernel HW is going to write to that memory so they want to prepare for it). It is hard to imagine someone will use this for MAP_SHARED pages (or what that would be supposed to achieve) but sometimes userspace is surprisingly inventive... Just something to be aware of... Honza -- Jan Kara j...@suse.cz SUSE Labs, CR -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, Mar 18, 2014 at 05:38:38PM -0700, Hugh Dickins wrote: (Cyrill, entirely unrelated, but in preparing this patch I noticed your soft_dirty work in install_file_pte(): which looked good at first, until I realized that it's propagating the soft_dirty of a pte it's about to zap completely, to the unrelated entry it's about to insert in its place. Which seems very odd to me.) Thanks a lot Hugh for pointing! I'll revisit all file-softdirty cases. (btw, I've grabbed Dave's config to run trinity and somehow help in testing and attempt to figure out what causes it but didn't yet find hardware node to run, hopefully i'll get a spare machine for testing in a couple of days). -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, Mar 18, 2014 at 07:19:09PM -0700, Hugh Dickins wrote: Another positive on the rss counters, great, thanks Dave. That encourages me to think again on the swapops BUG, but no promises. So while I slept I ran a test kernel with that swapops BUG replaced with a printk. I'm not sure of the validity of this, given the state of the kernel afterwards is somewhat suspect, but I did see in the logs this morning.. [18728.075153] migration_entry_to_page BUG hit [18728.200705] BUG: Bad rss-counter state mm:880241b3f500 idx:0 val:1 (Not tainted) [18728.200706] BUG: Bad rss-counter state mm:880241b3f500 idx:1 val:-1 (Not tainted) This might be collateral damage from the swapops thing, I guess we won't know until that gets fixed, but I thought I'd mention that we might still have a problem here. Dave -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Wed, 19 Mar 2014, Dave Jones wrote: On Tue, Mar 18, 2014 at 07:19:09PM -0700, Hugh Dickins wrote: Another positive on the rss counters, great, thanks Dave. That encourages me to think again on the swapops BUG, but no promises. So while I slept I ran a test kernel with that swapops BUG replaced with a printk. I'm not sure of the validity of this, given the state of the kernel afterwards is somewhat suspect, but I did see in the logs this morning.. [18728.075153] migration_entry_to_page BUG hit [18728.200705] BUG: Bad rss-counter state mm:880241b3f500 idx:0 val:1 (Not tainted) [18728.200706] BUG: Bad rss-counter state mm:880241b3f500 idx:1 val:-1 (Not tainted) This might be collateral damage from the swapops thing, I guess we won't know until that gets fixed, but I thought I'd mention that we might still have a problem here. Yes, those Bad rss-counters could well be collateral damage from the swapops BUG. To which I believe I now have the answer: again untested, but please give this a try... (It's worth saying, by the way, that these bugs are not a consequence of recent changes at all, they've been there for ages; but trinity has just got better at taunting remap_file_pages and the rest of mm...) [PATCH] mm: fix swapops.h:131 bug if remap_file_pages raced migration Add remove_linear_migration_ptes_from_nonlinear(), to fix an interesting little include/linux/swapops.h:131 BUG_ON(!PageLocked) found by trinity: indicating that remove_migration_ptes() failed to find one of the migration entries that was temporarily inserted. The problem comes from remap_file_pages()'s switch from vma_interval_tree (good for inserting the migration entry) to i_mmap_nonlinear list (no good for locating it again); but can only be a problem if the remap_file_pages() range does not cover the whole of the vma (zap_pte() clears the range). remove_migration_ptes() needs a file_nonlinear method to go down the i_mmap_nonlinear list, applying linear location to look for migration entries in those vmas too, just in case there was this race. The file_nonlinear method does need rmap_walk_control.arg to do this; but it never needed vma passed in - vma comes from its own iteration. Signed-off-by: Hugh Dickins hu...@google.com --- include/linux/rmap.h |3 +-- mm/migrate.c | 32 mm/rmap.c|5 +++-- 3 files changed, 36 insertions(+), 4 deletions(-) --- 3.14-rc7/include/linux/rmap.h 2014-02-02 18:49:07.429302104 -0800 +++ linux/include/linux/rmap.h 2014-03-19 20:12:27.056451541 -0700 @@ -250,8 +250,7 @@ struct rmap_walk_control { int (*rmap_one)(struct page *page, struct vm_area_struct *vma, unsigned long addr, void *arg); int (*done)(struct page *page); - int (*file_nonlinear)(struct page *, struct address_space *, - struct vm_area_struct *vma); + int (*file_nonlinear)(struct page *, struct address_space *, void *arg); struct anon_vma *(*anon_lock)(struct page *page); bool (*invalid_vma)(struct vm_area_struct *vma, void *arg); }; --- 3.14-rc7/mm/migrate.c 2014-03-16 19:24:19.635512576 -0700 +++ linux/mm/migrate.c 2014-03-19 21:06:02.704527965 -0700 @@ -178,6 +178,37 @@ out: } /* + * Congratulations to trinity for discovering this bug. + * mm/fremap.c's remap_file_pages() accepts any range within a single vma to + * convert that vma to VM_NONLINEAR; and generic_file_remap_pages() will then + * replace the specified range by file ptes throughout (maybe populated after). + * If page migration finds a page within that range, while it's still located + * by vma_interval_tree rather than lost to i_mmap_nonlinear list, no problem: + * zap_pte() clears the temporary migration entry before mmap_sem is dropped. + * But if the migrating page is in a part of the vma outside the range to be + * remapped, then it will not be cleared, and remove_migration_ptes() needs to + * deal with it. Fortunately, this part of the vma is of course still linear, + * so we just need to use linear location on the nonlinear list. + */ +static int remove_linear_migration_ptes_from_nonlinear(struct page *page, + struct address_space *mapping, void *arg) +{ + struct vm_area_struct *vma; + /* hugetlbfs does not support remap_pages, so no huge pgoff worries */ + pgoff_t pgoff = page-index (PAGE_CACHE_SHIFT - PAGE_SHIFT); + unsigned long addr; + + list_for_each_entry(vma, + mapping-i_mmap_nonlinear, shared.nonlinear) { + + addr = vma-vm_start + ((pgoff - vma-vm_pgoff) PAGE_SHIFT); + if (addr = vma-vm_start addr vma-vm_end) + remove_migration_pte(page, vma, addr, arg); + } + return SWAP_AGAIN; +} + +/* * Get rid of all migration entries and replace them by * references to the indicated page. */ @@
Re: bad rss-counter message in 3.14rc5
On Tue, Mar 18, 2014 at 7:37 PM, Hugh Dickins wrote: > > For 3.15, and probably 3.16 too, we should keep in place whatever > partial accommodations we have for the case (such as allowing for > anon and swap in fremap's zap_pte), in case we do need to revert; > but clean those away later on. (Not many, I think: it was mainly > a guilty secret that VM accounting didn't really hold together.) Absolutely. See if it works to just stop doing that special COW, and then later on, if we have decided "nobody even noticed", we can remove the hacks we have to support the fact that shared mappings sometimes have anon pages in them. > :) That fits with what I heard of HP-UX mmap, > but I never had the pleasure of dealing with it. They had purely virtually indexed caches, making coherency "interesting". Together with a VM based on some really old BSD VM code that everybody else had thrown out, and that didn't allow you to unmap things partially etc. So HPUX mmap really didn't work, not even for non-shared mmap's. I think they fixed the interfaces in HP-UX 11. But not being coherent meant that the shared mappings tended to still have trouble. nntp largely died, but was replaced with the cyrus imapd that played similar games. At least out mmap was always coherent. Even in MAP_PRIVATE, and with regards to both write() system calls and other mmap PROT_WRITE users. Except when we had bugs. Shared mmap really isn't very simple to get right. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On 03/18/2014 10:12 PM, Hugh Dickins wrote: On Tue, 18 Mar 2014, Sasha Levin wrote: On 03/18/2014 08:38 PM, Hugh Dickins wrote: On Tue, 11 Mar 2014, Dave Jones wrote: On Tue, Mar 11, 2014 at 09:36:03PM +0400, Cyrill Gorcunov wrote: > On Tue, Mar 11, 2014 at 01:10:45PM -0400, Dave Jones wrote: > > > > > > Dave, iirc trinity can write log file pointing which exactly syscall sequence > > > was passed, right? Share it too please. > > > > Hm, I may have been mistaken, and the damage was done by a previous run. > > I went from being able to reproduce it almost instantly to now not being able > > to reproduce it at all. Will keep trying. > > Sasha already gave a link to the syscalls sequence, so no rush. It'd be nice to get a more concise reproducer, his list had a little of everything in there. I've so far failed to find any explanation for your swapops.h BUG; but believe I have identified one cause for "Bad rss-counter"s. My hunch is that the swapops.h BUG is "nearby", but I just cannot fit it together (the swapops.h BUG comes when rmap cannot find all all the migration entries it inserted earlier: it's a very useful BUG for validating rmap). Untested patch below: I can't quite say Reported-by, because it may not even be one that you and Sasha have been seeing; but I'm hopeful, remap_file_pages is in the list. Please give this a try, preferably on 3.14-rc or earlier: I've never seen "Bad rss-counter"s there myself (trinity uses remap_file_pages a lot more than most of us); but have seen them on mmotm/next, so some other trigger is coming up there, I'll worry about that once it reaches 3.15-rc. The patch fixed the "Bad rss-counter" errors I've been seeing both in 3.14-rc7 and -next. Great, thanks a lot, Sasha. I was afraid that you'd hit those swapops BUGs, which seemed perhaps to be paired with these; but glad to hear a positive. Let's see how Dave fares. (I've not forgotten shmem fallocate, by the way, but those probably aren't as high on my agenda as you'd like.) I do hit the swapops issue a lot, I didn't think that your patch was supposed to fix that so I didn't mention it. Thanks for keeping shmem in mind, I've removed shmem from testing for now but I agree, it's not one of the more important issues to be taken care of. Thanks, Sasha -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, 18 Mar 2014, Linus Torvalds wrote: > On Tue, Mar 18, 2014 at 7:06 PM, Hugh Dickins wrote: > > > > I'd love that, if we can get away with it now: depends very > > much on whether we then turn out to break userspace or not. > > Right. I suspect we can, though, but it's one of those "we can try it > and see". Remind me early in the 3.15 merge window, and we can just > turn the "force" case into an error case and see if anybody hollers. Super, I'll do that, thanks. For 3.15, and probably 3.16 too, we should keep in place whatever partial accommodations we have for the case (such as allowing for anon and swap in fremap's zap_pte), in case we do need to revert; but clean those away later on. (Not many, I think: it was mainly a guilty secret that VM accounting didn't really hold together.) > > > If I remember correctly, it's been that way since early days, > > in case ptrace were used to put a breakpoint into a MAP_SHARED > > mapping of an executable: to prevent that modification from > > reaching the file, if the file happened to be opened O_RDWR. > > Usually it's not open for writing, and mapped MAP_PRIVATE anyway. > > Yes, it's been that way since the very beginning, I think it goes back > pretty much as far as MAP_SHARED does. > > We used to play lots of games wrt MAP_SHARED - in fact I think we used > to silently turn a MAP_SHARED RO mapping into MAP_PRIVATE because for > the longest time there was no "true" writable MAP_SHARED at all, but > we did have a coherent MAP_PRIVATE and something like the indexer for > nntpd wanted a read-only shared mapping of the nntp spool or something > like that. I forget the details, it's a _loong_ time ago. > > So the whole "force turns a MAP_SHARED page into MAP_PRIVATE" all used > to make a lot more sense in that kind of situation, when MAP_SHARED vs > MAP_PRIVATE was much less of a black-and-white thing. > > I really suspect nobody cares wrt ptrace, especially since presumably > other systems haven't had those kinds of games (although who knows - > HP-UX in particular had some of the shittiest mmap() implementations > on the planet - it made even the original Linux mmap hacks look like a > thing of pure beauty in comparison). :) That fits with what I heard of HP-UX mmap, but I never had the pleasure of dealing with it. Hugh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, Mar 18, 2014 at 7:06 PM, Hugh Dickins wrote: > > I'd love that, if we can get away with it now: depends very > much on whether we then turn out to break userspace or not. Right. I suspect we can, though, but it's one of those "we can try it and see". Remind me early in the 3.15 merge window, and we can just turn the "force" case into an error case and see if anybody hollers. > If I remember correctly, it's been that way since early days, > in case ptrace were used to put a breakpoint into a MAP_SHARED > mapping of an executable: to prevent that modification from > reaching the file, if the file happened to be opened O_RDWR. > Usually it's not open for writing, and mapped MAP_PRIVATE anyway. Yes, it's been that way since the very beginning, I think it goes back pretty much as far as MAP_SHARED does. We used to play lots of games wrt MAP_SHARED - in fact I think we used to silently turn a MAP_SHARED RO mapping into MAP_PRIVATE because for the longest time there was no "true" writable MAP_SHARED at all, but we did have a coherent MAP_PRIVATE and something like the indexer for nntpd wanted a read-only shared mapping of the nntp spool or something like that. I forget the details, it's a _loong_ time ago. So the whole "force turns a MAP_SHARED page into MAP_PRIVATE" all used to make a lot more sense in that kind of situation, when MAP_SHARED vs MAP_PRIVATE was much less of a black-and-white thing. I really suspect nobody cares wrt ptrace, especially since presumably other systems haven't had those kinds of games (although who knows - HP-UX in particular had some of the shittiest mmap() implementations on the planet - it made even the original Linux mmap hacks look like a thing of pure beauty in comparison). Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, 18 Mar 2014, Dave Jones wrote: > On Tue, Mar 18, 2014 at 10:06:02PM -0400, Dave Jones wrote: > > On Tue, Mar 18, 2014 at 09:32:36PM -0400, Sasha Levin wrote: > > > > > > Untested patch below: I can't quite say Reported-by, because it may > > > > not even be one that you and Sasha have been seeing; but I'm hopeful, > > > > remap_file_pages is in the list. > > > > > > > > Please give this a try, preferably on 3.14-rc or earlier: I've never > > > > seen "Bad rss-counter"s there myself (trinity uses remap_file_pages > > > > a lot more than most of us); but have seen them on mmotm/next, so > > > > some other trigger is coming up there, I'll worry about that once > > > > it reaches 3.15-rc. > > > > > > The patch fixed the "Bad rss-counter" errors I've been seeing both in > > > 3.14-rc7 and -next. > > > > It's looking good here too so far. I'll leave it running overnight to be > sure. > > Of course, that isn't going to happen. Immediately after posting this, I hit > the > swapops bug. Patch does seem to have cured the bad rss counters though. Another positive on the rss counters, great, thanks Dave. That encourages me to think again on the swapops BUG, but no promises. Hugh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, 18 Mar 2014, Sasha Levin wrote: > On 03/18/2014 08:38 PM, Hugh Dickins wrote: > > On Tue, 11 Mar 2014, Dave Jones wrote: > > > On Tue, Mar 11, 2014 at 09:36:03PM +0400, Cyrill Gorcunov wrote: > > > > On Tue, Mar 11, 2014 at 01:10:45PM -0400, Dave Jones wrote: > > > > > > > > > > > > Dave, iirc trinity can write log file pointing which exactly > > > syscall sequence > > > > > > was passed, right? Share it too please. > > > > > > > > > > Hm, I may have been mistaken, and the damage was done by a previous > > > run. > > > > > I went from being able to reproduce it almost instantly to now not > > > being able > > > > > to reproduce it at all. Will keep trying. > > > > > > > > Sasha already gave a link to the syscalls sequence, so no rush. > > > > > > It'd be nice to get a more concise reproducer, his list had a little of > > > everything in there. > > > > I've so far failed to find any explanation for your swapops.h BUG; > > but believe I have identified one cause for "Bad rss-counter"s. > > > > My hunch is that the swapops.h BUG is "nearby", but I just cannot > > fit it together (the swapops.h BUG comes when rmap cannot find all > > all the migration entries it inserted earlier: it's a very useful > > BUG for validating rmap). > > > > Untested patch below: I can't quite say Reported-by, because it may > > not even be one that you and Sasha have been seeing; but I'm hopeful, > > remap_file_pages is in the list. > > > > Please give this a try, preferably on 3.14-rc or earlier: I've never > > seen "Bad rss-counter"s there myself (trinity uses remap_file_pages > > a lot more than most of us); but have seen them on mmotm/next, so > > some other trigger is coming up there, I'll worry about that once > > it reaches 3.15-rc. > > The patch fixed the "Bad rss-counter" errors I've been seeing both in > 3.14-rc7 and -next. Great, thanks a lot, Sasha. I was afraid that you'd hit those swapops BUGs, which seemed perhaps to be paired with these; but glad to hear a positive. Let's see how Dave fares. (I've not forgotten shmem fallocate, by the way, but those probably aren't as high on my agenda as you'd like.) Hugh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, Mar 18, 2014 at 10:06:02PM -0400, Dave Jones wrote: > On Tue, Mar 18, 2014 at 09:32:36PM -0400, Sasha Levin wrote: > > > > Untested patch below: I can't quite say Reported-by, because it may > > > not even be one that you and Sasha have been seeing; but I'm hopeful, > > > remap_file_pages is in the list. > > > > > > Please give this a try, preferably on 3.14-rc or earlier: I've never > > > seen "Bad rss-counter"s there myself (trinity uses remap_file_pages > > > a lot more than most of us); but have seen them on mmotm/next, so > > > some other trigger is coming up there, I'll worry about that once > > > it reaches 3.15-rc. > > > > The patch fixed the "Bad rss-counter" errors I've been seeing both in > > 3.14-rc7 and -next. > > It's looking good here too so far. I'll leave it running overnight to be > sure. Of course, that isn't going to happen. Immediately after posting this, I hit the swapops bug. Patch does seem to have cured the bad rss counters though. Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, 18 Mar 2014, Linus Torvalds wrote: > On Tue, Mar 18, 2014 at 5:38 PM, Hugh Dickins wrote: > > > > And yes, it is possible (though very unusual) to find an anon page or > > swap entry in a VM_SHARED nonlinear mapping: coming from that horrid > > get_user_pages(write, force) case which COWs even in a shared mapping. > > Hmm. Maybe we could just disallow that forced case. > > It *used* to be a trivial "we can just do a COW", but that was back > when the VM was much simpler and we had no rmap's etc. So "that horrid > case" used to be a simple hack that wasn't painful. But I suspect we > could very easily just fail it instead of forcing a COW, if that would > make it simpler for the VM code. I'd love that, if we can get away with it now: depends very much on whether we then turn out to break userspace or not. If I remember correctly, it's been that way since early days, in case ptrace were used to put a breakpoint into a MAP_SHARED mapping of an executable: to prevent that modification from reaching the file, if the file happened to be opened O_RDWR. Usually it's not open for writing, and mapped MAP_PRIVATE anyway. That is still something worth protecting against, I presume; but I'd much rather do it by failing the awkward case, than by perverting the VM to break its own rules. If I'm not mistaken, Konstantin (who happens to be already on this Cc list) had a patch (that I hated) to complicate things, to fix up some of the inconsistencies arising from this very odd and overlooked corner-case. I think he'd prefer this simplification to his patch too. I'll look into it further, but not in haste. Hugh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, Mar 18, 2014 at 09:32:36PM -0400, Sasha Levin wrote: > > Untested patch below: I can't quite say Reported-by, because it may > > not even be one that you and Sasha have been seeing; but I'm hopeful, > > remap_file_pages is in the list. > > > > Please give this a try, preferably on 3.14-rc or earlier: I've never > > seen "Bad rss-counter"s there myself (trinity uses remap_file_pages > > a lot more than most of us); but have seen them on mmotm/next, so > > some other trigger is coming up there, I'll worry about that once > > it reaches 3.15-rc. > > The patch fixed the "Bad rss-counter" errors I've been seeing both in > 3.14-rc7 and -next. It's looking good here too so far. I'll leave it running overnight to be sure. Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On 03/18/2014 08:38 PM, Hugh Dickins wrote: On Tue, 11 Mar 2014, Dave Jones wrote: On Tue, Mar 11, 2014 at 09:36:03PM +0400, Cyrill Gorcunov wrote: > On Tue, Mar 11, 2014 at 01:10:45PM -0400, Dave Jones wrote: > > > > > > Dave, iirc trinity can write log file pointing which exactly syscall sequence > > > was passed, right? Share it too please. > > > > Hm, I may have been mistaken, and the damage was done by a previous run. > > I went from being able to reproduce it almost instantly to now not being able > > to reproduce it at all. Will keep trying. > > Sasha already gave a link to the syscalls sequence, so no rush. It'd be nice to get a more concise reproducer, his list had a little of everything in there. I've so far failed to find any explanation for your swapops.h BUG; but believe I have identified one cause for "Bad rss-counter"s. My hunch is that the swapops.h BUG is "nearby", but I just cannot fit it together (the swapops.h BUG comes when rmap cannot find all all the migration entries it inserted earlier: it's a very useful BUG for validating rmap). Untested patch below: I can't quite say Reported-by, because it may not even be one that you and Sasha have been seeing; but I'm hopeful, remap_file_pages is in the list. Please give this a try, preferably on 3.14-rc or earlier: I've never seen "Bad rss-counter"s there myself (trinity uses remap_file_pages a lot more than most of us); but have seen them on mmotm/next, so some other trigger is coming up there, I'll worry about that once it reaches 3.15-rc. The patch fixed the "Bad rss-counter" errors I've been seeing both in 3.14-rc7 and -next. Thanks, Sasha -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, Mar 18, 2014 at 5:38 PM, Hugh Dickins wrote: > > And yes, it is possible (though very unusual) to find an anon page or > swap entry in a VM_SHARED nonlinear mapping: coming from that horrid > get_user_pages(write, force) case which COWs even in a shared mapping. Hmm. Maybe we could just disallow that forced case. It *used* to be a trivial "we can just do a COW", but that was back when the VM was much simpler and we had no rmap's etc. So "that horrid case" used to be a simple hack that wasn't painful. But I suspect we could very easily just fail it instead of forcing a COW, if that would make it simpler for the VM code. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, 11 Mar 2014, Dave Jones wrote: > On Tue, Mar 11, 2014 at 09:36:03PM +0400, Cyrill Gorcunov wrote: > > On Tue, Mar 11, 2014 at 01:10:45PM -0400, Dave Jones wrote: > > > > > > > > Dave, iirc trinity can write log file pointing which exactly syscall > sequence > > > > was passed, right? Share it too please. > > > > > > Hm, I may have been mistaken, and the damage was done by a previous run. > > > I went from being able to reproduce it almost instantly to now not being > able > > > to reproduce it at all. Will keep trying. > > > > Sasha already gave a link to the syscalls sequence, so no rush. > > It'd be nice to get a more concise reproducer, his list had a little of > everything in there. I've so far failed to find any explanation for your swapops.h BUG; but believe I have identified one cause for "Bad rss-counter"s. My hunch is that the swapops.h BUG is "nearby", but I just cannot fit it together (the swapops.h BUG comes when rmap cannot find all all the migration entries it inserted earlier: it's a very useful BUG for validating rmap). Untested patch below: I can't quite say Reported-by, because it may not even be one that you and Sasha have been seeing; but I'm hopeful, remap_file_pages is in the list. Please give this a try, preferably on 3.14-rc or earlier: I've never seen "Bad rss-counter"s there myself (trinity uses remap_file_pages a lot more than most of us); but have seen them on mmotm/next, so some other trigger is coming up there, I'll worry about that once it reaches 3.15-rc. (Cyrill, entirely unrelated, but in preparing this patch I noticed your soft_dirty work in install_file_pte(): which looked good at first, until I realized that it's propagating the soft_dirty of a pte it's about to zap completely, to the unrelated entry it's about to insert in its place. Which seems very odd to me.) [PATCH] mm: fix bad rss-counter if remap_file_pages raced migration Fix some "Bad rss-counter state" reports on exit, arising from the interaction between page migration and remap_file_pages(): zap_pte() must count a migration entry when zapping it. And yes, it is possible (though very unusual) to find an anon page or swap entry in a VM_SHARED nonlinear mapping: coming from that horrid get_user_pages(write, force) case which COWs even in a shared mapping. Signed-off-by: Hugh Dickins --- mm/fremap.c | 28 ++-- 1 file changed, 22 insertions(+), 6 deletions(-) --- 3.14-rc7/mm/fremap.c2014-01-19 18:40:07.0 -0800 +++ linux/mm/fremap.c 2014-03-18 16:32:39.288612346 -0700 @@ -23,28 +23,44 @@ #include "internal.h" +static int mm_counter(struct page *page) +{ + return PageAnon(page) ? MM_ANONPAGES : MM_FILEPAGES; +} + static void zap_pte(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long addr, pte_t *ptep) { pte_t pte = *ptep; + struct page *page; + swp_entry_t entry; if (pte_present(pte)) { - struct page *page; - flush_cache_page(vma, addr, pte_pfn(pte)); pte = ptep_clear_flush(vma, addr, ptep); page = vm_normal_page(vma, addr, pte); if (page) { if (pte_dirty(pte)) set_page_dirty(page); + update_hiwater_rss(mm); + dec_mm_counter(mm, mm_counter(page)); page_remove_rmap(page); page_cache_release(page); + } + } else {/* zap_pte() is not called when pte_none() */ + if (!pte_file(pte)) { update_hiwater_rss(mm); - dec_mm_counter(mm, MM_FILEPAGES); + entry = pte_to_swp_entry(pte); + if (non_swap_entry(entry)) { + if (is_migration_entry(entry)) { + page = migration_entry_to_page(entry); + dec_mm_counter(mm, mm_counter(page)); + } + } else { + free_swap_and_cache(entry); + dec_mm_counter(mm, MM_SWAPENTS); + } } - } else { - if (!pte_file(pte)) - free_swap_and_cache(pte_to_swp_entry(pte)); pte_clear_not_present_full(mm, addr, ptep, 0); } } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, 11 Mar 2014, Dave Jones wrote: On Tue, Mar 11, 2014 at 09:36:03PM +0400, Cyrill Gorcunov wrote: On Tue, Mar 11, 2014 at 01:10:45PM -0400, Dave Jones wrote: Dave, iirc trinity can write log file pointing which exactly syscall sequence was passed, right? Share it too please. Hm, I may have been mistaken, and the damage was done by a previous run. I went from being able to reproduce it almost instantly to now not being able to reproduce it at all. Will keep trying. Sasha already gave a link to the syscalls sequence, so no rush. It'd be nice to get a more concise reproducer, his list had a little of everything in there. I've so far failed to find any explanation for your swapops.h BUG; but believe I have identified one cause for Bad rss-counters. My hunch is that the swapops.h BUG is nearby, but I just cannot fit it together (the swapops.h BUG comes when rmap cannot find all all the migration entries it inserted earlier: it's a very useful BUG for validating rmap). Untested patch below: I can't quite say Reported-by, because it may not even be one that you and Sasha have been seeing; but I'm hopeful, remap_file_pages is in the list. Please give this a try, preferably on 3.14-rc or earlier: I've never seen Bad rss-counters there myself (trinity uses remap_file_pages a lot more than most of us); but have seen them on mmotm/next, so some other trigger is coming up there, I'll worry about that once it reaches 3.15-rc. (Cyrill, entirely unrelated, but in preparing this patch I noticed your soft_dirty work in install_file_pte(): which looked good at first, until I realized that it's propagating the soft_dirty of a pte it's about to zap completely, to the unrelated entry it's about to insert in its place. Which seems very odd to me.) [PATCH] mm: fix bad rss-counter if remap_file_pages raced migration Fix some Bad rss-counter state reports on exit, arising from the interaction between page migration and remap_file_pages(): zap_pte() must count a migration entry when zapping it. And yes, it is possible (though very unusual) to find an anon page or swap entry in a VM_SHARED nonlinear mapping: coming from that horrid get_user_pages(write, force) case which COWs even in a shared mapping. Signed-off-by: Hugh Dickins hu...@google.com --- mm/fremap.c | 28 ++-- 1 file changed, 22 insertions(+), 6 deletions(-) --- 3.14-rc7/mm/fremap.c2014-01-19 18:40:07.0 -0800 +++ linux/mm/fremap.c 2014-03-18 16:32:39.288612346 -0700 @@ -23,28 +23,44 @@ #include internal.h +static int mm_counter(struct page *page) +{ + return PageAnon(page) ? MM_ANONPAGES : MM_FILEPAGES; +} + static void zap_pte(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long addr, pte_t *ptep) { pte_t pte = *ptep; + struct page *page; + swp_entry_t entry; if (pte_present(pte)) { - struct page *page; - flush_cache_page(vma, addr, pte_pfn(pte)); pte = ptep_clear_flush(vma, addr, ptep); page = vm_normal_page(vma, addr, pte); if (page) { if (pte_dirty(pte)) set_page_dirty(page); + update_hiwater_rss(mm); + dec_mm_counter(mm, mm_counter(page)); page_remove_rmap(page); page_cache_release(page); + } + } else {/* zap_pte() is not called when pte_none() */ + if (!pte_file(pte)) { update_hiwater_rss(mm); - dec_mm_counter(mm, MM_FILEPAGES); + entry = pte_to_swp_entry(pte); + if (non_swap_entry(entry)) { + if (is_migration_entry(entry)) { + page = migration_entry_to_page(entry); + dec_mm_counter(mm, mm_counter(page)); + } + } else { + free_swap_and_cache(entry); + dec_mm_counter(mm, MM_SWAPENTS); + } } - } else { - if (!pte_file(pte)) - free_swap_and_cache(pte_to_swp_entry(pte)); pte_clear_not_present_full(mm, addr, ptep, 0); } } -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, Mar 18, 2014 at 5:38 PM, Hugh Dickins hu...@google.com wrote: And yes, it is possible (though very unusual) to find an anon page or swap entry in a VM_SHARED nonlinear mapping: coming from that horrid get_user_pages(write, force) case which COWs even in a shared mapping. Hmm. Maybe we could just disallow that forced case. It *used* to be a trivial we can just do a COW, but that was back when the VM was much simpler and we had no rmap's etc. So that horrid case used to be a simple hack that wasn't painful. But I suspect we could very easily just fail it instead of forcing a COW, if that would make it simpler for the VM code. Linus -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On 03/18/2014 08:38 PM, Hugh Dickins wrote: On Tue, 11 Mar 2014, Dave Jones wrote: On Tue, Mar 11, 2014 at 09:36:03PM +0400, Cyrill Gorcunov wrote: On Tue, Mar 11, 2014 at 01:10:45PM -0400, Dave Jones wrote: Dave, iirc trinity can write log file pointing which exactly syscall sequence was passed, right? Share it too please. Hm, I may have been mistaken, and the damage was done by a previous run. I went from being able to reproduce it almost instantly to now not being able to reproduce it at all. Will keep trying. Sasha already gave a link to the syscalls sequence, so no rush. It'd be nice to get a more concise reproducer, his list had a little of everything in there. I've so far failed to find any explanation for your swapops.h BUG; but believe I have identified one cause for Bad rss-counters. My hunch is that the swapops.h BUG is nearby, but I just cannot fit it together (the swapops.h BUG comes when rmap cannot find all all the migration entries it inserted earlier: it's a very useful BUG for validating rmap). Untested patch below: I can't quite say Reported-by, because it may not even be one that you and Sasha have been seeing; but I'm hopeful, remap_file_pages is in the list. Please give this a try, preferably on 3.14-rc or earlier: I've never seen Bad rss-counters there myself (trinity uses remap_file_pages a lot more than most of us); but have seen them on mmotm/next, so some other trigger is coming up there, I'll worry about that once it reaches 3.15-rc. The patch fixed the Bad rss-counter errors I've been seeing both in 3.14-rc7 and -next. Thanks, Sasha -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, Mar 18, 2014 at 09:32:36PM -0400, Sasha Levin wrote: Untested patch below: I can't quite say Reported-by, because it may not even be one that you and Sasha have been seeing; but I'm hopeful, remap_file_pages is in the list. Please give this a try, preferably on 3.14-rc or earlier: I've never seen Bad rss-counters there myself (trinity uses remap_file_pages a lot more than most of us); but have seen them on mmotm/next, so some other trigger is coming up there, I'll worry about that once it reaches 3.15-rc. The patch fixed the Bad rss-counter errors I've been seeing both in 3.14-rc7 and -next. It's looking good here too so far. I'll leave it running overnight to be sure. Dave -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, 18 Mar 2014, Linus Torvalds wrote: On Tue, Mar 18, 2014 at 5:38 PM, Hugh Dickins hu...@google.com wrote: And yes, it is possible (though very unusual) to find an anon page or swap entry in a VM_SHARED nonlinear mapping: coming from that horrid get_user_pages(write, force) case which COWs even in a shared mapping. Hmm. Maybe we could just disallow that forced case. It *used* to be a trivial we can just do a COW, but that was back when the VM was much simpler and we had no rmap's etc. So that horrid case used to be a simple hack that wasn't painful. But I suspect we could very easily just fail it instead of forcing a COW, if that would make it simpler for the VM code. I'd love that, if we can get away with it now: depends very much on whether we then turn out to break userspace or not. If I remember correctly, it's been that way since early days, in case ptrace were used to put a breakpoint into a MAP_SHARED mapping of an executable: to prevent that modification from reaching the file, if the file happened to be opened O_RDWR. Usually it's not open for writing, and mapped MAP_PRIVATE anyway. That is still something worth protecting against, I presume; but I'd much rather do it by failing the awkward case, than by perverting the VM to break its own rules. If I'm not mistaken, Konstantin (who happens to be already on this Cc list) had a patch (that I hated) to complicate things, to fix up some of the inconsistencies arising from this very odd and overlooked corner-case. I think he'd prefer this simplification to his patch too. I'll look into it further, but not in haste. Hugh -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, Mar 18, 2014 at 10:06:02PM -0400, Dave Jones wrote: On Tue, Mar 18, 2014 at 09:32:36PM -0400, Sasha Levin wrote: Untested patch below: I can't quite say Reported-by, because it may not even be one that you and Sasha have been seeing; but I'm hopeful, remap_file_pages is in the list. Please give this a try, preferably on 3.14-rc or earlier: I've never seen Bad rss-counters there myself (trinity uses remap_file_pages a lot more than most of us); but have seen them on mmotm/next, so some other trigger is coming up there, I'll worry about that once it reaches 3.15-rc. The patch fixed the Bad rss-counter errors I've been seeing both in 3.14-rc7 and -next. It's looking good here too so far. I'll leave it running overnight to be sure. Of course, that isn't going to happen. Immediately after posting this, I hit the swapops bug. Patch does seem to have cured the bad rss counters though. Dave -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, 18 Mar 2014, Sasha Levin wrote: On 03/18/2014 08:38 PM, Hugh Dickins wrote: On Tue, 11 Mar 2014, Dave Jones wrote: On Tue, Mar 11, 2014 at 09:36:03PM +0400, Cyrill Gorcunov wrote: On Tue, Mar 11, 2014 at 01:10:45PM -0400, Dave Jones wrote: Dave, iirc trinity can write log file pointing which exactly syscall sequence was passed, right? Share it too please. Hm, I may have been mistaken, and the damage was done by a previous run. I went from being able to reproduce it almost instantly to now not being able to reproduce it at all. Will keep trying. Sasha already gave a link to the syscalls sequence, so no rush. It'd be nice to get a more concise reproducer, his list had a little of everything in there. I've so far failed to find any explanation for your swapops.h BUG; but believe I have identified one cause for Bad rss-counters. My hunch is that the swapops.h BUG is nearby, but I just cannot fit it together (the swapops.h BUG comes when rmap cannot find all all the migration entries it inserted earlier: it's a very useful BUG for validating rmap). Untested patch below: I can't quite say Reported-by, because it may not even be one that you and Sasha have been seeing; but I'm hopeful, remap_file_pages is in the list. Please give this a try, preferably on 3.14-rc or earlier: I've never seen Bad rss-counters there myself (trinity uses remap_file_pages a lot more than most of us); but have seen them on mmotm/next, so some other trigger is coming up there, I'll worry about that once it reaches 3.15-rc. The patch fixed the Bad rss-counter errors I've been seeing both in 3.14-rc7 and -next. Great, thanks a lot, Sasha. I was afraid that you'd hit those swapops BUGs, which seemed perhaps to be paired with these; but glad to hear a positive. Let's see how Dave fares. (I've not forgotten shmem fallocate, by the way, but those probably aren't as high on my agenda as you'd like.) Hugh -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, 18 Mar 2014, Dave Jones wrote: On Tue, Mar 18, 2014 at 10:06:02PM -0400, Dave Jones wrote: On Tue, Mar 18, 2014 at 09:32:36PM -0400, Sasha Levin wrote: Untested patch below: I can't quite say Reported-by, because it may not even be one that you and Sasha have been seeing; but I'm hopeful, remap_file_pages is in the list. Please give this a try, preferably on 3.14-rc or earlier: I've never seen Bad rss-counters there myself (trinity uses remap_file_pages a lot more than most of us); but have seen them on mmotm/next, so some other trigger is coming up there, I'll worry about that once it reaches 3.15-rc. The patch fixed the Bad rss-counter errors I've been seeing both in 3.14-rc7 and -next. It's looking good here too so far. I'll leave it running overnight to be sure. Of course, that isn't going to happen. Immediately after posting this, I hit the swapops bug. Patch does seem to have cured the bad rss counters though. Another positive on the rss counters, great, thanks Dave. That encourages me to think again on the swapops BUG, but no promises. Hugh -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, Mar 18, 2014 at 7:06 PM, Hugh Dickins hu...@google.com wrote: I'd love that, if we can get away with it now: depends very much on whether we then turn out to break userspace or not. Right. I suspect we can, though, but it's one of those we can try it and see. Remind me early in the 3.15 merge window, and we can just turn the force case into an error case and see if anybody hollers. If I remember correctly, it's been that way since early days, in case ptrace were used to put a breakpoint into a MAP_SHARED mapping of an executable: to prevent that modification from reaching the file, if the file happened to be opened O_RDWR. Usually it's not open for writing, and mapped MAP_PRIVATE anyway. Yes, it's been that way since the very beginning, I think it goes back pretty much as far as MAP_SHARED does. We used to play lots of games wrt MAP_SHARED - in fact I think we used to silently turn a MAP_SHARED RO mapping into MAP_PRIVATE because for the longest time there was no true writable MAP_SHARED at all, but we did have a coherent MAP_PRIVATE and something like the indexer for nntpd wanted a read-only shared mapping of the nntp spool or something like that. I forget the details, it's a _loong_ time ago. So the whole force turns a MAP_SHARED page into MAP_PRIVATE all used to make a lot more sense in that kind of situation, when MAP_SHARED vs MAP_PRIVATE was much less of a black-and-white thing. I really suspect nobody cares wrt ptrace, especially since presumably other systems haven't had those kinds of games (although who knows - HP-UX in particular had some of the shittiest mmap() implementations on the planet - it made even the original Linux mmap hacks look like a thing of pure beauty in comparison). Linus -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, 18 Mar 2014, Linus Torvalds wrote: On Tue, Mar 18, 2014 at 7:06 PM, Hugh Dickins hu...@google.com wrote: I'd love that, if we can get away with it now: depends very much on whether we then turn out to break userspace or not. Right. I suspect we can, though, but it's one of those we can try it and see. Remind me early in the 3.15 merge window, and we can just turn the force case into an error case and see if anybody hollers. Super, I'll do that, thanks. For 3.15, and probably 3.16 too, we should keep in place whatever partial accommodations we have for the case (such as allowing for anon and swap in fremap's zap_pte), in case we do need to revert; but clean those away later on. (Not many, I think: it was mainly a guilty secret that VM accounting didn't really hold together.) If I remember correctly, it's been that way since early days, in case ptrace were used to put a breakpoint into a MAP_SHARED mapping of an executable: to prevent that modification from reaching the file, if the file happened to be opened O_RDWR. Usually it's not open for writing, and mapped MAP_PRIVATE anyway. Yes, it's been that way since the very beginning, I think it goes back pretty much as far as MAP_SHARED does. We used to play lots of games wrt MAP_SHARED - in fact I think we used to silently turn a MAP_SHARED RO mapping into MAP_PRIVATE because for the longest time there was no true writable MAP_SHARED at all, but we did have a coherent MAP_PRIVATE and something like the indexer for nntpd wanted a read-only shared mapping of the nntp spool or something like that. I forget the details, it's a _loong_ time ago. So the whole force turns a MAP_SHARED page into MAP_PRIVATE all used to make a lot more sense in that kind of situation, when MAP_SHARED vs MAP_PRIVATE was much less of a black-and-white thing. I really suspect nobody cares wrt ptrace, especially since presumably other systems haven't had those kinds of games (although who knows - HP-UX in particular had some of the shittiest mmap() implementations on the planet - it made even the original Linux mmap hacks look like a thing of pure beauty in comparison). :) That fits with what I heard of HP-UX mmap, but I never had the pleasure of dealing with it. Hugh -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On 03/18/2014 10:12 PM, Hugh Dickins wrote: On Tue, 18 Mar 2014, Sasha Levin wrote: On 03/18/2014 08:38 PM, Hugh Dickins wrote: On Tue, 11 Mar 2014, Dave Jones wrote: On Tue, Mar 11, 2014 at 09:36:03PM +0400, Cyrill Gorcunov wrote: On Tue, Mar 11, 2014 at 01:10:45PM -0400, Dave Jones wrote: Dave, iirc trinity can write log file pointing which exactly syscall sequence was passed, right? Share it too please. Hm, I may have been mistaken, and the damage was done by a previous run. I went from being able to reproduce it almost instantly to now not being able to reproduce it at all. Will keep trying. Sasha already gave a link to the syscalls sequence, so no rush. It'd be nice to get a more concise reproducer, his list had a little of everything in there. I've so far failed to find any explanation for your swapops.h BUG; but believe I have identified one cause for Bad rss-counters. My hunch is that the swapops.h BUG is nearby, but I just cannot fit it together (the swapops.h BUG comes when rmap cannot find all all the migration entries it inserted earlier: it's a very useful BUG for validating rmap). Untested patch below: I can't quite say Reported-by, because it may not even be one that you and Sasha have been seeing; but I'm hopeful, remap_file_pages is in the list. Please give this a try, preferably on 3.14-rc or earlier: I've never seen Bad rss-counters there myself (trinity uses remap_file_pages a lot more than most of us); but have seen them on mmotm/next, so some other trigger is coming up there, I'll worry about that once it reaches 3.15-rc. The patch fixed the Bad rss-counter errors I've been seeing both in 3.14-rc7 and -next. Great, thanks a lot, Sasha. I was afraid that you'd hit those swapops BUGs, which seemed perhaps to be paired with these; but glad to hear a positive. Let's see how Dave fares. (I've not forgotten shmem fallocate, by the way, but those probably aren't as high on my agenda as you'd like.) I do hit the swapops issue a lot, I didn't think that your patch was supposed to fix that so I didn't mention it. Thanks for keeping shmem in mind, I've removed shmem from testing for now but I agree, it's not one of the more important issues to be taken care of. Thanks, Sasha -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, Mar 18, 2014 at 7:37 PM, Hugh Dickins hu...@google.com wrote: For 3.15, and probably 3.16 too, we should keep in place whatever partial accommodations we have for the case (such as allowing for anon and swap in fremap's zap_pte), in case we do need to revert; but clean those away later on. (Not many, I think: it was mainly a guilty secret that VM accounting didn't really hold together.) Absolutely. See if it works to just stop doing that special COW, and then later on, if we have decided nobody even noticed, we can remove the hacks we have to support the fact that shared mappings sometimes have anon pages in them. :) That fits with what I heard of HP-UX mmap, but I never had the pleasure of dealing with it. They had purely virtually indexed caches, making coherency interesting. Together with a VM based on some really old BSD VM code that everybody else had thrown out, and that didn't allow you to unmap things partially etc. So HPUX mmap really didn't work, not even for non-shared mmap's. I think they fixed the interfaces in HP-UX 11. But not being coherent meant that the shared mappings tended to still have trouble. nntp largely died, but was replaced with the cyrus imapd that played similar games. At least out mmap was always coherent. Even in MAP_PRIVATE, and with regards to both write() system calls and other mmap PROT_WRITE users. Except when we had bugs. Shared mmap really isn't very simple to get right. Linus -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, Mar 11, 2014 at 01:39:17PM -0400, Dave Jones wrote: > > > > Sasha already gave a link to the syscalls sequence, so no rush. > > It'd be nice to get a more concise reproducer, his list had a little of > everything in there. Dave, could you please send me your config privately so I would try to reproduce the issue locally maybe it shed some light on the problem. Cyrill -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, Mar 11, 2014 at 01:39:17PM -0400, Dave Jones wrote: Sasha already gave a link to the syscalls sequence, so no rush. It'd be nice to get a more concise reproducer, his list had a little of everything in there. Dave, could you please send me your config privately so I would try to reproduce the issue locally maybe it shed some light on the problem. Cyrill -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, Mar 11, 2014 at 09:36:03PM +0400, Cyrill Gorcunov wrote: > On Tue, Mar 11, 2014 at 01:10:45PM -0400, Dave Jones wrote: > > > > > > Dave, iirc trinity can write log file pointing which exactly syscall > > sequence > > > was passed, right? Share it too please. > > > > Hm, I may have been mistaken, and the damage was done by a previous run. > > I went from being able to reproduce it almost instantly to now not being > > able > > to reproduce it at all. Will keep trying. > > Sasha already gave a link to the syscalls sequence, so no rush. It'd be nice to get a more concise reproducer, his list had a little of everything in there. Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, Mar 11, 2014 at 01:10:45PM -0400, Dave Jones wrote: > > > > Dave, iirc trinity can write log file pointing which exactly syscall > sequence > > was passed, right? Share it too please. > > Hm, I may have been mistaken, and the damage was done by a previous run. > I went from being able to reproduce it almost instantly to now not being able > to reproduce it at all. Will keep trying. Sasha already gave a link to the syscalls sequence, so no rush. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, Mar 11, 2014 at 06:37:50PM +0400, Cyrill Gorcunov wrote: > > > After reading some more, I suppose the idea I had is wrong, > > investigating. > > > Will ping if I find something. > > > > I can rule it out anyway, I can reproduce this by telling trinity to do > > nothing > > other than mmap()'s. I'll try and narrow down the exact parameters. > > Dave, iirc trinity can write log file pointing which exactly syscall sequence > was passed, right? Share it too please. Hm, I may have been mistaken, and the damage was done by a previous run. I went from being able to reproduce it almost instantly to now not being able to reproduce it at all. Will keep trying. Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On 03/11/2014 10:37 AM, Cyrill Gorcunov wrote: On Tue, Mar 11, 2014 at 10:28:17AM -0400, Dave Jones wrote: On Tue, Mar 11, 2014 at 05:41:58PM +0400, Cyrill Gorcunov wrote: > On Tue, Mar 11, 2014 at 09:23:05AM -0400, Sasha Levin wrote: > > >> > > >>Ok, with move_pages excluded it still oopses. > > > > > >Dave, is it possible to somehow figure out was someone reading pagemap file > > >at moment of the bug triggering? > > > > We can sprinkle printk()s wherever might be useful, might not be 100% accurate but > > should be close enough to confirm/deny the theory. > > After reading some more, I suppose the idea I had is wrong, investigating. > Will ping if I find something. I can rule it out anyway, I can reproduce this by telling trinity to do nothing other than mmap()'s. I'll try and narrow down the exact parameters. Dave, iirc trinity can write log file pointing which exactly syscall sequence was passed, right? Share it too please. I've sent one of those last time I reported this issue: https://lkml.org/lkml/2014/1/22/625 Thanks, Sasha -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, Mar 11, 2014 at 10:28:17AM -0400, Dave Jones wrote: > On Tue, Mar 11, 2014 at 05:41:58PM +0400, Cyrill Gorcunov wrote: > > On Tue, Mar 11, 2014 at 09:23:05AM -0400, Sasha Levin wrote: > > > >> > > > >>Ok, with move_pages excluded it still oopses. > > > > > > > >Dave, is it possible to somehow figure out was someone reading pagemap > file > > > >at moment of the bug triggering? > > > > > > We can sprinkle printk()s wherever might be useful, might not be 100% > accurate but > > > should be close enough to confirm/deny the theory. > > > > After reading some more, I suppose the idea I had is wrong, investigating. > > Will ping if I find something. > > I can rule it out anyway, I can reproduce this by telling trinity to do > nothing > other than mmap()'s. I'll try and narrow down the exact parameters. Dave, iirc trinity can write log file pointing which exactly syscall sequence was passed, right? Share it too please. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, Mar 11, 2014 at 05:41:58PM +0400, Cyrill Gorcunov wrote: > On Tue, Mar 11, 2014 at 09:23:05AM -0400, Sasha Levin wrote: > > >> > > >>Ok, with move_pages excluded it still oopses. > > > > > >Dave, is it possible to somehow figure out was someone reading pagemap > > >file > > >at moment of the bug triggering? > > > > We can sprinkle printk()s wherever might be useful, might not be 100% > > accurate but > > should be close enough to confirm/deny the theory. > > After reading some more, I suppose the idea I had is wrong, investigating. > Will ping if I find something. I can rule it out anyway, I can reproduce this by telling trinity to do nothing other than mmap()'s. I'll try and narrow down the exact parameters. Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, Mar 11, 2014 at 09:23:05AM -0400, Sasha Levin wrote: > >> > >>Ok, with move_pages excluded it still oopses. > > > >Dave, is it possible to somehow figure out was someone reading pagemap file > >at moment of the bug triggering? > > We can sprinkle printk()s wherever might be useful, might not be 100% > accurate but > should be close enough to confirm/deny the theory. After reading some more, I suppose the idea I had is wrong, investigating. Will ping if I find something. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On 03/11/2014 09:20 AM, Cyrill Gorcunov wrote: On Tue, Mar 11, 2014 at 01:30:17AM -0400, Dave Jones wrote: > > > > > > I don't see any holes in regular migration. Do you know if this is > > > reproducible with CONFIG_NUMA_BALANCING=n or CONFIG_NUMA=n? > > > > CONFIG_NUMA_BALANCING was n already btw, so I'll do a NUMA=n run. > > There probably isn't much point unless trinity is using > sys_move_pages(). Is it? If so it would be interesting to disable > trinity's move_pages calls and see if it still fails. Ok, with move_pages excluded it still oopses. Dave, is it possible to somehow figure out was someone reading pagemap file at moment of the bug triggering? We can sprinkle printk()s wherever might be useful, might not be 100% accurate but should be close enough to confirm/deny the theory. Thanks, Sasha -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, Mar 11, 2014 at 01:30:17AM -0400, Dave Jones wrote: > > > > > > > > I don't see any holes in regular migration. Do you know if this is > > > > reproducible with CONFIG_NUMA_BALANCING=n or CONFIG_NUMA=n? > > > > > > CONFIG_NUMA_BALANCING was n already btw, so I'll do a NUMA=n run. > > > > There probably isn't much point unless trinity is using > > sys_move_pages(). Is it? If so it would be interesting to disable > > trinity's move_pages calls and see if it still fails. > > Ok, with move_pages excluded it still oopses. Dave, is it possible to somehow figure out was someone reading pagemap file at moment of the bug triggering? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On 03/11/2014 01:30 AM, Dave Jones wrote: On Mon, Mar 10, 2014 at 10:01:58PM -0700, Andrew Morton wrote: > On Tue, 11 Mar 2014 00:51:09 -0400 Dave Jones wrote: > > > On Mon, Mar 10, 2014 at 09:46:12PM -0700, Andrew Morton wrote: > > > On Mon, 10 Mar 2014 20:13:40 -0700 Andrew Morton wrote: > > > > > > > > Anyone ? I'm hitting this trace on an almost daily basis, which is a pain > > > > > while trying to reproduce a different bug.. > > > > > > > > Damn, I thought we'd fixed that but it seems not. Cc's added. > > > > > > > > Guys, what stops the migration target page from coming unlocked in > > > > parallel with zap_pte_range()'s call to migration_entry_to_page()? > > > > > > page_table_lock, sort-of. At least, transitions of is_migration_entry() > > > and page_locked() happen under ptl. > > > > > > I don't see any holes in regular migration. Do you know if this is > > > reproducible with CONFIG_NUMA_BALANCING=n or CONFIG_NUMA=n? > > > > CONFIG_NUMA_BALANCING was n already btw, so I'll do a NUMA=n run. > > There probably isn't much point unless trinity is using > sys_move_pages(). Is it? If so it would be interesting to disable > trinity's move_pages calls and see if it still fails. Ok, with move_pages excluded it still oopses. FWIW, yes - I still see both of these issues happening. It's easy to ignore the bad rss-counter, and I've commented out the BUG at swapops.h so that I could keep on testing. There are quite a few issues within mm/ right now, I think there are more than 5 different BUG()s hittable using trinity at this point without a fix. Thanks, Sasha -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On 03/11/2014 01:30 AM, Dave Jones wrote: On Mon, Mar 10, 2014 at 10:01:58PM -0700, Andrew Morton wrote: On Tue, 11 Mar 2014 00:51:09 -0400 Dave Jones da...@redhat.com wrote: On Mon, Mar 10, 2014 at 09:46:12PM -0700, Andrew Morton wrote: On Mon, 10 Mar 2014 20:13:40 -0700 Andrew Morton a...@linux-foundation.org wrote: Anyone ? I'm hitting this trace on an almost daily basis, which is a pain while trying to reproduce a different bug.. Damn, I thought we'd fixed that but it seems not. Cc's added. Guys, what stops the migration target page from coming unlocked in parallel with zap_pte_range()'s call to migration_entry_to_page()? page_table_lock, sort-of. At least, transitions of is_migration_entry() and page_locked() happen under ptl. I don't see any holes in regular migration. Do you know if this is reproducible with CONFIG_NUMA_BALANCING=n or CONFIG_NUMA=n? CONFIG_NUMA_BALANCING was n already btw, so I'll do a NUMA=n run. There probably isn't much point unless trinity is using sys_move_pages(). Is it? If so it would be interesting to disable trinity's move_pages calls and see if it still fails. Ok, with move_pages excluded it still oopses. FWIW, yes - I still see both of these issues happening. It's easy to ignore the bad rss-counter, and I've commented out the BUG at swapops.h so that I could keep on testing. There are quite a few issues within mm/ right now, I think there are more than 5 different BUG()s hittable using trinity at this point without a fix. Thanks, Sasha -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, Mar 11, 2014 at 01:30:17AM -0400, Dave Jones wrote: I don't see any holes in regular migration. Do you know if this is reproducible with CONFIG_NUMA_BALANCING=n or CONFIG_NUMA=n? CONFIG_NUMA_BALANCING was n already btw, so I'll do a NUMA=n run. There probably isn't much point unless trinity is using sys_move_pages(). Is it? If so it would be interesting to disable trinity's move_pages calls and see if it still fails. Ok, with move_pages excluded it still oopses. Dave, is it possible to somehow figure out was someone reading pagemap file at moment of the bug triggering? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On 03/11/2014 09:20 AM, Cyrill Gorcunov wrote: On Tue, Mar 11, 2014 at 01:30:17AM -0400, Dave Jones wrote: I don't see any holes in regular migration. Do you know if this is reproducible with CONFIG_NUMA_BALANCING=n or CONFIG_NUMA=n? CONFIG_NUMA_BALANCING was n already btw, so I'll do a NUMA=n run. There probably isn't much point unless trinity is using sys_move_pages(). Is it? If so it would be interesting to disable trinity's move_pages calls and see if it still fails. Ok, with move_pages excluded it still oopses. Dave, is it possible to somehow figure out was someone reading pagemap file at moment of the bug triggering? We can sprinkle printk()s wherever might be useful, might not be 100% accurate but should be close enough to confirm/deny the theory. Thanks, Sasha -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, Mar 11, 2014 at 09:23:05AM -0400, Sasha Levin wrote: Ok, with move_pages excluded it still oopses. Dave, is it possible to somehow figure out was someone reading pagemap file at moment of the bug triggering? We can sprinkle printk()s wherever might be useful, might not be 100% accurate but should be close enough to confirm/deny the theory. After reading some more, I suppose the idea I had is wrong, investigating. Will ping if I find something. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, Mar 11, 2014 at 05:41:58PM +0400, Cyrill Gorcunov wrote: On Tue, Mar 11, 2014 at 09:23:05AM -0400, Sasha Levin wrote: Ok, with move_pages excluded it still oopses. Dave, is it possible to somehow figure out was someone reading pagemap file at moment of the bug triggering? We can sprinkle printk()s wherever might be useful, might not be 100% accurate but should be close enough to confirm/deny the theory. After reading some more, I suppose the idea I had is wrong, investigating. Will ping if I find something. I can rule it out anyway, I can reproduce this by telling trinity to do nothing other than mmap()'s. I'll try and narrow down the exact parameters. Dave -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, Mar 11, 2014 at 10:28:17AM -0400, Dave Jones wrote: On Tue, Mar 11, 2014 at 05:41:58PM +0400, Cyrill Gorcunov wrote: On Tue, Mar 11, 2014 at 09:23:05AM -0400, Sasha Levin wrote: Ok, with move_pages excluded it still oopses. Dave, is it possible to somehow figure out was someone reading pagemap file at moment of the bug triggering? We can sprinkle printk()s wherever might be useful, might not be 100% accurate but should be close enough to confirm/deny the theory. After reading some more, I suppose the idea I had is wrong, investigating. Will ping if I find something. I can rule it out anyway, I can reproduce this by telling trinity to do nothing other than mmap()'s. I'll try and narrow down the exact parameters. Dave, iirc trinity can write log file pointing which exactly syscall sequence was passed, right? Share it too please. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On 03/11/2014 10:37 AM, Cyrill Gorcunov wrote: On Tue, Mar 11, 2014 at 10:28:17AM -0400, Dave Jones wrote: On Tue, Mar 11, 2014 at 05:41:58PM +0400, Cyrill Gorcunov wrote: On Tue, Mar 11, 2014 at 09:23:05AM -0400, Sasha Levin wrote: Ok, with move_pages excluded it still oopses. Dave, is it possible to somehow figure out was someone reading pagemap file at moment of the bug triggering? We can sprinkle printk()s wherever might be useful, might not be 100% accurate but should be close enough to confirm/deny the theory. After reading some more, I suppose the idea I had is wrong, investigating. Will ping if I find something. I can rule it out anyway, I can reproduce this by telling trinity to do nothing other than mmap()'s. I'll try and narrow down the exact parameters. Dave, iirc trinity can write log file pointing which exactly syscall sequence was passed, right? Share it too please. I've sent one of those last time I reported this issue: https://lkml.org/lkml/2014/1/22/625 Thanks, Sasha -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, Mar 11, 2014 at 06:37:50PM +0400, Cyrill Gorcunov wrote: After reading some more, I suppose the idea I had is wrong, investigating. Will ping if I find something. I can rule it out anyway, I can reproduce this by telling trinity to do nothing other than mmap()'s. I'll try and narrow down the exact parameters. Dave, iirc trinity can write log file pointing which exactly syscall sequence was passed, right? Share it too please. Hm, I may have been mistaken, and the damage was done by a previous run. I went from being able to reproduce it almost instantly to now not being able to reproduce it at all. Will keep trying. Dave -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, Mar 11, 2014 at 01:10:45PM -0400, Dave Jones wrote: Dave, iirc trinity can write log file pointing which exactly syscall sequence was passed, right? Share it too please. Hm, I may have been mistaken, and the damage was done by a previous run. I went from being able to reproduce it almost instantly to now not being able to reproduce it at all. Will keep trying. Sasha already gave a link to the syscalls sequence, so no rush. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, Mar 11, 2014 at 09:36:03PM +0400, Cyrill Gorcunov wrote: On Tue, Mar 11, 2014 at 01:10:45PM -0400, Dave Jones wrote: Dave, iirc trinity can write log file pointing which exactly syscall sequence was passed, right? Share it too please. Hm, I may have been mistaken, and the damage was done by a previous run. I went from being able to reproduce it almost instantly to now not being able to reproduce it at all. Will keep trying. Sasha already gave a link to the syscalls sequence, so no rush. It'd be nice to get a more concise reproducer, his list had a little of everything in there. Dave -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Mon, Mar 10, 2014 at 10:01:58PM -0700, Andrew Morton wrote: > On Tue, 11 Mar 2014 00:51:09 -0400 Dave Jones wrote: > > > On Mon, Mar 10, 2014 at 09:46:12PM -0700, Andrew Morton wrote: > > > On Mon, 10 Mar 2014 20:13:40 -0700 Andrew Morton > > wrote: > > > > > > > > Anyone ? I'm hitting this trace on an almost daily basis, which is > > a pain > > > > > while trying to reproduce a different bug.. > > > > > > > > Damn, I thought we'd fixed that but it seems not. Cc's added. > > > > > > > > Guys, what stops the migration target page from coming unlocked in > > > > parallel with zap_pte_range()'s call to migration_entry_to_page()? > > > > > > page_table_lock, sort-of. At least, transitions of is_migration_entry() > > > and page_locked() happen under ptl. > > > > > > I don't see any holes in regular migration. Do you know if this is > > > reproducible with CONFIG_NUMA_BALANCING=n or CONFIG_NUMA=n? > > > > CONFIG_NUMA_BALANCING was n already btw, so I'll do a NUMA=n run. > > There probably isn't much point unless trinity is using > sys_move_pages(). Is it? If so it would be interesting to disable > trinity's move_pages calls and see if it still fails. Ok, with move_pages excluded it still oopses. Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Mon, Mar 10, 2014 at 10:01:58PM -0700, Andrew Morton wrote: > On Tue, 11 Mar 2014 00:51:09 -0400 Dave Jones wrote: > > > On Mon, Mar 10, 2014 at 09:46:12PM -0700, Andrew Morton wrote: > > > On Mon, 10 Mar 2014 20:13:40 -0700 Andrew Morton > > wrote: > > > > > > > > Anyone ? I'm hitting this trace on an almost daily basis, which is > > a pain > > > > > while trying to reproduce a different bug.. > > > > > > > > Damn, I thought we'd fixed that but it seems not. Cc's added. > > > > > > > > Guys, what stops the migration target page from coming unlocked in > > > > parallel with zap_pte_range()'s call to migration_entry_to_page()? > > > > > > page_table_lock, sort-of. At least, transitions of is_migration_entry() > > > and page_locked() happen under ptl. > > > > > > I don't see any holes in regular migration. Do you know if this is > > > reproducible with CONFIG_NUMA_BALANCING=n or CONFIG_NUMA=n? > > > > CONFIG_NUMA_BALANCING was n already btw, so I'll do a NUMA=n run. > > There probably isn't much point unless trinity is using > sys_move_pages(). Is it? Trinity will do every syscall an arch has. In the test case I have so far, I've narrowed it down to the vm group of syscalls (so running with '-g vm' will do anything that I deemed 'vm'. Including.. sys_move_pages) I'll try to narrow it down further tomorrow. > If so it would be interesting to disable > trinity's move_pages calls and see if it still fails. Ok, I'll try that first. > Grasping at straws here, trying to reduce the amount of code to look at :( *nod*, it's not helped by the fact that the trace happens at process exit time which could be considerably later after the syscall that buggers everything up has happened. Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, 11 Mar 2014 00:51:09 -0400 Dave Jones wrote: > On Mon, Mar 10, 2014 at 09:46:12PM -0700, Andrew Morton wrote: > > On Mon, 10 Mar 2014 20:13:40 -0700 Andrew Morton > wrote: > > > > > > Anyone ? I'm hitting this trace on an almost daily basis, which is a > pain > > > > while trying to reproduce a different bug.. > > > > > > Damn, I thought we'd fixed that but it seems not. Cc's added. > > > > > > Guys, what stops the migration target page from coming unlocked in > > > parallel with zap_pte_range()'s call to migration_entry_to_page()? > > > > page_table_lock, sort-of. At least, transitions of is_migration_entry() > > and page_locked() happen under ptl. > > > > I don't see any holes in regular migration. Do you know if this is > > reproducible with CONFIG_NUMA_BALANCING=n or CONFIG_NUMA=n? > > CONFIG_NUMA_BALANCING was n already btw, so I'll do a NUMA=n run. There probably isn't much point unless trinity is using sys_move_pages(). Is it? If so it would be interesting to disable trinity's move_pages calls and see if it still fails. Grasping at straws here, trying to reduce the amount of code to look at :( -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Mon, Mar 10, 2014 at 09:46:12PM -0700, Andrew Morton wrote: > On Mon, 10 Mar 2014 20:13:40 -0700 Andrew Morton > wrote: > > > > Anyone ? I'm hitting this trace on an almost daily basis, which is a pain > > > while trying to reproduce a different bug.. > > > > Damn, I thought we'd fixed that but it seems not. Cc's added. > > > > Guys, what stops the migration target page from coming unlocked in > > parallel with zap_pte_range()'s call to migration_entry_to_page()? > > page_table_lock, sort-of. At least, transitions of is_migration_entry() > and page_locked() happen under ptl. > > I don't see any holes in regular migration. Do you know if this is > reproducible with CONFIG_NUMA_BALANCING=n or CONFIG_NUMA=n? CONFIG_NUMA_BALANCING was n already btw, so I'll do a NUMA=n run. Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Mon, Mar 10, 2014 at 09:46:12PM -0700, Andrew Morton wrote: > On Mon, 10 Mar 2014 20:13:40 -0700 Andrew Morton > wrote: > > > > Anyone ? I'm hitting this trace on an almost daily basis, which is a pain > > > while trying to reproduce a different bug.. > > > > Damn, I thought we'd fixed that but it seems not. Cc's added. > > > > Guys, what stops the migration target page from coming unlocked in > > parallel with zap_pte_range()'s call to migration_entry_to_page()? > > page_table_lock, sort-of. At least, transitions of is_migration_entry() > and page_locked() happen under ptl. > > I don't see any holes in regular migration. Do you know if this is > reproducible with CONFIG_NUMA_BALANCING=n or CONFIG_NUMA=n? I'll give it an overnight run and let you know tomorrow. Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Mon, 10 Mar 2014 20:13:40 -0700 Andrew Morton wrote: > > Anyone ? I'm hitting this trace on an almost daily basis, which is a pain > > while trying to reproduce a different bug.. > > Damn, I thought we'd fixed that but it seems not. Cc's added. > > Guys, what stops the migration target page from coming unlocked in > parallel with zap_pte_range()'s call to migration_entry_to_page()? page_table_lock, sort-of. At least, transitions of is_migration_entry() and page_locked() happen under ptl. I don't see any holes in regular migration. Do you know if this is reproducible with CONFIG_NUMA_BALANCING=n or CONFIG_NUMA=n? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Mon, 10 Mar 2014 22:49:06 -0400 Dave Jones wrote: > ... > > > > 124 static inline struct page *migration_entry_to_page(swp_entry_t > entry) > > > 125 { > > > 126 struct page *p = pfn_to_page(swp_offset(entry)); > > > 127 /* > > > 128 * Any use of migration entries may only occur while the > > > 129 * corresponding page is locked > > > 130 */ > > > 131 BUG_ON(!PageLocked(p)); > > > 132 return p; > > > 133 } > > > > I hit this again, this time a full trace made it over the serial console. > > This time there was no bad rss-counter message though. > > > > kernel BUG at include/linux/swapops.h:131! > > invalid opcode: [#1] PREEMPT SMP DEBUG_PAGEALLOC > > Modules linked in: snd_seq_dummy fuse hidp tun bnep rfcomm llc2 af_key > ipt_ULOG can_raw nfnetlink scsi_transport_iscsi nfc caif_socket caif > af_802154 phonet af_rxrpc can_bcm can pppoe pppox ppp_generic slhc irda > crc_ccitt rds rose x25 atm netrom appletalk ipx p8023 psnap p8022 llc ax25 > cfg80211 xfs libcrc32c coretemp hwmon x86_pkg_temp_thermal kvm_intel kvm > crct10dif_pclmul crc32c_intel ghash_clmulni_intel snd_hda_codec_hdmi > snd_hda_codec_realtek snd_hda_codec_generic microcode pcspkr serio_raw btusb > bluetooth 6lowpan_iphc rfkill usb_debug shpchp snd_hda_intel snd_hda_codec > snd_hwdep snd_seq snd_seq_device snd_pcm e1000e ptp snd_timer snd pps_core > soundcore > > CPU: 2 PID: 10002 Comm: trinity-c36 Not tainted 3.14.0-rc5+ #131 > > task: 880108966750 ti: 88018911a000 task.ti: 88018911a000 > > RIP: 0010:[] [] > migration_entry_to_page.part.47+0x4/0x6 > > RSP: :88018911bae8 EFLAGS: 00010246 > > RAX: ea00048a8980 RBX: 8801a08ae020 RCX: > > RDX: RSI: RDI: 3c122a26 > > RBP: 88018911bae8 R08: R09: > > R10: R11: fffe R12: 24544c3c > > R13: 88018911bc18 R14: 40c0 R15: 40a04000 > > FS: () GS:88024d08() > knlGS: > > CS: 0010 DS: ES: CR0: 80050033 > > CR2: 0001 CR3: 0001e3c27000 CR4: 001407e0 > > DR0: 00ab5000 DR1: 01008000 DR2: 0223 > > DR3: DR6: fffe0ff0 DR7: 0600 > > Stack: > > 88018911bbc8 9d17ec1e 40d65fff 40d65fff > > 8801e3c27000 40d66000 88011c72e008 40d66000 > > 40d65fff 0001 40d66000 88018911bb98 > > Call Trace: > > [] unmap_single_vma+0x89e/0x8a0 > > [] unmap_vmas+0x49/0x90 > > [] exit_mmap+0xe5/0x1a0 > > [] mmput+0x73/0x110 > > [] do_exit+0x2a2/0xb50 > > [] ? __sigqueue_free.part.11+0x33/0x40 > > [] ? __dequeue_signal+0x13c/0x220 > > [] do_group_exit+0x4c/0xc0 > > [] get_signal_to_deliver+0x2d1/0x6d0 > > [] do_signal+0x57/0x9d0 > > [] ? __acct_update_integrals+0x8e/0x120 > > [] ? preempt_count_sub+0x6b/0xf0 > > [] ? _raw_spin_unlock+0x31/0x50 > > [] ? vtime_account_user+0x91/0xa0 > > [] ? context_tracking_user_exit+0x9b/0x100 > > [] do_notify_resume+0x71/0xc0 > > [] retint_signal+0x46/0x90 > > Code: df 48 c1 ff 06 49 01 fc 4c 89 e7 e8 79 ff ff ff 85 c0 74 0c 4c 89 e0 > 48 c1 e0 06 48 29 d8 eb 02 31 c0 5b 41 5c 5d c3 55 48 89 e5 <0f> 0b 55 48 89 > e5 0f 0b 55 48 89 e5 0f 0b 55 31 f6 48 89 e5 e8 > > Anyone ? I'm hitting this trace on an almost daily basis, which is a pain > while trying to reproduce a different bug.. Damn, I thought we'd fixed that but it seems not. Cc's added. Guys, what stops the migration target page from coming unlocked in parallel with zap_pte_range()'s call to migration_entry_to_page()? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Thu, Mar 06, 2014 at 07:22:10PM -0500, Dave Jones wrote: > On Wed, Mar 05, 2014 at 12:57:25PM -0500, Dave Jones wrote: > > On Wed, Mar 05, 2014 at 12:45:03PM -0500, Dave Jones wrote: > > > I just saw this on my box that's been running trinity.. > > > > > > [48825.517189] BUG: Bad rss-counter state mm:880177921d40 idx:0 > val:1 (Not tainted) > > > > > > There's nothing else, no trace, nothing. Any ideas where to begin > with this? > > > > ah, on the serial console there was also this truncated warning.. > > > > [48825.517189] BUG: Bad rss-counter state mm:880177921d40 idx:0 val:1 > (Not tainted) > > [48924.133273] [ cut here ] > > [48924.133391] kernel BUG at include/linux/swapops.h:131! > > > > Dave > > > > 124 static inline struct page *migration_entry_to_page(swp_entry_t entry) > > 125 { > > 126 struct page *p = pfn_to_page(swp_offset(entry)); > > 127 /* > > 128 * Any use of migration entries may only occur while the > > 129 * corresponding page is locked > > 130 */ > > 131 BUG_ON(!PageLocked(p)); > > 132 return p; > > 133 } > > I hit this again, this time a full trace made it over the serial console. > This time there was no bad rss-counter message though. > > kernel BUG at include/linux/swapops.h:131! > invalid opcode: [#1] PREEMPT SMP DEBUG_PAGEALLOC > Modules linked in: snd_seq_dummy fuse hidp tun bnep rfcomm llc2 af_key > ipt_ULOG can_raw nfnetlink scsi_transport_iscsi nfc caif_socket caif > af_802154 phonet af_rxrpc can_bcm can pppoe pppox ppp_generic slhc irda > crc_ccitt rds rose x25 atm netrom appletalk ipx p8023 psnap p8022 llc ax25 > cfg80211 xfs libcrc32c coretemp hwmon x86_pkg_temp_thermal kvm_intel kvm > crct10dif_pclmul crc32c_intel ghash_clmulni_intel snd_hda_codec_hdmi > snd_hda_codec_realtek snd_hda_codec_generic microcode pcspkr serio_raw btusb > bluetooth 6lowpan_iphc rfkill usb_debug shpchp snd_hda_intel snd_hda_codec > snd_hwdep snd_seq snd_seq_device snd_pcm e1000e ptp snd_timer snd pps_core > soundcore > CPU: 2 PID: 10002 Comm: trinity-c36 Not tainted 3.14.0-rc5+ #131 > task: 880108966750 ti: 88018911a000 task.ti: 88018911a000 > RIP: 0010:[] [] > migration_entry_to_page.part.47+0x4/0x6 > RSP: :88018911bae8 EFLAGS: 00010246 > RAX: ea00048a8980 RBX: 8801a08ae020 RCX: > RDX: RSI: RDI: 3c122a26 > RBP: 88018911bae8 R08: R09: > R10: R11: fffe R12: 24544c3c > R13: 88018911bc18 R14: 40c0 R15: 40a04000 > FS: () GS:88024d08() knlGS: > CS: 0010 DS: ES: CR0: 80050033 > CR2: 0001 CR3: 0001e3c27000 CR4: 001407e0 > DR0: 00ab5000 DR1: 01008000 DR2: 0223 > DR3: DR6: fffe0ff0 DR7: 0600 > Stack: > 88018911bbc8 9d17ec1e 40d65fff 40d65fff > 8801e3c27000 40d66000 88011c72e008 40d66000 > 40d65fff 0001 40d66000 88018911bb98 > Call Trace: > [] unmap_single_vma+0x89e/0x8a0 > [] unmap_vmas+0x49/0x90 > [] exit_mmap+0xe5/0x1a0 > [] mmput+0x73/0x110 > [] do_exit+0x2a2/0xb50 > [] ? __sigqueue_free.part.11+0x33/0x40 > [] ? __dequeue_signal+0x13c/0x220 > [] do_group_exit+0x4c/0xc0 > [] get_signal_to_deliver+0x2d1/0x6d0 > [] do_signal+0x57/0x9d0 > [] ? __acct_update_integrals+0x8e/0x120 > [] ? preempt_count_sub+0x6b/0xf0 > [] ? _raw_spin_unlock+0x31/0x50 > [] ? vtime_account_user+0x91/0xa0 > [] ? context_tracking_user_exit+0x9b/0x100 > [] do_notify_resume+0x71/0xc0 > [] retint_signal+0x46/0x90 > Code: df 48 c1 ff 06 49 01 fc 4c 89 e7 e8 79 ff ff ff 85 c0 74 0c 4c 89 e0 > 48 c1 e0 06 48 29 d8 eb 02 31 c0 5b 41 5c 5d c3 55 48 89 e5 <0f> 0b 55 48 89 > e5 0f 0b 55 48 89 e5 0f 0b 55 31 f6 48 89 e5 e8 Anyone ? I'm hitting this trace on an almost daily basis, which is a pain while trying to reproduce a different bug.. Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Thu, Mar 06, 2014 at 07:22:10PM -0500, Dave Jones wrote: On Wed, Mar 05, 2014 at 12:57:25PM -0500, Dave Jones wrote: On Wed, Mar 05, 2014 at 12:45:03PM -0500, Dave Jones wrote: I just saw this on my box that's been running trinity.. [48825.517189] BUG: Bad rss-counter state mm:880177921d40 idx:0 val:1 (Not tainted) There's nothing else, no trace, nothing. Any ideas where to begin with this? ah, on the serial console there was also this truncated warning.. [48825.517189] BUG: Bad rss-counter state mm:880177921d40 idx:0 val:1 (Not tainted) [48924.133273] [ cut here ] [48924.133391] kernel BUG at include/linux/swapops.h:131! Dave 124 static inline struct page *migration_entry_to_page(swp_entry_t entry) 125 { 126 struct page *p = pfn_to_page(swp_offset(entry)); 127 /* 128 * Any use of migration entries may only occur while the 129 * corresponding page is locked 130 */ 131 BUG_ON(!PageLocked(p)); 132 return p; 133 } I hit this again, this time a full trace made it over the serial console. This time there was no bad rss-counter message though. kernel BUG at include/linux/swapops.h:131! invalid opcode: [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: snd_seq_dummy fuse hidp tun bnep rfcomm llc2 af_key ipt_ULOG can_raw nfnetlink scsi_transport_iscsi nfc caif_socket caif af_802154 phonet af_rxrpc can_bcm can pppoe pppox ppp_generic slhc irda crc_ccitt rds rose x25 atm netrom appletalk ipx p8023 psnap p8022 llc ax25 cfg80211 xfs libcrc32c coretemp hwmon x86_pkg_temp_thermal kvm_intel kvm crct10dif_pclmul crc32c_intel ghash_clmulni_intel snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic microcode pcspkr serio_raw btusb bluetooth 6lowpan_iphc rfkill usb_debug shpchp snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm e1000e ptp snd_timer snd pps_core soundcore CPU: 2 PID: 10002 Comm: trinity-c36 Not tainted 3.14.0-rc5+ #131 task: 880108966750 ti: 88018911a000 task.ti: 88018911a000 RIP: 0010:[9d72d129] [9d72d129] migration_entry_to_page.part.47+0x4/0x6 RSP: :88018911bae8 EFLAGS: 00010246 RAX: ea00048a8980 RBX: 8801a08ae020 RCX: RDX: RSI: RDI: 3c122a26 RBP: 88018911bae8 R08: R09: R10: R11: fffe R12: 24544c3c R13: 88018911bc18 R14: 40c0 R15: 40a04000 FS: () GS:88024d08() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 0001 CR3: 0001e3c27000 CR4: 001407e0 DR0: 00ab5000 DR1: 01008000 DR2: 0223 DR3: DR6: fffe0ff0 DR7: 0600 Stack: 88018911bbc8 9d17ec1e 40d65fff 40d65fff 8801e3c27000 40d66000 88011c72e008 40d66000 40d65fff 0001 40d66000 88018911bb98 Call Trace: [9d17ec1e] unmap_single_vma+0x89e/0x8a0 [9d17fd49] unmap_vmas+0x49/0x90 [9d1890f5] exit_mmap+0xe5/0x1a0 [9d068d13] mmput+0x73/0x110 [9d06d022] do_exit+0x2a2/0xb50 [9d07bb63] ? __sigqueue_free.part.11+0x33/0x40 [9d07c39c] ? __dequeue_signal+0x13c/0x220 [9d06e8cc] do_group_exit+0x4c/0xc0 [9d07fd41] get_signal_to_deliver+0x2d1/0x6d0 [9d0024c7] do_signal+0x57/0x9d0 [9d11003e] ? __acct_update_integrals+0x8e/0x120 [9d73d66b] ? preempt_count_sub+0x6b/0xf0 [9d738ec1] ? _raw_spin_unlock+0x31/0x50 [9d0aa0b1] ? vtime_account_user+0x91/0xa0 [9d15215b] ? context_tracking_user_exit+0x9b/0x100 [9d002eb1] do_notify_resume+0x71/0xc0 [9d739c06] retint_signal+0x46/0x90 Code: df 48 c1 ff 06 49 01 fc 4c 89 e7 e8 79 ff ff ff 85 c0 74 0c 4c 89 e0 48 c1 e0 06 48 29 d8 eb 02 31 c0 5b 41 5c 5d c3 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 31 f6 48 89 e5 e8 Anyone ? I'm hitting this trace on an almost daily basis, which is a pain while trying to reproduce a different bug.. Dave -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Mon, 10 Mar 2014 22:49:06 -0400 Dave Jones da...@redhat.com wrote: ... 124 static inline struct page *migration_entry_to_page(swp_entry_t entry) 125 { 126 struct page *p = pfn_to_page(swp_offset(entry)); 127 /* 128 * Any use of migration entries may only occur while the 129 * corresponding page is locked 130 */ 131 BUG_ON(!PageLocked(p)); 132 return p; 133 } I hit this again, this time a full trace made it over the serial console. This time there was no bad rss-counter message though. kernel BUG at include/linux/swapops.h:131! invalid opcode: [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: snd_seq_dummy fuse hidp tun bnep rfcomm llc2 af_key ipt_ULOG can_raw nfnetlink scsi_transport_iscsi nfc caif_socket caif af_802154 phonet af_rxrpc can_bcm can pppoe pppox ppp_generic slhc irda crc_ccitt rds rose x25 atm netrom appletalk ipx p8023 psnap p8022 llc ax25 cfg80211 xfs libcrc32c coretemp hwmon x86_pkg_temp_thermal kvm_intel kvm crct10dif_pclmul crc32c_intel ghash_clmulni_intel snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic microcode pcspkr serio_raw btusb bluetooth 6lowpan_iphc rfkill usb_debug shpchp snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm e1000e ptp snd_timer snd pps_core soundcore CPU: 2 PID: 10002 Comm: trinity-c36 Not tainted 3.14.0-rc5+ #131 task: 880108966750 ti: 88018911a000 task.ti: 88018911a000 RIP: 0010:[9d72d129] [9d72d129] migration_entry_to_page.part.47+0x4/0x6 RSP: :88018911bae8 EFLAGS: 00010246 RAX: ea00048a8980 RBX: 8801a08ae020 RCX: RDX: RSI: RDI: 3c122a26 RBP: 88018911bae8 R08: R09: R10: R11: fffe R12: 24544c3c R13: 88018911bc18 R14: 40c0 R15: 40a04000 FS: () GS:88024d08() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 0001 CR3: 0001e3c27000 CR4: 001407e0 DR0: 00ab5000 DR1: 01008000 DR2: 0223 DR3: DR6: fffe0ff0 DR7: 0600 Stack: 88018911bbc8 9d17ec1e 40d65fff 40d65fff 8801e3c27000 40d66000 88011c72e008 40d66000 40d65fff 0001 40d66000 88018911bb98 Call Trace: [9d17ec1e] unmap_single_vma+0x89e/0x8a0 [9d17fd49] unmap_vmas+0x49/0x90 [9d1890f5] exit_mmap+0xe5/0x1a0 [9d068d13] mmput+0x73/0x110 [9d06d022] do_exit+0x2a2/0xb50 [9d07bb63] ? __sigqueue_free.part.11+0x33/0x40 [9d07c39c] ? __dequeue_signal+0x13c/0x220 [9d06e8cc] do_group_exit+0x4c/0xc0 [9d07fd41] get_signal_to_deliver+0x2d1/0x6d0 [9d0024c7] do_signal+0x57/0x9d0 [9d11003e] ? __acct_update_integrals+0x8e/0x120 [9d73d66b] ? preempt_count_sub+0x6b/0xf0 [9d738ec1] ? _raw_spin_unlock+0x31/0x50 [9d0aa0b1] ? vtime_account_user+0x91/0xa0 [9d15215b] ? context_tracking_user_exit+0x9b/0x100 [9d002eb1] do_notify_resume+0x71/0xc0 [9d739c06] retint_signal+0x46/0x90 Code: df 48 c1 ff 06 49 01 fc 4c 89 e7 e8 79 ff ff ff 85 c0 74 0c 4c 89 e0 48 c1 e0 06 48 29 d8 eb 02 31 c0 5b 41 5c 5d c3 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 31 f6 48 89 e5 e8 Anyone ? I'm hitting this trace on an almost daily basis, which is a pain while trying to reproduce a different bug.. Damn, I thought we'd fixed that but it seems not. Cc's added. Guys, what stops the migration target page from coming unlocked in parallel with zap_pte_range()'s call to migration_entry_to_page()? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Mon, 10 Mar 2014 20:13:40 -0700 Andrew Morton a...@linux-foundation.org wrote: Anyone ? I'm hitting this trace on an almost daily basis, which is a pain while trying to reproduce a different bug.. Damn, I thought we'd fixed that but it seems not. Cc's added. Guys, what stops the migration target page from coming unlocked in parallel with zap_pte_range()'s call to migration_entry_to_page()? page_table_lock, sort-of. At least, transitions of is_migration_entry() and page_locked() happen under ptl. I don't see any holes in regular migration. Do you know if this is reproducible with CONFIG_NUMA_BALANCING=n or CONFIG_NUMA=n? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Mon, Mar 10, 2014 at 09:46:12PM -0700, Andrew Morton wrote: On Mon, 10 Mar 2014 20:13:40 -0700 Andrew Morton a...@linux-foundation.org wrote: Anyone ? I'm hitting this trace on an almost daily basis, which is a pain while trying to reproduce a different bug.. Damn, I thought we'd fixed that but it seems not. Cc's added. Guys, what stops the migration target page from coming unlocked in parallel with zap_pte_range()'s call to migration_entry_to_page()? page_table_lock, sort-of. At least, transitions of is_migration_entry() and page_locked() happen under ptl. I don't see any holes in regular migration. Do you know if this is reproducible with CONFIG_NUMA_BALANCING=n or CONFIG_NUMA=n? I'll give it an overnight run and let you know tomorrow. Dave -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Mon, Mar 10, 2014 at 09:46:12PM -0700, Andrew Morton wrote: On Mon, 10 Mar 2014 20:13:40 -0700 Andrew Morton a...@linux-foundation.org wrote: Anyone ? I'm hitting this trace on an almost daily basis, which is a pain while trying to reproduce a different bug.. Damn, I thought we'd fixed that but it seems not. Cc's added. Guys, what stops the migration target page from coming unlocked in parallel with zap_pte_range()'s call to migration_entry_to_page()? page_table_lock, sort-of. At least, transitions of is_migration_entry() and page_locked() happen under ptl. I don't see any holes in regular migration. Do you know if this is reproducible with CONFIG_NUMA_BALANCING=n or CONFIG_NUMA=n? CONFIG_NUMA_BALANCING was n already btw, so I'll do a NUMA=n run. Dave -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Tue, 11 Mar 2014 00:51:09 -0400 Dave Jones da...@redhat.com wrote: On Mon, Mar 10, 2014 at 09:46:12PM -0700, Andrew Morton wrote: On Mon, 10 Mar 2014 20:13:40 -0700 Andrew Morton a...@linux-foundation.org wrote: Anyone ? I'm hitting this trace on an almost daily basis, which is a pain while trying to reproduce a different bug.. Damn, I thought we'd fixed that but it seems not. Cc's added. Guys, what stops the migration target page from coming unlocked in parallel with zap_pte_range()'s call to migration_entry_to_page()? page_table_lock, sort-of. At least, transitions of is_migration_entry() and page_locked() happen under ptl. I don't see any holes in regular migration. Do you know if this is reproducible with CONFIG_NUMA_BALANCING=n or CONFIG_NUMA=n? CONFIG_NUMA_BALANCING was n already btw, so I'll do a NUMA=n run. There probably isn't much point unless trinity is using sys_move_pages(). Is it? If so it would be interesting to disable trinity's move_pages calls and see if it still fails. Grasping at straws here, trying to reduce the amount of code to look at :( -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Mon, Mar 10, 2014 at 10:01:58PM -0700, Andrew Morton wrote: On Tue, 11 Mar 2014 00:51:09 -0400 Dave Jones da...@redhat.com wrote: On Mon, Mar 10, 2014 at 09:46:12PM -0700, Andrew Morton wrote: On Mon, 10 Mar 2014 20:13:40 -0700 Andrew Morton a...@linux-foundation.org wrote: Anyone ? I'm hitting this trace on an almost daily basis, which is a pain while trying to reproduce a different bug.. Damn, I thought we'd fixed that but it seems not. Cc's added. Guys, what stops the migration target page from coming unlocked in parallel with zap_pte_range()'s call to migration_entry_to_page()? page_table_lock, sort-of. At least, transitions of is_migration_entry() and page_locked() happen under ptl. I don't see any holes in regular migration. Do you know if this is reproducible with CONFIG_NUMA_BALANCING=n or CONFIG_NUMA=n? CONFIG_NUMA_BALANCING was n already btw, so I'll do a NUMA=n run. There probably isn't much point unless trinity is using sys_move_pages(). Is it? Trinity will do every syscall an arch has. In the test case I have so far, I've narrowed it down to the vm group of syscalls (so running with '-g vm' will do anything that I deemed 'vm'. Including.. sys_move_pages) I'll try to narrow it down further tomorrow. If so it would be interesting to disable trinity's move_pages calls and see if it still fails. Ok, I'll try that first. Grasping at straws here, trying to reduce the amount of code to look at :( *nod*, it's not helped by the fact that the trace happens at process exit time which could be considerably later after the syscall that buggers everything up has happened. Dave -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Mon, Mar 10, 2014 at 10:01:58PM -0700, Andrew Morton wrote: On Tue, 11 Mar 2014 00:51:09 -0400 Dave Jones da...@redhat.com wrote: On Mon, Mar 10, 2014 at 09:46:12PM -0700, Andrew Morton wrote: On Mon, 10 Mar 2014 20:13:40 -0700 Andrew Morton a...@linux-foundation.org wrote: Anyone ? I'm hitting this trace on an almost daily basis, which is a pain while trying to reproduce a different bug.. Damn, I thought we'd fixed that but it seems not. Cc's added. Guys, what stops the migration target page from coming unlocked in parallel with zap_pte_range()'s call to migration_entry_to_page()? page_table_lock, sort-of. At least, transitions of is_migration_entry() and page_locked() happen under ptl. I don't see any holes in regular migration. Do you know if this is reproducible with CONFIG_NUMA_BALANCING=n or CONFIG_NUMA=n? CONFIG_NUMA_BALANCING was n already btw, so I'll do a NUMA=n run. There probably isn't much point unless trinity is using sys_move_pages(). Is it? If so it would be interesting to disable trinity's move_pages calls and see if it still fails. Ok, with move_pages excluded it still oopses. Dave -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Wed, Mar 05, 2014 at 12:57:25PM -0500, Dave Jones wrote: > On Wed, Mar 05, 2014 at 12:45:03PM -0500, Dave Jones wrote: > > I just saw this on my box that's been running trinity.. > > > > [48825.517189] BUG: Bad rss-counter state mm:880177921d40 idx:0 val:1 > (Not tainted) > > > > There's nothing else, no trace, nothing. Any ideas where to begin with > this? > > ah, on the serial console there was also this truncated warning.. > > [48825.517189] BUG: Bad rss-counter state mm:880177921d40 idx:0 val:1 > (Not tainted) > [48924.133273] [ cut here ] > [48924.133391] kernel BUG at include/linux/swapops.h:131! > > Dave > > 124 static inline struct page *migration_entry_to_page(swp_entry_t entry) > 125 { > 126 struct page *p = pfn_to_page(swp_offset(entry)); > 127 /* > 128 * Any use of migration entries may only occur while the > 129 * corresponding page is locked > 130 */ > 131 BUG_ON(!PageLocked(p)); > 132 return p; > 133 } I hit this again, this time a full trace made it over the serial console. This time there was no bad rss-counter message though. kernel BUG at include/linux/swapops.h:131! invalid opcode: [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: snd_seq_dummy fuse hidp tun bnep rfcomm llc2 af_key ipt_ULOG can_raw nfnetlink scsi_transport_iscsi nfc caif_socket caif af_802154 phonet af_rxrpc can_bcm can pppoe pppox ppp_generic slhc irda crc_ccitt rds rose x25 atm netrom appletalk ipx p8023 psnap p8022 llc ax25 cfg80211 xfs libcrc32c coretemp hwmon x86_pkg_temp_thermal kvm_intel kvm crct10dif_pclmul crc32c_intel ghash_clmulni_intel snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic microcode pcspkr serio_raw btusb bluetooth 6lowpan_iphc rfkill usb_debug shpchp snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm e1000e ptp snd_timer snd pps_core soundcore CPU: 2 PID: 10002 Comm: trinity-c36 Not tainted 3.14.0-rc5+ #131 task: 880108966750 ti: 88018911a000 task.ti: 88018911a000 RIP: 0010:[] [] migration_entry_to_page.part.47+0x4/0x6 RSP: :88018911bae8 EFLAGS: 00010246 RAX: ea00048a8980 RBX: 8801a08ae020 RCX: RDX: RSI: RDI: 3c122a26 RBP: 88018911bae8 R08: R09: R10: R11: fffe R12: 24544c3c R13: 88018911bc18 R14: 40c0 R15: 40a04000 FS: () GS:88024d08() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 0001 CR3: 0001e3c27000 CR4: 001407e0 DR0: 00ab5000 DR1: 01008000 DR2: 0223 DR3: DR6: fffe0ff0 DR7: 0600 Stack: 88018911bbc8 9d17ec1e 40d65fff 40d65fff 8801e3c27000 40d66000 88011c72e008 40d66000 40d65fff 0001 40d66000 88018911bb98 Call Trace: [] unmap_single_vma+0x89e/0x8a0 [] unmap_vmas+0x49/0x90 [] exit_mmap+0xe5/0x1a0 [] mmput+0x73/0x110 [] do_exit+0x2a2/0xb50 [] ? __sigqueue_free.part.11+0x33/0x40 [] ? __dequeue_signal+0x13c/0x220 [] do_group_exit+0x4c/0xc0 [] get_signal_to_deliver+0x2d1/0x6d0 [] do_signal+0x57/0x9d0 [] ? __acct_update_integrals+0x8e/0x120 [] ? preempt_count_sub+0x6b/0xf0 [] ? _raw_spin_unlock+0x31/0x50 [] ? vtime_account_user+0x91/0xa0 [] ? context_tracking_user_exit+0x9b/0x100 [] do_notify_resume+0x71/0xc0 [] retint_signal+0x46/0x90 Code: df 48 c1 ff 06 49 01 fc 4c 89 e7 e8 79 ff ff ff 85 c0 74 0c 4c 89 e0 48 c1 e0 06 48 29 d8 eb 02 31 c0 5b 41 5c 5d c3 55 48 89 e5 <0f> 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 31 f6 48 89 e5 e8 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Wed, Mar 05, 2014 at 12:57:25PM -0500, Dave Jones wrote: On Wed, Mar 05, 2014 at 12:45:03PM -0500, Dave Jones wrote: I just saw this on my box that's been running trinity.. [48825.517189] BUG: Bad rss-counter state mm:880177921d40 idx:0 val:1 (Not tainted) There's nothing else, no trace, nothing. Any ideas where to begin with this? ah, on the serial console there was also this truncated warning.. [48825.517189] BUG: Bad rss-counter state mm:880177921d40 idx:0 val:1 (Not tainted) [48924.133273] [ cut here ] [48924.133391] kernel BUG at include/linux/swapops.h:131! Dave 124 static inline struct page *migration_entry_to_page(swp_entry_t entry) 125 { 126 struct page *p = pfn_to_page(swp_offset(entry)); 127 /* 128 * Any use of migration entries may only occur while the 129 * corresponding page is locked 130 */ 131 BUG_ON(!PageLocked(p)); 132 return p; 133 } I hit this again, this time a full trace made it over the serial console. This time there was no bad rss-counter message though. kernel BUG at include/linux/swapops.h:131! invalid opcode: [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: snd_seq_dummy fuse hidp tun bnep rfcomm llc2 af_key ipt_ULOG can_raw nfnetlink scsi_transport_iscsi nfc caif_socket caif af_802154 phonet af_rxrpc can_bcm can pppoe pppox ppp_generic slhc irda crc_ccitt rds rose x25 atm netrom appletalk ipx p8023 psnap p8022 llc ax25 cfg80211 xfs libcrc32c coretemp hwmon x86_pkg_temp_thermal kvm_intel kvm crct10dif_pclmul crc32c_intel ghash_clmulni_intel snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic microcode pcspkr serio_raw btusb bluetooth 6lowpan_iphc rfkill usb_debug shpchp snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm e1000e ptp snd_timer snd pps_core soundcore CPU: 2 PID: 10002 Comm: trinity-c36 Not tainted 3.14.0-rc5+ #131 task: 880108966750 ti: 88018911a000 task.ti: 88018911a000 RIP: 0010:[9d72d129] [9d72d129] migration_entry_to_page.part.47+0x4/0x6 RSP: :88018911bae8 EFLAGS: 00010246 RAX: ea00048a8980 RBX: 8801a08ae020 RCX: RDX: RSI: RDI: 3c122a26 RBP: 88018911bae8 R08: R09: R10: R11: fffe R12: 24544c3c R13: 88018911bc18 R14: 40c0 R15: 40a04000 FS: () GS:88024d08() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 0001 CR3: 0001e3c27000 CR4: 001407e0 DR0: 00ab5000 DR1: 01008000 DR2: 0223 DR3: DR6: fffe0ff0 DR7: 0600 Stack: 88018911bbc8 9d17ec1e 40d65fff 40d65fff 8801e3c27000 40d66000 88011c72e008 40d66000 40d65fff 0001 40d66000 88018911bb98 Call Trace: [9d17ec1e] unmap_single_vma+0x89e/0x8a0 [9d17fd49] unmap_vmas+0x49/0x90 [9d1890f5] exit_mmap+0xe5/0x1a0 [9d068d13] mmput+0x73/0x110 [9d06d022] do_exit+0x2a2/0xb50 [9d07bb63] ? __sigqueue_free.part.11+0x33/0x40 [9d07c39c] ? __dequeue_signal+0x13c/0x220 [9d06e8cc] do_group_exit+0x4c/0xc0 [9d07fd41] get_signal_to_deliver+0x2d1/0x6d0 [9d0024c7] do_signal+0x57/0x9d0 [9d11003e] ? __acct_update_integrals+0x8e/0x120 [9d73d66b] ? preempt_count_sub+0x6b/0xf0 [9d738ec1] ? _raw_spin_unlock+0x31/0x50 [9d0aa0b1] ? vtime_account_user+0x91/0xa0 [9d15215b] ? context_tracking_user_exit+0x9b/0x100 [9d002eb1] do_notify_resume+0x71/0xc0 [9d739c06] retint_signal+0x46/0x90 Code: df 48 c1 ff 06 49 01 fc 4c 89 e7 e8 79 ff ff ff 85 c0 74 0c 4c 89 e0 48 c1 e0 06 48 29 d8 eb 02 31 c0 5b 41 5c 5d c3 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 31 f6 48 89 e5 e8 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Wed, Mar 05, 2014 at 12:45:03PM -0500, Dave Jones wrote: > I just saw this on my box that's been running trinity.. > > [48825.517189] BUG: Bad rss-counter state mm:880177921d40 idx:0 val:1 > (Not tainted) > > There's nothing else, no trace, nothing. Any ideas where to begin with this? ah, on the serial console there was also this truncated warning.. [48825.517189] BUG: Bad rss-counter state mm:880177921d40 idx:0 val:1 (Not tainted) [48924.133273] [ cut here ] [48924.133391] kernel BUG at include/linux/swapops.h:131! Dave 124 static inline struct page *migration_entry_to_page(swp_entry_t entry) 125 { 126 struct page *p = pfn_to_page(swp_offset(entry)); 127 /* 128 * Any use of migration entries may only occur while the 129 * corresponding page is locked 130 */ 131 BUG_ON(!PageLocked(p)); 132 return p; 133 } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
bad rss-counter message in 3.14rc5
I just saw this on my box that's been running trinity.. [48825.517189] BUG: Bad rss-counter state mm:880177921d40 idx:0 val:1 (Not tainted) There's nothing else, no trace, nothing. Any ideas where to begin with this? Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
bad rss-counter message in 3.14rc5
I just saw this on my box that's been running trinity.. [48825.517189] BUG: Bad rss-counter state mm:880177921d40 idx:0 val:1 (Not tainted) There's nothing else, no trace, nothing. Any ideas where to begin with this? Dave -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad rss-counter message in 3.14rc5
On Wed, Mar 05, 2014 at 12:45:03PM -0500, Dave Jones wrote: I just saw this on my box that's been running trinity.. [48825.517189] BUG: Bad rss-counter state mm:880177921d40 idx:0 val:1 (Not tainted) There's nothing else, no trace, nothing. Any ideas where to begin with this? ah, on the serial console there was also this truncated warning.. [48825.517189] BUG: Bad rss-counter state mm:880177921d40 idx:0 val:1 (Not tainted) [48924.133273] [ cut here ] [48924.133391] kernel BUG at include/linux/swapops.h:131! Dave 124 static inline struct page *migration_entry_to_page(swp_entry_t entry) 125 { 126 struct page *p = pfn_to_page(swp_offset(entry)); 127 /* 128 * Any use of migration entries may only occur while the 129 * corresponding page is locked 130 */ 131 BUG_ON(!PageLocked(p)); 132 return p; 133 } -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/