Re: [RFC/PATCH 0/5] Contiguous Memory Allocator and get_user_pages()
On Wed, Mar 06, 2013 at 11:48:36AM +0100, Marek Szyprowski wrote: > Hello, > > On 3/6/2013 9:47 AM, Minchan Kim wrote: > >Hello, > > > >On Tue, Mar 5, 2013 at 3:57 PM, Marek Szyprowski > > wrote: > >> Hello, > >> > >> Contiguous Memory Allocator is very sensitive about migration failures > >> of the individual pages. A single page, which causes permanent migration > >> failure can break large conitguous allocations and cause the failure of > >> a multimedia device driver. > >> > >> One of the known issues with migration of CMA pages are the problems of > >> migrating the anonymous user pages, for which the others called > >> get_user_pages(). This takes a reference to the given user pages to let > >> kernel to operate directly on the page content. This is usually used for > >> preventing swaping out the page contents and doing direct DMA to/from > >> userspace. > >> > >> To solving this issue requires preventing locking of the pages, which > >> are placed in CMA regions, for a long time. Our idea is to migrate > >> anonymous page content before locking the page in get_user_pages(). This > >> cannot be done automatically, as get_user_pages() interface is used very > >> often for various operations, which usually last for a short period of > >> time (like for example exec syscall). We have added a new flag > >> indicating that the given get_user_space() call will grab pages for a > >> long time, thus it is suitable to use the migration workaround in such > >> cases. > >> > >> The proposed extensions is used by V4L2/VideoBuf2 > >> (drivers/media/v4l2-core/videobuf2-dma-contig.c), but that is not the > >> only place which might benefit from it, like any driver which use DMA to > >> userspace with get_user_pages(). This one is provided to demonstrate the > >> use case. > >> > >> I would like to hear some comments on the presented approach. What do > >> you think about it? Is there a chance to get such workaround merged at > >> some point to mainline? > >> > > > >I discussed similar patch from memory-hotplug guys with Mel. > >Look at http://marc.info/?l=linux-mm=136014458829566=2 > > > >The conern is that we ends up forcing using FOLL_DURABLE/GUP_NM for > >all drivers and subsystems for making sure CMA/memory-hotplug works > >well. > > > >You mentioned driver grab a page for a long time should use > >FOLL_DURABLE flag but "for a long time" is very ambiguous. For > >example, there is a driver > > > >get_user_pages() > >some operation. > >put_pages > > > >You can make sure some operation is really fast always? > > Well, in our case (judging from the logs) we observed 2 usage patterns > for get_user_pages() calls. One group was lots of short time locks, whose > call stacks originated in various kernel places, the second group was > device drivers which used get_user_pages() to create a buffer for the > DMA. Such buffers were used for the whole lifetime of the session to > the given device, what was equivalent to infinity from the migration/CMA > point of view. This was however based on the specific use case at out > target system, that's why I wanted to start the discussion and find > some generic approach. > > > >For example, what if it depends on other event which is normally very > >fast but quite slow once a week or try to do dynamic memory allocation > >but memory pressure is severe? > > > >For 100% working well, at last we need to change all GUP user with > >GUP_NM or your FOLL_DURABLE whatever but the concern Mel pointed out > >is it could cause lowmem exhaustion problem. > > This way we sooner or later end up without any movable pages at all. > I assume that keeping some temporary references on movable/cma pages > must be allowed, because otherwise we limit the functionality too much. > > >At the moment, there is other problem against migratoin, which are not > >related with your patch. ex, zcache, zram, zswap. Their pages couldn't > >be migrated out so I think below Mel's suggestion or some generic > >infrastructure can move pinned page is more proper way to go. > > zcache/zram/zswap (vsmalloc based code) can be also extended to support > migration. It requires some significant amount of work, but it is really > doable. > > >"To guarantee CMA can migrate pages pinned by drivers I think you need > >migrate-related callsbacks to unpin, barrier the driver until migration > >completes and repin." > > Right, this might improve the migration reliability. Are there any works > being done in this direction? See my other mail about how we (ab)use mmu_notifiers in an experimental drm/i915 patch. I have no idea whether that's the right approach though. But I'd certainly welcome a generic approach here which works for all page migration users. And I guess some callback based approach is better to handle low memory situations, since at least for drm/i915 userptr backed buffer objects we might want to slurp in the entire available memory. Or as much as we can get hold off at least. So moving pages to a safe
Re: [RFC/PATCH 0/5] Contiguous Memory Allocator and get_user_pages()
Hello, On 3/6/2013 9:47 AM, Minchan Kim wrote: Hello, On Tue, Mar 5, 2013 at 3:57 PM, Marek Szyprowski wrote: > Hello, > > Contiguous Memory Allocator is very sensitive about migration failures > of the individual pages. A single page, which causes permanent migration > failure can break large conitguous allocations and cause the failure of > a multimedia device driver. > > One of the known issues with migration of CMA pages are the problems of > migrating the anonymous user pages, for which the others called > get_user_pages(). This takes a reference to the given user pages to let > kernel to operate directly on the page content. This is usually used for > preventing swaping out the page contents and doing direct DMA to/from > userspace. > > To solving this issue requires preventing locking of the pages, which > are placed in CMA regions, for a long time. Our idea is to migrate > anonymous page content before locking the page in get_user_pages(). This > cannot be done automatically, as get_user_pages() interface is used very > often for various operations, which usually last for a short period of > time (like for example exec syscall). We have added a new flag > indicating that the given get_user_space() call will grab pages for a > long time, thus it is suitable to use the migration workaround in such > cases. > > The proposed extensions is used by V4L2/VideoBuf2 > (drivers/media/v4l2-core/videobuf2-dma-contig.c), but that is not the > only place which might benefit from it, like any driver which use DMA to > userspace with get_user_pages(). This one is provided to demonstrate the > use case. > > I would like to hear some comments on the presented approach. What do > you think about it? Is there a chance to get such workaround merged at > some point to mainline? > I discussed similar patch from memory-hotplug guys with Mel. Look at http://marc.info/?l=linux-mm=136014458829566=2 The conern is that we ends up forcing using FOLL_DURABLE/GUP_NM for all drivers and subsystems for making sure CMA/memory-hotplug works well. You mentioned driver grab a page for a long time should use FOLL_DURABLE flag but "for a long time" is very ambiguous. For example, there is a driver get_user_pages() some operation. put_pages You can make sure some operation is really fast always? Well, in our case (judging from the logs) we observed 2 usage patterns for get_user_pages() calls. One group was lots of short time locks, whose call stacks originated in various kernel places, the second group was device drivers which used get_user_pages() to create a buffer for the DMA. Such buffers were used for the whole lifetime of the session to the given device, what was equivalent to infinity from the migration/CMA point of view. This was however based on the specific use case at out target system, that's why I wanted to start the discussion and find some generic approach. For example, what if it depends on other event which is normally very fast but quite slow once a week or try to do dynamic memory allocation but memory pressure is severe? For 100% working well, at last we need to change all GUP user with GUP_NM or your FOLL_DURABLE whatever but the concern Mel pointed out is it could cause lowmem exhaustion problem. This way we sooner or later end up without any movable pages at all. I assume that keeping some temporary references on movable/cma pages must be allowed, because otherwise we limit the functionality too much. At the moment, there is other problem against migratoin, which are not related with your patch. ex, zcache, zram, zswap. Their pages couldn't be migrated out so I think below Mel's suggestion or some generic infrastructure can move pinned page is more proper way to go. zcache/zram/zswap (vsmalloc based code) can be also extended to support migration. It requires some significant amount of work, but it is really doable. "To guarantee CMA can migrate pages pinned by drivers I think you need migrate-related callsbacks to unpin, barrier the driver until migration completes and repin." Right, this might improve the migration reliability. Are there any works being done in this direction? Best regards -- Marek Szyprowski Samsung Poland R Center -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCH 0/5] Contiguous Memory Allocator and get_user_pages()
Hello, On Tue, Mar 5, 2013 at 3:57 PM, Marek Szyprowski wrote: > Hello, > > Contiguous Memory Allocator is very sensitive about migration failures > of the individual pages. A single page, which causes permanent migration > failure can break large conitguous allocations and cause the failure of > a multimedia device driver. > > One of the known issues with migration of CMA pages are the problems of > migrating the anonymous user pages, for which the others called > get_user_pages(). This takes a reference to the given user pages to let > kernel to operate directly on the page content. This is usually used for > preventing swaping out the page contents and doing direct DMA to/from > userspace. > > To solving this issue requires preventing locking of the pages, which > are placed in CMA regions, for a long time. Our idea is to migrate > anonymous page content before locking the page in get_user_pages(). This > cannot be done automatically, as get_user_pages() interface is used very > often for various operations, which usually last for a short period of > time (like for example exec syscall). We have added a new flag > indicating that the given get_user_space() call will grab pages for a > long time, thus it is suitable to use the migration workaround in such > cases. > > The proposed extensions is used by V4L2/VideoBuf2 > (drivers/media/v4l2-core/videobuf2-dma-contig.c), but that is not the > only place which might benefit from it, like any driver which use DMA to > userspace with get_user_pages(). This one is provided to demonstrate the > use case. > > I would like to hear some comments on the presented approach. What do > you think about it? Is there a chance to get such workaround merged at > some point to mainline? > I discussed similar patch from memory-hotplug guys with Mel. Look at http://marc.info/?l=linux-mm=136014458829566=2 The conern is that we ends up forcing using FOLL_DURABLE/GUP_NM for all drivers and subsystems for making sure CMA/memory-hotplug works well. You mentioned driver grab a page for a long time should use FOLL_DURABLE flag but "for a long time" is very ambiguous. For example, there is a driver get_user_pages() some operation. put_pages You can make sure some operation is really fast always? For example, what if it depends on other event which is normally very fast but quite slow once a week or try to do dynamic memory allocation but memory pressure is severe? For 100% working well, at last we need to change all GUP user with GUP_NM or your FOLL_DURABLE whatever but the concern Mel pointed out is it could cause lowmem exhaustion problem. At the moment, there is other problem against migratoin, which are not related with your patch. ex, zcache, zram, zswap. Their pages couldn't be migrated out so I think below Mel's suggestion or some generic infrastructure can move pinned page is more proper way to go. "To guarantee CMA can migrate pages pinned by drivers I think you need migrate-related callsbacks to unpin, barrier the driver until migration completes and repin." Thanks. -- Kind regards, Minchan Kim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCH 0/5] Contiguous Memory Allocator and get_user_pages()
Hello, On Tue, Mar 5, 2013 at 3:57 PM, Marek Szyprowski m.szyprow...@samsung.com wrote: Hello, Contiguous Memory Allocator is very sensitive about migration failures of the individual pages. A single page, which causes permanent migration failure can break large conitguous allocations and cause the failure of a multimedia device driver. One of the known issues with migration of CMA pages are the problems of migrating the anonymous user pages, for which the others called get_user_pages(). This takes a reference to the given user pages to let kernel to operate directly on the page content. This is usually used for preventing swaping out the page contents and doing direct DMA to/from userspace. To solving this issue requires preventing locking of the pages, which are placed in CMA regions, for a long time. Our idea is to migrate anonymous page content before locking the page in get_user_pages(). This cannot be done automatically, as get_user_pages() interface is used very often for various operations, which usually last for a short period of time (like for example exec syscall). We have added a new flag indicating that the given get_user_space() call will grab pages for a long time, thus it is suitable to use the migration workaround in such cases. The proposed extensions is used by V4L2/VideoBuf2 (drivers/media/v4l2-core/videobuf2-dma-contig.c), but that is not the only place which might benefit from it, like any driver which use DMA to userspace with get_user_pages(). This one is provided to demonstrate the use case. I would like to hear some comments on the presented approach. What do you think about it? Is there a chance to get such workaround merged at some point to mainline? I discussed similar patch from memory-hotplug guys with Mel. Look at http://marc.info/?l=linux-mmm=136014458829566w=2 The conern is that we ends up forcing using FOLL_DURABLE/GUP_NM for all drivers and subsystems for making sure CMA/memory-hotplug works well. You mentioned driver grab a page for a long time should use FOLL_DURABLE flag but for a long time is very ambiguous. For example, there is a driver get_user_pages() some operation. put_pages You can make sure some operation is really fast always? For example, what if it depends on other event which is normally very fast but quite slow once a week or try to do dynamic memory allocation but memory pressure is severe? For 100% working well, at last we need to change all GUP user with GUP_NM or your FOLL_DURABLE whatever but the concern Mel pointed out is it could cause lowmem exhaustion problem. At the moment, there is other problem against migratoin, which are not related with your patch. ex, zcache, zram, zswap. Their pages couldn't be migrated out so I think below Mel's suggestion or some generic infrastructure can move pinned page is more proper way to go. To guarantee CMA can migrate pages pinned by drivers I think you need migrate-related callsbacks to unpin, barrier the driver until migration completes and repin. Thanks. -- Kind regards, Minchan Kim -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCH 0/5] Contiguous Memory Allocator and get_user_pages()
Hello, On 3/6/2013 9:47 AM, Minchan Kim wrote: Hello, On Tue, Mar 5, 2013 at 3:57 PM, Marek Szyprowski m.szyprow...@samsung.com wrote: Hello, Contiguous Memory Allocator is very sensitive about migration failures of the individual pages. A single page, which causes permanent migration failure can break large conitguous allocations and cause the failure of a multimedia device driver. One of the known issues with migration of CMA pages are the problems of migrating the anonymous user pages, for which the others called get_user_pages(). This takes a reference to the given user pages to let kernel to operate directly on the page content. This is usually used for preventing swaping out the page contents and doing direct DMA to/from userspace. To solving this issue requires preventing locking of the pages, which are placed in CMA regions, for a long time. Our idea is to migrate anonymous page content before locking the page in get_user_pages(). This cannot be done automatically, as get_user_pages() interface is used very often for various operations, which usually last for a short period of time (like for example exec syscall). We have added a new flag indicating that the given get_user_space() call will grab pages for a long time, thus it is suitable to use the migration workaround in such cases. The proposed extensions is used by V4L2/VideoBuf2 (drivers/media/v4l2-core/videobuf2-dma-contig.c), but that is not the only place which might benefit from it, like any driver which use DMA to userspace with get_user_pages(). This one is provided to demonstrate the use case. I would like to hear some comments on the presented approach. What do you think about it? Is there a chance to get such workaround merged at some point to mainline? I discussed similar patch from memory-hotplug guys with Mel. Look at http://marc.info/?l=linux-mmm=136014458829566w=2 The conern is that we ends up forcing using FOLL_DURABLE/GUP_NM for all drivers and subsystems for making sure CMA/memory-hotplug works well. You mentioned driver grab a page for a long time should use FOLL_DURABLE flag but for a long time is very ambiguous. For example, there is a driver get_user_pages() some operation. put_pages You can make sure some operation is really fast always? Well, in our case (judging from the logs) we observed 2 usage patterns for get_user_pages() calls. One group was lots of short time locks, whose call stacks originated in various kernel places, the second group was device drivers which used get_user_pages() to create a buffer for the DMA. Such buffers were used for the whole lifetime of the session to the given device, what was equivalent to infinity from the migration/CMA point of view. This was however based on the specific use case at out target system, that's why I wanted to start the discussion and find some generic approach. For example, what if it depends on other event which is normally very fast but quite slow once a week or try to do dynamic memory allocation but memory pressure is severe? For 100% working well, at last we need to change all GUP user with GUP_NM or your FOLL_DURABLE whatever but the concern Mel pointed out is it could cause lowmem exhaustion problem. This way we sooner or later end up without any movable pages at all. I assume that keeping some temporary references on movable/cma pages must be allowed, because otherwise we limit the functionality too much. At the moment, there is other problem against migratoin, which are not related with your patch. ex, zcache, zram, zswap. Their pages couldn't be migrated out so I think below Mel's suggestion or some generic infrastructure can move pinned page is more proper way to go. zcache/zram/zswap (vsmalloc based code) can be also extended to support migration. It requires some significant amount of work, but it is really doable. To guarantee CMA can migrate pages pinned by drivers I think you need migrate-related callsbacks to unpin, barrier the driver until migration completes and repin. Right, this might improve the migration reliability. Are there any works being done in this direction? Best regards -- Marek Szyprowski Samsung Poland RD Center -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCH 0/5] Contiguous Memory Allocator and get_user_pages()
On Wed, Mar 06, 2013 at 11:48:36AM +0100, Marek Szyprowski wrote: Hello, On 3/6/2013 9:47 AM, Minchan Kim wrote: Hello, On Tue, Mar 5, 2013 at 3:57 PM, Marek Szyprowski m.szyprow...@samsung.com wrote: Hello, Contiguous Memory Allocator is very sensitive about migration failures of the individual pages. A single page, which causes permanent migration failure can break large conitguous allocations and cause the failure of a multimedia device driver. One of the known issues with migration of CMA pages are the problems of migrating the anonymous user pages, for which the others called get_user_pages(). This takes a reference to the given user pages to let kernel to operate directly on the page content. This is usually used for preventing swaping out the page contents and doing direct DMA to/from userspace. To solving this issue requires preventing locking of the pages, which are placed in CMA regions, for a long time. Our idea is to migrate anonymous page content before locking the page in get_user_pages(). This cannot be done automatically, as get_user_pages() interface is used very often for various operations, which usually last for a short period of time (like for example exec syscall). We have added a new flag indicating that the given get_user_space() call will grab pages for a long time, thus it is suitable to use the migration workaround in such cases. The proposed extensions is used by V4L2/VideoBuf2 (drivers/media/v4l2-core/videobuf2-dma-contig.c), but that is not the only place which might benefit from it, like any driver which use DMA to userspace with get_user_pages(). This one is provided to demonstrate the use case. I would like to hear some comments on the presented approach. What do you think about it? Is there a chance to get such workaround merged at some point to mainline? I discussed similar patch from memory-hotplug guys with Mel. Look at http://marc.info/?l=linux-mmm=136014458829566w=2 The conern is that we ends up forcing using FOLL_DURABLE/GUP_NM for all drivers and subsystems for making sure CMA/memory-hotplug works well. You mentioned driver grab a page for a long time should use FOLL_DURABLE flag but for a long time is very ambiguous. For example, there is a driver get_user_pages() some operation. put_pages You can make sure some operation is really fast always? Well, in our case (judging from the logs) we observed 2 usage patterns for get_user_pages() calls. One group was lots of short time locks, whose call stacks originated in various kernel places, the second group was device drivers which used get_user_pages() to create a buffer for the DMA. Such buffers were used for the whole lifetime of the session to the given device, what was equivalent to infinity from the migration/CMA point of view. This was however based on the specific use case at out target system, that's why I wanted to start the discussion and find some generic approach. For example, what if it depends on other event which is normally very fast but quite slow once a week or try to do dynamic memory allocation but memory pressure is severe? For 100% working well, at last we need to change all GUP user with GUP_NM or your FOLL_DURABLE whatever but the concern Mel pointed out is it could cause lowmem exhaustion problem. This way we sooner or later end up without any movable pages at all. I assume that keeping some temporary references on movable/cma pages must be allowed, because otherwise we limit the functionality too much. At the moment, there is other problem against migratoin, which are not related with your patch. ex, zcache, zram, zswap. Their pages couldn't be migrated out so I think below Mel's suggestion or some generic infrastructure can move pinned page is more proper way to go. zcache/zram/zswap (vsmalloc based code) can be also extended to support migration. It requires some significant amount of work, but it is really doable. To guarantee CMA can migrate pages pinned by drivers I think you need migrate-related callsbacks to unpin, barrier the driver until migration completes and repin. Right, this might improve the migration reliability. Are there any works being done in this direction? See my other mail about how we (ab)use mmu_notifiers in an experimental drm/i915 patch. I have no idea whether that's the right approach though. But I'd certainly welcome a generic approach here which works for all page migration users. And I guess some callback based approach is better to handle low memory situations, since at least for drm/i915 userptr backed buffer objects we might want to slurp in the entire available memory. Or as much as we can get hold off at least. So moving pages to a safe area before pinning them might not be feasible. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch -- To unsubscribe from
Re: [RFC/PATCH 0/5] Contiguous Memory Allocator and get_user_pages()
2013/03/05 15:57, Marek Szyprowski wrote: > Hello, > > Contiguous Memory Allocator is very sensitive about migration failures > of the individual pages. A single page, which causes permanent migration > failure can break large conitguous allocations and cause the failure of > a multimedia device driver. > > One of the known issues with migration of CMA pages are the problems of > migrating the anonymous user pages, for which the others called > get_user_pages(). This takes a reference to the given user pages to let > kernel to operate directly on the page content. This is usually used for > preventing swaping out the page contents and doing direct DMA to/from > userspace. > > To solving this issue requires preventing locking of the pages, which > are placed in CMA regions, for a long time. Our idea is to migrate > anonymous page content before locking the page in get_user_pages(). This > cannot be done automatically, as get_user_pages() interface is used very > often for various operations, which usually last for a short period of > time (like for example exec syscall). We have added a new flag > indicating that the given get_user_space() call will grab pages for a > long time, thus it is suitable to use the migration workaround in such > cases. > > The proposed extensions is used by V4L2/VideoBuf2 > (drivers/media/v4l2-core/videobuf2-dma-contig.c), but that is not the > only place which might benefit from it, like any driver which use DMA to > userspace with get_user_pages(). This one is provided to demonstrate the > use case. > > I would like to hear some comments on the presented approach. What do > you think about it? Is there a chance to get such workaround merged at > some point to mainline? I'm interested in your idea since it seems that the idea solves my issue: https://lkml.org/lkml/2012/11/29/69 So I want to apply your idea to a memory hot plug. Thanks, Yasuaki Ishimatsu > > Best regards > Marek Szyprowski > Samsung Poland R Center > > > Patch summary: > > Marek Szyprowski (5): >mm: introduce migrate_replace_page() for migrating page to the given > target >mm: get_user_pages: use static inline >mm: get_user_pages: use NON-MOVABLE pages when FOLL_DURABLE flag is > set >mm: get_user_pages: migrate out CMA pages when FOLL_DURABLE flag is > set >media: vb2: use FOLL_DURABLE and __get_user_pages() to avoid CMA > migration issues > > drivers/media/v4l2-core/videobuf2-dma-contig.c |8 +- > include/linux/highmem.h| 12 ++- > include/linux/migrate.h|5 + > include/linux/mm.h | 76 - > mm/internal.h | 12 +++ > mm/memory.c| 136 > +++- > mm/migrate.c | 59 ++ > 7 files changed, 225 insertions(+), 83 deletions(-) > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCH 0/5] Contiguous Memory Allocator and get_user_pages()
On Tue, Mar 5, 2013 at 7:57 AM, Marek Szyprowski wrote: > Hello, > > Contiguous Memory Allocator is very sensitive about migration failures > of the individual pages. A single page, which causes permanent migration > failure can break large conitguous allocations and cause the failure of > a multimedia device driver. > > One of the known issues with migration of CMA pages are the problems of > migrating the anonymous user pages, for which the others called > get_user_pages(). This takes a reference to the given user pages to let > kernel to operate directly on the page content. This is usually used for > preventing swaping out the page contents and doing direct DMA to/from > userspace. > > To solving this issue requires preventing locking of the pages, which > are placed in CMA regions, for a long time. Our idea is to migrate > anonymous page content before locking the page in get_user_pages(). This > cannot be done automatically, as get_user_pages() interface is used very > often for various operations, which usually last for a short period of > time (like for example exec syscall). We have added a new flag > indicating that the given get_user_space() call will grab pages for a > long time, thus it is suitable to use the migration workaround in such > cases. > > The proposed extensions is used by V4L2/VideoBuf2 > (drivers/media/v4l2-core/videobuf2-dma-contig.c), but that is not the > only place which might benefit from it, like any driver which use DMA to > userspace with get_user_pages(). This one is provided to demonstrate the > use case. > > I would like to hear some comments on the presented approach. What do > you think about it? Is there a chance to get such workaround merged at > some point to mainline? Imo neat trick to make CMA work together with long-term gup'ed userspace memory in buffer objects, but doesn't really address the bigger issue that such userspace pinning kills all the nice features page migration allows. E.g. if your iommu supports huge pages and you need those to hit some performance targets, but not correctness since you can fall back to normal pages. For the userptr support we're playing around with in drm/i915 we've opted to fix this with the mmu_notifier. That allows us to evict buffers and unbind the mappings when the vm wants to move a page. There's still the issue that we can't unbind it right away, but the usual retry loop for referenced pages in the migration code should handle that like any other short-lived locked pages for I/O. I see two issues with that approach though: - Needs buffer eviction support. No really a problem for drm/i915, a bit a challenge for v4l ;-) - The mmu notifiers aren't really designed to keep track of a lot of tiny ranges in different mms. At least the simplistic approach currently used in the i915 patches to register a new mmu_notifier for each buffer object sucks performance wise. For performance reasons we want to also use get_user_pages_fast, so I don't think mixing that together with the "please migrate out of CMA" trick here is a good thing. Current drm/i915 wip patch is at: https://patchwork.kernel.org/patch/1748601/ Just my 2 cents on this entire issue. Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCH 0/5] Contiguous Memory Allocator and get_user_pages()
On Tuesday 05 March 2013, Marek Szyprowski wrote: > On 3/5/2013 9:50 AM, Arnd Bergmann wrote: > > On Tuesday 05 March 2013, Marek Szyprowski wrote: > > The problem is that the opposite approach is imho easier. I can understand that, yes ;-) > get_user_pages() > is used in quite a lot of places (I was quite surprised when I've added some > debug to it and saw the logs) and it seems to be easier to identify places > where references are kept for significant amount of time. Usually such > places > are in the device drivers. In our case only videobuf2 and some closed-source > driver were causing the real migration problems, so I decided to leave the > default approach unchanged. > > If we use this workaround for every get_user_pages() call we will sooner or > later end with most of the anonymous pages migrated to non-movable > pageblocks > what make the whole CMA approach a bit pointless. But you said that most users are in device drivers, and I would expect drivers not to touch that many pages. We already have two interfaces: the generic get_user_pages and the "fast" version "get_user_pages_fast" that has a number of restrictions. We could add another such restriction to get_user_pages_fast(), which is that it must not hold the page reference count for an extended time because it will not migrate pages out. I would assume that most of the in-kernel users of get_user_pages() that are called a lot either already use get_user_pages_fast, or can be easily converted to it. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCH 0/5] Contiguous Memory Allocator and get_user_pages()
Hello, On 3/5/2013 9:50 AM, Arnd Bergmann wrote: On Tuesday 05 March 2013, Marek Szyprowski wrote: > To solving this issue requires preventing locking of the pages, which > are placed in CMA regions, for a long time. Our idea is to migrate > anonymous page content before locking the page in get_user_pages(). This > cannot be done automatically, as get_user_pages() interface is used very > often for various operations, which usually last for a short period of > time (like for example exec syscall). We have added a new flag > indicating that the given get_user_space() call will grab pages for a > long time, thus it is suitable to use the migration workaround in such > cases. Can you explain the tradeoff here? I would have expected that the default should be to migrate pages out, and annotate the instances that we know are performance critical and short-lived. That would at least appear more reliable to me. The problem is that the opposite approach is imho easier. get_user_pages() is used in quite a lot of places (I was quite surprised when I've added some debug to it and saw the logs) and it seems to be easier to identify places where references are kept for significant amount of time. Usually such places are in the device drivers. In our case only videobuf2 and some closed-source driver were causing the real migration problems, so I decided to leave the default approach unchanged. If we use this workaround for every get_user_pages() call we will sooner or later end with most of the anonymous pages migrated to non-movable pageblocks what make the whole CMA approach a bit pointless. Best regards -- Marek Szyprowski Samsung Poland R Center -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCH 0/5] Contiguous Memory Allocator and get_user_pages()
On Tuesday 05 March 2013, Marek Szyprowski wrote: > To solving this issue requires preventing locking of the pages, which > are placed in CMA regions, for a long time. Our idea is to migrate > anonymous page content before locking the page in get_user_pages(). This > cannot be done automatically, as get_user_pages() interface is used very > often for various operations, which usually last for a short period of > time (like for example exec syscall). We have added a new flag > indicating that the given get_user_space() call will grab pages for a > long time, thus it is suitable to use the migration workaround in such > cases. Can you explain the tradeoff here? I would have expected that the default should be to migrate pages out, and annotate the instances that we know are performance critical and short-lived. That would at least appear more reliable to me. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCH 0/5] Contiguous Memory Allocator and get_user_pages()
On Tuesday 05 March 2013, Marek Szyprowski wrote: To solving this issue requires preventing locking of the pages, which are placed in CMA regions, for a long time. Our idea is to migrate anonymous page content before locking the page in get_user_pages(). This cannot be done automatically, as get_user_pages() interface is used very often for various operations, which usually last for a short period of time (like for example exec syscall). We have added a new flag indicating that the given get_user_space() call will grab pages for a long time, thus it is suitable to use the migration workaround in such cases. Can you explain the tradeoff here? I would have expected that the default should be to migrate pages out, and annotate the instances that we know are performance critical and short-lived. That would at least appear more reliable to me. Arnd -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCH 0/5] Contiguous Memory Allocator and get_user_pages()
Hello, On 3/5/2013 9:50 AM, Arnd Bergmann wrote: On Tuesday 05 March 2013, Marek Szyprowski wrote: To solving this issue requires preventing locking of the pages, which are placed in CMA regions, for a long time. Our idea is to migrate anonymous page content before locking the page in get_user_pages(). This cannot be done automatically, as get_user_pages() interface is used very often for various operations, which usually last for a short period of time (like for example exec syscall). We have added a new flag indicating that the given get_user_space() call will grab pages for a long time, thus it is suitable to use the migration workaround in such cases. Can you explain the tradeoff here? I would have expected that the default should be to migrate pages out, and annotate the instances that we know are performance critical and short-lived. That would at least appear more reliable to me. The problem is that the opposite approach is imho easier. get_user_pages() is used in quite a lot of places (I was quite surprised when I've added some debug to it and saw the logs) and it seems to be easier to identify places where references are kept for significant amount of time. Usually such places are in the device drivers. In our case only videobuf2 and some closed-source driver were causing the real migration problems, so I decided to leave the default approach unchanged. If we use this workaround for every get_user_pages() call we will sooner or later end with most of the anonymous pages migrated to non-movable pageblocks what make the whole CMA approach a bit pointless. Best regards -- Marek Szyprowski Samsung Poland RD Center -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCH 0/5] Contiguous Memory Allocator and get_user_pages()
On Tuesday 05 March 2013, Marek Szyprowski wrote: On 3/5/2013 9:50 AM, Arnd Bergmann wrote: On Tuesday 05 March 2013, Marek Szyprowski wrote: The problem is that the opposite approach is imho easier. I can understand that, yes ;-) get_user_pages() is used in quite a lot of places (I was quite surprised when I've added some debug to it and saw the logs) and it seems to be easier to identify places where references are kept for significant amount of time. Usually such places are in the device drivers. In our case only videobuf2 and some closed-source driver were causing the real migration problems, so I decided to leave the default approach unchanged. If we use this workaround for every get_user_pages() call we will sooner or later end with most of the anonymous pages migrated to non-movable pageblocks what make the whole CMA approach a bit pointless. But you said that most users are in device drivers, and I would expect drivers not to touch that many pages. We already have two interfaces: the generic get_user_pages and the fast version get_user_pages_fast that has a number of restrictions. We could add another such restriction to get_user_pages_fast(), which is that it must not hold the page reference count for an extended time because it will not migrate pages out. I would assume that most of the in-kernel users of get_user_pages() that are called a lot either already use get_user_pages_fast, or can be easily converted to it. Arnd -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCH 0/5] Contiguous Memory Allocator and get_user_pages()
On Tue, Mar 5, 2013 at 7:57 AM, Marek Szyprowski m.szyprow...@samsung.com wrote: Hello, Contiguous Memory Allocator is very sensitive about migration failures of the individual pages. A single page, which causes permanent migration failure can break large conitguous allocations and cause the failure of a multimedia device driver. One of the known issues with migration of CMA pages are the problems of migrating the anonymous user pages, for which the others called get_user_pages(). This takes a reference to the given user pages to let kernel to operate directly on the page content. This is usually used for preventing swaping out the page contents and doing direct DMA to/from userspace. To solving this issue requires preventing locking of the pages, which are placed in CMA regions, for a long time. Our idea is to migrate anonymous page content before locking the page in get_user_pages(). This cannot be done automatically, as get_user_pages() interface is used very often for various operations, which usually last for a short period of time (like for example exec syscall). We have added a new flag indicating that the given get_user_space() call will grab pages for a long time, thus it is suitable to use the migration workaround in such cases. The proposed extensions is used by V4L2/VideoBuf2 (drivers/media/v4l2-core/videobuf2-dma-contig.c), but that is not the only place which might benefit from it, like any driver which use DMA to userspace with get_user_pages(). This one is provided to demonstrate the use case. I would like to hear some comments on the presented approach. What do you think about it? Is there a chance to get such workaround merged at some point to mainline? Imo neat trick to make CMA work together with long-term gup'ed userspace memory in buffer objects, but doesn't really address the bigger issue that such userspace pinning kills all the nice features page migration allows. E.g. if your iommu supports huge pages and you need those to hit some performance targets, but not correctness since you can fall back to normal pages. For the userptr support we're playing around with in drm/i915 we've opted to fix this with the mmu_notifier. That allows us to evict buffers and unbind the mappings when the vm wants to move a page. There's still the issue that we can't unbind it right away, but the usual retry loop for referenced pages in the migration code should handle that like any other short-lived locked pages for I/O. I see two issues with that approach though: - Needs buffer eviction support. No really a problem for drm/i915, a bit a challenge for v4l ;-) - The mmu notifiers aren't really designed to keep track of a lot of tiny ranges in different mms. At least the simplistic approach currently used in the i915 patches to register a new mmu_notifier for each buffer object sucks performance wise. For performance reasons we want to also use get_user_pages_fast, so I don't think mixing that together with the please migrate out of CMA trick here is a good thing. Current drm/i915 wip patch is at: https://patchwork.kernel.org/patch/1748601/ Just my 2 cents on this entire issue. Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCH 0/5] Contiguous Memory Allocator and get_user_pages()
2013/03/05 15:57, Marek Szyprowski wrote: Hello, Contiguous Memory Allocator is very sensitive about migration failures of the individual pages. A single page, which causes permanent migration failure can break large conitguous allocations and cause the failure of a multimedia device driver. One of the known issues with migration of CMA pages are the problems of migrating the anonymous user pages, for which the others called get_user_pages(). This takes a reference to the given user pages to let kernel to operate directly on the page content. This is usually used for preventing swaping out the page contents and doing direct DMA to/from userspace. To solving this issue requires preventing locking of the pages, which are placed in CMA regions, for a long time. Our idea is to migrate anonymous page content before locking the page in get_user_pages(). This cannot be done automatically, as get_user_pages() interface is used very often for various operations, which usually last for a short period of time (like for example exec syscall). We have added a new flag indicating that the given get_user_space() call will grab pages for a long time, thus it is suitable to use the migration workaround in such cases. The proposed extensions is used by V4L2/VideoBuf2 (drivers/media/v4l2-core/videobuf2-dma-contig.c), but that is not the only place which might benefit from it, like any driver which use DMA to userspace with get_user_pages(). This one is provided to demonstrate the use case. I would like to hear some comments on the presented approach. What do you think about it? Is there a chance to get such workaround merged at some point to mainline? I'm interested in your idea since it seems that the idea solves my issue: https://lkml.org/lkml/2012/11/29/69 So I want to apply your idea to a memory hot plug. Thanks, Yasuaki Ishimatsu Best regards Marek Szyprowski Samsung Poland RD Center Patch summary: Marek Szyprowski (5): mm: introduce migrate_replace_page() for migrating page to the given target mm: get_user_pages: use static inline mm: get_user_pages: use NON-MOVABLE pages when FOLL_DURABLE flag is set mm: get_user_pages: migrate out CMA pages when FOLL_DURABLE flag is set media: vb2: use FOLL_DURABLE and __get_user_pages() to avoid CMA migration issues drivers/media/v4l2-core/videobuf2-dma-contig.c |8 +- include/linux/highmem.h| 12 ++- include/linux/migrate.h|5 + include/linux/mm.h | 76 - mm/internal.h | 12 +++ mm/memory.c| 136 +++- mm/migrate.c | 59 ++ 7 files changed, 225 insertions(+), 83 deletions(-) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC/PATCH 0/5] Contiguous Memory Allocator and get_user_pages()
Hello, Contiguous Memory Allocator is very sensitive about migration failures of the individual pages. A single page, which causes permanent migration failure can break large conitguous allocations and cause the failure of a multimedia device driver. One of the known issues with migration of CMA pages are the problems of migrating the anonymous user pages, for which the others called get_user_pages(). This takes a reference to the given user pages to let kernel to operate directly on the page content. This is usually used for preventing swaping out the page contents and doing direct DMA to/from userspace. To solving this issue requires preventing locking of the pages, which are placed in CMA regions, for a long time. Our idea is to migrate anonymous page content before locking the page in get_user_pages(). This cannot be done automatically, as get_user_pages() interface is used very often for various operations, which usually last for a short period of time (like for example exec syscall). We have added a new flag indicating that the given get_user_space() call will grab pages for a long time, thus it is suitable to use the migration workaround in such cases. The proposed extensions is used by V4L2/VideoBuf2 (drivers/media/v4l2-core/videobuf2-dma-contig.c), but that is not the only place which might benefit from it, like any driver which use DMA to userspace with get_user_pages(). This one is provided to demonstrate the use case. I would like to hear some comments on the presented approach. What do you think about it? Is there a chance to get such workaround merged at some point to mainline? Best regards Marek Szyprowski Samsung Poland R Center Patch summary: Marek Szyprowski (5): mm: introduce migrate_replace_page() for migrating page to the given target mm: get_user_pages: use static inline mm: get_user_pages: use NON-MOVABLE pages when FOLL_DURABLE flag is set mm: get_user_pages: migrate out CMA pages when FOLL_DURABLE flag is set media: vb2: use FOLL_DURABLE and __get_user_pages() to avoid CMA migration issues drivers/media/v4l2-core/videobuf2-dma-contig.c |8 +- include/linux/highmem.h| 12 ++- include/linux/migrate.h|5 + include/linux/mm.h | 76 - mm/internal.h | 12 +++ mm/memory.c| 136 +++- mm/migrate.c | 59 ++ 7 files changed, 225 insertions(+), 83 deletions(-) -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC/PATCH 0/5] Contiguous Memory Allocator and get_user_pages()
Hello, Contiguous Memory Allocator is very sensitive about migration failures of the individual pages. A single page, which causes permanent migration failure can break large conitguous allocations and cause the failure of a multimedia device driver. One of the known issues with migration of CMA pages are the problems of migrating the anonymous user pages, for which the others called get_user_pages(). This takes a reference to the given user pages to let kernel to operate directly on the page content. This is usually used for preventing swaping out the page contents and doing direct DMA to/from userspace. To solving this issue requires preventing locking of the pages, which are placed in CMA regions, for a long time. Our idea is to migrate anonymous page content before locking the page in get_user_pages(). This cannot be done automatically, as get_user_pages() interface is used very often for various operations, which usually last for a short period of time (like for example exec syscall). We have added a new flag indicating that the given get_user_space() call will grab pages for a long time, thus it is suitable to use the migration workaround in such cases. The proposed extensions is used by V4L2/VideoBuf2 (drivers/media/v4l2-core/videobuf2-dma-contig.c), but that is not the only place which might benefit from it, like any driver which use DMA to userspace with get_user_pages(). This one is provided to demonstrate the use case. I would like to hear some comments on the presented approach. What do you think about it? Is there a chance to get such workaround merged at some point to mainline? Best regards Marek Szyprowski Samsung Poland RD Center Patch summary: Marek Szyprowski (5): mm: introduce migrate_replace_page() for migrating page to the given target mm: get_user_pages: use static inline mm: get_user_pages: use NON-MOVABLE pages when FOLL_DURABLE flag is set mm: get_user_pages: migrate out CMA pages when FOLL_DURABLE flag is set media: vb2: use FOLL_DURABLE and __get_user_pages() to avoid CMA migration issues drivers/media/v4l2-core/videobuf2-dma-contig.c |8 +- include/linux/highmem.h| 12 ++- include/linux/migrate.h|5 + include/linux/mm.h | 76 - mm/internal.h | 12 +++ mm/memory.c| 136 +++- mm/migrate.c | 59 ++ 7 files changed, 225 insertions(+), 83 deletions(-) -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/