Re: [RFC/PATCH 0/5] Contiguous Memory Allocator and get_user_pages()

2013-03-06 Thread Daniel Vetter
On Wed, Mar 06, 2013 at 11:48:36AM +0100, Marek Szyprowski wrote:
> Hello,
> 
> On 3/6/2013 9:47 AM, Minchan Kim wrote:
> >Hello,
> >
> >On Tue, Mar 5, 2013 at 3:57 PM, Marek Szyprowski
> > wrote:
> >> Hello,
> >>
> >> Contiguous Memory Allocator is very sensitive about migration failures
> >> of the individual pages. A single page, which causes permanent migration
> >> failure can break large conitguous allocations and cause the failure of
> >> a multimedia device driver.
> >>
> >> One of the known issues with migration of CMA pages are the problems of
> >> migrating the anonymous user pages, for which the others called
> >> get_user_pages(). This takes a reference to the given user pages to let
> >> kernel to operate directly on the page content. This is usually used for
> >> preventing swaping out the page contents and doing direct DMA to/from
> >> userspace.
> >>
> >> To solving this issue requires preventing locking of the pages, which
> >> are placed in CMA regions, for a long time. Our idea is to migrate
> >> anonymous page content before locking the page in get_user_pages(). This
> >> cannot be done automatically, as get_user_pages() interface is used very
> >> often for various operations, which usually last for a short period of
> >> time (like for example exec syscall). We have added a new flag
> >> indicating that the given get_user_space() call will grab pages for a
> >> long time, thus it is suitable to use the migration workaround in such
> >> cases.
> >>
> >> The proposed extensions is used by V4L2/VideoBuf2
> >> (drivers/media/v4l2-core/videobuf2-dma-contig.c), but that is not the
> >> only place which might benefit from it, like any driver which use DMA to
> >> userspace with get_user_pages(). This one is provided to demonstrate the
> >> use case.
> >>
> >> I would like to hear some comments on the presented approach. What do
> >> you think about it? Is there a chance to get such workaround merged at
> >> some point to mainline?
> >>
> >
> >I discussed similar patch from memory-hotplug guys with Mel.
> >Look at http://marc.info/?l=linux-mm=136014458829566=2
> >
> >The conern is that we ends up forcing using FOLL_DURABLE/GUP_NM for
> >all drivers and subsystems for making sure CMA/memory-hotplug works
> >well.
> >
> >You mentioned driver grab a page for a long time should use
> >FOLL_DURABLE flag but "for a long time" is very ambiguous. For
> >example, there is a driver
> >
> >get_user_pages()
> >some operation.
> >put_pages
> >
> >You can make sure some operation is really fast always?
> 
> Well, in our case (judging from the logs) we observed 2 usage patterns
> for get_user_pages() calls. One group was lots of short time locks, whose
> call stacks originated in various kernel places, the second group was
> device drivers which used get_user_pages() to create a buffer for the
> DMA. Such buffers were used for the whole lifetime of the session to
> the given device, what was equivalent to infinity from the migration/CMA
> point of view. This was however based on the specific use case at out
> target system, that's why I wanted to start the discussion and find
> some generic approach.
> 
> 
> >For example, what if it depends on other event which is normally very
> >fast but quite slow once a week or try to do dynamic memory allocation
> >but memory pressure is severe?
> >
> >For 100% working well, at last we need to change all GUP user with
> >GUP_NM or your FOLL_DURABLE whatever but the concern Mel pointed out
> >is it could cause lowmem exhaustion problem.
> 
> This way we sooner or later end up without any movable pages at all.
> I assume that keeping some temporary references on movable/cma pages
> must be allowed, because otherwise we limit the functionality too much.
> 
> >At the moment, there is other problem against migratoin, which are not
> >related with your patch. ex, zcache, zram, zswap. Their pages couldn't
> >be migrated out so I think below Mel's suggestion or some generic
> >infrastructure can move pinned page is  more proper way to go.
> 
> zcache/zram/zswap (vsmalloc based code) can be also extended to support
> migration. It requires some significant amount of work, but it is really
> doable.
> 
> >"To guarantee CMA can migrate pages pinned by drivers I think you need
> >migrate-related callsbacks to unpin, barrier the driver until migration
> >completes and repin."
> 
> Right, this might improve the migration reliability. Are there any works
> being done in this direction?

See my other mail about how we (ab)use mmu_notifiers in an experimental
drm/i915 patch. I have no idea whether that's the right approach though.
But I'd certainly welcome a generic approach here which works for all
page migration users. And I guess some callback based approach is better
to handle low memory situations, since at least for drm/i915 userptr
backed buffer objects we might want to slurp in the entire available
memory. Or as much as we can get hold off at least. So moving pages to a
safe 

Re: [RFC/PATCH 0/5] Contiguous Memory Allocator and get_user_pages()

2013-03-06 Thread Marek Szyprowski

Hello,

On 3/6/2013 9:47 AM, Minchan Kim wrote:

Hello,

On Tue, Mar 5, 2013 at 3:57 PM, Marek Szyprowski
 wrote:
> Hello,
>
> Contiguous Memory Allocator is very sensitive about migration failures
> of the individual pages. A single page, which causes permanent migration
> failure can break large conitguous allocations and cause the failure of
> a multimedia device driver.
>
> One of the known issues with migration of CMA pages are the problems of
> migrating the anonymous user pages, for which the others called
> get_user_pages(). This takes a reference to the given user pages to let
> kernel to operate directly on the page content. This is usually used for
> preventing swaping out the page contents and doing direct DMA to/from
> userspace.
>
> To solving this issue requires preventing locking of the pages, which
> are placed in CMA regions, for a long time. Our idea is to migrate
> anonymous page content before locking the page in get_user_pages(). This
> cannot be done automatically, as get_user_pages() interface is used very
> often for various operations, which usually last for a short period of
> time (like for example exec syscall). We have added a new flag
> indicating that the given get_user_space() call will grab pages for a
> long time, thus it is suitable to use the migration workaround in such
> cases.
>
> The proposed extensions is used by V4L2/VideoBuf2
> (drivers/media/v4l2-core/videobuf2-dma-contig.c), but that is not the
> only place which might benefit from it, like any driver which use DMA to
> userspace with get_user_pages(). This one is provided to demonstrate the
> use case.
>
> I would like to hear some comments on the presented approach. What do
> you think about it? Is there a chance to get such workaround merged at
> some point to mainline?
>

I discussed similar patch from memory-hotplug guys with Mel.
Look at http://marc.info/?l=linux-mm=136014458829566=2

The conern is that we ends up forcing using FOLL_DURABLE/GUP_NM for
all drivers and subsystems for making sure CMA/memory-hotplug works
well.

You mentioned driver grab a page for a long time should use
FOLL_DURABLE flag but "for a long time" is very ambiguous. For
example, there is a driver

get_user_pages()
some operation.
put_pages

You can make sure some operation is really fast always?


Well, in our case (judging from the logs) we observed 2 usage patterns
for get_user_pages() calls. One group was lots of short time locks, whose
call stacks originated in various kernel places, the second group was
device drivers which used get_user_pages() to create a buffer for the
DMA. Such buffers were used for the whole lifetime of the session to
the given device, what was equivalent to infinity from the migration/CMA
point of view. This was however based on the specific use case at out
target system, that's why I wanted to start the discussion and find
some generic approach.



For example, what if it depends on other event which is normally very
fast but quite slow once a week or try to do dynamic memory allocation
but memory pressure is severe?

For 100% working well, at last we need to change all GUP user with
GUP_NM or your FOLL_DURABLE whatever but the concern Mel pointed out
is it could cause lowmem exhaustion problem.


This way we sooner or later end up without any movable pages at all.
I assume that keeping some temporary references on movable/cma pages
must be allowed, because otherwise we limit the functionality too much.


At the moment, there is other problem against migratoin, which are not
related with your patch. ex, zcache, zram, zswap. Their pages couldn't
be migrated out so I think below Mel's suggestion or some generic
infrastructure can move pinned page is  more proper way to go.


zcache/zram/zswap (vsmalloc based code) can be also extended to support
migration. It requires some significant amount of work, but it is really
doable.


"To guarantee CMA can migrate pages pinned by drivers I think you need
migrate-related callsbacks to unpin, barrier the driver until migration
completes and repin."


Right, this might improve the migration reliability. Are there any works
being done in this direction?

Best regards
--
Marek Szyprowski
Samsung Poland R Center


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCH 0/5] Contiguous Memory Allocator and get_user_pages()

2013-03-06 Thread Minchan Kim
Hello,

On Tue, Mar 5, 2013 at 3:57 PM, Marek Szyprowski
 wrote:
> Hello,
>
> Contiguous Memory Allocator is very sensitive about migration failures
> of the individual pages. A single page, which causes permanent migration
> failure can break large conitguous allocations and cause the failure of
> a multimedia device driver.
>
> One of the known issues with migration of CMA pages are the problems of
> migrating the anonymous user pages, for which the others called
> get_user_pages(). This takes a reference to the given user pages to let
> kernel to operate directly on the page content. This is usually used for
> preventing swaping out the page contents and doing direct DMA to/from
> userspace.
>
> To solving this issue requires preventing locking of the pages, which
> are placed in CMA regions, for a long time. Our idea is to migrate
> anonymous page content before locking the page in get_user_pages(). This
> cannot be done automatically, as get_user_pages() interface is used very
> often for various operations, which usually last for a short period of
> time (like for example exec syscall). We have added a new flag
> indicating that the given get_user_space() call will grab pages for a
> long time, thus it is suitable to use the migration workaround in such
> cases.
>
> The proposed extensions is used by V4L2/VideoBuf2
> (drivers/media/v4l2-core/videobuf2-dma-contig.c), but that is not the
> only place which might benefit from it, like any driver which use DMA to
> userspace with get_user_pages(). This one is provided to demonstrate the
> use case.
>
> I would like to hear some comments on the presented approach. What do
> you think about it? Is there a chance to get such workaround merged at
> some point to mainline?
>

I discussed similar patch from memory-hotplug guys with Mel.
Look at http://marc.info/?l=linux-mm=136014458829566=2

The conern is that we ends up forcing using FOLL_DURABLE/GUP_NM for
all drivers and subsystems for making sure CMA/memory-hotplug works
well.

You mentioned driver grab a page for a long time should use
FOLL_DURABLE flag but "for a long time" is very ambiguous. For
example, there is a driver

get_user_pages()
some operation.
put_pages

You can make sure some operation is really fast always?
For example, what if it depends on other event which is normally very
fast but quite slow once a week or try to do dynamic memory allocation
but memory pressure is severe?

For 100% working well, at last we need to change all GUP user with
GUP_NM or your FOLL_DURABLE whatever but the concern Mel pointed out
is it could cause lowmem exhaustion problem.

At the moment, there is other problem against migratoin, which are not
related with your patch. ex, zcache, zram, zswap. Their pages couldn't
be migrated out so I think below Mel's suggestion or some generic
infrastructure can move pinned page is  more proper way to go.

"To guarantee CMA can migrate pages pinned by drivers I think you need
migrate-related callsbacks to unpin, barrier the driver until migration
completes and repin."

Thanks.

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCH 0/5] Contiguous Memory Allocator and get_user_pages()

2013-03-06 Thread Minchan Kim
Hello,

On Tue, Mar 5, 2013 at 3:57 PM, Marek Szyprowski
m.szyprow...@samsung.com wrote:
 Hello,

 Contiguous Memory Allocator is very sensitive about migration failures
 of the individual pages. A single page, which causes permanent migration
 failure can break large conitguous allocations and cause the failure of
 a multimedia device driver.

 One of the known issues with migration of CMA pages are the problems of
 migrating the anonymous user pages, for which the others called
 get_user_pages(). This takes a reference to the given user pages to let
 kernel to operate directly on the page content. This is usually used for
 preventing swaping out the page contents and doing direct DMA to/from
 userspace.

 To solving this issue requires preventing locking of the pages, which
 are placed in CMA regions, for a long time. Our idea is to migrate
 anonymous page content before locking the page in get_user_pages(). This
 cannot be done automatically, as get_user_pages() interface is used very
 often for various operations, which usually last for a short period of
 time (like for example exec syscall). We have added a new flag
 indicating that the given get_user_space() call will grab pages for a
 long time, thus it is suitable to use the migration workaround in such
 cases.

 The proposed extensions is used by V4L2/VideoBuf2
 (drivers/media/v4l2-core/videobuf2-dma-contig.c), but that is not the
 only place which might benefit from it, like any driver which use DMA to
 userspace with get_user_pages(). This one is provided to demonstrate the
 use case.

 I would like to hear some comments on the presented approach. What do
 you think about it? Is there a chance to get such workaround merged at
 some point to mainline?


I discussed similar patch from memory-hotplug guys with Mel.
Look at http://marc.info/?l=linux-mmm=136014458829566w=2

The conern is that we ends up forcing using FOLL_DURABLE/GUP_NM for
all drivers and subsystems for making sure CMA/memory-hotplug works
well.

You mentioned driver grab a page for a long time should use
FOLL_DURABLE flag but for a long time is very ambiguous. For
example, there is a driver

get_user_pages()
some operation.
put_pages

You can make sure some operation is really fast always?
For example, what if it depends on other event which is normally very
fast but quite slow once a week or try to do dynamic memory allocation
but memory pressure is severe?

For 100% working well, at last we need to change all GUP user with
GUP_NM or your FOLL_DURABLE whatever but the concern Mel pointed out
is it could cause lowmem exhaustion problem.

At the moment, there is other problem against migratoin, which are not
related with your patch. ex, zcache, zram, zswap. Their pages couldn't
be migrated out so I think below Mel's suggestion or some generic
infrastructure can move pinned page is  more proper way to go.

To guarantee CMA can migrate pages pinned by drivers I think you need
migrate-related callsbacks to unpin, barrier the driver until migration
completes and repin.

Thanks.

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCH 0/5] Contiguous Memory Allocator and get_user_pages()

2013-03-06 Thread Marek Szyprowski

Hello,

On 3/6/2013 9:47 AM, Minchan Kim wrote:

Hello,

On Tue, Mar 5, 2013 at 3:57 PM, Marek Szyprowski
m.szyprow...@samsung.com wrote:
 Hello,

 Contiguous Memory Allocator is very sensitive about migration failures
 of the individual pages. A single page, which causes permanent migration
 failure can break large conitguous allocations and cause the failure of
 a multimedia device driver.

 One of the known issues with migration of CMA pages are the problems of
 migrating the anonymous user pages, for which the others called
 get_user_pages(). This takes a reference to the given user pages to let
 kernel to operate directly on the page content. This is usually used for
 preventing swaping out the page contents and doing direct DMA to/from
 userspace.

 To solving this issue requires preventing locking of the pages, which
 are placed in CMA regions, for a long time. Our idea is to migrate
 anonymous page content before locking the page in get_user_pages(). This
 cannot be done automatically, as get_user_pages() interface is used very
 often for various operations, which usually last for a short period of
 time (like for example exec syscall). We have added a new flag
 indicating that the given get_user_space() call will grab pages for a
 long time, thus it is suitable to use the migration workaround in such
 cases.

 The proposed extensions is used by V4L2/VideoBuf2
 (drivers/media/v4l2-core/videobuf2-dma-contig.c), but that is not the
 only place which might benefit from it, like any driver which use DMA to
 userspace with get_user_pages(). This one is provided to demonstrate the
 use case.

 I would like to hear some comments on the presented approach. What do
 you think about it? Is there a chance to get such workaround merged at
 some point to mainline?


I discussed similar patch from memory-hotplug guys with Mel.
Look at http://marc.info/?l=linux-mmm=136014458829566w=2

The conern is that we ends up forcing using FOLL_DURABLE/GUP_NM for
all drivers and subsystems for making sure CMA/memory-hotplug works
well.

You mentioned driver grab a page for a long time should use
FOLL_DURABLE flag but for a long time is very ambiguous. For
example, there is a driver

get_user_pages()
some operation.
put_pages

You can make sure some operation is really fast always?


Well, in our case (judging from the logs) we observed 2 usage patterns
for get_user_pages() calls. One group was lots of short time locks, whose
call stacks originated in various kernel places, the second group was
device drivers which used get_user_pages() to create a buffer for the
DMA. Such buffers were used for the whole lifetime of the session to
the given device, what was equivalent to infinity from the migration/CMA
point of view. This was however based on the specific use case at out
target system, that's why I wanted to start the discussion and find
some generic approach.



For example, what if it depends on other event which is normally very
fast but quite slow once a week or try to do dynamic memory allocation
but memory pressure is severe?

For 100% working well, at last we need to change all GUP user with
GUP_NM or your FOLL_DURABLE whatever but the concern Mel pointed out
is it could cause lowmem exhaustion problem.


This way we sooner or later end up without any movable pages at all.
I assume that keeping some temporary references on movable/cma pages
must be allowed, because otherwise we limit the functionality too much.


At the moment, there is other problem against migratoin, which are not
related with your patch. ex, zcache, zram, zswap. Their pages couldn't
be migrated out so I think below Mel's suggestion or some generic
infrastructure can move pinned page is  more proper way to go.


zcache/zram/zswap (vsmalloc based code) can be also extended to support
migration. It requires some significant amount of work, but it is really
doable.


To guarantee CMA can migrate pages pinned by drivers I think you need
migrate-related callsbacks to unpin, barrier the driver until migration
completes and repin.


Right, this might improve the migration reliability. Are there any works
being done in this direction?

Best regards
--
Marek Szyprowski
Samsung Poland RD Center


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCH 0/5] Contiguous Memory Allocator and get_user_pages()

2013-03-06 Thread Daniel Vetter
On Wed, Mar 06, 2013 at 11:48:36AM +0100, Marek Szyprowski wrote:
 Hello,
 
 On 3/6/2013 9:47 AM, Minchan Kim wrote:
 Hello,
 
 On Tue, Mar 5, 2013 at 3:57 PM, Marek Szyprowski
 m.szyprow...@samsung.com wrote:
  Hello,
 
  Contiguous Memory Allocator is very sensitive about migration failures
  of the individual pages. A single page, which causes permanent migration
  failure can break large conitguous allocations and cause the failure of
  a multimedia device driver.
 
  One of the known issues with migration of CMA pages are the problems of
  migrating the anonymous user pages, for which the others called
  get_user_pages(). This takes a reference to the given user pages to let
  kernel to operate directly on the page content. This is usually used for
  preventing swaping out the page contents and doing direct DMA to/from
  userspace.
 
  To solving this issue requires preventing locking of the pages, which
  are placed in CMA regions, for a long time. Our idea is to migrate
  anonymous page content before locking the page in get_user_pages(). This
  cannot be done automatically, as get_user_pages() interface is used very
  often for various operations, which usually last for a short period of
  time (like for example exec syscall). We have added a new flag
  indicating that the given get_user_space() call will grab pages for a
  long time, thus it is suitable to use the migration workaround in such
  cases.
 
  The proposed extensions is used by V4L2/VideoBuf2
  (drivers/media/v4l2-core/videobuf2-dma-contig.c), but that is not the
  only place which might benefit from it, like any driver which use DMA to
  userspace with get_user_pages(). This one is provided to demonstrate the
  use case.
 
  I would like to hear some comments on the presented approach. What do
  you think about it? Is there a chance to get such workaround merged at
  some point to mainline?
 
 
 I discussed similar patch from memory-hotplug guys with Mel.
 Look at http://marc.info/?l=linux-mmm=136014458829566w=2
 
 The conern is that we ends up forcing using FOLL_DURABLE/GUP_NM for
 all drivers and subsystems for making sure CMA/memory-hotplug works
 well.
 
 You mentioned driver grab a page for a long time should use
 FOLL_DURABLE flag but for a long time is very ambiguous. For
 example, there is a driver
 
 get_user_pages()
 some operation.
 put_pages
 
 You can make sure some operation is really fast always?
 
 Well, in our case (judging from the logs) we observed 2 usage patterns
 for get_user_pages() calls. One group was lots of short time locks, whose
 call stacks originated in various kernel places, the second group was
 device drivers which used get_user_pages() to create a buffer for the
 DMA. Such buffers were used for the whole lifetime of the session to
 the given device, what was equivalent to infinity from the migration/CMA
 point of view. This was however based on the specific use case at out
 target system, that's why I wanted to start the discussion and find
 some generic approach.
 
 
 For example, what if it depends on other event which is normally very
 fast but quite slow once a week or try to do dynamic memory allocation
 but memory pressure is severe?
 
 For 100% working well, at last we need to change all GUP user with
 GUP_NM or your FOLL_DURABLE whatever but the concern Mel pointed out
 is it could cause lowmem exhaustion problem.
 
 This way we sooner or later end up without any movable pages at all.
 I assume that keeping some temporary references on movable/cma pages
 must be allowed, because otherwise we limit the functionality too much.
 
 At the moment, there is other problem against migratoin, which are not
 related with your patch. ex, zcache, zram, zswap. Their pages couldn't
 be migrated out so I think below Mel's suggestion or some generic
 infrastructure can move pinned page is  more proper way to go.
 
 zcache/zram/zswap (vsmalloc based code) can be also extended to support
 migration. It requires some significant amount of work, but it is really
 doable.
 
 To guarantee CMA can migrate pages pinned by drivers I think you need
 migrate-related callsbacks to unpin, barrier the driver until migration
 completes and repin.
 
 Right, this might improve the migration reliability. Are there any works
 being done in this direction?

See my other mail about how we (ab)use mmu_notifiers in an experimental
drm/i915 patch. I have no idea whether that's the right approach though.
But I'd certainly welcome a generic approach here which works for all
page migration users. And I guess some callback based approach is better
to handle low memory situations, since at least for drm/i915 userptr
backed buffer objects we might want to slurp in the entire available
memory. Or as much as we can get hold off at least. So moving pages to a
safe area before pinning them might not be feasible.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
--
To unsubscribe from 

Re: [RFC/PATCH 0/5] Contiguous Memory Allocator and get_user_pages()

2013-03-05 Thread Yasuaki Ishimatsu
2013/03/05 15:57, Marek Szyprowski wrote:
> Hello,
> 
> Contiguous Memory Allocator is very sensitive about migration failures
> of the individual pages. A single page, which causes permanent migration
> failure can break large conitguous allocations and cause the failure of
> a multimedia device driver.
> 
> One of the known issues with migration of CMA pages are the problems of
> migrating the anonymous user pages, for which the others called
> get_user_pages(). This takes a reference to the given user pages to let
> kernel to operate directly on the page content. This is usually used for
> preventing swaping out the page contents and doing direct DMA to/from
> userspace.
> 
> To solving this issue requires preventing locking of the pages, which
> are placed in CMA regions, for a long time. Our idea is to migrate
> anonymous page content before locking the page in get_user_pages(). This
> cannot be done automatically, as get_user_pages() interface is used very
> often for various operations, which usually last for a short period of
> time (like for example exec syscall). We have added a new flag
> indicating that the given get_user_space() call will grab pages for a
> long time, thus it is suitable to use the migration workaround in such
> cases.
> 
> The proposed extensions is used by V4L2/VideoBuf2
> (drivers/media/v4l2-core/videobuf2-dma-contig.c), but that is not the
> only place which might benefit from it, like any driver which use DMA to
> userspace with get_user_pages(). This one is provided to demonstrate the
> use case.
> 
> I would like to hear some comments on the presented approach. What do
> you think about it? Is there a chance to get such workaround merged at
> some point to mainline?

I'm interested in your idea since it seems that the idea solves my issue:
https://lkml.org/lkml/2012/11/29/69

So I want to apply your idea to a memory hot plug.

Thanks,
Yasuaki Ishimatsu

> 
> Best regards
> Marek Szyprowski
> Samsung Poland R Center
> 
> 
> Patch summary:
> 
> Marek Szyprowski (5):
>mm: introduce migrate_replace_page() for migrating page to the given
>  target
>mm: get_user_pages: use static inline
>mm: get_user_pages: use NON-MOVABLE pages when FOLL_DURABLE flag is
>  set
>mm: get_user_pages: migrate out CMA pages when FOLL_DURABLE flag is
>  set
>media: vb2: use FOLL_DURABLE and __get_user_pages() to avoid CMA
>  migration issues
> 
>   drivers/media/v4l2-core/videobuf2-dma-contig.c |8 +-
>   include/linux/highmem.h|   12 ++-
>   include/linux/migrate.h|5 +
>   include/linux/mm.h |   76 -
>   mm/internal.h  |   12 +++
>   mm/memory.c|  136 
> +++-
>   mm/migrate.c   |   59 ++
>   7 files changed, 225 insertions(+), 83 deletions(-)
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCH 0/5] Contiguous Memory Allocator and get_user_pages()

2013-03-05 Thread Daniel Vetter
On Tue, Mar 5, 2013 at 7:57 AM, Marek Szyprowski
 wrote:
> Hello,
>
> Contiguous Memory Allocator is very sensitive about migration failures
> of the individual pages. A single page, which causes permanent migration
> failure can break large conitguous allocations and cause the failure of
> a multimedia device driver.
>
> One of the known issues with migration of CMA pages are the problems of
> migrating the anonymous user pages, for which the others called
> get_user_pages(). This takes a reference to the given user pages to let
> kernel to operate directly on the page content. This is usually used for
> preventing swaping out the page contents and doing direct DMA to/from
> userspace.
>
> To solving this issue requires preventing locking of the pages, which
> are placed in CMA regions, for a long time. Our idea is to migrate
> anonymous page content before locking the page in get_user_pages(). This
> cannot be done automatically, as get_user_pages() interface is used very
> often for various operations, which usually last for a short period of
> time (like for example exec syscall). We have added a new flag
> indicating that the given get_user_space() call will grab pages for a
> long time, thus it is suitable to use the migration workaround in such
> cases.
>
> The proposed extensions is used by V4L2/VideoBuf2
> (drivers/media/v4l2-core/videobuf2-dma-contig.c), but that is not the
> only place which might benefit from it, like any driver which use DMA to
> userspace with get_user_pages(). This one is provided to demonstrate the
> use case.
>
> I would like to hear some comments on the presented approach. What do
> you think about it? Is there a chance to get such workaround merged at
> some point to mainline?

Imo neat trick to make CMA work together with long-term gup'ed
userspace memory in buffer objects, but doesn't really address the
bigger issue that such userspace pinning kills all the nice features
page migration allows. E.g. if your iommu supports huge pages and you
need those to hit some performance targets, but not correctness since
you can fall back to normal pages.

For the userptr support we're playing around with in drm/i915 we've
opted to fix this with the mmu_notifier. That allows us to evict
buffers and unbind the mappings when the vm wants to move a page.
There's still the issue that we can't unbind it right away, but the
usual retry loop for referenced pages in the migration code should
handle that like any other short-lived locked pages for I/O. I see two
issues with that approach though:
- Needs buffer eviction support. No really a problem for drm/i915, a
bit a challenge for v4l ;-)
- The mmu notifiers aren't really designed to keep track of a lot of
tiny ranges in different mms. At least the simplistic approach
currently used in the i915 patches to register a new mmu_notifier for
each buffer object sucks performance wise.

For performance reasons we want to also use get_user_pages_fast, so I
don't think mixing that together with the "please migrate out of CMA"
trick here is a good thing.

Current drm/i915 wip patch is at: https://patchwork.kernel.org/patch/1748601/

Just my 2 cents on this entire issue.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCH 0/5] Contiguous Memory Allocator and get_user_pages()

2013-03-05 Thread Arnd Bergmann
On Tuesday 05 March 2013, Marek Szyprowski wrote:
> On 3/5/2013 9:50 AM, Arnd Bergmann wrote:
> > On Tuesday 05 March 2013, Marek Szyprowski wrote:
> 
> The problem is that the opposite approach is imho easier.

I can understand that, yes ;-)

> get_user_pages()
> is used in quite a lot of places (I was quite surprised when I've added some
> debug to it and saw the logs) and it seems to be easier to identify places
> where references are kept for significant amount of time. Usually such 
> places
> are in the device drivers. In our case only videobuf2 and some closed-source
> driver were causing the real migration problems, so I decided to leave the
> default approach unchanged.
> 
> If we use this workaround for every get_user_pages() call we will sooner or
> later end with most of the anonymous pages migrated to non-movable 
> pageblocks
> what make the whole CMA approach a bit pointless.

But you said that most users are in device drivers, and I would expect drivers
not to touch that many pages.

We already have two interfaces: the generic get_user_pages and the "fast" 
version
"get_user_pages_fast" that has a number of restrictions. We could add another
such restriction to get_user_pages_fast(), which is that it must not hold
the page reference count for an extended time because it will not migrate
pages out.

I would assume that most of the in-kernel users of get_user_pages() that
are called a lot either already use get_user_pages_fast, or can be easily
converted to it.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCH 0/5] Contiguous Memory Allocator and get_user_pages()

2013-03-05 Thread Marek Szyprowski

Hello,

On 3/5/2013 9:50 AM, Arnd Bergmann wrote:

On Tuesday 05 March 2013, Marek Szyprowski wrote:
> To solving this issue requires preventing locking of the pages, which
> are placed in CMA regions, for a long time. Our idea is to migrate
> anonymous page content before locking the page in get_user_pages(). This
> cannot be done automatically, as get_user_pages() interface is used very
> often for various operations, which usually last for a short period of
> time (like for example exec syscall). We have added a new flag
> indicating that the given get_user_space() call will grab pages for a
> long time, thus it is suitable to use the migration workaround in such
> cases.

Can you explain the tradeoff here? I would have expected that the default
should be to migrate pages out, and annotate the instances that we know
are performance critical and short-lived. That would at least appear
more reliable to me.


The problem is that the opposite approach is imho easier. get_user_pages()
is used in quite a lot of places (I was quite surprised when I've added some
debug to it and saw the logs) and it seems to be easier to identify places
where references are kept for significant amount of time. Usually such 
places

are in the device drivers. In our case only videobuf2 and some closed-source
driver were causing the real migration problems, so I decided to leave the
default approach unchanged.

If we use this workaround for every get_user_pages() call we will sooner or
later end with most of the anonymous pages migrated to non-movable 
pageblocks

what make the whole CMA approach a bit pointless.

Best regards
--
Marek Szyprowski
Samsung Poland R Center


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCH 0/5] Contiguous Memory Allocator and get_user_pages()

2013-03-05 Thread Arnd Bergmann
On Tuesday 05 March 2013, Marek Szyprowski wrote:
> To solving this issue requires preventing locking of the pages, which
> are placed in CMA regions, for a long time. Our idea is to migrate
> anonymous page content before locking the page in get_user_pages(). This
> cannot be done automatically, as get_user_pages() interface is used very
> often for various operations, which usually last for a short period of
> time (like for example exec syscall). We have added a new flag
> indicating that the given get_user_space() call will grab pages for a
> long time, thus it is suitable to use the migration workaround in such
> cases.

Can you explain the tradeoff here? I would have expected that the default
should be to migrate pages out, and annotate the instances that we know
are performance critical and short-lived. That would at least appear
more reliable to me.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCH 0/5] Contiguous Memory Allocator and get_user_pages()

2013-03-05 Thread Arnd Bergmann
On Tuesday 05 March 2013, Marek Szyprowski wrote:
 To solving this issue requires preventing locking of the pages, which
 are placed in CMA regions, for a long time. Our idea is to migrate
 anonymous page content before locking the page in get_user_pages(). This
 cannot be done automatically, as get_user_pages() interface is used very
 often for various operations, which usually last for a short period of
 time (like for example exec syscall). We have added a new flag
 indicating that the given get_user_space() call will grab pages for a
 long time, thus it is suitable to use the migration workaround in such
 cases.

Can you explain the tradeoff here? I would have expected that the default
should be to migrate pages out, and annotate the instances that we know
are performance critical and short-lived. That would at least appear
more reliable to me.

Arnd
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCH 0/5] Contiguous Memory Allocator and get_user_pages()

2013-03-05 Thread Marek Szyprowski

Hello,

On 3/5/2013 9:50 AM, Arnd Bergmann wrote:

On Tuesday 05 March 2013, Marek Szyprowski wrote:
 To solving this issue requires preventing locking of the pages, which
 are placed in CMA regions, for a long time. Our idea is to migrate
 anonymous page content before locking the page in get_user_pages(). This
 cannot be done automatically, as get_user_pages() interface is used very
 often for various operations, which usually last for a short period of
 time (like for example exec syscall). We have added a new flag
 indicating that the given get_user_space() call will grab pages for a
 long time, thus it is suitable to use the migration workaround in such
 cases.

Can you explain the tradeoff here? I would have expected that the default
should be to migrate pages out, and annotate the instances that we know
are performance critical and short-lived. That would at least appear
more reliable to me.


The problem is that the opposite approach is imho easier. get_user_pages()
is used in quite a lot of places (I was quite surprised when I've added some
debug to it and saw the logs) and it seems to be easier to identify places
where references are kept for significant amount of time. Usually such 
places

are in the device drivers. In our case only videobuf2 and some closed-source
driver were causing the real migration problems, so I decided to leave the
default approach unchanged.

If we use this workaround for every get_user_pages() call we will sooner or
later end with most of the anonymous pages migrated to non-movable 
pageblocks

what make the whole CMA approach a bit pointless.

Best regards
--
Marek Szyprowski
Samsung Poland RD Center


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCH 0/5] Contiguous Memory Allocator and get_user_pages()

2013-03-05 Thread Arnd Bergmann
On Tuesday 05 March 2013, Marek Szyprowski wrote:
 On 3/5/2013 9:50 AM, Arnd Bergmann wrote:
  On Tuesday 05 March 2013, Marek Szyprowski wrote:
 
 The problem is that the opposite approach is imho easier.

I can understand that, yes ;-)

 get_user_pages()
 is used in quite a lot of places (I was quite surprised when I've added some
 debug to it and saw the logs) and it seems to be easier to identify places
 where references are kept for significant amount of time. Usually such 
 places
 are in the device drivers. In our case only videobuf2 and some closed-source
 driver were causing the real migration problems, so I decided to leave the
 default approach unchanged.
 
 If we use this workaround for every get_user_pages() call we will sooner or
 later end with most of the anonymous pages migrated to non-movable 
 pageblocks
 what make the whole CMA approach a bit pointless.

But you said that most users are in device drivers, and I would expect drivers
not to touch that many pages.

We already have two interfaces: the generic get_user_pages and the fast 
version
get_user_pages_fast that has a number of restrictions. We could add another
such restriction to get_user_pages_fast(), which is that it must not hold
the page reference count for an extended time because it will not migrate
pages out.

I would assume that most of the in-kernel users of get_user_pages() that
are called a lot either already use get_user_pages_fast, or can be easily
converted to it.

Arnd
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCH 0/5] Contiguous Memory Allocator and get_user_pages()

2013-03-05 Thread Daniel Vetter
On Tue, Mar 5, 2013 at 7:57 AM, Marek Szyprowski
m.szyprow...@samsung.com wrote:
 Hello,

 Contiguous Memory Allocator is very sensitive about migration failures
 of the individual pages. A single page, which causes permanent migration
 failure can break large conitguous allocations and cause the failure of
 a multimedia device driver.

 One of the known issues with migration of CMA pages are the problems of
 migrating the anonymous user pages, for which the others called
 get_user_pages(). This takes a reference to the given user pages to let
 kernel to operate directly on the page content. This is usually used for
 preventing swaping out the page contents and doing direct DMA to/from
 userspace.

 To solving this issue requires preventing locking of the pages, which
 are placed in CMA regions, for a long time. Our idea is to migrate
 anonymous page content before locking the page in get_user_pages(). This
 cannot be done automatically, as get_user_pages() interface is used very
 often for various operations, which usually last for a short period of
 time (like for example exec syscall). We have added a new flag
 indicating that the given get_user_space() call will grab pages for a
 long time, thus it is suitable to use the migration workaround in such
 cases.

 The proposed extensions is used by V4L2/VideoBuf2
 (drivers/media/v4l2-core/videobuf2-dma-contig.c), but that is not the
 only place which might benefit from it, like any driver which use DMA to
 userspace with get_user_pages(). This one is provided to demonstrate the
 use case.

 I would like to hear some comments on the presented approach. What do
 you think about it? Is there a chance to get such workaround merged at
 some point to mainline?

Imo neat trick to make CMA work together with long-term gup'ed
userspace memory in buffer objects, but doesn't really address the
bigger issue that such userspace pinning kills all the nice features
page migration allows. E.g. if your iommu supports huge pages and you
need those to hit some performance targets, but not correctness since
you can fall back to normal pages.

For the userptr support we're playing around with in drm/i915 we've
opted to fix this with the mmu_notifier. That allows us to evict
buffers and unbind the mappings when the vm wants to move a page.
There's still the issue that we can't unbind it right away, but the
usual retry loop for referenced pages in the migration code should
handle that like any other short-lived locked pages for I/O. I see two
issues with that approach though:
- Needs buffer eviction support. No really a problem for drm/i915, a
bit a challenge for v4l ;-)
- The mmu notifiers aren't really designed to keep track of a lot of
tiny ranges in different mms. At least the simplistic approach
currently used in the i915 patches to register a new mmu_notifier for
each buffer object sucks performance wise.

For performance reasons we want to also use get_user_pages_fast, so I
don't think mixing that together with the please migrate out of CMA
trick here is a good thing.

Current drm/i915 wip patch is at: https://patchwork.kernel.org/patch/1748601/

Just my 2 cents on this entire issue.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCH 0/5] Contiguous Memory Allocator and get_user_pages()

2013-03-05 Thread Yasuaki Ishimatsu
2013/03/05 15:57, Marek Szyprowski wrote:
 Hello,
 
 Contiguous Memory Allocator is very sensitive about migration failures
 of the individual pages. A single page, which causes permanent migration
 failure can break large conitguous allocations and cause the failure of
 a multimedia device driver.
 
 One of the known issues with migration of CMA pages are the problems of
 migrating the anonymous user pages, for which the others called
 get_user_pages(). This takes a reference to the given user pages to let
 kernel to operate directly on the page content. This is usually used for
 preventing swaping out the page contents and doing direct DMA to/from
 userspace.
 
 To solving this issue requires preventing locking of the pages, which
 are placed in CMA regions, for a long time. Our idea is to migrate
 anonymous page content before locking the page in get_user_pages(). This
 cannot be done automatically, as get_user_pages() interface is used very
 often for various operations, which usually last for a short period of
 time (like for example exec syscall). We have added a new flag
 indicating that the given get_user_space() call will grab pages for a
 long time, thus it is suitable to use the migration workaround in such
 cases.
 
 The proposed extensions is used by V4L2/VideoBuf2
 (drivers/media/v4l2-core/videobuf2-dma-contig.c), but that is not the
 only place which might benefit from it, like any driver which use DMA to
 userspace with get_user_pages(). This one is provided to demonstrate the
 use case.
 
 I would like to hear some comments on the presented approach. What do
 you think about it? Is there a chance to get such workaround merged at
 some point to mainline?

I'm interested in your idea since it seems that the idea solves my issue:
https://lkml.org/lkml/2012/11/29/69

So I want to apply your idea to a memory hot plug.

Thanks,
Yasuaki Ishimatsu

 
 Best regards
 Marek Szyprowski
 Samsung Poland RD Center
 
 
 Patch summary:
 
 Marek Szyprowski (5):
mm: introduce migrate_replace_page() for migrating page to the given
  target
mm: get_user_pages: use static inline
mm: get_user_pages: use NON-MOVABLE pages when FOLL_DURABLE flag is
  set
mm: get_user_pages: migrate out CMA pages when FOLL_DURABLE flag is
  set
media: vb2: use FOLL_DURABLE and __get_user_pages() to avoid CMA
  migration issues
 
   drivers/media/v4l2-core/videobuf2-dma-contig.c |8 +-
   include/linux/highmem.h|   12 ++-
   include/linux/migrate.h|5 +
   include/linux/mm.h |   76 -
   mm/internal.h  |   12 +++
   mm/memory.c|  136 
 +++-
   mm/migrate.c   |   59 ++
   7 files changed, 225 insertions(+), 83 deletions(-)
 


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC/PATCH 0/5] Contiguous Memory Allocator and get_user_pages()

2013-03-04 Thread Marek Szyprowski
Hello,

Contiguous Memory Allocator is very sensitive about migration failures
of the individual pages. A single page, which causes permanent migration
failure can break large conitguous allocations and cause the failure of
a multimedia device driver.

One of the known issues with migration of CMA pages are the problems of
migrating the anonymous user pages, for which the others called
get_user_pages(). This takes a reference to the given user pages to let
kernel to operate directly on the page content. This is usually used for
preventing swaping out the page contents and doing direct DMA to/from
userspace.

To solving this issue requires preventing locking of the pages, which
are placed in CMA regions, for a long time. Our idea is to migrate
anonymous page content before locking the page in get_user_pages(). This
cannot be done automatically, as get_user_pages() interface is used very
often for various operations, which usually last for a short period of
time (like for example exec syscall). We have added a new flag
indicating that the given get_user_space() call will grab pages for a
long time, thus it is suitable to use the migration workaround in such
cases.

The proposed extensions is used by V4L2/VideoBuf2
(drivers/media/v4l2-core/videobuf2-dma-contig.c), but that is not the
only place which might benefit from it, like any driver which use DMA to
userspace with get_user_pages(). This one is provided to demonstrate the
use case.

I would like to hear some comments on the presented approach. What do
you think about it? Is there a chance to get such workaround merged at
some point to mainline?

Best regards
Marek Szyprowski
Samsung Poland R Center


Patch summary:

Marek Szyprowski (5):
  mm: introduce migrate_replace_page() for migrating page to the given
target
  mm: get_user_pages: use static inline
  mm: get_user_pages: use NON-MOVABLE pages when FOLL_DURABLE flag is
set
  mm: get_user_pages: migrate out CMA pages when FOLL_DURABLE flag is
set
  media: vb2: use FOLL_DURABLE and __get_user_pages() to avoid CMA
migration issues

 drivers/media/v4l2-core/videobuf2-dma-contig.c |8 +-
 include/linux/highmem.h|   12 ++-
 include/linux/migrate.h|5 +
 include/linux/mm.h |   76 -
 mm/internal.h  |   12 +++
 mm/memory.c|  136 +++-
 mm/migrate.c   |   59 ++
 7 files changed, 225 insertions(+), 83 deletions(-)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC/PATCH 0/5] Contiguous Memory Allocator and get_user_pages()

2013-03-04 Thread Marek Szyprowski
Hello,

Contiguous Memory Allocator is very sensitive about migration failures
of the individual pages. A single page, which causes permanent migration
failure can break large conitguous allocations and cause the failure of
a multimedia device driver.

One of the known issues with migration of CMA pages are the problems of
migrating the anonymous user pages, for which the others called
get_user_pages(). This takes a reference to the given user pages to let
kernel to operate directly on the page content. This is usually used for
preventing swaping out the page contents and doing direct DMA to/from
userspace.

To solving this issue requires preventing locking of the pages, which
are placed in CMA regions, for a long time. Our idea is to migrate
anonymous page content before locking the page in get_user_pages(). This
cannot be done automatically, as get_user_pages() interface is used very
often for various operations, which usually last for a short period of
time (like for example exec syscall). We have added a new flag
indicating that the given get_user_space() call will grab pages for a
long time, thus it is suitable to use the migration workaround in such
cases.

The proposed extensions is used by V4L2/VideoBuf2
(drivers/media/v4l2-core/videobuf2-dma-contig.c), but that is not the
only place which might benefit from it, like any driver which use DMA to
userspace with get_user_pages(). This one is provided to demonstrate the
use case.

I would like to hear some comments on the presented approach. What do
you think about it? Is there a chance to get such workaround merged at
some point to mainline?

Best regards
Marek Szyprowski
Samsung Poland RD Center


Patch summary:

Marek Szyprowski (5):
  mm: introduce migrate_replace_page() for migrating page to the given
target
  mm: get_user_pages: use static inline
  mm: get_user_pages: use NON-MOVABLE pages when FOLL_DURABLE flag is
set
  mm: get_user_pages: migrate out CMA pages when FOLL_DURABLE flag is
set
  media: vb2: use FOLL_DURABLE and __get_user_pages() to avoid CMA
migration issues

 drivers/media/v4l2-core/videobuf2-dma-contig.c |8 +-
 include/linux/highmem.h|   12 ++-
 include/linux/migrate.h|5 +
 include/linux/mm.h |   76 -
 mm/internal.h  |   12 +++
 mm/memory.c|  136 +++-
 mm/migrate.c   |   59 ++
 7 files changed, 225 insertions(+), 83 deletions(-)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/