[PATCH 3/6] drm: add unified vma offset manager
On Mon, Jul 01, 2013 at 09:54:17PM +0200, David Herrmann wrote: > On Mon, Jul 1, 2013 at 9:42 PM, Daniel Vetter wrote: > > On Mon, Jul 01, 2013 at 08:33:00PM +0200, David Herrmann wrote: > >> +/** drm_vma_offset_manager_destroy() > >> + * > >> + * Destroy an object manager which was previously created via > >> + * drm_vma_offset_manager_init(). The caller must remove all allocated > >> nodes > >> + * before destroying the manager. Otherwise, drm_mm will refuse to free > >> the > >> + * requested resources. > >> + * > >> + * The manager must not be accessed after this function is called. > >> + */ > >> +void drm_vma_offset_manager_destroy(struct drm_vma_offset_manager *mgr) > >> +{ > >> + BUG_ON(!drm_mm_clean(&mgr->vm_addr_space_mm)); > > > > Please convert this to a WARN_ON - drm_mm has code to cobble over buggy > > drivers, and killing the driver with a BUG_ON if we could keep on going > > with just a WARN_ON is a real pain for development. > > Ok. Actually I think the check in the takedown function should be good enough already. I've just dumped a patch onto dri-devel which will convert that to a WARN, so I think adding a second one here is redundant. > > >> + > >> + /* take the lock to protect against buggy drivers */ > >> + write_lock(&mgr->vm_lock); > >> + drm_mm_takedown(&mgr->vm_addr_space_mm); > >> + write_unlock(&mgr->vm_lock); > >> +} > >> +EXPORT_SYMBOL(drm_vma_offset_manager_destroy); > >> + > >> +/** drm_vma_offset_lookup() > >> + * > >> + * Find a node given a start address and object size. This returns the > >> _best_ > >> + * match for the given node. That is, @start may point somewhere into a > >> valid > >> + * region and the given node will be returned, as long as the node spans > >> the > >> + * whole requested area (given the size in number of pages as @pages). > >> + * > >> + * Returns NULL if no suitable node can be found. Otherwise, the best > >> match > >> + * is returned. It's the caller's responsibility to make sure the node > >> doesn't > >> + * get destroyed before the caller can access it. > >> + */ > >> +struct drm_vma_offset_node *drm_vma_offset_lookup(struct > >> drm_vma_offset_manager *mgr, > >> + unsigned long start, > >> + unsigned long pages) > >> +{ > >> + struct drm_vma_offset_node *node, *best; > >> + struct rb_node *iter; > >> + unsigned long offset; > >> + > >> + read_lock(&mgr->vm_lock); > >> + > >> + iter = mgr->vm_addr_space_rb.rb_node; > >> + best = NULL; > >> + > >> + while (likely(iter)) { > >> + node = rb_entry(iter, struct drm_vma_offset_node, vm_rb); > >> + offset = node->vm_node.start; > >> + if (start >= offset) { > >> + iter = iter->rb_right; > >> + best = node; > >> + if (start == offset) > >> + break; > >> + } else { > >> + iter = iter->rb_left; > >> + } > >> + } > >> + > >> + /* verify that the node spans the requested area */ > >> + if (likely(best)) { > >> + offset = best->vm_node.start + best->vm_pages; > >> + if (offset > start + pages) > >> + best = NULL; > >> + } > >> + > >> + read_unlock(&mgr->vm_lock); > >> + > >> + return best; > >> +} > >> +EXPORT_SYMBOL(drm_vma_offset_lookup); > >> + > >> +/* internal helper to link @node into the rb-tree */ > >> +static void _drm_vma_offset_add_rb(struct drm_vma_offset_manager *mgr, > >> +struct drm_vma_offset_node *node) > >> +{ > >> + struct rb_node **iter = &mgr->vm_addr_space_rb.rb_node; > >> + struct rb_node *parent = NULL; > >> + struct drm_vma_offset_node *iter_node; > >> + > >> + while (likely(*iter)) { > >> + parent = *iter; > >> + iter_node = rb_entry(*iter, struct drm_vma_offset_node, > >> vm_rb); > >> + > >> + if (node->vm_node.start < iter_node->vm_node.start) > >> + iter = &(*iter)->rb_left; > >> + else if (node->vm_node.start > iter_node->vm_node.start) > >> + iter = &(*iter)->rb_right; > >> + else > >> + BUG(); > >> + } > >> + > >> + rb_link_node(&node->vm_rb, parent, iter); > >> + rb_insert_color(&node->vm_rb, &mgr->vm_addr_space_rb); > >> +} > >> + > >> +/** drm_vma_offset_add() > >> + * > >> + * Add a node to the offset-manager. If the node was already added, this > >> does > >> + * nothing and return 0. @pages is the size of the object given in number > >> of > >> + * pages. > >> + * After this call succeeds, you can access the offset of the node until > >> it > >> + * is removed again. > >> + * > >> + * If this call fails, it is safe to retry the operation or call > >> + * drm_vma_offset_remove(), anyway. However, no clean
[PATCH 3/6] drm: add unified vma offset manager
Hi On Mon, Jul 1, 2013 at 9:42 PM, Daniel Vetter wrote: > On Mon, Jul 01, 2013 at 08:33:00PM +0200, David Herrmann wrote: >> If we want to map GPU memory into user-space, we need to linearize the >> addresses to not confuse mm-core. Currently, GEM and TTM both implement >> their own offset-managers to assign a pgoff to each object for user-space >> CPU access. GEM uses a hash-table, TTM uses an rbtree. >> >> This patch provides a unified implementation that can be used to replace >> both. TTM allows partial mmaps with a given offset, so we cannot use >> hashtables as the start address may not be known at mmap time. Hence, we >> use the rbtree-implementation of TTM. >> >> We could easily update drm_mm to use an rbtree instead of a linked list >> for it's object list and thus drop the rbtree from the vma-manager. >> However, this would slow down drm_mm object allocation for all other >> use-cases (rbtree insertion) and add another 4-8 bytes to each mm node. >> Hence, use the separate tree but allow for later migration. >> >> This is a rewrite of the 2012-proposal by David Airlie >> >> Signed-off-by: David Herrmann > > This seems to sprinkle a lot of likely/unlikely. Imo those are only > justified in two cases: > - When we have real performance data proving that they're worth it. > - Sometimes they're also useful as self-documenting code for the truly > unlikely. But early do-nothing returns and error handling are > established code patterns, so imo don't qualify. > > I think none of the likely/unlikely below qualify, so imo better to remove > them. They are copied mostly from ttm_bo.c. However, I agree that they are probably unneeded. Furthermore, these aren't code-paths I'd expect to be performance-critical. I will dig into the git-history of TTM to see whether they have been in-place since the beginning or whether there is some real reason for them. > Second high-level review request: Can you please convert the very nice > documentation you already added into proper kerneldoc and link it up at an > appropriate section in the drm DocBook? Sure. > A few more comments below. I'll postpone reviewing the gem/ttm conversion > patches until this is settled a bit. > >> --- >> drivers/gpu/drm/Makefile | 2 +- >> drivers/gpu/drm/drm_vma_manager.c | 224 >> ++ >> include/drm/drm_vma_manager.h | 107 ++ >> 3 files changed, 332 insertions(+), 1 deletion(-) >> create mode 100644 drivers/gpu/drm/drm_vma_manager.c >> create mode 100644 include/drm/drm_vma_manager.h >> >> diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile >> index 1c9f2439..f8851cc 100644 >> --- a/drivers/gpu/drm/Makefile >> +++ b/drivers/gpu/drm/Makefile >> @@ -12,7 +12,7 @@ drm-y := drm_auth.o drm_buffer.o drm_bufs.o >> drm_cache.o \ >> drm_platform.o drm_sysfs.o drm_hashtab.o drm_mm.o \ >> drm_crtc.o drm_modes.o drm_edid.o \ >> drm_info.o drm_debugfs.o drm_encoder_slave.o \ >> - drm_trace_points.o drm_global.o drm_prime.o >> + drm_trace_points.o drm_global.o drm_prime.o drm_vma_manager.o >> >> drm-$(CONFIG_COMPAT) += drm_ioc32.o >> drm-$(CONFIG_DRM_GEM_CMA_HELPER) += drm_gem_cma_helper.o >> diff --git a/drivers/gpu/drm/drm_vma_manager.c >> b/drivers/gpu/drm/drm_vma_manager.c >> new file mode 100644 >> index 000..c28639f >> --- /dev/null >> +++ b/drivers/gpu/drm/drm_vma_manager.c >> @@ -0,0 +1,224 @@ >> +/* >> + * Copyright (c) 2012 David Airlie >> + * Copyright (c) 2013 David Herrmann >> + * >> + * Permission is hereby granted, free of charge, to any person obtaining a >> + * copy of this software and associated documentation files (the >> "Software"), >> + * to deal in the Software without restriction, including without limitation >> + * the rights to use, copy, modify, merge, publish, distribute, sublicense, >> + * and/or sell copies of the Software, and to permit persons to whom the >> + * Software is furnished to do so, subject to the following conditions: >> + * >> + * The above copyright notice and this permission notice shall be included >> in >> + * all copies or substantial portions of the Software. >> + * >> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS >> OR >> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, >> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL >> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR >> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, >> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR >> + * OTHER DEALINGS IN THE SOFTWARE. >> + */ >> + >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> + >> +/** @file drm_vma_manager.c >> + * >> + * The vma-manager is responsible to map arbitrary driver-depend
[PATCH 3/6] drm: add unified vma offset manager
On Mon, Jul 01, 2013 at 08:33:00PM +0200, David Herrmann wrote: > If we want to map GPU memory into user-space, we need to linearize the > addresses to not confuse mm-core. Currently, GEM and TTM both implement > their own offset-managers to assign a pgoff to each object for user-space > CPU access. GEM uses a hash-table, TTM uses an rbtree. > > This patch provides a unified implementation that can be used to replace > both. TTM allows partial mmaps with a given offset, so we cannot use > hashtables as the start address may not be known at mmap time. Hence, we > use the rbtree-implementation of TTM. > > We could easily update drm_mm to use an rbtree instead of a linked list > for it's object list and thus drop the rbtree from the vma-manager. > However, this would slow down drm_mm object allocation for all other > use-cases (rbtree insertion) and add another 4-8 bytes to each mm node. > Hence, use the separate tree but allow for later migration. > > This is a rewrite of the 2012-proposal by David Airlie > > Signed-off-by: David Herrmann This seems to sprinkle a lot of likely/unlikely. Imo those are only justified in two cases: - When we have real performance data proving that they're worth it. - Sometimes they're also useful as self-documenting code for the truly unlikely. But early do-nothing returns and error handling are established code patterns, so imo don't qualify. I think none of the likely/unlikely below qualify, so imo better to remove them. Second high-level review request: Can you please convert the very nice documentation you already added into proper kerneldoc and link it up at an appropriate section in the drm DocBook? A few more comments below. I'll postpone reviewing the gem/ttm conversion patches until this is settled a bit. > --- > drivers/gpu/drm/Makefile | 2 +- > drivers/gpu/drm/drm_vma_manager.c | 224 > ++ > include/drm/drm_vma_manager.h | 107 ++ > 3 files changed, 332 insertions(+), 1 deletion(-) > create mode 100644 drivers/gpu/drm/drm_vma_manager.c > create mode 100644 include/drm/drm_vma_manager.h > > diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile > index 1c9f2439..f8851cc 100644 > --- a/drivers/gpu/drm/Makefile > +++ b/drivers/gpu/drm/Makefile > @@ -12,7 +12,7 @@ drm-y := drm_auth.o drm_buffer.o drm_bufs.o > drm_cache.o \ > drm_platform.o drm_sysfs.o drm_hashtab.o drm_mm.o \ > drm_crtc.o drm_modes.o drm_edid.o \ > drm_info.o drm_debugfs.o drm_encoder_slave.o \ > - drm_trace_points.o drm_global.o drm_prime.o > + drm_trace_points.o drm_global.o drm_prime.o drm_vma_manager.o > > drm-$(CONFIG_COMPAT) += drm_ioc32.o > drm-$(CONFIG_DRM_GEM_CMA_HELPER) += drm_gem_cma_helper.o > diff --git a/drivers/gpu/drm/drm_vma_manager.c > b/drivers/gpu/drm/drm_vma_manager.c > new file mode 100644 > index 000..c28639f > --- /dev/null > +++ b/drivers/gpu/drm/drm_vma_manager.c > @@ -0,0 +1,224 @@ > +/* > + * Copyright (c) 2012 David Airlie > + * Copyright (c) 2013 David Herrmann > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the "Software"), > + * to deal in the Software without restriction, including without limitation > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice shall be included in > + * all copies or substantial portions of the Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR > + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, > + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR > + * OTHER DEALINGS IN THE SOFTWARE. > + */ > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +/** @file drm_vma_manager.c > + * > + * The vma-manager is responsible to map arbitrary driver-dependent memory > + * regions into the linear user address-space. It provides offsets to the > + * caller which can then be used on the address_space of the drm-device. It > + * takes care to not overlap regions, size them appropriately and to not > + * confuse mm-core by inconsistent fake vm_pgoff fields. > + * > + * We use drm_mm as backend to manage object allocations. But it is highly > + * optimized for alloc/free calls, not lookups. Hence, we use an rb-tree to > + * speed up offset lookups.
[PATCH 3/6] drm: add unified vma offset manager
If we want to map GPU memory into user-space, we need to linearize the addresses to not confuse mm-core. Currently, GEM and TTM both implement their own offset-managers to assign a pgoff to each object for user-space CPU access. GEM uses a hash-table, TTM uses an rbtree. This patch provides a unified implementation that can be used to replace both. TTM allows partial mmaps with a given offset, so we cannot use hashtables as the start address may not be known at mmap time. Hence, we use the rbtree-implementation of TTM. We could easily update drm_mm to use an rbtree instead of a linked list for it's object list and thus drop the rbtree from the vma-manager. However, this would slow down drm_mm object allocation for all other use-cases (rbtree insertion) and add another 4-8 bytes to each mm node. Hence, use the separate tree but allow for later migration. This is a rewrite of the 2012-proposal by David Airlie Signed-off-by: David Herrmann --- drivers/gpu/drm/Makefile | 2 +- drivers/gpu/drm/drm_vma_manager.c | 224 ++ include/drm/drm_vma_manager.h | 107 ++ 3 files changed, 332 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/drm_vma_manager.c create mode 100644 include/drm/drm_vma_manager.h diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile index 1c9f2439..f8851cc 100644 --- a/drivers/gpu/drm/Makefile +++ b/drivers/gpu/drm/Makefile @@ -12,7 +12,7 @@ drm-y :=drm_auth.o drm_buffer.o drm_bufs.o drm_cache.o \ drm_platform.o drm_sysfs.o drm_hashtab.o drm_mm.o \ drm_crtc.o drm_modes.o drm_edid.o \ drm_info.o drm_debugfs.o drm_encoder_slave.o \ - drm_trace_points.o drm_global.o drm_prime.o + drm_trace_points.o drm_global.o drm_prime.o drm_vma_manager.o drm-$(CONFIG_COMPAT) += drm_ioc32.o drm-$(CONFIG_DRM_GEM_CMA_HELPER) += drm_gem_cma_helper.o diff --git a/drivers/gpu/drm/drm_vma_manager.c b/drivers/gpu/drm/drm_vma_manager.c new file mode 100644 index 000..c28639f --- /dev/null +++ b/drivers/gpu/drm/drm_vma_manager.c @@ -0,0 +1,224 @@ +/* + * Copyright (c) 2012 David Airlie + * Copyright (c) 2013 David Herrmann + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/** @file drm_vma_manager.c + * + * The vma-manager is responsible to map arbitrary driver-dependent memory + * regions into the linear user address-space. It provides offsets to the + * caller which can then be used on the address_space of the drm-device. It + * takes care to not overlap regions, size them appropriately and to not + * confuse mm-core by inconsistent fake vm_pgoff fields. + * + * We use drm_mm as backend to manage object allocations. But it is highly + * optimized for alloc/free calls, not lookups. Hence, we use an rb-tree to + * speed up offset lookups. + */ + +/** drm_vma_offset_manager_init() + * + * Initialize a new offset-manager. The offset and area size available for the + * manager are given as @page_offset and @size. Both are interpreted as + * page-numbers, not bytes. + * + * Adding/removing nodes from the manager is locked internally and protected + * against concurrent access. However, node allocation and destruction is left + * for the caller. While calling into the vma-manager, a given node must + * always be guaranteed to be referenced. + */ +void drm_vma_offset_manager_init(struct drm_vma_offset_manager *mgr, +unsigned long page_offset, unsigned long size) +{ + rwlock_init(&mgr->vm_lock); + mgr->vm_addr_space_rb = RB_ROOT; + drm_mm_init(&mgr->vm_addr_space_mm, page_offset, size); +} +EXPORT_SYMBOL(drm_vma_offset_manager_init); + +/** drm_vma_offset_manager_destroy() + * + * Destroy an object manager which was previously created via + * drm
Re: [PATCH 3/6] drm: add unified vma offset manager
On Mon, Jul 01, 2013 at 09:54:17PM +0200, David Herrmann wrote: > On Mon, Jul 1, 2013 at 9:42 PM, Daniel Vetter wrote: > > On Mon, Jul 01, 2013 at 08:33:00PM +0200, David Herrmann wrote: > >> +/** drm_vma_offset_manager_destroy() > >> + * > >> + * Destroy an object manager which was previously created via > >> + * drm_vma_offset_manager_init(). The caller must remove all allocated > >> nodes > >> + * before destroying the manager. Otherwise, drm_mm will refuse to free > >> the > >> + * requested resources. > >> + * > >> + * The manager must not be accessed after this function is called. > >> + */ > >> +void drm_vma_offset_manager_destroy(struct drm_vma_offset_manager *mgr) > >> +{ > >> + BUG_ON(!drm_mm_clean(&mgr->vm_addr_space_mm)); > > > > Please convert this to a WARN_ON - drm_mm has code to cobble over buggy > > drivers, and killing the driver with a BUG_ON if we could keep on going > > with just a WARN_ON is a real pain for development. > > Ok. Actually I think the check in the takedown function should be good enough already. I've just dumped a patch onto dri-devel which will convert that to a WARN, so I think adding a second one here is redundant. > > >> + > >> + /* take the lock to protect against buggy drivers */ > >> + write_lock(&mgr->vm_lock); > >> + drm_mm_takedown(&mgr->vm_addr_space_mm); > >> + write_unlock(&mgr->vm_lock); > >> +} > >> +EXPORT_SYMBOL(drm_vma_offset_manager_destroy); > >> + > >> +/** drm_vma_offset_lookup() > >> + * > >> + * Find a node given a start address and object size. This returns the > >> _best_ > >> + * match for the given node. That is, @start may point somewhere into a > >> valid > >> + * region and the given node will be returned, as long as the node spans > >> the > >> + * whole requested area (given the size in number of pages as @pages). > >> + * > >> + * Returns NULL if no suitable node can be found. Otherwise, the best > >> match > >> + * is returned. It's the caller's responsibility to make sure the node > >> doesn't > >> + * get destroyed before the caller can access it. > >> + */ > >> +struct drm_vma_offset_node *drm_vma_offset_lookup(struct > >> drm_vma_offset_manager *mgr, > >> + unsigned long start, > >> + unsigned long pages) > >> +{ > >> + struct drm_vma_offset_node *node, *best; > >> + struct rb_node *iter; > >> + unsigned long offset; > >> + > >> + read_lock(&mgr->vm_lock); > >> + > >> + iter = mgr->vm_addr_space_rb.rb_node; > >> + best = NULL; > >> + > >> + while (likely(iter)) { > >> + node = rb_entry(iter, struct drm_vma_offset_node, vm_rb); > >> + offset = node->vm_node.start; > >> + if (start >= offset) { > >> + iter = iter->rb_right; > >> + best = node; > >> + if (start == offset) > >> + break; > >> + } else { > >> + iter = iter->rb_left; > >> + } > >> + } > >> + > >> + /* verify that the node spans the requested area */ > >> + if (likely(best)) { > >> + offset = best->vm_node.start + best->vm_pages; > >> + if (offset > start + pages) > >> + best = NULL; > >> + } > >> + > >> + read_unlock(&mgr->vm_lock); > >> + > >> + return best; > >> +} > >> +EXPORT_SYMBOL(drm_vma_offset_lookup); > >> + > >> +/* internal helper to link @node into the rb-tree */ > >> +static void _drm_vma_offset_add_rb(struct drm_vma_offset_manager *mgr, > >> +struct drm_vma_offset_node *node) > >> +{ > >> + struct rb_node **iter = &mgr->vm_addr_space_rb.rb_node; > >> + struct rb_node *parent = NULL; > >> + struct drm_vma_offset_node *iter_node; > >> + > >> + while (likely(*iter)) { > >> + parent = *iter; > >> + iter_node = rb_entry(*iter, struct drm_vma_offset_node, > >> vm_rb); > >> + > >> + if (node->vm_node.start < iter_node->vm_node.start) > >> + iter = &(*iter)->rb_left; > >> + else if (node->vm_node.start > iter_node->vm_node.start) > >> + iter = &(*iter)->rb_right; > >> + else > >> + BUG(); > >> + } > >> + > >> + rb_link_node(&node->vm_rb, parent, iter); > >> + rb_insert_color(&node->vm_rb, &mgr->vm_addr_space_rb); > >> +} > >> + > >> +/** drm_vma_offset_add() > >> + * > >> + * Add a node to the offset-manager. If the node was already added, this > >> does > >> + * nothing and return 0. @pages is the size of the object given in number > >> of > >> + * pages. > >> + * After this call succeeds, you can access the offset of the node until > >> it > >> + * is removed again. > >> + * > >> + * If this call fails, it is safe to retry the operation or call > >> + * drm_vma_offset_remove(), anyway. However, no clean
Re: [PATCH 3/6] drm: add unified vma offset manager
Hi On Mon, Jul 1, 2013 at 9:42 PM, Daniel Vetter wrote: > On Mon, Jul 01, 2013 at 08:33:00PM +0200, David Herrmann wrote: >> If we want to map GPU memory into user-space, we need to linearize the >> addresses to not confuse mm-core. Currently, GEM and TTM both implement >> their own offset-managers to assign a pgoff to each object for user-space >> CPU access. GEM uses a hash-table, TTM uses an rbtree. >> >> This patch provides a unified implementation that can be used to replace >> both. TTM allows partial mmaps with a given offset, so we cannot use >> hashtables as the start address may not be known at mmap time. Hence, we >> use the rbtree-implementation of TTM. >> >> We could easily update drm_mm to use an rbtree instead of a linked list >> for it's object list and thus drop the rbtree from the vma-manager. >> However, this would slow down drm_mm object allocation for all other >> use-cases (rbtree insertion) and add another 4-8 bytes to each mm node. >> Hence, use the separate tree but allow for later migration. >> >> This is a rewrite of the 2012-proposal by David Airlie >> >> Signed-off-by: David Herrmann > > This seems to sprinkle a lot of likely/unlikely. Imo those are only > justified in two cases: > - When we have real performance data proving that they're worth it. > - Sometimes they're also useful as self-documenting code for the truly > unlikely. But early do-nothing returns and error handling are > established code patterns, so imo don't qualify. > > I think none of the likely/unlikely below qualify, so imo better to remove > them. They are copied mostly from ttm_bo.c. However, I agree that they are probably unneeded. Furthermore, these aren't code-paths I'd expect to be performance-critical. I will dig into the git-history of TTM to see whether they have been in-place since the beginning or whether there is some real reason for them. > Second high-level review request: Can you please convert the very nice > documentation you already added into proper kerneldoc and link it up at an > appropriate section in the drm DocBook? Sure. > A few more comments below. I'll postpone reviewing the gem/ttm conversion > patches until this is settled a bit. > >> --- >> drivers/gpu/drm/Makefile | 2 +- >> drivers/gpu/drm/drm_vma_manager.c | 224 >> ++ >> include/drm/drm_vma_manager.h | 107 ++ >> 3 files changed, 332 insertions(+), 1 deletion(-) >> create mode 100644 drivers/gpu/drm/drm_vma_manager.c >> create mode 100644 include/drm/drm_vma_manager.h >> >> diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile >> index 1c9f2439..f8851cc 100644 >> --- a/drivers/gpu/drm/Makefile >> +++ b/drivers/gpu/drm/Makefile >> @@ -12,7 +12,7 @@ drm-y := drm_auth.o drm_buffer.o drm_bufs.o >> drm_cache.o \ >> drm_platform.o drm_sysfs.o drm_hashtab.o drm_mm.o \ >> drm_crtc.o drm_modes.o drm_edid.o \ >> drm_info.o drm_debugfs.o drm_encoder_slave.o \ >> - drm_trace_points.o drm_global.o drm_prime.o >> + drm_trace_points.o drm_global.o drm_prime.o drm_vma_manager.o >> >> drm-$(CONFIG_COMPAT) += drm_ioc32.o >> drm-$(CONFIG_DRM_GEM_CMA_HELPER) += drm_gem_cma_helper.o >> diff --git a/drivers/gpu/drm/drm_vma_manager.c >> b/drivers/gpu/drm/drm_vma_manager.c >> new file mode 100644 >> index 000..c28639f >> --- /dev/null >> +++ b/drivers/gpu/drm/drm_vma_manager.c >> @@ -0,0 +1,224 @@ >> +/* >> + * Copyright (c) 2012 David Airlie >> + * Copyright (c) 2013 David Herrmann >> + * >> + * Permission is hereby granted, free of charge, to any person obtaining a >> + * copy of this software and associated documentation files (the >> "Software"), >> + * to deal in the Software without restriction, including without limitation >> + * the rights to use, copy, modify, merge, publish, distribute, sublicense, >> + * and/or sell copies of the Software, and to permit persons to whom the >> + * Software is furnished to do so, subject to the following conditions: >> + * >> + * The above copyright notice and this permission notice shall be included >> in >> + * all copies or substantial portions of the Software. >> + * >> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS >> OR >> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, >> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL >> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR >> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, >> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR >> + * OTHER DEALINGS IN THE SOFTWARE. >> + */ >> + >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> + >> +/** @file drm_vma_manager.c >> + * >> + * The vma-manager is responsible to map arbitrary driver-depend
Re: [PATCH 3/6] drm: add unified vma offset manager
On Mon, Jul 01, 2013 at 08:33:00PM +0200, David Herrmann wrote: > If we want to map GPU memory into user-space, we need to linearize the > addresses to not confuse mm-core. Currently, GEM and TTM both implement > their own offset-managers to assign a pgoff to each object for user-space > CPU access. GEM uses a hash-table, TTM uses an rbtree. > > This patch provides a unified implementation that can be used to replace > both. TTM allows partial mmaps with a given offset, so we cannot use > hashtables as the start address may not be known at mmap time. Hence, we > use the rbtree-implementation of TTM. > > We could easily update drm_mm to use an rbtree instead of a linked list > for it's object list and thus drop the rbtree from the vma-manager. > However, this would slow down drm_mm object allocation for all other > use-cases (rbtree insertion) and add another 4-8 bytes to each mm node. > Hence, use the separate tree but allow for later migration. > > This is a rewrite of the 2012-proposal by David Airlie > > Signed-off-by: David Herrmann This seems to sprinkle a lot of likely/unlikely. Imo those are only justified in two cases: - When we have real performance data proving that they're worth it. - Sometimes they're also useful as self-documenting code for the truly unlikely. But early do-nothing returns and error handling are established code patterns, so imo don't qualify. I think none of the likely/unlikely below qualify, so imo better to remove them. Second high-level review request: Can you please convert the very nice documentation you already added into proper kerneldoc and link it up at an appropriate section in the drm DocBook? A few more comments below. I'll postpone reviewing the gem/ttm conversion patches until this is settled a bit. > --- > drivers/gpu/drm/Makefile | 2 +- > drivers/gpu/drm/drm_vma_manager.c | 224 > ++ > include/drm/drm_vma_manager.h | 107 ++ > 3 files changed, 332 insertions(+), 1 deletion(-) > create mode 100644 drivers/gpu/drm/drm_vma_manager.c > create mode 100644 include/drm/drm_vma_manager.h > > diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile > index 1c9f2439..f8851cc 100644 > --- a/drivers/gpu/drm/Makefile > +++ b/drivers/gpu/drm/Makefile > @@ -12,7 +12,7 @@ drm-y := drm_auth.o drm_buffer.o drm_bufs.o > drm_cache.o \ > drm_platform.o drm_sysfs.o drm_hashtab.o drm_mm.o \ > drm_crtc.o drm_modes.o drm_edid.o \ > drm_info.o drm_debugfs.o drm_encoder_slave.o \ > - drm_trace_points.o drm_global.o drm_prime.o > + drm_trace_points.o drm_global.o drm_prime.o drm_vma_manager.o > > drm-$(CONFIG_COMPAT) += drm_ioc32.o > drm-$(CONFIG_DRM_GEM_CMA_HELPER) += drm_gem_cma_helper.o > diff --git a/drivers/gpu/drm/drm_vma_manager.c > b/drivers/gpu/drm/drm_vma_manager.c > new file mode 100644 > index 000..c28639f > --- /dev/null > +++ b/drivers/gpu/drm/drm_vma_manager.c > @@ -0,0 +1,224 @@ > +/* > + * Copyright (c) 2012 David Airlie > + * Copyright (c) 2013 David Herrmann > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the "Software"), > + * to deal in the Software without restriction, including without limitation > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice shall be included in > + * all copies or substantial portions of the Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR > + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, > + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR > + * OTHER DEALINGS IN THE SOFTWARE. > + */ > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +/** @file drm_vma_manager.c > + * > + * The vma-manager is responsible to map arbitrary driver-dependent memory > + * regions into the linear user address-space. It provides offsets to the > + * caller which can then be used on the address_space of the drm-device. It > + * takes care to not overlap regions, size them appropriately and to not > + * confuse mm-core by inconsistent fake vm_pgoff fields. > + * > + * We use drm_mm as backend to manage object allocations. But it is highly > + * optimized for alloc/free calls, not lookups. Hence, we use an rb-tree to > + * speed up offset lookups.
[PATCH 3/6] drm: add unified vma offset manager
If we want to map GPU memory into user-space, we need to linearize the addresses to not confuse mm-core. Currently, GEM and TTM both implement their own offset-managers to assign a pgoff to each object for user-space CPU access. GEM uses a hash-table, TTM uses an rbtree. This patch provides a unified implementation that can be used to replace both. TTM allows partial mmaps with a given offset, so we cannot use hashtables as the start address may not be known at mmap time. Hence, we use the rbtree-implementation of TTM. We could easily update drm_mm to use an rbtree instead of a linked list for it's object list and thus drop the rbtree from the vma-manager. However, this would slow down drm_mm object allocation for all other use-cases (rbtree insertion) and add another 4-8 bytes to each mm node. Hence, use the separate tree but allow for later migration. This is a rewrite of the 2012-proposal by David Airlie Signed-off-by: David Herrmann --- drivers/gpu/drm/Makefile | 2 +- drivers/gpu/drm/drm_vma_manager.c | 224 ++ include/drm/drm_vma_manager.h | 107 ++ 3 files changed, 332 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/drm_vma_manager.c create mode 100644 include/drm/drm_vma_manager.h diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile index 1c9f2439..f8851cc 100644 --- a/drivers/gpu/drm/Makefile +++ b/drivers/gpu/drm/Makefile @@ -12,7 +12,7 @@ drm-y :=drm_auth.o drm_buffer.o drm_bufs.o drm_cache.o \ drm_platform.o drm_sysfs.o drm_hashtab.o drm_mm.o \ drm_crtc.o drm_modes.o drm_edid.o \ drm_info.o drm_debugfs.o drm_encoder_slave.o \ - drm_trace_points.o drm_global.o drm_prime.o + drm_trace_points.o drm_global.o drm_prime.o drm_vma_manager.o drm-$(CONFIG_COMPAT) += drm_ioc32.o drm-$(CONFIG_DRM_GEM_CMA_HELPER) += drm_gem_cma_helper.o diff --git a/drivers/gpu/drm/drm_vma_manager.c b/drivers/gpu/drm/drm_vma_manager.c new file mode 100644 index 000..c28639f --- /dev/null +++ b/drivers/gpu/drm/drm_vma_manager.c @@ -0,0 +1,224 @@ +/* + * Copyright (c) 2012 David Airlie + * Copyright (c) 2013 David Herrmann + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/** @file drm_vma_manager.c + * + * The vma-manager is responsible to map arbitrary driver-dependent memory + * regions into the linear user address-space. It provides offsets to the + * caller which can then be used on the address_space of the drm-device. It + * takes care to not overlap regions, size them appropriately and to not + * confuse mm-core by inconsistent fake vm_pgoff fields. + * + * We use drm_mm as backend to manage object allocations. But it is highly + * optimized for alloc/free calls, not lookups. Hence, we use an rb-tree to + * speed up offset lookups. + */ + +/** drm_vma_offset_manager_init() + * + * Initialize a new offset-manager. The offset and area size available for the + * manager are given as @page_offset and @size. Both are interpreted as + * page-numbers, not bytes. + * + * Adding/removing nodes from the manager is locked internally and protected + * against concurrent access. However, node allocation and destruction is left + * for the caller. While calling into the vma-manager, a given node must + * always be guaranteed to be referenced. + */ +void drm_vma_offset_manager_init(struct drm_vma_offset_manager *mgr, +unsigned long page_offset, unsigned long size) +{ + rwlock_init(&mgr->vm_lock); + mgr->vm_addr_space_rb = RB_ROOT; + drm_mm_init(&mgr->vm_addr_space_mm, page_offset, size); +} +EXPORT_SYMBOL(drm_vma_offset_manager_init); + +/** drm_vma_offset_manager_destroy() + * + * Destroy an object manager which was previously created via + * dr