Re: [linux-next:master] [mm/slab] 7bd230a266: WARNING:at_mm/util.c:#kvmalloc_node_noprof
this looks like an i915 bug On Wed, May 15, 2024 at 10:41:19AM +0800, kernel test robot wrote: > > > Hello, > > as we understand, this commit is not the root-cause of this WARNING. the > WARNING > just shows in another way by commit changes. > > 53ed0af496422959 7bd230a26648ac68ab3731ebbc4 > --- >fail:runs %reproductionfail:runs >| | | > 6:6 -83%:6 dmesg.RIP:kvmalloc_node >:6 33% 6:6 dmesg.RIP:kvmalloc_node_noprof > 6:6 -83%:6 > dmesg.WARNING:at_mm/util.c:#kvmalloc_node >:6 33% 6:6 > dmesg.WARNING:at_mm/util.c:#kvmalloc_node_noprof > > > but we failed to bisect "dmesg.WARNING:at_mm/util.c:#kvmalloc_node". > > we still made this report FYI what we observed in our tests, not sure if it > could give somebody some hints to find the real problem then judge if a fix > is needed. > > below is full report. > > > > kernel test robot noticed "WARNING:at_mm/util.c:#kvmalloc_node_noprof" on: > > commit: 7bd230a26648ac68ab3731ebbc449090f0ac6a37 ("mm/slab: enable slab > allocation tagging for kmalloc and friends") > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master > > [test failed on linux-next/master 6ba6c795dc73c22ce2c86006f17c4aa802db2a60] > > in testcase: igt > version: igt-x86_64-86712f2ef-1_20240511 > with following parameters: > > group: gem_exec_reloc > > > > compiler: gcc-13 > test machine: 20 threads 1 sockets (Commet Lake) with 16G memory > > (please refer to attached dmesg/kmsg for entire log/backtrace) > > > > If you fix the issue in a separate patch/commit (i.e. not just a new version > of > the same patch/commit), kindly add following tags > | Reported-by: kernel test robot > | Closes: > https://lore.kernel.org/oe-lkp/202405151008.6ddd1aaf-oliver.s...@intel.com > > > [ 940.101700][ T5353] [ cut here ] > [ 940.107107][ T5353] WARNING: CPU: 1 PID: 5353 at mm/util.c:649 > kvmalloc_node_noprof (mm/util.c:649 (discriminator 1)) > [ 940.116178][ T5353] Modules linked in: netconsole btrfs blake2b_generic > xor zstd_compress intel_rapl_msr intel_rapl_common intel_uncore_frequency > intel_uncore_frequency_common raid6_pq libcrc32c x86_pkg_temp_thermal > intel_powerclamp coretemp kvm_intel sd_mod t10_pi crc64_rocksoft_generic > crc64_rocksoft ipmi_devintf crc64 sg ipmi_msghandler kvm crct10dif_pclmul > crc32_pclmul crc32c_intel ghash_clmulni_intel i915 sha512_ssse3 sdhci_pci > drm_buddy cqhci ahci rapl intel_gtt drm_display_helper sdhci libahci mei_me > ttm intel_cstate i2c_designware_platform ppdev intel_uncore > intel_wmi_thunderbolt wmi_bmof libata mei i2c_designware_core idma64 > drm_kms_helper mmc_core i2c_i801 i2c_smbus intel_pch_thermal parport_pc video > parport pinctrl_cannonlake wmi acpi_pad acpi_tad serio_raw binfmt_misc drm > fuse loop dm_mod ip_tables > [ 940.188041][ T5353] CPU: 1 PID: 5353 Comm: gem_exec_reloc Not tainted > 6.9.0-rc4-00085-g7bd230a26648 #1 > [ 940.197459][ T5353] RIP: 0010:kvmalloc_node_noprof (mm/util.c:649 > (discriminator 1)) > [ 940.203412][ T5353] Code: 04 a3 0d 00 48 83 c4 18 48 83 c4 08 5b 5d 41 5c > 41 5d 41 5e c3 cc cc cc cc 49 be 00 00 00 00 00 20 00 00 eb 9f 80 e7 20 75 de > <0f> 0b eb da 48 c7 c7 10 ec af 84 e8 0e a6 18 00 e9 3f ff ff ff 48 > All code > >0: 04 a3 add$0xa3,%al >2: 0d 00 48 83 c4 or $0xc4834800,%eax >7: 18 48 83sbb%cl,-0x7d(%rax) >a: c4 (bad) >b: 08 5b 5dor %bl,0x5d(%rbx) >e: 41 5c pop%r12 > 10: 41 5d pop%r13 > 12: 41 5e pop%r14 > 14: c3 retq > 15: cc int3 > 16: cc int3 > 17: cc int3 > 18: cc int3 > 19: 49 be 00 00 00 00 00movabs $0x2000,%r14 > 20: 20 00 00 > 23: eb 9f jmp0xffc4 > 25: 80 e7 20and$0x20,%bh > 28: 75 de jne0x8 > 2a:*0f 0b ud2 <-- trapping instruction > 2c: eb da jmp0x8 > 2e: 48 c7 c7 10 ec af 84mov$0x84afec10,%rdi > 35: e8 0e a6 18 00 callq 0x18a648 > 3a: e9 3f ff ff ff jmpq 0xff7e > 3f: 48 rex.W > > Code starting with the faulting instruction > === >0: 0f 0b ud2 >2: eb da jmp0xffde >4: 48 c7 c7 10 ec af 84mov$0x84afec10,%rdi >b: e8 0e a6 18 00 callq 0x18a61e > 10: e9 3f ff ff ff jmpq
Re: [FYI][PATCH] tracing/treewide: Remove second parameter of __assign_str()
On Fri, Feb 23, 2024 at 01:46:53PM -0500, Steven Rostedt wrote: > On Fri, 23 Feb 2024 10:30:45 -0800 > Jeff Johnson wrote: > > > On 2/23/2024 9:56 AM, Steven Rostedt wrote: > > > From: "Steven Rostedt (Google)" > > > > > > [ > > >This is a treewide change. I will likely re-create this patch again in > > >the second week of the merge window of v6.9 and submit it then. Hoping > > >to keep the conflicts that it will cause to a minimum. > > > ] > > > > > > With the rework of how the __string() handles dynamic strings where it > > > saves off the source string in field in the helper structure[1], the > > > assignment of that value to the trace event field is stored in the helper > > > value and does not need to be passed in again. > > > > Just curious if this could be done piecemeal by first changing the > > macros to be variadic macros which allows you to ignore the extra > > argument. The callers could then be modified in their separate trees. > > And then once all the callers have be merged, the macros could be > > changed to no longer be variadic. > > I weighed doing that, but I think ripping off the band-aid is a better > approach. One thing I found is that leaving unused parameters in the macros > can cause bugs itself. I found one case doing my clean up, where an unused > parameter in one of the macros was bogus, and when I made it a used > parameter, it broke the build. > > I think for tree-wide changes, the preferred approach is to do one big > patch at once. And since this only affects TRACE_EVENT() macros, it > hopefully would not be too much of a burden (although out of tree users may > suffer from this, but do we care?) Agreed on doing it all at once, it'll be way less spam for people to deal with. Tangentially related though, what would make me really happy is if we could create the string with in the TP__fast_assign() section. I have to have a bunch of annoying wrappers right now because the string length has to be known when we invoke the tracepoint.
Re: [PATCH 16/22] bcachefs: mark bch2_target_to_text_sb() static
On Wed, Nov 08, 2023 at 01:58:37PM +0100, Arnd Bergmann wrote: > From: Arnd Bergmann > > bch2_target_to_text_sb() is only called in the file it is defined in, > and it has no extern prototype: > > fs/bcachefs/disk_groups.c:583:6: error: no previous prototype for > 'bch2_target_to_text_sb' [-Werror=missing-prototypes] > > Mark it static to avoid the warning and have the code better optimized. > > Fixes: bf0d9e89de2e ("bcachefs: Split apart bch2_target_to_text(), > bch2_target_to_text_sb()") > Signed-off-by: Arnd Bergmann This is already fixed in my tree.
Re: [RFC v3 11/19] kunit: add Python libraries for handing KUnit config and kernel
On Thu, Dec 06, 2018 at 12:32:47PM +, Kieran Bingham wrote: > Oh - although, yes - there are some good concepts there - but I'm a bit > weary of how easy it would be to 'run' the said test against multiple > kernel version libraries... there would be a lot of possible ABI > conflicts perhaps. > > My main initial idea for a libumlinux is to provide infrastructure such > as our linked-lists and other kernel formatting so that we can take > kernel code directly to userspace for test and debug (assuming that > there are no hardware dependencies or things that we can't mock out) I think this would be a really wonderful to make happen, and could potentially be much wore widely useful than for just running tests, by making it easier to share code between both kernel and userspace. For bcachefs I've got a shim layer that lets me build almost everything in fs/bcachefs and use it as a library in the userspace bcachefs-tools - e.g. for fsck and migrate. Mine was a quick and dirty hack, but even so it's been _extremely_ useful and a major success - I think if this became something more official a lot of uses would be found for it. I'm not sure if you've actually started on this (haven't seen most of the thread yet), but if any of the bcachefs-tools shim code is useful feel free to steal it - I've got dirt-simple, minimum viable shims for the kthread api, workqueus, timers, the block layer, and assorted other stuff: https://evilpiepirate.org/git/bcachefs-tools.git/ Going forward, one issue is going to be that a libumllinux is going to want to shim some interfaces, and for other things it'll just want to pull in the kernel implementation - e.g. rhashtables. It might be nice if we could refactor things a bit so that things like rhashtables could be built as a standalone library, as is. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 10/10] idr: Rework idr_preload()
The old idr_preload() used percpu buffers - since the bitmap/radix/whatever tree only grew by fixed sized nodes, it only had to ensure there was a node available in the percpu buffer and disable preemption. This conveniently meant that you didn't have to pass the idr you were going to allocate from to it. With the new ida implementation, that doesn't work anymore - the new ida code grows its bitmap tree by reallocating the entire thing in power of two increments. Doh. So we need a slightly different trick. Note that if all allocations from an idr start by calling idr_prealloc() and disabling premption, at most nr_cpus() allocations can happen before someone calls idr_prealloc() again. So, we just change idr_prealloc() to resize the ida bitmap tree if there's less than num_possible_cpus() ids available - conveniently, we already track the number of allocated ids, and the total number of ids we can have allocated is just nr_leaf_nodes * BITS_PER_LONG. Easy. This does require a fairly trivial interface change - we now have to pass the idr we're going to allocate from (and the starting id we're going to pass to idr_allocate_range()) to idr_prealloc(), so this patch updates all the callers. Signed-off-by: Kent Overstreet Cc: Andrew Morton Cc: Tejun Heo Cc: Stefan Richter Cc: David Airlie Cc: Roland Dreier Cc: Sean Hefty Cc: Hal Rosenstock Cc: Steve Wise Cc: Hoang-Nam Nguyen Cc: Christoph Raisch Cc: Mike Marciniszyn Cc: Doug Gilbert Cc: "James E.J. Bottomley" Cc: Christine Caulfield Cc: David Teigland Cc: Trond Myklebust Cc: John McCutchan Cc: Robert Love Cc: Eric Paris Cc: Dave Airlie Cc: Thomas Hellstrom Cc: Brian Paul Cc: Maarten Lankhorst Cc: Dmitry Torokhov Cc: Erez Shitrit Cc: Al Viro Cc: Haggai Eran Cc: Jack Morgenstein Cc: Wolfram Sang Cc: Greg Kroah-Hartman Cc: Davidlohr Bueso Cc: Rik van Riel Cc: Michel Lespinasse Cc: linux1394-devel at lists.sourceforge.net Cc: linux-kernel at vger.kernel.org Cc: dri-devel at lists.freedesktop.org Cc: linux-rdma at vger.kernel.org Cc: linux-scsi at vger.kernel.org Cc: cluster-devel at redhat.com Cc: linux-nfs at vger.kernel.org --- drivers/firewire/core-cdev.c | 2 +- drivers/gpu/drm/drm_gem.c | 4 +- drivers/gpu/drm/vmwgfx/vmwgfx_resource.c | 2 +- drivers/infiniband/core/cm.c | 8 +--- drivers/infiniband/core/sa_query.c | 2 +- drivers/infiniband/core/uverbs_cmd.c | 2 +- drivers/infiniband/hw/cxgb3/iwch.h | 2 +- drivers/infiniband/hw/cxgb4/iw_cxgb4.h | 2 +- drivers/infiniband/hw/ehca/ehca_cq.c | 2 +- drivers/infiniband/hw/ehca/ehca_qp.c | 2 +- drivers/infiniband/hw/ipath/ipath_driver.c | 2 +- drivers/infiniband/hw/mlx4/cm.c| 2 +- drivers/infiniband/hw/qib/qib_init.c | 2 +- drivers/scsi/sg.c | 2 +- fs/dlm/lock.c | 2 +- fs/dlm/recover.c | 2 +- fs/nfs/nfs4client.c| 2 +- fs/notify/inotify/inotify_user.c | 2 +- include/linux/idr.h| 37 + ipc/util.c | 4 +- lib/idr.c | 66 ++ 21 files changed, 91 insertions(+), 60 deletions(-) diff --git a/drivers/firewire/core-cdev.c b/drivers/firewire/core-cdev.c index ba78d08..08d31da 100644 --- a/drivers/firewire/core-cdev.c +++ b/drivers/firewire/core-cdev.c @@ -491,7 +491,7 @@ static int add_client_resource(struct client *client, int ret; if (preload) - idr_preload(gfp_mask); + idr_preload(>resource_idr, 0, gfp_mask); spin_lock_irqsave(>lock, flags); if (client->in_shutdown) diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c index d12ea60..c8ed531 100644 --- a/drivers/gpu/drm/drm_gem.c +++ b/drivers/gpu/drm/drm_gem.c @@ -269,7 +269,7 @@ drm_gem_handle_create(struct drm_file *file_priv, * Get the user-visible handle using idr. Preload and perform * allocation under our spinlock. */ - idr_preload(GFP_KERNEL); + idr_preload(_priv->object_idr, 1, GFP_KERNEL); spin_lock(_priv->table_lock); ret = idr_alloc_range(_priv->object_idr, obj, 1, 0, GFP_NOWAIT); @@ -445,7 +445,7 @@ drm_gem_flink_ioctl(struct drm_device *dev, void *data, if (obj == NULL) return -ENOENT; - idr_preload(GFP_KERNEL); + idr_preload(>object_name_idr, 1, GFP_KERNEL); spin_lock(>object_name_lock); if (!obj->name) { ret = idr_alloc_range(>object_name_idr, diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c index 4838238..1078b51 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c @@ -177,7 +177,7 @@ int vmw_re
[PATCH 09/10] idr: Remove unneeded idr locking, idr_preload() usage
Our new idr implementation does its own locking, instead of forcing it onto the callers like the old implementation. Many of the existing idr users need locking for more than just idr_alloc()/idr_remove()/idr_find() - they're taking refcounts and such under their locks and we can't touch those. But a significant number of users had locks that protected nothing more than the idr data structures itself - those we can get rid of. Note that we have to be careful when removing locks; in some places, locks appear to only be protecting idr_alloc()/idr_remove() calls but they're also used by other code that needs to ensure the idr isn't modified while it's doing something else - so ideally we want to delete the lock that protected the idr, or else we have to carefully audit all the other places it's used. There's also a fair number of places where things were being done under the idr lock unnecessarily; drivers/dca/dca-sysfs.c is a good example. dca->id is set to the id idr_alloc() returns under the lock - but there's no idr_find() calls under the lock, and dca->id isn't touched in the idr_remove() paths. So the lock can be safely deleted. The really nice thing about deleting this unnecessary locking is that it lets us trivially delete a lot of now unnecessary idr_preload() - with idr doing its own locking, we can pass GFP_KERNEL to idr_alloc() just fine. Signed-off-by: Kent Overstreet Cc: Andrew Morton Cc: Tejun Heo Cc: David Airlie Cc: Tom Tucker Cc: Steve Wise Cc: Roland Dreier Cc: Sean Hefty Cc: Hal Rosenstock Cc: Alasdair Kergon Cc: dm-devel at redhat.com Cc: Samuel Ortiz Cc: Alex Dubov Cc: Arnd Bergmann Cc: Greg Kroah-Hartman Cc: Chris Ball Cc: "James E.J. Bottomley" Cc: Willem Riede Cc: "Kai M?kisara" Cc: "Nicholas A. Bellinger" Cc: Li Zefan Cc: Vlad Yasevich Cc: Neil Horman Cc: "David S. Miller" Cc: Dave Airlie Cc: Alon Levy Cc: Guennadi Liakhovetski Cc: Christoph Hellwig Cc: linux-kernel at vger.kernel.org Cc: dri-devel at lists.freedesktop.org Cc: linux-rdma at vger.kernel.org Cc: linux-mmc at vger.kernel.org Cc: linux-scsi at vger.kernel.org Cc: osst-users at lists.sourceforge.net Cc: target-devel at vger.kernel.org Cc: containers at lists.linux-foundation.org Cc: cgroups at vger.kernel.org Cc: linux-sctp at vger.kernel.org Cc: netdev at vger.kernel.org --- drivers/dca/dca-sysfs.c | 18 +++--- drivers/gpu/drm/qxl/qxl_cmd.c | 4 +--- drivers/gpu/drm/qxl/qxl_drv.h | 1 - drivers/gpu/drm/qxl/qxl_kms.c | 1 - drivers/gpu/drm/qxl/qxl_release.c | 20 +--- drivers/infiniband/hw/amso1100/c2.h | 1 - drivers/infiniband/hw/amso1100/c2_qp.c| 20 ++-- drivers/md/dm.c | 24 drivers/memstick/core/memstick.c | 17 +++-- drivers/mfd/rtsx_pcr.c| 13 +++-- drivers/misc/c2port/core.c| 11 +-- drivers/misc/tifm_core.c | 15 +++ drivers/mmc/core/host.c | 13 ++--- drivers/scsi/ch.c | 13 + drivers/scsi/st.c | 13 + drivers/target/iscsi/iscsi_target.c | 17 - drivers/target/iscsi/iscsi_target.h | 1 - drivers/target/iscsi/iscsi_target_login.c | 12 ++-- include/linux/cgroup.h| 1 - include/net/sctp/sctp.h | 1 - kernel/cgroup.c | 9 + kernel/workqueue.c| 15 ++- net/9p/util.c | 15 +-- net/sctp/associola.c | 14 ++ net/sctp/protocol.c | 1 - net/sctp/socket.c | 2 -- 26 files changed, 41 insertions(+), 231 deletions(-) diff --git a/drivers/dca/dca-sysfs.c b/drivers/dca/dca-sysfs.c index effda66..6be5fbd 100644 --- a/drivers/dca/dca-sysfs.c +++ b/drivers/dca/dca-sysfs.c @@ -31,7 +31,6 @@ static struct class *dca_class; static struct idr dca_idr; -static spinlock_t dca_idr_lock; int dca_sysfs_add_req(struct dca_provider *dca, struct device *dev, int slot) { @@ -55,23 +54,15 @@ int dca_sysfs_add_provider(struct dca_provider *dca, struct device *dev) struct device *cd; int ret; - idr_preload(GFP_KERNEL); - spin_lock(_idr_lock); - - ret = idr_alloc(_idr, dca, GFP_NOWAIT); - if (ret >= 0) - dca->id = ret; - - spin_unlock(_idr_lock); - idr_preload_end(); + ret = idr_alloc(_idr, dca, GFP_KERNEL); if (ret < 0) return ret; + dca->id = ret; + cd = device_create(dca_class, dev, MKDEV(0, 0), NULL, "dca%d", dca->id); if (IS_ERR(cd)) { - spin_lock(_idr_lock);
[PATCH 09/10] idr: Remove unneeded idr locking, idr_preload() usage
Our new idr implementation does its own locking, instead of forcing it onto the callers like the old implementation. Many of the existing idr users need locking for more than just idr_alloc()/idr_remove()/idr_find() - they're taking refcounts and such under their locks and we can't touch those. But a significant number of users had locks that protected nothing more than the idr data structures itself - those we can get rid of. Note that we have to be careful when removing locks; in some places, locks appear to only be protecting idr_alloc()/idr_remove() calls but they're also used by other code that needs to ensure the idr isn't modified while it's doing something else - so ideally we want to delete the lock that protected the idr, or else we have to carefully audit all the other places it's used. There's also a fair number of places where things were being done under the idr lock unnecessarily; drivers/dca/dca-sysfs.c is a good example. dca-id is set to the id idr_alloc() returns under the lock - but there's no idr_find() calls under the lock, and dca-id isn't touched in the idr_remove() paths. So the lock can be safely deleted. The really nice thing about deleting this unnecessary locking is that it lets us trivially delete a lot of now unnecessary idr_preload() - with idr doing its own locking, we can pass GFP_KERNEL to idr_alloc() just fine. Signed-off-by: Kent Overstreet k...@daterainc.com Cc: Andrew Morton a...@linux-foundation.org Cc: Tejun Heo t...@kernel.org Cc: David Airlie airl...@linux.ie Cc: Tom Tucker t...@opengridcomputing.com Cc: Steve Wise sw...@opengridcomputing.com Cc: Roland Dreier rol...@kernel.org Cc: Sean Hefty sean.he...@intel.com Cc: Hal Rosenstock hal.rosenst...@gmail.com Cc: Alasdair Kergon a...@redhat.com Cc: dm-de...@redhat.com Cc: Samuel Ortiz sa...@linux.intel.com Cc: Alex Dubov oa...@yahoo.com Cc: Arnd Bergmann a...@arndb.de Cc: Greg Kroah-Hartman gre...@linuxfoundation.org Cc: Chris Ball c...@laptop.org Cc: James E.J. Bottomley jbottom...@parallels.com Cc: Willem Riede o...@riede.org Cc: Kai Mäkisara kai.makis...@kolumbus.fi Cc: Nicholas A. Bellinger n...@linux-iscsi.org Cc: Li Zefan lize...@huawei.com Cc: Vlad Yasevich vyasev...@gmail.com Cc: Neil Horman nhor...@tuxdriver.com Cc: David S. Miller da...@davemloft.net Cc: Dave Airlie airl...@redhat.com Cc: Alon Levy al...@redhat.com Cc: Guennadi Liakhovetski g.liakhovet...@gmx.de Cc: Christoph Hellwig h...@lst.de Cc: linux-ker...@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linux-r...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: linux-s...@vger.kernel.org Cc: osst-us...@lists.sourceforge.net Cc: target-de...@vger.kernel.org Cc: contain...@lists.linux-foundation.org Cc: cgro...@vger.kernel.org Cc: linux-s...@vger.kernel.org Cc: net...@vger.kernel.org --- drivers/dca/dca-sysfs.c | 18 +++--- drivers/gpu/drm/qxl/qxl_cmd.c | 4 +--- drivers/gpu/drm/qxl/qxl_drv.h | 1 - drivers/gpu/drm/qxl/qxl_kms.c | 1 - drivers/gpu/drm/qxl/qxl_release.c | 20 +--- drivers/infiniband/hw/amso1100/c2.h | 1 - drivers/infiniband/hw/amso1100/c2_qp.c| 20 ++-- drivers/md/dm.c | 24 drivers/memstick/core/memstick.c | 17 +++-- drivers/mfd/rtsx_pcr.c| 13 +++-- drivers/misc/c2port/core.c| 11 +-- drivers/misc/tifm_core.c | 15 +++ drivers/mmc/core/host.c | 13 ++--- drivers/scsi/ch.c | 13 + drivers/scsi/st.c | 13 + drivers/target/iscsi/iscsi_target.c | 17 - drivers/target/iscsi/iscsi_target.h | 1 - drivers/target/iscsi/iscsi_target_login.c | 12 ++-- include/linux/cgroup.h| 1 - include/net/sctp/sctp.h | 1 - kernel/cgroup.c | 9 + kernel/workqueue.c| 15 ++- net/9p/util.c | 15 +-- net/sctp/associola.c | 14 ++ net/sctp/protocol.c | 1 - net/sctp/socket.c | 2 -- 26 files changed, 41 insertions(+), 231 deletions(-) diff --git a/drivers/dca/dca-sysfs.c b/drivers/dca/dca-sysfs.c index effda66..6be5fbd 100644 --- a/drivers/dca/dca-sysfs.c +++ b/drivers/dca/dca-sysfs.c @@ -31,7 +31,6 @@ static struct class *dca_class; static struct idr dca_idr; -static spinlock_t dca_idr_lock; int dca_sysfs_add_req(struct dca_provider *dca, struct device *dev, int slot) { @@ -55,23 +54,15 @@ int dca_sysfs_add_provider(struct dca_provider *dca, struct device *dev) struct device *cd; int ret; - idr_preload(GFP_KERNEL); - spin_lock(dca_idr_lock); - - ret = idr_alloc(dca_idr
[PATCH 10/10] idr: Rework idr_preload()
The old idr_preload() used percpu buffers - since the bitmap/radix/whatever tree only grew by fixed sized nodes, it only had to ensure there was a node available in the percpu buffer and disable preemption. This conveniently meant that you didn't have to pass the idr you were going to allocate from to it. With the new ida implementation, that doesn't work anymore - the new ida code grows its bitmap tree by reallocating the entire thing in power of two increments. Doh. So we need a slightly different trick. Note that if all allocations from an idr start by calling idr_prealloc() and disabling premption, at most nr_cpus() allocations can happen before someone calls idr_prealloc() again. So, we just change idr_prealloc() to resize the ida bitmap tree if there's less than num_possible_cpus() ids available - conveniently, we already track the number of allocated ids, and the total number of ids we can have allocated is just nr_leaf_nodes * BITS_PER_LONG. Easy. This does require a fairly trivial interface change - we now have to pass the idr we're going to allocate from (and the starting id we're going to pass to idr_allocate_range()) to idr_prealloc(), so this patch updates all the callers. Signed-off-by: Kent Overstreet k...@daterainc.com Cc: Andrew Morton a...@linux-foundation.org Cc: Tejun Heo t...@kernel.org Cc: Stefan Richter stef...@s5r6.in-berlin.de Cc: David Airlie airl...@linux.ie Cc: Roland Dreier rol...@kernel.org Cc: Sean Hefty sean.he...@intel.com Cc: Hal Rosenstock hal.rosenst...@gmail.com Cc: Steve Wise sw...@chelsio.com Cc: Hoang-Nam Nguyen hngu...@de.ibm.com Cc: Christoph Raisch rai...@de.ibm.com Cc: Mike Marciniszyn infinip...@intel.com Cc: Doug Gilbert dgilb...@interlog.com Cc: James E.J. Bottomley jbottom...@parallels.com Cc: Christine Caulfield ccaul...@redhat.com Cc: David Teigland teigl...@redhat.com Cc: Trond Myklebust trond.mykleb...@netapp.com Cc: John McCutchan j...@johnmccutchan.com Cc: Robert Love rl...@rlove.org Cc: Eric Paris epa...@parisplace.org Cc: Dave Airlie airl...@redhat.com Cc: Thomas Hellstrom thellst...@vmware.com Cc: Brian Paul bri...@vmware.com Cc: Maarten Lankhorst maarten.lankho...@canonical.com Cc: Dmitry Torokhov d...@vmware.com Cc: Erez Shitrit ere...@mellanox.co.il Cc: Al Viro v...@zeniv.linux.org.uk Cc: Haggai Eran hagg...@mellanox.com Cc: Jack Morgenstein ja...@dev.mellanox.co.il Cc: Wolfram Sang wolf...@the-dreams.de Cc: Greg Kroah-Hartman gre...@linuxfoundation.org Cc: Davidlohr Bueso davidlohr.bu...@hp.com Cc: Rik van Riel r...@redhat.com Cc: Michel Lespinasse wal...@google.com Cc: linux1394-de...@lists.sourceforge.net Cc: linux-ker...@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linux-r...@vger.kernel.org Cc: linux-s...@vger.kernel.org Cc: cluster-de...@redhat.com Cc: linux-...@vger.kernel.org --- drivers/firewire/core-cdev.c | 2 +- drivers/gpu/drm/drm_gem.c | 4 +- drivers/gpu/drm/vmwgfx/vmwgfx_resource.c | 2 +- drivers/infiniband/core/cm.c | 8 +--- drivers/infiniband/core/sa_query.c | 2 +- drivers/infiniband/core/uverbs_cmd.c | 2 +- drivers/infiniband/hw/cxgb3/iwch.h | 2 +- drivers/infiniband/hw/cxgb4/iw_cxgb4.h | 2 +- drivers/infiniband/hw/ehca/ehca_cq.c | 2 +- drivers/infiniband/hw/ehca/ehca_qp.c | 2 +- drivers/infiniband/hw/ipath/ipath_driver.c | 2 +- drivers/infiniband/hw/mlx4/cm.c| 2 +- drivers/infiniband/hw/qib/qib_init.c | 2 +- drivers/scsi/sg.c | 2 +- fs/dlm/lock.c | 2 +- fs/dlm/recover.c | 2 +- fs/nfs/nfs4client.c| 2 +- fs/notify/inotify/inotify_user.c | 2 +- include/linux/idr.h| 37 + ipc/util.c | 4 +- lib/idr.c | 66 ++ 21 files changed, 91 insertions(+), 60 deletions(-) diff --git a/drivers/firewire/core-cdev.c b/drivers/firewire/core-cdev.c index ba78d08..08d31da 100644 --- a/drivers/firewire/core-cdev.c +++ b/drivers/firewire/core-cdev.c @@ -491,7 +491,7 @@ static int add_client_resource(struct client *client, int ret; if (preload) - idr_preload(gfp_mask); + idr_preload(client-resource_idr, 0, gfp_mask); spin_lock_irqsave(client-lock, flags); if (client-in_shutdown) diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c index d12ea60..c8ed531 100644 --- a/drivers/gpu/drm/drm_gem.c +++ b/drivers/gpu/drm/drm_gem.c @@ -269,7 +269,7 @@ drm_gem_handle_create(struct drm_file *file_priv, * Get the user-visible handle using idr. Preload and perform * allocation under our spinlock. */ - idr_preload(GFP_KERNEL); + idr_preload(file_priv-object_idr, 1, GFP_KERNEL); spin_lock(file_priv-table_lock); ret = idr_alloc_range
[PATCH 09/10] idr: Remove unneeded idr locking, idr_preload() usage
From: Kent Overstreet koverstr...@google.com Our new idr implementation does its own locking, instead of forcing it onto the callers like the old implementation. Many of the existing idr users need locking for more than just idr_alloc()/idr_remove()/idr_find() - they're taking refcounts and such under their locks and we can't touch those. But a significant number of users had locks that protected nothing more than the idr data structures itself - those we can get rid of. Note that we have to be careful when removing locks; in some places, locks appear to only be protecting idr_alloc()/idr_remove() calls but they're also used by other code that needs to ensure the idr isn't modified while it's doing something else - so ideally we want to delete the lock that protected the idr, or else we have to carefully audit all the other places it's used. There's also a fair number of places where things were being done under the idr lock unnecessarily; drivers/dca/dca-sysfs.c is a good example. dca-id is set to the id idr_alloc() returns under the lock - but there's no idr_find() calls under the lock, and dca-id isn't touched in the idr_remove() paths. So the lock can be safely deleted. The really nice thing about deleting this unnecessary locking is that it lets us trivially delete a lot of now unnecessary idr_preload() - with idr doing its own locking, we can pass GFP_KERNEL to idr_alloc() just fine. Signed-off-by: Kent Overstreet koverstr...@google.com Cc: Andrew Morton a...@linux-foundation.org Cc: Tejun Heo t...@kernel.org Cc: David Airlie airl...@linux.ie Cc: Tom Tucker t...@opengridcomputing.com Cc: Steve Wise sw...@opengridcomputing.com Cc: Roland Dreier rol...@kernel.org Cc: Sean Hefty sean.he...@intel.com Cc: Hal Rosenstock hal.rosenst...@gmail.com Cc: Alasdair Kergon a...@redhat.com Cc: dm-de...@redhat.com Cc: Samuel Ortiz sa...@linux.intel.com Cc: Alex Dubov oa...@yahoo.com Cc: Arnd Bergmann a...@arndb.de Cc: Greg Kroah-Hartman gre...@linuxfoundation.org Cc: Chris Ball c...@laptop.org Cc: James E.J. Bottomley jbottom...@parallels.com Cc: Willem Riede o...@riede.org Cc: Kai Mäkisara kai.makis...@kolumbus.fi Cc: Nicholas A. Bellinger n...@linux-iscsi.org Cc: Li Zefan lize...@huawei.com Cc: Vlad Yasevich vyasev...@gmail.com Cc: Neil Horman nhor...@tuxdriver.com Cc: David S. Miller da...@davemloft.net Cc: Dave Airlie airl...@redhat.com Cc: Alon Levy al...@redhat.com Cc: Guennadi Liakhovetski g.liakhovet...@gmx.de Cc: Christoph Hellwig h...@lst.de Cc: linux-ker...@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linux-r...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: linux-s...@vger.kernel.org Cc: osst-us...@lists.sourceforge.net Cc: target-de...@vger.kernel.org Cc: contain...@lists.linux-foundation.org Cc: cgro...@vger.kernel.org Cc: linux-s...@vger.kernel.org Cc: net...@vger.kernel.org Signed-off-by: Kent Overstreet k...@daterainc.com --- drivers/dca/dca-sysfs.c | 18 +++--- drivers/gpu/drm/qxl/qxl_cmd.c | 4 +--- drivers/gpu/drm/qxl/qxl_drv.h | 1 - drivers/gpu/drm/qxl/qxl_kms.c | 1 - drivers/gpu/drm/qxl/qxl_release.c | 19 +-- drivers/infiniband/hw/amso1100/c2.h | 1 - drivers/infiniband/hw/amso1100/c2_qp.c| 20 ++-- drivers/md/dm.c | 22 -- drivers/memstick/core/memstick.c | 17 +++-- drivers/mfd/rtsx_pcr.c| 13 +++-- drivers/misc/c2port/core.c| 11 +-- drivers/misc/tifm_core.c | 15 +++ drivers/mmc/core/host.c | 13 ++--- drivers/scsi/ch.c | 14 ++ drivers/scsi/st.c | 13 + drivers/target/iscsi/iscsi_target.c | 17 - drivers/target/iscsi/iscsi_target.h | 1 - drivers/target/iscsi/iscsi_target_login.c | 12 ++-- include/linux/cgroup.h| 1 - include/net/sctp/sctp.h | 1 - kernel/cgroup.c | 9 + kernel/workqueue.c| 15 ++- net/9p/util.c | 15 +-- net/sctp/associola.c | 14 ++ net/sctp/protocol.c | 1 - net/sctp/socket.c | 2 -- 26 files changed, 42 insertions(+), 228 deletions(-) diff --git a/drivers/dca/dca-sysfs.c b/drivers/dca/dca-sysfs.c index effda66..6be5fbd 100644 --- a/drivers/dca/dca-sysfs.c +++ b/drivers/dca/dca-sysfs.c @@ -31,7 +31,6 @@ static struct class *dca_class; static struct idr dca_idr; -static spinlock_t dca_idr_lock; int dca_sysfs_add_req(struct dca_provider *dca, struct device *dev, int slot) { @@ -55,23 +54,15 @@ int dca_sysfs_add_provider(struct dca_provider *dca, struct device *dev) struct device *cd; int ret
[PATCH 10/10] idr: Rework idr_preload()
The old idr_preload() used percpu buffers - since the bitmap/radix/whatever tree only grew by fixed sized nodes, it only had to ensure there was a node available in the percpu buffer and disable preemption. This conveniently meant that you didn't have to pass the idr you were going to allocate from to it. With the new ida implementation, that doesn't work anymore - the new ida code grows its bitmap tree by reallocating the entire thing in power of two increments. Doh. So we need a slightly different trick. Note that if all allocations from an idr start by calling idr_prealloc() and disabling premption, at most nr_cpus() allocations can happen before someone calls idr_prealloc() again. So, we just change idr_prealloc() to resize the ida bitmap tree if there's less than num_possible_cpus() ids available - conveniently, we already track the number of allocated ids, and the total number of ids we can have allocated is just nr_leaf_nodes * BITS_PER_LONG. Easy. This does require a fairly trivial interface change - we now have to pass the idr we're going to allocate from (and the starting id we're going to pass to idr_allocate_range()) to idr_prealloc(), so this patch updates all the callers. Signed-off-by: Kent Overstreet koverstr...@google.com Cc: Andrew Morton a...@linux-foundation.org Cc: Tejun Heo t...@kernel.org Cc: Stefan Richter stef...@s5r6.in-berlin.de Cc: David Airlie airl...@linux.ie Cc: Roland Dreier rol...@kernel.org Cc: Sean Hefty sean.he...@intel.com Cc: Hal Rosenstock hal.rosenst...@gmail.com Cc: Steve Wise sw...@chelsio.com Cc: Hoang-Nam Nguyen hngu...@de.ibm.com Cc: Christoph Raisch rai...@de.ibm.com Cc: Mike Marciniszyn infinip...@intel.com Cc: Doug Gilbert dgilb...@interlog.com Cc: James E.J. Bottomley jbottom...@parallels.com Cc: Christine Caulfield ccaul...@redhat.com Cc: David Teigland teigl...@redhat.com Cc: Trond Myklebust trond.mykleb...@netapp.com Cc: John McCutchan j...@johnmccutchan.com Cc: Robert Love rl...@rlove.org Cc: Eric Paris epa...@parisplace.org Cc: Dave Airlie airl...@redhat.com Cc: Thomas Hellstrom thellst...@vmware.com Cc: Brian Paul bri...@vmware.com Cc: Maarten Lankhorst maarten.lankho...@canonical.com Cc: Dmitry Torokhov d...@vmware.com Cc: Erez Shitrit ere...@mellanox.co.il Cc: Al Viro v...@zeniv.linux.org.uk Cc: Haggai Eran hagg...@mellanox.com Cc: Jack Morgenstein ja...@dev.mellanox.co.il Cc: Wolfram Sang wolf...@the-dreams.de Cc: Greg Kroah-Hartman gre...@linuxfoundation.org Cc: Davidlohr Bueso davidlohr.bu...@hp.com Cc: Rik van Riel r...@redhat.com Cc: Michel Lespinasse wal...@google.com Cc: linux1394-de...@lists.sourceforge.net Cc: linux-ker...@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linux-r...@vger.kernel.org Cc: linux-s...@vger.kernel.org Cc: cluster-de...@redhat.com Cc: linux-...@vger.kernel.org Signed-off-by: Kent Overstreet k...@daterainc.com --- drivers/firewire/core-cdev.c | 2 +- drivers/gpu/drm/drm_gem.c | 4 +- drivers/gpu/drm/vmwgfx/vmwgfx_resource.c | 2 +- drivers/infiniband/core/cm.c | 7 +--- drivers/infiniband/core/sa_query.c | 2 +- drivers/infiniband/core/uverbs_cmd.c | 2 +- drivers/infiniband/hw/cxgb3/iwch.h | 2 +- drivers/infiniband/hw/cxgb4/iw_cxgb4.h | 2 +- drivers/infiniband/hw/ehca/ehca_cq.c | 2 +- drivers/infiniband/hw/ehca/ehca_qp.c | 2 +- drivers/infiniband/hw/ipath/ipath_driver.c | 2 +- drivers/infiniband/hw/mlx4/cm.c| 2 +- drivers/infiniband/hw/qib/qib_init.c | 2 +- drivers/scsi/sg.c | 2 +- fs/dlm/lock.c | 2 +- fs/dlm/recover.c | 2 +- fs/nfs/nfs4client.c| 2 +- fs/notify/inotify/inotify_user.c | 2 +- include/linux/idr.h| 37 + ipc/util.c | 4 +- lib/idr.c | 66 ++ 21 files changed, 91 insertions(+), 59 deletions(-) diff --git a/drivers/firewire/core-cdev.c b/drivers/firewire/core-cdev.c index 436debf..3c70fbc 100644 --- a/drivers/firewire/core-cdev.c +++ b/drivers/firewire/core-cdev.c @@ -490,7 +490,7 @@ static int add_client_resource(struct client *client, int ret; if (preload) - idr_preload(gfp_mask); + idr_preload(client-resource_idr, 0, gfp_mask); spin_lock_irqsave(client-lock, flags); if (client-in_shutdown) diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c index 1c897b9..cf50894 100644 --- a/drivers/gpu/drm/drm_gem.c +++ b/drivers/gpu/drm/drm_gem.c @@ -273,7 +273,7 @@ drm_gem_handle_create(struct drm_file *file_priv, * Get the user-visible handle using idr. Preload and perform * allocation under our spinlock. */ - idr_preload(GFP_KERNEL); + idr_preload(file_priv-object_idr, 1, GFP_KERNEL); spin_lock
[PATCH 09/10] idr: Remove unneeded idr locking, idr_preload() usage
From: Kent Overstreet <koverstr...@google.com> Our new idr implementation does its own locking, instead of forcing it onto the callers like the old implementation. Many of the existing idr users need locking for more than just idr_alloc()/idr_remove()/idr_find() - they're taking refcounts and such under their locks and we can't touch those. But a significant number of users had locks that protected nothing more than the idr data structures itself - those we can get rid of. Note that we have to be careful when removing locks; in some places, locks appear to only be protecting idr_alloc()/idr_remove() calls but they're also used by other code that needs to ensure the idr isn't modified while it's doing something else - so ideally we want to delete the lock that protected the idr, or else we have to carefully audit all the other places it's used. There's also a fair number of places where things were being done under the idr lock unnecessarily; drivers/dca/dca-sysfs.c is a good example. dca->id is set to the id idr_alloc() returns under the lock - but there's no idr_find() calls under the lock, and dca->id isn't touched in the idr_remove() paths. So the lock can be safely deleted. The really nice thing about deleting this unnecessary locking is that it lets us trivially delete a lot of now unnecessary idr_preload() - with idr doing its own locking, we can pass GFP_KERNEL to idr_alloc() just fine. Signed-off-by: Kent Overstreet Cc: Andrew Morton Cc: Tejun Heo Cc: David Airlie Cc: Tom Tucker Cc: Steve Wise Cc: Roland Dreier Cc: Sean Hefty Cc: Hal Rosenstock Cc: Alasdair Kergon Cc: dm-devel at redhat.com Cc: Samuel Ortiz Cc: Alex Dubov Cc: Arnd Bergmann Cc: Greg Kroah-Hartman Cc: Chris Ball Cc: "James E.J. Bottomley" Cc: Willem Riede Cc: "Kai M?kisara" Cc: "Nicholas A. Bellinger" Cc: Li Zefan Cc: Vlad Yasevich Cc: Neil Horman Cc: "David S. Miller" Cc: Dave Airlie Cc: Alon Levy Cc: Guennadi Liakhovetski Cc: Christoph Hellwig Cc: linux-kernel at vger.kernel.org Cc: dri-devel at lists.freedesktop.org Cc: linux-rdma at vger.kernel.org Cc: linux-mmc at vger.kernel.org Cc: linux-scsi at vger.kernel.org Cc: osst-users at lists.sourceforge.net Cc: target-devel at vger.kernel.org Cc: containers at lists.linux-foundation.org Cc: cgroups at vger.kernel.org Cc: linux-sctp at vger.kernel.org Cc: netdev at vger.kernel.org Signed-off-by: Kent Overstreet --- drivers/dca/dca-sysfs.c | 18 +++--- drivers/gpu/drm/qxl/qxl_cmd.c | 4 +--- drivers/gpu/drm/qxl/qxl_drv.h | 1 - drivers/gpu/drm/qxl/qxl_kms.c | 1 - drivers/gpu/drm/qxl/qxl_release.c | 19 +-- drivers/infiniband/hw/amso1100/c2.h | 1 - drivers/infiniband/hw/amso1100/c2_qp.c| 20 ++-- drivers/md/dm.c | 22 -- drivers/memstick/core/memstick.c | 17 +++-- drivers/mfd/rtsx_pcr.c| 13 +++-- drivers/misc/c2port/core.c| 11 +-- drivers/misc/tifm_core.c | 15 +++ drivers/mmc/core/host.c | 13 ++--- drivers/scsi/ch.c | 14 ++ drivers/scsi/st.c | 13 + drivers/target/iscsi/iscsi_target.c | 17 - drivers/target/iscsi/iscsi_target.h | 1 - drivers/target/iscsi/iscsi_target_login.c | 12 ++-- include/linux/cgroup.h| 1 - include/net/sctp/sctp.h | 1 - kernel/cgroup.c | 9 + kernel/workqueue.c| 15 ++- net/9p/util.c | 15 +-- net/sctp/associola.c | 14 ++ net/sctp/protocol.c | 1 - net/sctp/socket.c | 2 -- 26 files changed, 42 insertions(+), 228 deletions(-) diff --git a/drivers/dca/dca-sysfs.c b/drivers/dca/dca-sysfs.c index effda66..6be5fbd 100644 --- a/drivers/dca/dca-sysfs.c +++ b/drivers/dca/dca-sysfs.c @@ -31,7 +31,6 @@ static struct class *dca_class; static struct idr dca_idr; -static spinlock_t dca_idr_lock; int dca_sysfs_add_req(struct dca_provider *dca, struct device *dev, int slot) { @@ -55,23 +54,15 @@ int dca_sysfs_add_provider(struct dca_provider *dca, struct device *dev) struct device *cd; int ret; - idr_preload(GFP_KERNEL); - spin_lock(_idr_lock); - - ret = idr_alloc(_idr, dca, GFP_NOWAIT); - if (ret >= 0) - dca->id = ret; - - spin_unlock(_idr_lock); - idr_preload_end(); + ret = idr_alloc(_idr, dca, GFP_KERNEL); if (ret < 0) return ret; + dca->id = ret; + cd = device_create(dca_class, dev, MKDEV(0, 0), NULL, "dca%d"
[PATCH 10/10] idr: Rework idr_preload()
On Wed, Jun 19, 2013 at 01:18:32AM -0700, Tejun Heo wrote: > Hello, Kent. > > On Tue, Jun 18, 2013 at 05:02:30PM -0700, Kent Overstreet wrote: > > With the new ida implementation, that doesn't work anymore - the new ida > > code grows its bitmap tree by reallocating the entire thing in power of > > two increments. Doh. > > So, I'm not sure about the single contiguous array implementation. It > introduces some nasty characteristics such as possibly too large > contiguous allocation, bursty CPU usages, and loss of the ability to > handle ID allocations clustered in high areas - sure, the old ida > wasn't great on that part either but this can be much worse. > > In addition, I'm not sure what this single contiguous allocation buys > us. Let's say each leaf node size is 4k. Even with in-array tree > implementation, it should be able to serve 2k IDs per-page, which > would be enough for most use cases and with a bit of caching it can > easily achieve both the performance benefits of array implementation > and the flexibility of allocating each leaf separately. Even without > caching, the tree depth would normally be much lower than the current > implementation and the performance should be better. If you're > bothered by having to implement another radix tree for it, it should > be possible to just implement the leaf node logic and use the existing > radix tree for internal nodes, right? With respect to performance, strongly disagree. Leaving pointers out of the interior nodes cuts our space requirements by a factor of _64_ - that's huge, and with data structures the dominating factors w.r.t. performance tend to be the amount of memory you touch and the amount of pointer chasing. The lack of pointer chasing also means I could add prefetching to the tree walking code (child nodes are always contiguous in memory) like I did in bcache's binary search trees - I didn't realize DLM was using this for so many ids so I'll probably add that. That means for quite large bitmaps, _all_ the interior nodes will fit onto a couple cachelines which are all contiguous in memory. That's _huge_. Even for 1 million ids - that's 128 kilobytes for all the leaf nodes, which means all the interior nodes take up 2 kilobytes - which again is contiguous so we can prefetch far in advance as we walk down the tree. (If I was optimizing for huge bitmaps I would've picked a bigger splay factor and the interior nodes would take up even less memory, but I've been assuming the common case for the bitmap size is less than a page in which case BITS_PER_LONG should be about right). Also, idr_find() was slower than radix_tree_lookup() before, as tested for some aio patches - decoupling the id allocation from the radix tree gives the id allocator more freedom in its data structures (couldn't realloc before without breaking RCU lookup). I'm highly skeptical the bursty CPU usage is going to be a real issue in practice, as that can happen anywhere we do allocation - the realloc is going to be trivial compared to what can happen then. What's important is just the amortized cpu overhead, and in that respect doing the realloc is definitely a performance win. Besides all that, the ida/idr reimplementations deleted > 300 lines of code from idr.[ch]. If you try to add caching to the existing ida code, it's only going to get more complicated - and trying to maintain the behaviour where we always allocate the smallest available id is going to be fun there (the cache has to be kept in sorted order every time you free an id). Sparse id allocations/allocations clustered in the high areas isn't a concern - part of the reason I separated out ida_alloc() from ida_alloc_range() was to make sure none of the existing code was doing anything dumb with the starting id. The only thing that would've been a problem there was idr_alloc_cyclic() (and the code that was open coding ida_alloc_cyclic() as a performance hack), but the new ida_alloc_cyclic() handles that - handles it better than the old version, too. The only real concern I've come across is the fact that this implementation currently can't allocate up to INT_MAX ids with the whole allocation being contiguous - for all the uses I'd looked at I didn't think this was going to be an issue, but turns out it probably will be for DLM. So I've got a bit more work to do there. (I'm still not going to go with anything resembling a radix tree though - instead of having a flat array, I'll have a an array of pointers to fixed size array segments once the entire tree exceeds a certain size). > > So we need a slightly different trick. Note that if all allocations from > > an idr start by calling idr_prealloc() and disabling premption, at > > most nr_cpus() allocations can happen before someone calls > > idr_prealloc() again. > > But we allow mixing preloaded and normal allocations and users
Re: [PATCH 10/10] idr: Rework idr_preload()
On Wed, Jun 19, 2013 at 01:18:32AM -0700, Tejun Heo wrote: Hello, Kent. On Tue, Jun 18, 2013 at 05:02:30PM -0700, Kent Overstreet wrote: With the new ida implementation, that doesn't work anymore - the new ida code grows its bitmap tree by reallocating the entire thing in power of two increments. Doh. So, I'm not sure about the single contiguous array implementation. It introduces some nasty characteristics such as possibly too large contiguous allocation, bursty CPU usages, and loss of the ability to handle ID allocations clustered in high areas - sure, the old ida wasn't great on that part either but this can be much worse. In addition, I'm not sure what this single contiguous allocation buys us. Let's say each leaf node size is 4k. Even with in-array tree implementation, it should be able to serve 2k IDs per-page, which would be enough for most use cases and with a bit of caching it can easily achieve both the performance benefits of array implementation and the flexibility of allocating each leaf separately. Even without caching, the tree depth would normally be much lower than the current implementation and the performance should be better. If you're bothered by having to implement another radix tree for it, it should be possible to just implement the leaf node logic and use the existing radix tree for internal nodes, right? With respect to performance, strongly disagree. Leaving pointers out of the interior nodes cuts our space requirements by a factor of _64_ - that's huge, and with data structures the dominating factors w.r.t. performance tend to be the amount of memory you touch and the amount of pointer chasing. The lack of pointer chasing also means I could add prefetching to the tree walking code (child nodes are always contiguous in memory) like I did in bcache's binary search trees - I didn't realize DLM was using this for so many ids so I'll probably add that. That means for quite large bitmaps, _all_ the interior nodes will fit onto a couple cachelines which are all contiguous in memory. That's _huge_. Even for 1 million ids - that's 128 kilobytes for all the leaf nodes, which means all the interior nodes take up 2 kilobytes - which again is contiguous so we can prefetch far in advance as we walk down the tree. (If I was optimizing for huge bitmaps I would've picked a bigger splay factor and the interior nodes would take up even less memory, but I've been assuming the common case for the bitmap size is less than a page in which case BITS_PER_LONG should be about right). Also, idr_find() was slower than radix_tree_lookup() before, as tested for some aio patches - decoupling the id allocation from the radix tree gives the id allocator more freedom in its data structures (couldn't realloc before without breaking RCU lookup). I'm highly skeptical the bursty CPU usage is going to be a real issue in practice, as that can happen anywhere we do allocation - the realloc is going to be trivial compared to what can happen then. What's important is just the amortized cpu overhead, and in that respect doing the realloc is definitely a performance win. Besides all that, the ida/idr reimplementations deleted 300 lines of code from idr.[ch]. If you try to add caching to the existing ida code, it's only going to get more complicated - and trying to maintain the behaviour where we always allocate the smallest available id is going to be fun there (the cache has to be kept in sorted order every time you free an id). Sparse id allocations/allocations clustered in the high areas isn't a concern - part of the reason I separated out ida_alloc() from ida_alloc_range() was to make sure none of the existing code was doing anything dumb with the starting id. The only thing that would've been a problem there was idr_alloc_cyclic() (and the code that was open coding ida_alloc_cyclic() as a performance hack), but the new ida_alloc_cyclic() handles that - handles it better than the old version, too. The only real concern I've come across is the fact that this implementation currently can't allocate up to INT_MAX ids with the whole allocation being contiguous - for all the uses I'd looked at I didn't think this was going to be an issue, but turns out it probably will be for DLM. So I've got a bit more work to do there. (I'm still not going to go with anything resembling a radix tree though - instead of having a flat array, I'll have a an array of pointers to fixed size array segments once the entire tree exceeds a certain size). So we need a slightly different trick. Note that if all allocations from an idr start by calling idr_prealloc() and disabling premption, at most nr_cpus() allocations can happen before someone calls idr_prealloc() again. But we allow mixing preloaded and normal allocations and users are allowed to allocate as many IDs they want after preloading. It should guarantee that the first allocation after preloading follows the stronger
[PATCH 10/10] idr: Rework idr_preload()
The old idr_preload() used percpu buffers - since the bitmap/radix/whatever tree only grew by fixed sized nodes, it only had to ensure there was a node available in the percpu buffer and disable preemption. This conveniently meant that you didn't have to pass the idr you were going to allocate from to it. With the new ida implementation, that doesn't work anymore - the new ida code grows its bitmap tree by reallocating the entire thing in power of two increments. Doh. So we need a slightly different trick. Note that if all allocations from an idr start by calling idr_prealloc() and disabling premption, at most nr_cpus() allocations can happen before someone calls idr_prealloc() again. So, we just change idr_prealloc() to resize the ida bitmap tree if there's less than num_possible_cpus() ids available - conveniently, we already track the number of allocated ids, and the total number of ids we can have allocated is just nr_leaf_nodes * BITS_PER_LONG. Easy. This does require a fairly trivial interface change - we now have to pass the idr we're going to allocate from (and the starting id we're going to pass to idr_allocate_range()) to idr_prealloc(), so this patch updates all the callers. Signed-off-by: Kent Overstreet Cc: Andrew Morton Cc: Tejun Heo Cc: Stefan Richter Cc: David Airlie Cc: Roland Dreier Cc: Sean Hefty Cc: Hal Rosenstock Cc: Steve Wise Cc: Hoang-Nam Nguyen Cc: Christoph Raisch Cc: Mike Marciniszyn Cc: Doug Gilbert Cc: "James E.J. Bottomley" Cc: Christine Caulfield Cc: David Teigland Cc: Trond Myklebust Cc: John McCutchan Cc: Robert Love Cc: Eric Paris Cc: Dave Airlie Cc: Thomas Hellstrom Cc: Brian Paul Cc: Maarten Lankhorst Cc: Dmitry Torokhov Cc: Erez Shitrit Cc: Al Viro Cc: Haggai Eran Cc: Jack Morgenstein Cc: Wolfram Sang Cc: Greg Kroah-Hartman Cc: Davidlohr Bueso Cc: Rik van Riel Cc: Michel Lespinasse Cc: linux1394-devel at lists.sourceforge.net Cc: linux-kernel at vger.kernel.org Cc: dri-devel at lists.freedesktop.org Cc: linux-rdma at vger.kernel.org Cc: linux-scsi at vger.kernel.org Cc: cluster-devel at redhat.com Cc: linux-nfs at vger.kernel.org --- drivers/firewire/core-cdev.c | 2 +- drivers/gpu/drm/drm_gem.c | 4 +- drivers/gpu/drm/vmwgfx/vmwgfx_resource.c | 2 +- drivers/infiniband/core/cm.c | 7 +--- drivers/infiniband/core/sa_query.c | 2 +- drivers/infiniband/core/uverbs_cmd.c | 2 +- drivers/infiniband/hw/cxgb3/iwch.h | 2 +- drivers/infiniband/hw/cxgb4/iw_cxgb4.h | 2 +- drivers/infiniband/hw/ehca/ehca_cq.c | 2 +- drivers/infiniband/hw/ehca/ehca_qp.c | 2 +- drivers/infiniband/hw/ipath/ipath_driver.c | 2 +- drivers/infiniband/hw/mlx4/cm.c| 2 +- drivers/infiniband/hw/qib/qib_init.c | 2 +- drivers/scsi/sg.c | 2 +- fs/dlm/lock.c | 2 +- fs/dlm/recover.c | 2 +- fs/nfs/nfs4client.c| 2 +- fs/notify/inotify/inotify_user.c | 2 +- include/linux/idr.h| 37 + ipc/util.c | 4 +- lib/idr.c | 66 ++ 21 files changed, 91 insertions(+), 59 deletions(-) diff --git a/drivers/firewire/core-cdev.c b/drivers/firewire/core-cdev.c index 436debf..3c70fbc 100644 --- a/drivers/firewire/core-cdev.c +++ b/drivers/firewire/core-cdev.c @@ -490,7 +490,7 @@ static int add_client_resource(struct client *client, int ret; if (preload) - idr_preload(gfp_mask); + idr_preload(>resource_idr, 0, gfp_mask); spin_lock_irqsave(>lock, flags); if (client->in_shutdown) diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c index 1c897b9..cf50894 100644 --- a/drivers/gpu/drm/drm_gem.c +++ b/drivers/gpu/drm/drm_gem.c @@ -273,7 +273,7 @@ drm_gem_handle_create(struct drm_file *file_priv, * Get the user-visible handle using idr. Preload and perform * allocation under our spinlock. */ - idr_preload(GFP_KERNEL); + idr_preload(_priv->object_idr, 1, GFP_KERNEL); spin_lock(_priv->table_lock); ret = idr_alloc_range(_priv->object_idr, obj, 1, 0, GFP_NOWAIT); @@ -449,7 +449,7 @@ drm_gem_flink_ioctl(struct drm_device *dev, void *data, if (obj == NULL) return -ENOENT; - idr_preload(GFP_KERNEL); + idr_preload(>object_name_idr, 1, GFP_KERNEL); spin_lock(>object_name_lock); if (!obj->name) { ret = idr_alloc_range(>object_name_idr, obj, 1, 0, GFP_NOWAIT); diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c index ccbaed1..9f00706 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c @@ -177,7 +1
[PATCH 09/10] idr: Remove unneeded idr locking, idr_preload() usage
Our new idr implementation does its own locking, instead of forcing it onto the callers like the old implementation. Many of the existing idr users need locking for more than just idr_alloc()/idr_remove()/idr_find() - they're taking refcounts and such under their locks and we can't touch those. But a significant number of users had locks that protected nothing more than the idr data structures itself - those we can get rid of. Note that we have to be careful when removing locks; in some places, locks appear to only be protecting idr_alloc()/idr_remove() calls but they're also used by other code that needs to ensure the idr isn't modified while it's doing something else - so ideally we want to delete the lock that protected the idr, or else we have to carefully audit all the other places it's used. There's also a fair number of places where things were being done under the idr lock unnecessarily; drivers/dca/dca-sysfs.c is a good example. dca->id is set to the id idr_alloc() returns under the lock - but there's no idr_find() calls under the lock, and dca->id isn't touched in the idr_remove() paths. So the lock can be safely deleted. The really nice thing about deleting this unnecessary locking is that it lets us trivially delete a lot of now unnecessary idr_preload() - with idr doing its own locking, we can pass GFP_KERNEL to idr_alloc() just fine. Signed-off-by: Kent Overstreet Cc: Andrew Morton Cc: Tejun Heo Cc: David Airlie Cc: Tom Tucker Cc: Steve Wise Cc: Roland Dreier Cc: Sean Hefty Cc: Hal Rosenstock Cc: Alasdair Kergon Cc: dm-devel at redhat.com Cc: Samuel Ortiz Cc: Alex Dubov Cc: Arnd Bergmann Cc: Greg Kroah-Hartman Cc: Chris Ball Cc: "James E.J. Bottomley" Cc: Willem Riede Cc: "Kai M?kisara" Cc: "Nicholas A. Bellinger" Cc: Li Zefan Cc: Vlad Yasevich Cc: Neil Horman Cc: "David S. Miller" Cc: Dave Airlie Cc: Alon Levy Cc: Guennadi Liakhovetski Cc: Christoph Hellwig Cc: linux-kernel at vger.kernel.org Cc: dri-devel at lists.freedesktop.org Cc: linux-rdma at vger.kernel.org Cc: linux-mmc at vger.kernel.org Cc: linux-scsi at vger.kernel.org Cc: osst-users at lists.sourceforge.net Cc: target-devel at vger.kernel.org Cc: containers at lists.linux-foundation.org Cc: cgroups at vger.kernel.org Cc: linux-sctp at vger.kernel.org Cc: netdev at vger.kernel.org --- drivers/dca/dca-sysfs.c | 18 +++--- drivers/gpu/drm/qxl/qxl_cmd.c | 4 +--- drivers/gpu/drm/qxl/qxl_drv.h | 1 - drivers/gpu/drm/qxl/qxl_kms.c | 1 - drivers/gpu/drm/qxl/qxl_release.c | 19 +-- drivers/infiniband/hw/amso1100/c2.h | 1 - drivers/infiniband/hw/amso1100/c2_qp.c| 20 ++-- drivers/md/dm.c | 22 -- drivers/memstick/core/memstick.c | 17 +++-- drivers/mfd/rtsx_pcr.c| 13 +++-- drivers/misc/c2port/core.c| 11 +-- drivers/misc/tifm_core.c | 15 +++ drivers/mmc/core/host.c | 13 ++--- drivers/scsi/ch.c | 14 ++ drivers/scsi/st.c | 13 + drivers/target/iscsi/iscsi_target.c | 17 - drivers/target/iscsi/iscsi_target.h | 1 - drivers/target/iscsi/iscsi_target_login.c | 12 ++-- include/linux/cgroup.h| 1 - include/net/sctp/sctp.h | 1 - kernel/cgroup.c | 9 + kernel/workqueue.c| 15 ++- net/9p/util.c | 15 +-- net/sctp/associola.c | 14 ++ net/sctp/protocol.c | 1 - net/sctp/socket.c | 2 -- 26 files changed, 42 insertions(+), 228 deletions(-) diff --git a/drivers/dca/dca-sysfs.c b/drivers/dca/dca-sysfs.c index effda66..6be5fbd 100644 --- a/drivers/dca/dca-sysfs.c +++ b/drivers/dca/dca-sysfs.c @@ -31,7 +31,6 @@ static struct class *dca_class; static struct idr dca_idr; -static spinlock_t dca_idr_lock; int dca_sysfs_add_req(struct dca_provider *dca, struct device *dev, int slot) { @@ -55,23 +54,15 @@ int dca_sysfs_add_provider(struct dca_provider *dca, struct device *dev) struct device *cd; int ret; - idr_preload(GFP_KERNEL); - spin_lock(_idr_lock); - - ret = idr_alloc(_idr, dca, GFP_NOWAIT); - if (ret >= 0) - dca->id = ret; - - spin_unlock(_idr_lock); - idr_preload_end(); + ret = idr_alloc(_idr, dca, GFP_KERNEL); if (ret < 0) return ret; + dca->id = ret; + cd = device_create(dca_class, dev, MKDEV(0, 0), NULL, "dca%d", dca->id); if (IS_ERR(cd)) { - spin_lock(_idr_lock);
[PATCH 09/10] idr: Remove unneeded idr locking, idr_preload() usage
Our new idr implementation does its own locking, instead of forcing it onto the callers like the old implementation. Many of the existing idr users need locking for more than just idr_alloc()/idr_remove()/idr_find() - they're taking refcounts and such under their locks and we can't touch those. But a significant number of users had locks that protected nothing more than the idr data structures itself - those we can get rid of. Note that we have to be careful when removing locks; in some places, locks appear to only be protecting idr_alloc()/idr_remove() calls but they're also used by other code that needs to ensure the idr isn't modified while it's doing something else - so ideally we want to delete the lock that protected the idr, or else we have to carefully audit all the other places it's used. There's also a fair number of places where things were being done under the idr lock unnecessarily; drivers/dca/dca-sysfs.c is a good example. dca-id is set to the id idr_alloc() returns under the lock - but there's no idr_find() calls under the lock, and dca-id isn't touched in the idr_remove() paths. So the lock can be safely deleted. The really nice thing about deleting this unnecessary locking is that it lets us trivially delete a lot of now unnecessary idr_preload() - with idr doing its own locking, we can pass GFP_KERNEL to idr_alloc() just fine. Signed-off-by: Kent Overstreet koverstr...@google.com Cc: Andrew Morton a...@linux-foundation.org Cc: Tejun Heo t...@kernel.org Cc: David Airlie airl...@linux.ie Cc: Tom Tucker t...@opengridcomputing.com Cc: Steve Wise sw...@opengridcomputing.com Cc: Roland Dreier rol...@kernel.org Cc: Sean Hefty sean.he...@intel.com Cc: Hal Rosenstock hal.rosenst...@gmail.com Cc: Alasdair Kergon a...@redhat.com Cc: dm-de...@redhat.com Cc: Samuel Ortiz sa...@linux.intel.com Cc: Alex Dubov oa...@yahoo.com Cc: Arnd Bergmann a...@arndb.de Cc: Greg Kroah-Hartman gre...@linuxfoundation.org Cc: Chris Ball c...@laptop.org Cc: James E.J. Bottomley jbottom...@parallels.com Cc: Willem Riede o...@riede.org Cc: Kai Mäkisara kai.makis...@kolumbus.fi Cc: Nicholas A. Bellinger n...@linux-iscsi.org Cc: Li Zefan lize...@huawei.com Cc: Vlad Yasevich vyasev...@gmail.com Cc: Neil Horman nhor...@tuxdriver.com Cc: David S. Miller da...@davemloft.net Cc: Dave Airlie airl...@redhat.com Cc: Alon Levy al...@redhat.com Cc: Guennadi Liakhovetski g.liakhovet...@gmx.de Cc: Christoph Hellwig h...@lst.de Cc: linux-ker...@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linux-r...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: linux-s...@vger.kernel.org Cc: osst-us...@lists.sourceforge.net Cc: target-de...@vger.kernel.org Cc: contain...@lists.linux-foundation.org Cc: cgro...@vger.kernel.org Cc: linux-s...@vger.kernel.org Cc: net...@vger.kernel.org --- drivers/dca/dca-sysfs.c | 18 +++--- drivers/gpu/drm/qxl/qxl_cmd.c | 4 +--- drivers/gpu/drm/qxl/qxl_drv.h | 1 - drivers/gpu/drm/qxl/qxl_kms.c | 1 - drivers/gpu/drm/qxl/qxl_release.c | 19 +-- drivers/infiniband/hw/amso1100/c2.h | 1 - drivers/infiniband/hw/amso1100/c2_qp.c| 20 ++-- drivers/md/dm.c | 22 -- drivers/memstick/core/memstick.c | 17 +++-- drivers/mfd/rtsx_pcr.c| 13 +++-- drivers/misc/c2port/core.c| 11 +-- drivers/misc/tifm_core.c | 15 +++ drivers/mmc/core/host.c | 13 ++--- drivers/scsi/ch.c | 14 ++ drivers/scsi/st.c | 13 + drivers/target/iscsi/iscsi_target.c | 17 - drivers/target/iscsi/iscsi_target.h | 1 - drivers/target/iscsi/iscsi_target_login.c | 12 ++-- include/linux/cgroup.h| 1 - include/net/sctp/sctp.h | 1 - kernel/cgroup.c | 9 + kernel/workqueue.c| 15 ++- net/9p/util.c | 15 +-- net/sctp/associola.c | 14 ++ net/sctp/protocol.c | 1 - net/sctp/socket.c | 2 -- 26 files changed, 42 insertions(+), 228 deletions(-) diff --git a/drivers/dca/dca-sysfs.c b/drivers/dca/dca-sysfs.c index effda66..6be5fbd 100644 --- a/drivers/dca/dca-sysfs.c +++ b/drivers/dca/dca-sysfs.c @@ -31,7 +31,6 @@ static struct class *dca_class; static struct idr dca_idr; -static spinlock_t dca_idr_lock; int dca_sysfs_add_req(struct dca_provider *dca, struct device *dev, int slot) { @@ -55,23 +54,15 @@ int dca_sysfs_add_provider(struct dca_provider *dca, struct device *dev) struct device *cd; int ret; - idr_preload(GFP_KERNEL); - spin_lock(dca_idr_lock); - - ret = idr_alloc
[PATCH 10/10] idr: Rework idr_preload()
The old idr_preload() used percpu buffers - since the bitmap/radix/whatever tree only grew by fixed sized nodes, it only had to ensure there was a node available in the percpu buffer and disable preemption. This conveniently meant that you didn't have to pass the idr you were going to allocate from to it. With the new ida implementation, that doesn't work anymore - the new ida code grows its bitmap tree by reallocating the entire thing in power of two increments. Doh. So we need a slightly different trick. Note that if all allocations from an idr start by calling idr_prealloc() and disabling premption, at most nr_cpus() allocations can happen before someone calls idr_prealloc() again. So, we just change idr_prealloc() to resize the ida bitmap tree if there's less than num_possible_cpus() ids available - conveniently, we already track the number of allocated ids, and the total number of ids we can have allocated is just nr_leaf_nodes * BITS_PER_LONG. Easy. This does require a fairly trivial interface change - we now have to pass the idr we're going to allocate from (and the starting id we're going to pass to idr_allocate_range()) to idr_prealloc(), so this patch updates all the callers. Signed-off-by: Kent Overstreet koverstr...@google.com Cc: Andrew Morton a...@linux-foundation.org Cc: Tejun Heo t...@kernel.org Cc: Stefan Richter stef...@s5r6.in-berlin.de Cc: David Airlie airl...@linux.ie Cc: Roland Dreier rol...@kernel.org Cc: Sean Hefty sean.he...@intel.com Cc: Hal Rosenstock hal.rosenst...@gmail.com Cc: Steve Wise sw...@chelsio.com Cc: Hoang-Nam Nguyen hngu...@de.ibm.com Cc: Christoph Raisch rai...@de.ibm.com Cc: Mike Marciniszyn infinip...@intel.com Cc: Doug Gilbert dgilb...@interlog.com Cc: James E.J. Bottomley jbottom...@parallels.com Cc: Christine Caulfield ccaul...@redhat.com Cc: David Teigland teigl...@redhat.com Cc: Trond Myklebust trond.mykleb...@netapp.com Cc: John McCutchan j...@johnmccutchan.com Cc: Robert Love rl...@rlove.org Cc: Eric Paris epa...@parisplace.org Cc: Dave Airlie airl...@redhat.com Cc: Thomas Hellstrom thellst...@vmware.com Cc: Brian Paul bri...@vmware.com Cc: Maarten Lankhorst maarten.lankho...@canonical.com Cc: Dmitry Torokhov d...@vmware.com Cc: Erez Shitrit ere...@mellanox.co.il Cc: Al Viro v...@zeniv.linux.org.uk Cc: Haggai Eran hagg...@mellanox.com Cc: Jack Morgenstein ja...@dev.mellanox.co.il Cc: Wolfram Sang wolf...@the-dreams.de Cc: Greg Kroah-Hartman gre...@linuxfoundation.org Cc: Davidlohr Bueso davidlohr.bu...@hp.com Cc: Rik van Riel r...@redhat.com Cc: Michel Lespinasse wal...@google.com Cc: linux1394-de...@lists.sourceforge.net Cc: linux-ker...@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linux-r...@vger.kernel.org Cc: linux-s...@vger.kernel.org Cc: cluster-de...@redhat.com Cc: linux-...@vger.kernel.org --- drivers/firewire/core-cdev.c | 2 +- drivers/gpu/drm/drm_gem.c | 4 +- drivers/gpu/drm/vmwgfx/vmwgfx_resource.c | 2 +- drivers/infiniband/core/cm.c | 7 +--- drivers/infiniband/core/sa_query.c | 2 +- drivers/infiniband/core/uverbs_cmd.c | 2 +- drivers/infiniband/hw/cxgb3/iwch.h | 2 +- drivers/infiniband/hw/cxgb4/iw_cxgb4.h | 2 +- drivers/infiniband/hw/ehca/ehca_cq.c | 2 +- drivers/infiniband/hw/ehca/ehca_qp.c | 2 +- drivers/infiniband/hw/ipath/ipath_driver.c | 2 +- drivers/infiniband/hw/mlx4/cm.c| 2 +- drivers/infiniband/hw/qib/qib_init.c | 2 +- drivers/scsi/sg.c | 2 +- fs/dlm/lock.c | 2 +- fs/dlm/recover.c | 2 +- fs/nfs/nfs4client.c| 2 +- fs/notify/inotify/inotify_user.c | 2 +- include/linux/idr.h| 37 + ipc/util.c | 4 +- lib/idr.c | 66 ++ 21 files changed, 91 insertions(+), 59 deletions(-) diff --git a/drivers/firewire/core-cdev.c b/drivers/firewire/core-cdev.c index 436debf..3c70fbc 100644 --- a/drivers/firewire/core-cdev.c +++ b/drivers/firewire/core-cdev.c @@ -490,7 +490,7 @@ static int add_client_resource(struct client *client, int ret; if (preload) - idr_preload(gfp_mask); + idr_preload(client-resource_idr, 0, gfp_mask); spin_lock_irqsave(client-lock, flags); if (client-in_shutdown) diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c index 1c897b9..cf50894 100644 --- a/drivers/gpu/drm/drm_gem.c +++ b/drivers/gpu/drm/drm_gem.c @@ -273,7 +273,7 @@ drm_gem_handle_create(struct drm_file *file_priv, * Get the user-visible handle using idr. Preload and perform * allocation under our spinlock. */ - idr_preload(GFP_KERNEL); + idr_preload(file_priv-object_idr, 1, GFP_KERNEL); spin_lock(file_priv-table_lock); ret = idr_alloc_range