Re: [PATCH] Percpu tag allocator

2013-06-18 Thread Kent Overstreet
On Thu, Jun 13, 2013 at 12:23:39PM -0700, Andrew Morton wrote: > On Thu, 13 Jun 2013 12:15:07 -0700 Tejun Heo wrote: > > Oh, I'm sure the current id[r|a] can be improved upon a lot but I'm > > very skeptical one can reach the level of scalability necessary for, > > say, pci-e attached extremely hi

Re: [PATCH] Percpu tag allocator

2013-06-13 Thread Tejun Heo
On Thu, Jun 13, 2013 at 04:13:42PM -0700, Tejun Heo wrote: > On Thu, Jun 13, 2013 at 03:35:51PM -0700, Andrew Morton wrote: > > Well that's news. My awe at the dreadfulness of Kent's changelog > > continues to grow. Please identify these duplicated allocators > > (function names will suffice). T

Re: [PATCH] Percpu tag allocator

2013-06-13 Thread Tejun Heo
Hello, On Thu, Jun 13, 2013 at 03:35:51PM -0700, Andrew Morton wrote: > Well that's news. My awe at the dreadfulness of Kent's changelog > continues to grow. Please identify these duplicated allocators > (function names will suffice). That way, others can actually review > this code :( Oh yeah

Re: [PATCH] Percpu tag allocator

2013-06-13 Thread Andrew Morton
On Thu, 13 Jun 2013 15:30:28 -0700 Tejun Heo wrote: > Hello, Andrew. > > On Thu, Jun 13, 2013 at 03:10:40PM -0700, Andrew Morton wrote: > > Look, we're falling into the age-old trap of trying to justify existing > > code just because it exists. > > > > Stop. Take a step back and pretend that t

Re: [PATCH] Percpu tag allocator

2013-06-13 Thread Tejun Heo
Hello, Andrew. On Thu, Jun 13, 2013 at 03:10:40PM -0700, Andrew Morton wrote: > Look, we're falling into the age-old trap of trying to justify existing > code just because it exists. > > Stop. Take a step back and pretend that the percpu tag allocator patch > never existed. Now, define the prob

Re: [PATCH] Percpu tag allocator

2013-06-13 Thread Andrew Morton
On Thu, 13 Jun 2013 12:35:14 -0700 Tejun Heo wrote: > > > Maybe we can layer things so that we have percpu layer on top of > > > id[r|a] and, say, mapping id to point is still done by idr, or the > > > percpu tag allocator uses ida for tag chunk allocations, but it's > > > still gonna be somethin

Re: [PATCH] Percpu tag allocator

2013-06-13 Thread Kent Overstreet
On Thu, Jun 13, 2013 at 02:50:09PM -0700, Tejun Heo wrote: > (cc'ing Rafael and Oleg) > > On Thu, Jun 13, 2013 at 02:14:25PM -0700, Kent Overstreet wrote: > > Yeah, I think you're definitely right. (I only started reading up on the > > freezer stuff yesterday, though). > > > > Do you know offhand

Re: [PATCH] Percpu tag allocator

2013-06-13 Thread Kent Overstreet
On Thu, Jun 13, 2013 at 11:53:18AM -0700, Tejun Heo wrote: > Hello, Andrew, Kent. > > (cc'ing NFS folks for id[r|a] discussion) > > On Wed, Jun 12, 2013 at 08:03:11PM -0700, Andrew Morton wrote: > > They all sound like pretty crappy reasons ;) If the idr/ida interface > > is nasty then it can be

Re: [PATCH] Percpu tag allocator

2013-06-13 Thread Tejun Heo
(cc'ing Rafael and Oleg) On Thu, Jun 13, 2013 at 02:14:25PM -0700, Kent Overstreet wrote: > Yeah, I think you're definitely right. (I only started reading up on the > freezer stuff yesterday, though). > > Do you know offhand what existing (i.e. slab) allocators do? Whatever > they do should make

Re: [PATCH] Percpu tag allocator

2013-06-13 Thread Kent Overstreet
On Thu, Jun 13, 2013 at 12:21:52PM -0700, Tejun Heo wrote: > Hello, > > On Thu, Jun 13, 2013 at 12:13:24PM -0700, Andrew Morton wrote: > > As I understand it, if a task is stuck in this loop at freeze time, the > > whole freeze attempt will fail. But it's been a long time since I > > thought abou

Re: [PATCH] Percpu tag allocator

2013-06-13 Thread Tejun Heo
Hello, On Thu, Jun 13, 2013 at 12:23:39PM -0700, Andrew Morton wrote: > > The lowest number guarantee makes them different. Maybe tag > > allocation can be layered on top as a caching layer, I don't know, but > > at any rate we need at least two different operation modes. > > Why? Tag allocatio

Re: [PATCH] Percpu tag allocator

2013-06-13 Thread Andrew Morton
On Thu, 13 Jun 2013 12:15:07 -0700 Tejun Heo wrote: > Hello, Andrew. > > On Thu, Jun 13, 2013 at 12:04:39PM -0700, Andrew Morton wrote: > > > The thing is that id[r|a] guarantee that the lowest available slot is > > > allocated > > > > That isn't the case for ida_get_new_above() - the caller ge

Re: [PATCH] Percpu tag allocator

2013-06-13 Thread Tejun Heo
Hello, On Thu, Jun 13, 2013 at 12:13:24PM -0700, Andrew Morton wrote: > As I understand it, if a task is stuck in this loop at freeze time, the > whole freeze attempt will fail. But it's been a long time since I > thought about or worked on this stuff. It depends on who's calling the allocator b

Re: [PATCH] Percpu tag allocator

2013-06-13 Thread Tejun Heo
Hello, Andrew. On Thu, Jun 13, 2013 at 12:04:39PM -0700, Andrew Morton wrote: > > The thing is that id[r|a] guarantee that the lowest available slot is > > allocated > > That isn't the case for ida_get_new_above() - the caller gets to > control the starting index. Hmmm? get_new_above() is the s

Re: [PATCH] Percpu tag allocator

2013-06-13 Thread J. Bruce Fields
On Thu, Jun 13, 2013 at 11:53:18AM -0700, Tejun Heo wrote: > The thing is that id[r|a] guarantee that the lowest available slot is > allocated and this is important because it's used to name things which > are visible to userland - things like block device minor number, > device indicies and so on.

Re: [PATCH] Percpu tag allocator

2013-06-13 Thread Andrew Morton
On Thu, 13 Jun 2013 12:06:10 -0700 Tejun Heo wrote: > Hello, Andrew, Kent. > > On Wed, Jun 12, 2013 at 04:38:54PM -0700, Andrew Morton wrote: > ... > > > +unsigned percpu_tag_alloc(struct percpu_tag_pool *pool, gfp_t gfp) > > > +{ > > > + DEFINE_WAIT(wait); > > > + struct percpu_tag_cpu_freelist

Re: [PATCH] Percpu tag allocator

2013-06-13 Thread Jeff Layton
On Thu, 13 Jun 2013 11:53:18 -0700 Tejun Heo wrote: > Hello, Andrew, Kent. > > (cc'ing NFS folks for id[r|a] discussion) > > On Wed, Jun 12, 2013 at 08:03:11PM -0700, Andrew Morton wrote: > > They all sound like pretty crappy reasons ;) If the idr/ida interface > > is nasty then it can be wrapp

Re: [PATCH] Percpu tag allocator

2013-06-13 Thread Tejun Heo
Hello, Andrew, Kent. On Wed, Jun 12, 2013 at 04:38:54PM -0700, Andrew Morton wrote: ... > > +unsigned percpu_tag_alloc(struct percpu_tag_pool *pool, gfp_t gfp) > > +{ > > + DEFINE_WAIT(wait); > > + struct percpu_tag_cpu_freelist *tags; > > + unsigned long flags; > > + unsigned tag, this_cp

Re: [PATCH] Percpu tag allocator

2013-06-13 Thread Andrew Morton
On Thu, 13 Jun 2013 11:53:18 -0700 Tejun Heo wrote: > Hello, Andrew, Kent. > > (cc'ing NFS folks for id[r|a] discussion) > > On Wed, Jun 12, 2013 at 08:03:11PM -0700, Andrew Morton wrote: > > They all sound like pretty crappy reasons ;) If the idr/ida interface > > is nasty then it can be wrapp

Re: [PATCH] Percpu tag allocator

2013-06-13 Thread Tejun Heo
Hello, Andrew, Kent. (cc'ing NFS folks for id[r|a] discussion) On Wed, Jun 12, 2013 at 08:03:11PM -0700, Andrew Morton wrote: > They all sound like pretty crappy reasons ;) If the idr/ida interface > is nasty then it can be wrapped to provide the same interface as the > percpu tag allocator. > >

Re: [PATCH] Percpu tag allocator

2013-06-12 Thread Andrew Morton
On Wed, 12 Jun 2013 20:54:32 -0700 Kent Overstreet wrote: > On Wed, Jun 12, 2013 at 08:03:11PM -0700, Andrew Morton wrote: > > On Wed, 12 Jun 2013 19:05:36 -0700 Kent Overstreet > > wrote: > > > > > ... > > > > > > > Why can't we use ida_get_new_above()? > > > > > > > >If it is "speed" t

Re: [PATCH] Percpu tag allocator

2013-06-12 Thread Kent Overstreet
On Wed, Jun 12, 2013 at 08:03:11PM -0700, Andrew Morton wrote: > On Wed, 12 Jun 2013 19:05:36 -0700 Kent Overstreet > wrote: > > > ... > > > > > Why can't we use ida_get_new_above()? > > > > > >If it is "speed" then how bad is the problem and what efforts have > > >been made to address

Re: [PATCH] Percpu tag allocator

2013-06-12 Thread Andrew Morton
On Wed, 12 Jun 2013 19:05:36 -0700 Kent Overstreet wrote: > ... > > > Why can't we use ida_get_new_above()? > > > >If it is "speed" then how bad is the problem and what efforts have > >been made to address them within the idr code? (A per-cpu magazine > >frontend to ida_get_new_abo

Re: [PATCH] Percpu tag allocator

2013-06-12 Thread Kent Overstreet
On Wed, Jun 12, 2013 at 04:38:54PM -0700, Andrew Morton wrote: > On Tue, 11 Jun 2013 21:03:24 -0700 Kent Overstreet > wrote: > > > Allocates integers out of a predefined range > > 0..n, I assume. Not n..m. Yes (though n..m would be trivial if anyone wanted it)... and percpu_tag_pool_init() i

Re: [PATCH] Percpu tag allocator

2013-06-12 Thread Andrew Morton
On Tue, 11 Jun 2013 21:03:24 -0700 Kent Overstreet wrote: > Allocates integers out of a predefined range 0..n, I assume. Not n..m. > - for use by e.g. a driver > to allocate tags for communicating with the device. > > v2: Tejun pointed out that the original strategy completely breaks if > n

Re: [PATCH] Percpu tag allocator

2013-06-12 Thread Oleg Nesterov
On 06/12, Kent Overstreet wrote: > > So we'd need at least an atomic counter, but a bitmap isn't really any > more trouble and it lets us skip most of the percpu lists that are empty Yes, yes, I understand. > - which should make a real difference in scalability to huge nr_cpus. But this is not o

Re: [PATCH] Percpu tag allocator

2013-06-12 Thread Kent Overstreet
e = nr_free; > > + return true; > > + } else { > > + remote->nr_free = 0; > > + } > > Both branches clear remote->nr_free. Yeah, but clearing it has to happen after the memcpy() and the smp_mb(). I couldn&#x

Re: [PATCH] Percpu tag allocator

2013-06-12 Thread Oleg Nesterov
On 06/11, Kent Overstreet wrote: > > + * This is done by keeping track of which cpus have tags on their percpu > + * freelists in a bitmap, and then on allocation failure if too many cpus > have > + * tags on their freelists - i.e. if cpus_have_tags * TAG_CPU_SIZE (64) > > + * nr_tags / 2 - then w

[PATCH] Percpu tag allocator

2013-06-11 Thread Kent Overstreet
Allocates integers out of a predefined range - for use by e.g. a driver to allocate tags for communicating with the device. v2: Tejun pointed out that the original strategy completely breaks if nr_cpus is large enough. I think (hope) I realized that at one point, but completely forgot in the month