Re: [PATCH 2/4] bdi: Add bdi->id

2019-08-03 Thread Tejun Heo
Hey, Matthew. On Sat, Aug 03, 2019 at 08:39:08AM -0700, Matthew Wilcox wrote: > On Sat, Aug 03, 2019 at 07:01:53AM -0700, Tejun Heo wrote: > > There currently is no way to universally identify and lookup a bdi > > without holding a reference and pointer to it. This patch

[PATCH block 2/2] writeback, cgroup: inode_switch_wbs() shouldn't give up on wb_switch_rwsem trylock fail

2019-08-02 Thread Tejun Heo
. Let's use wb_switch_rwsem only for synchronizing the actual switching and sync(2) and use isw_nr_in_flight instead for limiting the maximum number of scheduled switches. The limit is set to 1024 which should be more than enough while still avoiding extreme situations. Signed-off-by: Tejun Heo

[PATCH block 1/2] writeback, cgroup: Adjust WB_FRN_TIME_CUT_DIV to accelerate foreign inode switching

2019-08-02 Thread Tejun Heo
that it only ignores writeback with are smaller than 12.5% of the current running average. Signed-off-by: Tejun Heo --- fs/fs-writeback.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -227,7 +227,7 @@ static void wb_wait_for_completion(struc

Re: [PATCH v12 0/6] Add utilization clamping support (CGroups API)

2019-07-29 Thread Tejun Heo
Hello, Looks good to me. On cgroup side, Acked-by: Tejun Heo Thanks. -- tejun

Re: [RFC 3/9] workqueue: require CPU hotplug read exclusion for apply_workqueue_attrs

2019-07-29 Thread Tejun Heo
ueue_attrs when changing > other CPU-hotplug-sensitive data structures with the CPU read lock > already held. > > Signed-off-by: Daniel Jordan Acked-by: Tejun Heo Please feel free to route with the rest of the patchset. Thanks. -- tejun

Re: [RFC 2/9] workqueue: unconfine alloc/apply/free_workqueue_attrs()

2019-07-29 Thread Tejun Heo
On Thu, Jul 25, 2019 at 05:24:58PM -0400, Daniel Jordan wrote: > padata will use these these interfaces in a later patch, so unconfine them. > > Signed-off-by: Daniel Jordan Acked-by: Tejun Heo Thanks. -- tejun

Re: [PATCH] fs: kernfs: Fix possible null-pointer dereferences in kernfs_path_from_node_locked()

2019-07-29 Thread Tejun Heo
On Wed, Jul 24, 2019 at 10:22:42AM +0800, Jia-Ju Bai wrote: > In kernfs_path_from_node_locked(), there is an if statement on line 147 > to check whether buf is NULL: > if (buf) > > When buf is NULL, it is used on line 151: > len += strlcpy(buf + len, parent_str, ...) > and line 158: >

Re: linux-next boot error: WARNING: workqueue cpumask: online intersect > possible intersect

2019-07-24 Thread Tejun Heo
On Wed, Jul 24, 2019 at 10:41:29AM -0700, Eric Biggers wrote: > The real boot error "general protection fault in dma_direct_max_mapping_size" > is > fixed in mainline now. I believe that unblocks syzbot testing, since it > doesn't > appear to have been blocked by "WARNING: workqueue cpumask:

Re: [PATCH] cgroup: minor tweak for logic to get cgroup css

2019-07-23 Thread Tejun Heo
On Wed, Jul 03, 2019 at 10:07:49AM +0800, Peng Wang wrote: > We could only handle the case that css exists > and css_try_get_online() fails. > > Signed-off-by: Peng Wang Applied to cgroup/for-5.4. Thanks. -- tejun

Re: [PATCH] cgroup: Replace a seq_printf() call by seq_puts() in cgroup_print_ss_mask()

2019-07-23 Thread Tejun Heo
On Tue, Jul 02, 2019 at 07:33:08PM +0200, Markus Elfring wrote: > From: Markus Elfring > Date: Tue, 2 Jul 2019 19:26:59 +0200 > > A string which did not contain a data format specification should be put > into a sequence. Thus use the corresponding function “seq_puts”. > > This issue was

Re: [PATCH] mm/backing-dev: show state of all bdi_writeback in debugfs

2019-07-23 Thread Tejun Heo
On Wed, Jul 24, 2019 at 12:24:41AM +0300, Konstantin Khlebnikov wrote: > Debugging such dynamic structure with gdb is a pain. Use drgn. It's a lot better than hard coding these debug features into the kernel. https://github.com/osandov/drgn Thanks. -- tejun

Re: [PATCH v9 2/8] sched/core: Streamlining calls to task_rq_unlock()

2019-07-23 Thread Tejun Heo
On Tue, Jul 23, 2019 at 12:31:31PM +0200, Peter Zijlstra wrote: > On Mon, Jul 22, 2019 at 10:32:14AM +0200, Juri Lelli wrote: > > > Thanks for reporting. The set is based on cgroup/for-next (as of last > > week), though. I can of course rebase on tip/sched/core or mainline if > > needed. > > TJ;

Re: [PATCH v11 1/5] sched/core: uclamp: Extend CPU's cgroup controller

2019-07-18 Thread Tejun Heo
Hello, Patrick. On Mon, Jul 08, 2019 at 09:43:53AM +0100, Patrick Bellasi wrote: > +static inline void cpu_uclamp_print(struct seq_file *sf, > + enum uclamp_id clamp_id) > +{ > + struct task_group *tg; > + u64 util_clamp; > + u64 percent; > + u32

Re: [PATCH v8 6/8] cgroup/cpuset: Change cpuset_rwsem and hotplug lock order

2019-07-16 Thread Tejun Heo
On Fri, Jul 12, 2019 at 04:04:09PM +0200, Juri Lelli wrote: > > Should I take this as an indication that you had a look at the set and > > (apart from Peter's comments) you are OK with them? > > > > If that's the case I will send a v9 out soon. Otherwise I'd kindly ask > > you to please have a

Re: [PATCH] MAINTAINERS: add entry for block io cgroup

2019-07-12 Thread Tejun Heo
h > +F: block/blk-throttle.c > +F: block/blk-iolatency.c > +F: block/bfq-cgroup.c Given that blkcg changes are often entangled with generic block changes and best routed through block tree, I think it'd be useful to add the followings. M: Tejun Heo M: Jens

Re: linux-next: build failure after merge of the block tree

2019-07-11 Thread Tejun Heo
ta.c > +++ b/fs/f2fs/data.c > @@ -513,7 +513,7 @@ int f2fs_merge_page_bio(struct f2fs_io_info *fio) > } > > if (fio->io_wbc) > - wbc_account_io(fio->io_wbc, page, PAGE_SIZE); > + wbc_account_cgroup_owner(fio->io_wbc, page, PAGE_SIZE); Acked-by: Tejun Heo Thanks. -- tejun

Re: [PATCH] ipmi_si_intf: use usleep_range() instead of busy looping

2019-07-10 Thread Tejun Heo
Hello, On Tue, Jul 09, 2019 at 06:01:44PM -0500, Corey Minyard wrote: > > I'm really not sure "carefully tuned" is applicable on indefinite busy > > looping. > > Well, yeah, but other things were tried and this was the only thing > we could find that worked. That was before the kind of SMP

Re: [Openipmi-developer] [PATCH] ipmi_si_intf: use usleep_range() instead of busy looping

2019-07-10 Thread Tejun Heo
Hello, Corey. On Tue, Jul 09, 2019 at 06:07:03PM -0500, Corey Minyard wrote: > I believe the change was 33979734cd35ae "IPMI: use schedule in kthread" > The original change that added the kthread was a9a2c44ff0a1350 > "ipmi: add timer thread". > > I mis-remembered this, we switched from doing a

Re: [PATCH] ipmi_si_intf: use usleep_range() instead of busy looping

2019-07-09 Thread Tejun Heo
On Tue, Jul 09, 2019 at 04:46:02PM -0500, Corey Minyard wrote: > On Tue, Jul 09, 2019 at 02:06:43PM -0700, Tejun Heo wrote: > > ipmi_thread() uses back-to-back schedule() to poll for command > > completion which, on some machines, can push up CPU consumption and > > heavily ta

Re: [PATCH] ipmi_si_intf: use usleep_range() instead of busy looping

2019-07-09 Thread Tejun Heo
Hello, Corey. On Tue, Jul 09, 2019 at 04:46:02PM -0500, Corey Minyard wrote: > I'm also a little confused because the CPU in question shouldn't > be doing anything else if the schedule() immediately returns here, > so it's not wasting CPU that could be used on another process. Or > is it lock

[PATCH] ipmi_si_intf: use usleep_range() instead of busy looping

2019-07-09 Thread Tejun Heo
the sensor readings to finish resonably fast and the cpu consumption of the kthread is kept under several percents of a core. Signed-off-by: Tejun Heo --- drivers/char/ipmi/ipmi_si_intf.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers

Re: WARNING in kernfs_create_dir_ns

2019-07-09 Thread Tejun Heo
On Tue, Jul 09, 2019 at 10:23:35AM +0800, Hillf Danton wrote: > > I don't think this is the correct fix. It's being called with kobj > > whose parent's sysfs node is dangling. It gotta be fixed from the > > caller side. > > > Make sense? > > --- a/lib/kobject.c > +++ b/lib/kobject.c No, I

Re: [PATCH] cgroup: minor tweak for logic to get cgroup css

2019-07-08 Thread Tejun Heo
On Mon, Jul 08, 2019 at 05:29:49PM +, Roman Gushchin wrote: > On Mon, Jul 08, 2019 at 09:42:43AM -0700, Tejun Heo wrote: > > On Wed, Jul 03, 2019 at 10:07:49AM +0800, Peng Wang wrote: > > > We could only handle the case that css exists > > > and

[GIT PULL] cgroup changes for v5.3-rc1

2019-07-08 Thread Tejun Heo
) Mauro Carvalho Chehab (1): docs: cgroup-v1: convert docs to ReST and rename to *.rst Tejun Heo (3): cgroup: add cgroup_parse_float() Merge branch 'for-5.2-fixes' into for-5.3 cgroup: Move cgroup_parse_float

[GIT PULL] workqueue changes for v5.3-rc1

2019-07-08 Thread Tejun Heo
Hello, Just a couple cleanup patches. No functional changes. Thanks. The following changes since commit 249155c20f9b0754bc1b932a33344cfb4e0c2101: Merge branch 'parisc-5.2-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux (2019-06-25 05:52:31 +0800) are available in

Re: [PATCH] cgroup: minor tweak for logic to get cgroup css

2019-07-08 Thread Tejun Heo
On Wed, Jul 03, 2019 at 10:07:49AM +0800, Peng Wang wrote: > We could only handle the case that css exists > and css_try_get_online() fails. As css_tryget_online() can't handle NULL input, this is a bug fix. Can you please clarify that in the description? Thanks. -- tejun

Re: [PATCH v2] kernfs: fix potential null pointer dereference

2019-07-08 Thread Tejun Heo
On Mon, Jul 08, 2019 at 11:16:11PM +0800, Peng Wang wrote: > Get root safely after kn is ensureed to be not null. > > Signed-off-by: Peng Wang Acked-by: Tejun Heo Thanks. -- tejun

Re: WARNING in kernfs_create_dir_ns

2019-07-08 Thread Tejun Heo
Hello, On Mon, Jul 01, 2019 at 01:52:35PM +0800, Hillf Danton wrote: > >WARNING: CPU: 0 PID: 8613 at fs/kernfs/dir.c:493 kernfs_get > >fs/kernfs/dir.c:493 [inline] > >WARNING: CPU: 0 PID: 8613 at fs/kernfs/dir.c:493 kernfs_new_node > >fs/kernfs/dir.c:700 [inline] > >WARNING: CPU: 0 PID: 8613

Re: [PATCH] cgroup: simplify code for cgroup_subtree_control_write()

2019-07-08 Thread Tejun Heo
On Mon, Jul 08, 2019 at 09:01:32PM +0800, Peng Wang wrote: > Process "enable" and "disable" earlier to simplify code. I don't think this is correct and even if it were the value of this change is close to none, so nack on this one. Thanks. -- tejun

Re: [PATCH] kernfs: fix potential null pointer dereference

2019-07-08 Thread Tejun Heo
) > if (likely(v != KN_DEACTIVATED_BIAS)) > return; > > + root = kernfs_root(kn); > wake_up_all(>deactivate_waitq); Maybe just remove the root variable altogether? Other than that, Acked-by: Tejun Heo Thanks. -- tejun

Re: [PATCH v8 6/8] cgroup/cpuset: Change cpuset_rwsem and hotplug lock order

2019-07-01 Thread Tejun Heo
Hello, On Mon, Jul 01, 2019 at 10:27:31AM +0200, Peter Zijlstra wrote: > IIRC TJ figured it wasn't strictly required to fix the lock invertion at > that time and they sorted it differently. If I (re)read the thread > correctly the other day, he didn't have fundamental objections against > it, but

Re: your mail

2019-06-27 Thread Tejun Heo
On Wed, Jun 26, 2019 at 04:52:36PM +0200, Sebastian Andrzej Siewior wrote: > A small series of tiny cleanups. Applied 1-2 to wq/for-5.3. Thanks. -- tejun

[PATCH 1/5] cgroup, blkcg: Prepare some symbols for module and !CONFIG_CGROUP usages

2019-06-27 Thread Tejun Heo
btrfs is going to use css_put() and wbc helpers to improve cgroup writeback support. Add dummy css_get() definition and export wbc helpers to prepare for module and !CONFIG_CGROUP builds. Signed-off-by: Tejun Heo Reported-by: kbuild test robot Reviewed-by: Jan Kara --- block/blk-cgroup.c

Re: [PATCH] memcg: Add kmem.slabinfo to v2 for debugging purpose

2019-06-27 Thread Tejun Heo
Hello, Waiman. On Wed, Jun 26, 2019 at 12:56:14PM -0400, Waiman Long wrote: > With memory cgroup v1, there is a kmem.slabinfo file that can be > used to view what slabs are allocated to the memory cgroup. There > is currently no such equivalent in memory cgroup v2. This file can > be useful for

Re: [PATCH 0/6] workqueue: convert to raw_spinlock_t

2019-06-26 Thread Tejun Heo
Hello, On Wed, Jun 26, 2019 at 04:12:15PM +0200, Thomas Gleixner wrote: > We are working hard to get the remaining pieces in and to the best of my > knowledge there is no hard resistance against merging them. I wonder whether it'd be useful to build some consensus around the approach. It'd be a

Re: [PATCH 0/6] workqueue: convert to raw_spinlock_t

2019-06-26 Thread Tejun Heo
Hello, On Wed, Jun 26, 2019 at 03:53:43PM +0200, Thomas Gleixner wrote: > > I don't now what to make of the series. AFAICS, there's no benefit to > > mainline. What am I missing? > > there is no downside either, right? Sure, but that usually isn't enough for merging patches, right? > It

Re: [PATCH 0/6] workqueue: convert to raw_spinlock_t

2019-06-26 Thread Tejun Heo
On Wed, Jun 26, 2019 at 09:17:19AM +0200, Sebastian Andrzej Siewior wrote: > On 2019-06-13 16:50:21 [+0200], To linux-kernel@vger.kernel.org wrote: > > Hi, > > > > the workqueue code has been reworked in -RT to use raw_spinlock_t based > > locking. This change allows to schedule worker from

Re: [PATCH v10 12/16] sched/core: uclamp: Extend CPU's cgroup controller

2019-06-24 Thread Tejun Heo
Hey, Patrick. On Mon, Jun 24, 2019 at 06:29:06PM +0100, Patrick Bellasi wrote: > > I kinda wonder whether the term bandwidth is a bit confusing because > > it's also used for cpu.max/min. Would just calling it frequency be > > clearer? > > Maybe I should find a better way to express the concept

Re: [PATCH v10 13/16] sched/core: uclamp: Propagate parent clamps

2019-06-24 Thread Tejun Heo
Hello, Patrick. On Mon, Jun 24, 2019 at 06:34:05PM +0100, Patrick Bellasi wrote: > > On Fri, Jun 21, 2019 at 09:42:14AM +0100, Patrick Bellasi wrote: > > > Since it can be interesting for userspace, e.g. system management > > > software, to know exactly what the currently propagated/enforced > >

Re: [PATCH 2/9] blkcg, writeback: Add wbc->no_wbc_acct

2019-06-24 Thread Tejun Heo
Hello, Jan. On Mon, Jun 24, 2019 at 10:21:30AM +0200, Jan Kara wrote: > OK, now I understand. Just one more question: So effectively, you are using > wbc->no_wbc_acct to pass information from btrfs code to btrfs code telling > it whether IO should or should not be accounted with wbc_account_io().

Re: [PATCH v10 13/16] sched/core: uclamp: Propagate parent clamps

2019-06-22 Thread Tejun Heo
Hello, On Fri, Jun 21, 2019 at 09:42:14AM +0100, Patrick Bellasi wrote: > Since it can be interesting for userspace, e.g. system management > software, to know exactly what the currently propagated/enforced > configuration is, the effective clamp values are exposed to user-space > by means of a

Re: [PATCH v10 12/16] sched/core: uclamp: Extend CPU's cgroup controller

2019-06-22 Thread Tejun Heo
Hello, Generally looks good to me. Some nitpicks. On Fri, Jun 21, 2019 at 09:42:13AM +0100, Patrick Bellasi wrote: > @@ -951,6 +951,12 @@ controller implements weight and absolute bandwidth > limit models for > normal scheduling policy and absolute bandwidth allocation model for > realtime

Re: [RFC] deadlock with flush_work() in UAS

2019-06-20 Thread Tejun Heo
Hello, On Tue, Jun 18, 2019 at 11:59:39AM -0400, Alan Stern wrote: > > > Even if you disagree, perhaps we should have a global workqueue with a > > > permanently set noio flag. It could be shared among multiple drivers > > > such as uas and the hub driver for purposes like this. (In fact, the

Re: [PATCH RFC] mm: memcontrol: add cgroup v2 interface to read memory watermark

2019-06-15 Thread Tejun Heo
now. Nacked-by: Tejun Heo Thanks. -- tejun

Re: [PATCH 08/10] blkcg: implement blk-ioweight

2019-06-15 Thread Tejun Heo
Hello, On Fri, Jun 14, 2019 at 10:50:34PM +0200, Toke Høiland-Jørgensen wrote: > > Within a single cgroup, the IOs are FIFO. When an IO has enough vtime > > credit, it just passes through. When it doesn't, it always waits > > behind any other IOs which are already waiting. > > OK. Is there any

Re: [PATCH v4 05/28] docs: cgroup-v1: convert docs to ReST and rename to *.rst

2019-06-14 Thread Tejun Heo
while this is not linked to > > the main index.rst file, in order to avoid build warnings. > > > > Signed-off-by: Mauro Carvalho Chehab > > Acked-by: Tejun Heo > > This one, too, has linux-next stuff that keeps it from applying to > docs-next. Tejun, would you like to carry it on top of your work? Applied to cgroup/for-5.3. Thanks. -- tejun

[GIT PULL] cgroup fixes for v5.2-rc4

2019-06-14 Thread Tejun Heo
to cpuset_cpus_allowed_fallback() Odin Ugedal (1): docs cgroups: add another example size for hugetlb Tejun Heo (6): cgroup: Use css_tryget() instead of css_tryget_online() in task_get_css() cgroup: Call cgroup_release() before __exit_signal() cgroup: Implement

Re: [PATCHSET block/for-next] IO cost model based work-conserving porportional controller

2019-06-14 Thread Tejun Heo
On Thu, Jun 13, 2019 at 06:56:10PM -0700, Tejun Heo wrote: ... > The patchset is also available in the following git branch. > > git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git review-iow Updated patchset available in the following branch. Just build fixes and cosmeti

Re: [RESEND PATCH v3] cpuset: restore sanity to cpuset_cpus_allowed_fallback()

2019-06-12 Thread Tejun Heo
On Wed, Jun 12, 2019 at 11:50:48AM -0400, Joel Savitz wrote: > In the case that a process is constrained by taskset(1) (i.e. > sched_setaffinity(2)) to a subset of available cpus, and all of those are > subsequently offlined, the scheduler will set tsk->cpus_allowed to > the current value of

Re: [RESEND PATCH v3] cpuset: restore sanity to cpuset_cpus_allowed_fallback()

2019-06-12 Thread Tejun Heo
est mainline kernel. > > However, this is not sane behavior. While not perfect (we'll need to stop updating task's cpumask from cpuset to make), this is still a signifcant improvement. Acked-by: Tejun Heo If there's no objection, I'll route it through the cgroup tree. Thanks. -- tejun

Re: [PATCH v3] cpuset: restore sanity to cpuset_cpus_allowed_fallback()

2019-06-12 Thread Tejun Heo
Hello, Joel. On Wed, Jun 12, 2019 at 11:13:15AM -0400, Joel Savitz wrote: > In the case that a process is constrained by taskset(1) (i.e. > sched_setaffinity(2)) to a subset of available cpus, and all of those are > subsequently offlined, the scheduler will set tsk->cpus_allowed to > the current

Re: [RFC v2 0/5] cgroup-aware unbound workqueues

2019-06-11 Thread Tejun Heo
Hello, Daniel. On Wed, Jun 05, 2019 at 11:32:29AM -0400, Daniel Jordan wrote: > Sure, quoting from the last ktask post: > > A single CPU can spend an excessive amount of time in the kernel operating > on large amounts of data. Often these situations arise during > initialization- > and

Re: [RFC v2 0/5] cgroup-aware unbound workqueues

2019-06-11 Thread Tejun Heo
Hello, On Thu, Jun 06, 2019 at 09:15:26AM +0300, Mike Rapoport wrote: > > Can you please go into more details on the use cases? > > If I remember correctly, the original Bandan's work was about using > workqueues instead of kthreads in vhost. For vhosts, I think it might be better to stick

Re: [PATCH v3 05/33] docs: cgroup-v1: convert docs to ReST and rename to *.rst

2019-06-11 Thread Tejun Heo
gt; - fix tables markups; > - add some lists markups; > - mark literal blocks; > - adjust title markups. > > At its new index.rst, let's add a :orphan: while this is not linked to > the main index.rst file, in order to avoid build warnings. > > Signed-off-by: Mauro Carv

Re: linux-next boot error: WARNING: workqueue cpumask: online intersect > possible intersect

2019-06-11 Thread Tejun Heo
Hello, On Fri, Jun 07, 2019 at 10:45:45AM +0200, Dmitry Vyukov wrote: > +workqueue maintainers and Michael who added this WARNING > > The WARNING was added in 2017, so I guess it's a change somewhere else > that triggered it. > The WARNING message does not seem to give enough info about the

Re: -next-20190607 kernel: oopses on bootup or shutdown

2019-06-11 Thread Tejun Heo
On Tue, Jun 11, 2019 at 10:57:53AM +0200, Pavel Machek wrote: > Hi! > > It failed to boot three times; now it booted but failed on shutdown. > > Hardware is thinkpad X60 (32bit x86), and I'm copying oops by hand. Can you please try next-20190611? It should be fixed now. Thanks. -- tejun

Re: KASAN: null-ptr-deref Read in css_task_iter_advance

2019-06-10 Thread Tejun Heo
Hello, Hillf. On Tue, Jun 11, 2019 at 12:59:23AM +0800, Hillf Danton wrote: > >syzbot will keep track of this bug report. See: > >https://goo.gl/tpsmEJ#status for how to communicate with syzbot. > > > Ignore my noise if you have no interest seeing the syzbot report. They're awesome. > The

Re: [PATCH v9 12/16] sched/core: uclamp: Extend CPU's cgroup controller

2019-06-05 Thread Tejun Heo
Hello, Patrick. On Wed, Jun 05, 2019 at 04:37:43PM +0100, Patrick Bellasi wrote: > > Everything sounds good to me. Please note that cgroup interface files > > actually use literal "max" for limit/protection max settings so that 0 > > and "max" mean the same things for all limit/protection knobs.

Re: [PATCH v9 12/16] sched/core: uclamp: Extend CPU's cgroup controller

2019-06-05 Thread Tejun Heo
Hello, Patrick. On Wed, Jun 05, 2019 at 04:06:30PM +0100, Patrick Bellasi wrote: > The only additional point I can think about as a (slightly) stronger > reason is that I guess we would like to have the same API for cgroups > as well as for the task specific and the system wide settings. > > The

Re: [PATCH v9 12/16] sched/core: uclamp: Extend CPU's cgroup controller

2019-06-05 Thread Tejun Heo
Hello, On Wed, Jun 05, 2019 at 03:39:50PM +0100, Patrick Bellasi wrote: > Which means we will enforce the effective values as: > >/tg1/tg11: > > util_min.effective=0 > i.e. keep the child protection since smaller than parent > > util_max.effective=800 >

Re: [PATCH v9 12/16] sched/core: uclamp: Extend CPU's cgroup controller

2019-06-05 Thread Tejun Heo
Hello, On Mon, Jun 03, 2019 at 01:29:29PM +0100, Patrick Bellasi wrote: > On 31-May 08:35, Tejun Heo wrote: > > Hello, Patrick. > > > > On Wed, May 15, 2019 at 10:44:55AM +0100, Patrick Bellasi wrote: > > [...] > > > For proportions (as opposed to

Re: [PATCH v9 12/16] sched/core: uclamp: Extend CPU's cgroup controller

2019-06-05 Thread Tejun Heo
Hello, On Mon, Jun 03, 2019 at 01:27:25PM +0100, Patrick Bellasi wrote: > All the above, to me it means that: > - cgroups are always capped by system clamps > - cgroups can further restrict system clamps > > Does that match with your view? Yeah, as long as what's defined at system level

Re: [RFC v2 0/5] cgroup-aware unbound workqueues

2019-06-05 Thread Tejun Heo
Hello, Daniel. On Wed, Jun 05, 2019 at 09:36:45AM -0400, Daniel Jordan wrote: > My use case for this work is kernel multithreading, the series formerly known > as ktask[2] that I'm now trying to combine with padata according to feedback > from the last post. Helper threads in a multithreaded job

Re: [PATCH] sched/core: Fix cpu controller for !RT_GROUP_SCHED

2019-06-05 Thread Tejun Heo
m I missing here and why (if) current behavior is needed and makes > sense. > > Any input? Yeah, RT tasks being transprent to the cpu controller when !RT_GROUP_SCHED makes sense to me, especially given that the rules around it are already inconsistent. Please feel free to add Acked-by: Tejun Heo Thanks. -- tejun

Re: [PATCH v9 12/16] sched/core: uclamp: Extend CPU's cgroup controller

2019-05-31 Thread Tejun Heo
Hello, Patrick. On Wed, May 15, 2019 at 10:44:55AM +0100, Patrick Bellasi wrote: > Extend the CPU controller with a couple of new attributes util.{min,max} > which allows to enforce utilization boosting and capping for all the > tasks in a group. Specifically: > > - util.min: defines the minimum

Re: [PATCH] sched/core: add __sched tag for io_schedule()

2019-05-31 Thread Tejun Heo
wchan" will report io_schedule() > rather than its callers when waiting io. > > Reported-by: Jilong Kou > Cc: Tejun Heo > Cc: Ingo Molnar > Cc: Peter Zijlstra > Signed-off-by: Gao Xiang Acked-by: Tejun Heo Thanks. -- tejun

Re: [PATCH] docs cgroups: add another example size for hugetlb

2019-05-30 Thread Tejun Heo
On Thu, May 30, 2019 at 12:24:25AM +0200, Odin Ugedal wrote: > Add another example to clarify that HugePages smaller than 1MB will > be displayed using "KB", with an uppercased K (eg. 20KB), and not the > normal SI prefix kilo (small k). > > Because of a misunderstanding/copy-paste error inside

Re: [PATCH v2 1/3] kselftest/cgroup: fix unexpected testing failure on test_memcontrol

2019-05-24 Thread Tejun Heo
Hello, All three patches look good to me. Please feel free to add my acked-by. Shuah, should I route these through cgroup tree or would the kselftest tree be a better fit? Thanks. -- tejun

[GIT PULL] cgroup fix for v5.2-rc1

2019-05-16 Thread Tejun Heo
Hello, Linus. The cgroup2 freezer pulled in this cycle broke strace. This pull request includes a workaround for the problem. It's not a complete fix in that it may cause spurious frozen state flip-flops which is fairly minor. Will push a full fix once it's ready. Thanks. The following

Re: [PATCH RESEND] signal: unconditionally leave the frozen state in ptrace_stop()

2019-05-16 Thread Tejun Heo
> [ pre-main omitted ] > write(1, "a", 1)= 1 > exit_group(0) = ? > +++ exited with 0 +++ > > Reported-by: Alex Xu > Fixes: 76f969e8948d ("cgroup: cgroup v2 freezer") > Signed-off-by: Roman Gushchin > Acked-by: Oleg Nesterov > Cc: Tejun Heo Applied to cgroup/for-5.2-fixes. Thanks. -- tejun

[GIT PULL] cgroup changes for v5.2-rc1

2019-05-09 Thread Tejun Heo
Hello, Linus. This pull request includes Roman's cgroup2 freezer implementation. It's a separate machanism from cgroup1 freezer. Instead of blocking user tasks in arbitrary uninterruptible sleeps, the new implementation extends jobctl stop - frozen tasks are trapped in jobctl stop until thawed

[GIT PULL] workqueue changes for v5.2-rc1

2019-05-09 Thread Tejun Heo
Hello, Linus. Only three commits, of which two are trivial. The non-trivial chagne is Thomas's patch to switch workqueue from sched RCU to regular one. The use of sched RCU is mostly historic and doesn't really buy us anything noticeable. Thanks. The following changes since commit

Re: [PATCH 1/4] percpu_ref: introduce PERCPU_REF_ALLOW_REINIT flag

2019-05-09 Thread Tejun Heo
gt; > This patch doesn't introduce any functional change to avoid any > regressions. It will be done later in the patchset after adjusting > all call sites, which are reviving percpu counters. > > Signed-off-by: Roman Gushchin For all patches in the series: Acked-by: Tejun Heo Thanks. -- tejun

Re: [PATCH] cgroup: never call do_group_exit() with task->frozen bit set

2019-05-09 Thread Tejun Heo
g. This is only place where > we can leave the loop with the task->frozen bit set and without > setting JOBCTL_TRAP_FREEZE and TIF_SIGPENDING. > > To resolve this problem, let's move cgroup_leave_frozen(true) call to > just after the fatal label. If the task is going to die, the frozen

Re: [PATCH v2 02/06] kernel: cgroup: fix misuse of %x

2019-05-06 Thread Tejun Heo
On Sun, Apr 21, 2019 at 07:47:27PM +0800, Fuqian Huang wrote: > Pointers should be printed with %p or %px rather than > cast to unsigned long type and printed with %lx. > Change %lx to %p to print the pointers. > > Signed-off-by: Fuqian Huang Applied to cgroup/for-5.2. Thanks. -- tejun

Re: [PATCH 0/2] cgroup v2 freezer follow-up patches

2019-05-06 Thread Tejun Heo
On Fri, Apr 26, 2019 at 10:59:43AM -0700, Roman Gushchin wrote: > Hi, Tejun! > > Please, pull these two follow-up patches for the cgroup v2 freezer. > > These are a fix for a spurious state transition, which could happen > due to a race condition, and a cleanup of some dead code. Both patches >

[GIT PULL] cgroup fix for v5.1-rc5

2019-04-19 Thread Tejun Heo
Hello, Linus. A patch to fix a RCU imbalance error in the devices cgroup configuration error path. Thanks. The following changes since commit 9e98c678c2d6ae3a17cb2de55d17f69dddaa231b: Linux 5.1-rc1 (2019-03-17 14:22:26 -0700) are available in the Git repository at:

Re: [PATCH v10 0/9] freezer for cgroup v2

2019-04-19 Thread Tejun Heo
On Fri, Apr 05, 2019 at 10:46:59AM -0700, Roman Gushchin wrote: > This patchset implements freezer for cgroup v2. > > It provides similar functionality as v1 freezer, but the interface > conforms to the cgroup v2 interface design principles, and it > provides a better user experience: tasks can

Re: [PATCH RFC 1/1] kernfs: keep kernfs node alive for __kernfs_remove()

2019-04-17 Thread Tejun Heo
Hello, On Wed, Apr 17, 2019 at 04:12:29PM +, Konstantin Khorenko wrote: > i don't know the full scenario unfortunately, but the idea is the following: > > __kernfs_remove() is called under kernfs_mutex and if >!(!kn || (kn->parent && RB_EMPTY_NODE(>rb))) > > it assumes that nothing can

Re: [PATCH RFC 1/1] kernfs: keep kernfs node alive for __kernfs_remove()

2019-04-16 Thread Tejun Heo
On Tue, Apr 16, 2019 at 06:53:35PM +0300, Konstantin Khorenko wrote: > __kernfs_remove() which is called under kernfs_mutex, > assumes nobody kills kernfs node whie it's working on it > and "get"s current kernfs node for that. > > But we hit a warning in kernfs_get(): kn->counter == 0 already: >

Re: [PATCH] kernel/workqueue: Verify alloc_workqueue() argument list consistency

2019-04-16 Thread Tejun Heo
On Tue, Mar 19, 2019 at 10:40:47AM -0700, Bart Van Assche wrote: > This patch avoids that gcc reports the following warning when building > with W=1: > > kernel/workqueue.c:4250:2: warning: function alloc_workqueue might be a > candidate for gnu_printf format attribute

Re: [PATCH] kernfs: fix barrier usage in __kernfs_new_node()

2019-04-16 Thread Tejun Heo
in > + * set ino first. This RELEASE is paired with atomic_inc_not_zero in >* kernfs_find_and_get_node_by_ino >*/ > - smp_mb__before_atomic(); > - atomic_set(>count, 1); > + atomic_set_release(>count, 1); Acked-by: Tejun Heo Thanks. -- tejun

Re: [PATCH 2/2] sched: Distangle worker accounting from rq lock

2019-04-08 Thread Tejun Heo
sleeping() by Daniel Bristot de > Oliveira] > Signed-off-by: Sebastian Andrzej Siewior This looks good from wq side. Peter, are you okay with routing this through the wq tree? If you wanna take it through the sched tree, please feel free to add Acked-by: Tejun Heo Thanks. -- tejun

Re: [PATCH 1/2] workqueue: Use normal rcu

2019-04-08 Thread Tejun Heo
On Wed, Mar 13, 2019 at 05:55:47PM +0100, Sebastian Andrzej Siewior wrote: > From: Thomas Gleixner > > There is no need for sched_rcu. The undocumented reason why sched_rcu > is used is to avoid a few explicit rcu_read_lock()/unlock() pairs by > the fact that sched_rcu reader side critical

Re: [PATCH 1/2] workqueue: Use normal rcu

2019-04-08 Thread Tejun Heo
Hello, Sebastian. On Fri, Apr 05, 2019 at 04:42:18PM +0200, Sebastian Andrzej Siewior wrote: > On 2019-03-22 18:59:23 [+0100], To Tejun Heo wrote: > > On 2019-03-22 10:43:34 [-0700], Tejun Heo wrote: > > > Hello, > Hi, > > > > We can switch but it doesn't real

Re: [PATCH] cgroup: remove extra cgroup_migrate_finish() call

2019-04-04 Thread Tejun Heo
On Wed, Apr 03, 2019 at 04:03:54PM -0700, Shakeel Butt wrote: > The callers of cgroup_migrate_prepare_dst() correctly call > cgroup_migrate_finish() for success and failure cases both. No need to > call it in cgroup_migrate_prepare_dst() in failure case. > > Signed-off-by: Shakeel Butt Applied

Re: [PATCH 1/2] workqueue: Use normal rcu

2019-03-22 Thread Tejun Heo
Hello, On Thu, Mar 21, 2019 at 09:59:35PM +0100, Sebastian Andrzej Siewior wrote: > On 2019-03-13 17:55:47 [+0100], To linux-kernel@vger.kernel.org wrote: > > From: Thomas Gleixner > > > > There is no need for sched_rcu. The undocumented reason why sched_rcu > > is used is to avoid a few

Re: [PATCH] kernel/workqueue: Document wq_worker_last_func() argument

2019-03-19 Thread Tejun Heo
On Tue, Mar 19, 2019 at 10:45:09AM -0700, Bart Van Assche wrote: > This patch avoids that the following warning is reported when building > with W=1: > > kernel/workqueue.c:938: warning: Function parameter or member 'task' not > described in 'wq_worker_last_func' > > Signed-off-by: Bart Van

Re: [PATCH] device_cgroup: fix RCU imbalance in error case

2019-03-19 Thread Tejun Heo
On Tue, Mar 19, 2019 at 02:36:59AM +0100, Jann Horn wrote: > When dev_exception_add() returns an error (due to a failed memory > allocation), make sure that we move the RCU preemption count back to where > it was before we were called. We dropped the RCU read lock inside the loop > body, so we

Re: [PATCH] kernel/workqueue: Use __printf markup to silence compiler in function 'alloc_workqueue'

2019-03-15 Thread Tejun Heo
On Tue, Mar 12, 2019 at 09:21:26PM +0100, Mathieu Malaterre wrote: > Silence warnings (triggered at W=1) by adding relevant __printf attributes. > > kernel/workqueue.c:4249:2: warning: function 'alloc_workqueue' might be a > candidate for 'gnu_printf' format attribute

Re: [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set

2019-03-14 Thread Tejun Heo
gt; and DEFINE_STATIC_SRCU() within loadable modules. > > Suggested-by: Barret Rhoden > Signed-off-by: Paul E. McKenney Looks-great-to-me-by: Tejun Heo Thanks. :) -- tejun

Re: [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set

2019-03-13 Thread Tejun Heo
Hello, On Wed, Mar 13, 2019 at 02:22:55PM -0700, Paul E. McKenney wrote: > Should I define DEFINE_SRCU() and DEFINE_STATIC_SRCU() only if > !defined(MODULE)? Yeah, that sounds like a great idea with comments explaining why it's like that. Thanks. -- tejun

Re: [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set

2019-03-13 Thread Tejun Heo
Hello, On Wed, Mar 13, 2019 at 03:40:04PM -0400, Barret Rhoden wrote: > Are there any other alternatives? Not using static SRCU in any code > that could be built as a module seems a little harsh. Yes, allocate the srcu dynamically on module init and destroy on module exit. That's how the other

[GIT PULL] cgroup changes for v5.1-rc1

2019-03-06 Thread Tejun Heo
gt;free() into cgroup_subsys->release() to fix the accounting Randy Dunlap (1): Documentation: cgroup-v2: eliminate markup warnings Tejun Heo (1): cgroup, rstat: Don't flush subtree root unless necessary Tibor Billes (1): cgroup: add documentation for pids.events file Documenta

[GIT PULL] workqueue changes for v5.1-rc1

2019-03-06 Thread Tejun Heo
Hello, Linus. All trivial. Two comment updates and one more initialization sanity check in flush_work(). Thanks. The following changes since commit d73aba1115cf40630cc8b4b7aed049ed8117b458: Merge tag 'drm-fixes-2019-01-25-1' of git://anongit.freedesktop.org/drm/drm (2019-01-25 12:19:10

Re: [PATCH] sched/core: fix buffer overflow in cgroup2 property cpu.max

2019-03-06 Thread Tejun Heo
On Wed, Mar 06, 2019 at 08:11:42PM +0300, Konstantin Khlebnikov wrote: > Add limit into sscanf format string for on-stack buffer. > > Fixes: 0d5936344f30 ("sched: Implement interface for cgroup unified > hierarchy") > Signed-off-by: Konstantin Khlebnikov Acked-by: Teju

Re: [PATCH] sched/core: check format and overflows in cgroup2 cpu.max

2019-03-06 Thread Tejun Heo
Hello, Konstantin. On Tue, Mar 05, 2019 at 08:03:24PM +0300, Konstantin Khlebnikov wrote: > >Ditto as the blkio patch. Unless there is a correctness problem, my > >preference is towards keeping the parsing functions simple and I don't > >think the kernel needs to play the role of strict input

Re: [PATCH v8 0/7] freezer for cgroup v2

2019-03-05 Thread Tejun Heo
Hello, Oleg. Sorry about the delay. On Mon, Feb 25, 2019 at 04:57:25PM +0100, Oleg Nesterov wrote: > > As long as the task is > > guaranteed to be trapped by signal stop afterwards (and they are), we > > likely can use them the same way. The only thing to be careful about > > would be ensuring

Re: [PATCH] sched/core: check format and overflows in cgroup2 cpu.max

2019-03-05 Thread Tejun Heo
Hello, On Wed, Feb 27, 2019 at 11:13:21AM +0300, Konstantin Khlebnikov wrote: > Cgroup2 interface for cpu bandwidth limit has some flaws: > > - on stack buffer overflow > - no checks for valid format or trailing garbage > - no checks for integer overflows > > This patch fixes all these flaws.

Re: [PATCH] blk-throttle: verify format of bandwidth limit and detect overflows

2019-03-05 Thread Tejun Heo
Hello, Konstantin. On Wed, Feb 27, 2019 at 11:05:44AM +0300, Konstantin Khlebnikov wrote: > Unlike to memory cgroup blkio throttler does not support value suffixes. > > It silently ignores everything after last digit. For example this command > will set rate limit 1 byte per second rather than 1

<    1   2   3   4   5   6   7   8   9   10   >