Re: [RFC 09/16] kgr: mark task_safe in some kthreads
On Thu, 2014-05-15 at 02:05 -0400, Tejun Heo wrote: > I'm not sure how much weight I can put on the specific use case. Even > with the direct control that the user thought to have previously, the > use case was ripe with possibilities of breakage from any number of > reasons. For example, there are driver paths which bounce to async > execution on IO exceptions (doesn't have to be hard errors) and setups > like the above would easily lock out exception handling and how's the > setup gonna work when the filesystems have to use dynamic pool of > workers as btrfs does? Oh yeah, this case isn't about _real_ realtime at all, it's just about unmanageable priority inversion. With hack, any/all kthreads spawning will start life at the priority the user specified, so as long as he doesn't prioritize userspace above that, he can do whatever evil deeds he sees fit, and box will function as expected. > The identified problem in the above case is allowing the kernel to > make reasonable forward progress even when RT processes don't concede > CPU cycles. Not only, but yeah, mostly. > If that is a use case that needs to be supported, we > better engineer an appropriate solution for that. Such solution > doesn't necessarily have to be advanced either. My solution isn't the least bit sophisticated. Dirt simple is usually best anyway, and is enough for the cases I've encountered. > Maybe all that's > necessary is marking the async mechanisms involved in IO path as such > (we already need to mark all workqueues involved in memory reclaim > path anyway) and provide a mechanism to make all of them RT when > directed. It might be simple but still would be a concious > engineering decision. You could do all singing/dancing PI boost thingy like RCU, but personally, I hate that, disable it. > I think the point I'm trying to make is that it isn't possible to > continue improving and maintaining the kernel with blanket > restrictions on internal details. If certain things shouldn't be > done, we better find out the specific reasons; otherwise, it's > impossible to weight the pros and cons of different options and make a > reasonable choice or to find out ways to accomodate those restrictions > while still achieving the original goals. > > Anyways, we're getting slightly off-topic and it seems like we'll have > to agree to disagree. Hey, we agree! :) -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
Hello, Mike. On Thu, May 15, 2014 at 07:32:29AM +0200, Mike Galbraith wrote: > On Thu, 2014-05-15 at 01:09 -0400, Tejun Heo wrote: > > > Soft/hard irq threads and anything having to do with IO mostly, which > > > including workqueues. I had to give the user a rather fugly global > > > prioritization option to let users more or less safely do the evil deeds > > > they want to and WILL do whether I agree with their motivation to do so > > > or not. I tell all users that realtime is real dangerous, but if they > > > want to do that, it's their box, so by definition perfectly fine. > > > > Frederic is working on global settings for workqueues, so that'll > > resolve some of those issues at least. > > Yeah, wrt what runs where for unbound workqueues, but not priority. Shouldn't be too difficult to extend it to cover priorities if necessary once the infrastructure is in place. > > > > If there are good enough reasons for specific ones, sure, but I don't > > > > think "we can't change any of the kthreads because someone might be > > > > diddling with it" is something we can sustain in the long term. > > > > > > I think the opposite. Taking any control the user has is pure evil. > > > > I'm not sure good/evil is the right frame to think about it. Is > > pooling worker threads evil in nature then? > > When there may be realtime consumers, yes to some extent, because it > inserts allocations he can't control directly into his world, but that's > the least of his worries. The instant userspace depends upon any kernel > proxy the user has no control over, he instantly has a priority > inversion he can do nothing about. This is exactly what happened that > prompted me to do fugly global hack. User turned pet database piggies > loose as realtime tasks for his own reasons, misguided or not, they > depend upon worker threads and kjournald et al who he can control, but > kworker threads respawn as normal tasks which can and will end up under > high priority userspace tasks. Worst case is box becomes dead, also > killing pet, best case is pet collapses to the floor in a quivering > heap. Neither makes Joe User particularly happy. I'm not sure how much weight I can put on the specific use case. Even with the direct control that the user thought to have previously, the use case was ripe with possibilities of breakage from any number of reasons. For example, there are driver paths which bounce to async execution on IO exceptions (doesn't have to be hard errors) and setups like the above would easily lock out exception handling and how's the setup gonna work when the filesystems have to use dynamic pool of workers as btrfs does? The identified problem in the above case is allowing the kernel to make reasonable forward progress even when RT processes don't concede CPU cycles. If that is a use case that needs to be supported, we better engineer an appropriate solution for that. Such solution doesn't necessarily have to be advanced either. Maybe all that's necessary is marking the async mechanisms involved in IO path as such (we already need to mark all workqueues involved in memory reclaim path anyway) and provide a mechanism to make all of them RT when directed. It might be simple but still would be a concious engineering decision. I think the point I'm trying to make is that it isn't possible to continue improving and maintaining the kernel with blanket restrictions on internal details. If certain things shouldn't be done, we better find out the specific reasons; otherwise, it's impossible to weight the pros and cons of different options and make a reasonable choice or to find out ways to accomodate those restrictions while still achieving the original goals. Anyways, we're getting slightly off-topic and it seems like we'll have to agree to disagree. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
Hello, Mike. On Thu, May 15, 2014 at 07:32:29AM +0200, Mike Galbraith wrote: On Thu, 2014-05-15 at 01:09 -0400, Tejun Heo wrote: Soft/hard irq threads and anything having to do with IO mostly, which including workqueues. I had to give the user a rather fugly global prioritization option to let users more or less safely do the evil deeds they want to and WILL do whether I agree with their motivation to do so or not. I tell all users that realtime is real dangerous, but if they want to do that, it's their box, so by definition perfectly fine. Frederic is working on global settings for workqueues, so that'll resolve some of those issues at least. Yeah, wrt what runs where for unbound workqueues, but not priority. Shouldn't be too difficult to extend it to cover priorities if necessary once the infrastructure is in place. If there are good enough reasons for specific ones, sure, but I don't think we can't change any of the kthreads because someone might be diddling with it is something we can sustain in the long term. I think the opposite. Taking any control the user has is pure evil. I'm not sure good/evil is the right frame to think about it. Is pooling worker threads evil in nature then? When there may be realtime consumers, yes to some extent, because it inserts allocations he can't control directly into his world, but that's the least of his worries. The instant userspace depends upon any kernel proxy the user has no control over, he instantly has a priority inversion he can do nothing about. This is exactly what happened that prompted me to do fugly global hack. User turned pet database piggies loose as realtime tasks for his own reasons, misguided or not, they depend upon worker threads and kjournald et al who he can control, but kworker threads respawn as normal tasks which can and will end up under high priority userspace tasks. Worst case is box becomes dead, also killing pet, best case is pet collapses to the floor in a quivering heap. Neither makes Joe User particularly happy. I'm not sure how much weight I can put on the specific use case. Even with the direct control that the user thought to have previously, the use case was ripe with possibilities of breakage from any number of reasons. For example, there are driver paths which bounce to async execution on IO exceptions (doesn't have to be hard errors) and setups like the above would easily lock out exception handling and how's the setup gonna work when the filesystems have to use dynamic pool of workers as btrfs does? The identified problem in the above case is allowing the kernel to make reasonable forward progress even when RT processes don't concede CPU cycles. If that is a use case that needs to be supported, we better engineer an appropriate solution for that. Such solution doesn't necessarily have to be advanced either. Maybe all that's necessary is marking the async mechanisms involved in IO path as such (we already need to mark all workqueues involved in memory reclaim path anyway) and provide a mechanism to make all of them RT when directed. It might be simple but still would be a concious engineering decision. I think the point I'm trying to make is that it isn't possible to continue improving and maintaining the kernel with blanket restrictions on internal details. If certain things shouldn't be done, we better find out the specific reasons; otherwise, it's impossible to weight the pros and cons of different options and make a reasonable choice or to find out ways to accomodate those restrictions while still achieving the original goals. Anyways, we're getting slightly off-topic and it seems like we'll have to agree to disagree. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
On Thu, 2014-05-15 at 02:05 -0400, Tejun Heo wrote: I'm not sure how much weight I can put on the specific use case. Even with the direct control that the user thought to have previously, the use case was ripe with possibilities of breakage from any number of reasons. For example, there are driver paths which bounce to async execution on IO exceptions (doesn't have to be hard errors) and setups like the above would easily lock out exception handling and how's the setup gonna work when the filesystems have to use dynamic pool of workers as btrfs does? Oh yeah, this case isn't about _real_ realtime at all, it's just about unmanageable priority inversion. With hack, any/all kthreads spawning will start life at the priority the user specified, so as long as he doesn't prioritize userspace above that, he can do whatever evil deeds he sees fit, and box will function as expected. The identified problem in the above case is allowing the kernel to make reasonable forward progress even when RT processes don't concede CPU cycles. Not only, but yeah, mostly. If that is a use case that needs to be supported, we better engineer an appropriate solution for that. Such solution doesn't necessarily have to be advanced either. My solution isn't the least bit sophisticated. Dirt simple is usually best anyway, and is enough for the cases I've encountered. Maybe all that's necessary is marking the async mechanisms involved in IO path as such (we already need to mark all workqueues involved in memory reclaim path anyway) and provide a mechanism to make all of them RT when directed. It might be simple but still would be a concious engineering decision. You could do all singing/dancing PI boost thingy like RCU, but personally, I hate that, disable it. I think the point I'm trying to make is that it isn't possible to continue improving and maintaining the kernel with blanket restrictions on internal details. If certain things shouldn't be done, we better find out the specific reasons; otherwise, it's impossible to weight the pros and cons of different options and make a reasonable choice or to find out ways to accomodate those restrictions while still achieving the original goals. Anyways, we're getting slightly off-topic and it seems like we'll have to agree to disagree. Hey, we agree! :) -Mike -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
On Thu, 2014-05-15 at 01:09 -0400, Tejun Heo wrote: > Hello, Mike. > > On Thu, May 15, 2014 at 07:04:22AM +0200, Mike Galbraith wrote: > > On Thu, 2014-05-15 at 00:50 -0400, Tejun Heo wrote: > > > Do we know specific kthreads which need to be exposed with this way? > > > > Soft/hard irq threads and anything having to do with IO mostly, which > > including workqueues. I had to give the user a rather fugly global > > prioritization option to let users more or less safely do the evil deeds > > they want to and WILL do whether I agree with their motivation to do so > > or not. I tell all users that realtime is real dangerous, but if they > > want to do that, it's their box, so by definition perfectly fine. > > Frederic is working on global settings for workqueues, so that'll > resolve some of those issues at least. Yeah, wrt what runs where for unbound workqueues, but not priority. > > > If there are good enough reasons for specific ones, sure, but I don't > > > think "we can't change any of the kthreads because someone might be > > > diddling with it" is something we can sustain in the long term. > > > > I think the opposite. Taking any control the user has is pure evil. > > I'm not sure good/evil is the right frame to think about it. Is > pooling worker threads evil in nature then? When there may be realtime consumers, yes to some extent, because it inserts allocations he can't control directly into his world, but that's the least of his worries. The instant userspace depends upon any kernel proxy the user has no control over, he instantly has a priority inversion he can do nothing about. This is exactly what happened that prompted me to do fugly global hack. User turned pet database piggies loose as realtime tasks for his own reasons, misguided or not, they depend upon worker threads and kjournald et al who he can control, but kworker threads respawn as normal tasks which can and will end up under high priority userspace tasks. Worst case is box becomes dead, also killing pet, best case is pet collapses to the floor in a quivering heap. Neither makes Joe User particularly happy. -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
Hello, Mike. On Thu, May 15, 2014 at 07:04:22AM +0200, Mike Galbraith wrote: > On Thu, 2014-05-15 at 00:50 -0400, Tejun Heo wrote: > > Do we know specific kthreads which need to be exposed with this way? > > Soft/hard irq threads and anything having to do with IO mostly, which > including workqueues. I had to give the user a rather fugly global > prioritization option to let users more or less safely do the evil deeds > they want to and WILL do whether I agree with their motivation to do so > or not. I tell all users that realtime is real dangerous, but if they > want to do that, it's their box, so by definition perfectly fine. Frederic is working on global settings for workqueues, so that'll resolve some of those issues at least. > > If there are good enough reasons for specific ones, sure, but I don't > > think "we can't change any of the kthreads because someone might be > > diddling with it" is something we can sustain in the long term. > > I think the opposite. Taking any control the user has is pure evil. I'm not sure good/evil is the right frame to think about it. Is pooling worker threads evil in nature then? Even when not doing so leads to serious scalibilty issues and general poor utilization of system resources? User control, just like everything else, is one of the many aspects to be evaluated and traded off, not something to uphold religiously at all cost. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
On Thu, 2014-05-15 at 00:50 -0400, Tejun Heo wrote: > Hello, Mike. > > On Thu, May 15, 2014 at 06:46:18AM +0200, Mike Galbraith wrote: > > > I think it'd be healthier to identify the use cases and then provide > > > proper interface for it. Note that workqueue can now expose interface > > > to modify concurrency, priority and cpumask to userland which > > > writeback workers are already using. > > > > You can't identify a specific thing, any/all of it can land on the > > user's diner plate, so he should be able to make the decisions. Power > > to the user and all that, if he does something stupid, tuff titty. User > > getting to call the shots, and getting to keep the pieces when he fscks > > it all up is wonderful stuff, lets kernel people off the hook :) > > Do we know specific kthreads which need to be exposed with this way? Soft/hard irq threads and anything having to do with IO mostly, which including workqueues. I had to give the user a rather fugly global prioritization option to let users more or less safely do the evil deeds they want to and WILL do whether I agree with their motivation to do so or not. I tell all users that realtime is real dangerous, but if they want to do that, it's their box, so by definition perfectly fine. > If there are good enough reasons for specific ones, sure, but I don't > think "we can't change any of the kthreads because someone might be > diddling with it" is something we can sustain in the long term. I think the opposite. Taking any control the user has is pure evil. -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
Hello, Mike. On Thu, May 15, 2014 at 06:46:18AM +0200, Mike Galbraith wrote: > > I think it'd be healthier to identify the use cases and then provide > > proper interface for it. Note that workqueue can now expose interface > > to modify concurrency, priority and cpumask to userland which > > writeback workers are already using. > > You can't identify a specific thing, any/all of it can land on the > user's diner plate, so he should be able to make the decisions. Power > to the user and all that, if he does something stupid, tuff titty. User > getting to call the shots, and getting to keep the pieces when he fscks > it all up is wonderful stuff, lets kernel people off the hook :) Do we know specific kthreads which need to be exposed with this way? If there are good enough reasons for specific ones, sure, but I don't think "we can't change any of the kthreads because someone might be diddling with it" is something we can sustain in the long term. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
On Thu, 2014-05-15 at 00:06 -0400, Tejun Heo wrote: > Hey, Mike. > > On Thu, May 15, 2014 at 05:53:57AM +0200, Mike Galbraith wrote: > > Hm. The user would need to be able to identify and prioritize the > > I suppose you mean userland by "the user"? Yeah. > > things, and have his settings stick. Any dynamic pool business doing > > allocations and/or munging priorities would be highly annoying. > > There are some use cases where control over worker priority or other > attributes are necessary. I'm not sure using kthread for that reason > is a good engineering choice tho. Many of those cases end up being > accidental. It's currently the only option. For perfection, you'd have to have fine grained deterministic yada yada throughout all paths, which is kinda out for generic proxies, but it's a hell of a lot better than no control. > I think it'd be healthier to identify the use cases and then provide > proper interface for it. Note that workqueue can now expose interface > to modify concurrency, priority and cpumask to userland which > writeback workers are already using. You can't identify a specific thing, any/all of it can land on the user's diner plate, so he should be able to make the decisions. Power to the user and all that, if he does something stupid, tuff titty. User getting to call the shots, and getting to keep the pieces when he fscks it all up is wonderful stuff, lets kernel people off the hook :) -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
Hey, Mike. On Thu, May 15, 2014 at 05:53:57AM +0200, Mike Galbraith wrote: > Hm. The user would need to be able to identify and prioritize the I suppose you mean userland by "the user"? > things, and have his settings stick. Any dynamic pool business doing > allocations and/or munging priorities would be highly annoying. There are some use cases where control over worker priority or other attributes are necessary. I'm not sure using kthread for that reason is a good engineering choice tho. Many of those cases end up being accidental. I think it'd be healthier to identify the use cases and then provide proper interface for it. Note that workqueue can now expose interface to modify concurrency, priority and cpumask to userland which writeback workers are already using. In general, being restricted to using kthread internally for this reason seems wrong to me. It's too direct influence on the implementation mechanism. > I saw a case where dynamic workers inflicted a realtime regression on a > user (but what they were getting away with previously was.. horrid). Yeah, exactly. It'd be far better to identify the use case properly and provide the appropriate interface for it. That said, even if it really requires diddling with kthread directly from userland, kthread_worker can still be used. It's still one dedicated kthread but with structured usage from kernel side so that infrastructure features like freezer and possibly kgr can be implemented in a single place rather than scattered around all over the place. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
On Wed, 2014-05-14 at 12:32 -0400, Tejun Heo wrote: > Hello, Jiri, Vojtech. > > On Wed, May 14, 2014 at 05:15:01PM +0200, Vojtech Pavlik wrote: > > On Wed, May 14, 2014 at 04:59:05PM +0200, Jiri Slaby wrote: > > > I see the worst case scenario. (For curious readers, it is for example > > > this kthread body: > > > while (1) { > > > some_paired_call(); /* invokes pre-patched code */ > > > if (kthread_should_stop()) { /* kgraft switches to the new code */ > > > its_paired_function(); /* invokes patched code (wrong) */ > > > break; > > > } > > > its_paired_function(); /* the same (wrong) */ > > > }) > > > > > > What to do with that now? We have come up with a couple possibilities. > > > Would you consider try_to_freeze() a good state-defining function? As it > > > is called when a kthread expects weird things can happen, it should be > > > safe to switch to the patched version in our opinion. > > > > > > The other possibility is to patch every kthread loop (~300) and insert > > > kgr_task_safe() semi-manually at some proper place. > > > > > > Or if you have any other suggestions we would appreciate that? > > > > A heretic idea would be to convert all kernel threads into functions > > that do not sleep and exit after a single iteration and are called from > > a central kthread main loop function. That would get all of > > Or converting them to use workqueues instead. Converting majority of > kthread users to workqueue is probably a good idea regardless of this > because workqueues are far easier to get right and give clear > delineation boundary between execution instances between which it's > safe to freeze and shutdown (and possibly to patch the work function). > Let alone overall lower overhead. I converted some and was planning > on converting most of them but never got around ot it. Hm. The user would need to be able to identify and prioritize the things, and have his settings stick. Any dynamic pool business doing allocations and/or munging priorities would be highly annoying. I saw a case where dynamic workers inflicted a realtime regression on a user (but what they were getting away with previously was.. horrid). -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
Hello, Jiri, Vojtech. On Wed, May 14, 2014 at 05:15:01PM +0200, Vojtech Pavlik wrote: > On Wed, May 14, 2014 at 04:59:05PM +0200, Jiri Slaby wrote: > > I see the worst case scenario. (For curious readers, it is for example > > this kthread body: > > while (1) { > > some_paired_call(); /* invokes pre-patched code */ > > if (kthread_should_stop()) { /* kgraft switches to the new code */ > > its_paired_function(); /* invokes patched code (wrong) */ > > break; > > } > > its_paired_function(); /* the same (wrong) */ > > }) > > > > What to do with that now? We have come up with a couple possibilities. > > Would you consider try_to_freeze() a good state-defining function? As it > > is called when a kthread expects weird things can happen, it should be > > safe to switch to the patched version in our opinion. > > > > The other possibility is to patch every kthread loop (~300) and insert > > kgr_task_safe() semi-manually at some proper place. > > > > Or if you have any other suggestions we would appreciate that? > > A heretic idea would be to convert all kernel threads into functions > that do not sleep and exit after a single iteration and are called from > a central kthread main loop function. That would get all of Or converting them to use workqueues instead. Converting majority of kthread users to workqueue is probably a good idea regardless of this because workqueues are far easier to get right and give clear delineation boundary between execution instances between which it's safe to freeze and shutdown (and possibly to patch the work function). Let alone overall lower overhead. I converted some and was planning on converting most of them but never got around ot it. > kthread_should_stop() and try_to_freeze() and kgr_task_safe() nicely > into one place and at the same time put enough constraint on what the > thread function can do to prevent it from breaking the assumptions of > each of these calls. Yeah, the exactly same rationales for using workqueue over kthreads. That said, even with most kthread users converted to workqueue, we'd probably want something which can really enforce correctness for the leftovers as long as we continue to expose kthread interface. Ooh, there's also kthread_worker thing which puts workqueue-like semantics on top of kthreads which can be used for whatever which can't be converted to workqueue due to special worker attributes or whatnot. So, yeah, I think there are enough tools available to put enough semantic meanings over how kthreads are used such that things like freezer or hot-code patching can be implemented in the generic framework rather than in hundred scattered places but it's likely to take a substantial amount of work. The upside is that conversions are likely beneficial on their own so they can be pushed separately. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
On Wed, May 14, 2014 at 05:15:01PM +0200, Vojtech Pavlik wrote: > On Wed, May 14, 2014 at 04:59:05PM +0200, Jiri Slaby wrote: > > > I see the worst case scenario. (For curious readers, it is for example > > this kthread body: > > while (1) { > > some_paired_call(); /* invokes pre-patched code */ > > if (kthread_should_stop()) { /* kgraft switches to the new code */ > > its_paired_function(); /* invokes patched code (wrong) */ > > break; > > } > > its_paired_function(); /* the same (wrong) */ > > }) > > > > What to do with that now? We have come up with a couple possibilities. > > Would you consider try_to_freeze() a good state-defining function? As it > > is called when a kthread expects weird things can happen, it should be > > safe to switch to the patched version in our opinion. > > > > The other possibility is to patch every kthread loop (~300) and insert > > kgr_task_safe() semi-manually at some proper place. > > > > Or if you have any other suggestions we would appreciate that? > > A heretic idea would be to convert all kernel threads into functions > that do not sleep and exit after a single iteration and are called from > a central kthread main loop function. That would get all of > kthread_should_stop() and try_to_freeze() and kgr_task_safe() nicely > into one place and at the same time put enough constraint on what the > thread function can do to prevent it from breaking the assumptions of > each of these calls. Some substantial restructuring would be required for several of the kthreads I am aware of, which contain kthread_should_stop() inside loop bodies as well as on their conditions. Also, a number of them do things like wait_event() and the like, which would mean that the central kthread main loop function would need to know about the wait queues and wait conditions and handle them properly. See for example rcu_torture_barrier_cbs() and rcu_torture_barrier() in kernel/rcu/rcutorture.c [*], which wait on each other in order to test RCU's rcu_barrier() primitives. Thanx, Paul * In older kernels, this is kernel/rcu/torture.c or kernel/rcutorture.c. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
On Wed, May 14, 2014 at 04:59:05PM +0200, Jiri Slaby wrote: > I see the worst case scenario. (For curious readers, it is for example > this kthread body: > while (1) { > some_paired_call(); /* invokes pre-patched code */ > if (kthread_should_stop()) { /* kgraft switches to the new code */ > its_paired_function(); /* invokes patched code (wrong) */ > break; > } > its_paired_function(); /* the same (wrong) */ > }) > > What to do with that now? We have come up with a couple possibilities. > Would you consider try_to_freeze() a good state-defining function? As it > is called when a kthread expects weird things can happen, it should be > safe to switch to the patched version in our opinion. > > The other possibility is to patch every kthread loop (~300) and insert > kgr_task_safe() semi-manually at some proper place. > > Or if you have any other suggestions we would appreciate that? A heretic idea would be to convert all kernel threads into functions that do not sleep and exit after a single iteration and are called from a central kthread main loop function. That would get all of kthread_should_stop() and try_to_freeze() and kgr_task_safe() nicely into one place and at the same time put enough constraint on what the thread function can do to prevent it from breaking the assumptions of each of these calls. -- Vojtech Pavlik SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
Hi Tejun, On 05/01/2014 11:09 PM, Tejun Heo wrote: > On Thu, May 01, 2014 at 05:02:42PM -0400, Tejun Heo wrote: >> Hello, Jiri. >> >> On Thu, May 01, 2014 at 10:17:44PM +0200, Jiri Kosina wrote: >>> I agree that this expectation might really somewhat implicit and is not >>> probably properly documented anywhere. The basic observation is "whenever >>> kthread_should_stop() is being called, all data structures are in a >>> consistent state and don't need any further updates in order to achieve >>> consistency, because we can exit the loop immediately here", as >>> kthread_should_stop() is the very last thing every freezable kernel thread >> >> But kthread_should_stop() doesn't necessarily imply that "we can exit >> the loop *immediately*" at all. It just indicates that it should >> terminate in finite amount of time. I don't think it'd be too > > Just a bit of addition. Please note that kthread_should_stop(), along > with the freezer test, is actually trickier than it seems. It's very > easy to write code which works most of the time but misses wake up > from kill when the timing is just right (or wrong). It should be > interlocked with set_current_state() and other related queueing data > structure accesses. This was several years ago but when I audited > most kthread users in kernel, especially in combination with the > freezer test which also has similar requirement, surprising percentage > of users (at least several tens of pct) were getting it slightly > wrong, so kthread_should_stop() really isn't used as "we can exit > *immediately*". It just isn't that simple. I see the worst case scenario. (For curious readers, it is for example this kthread body: while (1) { some_paired_call(); /* invokes pre-patched code */ if (kthread_should_stop()) { /* kgraft switches to the new code */ its_paired_function(); /* invokes patched code (wrong) */ break; } its_paired_function(); /* the same (wrong) */ }) What to do with that now? We have come up with a couple possibilities. Would you consider try_to_freeze() a good state-defining function? As it is called when a kthread expects weird things can happen, it should be safe to switch to the patched version in our opinion. The other possibility is to patch every kthread loop (~300) and insert kgr_task_safe() semi-manually at some proper place. Or if you have any other suggestions we would appreciate that? thanks, -- js suse labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
On Wed, 2014-05-14 at 12:32 -0400, Tejun Heo wrote: Hello, Jiri, Vojtech. On Wed, May 14, 2014 at 05:15:01PM +0200, Vojtech Pavlik wrote: On Wed, May 14, 2014 at 04:59:05PM +0200, Jiri Slaby wrote: I see the worst case scenario. (For curious readers, it is for example this kthread body: while (1) { some_paired_call(); /* invokes pre-patched code */ if (kthread_should_stop()) { /* kgraft switches to the new code */ its_paired_function(); /* invokes patched code (wrong) */ break; } its_paired_function(); /* the same (wrong) */ }) What to do with that now? We have come up with a couple possibilities. Would you consider try_to_freeze() a good state-defining function? As it is called when a kthread expects weird things can happen, it should be safe to switch to the patched version in our opinion. The other possibility is to patch every kthread loop (~300) and insert kgr_task_safe() semi-manually at some proper place. Or if you have any other suggestions we would appreciate that? A heretic idea would be to convert all kernel threads into functions that do not sleep and exit after a single iteration and are called from a central kthread main loop function. That would get all of Or converting them to use workqueues instead. Converting majority of kthread users to workqueue is probably a good idea regardless of this because workqueues are far easier to get right and give clear delineation boundary between execution instances between which it's safe to freeze and shutdown (and possibly to patch the work function). Let alone overall lower overhead. I converted some and was planning on converting most of them but never got around ot it. Hm. The user would need to be able to identify and prioritize the things, and have his settings stick. Any dynamic pool business doing allocations and/or munging priorities would be highly annoying. I saw a case where dynamic workers inflicted a realtime regression on a user (but what they were getting away with previously was.. horrid). -Mike -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
Hey, Mike. On Thu, May 15, 2014 at 05:53:57AM +0200, Mike Galbraith wrote: Hm. The user would need to be able to identify and prioritize the I suppose you mean userland by the user? things, and have his settings stick. Any dynamic pool business doing allocations and/or munging priorities would be highly annoying. There are some use cases where control over worker priority or other attributes are necessary. I'm not sure using kthread for that reason is a good engineering choice tho. Many of those cases end up being accidental. I think it'd be healthier to identify the use cases and then provide proper interface for it. Note that workqueue can now expose interface to modify concurrency, priority and cpumask to userland which writeback workers are already using. In general, being restricted to using kthread internally for this reason seems wrong to me. It's too direct influence on the implementation mechanism. I saw a case where dynamic workers inflicted a realtime regression on a user (but what they were getting away with previously was.. horrid). Yeah, exactly. It'd be far better to identify the use case properly and provide the appropriate interface for it. That said, even if it really requires diddling with kthread directly from userland, kthread_worker can still be used. It's still one dedicated kthread but with structured usage from kernel side so that infrastructure features like freezer and possibly kgr can be implemented in a single place rather than scattered around all over the place. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
On Thu, 2014-05-15 at 00:06 -0400, Tejun Heo wrote: Hey, Mike. On Thu, May 15, 2014 at 05:53:57AM +0200, Mike Galbraith wrote: Hm. The user would need to be able to identify and prioritize the I suppose you mean userland by the user? Yeah. things, and have his settings stick. Any dynamic pool business doing allocations and/or munging priorities would be highly annoying. There are some use cases where control over worker priority or other attributes are necessary. I'm not sure using kthread for that reason is a good engineering choice tho. Many of those cases end up being accidental. It's currently the only option. For perfection, you'd have to have fine grained deterministic yada yada throughout all paths, which is kinda out for generic proxies, but it's a hell of a lot better than no control. I think it'd be healthier to identify the use cases and then provide proper interface for it. Note that workqueue can now expose interface to modify concurrency, priority and cpumask to userland which writeback workers are already using. You can't identify a specific thing, any/all of it can land on the user's diner plate, so he should be able to make the decisions. Power to the user and all that, if he does something stupid, tuff titty. User getting to call the shots, and getting to keep the pieces when he fscks it all up is wonderful stuff, lets kernel people off the hook :) -Mike -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
Hello, Mike. On Thu, May 15, 2014 at 06:46:18AM +0200, Mike Galbraith wrote: I think it'd be healthier to identify the use cases and then provide proper interface for it. Note that workqueue can now expose interface to modify concurrency, priority and cpumask to userland which writeback workers are already using. You can't identify a specific thing, any/all of it can land on the user's diner plate, so he should be able to make the decisions. Power to the user and all that, if he does something stupid, tuff titty. User getting to call the shots, and getting to keep the pieces when he fscks it all up is wonderful stuff, lets kernel people off the hook :) Do we know specific kthreads which need to be exposed with this way? If there are good enough reasons for specific ones, sure, but I don't think we can't change any of the kthreads because someone might be diddling with it is something we can sustain in the long term. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
On Thu, 2014-05-15 at 00:50 -0400, Tejun Heo wrote: Hello, Mike. On Thu, May 15, 2014 at 06:46:18AM +0200, Mike Galbraith wrote: I think it'd be healthier to identify the use cases and then provide proper interface for it. Note that workqueue can now expose interface to modify concurrency, priority and cpumask to userland which writeback workers are already using. You can't identify a specific thing, any/all of it can land on the user's diner plate, so he should be able to make the decisions. Power to the user and all that, if he does something stupid, tuff titty. User getting to call the shots, and getting to keep the pieces when he fscks it all up is wonderful stuff, lets kernel people off the hook :) Do we know specific kthreads which need to be exposed with this way? Soft/hard irq threads and anything having to do with IO mostly, which including workqueues. I had to give the user a rather fugly global prioritization option to let users more or less safely do the evil deeds they want to and WILL do whether I agree with their motivation to do so or not. I tell all users that realtime is real dangerous, but if they want to do that, it's their box, so by definition perfectly fine. If there are good enough reasons for specific ones, sure, but I don't think we can't change any of the kthreads because someone might be diddling with it is something we can sustain in the long term. I think the opposite. Taking any control the user has is pure evil. -Mike -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
Hello, Mike. On Thu, May 15, 2014 at 07:04:22AM +0200, Mike Galbraith wrote: On Thu, 2014-05-15 at 00:50 -0400, Tejun Heo wrote: Do we know specific kthreads which need to be exposed with this way? Soft/hard irq threads and anything having to do with IO mostly, which including workqueues. I had to give the user a rather fugly global prioritization option to let users more or less safely do the evil deeds they want to and WILL do whether I agree with their motivation to do so or not. I tell all users that realtime is real dangerous, but if they want to do that, it's their box, so by definition perfectly fine. Frederic is working on global settings for workqueues, so that'll resolve some of those issues at least. If there are good enough reasons for specific ones, sure, but I don't think we can't change any of the kthreads because someone might be diddling with it is something we can sustain in the long term. I think the opposite. Taking any control the user has is pure evil. I'm not sure good/evil is the right frame to think about it. Is pooling worker threads evil in nature then? Even when not doing so leads to serious scalibilty issues and general poor utilization of system resources? User control, just like everything else, is one of the many aspects to be evaluated and traded off, not something to uphold religiously at all cost. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
On Thu, 2014-05-15 at 01:09 -0400, Tejun Heo wrote: Hello, Mike. On Thu, May 15, 2014 at 07:04:22AM +0200, Mike Galbraith wrote: On Thu, 2014-05-15 at 00:50 -0400, Tejun Heo wrote: Do we know specific kthreads which need to be exposed with this way? Soft/hard irq threads and anything having to do with IO mostly, which including workqueues. I had to give the user a rather fugly global prioritization option to let users more or less safely do the evil deeds they want to and WILL do whether I agree with their motivation to do so or not. I tell all users that realtime is real dangerous, but if they want to do that, it's their box, so by definition perfectly fine. Frederic is working on global settings for workqueues, so that'll resolve some of those issues at least. Yeah, wrt what runs where for unbound workqueues, but not priority. If there are good enough reasons for specific ones, sure, but I don't think we can't change any of the kthreads because someone might be diddling with it is something we can sustain in the long term. I think the opposite. Taking any control the user has is pure evil. I'm not sure good/evil is the right frame to think about it. Is pooling worker threads evil in nature then? When there may be realtime consumers, yes to some extent, because it inserts allocations he can't control directly into his world, but that's the least of his worries. The instant userspace depends upon any kernel proxy the user has no control over, he instantly has a priority inversion he can do nothing about. This is exactly what happened that prompted me to do fugly global hack. User turned pet database piggies loose as realtime tasks for his own reasons, misguided or not, they depend upon worker threads and kjournald et al who he can control, but kworker threads respawn as normal tasks which can and will end up under high priority userspace tasks. Worst case is box becomes dead, also killing pet, best case is pet collapses to the floor in a quivering heap. Neither makes Joe User particularly happy. -Mike -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
Hi Tejun, On 05/01/2014 11:09 PM, Tejun Heo wrote: On Thu, May 01, 2014 at 05:02:42PM -0400, Tejun Heo wrote: Hello, Jiri. On Thu, May 01, 2014 at 10:17:44PM +0200, Jiri Kosina wrote: I agree that this expectation might really somewhat implicit and is not probably properly documented anywhere. The basic observation is whenever kthread_should_stop() is being called, all data structures are in a consistent state and don't need any further updates in order to achieve consistency, because we can exit the loop immediately here, as kthread_should_stop() is the very last thing every freezable kernel thread But kthread_should_stop() doesn't necessarily imply that we can exit the loop *immediately* at all. It just indicates that it should terminate in finite amount of time. I don't think it'd be too Just a bit of addition. Please note that kthread_should_stop(), along with the freezer test, is actually trickier than it seems. It's very easy to write code which works most of the time but misses wake up from kill when the timing is just right (or wrong). It should be interlocked with set_current_state() and other related queueing data structure accesses. This was several years ago but when I audited most kthread users in kernel, especially in combination with the freezer test which also has similar requirement, surprising percentage of users (at least several tens of pct) were getting it slightly wrong, so kthread_should_stop() really isn't used as we can exit *immediately*. It just isn't that simple. I see the worst case scenario. (For curious readers, it is for example this kthread body: while (1) { some_paired_call(); /* invokes pre-patched code */ if (kthread_should_stop()) { /* kgraft switches to the new code */ its_paired_function(); /* invokes patched code (wrong) */ break; } its_paired_function(); /* the same (wrong) */ }) What to do with that now? We have come up with a couple possibilities. Would you consider try_to_freeze() a good state-defining function? As it is called when a kthread expects weird things can happen, it should be safe to switch to the patched version in our opinion. The other possibility is to patch every kthread loop (~300) and insert kgr_task_safe() semi-manually at some proper place. Or if you have any other suggestions we would appreciate that? thanks, -- js suse labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
On Wed, May 14, 2014 at 04:59:05PM +0200, Jiri Slaby wrote: I see the worst case scenario. (For curious readers, it is for example this kthread body: while (1) { some_paired_call(); /* invokes pre-patched code */ if (kthread_should_stop()) { /* kgraft switches to the new code */ its_paired_function(); /* invokes patched code (wrong) */ break; } its_paired_function(); /* the same (wrong) */ }) What to do with that now? We have come up with a couple possibilities. Would you consider try_to_freeze() a good state-defining function? As it is called when a kthread expects weird things can happen, it should be safe to switch to the patched version in our opinion. The other possibility is to patch every kthread loop (~300) and insert kgr_task_safe() semi-manually at some proper place. Or if you have any other suggestions we would appreciate that? A heretic idea would be to convert all kernel threads into functions that do not sleep and exit after a single iteration and are called from a central kthread main loop function. That would get all of kthread_should_stop() and try_to_freeze() and kgr_task_safe() nicely into one place and at the same time put enough constraint on what the thread function can do to prevent it from breaking the assumptions of each of these calls. -- Vojtech Pavlik SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
On Wed, May 14, 2014 at 05:15:01PM +0200, Vojtech Pavlik wrote: On Wed, May 14, 2014 at 04:59:05PM +0200, Jiri Slaby wrote: I see the worst case scenario. (For curious readers, it is for example this kthread body: while (1) { some_paired_call(); /* invokes pre-patched code */ if (kthread_should_stop()) { /* kgraft switches to the new code */ its_paired_function(); /* invokes patched code (wrong) */ break; } its_paired_function(); /* the same (wrong) */ }) What to do with that now? We have come up with a couple possibilities. Would you consider try_to_freeze() a good state-defining function? As it is called when a kthread expects weird things can happen, it should be safe to switch to the patched version in our opinion. The other possibility is to patch every kthread loop (~300) and insert kgr_task_safe() semi-manually at some proper place. Or if you have any other suggestions we would appreciate that? A heretic idea would be to convert all kernel threads into functions that do not sleep and exit after a single iteration and are called from a central kthread main loop function. That would get all of kthread_should_stop() and try_to_freeze() and kgr_task_safe() nicely into one place and at the same time put enough constraint on what the thread function can do to prevent it from breaking the assumptions of each of these calls. Some substantial restructuring would be required for several of the kthreads I am aware of, which contain kthread_should_stop() inside loop bodies as well as on their conditions. Also, a number of them do things like wait_event() and the like, which would mean that the central kthread main loop function would need to know about the wait queues and wait conditions and handle them properly. See for example rcu_torture_barrier_cbs() and rcu_torture_barrier() in kernel/rcu/rcutorture.c [*], which wait on each other in order to test RCU's rcu_barrier() primitives. Thanx, Paul * In older kernels, this is kernel/rcu/torture.c or kernel/rcutorture.c. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
Hello, Jiri, Vojtech. On Wed, May 14, 2014 at 05:15:01PM +0200, Vojtech Pavlik wrote: On Wed, May 14, 2014 at 04:59:05PM +0200, Jiri Slaby wrote: I see the worst case scenario. (For curious readers, it is for example this kthread body: while (1) { some_paired_call(); /* invokes pre-patched code */ if (kthread_should_stop()) { /* kgraft switches to the new code */ its_paired_function(); /* invokes patched code (wrong) */ break; } its_paired_function(); /* the same (wrong) */ }) What to do with that now? We have come up with a couple possibilities. Would you consider try_to_freeze() a good state-defining function? As it is called when a kthread expects weird things can happen, it should be safe to switch to the patched version in our opinion. The other possibility is to patch every kthread loop (~300) and insert kgr_task_safe() semi-manually at some proper place. Or if you have any other suggestions we would appreciate that? A heretic idea would be to convert all kernel threads into functions that do not sleep and exit after a single iteration and are called from a central kthread main loop function. That would get all of Or converting them to use workqueues instead. Converting majority of kthread users to workqueue is probably a good idea regardless of this because workqueues are far easier to get right and give clear delineation boundary between execution instances between which it's safe to freeze and shutdown (and possibly to patch the work function). Let alone overall lower overhead. I converted some and was planning on converting most of them but never got around ot it. kthread_should_stop() and try_to_freeze() and kgr_task_safe() nicely into one place and at the same time put enough constraint on what the thread function can do to prevent it from breaking the assumptions of each of these calls. Yeah, the exactly same rationales for using workqueue over kthreads. That said, even with most kthread users converted to workqueue, we'd probably want something which can really enforce correctness for the leftovers as long as we continue to expose kthread interface. Ooh, there's also kthread_worker thing which puts workqueue-like semantics on top of kthreads which can be used for whatever which can't be converted to workqueue due to special worker attributes or whatnot. So, yeah, I think there are enough tools available to put enough semantic meanings over how kthreads are used such that things like freezer or hot-code patching can be implemented in the generic framework rather than in hundred scattered places but it's likely to take a substantial amount of work. The upside is that conversions are likely beneficial on their own so they can be pushed separately. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
On Thu, May 01, 2014 at 05:02:42PM -0400, Tejun Heo wrote: > Hello, Jiri. > > On Thu, May 01, 2014 at 10:17:44PM +0200, Jiri Kosina wrote: > > I agree that this expectation might really somewhat implicit and is not > > probably properly documented anywhere. The basic observation is "whenever > > kthread_should_stop() is being called, all data structures are in a > > consistent state and don't need any further updates in order to achieve > > consistency, because we can exit the loop immediately here", as > > kthread_should_stop() is the very last thing every freezable kernel thread > > But kthread_should_stop() doesn't necessarily imply that "we can exit > the loop *immediately*" at all. It just indicates that it should > terminate in finite amount of time. I don't think it'd be too Just a bit of addition. Please note that kthread_should_stop(), along with the freezer test, is actually trickier than it seems. It's very easy to write code which works most of the time but misses wake up from kill when the timing is just right (or wrong). It should be interlocked with set_current_state() and other related queueing data structure accesses. This was several years ago but when I audited most kthread users in kernel, especially in combination with the freezer test which also has similar requirement, surprising percentage of users (at least several tens of pct) were getting it slightly wrong, so kthread_should_stop() really isn't used as "we can exit *immediately*". It just isn't that simple. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
Hello, Jiri. On Thu, May 01, 2014 at 10:17:44PM +0200, Jiri Kosina wrote: > I agree that this expectation might really somewhat implicit and is not > probably properly documented anywhere. The basic observation is "whenever > kthread_should_stop() is being called, all data structures are in a > consistent state and don't need any further updates in order to achieve > consistency, because we can exit the loop immediately here", as > kthread_should_stop() is the very last thing every freezable kernel thread But kthread_should_stop() doesn't necessarily imply that "we can exit the loop *immediately*" at all. It just indicates that it should terminate in finite amount of time. I don't think it'd be too difficult to find cases where kthreads do some stuff before returning after testing kthread_should_stop(). e.g. after pending changes, workqueue rescuers do one final loop over pending work items after kthread_should_stop() tests positive to ensure empty queue on exit. Please note that there's no expectation of discontinuity over the test. The users may carry over any state across the test as they see fit. > is calling before starting a new iteration. > > For the sake of collecting data points -- do you happen to have any > counter-example to the assumption? Just grep for kthread_should_stop() and look for the ones which doesn't immediately perform return? I think there are more which don't return *immediately*. You'd have to audit each and everyone to determine that they don't carry over states across the test. Most will hopefully be trivial but not all. More importantly, sounds like a maintenance nightmare to me without any means to guarantee, or even reasonably increase, correctness. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
On Thu, 1 May 2014, Tejun Heo wrote: > > Some threads do not use kthread_should_stop. Before we enable a > > Haven't really following kgraft development but is it safe to assume > that all kthread_should_stop() usages are clean side-effect-less > boundaries? If so, why is that property guaranteed? Is there any > mechanism for sanity checks? Maybe I'm just failing to understand how > the whole thing is supposed to work but this looks like it could > devolve into something more broken than the freezer which we haven't > fully recovered from yet. Hi Tejun, first, thanks a lot for review. I agree that this expectation might really somewhat implicit and is not probably properly documented anywhere. The basic observation is "whenever kthread_should_stop() is being called, all data structures are in a consistent state and don't need any further updates in order to achieve consistency, because we can exit the loop immediately here", as kthread_should_stop() is the very last thing every freezable kernel thread is calling before starting a new iteration. For the sake of collecting data points -- do you happen to have any counter-example to the assumption? Thanks, -- Jiri Kosina SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
On Wed, Apr 30, 2014 at 04:30:42PM +0200, Jiri Slaby wrote: > Some threads do not use kthread_should_stop. Before we enable a Haven't really following kgraft development but is it safe to assume that all kthread_should_stop() usages are clean side-effect-less boundaries? If so, why is that property guaranteed? Is there any mechanism for sanity checks? Maybe I'm just failing to understand how the whole thing is supposed to work but this looks like it could devolve into something more broken than the freezer which we haven't fully recovered from yet. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
On Wed, Apr 30, 2014 at 04:30:42PM +0200, Jiri Slaby wrote: Some threads do not use kthread_should_stop. Before we enable a Haven't really following kgraft development but is it safe to assume that all kthread_should_stop() usages are clean side-effect-less boundaries? If so, why is that property guaranteed? Is there any mechanism for sanity checks? Maybe I'm just failing to understand how the whole thing is supposed to work but this looks like it could devolve into something more broken than the freezer which we haven't fully recovered from yet. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
On Thu, 1 May 2014, Tejun Heo wrote: Some threads do not use kthread_should_stop. Before we enable a Haven't really following kgraft development but is it safe to assume that all kthread_should_stop() usages are clean side-effect-less boundaries? If so, why is that property guaranteed? Is there any mechanism for sanity checks? Maybe I'm just failing to understand how the whole thing is supposed to work but this looks like it could devolve into something more broken than the freezer which we haven't fully recovered from yet. Hi Tejun, first, thanks a lot for review. I agree that this expectation might really somewhat implicit and is not probably properly documented anywhere. The basic observation is whenever kthread_should_stop() is being called, all data structures are in a consistent state and don't need any further updates in order to achieve consistency, because we can exit the loop immediately here, as kthread_should_stop() is the very last thing every freezable kernel thread is calling before starting a new iteration. For the sake of collecting data points -- do you happen to have any counter-example to the assumption? Thanks, -- Jiri Kosina SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
Hello, Jiri. On Thu, May 01, 2014 at 10:17:44PM +0200, Jiri Kosina wrote: I agree that this expectation might really somewhat implicit and is not probably properly documented anywhere. The basic observation is whenever kthread_should_stop() is being called, all data structures are in a consistent state and don't need any further updates in order to achieve consistency, because we can exit the loop immediately here, as kthread_should_stop() is the very last thing every freezable kernel thread But kthread_should_stop() doesn't necessarily imply that we can exit the loop *immediately* at all. It just indicates that it should terminate in finite amount of time. I don't think it'd be too difficult to find cases where kthreads do some stuff before returning after testing kthread_should_stop(). e.g. after pending changes, workqueue rescuers do one final loop over pending work items after kthread_should_stop() tests positive to ensure empty queue on exit. Please note that there's no expectation of discontinuity over the test. The users may carry over any state across the test as they see fit. is calling before starting a new iteration. For the sake of collecting data points -- do you happen to have any counter-example to the assumption? Just grep for kthread_should_stop() and look for the ones which doesn't immediately perform return? I think there are more which don't return *immediately*. You'd have to audit each and everyone to determine that they don't carry over states across the test. Most will hopefully be trivial but not all. More importantly, sounds like a maintenance nightmare to me without any means to guarantee, or even reasonably increase, correctness. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
On Thu, May 01, 2014 at 05:02:42PM -0400, Tejun Heo wrote: Hello, Jiri. On Thu, May 01, 2014 at 10:17:44PM +0200, Jiri Kosina wrote: I agree that this expectation might really somewhat implicit and is not probably properly documented anywhere. The basic observation is whenever kthread_should_stop() is being called, all data structures are in a consistent state and don't need any further updates in order to achieve consistency, because we can exit the loop immediately here, as kthread_should_stop() is the very last thing every freezable kernel thread But kthread_should_stop() doesn't necessarily imply that we can exit the loop *immediately* at all. It just indicates that it should terminate in finite amount of time. I don't think it'd be too Just a bit of addition. Please note that kthread_should_stop(), along with the freezer test, is actually trickier than it seems. It's very easy to write code which works most of the time but misses wake up from kill when the timing is just right (or wrong). It should be interlocked with set_current_state() and other related queueing data structure accesses. This was several years ago but when I audited most kthread users in kernel, especially in combination with the freezer test which also has similar requirement, surprising percentage of users (at least several tens of pct) were getting it slightly wrong, so kthread_should_stop() really isn't used as we can exit *immediately*. It just isn't that simple. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
On Wed, Apr 30, 2014 at 08:33:27PM +0200, Vojtech Pavlik wrote: > On Wed, Apr 30, 2014 at 09:55:32AM -0700, Paul E. McKenney wrote: > > On Wed, Apr 30, 2014 at 04:30:42PM +0200, Jiri Slaby wrote: > > > Some threads do not use kthread_should_stop. Before we enable a > > > kthread support in kgr, we must make sure all those mark themselves > > > safe explicitly. > > > > Would it make sense to bury kgr_task_safe() in wait_event_interruptible() > > and friends? The kgr_task_safe() implementation looks pretty lightweight, > > so it should not be a performance problem. > > For userspace tasks, the kGraft in progress flag is cleared when > entering or exiting userspace. At that point it is safe to switch the > task to a post-patch world view. > > For kernel threads, it's a bit more complicated: They never exit the > kernel, they keep executing within the kernel continuously. The > kgr_task_safe() call is thus inserted at a location within the main loop > where a 'new loop' begins - where there are no dependencies on results > of calls of functions from the previous loop. > > Hence, putting kgr_task_safe() into every wait_event_interruptible() > wouldn't work, only a few of them are at that strategic spot where a > 'new loop' can be indicated to kGraft. > > The reason kgr_task_safe() is called from within the condition > evaluation statement in wait_event_interruptible() in this patch is > because we want it to be called as soon as a new loop begins - even if > that loop is empty because the condition to stop waiting has not been > met. > > This also means that kGraft currently cannot patch the main loops of > kernel threads themselves as the thread of execution never exits them. > > Jiří (Slabý) has some ideas about how to do without calling > kgr_task_safe() from within the kernel thread main loops, but for now, > the goal is to keep things simple and easy to understand. OK, from an RCU perspective: Acked-by: Paul E. McKenney > > One reason might this might be a bad idea is that there are calls to > > wait_event_interruptible() all over the place, which might therefore > > constrain where grafting could be safely done. That would be fair enough, > > but does that also imply new constraints on where kthread_should_stop() > > can be invoked? Any new constraints might not be a big deal given that > > a very large fraction of the kthreads (and maybe all of them) invoke > > kthread_should_stop() from their top-level function, but would be good > > to call out. > > > So, what is the story? > > kGraft currently assumes that kthread_should_stop() is always in a part > of the main loop which doesn't carry over effect dependencies from the > previous iteration. This is currently true for all the uses of > kthread_should_stop(), but indeed it is an additional constraint for the > future. Got it. It would be good to document this. ;-) Thanx, Paul > Vojtech > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
On Wed, Apr 30, 2014 at 09:55:32AM -0700, Paul E. McKenney wrote: > On Wed, Apr 30, 2014 at 04:30:42PM +0200, Jiri Slaby wrote: > > Some threads do not use kthread_should_stop. Before we enable a > > kthread support in kgr, we must make sure all those mark themselves > > safe explicitly. > > Would it make sense to bury kgr_task_safe() in wait_event_interruptible() > and friends? The kgr_task_safe() implementation looks pretty lightweight, > so it should not be a performance problem. For userspace tasks, the kGraft in progress flag is cleared when entering or exiting userspace. At that point it is safe to switch the task to a post-patch world view. For kernel threads, it's a bit more complicated: They never exit the kernel, they keep executing within the kernel continuously. The kgr_task_safe() call is thus inserted at a location within the main loop where a 'new loop' begins - where there are no dependencies on results of calls of functions from the previous loop. Hence, putting kgr_task_safe() into every wait_event_interruptible() wouldn't work, only a few of them are at that strategic spot where a 'new loop' can be indicated to kGraft. The reason kgr_task_safe() is called from within the condition evaluation statement in wait_event_interruptible() in this patch is because we want it to be called as soon as a new loop begins - even if that loop is empty because the condition to stop waiting has not been met. This also means that kGraft currently cannot patch the main loops of kernel threads themselves as the thread of execution never exits them. Jiří (Slabý) has some ideas about how to do without calling kgr_task_safe() from within the kernel thread main loops, but for now, the goal is to keep things simple and easy to understand. > One reason might this might be a bad idea is that there are calls to > wait_event_interruptible() all over the place, which might therefore > constrain where grafting could be safely done. That would be fair enough, > but does that also imply new constraints on where kthread_should_stop() > can be invoked? Any new constraints might not be a big deal given that > a very large fraction of the kthreads (and maybe all of them) invoke > kthread_should_stop() from their top-level function, but would be good > to call out. > So, what is the story? kGraft currently assumes that kthread_should_stop() is always in a part of the main loop which doesn't carry over effect dependencies from the previous iteration. This is currently true for all the uses of kthread_should_stop(), but indeed it is an additional constraint for the future. Vojtech -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
On Wed, Apr 30, 2014 at 04:30:42PM +0200, Jiri Slaby wrote: > Some threads do not use kthread_should_stop. Before we enable a > kthread support in kgr, we must make sure all those mark themselves > safe explicitly. Would it make sense to bury kgr_task_safe() in wait_event_interruptible() and friends? The kgr_task_safe() implementation looks pretty lightweight, so it should not be a performance problem. One reason might this might be a bad idea is that there are calls to wait_event_interruptible() all over the place, which might therefore constrain where grafting could be safely done. That would be fair enough, but does that also imply new constraints on where kthread_should_stop() can be invoked? Any new constraints might not be a big deal given that a very large fraction of the kthreads (and maybe all of them) invoke kthread_should_stop() from their top-level function, but would be good to call out. So, what is the story? Thanx, Paul > Signed-off-by: Jiri Slaby > Cc: Steven Rostedt > Cc: Frederic Weisbecker > Cc: Ingo Molnar > Cc: Greg Kroah-Hartman > Cc: "Theodore Ts'o" > Cc: Dipankar Sarma > Cc: "Paul E. McKenney" > Cc: Tejun Heo > --- > drivers/base/devtmpfs.c | 1 + > fs/jbd2/journal.c| 2 ++ > fs/notify/mark.c | 5 - > kernel/hung_task.c | 5 - > kernel/kthread.c | 3 +++ > kernel/rcu/tree.c| 6 -- > kernel/rcu/tree_plugin.h | 9 +++-- > kernel/workqueue.c | 1 + > 8 files changed, 26 insertions(+), 6 deletions(-) > > diff --git a/drivers/base/devtmpfs.c b/drivers/base/devtmpfs.c > index 25798db14553..c7d52d1b8c9c 100644 > --- a/drivers/base/devtmpfs.c > +++ b/drivers/base/devtmpfs.c > @@ -387,6 +387,7 @@ static int devtmpfsd(void *p) > sys_chroot("."); > complete(_done); > while (1) { > + kgr_task_safe(current); > spin_lock(_lock); > while (requests) { > struct req *req = requests; > diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c > index 67b8e303946c..1b9c4c2e014a 100644 > --- a/fs/jbd2/journal.c > +++ b/fs/jbd2/journal.c > @@ -43,6 +43,7 @@ > #include > #include > #include > +#include > > #define CREATE_TRACE_POINTS > #include > @@ -260,6 +261,7 @@ loop: > write_lock(>j_state_lock); > } > finish_wait(>j_wait_commit, ); > + kgr_task_safe(current); > } > > jbd_debug(1, "kjournald2 wakes\n"); > diff --git a/fs/notify/mark.c b/fs/notify/mark.c > index 923fe4a5f503..a74b6175e645 100644 > --- a/fs/notify/mark.c > +++ b/fs/notify/mark.c > @@ -82,6 +82,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -355,7 +356,9 @@ static int fsnotify_mark_destroy(void *ignored) > fsnotify_put_mark(mark); > } > > - wait_event_interruptible(destroy_waitq, > !list_empty(_list)); > + wait_event_interruptible(destroy_waitq, ({ > + kgr_task_safe(current); > + !list_empty(_list); })); > } > > return 0; > diff --git a/kernel/hung_task.c b/kernel/hung_task.c > index 06bb1417b063..b5f85bff2509 100644 > --- a/kernel/hung_task.c > +++ b/kernel/hung_task.c > @@ -14,6 +14,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -227,8 +228,10 @@ static int watchdog(void *dummy) > for ( ; ; ) { > unsigned long timeout = sysctl_hung_task_timeout_secs; > > - while (schedule_timeout_interruptible(timeout_jiffies(timeout))) > + while > (schedule_timeout_interruptible(timeout_jiffies(timeout))) { > + kgr_task_safe(current); > timeout = sysctl_hung_task_timeout_secs; > + } > > if (atomic_xchg(_hung_task, 0)) > continue; > diff --git a/kernel/kthread.c b/kernel/kthread.c > index 9a130ec06f7a..08b979dad619 100644 > --- a/kernel/kthread.c > +++ b/kernel/kthread.c > @@ -78,6 +78,8 @@ static struct kthread *to_live_kthread(struct task_struct > *k) > */ > bool kthread_should_stop(void) > { > + kgr_task_safe(current); > + > return test_bit(KTHREAD_SHOULD_STOP, _kthread(current)->flags); > } > EXPORT_SYMBOL(kthread_should_stop); > @@ -497,6 +499,7 @@ int kthreadd(void *unused) > if (list_empty(_create_list)) > schedule(); > __set_current_state(TASK_RUNNING); > + kgr_task_safe(current); > > spin_lock(_create_lock); > while (!list_empty(_create_list)) { > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > index 0c47e300210a..5dddedacfc06 100644 > --- a/kernel/rcu/tree.c > +++ b/kernel/rcu/tree.c > @@ -1593,9 +1593,10 @@ static int __noreturn rcu_gp_kthread(void *arg) >
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
On Wed, Apr 30, 2014 at 04:30:42PM +0200, Jiri Slaby wrote: > Some threads do not use kthread_should_stop. Before we enable a > kthread support in kgr, we must make sure all those mark themselves > safe explicitly. > > Signed-off-by: Jiri Slaby > Cc: Steven Rostedt > Cc: Frederic Weisbecker > Cc: Ingo Molnar > Cc: Greg Kroah-Hartman > Cc: "Theodore Ts'o" > Cc: Dipankar Sarma > Cc: "Paul E. McKenney" > Cc: Tejun Heo > --- > drivers/base/devtmpfs.c | 1 + Acked-by: Greg Kroah-Hartman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC 09/16] kgr: mark task_safe in some kthreads
Some threads do not use kthread_should_stop. Before we enable a kthread support in kgr, we must make sure all those mark themselves safe explicitly. Signed-off-by: Jiri Slaby Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: Ingo Molnar Cc: Greg Kroah-Hartman Cc: "Theodore Ts'o" Cc: Dipankar Sarma Cc: "Paul E. McKenney" Cc: Tejun Heo --- drivers/base/devtmpfs.c | 1 + fs/jbd2/journal.c| 2 ++ fs/notify/mark.c | 5 - kernel/hung_task.c | 5 - kernel/kthread.c | 3 +++ kernel/rcu/tree.c| 6 -- kernel/rcu/tree_plugin.h | 9 +++-- kernel/workqueue.c | 1 + 8 files changed, 26 insertions(+), 6 deletions(-) diff --git a/drivers/base/devtmpfs.c b/drivers/base/devtmpfs.c index 25798db14553..c7d52d1b8c9c 100644 --- a/drivers/base/devtmpfs.c +++ b/drivers/base/devtmpfs.c @@ -387,6 +387,7 @@ static int devtmpfsd(void *p) sys_chroot("."); complete(_done); while (1) { + kgr_task_safe(current); spin_lock(_lock); while (requests) { struct req *req = requests; diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c index 67b8e303946c..1b9c4c2e014a 100644 --- a/fs/jbd2/journal.c +++ b/fs/jbd2/journal.c @@ -43,6 +43,7 @@ #include #include #include +#include #define CREATE_TRACE_POINTS #include @@ -260,6 +261,7 @@ loop: write_lock(>j_state_lock); } finish_wait(>j_wait_commit, ); + kgr_task_safe(current); } jbd_debug(1, "kjournald2 wakes\n"); diff --git a/fs/notify/mark.c b/fs/notify/mark.c index 923fe4a5f503..a74b6175e645 100644 --- a/fs/notify/mark.c +++ b/fs/notify/mark.c @@ -82,6 +82,7 @@ #include #include #include +#include #include #include #include @@ -355,7 +356,9 @@ static int fsnotify_mark_destroy(void *ignored) fsnotify_put_mark(mark); } - wait_event_interruptible(destroy_waitq, !list_empty(_list)); + wait_event_interruptible(destroy_waitq, ({ + kgr_task_safe(current); + !list_empty(_list); })); } return 0; diff --git a/kernel/hung_task.c b/kernel/hung_task.c index 06bb1417b063..b5f85bff2509 100644 --- a/kernel/hung_task.c +++ b/kernel/hung_task.c @@ -14,6 +14,7 @@ #include #include #include +#include #include #include #include @@ -227,8 +228,10 @@ static int watchdog(void *dummy) for ( ; ; ) { unsigned long timeout = sysctl_hung_task_timeout_secs; - while (schedule_timeout_interruptible(timeout_jiffies(timeout))) + while (schedule_timeout_interruptible(timeout_jiffies(timeout))) { + kgr_task_safe(current); timeout = sysctl_hung_task_timeout_secs; + } if (atomic_xchg(_hung_task, 0)) continue; diff --git a/kernel/kthread.c b/kernel/kthread.c index 9a130ec06f7a..08b979dad619 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -78,6 +78,8 @@ static struct kthread *to_live_kthread(struct task_struct *k) */ bool kthread_should_stop(void) { + kgr_task_safe(current); + return test_bit(KTHREAD_SHOULD_STOP, _kthread(current)->flags); } EXPORT_SYMBOL(kthread_should_stop); @@ -497,6 +499,7 @@ int kthreadd(void *unused) if (list_empty(_create_list)) schedule(); __set_current_state(TASK_RUNNING); + kgr_task_safe(current); spin_lock(_create_lock); while (!list_empty(_create_list)) { diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 0c47e300210a..5dddedacfc06 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1593,9 +1593,10 @@ static int __noreturn rcu_gp_kthread(void *arg) trace_rcu_grace_period(rsp->name, ACCESS_ONCE(rsp->gpnum), TPS("reqwait")); - wait_event_interruptible(rsp->gp_wq, + wait_event_interruptible(rsp->gp_wq, ({ +kgr_task_safe(current); ACCESS_ONCE(rsp->gp_flags) & -RCU_GP_FLAG_INIT); +RCU_GP_FLAG_INIT; })); /* Locking provides needed memory barrier. */ if (rcu_gp_init(rsp)) break; @@ -1626,6 +1627,7 @@ static int __noreturn rcu_gp_kthread(void *arg) (!ACCESS_ONCE(rnp->qsmask) && !rcu_preempt_blocked_readers_cgp(rnp)), j); +
[RFC 09/16] kgr: mark task_safe in some kthreads
Some threads do not use kthread_should_stop. Before we enable a kthread support in kgr, we must make sure all those mark themselves safe explicitly. Signed-off-by: Jiri Slaby jsl...@suse.cz Cc: Steven Rostedt rost...@goodmis.org Cc: Frederic Weisbecker fweis...@gmail.com Cc: Ingo Molnar mi...@redhat.com Cc: Greg Kroah-Hartman gre...@linuxfoundation.org Cc: Theodore Ts'o ty...@mit.edu Cc: Dipankar Sarma dipan...@in.ibm.com Cc: Paul E. McKenney paul...@linux.vnet.ibm.com Cc: Tejun Heo t...@kernel.org --- drivers/base/devtmpfs.c | 1 + fs/jbd2/journal.c| 2 ++ fs/notify/mark.c | 5 - kernel/hung_task.c | 5 - kernel/kthread.c | 3 +++ kernel/rcu/tree.c| 6 -- kernel/rcu/tree_plugin.h | 9 +++-- kernel/workqueue.c | 1 + 8 files changed, 26 insertions(+), 6 deletions(-) diff --git a/drivers/base/devtmpfs.c b/drivers/base/devtmpfs.c index 25798db14553..c7d52d1b8c9c 100644 --- a/drivers/base/devtmpfs.c +++ b/drivers/base/devtmpfs.c @@ -387,6 +387,7 @@ static int devtmpfsd(void *p) sys_chroot(.); complete(setup_done); while (1) { + kgr_task_safe(current); spin_lock(req_lock); while (requests) { struct req *req = requests; diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c index 67b8e303946c..1b9c4c2e014a 100644 --- a/fs/jbd2/journal.c +++ b/fs/jbd2/journal.c @@ -43,6 +43,7 @@ #include linux/backing-dev.h #include linux/bitops.h #include linux/ratelimit.h +#include linux/sched.h #define CREATE_TRACE_POINTS #include trace/events/jbd2.h @@ -260,6 +261,7 @@ loop: write_lock(journal-j_state_lock); } finish_wait(journal-j_wait_commit, wait); + kgr_task_safe(current); } jbd_debug(1, kjournald2 wakes\n); diff --git a/fs/notify/mark.c b/fs/notify/mark.c index 923fe4a5f503..a74b6175e645 100644 --- a/fs/notify/mark.c +++ b/fs/notify/mark.c @@ -82,6 +82,7 @@ #include linux/kthread.h #include linux/module.h #include linux/mutex.h +#include linux/sched.h #include linux/slab.h #include linux/spinlock.h #include linux/srcu.h @@ -355,7 +356,9 @@ static int fsnotify_mark_destroy(void *ignored) fsnotify_put_mark(mark); } - wait_event_interruptible(destroy_waitq, !list_empty(destroy_list)); + wait_event_interruptible(destroy_waitq, ({ + kgr_task_safe(current); + !list_empty(destroy_list); })); } return 0; diff --git a/kernel/hung_task.c b/kernel/hung_task.c index 06bb1417b063..b5f85bff2509 100644 --- a/kernel/hung_task.c +++ b/kernel/hung_task.c @@ -14,6 +14,7 @@ #include linux/kthread.h #include linux/lockdep.h #include linux/export.h +#include linux/sched.h #include linux/sysctl.h #include linux/utsname.h #include trace/events/sched.h @@ -227,8 +228,10 @@ static int watchdog(void *dummy) for ( ; ; ) { unsigned long timeout = sysctl_hung_task_timeout_secs; - while (schedule_timeout_interruptible(timeout_jiffies(timeout))) + while (schedule_timeout_interruptible(timeout_jiffies(timeout))) { + kgr_task_safe(current); timeout = sysctl_hung_task_timeout_secs; + } if (atomic_xchg(reset_hung_task, 0)) continue; diff --git a/kernel/kthread.c b/kernel/kthread.c index 9a130ec06f7a..08b979dad619 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -78,6 +78,8 @@ static struct kthread *to_live_kthread(struct task_struct *k) */ bool kthread_should_stop(void) { + kgr_task_safe(current); + return test_bit(KTHREAD_SHOULD_STOP, to_kthread(current)-flags); } EXPORT_SYMBOL(kthread_should_stop); @@ -497,6 +499,7 @@ int kthreadd(void *unused) if (list_empty(kthread_create_list)) schedule(); __set_current_state(TASK_RUNNING); + kgr_task_safe(current); spin_lock(kthread_create_lock); while (!list_empty(kthread_create_list)) { diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 0c47e300210a..5dddedacfc06 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1593,9 +1593,10 @@ static int __noreturn rcu_gp_kthread(void *arg) trace_rcu_grace_period(rsp-name, ACCESS_ONCE(rsp-gpnum), TPS(reqwait)); - wait_event_interruptible(rsp-gp_wq, + wait_event_interruptible(rsp-gp_wq, ({ +kgr_task_safe(current); ACCESS_ONCE(rsp-gp_flags) -RCU_GP_FLAG_INIT);
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
On Wed, Apr 30, 2014 at 04:30:42PM +0200, Jiri Slaby wrote: Some threads do not use kthread_should_stop. Before we enable a kthread support in kgr, we must make sure all those mark themselves safe explicitly. Signed-off-by: Jiri Slaby jsl...@suse.cz Cc: Steven Rostedt rost...@goodmis.org Cc: Frederic Weisbecker fweis...@gmail.com Cc: Ingo Molnar mi...@redhat.com Cc: Greg Kroah-Hartman gre...@linuxfoundation.org Cc: Theodore Ts'o ty...@mit.edu Cc: Dipankar Sarma dipan...@in.ibm.com Cc: Paul E. McKenney paul...@linux.vnet.ibm.com Cc: Tejun Heo t...@kernel.org --- drivers/base/devtmpfs.c | 1 + Acked-by: Greg Kroah-Hartman gre...@linuxfoundation.org -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
On Wed, Apr 30, 2014 at 04:30:42PM +0200, Jiri Slaby wrote: Some threads do not use kthread_should_stop. Before we enable a kthread support in kgr, we must make sure all those mark themselves safe explicitly. Would it make sense to bury kgr_task_safe() in wait_event_interruptible() and friends? The kgr_task_safe() implementation looks pretty lightweight, so it should not be a performance problem. One reason might this might be a bad idea is that there are calls to wait_event_interruptible() all over the place, which might therefore constrain where grafting could be safely done. That would be fair enough, but does that also imply new constraints on where kthread_should_stop() can be invoked? Any new constraints might not be a big deal given that a very large fraction of the kthreads (and maybe all of them) invoke kthread_should_stop() from their top-level function, but would be good to call out. So, what is the story? Thanx, Paul Signed-off-by: Jiri Slaby jsl...@suse.cz Cc: Steven Rostedt rost...@goodmis.org Cc: Frederic Weisbecker fweis...@gmail.com Cc: Ingo Molnar mi...@redhat.com Cc: Greg Kroah-Hartman gre...@linuxfoundation.org Cc: Theodore Ts'o ty...@mit.edu Cc: Dipankar Sarma dipan...@in.ibm.com Cc: Paul E. McKenney paul...@linux.vnet.ibm.com Cc: Tejun Heo t...@kernel.org --- drivers/base/devtmpfs.c | 1 + fs/jbd2/journal.c| 2 ++ fs/notify/mark.c | 5 - kernel/hung_task.c | 5 - kernel/kthread.c | 3 +++ kernel/rcu/tree.c| 6 -- kernel/rcu/tree_plugin.h | 9 +++-- kernel/workqueue.c | 1 + 8 files changed, 26 insertions(+), 6 deletions(-) diff --git a/drivers/base/devtmpfs.c b/drivers/base/devtmpfs.c index 25798db14553..c7d52d1b8c9c 100644 --- a/drivers/base/devtmpfs.c +++ b/drivers/base/devtmpfs.c @@ -387,6 +387,7 @@ static int devtmpfsd(void *p) sys_chroot(.); complete(setup_done); while (1) { + kgr_task_safe(current); spin_lock(req_lock); while (requests) { struct req *req = requests; diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c index 67b8e303946c..1b9c4c2e014a 100644 --- a/fs/jbd2/journal.c +++ b/fs/jbd2/journal.c @@ -43,6 +43,7 @@ #include linux/backing-dev.h #include linux/bitops.h #include linux/ratelimit.h +#include linux/sched.h #define CREATE_TRACE_POINTS #include trace/events/jbd2.h @@ -260,6 +261,7 @@ loop: write_lock(journal-j_state_lock); } finish_wait(journal-j_wait_commit, wait); + kgr_task_safe(current); } jbd_debug(1, kjournald2 wakes\n); diff --git a/fs/notify/mark.c b/fs/notify/mark.c index 923fe4a5f503..a74b6175e645 100644 --- a/fs/notify/mark.c +++ b/fs/notify/mark.c @@ -82,6 +82,7 @@ #include linux/kthread.h #include linux/module.h #include linux/mutex.h +#include linux/sched.h #include linux/slab.h #include linux/spinlock.h #include linux/srcu.h @@ -355,7 +356,9 @@ static int fsnotify_mark_destroy(void *ignored) fsnotify_put_mark(mark); } - wait_event_interruptible(destroy_waitq, !list_empty(destroy_list)); + wait_event_interruptible(destroy_waitq, ({ + kgr_task_safe(current); + !list_empty(destroy_list); })); } return 0; diff --git a/kernel/hung_task.c b/kernel/hung_task.c index 06bb1417b063..b5f85bff2509 100644 --- a/kernel/hung_task.c +++ b/kernel/hung_task.c @@ -14,6 +14,7 @@ #include linux/kthread.h #include linux/lockdep.h #include linux/export.h +#include linux/sched.h #include linux/sysctl.h #include linux/utsname.h #include trace/events/sched.h @@ -227,8 +228,10 @@ static int watchdog(void *dummy) for ( ; ; ) { unsigned long timeout = sysctl_hung_task_timeout_secs; - while (schedule_timeout_interruptible(timeout_jiffies(timeout))) + while (schedule_timeout_interruptible(timeout_jiffies(timeout))) { + kgr_task_safe(current); timeout = sysctl_hung_task_timeout_secs; + } if (atomic_xchg(reset_hung_task, 0)) continue; diff --git a/kernel/kthread.c b/kernel/kthread.c index 9a130ec06f7a..08b979dad619 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -78,6 +78,8 @@ static struct kthread *to_live_kthread(struct task_struct *k) */ bool kthread_should_stop(void) { + kgr_task_safe(current); + return test_bit(KTHREAD_SHOULD_STOP, to_kthread(current)-flags); } EXPORT_SYMBOL(kthread_should_stop); @@ -497,6 +499,7 @@ int kthreadd(void *unused) if (list_empty(kthread_create_list)) schedule();
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
On Wed, Apr 30, 2014 at 09:55:32AM -0700, Paul E. McKenney wrote: On Wed, Apr 30, 2014 at 04:30:42PM +0200, Jiri Slaby wrote: Some threads do not use kthread_should_stop. Before we enable a kthread support in kgr, we must make sure all those mark themselves safe explicitly. Would it make sense to bury kgr_task_safe() in wait_event_interruptible() and friends? The kgr_task_safe() implementation looks pretty lightweight, so it should not be a performance problem. For userspace tasks, the kGraft in progress flag is cleared when entering or exiting userspace. At that point it is safe to switch the task to a post-patch world view. For kernel threads, it's a bit more complicated: They never exit the kernel, they keep executing within the kernel continuously. The kgr_task_safe() call is thus inserted at a location within the main loop where a 'new loop' begins - where there are no dependencies on results of calls of functions from the previous loop. Hence, putting kgr_task_safe() into every wait_event_interruptible() wouldn't work, only a few of them are at that strategic spot where a 'new loop' can be indicated to kGraft. The reason kgr_task_safe() is called from within the condition evaluation statement in wait_event_interruptible() in this patch is because we want it to be called as soon as a new loop begins - even if that loop is empty because the condition to stop waiting has not been met. This also means that kGraft currently cannot patch the main loops of kernel threads themselves as the thread of execution never exits them. Jiří (Slabý) has some ideas about how to do without calling kgr_task_safe() from within the kernel thread main loops, but for now, the goal is to keep things simple and easy to understand. One reason might this might be a bad idea is that there are calls to wait_event_interruptible() all over the place, which might therefore constrain where grafting could be safely done. That would be fair enough, but does that also imply new constraints on where kthread_should_stop() can be invoked? Any new constraints might not be a big deal given that a very large fraction of the kthreads (and maybe all of them) invoke kthread_should_stop() from their top-level function, but would be good to call out. So, what is the story? kGraft currently assumes that kthread_should_stop() is always in a part of the main loop which doesn't carry over effect dependencies from the previous iteration. This is currently true for all the uses of kthread_should_stop(), but indeed it is an additional constraint for the future. Vojtech -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 09/16] kgr: mark task_safe in some kthreads
On Wed, Apr 30, 2014 at 08:33:27PM +0200, Vojtech Pavlik wrote: On Wed, Apr 30, 2014 at 09:55:32AM -0700, Paul E. McKenney wrote: On Wed, Apr 30, 2014 at 04:30:42PM +0200, Jiri Slaby wrote: Some threads do not use kthread_should_stop. Before we enable a kthread support in kgr, we must make sure all those mark themselves safe explicitly. Would it make sense to bury kgr_task_safe() in wait_event_interruptible() and friends? The kgr_task_safe() implementation looks pretty lightweight, so it should not be a performance problem. For userspace tasks, the kGraft in progress flag is cleared when entering or exiting userspace. At that point it is safe to switch the task to a post-patch world view. For kernel threads, it's a bit more complicated: They never exit the kernel, they keep executing within the kernel continuously. The kgr_task_safe() call is thus inserted at a location within the main loop where a 'new loop' begins - where there are no dependencies on results of calls of functions from the previous loop. Hence, putting kgr_task_safe() into every wait_event_interruptible() wouldn't work, only a few of them are at that strategic spot where a 'new loop' can be indicated to kGraft. The reason kgr_task_safe() is called from within the condition evaluation statement in wait_event_interruptible() in this patch is because we want it to be called as soon as a new loop begins - even if that loop is empty because the condition to stop waiting has not been met. This also means that kGraft currently cannot patch the main loops of kernel threads themselves as the thread of execution never exits them. Jiří (Slabý) has some ideas about how to do without calling kgr_task_safe() from within the kernel thread main loops, but for now, the goal is to keep things simple and easy to understand. OK, from an RCU perspective: Acked-by: Paul E. McKenney paul...@linux.vnet.ibm.com One reason might this might be a bad idea is that there are calls to wait_event_interruptible() all over the place, which might therefore constrain where grafting could be safely done. That would be fair enough, but does that also imply new constraints on where kthread_should_stop() can be invoked? Any new constraints might not be a big deal given that a very large fraction of the kthreads (and maybe all of them) invoke kthread_should_stop() from their top-level function, but would be good to call out. So, what is the story? kGraft currently assumes that kthread_should_stop() is always in a part of the main loop which doesn't carry over effect dependencies from the previous iteration. This is currently true for all the uses of kthread_should_stop(), but indeed it is an additional constraint for the future. Got it. It would be good to document this. ;-) Thanx, Paul Vojtech -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/