Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Mon 2014-09-22 13:23:54, Dmitry Torokhov wrote: On Monday, September 22, 2014 09:49:06 PM Pavel Machek wrote: On Thu 2014-09-11 13:23:54, Dmitry Torokhov wrote: On Thu, Sep 11, 2014 at 12:59:25PM -0700, James Bottomley wrote: Yes, but we mostly do this anyway. SCSI for instance does asynchronous scanning of attached devices (once the cards are probed) What would it do it card was a bit slow to probe? but has a sync point for ordering. Quite often we do not really care about ordering of devices. I mean, does it matter if your mouse is discovered before your keyboard or after? Actually yes, I suspect it does. I do evtest /dev/input/eventX by hand, occassionaly. It would be annoying if they moved between reboots. I am sorry but you will have to cope with such annoyances. It' snot like we fail to boot the box here. The systems are now mostly hot-pluggable and userland is supposed to handle it, and it does, at least for input devices. If you want stable naming use udev facilities to rename devices as needed or add needed symlinks (by-id, etc.). Well, it would be nice if udev was not mandatory. Do the sync points for ordering actually cost us something? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Tue, Sep 30, 2014 at 11:06:34PM +0200, Pavel Machek wrote: On Mon 2014-09-22 13:23:54, Dmitry Torokhov wrote: On Monday, September 22, 2014 09:49:06 PM Pavel Machek wrote: On Thu 2014-09-11 13:23:54, Dmitry Torokhov wrote: On Thu, Sep 11, 2014 at 12:59:25PM -0700, James Bottomley wrote: Yes, but we mostly do this anyway. SCSI for instance does asynchronous scanning of attached devices (once the cards are probed) What would it do it card was a bit slow to probe? but has a sync point for ordering. Quite often we do not really care about ordering of devices. I mean, does it matter if your mouse is discovered before your keyboard or after? Actually yes, I suspect it does. I do evtest /dev/input/eventX by hand, occassionaly. It would be annoying if they moved between reboots. I am sorry but you will have to cope with such annoyances. It' snot like we fail to boot the box here. The systems are now mostly hot-pluggable and userland is supposed to handle it, and it does, at least for input devices. If you want stable naming use udev facilities to rename devices as needed or add needed symlinks (by-id, etc.). Well, it would be nice if udev was not mandatory. Do the sync points for ordering actually cost us something? Yes, boot time. We can save a second or two off the boot time if we probe several devices/drivers simultaneously. Thanks. -- Dmitry -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Mon, Sep 8, 2014 at 7:57 PM, Luis R. Rodriguez mcg...@do-not-panic.com wrote: Why do we care about the priority of probing tasks? Does that actually make any meaningful difference? If so, how? As I noted before -- I have yet to provide clear metrics but at least changing both init paths + probe from finit_module() to kthread certainly had a measurable time increase, I suspect using queue_work(system_unbound_wq, async_probe_work) will make probe slower. I'll get to these metrics this week. The results are in and I'm glad to report my suspicions were incorrect about kthread() being slower than queue_work(system_unbound_wq), it actually works faster. Results will likely vary depending on subsystems but in this particular case the cxgb4 driver was tested requiring firmware loading and then without requiring firmware loading and for these two types of driver loading all mechanisms make probe take just about the same out of time. What was surprising was that when firmware loading is required the amount of time it takes to run probe does vary and quite considerably in terms of microseconds. The discrepancies are by no means terrible... but should be considered if one is thinking of large systems and if we do wish to optimize things further and offer equivalent behavior, specially when probing multiple devices with the same driver. The method used to collect the amount of time for probe was to use: ktime_t calltime, delta, rettime; calltime = ktime_get(); driver_attach(); rettime = ktime_get(); delta = ktime_sub(rettime, calltime); duration = (unsigned long long) ktime_to_ns(delta) 10; And then print that time of microsecond out right after it finishes, whether that be through the default kernel synchronous run or the async runs. The collection and testing was then done by Santosh. Details of the collections are at: https://bugzilla.novell.com/show_bug.cgi?id=877622 The summary: The driver actually probed 2 cards in the tests so we don't have results for 1 card, the kernel serially calls probe for each device so to get the amount of time for one run lets just divide the results by 2. For each strategy there is the requirement of using firmware and a run where no firmware loading is required. The results for both cards are: =| strategyfw (usec) no-fw (usec) | -| synchronous 489451382615126 | kthread 501328312619737 | queue_work(system_unbound_wq) 498273232615262 | -| For one device then that comes out to: =| strategyfw (usec) no-fw (usec) | -| synchronous 244725691307563 | kthread 25066415.5 1309868.5| queue_work(system_unbound_wq) 24913661.5 1307631 | -| Converting that to seconds: =| strategyfw (s) no-fw (s)| -| synchronous 24.47 1.31 | kthread 25.07 1.31 | queue_work(system_unbound_wq) 24.91 1.31 | -| Graph friendly versions of the results for probe of 1 device: Probe with firmware: http://drvbp1.linux-foundation.org/~mcgrof/images/probe-measurements/probe-cgxb4-firmware.png Probe without firmware: http://drvbp1.linux-foundation.org/~mcgrof/images/probe-measurements/probe-cgxb4-no-firmware.png Luis -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Thu 2014-09-11 13:23:54, Dmitry Torokhov wrote: On Thu, Sep 11, 2014 at 12:59:25PM -0700, James Bottomley wrote: On Tue, 2014-09-09 at 16:01 -0700, Dmitry Torokhov wrote: On Tuesday, September 09, 2014 03:46:23 PM James Bottomley wrote: On Wed, 2014-09-10 at 07:41 +0900, Tejun Heo wrote: The thing is that we have to have dynamic mechanism to listen for device attachments no matter what and such mechanism has been in place for a long time at this point. The synchronous wait simply doesn't serve any purpose anymore and kinda gets in the way in that it makes it a possibly extremely slow process to tell whether loading of a module succeeded or not because the wait for the initial round of probe is piggybacked. OK, so we just fire and forget in userland ... why bother inventing an elaborate new infrastructure in the kernel to do exactly what modprobe mod would do? Just so we do not forget: we also want the no-modules case to also be able to probe asynchronously so that a slow device does not stall kernel booting. Yes, but we mostly do this anyway. SCSI for instance does asynchronous scanning of attached devices (once the cards are probed) What would it do it card was a bit slow to probe? but has a sync point for ordering. Quite often we do not really care about ordering of devices. I mean, does it matter if your mouse is discovered before your keyboard or after? Actually yes, I suspect it does. I do evtest /dev/input/eventX by hand, occassionaly. It would be annoying if they moved between reboots. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Monday, September 22, 2014 09:49:06 PM Pavel Machek wrote: On Thu 2014-09-11 13:23:54, Dmitry Torokhov wrote: On Thu, Sep 11, 2014 at 12:59:25PM -0700, James Bottomley wrote: On Tue, 2014-09-09 at 16:01 -0700, Dmitry Torokhov wrote: On Tuesday, September 09, 2014 03:46:23 PM James Bottomley wrote: On Wed, 2014-09-10 at 07:41 +0900, Tejun Heo wrote: The thing is that we have to have dynamic mechanism to listen for device attachments no matter what and such mechanism has been in place for a long time at this point. The synchronous wait simply doesn't serve any purpose anymore and kinda gets in the way in that it makes it a possibly extremely slow process to tell whether loading of a module succeeded or not because the wait for the initial round of probe is piggybacked. OK, so we just fire and forget in userland ... why bother inventing an elaborate new infrastructure in the kernel to do exactly what modprobe mod would do? Just so we do not forget: we also want the no-modules case to also be able to probe asynchronously so that a slow device does not stall kernel booting. Yes, but we mostly do this anyway. SCSI for instance does asynchronous scanning of attached devices (once the cards are probed) What would it do it card was a bit slow to probe? but has a sync point for ordering. Quite often we do not really care about ordering of devices. I mean, does it matter if your mouse is discovered before your keyboard or after? Actually yes, I suspect it does. I do evtest /dev/input/eventX by hand, occassionaly. It would be annoying if they moved between reboots. I am sorry but you will have to cope with such annoyances. It' snot like we fail to boot the box here. The systems are now mostly hot-pluggable and userland is supposed to handle it, and it does, at least for input devices. If you want stable naming use udev facilities to rename devices as needed or add needed symlinks (by-id, etc.). Thanks. -- Dmitry -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Tue, Sep 9, 2014 at 4:03 PM, Tejun Heo t...@kernel.org wrote: On Tue, Sep 09, 2014 at 12:25:29PM +0900, Tejun Heo wrote: Hello, On Mon, Sep 08, 2014 at 08:19:12PM -0700, Luis R. Rodriguez wrote: On the systemd side of things it should enable this sysctl and for older kernels what should it do? Supposing the change is backported via -stable, it can try to set the sysctl on all kernels. If the knob doesn't exist, the fix is not there and nothing can be done about it. The more I think about it, the more I think this should be a per-insmod instance thing rather than a system-wide switch. Agreed, a good use case that comes to mind would be systemd's modules-load.d lists used by systemd services to require modules, the hooks there however likely expect probe to complete as part of the service, since the timeout is not applicable to these the synchronous probe for them would be good while systemd would use async probe for regular modules. Currently the kernel param code doesn't allow a generic param outside the ones specified by the module itself but adding support for something like driver.async_load=1 shouldn't be too difficult, applying that to existing systems shouldn't be much more difficult than a system-wide switch, and it'd be siginificantly cleaner than fiddling with driver blacklist. Agreed. Luis -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Tue, 2014-09-09 at 16:01 -0700, Dmitry Torokhov wrote: On Tuesday, September 09, 2014 03:46:23 PM James Bottomley wrote: On Wed, 2014-09-10 at 07:41 +0900, Tejun Heo wrote: The thing is that we have to have dynamic mechanism to listen for device attachments no matter what and such mechanism has been in place for a long time at this point. The synchronous wait simply doesn't serve any purpose anymore and kinda gets in the way in that it makes it a possibly extremely slow process to tell whether loading of a module succeeded or not because the wait for the initial round of probe is piggybacked. OK, so we just fire and forget in userland ... why bother inventing an elaborate new infrastructure in the kernel to do exactly what modprobe mod would do? Just so we do not forget: we also want the no-modules case to also be able to probe asynchronously so that a slow device does not stall kernel booting. Yes, but we mostly do this anyway. SCSI for instance does asynchronous scanning of attached devices (once the cards are probed) but has a sync point for ordering. The problem of speeding up boot is different from the one of init processes killing modprobes. There are elements in common, but by and large the biggest headaches at least in large device number boots have already been tackled by the enterprise crowd (they don't like their S390's or 1024 core NUMA systems taking half an hour to come up). James -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Thu, Sep 11, 2014 at 12:59:25PM -0700, James Bottomley wrote: On Tue, 2014-09-09 at 16:01 -0700, Dmitry Torokhov wrote: On Tuesday, September 09, 2014 03:46:23 PM James Bottomley wrote: On Wed, 2014-09-10 at 07:41 +0900, Tejun Heo wrote: The thing is that we have to have dynamic mechanism to listen for device attachments no matter what and such mechanism has been in place for a long time at this point. The synchronous wait simply doesn't serve any purpose anymore and kinda gets in the way in that it makes it a possibly extremely slow process to tell whether loading of a module succeeded or not because the wait for the initial round of probe is piggybacked. OK, so we just fire and forget in userland ... why bother inventing an elaborate new infrastructure in the kernel to do exactly what modprobe mod would do? Just so we do not forget: we also want the no-modules case to also be able to probe asynchronously so that a slow device does not stall kernel booting. Yes, but we mostly do this anyway. SCSI for instance does asynchronous scanning of attached devices (once the cards are probed) What would it do it card was a bit slow to probe? but has a sync point for ordering. Quite often we do not really care about ordering of devices. I mean, does it matter if your mouse is discovered before your keyboard or after? The problem of speeding up boot is different from the one of init processes killing modprobes. Right. One is systemd doing stupid things, another is kernel could be smarter. There are elements in common, but by and large the biggest headaches at least in large device number boots have already been tackled by the enterprise crowd (they don't like their S390's or 1024 core NUMA systems taking half an hour to come up). Please do not position this as a mostly solved large systems problem, For us it is touchpad detection stalling kernel for 0.5-1 sec. Which is a lot given that we boot in seconds. Thanks. -- Dmitry -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Thu, Sep 11, 2014 at 1:23 PM, Dmitry Torokhov dmitry.torok...@gmail.com wrote: There are elements in common, but by and large the biggest headaches at least in large device number boots have already been tackled by the enterprise crowd (they don't like their S390's or 1024 core NUMA systems taking half an hour to come up). Please do not position this as a mostly solved large systems problem, For us it is touchpad detection stalling kernel for 0.5-1 sec. Which is a lot given that we boot in seconds. Dmitry, would working on top of the aysnc series be reasonable? Then we could address these as separate things which we'd build on top of. The one aspect I see us needing to share is the async probe universe is OK flag. Luis -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Thu, Sep 11, 2014 at 01:42:20PM -0700, Luis R. Rodriguez wrote: On Thu, Sep 11, 2014 at 1:23 PM, Dmitry Torokhov dmitry.torok...@gmail.com wrote: There are elements in common, but by and large the biggest headaches at least in large device number boots have already been tackled by the enterprise crowd (they don't like their S390's or 1024 core NUMA systems taking half an hour to come up). Please do not position this as a mostly solved large systems problem, For us it is touchpad detection stalling kernel for 0.5-1 sec. Which is a lot given that we boot in seconds. Dmitry, would working on top of the aysnc series be reasonable? Then we could address these as separate things which we'd build on top of. The one aspect I see us needing to share is the async probe universe is OK flag. Sure. Are you planning on refreshing your series? I think the code-related discussion kind of stalled... -- Dmitry -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Thu, Sep 11, 2014 at 1:53 PM, Dmitry Torokhov dmitry.torok...@gmail.com wrote: On Thu, Sep 11, 2014 at 01:42:20PM -0700, Luis R. Rodriguez wrote: On Thu, Sep 11, 2014 at 1:23 PM, Dmitry Torokhov dmitry.torok...@gmail.com wrote: There are elements in common, but by and large the biggest headaches at least in large device number boots have already been tackled by the enterprise crowd (they don't like their S390's or 1024 core NUMA systems taking half an hour to come up). Please do not position this as a mostly solved large systems problem, For us it is touchpad detection stalling kernel for 0.5-1 sec. Which is a lot given that we boot in seconds. Dmitry, would working on top of the aysnc series be reasonable? Then we could address these as separate things which we'd build on top of. The one aspect I see us needing to share is the async probe universe is OK flag. Sure. Are you planning on refreshing your series? Yes. I think the code-related discussion kind of stalled... I was just waiting for any possible brain farts to flush out before a new respin. I'll tackle this now. Luis -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Mon, Sep 8, 2014 at 10:38 PM, James Bottomley james.bottom...@hansenpartnership.com wrote: On Tue, 2014-09-09 at 10:10 +0900, Tejun Heo wrote: Hello, Luis. On Mon, Sep 08, 2014 at 06:04:23PM -0700, Luis R. Rodriguez wrote: I have no idea how the selection should be. It could be per-insmod or maybe just a system-wide flag with explicit exceptions marked on drivers is good enough. I don't know. Its perfectly understandable if we don't know what path to take yet and its also understandable for it to take time to figure out -- meanwhile though systemd already has merged a policy of a 30 second timeout for *all drivers* though so we therefore need: I'm not too convinced this is such a difficult problem to figure out. We already have most of logic in place and the only thing missing is how to switch it. Wouldn't something like the following work? * Add a sysctl knob to enable asynchronous device probing on module load and enable asynchronous probing globally if the knob is set. * Identify cases which can't be asynchronous and make them synchronous. e.g. keep who's doing request_module() and avoid asynchronous probing if current is probing one of those. What's wrong with just fixing systemd? Arbitrary timeouts in init scripts for system bring up are plain wrong ... I thought we had this sorted out ten years ago when we were first having the arguments about how long to wait for root; I'm surprised it's coming back again. By design it seems systemd should not allow worker processes to block indefinitely and in fact it currently uses the same timeout for all types of worker processes. I last recommended a multiplier to at least allow systemd to distinguish and allow us to modify the timeout based on the type of process by using an enum used to classify these, kmod for example would be one type of command: http://lists.freedesktop.org/archives/systemd-devel/2014-August/021852.html This was deemed to introduce unnecessary complexity, but I believe this was before we realized that the timeout was penalizing kmod usage unfairly given that the original assumption that it was just init that should be penalized was incorrect given that we batch both init + probe together. I have been relaying updates back on that thread as we move along with this discussion on the issues found with the timeout, but haven't gotten feedback yet as to which path folks on systemd would like to take in light of recent discussions / clarifications. Perhaps your arguments might help folks here reconsider things a bit as well. If we want *tight* integration between init system / kernel these discussions are necessary not only when we find issues but also should be part of the design phase for major changes. If we want to sort out some sync/async mechanism for probing devices, as an agreement between the init systems and the kernel, that's fine, but its a to-be negotiated enhancement. Unfortunately as Tejun notes the train has left which already made assumptions on this. I'm afraid distributions that want to avoid this sigkill at least on the kernel front will have to work around this issue either on systemd by increasing the default timeout which is now possible thanks to Hannes' changes or by some other means such as the combination of a modified non-chatty version of this patch + a check at the end of load_module() as mentioned earlier on these threads. For the current bug fix, just fix the component that broke ... which would be systemd. For new systems it seems the proposed fix is to have systemd tell the kernel what it thought it should be seeing and that is all pure async probes through a sysctl, and then we'd do async probe on all modules unless a driver is specifically flagged with a need to run synchronous (we'll enable this for request_firmware() users for example to start off with). Luis -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Tue, 2014-09-09 at 12:16 -0700, Luis R. Rodriguez wrote: On Mon, Sep 8, 2014 at 10:38 PM, James Bottomley james.bottom...@hansenpartnership.com wrote: If we want to sort out some sync/async mechanism for probing devices, as an agreement between the init systems and the kernel, that's fine, but its a to-be negotiated enhancement. Unfortunately as Tejun notes the train has left which already made assumptions on this. Well, that's why it's a bug. It's a material regression impacting users. I'm afraid distributions that want to avoid this sigkill at least on the kernel front will have to work around this issue either on systemd by increasing the default timeout which is now possible thanks to Hannes' changes or by some other means such as the combination of a modified non-chatty version of this patch + a check at the end of load_module() as mentioned earlier on these threads. Increasing the default timeout in systemd seems like the obvious bug fix to me. If the patch exists already, having distros that want it use it looks to be correct ... not every bug is a kernel bug, after all. Negotiating a probe vs init split for drivers is fine too, but it's a longer term thing rather than a bug fix. For the current bug fix, just fix the component that broke ... which would be systemd. For new systems it seems the proposed fix is to have systemd tell the kernel what it thought it should be seeing and that is all pure async probes through a sysctl, and then we'd do async probe on all modules unless a driver is specifically flagged with a need to run synchronous (we'll enable this for request_firmware() users for example to start off with). I don't have very strong views on this one. However, I've got to say from a systems point of view that if the desire is to flag when the module is having problems, probing and initializing synchronously in a thread spawned by init which the init process can watchdog and thus can flash up warning messages seems to be more straightforwards than an elaborate asynchronous mechanism with completion signalling which achieves the same thing in a more complicated (and thus bug prone) fashion. James -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Tue, Sep 9, 2014 at 12:35 PM, James Bottomley james.bottom...@hansenpartnership.com wrote: On Tue, 2014-09-09 at 12:16 -0700, Luis R. Rodriguez wrote: On Mon, Sep 8, 2014 at 10:38 PM, James Bottomley james.bottom...@hansenpartnership.com wrote: If we want to sort out some sync/async mechanism for probing devices, as an agreement between the init systems and the kernel, that's fine, but its a to-be negotiated enhancement. Unfortunately as Tejun notes the train has left which already made assumptions on this. Well, that's why it's a bug. It's a material regression impacting users. Indeed. I believe the issue with this regression however was that the original commit e64fae55 (January 2012) was only accepted by *kernel folks* to be a real regression until recently. More than two years have gone by on growing design and assumptions on top of that original commit. I'm not sure if *systemd folks* yet believe its was a design regression? I'm afraid distributions that want to avoid this sigkill at least on the kernel front will have to work around this issue either on systemd by increasing the default timeout which is now possible thanks to Hannes' changes or by some other means such as the combination of a modified non-chatty version of this patch + a check at the end of load_module() as mentioned earlier on these threads. Increasing the default timeout in systemd seems like the obvious bug fix to me. If the patch exists already, having distros that want it use it looks to be correct ... not every bug is a kernel bug, after all. Its merged upstream on systemd now, along with a few fixes on top of it. I also see Kay merged a change to the default timeout to 60 second on August 30. Its unclear if these discussions had any impact on that decision or if that was just because udev firmware loading got now ripped out. I'll note that the new 60 second timeout wouldn't suffice for cxgb4 even if it didn't do firmware loading, its probe takes over one full minute. Negotiating a probe vs init split for drivers is fine too, but it's a longer term thing rather than a bug fix. Indeed. What I proposed with a multiplier for the timeout for the different types of built in commands was deemed complex but saw no alternatives proposed despite my interest to work on one and clarifications noted that this was a design regression. Not quite sure what else I could have done here. I'm interested in learning what the better approach is for the future as if we want to marry init + kernel we need a smooth way for us to discuss design without getting worked up about it, or taking it personal. I really want this to work as I personally like systemd so far. For the current bug fix, just fix the component that broke ... which would be systemd. For new systems it seems the proposed fix is to have systemd tell the kernel what it thought it should be seeing and that is all pure async probes through a sysctl, and then we'd do async probe on all modules unless a driver is specifically flagged with a need to run synchronous (we'll enable this for request_firmware() users for example to start off with). I don't have very strong views on this one. However, I've got to say from a systems point of view that if the desire is to flag when the module is having problems, probing and initializing synchronously in a thread spawned by init which the init process can watchdog and thus can flash up warning messages seems to be more straightforwards Indeed however it was not understood that module loading did init + probe synchrounously, and indeed what you recommend is also what I was hoping systemd *should do* instead of a hard sigkill at the default timeout. than an elaborate asynchronous mechanism with completion signalling which achieves the same thing in a more complicated (and thus bug prone) fashion. I couldn't be in any more agreement with you. It takes two to tango though. Luis -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
Hey, James. On Tue, Sep 09, 2014 at 12:35:46PM -0700, James Bottomley wrote: I don't have very strong views on this one. However, I've got to say from a systems point of view that if the desire is to flag when the module is having problems, probing and initializing synchronously in a thread spawned by init which the init process can watchdog and thus can flash up warning messages seems to be more straightforwards than an elaborate asynchronous mechanism with completion signalling which achieves the same thing in a more complicated (and thus bug prone) fashion. We no longer report back error on probe failure on module load. It used to make sense to indicate error for module load on probe failure when the hardware was a lot simpler and drivers did their own device enumeration. With the current bus / device setup, it doesn't make any sense and driver core silently suppresses all probe failures. There's nothing the probing thread can monitor anymore. In that sense, we already separated out device probing from module loading simply because the hardware reality mandated it and we have dynamic mechanisms to listen for device probes exactly for the same reason, so I think it makes sense to separate out the waiting too, at least in the long term. In a modern dynamic setup, the waits are essentially arbitrary and doesn't buy us anything. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Tue, 9 Sep 2014, Luis R. Rodriguez wrote: By design it seems systemd should not allow worker processes to block indefinitely and in fact it currently uses the same timeout for all types of worker processes. And I whole-heartedly believe this is something that fundamentally needs to be addressed in systemd, not in the kernel. This aproach is actually introducing a user-visible regressions. Look, for example, exec() never times out. Therefore if your system is on its knees, heavily overloaded (or completely broken), you are likely to be able to `reboot' it, because exec(/sbin/reboot) ultimately succeeds. But with all the timeouts, dbus, Failed to issue method call: Did not receive a reply messages, this is getting close to impossible. -- Jiri Kosina SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Wed, 2014-09-10 at 06:42 +0900, Tejun Heo wrote: Hey, James. On Tue, Sep 09, 2014 at 12:35:46PM -0700, James Bottomley wrote: I don't have very strong views on this one. However, I've got to say from a systems point of view that if the desire is to flag when the module is having problems, probing and initializing synchronously in a thread spawned by init which the init process can watchdog and thus can flash up warning messages seems to be more straightforwards than an elaborate asynchronous mechanism with completion signalling which achieves the same thing in a more complicated (and thus bug prone) fashion. We no longer report back error on probe failure on module load. Yes, we do; for every probe failure of a device on a driver we'll print a warning (see drivers/base/dd.c). Now if someone is proposing we should report this in a better fashion, that's probably a good idea, but I must have missed that patch. It used to make sense to indicate error for module load on probe failure when the hardware was a lot simpler and drivers did their own device enumeration. With the current bus / device setup, it doesn't make any sense and driver core silently suppresses all probe failures. There's nothing the probing thread can monitor anymore. Except the length of time taken to probe. That seems to be what systemd is interested in, hence this whole thread, right? In that sense, we already separated out device probing from module loading simply because the hardware reality mandated it and we have dynamic mechanisms to listen for device probes exactly for the same reason, so I think it makes sense to separate out the waiting too, at least in the long term. In a modern dynamic setup, the waits are essentially arbitrary and doesn't buy us anything. But that's nothing to do with sync or async. Nowadays we register a driver, the driver may bind to multiple devices. If one of those devices encounters an error during probe, we just report the fact in dmesg and move on. The module_init thread currently returns when all the probe routines for all enumerated devices have been called, so module_init has no indication of any failures (because they might be mixed with successes); successes are indicated as the device appears but we have nothing other than the kernel log to indicate the failures. How does moving to async probing alter this? It doesn't as far as I can see, except that module_init returns earlier but now we no longer have an indication of when the probe completes, so we have to add yet another mechanism to tell us if we're interested in that. I really don't see what this buys us. James -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
Hello, On Tue, Sep 09, 2014 at 03:26:02PM -0700, James Bottomley wrote: We no longer report back error on probe failure on module load. Yes, we do; for every probe failure of a device on a driver we'll print a warning (see drivers/base/dd.c). Now if someone is proposing we should report this in a better fashion, that's probably a good idea, but I must have missed that patch. We can do printks all the same from anywhere. There's nothing special about printing from the module loading thread. The only way to actually take advantage of the synchronisity would be propagating error return to the waiting issuer, which we used to do but no longer can. It used to make sense to indicate error for module load on probe failure when the hardware was a lot simpler and drivers did their own device enumeration. With the current bus / device setup, it doesn't make any sense and driver core silently suppresses all probe failures. There's nothing the probing thread can monitor anymore. Except the length of time taken to probe. That seems to be what systemd is interested in, hence this whole thread, right? No, systemd in this case isn't interested in the time taken to probe at all. It is expecting module load to just do that - load the module. Modern userlands, systemd or not, no longer depend on or make use of the wait. But that's nothing to do with sync or async. Nowadays we register a driver, the driver may bind to multiple devices. If one of those devices encounters an error during probe, we just report the fact in dmesg and move on. The module_init thread currently returns when all the probe routines for all enumerated devices have been called, so module_init has no indication of any failures (because they might be mixed with successes); successes are indicated as the device appears but we have nothing other than the kernel log to indicate the failures. How does moving to async probing alter this? It doesn't as far as I can see, except that module_init returns earlier but now we no longer have an indication of when the probe completes, so we have to add yet another mechanism to tell us if we're interested in that. I really don't see what this buys us. The thing is that we have to have dynamic mechanism to listen for device attachments no matter what and such mechanism has been in place for a long time at this point. The synchronous wait simply doesn't serve any purpose anymore and kinda gets in the way in that it makes it a possibly extremely slow process to tell whether loading of a module succeeded or not because the wait for the initial round of probe is piggybacked. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Wed, 2014-09-10 at 07:41 +0900, Tejun Heo wrote: Hello, On Tue, Sep 09, 2014 at 03:26:02PM -0700, James Bottomley wrote: We no longer report back error on probe failure on module load. Yes, we do; for every probe failure of a device on a driver we'll print a warning (see drivers/base/dd.c). Now if someone is proposing we should report this in a better fashion, that's probably a good idea, but I must have missed that patch. We can do printks all the same from anywhere. There's nothing special about printing from the module loading thread. The only way to actually take advantage of the synchronisity would be propagating error return to the waiting issuer, which we used to do but no longer can. If you want the return of an individual device probe a log scraper gives it to you ... and nothing else does currently. The advantage of the prink in dd.c is that it's standard for everything and can be scanned for ... if you take that out, you'll get complaints about the lack of standard messages (you'd be surprised at the number of enterprise monitoring systems that actually do log scraping). It used to make sense to indicate error for module load on probe failure when the hardware was a lot simpler and drivers did their own device enumeration. With the current bus / device setup, it doesn't make any sense and driver core silently suppresses all probe failures. There's nothing the probing thread can monitor anymore. Except the length of time taken to probe. That seems to be what systemd is interested in, hence this whole thread, right? No, systemd in this case isn't interested in the time taken to probe at all. It is expecting module load to just do that - load the module. Modern userlands, systemd or not, no longer depend on or make use of the wait. So what's the problem? it can just fire and forget; that's what fork() is for. But that's nothing to do with sync or async. Nowadays we register a driver, the driver may bind to multiple devices. If one of those devices encounters an error during probe, we just report the fact in dmesg and move on. The module_init thread currently returns when all the probe routines for all enumerated devices have been called, so module_init has no indication of any failures (because they might be mixed with successes); successes are indicated as the device appears but we have nothing other than the kernel log to indicate the failures. How does moving to async probing alter this? It doesn't as far as I can see, except that module_init returns earlier but now we no longer have an indication of when the probe completes, so we have to add yet another mechanism to tell us if we're interested in that. I really don't see what this buys us. The thing is that we have to have dynamic mechanism to listen for device attachments no matter what and such mechanism has been in place for a long time at this point. The synchronous wait simply doesn't serve any purpose anymore and kinda gets in the way in that it makes it a possibly extremely slow process to tell whether loading of a module succeeded or not because the wait for the initial round of probe is piggybacked. OK, so we just fire and forget in userland ... why bother inventing an elaborate new infrastructure in the kernel to do exactly what modprobe mod would do? James -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
Hello, James. On Tue, Sep 09, 2014 at 03:46:23PM -0700, James Bottomley wrote: If you want the return of an individual device probe a log scraper gives it to you ... and nothing else does currently. The advantage of the prink in dd.c is that it's standard for everything and can be scanned for ... if you take that out, you'll get complaints about the lack of standard messages (you'd be surprised at the number of enterprise monitoring systems that actually do log scraping). Why would a log scaper care about which task is printing the messages? The printk can stay there. There's nothing wrong with it. Log scapers tend to be asynchronous in nature but if a log scraper wants to operate synchronously for whatever reason, it can simply not turn on async probing. OK, so we just fire and forget in userland ... why bother inventing an elaborate new infrastructure in the kernel to do exactly what modprobe mod would do? I think the argument there is that the issuer wants to know whether such operations succeeded or not and wants to report and record the result and possibly take other actions in response. We're currently mixing wait and error reporting for one type of operation with wait for another. I'm not saying it's a fatal flaw or anything but it can get in the way. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Tue, Sep 09, 2014 at 12:25:29PM +0900, Tejun Heo wrote: Hello, On Mon, Sep 08, 2014 at 08:19:12PM -0700, Luis R. Rodriguez wrote: On the systemd side of things it should enable this sysctl and for older kernels what should it do? Supposing the change is backported via -stable, it can try to set the sysctl on all kernels. If the knob doesn't exist, the fix is not there and nothing can be done about it. The more I think about it, the more I think this should be a per-insmod instance thing rather than a system-wide switch. Currently the kernel param code doesn't allow a generic param outside the ones specified by the module itself but adding support for something like driver.async_load=1 shouldn't be too difficult, applying that to existing systems shouldn't be much more difficult than a system-wide switch, and it'd be siginificantly cleaner than fiddling with driver blacklist. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Tue, Sep 9, 2014 at 3:26 AM, Luis R. Rodriguez mcg...@do-not-panic.com wrote: On Mon, Sep 8, 2014 at 6:22 PM, Tejun Heo t...@kernel.org wrote: On Tue, Sep 09, 2014 at 10:10:59AM +0900, Tejun Heo wrote: I'm not too convinced this is such a difficult problem to figure out. We already have most of logic in place and the only thing missing is how to switch it. Wouldn't something like the following work? * Add a sysctl knob to enable asynchronous device probing on module load and enable asynchronous probing globally if the knob is set. Alternatively, add a module-generic param async_probe or whatever and use that to switch the behavior should work too. I don't know which way is better but either should work fine. I take it by this you meant a generic system-wide sysctl or kernel cmd line option to enable this for al drivers? If the expectation is that this feature should be enabled unconditionally for all systemd systems, wouldn't it make more sense to make it a Kconfig option (possibly overridable from the kernel commandline in case that makes testing simpler)? Cheers, Tom -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Fri, Sep 5, 2014 at 3:40 PM, Tejun Heo t...@kernel.org wrote: Hello, Luis. On Fri, Sep 05, 2014 at 11:12:17AM -0700, Luis R. Rodriguez wrote: Meanwhile we are allowing a major design consideration such as a 30 second timeout for both init + probe all of a sudden become a hard requirement for device drivers. I see your point but can't also be introducing major design changes willy nilly either. We *need* a solution for the affected drivers. Yes, make the behavior specifically specified from userland. When did I ever say that there should be no solution for the problem? I've been saying that the behavior should be selected from userland from the get-go, haven't I? I have no idea how the selection should be. It could be per-insmod or maybe just a system-wide flag with explicit exceptions marked on drivers is good enough. I don't know. Its perfectly understandable if we don't know what path to take yet and its also understandable for it to take time to figure out -- meanwhile though systemd already has merged a policy of a 30 second timeout for *all drivers* though so we therefore need: 0) a solutions for affected combination of systemd / drivers 1) an agreed path forward If we want a tight integration between both kernel / init system we need to be able to communicate effectively folks and I'm afraid this isn't happening. I last noted on systemd-devel how the 30 second timeout issue was merged under incorrect assumptions -- that it was not just init that at times caused delays, and that since we currently batch both init and probe on the driver core we need a non fatal userspace solution [0], while we work on design on the kernel side of things for async'ing for drivers that make sense. A proper kernel solution may take longer than expected, we can't just assume a probe_async flag will suffice on drivers, in fact as Tejun notes, its wrong since historically we have had some random userland depend on the synhronous behaviour of module loading of some drivers, and that *could* have taken a while. Kay, Lennart, any recommendations ? [0] http://lists.freedesktop.org/archives/systemd-devel/2014-August/022696.html Also what stops drivers from going ahead and just implementing their own async probe? Would that now be frowned upon as it strives away The drivers can't. How many times should I explain the same thing over and over again. libata can't simply make probing asynchronous w.r.t. module loading no matter how it does it. Yeah, sure, there can be other drivers which can do that without most people noticing it but a storage driver isn't one of them and the storage drivers are the problematic ones already, right? Its one of the subsystems that has suffered from this, but not the only one. from the original design? The bool would let those drivers do this easily, and we would still need to identify these drivers, although this particular change can be NAK'd Oleg's suggestion on WARN_ON(fatal_signal_pending() at the end of load_module() seems to me at least needed. And if its not async probe... what do those with failed drivers do? I'm getting tired of explaining the same thing over and over again. The said change was nacked because the whole approach of let's see which drivers get reported on the issue which exists basically for all drivers and just change the behavior of them is braindead. It makes no sense whatsoever. It doesn't address the root cause of the problem while making the same class of drivers behave significantly differently for no good reason. Please stop chasing your own tail and try to understand the larger picture. Understood. Luis -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
Hello, Luis. On Mon, Sep 08, 2014 at 06:04:23PM -0700, Luis R. Rodriguez wrote: I have no idea how the selection should be. It could be per-insmod or maybe just a system-wide flag with explicit exceptions marked on drivers is good enough. I don't know. Its perfectly understandable if we don't know what path to take yet and its also understandable for it to take time to figure out -- meanwhile though systemd already has merged a policy of a 30 second timeout for *all drivers* though so we therefore need: I'm not too convinced this is such a difficult problem to figure out. We already have most of logic in place and the only thing missing is how to switch it. Wouldn't something like the following work? * Add a sysctl knob to enable asynchronous device probing on module load and enable asynchronous probing globally if the knob is set. * Identify cases which can't be asynchronous and make them synchronous. e.g. keep who's doing request_module() and avoid asynchronous probing if current is probing one of those. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Tue, Sep 09, 2014 at 10:10:59AM +0900, Tejun Heo wrote: * Identify cases which can't be asynchronous and make them synchronous. e.g. keep who's doing request_module() and avoid asynchronous probing if current is probing one of those. That wouldn't work as we don't know what's gonna happen in userland but we can start with just disallowing async probing for char devices for now. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Tue, Sep 09, 2014 at 10:10:59AM +0900, Tejun Heo wrote: I'm not too convinced this is such a difficult problem to figure out. We already have most of logic in place and the only thing missing is how to switch it. Wouldn't something like the following work? * Add a sysctl knob to enable asynchronous device probing on module load and enable asynchronous probing globally if the knob is set. Alternatively, add a module-generic param async_probe or whatever and use that to switch the behavior should work too. I don't know which way is better but either should work fine. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Mon, Sep 8, 2014 at 6:22 PM, Tejun Heo t...@kernel.org wrote: On Tue, Sep 09, 2014 at 10:10:59AM +0900, Tejun Heo wrote: I'm not too convinced this is such a difficult problem to figure out. We already have most of logic in place and the only thing missing is how to switch it. Wouldn't something like the following work? * Add a sysctl knob to enable asynchronous device probing on module load and enable asynchronous probing globally if the knob is set. Alternatively, add a module-generic param async_probe or whatever and use that to switch the behavior should work too. I don't know which way is better but either should work fine. I take it by this you meant a generic system-wide sysctl or kernel cmd line option to enable this for al drivers? Luis -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Mon, Sep 08, 2014 at 06:26:04PM -0700, Luis R. Rodriguez wrote: Alternatively, add a module-generic param async_probe or whatever and use that to switch the behavior should work too. I don't know which way is better but either should work fine. I take it by this you meant a generic system-wide sysctl or kernel cmd line option to enable this for al drivers? Well, either global or per-insmod switch should work. There probably are details that I haven't mentioned - e.g. probably global switch is easier to backport and deploy to existing systems - but as long as it works I don't have fundmental objections either way. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Mon, Sep 8, 2014 at 6:29 PM, Tejun Heo t...@kernel.org wrote: On Mon, Sep 08, 2014 at 06:26:04PM -0700, Luis R. Rodriguez wrote: Alternatively, add a module-generic param async_probe or whatever and use that to switch the behavior should work too. I don't know which way is better but either should work fine. I take it by this you meant a generic system-wide sysctl or kernel cmd line option to enable this for al drivers? Well, either global or per-insmod switch should work. There probably are details that I haven't mentioned - e.g. probably global switch is easier to backport and deploy to existing systems Yes a global sysctl solution might make it easier to backport. - but as long as it works I don't have fundmental objections either way. OK then one only concern I would have with this is that the presence of such a flag doesn't necessarily mean that all drivers on a system have been tested for asynch probe yet. I'd feel much more comfortable if this global flag allowed say specific drivers that *did* have such a bool enabled, for example. Then that would enable synchronous behaviour for the kernel by default, require the flag for enabling the new async feature but only for drivers that have been tested. That also still would not technically solve the issue of the current existence of the timeout, unless of course we wish to ask systemd to only make the timeout take effect *iff* the global sysctl flag / whatever was enabled. Luis -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
Hello, On Mon, Sep 08, 2014 at 06:38:34PM -0700, Luis R. Rodriguez wrote: OK then one only concern I would have with this is that the presence of such a flag doesn't necessarily mean that all drivers on a system have been tested for asynch probe yet. I'd feel much more comfortable Given that the behvaior change is from driver core and that device probing can happen post-loading anyway, I don't think we need to worry about drivers breaking from probing made asynchronous to loading. The problem is the expectation of the entity which initiated loading of the module. If it's depending on device being probed synchronously but insmod returns before that, it can break things. We probably should audit request_module() users and see which ones expect such behavior. if this global flag allowed say specific drivers that *did* have such a bool enabled, for example. Then that would enable synchronous behaviour for the kernel by default, require the flag for enabling the new async feature but only for drivers that have been tested. If we're gonna do the global switch, I personally think the right approach is blacklisting instead of the other way around because each specific driver doesn't really have much to do with it and the exceptions are about specific use cases that we don't have a good way to identify them from module loading path. That also still would not technically solve the issue of the current existence of the timeout, unless of course we wish to ask systemd to only make the timeout take effect *iff* the global sysctl flag / whatever was enabled. Userland could backport a fix to set the sysctl. Given that we need both synchrnous and asynchronous behaviors, it's unlikely that we can come up with a solution which doesn't need cooperation from userland. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Mon, Sep 8, 2014 at 6:47 PM, Tejun Heo t...@kernel.org wrote: Hello, On Mon, Sep 08, 2014 at 06:38:34PM -0700, Luis R. Rodriguez wrote: OK then one only concern I would have with this is that the presence of such a flag doesn't necessarily mean that all drivers on a system have been tested for asynch probe yet. I'd feel much more comfortable Given that the behvaior change is from driver core and that device probing can happen post-loading anyway, Ah but lets not forget Dmitry's requirement which is for in-kernel drivers. We'd need to deal with both built-in and modules. Dmitry's case is completely orthogonal to the systemd issue and is just needed to help not stall boot but I see no reason to blend these two issues into one requirement together. I don't think we need to worry about drivers breaking from probing made asynchronous to loading. The problem is the expectation of the entity which initiated loading of the module. If it's depending on device being probed synchronously but insmod returns before that, it can break things. We probably should audit request_module() users and see which ones expect such behavior. Sure. Based on a quick glance I see sloppy uses of this, this should probably be fixed anyway. if this global flag allowed say specific drivers that *did* have such a bool enabled, for example. Then that would enable synchronous behaviour for the kernel by default, require the flag for enabling the new async feature but only for drivers that have been tested. If we're gonna do the global switch, I personally think the right approach is blacklisting instead of the other way around because each specific driver doesn't really have much to do with it and the exceptions are about specific use cases that we don't have a good way to identify them from module loading path. OK sure... even if we did whitelist I'm afraid such a white list might be subjective in terms of design to specific systems anyway... I suppose the only real way to do it right is to push and strive towards a full system whitelist and address the black list as you mention. In terms of approach we would still need to decide on a path for how to do asynch probing for both in-kernel drivers and modules, do we want async_schedule(), or queue_work()? If async_schedule() do we want to use a new domain or a new one shared for all drivers? Priority on the schedular was one of my other concerns which we'd need to make right to match existing load on drivers through finit_module() and synchronous probe. That also still would not technically solve the issue of the current existence of the timeout, unless of course we wish to ask systemd to only make the timeout take effect *iff* the global sysctl flag / whatever was enabled. Userland could backport a fix to set the sysctl. Given that we need both synchrnous and asynchronous behaviors, it's unlikely that we can come up with a solution which doesn't need cooperation from userland. True and then the timeout would also have to be skipped for device drivers that have the sync_probe flag set, so I guess we'd need to expose that too. I'm not too sure if systemd is equipped to be happy with no timeout on module loading based previous discussions [0] so we'd need to ensure we're all in agreement there that such drivers exist and we may need *something*, if at the very least a really long fucking timeout (TM) for such drivers. [0] http://lists.freedesktop.org/archives/systemd-devel/2014-August/021852.html Luis -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
Hello, On Mon, Sep 08, 2014 at 07:28:58PM -0700, Luis R. Rodriguez wrote: Given that the behvaior change is from driver core and that device probing can happen post-loading anyway, Ah but lets not forget Dmitry's requirement which is for in-kernel drivers. We'd need to deal with both built-in and modules. Dmitry's case is completely orthogonal to the systemd issue and is just needed to help not stall boot but I see no reason to blend these two issues into one requirement together. Maybe we can piggy back the two on the same mechanism but as you said the two issues are orthogonal. Let's keep it that way for now. We need them separate anyway for backports. In terms of approach we would still need to decide on a path for how to do asynch probing for both in-kernel drivers and modules, do we want async_schedule(), or queue_work()? If async_schedule() do we want to use a new domain or a new one shared for all drivers? Priority on I don't think async_schedule() is the right mechanism for this use case as the mechanism is inherently opportunistic. It also gets tangled up with async synchronization at the end of module loading. the schedular was one of my other concerns which we'd need to make right to match existing load on drivers through finit_module() and synchronous probe. Why do we care about the priority of probing tasks? Does that actually make any meaningful difference? If so, how? Userland could backport a fix to set the sysctl. Given that we need both synchrnous and asynchronous behaviors, it's unlikely that we can come up with a solution which doesn't need cooperation from userland. True and then the timeout would also have to be skipped for device drivers that have the sync_probe flag set, so I guess we'd need to I'm not sure about skipping for sync_probe flag. That seems like an implementation detail to me. Sure, we do that now because we don't have a better way of figuring out whether request_module() is waiting for it or not but hopefully we'd be able to in the future. I think we just should make exceptions sensible so that it works fine in practice for now (and I don't think that'd be too hard). So, the only cooperation necessary from userland would be just saying I don't wanna wait for device probing on module load. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Mon, Sep 8, 2014 at 7:39 PM, Tejun Heo t...@kernel.org wrote: Hello, On Mon, Sep 08, 2014 at 07:28:58PM -0700, Luis R. Rodriguez wrote: Given that the behvaior change is from driver core and that device probing can happen post-loading anyway, Ah but lets not forget Dmitry's requirement which is for in-kernel drivers. We'd need to deal with both built-in and modules. Dmitry's case is completely orthogonal to the systemd issue and is just needed to help not stall boot but I see no reason to blend these two issues into one requirement together. Maybe we can piggy back the two on the same mechanism but as you said the two issues are orthogonal. Let's keep it that way for now. We need them separate anyway for backports. OK. In terms of approach we would still need to decide on a path for how to do asynch probing for both in-kernel drivers and modules, do we want async_schedule(), or queue_work()? If async_schedule() do we want to use a new domain or a new one shared for all drivers? Priority on I don't think async_schedule() is the right mechanism for this use case as the mechanism is inherently opportunistic. It also gets tangled up with async synchronization at the end of module loading. the schedular was one of my other concerns which we'd need to make right to match existing load on drivers through finit_module() and synchronous probe. Why do we care about the priority of probing tasks? Does that actually make any meaningful difference? If so, how? As I noted before -- I have yet to provide clear metrics but at least changing both init paths + probe from finit_module() to kthread certainly had a measurable time increase, I suspect using queue_work(system_unbound_wq, async_probe_work) will make probe slower. I'll get to these metrics this week. Userland could backport a fix to set the sysctl. Given that we need both synchrnous and asynchronous behaviors, it's unlikely that we can come up with a solution which doesn't need cooperation from userland. True and then the timeout would also have to be skipped for device drivers that have the sync_probe flag set, so I guess we'd need to I'm not sure about skipping for sync_probe flag. That seems like an implementation detail to me. Sure, we do that now because we don't have a better way of figuring out whether request_module() is waiting for it or not but hopefully we'd be able to in the future. Oh I was not thinking about just request_modules() users but also any of those stragglers which we might have ended up finding through run time analysis. The alternative right now is these drivers won't load. No bueno. I think we just should make exceptions sensible so that it works fine in practice for now (and I don't think that'd be too hard). So, the only cooperation necessary from userland would be just saying I don't wanna wait for device probing on module load. But we're talking about drivers that have a flag that says 'you gotta wait sucker', what do we want systemd to do then? I'd be happy if it'd would not send the sigkill for these drivers, for example. Luis -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Mon, Sep 08, 2014 at 07:57:28PM -0700, Luis R. Rodriguez wrote: I think we just should make exceptions sensible so that it works fine in practice for now (and I don't think that'd be too hard). So, the only cooperation necessary from userland would be just saying I don't wanna wait for device probing on module load. But we're talking about drivers that have a flag that says 'you gotta wait sucker', what do we want systemd to do then? I'd be happy if it'd would not send the sigkill for these drivers, for example. Hah? Can you give me an example? I'm having hard time imagining a driver with such requirement given our current driver core implementation. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Mon, Sep 8, 2014 at 8:03 PM, Tejun Heo t...@kernel.org wrote: On Mon, Sep 08, 2014 at 07:57:28PM -0700, Luis R. Rodriguez wrote: I think we just should make exceptions sensible so that it works fine in practice for now (and I don't think that'd be too hard). So, the only cooperation necessary from userland would be just saying I don't wanna wait for device probing on module load. But we're talking about drivers that have a flag that says 'you gotta wait sucker', what do we want systemd to do then? I'd be happy if it'd would not send the sigkill for these drivers, for example. Hah? Can you give me an example? I'm having hard time imagining a driver with such requirement given our current driver core implementation. I didn't say I had one in mind, but if you're certain these *shouldn't exist* that's sufficient by me as well. OK so I'll respin this series to enable a sysctl that would enable async probe for *all drivers* using queue_work(system_unbound_wq) and only use sync probe for now on request_module() users, we'll address scheduling issues as they come up. I'll be ignoring built-in. On the systemd side of things it should enable this sysctl and for older kernels what should it do? Luis -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
Hello, On Mon, Sep 08, 2014 at 08:19:12PM -0700, Luis R. Rodriguez wrote: On the systemd side of things it should enable this sysctl and for older kernels what should it do? Supposing the change is backported via -stable, it can try to set the sysctl on all kernels. If the knob doesn't exist, the fix is not there and nothing can be done about it. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Tue, 2014-09-09 at 10:10 +0900, Tejun Heo wrote: Hello, Luis. On Mon, Sep 08, 2014 at 06:04:23PM -0700, Luis R. Rodriguez wrote: I have no idea how the selection should be. It could be per-insmod or maybe just a system-wide flag with explicit exceptions marked on drivers is good enough. I don't know. Its perfectly understandable if we don't know what path to take yet and its also understandable for it to take time to figure out -- meanwhile though systemd already has merged a policy of a 30 second timeout for *all drivers* though so we therefore need: I'm not too convinced this is such a difficult problem to figure out. We already have most of logic in place and the only thing missing is how to switch it. Wouldn't something like the following work? * Add a sysctl knob to enable asynchronous device probing on module load and enable asynchronous probing globally if the knob is set. * Identify cases which can't be asynchronous and make them synchronous. e.g. keep who's doing request_module() and avoid asynchronous probing if current is probing one of those. What's wrong with just fixing systemd? Arbitrary timeouts in init scripts for system bring up are plain wrong ... I thought we had this sorted out ten years ago when we were first having the arguments about how long to wait for root; I'm surprised it's coming back again. If we want to sort out some sync/async mechanism for probing devices, as an agreement between the init systems and the kernel, that's fine, but its a to-be negotiated enhancement. For the current bug fix, just fix the component that broke ... which would be systemd. James -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Thu, Sep 04, 2014 at 11:37:24PM -0700, Luis R. Rodriguez wrote: ... + /* + * I got SIGKILL, but wait for 60 more seconds for completion + * unless chosen by the OOM killer. This delay is there as a + * workaround for boot failure caused by SIGKILL upon device + * driver initialization timeout. + * + * N.B. this will actually let the thread complete regularly, + * wait_for_completion() will be used eventually, the 60 second + * try here is just to check for the OOM over that time. + */ + WARN_ONCE(!test_thread_flag(TIF_MEMDIE), + Got SIGKILL but not from OOM, if this issue is on probe use .driver.async_probe\n); + for (i = 0; i 60 !test_thread_flag(TIF_MEMDIE); i++) + if (wait_for_completion_timeout(done, HZ)) + goto wait_done; + Ugh... Jesus, this is way too hacky, so now we fail on 90s timeout instead of 30? Why do we even need this with the proposed async probing changes? Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Fri, 2014-09-05 at 00:47 -0700, Luis R. Rodriguez wrote: On Fri, Sep 5, 2014 at 12:19 AM, Tejun Heo t...@kernel.org wrote: On Thu, Sep 04, 2014 at 11:37:24PM -0700, Luis R. Rodriguez wrote: ... + /* + * I got SIGKILL, but wait for 60 more seconds for completion + * unless chosen by the OOM killer. This delay is there as a + * workaround for boot failure caused by SIGKILL upon device + * driver initialization timeout. + * + * N.B. this will actually let the thread complete regularly, + * wait_for_completion() will be used eventually, the 60 second + * try here is just to check for the OOM over that time. + */ + WARN_ONCE(!test_thread_flag(TIF_MEMDIE), + Got SIGKILL but not from OOM, if this issue is on probe use .driver.async_probe\n); + for (i = 0; i 60 !test_thread_flag(TIF_MEMDIE); i++) + if (wait_for_completion_timeout(done, HZ)) + goto wait_done; + Ugh... Jesus, this is way too hacky, so now we fail on 90s timeout instead of 30? Nope! I fell into the same trap and only with tons of patience by part of Tetsuo with me was I able to grok that the 60 seconds here are not for increasing the timeout, this is just time spent checking to ensure that the OOM wasn't the one who triggered the SIGKILL. Even if the drivers took eons it should be fine now, I tried it :D Why do we even need this with the proposed async probing changes? Ah -- well without it the way we find drivers that need this new async feature is by a bug report and folks saying their system can't boot, or they say their device doesn't come up. That's all. Tracing this to systemd and a timeout was one of the most ugliest things ever. There two insane bug reports you can go check: mptsas was the first: http://article.gmane.org/gmane.linux.kernel/1669550 https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1297248 quote (2) Currently systemd-udevd unconditionally sends SIGKILL upon hardcoded 30 seconds timeout. As a result, finit_module() of mptsas kernel module receives SIGKILL when waiting for error handler thread to be started. /quote Hm. Why is this not a systemd-udevd bug for running around killing stuff when it has no idea whether progress is being made or not? -Mike -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On 09/04, Luis R. Rodriguez wrote: From: Luis R. Rodriguez mcg...@suse.com The new umh kill option has allowed kthreads to receive kill signals but they are generally accepting all sources of kill signals And I think this is right, while the original motivation was to enable through the OOM from sending the kill. even if the main concern was OOM. Users can provide a log output and it should be clear on the trace what probe / driver got the kill signal. Well, if you need a WARN output, perhaps you could just add WARN_ON(fatal_signal_pending()) at the end of load_module() ? Not only kthread_create() can fail if systemd sends SIGKILL. Although Oleg had rejected a similar change a while ago And honestly, I still dislike this change. Oleg. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Fri, Sep 05, 2014 at 12:47:16AM -0700, Luis R. Rodriguez wrote: Ah -- well without it the way we find drivers that need this new async feature is by a bug report and folks saying their system can't boot, or they say their device doesn't come up. That's all. Tracing this to systemd and a timeout was one of the most ugliest things ever. There two insane bug reports you can go check: mptsas was the first: http://article.gmane.org/gmane.linux.kernel/1669550 https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1297248 Then cxgb4: https://bugzilla.novell.com/show_bug.cgi?id=877622 I only had Cc'd you on the newest gem pata_marvell : https://bugzilla.kernel.org/show_bug.cgi?id=59581 We can't seriously expect to be doing all this work for every driver. a WARN_ONCE() would enable us to find the drivers that need this new async probe feature. This whole approach of trying to mark specific drivers as needing async probing is completely broken for the problem at hand. It can't address the problem adequately while breaking backward compatibility. I don't think this makes much sense. Nacked-by: Tejun Heo t...@kernel.org Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Friday, September 05, 2014 11:12:41 PM Tejun Heo wrote: On Fri, Sep 05, 2014 at 12:47:16AM -0700, Luis R. Rodriguez wrote: Ah -- well without it the way we find drivers that need this new async feature is by a bug report and folks saying their system can't boot, or they say their device doesn't come up. That's all. Tracing this to systemd and a timeout was one of the most ugliest things ever. There two insane bug reports you can go check: mptsas was the first: http://article.gmane.org/gmane.linux.kernel/1669550 https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1297248 Then cxgb4: https://bugzilla.novell.com/show_bug.cgi?id=877622 I only had Cc'd you on the newest gem pata_marvell : https://bugzilla.kernel.org/show_bug.cgi?id=59581 We can't seriously expect to be doing all this work for every driver. a WARN_ONCE() would enable us to find the drivers that need this new async probe feature. This whole approach of trying to mark specific drivers as needing async probing is completely broken for the problem at hand. It can't address the problem adequately while breaking backward compatibility. I don't think this makes much sense. Which problem are we talking about here though? It does solve the slow device stalling the rest if the kernel booting (non-module case) for me. I also reject the notion that anyone should be relying on drivers to be fully bound on module loading. It is not nineties anymore. We have hot pluggable buses, deferred probing, and even for not hot-pluggable ones the module providing the device itself might not be yet loaded. Any scripts that expect to find device 100% ready after module loading are simply broken. Thanks. -- Dmitry -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Fri, Sep 05, 2014 at 12:59:49PM +0200, Oleg Nesterov wrote: On 09/04, Luis R. Rodriguez wrote: From: Luis R. Rodriguez mcg...@suse.com The new umh kill option has allowed kthreads to receive kill signals but they are generally accepting all sources of kill signals And I think this is right, while the original motivation was to enable through the OOM from sending the kill. even if the main concern was OOM. Users can provide a log output and it should be clear on the trace what probe / driver got the kill signal. Well, if you need a WARN output, perhaps you could just add WARN_ON(fatal_signal_pending()) at the end of load_module() ? We could and that's a good idea, thanks! This however would at least allow the device to be functional in the case the kill was received during kthread usage, but it would certainly also set precedents for doing similar things in the kernel which I do agree with is hacky. If we had upstream at least WARN_ON(fatal_signal_pending()) as you note then I think it would at least be a reasonable compromise. Not only kthread_create() can fail if systemd sends SIGKILL. Sure, although its currently the only source found and debugged. Although Oleg had rejected a similar change a while ago And honestly, I still dislike this change. Don't blame you. The code is sensitive and hacky. Luis -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
Hello, On Fri, Sep 05, 2014 at 09:44:05AM -0700, Dmitry Torokhov wrote: Which problem are we talking about here though? It does solve the slow device stalling the rest if the kernel booting (non-module case) for me. The other one. The one with timeout. Neither cxgb4 or pata_marvell has slow probing stalling boot problem. I also reject the notion that anyone should be relying on drivers to be fully bound on module loading. It is not nineties anymore. We have hot pluggable buses, deferred probing, and even for not hot-pluggable ones the module providing the device itself might not be yet loaded. Any scripts that expect to find device 100% ready after module loading are simply broken. We've been treating loading + probing as a single operation when loading drivers and the assumption has always been that the existing devices at the time of loading finished probing by the time insmod finishes. We now need to split loading and probing and wait for each of them differently. The *only* thing we can do is somehow making the issuer specify that it's gonna wait for probing separately. I'm not sure this can even be up for discussion. We're talking about a major userland visible behavior change. We simply can't change it underneath the existing users. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Fri, Sep 5, 2014 at 10:49 AM, Tejun Heo t...@kernel.org wrote: Hello, On Fri, Sep 05, 2014 at 09:44:05AM -0700, Dmitry Torokhov wrote: Which problem are we talking about here though? It does solve the slow device stalling the rest if the kernel booting (non-module case) for me. The other one. The one with timeout. Neither cxgb4 or pata_marvell has slow probing stalling boot problem. I also reject the notion that anyone should be relying on drivers to be fully bound on module loading. It is not nineties anymore. We have hot pluggable buses, deferred probing, and even for not hot-pluggable ones the module providing the device itself might not be yet loaded. Any scripts that expect to find device 100% ready after module loading are simply broken. We've been treating loading + probing as a single operation when loading drivers and the assumption has always been that the existing devices at the time of loading finished probing by the time insmod finishes. We now need to split loading and probing and wait for each of them differently. The *only* thing we can do is somehow making the issuer specify that it's gonna wait for probing separately. I'm not sure this can even be up for discussion. We're talking about a major userland visible behavior change. We simply can't change it underneath the existing users. Meanwhile we are allowing a major design consideration such as a 30 second timeout for both init + probe all of a sudden become a hard requirement for device drivers. I see your point but can't also be introducing major design changes willy nilly either. We *need* a solution for the affected drivers. Also what stops drivers from going ahead and just implementing their own async probe? Would that now be frowned upon as it strives away from the original design? The bool would let those drivers do this easily, and we would still need to identify these drivers, although this particular change can be NAK'd Oleg's suggestion on WARN_ON(fatal_signal_pending() at the end of load_module() seems to me at least needed. And if its not async probe... what do those with failed drivers do? Luis -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Sat, Sep 06, 2014 at 07:29:56AM +0900, Tejun Heo wrote: It is for storage devices which always have guaranteed synchronous probing on module load and well-defined probing order. Sure, modern setups are a lot more dynamic but I'm quite certain that there are setups in the wild which depend on storage driver loading being synchronous. We can't simply declare one day that such behavior is broken and break, most likely, their boots. To add a bit, if the argument here is that dependency on such behavior shouldn't exist and module loading and device probing should always be asynchronous, the right approach is implementing synchronous_probing flag not the other way around. I actually wouldn't hate to see that change happening but whoever submits and routes such a change should be ready for a major shitstorm, I'm afraid. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
Hello, Luis. On Fri, Sep 05, 2014 at 11:12:17AM -0700, Luis R. Rodriguez wrote: Meanwhile we are allowing a major design consideration such as a 30 second timeout for both init + probe all of a sudden become a hard requirement for device drivers. I see your point but can't also be introducing major design changes willy nilly either. We *need* a solution for the affected drivers. Yes, make the behavior specifically specified from userland. When did I ever say that there should be no solution for the problem? I've been saying that the behavior should be selected from userland from the get-go, haven't I? I have no idea how the seleciton should be. It could be per-insmod or maybe just a system-wide flag with explicit exceptions marked on drivers is good enough. I don't know. Also what stops drivers from going ahead and just implementing their own async probe? Would that now be frowned upon as it strives away The drivers can't. How many times should I explain the same thing over and over again. libata can't simply make probing asynchronous w.r.t. module loading no matter how it does it. Yeah, sure, there can be other drivers which can do that without most people noticing it but a storage driver isn't one of them and the storage drivers are the problematic ones already, right? from the original design? The bool would let those drivers do this easily, and we would still need to identify these drivers, although this particular change can be NAK'd Oleg's suggestion on WARN_ON(fatal_signal_pending() at the end of load_module() seems to me at least needed. And if its not async probe... what do those with failed drivers do? I'm getting tired of explaining the same thing over and over again. The said change was nacked because the whole approach of let's see which drivers get reported on the issue which exists basically for all drivers and just change the behavior of them is braindead. It makes no sense whatsoever. It doesn't address the root cause of the problem while making the same class of drivers behave significantly differently for no good reason. Please stop chasing your own tail and try to understand the larger picture. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On 9/5/2014 3:29 PM, Tejun Heo wrote: Hello, Dmitry. On Fri, Sep 05, 2014 at 11:10:03AM -0700, Dmitry Torokhov wrote: I do not agree that it is actually user-visible change: generally speaking you do not really know if device is there or not. They come and go. Like I said, consider all permutations, with hot-pluggable buses, deferred probing, etc, It is for storage devices which always have guaranteed synchronous probing on module load and well-defined probing order. Sure, modern setups are a lot more dynamic but I'm quite certain that there are setups in the wild which depend on storage driver loading being synchronous. We can't simply declare one day that such behavior is broken and break, most likely, their boots. we even depend on this in the mount-by-label cases many setups assume that the internal storage prevails over the USB stick in the case of conflicts. it's a security issue; you don't want the built in secure bootloader that has a kernel root argument by label/uuid. the security there tends to assume that built-in wins over USB -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On Fri, Sep 05, 2014 at 03:45:08PM -0700, Arjan van de Ven wrote: On 9/5/2014 3:29 PM, Tejun Heo wrote: Hello, Dmitry. On Fri, Sep 05, 2014 at 11:10:03AM -0700, Dmitry Torokhov wrote: I do not agree that it is actually user-visible change: generally speaking you do not really know if device is there or not. They come and go. Like I said, consider all permutations, with hot-pluggable buses, deferred probing, etc, It is for storage devices which always have guaranteed synchronous probing on module load and well-defined probing order. Sure, modern setups are a lot more dynamic but I'm quite certain that there are setups in the wild which depend on storage driver loading being synchronous. We can't simply declare one day that such behavior is broken and break, most likely, their boots. we even depend on this in the mount-by-label cases many setups assume that the internal storage prevails over the USB stick in the case of conflicts. it's a security issue; you don't want the built in secure bootloader that has a kernel root argument by label/uuid. the security there tends to assume that built-in wins over USB Ahem... and they sure it works reliably with large storage arrays? With SCSI doing probing asynchronously already? Thanks. -- Dmitry -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
Hello, Dmitry. On Fri, Sep 05, 2014 at 03:49:17PM -0700, Dmitry Torokhov wrote: On Sat, Sep 06, 2014 at 07:31:39AM +0900, Tejun Heo wrote: On Sat, Sep 06, 2014 at 07:29:56AM +0900, Tejun Heo wrote: It is for storage devices which always have guaranteed synchronous probing on module load and well-defined probing order. Agree about probing order (IIRC that is why we had to revert the wholesale asynchronous probing a few years back) but totally disagree about synchronous module loading. I don't get it. This is a behavior userland already depends on for boots. What's there to agree or disagree? This is just a fact that we can't do this w/o disturbing some userlands in a major way. Anyway, I just posted a patch that I think preserves module loading behavior and solves my issue with built-in modules. It does not help Luis' issue though (but then I think the main problem is with systemd being stupid there). This sure can be worked around from userland side too by not imposing any timeout on module loading but that said for the same reasons that you've been arguing until now, I actually do think that it's kinda silly to make device probing synchronous to module loading at this time and age. What we disagree on is not that we want to separate those waits. It is about how to achieve it. To add a bit, if the argument here is that dependency on such behavior shouldn't exist and module loading and device probing should always be asynchronous, the right approach is implementing synchronous_probing flag not the other way around. I actually wouldn't hate to see that change happening but whoever submits and routes such a change should be ready for a major shitstorm, I'm afraid. I think we already had this storm and that is why here we have opt-in behavior for the drivers. It's a different shitstorm where we actively break bootings on some userlands. Trust me. That's gonna be a lot worse. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
Hello, On Fri, Sep 05, 2014 at 03:52:48PM -0700, Dmitry Torokhov wrote: Ahem... and they sure it works reliably with large storage arrays? With SCSI doing probing asynchronously already? I believe this has been mentioned before too but, yes, SCSI device probing is asynchronous and parallelized but the registration of the discovered devices are fully serialized according to driver attach order. Storage devices are probed in parallel and attached in a fully deterministic order. That part has never changed. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
On 9/5/2014 3:52 PM, Dmitry Torokhov wrote: On Fri, Sep 05, 2014 at 03:45:08PM -0700, Arjan van de Ven wrote: On 9/5/2014 3:29 PM, Tejun Heo wrote: Hello, Dmitry. On Fri, Sep 05, 2014 at 11:10:03AM -0700, Dmitry Torokhov wrote: I do not agree that it is actually user-visible change: generally speaking you do not really know if device is there or not. They come and go. Like I said, consider all permutations, with hot-pluggable buses, deferred probing, etc, It is for storage devices which always have guaranteed synchronous probing on module load and well-defined probing order. Sure, modern setups are a lot more dynamic but I'm quite certain that there are setups in the wild which depend on storage driver loading being synchronous. We can't simply declare one day that such behavior is broken and break, most likely, their boots. we even depend on this in the mount-by-label cases many setups assume that the internal storage prevails over the USB stick in the case of conflicts. it's a security issue; you don't want the built in secure bootloader that has a kernel root argument by label/uuid. the security there tends to assume that built-in wins over USB Ahem... and they sure it works reliably with large storage arrays? With SCSI doing probing asynchronously already? you tend to trust your large storage array you tend to not trust the walk up USB stick. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
Hey, On Fri, Sep 05, 2014 at 04:22:42PM -0700, Dmitry Torokhov wrote: I don't get it. This is a behavior userland already depends on for boots. What's there to agree or disagree? This is just a fact that we can't do this w/o disturbing some userlands in a major way. I am just expressing my disbelief that somebody relies on module loading being synchronous with probing. Out of curiosity, do you have any pointers? I've seen initrd scripts which depended on the behavior to wait for storage devices over the years. AFAIK, none of the modern distros does it but this has been such a basic feature all along and it seems highly unlikely to me that there's no userland remaining out there depending on such behavior. We do have a lot of different userlands, many of them quite ad-hoc. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html