This is awesome. Thanks for taking the time to dive in and solve this one.
We do indeed have some high-turnover postgres systems that are impossible to effectively trace due to this issue. I'm excited to get this into a production kernel over here... in due time :-) On Sat, Jul 2, 2011 at 2:27 PM, Bryan Cantrill <br...@joyent.com> wrote: > All, > > A longstanding problem that we have had is that enablings on defunct > providers (e.g., USDT probes on dead processes) are not reaped: the > probes will exist as long as there exists an enabling for them. When > processes are turning over frequently (or when enablings are > long-running), this can clog up the probe space to the point that > DTrace probe creation will silently fail (an absolutely maddening > failure mode). This has been hit several times over the years (we > were nailed by it on our build machines at Fishworks) -- so when Theo > Schlossnagle mentioned to me that he was getting killed by this > problem in an environment with rapidly turning over Postgres > processes, I was embarrassed that I hadn't tackled it earlier. As it > turns out, it was a tad thorny for locking reasons, but a patch for > this problem is attached. We have integrated this into our bits at > Joyent (internal ticket is OS-454, "enablings on defunct providers > prevent providers from unregistering"), so you'll see this show up > soon at http://github.com/joyent/illumos-joyent -- but I wanted to > give everyone here a heads-up. > > Anyway, patch is attached, with my thanks to Adam for a helpful > discussion on fasttrap's asynchronous provider retiring mechanics. > Note that Adam hasn't (yet) reviewed this, and its integration > upstream should wait until he's had a chance to look it over. Please > let me know if you have any questions or comments! > > Thanks, > Bryan > -- Theo Schlossnagle http://omniti.com/is/theo-schlossnagle _______________________________________________ dtrace-discuss mailing list dtrace-discuss@opensolaris.org