OK, now I'm replying to myself a second time. Really sorry. I realized we don't need any enhancement to sampling other than getting the non-check for Isolate locked by debugged (suspended) thread fixed in Windows. For sampling when Isolate is unlocked I can simply get time before and after an unlock on a profiled thread, do a StackTrace::CurrentStackTrace() and use that information to merge with the CpuProfiler::StopProfiling results, converting timer units to ticks. Not exactly rocket science.
So the simple question then is should I do a PR to fix the sampling issue on Windows? Thanks and sorry about all the noise. On Sunday, August 2, 2020 at 3:04:03 PM UTC-5, Alex Kodat wrote: > > We're using the CpuProfiler in an Isolate that can have multiple threads > running and and taking turns in the Isolate via Unlocker. Frequently, we > want to profile what's happening on a thread so we do a StartProfiling for > a thread, let it run for a while, and then collect the results via > StopProfiling. Works great. > > However, I've noticed a difference between Windows and Linux in the > profiles, specifically, Windows collects samples for a thread when it's > inside an Unlocker bracket whereas Linux does not. So if we have a native > function sleep(), that unlocks the Isolate, in Windows I see a high hit > count for stacks that have functionName "sleep" and sourceType callback in > the bottom stack entry. In Linux, there are no hits where I'm inside the > sleep native function. For our purposes, the Windows behavior is much nicer > as we can easily see how much of the thread's time is being spent in sleep. > In Linux, there is absolutely no way of knowing because the information is > not captured. > > Unfortunately, Windows nicer (IMO) behavior comes at a cost. Specifically, > if indeed other threads do run in the Isolate while the profiled thread is > sampled when unlocked, there is a tendency for core dumps while getting the > stack trace. This is unsurprising because the stack trace needs to look in > the Isolate's heap to get things like function names and that's not really > a good idea when a thread doesn't have the Isolate lock. > > The following code samples are all from src/libsampler/sampler.cc. > > Under Linux the profiler protects itself in SamplerManager::DoSample > (called by the SIGPROF handler) with: > > for (Sampler* sampler : samplers) { > ... > if (v8::Locker::IsActive() && !Locker::IsLocked(isolate)) continue; > sampler->SampleStack(state); > } > > That Locker::isLocked is a bit of a misnomer and really means > Locker::isLockedByCurrentThread and, because our SIGPROF handler runs on > the profiled thread, if that tread is in an Unlocker bracket in a native > function, isLocked returns false and we don't collect a sample. > > In Windows, the sampler does not run on the profiled thread and, instead, > does a SuspendThread of the profiled thread from another thread and then > uses GetThreadContext: > > void Sampler::DoSample() { > HANDLE profiled_thread = platform_data()->profiled_thread(); > if (profiled_thread == nullptr) return; > > const DWORD kSuspendFailed = static_cast<DWORD>(-1); > if (SuspendThread(profiled_thread) == kSuspendFailed) return; > > // Context used for sampling the register state of the profiled thread. > CONTEXT context; > memset(&context, 0, sizeof(context)); > context.ContextFlags = CONTEXT_FULL; > if (GetThreadContext(profiled_thread, &context) != 0) { > ... > SampleStack(state); > } > > There is no check if the Isolate is actually locked by the profiled thread > so we can get a sample for it but also, unfortunately, seg faults and the > like if another thread is running. > > So it seems that there really should be a check in Sampler::DoSample for > Windows to check if the Isolate is locked by the profiled thread. This > would require maybe adding another method to Locker like > isLockedByThread(ThreadId id) or something like that and then checking that > in Sampler::DoSample. When PlatformData is instantiated for Windows it > could stash the ThreadId for the current thread so it could be retrieved in > Sampler::DoSample. Doesn't seem all that daunting and am willing to put > together a PR for this if this seems reasonable and unless someone more > conversant with V8 internals doesn't pick it up. > > However, while this fix would eliminate Windows seg faults when doing > profiling in a multi-threaded environment, it would also make me sad > because now we would have no way of getting time spent unlocked in my > native function. To provide this functionality, it seems like it would be > nice to be able to create a provisional TickSample before the Isolate is > unlocked in a native function and use that in DoSample if the Isolate is > not locked by the profiled thread. Obviously, this would/should only be > done if the thread is being profiled. > > There are a lot of ways this could work but a straw man is that there > could be CpuProfiler methods called SaveProvisionalThreadSample and > ClearProvisionalThreadSample. The former, at least, would have to be called > while one holds the Isolate lock so one would presumably do such a call > right before Unlocker instantiation and then do a > ClearProvisionalThreadSample call right after the end of the Unlocker > context. Then, if DoSample decides that Isolate is not locked by the > sampled thread, it could look for a saved provisional sample for the thread > and, if available, use that for its sample. Note that a provisional > TickSample could be used many times as a thread might be unlocked over many > samples. > > Again, assuming no one else picks this up, happy to take a swing at a PR, > myself. While more complicated than the Windows fix, it doesn't seem that > daunting. > > Opinions? > > Sorry if this would have been more appropriately posted on v8-dev. > > Thanks > > -- -- v8-users mailing list v8-users@googlegroups.com http://groups.google.com/group/v8-users --- You received this message because you are subscribed to the Google Groups "v8-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to v8-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/v8-users/1bd503d6-ddec-4dca-9c76-55877b84e283o%40googlegroups.com.