On Sat, Mar 02, 2024 at 08:53:54PM +0100, Max Boone wrote:
> 
> Thank you so much for the quick reply!
> 
> ​​​​​​I haven't filed a bug with Debian specifically as I'm running the linux 
> kernel built and provided by Microsoft and Ubuntu as OS on top. If it helps 
> with the search I'd gladly run Debian and file a bug there, but will still 
> need to build my own kernel as WSL requires some modules (such as HyperV 
> storage and sockets) to be built into the kernel instead (meaning =y) of as 
> modules (meaning =m).

Ah, if you built your own kernel, then you are your own distro as far
as kernel issues are concerned.  ;-)

                                                        Thanx, Paul

> I'll stick to using the rcu list from here on to avoid spam, thanks again!
> ​​​​​
> On Saturday, March 02, 2024 20:43 CET, "Paul E. McKenney" 
> <[email protected]> wrote:
>  [ Adding Boqun and the rcu list on CC. ]
> 
> On Sat, Mar 02, 2024 at 07:59:08PM +0100, Max Boone wrote:
> >
> > Dear Dr. McKenney,
> >
> > For a couple of years now I've been the sometimes frustrated owner of a 
> > Microsoft Surface Pro X ARM64 device, which has been getting progressively 
> > better as more vendors start targeting their builds at ARM64 architectures 
> > but since the introduction of the device there have been issues with the 
> > Windows Subsystem for Linux (not more than an opinionated Hyper-V VM with 
> > extensive tooling) locking up and hanging. 
> >
> > When this happens, traces like the following are dumped in the kernel 
> > messages:
> > https://github.com/microsoft/WSL/issues/9454#issuecomment-1942222109
> >
> > When watching your talk "Decoding Those Inscrutable RCU CPU Stall Warnings" 
> > you mentioned one can feel free reaching out when bumping into such issues. 
> > Building other kernel releases, switching off-and-on modules and playing 
> > with the RCU grace period times so far don't seem to work for me (or others 
> > in that thread).
> >
> > Anyways, I don't really know where to start looking and the call stacks 
> > aren't very informative (to my eye) either. I'm hoping you might help me 
> > find the direction to look for the root of this problem.
> 
> I am assuming that you have filed a bug with the Debian folks, and before
> doing that, searched for similar bug reports.
> 
> At first glance, this is because things were stuck here:
> 
> [ 967.115632] clear_rseq_cs.isra.0+0x4c/0x60
> [ 967.116433] do_notify_resume+0xf8/0xeb0
> [ 967.116960] el0_svc+0x3c/0x50
> [ 967.117537] el0t_64_sync_handler+0x9c/0x120
> [ 967.118323] el0t_64_sync+0x158/0x15c
> 
> So including these function names (clear_rseq_cs() and so on) in your
> search for similar bug reports would be a good idea.
> 
> I am unfamiliar with that code.
> 
> So I added Boqun because he works with Linux on HyperV as part of his
> day job and has a great deal of experience with RCU. He will likely
> have quite a number of questions for you including exact versions,
> Debian bug number, the results of your web search, and so on. He might
> also know an ARM person to get involved in this.
> 
> Or maybe he knows the solution off the top of his head!
> 
> Thanx, Paul
> 
> 
>  

Reply via email to