On Sat, Oct 12, 2019 at 7:56 AM Tom Lane <t...@sss.pgh.pa.us> wrote: > This matches up with the intermittent infinite_recurse failures > we've been seeing in the buildfarm. Those are happening across > a range of systems, but they're (almost) all Linux-based ppc64, > suggesting that there's a longstanding arch-specific kernel bug > involved. For reference, I scraped the attached list of such > failures in the last three months. I wonder whether we can get > the attention of any kernel hackers about that.
Yeah, I don't know anything about this stuff, but I was also beginning to wonder if something is busted in the arch-specific fault.c code that checks if stack expansion is valid[1], in a way that fails with a rapidly growing stack, well timed incoming signals, and perhaps Docker/LXC (that's on Mark's systems IIUC, not sure about the ARM boxes that failed or if it could be relevant here). Perhaps the arbitrary tolerances mentioned in that comment are relevant. [1] https://github.com/torvalds/linux/blob/master/arch/powerpc/mm/fault.c#L244