Hi, On Thu, 29 May 2025 at 02:32, Andres Freund <and...@anarazel.de> wrote:
> On 2025-05-28 22:51:14 +0930, Robins Tharakan wrote: > Recently leafhopper failed again on the same test. For now I've paused it. > > To rule out the compiler (and its maturity on the architecture), I'll > > upgrade > > gcc (to nightly, or something more recent) and then re-enable to see if > it > > changes anything. > > +1 to a gcc upgrade, gcc 11 is rather old and out of upstream support. Ack. I've updated leafhopper to gcc master. For now (to get the machine green / running), I've disabled some flags, which I'll revisit in some time, but hopefully that's not about compiler maturity - which is what I'm after here. > A kernel upgrade would be good too. My completely baseless gut feeling is > that some SIMD registers occassionally get corrupted, e.g. due to a kernel > interrupt / context switch not properly storing & restoring them. Weirdly > enought the instrumentation code is among the pieces of PG code most > vulnerable to that because we mostly don't do enough auto-vectorizable > math, > but InstrEndLoop(), InstrStopNode() etc are trivially auto-vectorizable. > I'm > pretty sure I've previously analyzed problems around this, but don't > remember > the details (IA64 maybe?). > Fair point, I'll keep that option open. Originally, the machine was spun up to evaluate the graviton4 ec2 instance and I'd like to explore whether the stock-kernel / kernel-updates are able to keep the instance green (and resort to updating the kernel only if I exhaust all other options - pg / compiler etc.). - robins