The true nature and state of those tests lie far deeper than pretty much anyone occasionally scratches with their trowel. To really take a peak, you have to do at minimum, something like setup a Jenkins farm with half a dozen, a dozen machines with varying low to high need specs, randomize parallel overlap and test order and actually shake the Jenga tower to see what falls out.
That will expose a real view rather then a narrow slit into a shifting, opaque, but “relatively balancing from a view point”perspective at least view. Just from a practical squeeze, many projects just push on narrowing that slit view and leaning into more efforts on keeping the structure balanced in that far. Perhaps going as far as, run in a known Docker environment, minimize disturbances and test recording with light parallel at most and even, just a master Jenkins run is the real deal, developers, your luck will vary, outside of adventuring, you’ll have an easier time letting the test source of truth Jenkins instance dictate your hat fails or not.
