The analysis of the failure is in and it is interesting:

The problem was caused by a null pointer dereference in the kernel.
The null pointer issue came from a module of "pcode" that is executed in
the kernel module.
The pcode file was all zeros.
When the pcode was loaded, it was run, and violla! BSOD.
The fix was to remove the offending pcode file.

Much of this could fall under the category of "sh&^%t happens," but I
think there are three fundamental mistakes that show CrowdStrike was
incompetent and negligent.

Thoughts:
(1) loading pcode into a kernel driver. Are you kidding me?

(2) loading pcode (in any environment) without basic sanity checks
(checksum, structural verification, etc.) is total incompetence. This is a
disaster waiting to happen, even a little bit-rot could create a problem
that would be difficult to diagnose and fix.

(3) Unstaged rollout: amateur hour nonsense.






_______________________________________________
Discuss mailing list
Discuss@driftwood.blu.org
https://driftwood.blu.org/mailman/listinfo/discuss

Reply via email to