Re: [PATCH v2 0/2] gitlab-ci.yml: Add jobs to test CFI
On Tue, Mar 02, 2021 at 04:01:17PM -0500, Daniele Buono wrote: > On 3/2/2021 11:40 AM, Daniel P. Berrangé wrote: > > The CFI protection is something I'd say is relevant to virtualization > > use cases, not to emulation use cases > > > > https://qemu-project.gitlab.io/qemu/system/security.html > > > > IOW, the targets that are important to test are the ones where KVM > > is available. > > > > So that's s390x, ppc, x86, mips, and arm. > > > > I think we can probably ignore mips as that's fairly niche. > > We can also reasonably limit ourselves to only test the 64-bit > > variants of the target, on the basis that 32-bit is increasingly > > legacy/niche too. > > > > So that gives us ppc64le, x86_64, aarch64 and s390x as the > > targets we should get CI coverage for CFI. > > Thanks Daniel, > I'll start working on a V3 that only contains those 4 targets, probably in > two sets of build/check/acceptance to maintain the jobs below the hour mark. > > These would still be x86 binaries that are not testing KVM, however, > because of the capabilities of the shared gitlab runners. Yes, that's fine. > I see that there's some work from Cleber Rosa to allow running custom jobs > on aarch64 and s390x systems. I think that, when the infrastructure is > ready, having a KVM-based CFI test there would help a lot in terms of > coverage for those architectures. Yep, that should be possible. Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH v2 0/2] gitlab-ci.yml: Add jobs to test CFI
On 3/2/2021 11:40 AM, Daniel P. Berrangé wrote: The CFI protection is something I'd say is relevant to virtualization use cases, not to emulation use cases https://qemu-project.gitlab.io/qemu/system/security.html IOW, the targets that are important to test are the ones where KVM is available. So that's s390x, ppc, x86, mips, and arm. I think we can probably ignore mips as that's fairly niche. We can also reasonably limit ourselves to only test the 64-bit variants of the target, on the basis that 32-bit is increasingly legacy/niche too. So that gives us ppc64le, x86_64, aarch64 and s390x as the targets we should get CI coverage for CFI. Thanks Daniel, I'll start working on a V3 that only contains those 4 targets, probably in two sets of build/check/acceptance to maintain the jobs below the hour mark. These would still be x86 binaries that are not testing KVM, however, because of the capabilities of the shared gitlab runners. I see that there's some work from Cleber Rosa to allow running custom jobs on aarch64 and s390x systems. I think that, when the infrastructure is ready, having a KVM-based CFI test there would help a lot in terms of coverage for those architectures.
Re: [PATCH v2 0/2] gitlab-ci.yml: Add jobs to test CFI
On Tue, Mar 02, 2021 at 11:31:54AM -0500, Daniele Buono wrote: > > On 3/2/2021 10:38 AM, Daniel P. Berrangé wrote: > > Is this scenario going to upset CFI, or is it happy that 'void *' > > is compatible with 'mytype *', and ok with the intermediate casts > > to/from GCallback ? > > This is a valid scenario. LLVM does offer the ability of considering all > pointer types compatible, and it is being enabled in QEMU. So void* is > compatible to any type* and that would not be considered a fault. Ok that's good. > Intermediate casts are also fine since you are just passing the pointer but > not using it. The check will happen only when the function is called, at > which point it was cast back to something compatible. Makes sense. So in general, it sounds like breadth of test coverage is fairly important for the CFI jobs, at least if we're exercising different areas of functionality. So I think we do need to be testing more than just one architecture target. The CFI protection is something I'd say is relevant to virtualization use cases, not to emulation use cases https://qemu-project.gitlab.io/qemu/system/security.html IOW, the targets that are important to test are the ones where KVM is available. So that's s390x, ppc, x86, mips, and arm. I think we can probably ignore mips as that's fairly niche. We can also reasonably limit ourselves to only test the 64-bit variants of the target, on the basis that 32-bit is increasingly legacy/niche too. So that gives us ppc64le, x86_64, aarch64 and s390x as the targets we should get CI coverage for CFI. Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH v2 0/2] gitlab-ci.yml: Add jobs to test CFI
On 3/2/2021 10:38 AM, Daniel P. Berrangé wrote: Is this scenario going to upset CFI, or is it happy that 'void *' is compatible with 'mytype *', and ok with the intermediate casts to/from GCallback ? This is a valid scenario. LLVM does offer the ability of considering all pointer types compatible, and it is being enabled in QEMU. So void* is compatible to any type* and that would not be considered a fault. Intermediate casts are also fine since you are just passing the pointer but not using it. The check will happen only when the function is called, at which point it was cast back to something compatible.
Re: [PATCH v2 0/2] gitlab-ci.yml: Add jobs to test CFI
On Tue, Mar 02, 2021 at 08:18:03AM -0500, Daniele Buono wrote: > On 3/2/2021 5:30 AM, Daniel P. Berrangé wrote: > > On Mon, Mar 01, 2021 at 03:39:42PM -0500, Daniele Buono wrote: > > > Hi Daniel, > > > > > > On 3/1/2021 10:08 AM, Daniel P. Berrangé wrote: > > > > What are the unique failure scenarios for CFI that these jobs are > > > > likely to expose ? Is it likely that we'll have cases where > > > > CFI succeeds in say, x86_64 target, but fails in aarch64 target ? > > > For CFI to fail (even if it shouldn't) you'll need code that is calling a > > > function pointer that was not well defined at compile time. Although > > > unlikely, that could happen everywhere in the code. > > What does "was not well defined" mean here ? > > > > At high level, the compiler creates metadata for every function. Before > jumping to a function pointer, it makes sure that the pointer and the > pointee have matching types. > Not well defined means one of these two cases: > 1. The function has a different type than the pointer -> Most likely an > error How strictly is this checked ? With GLib function prototype mismatch is not uncommon. For example GLib might need to invoke a callback with a signature: int foo(int somearg, void *opaque); The API though will often declare the callback signature to be generic void (*GCallback) (void); The caller will implement a callback with int foo(int somearg, mytype *mydata); and will use G_CALLBACK() to do an intentional bad cast to GCallback Before it invokes the callback, GLib would cast from GCallback back toint foo(int somearg, void *opaque); Notice this last arg doesn't match the type of the actual implemented callback. Is this scenario going to upset CFI, or is it happy that 'void *' is compatible with 'mytype *', and ok with the intermediate casts to/from GCallback ? > 2. The function was not available at compile time so the compiler could > not create the related metadata -> Most likely a false positive. Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH v2 0/2] gitlab-ci.yml: Add jobs to test CFI
On 3/2/2021 5:30 AM, Daniel P. Berrangé wrote: On Mon, Mar 01, 2021 at 03:39:42PM -0500, Daniele Buono wrote: Hi Daniel, On 3/1/2021 10:08 AM, Daniel P. Berrangé wrote: What are the unique failure scenarios for CFI that these jobs are likely to expose ? Is it likely that we'll have cases where CFI succeeds in say, x86_64 target, but fails in aarch64 target ? For CFI to fail (even if it shouldn't) you'll need code that is calling a function pointer that was not well defined at compile time. Although unlikely, that could happen everywhere in the code. What does "was not well defined" mean here ? At high level, the compiler creates metadata for every function. Before jumping to a function pointer, it makes sure that the pointer and the pointee have matching types. Not well defined means one of these two cases: 1. The function has a different type than the pointer -> Most likely an error 2. The function was not available at compile time so the compiler could not create the related metadata -> Most likely a false positive. Thanks, Daniele
Re: [PATCH v2 0/2] gitlab-ci.yml: Add jobs to test CFI
On Mon, Mar 01, 2021 at 03:39:42PM -0500, Daniele Buono wrote: > Hi Daniel, > > On 3/1/2021 10:08 AM, Daniel P. Berrangé wrote: > > What are the unique failure scenarios for CFI that these jobs are > > likely to expose ? Is it likely that we'll have cases where > > CFI succeeds in say, x86_64 target, but fails in aarch64 target ? > > For CFI to fail (even if it shouldn't) you'll need code that is calling a > function pointer that was not well defined at compile time. Although > unlikely, that could happen everywhere in the code. What does "was not well defined" mean here ? > > So by just testing one (or few) targets we are are not covering the code in > the TCG frontend used to compile the target ISA to tcg ops, which should be > the in target/, and the architecture-dependent code in linux-user. > > That code seems unlikely (at least to me) to cause a false positive with > CFI. Examples that I've seen in QEMU so far are: > - Calling code that was JIT-ed at runtime > - Callbacks to functions that were loaded from shared libraries > - Signal Handlers > And none of those should happen there IMHO. But you know, corner cases are > still possible, and it's quite difficult to predict what new code may bring. > > We could also consider always testing one or two targets, and keep an > optional job to test them all when deemed necessary. I'm thinking for > example full testing when code in target/ or linux-user/ is considered, or > for testing pre-release code. Would be nice to have this automated but I am > not sure that's possible. > > Regards, > Daniele > Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH v2 0/2] gitlab-ci.yml: Add jobs to test CFI
Hi Daniel, On 3/1/2021 10:08 AM, Daniel P. Berrangé wrote: What are the unique failure scenarios for CFI that these jobs are likely to expose ? Is it likely that we'll have cases where CFI succeeds in say, x86_64 target, but fails in aarch64 target ? For CFI to fail (even if it shouldn't) you'll need code that is calling a function pointer that was not well defined at compile time. Although unlikely, that could happen everywhere in the code. So by just testing one (or few) targets we are are not covering the code in the TCG frontend used to compile the target ISA to tcg ops, which should be the in target/, and the architecture-dependent code in linux-user. That code seems unlikely (at least to me) to cause a false positive with CFI. Examples that I've seen in QEMU so far are: - Calling code that was JIT-ed at runtime - Callbacks to functions that were loaded from shared libraries - Signal Handlers And none of those should happen there IMHO. But you know, corner cases are still possible, and it's quite difficult to predict what new code may bring. We could also consider always testing one or two targets, and keep an optional job to test them all when deemed necessary. I'm thinking for example full testing when code in target/ or linux-user/ is considered, or for testing pre-release code. Would be nice to have this automated but I am not sure that's possible. Regards, Daniele
Re: [PATCH v2 0/2] gitlab-ci.yml: Add jobs to test CFI
On Mon, Mar 01, 2021 at 09:59:22AM -0500, Daniele Buono wrote: > Hi Daniel, > > On 3/1/2021 5:06 AM, Daniel P. Berrangé wrote: > > On Fri, Feb 26, 2021 at 10:21:06AM -0500, Daniele Buono wrote: > > > Build jobs are on the longer side (about 2h and 20m), but I thought it > > > would be better to just have 6 large jobs than tens of smaller ones. > > > > IMHO that is a not viable. > > > > Our longest job today is approx 60 minutes, and that is already > > painfully long when developers are repeatedly testing their > > patch series to find and fix bugs before posting them for review. > > I can perhaps get through 5-6 test cycles in a day. If we have a > > 2 hour 20 min job, then I'll get 2-3 test cycles a day. > > > > I don't want to see any new jobs added which increase the longest > > job execution time. We want to reduce our max job time if anything. > > > > > > I totally understand the argument. > > We could build two targets per job. That would create build jobs that > take 40 to 60-ish minutes. If that's the case, however, I would not > recommend testing all the possible targets but limit them to what > is considered a set of most common targets. I have an example of the > resulting pipeline here: > > https://gitlab.com/dbuono/qemu/-/pipelines/258983262 > > I selected intel, power, arm and s390 as "common" targets. Would > something like this be a viable alternative? Perhaps after > due thinking of what targets should be tested? What are the unique failure scenarios for CFI that these jobs are likely to expose ? Is it likely that we'll have cases where CFI succeeds in say, x86_64 target, but fails in aarch64 target ? If not, then it would be sufficient to just test a single target to smoke out CFI specific bugs, and assume it covers other targets implicitly. Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH v2 0/2] gitlab-ci.yml: Add jobs to test CFI
Hi Daniel, On 3/1/2021 5:06 AM, Daniel P. Berrangé wrote: On Fri, Feb 26, 2021 at 10:21:06AM -0500, Daniele Buono wrote: Build jobs are on the longer side (about 2h and 20m), but I thought it would be better to just have 6 large jobs than tens of smaller ones. IMHO that is a not viable. Our longest job today is approx 60 minutes, and that is already painfully long when developers are repeatedly testing their patch series to find and fix bugs before posting them for review. I can perhaps get through 5-6 test cycles in a day. If we have a 2 hour 20 min job, then I'll get 2-3 test cycles a day. I don't want to see any new jobs added which increase the longest job execution time. We want to reduce our max job time if anything. I totally understand the argument. We could build two targets per job. That would create build jobs that take 40 to 60-ish minutes. If that's the case, however, I would not recommend testing all the possible targets but limit them to what is considered a set of most common targets. I have an example of the resulting pipeline here: https://gitlab.com/dbuono/qemu/-/pipelines/258983262 I selected intel, power, arm and s390 as "common" targets. Would something like this be a viable alternative? Perhaps after due thinking of what targets should be tested? Regards, Daniel
Re: [PATCH v2 0/2] gitlab-ci.yml: Add jobs to test CFI
On Fri, Feb 26, 2021 at 10:21:06AM -0500, Daniele Buono wrote: > Build jobs are on the longer side (about 2h and 20m), but I thought it > would be better to just have 6 large jobs than tens of smaller ones. IMHO that is a not viable. Our longest job today is approx 60 minutes, and that is already painfully long when developers are repeatedly testing their patch series to find and fix bugs before posting them for review. I can perhaps get through 5-6 test cycles in a day. If we have a 2 hour 20 min job, then I'll get 2-3 test cycles a day. I don't want to see any new jobs added which increase the longest job execution time. We want to reduce our max job time if anything. Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
[PATCH v2 0/2] gitlab-ci.yml: Add jobs to test CFI
For a few months now QEMU has had options to enable compiler-based control-flow integrity if built with clang. While this feature has a low maintenance, It's probably still better to add tests to the CI environment to check that an update doesn't break it. The patchset allow gitlab testing of: * --enable-cfi: forward-edge cfi (function pointers) * --enable-safe-stack: backward-edge cfi (return pointers) As an added benefit, this also inherently tests LTO. The first patch allows a custom selection for linker parallelism. Currently, make parallelism at build time is based on the number of cpus available. This doesn't work well with LTO at linking, because the linker has to load in memory all the intermediate object files for optimization. If the gitlab runner happens to run two linking processes at the same time, the job will fail with an out-of-memory error, The patch leverages the ability to maintain high parallelism at compile time, but limit the number of linkers executed in parallel. The second patch introduces the ci/cd jobs in the gitlab pipeline. My original intention was to create a single chain of build -> check -> acceptance, with all the targets compiled by default. Unfortunately, the resulting artifact is too big and won't be uploaded. So I split the test in two chains, that should cover all non-deprecated targets as of today. Build jobs are on the longer side (about 2h and 20m), but I thought it would be better to just have 6 large jobs than tens of smaller ones. For build, we have to select --enable-slirp=git, because CFI needs a statically linked version of slirp, with CFI information. More info on this can be found in a comment in .gitlab-ci.yml. Test runs of the full pipeline are here (cfi-ci-v2 branch): https://gitlab.com/dbuono/qemu/-/pipelines/261311362 v2: - More details in the code about the issue of using system-wide slirp - Use meson to only limit linker parallelism instead of forcing no parallelism on the whole compilation process. Daniele Buono (2): gitlab-ci.yml: Allow custom # of parallel linkers gitlab-ci.yml: Add jobs to test CFI flags .gitlab-ci.yml | 97 ++ 1 file changed, 97 insertions(+) -- 2.30.0