Re: NULL ptr dereferences found with Calysto static checker

Domagoj Babic Mon, 11 Jun 2007 10:09:28 -0700

Ken,

On 6/9/07, Ken Raeburn <[EMAIL PROTECTED]> wrote:

Sure, makes sense.  I hope it's easy to expand Calysto's database of
such functions.


It is. Currently, it requires recompilation of Calysto, but in the
future it will be done through simple modifiable specifications.
However, there are quite a few things on my research agenda that are
much more urgent.

Can you express such things as "F returns non-zero
or sets *x to a valid pointer"?


Yes, of course.

I played a bit with Splint some time
back, and one of its most glaring problems was the inability to
understand stuff like that -- in particular, realloc() success and
failure paths.


Note that Calysto is a completely different beast. It's fully
path-sensitive. For instance, Calysto easily proves (in 0.02 sec) that
the following assertion is always satisfied:

int f(int a, int b) {
   int a1;
   if (a%2) { a1 = a + 1; }
   else { a1 = a; }
   int c = a1 * a1;
   assert(c % 4 == 0);
}

Of course, to achieve path-sensitivity you need some form of a theorem
prover. Calysto owns a lot of its scalability to the Spear theorem
prover ( http://www.domagoj.info/index_spear.htm ), which has been
developed (and heavily optimized) for software verification.


Currently, assertion-checking is not yet ready for mass-consumption, but
NULL-ptr dereference checking is very close. As Calysto evolves,
projects that join Calysto Community will also get assertion checking.

> 2) You are willing to help me out analyzing the results.
Well, I'm willing to try. :-)


Excellent! :-)

Is it possible to save the analyzed behavior of some functions or
their uses, maybe programmatically transform some subset of it into
some input format acceptable to Calysto (e.g., do you have an input
file that describes strncmp and gethostbyname to it?), and import the
result into an analysis of another module?


It is doable, but IMHO, static checker should do checking not linking
:-). So, I prefer to get a complete input, and specify only a few most
important external functions through additional constraints.

> It is just a simple heuristic to decrease the number of false
> positives
> - if the context is available, Calysto always takes it into
> account. As
> said before, really not a big deal to change this behaviour.

Okay, that sounds good.  Especially if a report on the assumptions
made can be produced, we could presumably correct certain wrong
conclusions and re-run the analysis?


Of course.

> How many should there be?

Looking at a UNIX install tree, I've got 11 shared libraries, 39
executables (of which a few, like krb5kdc and kadmind, are the most
critical, but almost all involve network traffic and thus subject in
one way or another to malicious input, and two are normally installed
setuid root), and depending on configuration options, one or two .so
files that may get dynamically loaded by krb5kdc at run time with
dlopen.  We only build shared libraries for most things currently;
it's not possible to do a fully static build of any of the
executables, without some pounding on our somewhat baroque build system.


After spending a couple of hours analyzing your build process, I found
the culprit: export-check.pl fails. After commenting lines 74 & 76,
everything went through, producing 14 shared libraries and 65 binaries.

I've run Calysto, and after 13.4 hours of computation got 784 warnings,
which is quite a lot considering the size of the project. Roughly one
third is in libkrb5.so.

I've attached the log. Please, check it out and sort binaries that have
unique_locations!=0 by priority, at least give me the top-ten most
important ones. Note that since binaries share some code, some
warnings are reported multiple times. After a quick analysis of the
reports, it seems that there are 784 unique ones. Binaries are named
source_path_binary_name.bc.

As soon as I get the 'top-list', I'll start analyzing and postprocessing
the reports, and send them to [EMAIL PROTECTED] when done.

> Another possible approach (that I've done with sendmail) is to compile
> all sources individually, and link them in a knapsack manner -
> maximising the number of object files linked together without linkage
> conflicts.

Ah, I see.  So you generate bitcode files from LLVM for individual
source/object files manually somehow, and feed a collection of them
to Calysto?

If you do it by hand, I think you wind up needing to process all of
the library directories before you can build the KDC -- util/support,
util/et, util/profile, lib/crypto, lib/krb5, lib/gssapi, lib/rpc, lib/
des425, lib/krb4, lib/kadm5/, lib/kadm5/srv, lib/kdb, lib/apputils,
and finally kdc/.  (Normally the lib/krb5 directory builds a library
that includes the util/profile objects, and lib/kadm5/srv builds a
library using the objects in lib/kadm5.)  That sounds like a lot of
manual work.

I don't know what your process is for generating the intermediate BC
files; is it something that could be tied into a normal build somehow
to generate and process the additional files?


Essentially, one would need to compile an application to bitcode files.
LLVM is pretty close to it, but does not have binutils. In order to
circumvent this problem, I wrote a linkage script
http://lists.cs.uiuc.edu/pipermail/llvmdev/2006-December/007577.html
that solves the problem. Using it, I've managed to build a solid number
of apps. Some apps (like krb5) require more hacking.

Note that that script is quite a bit out of date, and I've changed
it in the meantime. After I move to the new LLVM 2.0, I'll resend the
new scripts so that people can compile their own code into bc files,
saving my time. In the meantime, I'll have to explore the beauty of
inventive build systems on my own :-)

Regards,

--
       Domagoj Babic

       http://www.domagoj.info/

________________________________________________
Kerberos mailing list           [email protected]
https://mailman.mit.edu/mailman/listinfo/kerberos

Re: NULL ptr dereferences found with Calysto static checker

Reply via email to