Hello FreeCalypso community, As a little fun exercise, I have just written a tool that allows us to quantitatively measure exactly how much we have deblobbed our TI-based modem firmware. For years I've been saying that our starting point (the TCS211 semi-src that has been salvaged from the ruins of former Openmoko) was approximately half source, half blobs, and that our current FreeCalypso modem fw is almost completely blob-free - but of course such statements are general terms, lacking quantitative substance. The new blobstat tool finally gives us the actual numbers which we haven't had until now.
Given a final build product that has been produced from a mixture of source components and linkable binary objects, just how can one quantify precisely what percentage is source and what percentage is blobs? The answer can be obtained from the map file produced by the linker. Just like the more popular ELF, TI's object format (a COFF variant) is based around sections: linkable objects consist of sections, and so does the final link output. Every byte in the final fw image belongs to some section, and each of these output sections from the linker's perspective is made up of various input sections (meaning sections taken from linkable objects), with a few bits added by the linker itself (long Thumb call trampolines and some filler and padding bytes). The map file produced by the linker shows the allocation of every byte in the final fw image: it lists all generated output sections, shows what input sections each output section is made of, and shows all linker-added fillers and trampolines. My blobstat program reads and parses these map files, taking account of every code section that went into that final link. It also reads another file (a classification spec) that indicates which linkable *.lib files (or which individual objects within these libs) should be counted in the src category, versus which ones should be counted in the blob category. One can also define any other classification categories as desired. Let's look at our starting point first. There are no surviving map or COFF files corresponding to moko11 or moko10, but there is a surviving map file corresponding to moko10-beta1; we can use this map file because there is no difference in the state of source vs blobs between moko10-beta1 and moko11. Analyzing this gsm_<blah>.map file from moko10-beta1 with blobstat, we get the following numbers: * The total number of bytes in the final fw image that came from linkable code bits (as opposed to linker-generated fillers and trampolines) is 0x2156FC. For comparison, the total image size is 0x2255B4 - there is almost 64 KiB of dead space in there, filled with padding. * The portion of the bits which were either compiled from source by OM or for which they had the exact corresponding source which they chose not to touch is 0xD3D34 bytes, or about 40% of the total. * The portion of the bits coming from linkable *.lib files for which OM did not have corresponding source is 0x1419C8 of the total. Thus my original assessment of OM's firmware being about half source, half blobs was pretty close to the true numbers, which turn out to be 60% blobs, 40% source. Now let's look at our current FreeCalypso production firmware: namely, the 20190409 build of FC Magnetite hybrid for the fcdev3b target. Here the total number of non-padding, non-filler, non-trampoline bytes (the actual code size) is 0x23F7A4, and guess what the blob percentage is... The only parts of FC Magnetite hybrid fw which exist as blobs with no exact corresponding source are Nucleus, the OSL and OSX glue layers of GPF, and the TMS470 compiler's RTS library. These blob bits add up to a grand total of 0xA82C (43052) bytes, comprising about 1.8% of the total fw code size. Thus we have gone from 60% blobs, 40% source to 98% source, 1.8% blobs. So what are we going to do with these last remaining 43052 bytes of code which we currently use in the form of binary objects with no corresponding source? Out of this entire blob division, the one part which currently stands as the last remaining bone in our figurative throat (OSL and OSX bits of GPF) weighs 0x3A90 (14992) bytes: about one third of the total blob division, or just 0.6% of the total fw code size. Needless to say, a blob that weighs a total of 14992 bytes and comes in the form of COFF objects with full symbolic info (-g style) is very easy to reverse-engineer and thoroughly understand. There are no mysteries in this OSL/OSX glue code, it is very thoroughly understood - at least by me. Instead the problem is that I am not able to turn this disassembly understanding into recompilable C code - more precisely, I am not able to produce os_???.c code that can be fed to TI's TMS470 compiler (the specific version used in the TCS211 program) and which would produce output exactly matching the original blobs. I have recently written an article explaining the situation with these OSL/OSX components and where our Magnetite and Selenite firmwares stand with respect to them: https://www.freecalypso.org/hg/freecalypso-docs/file/tip/Firmware-deblobbing Oh, and the new blobstat program resides in the freecalypso-reveng repository: https://www.freecalypso.org/hg/freecalypso-reveng/file/tip/blobstat Hasta la Victoria, Siempre, Mychaela aka The Mother _______________________________________________ Community mailing list [email protected] https://www.freecalypso.org/mailman/listinfo/community
