Thanks Bero. Sending this extremely useful information out to a wider audience.
Alex, I think you're probably be very interested in this for your Mozilla work. >> -O3 >> * What is is, does, available on > > -O3 enables several additional compiler optimizations such as tree > vectorizing and loop unswitching, and optimizes for speed over code > size somewhat more aggressively than -O2, e.g. by inlining all calls > to small static functions. > It is available on any platform supported by gcc. > >> OpenMP >> * What is is, does, available on > > OpenMP is a simple API that makes it easier for a programmer to make > use of multi-core or multi-processor systems, e.g. by automatically > splitting marked loops into several threads. > Example: > > #pragma omp parallel for > for(int i=0; i<100; i++) > do_something(i); > > Would use up to 100 threads to do its job. > > > It is available on plaforms supported by gcc that can use libgomp, > gcc's OpenMP library. This includes most platforms that support POSIX > threads - but -- initially -- not Android. > > >> Loop parallelization >> * What is is, does, available on > > Loop parallelization takes OpenMP a step further by automatically > determining which loops are suitable for "#pragma omp parallel for" > and similar constructs. This allows code that was written without > multiprocessing in mind (such as most code written specifically for > ARM platforms - multicore/SMP ARM systems are quite new) to take > advantage of multicore/SMP systems (to some extent) without having to > modify the code. > > Compiler flag: -ftree-parallelize-loops=X (where X is the number of > threads to be optimized for - typically the number of CPU cores in the > target system) > > Available on anything supported by gcc that has both libgomp and > graphite (incl. CLooG, PPL or ISL) - the original Android toolchain > has neither of those. > >> ...and any other optimizations that you've done. > > None of the following is enabled yet (but the support in the toolchain > is there now), but I'm planning to enable them step by step once we > have systems built w/ the new toolchain that actually boot: > > binutils: --hash-style=gnu > By default, ld creates SysV style hash tables for function tables > in shared libraries. With --hash-style=gnu, we switch to GNU style > hashes, making symbol lookup a lot faster. (details: > http://sourceware.org/ml/binutils/2006-10/msg00377.html) > > binutils: -Bsymbolic-functions > Speed up the dynamic linker by binding references to global > functions in shared libraries where it is known that this doesn't > break things (it's safe for libraries that don't have any users trying > to override their symbols - it's probably safe to assume e.g. skia and > opengl could benefit). > (details: > http://www.fkf.mpg.de/edv/docs/intel_composer/Documentation/en_US/compiler_f/main_for/copts/common_options/option_bsymbolic_functions.htm) > > binutils/gcc: -flto, -fwhole-program > Link-Time Optimization - causes code to be optimized again at link > time, when the compiler knows what functions are called form what > parts of the code, what functions are only called with constant > parameters, etc. > > gcc: -mtune=cortex-a9 (or whatever the actual target CPU is) > The Android build system uses -march=arm-v7a, which is good -- but > it doesn't do any tuning for the specifc CPU type (e.g. cortex-a8 vs. > cortex-a9). > > gcc: -fvisibility-inlines-hidden > Don't export C++ inline methods in shared libraries. Makes the > symbol table smaller, improving startup time and diskspace efficiency > > gcc: -fstrict-aliasing -Werror=strict-aliasing > Currently, Android uses -fno-strict-aliasing unconditionally for > thumb code, to work around some pieces of code that violate strict > aliasing rules. Using -Werror=strict-aliasing, we can determine what > pieces of code are affected, and fix them, or limit the use of > -fno-strict-aliasing to the specific files that need it - enabling the > rather useful strict-aliasing optimization for the rest of the build > > gcc: Investigate Graphite optimizations that aren't even enabled at -O3: > -fgraphite-identity -floop-block -floop-interchage > -floop-strip-mine -ftree-loop-distribution -ftree-loop-linear > _______________________________________________ linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev