Package: firefox
Version:  53.0.is.52.0.2-1
Severity: normal

libyuv which is a performance critical library for firefix is built with
-Os which is horrible for performance for it.
In particular row_common.cc which contains the generic parts of the
color transformation code:

See:
https://buildd.debian.org/status/fetch.php?pkg=firefox&arch=amd64&ver=53.0.is.52.0.2-1&stamp=1492644908&raw=0

/usr/bin/g++ -std=gnu++11 -o row_common.o -c  ...   -fPIC
-DMOZILLA_CLIENT -include
/<<PKGBUILDDIR>>/build-browser/mozilla-config.h -MD -MP -MF
.deps/row_common.o.pp -Wdate-time -D_FORTIFY_SOURCE=2 -Wall
-Wc++11-compat -Wempty-body -Wignored-qualifiers -Woverloaded-virtual
-Wpointer-arith -Wsign-compare -Wtype-limits -Wunreachable-code
-Wwrite-strings -Wno-invalid-offsetof -Wc++14-compat
-Wno-error=maybe-uninitialized -Wno-error=deprecated-declarations
-Wno-error=array-bounds -fno-lifetime-dse -fstack-protector-strong
-Wformat -Werror=format-security -fno-schedule-insns2 -fno-lifetime-dse
-fno-delete-null-pointer-checks -fno-exceptions -fno-strict-aliasing
-fno-rtti -ffunction-sections -fdata-sections -fno-exceptions
-fno-math-errno -pthread -pipe  -g -freorder-blocks -Os
-fomit-frame-pointer
/<<PKGBUILDDIR>>/media/libyuv/source/row_common.cc


The problematic part is the YuvPixel function which is called in loops
and in turn calls tiny clamp functions.
Os disables inlining so this causes massive overhead.
This is the top cpu profile on sites which e.g. display videos.
  17.25%  libxul.so                   [.] YuvPixel        ▒
   6.58%  libxul.so                   [.] Clamp           ▒
   6.46%  libxul.so                   [.] clamp255

The problem is not as bad as it looks as this generic code is only
executed on machines that do not have SSSE3, AVX2 or NEON (see
convert_argb.cc)
But there are still plenty useful cpus that do not have these
instruction sets and are crippled by the compiler flags used.

Is it possible to compile this library with O3 to allow the compiler to
vectorize it with the best available generic instruction set (e.g. SSE2
on x64).

cheers,
Julian Taylor

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to