Carl Worth wrote:
[Pardon me for quoting several separate messages in a single reply
here---I wasn't previously on this list, and it's difficult enough to
fish out the Message-ID of a single message to try to get a properly
formatted reply.]
On Mon, 2007-02-12, William Cohen wrote:
samples| %|
------------------
19279 50.6077 vmlinux-20070309.olpc1p.dc5079fafb767e4
11959 31.3926 Xorg
6177 16.2147 python
The rest of the thread talked plenty about the Xorg samples. But
am I reading correctly that oprofile only sees 31% in Xorg and 50% in
the kernel? Does anyone know what's happening there? And shouldn't
that stuff be an important priority?
The high number of samples for the kernel is an artifact. OProfile is using a
periodic sampling mechanism that takes a sample regardless whether the processor
is doing something useful. Most of the samples for the kernel are for idle. This
was mentioned in earlier email (subject "Experiences getting OProfile working on
OLPC machine"). If oprofile was using CPU_CLOCK_UNHALTED or similar mechanism,
then there were be very few samples for the kernel.
The point is there is a big hot spot in fbFetchTransformed. The samples for that
function alone account for over 25% of the time/samples. Discounting the kernel
idle samples that one function has closer to 50% of the active processor cycles.
-Will
On Tue, 2007-03-03, Adam Jackson wrote:
On Mon, 2007-03-12 at 18:59 -0400, William Cohen wrote:
...
6514 68.1096 libfb.so fbFetchTransformed
Wow. I think that's the first time I've ever seen this actually show up
on a profile. I didn't think anything used Render's transformations on
account of they're so painfully slow.
But it's a new, cairo world now, Adam, and we prefer to use Render
whenever possible. So I expect you'll be seeing these paths in the X
server get exercised more and more, so we really do need to improve
this software to be much faster, (and also use the GPU whenever
possible to avoid this software altogether--obviously not an option on
the OLPC machine).
On Tue, 2007-03-13, Dan Williams wrote:
On Mon, 2007-03-12 at 18:59 -0400, William Cohen wrote:
613 6.4095 libfb.so fbFetchPixel_x8r8g8b8
446 4.6633 libfb.so fbCompositeSolidMask_nx8x0565mmx
252 2.6349 libfb.so fbStore_r5g6b5
169 1.7670 libfb.so fbRasterizeEdges
137 1.4325 libfb.so fbCompositeSrc_8888x0565mmx
This one is inevitable since we're running in 565 and you have to do
compositing and pixelsmashing to convert 8888 -> 0565. However, could
you get some idea of which lines are hot in this function?
It may be inevitable to do some conversion somewhere, but it's
definitely not inevitable that this should show up in profiling,
(though, we're talking less that 10% of 30% here, right?). But anyway,
since your destination format is 565 you should be ensuring that
you're getting your source images to 565 surfaces as soon as possible,
(convert on load, and never again after that).
Some things that can help with that are to load originally into a
surface without an alpha channel if there's no alpha in your image. Or
if there is alpha in the image, flatten as soon as possible,
(presumably, the background is often a known solid color, so hopefully this
would be possible quite early).
But maybe that's already happening here---like I said, these don't
look like big percentages here...
-Carl
_______________________________________________
Devel mailing list
[email protected]
http://mailman.laptop.org/mailman/listinfo/devel