Re: [E-devel] evas crashing in scale_rgba_in_to_out_clip_sample_internal() on intel Gen2 GPU

The Rasterman Wed, 17 Apr 2013 17:36:40 -0700

On Wed, 17 Apr 2013 20:19:22 +0200 Bruno <[email protected]> said:

> Hi Raster,
> 
> On Wed, 17 April 2013 Carsten Haitzler (The Rasterman) <[email protected]>
> wrote:
> > > Full valgrind log attached to the bug.
> > 
> > that's a little more useful. it gives me an actual address. the address
> > looks sane - ie it's writing to and invalid memory region. it'd be goot to
> > get a backtrace there: valgrind --db-attach=yes ... that will ask to attach
> > gdb per complaint - if you say yes, you can get a backtrace (bt command)
> > and print variables/values.
> 
> Attached is a combination of valgrind output, gdb backtrace (only the
> innermost functions) as well as a detailed view of src, dst and dc.
> 
> Hope this is useful information.

ooooh. this is VERY useful. FINALLY. btw. anyone following this. listen in
PLEASE! i shall make this a lesson in "bug reporting".

first. bruno. THANKS SO MUCH FOR YOUR PATIENCE. really. providing this valgrind
trace and then now this gdb dump tells me a fascinating story. i "suspected
it", but i needed proof. it's a result of you indicating you are at 15bpp and
this segfaulting where it really shouldn't... my problem right now is to easily
reproduce 15bpp so i can verify a fix. xephyr doesnt do 15ppb (already tried),
so i need to use a real server... and that means this is going to take more
time.

now here goes my little analysis...

your dump of *src, *dst, as well as local variables gives me some KEY values.

1. src_region_x=0, src_region_y=0,
   src_region_w=1920, src_region_h=1200,
   dst_region_x=-140, dst_region_y=0,
   dst_region_w=1680, dst_region_h=1050

2. in *dst:
  w = 1400, h = 1050
  image { data = 0x7cce000, no_free = 1

3. and in evas_common_scale_rgba_in_to_out_clip_smooth_mmx() local var:
  dst_ptr = 0x7f9bc60

now these tell me a story.

1. we are scaling a 1920x1200 image down to 1680x1050 at an offset of -140, 0
in the dest buffer. the dest buffer is 1400x1050 in size (pixels). this is all
fine as clipping handled the negative x value and chops off the left/right
edges of the buffer.

2. the destination buffer itself is using foreign pixel data. ie the pixel data
is not managed by the image buffer itself (no_free is 1). that means this image
is effectively a wrapper struct around some foreign pixels from somewhere else.

3. given the start address of pixels (0x7cce000) in the buffer, the dst_ptr
(destination pointer where we are writing to at any time)... this means
valgrind catches us EXACTLY at 735000 pixels in from the start
(0x7f9bc60-0x7cce000 / 4). now.. given our buffer has 1400 pixels per line...
that makes it... at exactly 525 lines in... we walk outside of a valid memory
region... *EEEK*! this is bad!. by... 525? thats exactly HALF of the height
(1050 / 2)... exactly HALF? well well.. 15bpp of course really uses 16bits for
storage and throws one away. so we use 2 bytes per pixel. BUT... all the
software rendering routines in evas are built for 32bpp (4 bytes per pixel)...
that is what they work with (source and destination) and our buffer is HALF the
size... given that it is a FOREIGN buffer ... i smell that 0x7cce000 is
actually the address of a shared memory buffer for an xshm image... evas is
taking a shortcut. it is trying to render FASTER by rendering DIRECTLY into the
buffer that will be handed to x to upload to the destination (window/pixmap
etc.)...

evas does this... *IF* the destination buffer is 32bpp (ie for 32bpp
displays) AND if the rgb pixel masks match the rgb ordering that evas works
with natively... i know it does this.... i wrote that code and intended exactly
this. this is why i was asking for xdpyinfo to get things like your rgb masks
and visual info...

if evas CANT do this fast path it will use a 32bpp internal image buffer and
CONVERT to 15bpp after rendering is done. this is done for all depths that are
not 32bpp. this of course slows things down.

now the question is.. why the HELL is evas thinking that it can use a fast path
render direct to the dest buffer here? what is making it choose this path? what
did it get wrong? in fact this is the relevant bit of code in
evas_xlib_outbuf.c:

        if ((buf->rot == 0) &&
            (buf->priv.x11.xlib.imdepth == 32) &&
            (buf->priv.mask.r == 0xff0000) &&
            (buf->priv.mask.g == 0x00ff00) &&
            (buf->priv.mask.b == 0x0000ff))

ie if the x11 buffer is 32bpp and rotation is 0 and rgb masks are as above...
but wtf? how is this being triggered.

so let me ask you another q... is evas built with xlib or xcb support? there is
another xcb bit of code that does the same:

        if ((buf->rot == 0) &&
            (buf->priv.x11.xcb.imdepth == 32) &&
            (buf->priv.mask.r == 0xff0000) &&
            (buf->priv.mask.g == 0x00ff00) &&
            (buf->priv.mask.b == 0x0000ff))

all the same...

so i am now baffled as to how you can have this path triggered? the stuff that
sets up the depth is in evas_software_xlib_outbuf_setup_x(). i.e.

           if (((vis->class == TrueColor) || (vis->class == DirectColor)) &&
               (x_depth > 8))
             {
                buf->priv.mask.r = (DATA32) vis->red_mask;
                buf->priv.mask.g = (DATA32) vis->green_mask;
                buf->priv.mask.b = (DATA32) vis->blue_mask;
                if (buf->priv.x11.xlib.swap)
                  {
                     SWAP32(buf->priv.mask.r);
                     SWAP32(buf->priv.mask.g);
                     SWAP32(buf->priv.mask.b);
                  }

ok - so 15bpp can go here.. BUT we shouldnt get an imdepth of 32... and the
masks should not be the 24/32bpp masks as above. they will be the smaller 15bpp
masks... my only thoughts (and why my first q was about xdpyinfo) is that
somehow we get past this with bizarre visual info. seemingly not. all i can
conclude at the moment is that we are somehow USING a 24/32bpp visual BUT
creating a 15bpp x(shm)image. but HOW...

well evas_software_xlib_x_output_buffer_new() creates our images. and it is
passed
                                                  buf->priv.x11.xlib.vis,
                                                  buf->priv.x11.xlib.depth,
either directly or indirectly via _find_xob() (which is passed these too - this
is the shm buffer cache that allows us to keep a pool of shm buffers around to
speed up rendering framerate avoiding re-allocation - expensive - need to be
filled with 0's by kernel etc.)... ignore the depth 1 ones - these are for
bitmap shape masks... so now who sets these?

evas_software_xlib_outbuf_setup_x() does. this is called from
_output_xlib_setup() or eng_setup(). in both cases the visual and depth come
from:

info->info.visual
info->info.depth

this is engine info - passed into the engine from its setup phase... by
ecore_evas. so ecore_evas must logically be choosing a 32bit visual BUT passing
in a 15bpp depth... WTF? next breadcrumb. so ecore_evas_x.c here we come.

        einfo->info.visual = einfo->func.best_visual_get(einfo);
        einfo->info.colormap = einfo->func.best_colormap_get(einfo);
        einfo->info.depth = einfo->func.best_depth_get(einfo);

oooh ooh ooh. it uses functions provide by the evas engine to get visual/depth
etc... provided in the info struct passed out by the engine... so.. back to
engine.

for visual:
     return DefaultVisual((Display *)connection, screen);

for depth:
     return DefaultDepth((Display *)connection, screen);

(again assuming xlib build here. there are xcb paths there and i'm ALSO
assuming screen 0... no multihead here... is screen 0?)

so.. my current bet is.. that this depth does not match the visual being used
(still my first suspicion but your info has helps me narrow down the forensics
and follow the breadcrumbs up until this point). this is the point of "i need
to know more". are we using xcb? or xlib? if xlib.. then defaultvisual and
defaultdepth dont match from x.. and that is highly bizarre. that is definitely
a bug x has to fix, BUT we can put some work-around code to detect this and/or
steal depth directly from the visual, but this will have a problem - then our
depth wont match the destination window either and we're all in deep poo as our
putimages will fail as depths dont match. if its the xcb code.. then i know
what i have to enable to test that path etc. but a quick scan of it makes it
look sane to me.

so right now, ball is back in your court. as i said - things work in 16bpp for
me in xephyr which essentially (other than a different rgb mask than 15bpp by
having an extra bit of green) is the same as 15bpp codepaths...

-- 
------------- Codito, ergo sum - "I code, therefore I am" --------------
The Rasterman (Carsten Haitzler)    [email protected]

------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________
enlightenment-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel

Re: [E-devel] evas crashing in scale_rgba_in_to_out_clip_sample_internal() on intel Gen2 GPU

Reply via email to