Hi all, I'm having an application which is hanging after a call to DRI2GetBuffersWithFormat in mesa.
I am looking for support with the issue and how to diagnose it. The issue is that our application is hanging at some point waiting for the X server to respond. One workaround is to move the mouse, which seems to generate events and that makes the communication get back on track. The issue has been very consistent, occurring at most 5 minutes after the x server is started and usually sooner. The issue does not occur in Ubuntu 10.10, which means there is a difference somewhere. The stack trace for the X client we are consistently getting is: #0 0xb7741424 in __kernel_vsyscall () #1 0xb6d269b4 in pthread_cond_wait () from /lib/libpthread.so.0 #2 0xb6a03344 in ?? () from /usr/lib/libX11.so.6 #3 0xb6a02d2f in ?? () from /usr/lib/libX11.so.6 #4 0xb6a1e0d3 in _XReply () from /usr/lib/libX11.so.6 #5 0xb5de69a3 in DRI2GetBuffersWithFormat (dpy=0x8397fa0, drawable=10485772, width=0x88dd5a4, height=0x88dd5a8, attachments=0xbfaf0098, count=1, outCount=0xbfaf00c4) at dri2.c:454 #6 0xb5de478f in dri2GetBuffersWithFormat (driDrawable=0x88dd580, width=0x88dd5a4, height=0x88dd5a8, attachments=0xbfaf0098, count=1, out_count=0xbfaf00c4, loaderPrivate=0x88dd4e0) at dri2_glx.c:582 #7 0xb5a7d3df in intel_update_renderbuffers (context=0x835f260, drawable=0x88dd580) at intel_context.c:290 #8 0xb5a7d8cc in intel_prepare_render (intel=0x83997c0) at intel_context.c:438 #9 0xb5a7a99d in i915_render_start (intel=0x83997c0) at i915_vtbl.c:58 #10 0xb5a8ed9b in intelRenderStart (ctx=0x83997c0) at intel_tris.c:1087 #11 0xb5b888b3 in run_render (ctx=0x83997c0, stage=0x8607d88) at tnl/t_vb_render.c:276 #12 0xb5b7cb84 in _tnl_run_pipeline (ctx=0x83997c0) at tnl/t_pipeline.c:153 #13 0xb5a8f07d in intelRunPipeline (ctx=0x83997c0) at intel_tris.c:1075 #14 0xb5b7d284 in _tnl_draw_prims (ctx=0x83997c0, arrays=0x85f5e1c, prim=0x85f47f0, nr_prims=1, ib=0x0, min_index=0, max_index=3) at tnl/t_draw.c:478 #15 0xb5b7dd86 in _tnl_vbo_draw_prims (ctx=0x83997c0, arrays=0x85f5e1c, prim=0x85f47f0, nr_prims=1, ib=0x0, index_bounds_valid=1 '\001', min_index=0, max_index=3) at tnl/t_draw.c:384 #16 0xb5b75667 in vbo_exec_vtx_flush (exec=0x85f46b8, unmap=1 '\001') at vbo/vbo_exec_draw.c:384 #17 0xb5b72560 in vbo_exec_FlushVertices_internal (ctx=0x83997c0, unmap=0 '\0') at vbo/vbo_exec_api.c:876 #18 0xb5b72638 in vbo_exec_FlushVertices (ctx=0x83997c0, flags=1) at vbo/vbo_exec_api.c:910 #19 0xb5b4bb79 in _mesa_BindTexture (target=34037, texName=19) at main/texobj.c:1058 #20 0xb5df80b1 in glBindTexture (target=34037, texture=19) at ../../../src/mapi/glapi/glapitemp.h:1627 #21 0xb4a10e44 in glSetState () from /usr/lib/directfb-1.4-5-pure/gfxdrivers/libdirectfb_gl.so #22 0xb6dc3ee1 in ?? () from /usr/lib/libdirectfb-1.4.so.5 #23 0xb6dc698c in dfb_gfxcard_blit () from /usr/lib/libdirectfb-1.4.so.5 #24 0xb6d78679 in ?? () from /usr/lib/libdirectfb-1.4.so.5 #25 0xb6e01825 in IDirectFBSurface::Blit () from /usr/lib/lib++dfb-1.0.so.0 I see the X server getting a request and sending a reply. After this it's back in the waiting for command state: #0 0xb76fb424 in __kernel_vsyscall () #1 0xb73c07cd in select () from /lib/libc.so.6 #2 0x0809bb28 in WaitForSomething () #3 0x0806e0ae in ?? () #4 0x092b0da8 in ?? () #5 0x00000002 in ?? () #6 0x00000000 in ?? () With DEBUG_COMMUNICATION in os/io.c defined, I see the following output: [snip] REPLY: ClientIDX: 6 XEvent: type: 0xe detail: 0x0 seq#: 0x161b REQUEST: ClientIDX: 6, type: 0x86 data: 0x7 len: 5 REPLY: ClientIDX: 6 Xreply: type: 0x1 data: 0xe1 len: 5 seq#: 0x161c REQUEST: ClientIDX: 6, type: 0x86 data: 0x7 len: 5 REPLY: ClientIDX: 6 Xreply: type: 0x1 data: 0x63 len: 5 seq#: 0x161d REQUEST: ClientIDX: 6, type: 0x3e data: 0x7 len: 7 REPLY: ClientIDX: 6 XEvent: type: 0xe detail: 0x0 seq#: 0x161e REQUEST: ClientIDX: 6, type: 0x86 data: 0x7 len: 5 REPLY: ClientIDX: 6 Xreply: type: 0x1 data: 0xe1 len: 5 seq#: 0x161f REQUEST: ClientIDX: 6, type: 0x86 data: 0x7 len: 5 REPLY: ClientIDX: 6 Xreply: type: 0x1 data: 0x63 len: 5 seq#: 0x1620 REQUEST: ClientIDX: 6, type: 0x3e data: 0x7 len: 7 REPLY: ClientIDX: 6 XEvent: type: 0xe detail: 0x0 seq#: 0x1621 REQUEST: ClientIDX: 6, type: 0x86 data: 0x7 len: 5 REPLY: ClientIDX: 6 Xreply: type: 0x1 data: 0xe1 len: 5 seq#: 0x1622 REQUEST: ClientIDX: 6, type: 0x86 data: 0x7 len: 5 REPLY: ClientIDX: 6 Xreply: type: 0x1 data: 0x63 len: 5 seq#: 0x1623 REQUEST: ClientIDX: 6, type: 0x3e data: 0x7 len: 7 REPLY: ClientIDX: 6 XEvent: type: 0xe detail: 0x0 seq#: 0x1624 REQUEST: ClientIDX: 6, type: 0x86 data: 0x7 len: 5 REPLY: ClientIDX: 6 Xreply: type: 0x1 data: 0xe1 len: 5 seq#: 0x1625 REQUEST: ClientIDX: 6, type: 0x86 data: 0x7 len: 5 REPLY: ClientIDX: 6 Xreply: type: 0x1 data: 0x63 len: 5 seq#: 0x1626 REQUEST: ClientIDX: 6, type: 0x3e data: 0x7 len: 7 REPLY: ClientIDX: 6 XEvent: type: 0xe detail: 0x0 seq#: 0x1627 REQUEST: ClientIDX: 6, type: 0x86 data: 0x7 len: 5 REPLY: ClientIDX: 6 Xreply: type: 0x1 data: 0xe1 len: 5 seq#: 0x1628 REQUEST: ClientIDX: 6, type: 0x86 data: 0x7 len: 5 REPLY: ClientIDX: 6 Xreply: type: 0x1 data: 0x63 len: 5 seq#: 0x1629 REQUEST: ClientIDX: 6, type: 0x3e data: 0x7 len: 7 REPLY: ClientIDX: 6 XEvent: type: 0xe detail: 0x0 seq#: 0x162a REQUEST: ClientIDX: 6, type: 0x86 data: 0x7 len: 5 REPLY: ClientIDX: 6 Xreply: type: 0x1 data: 0xe1 len: 5 seq#: 0x162b REQUEST: ClientIDX: 6, type: 0x86 data: 0x7 len: 5 REPLY: ClientIDX: 6 Xreply: type: 0x1 data: 0x63 len: 5 seq#: 0x162c It keeps waiting there till I move the mouse and see more REPLY responses in a line. There is the faint impression that the xserver does something wrong with buffering the message, and not flushing it. What is the right approach to see when the message is actually written to the socket? There seems to be quite a lot of cleverness around writing to the socket, probably for performance reasons. Note: this is the same issue as: <http://lists.freedesktop.org/archives/intel-gfx/2010-November/008850.html> Used software (somewhat updated in the meanwhile): * Linux 2.6.37-rc3, but same issue with 2.6.36 (and I think with 2.5.35 as well, but I'm not completely sure). * mesa 7.9. * xf86-video-intel 2.12.0, 2.13.0, 2.13.901. * libdrm 2.4.22 (and today's trunk, as it had a lot of intel patches). The issue looks very much the same as: <http://www.pubbs.net/201003/xorg/2227-problem-using-an-mesa-based-app-with-recent-xorgmesaxf86-video-intelloop.html> Thanks in advance, - Joris _______________________________________________ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel