On Thu, Feb 23, 2017 at 02:22:47AM -0500, Olivier Fourdan wrote:
> Hi Peter,
> 
> > > > ---
> > > >  RFC: This is probably sub-optimal and broken in many places, but it
> > > >       seems to avoid the memory corruption (so far)... At least it's a
> > > >       start, I guess.
> > > 
> > > It's certainly an improvement in correctness, I just hate it. ;)
> > > 
> > > One problem with this is there are some requests (GetImage in
> > > particular) whose reply is split up into multiple calls to
> > > WriteToClient, which means if an XICDP call happens on the input thread
> > > while GetImage is writing, the events will be interleaved with the
> > > reply data. In addition to corrupting the returned image, the client
> > > will almost certainly choke and die while attempting to process the
> > > next reply/event.
> > > 
> > > We can paper over that by fixing the WriteToClient callers to
> > > (recursively) take the io lock as well. But doing so highlights a
> > > design issue with this approach. GetImage replies are slow because
> > > they're a lot of data, so now (if you hit the same XICDP path) you've
> > > made the input thread _block_ until the request completes, and the
> > > mouse will get stuck.
> > > 
> > > What I'd like to see is the events built asynchronously and flushed
> > > from the main thread (possibly only if we can tell we're running on the
> > > input thread).
> > 
> > unless I'm confused about the thread, it's intention was to replace the
> > sigio handler. So while it updates the pointer, it never actually *sends*
> > events, it merely generates them and shoves them into the EQ. The events are
> > then taken from there in the main thread and processed.
> 
> From the backtraces I got, it does send them, see that backtrace I captured 
> yesterday:
> 
> #0  0x0000000006f8b91f in __GI_raise (sig=sig@entry=6) at 
> ../sysdeps/unix/sysv/linux/raise.c:58
> #1  0x0000000006f8d51a in __GI_abort () at abort.c:89
> #2  0x00000000005a156e in OsAbort () at utils.c:1355
> #3  0x00000000005a7163 in AbortServer () at log.c:877
> #4  0x00000000005a7f4d in FatalError (f=f@entry=0xaaf8ecf "%s:%d assertion 
> '%s' failed\n") at log.c:1015
> #5  0x000000000a974e36 in _kgem_submit (kgem=kgem@entry=0x9617000) at 
> kgem.c:4134
> #6  0x000000000a9b9f64 in kgem_submit (kgem=0x9617000) at kgem.h:382
> #7  0x000000000a9b9f64 in sna_accel_flush (sna=sna@entry=0x850800 
> <FlushCallback>) at sna_accel.c:17375
> #8  0x000000000a9b9fec in sna_flush_callback (list=<optimized out>, 
> user_data=0x850800 <FlushCallback>, call_data=<optimized out>) at 
> sna_accel.c:17399
> #9  0x000000000043c3e4 in _CallCallbacks (pcbl=0x36dc6b30, 
> pcbl@entry=0x850800 <FlushCallback>, call_data=0x20, 
> call_data@entry=0xb0bc150) at dixutils.c:737
> #10 0x000000000059dbd3 in CallCallbacks (call_data=0xb0bc150, pcbl=0x850800 
> <FlushCallback>) at ../include/callback.h:83
> #11 0x000000000059dbd3 in FlushClient (who=who@entry=0xb0bc150, 
> oc=oc@entry=0x9552800, __extraBuf=__extraBuf@entry=0x11027a60, 
> extraCount=extraCount@entry=32) at io.c:809
> #12 0x000000000059e13f in WriteToClient (who=who@entry=0xb0bc150, 
> count=count@entry=32, __buf=__buf@entry=0x11027a60) at io.c:763
> #13 0x0000000000442572 in WriteEventsToClient 
> (pClient=pClient@entry=0xb0bc150, count=<optimized out>, count@entry=1, 
> events=events@entry=0x11027a60) at events.c:6000
> #14 0x0000000000442710 in TryClientEvents (client=0xb0bc150, dev=<optimized 
> out>, pEvents=0x11027a60, count=1, mask=<optimized out>, filter=<optimized 
> out>, grab=0x0)
>     at events.c:2021
> #15 0x0000000000445e7a in DeliverEventToInputClients 
> (dev=dev@entry=0xfc85870, inputclients=0xd5726f0, win=win@entry=0xb2fed40, 
> events=events@entry=0x11027a60, count=count@entry=1, filter=filter@entry=16, 
> grab=0x0, client_return=0x11027998, mask_return=0x11027994) at events.c:2170
> #16 0x0000000000446167 in DeliverEventToWindowMask (mask_return=0x11027994, 
> client_return=0x11027998, grab=0x0, filter=16, count=1, events=0x11027a60, 
> win=0xb2fed40, dev=0xfc85870) at events.c:2213
> #17 0x0000000000446167 in DeliverEventsToWindow (pDev=pDev@entry=0xfc85870, 
> pWin=pWin@entry=0xb2fed40, pEvents=pEvents@entry=0x11027a60, 
> count=count@entry=1, filter=filter@entry=16, grab=grab@entry=0x0) at 
> events.c:2277
> #18 0x00000000005283b9 in SendEventToAllWindows (dev=dev@entry=0xfc85870, 
> mask=16, ev=ev@entry=0x11027a60, count=count@entry=1) at exevents.c:3045
> #19 0x0000000000533d1f in send_property_event (dev=dev@entry=0xfc85870, 
> property=<optimized out>, what=<optimized out>) at xiproperty.c:208
> #20 0x00000000005346d8 in XIChangeDeviceProperty (dev=0xfc85870, 
> property=<optimized out>, type=<optimized out>, format=<optimized out>, 
> mode=<optimized out>, len=<optimized out>, value=0x11027b50, sendevent=1) at 
> xiproperty.c:802
> #21 0x000000000fe1d039 in wcmUpdateSerial () at 
> /usr/lib64/xorg/modules/input/wacom_drv.so
> #22 0x000000000fe131c2 in wcmSendEvents () at 
> /usr/lib64/xorg/modules/input/wacom_drv.so
> #23 0x000000000fe14764 in wcmEvent () at 
> /usr/lib64/xorg/modules/input/wacom_drv.so
> #24 0x000000000fe1a476 in usbParse () at 
> /usr/lib64/xorg/modules/input/wacom_drv.so
> #25 0x000000000fe11feb in wcmReadPacket () at 
> /usr/lib64/xorg/modules/input/wacom_drv.so
> #26 0x000000000fe12226 in wcmDevReadInput () at 
> /usr/lib64/xorg/modules/input/wacom_drv.so
> #27 0x000000000059cc8c in InputReady (fd=32, xevents=1, data=0xfc94e90) at 
> inputthread.c:173
> #28 0x000000000059f271 in ospoll_wait (ospoll=0xd1172f0, 
> timeout=timeout@entry=-1) at ospoll.c:412
> #29 0x000000000059cae6 in InputThreadDoWork (arg=<optimized out>) at 
> inputthread.c:360
> #30 0x0000000006d3f6ca in start_thread (arg=0x11029700) at 
> pthread_create.c:333
> #31 0x000000000705df7f in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:105
> 
> > So I'm wondering if the simple fix for all these bugs would be an
> > xorg_backtrace() + return in WriteToClient whenever it's called from within
> > the input thread.
> 
> But then what would happen with the send_property_event() from the
> XIChangeDeviceProperty() in this case?

it gets discarded, which isn't the end of the world in this case. and if
there are other callers, I'm assuming that not sending something out is
better than crashing the server or stomping memory. 

fwiw, a patch for this instance is already on the linuxwacom list.

Cheers,
   Peter
_______________________________________________
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel

Reply via email to