Re: [PATCH 0/7] HID: picoLCD updates
On Sun, 19 Aug 2012, Bruno Prémont wrote: > > I don't know bout hid_hw_close(). Certainly no more reports should be > > submitted following usbhid_stop(). > > Ok, I did just that, prevent new calls to usbhid_submit_report(), after > calling hid_hw_close(), fixed one bug in my code that triggers the NULL > pointer dereference (calling hid_set_drvdata(hdev, NULL) too early). > > Now I'm still seeing the bad paging request in _mmx_memcpy(), though rather > sporadically. > The last ones I saw were during remove() around the time of calling > hid_hw_close() > and hid_hw_stop(). Adding a printk() between the two (at least while picoLCD > is hosting fbcon) makes it very improbably for the bad page to happen. > > It looks like low-level driver did free memory in hid_hw_close() for some > in-flight URB and thus things break in following USB interrupt. No, memory is not freed in usbhid_close(). It is freed in usbhid_stop(). > From mapping trace information to source it seems: > usbhid/hid-core.c: > static int hid_submit_out(struct hid_device *hid) > { > struct hid_report *report; > char *raw_report; > struct usbhid_device *usbhid = hid->driver_data; > int r; > > report = usbhid->out[usbhid->outtail].report; > raw_report = usbhid->out[usbhid->outtail].raw_report; > > usbhid->urbout->transfer_buffer_length = ((report->size - 1) >> 3) + > 1 + (report->id > 0); > usbhid->urbout->dev = hid_to_usb_dev(hid); > if (raw_report) { > memcpy(usbhid->outbuf, raw_report, > usbhid->urbout->transfer_buffer_length); > ^^^_ this is exploding Right. Like I said, usbhid->outbuf is freed in hid_free_buffers(), which is called from usbhid_stop(). Forget about usbhid_close(). The race is between hid_submit_out() and usbhid_stop(). Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] HID: picoLCD updates
On Sat, 18 August 2012 Alan Stern wrote: > On Sat, 18 Aug 2012, Bruno Prémont wrote: > > > One thing I just though about, how does usbhid handle the calls to > > usbhid_submit_report() when hid_hw_stop()/hid_hw_close() have already > > been called? > > Looks like they aren't synchronized at all. That's a bug. > usbhid_submit_report() should check the HID_DISCONNECTED flag. > > > I will attempt to see if it makes a difference to shortcut my > > usbhid_submit_report() calls from the point on I have called > > hid_hw_close()... > > I don't know bout hid_hw_close(). Certainly no more reports should be > submitted following usbhid_stop(). Ok, I did just that, prevent new calls to usbhid_submit_report(), after calling hid_hw_close(), fixed one bug in my code that triggers the NULL pointer dereference (calling hid_set_drvdata(hdev, NULL) too early). Now I'm still seeing the bad paging request in _mmx_memcpy(), though rather sporadically. The last ones I saw were during remove() around the time of calling hid_hw_close() and hid_hw_stop(). Adding a printk() between the two (at least while picoLCD is hosting fbcon) makes it very improbably for the bad page to happen. It looks like low-level driver did free memory in hid_hw_close() for some in-flight URB and thus things break in following USB interrupt. >From mapping trace information to source it seems: usbhid/hid-core.c: static int hid_submit_out(struct hid_device *hid) { struct hid_report *report; char *raw_report; struct usbhid_device *usbhid = hid->driver_data; int r; report = usbhid->out[usbhid->outtail].report; raw_report = usbhid->out[usbhid->outtail].raw_report; usbhid->urbout->transfer_buffer_length = ((report->size - 1) >> 3) + 1 + (report->id > 0); usbhid->urbout->dev = hid_to_usb_dev(hid); if (raw_report) { memcpy(usbhid->outbuf, raw_report, usbhid->urbout->transfer_buffer_length); ^^^_ this is exploding kfree(raw_report); usbhid->out[usbhid->outtail].raw_report = NULL; } dbg_hid("submitting out urb\n"); r = usb_submit_urb(usbhid->urbout, GFP_ATOMIC); if (r < 0) { hid_err(hid, "usb_submit_urb(out) failed: %d\n", r); return r; } usbhid->last_out = jiffies; return 0; } Bruno -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] HID: picoLCD updates
On Sat, 18 August 2012 Alan Stern wrote: On Sat, 18 Aug 2012, Bruno Prémont wrote: One thing I just though about, how does usbhid handle the calls to usbhid_submit_report() when hid_hw_stop()/hid_hw_close() have already been called? Looks like they aren't synchronized at all. That's a bug. usbhid_submit_report() should check the HID_DISCONNECTED flag. I will attempt to see if it makes a difference to shortcut my usbhid_submit_report() calls from the point on I have called hid_hw_close()... I don't know bout hid_hw_close(). Certainly no more reports should be submitted following usbhid_stop(). Ok, I did just that, prevent new calls to usbhid_submit_report(), after calling hid_hw_close(), fixed one bug in my code that triggers the NULL pointer dereference (calling hid_set_drvdata(hdev, NULL) too early). Now I'm still seeing the bad paging request in _mmx_memcpy(), though rather sporadically. The last ones I saw were during remove() around the time of calling hid_hw_close() and hid_hw_stop(). Adding a printk() between the two (at least while picoLCD is hosting fbcon) makes it very improbably for the bad page to happen. It looks like low-level driver did free memory in hid_hw_close() for some in-flight URB and thus things break in following USB interrupt. From mapping trace information to source it seems: usbhid/hid-core.c: static int hid_submit_out(struct hid_device *hid) { struct hid_report *report; char *raw_report; struct usbhid_device *usbhid = hid-driver_data; int r; report = usbhid-out[usbhid-outtail].report; raw_report = usbhid-out[usbhid-outtail].raw_report; usbhid-urbout-transfer_buffer_length = ((report-size - 1) 3) + 1 + (report-id 0); usbhid-urbout-dev = hid_to_usb_dev(hid); if (raw_report) { memcpy(usbhid-outbuf, raw_report, usbhid-urbout-transfer_buffer_length); ^^^_ this is exploding kfree(raw_report); usbhid-out[usbhid-outtail].raw_report = NULL; } dbg_hid(submitting out urb\n); r = usb_submit_urb(usbhid-urbout, GFP_ATOMIC); if (r 0) { hid_err(hid, usb_submit_urb(out) failed: %d\n, r); return r; } usbhid-last_out = jiffies; return 0; } Bruno -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] HID: picoLCD updates
On Sun, 19 Aug 2012, Bruno Prémont wrote: I don't know bout hid_hw_close(). Certainly no more reports should be submitted following usbhid_stop(). Ok, I did just that, prevent new calls to usbhid_submit_report(), after calling hid_hw_close(), fixed one bug in my code that triggers the NULL pointer dereference (calling hid_set_drvdata(hdev, NULL) too early). Now I'm still seeing the bad paging request in _mmx_memcpy(), though rather sporadically. The last ones I saw were during remove() around the time of calling hid_hw_close() and hid_hw_stop(). Adding a printk() between the two (at least while picoLCD is hosting fbcon) makes it very improbably for the bad page to happen. It looks like low-level driver did free memory in hid_hw_close() for some in-flight URB and thus things break in following USB interrupt. No, memory is not freed in usbhid_close(). It is freed in usbhid_stop(). From mapping trace information to source it seems: usbhid/hid-core.c: static int hid_submit_out(struct hid_device *hid) { struct hid_report *report; char *raw_report; struct usbhid_device *usbhid = hid-driver_data; int r; report = usbhid-out[usbhid-outtail].report; raw_report = usbhid-out[usbhid-outtail].raw_report; usbhid-urbout-transfer_buffer_length = ((report-size - 1) 3) + 1 + (report-id 0); usbhid-urbout-dev = hid_to_usb_dev(hid); if (raw_report) { memcpy(usbhid-outbuf, raw_report, usbhid-urbout-transfer_buffer_length); ^^^_ this is exploding Right. Like I said, usbhid-outbuf is freed in hid_free_buffers(), which is called from usbhid_stop(). Forget about usbhid_close(). The race is between hid_submit_out() and usbhid_stop(). Alan Stern -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] HID: picoLCD updates
On Sat, 18 Aug 2012, Bruno Prémont wrote: > One thing I just though about, how does usbhid handle the calls to > usbhid_submit_report() when hid_hw_stop()/hid_hw_close() have already > been called? Looks like they aren't synchronized at all. That's a bug. usbhid_submit_report() should check the HID_DISCONNECTED flag. > I will attempt to see if it makes a difference to shortcut my > usbhid_submit_report() calls from the point on I have called hid_hw_close()... I don't know bout hid_hw_close(). Certainly no more reports should be submitted following usbhid_stop(). Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] HID: picoLCD updates
On Sat, 18 August 2012 Bruno Prémont wrote: > One thing I just though about, how does usbhid handle the calls to > usbhid_submit_report() when hid_hw_stop()/hid_hw_close() have already > been called? > I will attempt to see if it makes a difference to shortcut my > usbhid_submit_report() calls from the point on I have called hid_hw_close()... I don't know if I'm progressing or just drawing circles... I included a check right before calling usbhid_submit_report() to prevent calling it if hid_hw_close()/hid_hw_stop() has been called and that part works as expected, just that at some moment it explodes on a NULL-pointer dereference. Initially I had usbhid_submit_report macro: #define usbhid_submit_report(a, b, c) \ do { \ picolcd_debug_out_report(d, a, b); \ usbhid_submit_report(a, b, c); \ } while (0) Now to make sure no new reports get submitted once hid_hw_stop() was called I changed it to: #define usbhid_submit_report(a, b, c) \ do { \ struct picolcd_data *d = hid_get_drvdata(a); \ unsigned long flags; \ picolcd_debug_out_report(d, a, b); \ spin_lock_irqsave(>lock, flags); \ if (!(d->status & PICOLCD_FAILED)) \ usbhid_submit_report(a, b, c); \ else \ printk(KERN_INFO "Submitting report after hw_stop!\n"); \ spin_unlock_irqrestore(>lock, flags); \ } while (0) with the printk() to spam dmesg :) The NULL-pointer dereference I got inside picolcd_fb_deferred_io() was due to me calling hid_set_drvdata(hdev, NULL) too early in remove(), fixed. But with that fixed I still encounter the bad page, probably around the time of hid_hw_stop(), though it always takes the HARD LOCK watchdog to trigger for the trace to show (probably because or IRQ context): It seems to happen during the bind/unbind following the unbind that produces a hid-picolcd 0003:04D8:C002.0003: usb_submit_urb(out) failed: -1 either a BUG or even as reboot and happend between hid_hw_close() and hid_hw_stop() though my extra printks seem to reduce probability of bad things to happen. [ 1217.643017] BUG: unable to handle kernel paging request at c7cb [ 1217.643017] IP: [] _mmx_memcpy+0x27/0x16c [ 1217.643017] *pde = 19078063 *pte = 07cb0161 [ 1217.643017] Oops: 0003 [#1] [ 1217.643017] Modules linked in: hid_picolcd fb_sys_fops sysimgblt sysfillrect syscopyarea drm_kms_helper nfs3 nfs_acl nfs lockd sunrpc [ 1217.643017] Pid: 0, comm: swapper Not tainted 3.6.0-rc2-jupiter-8-gd3c2b0c-dirty #2 NVIDIA Corporation. nFORCE-MCP/MS-6373 [ 1217.643017] EIP: 0060:[] EFLAGS: 00010013 CPU: 0 [ 1217.643017] EIP is at _mmx_memcpy+0x27/0x16c [ 1217.643017] EAX: dd40a000 EBX: dd6ea010 ECX: 035af77b EDX: d9108c80 [ 1217.643017] ESI: d9121c00 EDI: c7cb EBP: dd40bebc ESP: dd40bea0 [ 1217.643017] DS: 007b ES: 007b FS: GS: 00e0 SS: 0068 [ 1217.643017] CR0: 8005003b CR2: c7cb CR3: 1cf1c000 CR4: 07d0 [ 1217.643017] DR0: DR1: DR2: DR3: [ 1217.643017] DR6: 0ff0 DR7: 0400 [ 1217.643017] Process swapper (pid: 0, ti=dd40a000 task=c15d24c0 task.ti=c15c6000) [ 1217.643017] Stack: [ 1217.643017] 0d6d6d6f dd40beac c109bba4 c7c97080 dd6ea010 c7c97080 dceb3030 dd40bee4 [ 1217.643017] c133b364 c7d4 c7d4 dd40bed4 dceb3030 d9108c80 dd6ea010 0046 [ 1217.643017] dceb3030 dd40bf04 c133bea0 d6d6 dbe901b0 fffe dbe901b0 fffe [ 1217.643017] Call Trace: [ 1217.643017] [] ? free_compound_page+0x14/0x20 [ 1217.643017] [] hid_submit_out+0xa4/0x130 [ 1217.643017] [] hid_irq_out+0xa0/0x100 [ 1217.643017] [] usb_hcd_giveback_urb+0x4e/0x90 [ 1217.643017] [] finish_urb+0xb2/0xf0 [ 1217.643017] [] finish_unlinks+0x168/0x320 [ 1217.643017] [] ohci_irq+0x27f/0x300 [ 1217.643017] [] ? unmask_irq+0x20/0x20 [ 1217.643017] [] usb_hcd_irq+0x1e/0x40 [ 1217.643017] [] handle_irq_event_percpu+0x6d/0x1c0 [ 1217.643017] [] ? unmask_irq+0x20/0x20 [ 1217.643017] [] handle_irq_event+0x28/0x40 [ 1217.643017] [] handle_fasteoi_irq+0x8a/0xe0 [ 1217.643017] [ 1217.643017] [] ? do_IRQ+0x3a/0xb0 [ 1217.643017] [] ? common_interrupt+0x30/0x38 [ 1217.643017] [] ? default_idle+0x68/0xd0 [ 1217.643017] [] ? cpu_idle+0x8f/0xc0 [ 1217.643017] [] ? rest_init+0x67/0x70 [ 1217.643017] [] ? start_kernel+0x2a3/0x2aa [ 1217.643017] [] ? repair_env_string+0x51/0x51 [ 1217.643017] [] ? i386_start_kernel+0x44/0x46 [ 1217.643017] Code: 90 90 90 90 55 89 e5 57 56 89 d6 53 83 ec 10 89 45 f0 89 4d e4 89 e0 25 00 e0 ff ff f7 40 14 00 ff ff 07 74 17 c1 e9 02 8b 7d f0 a5 8b 4d e4 83 e1 03 74 02 f3 a4 e9 27 01 00 00 8b 45 1 [ 1217.643017] EIP: [] _mmx_memcpy+0x27/0x16c SS:ESP 0068:dd40bea0 [ 1217.643017] CR2: c7cb [ 1217.643017] ---[ end trace fbaa4c5261e1a846 ]--- Bruno -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a
Re: [PATCH 0/7] HID: picoLCD updates
On Sat, 18 August 2012 Alan Stern wrote: > On Sat, 18 Aug 2012, Bruno Prémont wrote: > > On Thu, 16 August 2012 Jiri Kosina wrote: > > > On Thu, 16 Aug 2012, Bruno Prémont wrote: > > > > > > > > I don't really understand this explanation. Once usb_kill_urb() > > > > > returns, > > > > > the URB should be available for future use (and therefore all queues > > > > > completely drained). > > > > > > > > I won't have time today to check, though my guess is that on each > > > > echo $usb-id > bind; echo $usb-id > unbind > > > > under /sys/bus/hid/drivers/hid-picolcd/ the USB urb queue fills a bit > > > > does > > > > not get cleared. > > > > > > > > Is usb_kill_urb() called when unbinding just the specific hid driver? > > > > > > Yup, through hid_hw_stop() -> usbhid_stop(). > > > > > > > If so my short timing between bind/unbind must be triggering something > > > > else... > > > > > > > > Otherwise I'm missing something as at first time I got no "output queue > > > > full" > > > > messages, but as I repeated the bind/unbind sequences the prints per > > > > bind/unbind > > > > iteration increased in number. > > > > > > > > Anyhow, on Friday evening/week-end I will continue digging and report > > > > back with my > > > > findings. > > > > Huh, after changing some of the hid-picolcd data in order to have less racy > > coupling between hid and framebuffer I'm now dying way too often in > > _mmx_memcpy > > and most of the time I don't get a (complete) trace... > > There was a similar problem reported recently. It turned out to be > caused by a __devinitconst annotation attached to a usb_device_id > table. > > If there are any __devinit* annotations in the hid-picolcd driver, you > should see if removing them helps. There is no such annotation around in hid-picolcd. One thing I just though about, how does usbhid handle the calls to usbhid_submit_report() when hid_hw_stop()/hid_hw_close() have already been called? I will attempt to see if it makes a difference to shortcut my usbhid_submit_report() calls from the point on I have called hid_hw_close()... Bruno -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] HID: picoLCD updates
On Sat, 18 Aug 2012, Bruno Prémont wrote: > Hi Jiri, > > [CCing Alan Stern] > > On Thu, 16 August 2012 Jiri Kosina wrote: > > On Thu, 16 Aug 2012, Bruno Prémont wrote: > > > > > > I don't really understand this explanation. Once usb_kill_urb() > > > > returns, > > > > the URB should be available for future use (and therefore all queues > > > > completely drained). > > > > > > I won't have time today to check, though my guess is that on each > > > echo $usb-id > bind; echo $usb-id > unbind > > > under /sys/bus/hid/drivers/hid-picolcd/ the USB urb queue fills a bit does > > > not get cleared. > > > > > > Is usb_kill_urb() called when unbinding just the specific hid driver? > > > > Yup, through hid_hw_stop() -> usbhid_stop(). > > > > > If so my short timing between bind/unbind must be triggering something > > > else... > > > > > > Otherwise I'm missing something as at first time I got no "output queue > > > full" > > > messages, but as I repeated the bind/unbind sequences the prints per > > > bind/unbind > > > iteration increased in number. > > > > > > Anyhow, on Friday evening/week-end I will continue digging and report > > > back with my > > > findings. > > Huh, after changing some of the hid-picolcd data in order to have less racy > coupling between hid and framebuffer I'm now dying way too often in > _mmx_memcpy > and most of the time I don't get a (complete) trace... There was a similar problem reported recently. It turned out to be caused by a __devinitconst annotation attached to a usb_device_id table. If there are any __devinit* annotations in the hid-picolcd driver, you should see if removing them helps. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] HID: picoLCD updates
Hi Jiri, [CCing Alan Stern] On Thu, 16 August 2012 Jiri Kosina wrote: > On Thu, 16 Aug 2012, Bruno Prémont wrote: > > > > I don't really understand this explanation. Once usb_kill_urb() returns, > > > the URB should be available for future use (and therefore all queues > > > completely drained). > > > > I won't have time today to check, though my guess is that on each > > echo $usb-id > bind; echo $usb-id > unbind > > under /sys/bus/hid/drivers/hid-picolcd/ the USB urb queue fills a bit does > > not get cleared. > > > > Is usb_kill_urb() called when unbinding just the specific hid driver? > > Yup, through hid_hw_stop() -> usbhid_stop(). > > > If so my short timing between bind/unbind must be triggering something > > else... > > > > Otherwise I'm missing something as at first time I got no "output queue > > full" > > messages, but as I repeated the bind/unbind sequences the prints per > > bind/unbind > > iteration increased in number. > > > > Anyhow, on Friday evening/week-end I will continue digging and report back > > with my > > findings. Huh, after changing some of the hid-picolcd data in order to have less racy coupling between hid and framebuffer I'm now dying way too often in _mmx_memcpy and most of the time I don't get a (complete) trace... The only full trace I got was: [ 3857.426136] BUG: unable to handle kernel paging request at dbdf9000 [ 3857.432555] IP: [] _mmx_memcpy+0x27/0x16c [ 3857.435906] *pde = 1bebb063 *pte = 1bdf9161 [ 3857.435906] Oops: 0003 [#1] [ 3857.435906] Modules linked in: hid_picolcd fb_sys_fops sysimgblt sysfillrect syscopyarea drm_kms_helper nfs3 nfs_acl nfs lockd sunrpc [ 3857.435906] Pid: 1935, comm: bash Not tainted 3.6.0-rc1-jupiter-00363-g0e8ccbc #1 NVIDIA Corporation. nFORCE-MCP/MS-6373 [ 3857.435906] EIP: 0060:[] EFLAGS: 00010013 CPU: 0 [ 3857.435906] EIP is at _mmx_memcpy+0x27/0x16c [ 3857.435906] EAX: dd40a000 EBX: dd6ea010 ECX: 035af77b EDX: dbdf4230 [ 3857.435906] ESI: dbe0d1b0 EDI: dbdf9000 EBP: dd40bea8 ESP: dd40be8c [ 3857.435906] DS: 007b ES: 007b FS: GS: 00e0 SS: 0068 [ 3857.435906] CR0: 8005003b CR2: dbdf9000 CR3: 1d7ee000 CR4: 07d0 [ 3857.435906] DR0: DR1: DR2: DR3: [ 3857.435906] DR6: 0ff0 DR7: 0400 [ 3857.435906] Process bash (pid: 1935, ti=dd40a000 task=dcdb6120 task.ti=dbce2000) [ 3857.435906] Stack: [ 3857.435906] 0d6d6d6f dd6eb49c dd40beb8 dbde0080 dd6ea010 dbde0080 dbd31010 dd40bed0 [ 3857.435906] c133b364 ddfaa0e0 dbd31010 dbdf4230 dd6ea010 0046 [ 3857.435906] dbd31010 dd40bef0 c133bea0 dbd0c3f0 dbd0c3f0 [ 3857.435906] Call Trace: [ 3857.435906] [] hid_submit_out+0xa4/0x130 [ 3857.435906] [] hid_irq_out+0xa0/0x100 [ 3857.435906] [] usb_hcd_giveback_urb+0x4e/0x90 [ 3857.435906] [] finish_urb+0xb2/0xf0 [ 3857.435906] [] takeback_td+0x3d/0x100 [ 3857.435906] [] dl_done_list+0x14f/0x180 [ 3857.435906] [] ohci_irq+0x191/0x300 [ 3857.435906] [] ? unmask_irq+0x20/0x20 [ 3857.435906] [] usb_hcd_irq+0x1e/0x40 [ 3857.435906] [] handle_irq_event_percpu+0x6d/0x1c0 [ 3857.435906] [] ? unmask_irq+0x20/0x20 [ 3857.435906] [] handle_irq_event+0x28/0x40 [ 3857.435906] [] handle_fasteoi_irq+0x8a/0xe0 [ 3857.435906] [ 3857.435906] [] ? do_IRQ+0x3a/0xb0 [ 3857.435906] [] ? common_interrupt+0x30/0x38 [ 3857.435906] [] ? __do_softirq+0x53/0x1c0 [ 3857.435906] [] ? local_bh_enable_ip+0x80/0x80 [ 3857.435906] [ 3857.435906] [] ? irq_exit+0x3e/0xa0 [ 3857.435906] [] ? smp_apic_timer_interrupt+0x6e/0x80 [ 3857.435906] [] ? apic_timer_interrupt+0x31/0x38 [ 3857.435906] [] ? vprintk_emit+0x34a/0x390 [ 3857.435906] [] ? printk+0x38/0x3a [ 3857.435906] [] ? picolcd_init_framebuffer+0x261/0x350 [hid_picolcd] [ 3857.435906] [] ? picolcd_probe+0x3d1/0x5a0 [hid_picolcd] [ 3857.435906] [] ? hid_device_probe+0x67/0xf0 [ 3857.435906] [] ? driver_sysfs_add+0x57/0x80 [ 3857.435906] [] ? driver_probe_device+0xbd/0x1d0 [ 3857.435906] [] ? hid_match_device+0x7b/0x90 [ 3857.435906] [] ? driver_bind+0x75/0xd0 [ 3857.435906] [] ? driver_unbind+0x90/0x90 [ 3857.435906] [] ? drv_attr_store+0x27/0x30 [ 3857.435906] [] ? sysfs_write_file+0xac/0xf0 [ 3857.435906] [] ? vfs_write+0x9c/0x130 [ 3857.435906] [] ? sys_dup3+0x11f/0x160 [ 3857.435906] [] ? sysfs_poll+0x90/0x90 [ 3857.435906] [] ? sys_write+0x3d/0x70 [ 3857.435906] [] ? sysenter_do_call+0x12/0x26 [ 3857.435906] Code: 90 90 90 90 55 89 e5 57 56 89 d6 53 83 ec 10 89 45 f0 89 4d e4 89 e0 25 00 e0 ff ff f7 40 14 00 ff ff 07 74 17 c1 e9 02 8b 7d f0 a5 8b 4d e4 83 e1 03 74 02 f3 a4 e9 27 01 00 00 8b 45 1 [ 3857.435906] EIP: [] _mmx_memcpy+0x27/0x16c SS:ESP 0068:dd40be8c [ 3857.435906] CR2: dbdf9000 [ 3857.435906] ---[ end trace 10ae520e65e3b763 ]--- [ 3857.435906] Kernel panic - not syncing: Fatal exception in interrupt It happens around the time of calling device_create_file(dev, _attr_fb_update_rate); that is, after picolcd_fb_reset(data, 1) has returned
Re: [PATCH 0/7] HID: picoLCD updates
Hi Jiri, [CCing Alan Stern] On Thu, 16 August 2012 Jiri Kosina jkos...@suse.cz wrote: On Thu, 16 Aug 2012, Bruno Prémont wrote: I don't really understand this explanation. Once usb_kill_urb() returns, the URB should be available for future use (and therefore all queues completely drained). I won't have time today to check, though my guess is that on each echo $usb-id bind; echo $usb-id unbind under /sys/bus/hid/drivers/hid-picolcd/ the USB urb queue fills a bit does not get cleared. Is usb_kill_urb() called when unbinding just the specific hid driver? Yup, through hid_hw_stop() - usbhid_stop(). If so my short timing between bind/unbind must be triggering something else... Otherwise I'm missing something as at first time I got no output queue full messages, but as I repeated the bind/unbind sequences the prints per bind/unbind iteration increased in number. Anyhow, on Friday evening/week-end I will continue digging and report back with my findings. Huh, after changing some of the hid-picolcd data in order to have less racy coupling between hid and framebuffer I'm now dying way too often in _mmx_memcpy and most of the time I don't get a (complete) trace... The only full trace I got was: [ 3857.426136] BUG: unable to handle kernel paging request at dbdf9000 [ 3857.432555] IP: [c11e2be7] _mmx_memcpy+0x27/0x16c [ 3857.435906] *pde = 1bebb063 *pte = 1bdf9161 [ 3857.435906] Oops: 0003 [#1] [ 3857.435906] Modules linked in: hid_picolcd fb_sys_fops sysimgblt sysfillrect syscopyarea drm_kms_helper nfs3 nfs_acl nfs lockd sunrpc [ 3857.435906] Pid: 1935, comm: bash Not tainted 3.6.0-rc1-jupiter-00363-g0e8ccbc #1 NVIDIA Corporation. nFORCE-MCP/MS-6373 [ 3857.435906] EIP: 0060:[c11e2be7] EFLAGS: 00010013 CPU: 0 [ 3857.435906] EIP is at _mmx_memcpy+0x27/0x16c [ 3857.435906] EAX: dd40a000 EBX: dd6ea010 ECX: 035af77b EDX: dbdf4230 [ 3857.435906] ESI: dbe0d1b0 EDI: dbdf9000 EBP: dd40bea8 ESP: dd40be8c [ 3857.435906] DS: 007b ES: 007b FS: GS: 00e0 SS: 0068 [ 3857.435906] CR0: 8005003b CR2: dbdf9000 CR3: 1d7ee000 CR4: 07d0 [ 3857.435906] DR0: DR1: DR2: DR3: [ 3857.435906] DR6: 0ff0 DR7: 0400 [ 3857.435906] Process bash (pid: 1935, ti=dd40a000 task=dcdb6120 task.ti=dbce2000) [ 3857.435906] Stack: [ 3857.435906] 0d6d6d6f dd6eb49c dd40beb8 dbde0080 dd6ea010 dbde0080 dbd31010 dd40bed0 [ 3857.435906] c133b364 ddfaa0e0 dbd31010 dbdf4230 dd6ea010 0046 [ 3857.435906] dbd31010 dd40bef0 c133bea0 dbd0c3f0 dbd0c3f0 [ 3857.435906] Call Trace: [ 3857.435906] [c133b364] hid_submit_out+0xa4/0x130 [ 3857.435906] [c133bea0] hid_irq_out+0xa0/0x100 [ 3857.435906] [c12e425e] usb_hcd_giveback_urb+0x4e/0x90 [ 3857.435906] [c12f9bd2] finish_urb+0xb2/0xf0 [ 3857.435906] [c12fa67d] takeback_td+0x3d/0x100 [ 3857.435906] [c12faccf] dl_done_list+0x14f/0x180 [ 3857.435906] [c12fc351] ohci_irq+0x191/0x300 [ 3857.435906] [c1077790] ? unmask_irq+0x20/0x20 [ 3857.435906] [c12e3ace] usb_hcd_irq+0x1e/0x40 [ 3857.435906] [c107549d] handle_irq_event_percpu+0x6d/0x1c0 [ 3857.435906] [c1077790] ? unmask_irq+0x20/0x20 [ 3857.435906] [c1075618] handle_irq_event+0x28/0x40 [ 3857.435906] [c107781a] handle_fasteoi_irq+0x8a/0xe0 [ 3857.435906] IRQ [ 3857.435906] [c1003e0a] ? do_IRQ+0x3a/0xb0 [ 3857.435906] [c140c870] ? common_interrupt+0x30/0x38 [ 3857.435906] [c1035143] ? __do_softirq+0x53/0x1c0 [ 3857.435906] [c10350f0] ? local_bh_enable_ip+0x80/0x80 [ 3857.435906] IRQ [ 3857.435906] [c103544e] ? irq_exit+0x3e/0xa0 [ 3857.435906] [c101dfde] ? smp_apic_timer_interrupt+0x6e/0x80 [ 3857.435906] [c140bee1] ? apic_timer_interrupt+0x31/0x38 [ 3857.435906] [c102fdba] ? vprintk_emit+0x34a/0x390 [ 3857.435906] [c1409f58] ? printk+0x38/0x3a [ 3857.435906] [de94b021] ? picolcd_init_framebuffer+0x261/0x350 [hid_picolcd] [ 3857.435906] [de949bd1] ? picolcd_probe+0x3d1/0x5a0 [hid_picolcd] [ 3857.435906] [c13321f7] ? hid_device_probe+0x67/0xf0 [ 3857.435906] [c128ac17] ? driver_sysfs_add+0x57/0x80 [ 3857.435906] [c128af1d] ? driver_probe_device+0xbd/0x1d0 [ 3857.435906] [c1331aab] ? hid_match_device+0x7b/0x90 [ 3857.435906] [c1289e65] ? driver_bind+0x75/0xd0 [ 3857.435906] [c1289df0] ? driver_unbind+0x90/0x90 [ 3857.435906] [c1289537] ? drv_attr_store+0x27/0x30 [ 3857.435906] [c111a59c] ? sysfs_write_file+0xac/0xf0 [ 3857.435906] [c10cca0c] ? vfs_write+0x9c/0x130 [ 3857.435906] [c10da2ef] ? sys_dup3+0x11f/0x160 [ 3857.435906] [c111a4f0] ? sysfs_poll+0x90/0x90 [ 3857.435906] [c10ccc7d] ? sys_write+0x3d/0x70 [ 3857.435906] [c140c357] ? sysenter_do_call+0x12/0x26 [ 3857.435906] Code: 90 90 90 90 55 89 e5 57 56 89 d6 53 83 ec 10 89 45 f0 89 4d e4 89 e0 25 00 e0 ff ff f7 40 14 00 ff ff 07 74 17 c1 e9 02 8b 7d f0 f3 a5 8b 4d e4 83 e1 03 74 02 f3 a4 e9 27 01 00 00 8b 45 1 [ 3857.435906] EIP: [c11e2be7] _mmx_memcpy+0x27/0x16c SS:ESP 0068:dd40be8c [ 3857.435906]
Re: [PATCH 0/7] HID: picoLCD updates
On Sat, 18 Aug 2012, Bruno Prémont wrote: Hi Jiri, [CCing Alan Stern] On Thu, 16 August 2012 Jiri Kosina jkos...@suse.cz wrote: On Thu, 16 Aug 2012, Bruno Prémont wrote: I don't really understand this explanation. Once usb_kill_urb() returns, the URB should be available for future use (and therefore all queues completely drained). I won't have time today to check, though my guess is that on each echo $usb-id bind; echo $usb-id unbind under /sys/bus/hid/drivers/hid-picolcd/ the USB urb queue fills a bit does not get cleared. Is usb_kill_urb() called when unbinding just the specific hid driver? Yup, through hid_hw_stop() - usbhid_stop(). If so my short timing between bind/unbind must be triggering something else... Otherwise I'm missing something as at first time I got no output queue full messages, but as I repeated the bind/unbind sequences the prints per bind/unbind iteration increased in number. Anyhow, on Friday evening/week-end I will continue digging and report back with my findings. Huh, after changing some of the hid-picolcd data in order to have less racy coupling between hid and framebuffer I'm now dying way too often in _mmx_memcpy and most of the time I don't get a (complete) trace... There was a similar problem reported recently. It turned out to be caused by a __devinitconst annotation attached to a usb_device_id table. If there are any __devinit* annotations in the hid-picolcd driver, you should see if removing them helps. Alan Stern -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] HID: picoLCD updates
On Sat, 18 August 2012 Alan Stern st...@rowland.harvard.edu wrote: On Sat, 18 Aug 2012, Bruno Prémont wrote: On Thu, 16 August 2012 Jiri Kosina jkos...@suse.cz wrote: On Thu, 16 Aug 2012, Bruno Prémont wrote: I don't really understand this explanation. Once usb_kill_urb() returns, the URB should be available for future use (and therefore all queues completely drained). I won't have time today to check, though my guess is that on each echo $usb-id bind; echo $usb-id unbind under /sys/bus/hid/drivers/hid-picolcd/ the USB urb queue fills a bit does not get cleared. Is usb_kill_urb() called when unbinding just the specific hid driver? Yup, through hid_hw_stop() - usbhid_stop(). If so my short timing between bind/unbind must be triggering something else... Otherwise I'm missing something as at first time I got no output queue full messages, but as I repeated the bind/unbind sequences the prints per bind/unbind iteration increased in number. Anyhow, on Friday evening/week-end I will continue digging and report back with my findings. Huh, after changing some of the hid-picolcd data in order to have less racy coupling between hid and framebuffer I'm now dying way too often in _mmx_memcpy and most of the time I don't get a (complete) trace... There was a similar problem reported recently. It turned out to be caused by a __devinitconst annotation attached to a usb_device_id table. If there are any __devinit* annotations in the hid-picolcd driver, you should see if removing them helps. There is no such annotation around in hid-picolcd. One thing I just though about, how does usbhid handle the calls to usbhid_submit_report() when hid_hw_stop()/hid_hw_close() have already been called? I will attempt to see if it makes a difference to shortcut my usbhid_submit_report() calls from the point on I have called hid_hw_close()... Bruno -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] HID: picoLCD updates
On Sat, 18 August 2012 Bruno Prémont bonb...@linux-vserver.org wrote: One thing I just though about, how does usbhid handle the calls to usbhid_submit_report() when hid_hw_stop()/hid_hw_close() have already been called? I will attempt to see if it makes a difference to shortcut my usbhid_submit_report() calls from the point on I have called hid_hw_close()... I don't know if I'm progressing or just drawing circles... I included a check right before calling usbhid_submit_report() to prevent calling it if hid_hw_close()/hid_hw_stop() has been called and that part works as expected, just that at some moment it explodes on a NULL-pointer dereference. Initially I had usbhid_submit_report macro: #define usbhid_submit_report(a, b, c) \ do { \ picolcd_debug_out_report(d, a, b); \ usbhid_submit_report(a, b, c); \ } while (0) Now to make sure no new reports get submitted once hid_hw_stop() was called I changed it to: #define usbhid_submit_report(a, b, c) \ do { \ struct picolcd_data *d = hid_get_drvdata(a); \ unsigned long flags; \ picolcd_debug_out_report(d, a, b); \ spin_lock_irqsave(d-lock, flags); \ if (!(d-status PICOLCD_FAILED)) \ usbhid_submit_report(a, b, c); \ else \ printk(KERN_INFO Submitting report after hw_stop!\n); \ spin_unlock_irqrestore(d-lock, flags); \ } while (0) with the printk() to spam dmesg :) The NULL-pointer dereference I got inside picolcd_fb_deferred_io() was due to me calling hid_set_drvdata(hdev, NULL) too early in remove(), fixed. But with that fixed I still encounter the bad page, probably around the time of hid_hw_stop(), though it always takes the HARD LOCK watchdog to trigger for the trace to show (probably because or IRQ context): It seems to happen during the bind/unbind following the unbind that produces a hid-picolcd 0003:04D8:C002.0003: usb_submit_urb(out) failed: -1 either a BUG or even as reboot and happend between hid_hw_close() and hid_hw_stop() though my extra printks seem to reduce probability of bad things to happen. [ 1217.643017] BUG: unable to handle kernel paging request at c7cb [ 1217.643017] IP: [c11e2be7] _mmx_memcpy+0x27/0x16c [ 1217.643017] *pde = 19078063 *pte = 07cb0161 [ 1217.643017] Oops: 0003 [#1] [ 1217.643017] Modules linked in: hid_picolcd fb_sys_fops sysimgblt sysfillrect syscopyarea drm_kms_helper nfs3 nfs_acl nfs lockd sunrpc [ 1217.643017] Pid: 0, comm: swapper Not tainted 3.6.0-rc2-jupiter-8-gd3c2b0c-dirty #2 NVIDIA Corporation. nFORCE-MCP/MS-6373 [ 1217.643017] EIP: 0060:[c11e2be7] EFLAGS: 00010013 CPU: 0 [ 1217.643017] EIP is at _mmx_memcpy+0x27/0x16c [ 1217.643017] EAX: dd40a000 EBX: dd6ea010 ECX: 035af77b EDX: d9108c80 [ 1217.643017] ESI: d9121c00 EDI: c7cb EBP: dd40bebc ESP: dd40bea0 [ 1217.643017] DS: 007b ES: 007b FS: GS: 00e0 SS: 0068 [ 1217.643017] CR0: 8005003b CR2: c7cb CR3: 1cf1c000 CR4: 07d0 [ 1217.643017] DR0: DR1: DR2: DR3: [ 1217.643017] DR6: 0ff0 DR7: 0400 [ 1217.643017] Process swapper (pid: 0, ti=dd40a000 task=c15d24c0 task.ti=c15c6000) [ 1217.643017] Stack: [ 1217.643017] 0d6d6d6f dd40beac c109bba4 c7c97080 dd6ea010 c7c97080 dceb3030 dd40bee4 [ 1217.643017] c133b364 c7d4 c7d4 dd40bed4 dceb3030 d9108c80 dd6ea010 0046 [ 1217.643017] dceb3030 dd40bf04 c133bea0 d6d6 dbe901b0 fffe dbe901b0 fffe [ 1217.643017] Call Trace: [ 1217.643017] [c109bba4] ? free_compound_page+0x14/0x20 [ 1217.643017] [c133b364] hid_submit_out+0xa4/0x130 [ 1217.643017] [c133bea0] hid_irq_out+0xa0/0x100 [ 1217.643017] [c12e425e] usb_hcd_giveback_urb+0x4e/0x90 [ 1217.643017] [c12f9bd2] finish_urb+0xb2/0xf0 [ 1217.643017] [c12fa8a8] finish_unlinks+0x168/0x320 [ 1217.643017] [c12fc43f] ohci_irq+0x27f/0x300 [ 1217.643017] [c1077790] ? unmask_irq+0x20/0x20 [ 1217.643017] [c12e3ace] usb_hcd_irq+0x1e/0x40 [ 1217.643017] [c107549d] handle_irq_event_percpu+0x6d/0x1c0 [ 1217.643017] [c1077790] ? unmask_irq+0x20/0x20 [ 1217.643017] [c1075618] handle_irq_event+0x28/0x40 [ 1217.643017] [c107781a] handle_fasteoi_irq+0x8a/0xe0 [ 1217.643017] IRQ [ 1217.643017] [c1003e0a] ? do_IRQ+0x3a/0xb0 [ 1217.643017] [c140c870] ? common_interrupt+0x30/0x38 [ 1217.643017] [c1009028] ? default_idle+0x68/0xd0 [ 1217.643017] [c1009a0f] ? cpu_idle+0x8f/0xc0 [ 1217.643017] [c13ffe97] ? rest_init+0x67/0x70 [ 1217.643017] [c161b8b8] ? start_kernel+0x2a3/0x2aa [ 1217.643017] [c161b3da] ? repair_env_string+0x51/0x51 [ 1217.643017] [c161b28e] ? i386_start_kernel+0x44/0x46 [ 1217.643017] Code: 90 90 90 90 55 89 e5 57 56 89 d6 53 83 ec 10 89 45 f0 89 4d e4 89 e0 25 00 e0 ff ff f7 40 14 00 ff ff 07 74 17 c1 e9 02 8b 7d f0 f3 a5 8b 4d e4 83 e1 03 74 02 f3 a4 e9 27 01 00 00 8b 45 1 [ 1217.643017] EIP: [c11e2be7] _mmx_memcpy+0x27/0x16c
Re: [PATCH 0/7] HID: picoLCD updates
On Sat, 18 Aug 2012, Bruno Prémont wrote: One thing I just though about, how does usbhid handle the calls to usbhid_submit_report() when hid_hw_stop()/hid_hw_close() have already been called? Looks like they aren't synchronized at all. That's a bug. usbhid_submit_report() should check the HID_DISCONNECTED flag. I will attempt to see if it makes a difference to shortcut my usbhid_submit_report() calls from the point on I have called hid_hw_close()... I don't know bout hid_hw_close(). Certainly no more reports should be submitted following usbhid_stop(). Alan Stern -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] HID: picoLCD updates
On Thu, 16 Aug 2012, Bruno Prémont wrote: > > I don't really understand this explanation. Once usb_kill_urb() returns, > > the URB should be available for future use (and therefore all queues > > completely drained). > > I won't have time today to check, though my guess is that on each > echo $usb-id > bind; echo $usb-id > unbind > under /sys/bus/hid/drivers/hid-picolcd/ the USB urb queue fills a bit does > not get cleared. > > Is usb_kill_urb() called when unbinding just the specific hid driver? Yup, through hid_hw_stop() -> usbhid_stop(). > If so my short timing between bind/unbind must be triggering something > else... > > Otherwise I'm missing something as at first time I got no "output queue full" > messages, but as I repeated the bind/unbind sequences the prints per > bind/unbind > iteration increased in number. > > Anyhow, on Friday evening/week-end I will continue digging and report back > with my > findings. Thank you, Bruno. -- Jiri Kosina SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] HID: picoLCD updates
On Wed, 15 August 2012 Jiri Kosina wrote: > On Wed, 15 Aug 2012, Bruno Prémont wrote: > > > I see. Alan Stern has fixed a huge pile of things in this area in > > > 3.6-rc1. > > > I have expected all of those to actually be on theoretical problems not > > > ever having happened in the wild, but it might be that you are actually > > > chasing on of those. > > > > > > Could you please retest with latest Linus' tree (or at least > > > eb055fd0560b) > > > to see whether this hasn't actually been fixed already by Alan's series? > > > > I've started trying that out, it seems Alan's work improved things. > > > > For the first few attempts I have not seen SLAB corruptions, though after > > a few rounds I hit accumulation of the following messages: > > [ 297.174828] hid-picolcd 0003:04D8:C002.0003: output queue full > > [ 297.181098] hid-picolcd 0003:04D8:C002.0003: output queue full > > [ 297.187820] hid-picolcd 0003:04D8:C002.0003: output queue full > > [ 297.194087] hid-picolcd 0003:04D8:C002.0003: output queue full > > > > with sporadically in between: > > [ 292.668019] hid-picolcd 0003:04D8:C002.0003: usb_submit_urb(out) failed: > > -1 > > > > At first glance I think the queue filling up and never draining is caused > > by hid_hw_stop() stalling the queue and the time between both being just too > > short. > > I don't really understand this explanation. Once usb_kill_urb() returns, > the URB should be available for future use (and therefore all queues > completely drained). I won't have time today to check, though my guess is that on each echo $usb-id > bind; echo $usb-id > unbind under /sys/bus/hid/drivers/hid-picolcd/ the USB urb queue fills a bit does not get cleared. Is usb_kill_urb() called when unbinding just the specific hid driver? If so my short timing between bind/unbind must be triggering something else... Otherwise I'm missing something as at first time I got no "output queue full" messages, but as I repeated the bind/unbind sequences the prints per bind/unbind iteration increased in number. Anyhow, on Friday evening/week-end I will continue digging and report back with my findings. Thanks, Bruno -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] HID: picoLCD updates
On Wed, 15 August 2012 Jiri Kosina jkos...@suse.cz wrote: On Wed, 15 Aug 2012, Bruno Prémont wrote: I see. Alan Stern has fixed a huge pile of things in this area in 3.6-rc1. I have expected all of those to actually be on theoretical problems not ever having happened in the wild, but it might be that you are actually chasing on of those. Could you please retest with latest Linus' tree (or at least eb055fd0560b) to see whether this hasn't actually been fixed already by Alan's series? I've started trying that out, it seems Alan's work improved things. For the first few attempts I have not seen SLAB corruptions, though after a few rounds I hit accumulation of the following messages: [ 297.174828] hid-picolcd 0003:04D8:C002.0003: output queue full [ 297.181098] hid-picolcd 0003:04D8:C002.0003: output queue full [ 297.187820] hid-picolcd 0003:04D8:C002.0003: output queue full [ 297.194087] hid-picolcd 0003:04D8:C002.0003: output queue full with sporadically in between: [ 292.668019] hid-picolcd 0003:04D8:C002.0003: usb_submit_urb(out) failed: -1 At first glance I think the queue filling up and never draining is caused by hid_hw_stop() stalling the queue and the time between both being just too short. I don't really understand this explanation. Once usb_kill_urb() returns, the URB should be available for future use (and therefore all queues completely drained). I won't have time today to check, though my guess is that on each echo $usb-id bind; echo $usb-id unbind under /sys/bus/hid/drivers/hid-picolcd/ the USB urb queue fills a bit does not get cleared. Is usb_kill_urb() called when unbinding just the specific hid driver? If so my short timing between bind/unbind must be triggering something else... Otherwise I'm missing something as at first time I got no output queue full messages, but as I repeated the bind/unbind sequences the prints per bind/unbind iteration increased in number. Anyhow, on Friday evening/week-end I will continue digging and report back with my findings. Thanks, Bruno -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] HID: picoLCD updates
On Thu, 16 Aug 2012, Bruno Prémont wrote: I don't really understand this explanation. Once usb_kill_urb() returns, the URB should be available for future use (and therefore all queues completely drained). I won't have time today to check, though my guess is that on each echo $usb-id bind; echo $usb-id unbind under /sys/bus/hid/drivers/hid-picolcd/ the USB urb queue fills a bit does not get cleared. Is usb_kill_urb() called when unbinding just the specific hid driver? Yup, through hid_hw_stop() - usbhid_stop(). If so my short timing between bind/unbind must be triggering something else... Otherwise I'm missing something as at first time I got no output queue full messages, but as I repeated the bind/unbind sequences the prints per bind/unbind iteration increased in number. Anyhow, on Friday evening/week-end I will continue digging and report back with my findings. Thank you, Bruno. -- Jiri Kosina SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] HID: picoLCD updates
On Wed, 15 Aug 2012, Bruno Prémont wrote: > > I see. Alan Stern has fixed a huge pile of things in this area in 3.6-rc1. > > I have expected all of those to actually be on theoretical problems not > > ever having happened in the wild, but it might be that you are actually > > chasing on of those. > > > > Could you please retest with latest Linus' tree (or at least eb055fd0560b) > > to see whether this hasn't actually been fixed already by Alan's series? > > I've started trying that out, it seems Alan's work improved things. > > For the first few attempts I have not seen SLAB corruptions, though after > a few rounds I hit accumulation of the following messages: > [ 297.174828] hid-picolcd 0003:04D8:C002.0003: output queue full > [ 297.181098] hid-picolcd 0003:04D8:C002.0003: output queue full > [ 297.187820] hid-picolcd 0003:04D8:C002.0003: output queue full > [ 297.194087] hid-picolcd 0003:04D8:C002.0003: output queue full > > with sporadically in between: > [ 292.668019] hid-picolcd 0003:04D8:C002.0003: usb_submit_urb(out) failed: -1 > > At first glance I think the queue filling up and never draining is caused > by hid_hw_stop() stalling the queue and the time between both being just too > short. I don't really understand this explanation. Once usb_kill_urb() returns, the URB should be available for future use (and therefore all queues completely drained). -- Jiri Kosina SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] HID: picoLCD updates
Hi Jiri, On Wed, 15 August 2012 Jiri Kosina wrote: > I see. Alan Stern has fixed a huge pile of things in this area in 3.6-rc1. > I have expected all of those to actually be on theoretical problems not > ever having happened in the wild, but it might be that you are actually > chasing on of those. > > Could you please retest with latest Linus' tree (or at least eb055fd0560b) > to see whether this hasn't actually been fixed already by Alan's series? I've started trying that out, it seems Alan's work improved things. For the first few attempts I have not seen SLAB corruptions, though after a few rounds I hit accumulation of the following messages: [ 297.174828] hid-picolcd 0003:04D8:C002.0003: output queue full [ 297.181098] hid-picolcd 0003:04D8:C002.0003: output queue full [ 297.187820] hid-picolcd 0003:04D8:C002.0003: output queue full [ 297.194087] hid-picolcd 0003:04D8:C002.0003: output queue full with sporadically in between: [ 292.668019] hid-picolcd 0003:04D8:C002.0003: usb_submit_urb(out) failed: -1 At first glance I think the queue filling up and never draining is caused by hid_hw_stop() stalling the queue and the time between both being just too short. Maybe me (un)binding on hid-picolcd driver level is the major cause of this and (un)binding at lower usb level would kill the queue thus preventing it just growing until full... Is there a proper way to do rate-limiting or throttling when submitting reports (which are the urbs deeper in the stack) and catch -ENODEV early while remove() is delayed for locking reasons? At least until 3.5 submitting reports provides no return value for driver which could convey errors like -ENODEV, -EBUSY. After a few bind-unbind iterations I have also hit a race in hid-picolcd with regard to fbdefio. That looks like chicken-egg release sequencing between hid-picolcd and fb_info subdev -- though I'm wondering how that happens as the only framebuffer user I know of - fbcon - should have stopped using it after unregister_framebuffer() returned and released the last reference! But that part can be improved rather easily with a new spinlock synchronizing interoperation between hid side and framebuffer side. Thanks, Bruno -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] HID: picoLCD updates
On Wed, 15 Aug 2012, Bruno Prémont wrote: > > > [ 6383.521833] > > > = > > > [ 6383.530020] BUG kmalloc-64 (Not tainted): Object already free > > > [ 6383.530020] > > > - > > > [ 6383.530020] > > > [ 6383.530020] INFO: Slab 0xdde0ea20 objects=51 used=40 fp=0xcef516e0 > > > flags=0x4080 > > > [ 6383.530020] INFO: Object 0xcef51190 @offset=400 fp=0xcef51f50 > > > [ 6383.530020] > > > [ 6383.530020] Bytes b4 cef51180: cc cc cc cc d0 12 f5 ce 5a 5a 5a 5a 5a > > > 5a 5a 5a > > > [ 6383.530020] Object cef51190: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b > > > 6b 6b > > > [ 6383.530020] Object cef511a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b > > > 6b 6b > > > [ 6383.530020] Object cef511b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b > > > 6b 6b > > > [ 6383.530020] Object cef511c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b > > > 6b a5 kkk. > > > [ 6383.530020] Redzone cef511d0: bb bb bb bb > > > > > > [ 6383.530020] Padding cef511d8: 5a 5a 5a 5a 5a 5a 5a 5a > > > > > > [ 6383.530020] Pid: 1922, comm: bash Not tainted > > > 3.5.0-jupiter-3-g8d858b1-dirty #2 > > > [ 6383.530020] Call Trace: > > > [ 6383.530020] [] print_trailer+0x11c/0x130 > > > [ 6383.530020] [] object_err+0x35/0x40 > > > [ 6383.530020] [] free_debug_processing+0x99/0x200 > > > [ 6383.530020] [] __slab_free+0x2e/0x280 > > > [ 6383.530020] [] ? hid_submit_out+0xa4/0x120 > > > [ 6383.530020] [] ? __usbhid_submit_report+0xc0/0x3c0 > > > [ 6383.530020] [] ? kfree+0xfa/0x110 > > > [ 6383.530020] [] ? picolcd_debug_out_report+0x8c4/0x8e0 > > > [hid_picolcd] > > > [ 6383.530020] [] kfree+0xfa/0x110 > > > [ 6383.530020] [] ? hid_submit_out+0xa4/0x120 > > > [ 6383.530020] [] ? hid_submit_out+0xa4/0x120 > > > [ 6383.530020] [] ? hid_submit_out+0xa4/0x120 > > > [ 6383.530020] [] hid_submit_out+0xa4/0x120 > > > [ 6383.530020] [] __usbhid_submit_report+0x158/0x3c0 > > > [ 6383.530020] [] usbhid_submit_report+0x1b/0x30 > > > [ 6383.530020] [] picolcd_fb_reset+0xb9/0x180 [hid_picolcd] > > > [ 6383.530020] [] picolcd_init_framebuffer+0x20d/0x2e0 > > > [hid_picolcd] > > > [ 6383.530020] [] picolcd_probe+0x3cc/0x580 [hid_picolcd] > > > [ 6383.530020] [] hid_device_probe+0x67/0xf0 > > > [ 6383.530020] [] ? driver_sysfs_add+0x57/0x80 > > > [ 6383.530020] [] driver_probe_device+0xbd/0x1c0 > > > [ 6383.530020] [] ? hid_match_device+0x7b/0x90 > > > [ 6383.530020] [] driver_bind+0x75/0xd0 > > > [ 6383.530020] [] ? driver_unbind+0x90/0x90 > > > [ 6383.530020] [] drv_attr_store+0x27/0x30 > > > [ 6383.530020] [] sysfs_write_file+0xac/0xf0 > > > [ 6383.530020] [] vfs_write+0x9c/0x130 > > > [ 6383.530020] [] ? sys_dup3+0x11f/0x160 > > > [ 6383.530020] [] ? sysfs_poll+0x90/0x90 > > > [ 6383.530020] [] sys_write+0x3d/0x70 > > > [ 6383.530020] [] sysenter_do_call+0x12/0x26 > > > > So I am wondering whether the path this happens on is > > > > if (!test_bit(HID_OUT_RUNNING, >iofl)) { > > usbhid_restart_out_queue(usbhid); > > > > in __usbhid_submit_report(). It would then indicate perhaps some race with > > iofl handling. > > Huh, that specific test_bit hunk I can't find in __usbhid_submit_report, > is that 3.6 material? > I'm running my tests against 3.5... I see. Alan Stern has fixed a huge pile of things in this area in 3.6-rc1. I have expected all of those to actually be on theoretical problems not ever having happened in the wild, but it might be that you are actually chasing on of those. Could you please retest with latest Linus' tree (or at least eb055fd0560b) to see whether this hasn't actually been fixed already by Alan's series? Thanks, -- Jiri Kosina SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] HID: picoLCD updates
Hi Jiri, On Wed, 15 August 2012 Jiri Kosina wrote: > On Mon, 30 Jul 2012, Bruno Prémont wrote: > > Hi, > > > > This series updates picoLCD driver: > > - split the driver functions into separate files which get included > > depending on Kconfig selection > > (implementation for CIR using RC_CORE will follow later) > > - drop private framebuffer refcounting in favor of refcounting added > > to fb_info some time ago > > - fix various bugs issues > > - disabled firmware version checking in probe() as it does not work > > anymore since commit 4ea5454203d991ec85264f64f89ca8855fce69b0 > > [HID: Fix race condition between driver core and ll-driver] > > I have now applied the series to my 'picolcd' branch, except for 6/7, > please see the comment I sent to it separately. Will respin that one soon > > Note: I still get weird behavior on quick unbind/bind sequences > > issued via sysfs (CONFIG_SMP=n system) that are triggered by framebuffer > > support and apparently more specifically fb_defio part of it. > > > > Unfortunately I'm out of ideas as to how to track down the problem which > > shows either as SLAB corruption (detected with SLUB debugging, e.g. > > Would be nice to have this sorted out before the next merge window indeed, > so that it can go in together with the rest of the changes. > > > > > [ 6383.521833] > > = > > [ 6383.530020] BUG kmalloc-64 (Not tainted): Object already free > > [ 6383.530020] > > - > > [ 6383.530020] > > [ 6383.530020] INFO: Slab 0xdde0ea20 objects=51 used=40 fp=0xcef516e0 > > flags=0x4080 > > [ 6383.530020] INFO: Object 0xcef51190 @offset=400 fp=0xcef51f50 > > [ 6383.530020] > > [ 6383.530020] Bytes b4 cef51180: cc cc cc cc d0 12 f5 ce 5a 5a 5a 5a 5a 5a > > 5a 5a > > [ 6383.530020] Object cef51190: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b > > 6b 6b > > [ 6383.530020] Object cef511a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b > > 6b 6b > > [ 6383.530020] Object cef511b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b > > 6b 6b > > [ 6383.530020] Object cef511c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b > > 6b a5 kkk. > > [ 6383.530020] Redzone cef511d0: bb bb bb bb > > > > [ 6383.530020] Padding cef511d8: 5a 5a 5a 5a 5a 5a 5a 5a > > > > [ 6383.530020] Pid: 1922, comm: bash Not tainted > > 3.5.0-jupiter-3-g8d858b1-dirty #2 > > [ 6383.530020] Call Trace: > > [ 6383.530020] [] print_trailer+0x11c/0x130 > > [ 6383.530020] [] object_err+0x35/0x40 > > [ 6383.530020] [] free_debug_processing+0x99/0x200 > > [ 6383.530020] [] __slab_free+0x2e/0x280 > > [ 6383.530020] [] ? hid_submit_out+0xa4/0x120 > > [ 6383.530020] [] ? __usbhid_submit_report+0xc0/0x3c0 > > [ 6383.530020] [] ? kfree+0xfa/0x110 > > [ 6383.530020] [] ? picolcd_debug_out_report+0x8c4/0x8e0 > > [hid_picolcd] > > [ 6383.530020] [] kfree+0xfa/0x110 > > [ 6383.530020] [] ? hid_submit_out+0xa4/0x120 > > [ 6383.530020] [] ? hid_submit_out+0xa4/0x120 > > [ 6383.530020] [] ? hid_submit_out+0xa4/0x120 > > [ 6383.530020] [] hid_submit_out+0xa4/0x120 > > [ 6383.530020] [] __usbhid_submit_report+0x158/0x3c0 > > [ 6383.530020] [] usbhid_submit_report+0x1b/0x30 > > [ 6383.530020] [] picolcd_fb_reset+0xb9/0x180 [hid_picolcd] > > [ 6383.530020] [] picolcd_init_framebuffer+0x20d/0x2e0 > > [hid_picolcd] > > [ 6383.530020] [] picolcd_probe+0x3cc/0x580 [hid_picolcd] > > [ 6383.530020] [] hid_device_probe+0x67/0xf0 > > [ 6383.530020] [] ? driver_sysfs_add+0x57/0x80 > > [ 6383.530020] [] driver_probe_device+0xbd/0x1c0 > > [ 6383.530020] [] ? hid_match_device+0x7b/0x90 > > [ 6383.530020] [] driver_bind+0x75/0xd0 > > [ 6383.530020] [] ? driver_unbind+0x90/0x90 > > [ 6383.530020] [] drv_attr_store+0x27/0x30 > > [ 6383.530020] [] sysfs_write_file+0xac/0xf0 > > [ 6383.530020] [] vfs_write+0x9c/0x130 > > [ 6383.530020] [] ? sys_dup3+0x11f/0x160 > > [ 6383.530020] [] ? sysfs_poll+0x90/0x90 > > [ 6383.530020] [] sys_write+0x3d/0x70 > > [ 6383.530020] [] sysenter_do_call+0x12/0x26 > > So I am wondering whether the path this happens on is > > if (!test_bit(HID_OUT_RUNNING, >iofl)) { > usbhid_restart_out_queue(usbhid); > > in __usbhid_submit_report(). It would then indicate perhaps some race with > iofl handling. Huh, that specific test_bit hunk I can't find in __usbhid_submit_report, is that 3.6 material? I'm running my tests against 3.5... The nearest I have is: if (!test_bit(HID_OUT_RUNNING, >iofl)) if (!irq_out_pump_restart(hid)) set_bit(HID_OUT_RUNNING, >iofl); > Could you please stick some printk() just
Re: [PATCH 0/7] HID: picoLCD updates
On Mon, 30 Jul 2012, Bruno Prémont wrote: > Hi, > > This series updates picoLCD driver: > - split the driver functions into separate files which get included > depending on Kconfig selection > (implementation for CIR using RC_CORE will follow later) > - drop private framebuffer refcounting in favor of refcounting added > to fb_info some time ago > - fix various bugs issues > - disabled firmware version checking in probe() as it does not work > anymore since commit 4ea5454203d991ec85264f64f89ca8855fce69b0 > [HID: Fix race condition between driver core and ll-driver] I have now applied the series to my 'picolcd' branch, except for 6/7, please see the comment I sent to it separately. > Note: I still get weird behavior on quick unbind/bind sequences > issued via sysfs (CONFIG_SMP=n system) that are triggered by framebuffer > support and apparently more specifically fb_defio part of it. > > Unfortunately I'm out of ideas as to how to track down the problem which > shows either as SLAB corruption (detected with SLUB debugging, e.g. Would be nice to have this sorted out before the next merge window indeed, so that it can go in together with the rest of the changes. > > [ 6383.521833] > = > [ 6383.530020] BUG kmalloc-64 (Not tainted): Object already free > [ 6383.530020] > - > [ 6383.530020] > [ 6383.530020] INFO: Slab 0xdde0ea20 objects=51 used=40 fp=0xcef516e0 > flags=0x4080 > [ 6383.530020] INFO: Object 0xcef51190 @offset=400 fp=0xcef51f50 > [ 6383.530020] > [ 6383.530020] Bytes b4 cef51180: cc cc cc cc d0 12 f5 ce 5a 5a 5a 5a 5a 5a > 5a 5a > [ 6383.530020] Object cef51190: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b > 6b > [ 6383.530020] Object cef511a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b > 6b > [ 6383.530020] Object cef511b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b > 6b > [ 6383.530020] Object cef511c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b > a5 kkk. > [ 6383.530020] Redzone cef511d0: bb bb bb bb > > [ 6383.530020] Padding cef511d8: 5a 5a 5a 5a 5a 5a 5a 5a > > [ 6383.530020] Pid: 1922, comm: bash Not tainted > 3.5.0-jupiter-3-g8d858b1-dirty #2 > [ 6383.530020] Call Trace: > [ 6383.530020] [] print_trailer+0x11c/0x130 > [ 6383.530020] [] object_err+0x35/0x40 > [ 6383.530020] [] free_debug_processing+0x99/0x200 > [ 6383.530020] [] __slab_free+0x2e/0x280 > [ 6383.530020] [] ? hid_submit_out+0xa4/0x120 > [ 6383.530020] [] ? __usbhid_submit_report+0xc0/0x3c0 > [ 6383.530020] [] ? kfree+0xfa/0x110 > [ 6383.530020] [] ? picolcd_debug_out_report+0x8c4/0x8e0 > [hid_picolcd] > [ 6383.530020] [] kfree+0xfa/0x110 > [ 6383.530020] [] ? hid_submit_out+0xa4/0x120 > [ 6383.530020] [] ? hid_submit_out+0xa4/0x120 > [ 6383.530020] [] ? hid_submit_out+0xa4/0x120 > [ 6383.530020] [] hid_submit_out+0xa4/0x120 > [ 6383.530020] [] __usbhid_submit_report+0x158/0x3c0 > [ 6383.530020] [] usbhid_submit_report+0x1b/0x30 > [ 6383.530020] [] picolcd_fb_reset+0xb9/0x180 [hid_picolcd] > [ 6383.530020] [] picolcd_init_framebuffer+0x20d/0x2e0 > [hid_picolcd] > [ 6383.530020] [] picolcd_probe+0x3cc/0x580 [hid_picolcd] > [ 6383.530020] [] hid_device_probe+0x67/0xf0 > [ 6383.530020] [] ? driver_sysfs_add+0x57/0x80 > [ 6383.530020] [] driver_probe_device+0xbd/0x1c0 > [ 6383.530020] [] ? hid_match_device+0x7b/0x90 > [ 6383.530020] [] driver_bind+0x75/0xd0 > [ 6383.530020] [] ? driver_unbind+0x90/0x90 > [ 6383.530020] [] drv_attr_store+0x27/0x30 > [ 6383.530020] [] sysfs_write_file+0xac/0xf0 > [ 6383.530020] [] vfs_write+0x9c/0x130 > [ 6383.530020] [] ? sys_dup3+0x11f/0x160 > [ 6383.530020] [] ? sysfs_poll+0x90/0x90 > [ 6383.530020] [] sys_write+0x3d/0x70 > [ 6383.530020] [] sysenter_do_call+0x12/0x26 So I am wondering whether the path this happens on is if (!test_bit(HID_OUT_RUNNING, >iofl)) { usbhid_restart_out_queue(usbhid); in __usbhid_submit_report(). It would then indicate perhaps some race with iofl handling. Could you please stick some printk() just before and after this usbhid_restart_out_queue() call, so that we know that it's this triggering it? -- Jiri Kosina SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] HID: picoLCD updates
On Mon, 30 Jul 2012, Bruno Prémont wrote: Hi, This series updates picoLCD driver: - split the driver functions into separate files which get included depending on Kconfig selection (implementation for CIR using RC_CORE will follow later) - drop private framebuffer refcounting in favor of refcounting added to fb_info some time ago - fix various bugs issues - disabled firmware version checking in probe() as it does not work anymore since commit 4ea5454203d991ec85264f64f89ca8855fce69b0 [HID: Fix race condition between driver core and ll-driver] I have now applied the series to my 'picolcd' branch, except for 6/7, please see the comment I sent to it separately. Note: I still get weird behavior on quick unbind/bind sequences issued via sysfs (CONFIG_SMP=n system) that are triggered by framebuffer support and apparently more specifically fb_defio part of it. Unfortunately I'm out of ideas as to how to track down the problem which shows either as SLAB corruption (detected with SLUB debugging, e.g. Would be nice to have this sorted out before the next merge window indeed, so that it can go in together with the rest of the changes. [ 6383.521833] = [ 6383.530020] BUG kmalloc-64 (Not tainted): Object already free [ 6383.530020] - [ 6383.530020] [ 6383.530020] INFO: Slab 0xdde0ea20 objects=51 used=40 fp=0xcef516e0 flags=0x4080 [ 6383.530020] INFO: Object 0xcef51190 @offset=400 fp=0xcef51f50 [ 6383.530020] [ 6383.530020] Bytes b4 cef51180: cc cc cc cc d0 12 f5 ce 5a 5a 5a 5a 5a 5a 5a 5a [ 6383.530020] Object cef51190: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b [ 6383.530020] Object cef511a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b [ 6383.530020] Object cef511b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b [ 6383.530020] Object cef511c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkk. [ 6383.530020] Redzone cef511d0: bb bb bb bb [ 6383.530020] Padding cef511d8: 5a 5a 5a 5a 5a 5a 5a 5a [ 6383.530020] Pid: 1922, comm: bash Not tainted 3.5.0-jupiter-3-g8d858b1-dirty #2 [ 6383.530020] Call Trace: [ 6383.530020] [c10bd3cc] print_trailer+0x11c/0x130 [ 6383.530020] [c10bd415] object_err+0x35/0x40 [ 6383.530020] [c10be809] free_debug_processing+0x99/0x200 [ 6383.530020] [c10bf77e] __slab_free+0x2e/0x280 [ 6383.530020] [c1322284] ? hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322870] ? __usbhid_submit_report+0xc0/0x3c0 [ 6383.530020] [c10bfbda] ? kfree+0xfa/0x110 [ 6383.530020] [de932aa4] ? picolcd_debug_out_report+0x8c4/0x8e0 [hid_picolcd] [ 6383.530020] [c10bfbda] kfree+0xfa/0x110 [ 6383.530020] [c1322284] ? hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322284] ? hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322284] ? hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322284] hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322908] __usbhid_submit_report+0x158/0x3c0 [ 6383.530020] [c1322c2b] usbhid_submit_report+0x1b/0x30 [ 6383.530020] [de930789] picolcd_fb_reset+0xb9/0x180 [hid_picolcd] [ 6383.530020] [de930f1d] picolcd_init_framebuffer+0x20d/0x2e0 [hid_picolcd] [ 6383.530020] [de92fb9c] picolcd_probe+0x3cc/0x580 [hid_picolcd] [ 6383.530020] [c1319147] hid_device_probe+0x67/0xf0 [ 6383.530020] [c1282f97] ? driver_sysfs_add+0x57/0x80 [ 6383.530020] [c128329d] driver_probe_device+0xbd/0x1c0 [ 6383.530020] [c1318a1b] ? hid_match_device+0x7b/0x90 [ 6383.530020] [c12821e5] driver_bind+0x75/0xd0 [ 6383.530020] [c1282170] ? driver_unbind+0x90/0x90 [ 6383.530020] [c12818b7] drv_attr_store+0x27/0x30 [ 6383.530020] [c1114aec] sysfs_write_file+0xac/0xf0 [ 6383.530020] [c10c794c] vfs_write+0x9c/0x130 [ 6383.530020] [c10d4a1f] ? sys_dup3+0x11f/0x160 [ 6383.530020] [c1114a40] ? sysfs_poll+0x90/0x90 [ 6383.530020] [c10c7bbd] sys_write+0x3d/0x70 [ 6383.530020] [c13f2557] sysenter_do_call+0x12/0x26 So I am wondering whether the path this happens on is if (!test_bit(HID_OUT_RUNNING, usbhid-iofl)) { usbhid_restart_out_queue(usbhid); in __usbhid_submit_report(). It would then indicate perhaps some race with iofl handling. Could you please stick some printk() just before and after this usbhid_restart_out_queue() call, so that we know that it's this triggering it? -- Jiri Kosina SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] HID: picoLCD updates
Hi Jiri, On Wed, 15 August 2012 Jiri Kosina jkos...@suse.cz wrote: On Mon, 30 Jul 2012, Bruno Prémont wrote: Hi, This series updates picoLCD driver: - split the driver functions into separate files which get included depending on Kconfig selection (implementation for CIR using RC_CORE will follow later) - drop private framebuffer refcounting in favor of refcounting added to fb_info some time ago - fix various bugs issues - disabled firmware version checking in probe() as it does not work anymore since commit 4ea5454203d991ec85264f64f89ca8855fce69b0 [HID: Fix race condition between driver core and ll-driver] I have now applied the series to my 'picolcd' branch, except for 6/7, please see the comment I sent to it separately. Will respin that one soon Note: I still get weird behavior on quick unbind/bind sequences issued via sysfs (CONFIG_SMP=n system) that are triggered by framebuffer support and apparently more specifically fb_defio part of it. Unfortunately I'm out of ideas as to how to track down the problem which shows either as SLAB corruption (detected with SLUB debugging, e.g. Would be nice to have this sorted out before the next merge window indeed, so that it can go in together with the rest of the changes. [ 6383.521833] = [ 6383.530020] BUG kmalloc-64 (Not tainted): Object already free [ 6383.530020] - [ 6383.530020] [ 6383.530020] INFO: Slab 0xdde0ea20 objects=51 used=40 fp=0xcef516e0 flags=0x4080 [ 6383.530020] INFO: Object 0xcef51190 @offset=400 fp=0xcef51f50 [ 6383.530020] [ 6383.530020] Bytes b4 cef51180: cc cc cc cc d0 12 f5 ce 5a 5a 5a 5a 5a 5a 5a 5a [ 6383.530020] Object cef51190: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b [ 6383.530020] Object cef511a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b [ 6383.530020] Object cef511b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b [ 6383.530020] Object cef511c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkk. [ 6383.530020] Redzone cef511d0: bb bb bb bb [ 6383.530020] Padding cef511d8: 5a 5a 5a 5a 5a 5a 5a 5a [ 6383.530020] Pid: 1922, comm: bash Not tainted 3.5.0-jupiter-3-g8d858b1-dirty #2 [ 6383.530020] Call Trace: [ 6383.530020] [c10bd3cc] print_trailer+0x11c/0x130 [ 6383.530020] [c10bd415] object_err+0x35/0x40 [ 6383.530020] [c10be809] free_debug_processing+0x99/0x200 [ 6383.530020] [c10bf77e] __slab_free+0x2e/0x280 [ 6383.530020] [c1322284] ? hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322870] ? __usbhid_submit_report+0xc0/0x3c0 [ 6383.530020] [c10bfbda] ? kfree+0xfa/0x110 [ 6383.530020] [de932aa4] ? picolcd_debug_out_report+0x8c4/0x8e0 [hid_picolcd] [ 6383.530020] [c10bfbda] kfree+0xfa/0x110 [ 6383.530020] [c1322284] ? hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322284] ? hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322284] ? hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322284] hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322908] __usbhid_submit_report+0x158/0x3c0 [ 6383.530020] [c1322c2b] usbhid_submit_report+0x1b/0x30 [ 6383.530020] [de930789] picolcd_fb_reset+0xb9/0x180 [hid_picolcd] [ 6383.530020] [de930f1d] picolcd_init_framebuffer+0x20d/0x2e0 [hid_picolcd] [ 6383.530020] [de92fb9c] picolcd_probe+0x3cc/0x580 [hid_picolcd] [ 6383.530020] [c1319147] hid_device_probe+0x67/0xf0 [ 6383.530020] [c1282f97] ? driver_sysfs_add+0x57/0x80 [ 6383.530020] [c128329d] driver_probe_device+0xbd/0x1c0 [ 6383.530020] [c1318a1b] ? hid_match_device+0x7b/0x90 [ 6383.530020] [c12821e5] driver_bind+0x75/0xd0 [ 6383.530020] [c1282170] ? driver_unbind+0x90/0x90 [ 6383.530020] [c12818b7] drv_attr_store+0x27/0x30 [ 6383.530020] [c1114aec] sysfs_write_file+0xac/0xf0 [ 6383.530020] [c10c794c] vfs_write+0x9c/0x130 [ 6383.530020] [c10d4a1f] ? sys_dup3+0x11f/0x160 [ 6383.530020] [c1114a40] ? sysfs_poll+0x90/0x90 [ 6383.530020] [c10c7bbd] sys_write+0x3d/0x70 [ 6383.530020] [c13f2557] sysenter_do_call+0x12/0x26 So I am wondering whether the path this happens on is if (!test_bit(HID_OUT_RUNNING, usbhid-iofl)) { usbhid_restart_out_queue(usbhid); in __usbhid_submit_report(). It would then indicate perhaps some race with iofl handling. Huh, that specific test_bit hunk I can't find in __usbhid_submit_report, is that 3.6 material? I'm running my tests against 3.5... The nearest I have is: if (!test_bit(HID_OUT_RUNNING, usbhid-iofl)) if (!irq_out_pump_restart(hid))
Re: [PATCH 0/7] HID: picoLCD updates
On Wed, 15 Aug 2012, Bruno Prémont wrote: [ 6383.521833] = [ 6383.530020] BUG kmalloc-64 (Not tainted): Object already free [ 6383.530020] - [ 6383.530020] [ 6383.530020] INFO: Slab 0xdde0ea20 objects=51 used=40 fp=0xcef516e0 flags=0x4080 [ 6383.530020] INFO: Object 0xcef51190 @offset=400 fp=0xcef51f50 [ 6383.530020] [ 6383.530020] Bytes b4 cef51180: cc cc cc cc d0 12 f5 ce 5a 5a 5a 5a 5a 5a 5a 5a [ 6383.530020] Object cef51190: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b [ 6383.530020] Object cef511a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b [ 6383.530020] Object cef511b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b [ 6383.530020] Object cef511c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkk. [ 6383.530020] Redzone cef511d0: bb bb bb bb [ 6383.530020] Padding cef511d8: 5a 5a 5a 5a 5a 5a 5a 5a [ 6383.530020] Pid: 1922, comm: bash Not tainted 3.5.0-jupiter-3-g8d858b1-dirty #2 [ 6383.530020] Call Trace: [ 6383.530020] [c10bd3cc] print_trailer+0x11c/0x130 [ 6383.530020] [c10bd415] object_err+0x35/0x40 [ 6383.530020] [c10be809] free_debug_processing+0x99/0x200 [ 6383.530020] [c10bf77e] __slab_free+0x2e/0x280 [ 6383.530020] [c1322284] ? hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322870] ? __usbhid_submit_report+0xc0/0x3c0 [ 6383.530020] [c10bfbda] ? kfree+0xfa/0x110 [ 6383.530020] [de932aa4] ? picolcd_debug_out_report+0x8c4/0x8e0 [hid_picolcd] [ 6383.530020] [c10bfbda] kfree+0xfa/0x110 [ 6383.530020] [c1322284] ? hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322284] ? hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322284] ? hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322284] hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322908] __usbhid_submit_report+0x158/0x3c0 [ 6383.530020] [c1322c2b] usbhid_submit_report+0x1b/0x30 [ 6383.530020] [de930789] picolcd_fb_reset+0xb9/0x180 [hid_picolcd] [ 6383.530020] [de930f1d] picolcd_init_framebuffer+0x20d/0x2e0 [hid_picolcd] [ 6383.530020] [de92fb9c] picolcd_probe+0x3cc/0x580 [hid_picolcd] [ 6383.530020] [c1319147] hid_device_probe+0x67/0xf0 [ 6383.530020] [c1282f97] ? driver_sysfs_add+0x57/0x80 [ 6383.530020] [c128329d] driver_probe_device+0xbd/0x1c0 [ 6383.530020] [c1318a1b] ? hid_match_device+0x7b/0x90 [ 6383.530020] [c12821e5] driver_bind+0x75/0xd0 [ 6383.530020] [c1282170] ? driver_unbind+0x90/0x90 [ 6383.530020] [c12818b7] drv_attr_store+0x27/0x30 [ 6383.530020] [c1114aec] sysfs_write_file+0xac/0xf0 [ 6383.530020] [c10c794c] vfs_write+0x9c/0x130 [ 6383.530020] [c10d4a1f] ? sys_dup3+0x11f/0x160 [ 6383.530020] [c1114a40] ? sysfs_poll+0x90/0x90 [ 6383.530020] [c10c7bbd] sys_write+0x3d/0x70 [ 6383.530020] [c13f2557] sysenter_do_call+0x12/0x26 So I am wondering whether the path this happens on is if (!test_bit(HID_OUT_RUNNING, usbhid-iofl)) { usbhid_restart_out_queue(usbhid); in __usbhid_submit_report(). It would then indicate perhaps some race with iofl handling. Huh, that specific test_bit hunk I can't find in __usbhid_submit_report, is that 3.6 material? I'm running my tests against 3.5... I see. Alan Stern has fixed a huge pile of things in this area in 3.6-rc1. I have expected all of those to actually be on theoretical problems not ever having happened in the wild, but it might be that you are actually chasing on of those. Could you please retest with latest Linus' tree (or at least eb055fd0560b) to see whether this hasn't actually been fixed already by Alan's series? Thanks, -- Jiri Kosina SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] HID: picoLCD updates
Hi Jiri, On Wed, 15 August 2012 Jiri Kosina jkos...@suse.cz wrote: I see. Alan Stern has fixed a huge pile of things in this area in 3.6-rc1. I have expected all of those to actually be on theoretical problems not ever having happened in the wild, but it might be that you are actually chasing on of those. Could you please retest with latest Linus' tree (or at least eb055fd0560b) to see whether this hasn't actually been fixed already by Alan's series? I've started trying that out, it seems Alan's work improved things. For the first few attempts I have not seen SLAB corruptions, though after a few rounds I hit accumulation of the following messages: [ 297.174828] hid-picolcd 0003:04D8:C002.0003: output queue full [ 297.181098] hid-picolcd 0003:04D8:C002.0003: output queue full [ 297.187820] hid-picolcd 0003:04D8:C002.0003: output queue full [ 297.194087] hid-picolcd 0003:04D8:C002.0003: output queue full with sporadically in between: [ 292.668019] hid-picolcd 0003:04D8:C002.0003: usb_submit_urb(out) failed: -1 At first glance I think the queue filling up and never draining is caused by hid_hw_stop() stalling the queue and the time between both being just too short. Maybe me (un)binding on hid-picolcd driver level is the major cause of this and (un)binding at lower usb level would kill the queue thus preventing it just growing until full... Is there a proper way to do rate-limiting or throttling when submitting reports (which are the urbs deeper in the stack) and catch -ENODEV early while remove() is delayed for locking reasons? At least until 3.5 submitting reports provides no return value for driver which could convey errors like -ENODEV, -EBUSY. After a few bind-unbind iterations I have also hit a race in hid-picolcd with regard to fbdefio. That looks like chicken-egg release sequencing between hid-picolcd and fb_info subdev -- though I'm wondering how that happens as the only framebuffer user I know of - fbcon - should have stopped using it after unregister_framebuffer() returned and released the last reference! But that part can be improved rather easily with a new spinlock synchronizing interoperation between hid side and framebuffer side. Thanks, Bruno -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] HID: picoLCD updates
On Wed, 15 Aug 2012, Bruno Prémont wrote: I see. Alan Stern has fixed a huge pile of things in this area in 3.6-rc1. I have expected all of those to actually be on theoretical problems not ever having happened in the wild, but it might be that you are actually chasing on of those. Could you please retest with latest Linus' tree (or at least eb055fd0560b) to see whether this hasn't actually been fixed already by Alan's series? I've started trying that out, it seems Alan's work improved things. For the first few attempts I have not seen SLAB corruptions, though after a few rounds I hit accumulation of the following messages: [ 297.174828] hid-picolcd 0003:04D8:C002.0003: output queue full [ 297.181098] hid-picolcd 0003:04D8:C002.0003: output queue full [ 297.187820] hid-picolcd 0003:04D8:C002.0003: output queue full [ 297.194087] hid-picolcd 0003:04D8:C002.0003: output queue full with sporadically in between: [ 292.668019] hid-picolcd 0003:04D8:C002.0003: usb_submit_urb(out) failed: -1 At first glance I think the queue filling up and never draining is caused by hid_hw_stop() stalling the queue and the time between both being just too short. I don't really understand this explanation. Once usb_kill_urb() returns, the URB should be available for future use (and therefore all queues completely drained). -- Jiri Kosina SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] HID: picoLCD updates
Hello, On Tue, Aug 14, 2012 at 08:30:44AM +0200, Bruno Prémont wrote: > > I'm kinda shooting in the dark but who flushes / cancels > > fb_info->deferred_work? > > fb_deferred_io_cleanup() does so and is called by destroy fbops > (when last reference to fb_info is returned): I see. Sorry but just from glancing it I can't really tell what's going wrong. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] HID: picoLCD updates
Hello Tejun, [Tejun: sorry for duplicate, did hit "reply" instead of "reply to all"] On Mon, 13 Aug 2012 16:27:08 Tejun Heo wrote: > On Thu, Aug 09, 2012 at 08:09:47PM +0200, Bruno Prémont wrote: > > As you are working on workqueues and related code, could you have a look > > at my usage of them in combination with db_defio? > > > > The delayed memory corruptions or system reboots after unbinding/unplugging > > the PicoLCD seem very much related to workqueue used to handle the deferred > > IO to framebuffer. > > > > I think things don't get cleaned-up as they should though I'm not sure > > where the trouble lies. > > > > For ease of reading, I'm inlineing below the framebuffer related code of > > PicoLCD (for complete driver after this patch series apply the whole series > > on top of 3.5 https://lkml.org/lkml/2012/7/30/375 ) > ... > > void picolcd_exit_framebuffer(struct picolcd_data *data) > > { > > struct fb_info *info = data->fb_info; > > u8 *fb_vbitmap = data->fb_vbitmap; > > > > if (!info) > > return; > > > > device_remove_file(>hdev->dev, _attr_fb_update_rate); > > info->par = NULL; > > unregister_framebuffer(info); > > data->fb_vbitmap = NULL; > > data->fb_bitmap = NULL; > > data->fb_bpp = 0; > > data->fb_info= NULL; > > kfree(fb_vbitmap); > > } > > I'm kinda shooting in the dark but who flushes / cancels > fb_info->deferred_work? fb_deferred_io_cleanup() does so and is called by destroy fbops (when last reference to fb_info is returned): static void picolcd_fb_destroy(struct fb_info *info) { /* make sure no work is deferred */ fb_deferred_io_cleanup(info); vfree((u8 *)info->fix.smem_start); framebuffer_release(info); printk(KERN_DEBUG "picolcd_fb_destroy(%p)\n", info); } = drivers/video/fb_defio.c = void fb_deferred_io_cleanup(struct fb_info *info) { struct fb_deferred_io *fbdefio = info->fbdefio; struct page *page; int i; BUG_ON(!fbdefio); cancel_delayed_work_sync(>deferred_work); /* clear out the mapping that we setup */ for (i = 0 ; i < info->fix.smem_len; i += PAGE_SIZE) { page = fb_deferred_io_page(info, i); page->mapping = NULL; } info->fbops->fb_mmap = NULL; mutex_destroy(>lock); } Thanks, Bruno -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] HID: picoLCD updates
Hello Tejun, [Tejun: sorry for duplicate, did hit reply instead of reply to all] On Mon, 13 Aug 2012 16:27:08 Tejun Heo t...@kernel.org wrote: On Thu, Aug 09, 2012 at 08:09:47PM +0200, Bruno Prémont wrote: As you are working on workqueues and related code, could you have a look at my usage of them in combination with db_defio? The delayed memory corruptions or system reboots after unbinding/unplugging the PicoLCD seem very much related to workqueue used to handle the deferred IO to framebuffer. I think things don't get cleaned-up as they should though I'm not sure where the trouble lies. For ease of reading, I'm inlineing below the framebuffer related code of PicoLCD (for complete driver after this patch series apply the whole series on top of 3.5 https://lkml.org/lkml/2012/7/30/375 ) ... void picolcd_exit_framebuffer(struct picolcd_data *data) { struct fb_info *info = data-fb_info; u8 *fb_vbitmap = data-fb_vbitmap; if (!info) return; device_remove_file(data-hdev-dev, dev_attr_fb_update_rate); info-par = NULL; unregister_framebuffer(info); data-fb_vbitmap = NULL; data-fb_bitmap = NULL; data-fb_bpp = 0; data-fb_info= NULL; kfree(fb_vbitmap); } I'm kinda shooting in the dark but who flushes / cancels fb_info-deferred_work? fb_deferred_io_cleanup() does so and is called by destroy fbops (when last reference to fb_info is returned): static void picolcd_fb_destroy(struct fb_info *info) { /* make sure no work is deferred */ fb_deferred_io_cleanup(info); vfree((u8 *)info-fix.smem_start); framebuffer_release(info); printk(KERN_DEBUG picolcd_fb_destroy(%p)\n, info); } = drivers/video/fb_defio.c = void fb_deferred_io_cleanup(struct fb_info *info) { struct fb_deferred_io *fbdefio = info-fbdefio; struct page *page; int i; BUG_ON(!fbdefio); cancel_delayed_work_sync(info-deferred_work); /* clear out the mapping that we setup */ for (i = 0 ; i info-fix.smem_len; i += PAGE_SIZE) { page = fb_deferred_io_page(info, i); page-mapping = NULL; } info-fbops-fb_mmap = NULL; mutex_destroy(fbdefio-lock); } Thanks, Bruno -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] HID: picoLCD updates
Hello, On Tue, Aug 14, 2012 at 08:30:44AM +0200, Bruno Prémont wrote: I'm kinda shooting in the dark but who flushes / cancels fb_info-deferred_work? fb_deferred_io_cleanup() does so and is called by destroy fbops (when last reference to fb_info is returned): I see. Sorry but just from glancing it I can't really tell what's going wrong. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] HID: picoLCD updates
Hello, On Thu, Aug 09, 2012 at 08:09:47PM +0200, Bruno Prémont wrote: > As you are working on workqueues and related code, could you have a look > at my usage of them in combination with db_defio? > > The delayed memory corruptions or system reboots after unbinding/unplugging > the PicoLCD seem very much related to workqueue used to handle the deferred > IO to framebuffer. > > I think things don't get cleaned-up as they should though I'm not sure > where the trouble lies. > > For ease of reading, I'm inlineing below the framebuffer related code of > PicoLCD (for complete driver after this patch series apply the whole series > on top of 3.5 https://lkml.org/lkml/2012/7/30/375 ) ... > void picolcd_exit_framebuffer(struct picolcd_data *data) > { > struct fb_info *info = data->fb_info; > u8 *fb_vbitmap = data->fb_vbitmap; > > if (!info) > return; > > device_remove_file(>hdev->dev, _attr_fb_update_rate); > info->par = NULL; > unregister_framebuffer(info); > data->fb_vbitmap = NULL; > data->fb_bitmap = NULL; > data->fb_bpp = 0; > data->fb_info= NULL; > kfree(fb_vbitmap); > } I'm kinda shooting in the dark but who flushes / cancels fb_info->deferred_work? Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] HID: picoLCD updates
Hello, On Thu, Aug 09, 2012 at 08:09:47PM +0200, Bruno Prémont wrote: As you are working on workqueues and related code, could you have a look at my usage of them in combination with db_defio? The delayed memory corruptions or system reboots after unbinding/unplugging the PicoLCD seem very much related to workqueue used to handle the deferred IO to framebuffer. I think things don't get cleaned-up as they should though I'm not sure where the trouble lies. For ease of reading, I'm inlineing below the framebuffer related code of PicoLCD (for complete driver after this patch series apply the whole series on top of 3.5 https://lkml.org/lkml/2012/7/30/375 ) ... void picolcd_exit_framebuffer(struct picolcd_data *data) { struct fb_info *info = data-fb_info; u8 *fb_vbitmap = data-fb_vbitmap; if (!info) return; device_remove_file(data-hdev-dev, dev_attr_fb_update_rate); info-par = NULL; unregister_framebuffer(info); data-fb_vbitmap = NULL; data-fb_bitmap = NULL; data-fb_bpp = 0; data-fb_info= NULL; kfree(fb_vbitmap); } I'm kinda shooting in the dark but who flushes / cancels fb_info-deferred_work? Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] HID: picoLCD updates
Hi Tejun, As you are working on workqueues and related code, could you have a look at my usage of them in combination with db_defio? The delayed memory corruptions or system reboots after unbinding/unplugging the PicoLCD seem very much related to workqueue used to handle the deferred IO to framebuffer. I think things don't get cleaned-up as they should though I'm not sure where the trouble lies. For ease of reading, I'm inlineing below the framebuffer related code of PicoLCD (for complete driver after this patch series apply the whole series on top of 3.5 https://lkml.org/lkml/2012/7/30/375 ) Thanks, Bruno On Mon, 30 July 2012 Bruno Prémont wrote: > This series updates picoLCD driver: > - split the driver functions into separate files which get included > depending on Kconfig selection > (implementation for CIR using RC_CORE will follow later) > - drop private framebuffer refcounting in favor of refcounting added > to fb_info some time ago > - fix various bugs issues > - disabled firmware version checking in probe() as it does not work > anymore since commit 4ea5454203d991ec85264f64f89ca8855fce69b0 > [HID: Fix race condition between driver core and ll-driver] > > Note: I still get weird behavior on quick unbind/bind sequences > issued via sysfs (CONFIG_SMP=n system) that are triggered by framebuffer > support and apparently more specifically fb_defio part of it. > > Unfortunately I'm out of ideas as to how to track down the problem which > shows either as SLAB corruption (detected with SLUB debugging, e.g. > > [ 6383.521833] > = > [ 6383.530020] BUG kmalloc-64 (Not tainted): Object already free > [ 6383.530020] > - > [ 6383.530020] > [ 6383.530020] INFO: Slab 0xdde0ea20 objects=51 used=40 fp=0xcef516e0 > flags=0x4080 > [ 6383.530020] INFO: Object 0xcef51190 @offset=400 fp=0xcef51f50 > [ 6383.530020] > [ 6383.530020] Bytes b4 cef51180: cc cc cc cc d0 12 f5 ce 5a 5a 5a 5a 5a 5a > 5a 5a > [ 6383.530020] Object cef51190: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b > 6b > [ 6383.530020] Object cef511a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b > 6b > [ 6383.530020] Object cef511b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b > 6b > [ 6383.530020] Object cef511c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b > a5 kkk. > [ 6383.530020] Redzone cef511d0: bb bb bb bb > > [ 6383.530020] Padding cef511d8: 5a 5a 5a 5a 5a 5a 5a 5a > > [ 6383.530020] Pid: 1922, comm: bash Not tainted > 3.5.0-jupiter-3-g8d858b1-dirty #2 > [ 6383.530020] Call Trace: > [ 6383.530020] [] print_trailer+0x11c/0x130 > [ 6383.530020] [] object_err+0x35/0x40 > [ 6383.530020] [] free_debug_processing+0x99/0x200 > [ 6383.530020] [] __slab_free+0x2e/0x280 > [ 6383.530020] [] ? hid_submit_out+0xa4/0x120 > [ 6383.530020] [] ? __usbhid_submit_report+0xc0/0x3c0 > [ 6383.530020] [] ? kfree+0xfa/0x110 > [ 6383.530020] [] ? picolcd_debug_out_report+0x8c4/0x8e0 > [hid_picolcd] > [ 6383.530020] [] kfree+0xfa/0x110 > [ 6383.530020] [] ? hid_submit_out+0xa4/0x120 > [ 6383.530020] [] ? hid_submit_out+0xa4/0x120 > [ 6383.530020] [] ? hid_submit_out+0xa4/0x120 > [ 6383.530020] [] hid_submit_out+0xa4/0x120 > [ 6383.530020] [] __usbhid_submit_report+0x158/0x3c0 > [ 6383.530020] [] usbhid_submit_report+0x1b/0x30 > [ 6383.530020] [] picolcd_fb_reset+0xb9/0x180 [hid_picolcd] > [ 6383.530020] [] picolcd_init_framebuffer+0x20d/0x2e0 > [hid_picolcd] > [ 6383.530020] [] picolcd_probe+0x3cc/0x580 [hid_picolcd] > [ 6383.530020] [] hid_device_probe+0x67/0xf0 > [ 6383.530020] [] ? driver_sysfs_add+0x57/0x80 > [ 6383.530020] [] driver_probe_device+0xbd/0x1c0 > [ 6383.530020] [] ? hid_match_device+0x7b/0x90 > [ 6383.530020] [] driver_bind+0x75/0xd0 > [ 6383.530020] [] ? driver_unbind+0x90/0x90 > [ 6383.530020] [] drv_attr_store+0x27/0x30 > [ 6383.530020] [] sysfs_write_file+0xac/0xf0 > [ 6383.530020] [] vfs_write+0x9c/0x130 > [ 6383.530020] [] ? sys_dup3+0x11f/0x160 > [ 6383.530020] [] ? sysfs_poll+0x90/0x90 > [ 6383.530020] [] sys_write+0x3d/0x70 > [ 6383.530020] [] sysenter_do_call+0x12/0x26 > [ 6383.530020] FIX kmalloc-64: Object at 0xcef51190 not freed > > or worse spontaneous reboot of the system without any trace on netconsole or > serial console. > > echo $devid > bind; echo $devid > unbind > or > echo $devid > bind; echo $devid > unbind; sleep 0.2; echo $devid > bind; echo > $devid > unbind > > is sufficient to trigger the above issue while waiting a few seconds between > bind and unbind > shows no sign of trouble. > > Suggestions as to how to debug this and fix it are welcome! = driver/hid/hid-picolcd_fb.c ==
Re: [PATCH 0/7] HID: picoLCD updates
Hi Tejun, As you are working on workqueues and related code, could you have a look at my usage of them in combination with db_defio? The delayed memory corruptions or system reboots after unbinding/unplugging the PicoLCD seem very much related to workqueue used to handle the deferred IO to framebuffer. I think things don't get cleaned-up as they should though I'm not sure where the trouble lies. For ease of reading, I'm inlineing below the framebuffer related code of PicoLCD (for complete driver after this patch series apply the whole series on top of 3.5 https://lkml.org/lkml/2012/7/30/375 ) Thanks, Bruno On Mon, 30 July 2012 Bruno Prémont bonb...@linux-vserver.org wrote: This series updates picoLCD driver: - split the driver functions into separate files which get included depending on Kconfig selection (implementation for CIR using RC_CORE will follow later) - drop private framebuffer refcounting in favor of refcounting added to fb_info some time ago - fix various bugs issues - disabled firmware version checking in probe() as it does not work anymore since commit 4ea5454203d991ec85264f64f89ca8855fce69b0 [HID: Fix race condition between driver core and ll-driver] Note: I still get weird behavior on quick unbind/bind sequences issued via sysfs (CONFIG_SMP=n system) that are triggered by framebuffer support and apparently more specifically fb_defio part of it. Unfortunately I'm out of ideas as to how to track down the problem which shows either as SLAB corruption (detected with SLUB debugging, e.g. [ 6383.521833] = [ 6383.530020] BUG kmalloc-64 (Not tainted): Object already free [ 6383.530020] - [ 6383.530020] [ 6383.530020] INFO: Slab 0xdde0ea20 objects=51 used=40 fp=0xcef516e0 flags=0x4080 [ 6383.530020] INFO: Object 0xcef51190 @offset=400 fp=0xcef51f50 [ 6383.530020] [ 6383.530020] Bytes b4 cef51180: cc cc cc cc d0 12 f5 ce 5a 5a 5a 5a 5a 5a 5a 5a [ 6383.530020] Object cef51190: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b [ 6383.530020] Object cef511a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b [ 6383.530020] Object cef511b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b [ 6383.530020] Object cef511c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkk. [ 6383.530020] Redzone cef511d0: bb bb bb bb [ 6383.530020] Padding cef511d8: 5a 5a 5a 5a 5a 5a 5a 5a [ 6383.530020] Pid: 1922, comm: bash Not tainted 3.5.0-jupiter-3-g8d858b1-dirty #2 [ 6383.530020] Call Trace: [ 6383.530020] [c10bd3cc] print_trailer+0x11c/0x130 [ 6383.530020] [c10bd415] object_err+0x35/0x40 [ 6383.530020] [c10be809] free_debug_processing+0x99/0x200 [ 6383.530020] [c10bf77e] __slab_free+0x2e/0x280 [ 6383.530020] [c1322284] ? hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322870] ? __usbhid_submit_report+0xc0/0x3c0 [ 6383.530020] [c10bfbda] ? kfree+0xfa/0x110 [ 6383.530020] [de932aa4] ? picolcd_debug_out_report+0x8c4/0x8e0 [hid_picolcd] [ 6383.530020] [c10bfbda] kfree+0xfa/0x110 [ 6383.530020] [c1322284] ? hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322284] ? hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322284] ? hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322284] hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322908] __usbhid_submit_report+0x158/0x3c0 [ 6383.530020] [c1322c2b] usbhid_submit_report+0x1b/0x30 [ 6383.530020] [de930789] picolcd_fb_reset+0xb9/0x180 [hid_picolcd] [ 6383.530020] [de930f1d] picolcd_init_framebuffer+0x20d/0x2e0 [hid_picolcd] [ 6383.530020] [de92fb9c] picolcd_probe+0x3cc/0x580 [hid_picolcd] [ 6383.530020] [c1319147] hid_device_probe+0x67/0xf0 [ 6383.530020] [c1282f97] ? driver_sysfs_add+0x57/0x80 [ 6383.530020] [c128329d] driver_probe_device+0xbd/0x1c0 [ 6383.530020] [c1318a1b] ? hid_match_device+0x7b/0x90 [ 6383.530020] [c12821e5] driver_bind+0x75/0xd0 [ 6383.530020] [c1282170] ? driver_unbind+0x90/0x90 [ 6383.530020] [c12818b7] drv_attr_store+0x27/0x30 [ 6383.530020] [c1114aec] sysfs_write_file+0xac/0xf0 [ 6383.530020] [c10c794c] vfs_write+0x9c/0x130 [ 6383.530020] [c10d4a1f] ? sys_dup3+0x11f/0x160 [ 6383.530020] [c1114a40] ? sysfs_poll+0x90/0x90 [ 6383.530020] [c10c7bbd] sys_write+0x3d/0x70 [ 6383.530020] [c13f2557] sysenter_do_call+0x12/0x26 [ 6383.530020] FIX kmalloc-64: Object at 0xcef51190 not freed or worse spontaneous reboot of the system without any trace on netconsole or serial console. echo $devid bind; echo $devid unbind or echo $devid bind; echo $devid unbind; sleep 0.2; echo $devid bind; echo $devid unbind is sufficient to trigger the above issue while waiting a few seconds between bind and unbind shows no sign
Re: [PATCH 0/7] HID: picoLCD updates
Hi David, On Tue, 31 Jul 2012 09:26:07 David Herrmann wrote: > This is not directly related to this patchset, but did you fix the > locking issue with hid-core? It is still on my todo-list but I haven't > gotten around fixing it, yet. However, I plan on fixing it this > summer, but if picolcd does not require it, it's probably not worth > the work. I worked around it with patch 6/7 as in disabling the version check. The other probing steps don't expect a response from the hardware, thus are not affected. But it would definitely be helpful to be able to revert that one. Understanding and fixing the FB-related misbehavior on quick bind/unbind seems more important though. Not related with this, getting a return value from usbhid_submit_report() as in -EAGAIN or -ENODEV would help in order to avoid the ugly hid-picolcd 0003:04D8:C002.0003: usb_submit_urb(out) failed: -19 lines in kernel log and stopping interaction with hardware when it's gone already or giving it time to consume its queue when there is lots of data being sent. Regards, Bruno -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] HID: picoLCD updates
Hi Bruno On Mon, Jul 30, 2012 at 9:36 PM, Bruno Prémont wrote: > Hi, > > This series updates picoLCD driver: > - split the driver functions into separate files which get included > depending on Kconfig selection > (implementation for CIR using RC_CORE will follow later) > - drop private framebuffer refcounting in favor of refcounting added > to fb_info some time ago > - fix various bugs issues > - disabled firmware version checking in probe() as it does not work > anymore since commit 4ea5454203d991ec85264f64f89ca8855fce69b0 > [HID: Fix race condition between driver core and ll-driver] > > Note: I still get weird behavior on quick unbind/bind sequences > issued via sysfs (CONFIG_SMP=n system) that are triggered by framebuffer > support and apparently more specifically fb_defio part of it. > > Unfortunately I'm out of ideas as to how to track down the problem which > shows either as SLAB corruption (detected with SLUB debugging, e.g. > > [ 6383.521833] > = > [ 6383.530020] BUG kmalloc-64 (Not tainted): Object already free > [ 6383.530020] > - > [ 6383.530020] > [ 6383.530020] INFO: Slab 0xdde0ea20 objects=51 used=40 fp=0xcef516e0 > flags=0x4080 > [ 6383.530020] INFO: Object 0xcef51190 @offset=400 fp=0xcef51f50 > [ 6383.530020] > [ 6383.530020] Bytes b4 cef51180: cc cc cc cc d0 12 f5 ce 5a 5a 5a 5a 5a 5a > 5a 5a > [ 6383.530020] Object cef51190: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b > 6b > [ 6383.530020] Object cef511a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b > 6b > [ 6383.530020] Object cef511b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b > 6b > [ 6383.530020] Object cef511c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b > a5 kkk. > [ 6383.530020] Redzone cef511d0: bb bb bb bb > > [ 6383.530020] Padding cef511d8: 5a 5a 5a 5a 5a 5a 5a 5a > > [ 6383.530020] Pid: 1922, comm: bash Not tainted > 3.5.0-jupiter-3-g8d858b1-dirty #2 > [ 6383.530020] Call Trace: > [ 6383.530020] [] print_trailer+0x11c/0x130 > [ 6383.530020] [] object_err+0x35/0x40 > [ 6383.530020] [] free_debug_processing+0x99/0x200 > [ 6383.530020] [] __slab_free+0x2e/0x280 > [ 6383.530020] [] ? hid_submit_out+0xa4/0x120 > [ 6383.530020] [] ? __usbhid_submit_report+0xc0/0x3c0 > [ 6383.530020] [] ? kfree+0xfa/0x110 > [ 6383.530020] [] ? picolcd_debug_out_report+0x8c4/0x8e0 > [hid_picolcd] > [ 6383.530020] [] kfree+0xfa/0x110 > [ 6383.530020] [] ? hid_submit_out+0xa4/0x120 > [ 6383.530020] [] ? hid_submit_out+0xa4/0x120 > [ 6383.530020] [] ? hid_submit_out+0xa4/0x120 > [ 6383.530020] [] hid_submit_out+0xa4/0x120 > [ 6383.530020] [] __usbhid_submit_report+0x158/0x3c0 > [ 6383.530020] [] usbhid_submit_report+0x1b/0x30 > [ 6383.530020] [] picolcd_fb_reset+0xb9/0x180 [hid_picolcd] > [ 6383.530020] [] picolcd_init_framebuffer+0x20d/0x2e0 > [hid_picolcd] > [ 6383.530020] [] picolcd_probe+0x3cc/0x580 [hid_picolcd] > [ 6383.530020] [] hid_device_probe+0x67/0xf0 > [ 6383.530020] [] ? driver_sysfs_add+0x57/0x80 > [ 6383.530020] [] driver_probe_device+0xbd/0x1c0 > [ 6383.530020] [] ? hid_match_device+0x7b/0x90 > [ 6383.530020] [] driver_bind+0x75/0xd0 > [ 6383.530020] [] ? driver_unbind+0x90/0x90 > [ 6383.530020] [] drv_attr_store+0x27/0x30 > [ 6383.530020] [] sysfs_write_file+0xac/0xf0 > [ 6383.530020] [] vfs_write+0x9c/0x130 > [ 6383.530020] [] ? sys_dup3+0x11f/0x160 > [ 6383.530020] [] ? sysfs_poll+0x90/0x90 > [ 6383.530020] [] sys_write+0x3d/0x70 > [ 6383.530020] [] sysenter_do_call+0x12/0x26 > [ 6383.530020] FIX kmalloc-64: Object at 0xcef51190 not freed > > or worse spontaneous reboot of the system without any trace on netconsole or > serial console. > > echo $devid > bind; echo $devid > unbind > or > echo $devid > bind; echo $devid > unbind; sleep 0.2; echo $devid > bind; echo > $devid > unbind > > is sufficient to trigger the above issue while waiting a few seconds between > bind and unbind > shows no sign of trouble. > > Suggestions as to how to debug this and fix it are welcome! This is not directly related to this patchset, but did you fix the locking issue with hid-core? It is still on my todo-list but I haven't gotten around fixing it, yet. However, I plan on fixing it this summer, but if picolcd does not require it, it's probably not worth the work. Regards David -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] HID: picoLCD updates
Hi Bruno On Mon, Jul 30, 2012 at 9:36 PM, Bruno Prémont bonb...@linux-vserver.org wrote: Hi, This series updates picoLCD driver: - split the driver functions into separate files which get included depending on Kconfig selection (implementation for CIR using RC_CORE will follow later) - drop private framebuffer refcounting in favor of refcounting added to fb_info some time ago - fix various bugs issues - disabled firmware version checking in probe() as it does not work anymore since commit 4ea5454203d991ec85264f64f89ca8855fce69b0 [HID: Fix race condition between driver core and ll-driver] Note: I still get weird behavior on quick unbind/bind sequences issued via sysfs (CONFIG_SMP=n system) that are triggered by framebuffer support and apparently more specifically fb_defio part of it. Unfortunately I'm out of ideas as to how to track down the problem which shows either as SLAB corruption (detected with SLUB debugging, e.g. [ 6383.521833] = [ 6383.530020] BUG kmalloc-64 (Not tainted): Object already free [ 6383.530020] - [ 6383.530020] [ 6383.530020] INFO: Slab 0xdde0ea20 objects=51 used=40 fp=0xcef516e0 flags=0x4080 [ 6383.530020] INFO: Object 0xcef51190 @offset=400 fp=0xcef51f50 [ 6383.530020] [ 6383.530020] Bytes b4 cef51180: cc cc cc cc d0 12 f5 ce 5a 5a 5a 5a 5a 5a 5a 5a [ 6383.530020] Object cef51190: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b [ 6383.530020] Object cef511a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b [ 6383.530020] Object cef511b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b [ 6383.530020] Object cef511c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkk. [ 6383.530020] Redzone cef511d0: bb bb bb bb [ 6383.530020] Padding cef511d8: 5a 5a 5a 5a 5a 5a 5a 5a [ 6383.530020] Pid: 1922, comm: bash Not tainted 3.5.0-jupiter-3-g8d858b1-dirty #2 [ 6383.530020] Call Trace: [ 6383.530020] [c10bd3cc] print_trailer+0x11c/0x130 [ 6383.530020] [c10bd415] object_err+0x35/0x40 [ 6383.530020] [c10be809] free_debug_processing+0x99/0x200 [ 6383.530020] [c10bf77e] __slab_free+0x2e/0x280 [ 6383.530020] [c1322284] ? hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322870] ? __usbhid_submit_report+0xc0/0x3c0 [ 6383.530020] [c10bfbda] ? kfree+0xfa/0x110 [ 6383.530020] [de932aa4] ? picolcd_debug_out_report+0x8c4/0x8e0 [hid_picolcd] [ 6383.530020] [c10bfbda] kfree+0xfa/0x110 [ 6383.530020] [c1322284] ? hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322284] ? hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322284] ? hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322284] hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322908] __usbhid_submit_report+0x158/0x3c0 [ 6383.530020] [c1322c2b] usbhid_submit_report+0x1b/0x30 [ 6383.530020] [de930789] picolcd_fb_reset+0xb9/0x180 [hid_picolcd] [ 6383.530020] [de930f1d] picolcd_init_framebuffer+0x20d/0x2e0 [hid_picolcd] [ 6383.530020] [de92fb9c] picolcd_probe+0x3cc/0x580 [hid_picolcd] [ 6383.530020] [c1319147] hid_device_probe+0x67/0xf0 [ 6383.530020] [c1282f97] ? driver_sysfs_add+0x57/0x80 [ 6383.530020] [c128329d] driver_probe_device+0xbd/0x1c0 [ 6383.530020] [c1318a1b] ? hid_match_device+0x7b/0x90 [ 6383.530020] [c12821e5] driver_bind+0x75/0xd0 [ 6383.530020] [c1282170] ? driver_unbind+0x90/0x90 [ 6383.530020] [c12818b7] drv_attr_store+0x27/0x30 [ 6383.530020] [c1114aec] sysfs_write_file+0xac/0xf0 [ 6383.530020] [c10c794c] vfs_write+0x9c/0x130 [ 6383.530020] [c10d4a1f] ? sys_dup3+0x11f/0x160 [ 6383.530020] [c1114a40] ? sysfs_poll+0x90/0x90 [ 6383.530020] [c10c7bbd] sys_write+0x3d/0x70 [ 6383.530020] [c13f2557] sysenter_do_call+0x12/0x26 [ 6383.530020] FIX kmalloc-64: Object at 0xcef51190 not freed or worse spontaneous reboot of the system without any trace on netconsole or serial console. echo $devid bind; echo $devid unbind or echo $devid bind; echo $devid unbind; sleep 0.2; echo $devid bind; echo $devid unbind is sufficient to trigger the above issue while waiting a few seconds between bind and unbind shows no sign of trouble. Suggestions as to how to debug this and fix it are welcome! This is not directly related to this patchset, but did you fix the locking issue with hid-core? It is still on my todo-list but I haven't gotten around fixing it, yet. However, I plan on fixing it this summer, but if picolcd does not require it, it's probably not worth the work. Regards David -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at
Re: [PATCH 0/7] HID: picoLCD updates
Hi David, On Tue, 31 Jul 2012 09:26:07 David Herrmann wrote: This is not directly related to this patchset, but did you fix the locking issue with hid-core? It is still on my todo-list but I haven't gotten around fixing it, yet. However, I plan on fixing it this summer, but if picolcd does not require it, it's probably not worth the work. I worked around it with patch 6/7 as in disabling the version check. The other probing steps don't expect a response from the hardware, thus are not affected. But it would definitely be helpful to be able to revert that one. Understanding and fixing the FB-related misbehavior on quick bind/unbind seems more important though. Not related with this, getting a return value from usbhid_submit_report() as in -EAGAIN or -ENODEV would help in order to avoid the ugly hid-picolcd 0003:04D8:C002.0003: usb_submit_urb(out) failed: -19 lines in kernel log and stopping interaction with hardware when it's gone already or giving it time to consume its queue when there is lots of data being sent. Regards, Bruno -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/7] HID: picoLCD updates
Hi, This series updates picoLCD driver: - split the driver functions into separate files which get included depending on Kconfig selection (implementation for CIR using RC_CORE will follow later) - drop private framebuffer refcounting in favor of refcounting added to fb_info some time ago - fix various bugs issues - disabled firmware version checking in probe() as it does not work anymore since commit 4ea5454203d991ec85264f64f89ca8855fce69b0 [HID: Fix race condition between driver core and ll-driver] Note: I still get weird behavior on quick unbind/bind sequences issued via sysfs (CONFIG_SMP=n system) that are triggered by framebuffer support and apparently more specifically fb_defio part of it. Unfortunately I'm out of ideas as to how to track down the problem which shows either as SLAB corruption (detected with SLUB debugging, e.g. [ 6383.521833] = [ 6383.530020] BUG kmalloc-64 (Not tainted): Object already free [ 6383.530020] - [ 6383.530020] [ 6383.530020] INFO: Slab 0xdde0ea20 objects=51 used=40 fp=0xcef516e0 flags=0x4080 [ 6383.530020] INFO: Object 0xcef51190 @offset=400 fp=0xcef51f50 [ 6383.530020] [ 6383.530020] Bytes b4 cef51180: cc cc cc cc d0 12 f5 ce 5a 5a 5a 5a 5a 5a 5a 5a [ 6383.530020] Object cef51190: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b [ 6383.530020] Object cef511a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b [ 6383.530020] Object cef511b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b [ 6383.530020] Object cef511c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkk. [ 6383.530020] Redzone cef511d0: bb bb bb bb [ 6383.530020] Padding cef511d8: 5a 5a 5a 5a 5a 5a 5a 5a [ 6383.530020] Pid: 1922, comm: bash Not tainted 3.5.0-jupiter-3-g8d858b1-dirty #2 [ 6383.530020] Call Trace: [ 6383.530020] [] print_trailer+0x11c/0x130 [ 6383.530020] [] object_err+0x35/0x40 [ 6383.530020] [] free_debug_processing+0x99/0x200 [ 6383.530020] [] __slab_free+0x2e/0x280 [ 6383.530020] [] ? hid_submit_out+0xa4/0x120 [ 6383.530020] [] ? __usbhid_submit_report+0xc0/0x3c0 [ 6383.530020] [] ? kfree+0xfa/0x110 [ 6383.530020] [] ? picolcd_debug_out_report+0x8c4/0x8e0 [hid_picolcd] [ 6383.530020] [] kfree+0xfa/0x110 [ 6383.530020] [] ? hid_submit_out+0xa4/0x120 [ 6383.530020] [] ? hid_submit_out+0xa4/0x120 [ 6383.530020] [] ? hid_submit_out+0xa4/0x120 [ 6383.530020] [] hid_submit_out+0xa4/0x120 [ 6383.530020] [] __usbhid_submit_report+0x158/0x3c0 [ 6383.530020] [] usbhid_submit_report+0x1b/0x30 [ 6383.530020] [] picolcd_fb_reset+0xb9/0x180 [hid_picolcd] [ 6383.530020] [] picolcd_init_framebuffer+0x20d/0x2e0 [hid_picolcd] [ 6383.530020] [] picolcd_probe+0x3cc/0x580 [hid_picolcd] [ 6383.530020] [] hid_device_probe+0x67/0xf0 [ 6383.530020] [] ? driver_sysfs_add+0x57/0x80 [ 6383.530020] [] driver_probe_device+0xbd/0x1c0 [ 6383.530020] [] ? hid_match_device+0x7b/0x90 [ 6383.530020] [] driver_bind+0x75/0xd0 [ 6383.530020] [] ? driver_unbind+0x90/0x90 [ 6383.530020] [] drv_attr_store+0x27/0x30 [ 6383.530020] [] sysfs_write_file+0xac/0xf0 [ 6383.530020] [] vfs_write+0x9c/0x130 [ 6383.530020] [] ? sys_dup3+0x11f/0x160 [ 6383.530020] [] ? sysfs_poll+0x90/0x90 [ 6383.530020] [] sys_write+0x3d/0x70 [ 6383.530020] [] sysenter_do_call+0x12/0x26 [ 6383.530020] FIX kmalloc-64: Object at 0xcef51190 not freed or worse spontaneous reboot of the system without any trace on netconsole or serial console. echo $devid > bind; echo $devid > unbind or echo $devid > bind; echo $devid > unbind; sleep 0.2; echo $devid > bind; echo $devid > unbind is sufficient to trigger the above issue while waiting a few seconds between bind and unbind shows no sign of trouble. Suggestions as to how to debug this and fix it are welcome! Thanks, Bruno -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/7] HID: picoLCD updates
Hi, This series updates picoLCD driver: - split the driver functions into separate files which get included depending on Kconfig selection (implementation for CIR using RC_CORE will follow later) - drop private framebuffer refcounting in favor of refcounting added to fb_info some time ago - fix various bugs issues - disabled firmware version checking in probe() as it does not work anymore since commit 4ea5454203d991ec85264f64f89ca8855fce69b0 [HID: Fix race condition between driver core and ll-driver] Note: I still get weird behavior on quick unbind/bind sequences issued via sysfs (CONFIG_SMP=n system) that are triggered by framebuffer support and apparently more specifically fb_defio part of it. Unfortunately I'm out of ideas as to how to track down the problem which shows either as SLAB corruption (detected with SLUB debugging, e.g. [ 6383.521833] = [ 6383.530020] BUG kmalloc-64 (Not tainted): Object already free [ 6383.530020] - [ 6383.530020] [ 6383.530020] INFO: Slab 0xdde0ea20 objects=51 used=40 fp=0xcef516e0 flags=0x4080 [ 6383.530020] INFO: Object 0xcef51190 @offset=400 fp=0xcef51f50 [ 6383.530020] [ 6383.530020] Bytes b4 cef51180: cc cc cc cc d0 12 f5 ce 5a 5a 5a 5a 5a 5a 5a 5a [ 6383.530020] Object cef51190: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b [ 6383.530020] Object cef511a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b [ 6383.530020] Object cef511b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b [ 6383.530020] Object cef511c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkk. [ 6383.530020] Redzone cef511d0: bb bb bb bb [ 6383.530020] Padding cef511d8: 5a 5a 5a 5a 5a 5a 5a 5a [ 6383.530020] Pid: 1922, comm: bash Not tainted 3.5.0-jupiter-3-g8d858b1-dirty #2 [ 6383.530020] Call Trace: [ 6383.530020] [c10bd3cc] print_trailer+0x11c/0x130 [ 6383.530020] [c10bd415] object_err+0x35/0x40 [ 6383.530020] [c10be809] free_debug_processing+0x99/0x200 [ 6383.530020] [c10bf77e] __slab_free+0x2e/0x280 [ 6383.530020] [c1322284] ? hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322870] ? __usbhid_submit_report+0xc0/0x3c0 [ 6383.530020] [c10bfbda] ? kfree+0xfa/0x110 [ 6383.530020] [de932aa4] ? picolcd_debug_out_report+0x8c4/0x8e0 [hid_picolcd] [ 6383.530020] [c10bfbda] kfree+0xfa/0x110 [ 6383.530020] [c1322284] ? hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322284] ? hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322284] ? hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322284] hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322908] __usbhid_submit_report+0x158/0x3c0 [ 6383.530020] [c1322c2b] usbhid_submit_report+0x1b/0x30 [ 6383.530020] [de930789] picolcd_fb_reset+0xb9/0x180 [hid_picolcd] [ 6383.530020] [de930f1d] picolcd_init_framebuffer+0x20d/0x2e0 [hid_picolcd] [ 6383.530020] [de92fb9c] picolcd_probe+0x3cc/0x580 [hid_picolcd] [ 6383.530020] [c1319147] hid_device_probe+0x67/0xf0 [ 6383.530020] [c1282f97] ? driver_sysfs_add+0x57/0x80 [ 6383.530020] [c128329d] driver_probe_device+0xbd/0x1c0 [ 6383.530020] [c1318a1b] ? hid_match_device+0x7b/0x90 [ 6383.530020] [c12821e5] driver_bind+0x75/0xd0 [ 6383.530020] [c1282170] ? driver_unbind+0x90/0x90 [ 6383.530020] [c12818b7] drv_attr_store+0x27/0x30 [ 6383.530020] [c1114aec] sysfs_write_file+0xac/0xf0 [ 6383.530020] [c10c794c] vfs_write+0x9c/0x130 [ 6383.530020] [c10d4a1f] ? sys_dup3+0x11f/0x160 [ 6383.530020] [c1114a40] ? sysfs_poll+0x90/0x90 [ 6383.530020] [c10c7bbd] sys_write+0x3d/0x70 [ 6383.530020] [c13f2557] sysenter_do_call+0x12/0x26 [ 6383.530020] FIX kmalloc-64: Object at 0xcef51190 not freed or worse spontaneous reboot of the system without any trace on netconsole or serial console. echo $devid bind; echo $devid unbind or echo $devid bind; echo $devid unbind; sleep 0.2; echo $devid bind; echo $devid unbind is sufficient to trigger the above issue while waiting a few seconds between bind and unbind shows no sign of trouble. Suggestions as to how to debug this and fix it are welcome! Thanks, Bruno -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/