Re: localed stuck in recent 3.18 git in copy_net_ns?

2014-10-28 Thread Kevin Fenzi
Just FYI, this solves the orig issue for me as well. ;) 

Thanks for all the work in tracking it down... 

Tested-by: Kevin Fenzi 

kevin




signature.asc
Description: PGP signature


Re: localed stuck in recent 3.18 git in copy_net_ns?

2014-10-21 Thread Kevin Fenzi
On Mon, 20 Oct 2014 14:53:59 -0600
Kevin Fenzi  wrote:

> On Mon, 20 Oct 2014 16:43:26 -0400
> Dave Jones  wrote:
> 
> > I've seen similar soft lockup traces from the sys_unshare path when
> > running my fuzz tester.  It seems that if you create enough network
> > namespaces, it can take a huge amount of time for them to be
> > iterated. (Running trinity with '-c unshare' you can see the slow
> > down happen. In some cases, it takes so long that the watchdog
> > process kills it -- though the SIGKILL won't get delivered until
> > the unshare() completes)
> > 
> > Any idea what this machine had been doing prior to this that may
> > have involved creating lots of namespaces ?
> 
> That was right after boot. ;) 
> 
> This is my main rawhide running laptop.
> 
> A 'ip netns list' shows nothing.

Some more information: 

The problem started between: 

v3.17-7872-g5ff0b9e1a1da and v3.17-8307-gf1d0d14120a8

(I can try and do a bisect, but have to head out on a trip tomorrow)

In all the kernels with the problem, there is a kworker process in D. 

sysrq-t says: 
Showing all locks held in the 
system:
Oct 21 15:06:31 voldemort.scrye.com kernel: 4 locks held by kworker/u16:0/6:
Oct 21 15:06:31 voldemort.scrye.com kernel:  #0:  ("%s""netns"){.+.+.+}, at: 
[] process_one_work+0x17f/0x850
Oct 21 15:06:31 voldemort.scrye.com kernel:  #1:  (net_cleanup_work){+.+.+.}, 
at: [] process_one_work+0x17f/0x850
Oct 21 15:06:31 voldemort.scrye.com kernel:  #2:  (net_mutex){+.+.+.}, at: 
[] cleanup_net+0x8c/0x1f0
Oct 21 15:06:31 voldemort.scrye.com kernel:  #3:
(rcu_sched_state.barrier_mutex){+.+...}, at: []
_rcu_barrier+0x35/0x200

On first running any of the systemd units that use PrivateNetwork, then
run ok, but they are also set to timeout after a minute. On sucessive
runs they hang in D also.

kevin


signature.asc
Description: PGP signature


Re: localed stuck in recent 3.18 git in copy_net_ns?

2014-10-20 Thread Kevin Fenzi
On Mon, 20 Oct 2014 16:43:26 -0400
Dave Jones  wrote:

> I've seen similar soft lockup traces from the sys_unshare path when
> running my fuzz tester.  It seems that if you create enough network
> namespaces, it can take a huge amount of time for them to be iterated.
> (Running trinity with '-c unshare' you can see the slow down happen.
> In some cases, it takes so long that the watchdog process kills it --
>  though the SIGKILL won't get delivered until the unshare() completes)
> 
> Any idea what this machine had been doing prior to this that may have
> involved creating lots of namespaces ?

That was right after boot. ;) 

This is my main rawhide running laptop.

A 'ip netns list' shows nothing.

kevin


signature.asc
Description: PGP signature


localed stuck in recent 3.18 git in copy_net_ns?

2014-10-20 Thread Kevin Fenzi
Greetings. 

I'm seeing suspend/resume failures with recent 3.18 git kernels. 

Full dmesg at: http://paste.fedoraproject.org/143615/83287914/

The possibly interesting parts: 

[   78.373144] PM: Syncing filesystems ... done.
[   78.411180] PM: Preparing system for mem sleep
[   78.411995] Freezing user space processes ... 
[   98.429955] Freezing of tasks failed after 20.001 seconds (1 tasks refusing 
to freeze, wq_busy=0):
[   98.429971] (-localed)  D 88025f214c80 0  1866  1 0x0084
[   98.429975]  88024e777df8 0086 88009bb0 
00014c80
[   98.429978]  88024e777fd8 00014c80 880250ffb110 
88009bb0
[   98.429981]   81cec1a0 81cec1a4 
88009bb0
[   98.429983] Call Trace:
[   98.429991]  [] schedule_preempt_disabled+0x29/0x70
[   98.429994]  [] __mutex_lock_slowpath+0xb3/0x120
[   98.429997]  [] mutex_lock+0x23/0x40
[   98.430001]  [] copy_net_ns+0x75/0x140
[   98.430005]  [] create_new_namespaces+0xfd/0x1a0
[   98.430008]  [] unshare_nsproxy_namespaces+0x5a/0xc0
[   98.430012]  [] SyS_unshare+0x193/0x340
[   98.430015]  [] system_call_fastpath+0x12/0x17

[   98.430032] Restarting tasks ... done.
[   98.480361] PM: Syncing filesystems ... done.
[   98.571645] PM: Preparing system for freeze sleep
[   98.571779] Freezing user space processes ... 
[  118.592086] Freezing of tasks failed after 20.003 seconds (1 tasks refusing 
to freeze, wq_busy=0):
[  118.592102] (-localed)  D 88025f214c80 0  1866  1 0x0084
[  118.592106]  88024e777df8 0086 88009bb0 
00014c80
[  118.592109]  88024e777fd8 00014c80 880250ffb110 
88009bb0
[  118.592111]   81cec1a0 81cec1a4 
88009bb0
[  118.592114] Call Trace:
[  118.592121]  [] schedule_preempt_disabled+0x29/0x70
[  118.592125]  [] __mutex_lock_slowpath+0xb3/0x120
[  118.592127]  [] mutex_lock+0x23/0x40
[  118.592132]  [] copy_net_ns+0x75/0x140
[  118.592136]  [] create_new_namespaces+0xfd/0x1a0
[  118.592139]  [] unshare_nsproxy_namespaces+0x5a/0xc0
[  118.592143]  [] SyS_unshare+0x193/0x340
[  118.592146]  [] system_call_fastpath+0x12/0x17

[  118.592163] Restarting tasks ... done.

root 6  0.0  0.0  0 0 ?D13:49   0:00 [kworker/u16:0]
root  1876  0.0  0.0  41460  5784 ?Ds   13:49   0:00 (-localed)

I'll try and bisect this, but perhaps it rings bells already for folks. 

kevin



signature.asc
Description: PGP signature


Re: ELAN Touchscreen regression in recent 3.12 rc's? (USB)

2013-11-09 Thread Kevin Fenzi
Well, that was fun. ;) 

git bisect start
# bad: [69c88dc7d9f1a6c3eceb7058111677c640811c94] vfs: fix new kernel-doc 
warnings
git bisect bad 69c88dc7d9f1a6c3eceb7058111677c640811c94
# good: [31d141e3a666269a3b6fcccddb0351caf7454240] Linux 3.12-rc6
git bisect good 31d141e3a666269a3b6fcccddb0351caf7454240
# good: [b403b73c21fcab11411a1439867a3ead9117e5e4] Merge branch 'drm-fixes' of 
git://people.freedesktop.org/~airlied/linux
git bisect good b403b73c21fcab11411a1439867a3ead9117e5e4
# good: [d55f0691c041dba46daad7790b8f2631acb55f9a] Merge remote-tracking branch 
'asoc/fix/pcm1792a' into asoc-linus
git bisect good d55f0691c041dba46daad7790b8f2631acb55f9a
# good: [20c87bd40e6c1ff7e31cc5eea4fb37829a57eb58] Merge tag 'asoc-v3.12-rc5' 
of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
git bisect good 20c87bd40e6c1ff7e31cc5eea4fb37829a57eb58
# good: [d24fec3991076124e069c889c530cdc69cd43fb8] Merge tag 'jfs-3.12' of 
git://github.com/kleikamp/linux-shaggy
git bisect good d24fec3991076124e069c889c530cdc69cd43fb8
# good: [93cd00043fffec840fa36909c4a8eb0f735dfb04] Merge tag 'sound-3.12' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
git bisect good 93cd00043fffec840fa36909c4a8eb0f735dfb04
# good: [606d6fe3ffdb5190d4c8e4d6cd23aa6c1f9cb6ad] fs/namei.c: fix new 
kernel-doc warning
git bisect good 606d6fe3ffdb5190d4c8e4d6cd23aa6c1f9cb6ad
# first bad commit: [69c88dc7d9f1a6c3eceb7058111677c640811c94] vfs: fix
new kernel-doc warnings

Which is... nonsense. ;) 

So, I tried the first bad commit again and it works. 

After a small period of head scratching, I noticed that the other change
in the non working kernel was re-enabling debugging options. 

So, I rebuild locally the bad kernel with the debugging config and
sure enough it's back. 

So, it's some debugging option causing it.

I can try and bisect again with debugging config but I just got this laptop
recently, so it could be that all debugging kernels have the issue and it's 
nothing recent?

The diff of configs is attached in case anything leaps out at people: 

kevin
--
164c164
< CONFIG_DEBUG_BLK_CGROUP=y
---
> # CONFIG_DEBUG_BLK_CGROUP is not set
211d210
< CONFIG_PERF_USE_VMALLOC=y
217c216
< CONFIG_DEBUG_PERF_USE_VMALLOC=y
---
> # CONFIG_DEBUG_PERF_USE_VMALLOC is not set
286c285
< CONFIG_MODULE_FORCE_UNLOAD=y
---
> # CONFIG_MODULE_FORCE_UNLOAD is not set
346c345,350
< CONFIG_UNINLINE_SPIN_UNLOCK=y
---
> CONFIG_INLINE_SPIN_UNLOCK_IRQ=y
> CONFIG_INLINE_READ_UNLOCK=y
> CONFIG_INLINE_READ_UNLOCK_IRQ=y
> CONFIG_INLINE_WRITE_UNLOCK=y
> CONFIG_INLINE_WRITE_UNLOCK_IRQ=y
> CONFIG_MUTEX_SPIN_ON_OWNER=y
402,403c406,407
< CONFIG_MAXSMP=y
< CONFIG_NR_CPUS=4096
---
> # CONFIG_MAXSMP is not set
> CONFIG_NR_CPUS=128
409d412
< CONFIG_PREEMPT_COUNT=y
437c440
< CONFIG_NODES_SHIFT=10
---
> CONFIG_NODES_SHIFT=9
460c463
< CONFIG_SPLIT_PTLOCK_CPUS=99
---
> CONFIG_SPLIT_PTLOCK_CPUS=4
484c487
< CONFIG_X86_BOOTPARAM_MEMORY_CORRUPTION_CHECK=y
---
> # CONFIG_X86_BOOTPARAM_MEMORY_CORRUPTION_CHECK is not set
535c538
< CONFIG_PM_TEST_SUSPEND=y
---
> # CONFIG_PM_TEST_SUSPEND is not set
561c564
< CONFIG_ACPI_DEBUG=y
---
> # CONFIG_ACPI_DEBUG is not set
1391c1394
< CONFIG_CEPH_LIB_PRETTYDEBUG=y
---
> # CONFIG_CEPH_LIB_PRETTYDEBUG is not set
1545c1548
< CONFIG_DRBD_FAULT_INJECTION=y
---
> # CONFIG_DRBD_FAULT_INJECTION is not set
2293c2296
< CONFIG_ATH_DEBUG=y
---
> # CONFIG_ATH_DEBUG is not set
2311c2314
< CONFIG_CARL9170_DEBUGFS=y
---
> # CONFIG_CARL9170_DEBUGFS is not set
2342c2345
< CONFIG_B43_DEBUG=y
---
> # CONFIG_B43_DEBUG is not set
2348c2351
< CONFIG_B43LEGACY_DEBUG=y
---
> # CONFIG_B43LEGACY_DEBUG is not set
2384c2387
< CONFIG_IWLWIFI_DEVICE_TRACING=y
---
> # CONFIG_IWLWIFI_DEVICE_TRACING is not set
4745,4746c4748
< CONFIG_DMADEVICES_DEBUG=y
< CONFIG_DMADEVICES_VDEBUG=y
---
> # CONFIG_DMADEVICES_DEBUG is not set
5202c5204
< CONFIG_EXT4_DEBUG=y
---
> # CONFIG_EXT4_DEBUG is not set
5204c5206
< CONFIG_JBD2_DEBUG=y
---
> # CONFIG_JBD2_DEBUG is not set
5249c5251
< CONFIG_QUOTA_DEBUG=y
---
> # CONFIG_QUOTA_DEBUG is not set
5380c5382
< CONFIG_NFSD_FAULT_INJECTION=y
---
> # CONFIG_NFSD_FAULT_INJECTION is not set
5505c5507
< CONFIG_DEBUG_FORCE_WEAK_PER_CPU=y
---
> # CONFIG_DEBUG_FORCE_WEAK_PER_CPU is not set
5513,5521c5515,5516
< CONFIG_DEBUG_OBJECTS=y
< # CONFIG_DEBUG_OBJECTS_SELFTEST is not set
< CONFIG_DEBUG_OBJECTS_FREE=y
< CONFIG_DEBUG_OBJECTS_TIMERS=y
< CONFIG_DEBUG_OBJECTS_WORK=y
< CONFIG_DEBUG_OBJECTS_RCU_HEAD=y
< CONFIG_DEBUG_OBJECTS_PERCPU_COUNTER=y
< CONFIG_DEBUG_OBJECTS_ENABLE_DEFAULT=1
< CONFIG_SLUB_DEBUG_ON=y
---
> # CONFIG_DEBUG_OBJECTS is not set
> # CONFIG_SLUB_DEBUG_ON is not set
5524,5528c5519,5520
< CONFIG_DEBUG_KMEMLEAK=y
< CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE=1024
< # CONFIG_DEBUG_KMEMLEAK_TEST is not set
< CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF=y
< CONFIG_DEBUG_STACK_USAGE=y
---
> # CONFIG_DEBUG_KMEMLEAK is not set
> # CONFIG_DEBUG_STACK_USAGE is not set
5548,5551c5540
< CONFIG_DETECT_HUNG_TASK=y
< CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120

Re: ELAN Touchscreen regression in recent 3.12 rc's? (USB)

2013-11-09 Thread Kevin Fenzi
On Thu, 7 Nov 2013 16:10:07 +0800
AceLan Kao  wrote:

> Hi Kevin,
> 
> http://people.canonical.com/~acelan/elan_touchscreen/
> Here are the kernels, please test them one by one and tell me which
> one works. As your description, the first one should work, and the
> second one doesn't, I just want to make sure that, so I build the rc6
> and rc7 kernel to test. Thanks.

Thats very nice of you. :) 

However, I run Fedora rawhide here, so deb's aren't too useful for me. 

I just confirmed that: 

kernel-3.12.0-0.rc6.git0.1.fc21.x86_64 is ok. 
and
kernel-3.12.0-0.rc6.git1.1.fc21.x86_64 shows the issue. 

So, thats between 3.12-rc6 and 57-g69c88dc
I'm setting up to build some kernels to bisect it. 

git bisect start
# bad: [69c88dc7d9f1a6c3eceb7058111677c640811c94] vfs: fix new kernel-doc 
warnings
git bisect bad 69c88dc7d9f1a6c3eceb7058111677c640811c94
# good: [31d141e3a666269a3b6fcccddb0351caf7454240] Linux 3.12-rc6
git bisect good 31d141e3a666269a3b6fcccddb0351caf7454240

Will let you know what I end up with, thanks. 

kevin


signature.asc
Description: PGP signature


Re: ELAN Touchscreen regression in recent 3.12 rc's? (USB)

2013-11-06 Thread Kevin Fenzi
On Wed, 6 Nov 2013 09:44:11 +0800
AceLan Kao  wrote:

> Hi,
> 
> Sorry, no thoughts about this.
> If Kevin would like to try, I can bisect the kernel and build the
> Ubuntu .deb package for him.
> But like I say above, those 4 commits doesn't look like to lead to the
> problem, so I'll try to bisect the whole kernel.
> 
> % git bisect start v3.12-rc7 v3.12-rc6
> Bisecting: 165 revisions left to test after this (roughly 7 steps)
> [edd31476011052d8f6591a3194ba0716b0cea681] bnx2x: Set NETIF_F_HIGHDMA
> unconditionally

Yeah, I can try and do so this weekend. To busy during the week to get
very far on this. 

Also, it seems like on some boots it just never works, but on others It
does finally work: 

[ 3238.722796] usb 2-7: new full-speed USB device number 127 using xhci_hcd
[ 3238.735177] usb 2-7: New USB device found, idVendor=04f3, idProduct=016f
[ 3238.735183] usb 2-7: New USB device strings: Mfr=4, Product=14, 
SerialNumber=0
[ 3238.735185] usb 2-7: Product: Touchscreen
[ 3238.735187] usb 2-7: Manufacturer: ELAN
[ 3238.735539] usb 2-7: ep 0x2 - rounding interval to 64 microframes, ep desc 
says 80 microframes
[ 3238.743682] input: ELAN Touchscreen as 
/devices/pci:00/:00:14.0/usb2/2-7/2-7:1.0/input/input368
[ 3238.744505] hid-multitouch 0003:04F3:016F.00B4:
input,hiddev0,hidraw1: USB HID v1.10 Device [ELAN Touchscreen] on
usb-:00:14.0-7/input0

so, 3238 seconds after boot, with 127 tries it started working on this boot. 

I'll try and isolate it this weekend. Thanks for the attention everyone. 

kevin
--
> 
> Best regards,
> AceLan Kao.
> 
> 2013/11/5 Jiri Kosina :
> > On Wed, 30 Oct 2013, Nikolai Kondrashov wrote:
> >
> >> > It's hard to believe that those quirks will lead to the problem.
> >> > And actually, there are 4 new commits introduced to -rc7, and 3
> >> > of them are quirks.
> >> >
> >> > % git log --pretty=oneline v3.12-rc6..v3.12-rc7 drivers/hid
> >> > 86b84167d4e67372376a57ea9955c5d53dae232f HID: wiimote: add
> >> > LEGO-wiimote VID ad0e669b922c7790182cf19f8015b30e23ad9499 HID:
> >> > Fix unit exponent parsing again
> >> > 684524d35fe8d13be1f2649633e43bd02c96c695 HID: usbhid: quirk for
> >> > SiS Touchscreen
> >> > 8171a67d587a09e14a4949a81e070345fedcf410 HID: usbhid: quirk for
> >> > Synaptics Large Touchccreen
> >> >
> >> > CC'd Nikolai, since his commit changes the protocol.
> >>
> >> My change is very unlikely to produce such problems. It changes
> >> calculation of axes resolution at the time the report descriptor
> >> is processed. The results of the calculation are not used by the
> >> kernel, AFAIK, but only by userspace drivers. The results should
> >> not be used to affect the interactions with the device, but only
> >> the interpretation of the reports (again, in userspace), and even
> >> that is barely done currently.
> >>
> >> Please try reverting that particular commit and see if it affects
> >> the behavior.
> >
> > AceLan,
> >
> > do you have any update, please?
> >
> > --
> > Jiri Kosina
> > SUSE Labs
> 



signature.asc
Description: PGP signature


Re: ELAN Touchscreen regression in recent 3.12 rc's? (USB)

2013-10-29 Thread Kevin Fenzi
On Wed, 30 Oct 2013 09:41:06 +0800
AceLan Kao  wrote:

> Hi,
> 
> It's hard to believe that those quirks will lead to the problem.
> And actually, there are 4 new commits introduced to -rc7, and 3 of
> them are quirks.
> 
> % git log --pretty=oneline v3.12-rc6..v3.12-rc7 drivers/hid
> 86b84167d4e67372376a57ea9955c5d53dae232f HID: wiimote: add
> LEGO-wiimote VID ad0e669b922c7790182cf19f8015b30e23ad9499 HID: Fix
> unit exponent parsing again 684524d35fe8d13be1f2649633e43bd02c96c695
> HID: usbhid: quirk for SiS Touchscreen
> 8171a67d587a09e14a4949a81e070345fedcf410 HID: usbhid: quirk for
> Synaptics Large Touchccreen
> 
> CC'd Nikolai, since his commit changes the protocol.

Oddly, after a while it seems to have decided to stay on: 

[ 4021.819962] usb 2-7: Manufacturer: ELAN
[ 4021.820163] usb 2-7: ep 0x2 - rounding interval to 64 microframes, ep desc 
says 80 microframes
[ 4021.828338] input: ELAN Touchscreen as 
/devices/pci:00/:00:14.0/usb2/2-7/2-7:1.0/input/input221
[ 4021.828801] hid-multitouch 0003:04F3:016F.00D3: input,hiddev0,hidraw1: USB 
HID v1.10 Device [ELAN Touchscreen] on usb-:00:14.0-7/input0
[ 4023.903599] usb 2-7: USB disconnect, device number 86
[ 4024.198444] usb 2-7: new full-speed USB device number 87 using xhci_hcd
[ 4024.211102] usb 2-7: New USB device found, idVendor=04f3, idProduct=016f
[ 4024.211143] usb 2-7: New USB device strings: Mfr=4, Product=14, 
SerialNumber=0
[ 4024.211177] usb 2-7: Product: Touchscreen
[ 4024.211197] usb 2-7: Manufacturer: ELAN
[ 4024.211489] usb 2-7: ep 0x2 - rounding interval to 64 microframes, ep desc 
says 80 microframes
[ 4024.219707] input: ELAN Touchscreen as 
/devices/pci:00/:00:14.0/usb2/2-7/2-7:1.0/input/input222
[ 4024.220555] hid-multitouch 0003:04F3:016F.00D4:
input,hiddev0,hidraw1: USB HID v1.10 Device [ELAN Touchscreen] on
usb-:00:14.0-7/input0

and it's been working since then. So, only after 87 cycles... 

Some kind of weird race condition?

kevin


signature.asc
Description: PGP signature


ELAN Touchscreen regression in recent 3.12 rc's? (USB)

2013-10-28 Thread Kevin Fenzi
Greetings. 

I have a lenovo yoga 2 pro.. it has a ELAN Touchscreen. 

In 3.12rc5 it was detected and works. 
in 3.12rc6 it was detected and works. 

Sometime after the rc6 timeframe it stopped working. It doesn't work with
rc7. 

In dmesg I get: 

[  159.519663] usb 2-7: New USB device found, idVendor=04f3, idProduct=016f
[  159.519681] usb 2-7: New USB device strings: Mfr=4, Product=14, 
SerialNumber=0
[  159.519683] usb 2-7: Product: Touchscreen
[  159.519685] usb 2-7: Manufacturer: ELAN
[  159.519901] usb 2-7: ep 0x2 - rounding interval to 64 microframes, ep desc 
says 80 microframes
[  159.528027] input: ELAN Touchscreen as 
/devices/pci:00/:00:14.0/usb2/2-7/2-7:1.0/input/input20
[  159.528544] hid-multitouch 0003:04F3:016F.000A: input,hiddev0,hidraw1: USB 
HID v1.10 Device [ELAN Touchscreen] on usb-:00:14.0-7/input0
[  161.601423] usb 2-7: USB disconnect, device number 71
[  161.896828] usb 2-7: new full-speed USB device number 72 using xhci_hcd
[  163.991296] usb 2-7: unable to read config index 0 descriptor/start: -71
[  163.991301] usb 2-7: can't read configurations, error -71
[  164.143686] usb 2-7: new full-speed USB device number 73 using xhci_hcd
[  166.239273] usb 2-7: unable to read config index 0 descriptor/start: -71
[  166.239278] usb 2-7: can't read configurations, error -71
[  166.392576] usb 2-7: new full-speed USB device number 74 using xhci_hcd
[  168.487051] usb 2-7: unable to read config index 0 descriptor/start: -71
[  168.487058] usb 2-7: can't read configurations, error -71
[  168.640308] usb 2-7: new full-speed USB device number 75 using xhci_hcd
[  170.734850] usb 2-7: unable to read config index 0 descriptor/start: -71
[  170.734856] usb 2-7: can't read configurations, error -71
[  170.734927] hub 2-0:1.0: unable to enumerate USB device on port 7
[  171.022134] usb 2-7: new full-speed USB device number 76 using xhci_hcd
[  173.116744] usb 2-7: unable to read config index 0 descriptor/start: -71
[  173.116750] usb 2-7: can't read configurations, error -71
[  173.269840] usb 2-7: new full-speed USB device number 77 using xhci_hcd
[  173.282520] usb 2-7: New USB device found, idVendor=04f3, idProduct=016f
[  173.282526] usb 2-7: New USB device strings: Mfr=4, Product=14, 
SerialNumber=0
[  173.282528] usb 2-7: Product: Touchscreen
[  173.282530] usb 2-7: Manufacturer: ELAN
[  173.282782] usb 2-7: ep 0x2 - rounding interval to 64 microframes, ep desc 
says 80 microframes
[  173.290920] input: ELAN Touchscreen
as /devices/pci:00/:00:14.0/usb2/2-7/2-7:1.0/input/input21

So it seems to be bouncing in and out (but never long enough to actually work). 

Could be a regression in xhci_hcd? 

Happy to provide more info/details/etc 

kevin


signature.asc
Description: PGP signature


Re: ALSA: Conexant CX20585 (thinkpad t510) speaker powersave regression

2013-03-10 Thread Kevin Fenzi
On Sun, 13 Jan 2013 10:06:41 +0100
Takashi Iwai  wrote:

> At Sat, 12 Jan 2013 11:40:24 -0700,
> Kevin Fenzi wrote:
> > 
> > On Tue, 08 Jan 2013 16:26:38 +0100
> > Takashi Iwai  wrote:
> > 
> > > At Sat, 22 Dec 2012 14:23:24 -0700,
> > > Kevin Fenzi wrote:
> > > > 
> > > > Greetings. 
> > > > 
> > > > I've got an issue with sound on my t510. It started in the late
> > > > 3.4.x kernels. Sound works on boot and for 5-10min after, then
> > > > the speakers stop working at all. 
> > > > 
> > > > I posted a while back on the alsa-users list, and someone else
> > > > had the same issue: 
> > > > 
> > > > http://www.mail-archive.com/alsa-user@lists.sourceforge.net/msg28769.html
> > > > 
> > > > It seems that the speakers(?) are getting moved to state D3 and
> > > > turning off. You can manually get it to come back for a few
> > > > minutes by using hda-verb to move it back to D0: 
> > > > 
> > > > hda-verb /dev/snd/hwC0D0 0x1f SET_POWER_STATE 0
> > > > 
> > > > http://www.alsa-project.org/db/?f=fd5ca157b2e76942df5ce2f0ae1a2817f3f08afd
> > > > is my alsa-info. 
> > > > 
> > > > I'd file a bug on the alsa bug tracker, but it still seems to be
> > > > down (for many months now?). No response on alsa-users, so now
> > > > that I have a bit of time, I am posting here in hopes someone
> > > > can look. ;) 
> > > > 
> > > > Happy to provide more info or try things. 
> > > 
> > > It's a known problem that some people have already reported, but
> > > currently no clue who actually turns down the pin to D3.
> > > Conexant guys wrote me that the codec doesn't do it by itself,
> > > and the driver neither, AFAIK.  A possible answer is the
> > > firmware / BIOS, but who knows.
> > > 
> > > In anyway, could you try to trace the hd-audio events and see
> > > whether the power down is *not* issued by the driver when you see
> > > this state? See Documentation/sound/alsa/HD-Audio.txt, the
> > > section "Tracepoints" for a brief instruction.
> > 
> > ok. I am not sure I am doing the right thing, but: 
> > 
> > - echo 1 > /sys/kernel/debug/tracing/events/hda/enable
> > - Run hda-verb to get sound working. 
> > - Play a video to confirm. 
> > - check /sys/kernel/debug/tracing/trace
> > - Wait a while. 
> > - Play another sound that doesn't come out of speakers. 
> > - check /sys/kernel/debug/tracing/trace for anything new. 
> > 
> > The only things I see in the trace file are my hda-verb call, and
> > the sounds playing. Nothing else. 
> > 
> > I then tried to duplicate it, but the trace file didn't seem to
> > update properly. Is there some reset needed? 
> > 
> > Or is this not the info you were looking for?
> > 
> > hda-verb-27187 [001]  78907.305149: hda_power_count:
> > [0:0] power_count=1, power_on=1, power_transition=0
> > hda-verb-27187 [001]  78907.305152: hda_send_cmd: [0:0]
> > val=1f70500
> 
> Yes these are what I'd like to check.

Sorry for the long delay here... was the above info of any use? 

> The point to check here is exactly which verbs have been sent, and how
> is the codec power status.  Check what is the last power_on value when
> the problem appears.  If it's power_on=1, it's fine.
> And you can concentrate on the verb to NID 0x1f, namely, hda_send_cmd
> with a value 1fx.  In the example above, 1f70500 means it's
> setting the power state of 0x1f to D0.
> 
> Last but not least, you can try my very latest code in sound-unstable
> tree:
>   git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-unstable.git
> 
> Either master or test/hda-migrate branch contains the latest code for
> Conexant codecs.

I'll note that the problem persists in 3.9.0-0.rc1

Please let me know if there's anything I can do to move this along or
try. ;( 

kevin


signature.asc
Description: PGP signature


Re: ALSA: Conexant CX20585 (thinkpad t510) speaker powersave regression

2013-01-12 Thread Kevin Fenzi
On Tue, 08 Jan 2013 16:26:38 +0100
Takashi Iwai  wrote:

> At Sat, 22 Dec 2012 14:23:24 -0700,
> Kevin Fenzi wrote:
> > 
> > Greetings. 
> > 
> > I've got an issue with sound on my t510. It started in the late
> > 3.4.x kernels. Sound works on boot and for 5-10min after, then the
> > speakers stop working at all. 
> > 
> > I posted a while back on the alsa-users list, and someone else had
> > the same issue: 
> > 
> > http://www.mail-archive.com/alsa-user@lists.sourceforge.net/msg28769.html
> > 
> > It seems that the speakers(?) are getting moved to state D3 and
> > turning off. You can manually get it to come back for a few minutes
> > by using hda-verb to move it back to D0: 
> > 
> > hda-verb /dev/snd/hwC0D0 0x1f SET_POWER_STATE 0
> > 
> > http://www.alsa-project.org/db/?f=fd5ca157b2e76942df5ce2f0ae1a2817f3f08afd
> > is my alsa-info. 
> > 
> > I'd file a bug on the alsa bug tracker, but it still seems to be
> > down (for many months now?). No response on alsa-users, so now that
> > I have a bit of time, I am posting here in hopes someone can
> > look. ;) 
> > 
> > Happy to provide more info or try things. 
> 
> It's a known problem that some people have already reported, but
> currently no clue who actually turns down the pin to D3.
> Conexant guys wrote me that the codec doesn't do it by itself, and the
> driver neither, AFAIK.  A possible answer is the firmware / BIOS, but
> who knows.
> 
> In anyway, could you try to trace the hd-audio events and see whether
> the power down is *not* issued by the driver when you see this state?
> See Documentation/sound/alsa/HD-Audio.txt, the section "Tracepoints"
> for a brief instruction.

ok. I am not sure I am doing the right thing, but: 

- echo 1 > /sys/kernel/debug/tracing/events/hda/enable
- Run hda-verb to get sound working. 
- Play a video to confirm. 
- check /sys/kernel/debug/tracing/trace
- Wait a while. 
- Play another sound that doesn't come out of speakers. 
- check /sys/kernel/debug/tracing/trace for anything new. 

The only things I see in the trace file are my hda-verb call, and the
sounds playing. Nothing else. 

I then tried to duplicate it, but the trace file didn't seem to update
properly. Is there some reset needed? 

Or is this not the info you were looking for?

hda-verb-27187 [001]  78907.305149: hda_power_count: [0:0]
power_count=1, power_on=1, power_transition=0
hda-verb-27187 [001]  78907.305152: hda_send_cmd: [0:0]
val=1f70500
hda-verb-27187 [001]  78907.305214: hda_get_response: [0:0]
val=0
hda-verb-27187 [001]  78907.305214: hda_power_count: [0:0]
power_count=0, power_on=1, power_transition=0
   alsa-sink-17958 [000]  78933.682478: hda_power_count: [0:0]
power_count=1, power_on=1, power_transition=0
   alsa-sink-17958 [000]  78933.683330: hda_power_count: [0:0]
power_count=2, power_on=1, power_transition=0
   alsa-sink-17958 [000]  78933.683335: hda_power_count: [0:0]
power_count=3, power_on=1, power_transition=0
   alsa-sink-17958 [000]  78933.683336: hda_send_cmd: [0:0]
val=103a047
   alsa-sink-17958 [000]  78933.683378: hda_get_response: [0:0]
val=0
   alsa-sink-17958 [000]  78933.683379: hda_power_count: [0:0]
power_count=2, power_on=1, power_transition=0
   alsa-sink-17958 [000]  78933.683380: hda_power_count: [0:0]
   power_count=3, power_on=1, power_transition=0
...

kevin


signature.asc
Description: PGP signature


Re: ALSA: Conexant CX20585 (thinkpad t510) speaker powersave regression

2013-01-09 Thread Kevin Fenzi
On Tue, 08 Jan 2013 16:26:38 +0100
Takashi Iwai  wrote:

> It's a known problem that some people have already reported, but
> currently no clue who actually turns down the pin to D3.
> Conexant guys wrote me that the codec doesn't do it by itself, and the
> driver neither, AFAIK.  A possible answer is the firmware / BIOS, but
> who knows.

Thanks for the answer! :) 

> In anyway, could you try to trace the hd-audio events and see whether
> the power down is *not* issued by the driver when you see this state?
> See Documentation/sound/alsa/HD-Audio.txt, the section "Tracepoints"
> for a brief instruction.

I can give it a try... 

kevin


signature.asc
Description: PGP signature


ALSA: Conexant CX20585 (thinkpad t510) speaker powersave regression

2012-12-22 Thread Kevin Fenzi
Greetings. 

I've got an issue with sound on my t510. It started in the late 3.4.x
kernels. Sound works on boot and for 5-10min after, then the speakers
stop working at all. 

I posted a while back on the alsa-users list, and someone else had the
same issue: 

http://www.mail-archive.com/alsa-user@lists.sourceforge.net/msg28769.html

It seems that the speakers(?) are getting moved to state D3 and turning
off. You can manually get it to come back for a few minutes by using
hda-verb to move it back to D0: 

hda-verb /dev/snd/hwC0D0 0x1f SET_POWER_STATE 0

http://www.alsa-project.org/db/?f=fd5ca157b2e76942df5ce2f0ae1a2817f3f08afd
is my alsa-info. 

I'd file a bug on the alsa bug tracker, but it still seems to be down
(for many months now?). No response on alsa-users, so now that I have a
bit of time, I am posting here in hopes someone can look. ;) 

Happy to provide more info or try things. 

kevin


signature.asc
Description: PGP signature


Re: usbserial not working/oops on removal

2007-03-02 Thread Kevin Fenzi
On Sat, 3 Mar 2007 07:45:19 +0100
[EMAIL PROTECTED] (Oliver Neukum) wrote:

> Am Samstag, 3. März 2007 03:37 schrieb Kevin Fenzi:
> > I'm seeing some oddity with the latest fedora development kernel
> > and a usbserial device. 
> 
> Very interesting. Is this repeatable? 

Yep. Tried a half dozen times or so... same thing each time. 

> Does unplugging have the same
> effect?

Well, the card is a minipci, so I would prefer to avoid taking the
machine apart to unplug the card. ;) 

>   Regards
>   Oliver

kevin



signature.asc
Description: PGP signature


usbserial not working/oops on removal

2007-03-02 Thread Kevin Fenzi
I'm seeing some oddity with the latest fedora development kernel and a
usbserial device. 

2.6.20-1.2949.fc7 #1 SMP Mon Feb 26 18:33:03 EST 2007 x86_64 x86_64
x86_64 GNU/Linux

Its a evdo device. 

Doing: 

modprobe usbserial vendor=0x413c product=0x8128 debug=1

gets: 

drivers/usb/serial/usb-serial.c: Had to override the open usb serial operation 
with the generic one.
drivers/usb/serial/usb-serial.c: Had to override the write usb serial operation 
with the generic one.
drivers/usb/serial/usb-serial.c: Had to override the close usb serial operation 
with the generic one.
drivers/usb/serial/usb-serial.c: Had to override the write_room usb serial 
operation with the generic one.
drivers/usb/serial/usb-serial.c: Had to override the chars_in_buffer usb serial 
operation with the generic one.
drivers/usb/serial/usb-serial.c: Had to override the read_bulk_callback usb 
serial operation with the generic one.
drivers/usb/serial/usb-serial.c: Had to override the write_bulk_callback usb 
serial operation with the generic one.
drivers/usb/serial/usb-serial.c: USB Serial support registered for generic
drivers/usb/serial/usb-serial.c: static descriptor matches
drivers/usb/serial/usb-serial.c: found interrupt in on endpoint 0
drivers/usb/serial/usb-serial.c: found bulk in on endpoint 1
drivers/usb/serial/usb-serial.c: found bulk out on endpoint 2
usbserial_generic 1-2.2:1.0: generic converter detected
drivers/usb/serial/usb-serial.c: usb_serial_probe - setting up 1 port 
structures for this device
drivers/usb/serial/usb-serial.c: the device claims to support interrupt in 
transfers, but read_int_callback is not defined
drivers/usb/serial/usb-serial.c: get_free_serial 1
drivers/usb/serial/usb-serial.c: get_free_serial - minor base = 0
drivers/usb/serial/usb-serial.c: usb_serial_probe - registering ttyUSB255
Attempt to register invalid tty line number  (255).
usb 1-2.2: generic converter now attached to ttyUSB255
drivers/usb/serial/usb-serial.c: static descriptor matches
drivers/usb/serial/usb-serial.c: found bulk in on endpoint 0
drivers/usb/serial/usb-serial.c: found bulk out on endpoint 1
usbserial_generic 1-2.2:1.1: generic converter detected
drivers/usb/serial/usb-serial.c: usb_serial_probe - setting up 1 port 
structures for this device
drivers/usb/serial/usb-serial.c: get_free_serial 1
drivers/usb/serial/usb-serial.c: get_free_serial - minor base = 1
drivers/usb/serial/usb-serial.c: usb_serial_probe - registering ttyUSB255
usb-serial ttyUSB255: Error registering port device, continuing
usbcore: registered new interface driver usbserial_generic
drivers/usb/serial/usb-serial.c: USB Serial Driver core
usbcore: deregistering interface driver usbserial_generic
drivers/usb/serial/usb-serial.c: usb_serial_disconnect
drivers/usb/serial/usb-serial.c: destroy_serial - generic
drivers/usb/serial/generic.c: usb_serial_generic_shutdown
drivers/usb/serial/generic.c: generic_cleanup - port 255
drivers/usb/serial/usb-serial.c: return_serial

On a updated fc6 kernel it works fine and gives me a ttyUSB0, ttyUSB1.  
Trying to rmmod the module gets: 

Unable to handle kernel NULL pointer dereference at 0048 RIP: 
 [] klist_del+0x16/0x50
PGD 626f0067 PUD 601bc067 PMD 0 
Oops:  [1] SMP 
last sysfs file: /class/net/eth0/carrier
CPU 1 
Modules linked in: usbserial kvm_intel kvm i915 drm autofs4 hidp rfcomm l2cap 
sunrpc nf_conntrack_netbios_ns ipt_REJECT nf_conntrack_ipv4 xt_state 
nf_conntrack nfnetlink xt_tcpudp iptable_filter ip_tables x_tables acpi_cpufreq 
dm_multipath video sbs i2c_ec button bay dock battery asus_acpi ac ipv6 
parport_pc lp parport aes cbc blkcipher sha256 dm_crypt snd_hda_intel 
snd_hda_codec snd_seq_dummy hci_usb bluetooth snd_seq_oss snd_seq_midi_event 
rtc_cmos fw_ohci snd_seq tg3 rtc_core fw_core serio_raw snd_seq_device rtc_lib 
snd_pcm_oss iTCO_wdt iTCO_vendor_support snd_mixer_oss snd_pcm snd_timer snd 
soundcore shpchp i2c_i801 snd_page_alloc i2c_core sr_mod cdrom sg joydev 
dm_snapshot dm_zero dm_mirror dm_mod ata_piix ata_generic libata sd_mod 
scsi_mod ext3 jbd mbcache ehci_hcd ohci_hcd uhci_hcd
Pid: 3256, comm: rmmod Not tainted 2.6.20-1.2949.fc7 #1
RIP: 0010:[]  [] klist_del+0x16/0x50
RSP: 0018:8100606d1c88  EFLAGS: 00010296
RAX: 8100760cf2b8 RBX:  RCX: 0001
RDX: 81004f4c6778 RSI: 0001 RDI: 
RBP: 8100606d1ca8 R08: 022a R09: 0001
R10: 884479d2 R11: 00300018 R12: 8100760cf4a8
R13: 81004f4c6768 R14: 81007e386710 R15: 81007e386710
FS:  2b0136f0() GS:810003f5fcc0() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 0048 CR3: 5e918000 CR4: 26e0
Process rmmod (pid: 3256, threadinfo 8100606d, task 810065365080)
Stack:  0001 8100760cf458 8100760cf458 81004f4c6768
 8100606d1cd8 803b7d60 8100760cf458 81004f4c6768
 81004f4c6768 00

Re: X and 2.4.0 problem (video bios probing?) (SOLUTION!)

2001-01-05 Thread Kevin Fenzi


Duh. 

I figured out the problem. In 2.4.0-test13-pre3 is the introduction of
the shmall sysctl. I had installed a package called powertweak a while
back. It looks like powertweak sets any sysctl it doesn't know to 0. 

So, the problem was that there was no shared memory for X. ;( 

I set that up to a reasonable level and all is well. 

sorry for the wild goose chase. :(

kevin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: X and 2.4.0 problem (video bios probing?) (more info)

2001-01-05 Thread Kevin Fenzi

> "Alan" == Alan Cox <[EMAIL PROTECTED]> writes:

>> (II) Loading /usr/X11R6/lib/modules/linux/libint10.a (II) Module
>> int10: vendor="The XFree86 Project" compiled for 4.0.1a, module
>> version = 1.0.0 ABI class: XFree86 Video Driver, version 0.2 (EE)
>> ATI(0): Unable to initialise int10 interface.

Alan> Thats the critical bit but it isnt directly a kernel thing. Im
Alan> not sure why it should have failed. Do you have different
Alan> .config options (eg ATI fb options ?)

By the process of elimination, I have determined that my problem
appears in the 2.4.0-test13-pre3 patch. 

2.4.0-test13-pre2 X works fine, in pre3 and beyond it fails. 

Anyone able to spot anything in the pre3 patch that might be causing
this? 

thanks, 

kevin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: X and 2.4.0 problem (video bios probing?)

2001-01-05 Thread Kevin Fenzi

> "Alan" == Alan Cox <[EMAIL PROTECTED]> writes:

>> (II) Loading /usr/X11R6/lib/modules/linux/libint10.a (II) Module
>> int10: vendor="The XFree86 Project" compiled for 4.0.1a, module
>> version = 1.0.0 ABI class: XFree86 Video Driver, version 0.2 (EE)
>> ATI(0): Unable to initialise int10 interface.

Alan> Thats the critical bit but it isnt directly a kernel thing. Im
Alan> not sure why it should have failed. Do you have different
Alan> .config options (eg ATI fb options ?)

nope. I did do a 'make oldconfig' after patching to 2.4.0 (final), but
all the other options are the same. 

I was not using the ATI fb in either case. I had it built as a module,
"ATI Rage 128 display support (EXPERIMENTAL)" (that I didn't load)
and I did have "VESA VGA graphics console" enabled. 

kevin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



X and 2.4.0 problem (video bios probing?)

2001-01-05 Thread Kevin Fenzi


Hi there. Having been running the 2.4.0-test kernels on my laptop, I
quickly upgraded to 2.4.0 when it came out. A problem quickly arose: 

X won't start. 

Under 2.4.0-test12 it works fine. 

Under 2.4.0 it exits after trying to load the int10 module...

from 2.4.0-test12 (and all others):

(II) Loading sub module "int10"
(II) LoadModule: "int10"
(II) Loading /usr/X11R6/lib/modules/linux/libint10.a
(II) Module int10: vendor="The XFree86 Project"
compiled for 4.0.1a, module version = 1.0.0
ABI class: XFree86 Video Driver, version 0.2
(II) ATI(0): Primary V_BIOS segment is: 0xc000
(II) Loading sub module "ddc"
(II) LoadModule: "ddc"
(II) Loading /usr/X11R6/lib/modules/libddc.a
(II) Module ddc: vendor="The XFree86 Project"
compiled for 4.0.1a, module version = 1.0.0
ABI class: XFree86 Video Driver, version 0.2
(II) Loading sub module "vbe"
(II) LoadModule: "vbe"
(II) Loading /usr/X11R6/lib/modules/libvbe.a
(II) Module vbe: vendor="The XFree86 Project"
compiled for 4.0.1a, module version = 1.0.0
ABI class: XFree86 Video Driver, version 0.2
(II) ATI(0): VESA Bios detected
(II) ATI(0): VESA VBE Version 2.0
(II) ATI(0): VESA VBE Total Mem: 8128 kB
(II) ATI(0): VESA VBE OEM: ATI MACH64
(II) ATI(0): VESA VBE OEM Software Rev: 1.0
(II) ATI(0): VESA VBE OEM Vendor: ATI Technologies Inc.
(II) ATI(0): VESA VBE OEM Product: MACH64RM
(II) ATI(0): VESA VBE OEM Product Rev: 01.00
(II) ATI(0): VESA VBE DDC supported
(II) ATI(0): VESA VBE DDC Level none
(II) ATI(0): VESA VBE DDC transfer in appr. 2 sec.
(==) ATI(0): Chipset:  "ati".
(--) ATI(0): ATI 3D Rage Mobility graphics controller detected.
(--) ATI(0): Chip type 4C4D "LM", version 4, foundry TSMC, class 0, revision 0x01.
(--) ATI(0): AGP bus interface detected;  block I/O base is 0x2000.
(--) ATI(0): ATI Mach64 adapter detected.
(--) ATI(0): Internal RAMDAC (subtype 1) detected.
(--) ATI(0): 1400x1050 panel (ID 6) detected.
(--) ATI(0): Panel model IBM ITSX93.
(--) ATI(0): Panel clock is 107.859 MHz.

under 2.4.0 (final):

(II) Setting vga for screen 0.
(II) Loading sub module "int10"
(II) LoadModule: "int10"
(II) Loading /usr/X11R6/lib/modules/linux/libint10.a
(II) Module int10: vendor="The XFree86 Project"
compiled for 4.0.1a, module version = 1.0.0
ABI class: XFree86 Video Driver, version 0.2
(EE) ATI(0): Unable to initialise int10 interface.
(II) UnloadModule: "ati"
(II) UnloadModule: "int10"
(II) Unloading /usr/X11R6/lib/modules/linux/libint10.a
(EE) Screen(s) found, but none have a usable configuration.

Fatal server error:
no screens found

Things I have ruled out:

- Compiled with redhat 7's gcc 2.96 snapshot, and also with kgcc. No
diffrence. 

- did a cat /proc/pci, lspci, and cat /proc/inturrupts and compaired
2.4.0 and 2.4.0-test12 and they were identical. 

- Without changing anything besides the kernel X works. If I am in
2.4.0-test12 it's fine, if I am in 2.4.0 (final) it fails. 

Further debugging, requests for logs, patches, or ideas are welcome. 

This laptop is a dell inspiron 5000 (not 5000e). It has worked just
fine with X until now. It's a redhat 7 + all errata. It has
XFree86-4.0.1a on it. 

thanks for any suggestions. 

kevin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/