Re: [GIT PULL] bcache changes for 3.17
On 09/05/2014 07:10 PM, Jens Axboe wrote: > On 09/05/2014 11:03 AM, Arne Wiebalck wrote: >> >> On Sep 5, 2014, at 6:41 PM, Peter Kieser >> wrote: >> >>> >>> On 2014-09-05 8:37 AM, Eddie Chapman wrote: On 05/09/14 15:17, Jens Axboe wrote: > (from oldest to newest). And that's just from 3.16 to 3.17-rc3, going > all the way back to 3.10 would be a lot of work. If there's anyone that > cares about bcache on stable kernels (and actually use it), now would be > a good time to pipe up. Just "piping up" as I care about bcache and actually use it in production on 3.10! Shame I don't have the knowledge to try and backport these though :-) Eddie >>> >>> I'm "piping up" as well, I use bcache on 3.10 in production. >>> >>> -Peter >>> >> >> >> More "piping up": we currently use bcache on a few nodes in production, on >> 3.14 and 3.15, and plan to roll it out on a wider scale now. >> If necessary we'll go with these kernels, but we'd certainly prefer our >> usual 3.10-based CentOS kernel. > > OK, so we definitely have people using it in production. My concern was > that whomever does the backport of the appropriate patches to 3.10/14/15 > stable would have an audience for getting some amount of testing of such > a patch series. > > Now we just need someone to line up to do the work... > Ok it's becoming insane: my system crashes every 2 days: any processes that attempt a write to the disk get stuck, and cpu are at 100%. So I can try to backport the fixes that address the following oops for kernel 3.14 but someone has to point me the corresponding commits since I don't know bcache. Thanks. BUG: soft lockup - CPU#0 stuck for 22s! [bcache_gc:152] Modules linked in: tun xt_nat xt_tcpudp mmc_block btrfs raid6_pq xor ses enclosure usb_storage veth xt_addrtype xt_conntrack ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack bridge stp llc dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c loop dm_mod iptable_filter ip_tables x_tables hid_generic usbhid hid ctr ccm fuse joydev mousedev coretemp hwmon arc4 iwldvm led_class nls_iso8859_1 nls_cp437 vfat mac80211 fat intel_rapl x86_pkg_temp_thermal iTCO_wdt intel_powerclamp iTCO_vendor_support kvm_intel snd_hda_codec_hdmi kvm snd_hda_codec_via snd_hda_codec_generic crct10dif_pclmul iwlwifi crc32_pclmul crc32c_intel btusb ghash_clmulni_intel bluetooth aesni_intel aes_x86_64 cfg80211 lrw snd_hda_intel gf128mul glue_helper ablk_helper 6lowpan_iphc cryptd r8169 snd_hda_codec psmouse rtsx_pci_ms i2c_i801 snd_hwdep serio_raw rfkill memstick mii snd_pcm wmi snd_timer snd evdev tpm_infineon mei_me tpm_tis mei tpm soundcore shpchp mac_hid lpc_ich battery ac processor thermal sch_fq_codel nfs lockd sunrpc fscache ext4 crc16 mbcache jbd2 bcache sd_mod sr_mod crc_t10dif cdrom crct10dif_common rtsx_pci_sdmmc mmc_core atkbd libps2 ahci libahci libata ehci_pci xhci_hcd ehci_hcd scsi_mod rtsx_pci usbcore usb_common i8042 serio i915 video button intel_gtt i2c_algo_bit drm_kms_helper drm i2c_core CPU: 0 PID: 152 Comm: bcache_gc Not tainted 3.14.30-1-lts #1 Hardware name: CLEVO CO.W55xEU /W55xEU , BIOS 4.6.5 03/05/2013 task: 880406b1a780 ti: 88040461e000 task.ti: 88040461e000 RIP: 0010:[] [] bch_extent_bad+0x122/0x1d0 [bcache] RSP: 0018:88040461fa90 EFLAGS: 0207 RAX: 9081 RBX: a04439b9 RCX: c90017452000 RDX: c90017468f38 RSI: 7a6b5813 RDI: 88007ff2 RBP: 88040461fac0 R08: 0013 R09: 0008 R10: 07ff R11: 880405fe8000 R12: 8804055b08a0 R13: 8804055b08a0 R14: 880404844760 R15: 0018 FS: () GS:88041e20() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7f1b36926007 CR3: 0280c000 CR4: 001427e0 Stack: 88040461faa0 880404844760 88040461fc48 a043ba80 8804055b08a0 880405e2dc60 88040461fad0 a043ba8a 88040461fb00 a043b879 08e8 8804055b08a0 Call Trace: [] ? bch_ptr_invalid+0x10/0x10 [bcache] [] bch_ptr_bad+0xa/0x10 [bcache] [] bch_btree_iter_next_filter+0x29/0x50 [bcache] [] btree_gc_recurse+0x175/0xc10 [bcache] [] ? bch_btree_keys_stats+0xf0/0xf0 [bcache] [] ? __bch_btree_ptr_invalid+0xa5/0xc0 [bcache] [] ? bch_btree_keys_stats+0xf0/0xf0 [bcache] [] ? btree_gc_mark_node+0x73/0x230 [bcache] [] bch_btree_gc+0x50f/0x690 [bcache] [] ? try_to_wake_up+0x20c/0x2d0 [] ? __wake_up_sync+0x20/0x20 [] bch_gc_thread+0x48/0x130 [bcache] [] ? bch_btree_gc+0x690/0x690 [bcache] [] kthread+0xea/0x100 [] ? kthread_create_on_node+0x1a0/0x1a0 [] ret_from_fork+0x7c/0xb0 [] ? kthread_create_on_node+0x1a0/0x1a0 Code: 00 00 4c 8b 84 d7 40 0c 00 00 48 89 f2 48 c1 ea 08 4c 21 fa 48 d3 ea 49 8b 88 00 0b 00 00 48 8d 14 52 48 8d 14 91 44 0f b6 42 06 <41> 29 f0 41 80 f8 80 7
kernel 3.17.1: fail to use USB3 device after resuming from suspend
Hello, After resuming from a suspend (to RAM), I can't use an external USB hard drive anymore, the kernel seems to fail to detect it. Here is the kernel log when doing a suspend/resume cycle. [Oct23 22:03] wlp2s0: deauthenticating from 92:23:b1:f9:54:e4 by local choice (Reason: 3=DEAUTH_LEAVING) [ +0.025152] cfg80211: Calling CRDA to update world regulatory domain [ +0.052636] IPv6: ADDRCONF(NETDEV_UP): wlp2s0: link is not ready [ +0.071175] PM: Syncing filesystems ... done. [ +0.099223] PM: Preparing system for mem sleep [ +0.000347] Freezing user space processes ... (elapsed 0.001 seconds) done. [ +0.001406] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. [ +0.001083] PM: Entering mem sleep [ +0.17] Suspending console(s) (use no_console_suspend to debug) [ +0.000264] sd 4:0:0:0: [sdb] Synchronizing SCSI cache [ +0.36] sd 4:0:0:0: [sdb] Stopping disk [ +0.04] sd 0:0:0:0: [sda] Synchronizing SCSI cache [ +0.49] sd 0:0:0:0: [sda] Stopping disk [ +1.058316] PM: suspend of devices complete after 1057.429 msecs [ +0.013434] PM: late suspend of devices complete after 13.410 msecs [ +0.000875] ehci-pci :00:1d.0: System wakeup enabled by ACPI [ +0.13] r8169 :03:00.2: System wakeup enabled by ACPI [ +0.000101] ehci-pci :00:1a.0: System wakeup enabled by ACPI [ +0.47] xhci_hcd :00:14.0: System wakeup enabled by ACPI [ +0.012356] PM: noirq suspend of devices complete after 13.373 msecs [ +0.000401] ACPI: Preparing to enter system sleep state S3 [ +0.002149] PM: Saving platform NVS memory [ +0.06] Disabling non-boot CPUs ... [ +0.75] intel_pstate CPU 1 exiting [ +0.001363] kvm: disabling virtualization on CPU1 [ +0.28] smpboot: CPU 1 is now offline [ +0.000436] intel_pstate CPU 2 exiting [ +0.001348] kvm: disabling virtualization on CPU2 [ +0.100773] smpboot: CPU 2 is now offline [ +0.000322] intel_pstate CPU 3 exiting [ +0.001260] kvm: disabling virtualization on CPU3 [ +0.101855] smpboot: CPU 3 is now offline [ +0.000246] intel_pstate CPU 4 exiting [ +0.001181] kvm: disabling virtualization on CPU4 [ +0.102021] smpboot: CPU 4 is now offline [ +0.000396] intel_pstate CPU 5 exiting [ +0.001242] kvm: disabling virtualization on CPU5 [ +0.101801] smpboot: CPU 5 is now offline [ +0.000292] intel_pstate CPU 6 exiting [ +0.001301] kvm: disabling virtualization on CPU6 [ +0.101880] smpboot: CPU 6 is now offline [ +0.000496] intel_pstate CPU 7 exiting [ +0.001265] kvm: disabling virtualization on CPU7 [ +0.101649] smpboot: CPU 7 is now offline [ +0.002022] ACPI: Low-level resume complete [ +0.43] PM: Restoring platform NVS memory [ +0.000342] Enabling non-boot CPUs ... [ +0.47] x86: Booting SMP configuration: [ +0.02] smpboot: Booting Node 0 Processor 1 APIC 0x2 [ +0.011516] kvm: enabling virtualization on CPU1 [ +0.002301] CPU1 is up [ +0.25] smpboot: Booting Node 0 Processor 2 APIC 0x4 [ +0.011466] kvm: enabling virtualization on CPU2 [ +0.002307] CPU2 is up [ +0.22] smpboot: Booting Node 0 Processor 3 APIC 0x6 [ +0.011469] kvm: enabling virtualization on CPU3 [ +0.002305] CPU3 is up [ +0.22] smpboot: Booting Node 0 Processor 4 APIC 0x1 [ +0.011483] kvm: enabling virtualization on CPU4 [ +0.002298] CPU4 is up [ +0.18] smpboot: Booting Node 0 Processor 5 APIC 0x3 [ +0.011437] kvm: enabling virtualization on CPU5 [ +0.002305] CPU5 is up [ +0.17] smpboot: Booting Node 0 Processor 6 APIC 0x5 [ +0.011550] kvm: enabling virtualization on CPU6 [ +0.002297] CPU6 is up [ +0.17] smpboot: Booting Node 0 Processor 7 APIC 0x7 [ +0.011457] kvm: enabling virtualization on CPU7 [ +0.002312] CPU7 is up [ +0.006813] ACPI: Waking up from system sleep state S3 [ +0.046539] ehci-pci :00:1d.0: System wakeup disabled by ACPI [ +0.000267] ehci-pci :00:1a.0: System wakeup disabled by ACPI [ +0.000123] xhci_hcd :00:14.0: System wakeup disabled by ACPI [ +0.53] PM: noirq resume of devices complete after 13.005 msecs [ +0.000523] PM: early resume of devices complete after 0.479 msecs [ +0.000120] mei_me :00:16.0: irq 28 for MSI/MSI-X [ +0.47] r8169 :03:00.2: System wakeup disabled by ACPI [ +0.97] snd_hda_intel :00:1b.0: irq 29 for MSI/MSI-X [ +0.003073] rtc_cmos 00:02: System wakeup disabled by ACPI [ +0.008776] sd 4:0:0:0: [sdb] Starting disk [ +0.09] sd 0:0:0:0: [sda] Starting disk [ +0.053104] r8169 :03:00.2 enp3s0f2: link down [ +0.254016] usb 1-4: reset full-speed USB device number 2 using xhci_hcd [ +0.16] xhci_hcd :00:14.0: Setup ERROR: setup context command for slot 1. [ +0.03] usb 1-4: hub failed to enable device, error -22 [ +0.019787] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [ +0.002347] ata3.00: configured for UDMA/100 [ +0.004325] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ +0.014796] ata5.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) succeeded [ +0.
Re: oops on kernel 3.14.17 seems related to EFI
Hello Matt, On 09/05/2014 02:09 PM, Matt Fleming wrote: > (Adding linux-efi) > > On Fri, 05 Sep, at 08:51:31AM, Francis Moreau wrote: >> [ +0.45] RIP: 0010:[<>] [< (null)>] >> (null) >> [ +0.46] RSP: 0018:8800b4001da8 EFLAGS: 00010002 >> [ +0.32] RAX: 80050033 RBX: 880406288000 RCX: >> 880406288000 >> [ +0.41] RDX: 880406288400 RSI: 880406288000 RDI: >> >> [ +0.42] RBP: 8800b4001e80 R08: R09: >> 8800b4001ec0 >> [ +0.42] R10: R11: 0246 R12: >> 880406288400 >> [ +0.42] R13: R14: 8800b4001ec0 R15: >> 0009b000 >> [ +0.42] FS: 7f7720a567c0() GS:88041e2c() >> knlGS: >> [ +0.48] CS: 0010 DS: ES: CR0: 80050033 >> [ +0.34] CR2: CR3: 0009b000 CR4: >> 001427e0 >> [ +0.42] Stack: >> [ +0.14] 81063281 811bd95c 0246 >> 8800da9e6628 >> [ +0.52] 8003 >> 8800b4001e50 >> [ +0.51] 80050033 7f7722abbc50 7f7722abbb50 >> 00ff >> [ +0.52] Call Trace: >> [ +0.22] [] ? efi_call5+0x71/0xf0 >> [ +0.35] [] ? getname_flags+0x2c/0x130 >> [ +0.37] [] ? virt_efi_get_variable+0x49/0x60 >> [ +0.51] [] efivar_entry_size+0x41/0x80 >> [ +0.30] [] efivarfs_file_read+0x49/0x100 >> [ +0.46] [] vfs_read+0x97/0x160 >> [ +0.41] [] SyS_read+0x59/0xd0 >> [ +0.41] [] system_call_fastpath+0x16/0x1b > > This looks like efi.systab->runtime->get_variable is NULL. > > Could you send a copy of the dmesg buffer? It might contain some info to > explain this issue. > It happened again and it really does related to loop device: it always happens after (but not immediately after) I set up/delete loop devices and maybe after hibernating too. I attached the dmesg output after the bug triggered again. Thanks oops.gz Description: application/gzip
Why is max_part default value 0 for loop devices ?
Hello, I'm wondering why max_part parameter for loop device is 0 by default. Also would it make sense to allow one to change this default value at kernel build time ? Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] bcache changes for 3.17
On 09/05/2014 11:45 PM, Greg KH wrote: > On Fri, Sep 05, 2014 at 09:31:06AM +0200, Francis Moreau wrote: >> On 08/10/2014 09:54 AM, Peter Kieser wrote: >>> >>> On 2014-08-05 9:58 AM, Jens Axboe wrote: >>>> On 08/04/2014 10:33 PM, Kent Overstreet wrote: >>>>> Hey Jens, here's the pull request for 3.17 - typically late, but lots of >>>>> tasty >>>>> fixes in this one :) >>>> Normally I'd say no, but since it's basically just fixes, I guess we can >>>> pull it in. But generally, it has to be in my hands a week before this, >>>> so it can simmer a bit in for-next before going in... >>>> >>> Are these fixes going to be backported to 3.10 or other stable releases? >>> >> >> Could you please answer this question ? >> >> If you don't want to maintain bcache for stable kernels (I can >> understand that), can you mark it at least as unstable/experimental >> stuff since it really is ? > > WTF? > > Just because a maintainer/developer doesn't want to do anything for the > stable kernel releases does _NOT_ mean the code is > "unstable/expreimental" at all. > > That's not how stable kernel releases work. _IF_ a maintainer wants to > / has the time to, they can mark patches for inclusion in stable kernel > releases. Given the huge list of patches that Jens just posted, I doubt > that those are really something I would ever take for a stable kernel > release. > > Please read Documentation/stable_kernel_rules.txt for more details > please. And don't ask others to do backporting work for you, it's not > ok, and is something that I have always said is never required, and is > not going to be. > wow, not sure why I deserve such anger... Looks like you haven't understood me well and specially I *never* asked others to do the backporting for me. Please reread the thread, perhaps peaceful music can help too. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3.10.y+] PM / sleep: Use valid_state() for platform-dependent sleep states only
On 09/05/2014 09:45 AM, Brian Norris wrote: > On Fri, Sep 05, 2014 at 08:29:09AM +0200, Francis Moreau wrote: >> On 09/04/2014 11:21 PM, Brian Norris wrote: > [...] >>> Signed-off-by: Rafael J. Wysocki >>> Cc: # 3.10+: 27ddcc6596e5: PM / sleep: Add state >>> field to pm_states[] entries >>> Cc: # 3.10+ >>> --- >>> This is a backport request for these two commits upstream: >>> >>> 27ddcc6596e5 PM / sleep: Add state field to pm_states[] entries >>> 43e8317b0bba PM / sleep: Use valid_state() for platform-dependent sleep >>> states only >>> >> >> Wouldn't it be cleaner to have 2 separate backports then ? > > The first is purely a dependency for the second. It has no value on its > own. So I thought the above form made sense and followed the process > mentioned in Documentation/stable_kernel_rules.txt. > > Admittedly, it's a little asymmetric. But I really don't know what the > "best" option is, since I'd prefer not having to send around any patch > text at all, unless the backport is not trivial (these apply cleanly). I don't know, I just find cleaner to cherry-pick upstream commits when possible so I can retrieve them easily later when inspecting a stable kernel. My 2 cents. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] bcache changes for 3.17
On 09/05/2014 04:17 PM, Jens Axboe wrote: > > We need to do something about this. From this latest pull, looks like > all should go to stable: > > 5b1016e62f74c53e0330403025954c8d95384c03 > 9aa61a992acceeec0d1de2cd99938421498659d5 > dbd810ab678d262d3772d29b65844d7b20dc47bc > 8b326d3a2a76912dfed2f0ab937d59fae9512ca2 > e5112201c1285841f8b565ece5d6ae7e0d7947a2 > a664d0f05a2ec02c8f042db536d84d15d6e19e81 > c5aa4a3157b55bdca18dd2a9d9f43314470b6d32 > 9e5c353510b26500bd6b8309823ac9ef2837b761 > bcf090e0040e30f8409e6a535a01e6473afb096f > 501d52a90cbe652b41336c206ff0e95799d5a9b5 > 8e0948080670f6330229718b15a6a1a011d441ce > 60ae81eee86dd7a520db8c1e3d702b49fc0418b5 > 913dc33fb2720fb5f979011664294137ddd8b13b > 6b708de64adb6dc8319e7aeac922b46904fbeeec > 400ffaa2acd72274e2c7293a9724382383bebf3e > d83353b319d47ef8cce82467da6a25c2d558253f > bf0c55c986540483c34ca640f2eef4c3314388b1 > c9a78332b42cbdcdd386a95192a716b67d1711a4 > 2452cc89063a2a6890368f185c4b6d7d8802175b > 25abade29616d42d60f9bd5e6a5ad07f7314e39e > 5b25abade29616d42d60f9bd5e6a5ad07f7314e3 > 789d21dbd9d8889e62c79ec19585fcc97e42ef07 > 0781c8748cf1ea2b0dcd966571103909528c4efa > > (from oldest to newest). And that's just from 3.16 to 3.17-rc3, going > all the way back to 3.10 would be a lot of work. If there's anyone that > cares about bcache on stable kernels (and actually use it), now would be > a good time to pipe up. > Then if it's too much work, it just confirmed what was asked previously: bcache is still experimental so mark it such for stable kernels. I wouldn't have used bcache in that case. Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: oops on kernel 3.14.17 seems related to EFI
On 09/05/2014 02:09 PM, Matt Fleming wrote: > (Adding linux-efi) > > On Fri, 05 Sep, at 08:51:31AM, Francis Moreau wrote: >> [ +0.45] RIP: 0010:[<>] [< (null)>] >> (null) >> [ +0.46] RSP: 0018:8800b4001da8 EFLAGS: 00010002 >> [ +0.32] RAX: 80050033 RBX: 880406288000 RCX: >> 880406288000 >> [ +0.41] RDX: 880406288400 RSI: 880406288000 RDI: >> >> [ +0.42] RBP: 8800b4001e80 R08: R09: >> 8800b4001ec0 >> [ +0.42] R10: R11: 0246 R12: >> 880406288400 >> [ +0.42] R13: R14: 8800b4001ec0 R15: >> 0009b000 >> [ +0.42] FS: 7f7720a567c0() GS:88041e2c() >> knlGS: >> [ +0.48] CS: 0010 DS: ES: CR0: 80050033 >> [ +0.34] CR2: CR3: 0009b000 CR4: >> 001427e0 >> [ +0.42] Stack: >> [ +0.14] 81063281 811bd95c 0246 >> 8800da9e6628 >> [ +0.52] 8003 >> 8800b4001e50 >> [ +0.51] 80050033 7f7722abbc50 7f7722abbb50 >> 00ff >> [ +0.52] Call Trace: >> [ +0.22] [] ? efi_call5+0x71/0xf0 >> [ +0.35] [] ? getname_flags+0x2c/0x130 >> [ +0.37] [] ? virt_efi_get_variable+0x49/0x60 >> [ +0.51] [] efivar_entry_size+0x41/0x80 >> [ +0.30] [] efivarfs_file_read+0x49/0x100 >> [ +0.46] [] vfs_read+0x97/0x160 >> [ +0.41] [] SyS_read+0x59/0xd0 >> [ +0.41] [] system_call_fastpath+0x16/0x1b > > This looks like efi.systab->runtime->get_variable is NULL. > > Could you send a copy of the dmesg buffer? It might contain some info to > explain this issue. Unfortunately, I haven't kept a copy of it. > > Also, is this a regression? If so it would be excellent if you could > pinpoint the commit that causes the problem, using git bisect. > it's been a while now I'm stuck to 3.14 (later kernels oops for other reasons) so I don't really remember. I would say no but I'm really not sure. Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] bcache changes for 3.17
On 08/10/2014 09:54 AM, Peter Kieser wrote: > > On 2014-08-05 9:58 AM, Jens Axboe wrote: >> On 08/04/2014 10:33 PM, Kent Overstreet wrote: >>> Hey Jens, here's the pull request for 3.17 - typically late, but lots of >>> tasty >>> fixes in this one :) >> Normally I'd say no, but since it's basically just fixes, I guess we can >> pull it in. But generally, it has to be in my hands a week before this, >> so it can simmer a bit in for-next before going in... >> > Are these fixes going to be backported to 3.10 or other stable releases? > Could you please answer this question ? If you don't want to maintain bcache for stable kernels (I can understand that), can you mark it at least as unstable/experimental stuff since it really is ? Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] bcache: fix uninterruptible sleep in writeback thread
On 07/25/2014 09:30 AM, Francis Moreau wrote: > Hi, > > On 06/02/2014 04:07 PM, Francis Moreau wrote: >> Hello, >> >> On 05/15/2014 07:30 PM, Jens Axboe wrote: >>> On 05/15/2014 02:02 AM, Francis Moreau wrote: >>>> Hello Jens, >>>> >>>> On 05/12/2014 08:27 PM, Peter Kieser wrote: >>>>> >>>>> On 2014-05-05 3:30 PM, Nikolay Amiantov wrote: >>>>>> 2014-05-02 1:52 GMT+04:00 Slava Pestov : >>>>>>> There were two issues here: >>>>>>> >>>>>>> - writeback thread did not start until the device first became dirty >>>>>>> - writeback thread used uninterruptible sleep once running >>>>>>> >>>>>>> Without this patch I see kernel warnings printed and a load average of >>>>>>> 1.52 after booting my test VM. With this patch the warnings are gone and >>>>>>> the load average is near 0.00 as expected. >>>>>> I've tried this patch and it has indeed fixed [1]! Thanks! >>>>>> >>>>>> [1]: https://bugzilla.kernel.org/show_bug.cgi?id=69471 >>>>> >>>>> Kent, >>>>> >>>>> Could you please review this patch, and have it pushed upstream? >>>>> >>>> >>>> Would it be possible to merge this patch directly before 3.15 is being >>>> released since kent don't seem to care about bugs in bcache or maybe he >>>> does but very selectively ? >>>> >>>> Also it would be great that stable trees will be fixed. >>>> >>>> Eventually I would suggest to mark bcache as an experimental thing since >>>> it's really not ready for production, just take a look at the bcache >>>> mailing list to see why. At least people won't be disappointed when >>>> they'll use bcache and see ton of koops. >>> >>> I'd really like to get Kent to weigh in on this. Sometimes it appears >>> straightforward to switch from uninterruptible to interruptible sleep, >>> but then signals get in the way. >>> >> >> Any progress ? >> > > still present on 3.15.5-2-ARCH :-/ > still present on 3.14.17 :-/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
oops on kernel 3.14.17 seems related to EFI
Hello, Another day, another oops... [ +0.055412] BUG: unable to handle kernel NULL pointer dereference at (null) [ +0.62] IP: [< (null)>] (null) [ +0.34] PGD 2b3c067 PUD 2b3d067 PMD 2b3e067 PTE 8163 [ +0.47] Oops: 0011 [#1] SMP [ +0.27] Modules linked in: btrfs raid6_pq xor ses enclosure usb_storage tun loop joydev fuse coretemp hwmon arc4 iwldvm led_class intel_rapl x86_pkg_temp_thermal intel_powerclamp kvm_intel kvm nls_iso8859_1 mac80211 nls_cp437 crct10d [ +0.000574] ac processor mac_hid nfs lockd sunrpc fscache ext4 crc16 mbcache jbd2 hid_generic usbhid hid bcache sd_mod sr_mod crc_t10dif cdrom crct10dif_common rtsx_pci_sdmmc mmc_core atkbd libps2 ahci libahci ehci_pci libata xhci_hcd e [ +0.000274] CPU: 3 PID: 12321 Comm: systemd-udevd Not tainted 3.14.17-1-lts #1 [ +0.44] Hardware name: CLEVO CO.W55xEU /W55xEU , BIOS 4.6.5 03/05/2013 [ +0.77] task: 8803a9ef60e0 ti: 8800b400 task.ti: 8800b400 [ +0.45] RIP: 0010:[<>] [< (null)>] (null) [ +0.46] RSP: 0018:8800b4001da8 EFLAGS: 00010002 [ +0.32] RAX: 80050033 RBX: 880406288000 RCX: 880406288000 [ +0.41] RDX: 880406288400 RSI: 880406288000 RDI: [ +0.42] RBP: 8800b4001e80 R08: R09: 8800b4001ec0 [ +0.42] R10: R11: 0246 R12: 880406288400 [ +0.42] R13: R14: 8800b4001ec0 R15: 0009b000 [ +0.42] FS: 7f7720a567c0() GS:88041e2c() knlGS: [ +0.48] CS: 0010 DS: ES: CR0: 80050033 [ +0.34] CR2: CR3: 0009b000 CR4: 001427e0 [ +0.42] Stack: [ +0.14] 81063281 811bd95c 0246 8800da9e6628 [ +0.52] 8003 8800b4001e50 [ +0.51] 80050033 7f7722abbc50 7f7722abbb50 00ff [ +0.52] Call Trace: [ +0.22] [] ? efi_call5+0x71/0xf0 [ +0.35] [] ? getname_flags+0x2c/0x130 [ +0.37] [] ? virt_efi_get_variable+0x49/0x60 [ +0.51] [] efivar_entry_size+0x41/0x80 [ +0.30] [] efivarfs_file_read+0x49/0x100 [ +0.46] [] vfs_read+0x97/0x160 [ +0.41] [] SyS_read+0x59/0xd0 [ +0.41] [] system_call_fastpath+0x16/0x1b [ +0.35] Code: Bad RIP value. [ +0.26] RIP [< (null)>] (null) [ +0.33] RSP [ +0.22] CR2: [ +0.015070] ---[ end trace 9b69115f973204df ]--- Thanks for any helps. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3.10.y+] PM / sleep: Use valid_state() for platform-dependent sleep states only
Hello, On 09/04/2014 11:21 PM, Brian Norris wrote: > From: "Rafael J. Wysocki" > > [Upstream commit 43e8317b0bba1d6eb85f38a4a233d82d7c20d732] > > Use the observation that, for platform-dependent sleep states > (PM_SUSPEND_STANDBY, PM_SUSPEND_MEM), a given state is either > always supported or always unsupported and store that information > in pm_states[] instead of calling valid_state() every time we > need to check it. > > Also do not use valid_state() for PM_SUSPEND_FREEZE, which is always > valid, and move the pm_test_level validity check for PM_SUSPEND_FREEZE > directly into enter_state(). > > Signed-off-by: Rafael J. Wysocki > Cc: # 3.10+: 27ddcc6596e5: PM / sleep: Add state > field to pm_states[] entries > Cc: # 3.10+ > --- > This is a backport request for these two commits upstream: > > 27ddcc6596e5 PM / sleep: Add state field to pm_states[] entries > 43e8317b0bba PM / sleep: Use valid_state() for platform-dependent sleep > states only > Wouldn't it be cleaner to have 2 separate backports then ? Thanks for the backport anyways. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel 3.14.2 oops: seems related to EFI
On 05/20/2014 01:54 PM, Matt Fleming wrote: > On Mon, 19 May, at 09:09:58AM, Francis Moreau wrote: >> >> I don't know, I can't really afford to configure/compile/test this new >> kernel, sorry. > > It would be useful to know whether this issue still occurs when booting > with the efi=old_map kernel parameter. > the bug triggered: [ +0.002872] BUG: unable to handle kernel paging request at fffefd4a1e60 [ +0.66] IP: [] virt_efi_get_variable+0x48/0x80 [ +0.54] PGD 280f067 PUD 0 [ +0.31] Oops: [#1] PREEMPT SMP [ +0.39] Modules linked in: tun ses enclosure usb_storage loop fuse joydev coretemp hwmon arc4 nls_iso8859_1 nls_c [ +0.000691] ac ext4 crc16 mbcache jbd2 hid_generic usbhid hid bcache sd_mod sr_mod crc_t10dif cdrom crct10dif_common [ +0.000289] CPU: 7 PID: 23293 Comm: systemd-udevd Tainted: GW 3.14.4-1-ARCH #1 [ +0.57] Hardware name: CLEVO CO.W55xEU /W55xEU [ +0.87] task: 88039557bae0 ti: 8802de764000 task.ti: 8802de764000 [ +0.50] RIP: 0010:[] [] virt_efi_get_variable+0x48/0x80 [ +0.64] RSP: 0018:8802de765e58 EFLAGS: 00010082 [ +0.37] RAX: fffefd4a1e18 RBX: 8800da88f000 RCX: [ +0.48] RDX: 8800da88f400 RSI: 8800da88f000 RDI: [ +0.48] RBP: 8802de765e80 R08: 8802de765ec0 R09: [ +0.47] R10: R11: 0246 R12: 8800da88f400 [ +0.48] R13: R14: 8802de765ec0 R15: [ +0.48] FS: 7f10751057c0() GS:88041e3c() knlGS: [ +0.54] CS: 0010 DS: ES: CR0: 80050033 [ +0.40] CR2: fffefd4a1e60 CR3: 0003c4afa000 CR4: 001407e0 [ +0.48] Stack: [ +0.16] 8800da88f000 8802de765ec0 81b27c20 8802de765f48 [ +0.60] 3bc93ec9a0004bba 8802de765ea8 813dbc91 8800da88f000 [ +0.60] 7fffdc30c104 0004 8802de765ef8 81245779 [ +0.60] Call Trace: [ +0.25] [] efivar_entry_size+0x41/0x80 [ +0.44] [] efivarfs_file_read+0x49/0x100 [ +0.44] [] vfs_read+0x97/0x160 [ +0.37] [] SyS_read+0x59/0xd0 [ +0.39] [] system_call_fastpath+0x16/0x1b [ +0.41] Code: ce 4d 89 c7 e8 9a 06 00 00 65 ff 04 25 a0 c7 00 00 48 8b 05 1b d4 86 00 4d 89 f9 4d 89 f0 4c 89 e9 [ +0.000335] RIP [] virt_efi_get_variable+0x48/0x80 [ +0.49] RSP [ +0.26] CR2: fffefd4a1e60 [ +0.016781] ---[ end trace 5a7017feeac75345 ]--- the sad thing is tht my system can't shutdown properly when it happens. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel 3.14.2 oops: seems related to EFI
On 05/20/2014 01:54 PM, Matt Fleming wrote: > On Mon, 19 May, at 09:09:58AM, Francis Moreau wrote: >> >> I don't know, I can't really afford to configure/compile/test this new >> kernel, sorry. > > It would be useful to know whether this issue still occurs when booting > with the efi=old_map kernel parameter. > ok I can try to boot with that parameter and see if the issue happens again. Unfortunately if it doesn't, we couldn't tell. Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel 3.14.2 oops: seems related to EFI
On 05/18/2014 03:42 PM, Borislav Petkov wrote: > On Sat, May 17, 2014 at 05:25:47PM +0200, Francis Moreau wrote: >> [ +0.018677] general protection fault: [#1] PREEMPT SMP >> [ +0.68] Modules linked in: usb_storage tun raid1 md_mod loop fuse >> joydev coretemp hwmon arc4 intel_rapl x86_pkg_temp_thermal >> intel_powerclamp kvm_intel nls_iso8859_1 nls_cp437 iTCO_wdt kvm vfat fat >> iTCO_vendor_support iwldvm uvcvideo led_class crct10dif_pclmul >> crc32_pclmul crc32c_intel ghash_clmulni_intel mac80211 videobuf2_vmalloc >> videobuf2_memops videobuf2_core aesni_intel videodev aes_x86_64 >> snd_hda_codec_hdmi lrw gf128mul mousedev glue_helper btusb >> snd_hda_codec_via ablk_helper media cryptd iwlwifi snd_hda_codec_generic >> bluetooth psmouse microcode i2c_i801 serio_raw cfg80211 6lowpan_iphc >> rtsx_pci_ms r8169 memstick rfkill lpc_ich mii snd_hda_intel >> snd_hda_codec thermal snd_hwdep wmi snd_pcm tpm_infineon snd_timer >> tpm_tis mei_me snd tpm mei shpchp evdev soundcore processor battery >> mac_hid ac >> [ +0.000803] ext4 crc16 mbcache jbd2 hid_generic usbhid hid bcache >> sd_mod sr_mod crc_t10dif cdrom crct10dif_common rtsx_pci_sdmmc mmc_core >> atkbd libps2 ahci libahci ehci_pci libata xhci_hcd ehci_hcd scsi_mod >> rtsx_pci usbcore usb_common i8042 serio i915 video button intel_gtt >> i2c_algo_bit drm_kms_helper drm i2c_core >> [ +0.000328] CPU: 0 PID: 30835 Comm: systemd-udevd Not tainted >> 3.14.2-1-ARCH #1 >> [ +0.64] Hardware name: CLEVO CO.W55xEU >> /W55xEU , BIOS 4.6.5 >> 03/05/2013 >> [ +0.000102] task: 880405ee6bf0 ti: 880400f4a000 task.ti: >> 880400f4a000 >> [ +0.60] RIP: 0010:[] [] >> efi_call5+0x6f/0xf0 >> [ +0.71] RSP: 0018:880400f4bdb0 EFLAGS: 00010002 >> [ +0.45] RAX: 80050033 RBX: 8804040e3000 RCX: >> 8804040e3000 >> [ +0.55] RDX: 8804040e3400 RSI: 8804040e3000 RDI: >> bff7f7af > > So you get a #GP while executing call *rdi and %rdi is supposed to > contain ->get_variable. But instead it contains some very funky shit: > > 0xbff7f7af > > Who made it contain that nuisance of a pointer which thinks it is > ->get_variable, huh? If only I could get my hands on that guy! :-P > > Ok, seriously, how reproducible is this? I don't really know how to reproduce this, I only can say that it usually happens while partitioning the loop device or perhaps when the kernel reads the partition table afterwards. > Can you reproduce with the > latest upstream kernel too, i.e. 3.15-rc5+? I don't know, I can't really afford to configure/compile/test this new kernel, sorry. Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
kernel 3.14.2 oops: seems related to EFI
[ +0.018677] general protection fault: [#1] PREEMPT SMP [ +0.68] Modules linked in: usb_storage tun raid1 md_mod loop fuse joydev coretemp hwmon arc4 intel_rapl x86_pkg_temp_thermal intel_powerclamp kvm_intel nls_iso8859_1 nls_cp437 iTCO_wdt kvm vfat fat iTCO_vendor_support iwldvm uvcvideo led_class crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel mac80211 videobuf2_vmalloc videobuf2_memops videobuf2_core aesni_intel videodev aes_x86_64 snd_hda_codec_hdmi lrw gf128mul mousedev glue_helper btusb snd_hda_codec_via ablk_helper media cryptd iwlwifi snd_hda_codec_generic bluetooth psmouse microcode i2c_i801 serio_raw cfg80211 6lowpan_iphc rtsx_pci_ms r8169 memstick rfkill lpc_ich mii snd_hda_intel snd_hda_codec thermal snd_hwdep wmi snd_pcm tpm_infineon snd_timer tpm_tis mei_me snd tpm mei shpchp evdev soundcore processor battery mac_hid ac [ +0.000803] ext4 crc16 mbcache jbd2 hid_generic usbhid hid bcache sd_mod sr_mod crc_t10dif cdrom crct10dif_common rtsx_pci_sdmmc mmc_core atkbd libps2 ahci libahci ehci_pci libata xhci_hcd ehci_hcd scsi_mod rtsx_pci usbcore usb_common i8042 serio i915 video button intel_gtt i2c_algo_bit drm_kms_helper drm i2c_core [ +0.000328] CPU: 0 PID: 30835 Comm: systemd-udevd Not tainted 3.14.2-1-ARCH #1 [ +0.64] Hardware name: CLEVO CO.W55xEU /W55xEU , BIOS 4.6.5 03/05/2013 [ +0.000102] task: 880405ee6bf0 ti: 880400f4a000 task.ti: 880400f4a000 [ +0.60] RIP: 0010:[] [] efi_call5+0x6f/0xf0 [ +0.71] RSP: 0018:880400f4bdb0 EFLAGS: 00010002 [ +0.45] RAX: 80050033 RBX: 8804040e3000 RCX: 8804040e3000 [ +0.55] RDX: 8804040e3400 RSI: 8804040e3000 RDI: bff7f7af [ +0.56] RBP: 880400f4be80 R08: R09: 880400f4bec0 [ +0.55] R10: R11: 0246 R12: 8804040e3400 [ +0.56] R13: R14: 880400f4bec0 R15: 0009b000 [ +0.002960] FS: 7fb6167c97c0() GS:88041e20() knlGS: [ +0.002958] CS: 0010 DS: ES: CR0: 80050033 [ +0.003177] CR2: 7fb61581f4c0 CR3: 0009b000 CR4: 001427e0 [ +0.003258] Stack: [ +0.003257] 0201 8065 8804 8801 [ +0.003328] 880400f4be50 80050033 [ +0.003354] 00ff 00ff [ +0.003368] Call Trace: [ +0.003389] [] ? virt_efi_get_variable+0x51/0x80 [ +0.003353] [] efivar_entry_size+0x41/0x80 [ +0.003315] [] efivarfs_file_read+0x49/0x100 [ +0.003326] [] vfs_read+0x97/0x160 [ +0.003305] [] SyS_read+0x59/0xd0 [ +0.003263] [] system_call_fastpath+0x16/0x1b [ +0.003239] Code: 89 c8 48 89 f1 80 3d e8 16 7d 00 00 74 1d 4c 89 3d c7 16 7d 00 41 0f 20 df 4c 89 3d c4 16 7d 00 4c 8b 3d c5 16 7d 00 41 0f 22 df d7 80 3d c0 16 7d 00 00 74 41 4c 8b 3d a7 16 7d 00 41 0f 22 [ +0.003648] RIP [] efi_call5+0x6f/0xf0 [ +0.003511] RSP [ +0.024630] ---[ end trace 3670998c9a49abb7 ]--- [ +0.05] note: systemd-udevd[30835] exited with preempt_count 2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Can't umount /mnt/dev after calling dd(1) and with /mnt/dev is a bind mount
up On 03/22/2014 05:52 PM, Francis Moreau wrote: > Hello, > > I'm posting here because it might be a behaviour related to the kernel > internals that I can't explain from my user point of view :) > > Basically I'm doing this: > > mount -o bind /dev/ /mnt/dev && > chroot /mnt dd bs=440 conv=notrunc count=1 if=gptmbr.bin of=/dev/loop0 > umount /mnt/dev > > but umount gives the following error: "umount: /mnt/dev: target is busy" > > I tried to see if any processes were still using a file in dev with > fuser(1) but there weren't. Futhermore inserting a call to fuser(1) > right before umount fixed the issue, so it really seems a timing issue. > > stracing umount showed that umount failed here: > umount("/mnt/dev", 0) = -1 EBUSY (Device or resource busy) > > I replaced the bind mount of /dev by: > mount -t devtmpfs none /mnt/dev > and it worked. > > Could anybody tell me what I'm doing wrong ? > > Thanks. > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Can't umount /mnt/dev after calling dd(1) and with /mnt/dev is a bind mount
Hello, On 03/22/2014 08:24 PM, Al Viro wrote: > On Sat, Mar 22, 2014 at 05:52:24PM +0100, Francis Moreau wrote: >> I'm posting here because it might be a behaviour related to the kernel >> internals that I can't explain from my user point of view :) >> >> Basically I'm doing this: >> >> mount -o bind /dev/ /mnt/dev && >> chroot /mnt dd bs=440 conv=notrunc count=1 if=gptmbr.bin of=/dev/loop0 >> umount /mnt/dev >> >> but umount gives the following error: "umount: /mnt/dev: target is busy" > > What do you have in /proc/self/mountinfo before all that? > here it is: $ cat /proc/self/mountinfo 15 19 0:3 / /proc rw,nosuid,nodev,noexec,relatime shared:5 - proc proc rw 16 19 0:14 / /sys rw,nosuid,nodev,noexec,relatime shared:6 - sysfs sys rw 17 19 0:5 / /dev rw,nosuid,relatime shared:2 - devtmpfs dev rw,size=8152392k,nr_inodes=2038098,mode=755 18 19 0:15 / /run rw,nosuid,nodev,relatime shared:12 - tmpfs run rw,mode=755 19 1 8:17 / / rw,relatime shared:1 - ext4 /dev/sdb1 rw,discard,data=ordered 20 16 0:16 / /sys/kernel/security rw,nosuid,nodev,noexec,relatime shared:7 - securityfs securityfs rw 21 17 0:17 / /dev/shm rw,nosuid,nodev shared:3 - tmpfs tmpfs rw 22 17 0:11 / /dev/pts rw,nosuid,noexec,relatime shared:4 - devpts devpts rw,gid=5,mode=620,ptmxmode=000 23 16 0:18 / /sys/fs/cgroup rw,nosuid,nodev,noexec shared:8 - tmpfs tmpfs rw,mode=755 24 23 0:19 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime shared:9 - cgroup cgroup rw,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd 25 16 0:20 / /sys/fs/pstore rw,nosuid,nodev,noexec,relatime shared:10 - pstore pstore rw 26 16 0:21 / /sys/firmware/efi/efivars rw,nosuid,nodev,noexec,relatime shared:11 - efivarfs efivarfs rw 27 23 0:22 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime shared:13 - cgroup cgroup rw,cpuset 28 23 0:23 / /sys/fs/cgroup/cpu,cpuacct rw,nosuid,nodev,noexec,relatime shared:14 - cgroup cgroup rw,cpuacct,cpu 29 23 0:24 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime shared:15 - cgroup cgroup rw,memory 30 23 0:25 / /sys/fs/cgroup/devices rw,nosuid,nodev,noexec,relatime shared:16 - cgroup cgroup rw,devices 31 23 0:26 / /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime shared:17 - cgroup cgroup rw,freezer 32 23 0:27 / /sys/fs/cgroup/net_cls rw,nosuid,nodev,noexec,relatime shared:18 - cgroup cgroup rw,net_cls 33 23 0:28 / /sys/fs/cgroup/blkio rw,nosuid,nodev,noexec,relatime shared:19 - cgroup cgroup rw,blkio 34 15 0:29 / /proc/sys/fs/binfmt_misc rw,relatime shared:20 - autofs systemd-1 rw,fd=21,pgrp=1,timeout=300,minproto=5,maxproto=5,direct 35 16 0:7 / /sys/kernel/debug rw,relatime shared:21 - debugfs debugfs rw 36 17 0:13 / /dev/mqueue rw,relatime shared:22 - mqueue mqueue rw 37 17 0:30 / /dev/hugepages rw,relatime shared:23 - hugetlbfs hugetlbfs rw 39 16 0:32 / /sys/kernel/config rw,relatime shared:24 - configfs configfs rw 38 19 0:31 / /tmp rw shared:25 - tmpfs tmpfs rw 40 34 0:33 / /proc/sys/fs/binfmt_misc rw,relatime shared:26 - binfmt_misc binfmt_misc rw 42 19 8:1 / /boot rw,relatime shared:27 - vfat /dev/sda1 rw,fmask=0022,dmask=0022,codepage=437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro 43 19 254:0 / /home rw,relatime shared:28 - ext4 /dev/bcache0 rw,data=ordered 77 18 0:35 / /run/user/1000 rw,nosuid,nodev,relatime shared:61 - tmpfs tmpfs rw,size=1631108k,mode=700,uid=1000,gid=100 79 16 0:36 / /sys/fs/fuse/connections rw,relatime shared:63 - fusectl fusectl rw 81 77 0:37 / /run/user/1000/gvfs rw,nosuid,nodev,relatime shared:65 - fuse.gvfsd-fuse gvfsd-fuse rw,user_id=1000,group_id=100 44 19 7:2 / /mnt rw,relatime shared:29 - ext4 /dev/loop0p2 rw,data=ordered Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Can't umount /mnt/dev after calling dd(1) and with /mnt/dev is a bind mount
Hello, I'm posting here because it might be a behaviour related to the kernel internals that I can't explain from my user point of view :) Basically I'm doing this: mount -o bind /dev/ /mnt/dev && chroot /mnt dd bs=440 conv=notrunc count=1 if=gptmbr.bin of=/dev/loop0 umount /mnt/dev but umount gives the following error: "umount: /mnt/dev: target is busy" I tried to see if any processes were still using a file in dev with fuser(1) but there weren't. Futhermore inserting a call to fuser(1) right before umount fixed the issue, so it really seems a timing issue. stracing umount showed that umount failed here: umount("/mnt/dev", 0) = -1 EBUSY (Device or resource busy) I replaced the bind mount of /dev by: mount -t devtmpfs none /mnt/dev and it worked. Could anybody tell me what I'm doing wrong ? Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12: ethernet controller missing after resuming from suspend to RAM
On 02/12/2014 12:58 AM, Rafael J. Wysocki wrote: > On Tuesday, February 11, 2014 07:17:37 PM Peter Wu wrote: >> On Tuesday 11 February 2014 12:42:37 Mika Westerberg wrote: >>> On Mon, Feb 10, 2014 at 11:39:29PM +0100, Rafael J. Wysocki wrote: > _STA() returns 0x0A instead of 0x0F. Could there be something missing in > the ACPI hotplug code that overlooks this and removes the device on > resume? That is possible. Actually even quite likely, but let's wait for the response from Peter. >>> >>> Here's a hack that should take the 0xa return value into consideration. It >>> turned out that this case is even mentioned in the ACPI spec. >> >> Tested-by: Peter Wu >> >> This works, the devices are not lost anymore after resume. dmesg >> mentions the 04:00.* devices at resume compared to the unpatched kernel: >> >> [ 42.650721] PM: Finishing wakeup. >> [ 42.650768] acpiphp_glue: hotplug_event: Bus check notify on >> \_SB_.PCI0.RP03 >> [ 42.650772] acpiphp_glue: hotplug_event: re-enumerating slots under >> \_SB_.PCI0.RP03 >> [ 42.650874] iwlwifi :05:00.0: no hotplug settings from platform >> [ 42.650722] Restarting tasks ... done. >> [ 42.650985] video LNXVIDEO:00: Restoring backlight state >> [ 42.650988] video LNXVIDEO:01: Restoring backlight state >> [ 43.124208] ACPI: \_SB_.AC__: ACPI_NOTIFY_BUS_CHECK event: unsupported >> [ 43.128401] jme :04:00.5: irq 50 for MSI/MSI-X >> [ 43.153005] jme :04:00.5 eth0: Link is down >> [ 43.153030] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready >> [ 43.154364] iwlwifi :05:00.0: L1 Enabled; Disabling L0S >> [ 43.162307] iwlwifi :05:00.0: Radio type=0x1-0x3-0x1 >> [ 43.276220] acpiphp_glue: hotplug_event: Bus check notify on >> \_SB_.PCI0.RP01 >> [ 43.276223] acpiphp_glue: hotplug_event: re-enumerating slots under >> \_SB_.PCI0.RP01 >> [ 43.276257] xhci_hcd :02:00.0: no hotplug settings from platform >> [ 43.276267] acpiphp_glue: hotplug_event: Bus check notify on >> \_SB_.PCI0.RP02 >> [ 43.276268] acpiphp_glue: hotplug_event: re-enumerating slots under >> \_SB_.PCI0.RP02 >> [ 43.276355] sdhci-pci :04:00.0: no hotplug settings from platform >> [ 43.276368] pci :04:00.2: no hotplug settings from platform >> [ 43.276381] jmb38x_ms :04:00.3: no hotplug settings from platform >> [ 43.276393] jme :04:00.5: no hotplug settings from platform >> [ 43.276398] acpiphp_glue: hotplug_event: Bus check notify on >> \_SB_.PCI0.RP03 >> [ 43.276399] acpiphp_glue: hotplug_event: re-enumerating slots under >> \_SB_.PCI0.RP03 >> [ 43.276491] iwlwifi :05:00.0: no hotplug settings from platform >> [ 43.277214] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready >> >> Rafael, do you want me to test the other patch as well? > > No, thanks! > > Mika, I'll add a changelog to your patch and queue it up as a fix for 3.14. > Thanks guys for solving this issue ! Rafael, could this be submitted to stable trees (at least 3.12, 3.13) as well ? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12: ethernet controller missing after resuming from suspend to RAM
On 02/09/2014 07:44 PM, Bastien Traverse wrote: > Le 07/02/2014 02:19, Rafael J. Wysocki a écrit : >> Please send the output of lspci -vv right before suspend and right after >> the subsequent resume as attachments. > > You'll find them attached, but I got a strange error when I wanted to > run it as root: > $ sudo lspci -vv > lspci_vv_before > pcilib: sysfs_read_vpd: read failed: Connection timed out > $ sudo -i > # lspci -vv > pcilib: sysfs_read_vpd: read failed: Connection timed out > > So I only got the unpriviledge output. > > Some complementary lines from my journal: > > kernel: r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded > kernel: r8169 :03:00.2: can't disable ASPM; OS doesn't have ASPM control > kernel: pcieport :00:1c.3: driver skip pci_set_master, fix it! > kernel: r8169 :03:00.2: irq 44 for MSI/MSI-X > kernel: r8169 :03:00.2 eth0: RTL8411 at 0xc90016ed4000, > 00:90:f5:d7:34:53, XID 08800800 IRQ 44 > kernel: r8169 :03:00.2 eth0: jumbo features [frames: 9200 bytes, tx > checksumming: ko] > kernel: rtsx_pci :03:00.0: irq 45 for MSI/MSI-X > kernel: rtsx_pci :03:00.0: rtsx_pci_acquire_irq: pcr->msi_en = 1, > pci->irq = 45 > ... > > And one I thought of interest: > > kernel: rtsx_pci :03:00.0: vpd r/w failed. This is likely a > firmware bug on this device. Contact the card vendor for a firmware update. > > That came three times before suspend. > > Only two lines about hotplug, none special. > > Stripped journal attached for the suspend cycle. > > > Le 07/02/2014 08:29, Francis Moreau a écrit : >> yeah, but calling this "fast resolution" is quite incorrect. >> >> I don't blame anyone for this and I'm quite happy that a workaround has >> been found at last but calling this "fast resolution" is a bit funny >> compare to the PITA it was to debug this. > > Sorry, I didn't mean to underestimate the amount of work you put in that > bug resolution (actually it was the first time I heard of kernel > bisection and was pretty impressed by how you led it). No problem Bastien and don't be impressed by this :). Bisection thing is a mechanical work but can be useful sometimes when we, users, are lost and have no clue on what's going on. The real PITA in that case was to reboot more than 10 times the system in order to test each suspicious commits on a live system. Anyways, good luck, all my hopes are on you now :) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12: ethernet controller missing after resuming from suspend to RAM
On 02/07/2014 12:15 AM, Bastien Traverse wrote: > > I was also hit by the rtsx driver bug > (https://bugs.archlinux.org/task/37720) and was delighted with its fast > resolution. I was hoping that its fix would also address the > disappearing Ethernet bug. yeah, but calling this "fast resolution" is quite incorrect. I don't blame anyone for this and I'm quite happy that a workaround has been found at last but calling this "fast resolution" is a bit funny compare to the PITA it was to debug this. That's probably why I haven't tried to do a similar work for this bug. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12: ethernet controller missing after resuming from suspend to RAM
On 02/06/2014 01:40 PM, Rafael J. Wysocki wrote: > On Thursday, February 06, 2014 08:38:15 AM Francis Moreau wrote: >> Hi, >> >> On 02/06/2014 12:42 AM, Bastien Traverse wrote: >>> Hello, >>> >>> I'm encountering the exact same problem (same model of machine, same >>> controller, same OS) and was wondering where this bug was at. >> >> I'm still leaving with this issue since my initial posting unfortunately :( >> >> I'm wondering why PM support on this machine is so crapish (it initialy >> oopsed when resuming due to a bug in rtsx driver). > > Is this a new problem in 3.12? > I don't know, sorry, maybe Bastien has an idea. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12: ethernet controller missing after resuming from suspend to RAM
Hi, On 02/06/2014 12:42 AM, Bastien Traverse wrote: > Hello, > > I'm encountering the exact same problem (same model of machine, same > controller, same OS) and was wondering where this bug was at. I'm still leaving with this issue since my initial posting unfortunately :( I'm wondering why PM support on this machine is so crapish (it initialy oopsed when resuming due to a bug in rtsx driver). Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
On 01/10/2014 10:52 AM, Samuel Ortiz wrote: > Hi Francis, > > On Fri, Jan 10, 2014 at 08:26:13AM +0100, Francis Moreau wrote: >> Hi. >> >> On 12/10/2013 09:29 AM, Samuel Ortiz wrote: >>> Hi Micky, >>> >>> On Tue, Dec 10, 2013 at 09:56:48AM +0800, micky wrote: >>>> Hi Francis: >>>> On 12/10/2013 09:39 AM, wwang wrote: >>>>> which is based on Thomas' patch. >>>> >>>> Can you help us test this patch, we disable irq while suspend here. >>> I already pushed a patch from Thomas to mfd-fixes that seems to fix the >>> resume breakage: >>> >>> https://git.kernel.org/cgit/linux/kernel/git/sameo/mfd-fixes.git/commit/?id=19e49e445e198197c5e243f92d333d076e23d032 >>> >> >> I still can see any traces of this fix in Linus' tree. >> >> Shouldn't this get merged before 3.13 is out ? > Yes, it should. I just sent a pull request to Linus for that. Thanks, you might consider to send this to the 3.12 stable tree as well. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
Hi. On 12/10/2013 09:29 AM, Samuel Ortiz wrote: > Hi Micky, > > On Tue, Dec 10, 2013 at 09:56:48AM +0800, micky wrote: >> Hi Francis: >> On 12/10/2013 09:39 AM, wwang wrote: >>> which is based on Thomas' patch. >> >> Can you help us test this patch, we disable irq while suspend here. > I already pushed a patch from Thomas to mfd-fixes that seems to fix the > resume breakage: > > https://git.kernel.org/cgit/linux/kernel/git/sameo/mfd-fixes.git/commit/?id=19e49e445e198197c5e243f92d333d076e23d032 > I still can see any traces of this fix in Linus' tree. Shouldn't this get merged before 3.13 is out ? Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
On 12/18/2013 05:05 AM, micky wrote: > Hi: > > It seems that the card-reader was removed during suspend or resume, is > that right? or did you removed by hand? yes during a suspend/resume cycle. > I want to know with Thomas' patch, after resume, is the card-reader and > card-reader driver still exist? I'm not sure but IIRC it's still loaded in the kernel after resuming. > if not exist, I also want to know which function called first, > rtsx_pci_resume or rtsx_pci_remove, can you determine it? > And IRQ16 seems not handled by rtsx_pci driver, so with Thomas' patch, > is there still some go wrong? > No idea, I'm simply an unfortunate user of that driver. Aren't the information you're asking for already answered in the previous posts ? Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12: ethernet controller missing after resuming from suspend to RAM
Hello Rafael, Could you see something in the logs ? Thanks On 12/12/2013 08:17 PM, Francis Moreau wrote: > On 12/12/2013 06:58 PM, Rafael J. Wysocki wrote: >> On Thursday, December 12, 2013 06:43:03 PM Francis Moreau wrote: > > [...] > >>> >>> Actually I can see this now: >>> >>> [ 42.400974] r8169 :03:00.2: System wakeup disabled by ACPI >> >> This should be harmless. >> >> Please run lspci -vv before and after suspend and send both results. > > Please find them attached. > > Thanks. > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
Hi, On 12/10/2013 02:56 AM, micky wrote: > Hi Francis: > On 12/10/2013 09:39 AM, wwang wrote: >> which is based on Thomas' patch. > > Can you help us test this patch, we disable irq while suspend here. This patch doesn't seem to help, it still oops: [ 29.843910] [ cut here ] [ 29.843917] WARNING: CPU: 0 PID: 53 at lib/debugobjects.c:260 debug_print_object+0x83/0xa0() [ 29.843921] ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x20 [ 29.843972] Modules linked in: x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm rtsx_pci_ms i915 i2c_algo_bit intel_agp intel_gtt memstick iTCO_wdt drm_kms_helper crc32c_intel video drm r8169 mei_me mii thermal agpgart mei wmi iTCO_vendor_support ac i2c_i801 i2c_core battery evdev button shpchp lpc_ich mperf processor serio_raw microcode ext4 crc16 mbcache jbd2 sr_mod cdrom sd_mod usb_storage rtsx_pci_sdmmc mmc_core ahci libahci libata scsi_mod ehci_pci xhci_hcd ehci_hcd rtsx_pci usbcore usb_common [ 29.844004] CPU: 0 PID: 53 Comm: kworker/0:1 Not tainted 3.11.0-rc2-ARCH #66 [ 29.844006] Hardware name: CLEVO CO.W55xEU /W55xEU , BIOS 4.6.5 03/05/2013 [ 29.844010] Workqueue: kacpi_hotplug hotplug_event_work [ 29.844012] 0009 880407a95a18 81459fe9 880407a95a60 [ 29.844014] 880407a95a50 8104dc7d 880406b896b8 81836fc0 [ 29.844017] 81701358 81b2f9b0 0003 880407a95ab0 [ 29.844019] Call Trace: [ 29.844024] [] dump_stack+0x54/0x8d [ 29.844027] [] warn_slowpath_common+0x7d/0xa0 [ 29.844029] [] warn_slowpath_fmt+0x4c/0x50 [ 29.844032] [] debug_print_object+0x83/0xa0 [ 29.844034] [] ? queue_work_on+0x50/0x50 [ 29.844037] [] __debug_check_no_obj_freed+0x1fb/0x240 [ 29.844044] [] ? rtsx_pci_remove+0x119/0x1d0 [rtsx_pci] [ 29.844046] [] debug_check_no_obj_freed+0x19/0x20 [ 29.844049] [] kfree+0x191/0x210 [ 29.844054] [] ? pcibios_disable_device+0x20/0x30 [ 29.844066] [] ? rtsx_pci_remove+0x119/0x1d0 [rtsx_pci] [ 29.844071] [] rtsx_pci_remove+0x119/0x1d0 [rtsx_pci] [ 29.844075] [] pci_device_remove+0x3b/0xb0 [ 29.844079] [] __device_release_driver+0x7f/0xf0 [ 29.844082] [] device_release_driver+0x23/0x30 [ 29.844084] [] bus_remove_device+0xf4/0x170 [ 29.844087] [] device_del+0x135/0x1d0 [ 29.844089] [] pci_stop_bus_device+0x94/0xa0 [ 29.844091] [] pci_stop_and_remove_bus_device+0x12/0x20 [ 29.844094] [] disable_slot+0x76/0xd0 [ 29.844096] [] acpiphp_check_bridge+0xa8/0xd0 [ 29.844099] [] hotplug_event+0xfa/0x210 [ 29.844101] [] hotplug_event_work+0x27/0x60 [ 29.844104] [] process_one_work+0x178/0x470 [ 29.844106] [] worker_thread+0x121/0x3a0 [ 29.844109] [] ? manage_workers.isra.21+0x2b0/0x2b0 [ 29.844111] [] kthread+0xc0/0xd0 [ 29.844114] [] ? kthread_create_on_node+0x120/0x120 [ 29.844117] [] ret_from_fork+0x7c/0xb0 [ 29.844119] [] ? kthread_create_on_node+0x120/0x120 [ 29.844120] ---[ end trace ed9751fe6c0cd9e3 ]--- [ 29.844137] kobject: ':03:00.0' (880407a010a8): kobject_uevent_env [ 29.844150] kobject: ':03:00.0' (880407a010a8): fill_kobj_path: path = '/devices/pci:00/:00:1c.3/:03:00.0' [ 29.844162] kobject: ':03:00.0' (880407a010a8): kobject_cleanup [ 29.844164] kobject: ':03:00.0' (880407a010a8): calling ktype release [ 29.844166] kobject: ':03:00.0': free name [ 29.844367] kobject: 'rx-0' (8804067ae010): kobject_cleanup [ 29.844370] kobject: 'rx-0' (8804067ae010): auto cleanup 'remove' event [ 29.844371] kobject: 'rx-0' (8804067ae010): kobject_uevent_env [ 29.844374] kobject: 'rx-0' (8804067ae010): fill_kobj_path: path = '/devices/pci:00/:00:1c.3/:03:00.2/net/enp3s0f2/queues/rx-0' [ 29.844379] kobject: 'rx-0' (8804067ae010): auto cleanup kobject_del [ 29.844383] kobject: 'rx-0' (8804067ae010): calling ktype release [ 29.844384] kobject: 'rx-0': free name [ 29.844389] kobject: 'tx-0' (880407205e18): kobject_cleanup [ 29.844390] kobject: 'tx-0' (880407205e18): auto cleanup 'remove' event [ 29.844391] kobject: 'tx-0' (880407205e18): kobject_uevent_env [ 29.844393] kobject: 'tx-0' (880407205e18): fill_kobj_path: path = '/devices/pci:00/:00:1c.3/:03:00.2/net/enp3s0f2/queues/tx-0' [ 29.844396] kobject: 'tx-0' (880407205e18): auto cleanup kobject_del [ 29.844398] kobject: 'tx-0' (880407205e18): calling ktype release [ 29.844399] kobject: 'tx-0': free name [ 29.844400] kobject: 'queues' (880406216c78): kobject_cleanup [ 29.844401] kobject: 'queues' (880406216c78): auto cleanup kobject_del [ 29.844403] kobject: 'queues' (880406216c78): calling ktype release [ 29.844404] kobject: 'queues' (880406216c78): kset_release [ 29.844405] kobject: 'queues': free name [
Re: 3.12: ethernet controller missing after resuming from suspend to RAM
On 12/12/2013 06:58 PM, Rafael J. Wysocki wrote: > On Thursday, December 12, 2013 06:43:03 PM Francis Moreau wrote: [...] >> >> Actually I can see this now: >> >> [ 42.400974] r8169 :03:00.2: System wakeup disabled by ACPI > > This should be harmless. > > Please run lspci -vv before and after suspend and send both results. Please find them attached. Thanks. 00:00.0 Host bridge: Intel Corporation 3rd Gen Core processor DRAM Controller (rev 09) Subsystem: CLEVO/KAPOK Computer Device 0550 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- 00:02.0 VGA compatible controller: Intel Corporation 3rd Gen Core processor Graphics Controller (rev 09) (prog-if 00 [VGA controller]) Subsystem: CLEVO/KAPOK Computer Device 0550 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- [disabled] Capabilities: Kernel driver in use: i915 Kernel modules: i915 00:14.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB xHCI Host Controller (rev 04) (prog-if 30 [XHCI]) Subsystem: CLEVO/KAPOK Computer Device 0550 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- Kernel driver in use: xhci_hcd Kernel modules: xhci_hcd 00:16.0 Communication controller: Intel Corporation 7 Series/C210 Series Chipset Family MEI Controller #1 (rev 04) Subsystem: CLEVO/KAPOK Computer Device 0550 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- Kernel driver in use: mei_me Kernel modules: mei_me 00:1a.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB Enhanced Host Controller #2 (rev 04) (prog-if 20 [EHCI]) Subsystem: CLEVO/KAPOK Computer Device 0550 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- Kernel driver in use: ehci-pci Kernel modules: ehci_pci 00:1b.0 Audio device: Intel Corporation 7 Series/C210 Series Chipset Family High Definition Audio Controller (rev 04) Subsystem: CLEVO/KAPOK Computer Device 0550 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- Kernel driver in use: snd_hda_intel Kernel modules: snd_hda_intel 00:1c.0 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 1 (rev c4) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: Kernel driver in use: pcieport Kernel modules: shpchp 00:1c.2 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 3 (rev c4) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: Kernel driver in use: pcieport Kernel modules: shpchp 00:1c.3 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 4 (rev c4) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: Kernel driver in use: pcieport Kernel modules: shpchp 00:1c.4 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 5 (rev c4) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities:
Re: 3.12: ethernet controller missing after resuming from suspend to RAM
On 12/12/2013 09:00 AM, Francis Moreau wrote: > Hello, > > I'm encountering an issue after resuming my system from suspend to RAM: > my ethernet controller is missing, it seems that the kernel doesn't see > it anymore. It's missing from /sys/class/net. > > Before suspending, this is what lspci gives. > > 03:00.2 Ethernet controller: Realtek Semiconductor Co., Ltd. > RTL8111/8168 PCI Express Gigabit Ethernet controller (rev 0a) > Subsystem: CLEVO/KAPOK Computer Device 0540 > Flags: bus master, fast devsel, latency 0, IRQ 46 > I/O ports at e000 [size=256] > Memory at f0a04000 (64-bit, prefetchable) [size=4K] > Memory at f0a0 (64-bit, prefetchable) [size=16K] > Capabilities: [40] Power Management version 3 > Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+ > Capabilities: [70] Express Endpoint, MSI 01 > Capabilities: [b0] MSI-X: Enable- Count=4 Masked- > Capabilities: [d0] Vital Product Data > Capabilities: [100] Advanced Error Reporting > Capabilities: [160] Device Serial Number 02-00-00-00-68-4c-e0-00 > Kernel driver in use: r8169 > Kernel modules: r8169 > > I can't see anything obvious in dmesg. > > Could anybody help me to fix this ? > Additionnal information: I have the same symptom when doing suspend to disk. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
3.12: ethernet controller missing after resuming from suspend to RAM
Hello, I'm encountering an issue after resuming my system from suspend to RAM: my ethernet controller is missing, it seems that the kernel doesn't see it anymore. It's missing from /sys/class/net. Before suspending, this is what lspci gives. 03:00.2 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168 PCI Express Gigabit Ethernet controller (rev 0a) Subsystem: CLEVO/KAPOK Computer Device 0540 Flags: bus master, fast devsel, latency 0, IRQ 46 I/O ports at e000 [size=256] Memory at f0a04000 (64-bit, prefetchable) [size=4K] Memory at f0a0 (64-bit, prefetchable) [size=16K] Capabilities: [40] Power Management version 3 Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [70] Express Endpoint, MSI 01 Capabilities: [b0] MSI-X: Enable- Count=4 Masked- Capabilities: [d0] Vital Product Data Capabilities: [100] Advanced Error Reporting Capabilities: [160] Device Serial Number 02-00-00-00-68-4c-e0-00 Kernel driver in use: r8169 Kernel modules: r8169 I can't see anything obvious in dmesg. Could anybody help me to fix this ? Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
Hi, On 12/10/2013 02:56 AM, micky wrote: > Hi Francis: > On 12/10/2013 09:39 AM, wwang wrote: >> which is based on Thomas' patch. > > Can you help us test this patch, we disable irq while suspend here. I'll give it a try tonight. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
Hi, On 12/09/2013 11:17 PM, Samuel Ortiz wrote: > Hi Francis, > > Adding Lee to the Cc list. > > On Tue, Dec 03, 2013 at 09:14:14AM +0100, Francis Moreau wrote: >> Now that you did the hard work, I hope driver's maintainer/developper >> will care about this issue. > I applied Thomas' patch to mfd-fixes. > Thanks a lot to you and Thomas for that. Please, don't forget to propagate the fix to the affected stable trees. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
On 12/03/2013 09:14 AM, Francis Moreau wrote: > Hello Thomas, > > On 12/02/2013 12:20 PM, Thomas Gleixner wrote: >> On Mon, 2 Dec 2013, Thomas Gleixner wrote: >>> On Sat, 30 Nov 2013, Francis Moreau wrote: >>>> Hello Thomas, >>>> >>>> Sorry for the delay. >>>> >>>> On 11/29/2013 10:02 AM, Thomas Gleixner wrote: >>>>> On Fri, 29 Nov 2013, Francis Moreau wrote: >>>>>> Since it seems to be related to rtsx driver or its upper layer, could >>>>>> the folks involved in this area have a look to this issue please ? >>>>> >>>>> I'm not involved, but looking at the debug objects backtrace it's >>>>> related to the delayed work in rtsx. >>>>> >>>>> Does the untested patch below cure the issue? >>>>> >>>> >>>> It seems it does since I can't see the debug object trace anymore >>>> however Ican see this now: >>> >>> >>> >>>> So I don't think it completely solve the problem but it's a good start. >>> >>> I kinda expected that, but I wanted to confirm my suspicion, that the >>> interrupt hits after the delayed work is canceled and just requeues it >>> again, which then leads to an armed timer being freed further down. >>> >>> I'm not familiar with that driver and I leave the final fixup to the >>> driver maintainers. It's enough data for them to figure out the real >>> solution. >> >> Just had a quick look and the obvious solution is to disable the >> interrupts at the device level _BEFORE_ doing anything else in the >> teardown path. Updated patch below. That should avoid the nobody cared >> splat on the other irq line. >> > > Yes it does. > > Now that you did the hard work, I hope driver's maintainer/developper > will care about this issue. > Unfortunately he/she doesn't seem to care. Moreover I've been by this now: [ 241.003324] INFO: task kworker/u16:4:108 blocked for more than 120 seconds. [ 241.003331] Not tainted 3.12.2-1-ARCH #1 [ 241.003332] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 241.003335] kworker/u16:4 D 880405bc8000 0 108 2 0x [ 241.003355] Workqueue: kmemstick memstick_check [memstick] [ 241.003358] 880405bc3c90 0046 000144c0 880405bc3fd8 [ 241.003362] 880405bc3fd8 000144c0 880405bc8000 880405bc3c68 [ 241.003366] 814ef57c 880405bc3fd8 0286 [ 241.003370] Call Trace: [ 241.003380] [] ? schedule_timeout+0x13c/0x290 [ 241.003385] [] ? detach_if_pending+0x120/0x120 [ 241.003388] [] ? detach_if_pending+0x120/0x120 [ 241.003392] [] schedule+0x29/0x70 [ 241.003396] [] schedule_timeout+0x219/0x290 [ 241.003401] [] ? vsnprintf+0x1e1/0x680 [ 241.003405] [] wait_for_common+0xd3/0x180 [ 241.003411] [] ? wake_up_process+0x40/0x40 [ 241.003414] [] wait_for_completion+0x1d/0x20 [ 241.003419] [] memstick_set_rw_addr+0x4a/0x50 [memstick] [ 241.003424] [] memstick_check+0x10e/0x370 [memstick] [ 241.003429] [] process_one_work+0x167/0x450 [ 241.003432] [] worker_thread+0x121/0x3a0 [ 241.003436] [] ? manage_workers.isra.23+0x2b0/0x2b0 [ 241.003441] [] kthread+0xc0/0xd0 [ 241.003446] [] ? kthread_create_on_node+0x120/0x120 [ 241.003450] [] ret_from_fork+0x7c/0xb0 [ 241.003454] [] ? kthread_create_on_node+0x120/0x120 looks like a different issue. I already black listed this driver, maybe it's time to mark it as broken ? Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
Hello Thomas, On 12/02/2013 12:20 PM, Thomas Gleixner wrote: > On Mon, 2 Dec 2013, Thomas Gleixner wrote: >> On Sat, 30 Nov 2013, Francis Moreau wrote: >>> Hello Thomas, >>> >>> Sorry for the delay. >>> >>> On 11/29/2013 10:02 AM, Thomas Gleixner wrote: >>>> On Fri, 29 Nov 2013, Francis Moreau wrote: >>>>> Since it seems to be related to rtsx driver or its upper layer, could >>>>> the folks involved in this area have a look to this issue please ? >>>> >>>> I'm not involved, but looking at the debug objects backtrace it's >>>> related to the delayed work in rtsx. >>>> >>>> Does the untested patch below cure the issue? >>>> >>> >>> It seems it does since I can't see the debug object trace anymore >>> however Ican see this now: >> >> >> >>> So I don't think it completely solve the problem but it's a good start. >> >> I kinda expected that, but I wanted to confirm my suspicion, that the >> interrupt hits after the delayed work is canceled and just requeues it >> again, which then leads to an armed timer being freed further down. >> >> I'm not familiar with that driver and I leave the final fixup to the >> driver maintainers. It's enough data for them to figure out the real >> solution. > > Just had a quick look and the obvious solution is to disable the > interrupts at the device level _BEFORE_ doing anything else in the > teardown path. Updated patch below. That should avoid the nobody cared > splat on the other irq line. > Yes it does. Now that you did the hard work, I hope driver's maintainer/developper will care about this issue. Thank you. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
On 11/30/2013 09:17 PM, Rafael J. Wysocki wrote: [...] > If your system survives resume (I guess it does?), can you please send > /proc/interrupts before and after the first suspend/resume cycle? > Please find both dumps attached. Thanks CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 0: 18 0 0 0 0 0 0 0 IO-APIC-edge timer 1:202 2 0 0 6 6 1 0 IO-APIC-edge i8042 9:381 11 7 2 28121 3 7 IO-APIC-fasteoi acpi 12: 17 1 0 1 2 1 0 0 IO-APIC-edge i8042 16: 3 0 0 1 2 4 0 1 IO-APIC-fasteoi ehci_hcd:usb3 23: 50 4 0 0 6 0 0 1 IO-APIC-fasteoi ehci_hcd:usb4 41: 10082499229182 7653435 112137 PCI-MSI-edge xhci_hcd 42:973 1 32 65 20126 17102 PCI-MSI-edge ahci 43: 26 0 0 0 0 0 0 0 PCI-MSI-edge mei_me 45: 21 33 0 0 6 1 0 0 PCI-MSI-edge i915 NMI: 0 0 0 0 0 0 0 0 Non-maskable interrupts LOC: 2071 1279951 1023 1177764 638700 Local timer interrupts SPU: 0 0 0 0 0 0 0 0 Spurious interrupts PMI: 0 0 0 0 0 0 0 0 Performance monitoring interrupts IWI: 56 79 36 21 17 43 39 38 IRQ work interrupts RTR: 12 0 0 0 0 0 0 0 APIC ICR read retries RES: 2033 2711 2288 1925 2039 1390 916 1161 Rescheduling interrupts CAL:419438479466491512 502477 Function call interrupts TLB: 63 2 3 11 1 7 5 0 TLB shootdowns TRM: 0 0 0 0 0 0 0 0 Thermal event interrupts THR: 0 0 0 0 0 0 0 0 Threshold APIC interrupts MCE: 0 0 0 0 0 0 0 0 Machine check exceptions MCP: 4 4 4 4 4 4 4 4 Machine check polls ERR: 0 MIS: 0 CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 0: 18 0 0 0 0 0 0 0 IO-APIC-edge timer 1:100 2 0 0 6 6 1 0 IO-APIC-edge i8042 9:179 11 7 2 26118 3 6 IO-APIC-fasteoi acpi 12: 9 1 0 1 2 1 0 0 IO-APIC-edge i8042 16: 25 0 0 1 2 4 0 1 IO-APIC-fasteoi ehci_hcd:usb3 23: 30 0 0 0 6 0 0 1 IO-APIC-fasteoi ehci_hcd:usb4 40: 6 7 3 0 0 1 4 2 PCI-MSI-edge rtsx_pci 41: 7196491228181 7448425 109137 PCI-MSI-edge xhci_hcd 42:929 1 32 65 20125 17102 PCI-MSI-edge ahci 43: 15 0 0 0 0 7 2 0 PCI-MSI-edge mei_me 45: 14 33 0 0 5 1 0 0 PCI-MSI-edge i915 NMI: 0 0 0 0 0 0 0 0 Non-maskable interrupts LOC: 1331 1141847911 1104673 577642 Local timer interrupts SPU: 0 0 0 0 0 0 0 0 Spurious interrupts PMI: 0 0 0 0 0
Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
On 11/30/2013 09:17 PM, Rafael J. Wysocki wrote: > On Saturday, November 30, 2013 04:07:36 PM Francis Moreau wrote: >> Hello Thomas, >> >> Sorry for the delay. >> >> On 11/29/2013 10:02 AM, Thomas Gleixner wrote: >>> On Fri, 29 Nov 2013, Francis Moreau wrote: >>>> Since it seems to be related to rtsx driver or its upper layer, could >>>> the folks involved in this area have a look to this issue please ? >>> >>> I'm not involved, but looking at the debug objects backtrace it's >>> related to the delayed work in rtsx. >>> >>> Does the untested patch below cure the issue? >>> >> >> It seems it does since I can't see the debug object trace anymore >> however Ican see this now: > > So Thomas' patch should be applied to the rtsx driver. > >> [ 64.498270] irq 16: nobody cared (try booting with the "irqpoll" option) >> [ 64.498314] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.11.0-rc2-ARCH #65 >> [ 64.498316] Hardware name: CLEVO CO.W55xEU >>/W55xEU , BIOS 4.6.5 >> 03/05/2013 >> [ 64.498317] 8804078bd38c 88041e203e48 81459fe9 >> 8804078bd300 >> [ 64.498320] 88041e203e70 810d8632 8804078bd300 >> 0010 >> [ 64.498322] 88041e203eb0 810d8a58 >> 8136a882 >> [ 64.498324] Call Trace: >> [ 64.498325][] dump_stack+0x54/0x8d >> [ 64.498334] [] __report_bad_irq+0x32/0xd0 >> [ 64.498337] [] note_interrupt+0x138/0x1f0 >> [ 64.498340] [] ? cpuidle_enter_state+0x52/0xc0 >> [ 64.498343] [] handle_irq_event_percpu+0xf9/0x250 >> [ 64.498345] [] handle_irq_event+0x3d/0x60 >> [ 64.498347] [] handle_fasteoi_irq+0x5a/0x100 >> [ 64.498350] [] handle_irq+0x1e/0x30 >> [ 64.498353] [] do_IRQ+0x4d/0xc0 >> [ 64.498355] [] common_interrupt+0x6d/0x6d >> [ 64.498356][] ? cpuidle_enter_state+0x52/0xc0 >> [ 64.498360] [] ? cpuidle_enter_state+0x48/0xc0 >> [ 64.498362] [] cpuidle_idle_call+0xc9/0x280 >> [ 64.498365] [] arch_cpu_idle+0xe/0x30 >> [ 64.498368] [] cpu_startup_entry+0x257/0x2d0 >> [ 64.498370] [] rest_init+0x84/0x90 >> [ 64.498373] [] start_kernel+0x414/0x420 >> [ 64.498375] [] ? repair_env_string+0x5c/0x5c >> [ 64.498377] [] ? early_idt_handlers+0x120/0x120 >> [ 64.498379] [] x86_64_start_reservations+0x2a/0x2c >> [ 64.498381] [] x86_64_start_kernel+0x108/0x117 >> [ 64.498382] handlers: >> [ 64.498402] [] usb_hcd_irq [usbcore] >> [ 64.498422] Disabling IRQ #16 >> >> So I don't think it completely solve the problem but it's a good start. > > That issue may or may not be related. > > If your system survives resume (I guess it does?), my system survives resume as soon as the DEBUG_OBJECTS facility was activated. > can you please send > /proc/interrupts before and after the first suspend/resume cycle? > Sure, I will do later in the day. Thanks for your help. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
Hello Thomas, Sorry for the delay. On 11/29/2013 10:02 AM, Thomas Gleixner wrote: > On Fri, 29 Nov 2013, Francis Moreau wrote: >> Since it seems to be related to rtsx driver or its upper layer, could >> the folks involved in this area have a look to this issue please ? > > I'm not involved, but looking at the debug objects backtrace it's > related to the delayed work in rtsx. > > Does the untested patch below cure the issue? > It seems it does since I can't see the debug object trace anymore however Ican see this now: [ 64.498270] irq 16: nobody cared (try booting with the "irqpoll" option) [ 64.498314] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.11.0-rc2-ARCH #65 [ 64.498316] Hardware name: CLEVO CO.W55xEU /W55xEU , BIOS 4.6.5 03/05/2013 [ 64.498317] 8804078bd38c 88041e203e48 81459fe9 8804078bd300 [ 64.498320] 88041e203e70 810d8632 8804078bd300 0010 [ 64.498322] 88041e203eb0 810d8a58 8136a882 [ 64.498324] Call Trace: [ 64.498325][] dump_stack+0x54/0x8d [ 64.498334] [] __report_bad_irq+0x32/0xd0 [ 64.498337] [] note_interrupt+0x138/0x1f0 [ 64.498340] [] ? cpuidle_enter_state+0x52/0xc0 [ 64.498343] [] handle_irq_event_percpu+0xf9/0x250 [ 64.498345] [] handle_irq_event+0x3d/0x60 [ 64.498347] [] handle_fasteoi_irq+0x5a/0x100 [ 64.498350] [] handle_irq+0x1e/0x30 [ 64.498353] [] do_IRQ+0x4d/0xc0 [ 64.498355] [] common_interrupt+0x6d/0x6d [ 64.498356][] ? cpuidle_enter_state+0x52/0xc0 [ 64.498360] [] ? cpuidle_enter_state+0x48/0xc0 [ 64.498362] [] cpuidle_idle_call+0xc9/0x280 [ 64.498365] [] arch_cpu_idle+0xe/0x30 [ 64.498368] [] cpu_startup_entry+0x257/0x2d0 [ 64.498370] [] rest_init+0x84/0x90 [ 64.498373] [] start_kernel+0x414/0x420 [ 64.498375] [] ? repair_env_string+0x5c/0x5c [ 64.498377] [] ? early_idt_handlers+0x120/0x120 [ 64.498379] [] x86_64_start_reservations+0x2a/0x2c [ 64.498381] [] x86_64_start_kernel+0x108/0x117 [ 64.498382] handlers: [ 64.498402] [] usb_hcd_irq [usbcore] [ 64.498422] Disabling IRQ #16 So I don't think it completely solve the problem but it's a good start. Thank you. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
Hello, On 11/25/2013 11:47 AM, Rafael J. Wysocki wrote: > On Monday, November 25, 2013 08:42:21 AM Francis Moreau wrote: >> On 11/24/2013 10:06 PM, Rafael J. Wysocki wrote: >>> On Sunday, November 24, 2013 10:39:20 AM Francis Moreau wrote: >>>> Hello Thomas >>>> >>>> On 11/22/2013 11:27 PM, Thomas Gleixner wrote: >>>>> On Fri, 22 Nov 2013, Rafael J. Wysocki wrote: >>>>>> On Friday, November 22, 2013 10:36:23 PM Francis Moreau wrote: >>>>>>> Ok, I've finally managed to find out the bad commit: >>>>>>> ad07277e82dedabacc52c82746633680a3187d25: ACPI / PM: Hold acpi_scan_lock >>>>>>> over system PM transitions >>>>>>> >>>>>>> I verified that the parent commit doesn't have the problem. >>>>>> >>>>>> Interesting. >>>>>> >>>>>>> Rafael, you're the man now ;) >>>>>> >>>>>> I kind of don't see how that commit may result in behavior that you >>>>>> described earlier in the thread. >>>>>> >>>>>> You get a memory corruption that seems to have started to happen because >>>>>> we're holding an additional lock over suspend resume now. Something's >>>>>> fishy >>>>>> on that machine and we need to figure out what it is. >>>>> >>>>> The hickup happens in the timer softirq. >>>>> >>>>> @Francis: Did you try to enable DEBUG_OBJECTS.*. If not please give it >>>>> a try. >>>> >>>> This looks like it was a good idea. >>>> >>>> The kernel now outputs the following traces after resuming. >>>> >>>> [ 26.973928] WARNING: CPU: 0 PID: 4 at lib/debugobjects.c:260 >>>> debug_print_object+0x83/0xa0() >>>> [ 26.973932] ODEBUG: free active (active state 0) object type: >>>> timer_list hint: delayed_work_timer_fn+0x0/0x20 >>>> [ 26.973972] Modules linked in: x86_pkg_temp_thermal intel_powerclamp >>>> rtsx_pci_ms coretemp memstick kvm_intel i2c_i801 iTCO_wdt >>>> iTCO_vendor_support i915 i2c_algo_bit intel_agp intel_gtt drm_kms_helper >>>> r8169 drm kvm mii agpgart i2c_core lpc_ich ac shpchp crc32c_intel >>>> battery thermal wmi evdev mei_me video mei button mperf processor >>>> serio_raw microcode ext4 crc16 mbcache jbd2 sr_mod cdrom sd_mod >>>> usb_storage rtsx_pci_sdmmc mmc_core ahci libahci libata ehci_pci >>>> ehci_hcd xhci_hcd scsi_mod rtsx_pci usbcore usb_common >>>> [ 26.974013] CPU: 0 PID: 4 Comm: kworker/0:0 Not tainted >>>> 3.11.0-rc2-ARCH #64 >>>> [ 26.974014] Hardware name: CLEVO CO.W55xEU >>>>/W55xEU , BIOS 4.6.5 >>>> 03/05/2013 >>>> [ 26.974019] Workqueue: kacpi_hotplug hotplug_event_work >>>> [ 26.974020] 0009 880407d0da18 81459fe9 >>>> 880407d0da60 >>>> [ 26.974023] 880407d0da50 8104dc7d 880407fad488 >>>> 81836fc0 >>>> [ 26.974025] 81701358 81afef70 0003 >>>> 880407d0dab0 >>>> [ 26.974027] Call Trace: >>>> [ 26.974031] [] dump_stack+0x54/0x8d >>>> [ 26.974043] [] warn_slowpath_common+0x7d/0xa0 >>>> [ 26.974044] [] warn_slowpath_fmt+0x4c/0x50 >>>> [ 26.974047] [] debug_print_object+0x83/0xa0 >>>> [ 26.974050] [] ? queue_work_on+0x50/0x50 >>>> [ 26.974053] [] __debug_check_no_obj_freed+0x1fb/0x240 >>>> [ 26.974059] [] ? rtsx_pci_remove+0x119/0x1d0 >>>> [rtsx_pci] >>> >>> So a device driven by rtsx_pcr.c is removed after resume. Without the >>> commit >>> you've bisected it is removed as well, but that happens during resume, so >>> rtsx_pci_resume() is likely not called in that case. >> >> I'm not sure to understand your point. > > The problem is that with the commit you've bisected, the whole removal of > rtsx_pcr is likely done *before* the PM core calls resume callbacks of > device drivers (although only incidentally, because it very well may be > done in parallel with that). However, after that commit the removal is only > done after the resume callbacks have been called, which means that the device > is not physically present when rtsx_pci_resume() is called. Of course, > it may not be physically present at that point anyway, so rtsx_pci_resume() > should have taken that into consideration already, but it doesn't from what > I can say. > Since it seems to be related to rtsx driver or its upper layer, could the folks involved in this area have a look to this issue please ? Thank you -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
On 11/24/2013 10:06 PM, Rafael J. Wysocki wrote: > On Sunday, November 24, 2013 10:39:20 AM Francis Moreau wrote: >> Hello Thomas >> >> On 11/22/2013 11:27 PM, Thomas Gleixner wrote: >>> On Fri, 22 Nov 2013, Rafael J. Wysocki wrote: >>>> On Friday, November 22, 2013 10:36:23 PM Francis Moreau wrote: >>>>> Ok, I've finally managed to find out the bad commit: >>>>> ad07277e82dedabacc52c82746633680a3187d25: ACPI / PM: Hold acpi_scan_lock >>>>> over system PM transitions >>>>> >>>>> I verified that the parent commit doesn't have the problem. >>>> >>>> Interesting. >>>> >>>>> Rafael, you're the man now ;) >>>> >>>> I kind of don't see how that commit may result in behavior that you >>>> described earlier in the thread. >>>> >>>> You get a memory corruption that seems to have started to happen because >>>> we're holding an additional lock over suspend resume now. Something's >>>> fishy >>>> on that machine and we need to figure out what it is. >>> >>> The hickup happens in the timer softirq. >>> >>> @Francis: Did you try to enable DEBUG_OBJECTS.*. If not please give it >>> a try. >> >> This looks like it was a good idea. >> >> The kernel now outputs the following traces after resuming. >> >> [ 26.973928] WARNING: CPU: 0 PID: 4 at lib/debugobjects.c:260 >> debug_print_object+0x83/0xa0() >> [ 26.973932] ODEBUG: free active (active state 0) object type: >> timer_list hint: delayed_work_timer_fn+0x0/0x20 >> [ 26.973972] Modules linked in: x86_pkg_temp_thermal intel_powerclamp >> rtsx_pci_ms coretemp memstick kvm_intel i2c_i801 iTCO_wdt >> iTCO_vendor_support i915 i2c_algo_bit intel_agp intel_gtt drm_kms_helper >> r8169 drm kvm mii agpgart i2c_core lpc_ich ac shpchp crc32c_intel >> battery thermal wmi evdev mei_me video mei button mperf processor >> serio_raw microcode ext4 crc16 mbcache jbd2 sr_mod cdrom sd_mod >> usb_storage rtsx_pci_sdmmc mmc_core ahci libahci libata ehci_pci >> ehci_hcd xhci_hcd scsi_mod rtsx_pci usbcore usb_common >> [ 26.974013] CPU: 0 PID: 4 Comm: kworker/0:0 Not tainted >> 3.11.0-rc2-ARCH #64 >> [ 26.974014] Hardware name: CLEVO CO.W55xEU >>/W55xEU , BIOS 4.6.5 >> 03/05/2013 >> [ 26.974019] Workqueue: kacpi_hotplug hotplug_event_work >> [ 26.974020] 0009 880407d0da18 81459fe9 >> 880407d0da60 >> [ 26.974023] 880407d0da50 8104dc7d 880407fad488 >> 81836fc0 >> [ 26.974025] 81701358 81afef70 0003 >> 880407d0dab0 >> [ 26.974027] Call Trace: >> [ 26.974031] [] dump_stack+0x54/0x8d >> [ 26.974043] [] warn_slowpath_common+0x7d/0xa0 >> [ 26.974044] [] warn_slowpath_fmt+0x4c/0x50 >> [ 26.974047] [] debug_print_object+0x83/0xa0 >> [ 26.974050] [] ? queue_work_on+0x50/0x50 >> [ 26.974053] [] __debug_check_no_obj_freed+0x1fb/0x240 >> [ 26.974059] [] ? rtsx_pci_remove+0x119/0x1d0 >> [rtsx_pci] > > So a device driven by rtsx_pcr.c is removed after resume. Without the commit > you've bisected it is removed as well, but that happens during resume, so > rtsx_pci_resume() is likely not called in that case. I'm not sure to understand your point. > > I bet that there's a bug either in rtsx_pci_remove() or in rtsx_pci_resume(). > The latter definitely should check if the device is actually still present > before scheduling the delayed work, but then the Boris' patch should take care > of that anyway. > With Boris' patch applied, I still have the problem. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
Hello Rafael, On 11/22/2013 11:08 PM, Rafael J. Wysocki wrote: > On Friday, November 22, 2013 10:36:23 PM Francis Moreau wrote: >> On 11/22/2013 01:54 PM, Rafael J. Wysocki wrote: >>> On Friday, November 22, 2013 10:57:25 AM Francis Moreau wrote: >>>> Le 22/11/2013 08:43, Francis Moreau a écrit : >>>>> Le 21/11/2013 12:17, Jingoo Han a écrit : >>>>> [...] >>>>>>> >>>>>>>> Also I took a look at the changes between v3.11 and v3.12 in this area >>>>>>>> and those changes match the issue I'm facing: >>>>>>>> >>>>>>>> $ git log --oneline v3.11..v3.12 -- drivers/mfd/rtsx_pcr.c >>>>>>>> 09fd867 mfd: rtsx: Copyright modifications >>>>>>>> eb891c6 mfd: rtsx: Configure to enter a deeper power-saving mode in S3 >>>>>>>> 7140812 mfd: rtsx: Move some actions from rtsx_pci_init_hw to >>>>>>>> individual >>>>>>>> extra_init_hw >>>>>>>> 5947c16 mfd: rtsx: Add shutdown callback in rtsx_pci_driver >>>>>>>> 773ccdf mfd: rtsx: Read vendor setting from config space >>>>>> >>>>>> In my opinion, rtsx_pci_resume()/rtsx_pci_suspend() in realtek PCIe card >>>>>> reader driver may make the kernel panic. >>>>>> >>>>>> I think that the commit "mfd: rtsx: Configure to enter a deeper >>>>>> power-saving mode in S3" may be the culprit. >>>>> >>>>> Unfortunately no, reverting this commit on top of v3.12 doesn't help. I >>>>> also reverted 7140812, 5947c16 but it didn't improve anything. >>>>> >>>>> The good news is that I managed to have a "light" kernel configuration >>>>> which is faster to build and more important it seems that the bug is >>>>> almost 100% reproductible now. >>>>> >>>>> So I'll try to do another git-bisect session later. >>>> >>>> So after bisecting between v3.11..v3.12 range, git bisect told me: >>>> >>>> the first bad commit is 551f5c74e17ba9257cdc35bf657ee448cad2d5b0 >>>> >>>> Merge branch 'acpi-processor' >>>> >>>> * acpi-processor: >>>> ACPI / processor: Acquire writer lock to update CPU maps >>>> ACPI / processor: Remove acpi_processor_get_limit_info() >>>> >>>> The two commits brought by the merge are not the culprits because >>>> reseting HEAD on "ACPI / processor: Acquire writer lock to update CPU >>>> maps" doesn't have the issue anymore. >>>> >>>> At that point I'm not sure how to bisect futher. >>> >>> Does the second parent of this merge (that is, 8462d9df9d50) have the >>> problem? >>> >> >> Yes it does. >> >> Ok, I've finally managed to find out the bad commit: >> ad07277e82dedabacc52c82746633680a3187d25: ACPI / PM: Hold acpi_scan_lock >> over system PM transitions >> >> I verified that the parent commit doesn't have the problem. > > Interesting. > >> Rafael, you're the man now ;) > > I kind of don't see how that commit may result in behavior that you > described earlier in the thread. > > You get a memory corruption that seems to have started to happen because > we're holding an additional lock over suspend resume now. Something's fishy > on that machine and we need to figure out what it is. > > Please file a bug at bugzilla.kernel.org against ACPI and assign it to me. > Please put all of the relevant info in there and attach the output of dmesg > after a fresh boot and the output of acpidump from the affected machine to > the bug entry. > I just sent a new trace with DEBUG_OBJECTS enabled which seems to give some interesting traces. If nothing can be found from them, I'll do the bug report. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
On 11/22/2013 01:54 PM, Rafael J. Wysocki wrote: > On Friday, November 22, 2013 10:57:25 AM Francis Moreau wrote: >> Le 22/11/2013 08:43, Francis Moreau a écrit : >>> Le 21/11/2013 12:17, Jingoo Han a écrit : >>> [...] >>>>> >>>>>> Also I took a look at the changes between v3.11 and v3.12 in this area >>>>>> and those changes match the issue I'm facing: >>>>>> >>>>>> $ git log --oneline v3.11..v3.12 -- drivers/mfd/rtsx_pcr.c >>>>>> 09fd867 mfd: rtsx: Copyright modifications >>>>>> eb891c6 mfd: rtsx: Configure to enter a deeper power-saving mode in S3 >>>>>> 7140812 mfd: rtsx: Move some actions from rtsx_pci_init_hw to individual >>>>>> extra_init_hw >>>>>> 5947c16 mfd: rtsx: Add shutdown callback in rtsx_pci_driver >>>>>> 773ccdf mfd: rtsx: Read vendor setting from config space >>>> >>>> In my opinion, rtsx_pci_resume()/rtsx_pci_suspend() in realtek PCIe card >>>> reader driver may make the kernel panic. >>>> >>>> I think that the commit "mfd: rtsx: Configure to enter a deeper >>>> power-saving mode in S3" may be the culprit. >>> >>> Unfortunately no, reverting this commit on top of v3.12 doesn't help. I >>> also reverted 7140812, 5947c16 but it didn't improve anything. >>> >>> The good news is that I managed to have a "light" kernel configuration >>> which is faster to build and more important it seems that the bug is >>> almost 100% reproductible now. >>> >>> So I'll try to do another git-bisect session later. >> >> So after bisecting between v3.11..v3.12 range, git bisect told me: >> >> the first bad commit is 551f5c74e17ba9257cdc35bf657ee448cad2d5b0 >> >> Merge branch 'acpi-processor' >> >> * acpi-processor: >> ACPI / processor: Acquire writer lock to update CPU maps >> ACPI / processor: Remove acpi_processor_get_limit_info() >> >> The two commits brought by the merge are not the culprits because >> reseting HEAD on "ACPI / processor: Acquire writer lock to update CPU >> maps" doesn't have the issue anymore. >> >> At that point I'm not sure how to bisect futher. > > Does the second parent of this merge (that is, 8462d9df9d50) have the problem? > Yes it does. Ok, I've finally managed to find out the bad commit: ad07277e82dedabacc52c82746633680a3187d25: ACPI / PM: Hold acpi_scan_lock over system PM transitions I verified that the parent commit doesn't have the problem. Rafael, you're the man now ;) Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
Le 22/11/2013 08:43, Francis Moreau a écrit : > Le 21/11/2013 12:17, Jingoo Han a écrit : > [...] >>> >>>> Also I took a look at the changes between v3.11 and v3.12 in this area >>>> and those changes match the issue I'm facing: >>>> >>>> $ git log --oneline v3.11..v3.12 -- drivers/mfd/rtsx_pcr.c >>>> 09fd867 mfd: rtsx: Copyright modifications >>>> eb891c6 mfd: rtsx: Configure to enter a deeper power-saving mode in S3 >>>> 7140812 mfd: rtsx: Move some actions from rtsx_pci_init_hw to individual >>>> extra_init_hw >>>> 5947c16 mfd: rtsx: Add shutdown callback in rtsx_pci_driver >>>> 773ccdf mfd: rtsx: Read vendor setting from config space >> >> In my opinion, rtsx_pci_resume()/rtsx_pci_suspend() in realtek PCIe card >> reader driver may make the kernel panic. >> >> I think that the commit "mfd: rtsx: Configure to enter a deeper >> power-saving mode in S3" may be the culprit. > > Unfortunately no, reverting this commit on top of v3.12 doesn't help. I > also reverted 7140812, 5947c16 but it didn't improve anything. > > The good news is that I managed to have a "light" kernel configuration > which is faster to build and more important it seems that the bug is > almost 100% reproductible now. > > So I'll try to do another git-bisect session later. So after bisecting between v3.11..v3.12 range, git bisect told me: the first bad commit is 551f5c74e17ba9257cdc35bf657ee448cad2d5b0 Merge branch 'acpi-processor' * acpi-processor: ACPI / processor: Acquire writer lock to update CPU maps ACPI / processor: Remove acpi_processor_get_limit_info() The two commits brought by the merge are not the culprits because reseting HEAD on "ACPI / processor: Acquire writer lock to update CPU maps" doesn't have the issue anymore. At that point I'm not sure how to bisect futher. Hope that helps. Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
Le 21/11/2013 12:17, Jingoo Han a écrit : [...] >> >>> Also I took a look at the changes between v3.11 and v3.12 in this area >>> and those changes match the issue I'm facing: >>> >>> $ git log --oneline v3.11..v3.12 -- drivers/mfd/rtsx_pcr.c >>> 09fd867 mfd: rtsx: Copyright modifications >>> eb891c6 mfd: rtsx: Configure to enter a deeper power-saving mode in S3 >>> 7140812 mfd: rtsx: Move some actions from rtsx_pci_init_hw to individual >>> extra_init_hw >>> 5947c16 mfd: rtsx: Add shutdown callback in rtsx_pci_driver >>> 773ccdf mfd: rtsx: Read vendor setting from config space > > In my opinion, rtsx_pci_resume()/rtsx_pci_suspend() in realtek PCIe card > reader driver may make the kernel panic. > > I think that the commit "mfd: rtsx: Configure to enter a deeper > power-saving mode in S3" may be the culprit. Unfortunately no, reverting this commit on top of v3.12 doesn't help. I also reverted 7140812, 5947c16 but it didn't improve anything. The good news is that I managed to have a "light" kernel configuration which is faster to build and more important it seems that the bug is almost 100% reproductible now. So I'll try to do another git-bisect session later. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
Hello Borislav, On 11/19/2013 11:15 AM, Borislav Petkov wrote: > On Tue, Nov 19, 2013 at 11:01:14AM +0100, Francis Moreau wrote: >> I think the easiest way to do it is to install a minimal system on a >> USB stick and try to reproduce first in order to preserve my system. > > Yep, sounds simple enough. > >> Then I'll try to see if this issue exists in a previous kernel version >> and if so, I'll do a git-bisect session. >> >> I can't find a quicker way to do that although using git-bisect (which >> implies several kernel builds) is a PITA. > > You can start with a coarse bisect by testing the major kernel versions > first, i.e. 3.11, 3.10, 3.9 ... and once you find good and bad, then you > can do the git-bisect thing. > Unfortunately the bisect session didn't give any positive results: I couldn't be sure if a specific revision was good or bad because the bug wasn't reproductible every time. But I got a different kernel oops on my stripped system that may give us a clue: http://imgur.com/zdCknbY Does this help ? Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
Le 17/11/2013 23:46, Borislav Petkov a écrit : > On Sun, Nov 17, 2013 at 11:34:20PM +0100, Rafael J. Wysocki wrote: >> This looks like a softirq bug to me (and related to cpuidle). > > Reportedly, it happens right after resume from RAM. Francis, is that > correct? yes that's correct. I haven't been hit by this issue otherwise. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
Le 17/11/2013 23:34, Rafael J. Wysocki a écrit : > On Sunday, November 17, 2013 11:06:12 PM Borislav Petkov wrote: >> On Sun, Nov 17, 2013 at 09:49:40PM +0100, Francis Moreau wrote: >>> On Sun, Nov 17, 2013 at 8:53 PM, Borislav Petkov wrote: >>>> On Sun, Nov 17, 2013 at 07:02:21PM +0100, Francis Moreau wrote: >>>>> Sorry I haven't taken the original picture large enough, and getting >>>>> this kernel panic is pretty hard since the kernel usually displays the >>>>> black screen. >>>> >>>> Ok, just try to make a readable picture of the whole line, next time you >>>> trigger it. >>>> >>>>> I can't find any traces of this function in the dump... >>>> >>>> Hmm, strange. Can you upload the whole vmlinux somewhere? Or is this the >>>> official archlinux kernel? If so, where can I get it from? >>> >>> Yes, you can download the bin package from : >>> https://www.archlinux.org/packages/core/x86_64/linux/ >>> >>> The bin package is a tar archive, so it pretty straightforward to >>> unpack the vmlinux file (actual is filename vmlinuz-linux). >> >> Ok, here's what I was able to see: rIP points to call_timer_fn+0x33 >> which is this: >> >> 8106f590 : >> 8106f590: e8 2b b2 48 00 callq 814fa7c0 >> <__fentry__> >> 8106f595: 55 push %rbp >> 8106f596: 65 48 8b 04 25 70 c7mov%gs:0xc770,%rax >> 8106f59d: 00 00 >> 8106f59f: 48 89 e5mov%rsp,%rbp >> 8106f5a2: 41 57 push %r15 >> 8106f5a4: 49 89 d7mov%rdx,%r15 >> 8106f5a7: 41 56 push %r14 >> 8106f5a9: 49 89 f6mov%rsi,%r14 >> 8106f5ac: 41 55 push %r13 >> 8106f5ae: 41 54 push %r12 >> 8106f5b0: 49 89 fcmov%rdi,%r12 >> 8106f5b3: 53 push %rbx >> 8106f5b4: 44 8b a8 44 e0 ff ffmov-0x1fbc(%rax),%r13d >> 8106f5bb: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) >> 8106f5c0: 4c 89 ffmov%r15,%rdi >> 8106f5c3: 41 ff d6callq *%r14 >> <--- faulting insn >> 8106f5c6: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) >> 8106f5cb: 65 48 8b 04 25 70 c7mov%gs:0xc770,%rax >> 8106f5d2: 00 00 >> 8106f5d4: 44 39 a8 44 e0 ff ffcmp%r13d,-0x1fbc(%rax) >> >> and the virtual address in rIP is 8106f5c3, i.e. the same one >> as in the photo. Thus, the CALL instruction tries to call the timer >> function 'fn' which we pass as an argument to call_timer_fn. >> >> However, the address we're trying to call in %r14 is garbage: >> 0x455300323d504544 and not in canonical form, causing the #GP. >> >> So basically what happens is suspend to RAM corrupts something >> containing one or more timer functions and we end up calling crap after >> resume. >> >> If you want to debug this further, you could try playing through >> Documentation/power/basic-pm-debugging.txt and see whether suspend to >> disk works. There's also a section 2 which talks about testing suspend >> to RAM which could be of help. >> >> But let me add Rafael and Thomas - they should have much better ideas >> than me. >> >> Guys, thread starts here: >> http://marc.info/?l=linux-kernel&m=138468134321335 > > This looks like a softirq bug to me (and related to cpuidle). > > I'm wondering if that happens with any of the older kernels or just 3.12? > I can try to find the old kernel package and see if that happens tonight. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
Hello Borislav, Le 17/11/2013 23:06, Borislav Petkov a écrit : > On Sun, Nov 17, 2013 at 09:49:40PM +0100, Francis Moreau wrote: >> On Sun, Nov 17, 2013 at 8:53 PM, Borislav Petkov wrote: >>> On Sun, Nov 17, 2013 at 07:02:21PM +0100, Francis Moreau wrote: >>>> Sorry I haven't taken the original picture large enough, and getting >>>> this kernel panic is pretty hard since the kernel usually displays the >>>> black screen. >>> >>> Ok, just try to make a readable picture of the whole line, next time you >>> trigger it. >>> >>>> I can't find any traces of this function in the dump... >>> >>> Hmm, strange. Can you upload the whole vmlinux somewhere? Or is this the >>> official archlinux kernel? If so, where can I get it from? >> >> Yes, you can download the bin package from : >> https://www.archlinux.org/packages/core/x86_64/linux/ >> >> The bin package is a tar archive, so it pretty straightforward to >> unpack the vmlinux file (actual is filename vmlinuz-linux). > > Ok, here's what I was able to see: rIP points to call_timer_fn+0x33 > which is this: > > 8106f590 : > 8106f590: e8 2b b2 48 00 callq 814fa7c0 > <__fentry__> > 8106f595: 55 push %rbp > 8106f596: 65 48 8b 04 25 70 c7mov%gs:0xc770,%rax > 8106f59d: 00 00 > 8106f59f: 48 89 e5mov%rsp,%rbp > 8106f5a2: 41 57 push %r15 > 8106f5a4: 49 89 d7mov%rdx,%r15 > 8106f5a7: 41 56 push %r14 > 8106f5a9: 49 89 f6mov%rsi,%r14 > 8106f5ac: 41 55 push %r13 > 8106f5ae: 41 54 push %r12 > 8106f5b0: 49 89 fcmov%rdi,%r12 > 8106f5b3: 53 push %rbx > 8106f5b4: 44 8b a8 44 e0 ff ffmov-0x1fbc(%rax),%r13d > 8106f5bb: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) > 8106f5c0: 4c 89 ffmov%r15,%rdi > 8106f5c3: 41 ff d6callq *%r14 > <--- faulting insn > 8106f5c6: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) > 8106f5cb: 65 48 8b 04 25 70 c7mov%gs:0xc770,%rax > 8106f5d2: 00 00 > 8106f5d4: 44 39 a8 44 e0 ff ffcmp%r13d,-0x1fbc(%rax) > > and the virtual address in rIP is 8106f5c3, i.e. the same one > as in the photo. Thus, the CALL instruction tries to call the timer > function 'fn' which we pass as an argument to call_timer_fn. > > However, the address we're trying to call in %r14 is garbage: > 0x455300323d504544 and not in canonical form, causing the #GP. > Thanks for digging this out ! Just out of curiosity, running "objdump -D" doesn't seem to show the same thing here. How did you get such dump with function names for example ? > So basically what happens is suspend to RAM corrupts something > containing one or more timer functions and we end up calling crap after > resume. > > If you want to debug this further, you could try playing through > Documentation/power/basic-pm-debugging.txt and see whether suspend to > disk works. There's also a section 2 which talks about testing suspend > to RAM which could be of help. The thing is that I'd like to avoid to oops my kernel to avoid to corrupt my filesystem. Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
On Sun, Nov 17, 2013 at 8:53 PM, Borislav Petkov wrote: > On Sun, Nov 17, 2013 at 07:02:21PM +0100, Francis Moreau wrote: >> Sorry I haven't taken the original picture large enough, and getting >> this kernel panic is pretty hard since the kernel usually displays the >> black screen. > > Ok, just try to make a readable picture of the whole line, next time you > trigger it. > >> I can't find any traces of this function in the dump... > > Hmm, strange. Can you upload the whole vmlinux somewhere? Or is this the > official archlinux kernel? If so, where can I get it from? Yes, you can download the bin package from : https://www.archlinux.org/packages/core/x86_64/linux/ The bin package is a tar archive, so it pretty straightforward to unpack the vmlinux file (actual is filename vmlinuz-linux). Thanks for you help. -- Francis -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
On Sun, Nov 17, 2013 at 5:01 PM, Borislav Petkov wrote: > On Sun, Nov 17, 2013 at 04:50:23PM +0100, Francis Moreau wrote: >> AFAIK, the kernel has 2 simple patches on top of the vanilla one. >> They're both are trivial and can't be related to this issue. >> >> You can have look to them here: >> https://wiki.archlinux.org/index.php/Kernels#Official_packages > > Ok. > >> Assuming that I'm running an upstream kernel, it's almost 100% >> reproductible. > > Is there any chance you can catch the whole oops, esp. keep the Code: > line complete? Sorry I haven't taken the original picture large enough, and getting this kernel panic is pretty hard since the kernel usually displays the black screen. > > Also, can you do: > > $ objdump -d vmlinux | less > > then search for 'call_timer_fn' and paste the whole function somewhere. I can't find any traces of this function in the dump... > > Also, can you catch a full dmesg and upload that somewhere too? http://paste.debian.net/66294/ Thanks. -- Francis -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
Le 17/11/2013 14:25, Borislav Petkov a écrit : > On Sun, Nov 17, 2013 at 10:42:05AM +0100, Francis Moreau wrote: >> Today I got a different behaviour, after resuming I got a >> kernel panic. I could take a picture of the laptop screen: >> http://imgur.com/f5uWFTY > > Does archlinux ship the upstream kernel or do they have patches ontop? > If "yes" to the last one, try reproducing this panic with the upstream > kernel 3.12. AFAIK, the kernel has 2 simple patches on top of the vanilla one. They're both are trivial and can't be related to this issue. You can have look to them here: https://wiki.archlinux.org/index.php/Kernels#Official_packages > > In general, how reliably can you reproduce the kernel panic with the > upstream 3.12 kernel? > Assuming that I'm running an upstream kernel, it's almost 100% reproductible. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
3.12: kernel panic when resuming from suspend to RAM (x86_64)
Hello, I recently acquiered a new laptop. After installing archlinux which is shipping a kernel 3.12, I've got some troubles after resuming from each suspend to RAM. The behaviour is as following: each resumes correctly and my session seems to be restored but after typing a command on the term, I got a black screen and the fan is becoming noisy which seems to indicate the cpus are running intensively. Today I got a different behaviour, after resuming I got a kernel panic. I could take a picture of the laptop screen: http://imgur.com/f5uWFTY Could anybody help me to sort this out ? Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bcache: process get stucks when doing write IOs in writeback mode
Hello, It doesn't seem my initial post reached LKML, maybe that's due to the dmesg file I initially attached. So I'm replying to this hoping that this is going to be fixed (since the attached file is gone). On Mon, Nov 11, 2013 at 6:45 PM, Francis Moreau wrote: > Hello, > > [ Resending this issue to LKML to reach a wider audience since I've > got no answer so far on bcache mailing list and it seems a pretty > major bug in that component ] > > I'm using bcache on a very basic setup: no MD or LVM involved. > /dev/sda4 (900Mo) is the backing device while /dev/sdb (120G) is the > cache device. On top of bcache0 I'm using ext4 and I'm using it as my > root device. > > I initially created the bcache0 device with default using writethough > mode. I haven't (yet) experienced any issues using this mode: I > successfully installed my system (archlinux) on it. > > I decided to switch to writeback mode and encounter several times the > same issue: after doing a lot of IOs (for example when installing new > packages) one process is stuck in D state. Currently I can see this: > > # ps aux | grep D+ > root 1080 0.0 0.0 41796 5728 pts/0D+ 12:59 0:00 > gtk-update-icon > > # cat /proc/1080/stack > [] sleep_on_page+0xe/0x20 > [] wait_on_page_bit+0x7f/0x90 > [] filemap_fdatawait_range+0x11b/0x1a0 > [] filemap_write_and_wait_range+0x3f/0x70 > [] ext4_sync_file+0xba/0x390 [ext4] > [] do_fsync+0x56/0x80 > [] SyS_fsync+0x10/0x20 > [] system_call_fastpath+0x1a/0x1f > [] 0x > > From that point I'm not really sure what I should do to restore the > system without loosing or breaking badly my rootfs. Any advices are > welcome. > > Please find below some additionnal information that might help to fix > this issue: > > # mount | grep bcache > /dev/bcache0 on / type ext4 (rw,relatime,data=ordered) > > # uname -r > 3.11.6-1-ARCH > > # bcache-super-show /dev/sda4 > sb.magicok > sb.first_sector8 [match] > sb.csumF828E134D5AB890C [match] > sb.version1 [backing device] > > dev.label(empty) > dev.uuid62839366-e5a9-43a9-9984-fc8f2aefe9de > dev.sectors_per_block1 > dev.sectors_per_bucket1024 > dev.data.first_sector16 > dev.data.cache_mode1 [writeback] > dev.data.cache_state2 [dirty] > > cset.uuid50485be4-15f7-424f-a01b-4c65fdf8487d > > # bcache-super-show /dev/sdb > sb.magicok > sb.first_sector8 [match] > sb.csum692BB25984E31571 [match] > sb.version3 [cache device] > > dev.label(empty) > dev.uuida63ec68a-6a71-497e-86db-0dd71bbfb404 > dev.sectors_per_block1 > dev.sectors_per_bucket1024 > dev.cache.first_sector1024 > dev.cache.cache_sectors234439680 > dev.cache.total_sectors234440704 > dev.cache.orderedyes > dev.cache.discardyes > dev.cache.pos0 > dev.cache.replacement0 [lru] > > cset.uuid50485be4-15f7-424f-a01b-4c65fdf8487d > > I attached dmesg output which has been generated after doing "echo t >>/proc/sysrq-trigger" > > Thanks > -- > Francis > > > -- > Francis -- Francis -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Why not are all serial devices named ttyS* ?
Hi, I'm just wondering why device nodes of some serial drivers (mostly when arch != x86) are not always named "ttyS[:digit:]" ? For example, I have a ARM based platform which has a serial device node named "ttymxc0". I don't see any advantages to do this but only require one to handle special cases since most applications expect the ttyS* name. Thanks -- Francis -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Question on timekeeping subsystem
Hello Roman, On Thu, Feb 14, 2008 at 2:37 AM, Roman Zippel <[EMAIL PROTECTED]> wrote: > > These mails should help to understand, what this code does: > > http://lkml.org/lkml/2006/3/4/61 > http://lkml.org/lkml/2006/4/3/205 > Indeed ! They look interesting after a quick look but I haven't time yet to read them carefully. Thanks -- Francis -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Question on timekeeping subsystem
Hello, I looked at this subsystem, trying to understand how this works on Linux but call me a dumb xxx but I think I really miss something. First I tried to find some documentation on the current implementation but haven't found any thing really usefull. Specially there's nothing about it in Documentation/ directory. Please correct me if I'm already wrong. Actually I read the implementation of update_wall_time() and I really fail to understand how it works. This is probably because I don't know what "xtime_nsec" and "error" fields in clocksource struct are for. These fields are not documented anywhere in the source code so it should be obvious but unfortunately not for me. Another example almost the first thing done by this function is: clock->xtime_nsec += (s64)xtime.tv_nsec << clock->shift; What's the hell this ? I know I'm stupid but please enlight me ;) Thanks -- Francis -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Kbuild] How to clean a particular directory ?
On Jan 31, 2008 9:54 AM, Paul Mundt <[EMAIL PROTECTED]> wrote: > Makefile says: > > # Use make M=dir to specify directory of external module to build > # Old syntax make ... SUBDIRS=$PWD is still supported > # Setting the environment variable KBUILD_EXTMOD take precedence > ifdef SUBDIRS > KBUILD_EXTMOD ?= $(SUBDIRS) > endif > ifdef M > ifeq ("$(origin M)", "command line") > KBUILD_EXTMOD := $(M) > endif > endif > > so M= is apparently the newfangled (and undocumented) way of doing this. > Good catch. I tried "make help" to search for revelant info but didn't find anything useful. Thanks -- Francis -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Kbuild] How to clean a particular directory ?
On Jan 31, 2008 9:48 AM, Paul Mundt <[EMAIL PROTECTED]> wrote: > > On Thu, Jan 31, 2008 at 09:38:10AM +0100, Francis Moreau wrote: > > I'd like to clean a particular directory in the kernel tree. > > > > I tried several things such as: > > > > $ make drivers/char clean > > $ make -f scripts/Makefile.clean obj=drivers/char > > > > But it doesn't work. > > > > Could anybody give me a hint ? > > make SUBDIRS=drivers/char clean > > should do the trick. Kbuild might have a magic incantation for it these > days, but that's the way it used to work, and still seems to. > Thanks Paul, it works fine. -- Francis -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[Kbuild] How to clean a particular directory ?
Hello, I'd like to clean a particular directory in the kernel tree. I tried several things such as: $ make drivers/char clean $ make -f scripts/Makefile.clean obj=drivers/char But it doesn't work. Could anybody give me a hint ? Thanks -- Francis -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Question about DMA
[ Added Paul in CC ] On Jan 28, 2008 11:29 AM, Haavard Skinnemoen <[EMAIL PROTECTED]> wrote: > On Mon, 28 Jan 2008 11:22:49 +0100 > "Francis Moreau" <[EMAIL PROTECTED]> wrote: > > > > Please let me know if you think this will work for your hardware. > > > > Thanks for pointing this out. I currently can't look at this but I'll > > try to give it > > a deep look this week. > > Great. I'll Cc you on the next round of patches. > > > > What platform are you working on, btw? > > > > > > > SH > > Nice. That means we have potential users on three different > architectures (the other two being avr32 and arm; I wouldn't be too > surprised if powerpc and mips want a piece of the fun at some point > too.) > I think it's worth to CC Paul as well. He plans to move SH architecture to use the dmaengine API soon, and he's definitively the right person who can give you some useful feedbacks from SH architecture. -- Francis -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Question about DMA
Hello Haavard, On Jan 28, 2008 10:21 AM, Haavard Skinnemoen <[EMAIL PROTECTED]> wrote: > On Mon, 28 Jan 2008 09:55:58 +0100 > "Francis Moreau" <[EMAIL PROTECTED]> wrote: > > > My DMA controller has very little in common with ISA DMA one. But I'd like > > to > > use it in a driver. This driver can do DMA but with the help of an external > > DMA > > controller. It's only implement the "slave" side. So basically this driver > > needs > > to configure one of the DMAC channels before transfering data. > > Have a look at this thread: > > http://lkml.org/lkml/2007/11/23/79 > > I'm planning to post an updated patch set this week that addresses the > comments by Dan Williams, and that applies on top of the other DMA > Engine patches that have been posted since then. > > Please let me know if you think this will work for your hardware. Thanks for pointing this out. I currently can't look at this but I'll try to give it a deep look this week. > What platform are you working on, btw? > SH Thanks. -- Francis -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Question about DMA
On Jan 28, 2008 10:04 AM, Jiri Slaby <[EMAIL PROTECTED]> wrote: > On 01/28/2008 09:55 AM, Francis Moreau wrote: > Which bus is it in this case? Basically it's a bus which is used to access memories. Thanks -- Francis -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Question about DMA
Hello Jiri, On Jan 27, 2008 11:34 PM, Jiri Slaby <[EMAIL PROTECTED]> wrote: > On 01/27/2008 09:51 PM, Francis Moreau wrote: > > 1/ Why does the function take only one address ? I would expect it > > to take both a source and a destination address for the dma controller > > to transfer data. > > since your device is responsible for sending data from/to local memory. ISA > dma > controller has only 2 registers -- 16-bit address to put incoming data to (get > outcoming from) + 8-bit nonincrementing page and 16-bit counter. > > > 2/ The type of address parameter is an unsigned int. Why isn't it a > > dma_addr_t > > type ? > > since isa dma controller can address up to 2^24 (16-bit address + 8-bit page) > bytes of memory, i.e. 16M. > > Are you sure, you want use this API? > > No ;) My DMA controller has very little in common with ISA DMA one. But I'd like to use it in a driver. This driver can do DMA but with the help of an external DMA controller. It's only implement the "slave" side. So basically this driver needs to configure one of the DMAC channels before transfering data. What other API could I use in this case ? I don't think the DMA-mapping can help in this case... Thanks -- Francis -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Question about DMA
Hello, I have 2 questions regarding set_dma_addr(unsigned int channel, unsigned int addr) helper. 1/ Why does the function take only one address ? I would expect it to take both a source and a destination address for the dma controller to transfer data. 2/ The type of address parameter is an unsigned int. Why isn't it a dma_addr_t type ? Thanks ! -- Francis -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux and remote control device
On Jan 22, 2008 8:32 PM, Chuck Ebbert <[EMAIL PROTECTED]> wrote: > On 01/22/2008 03:01 AM, Francis Moreau wrote: > > Hello, > > > > I'd like to add support for my Infrared remote control to Linux. > > > > So far I only see LIRCD project that make the kernel support > > such device but I'm not sure if this project is the best choice > > since it's not part of mainline kernels. And there are certainly > > good reasons which I'm not aware of. > > > > lirc is being worked on and is nearly ready for upstream submission. > Really ? I'm wondering why lirc doesn't use the input subsystem, do you have any ideas ? -- Francis -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux and remote control device
On Jan 22, 2008 11:14 AM, Vojtech Pavlik <[EMAIL PROTECTED]> wrote: > > On Tue, Jan 22, 2008 at 10:47:22AM +0100, Francis Moreau wrote: > > On Jan 22, 2008 9:19 AM, Vojtech Pavlik <[EMAIL PROTECTED]> wrote: > > > On Tue, Jan 22, 2008 at 09:01:10AM +0100, Francis Moreau wrote: > > > > Hello, > > > > > > > > I'd like to add support for my Infrared remote control to Linux. > > > > > > > > So far I only see LIRCD project that make the kernel support > > > > such device but I'm not sure if this project is the best choice > > > > since it's not part of mainline kernels. And there are certainly > > > > good reasons which I'm not aware of. > > > > > > > > Another possibility is to make the remote controle device an input > > > > device. But I see several flaws: > > > > > > > > - The IR receiver on my board can't be use as a transmitter any > > > > more > > > > > > The input subsystem moves events in both directions. It would need some > > > additional hacking to work with IR transmitters, but it might be > > > possible. > > > > > > > - All scancodes are embedded in the kernel. > > > > > > While they reside in the kernel, they can be changed from userspace. > > > > > > > Do you think it's still the way to go despite the 2 points raised above > > > > ? > > > > > > It might be, but I'm not 100% convinced either. > > > > > > > All your answers sound good however. > > > > Could you please tell me why you're not sure ? > > It would still limit the IR ability to sending/receiving "just" > keypresses. You may want to send and receive more (arbitrary data) over > an IR dongle attached to the machine. > Don't know about such devices. The device I actually have, an infrared remote control, seems simple enough to use the input device model. I have a wheel on it but I think I can make it work too... Thanks -- Francis -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux and remote control device
On Jan 22, 2008 9:19 AM, Vojtech Pavlik <[EMAIL PROTECTED]> wrote: > On Tue, Jan 22, 2008 at 09:01:10AM +0100, Francis Moreau wrote: > > Hello, > > > > I'd like to add support for my Infrared remote control to Linux. > > > > So far I only see LIRCD project that make the kernel support > > such device but I'm not sure if this project is the best choice > > since it's not part of mainline kernels. And there are certainly > > good reasons which I'm not aware of. > > > > Another possibility is to make the remote controle device an input > > device. But I see several flaws: > > > > - The IR receiver on my board can't be use as a transmitter any > > more > > The input subsystem moves events in both directions. It would need some > additional hacking to work with IR transmitters, but it might be > possible. > > > - All scancodes are embedded in the kernel. > > While they reside in the kernel, they can be changed from userspace. > > > Do you think it's still the way to go despite the 2 points raised above ? > > It might be, but I'm not 100% convinced either. > All your answers sound good however. Could you please tell me why you're not sure ? -- Francis -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Linux and remote control device
Hello, I'd like to add support for my Infrared remote control to Linux. So far I only see LIRCD project that make the kernel support such device but I'm not sure if this project is the best choice since it's not part of mainline kernels. And there are certainly good reasons which I'm not aware of. Another possibility is to make the remote controle device an input device. But I see several flaws: - The IR receiver on my board can't be use as a transmitter any more - All scancodes are embedded in the kernel. Do you think it's still the way to go despite the 2 points raised above ? Thanks ! -- Francis -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Why not creating a GIT RT tree ?
Hello, On Jan 18, 2008 10:59 PM, James Cloos <[EMAIL PROTECTED]> wrote: > >>>>> "Francis" == Francis Moreau <[EMAIL PROTECTED]> writes: > > Francis> I can't find a rt tree anywhere and all new rt release spoke > Francis> about a patchset to apply on mainline kernels. > > It is not perfect, but I do have a git repo of the rt history-of-patches > up at: > > git://git.kernel.org/pub/scm/linux/kernel/git/cloos/rt-2.6.git > http://www.kernel.org/pub/scm/linux/kernel/git/cloos/rt-2.6.git > > Gitweb URL is: > > http://git.kernel.org/?p=linux/kernel/git/cloos/rt-2.6.git > > It is in the one-head per patch style, and has the single-file patches > applied rather than the quilt queue. Why don't you have one commit per patch ? Thanks -- Francis -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Why not creating a GIT RT tree ?
On Jan 18, 2008 8:12 PM, Steven Rostedt <[EMAIL PROTECTED]> wrote: > True, but then how would you do it. One thing is that most of these > branches would interact with each other. Touching the same code quite > a bit. So it doesn't always help. But pulling out patches can help us to > an extent. > I see, it would probably be too painful in this context, that's pity. > > Great! Looking forward to it ;-) > Well actually it seems already supported. Some files in arch/sh are already touched by RT patches. I took a look to your interesting paper and I have now a question about the BKL: Why is it so hard to get ride of it completely ? Do you have any advices or starting point to get involved in RT kernel ? I'm almost new in this area but I'd like to acquire some knowledges and try to contribute if possible... Thanks ! -- Francis -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Why not creating a GIT RT tree ?
Hello, On Jan 18, 2008 4:55 PM, Steven Rostedt <[EMAIL PROTECTED]> wrote: > > On Fri, 18 Jan 2008, Francis Moreau wrote: > > > Maybe I missed it but I'm wondering why GIT is not used for > > the RT development ? I can't find a rt tree anywhere and all > > new rt release spoke about a patchset to apply on mainline > > kernels. > > The answer to this is pretty much the same as why the -mm tree isn't in > git either. > Well not exactly. Unlike the mm tree which is made of a lots of patches dealing with totaly unrelated subjects, the rt patches only hopefully deal with realtime stuffs. > The RT tree is made up of lots of patches (over 300). Our goal is to get > RT into mainline Linux. RT isn't just one type of system, it extends all > over the kernel, and the patches may be rewriten over and over. Managing > this in quilt is a lot easier than managing it in git. > I'm probably missing something since I haven't looked at the RT patches (yet) but couldn't these 300 patches be sorted out by topics ? If so you could create a branch per topic and merge all of them in your master branch which would be the rt kernel. Hopefully each branch won't interact with other branch too much. All of this assumes of course that the number of topics is definitely much smaller than the number of patches (~300). Having such a tree would be very useful for looking at history in each topic, for doing some git-bisect debug session IMHO... > That said, there's been talk about making a git tree for others based on > the quilt queue. The thing is that a new git tree will need to be created > for every release. Which means that it will be difficult for others to > simply update their local repo since you will get a bunch of errors with > not being from the same head. > > > > > > Another question, is there a TODO list somewhere which would > > help to port the RT patch to a new architecture ? > > Which arch? We are already on PowerPC, ARM and MIPS. Thinking about sh? > Yep, not that I'm an expert in this architecture but it's commonly used in multimedia device where realtime is often needed. Thanks -- Francis -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Why not creating a GIT RT tree ?
Hello, Maybe I missed it but I'm wondering why GIT is not used for the RT development ? I can't find a rt tree anywhere and all new rt release spoke about a patchset to apply on mainline kernels. Another question, is there a TODO list somewhere which would help to port the RT patch to a new architecture ? Sorry if the questions are dumbed but I'm just new to this project. Thanks -- Francis -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: x86_64: vsyscall vs vdso
On 9/16/07, Ulrich Drepper <[EMAIL PROTECTED]> wrote: > On 9/16/07, Francis Moreau <[EMAIL PROTECTED]> wrote: > > Another question: is vdso going to replace vsyscall at all ? If so how It's weird, because it seems that vsyscalls are only done by x86_64, all others archs have only vdso... so they seem to forget about statically linked apps... > > are statically programs going to be handled ? > > Unfortunately the vsyscalls cannot ever go completely away. > Statically linked apps, the bane of progress, will need them. Actually if we could easily retrieve the vdso in a process memory mapping (through a new syscall or /proc/self/maps), it should be easy for gcc/ld to statically links vdso functions into a statically linked app, shouldn't it ? -- Francis - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: x86_64: vsyscall vs vdso
On 9/17/07, Ulrich Drepper <[EMAIL PROTECTED]> wrote: > On 9/17/07, Francis Moreau <[EMAIL PROTECTED]> wrote: > > I think signal trampolines will still need them too. So making > > vsyscalls configurable doesn't seem to work, does it ? > > vsyscalls aren't used for that. We have a restorer in libc and could > easily use one in the vdso. That's what is done on x86. > Sorry for my ignorance but what' is 'a restorer' ? -- Francis - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: x86_64: vsyscall vs vdso
Hello Ulrich, Thanks for taking time to respond ! On 9/16/07, Ulrich Drepper <[EMAIL PROTECTED]> wrote: > On 9/16/07, Francis Moreau <[EMAIL PROTECTED]> wrote: >> I'm a bit puzzled because vdso doesn't seem to be used on my fedora 7: >> I just compiled a trivial program which just call gettimeofday() and >> ld.so resolves this call with vsyscall's gettimeofday. >> >> Now I'm wondering when vdso is used, could anybody give me a clue ? > > F7 was released before the vdso for x86_64 was upstream so you should > not expect anything else. F8 will use the available vdso. This > doesn't just happen magically, changes to libc are needed. > You're right. I don't know why I thought vdso was older... > >> Another question: is vdso going to replace vsyscall at all ? If so how >> are statically programs going to be handled ? > > Unfortunately the vsyscalls cannot ever go completely away. > Statically linked apps, the bane of progress, will need them. There > are also people updating kernels but not the user userland code. > Does that mean we'll need to keep 3 different implementations of gtod in the kernel forever ? > What we will have to do in future is to make vsyscalls configurable. > Both a compile time option and a runtime option (perhaps also under > control of SELinux) are likely needed. I think signal trampolines will still need them too. So making vsyscalls configurable doesn't seem to work, does it ? I took a look to glibc-2.4 and it doesn't seem to use __kernel_vsyscall vsyscall. Am I wrong ? If so could you point me where it's used in the code ? Another not so related question, hope you don't mind :) I'm having hard time to understand how ld.so is working, specially the dynamic symbol resolution after looking at the glibc code and reading the ELF specification. Do you know any others documents that could help ? Thanks a lot. -- Francis - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
x86_64: vsyscall vs vdso
Hello, I'm a bit puzzled because vdso doesn't seem to be used on my fedora 7: I just compiled a trivial program which just call gettimeofday() and ld.so resolves this call with vsyscall's gettimeofday. Now I'm wondering when vdso is used, could anybody give me a clue ? Another question: is vdso going to replace vsyscall at all ? If so how are statically programs going to be handled ? Thanks, -- Francis - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: i2c transfers during interrupt context
Bonjour Jean ! On 8/30/07, Jean Delvare <[EMAIL PROTECTED]> wrote: > Non. Tu n'as pas le droit de dormir dans un gestionnaire > d'interruption, et la majorité des pilotes I2C dorment pendant les > transferts. > Ok, c'est ce que je voulais savoir. Peut etre que cette regle pourrait etre renforce par un "might_sleep()" ajoute dans les fonctions de transfert de l'i2c-core ? D'ailleurs pourquoi certains pilotes i2c ne dorment pas pendant les transferts ? Qu'ont ils de different ? > Si tu as besoin de faire ce genre de chose il faut typiquement passer > par une workqueue. Malheureusement dans certains cas la workqueue est schedulee trop tardivement et le message i2c n'arrive pas a temps au device. Une autre question, j'espere que je n'abuse pas: Comment peut on changer la frequence du bus i2c depuis un driver ? Merci pour tes reponses. -- Francis - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
i2c transfers during interrupt context
Hello, I have a very simple question about i2c transfers. I'm wondering if I'm allowed to initiate some very short i2c transfers in an interrupt handler. Thanks for your answers. -- Francis -- Francis - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Search for x86_64 documentation.
Hello Sebastien, On 8/2/07, Sébastien Dugué <[EMAIL PROTECTED]> wrote: > > Then you may have a look at > > http://www.chip-architect.com/news/2003_09_21_Detailed_Architecture_of_AMDs_64bit_Core.html > > It's a bit more pleasant to read than the AMD or Intel programming manuals > but in the long run you will need those along with the accompanying erratas. > thanks for the pointer, it looks good ! -- Francis - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Search for x86_64 documentation.
On 8/1/07, Rene Herman <[EMAIL PROTECTED]> wrote: > On 08/01/2007 05:30 PM, Francis Moreau wrote: > > >>> Could anyone point out some nice documentations/books describing this > >>> architecture ? > >> > >> First and foremost the AMD64 architecture documentation from AMD > >> itself: > >> > >> http://www.amd.com/gb-uk/Processors/TechnicalResources/0,,30_182_739_7044,00.html > > > > Thank you for the pointer but I alread knew about them. > > > > I was actually more interested in books which are more pleasant to > > read than a raw datasheet. Something like "x86_64 arch for newbies" ;) > > Not aware of any -- but if you write one, I'll probably buy it ;-| > At the time I'll finish it, you will probably want to read documentation for x86_512 arch ;) -- Francis - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Search for x86_64 documentation.
Hello Andi, On 01 Aug 2007 19:13:27 +0200, Andi Kleen <[EMAIL PROTECTED]> wrote: > "Francis Moreau" <[EMAIL PROTECTED]> writes: > > > > I was actually more interested in books which are more pleasant to > > read than a raw datasheet. > > The first volumes of the Intel and AMD architecture manuals are far from "raw > datasheets". In fact they're quite well written as brief introduction > of x86 assuming you have some basic knowledge of assembly language concepts. > > There might be better introductions for a total newbie but if you > already know another assembly language and other basic concepts > of computer architecture they should serve you very well. > Ah ok. I had bad experience with this kind of documentation but I should have taken a deeper look into them before asking. thanks. -- Francis - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Search for x86_64 documentation.
Hello Rene, On 8/1/07, Rene Herman <[EMAIL PROTECTED]> wrote: > On 08/01/2007 03:27 PM, Francis Moreau wrote: > > > I'm used to hack Linux on a ARM based board and would like to be > > involved in x86_64 architecture but I don't know where I should > > start... > > > > Could anyone point out some nice documentations/books describing this > > architecture ? > > First and foremost the AMD64 architecture documentation from AMD itself: > > http://www.amd.com/gb-uk/Processors/TechnicalResources/0,,30_182_739_7044,00.html > Thank you for the pointer but I alread knew about them. I was actually more interested in books which are more pleasant to read than a raw datasheet. Something like "x86_64 arch for newbies" ;) thanks -- Francis - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Search for x86_64 documentation.
Hello, I'm used to hack Linux on a ARM based board and would like to be involved in x86_64 architecture but I don't know where I should start... Could anyone point out some nice documentations/books describing this architecture ? thanks -- Francis - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Question about pm and tty
Hi, I've just finished to implement an input driver for a simple custom keyboard (only 16 keys). I wanted to give it a test through a virtual terminal (CONFIG_VT). Note that I'm not familiar with this at all so pardon me if I'm saying something silly. I noticed that as soon as the terminal is initialized, tty_init() is called, the input device is open and thus its clock is started. It sounds strange to me, I would have thought that the input device is open only when /dev/tty is open. Thus the keyboard clock is started when the keyboard is really used. Am I missing something ? Thanks -- Francis - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: clockevent questions
Hi Thomas, On 5/16/07, Thomas Gleixner <[EMAIL PROTECTED]> wrote: Francis, On Tue, 2007-05-15 at 10:47 +0200, Francis Moreau wrote: > My question is about the clock resolution field which is equal to 1ns. > How is this possible since my timer's frequency is only 100Mhz ? you are right. It is a bit strange. The resolution info is not really reflecting the clock event source capability. I look if there is a sane solution for that. Doesn't that make hrtimer_get_res() and its callers buggy since they return this 1ns value which is not reflecting the correct clock event capability ? Another question about the output of '/proc/timer_list': [...] active timers: #0: , tick_sched_timer, S:01 # expires at 64696546875000 nsecs [in 2514469 nsecs] .expires_next : 64696546875000 nsecs [...] Doesn't the 2 expire time lines give the same information ? If so, couldn't we merge them into : ".expires_next : 64696546875000 nsecs [in 2514469 nsecs]" ? Thanks -- Francis - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: clockevent questions
Thomas, On 5/12/07, Thomas Gleixner <[EMAIL PROTECTED]> wrote: Well, it ends up in hrtimer_interrupt() and the code there finds out, that the next timer is not due right now, so it simply requests the same (absolute) time event again, which is processed by the clock events code and eventually limited to the max delta of the device again. My timer is finally using the clockevent subsystem. Below is the output of '/proc/timer_list': cat /proc/timer_list Timer List Version: v0.3 HRTIMER_MAX_CLOCK_BASES: 2 now at 64696544360531 nsecs cpu: 0 clock 0: .index: 0 .resolution: 1 nsecs .get_time: ktime_get_real .offset: 11791511530 nsecs active timers: clock 1: .index: 1 .resolution: 1 nsecs .get_time: ktime_get .offset: 0 nsecs active timers: #0: , tick_sched_timer, S:01 # expires at 64696546875000 nsecs [in 2514469 nsecs] .expires_next : 64696546875000 nsecs .hres_active: 1 .nr_events : 16562163 .nohz_mode : 0 .idle_tick : 0 nsecs .tick_stopped : 0 .idle_jiffies : 0 .idle_calls : 0 .idle_sleeps: 0 .idle_entrytime : 0 nsecs .idle_sleeptime : 0 nsecs .last_jiffies : 0 .next_jiffies : 0 .idle_expires : 0 nsecs jiffies: 16485515 Tick Device: mode: 1 Clock Event Device: hrt max_delta_ns: 2147483647 min_delta_ns: 1000 mult: 206158430 shift: 32 mode: 3 next_event: 64696546875000 nsecs set_next_event: hrt_next_event set_mode: hrt_timer_setup event_handler: hrtimer_interrupt My question is about the clock resolution field which is equal to 1ns. How is this possible since my timer's frequency is only 100Mhz ? My 2 cents, it looks like some tabs are missing when printing the list of hrtimers of each clock... Thanks -- Francis - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: clockevent questions
On 5/12/07, Thomas Gleixner <[EMAIL PROTECTED]> wrote: On Sat, 2007-05-12 at 22:13 +0200, Francis Moreau wrote: > > Yes, it is correct. The generic timer code requests an event in the > > future. It does not care, whether the hardware device can handle that or > > not. So the clock event code limits the delta to the maximum delta the > > device can handle. The interrupt fires and the generic timer code > > reschedules the event with the remaining delta time. > > > > Thanks again for explanations. Could you give me a pointer of this reschedules ? Well, it ends up in hrtimer_interrupt() and the code there finds out, that the next timer is not due right now, so it simply requests the same (absolute) time event again, which is processed by the clock events code and eventually limited to the max delta of the device again. OK, I'll give it a deeper look soon. Thanks for your great work ! -- Francis - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: clockevent questions
Hi Thomas, Thanks for answering so quickly ! On 5/12/07, Thomas Gleixner <[EMAIL PROTECTED]> wrote: Francis, On Sat, 2007-05-12 at 16:54 +0200, Francis Moreau wrote: > I'm trying to use clocksource and clockevent new subsystem. My > platform has a timer that I'd like to use both as a clocksource and a > clockevent devices. See arch/i386/kernel/hpet.c thanks for the pointer > This timer is continueous in sense that it can run > without any interruption -ENOPARSE Has your timer a free running counter and a match register based event mechanism ? yes > so I assume I can flag the clocksource device > with "CLOCK_SOURCE_IS_CONTINUOUS". However I noticed that clockevent > device can be stopped by using "set_mode()" method. Are these two > behaviours compatible ? the clock event stop only stops the event mechanism, it does not stop the counter. See arch/i386/kernel/hpet.c OK. I got it now, In my initial plan, I was thinking to stop the counter to stop the event. But that's not the right thing to do. It seems that I should disable event interrupt instead. > Another question is that I have another embedded 16 bits timer that I > would like to use. Since the timer is only 16 bits, the maximum > interval is tiny, My question if a user ask for a clockevent device > using an interval is bigger that 2^16, the clockevent system doesn't > return an error. Instead it silently reduce the interval to 2^16. Is > this correct ? if so why ? Yes, it is correct. The generic timer code requests an event in the future. It does not care, whether the hardware device can handle that or not. So the clock event code limits the delta to the maximum delta the device can handle. The interrupt fires and the generic timer code reschedules the event with the remaining delta time. Thanks again for explanations. Could you give me a pointer of this reschedules ? -- Francis - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
clockevent questions
Hi, I'm trying to use clocksource and clockevent new subsystem. My platform has a timer that I'd like to use both as a clocksource and a clockevent devices. This timer is continueous in sense that it can run without any interruption so I assume I can flag the clocksource device with "CLOCK_SOURCE_IS_CONTINUOUS". However I noticed that clockevent device can be stopped by using "set_mode()" method. Are these two behaviours compatible ? Another question is that I have another embedded 16 bits timer that I would like to use. Since the timer is only 16 bits, the maximum interval is tiny, My question if a user ask for a clockevent device using an interval is bigger that 2^16, the clockevent system doesn't return an error. Instead it silently reduce the interval to 2^16. Is this correct ? if so why ? Thanks for your answers -- Francis - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [CRYPTO] is it really optimized ?
Hi [Sorry for the late answer] On 4/19/07, Francis Moreau <[EMAIL PROTECTED]> wrote: On 4/17/07, Roland Dreier <[EMAIL PROTECTED]> wrote: > > > It seems trivial to keep the last key you were given and do a quick > > > memcmp in your setkey method to see if it's different from the last > > > key you pushed to hardware, and set a flag if it is. Then only do > > > your set_key() if you have a new key to pass to hardware. > > > > > > I'm assuming the expense is in the aes_write() calls, and you could > > > avoid them if you know you're not writing something new. > > > that's a wrong assumption. aes_write()/aes_read() are both used to > > access to the controller and are slow (no cache involved). > > Sorry, I wasn't clear. I meant that the hardware access is what is > slow, and that anything you do on the CPU is relatively cheap compared > to that. > > So my suggestion is just to keep a cache (in CPU memory) of what you > have already loaded into the HW, and before reloading the HW just > check the cache and don't do the actual HW access if you're not going > to change the HW contents. So you avoid any extra aes_write and > aes_read calls in the cache hit case. > > This would have the advantage of making anything that does lots of > bulk encryption fast without special casing ecryptfs. > I'm not sure how "memcmp(key, cache, KEY_SIZE)" would impact AES performance. I need to give it a test but can't today. I'll do tomorrow and give you back the result. OK, I gave it a test and it appears that the cache hit case is slightly worse than unconditionnal key loading. So it means that testing that hte key is cached is as long as loading the key into the controller. Here is what I did in set_key() function: static void set_key(const char *key) { static u32 my_key[4] __cacheline_aligned; u32 key0 = *(const u32 *)(key + 12); u32 key1 = *(const u32 *)(key + 8); u32 key2 = *(const u32 *)(key + 4); u32 key3 = *(const u32 *)(key); int timeout = 100; u32 miss = 0; miss |= key0 ^ my_key[0]; miss |= key1 ^ my_key[1]; miss |= key2 ^ my_key[2]; miss |= key3 ^ my_key[3]; if (miss == 0) return; my_key[0] = key0; my_key[1] = key1; my_key[2] = key2; my_key[3] = key3; aes_write(be32_to_cpu(key0), AES_KEY0); aes_write(be32_to_cpu(key1), AES_KEY1); aes_write(be32_to_cpu(key2), AES_KEY2); aes_write(be32_to_cpu(key3), AES_KEY3); /* generate dkey: should take 11 cycles */ aes_write(aes_read(AES_CR) | CR_DKEYGEN, AES_CR); while (aes_read(AES_CR) & CR_DKEYGEN) { if (--timeout == 0) break; } } So I was wrong, hardware access is not so expensive as I thought. But it also means that all instructions executed in the drivers' encrypt()/decrypt() methods have a real cost and skipping key loadings is a win. Using the driver exclusively doesn't seem to be the right solution, but I don't see another way to do that... -- Francis - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [CRYPTO] is it really optimized ?
On 4/17/07, Roland Dreier <[EMAIL PROTECTED]> wrote: > > It seems trivial to keep the last key you were given and do a quick > > memcmp in your setkey method to see if it's different from the last > > key you pushed to hardware, and set a flag if it is. Then only do > > your set_key() if you have a new key to pass to hardware. > > > > I'm assuming the expense is in the aes_write() calls, and you could > > avoid them if you know you're not writing something new. > that's a wrong assumption. aes_write()/aes_read() are both used to > access to the controller and are slow (no cache involved). Sorry, I wasn't clear. I meant that the hardware access is what is slow, and that anything you do on the CPU is relatively cheap compared to that. So my suggestion is just to keep a cache (in CPU memory) of what you have already loaded into the HW, and before reloading the HW just check the cache and don't do the actual HW access if you're not going to change the HW contents. So you avoid any extra aes_write and aes_read calls in the cache hit case. This would have the advantage of making anything that does lots of bulk encryption fast without special casing ecryptfs. I'm not sure how "memcmp(key, cache, KEY_SIZE)" would impact AES performance. I need to give it a test but can't today. I'll do tomorrow and give you back the result. Thanks -- Francis - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: question on generic gpio interface
On 4/17/07, David Brownell <[EMAIL PROTECTED]> wrote: In this case I'm not entirely sure how it'd work. I've seen a few drivers which let userspace peek and poke at GPIO signals -- like one for Gumstix boards -- but generalizing the model isn't simple. Sub-problems include: - Configuring the relevant pins. Especially for SOC cases, GPIO roles are multiplexed with several others. So there are two issues: (a) the platform-specific setup of that multiplexing, plus (b) the board-specific knowledge of what pins are truly available for use as GPIOs, and not otherwise in use. what about create a module "user-gpio" for example that could request some gpios that the board could have declared using resource subsystem, like this: static struct resource foo_gpio_resource[] = { [0] = { .start = 10, .end = 11, .flags = IORESOURCE_GPIO, }, [1] = { .start = 26, .end = 31, .flags = IORESOURCE_GPIO, }, }; struct platform_device foo_device_usergpio = { .name = "user-gpio", .id = -1, .num_resources = ARRAY_SIZE(foo_gpio_resource), .resource = foo_gpio_resource, }; This way "user-gpio" module knows which pins are avalaible to userspace. - Enumerating those GPIOs to userspace. One SOC might have just a few dozen, another might have a few hundred; and then there are all the board-specific ones, on FPGA or I2C chips etc. This point is actully the one where I'm really not sure... Enumerating user GPIOs would always start from 0 to GPIO_USER_NR - 1 and an application that need to be portable should use a config file to specify which GPIO num to use... - Exposing those pins to userspace. It'd be unsafe to let pins claimed by drivers be managed by userspace; the default should be that only unclaimed GPIOs can be accessed. Well an extreme solution would be to test in gpio_request(), if the passed gpio nr is a user one then gpio_request() would return an error. We could use is_user_gpio() function implemented by user-gpio module Thanks -- Francis - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [CRYPTO] is it really optimized ?
On 4/17/07, Roland Dreier <[EMAIL PROTECTED]> wrote: > > I wonder if there's some way you can cache the last caller and reload > > the key lazily (only when it changes). > > yes something that allows crypto drivers to detect if the key has > changed would be good. It seems trivial to keep the last key you were given and do a quick memcmp in your setkey method to see if it's different from the last key you pushed to hardware, and set a flag if it is. Then only do your set_key() if you have a new key to pass to hardware. I'm assuming the expense is in the aes_write() calls, and you could avoid them if you know you're not writing something new. that's a wrong assumption. aes_write()/aes_read() are both used to access to the controller and are slow (no cache involved). thanks -- Francis - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [CRYPTO] is it really optimized ?
On 4/17/07, Evgeniy Polyakov <[EMAIL PROTECTED]> wrote: On Tue, Apr 17, 2007 at 04:01:51PM +0200, Francis Moreau ([EMAIL PROTECTED]) wrote: > On 4/17/07, Herbert Xu <[EMAIL PROTECTED]> wrote: > > > >Yep. We don't need such a flag anyway. All we need is a way to tweak > >the priority and Bob's your uncle. > > > > Could you elaborate please, I don't see how you prevent others users > to use this module with priority. > > Priority is a stuff that tells you which aes implementation to use but > it does not prevent an implementation to be used several times... Preventing anyone from using the module is incorrect. How will you handle the case when you have only one algo registered and it will be exclusively used by ecryptfs? As I tried to explain, in that case the admin must load the module without the exclusive flag. Herbert proposes to register _second_ algo (say aes-generic(prio_100) and aes_for_ecryptfs(prio_1)) with lower prio, so generic access will never try to catch aes_for_ecryptfs, but your code still can access it using full name. yes but my worries with this approach is that nothing prevent an admin to load others modules that will use aes_for_ecryptfs. And an admin is not always aware about a module implementation. Thanks -- Francis - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [CRYPTO] is it really optimized ?
On 4/17/07, Herbert Xu <[EMAIL PROTECTED]> wrote: Francis Moreau <[EMAIL PROTECTED]> wrote: > On 4/17/07, Herbert Xu <[EMAIL PROTECTED]> wrote: >> >> Actually, I was referring to your AES module :) > > Well I don't if I can do that unfortunately. What's the problem? Always the same problem. Some stupid people here think that they have designed the most wonderful AES hardware module. If I give you the code of the driver I'll show you a lot of "confidential" stuff. > Actually there's nothing really interesting in this code, only key or > acc loadings and that's it. > > What do you want to see exactly ? Well if your code's faster than what we have in the kernel then we should use yours instead. Again, my code is faster only because I skip the key loading in "cia_encrypt" method. Actually the gain is bigger in decryption mode than in encryption one because I also generate the decryption key for each block. You see there's absolutely no clever trick here... -- Francis - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [CRYPTO] is it really optimized ?
On 4/17/07, Evgeniy Polyakov <[EMAIL PROTECTED]> wrote: If there are another users, then flag should not be set. depends if there's a 'generic' algo that can be used at the same time. Admin should know that. If there are no another users, your code already has exclusive access. sorry I don't understand that. One can not know if there will be any additional users at all (consider the case when new encrypted block device or ipsec negotiation started some time after module was loaded). well I should say administrator should know. -- Francis - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [CRYPTO] is it really optimized ?
On 4/17/07, Evgeniy Polyakov <[EMAIL PROTECTED]> wrote: > OK, I tried to cook up something very simple. Since I don't know this > code, please be indulgent when reading the following patch ;) Which means that after one has loaded ecryptfs module it can not use ipsec and dm-crypt if there is only one crypto algo registered... That's actually the goal, but I agree we would need a flag to pass when loading AES module to say "I want an exclusive usage of it and therefore it can be run faster". If you have several users of AES module, you can choose (a) use the no-optimized version for all users or (b) choose which user needs to be run quickly and make it exclusively use the AES hw module; the others users would use the generic AES (the slower one). -- Francis - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [CRYPTO] is it really optimized ?
On 4/17/07, Roland Dreier <[EMAIL PROTECTED]> wrote: > Again, my code is faster only because I skip the key loading in > "cia_encrypt" method. Actually the gain is bigger in decryption mode > than in encryption one because I also generate the decryption key for > each block. I wonder if there's some way you can cache the last caller and reload the key lazily (only when it changes). yes something that allows crypto drivers to detect if the key has changed would be good. Of course without your code it's hard to say... Alright you can find the main part of it below... struct foo_aes_ctx { u8 key[AES_KEY_LENGTH]; }; /* * */ static inline void set_dir(int dir) { u32 cr = aes_read(AES_CR); switch (dir) { case AES_DIR_ENCRYPT: cr |= CR_DIR; break; case AES_DIR_DECRYPT: cr &= ~CR_DIR; break; default: BUG(); } aes_write(cr, AES_CR); } static inline void set_key(const char *key) { u32 key0 = be32_to_cpup((const u32 *)(key + 12)); u32 key1 = be32_to_cpup((const u32 *)(key + 8)); u32 key2 = be32_to_cpup((const u32 *)(key + 4)); u32 key3 = be32_to_cpup((const u32 *)(key)); aes_write(key0, AES_KEY0); aes_write(key1, AES_KEY1); aes_write(key2, AES_KEY2); aes_write(key3, AES_KEY3); } /* should take only 11 cycles */ static int gen_dkey(void) { int timeout = 100; aes_write(aes_read(AES_CR) | CR_DKEYGEN, AES_CR); while (aes_read(AES_CR) & CR_DKEYGEN) { if (--timeout == 0) return -EIO; } return 0; } static int crypt_block(int dir, u8 *dst, const u8 *src, const char *key) { register u32 acc0 = be32_to_cpup((const u32 *)(src + 12)); register u32 acc1 = be32_to_cpup((const u32 *)(src + 8)); register u32 acc2 = be32_to_cpup((const u32 *)(src + 4)); register u32 acc3 = be32_to_cpup((const u32 *)(src)); unsigned long flags; spin_lock_irqsave(&foo_aes_lock, flags); set_key(key); set_dir(dir); if (dir == AES_DIR_DECRYPT) gen_dkey(); aes_write(acc0, AES_ACC0); aes_write(acc1, AES_ACC1); aes_write(acc2, AES_ACC2); aes_write(acc3, AES_ACC3); { /* Again, should take only 11 cycles */ int timeout = 100; while (aes_read(AES_CR) & 0x70) { if (--timeout == 0) return -EIO; } } /* order is important, guess why ? */ *(u32 *)(dst + 12) = cpu_to_be32(aes_read(AES_ACC0)); *(u32 *)(dst + 8) = cpu_to_be32(aes_read(AES_ACC1)); *(u32 *)(dst + 4) = cpu_to_be32(aes_read(AES_ACC2)); *(u32 *)(dst) = cpu_to_be32(aes_read(AES_ACC3)); spin_unlock_irqrestore(&foo_aes_lock, flags); return 0; } /* * */ static int foo_setkey(struct crypto_tfm *tfm, const u8 *key, unsigned int len) { struct foo_aes_ctx *ctx = crypto_tfm_ctx(tfm); int rv; if (len == AES_KEY_LENGTH) { memcpy(ctx->key, key, AES_KEY_LENGTH); rv = 0; } else { tfm->crt_flags |= CRYPTO_TFM_RES_BAD_KEY_LEN; rv = -EINVAL; } return rv; } static void foo_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in) { struct foo_aes_ctx *ctx = crypto_tfm_ctx(tfm); BUG_ON(!in); BUG_ON(!out); crypt_block(AES_DIR_DECRYPT, out, in, ctx->key); } static void foo_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in) { struct foo_aes_ctx *ctx = crypto_tfm_ctx(tfm); BUG_ON(!in); BUG_ON(!out); crypt_block(AES_DIR_ENCRYPT, out, in, ctx->key); } static struct crypto_alg foo_alg = { .cra_name = "aes", .cra_driver_name= "aes-128-foo", .cra_priority = 300, .cra_flags = CRYPTO_ALG_TYPE_CIPHER, .cra_blocksize = AES_MIN_BLOCK_SIZE, .cra_ctxsize= sizeof(struct foo_aes_ctx), .cra_module = THIS_MODULE, .cra_list = LIST_HEAD_INIT(foo_alg.cra_list), .cra_u = { .cipher = { .cia_min_keysize= AES_KEY_LENGTH, .cia_max_keysize= AES_KEY_LENGTH, .cia_setkey = foo_setkey, .cia_encrypt= foo_encrypt, .cia_decrypt= foo_decrypt } } }; -- Francis - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org
Re: [CRYPTO] is it really optimized ?
On 4/17/07, Herbert Xu <[EMAIL PROTECTED]> wrote: Yep. We don't need such a flag anyway. All we need is a way to tweak the priority and Bob's your uncle. Could you elaborate please, I don't see how you prevent others users to use this module with priority. Priority is a stuff that tells you which aes implementation to use but it does not prevent an implementation to be used several times... Thanks -- Francis - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/