Hello Rafael, On 11/22/2013 11:08 PM, Rafael J. Wysocki wrote: > On Friday, November 22, 2013 10:36:23 PM Francis Moreau wrote: >> On 11/22/2013 01:54 PM, Rafael J. Wysocki wrote: >>> On Friday, November 22, 2013 10:57:25 AM Francis Moreau wrote: >>>> Le 22/11/2013 08:43, Francis Moreau a écrit : >>>>> Le 21/11/2013 12:17, Jingoo Han a écrit : >>>>> [...] >>>>>>> >>>>>>>> Also I took a look at the changes between v3.11 and v3.12 in this area >>>>>>>> and those changes match the issue I'm facing: >>>>>>>> >>>>>>>> $ git log --oneline v3.11..v3.12 -- drivers/mfd/rtsx_pcr.c >>>>>>>> 09fd867 mfd: rtsx: Copyright modifications >>>>>>>> eb891c6 mfd: rtsx: Configure to enter a deeper power-saving mode in S3 >>>>>>>> 7140812 mfd: rtsx: Move some actions from rtsx_pci_init_hw to >>>>>>>> individual >>>>>>>> extra_init_hw >>>>>>>> 5947c16 mfd: rtsx: Add shutdown callback in rtsx_pci_driver >>>>>>>> 773ccdf mfd: rtsx: Read vendor setting from config space >>>>>> >>>>>> In my opinion, rtsx_pci_resume()/rtsx_pci_suspend() in realtek PCIe card >>>>>> reader driver may make the kernel panic. >>>>>> >>>>>> I think that the commit "mfd: rtsx: Configure to enter a deeper >>>>>> power-saving mode in S3" may be the culprit. >>>>> >>>>> Unfortunately no, reverting this commit on top of v3.12 doesn't help. I >>>>> also reverted 7140812, 5947c16 but it didn't improve anything. >>>>> >>>>> The good news is that I managed to have a "light" kernel configuration >>>>> which is faster to build and more important it seems that the bug is >>>>> almost 100% reproductible now. >>>>> >>>>> So I'll try to do another git-bisect session later. >>>> >>>> So after bisecting between v3.11..v3.12 range, git bisect told me: >>>> >>>> the first bad commit is 551f5c74e17ba9257cdc35bf657ee448cad2d5b0 >>>> >>>> Merge branch 'acpi-processor' >>>> >>>> * acpi-processor: >>>> ACPI / processor: Acquire writer lock to update CPU maps >>>> ACPI / processor: Remove acpi_processor_get_limit_info() >>>> >>>> The two commits brought by the merge are not the culprits because >>>> reseting HEAD on "ACPI / processor: Acquire writer lock to update CPU >>>> maps" doesn't have the issue anymore. >>>> >>>> At that point I'm not sure how to bisect futher. >>> >>> Does the second parent of this merge (that is, 8462d9df9d50) have the >>> problem? >>> >> >> Yes it does. >> >> Ok, I've finally managed to find out the bad commit: >> ad07277e82dedabacc52c82746633680a3187d25: ACPI / PM: Hold acpi_scan_lock >> over system PM transitions >> >> I verified that the parent commit doesn't have the problem. > > Interesting. > >> Rafael, you're the man now ;) > > I kind of don't see how that commit may result in behavior that you > described earlier in the thread. > > You get a memory corruption that seems to have started to happen because > we're holding an additional lock over suspend resume now. Something's fishy > on that machine and we need to figure out what it is. > > Please file a bug at bugzilla.kernel.org against ACPI and assign it to me. > Please put all of the relevant info in there and attach the output of dmesg > after a fresh boot and the output of acpidump from the affected machine to > the bug entry. >
I just sent a new trace with DEBUG_OBJECTS enabled which seems to give some interesting traces. If nothing can be found from them, I'll do the bug report. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/