Re: 3.12-rc5 and overwritten partition table - by powertop?
>(CCing linux-ide) > >It seems like most likely either the SATA host controller or drive >doesn't play nice with link power management enabled. Can you post the >full dmesg boot log? I found backup, yay! -> attached Compared to Stefan's report, same hardware, same SSD, only my BIOS is one rev older and my 840 Pro firmware is also one rev older. Hope it helps, J log_oct29.txt.bz2 Description: BZip2 compressed data
Re: 3.12-rc5 and overwritten partition table - by powertop?
On 10/29/2013 04:32 PM, John Twideldum wrote: The first ~170kb of /dev/sda got blown away with what seems to be a logging output by Powertop, when I was playing with the tuneables. So did you log the output to some file? I'm just trying to understand how it could get onto your disk in the first place... Attached a dump of the first 1Mb of the disk, HTH. It looks like a powertop log? (I have powertop 2.4) Yes, likely. But it is strange the corruption doesn't even end at any sensible boundary (data ends at offset 0x27b53). Shrug... My recollection what I did is this: I was looking into powertop and observing how -rc5 works now with Haswell. I saw the tuneable parameters and quite a few were "bad", so I set them to "good". Power usage dropped about one third - yay! However, changing "SATA link power" threw up complaints: Oct 29 09:09:21 localhost kernel: [ 3697.423868] ata1.00: exception Emask 0x10 SAct 0x1 SErr 0xc action 0x6 frozen Oct 29 09:09:21 localhost kernel: [ 3697.423873] ata1.00: irq_stat 0x0800, interface fatal error Oct 29 09:09:21 localhost kernel: [ 3697.423877] ata1: SError: { CommWake 10B8B } Oct 29 09:09:21 localhost kernel: [ 3697.423880] ata1.00: failed command: WRITE FPDMA QUEUED Oct 29 09:09:21 localhost kernel: [ 3697.423886] ata1.00: cmd 61/38:00:01:9e:a4/01:00:00:00:00/40 tag 0 ncq 159744 out Oct 29 09:09:21 localhost kernel: [ 3697.423886] res 50/01:00:01:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error) Oct 29 09:09:21 localhost kernel: [ 3697.423888] ata1.00: status: { DRDY } Oct 29 09:09:21 localhost kernel: [ 3697.423894] ata1: hard resetting link Oct 29 09:09:22 localhost kernel: [ 3697.743196] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Oct 29 09:09:22 localhost kernel: [ 3697.744707] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded Oct 29 09:09:22 localhost kernel: [ 3697.744719] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out Oct 29 09:09:22 localhost kernel: [ 3697.744725] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out Oct 29 09:09:22 localhost kernel: [ 3697.744813] ata1.00: ACPI cmd ef/10:09:00:00:00:a0 (SET FEATURES) succeeded Oct 29 09:09:22 localhost kernel: [ 3697.745212] ata1.00: failed to get NCQ Send/Recv Log Emask 0x1 Oct 29 09:09:22 localhost kernel: [ 3697.746694] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded Oct 29 09:09:22 localhost kernel: [ 3697.746705] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out Oct 29 09:09:22 localhost kernel: [ 3697.746711] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out Oct 29 09:09:22 localhost kernel: [ 3697.746779] ata1.00: ACPI cmd ef/10:09:00:00:00:a0 (SET FEATURES) succeeded Oct 29 09:09:22 localhost kernel: [ 3697.747286] ata1.00: failed to get NCQ Send/Recv Log Emask 0x1 Oct 29 09:09:22 localhost kernel: [ 3697.747432] ata1.00: configured for UDMA/133 Oct 29 09:09:22 localhost kernel: [ 3697.763181] ata1: EH complete I did not know yet about what "frozen" means, so I did not investigate and very soon powered down as I had to leave. Next time I boot up I did not boot. So data probable is just the size because as long as I had powertop running... (CCing linux-ide) It seems like most likely either the SATA host controller or drive doesn't play nice with link power management enabled. Can you post the full dmesg boot log? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12-rc5 and overwritten partition table - by powertop?
Am 2013-10-29 21:10, schrieb Jan Kara: >> The first ~170kb of /dev/sda got blown away with what seems to be a logging >> output >> by Powertop, when I was playing with the tuneables. >> (Luckily the first partition starts later :-)) > So did you log the output to some file? I'm just trying to understand how > it could get onto your disk in the first place... > >> Why is that I don't know, but maybe when turning on the SATA knobs >> something goes wrong. I'm afraid to try again, but I accept rather higher >> power use than data loss again :-/ I experienced the same on the very same hardware (Lenovo T440s). Like John, I turned all those knobs in powertop, including the SATA ones. Several time I ended up with broken partition table. Once, even my EFI System partition (first partition) was broken. However, since I use EFI I was able to recover the partition table quite easily (gdisk asks for recovery from backup partition table, kudos to the designer of the GPT format!). This happens running on Arch Linux with stock 3.12.7 as well as mainline 3.13 kernel. I use latest T440s firmware (2.17). Is it possible to disable/warn user when using that knob (at least on Lenovo T440s), in order to avoid users left at an unbootable system...? dmesg output: [2.744398] ata1: SATA max UDMA/133 abar m2048@0xf063c000 port 0xf063c100 irq 59 [2.744400] ata2: DUMMY [2.744401] ata3: DUMMY [3.063804] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) [3.064532] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded [3.064536] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out [3.064538] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out [3.064606] ata1.00: ACPI cmd ef/10:09:00:00:00:a0 (SET FEATURES) succeeded [3.064926] ata1.00: failed to get NCQ Send/Recv Log Emask 0x1 [3.064929] ata1.00: ATA-9: Samsung SSD 840 PRO Series, DXM05B0Q, max UDMA/133 [3.064931] ata1.00: 500118192 sectors, multi 16: LBA48 NCQ (depth 31/32), AA [3.065256] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded [3.065259] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out [3.065261] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out [3.065286] ata1.00: ACPI cmd ef/10:09:00:00:00:a0 (SET FEATURES) succeeded [3.065545] ata1.00: failed to get NCQ Send/Recv Log Emask 0x1 [3.065605] ata1.00: configured for UDMA/133 ... [ 130.578789] ata1.00: exception Emask 0x0 SAct 0x7fff SErr 0x4 action 0x0 [ 130.578794] ata1.00: irq_stat 0x4001 [ 130.578796] ata1: SError: { CommWake } [ 130.578798] ata1.00: failed command: WRITE FPDMA QUEUED [ 130.578802] ata1.00: cmd 61/10:00:f0:29:05/00:00:00:00:00/40 tag 0 ncq 8192 out [ 130.578804] ata1.00: status: { DRDY ERR } [ 130.578806] ata1.00: error: { ABRT } ... [ 130.579011] ata1.00: failed command: WRITE FPDMA QUEUED [ 130.579014] ata1.00: cmd 61/10:f0:58:7c:0f/00:00:00:00:00/40 tag 30 ncq 8192 out [ 130.579016] ata1.00: status: { DRDY ERR } [ 130.579017] ata1.00: error: { ABRT } [ 130.579207] ata1.00: failed to get NCQ Send/Recv Log Emask 0x1 [ 130.579456] ata1.00: failed to get NCQ Send/Recv Log Emask 0x1 [ 130.579511] ata1.00: configured for UDMA/133 [ 130.579583] ata1: EH complete -- Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12-rc5 and overwritten partition table - by powertop?
>> >> The first ~170kb of /dev/sda got blown away with what seems to be a >> >> logging output >> >> by Powertop, when I was playing with the tuneables. >> > >> >So did you log the output to some file? I'm just trying to understand how >> >it could get onto your disk in the first place... >> >> Attached a dump of the first 1Mb of the disk, HTH. >> It looks like a powertop log? >> (I have powertop 2.4) > >Yes, likely. But it is strange the corruption doesn't even end at any >sensible boundary (data ends at offset 0x27b53). Shrug... My recollection what I did is this: I was looking into powertop and observing how -rc5 works now with Haswell. I saw the tuneable parameters and quite a few were "bad", so I set them to "good". Power usage dropped about one third - yay! However, changing "SATA link power" threw up complaints: Oct 29 09:09:21 localhost kernel: [ 3697.423868] ata1.00: exception Emask 0x10 SAct 0x1 SErr 0xc action 0x6 frozen Oct 29 09:09:21 localhost kernel: [ 3697.423873] ata1.00: irq_stat 0x0800, interface fatal error Oct 29 09:09:21 localhost kernel: [ 3697.423877] ata1: SError: { CommWake 10B8B } Oct 29 09:09:21 localhost kernel: [ 3697.423880] ata1.00: failed command: WRITE FPDMA QUEUED Oct 29 09:09:21 localhost kernel: [ 3697.423886] ata1.00: cmd 61/38:00:01:9e:a4/01:00:00:00:00/40 tag 0 ncq 159744 out Oct 29 09:09:21 localhost kernel: [ 3697.423886] res 50/01:00:01:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error) Oct 29 09:09:21 localhost kernel: [ 3697.423888] ata1.00: status: { DRDY } Oct 29 09:09:21 localhost kernel: [ 3697.423894] ata1: hard resetting link Oct 29 09:09:22 localhost kernel: [ 3697.743196] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Oct 29 09:09:22 localhost kernel: [ 3697.744707] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded Oct 29 09:09:22 localhost kernel: [ 3697.744719] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out Oct 29 09:09:22 localhost kernel: [ 3697.744725] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out Oct 29 09:09:22 localhost kernel: [ 3697.744813] ata1.00: ACPI cmd ef/10:09:00:00:00:a0 (SET FEATURES) succeeded Oct 29 09:09:22 localhost kernel: [ 3697.745212] ata1.00: failed to get NCQ Send/Recv Log Emask 0x1 Oct 29 09:09:22 localhost kernel: [ 3697.746694] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded Oct 29 09:09:22 localhost kernel: [ 3697.746705] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out Oct 29 09:09:22 localhost kernel: [ 3697.746711] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out Oct 29 09:09:22 localhost kernel: [ 3697.746779] ata1.00: ACPI cmd ef/10:09:00:00:00:a0 (SET FEATURES) succeeded Oct 29 09:09:22 localhost kernel: [ 3697.747286] ata1.00: failed to get NCQ Send/Recv Log Emask 0x1 Oct 29 09:09:22 localhost kernel: [ 3697.747432] ata1.00: configured for UDMA/133 Oct 29 09:09:22 localhost kernel: [ 3697.763181] ata1: EH complete I did not know yet about what "frozen" means, so I did not investigate and very soon powered down as I had to leave. Next time I boot up I did not boot. So data probable is just the size because as long as I had powertop running... HTH, J -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12-rc5 and overwritten partition table - by powertop?
On Tue 29-10-13 22:58:40, John Twideldum wrote: > >> The first ~170kb of /dev/sda got blown away with what seems to be a > >> logging output > >> by Powertop, when I was playing with the tuneables. > > > >So did you log the output to some file? I'm just trying to understand how > >it could get onto your disk in the first place... > > Attached a dump of the first 1Mb of the disk, HTH. > It looks like a powertop log? > (I have powertop 2.4) Yes, likely. But it is strange the corruption doesn't even end at any sensible boundary (data ends at offset 0x27b53). Shrug... > meanwhile, testdisk running in all variations recovered enough clues to > guess the partitions correctly - the data is fine :-) Good. Honza -- Jan Kara SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12-rc5 and overwritten partition table - by powertop?
>> The first ~170kb of /dev/sda got blown away with what seems to be a logging >> output >> by Powertop, when I was playing with the tuneables. > >So did you log the output to some file? I'm just trying to understand how >it could get onto your disk in the first place... Attached a dump of the first 1Mb of the disk, HTH. It looks like a powertop log? (I have powertop 2.4) meanwhile, testdisk running in all variations recovered enough clues to guess the partitions correctly - the data is fine :-) J backup_overwrite.dump.bz2 Description: BZip2 compressed data
Re: 3.12-rc5 and overwritten partition table - by powertop?
On Tue 29-10-13 15:57:54, John Twideldum wrote: > replying to myself with more insights... > > >000 13.0\nalsa:hw > >020 C0D0\t99.0\nal > >040 sa:hwC0D3\t99.000 > >060 00\nbacklight:acp > >. > > The first ~170kb of /dev/sda got blown away with what seems to be a logging > output > by Powertop, when I was playing with the tuneables. > (Luckily the first partition starts later :-)) So did you log the output to some file? I'm just trying to understand how it could get onto your disk in the first place... > Why is that I don't know, but maybe when turning on the SATA knobs > something goes wrong. I'm afraid to try again, but I accept rather higher > power use than data loss again :-/ Honza -- Jan Kara SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/