Bug#533616: linux-image-2.6.29-2-amd64: occasional ext3 filesystem corruption

2009-08-06 Thread Matijs van Zuijlen
Moritz Muehlenhoff wrote:
 Did this happen again with 2.6.30 from current unstable?

No, it fortunately has not happened since the upgrade to 2.6.30.

Regards,
Matijs



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#533616: linux-image-2.6.29-2-amd64: occasional ext3 filesystem corruption

2009-08-05 Thread Moritz Muehlenhoff
On Sat, Jun 20, 2009 at 09:15:23AM +0200, Matijs van Zuijlen wrote:
 Bastian Blank wrote:
  
  [...]
  
  Log from the fsck run?
 
 Again, I don't have an actual file, just written notes. I can give you the 
 list
 of orhan inodes, but the bitmaps were a bit much to write down. Any thoughts 
 on
 how to capture this stuff next time this happens (it was the second time 
 alread,
 I had neglegted to mention that)?

Did this happen again with 2.6.30 from current unstable? 

Cheers,
 Moritz



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#533616: linux-image-2.6.29-2-amd64: occasional ext3 filesystem corruption

2009-06-21 Thread Matijs van Zuijlen
Matijs van Zuijlen wrote:
 I'll try running memtester and/or memtest86 to check this.

I did one full pass of memtest86+, and no errors were found.

-- 
Matijs



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#533616: linux-image-2.6.29-2-amd64: occasional ext3 filesystem corruption

2009-06-20 Thread Matijs van Zuijlen
Bastian Blank wrote:
 
 [...]
 
 Log from the fsck run?

Again, I don't have an actual file, just written notes. I can give you the list
of orhan inodes, but the bitmaps were a bit much to write down. Any thoughts on
how to capture this stuff next time this happens (it was the second time alread,
I had neglegted to mention that)?

 Some of the differences are normal if the journal
 got aborted. But overall this looks like bad hardware, most likely
 memory.

I'll try running memtester and/or memtest86 to check this.

-- 
Matijs



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#533616: linux-image-2.6.29-2-amd64: occasional ext3 filesystem corruption

2009-06-19 Thread Matijs van Zuijlen
Package: linux-image-2.6.29-2-amd64
Version: 2.6.29-5
Severity: grave
Justification: causes non-serious data loss

Yesterday, I found my root filesystem mounted read-only. Dmesg gave the
following messages (retyped by hand, which is why the timestamps are
missing):

  EXT3-fs error (device sda3): htree_dirblock_to_tree: bad entry in directory
#91119000: directory entry accross blocks - offset=0, inode=2364050278,
rec_len=36552, name-len=216
  Aborting journal on device sda3.
  Remounting filesystem read-only
  __journal_remove_journal_head: freeing b_comitted_data

Note the ridiculously large inode number mentioned in the first line.

After reboot, I needed to run fsck from the root prompt. An orphaned inode
list was fixed, a zero dtime was fixed, block bitmap differences were
fixed, free block counts were fixed, free inode counts were fixed. I seem
to recall inode bitmap differences were also fixed.

-- Package-specific info:
** Version:
Linux version 2.6.29-2-amd64 (Debian 2.6.29-5) (wa...@debian.org) (gcc version 
4.3.3 (Debian 4.3.3-10) ) #1 SMP Sun May 17 17:15:47 UTC 2009

** Command line:
root=/dev/sda3 ro 

** Not tainted

** Kernel log:
[10757.264576] firewire_ohci :03:03.0: restoring config space at offset 0x3 
(was 0x0, writing 0xf810)
[10757.264576] firewire_ohci :03:03.0: restoring config space at offset 0x1 
(was 0x290, writing 0x2900216)
[10757.264576] Enabling non-boot CPUs ...
[10757.264576] SMP alternatives: switching to SMP code
[10757.387717] Booting processor 1 APIC 0x1 ip 0x6000
[10757.264016] Initializing CPU#1
[10757.264016] Calibrating delay using timer specific routine.. 4322.66 
BogoMIPS (lpj=8645327)
[10757.264016] CPU: L1 I cache: 32K, L1 D cache: 32K
[10757.264016] CPU: L2 cache: 4096K
[10757.264016] CPU 1/0x1 - Node 0
[10757.264016] CPU: Physical Processor ID: 0
[10757.264016] CPU: Processor Core ID: 1
[10757.264016] CPU1: Thermal monitoring enabled (TM2)
[10757.476572] CPU1: Intel(R) Core(TM)2 CPU T7400  @ 2.16GHz stepping 06
[10757.476614] CPU0 attaching NULL sched-domain.
[10757.477015] Switched to high resolution mode on CPU 1
[10757.488758] CPU0 attaching sched-domain:
[10757.488761]  domain 0: span 0-1 level MC
[10757.488762]   groups: 0 1
[10757.488766] CPU1 attaching sched-domain:
[10757.488767]  domain 0: span 0-1 level MC
[10757.488769]   groups: 1 0
[10757.492025] CPU1 is up
[10757.492027] ACPI: Waking up from system sleep state S3
[10757.696555] ACPI: EC: non-query interrupt received, switching to interrupt 
mode
[10757.859613] pci :00:02.0: PME# disabled
[10757.859618] pci :00:02.1: PME# disabled
[10757.859621] pci :00:07.0: PME# disabled
[10757.859679] HDA Intel :00:1b.0: PCI INT A - GSI 22 (level, low) - IRQ 
22
[10757.859685] HDA Intel :00:1b.0: setting latency timer to 64
[10757.859716] pcieport-driver :00:1c.0: setting latency timer to 64
[10757.859725] pcieport-driver :00:1c.1: setting latency timer to 64
[10757.859760] uhci_hcd :00:1d.0: PCI INT A - GSI 21 (level, low) - IRQ 21
[10757.859765] uhci_hcd :00:1d.0: setting latency timer to 64
[10757.859790] usb usb2: root hub lost power or was reset
[10757.859850] uhci_hcd :00:1d.1: PCI INT B - GSI 19 (level, low) - IRQ 19
[10757.859855] uhci_hcd :00:1d.1: setting latency timer to 64
[10757.859879] usb usb3: root hub lost power or was reset
[10757.859922] uhci_hcd :00:1d.2: PCI INT C - GSI 18 (level, low) - IRQ 18
[10757.859928] uhci_hcd :00:1d.2: setting latency timer to 64
[10757.859951] usb usb4: root hub lost power or was reset
[10757.860004] uhci_hcd :00:1d.3: PCI INT D - GSI 16 (level, low) - IRQ 16
[10757.860038] uhci_hcd :00:1d.3: setting latency timer to 64
[10757.860070] usb usb5: root hub lost power or was reset
[10757.860140] ehci_hcd :00:1d.7: PME# disabled
[10757.860144] ehci_hcd :00:1d.7: PCI INT A - GSI 21 (level, low) - IRQ 21
[10757.860150] ehci_hcd :00:1d.7: setting latency timer to 64
[10757.860156] ehci_hcd :00:1d.7: PME# disabled
[10757.860248] pci :00:1e.0: power state changed by ACPI to D0
[10757.860255] pci :00:1e.0: setting latency timer to 64
[10757.860297] PIIX_IDE :00:1f.1: power state changed by ACPI to D0
[10757.860333] PIIX_IDE :00:1f.1: power state changed by ACPI to D0
[10757.860337] PIIX_IDE :00:1f.1: PCI INT A - GSI 18 (level, low) - IRQ 18
[10757.860361] PIIX_IDE :00:1f.1: restoring config space at offset 0x1 (was 
0x2880005, writing 0x285)
[10757.860370] PIIX_IDE :00:1f.1: setting latency timer to 64
[10757.860415] ata_piix :00:1f.2: PCI INT B - GSI 19 (level, low) - IRQ 19
[10757.860419] ata_piix :00:1f.2: setting latency timer to 64
[10757.860526] sky2 :01:00.0: restoring config space at offset 0x1 (was 
0x40100407, writing 0x100407)
[10757.860588] sky2 :01:00.0: PME# disabled
[10757.860635] ath9k :02:00.0: PCI INT A - GSI 17 (level, low) - IRQ 17
[10757.932148] firewire_core: skipped bus generations, destroying all 

Bug#533616: linux-image-2.6.29-2-amd64: occasional ext3 filesystem corruption

2009-06-19 Thread Bastian Blank
severity 533616 important
tags 533616 moreinfo
thanks

On Fri, Jun 19, 2009 at 12:38:12PM +0200, Matijs van Zuijlen wrote:
 Yesterday, I found my root filesystem mounted read-only. Dmesg gave the
 following messages (retyped by hand, which is why the timestamps are
 missing):
 
   EXT3-fs error (device sda3): htree_dirblock_to_tree: bad entry in directory
 #91119000: directory entry accross blocks - offset=0, inode=2364050278,
 rec_len=36552, name-len=216
   Aborting journal on device sda3.
   Remounting filesystem read-only
   __journal_remove_journal_head: freeing b_comitted_data
 
 Note the ridiculously large inode number mentioned in the first line.
 
 After reboot, I needed to run fsck from the root prompt. An orphaned inode
 list was fixed, a zero dtime was fixed, block bitmap differences were
 fixed, free block counts were fixed, free inode counts were fixed. I seem
 to recall inode bitmap differences were also fixed.

Log from the fsck run? Some of the differences are normal if the journal
got aborted. But overall this looks like bad hardware, most likely
memory.

Bastian



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org