Re: Scrub aborts due to corrupt leaf

2018-12-31 Thread Larkin Lowrey
On 12/31/2018 7:12 PM, Qu Wenruo wrote: On 2018/12/31 下午11:52, Larkin Lowrey wrote: On 10/11/2018 12:15 AM, Chris Murphy wrote: Is this a 68T file system? Seems excessive. Haha, by excessive I mean nuking such a big fs just for being unable to remove the space tree. I'm quite sure the devs wou

Re: Scrub aborts due to corrupt leaf

2018-12-31 Thread Qu Wenruo
On 2018/12/31 下午11:52, Larkin Lowrey wrote: > On 10/11/2018 12:15 AM, Chris Murphy wrote: >> Is this a 68T file system? Seems excessive. >> Haha, by excessive I mean nuking such a big fs just for being unable >> to remove the space tree. I'm quite sure the devs would like to get >> that crashing

Re: Scrub aborts due to corrupt leaf

2018-12-31 Thread Larkin Lowrey
On 10/11/2018 12:15 AM, Chris Murphy wrote: Is this a 68T file system? Seems excessive. Haha, by excessive I mean nuking such a big fs just for being unable to remove the space tree. I'm quite sure the devs would like to get that crashing bug fixed, anyway. A second FS just started failing. I n

Re: Scrub aborts due to corrupt leaf

2018-10-10 Thread Chris Murphy
On Wed, Oct 10, 2018 at 10:00 PM, Chris Murphy wrote: > On Wed, Oct 10, 2018 at 9:07 PM, Larkin Lowrey > wrote: >> On 10/10/2018 10:51 PM, Chris Murphy wrote: >>> >>> On Wed, Oct 10, 2018 at 8:12 PM, Larkin Lowrey >>> wrote: On 10/10/2018 7:55 PM, Hans van Kranenburg wrote: > >

Re: Scrub aborts due to corrupt leaf

2018-10-10 Thread Chris Murphy
On Wed, Oct 10, 2018 at 9:07 PM, Larkin Lowrey wrote: > On 10/10/2018 10:51 PM, Chris Murphy wrote: >> >> On Wed, Oct 10, 2018 at 8:12 PM, Larkin Lowrey >> wrote: >>> >>> On 10/10/2018 7:55 PM, Hans van Kranenburg wrote: On 10/10/2018 07:44 PM, Chris Murphy wrote: > > > I'm

Re: Scrub aborts due to corrupt leaf

2018-10-10 Thread Larkin Lowrey
On 10/10/2018 10:51 PM, Chris Murphy wrote: On Wed, Oct 10, 2018 at 8:12 PM, Larkin Lowrey wrote: On 10/10/2018 7:55 PM, Hans van Kranenburg wrote: On 10/10/2018 07:44 PM, Chris Murphy wrote: I'm pretty sure you have to umount, and then clear the space_cache with 'btrfs check --clear-space-c

Re: Scrub aborts due to corrupt leaf

2018-10-10 Thread Chris Murphy
On Wed, Oct 10, 2018 at 8:12 PM, Larkin Lowrey wrote: > On 10/10/2018 7:55 PM, Hans van Kranenburg wrote: >> >> On 10/10/2018 07:44 PM, Chris Murphy wrote: >>> >>> >>> I'm pretty sure you have to umount, and then clear the space_cache >>> with 'btrfs check --clear-space-cache=v1' and then do a one

Re: Scrub aborts due to corrupt leaf

2018-10-10 Thread Larkin Lowrey
On 10/10/2018 7:55 PM, Hans van Kranenburg wrote: On 10/10/2018 07:44 PM, Chris Murphy wrote: I'm pretty sure you have to umount, and then clear the space_cache with 'btrfs check --clear-space-cache=v1' and then do a one time mount with -o space_cache=v2. The --clear-space-cache=v1 is optional

Re: Scrub aborts due to corrupt leaf

2018-10-10 Thread Hans van Kranenburg
On 10/10/2018 07:44 PM, Chris Murphy wrote: > On Wed, Oct 10, 2018 at 10:04 AM, Holger Hoffstätte > wrote: >> On 10/10/18 17:44, Larkin Lowrey wrote: >> (..) >>> >>> About once a week, or so, I'm running into the above situation where >>> FS seems to deadlock. All IO to the FS blocks, there is no

Re: Scrub aborts due to corrupt leaf

2018-10-10 Thread Qu Wenruo
On 2018/10/11 上午1:25, Larkin Lowrey wrote: > On 10/10/2018 12:04 PM, Holger Hoffstätte wrote: >> On 10/10/18 17:44, Larkin Lowrey wrote: >> (..) >>> About once a week, or so, I'm running into the above situation where >>> FS seems to deadlock. All IO to the FS blocks, there is no IO >>> activity

Re: Scrub aborts due to corrupt leaf

2018-10-10 Thread Chris Murphy
On Wed, Oct 10, 2018 at 12:31 PM, Larkin Lowrey wrote: > Interesting, because I do not see any indications of any other errors. The > fs is backed by an mdraid array and the raid checks always pass with no > mismatches, edac-util doesn't report any ECC errors, smartd doesn't report > any SMART er

Re: Scrub aborts due to corrupt leaf

2018-10-10 Thread Larkin Lowrey
On 10/10/2018 2:20 PM, Holger Hoffstätte wrote: On 10/10/18 19:25, Larkin Lowrey wrote: On 10/10/2018 12:04 PM, Holger Hoffstätte wrote: On 10/10/18 17:44, Larkin Lowrey wrote: (..) About once a week, or so, I'm running into the above situation where FS seems to deadlock. All IO to the FS bloc

Re: Scrub aborts due to corrupt leaf

2018-10-10 Thread Holger Hoffstätte
On 10/10/18 19:44, Chris Murphy wrote: On Wed, Oct 10, 2018 at 10:04 AM, Holger Hoffstätte wrote: On 10/10/18 17:44, Larkin Lowrey wrote: (..) About once a week, or so, I'm running into the above situation where FS seems to deadlock. All IO to the FS blocks, there is no IO activity at all. I

Re: Scrub aborts due to corrupt leaf

2018-10-10 Thread Holger Hoffstätte
On 10/10/18 19:25, Larkin Lowrey wrote: On 10/10/2018 12:04 PM, Holger Hoffstätte wrote: On 10/10/18 17:44, Larkin Lowrey wrote: (..) About once a week, or so, I'm running into the above situation where FS seems to deadlock. All IO to the FS blocks, there is no IO activity at all. I have to har

Re: Scrub aborts due to corrupt leaf

2018-10-10 Thread Chris Murphy
On Wed, Oct 10, 2018 at 10:04 AM, Holger Hoffstätte wrote: > On 10/10/18 17:44, Larkin Lowrey wrote: > (..) >> >> About once a week, or so, I'm running into the above situation where >> FS seems to deadlock. All IO to the FS blocks, there is no IO >> activity at all. I have to hard reboot the syst

Re: Scrub aborts due to corrupt leaf

2018-10-10 Thread Larkin Lowrey
On 10/10/2018 12:04 PM, Holger Hoffstätte wrote: On 10/10/18 17:44, Larkin Lowrey wrote: (..) About once a week, or so, I'm running into the above situation where FS seems to deadlock. All IO to the FS blocks, there is no IO activity at all. I have to hard reboot the system to recover. There are

Re: Scrub aborts due to corrupt leaf

2018-10-10 Thread Holger Hoffstätte
On 10/10/18 17:44, Larkin Lowrey wrote: (..) About once a week, or so, I'm running into the above situation where FS seems to deadlock. All IO to the FS blocks, there is no IO activity at all. I have to hard reboot the system to recover. There are no error indications except for the following whi

Re: Scrub aborts due to corrupt leaf

2018-10-10 Thread Larkin Lowrey
On 9/11/2018 11:23 AM, Larkin Lowrey wrote: On 8/29/2018 1:32 AM, Qu Wenruo wrote: On 2018/8/28 下午9:56, Chris Murphy wrote: On Tue, Aug 28, 2018 at 7:42 AM, Qu Wenruo wrote: On 2018/8/28 下午9:29, Larkin Lowrey wrote: On 8/27/2018 10:12 PM, Larkin Lowrey wrote: On 8/27/2018 12:46 AM, Qu Wen

Re: Scrub aborts due to corrupt leaf

2018-09-11 Thread Larkin Lowrey
On 8/29/2018 1:32 AM, Qu Wenruo wrote: On 2018/8/28 下午9:56, Chris Murphy wrote: On Tue, Aug 28, 2018 at 7:42 AM, Qu Wenruo wrote: On 2018/8/28 下午9:29, Larkin Lowrey wrote: On 8/27/2018 10:12 PM, Larkin Lowrey wrote: On 8/27/2018 12:46 AM, Qu Wenruo wrote: The system uses ECC memory and ed

Re: Scrub aborts due to corrupt leaf

2018-08-28 Thread Qu Wenruo
On 2018/8/28 下午9:56, Chris Murphy wrote: > On Tue, Aug 28, 2018 at 7:42 AM, Qu Wenruo wrote: >> >> >> On 2018/8/28 下午9:29, Larkin Lowrey wrote: >>> On 8/27/2018 10:12 PM, Larkin Lowrey wrote: On 8/27/2018 12:46 AM, Qu Wenruo wrote: > >> The system uses ECC memory and edac-util has n

Re: Scrub aborts due to corrupt leaf

2018-08-28 Thread Qu Wenruo
On 2018/8/28 下午9:56, Chris Murphy wrote: > On Tue, Aug 28, 2018 at 7:42 AM, Qu Wenruo wrote: >> >> >> On 2018/8/28 下午9:29, Larkin Lowrey wrote: >>> On 8/27/2018 10:12 PM, Larkin Lowrey wrote: On 8/27/2018 12:46 AM, Qu Wenruo wrote: > >> The system uses ECC memory and edac-util has n

Re: Scrub aborts due to corrupt leaf

2018-08-28 Thread Chris Murphy
On Tue, Aug 28, 2018 at 7:42 AM, Qu Wenruo wrote: > > > On 2018/8/28 下午9:29, Larkin Lowrey wrote: >> On 8/27/2018 10:12 PM, Larkin Lowrey wrote: >>> On 8/27/2018 12:46 AM, Qu Wenruo wrote: > The system uses ECC memory and edac-util has not reported any errors. > However, I will run a

Re: Scrub aborts due to corrupt leaf

2018-08-28 Thread Qu Wenruo
On 2018/8/28 下午9:29, Larkin Lowrey wrote: > On 8/27/2018 10:12 PM, Larkin Lowrey wrote: >> On 8/27/2018 12:46 AM, Qu Wenruo wrote: >>> The system uses ECC memory and edac-util has not reported any errors. However, I will run a memtest anyway. >>> So it should not be the memory problem.

Re: Scrub aborts due to corrupt leaf

2018-08-28 Thread Larkin Lowrey
On 8/27/2018 10:12 PM, Larkin Lowrey wrote: On 8/27/2018 12:46 AM, Qu Wenruo wrote: The system uses ECC memory and edac-util has not reported any errors. However, I will run a memtest anyway. So it should not be the memory problem. BTW, what's the current generation of the fs? # btrfs inspe

Re: Scrub aborts due to corrupt leaf

2018-08-27 Thread Chris Murphy
On Mon, Aug 27, 2018 at 8:12 PM, Larkin Lowrey wrote: > On 8/27/2018 12:46 AM, Qu Wenruo wrote: >> >> >>> The system uses ECC memory and edac-util has not reported any errors. >>> However, I will run a memtest anyway. >> >> So it should not be the memory problem. >> >> BTW, what's the current gene

Re: Scrub aborts due to corrupt leaf

2018-08-27 Thread Larkin Lowrey
On 8/27/2018 12:46 AM, Qu Wenruo wrote: The system uses ECC memory and edac-util has not reported any errors. However, I will run a memtest anyway. So it should not be the memory problem. BTW, what's the current generation of the fs? # btrfs inspect dump-super | grep generation The corrupt

Re: Scrub aborts due to corrupt leaf

2018-08-26 Thread Qu Wenruo
On 2018/8/27 上午10:32, Larkin Lowrey wrote: > On 8/26/2018 8:16 PM, Qu Wenruo wrote: >> Corrupted tree block bytenr matches with the number reported by kernel. >> You could provide the tree block dump for bytenr 7687860535296, and >> maybe we could find out what's going wrong and fix it manually.

Re: Scrub aborts due to corrupt leaf

2018-08-26 Thread Larkin Lowrey
On 8/26/2018 8:16 PM, Qu Wenruo wrote: Corrupted tree block bytenr matches with the number reported by kernel. You could provide the tree block dump for bytenr 7687860535296, and maybe we could find out what's going wrong and fix it manually. # btrfs ins dump-tree -b 7687860535296 Thank you f

Re: Scrub aborts due to corrupt leaf

2018-08-26 Thread Qu Wenruo
On 2018/8/27 上午4:45, Larkin Lowrey wrote: > When I do a scrub it aborts about 10% of the way in due to: > > corrupt leaf: root=7 block=7687860535296 slot=0, invalid key objectid > for csum item, have 18446744073650847734 expect 18446744073709551606 This error message explains itself. Key objec

Scrub aborts due to corrupt leaf

2018-08-26 Thread Larkin Lowrey
When I do a scrub it aborts about 10% of the way in due to: corrupt leaf: root=7 block=7687860535296 slot=0, invalid key objectid for csum item, have 18446744073650847734 expect 18446744073709551606 The filesystem in question stores my backups and I have verified all of the backups so I know