Re: Transaction aborted in btrfs_rename2

2016-07-29 Thread Jeff Mahoney
On 7/29/16 12:13 PM, Adam Borowski wrote:
> On Fri, Jul 29, 2016 at 11:43:29AM -0400, Jeff Mahoney wrote:
>> On 6/6/16 10:13 AM, Jeff Mahoney wrote:
>>> On 6/6/16 7:47 AM, Adam Borowski wrote:
 Hi!
 I just got this thrice, in 4.7-rc1 and 4.7-rc2:

 [ 1836.672368] [ cut here ]
 [ 1836.672382] WARNING: CPU: 1 PID: 16348 at fs/btrfs/inode.c:9820 
 btrfs_rename2+0xcd2/0x2a50
 [ 1836.672385] BTRFS: Transaction aborted (error -2)
 [ 1836.672396] CPU: 1 PID: 16348 Comm: gcc-6 Tainted: P   O
 4.7.0-rc2-debug+ #3
 [ 1836.672415] Call Trace:
 [ 1836.672423]  [] dump_stack+0x4e/0x71
 [ 1836.672429]  [] __warn+0x10c/0x150
 [ 1836.672433]  [] warn_slowpath_fmt+0x4a/0x50
 [ 1836.672437]  [] btrfs_rename2+0xcd2/0x2a50
 [ 1836.672443]  [] ? btrfs_permission+0x5b/0xc0
 [ 1836.672448]  [] ? down_write+0x18/0x60
 [ 1836.672453]  [] vfs_rename+0x7cc/0xc30
 [ 1836.672457]  [] SyS_rename+0x32b/0x420
 [ 1836.672461]  [] entry_SYSCALL_64_fastpath+0x17/0x93
 [ 1836.672464] ---[ end trace 6405b6e3d0e6c945 ]---
 [ 1836.672468] BTRFS warning (device sda1): btrfs_rename:9820: Aborting 
 unused transaction(No such entry).
 [ 1836.675505] BTRFS warning (device sda1): btrfs_rename:9820: Aborting 
 unused transaction(No such entry).
 
>>>
>>> Oh, interesting.  We're seeing this on our 4.4-based kernels as well but
>>> only on arm64.  That it's triggering on x86_64 is a good data point.
>>> I'm hunting this one today.
>>
>> I was finally able to track down what this was on arm64, and I'm afraid
>> the news won't help you much.  It was a bug in gcc 4.8.5 instruction
>> scheduling around function return that caused the stack pointer to be
>> restored to the position at the beginning of the function while the
>> stack was still being used via a separate register.  If an interrupt
>> arrived between those two instructions, you'd get stack corruption that
>> would present as bad hash values.
>>
>> Are you still able to reproduce this on x86_64?
> 
> Nope, not in quite a while.  I haven't used middle 4.7 rcs so I don't know
> when it went away.
> 
> I use gcc-6, too.
> 

Ok, thanks.  I've not been able to reproduce it anywhere but on ARM64,
so I'm trying to find out if it has, potentially, multiple vectors to
reproduce that might be platform agnostic.

-Jeff

-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature


Re: Transaction aborted in btrfs_rename2

2016-07-29 Thread Adam Borowski
On Fri, Jul 29, 2016 at 11:43:29AM -0400, Jeff Mahoney wrote:
> On 6/6/16 10:13 AM, Jeff Mahoney wrote:
> > On 6/6/16 7:47 AM, Adam Borowski wrote:
> >> Hi!
> >> I just got this thrice, in 4.7-rc1 and 4.7-rc2:
> >>
> >> [ 1836.672368] [ cut here ]
> >> [ 1836.672382] WARNING: CPU: 1 PID: 16348 at fs/btrfs/inode.c:9820 
> >> btrfs_rename2+0xcd2/0x2a50
> >> [ 1836.672385] BTRFS: Transaction aborted (error -2)
> >> [ 1836.672396] CPU: 1 PID: 16348 Comm: gcc-6 Tainted: P   O
> >> 4.7.0-rc2-debug+ #3
> >> [ 1836.672415] Call Trace:
> >> [ 1836.672423]  [] dump_stack+0x4e/0x71
> >> [ 1836.672429]  [] __warn+0x10c/0x150
> >> [ 1836.672433]  [] warn_slowpath_fmt+0x4a/0x50
> >> [ 1836.672437]  [] btrfs_rename2+0xcd2/0x2a50
> >> [ 1836.672443]  [] ? btrfs_permission+0x5b/0xc0
> >> [ 1836.672448]  [] ? down_write+0x18/0x60
> >> [ 1836.672453]  [] vfs_rename+0x7cc/0xc30
> >> [ 1836.672457]  [] SyS_rename+0x32b/0x420
> >> [ 1836.672461]  [] entry_SYSCALL_64_fastpath+0x17/0x93
> >> [ 1836.672464] ---[ end trace 6405b6e3d0e6c945 ]---
> >> [ 1836.672468] BTRFS warning (device sda1): btrfs_rename:9820: Aborting 
> >> unused transaction(No such entry).
> >> [ 1836.675505] BTRFS warning (device sda1): btrfs_rename:9820: Aborting 
> >> unused transaction(No such entry).
> >> 
> > 
> > Oh, interesting.  We're seeing this on our 4.4-based kernels as well but
> > only on arm64.  That it's triggering on x86_64 is a good data point.
> > I'm hunting this one today.
> 
> I was finally able to track down what this was on arm64, and I'm afraid
> the news won't help you much.  It was a bug in gcc 4.8.5 instruction
> scheduling around function return that caused the stack pointer to be
> restored to the position at the beginning of the function while the
> stack was still being used via a separate register.  If an interrupt
> arrived between those two instructions, you'd get stack corruption that
> would present as bad hash values.
> 
> Are you still able to reproduce this on x86_64?

Nope, not in quite a while.  I haven't used middle 4.7 rcs so I don't know
when it went away.

I use gcc-6, too.

-- 
An imaginary friend squared is a real enemy.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Transaction aborted in btrfs_rename2

2016-07-29 Thread Jeff Mahoney
On 6/6/16 10:13 AM, Jeff Mahoney wrote:
> On 6/6/16 7:47 AM, Adam Borowski wrote:
>> Hi!
>> I just got this thrice, in 4.7-rc1 and 4.7-rc2:
>>
>> [ 1836.672368] [ cut here ]
>> [ 1836.672382] WARNING: CPU: 1 PID: 16348 at fs/btrfs/inode.c:9820 
>> btrfs_rename2+0xcd2/0x2a50
>> [ 1836.672385] BTRFS: Transaction aborted (error -2)
>> [ 1836.672387] Modules linked in: nvidia(PO) usb_storage
>> [ 1836.672396] CPU: 1 PID: 16348 Comm: gcc-6 Tainted: P   O
>> 4.7.0-rc2-debug+ #3
>> [ 1836.672399] Hardware name: System manufacturer System Product 
>> Name/M4A77T, BIOS 240105/18/2011
>> [ 1836.672402]  81f8b504 880062c47c78 8165be6d 
>> 0007
>> [ 1836.672407]  880062c47cd0  880062c47cc0 
>> 81110c1c
>> [ 1836.672411]  880062c47d20 265c814e8642  
>> 00a25ade
>> [ 1836.672415] Call Trace:
>> [ 1836.672423]  [] dump_stack+0x4e/0x71
>> [ 1836.672429]  [] __warn+0x10c/0x150
>> [ 1836.672433]  [] warn_slowpath_fmt+0x4a/0x50
>> [ 1836.672437]  [] btrfs_rename2+0xcd2/0x2a50
>> [ 1836.672443]  [] ? btrfs_permission+0x5b/0xc0
>> [ 1836.672448]  [] ? down_write+0x18/0x60
>> [ 1836.672453]  [] vfs_rename+0x7cc/0xc30
>> [ 1836.672457]  [] SyS_rename+0x32b/0x420
>> [ 1836.672461]  [] entry_SYSCALL_64_fastpath+0x17/0x93
>> [ 1836.672464] ---[ end trace 6405b6e3d0e6c945 ]---
>> [ 1836.672468] BTRFS warning (device sda1): btrfs_rename:9820: Aborting 
>> unused transaction(No such entry).
>> [ 1836.675505] BTRFS warning (device sda1): btrfs_rename:9820: Aborting 
>> unused transaction(No such entry).
>> 
>> [ 1837.935238] BTRFS warning (device sda1): btrfs_rename:9820: Aborting 
>> unused transaction(No such entry).
>> [ 1837.937602] BTRFS: error (device sda1) in btrfs_rename:9820: errno=-2 No 
>> such entry
>> [ 1837.937607] BTRFS info (device sda1): forced readonly
>> [ 1838.086754] BTRFS warning (device sda1): Skipping commit of aborted 
>> transaction.
>> [ 1838.086762] BTRFS: error (device sda1) in cleanup_transaction:1857: 
>> errno=-2 No such entry
>> [ 1838.086782] BTRFS info (device sda1): delayed_refs has NO entry
>>
>> Didn't trigger during a week of other work, yet a kernel compile triggers
>> this reliably.
>>
>> Filesystem appears consistent (btrfs check, scrub).
>> Mount options: noatime,compress=lzo,ssd,space_cache.
>>
> 
> Oh, interesting.  We're seeing this on our 4.4-based kernels as well but
> only on arm64.  That it's triggering on x86_64 is a good data point.
> I'm hunting this one today.

Hi Adam -

I was finally able to track down what this was on arm64, and I'm afraid
the news won't help you much.  It was a bug in gcc 4.8.5 instruction
scheduling around function return that caused the stack pointer to be
restored to the position at the beginning of the function while the
stack was still being used via a separate register.  If an interrupt
arrived between those two instructions, you'd get stack corruption that
would present as bad hash values.

Are you still able to reproduce this on x86_64?

Thanks,

-Jeff

-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature


Re: Transaction aborted in btrfs_rename2

2016-06-06 Thread Jeff Mahoney
On 6/6/16 7:47 AM, Adam Borowski wrote:
> Hi!
> I just got this thrice, in 4.7-rc1 and 4.7-rc2:
> 
> [ 1836.672368] [ cut here ]
> [ 1836.672382] WARNING: CPU: 1 PID: 16348 at fs/btrfs/inode.c:9820 
> btrfs_rename2+0xcd2/0x2a50
> [ 1836.672385] BTRFS: Transaction aborted (error -2)
> [ 1836.672387] Modules linked in: nvidia(PO) usb_storage
> [ 1836.672396] CPU: 1 PID: 16348 Comm: gcc-6 Tainted: P   O
> 4.7.0-rc2-debug+ #3
> [ 1836.672399] Hardware name: System manufacturer System Product Name/M4A77T, 
> BIOS 240105/18/2011
> [ 1836.672402]  81f8b504 880062c47c78 8165be6d 
> 0007
> [ 1836.672407]  880062c47cd0  880062c47cc0 
> 81110c1c
> [ 1836.672411]  880062c47d20 265c814e8642  
> 00a25ade
> [ 1836.672415] Call Trace:
> [ 1836.672423]  [] dump_stack+0x4e/0x71
> [ 1836.672429]  [] __warn+0x10c/0x150
> [ 1836.672433]  [] warn_slowpath_fmt+0x4a/0x50
> [ 1836.672437]  [] btrfs_rename2+0xcd2/0x2a50
> [ 1836.672443]  [] ? btrfs_permission+0x5b/0xc0
> [ 1836.672448]  [] ? down_write+0x18/0x60
> [ 1836.672453]  [] vfs_rename+0x7cc/0xc30
> [ 1836.672457]  [] SyS_rename+0x32b/0x420
> [ 1836.672461]  [] entry_SYSCALL_64_fastpath+0x17/0x93
> [ 1836.672464] ---[ end trace 6405b6e3d0e6c945 ]---
> [ 1836.672468] BTRFS warning (device sda1): btrfs_rename:9820: Aborting 
> unused transaction(No such entry).
> [ 1836.675505] BTRFS warning (device sda1): btrfs_rename:9820: Aborting 
> unused transaction(No such entry).
> 
> [ 1837.935238] BTRFS warning (device sda1): btrfs_rename:9820: Aborting 
> unused transaction(No such entry).
> [ 1837.937602] BTRFS: error (device sda1) in btrfs_rename:9820: errno=-2 No 
> such entry
> [ 1837.937607] BTRFS info (device sda1): forced readonly
> [ 1838.086754] BTRFS warning (device sda1): Skipping commit of aborted 
> transaction.
> [ 1838.086762] BTRFS: error (device sda1) in cleanup_transaction:1857: 
> errno=-2 No such entry
> [ 1838.086782] BTRFS info (device sda1): delayed_refs has NO entry
> 
> Didn't trigger during a week of other work, yet a kernel compile triggers
> this reliably.
> 
> Filesystem appears consistent (btrfs check, scrub).
> Mount options: noatime,compress=lzo,ssd,space_cache.
> 

Oh, interesting.  We're seeing this on our 4.4-based kernels as well but
only on arm64.  That it's triggering on x86_64 is a good data point.
I'm hunting this one today.

-Jeff

-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature


Transaction aborted in btrfs_rename2

2016-06-06 Thread Adam Borowski
Hi!
I just got this thrice, in 4.7-rc1 and 4.7-rc2:

[ 1836.672368] [ cut here ]
[ 1836.672382] WARNING: CPU: 1 PID: 16348 at fs/btrfs/inode.c:9820 
btrfs_rename2+0xcd2/0x2a50
[ 1836.672385] BTRFS: Transaction aborted (error -2)
[ 1836.672387] Modules linked in: nvidia(PO) usb_storage
[ 1836.672396] CPU: 1 PID: 16348 Comm: gcc-6 Tainted: P   O
4.7.0-rc2-debug+ #3
[ 1836.672399] Hardware name: System manufacturer System Product Name/M4A77T, 
BIOS 240105/18/2011
[ 1836.672402]  81f8b504 880062c47c78 8165be6d 
0007
[ 1836.672407]  880062c47cd0  880062c47cc0 
81110c1c
[ 1836.672411]  880062c47d20 265c814e8642  
00a25ade
[ 1836.672415] Call Trace:
[ 1836.672423]  [] dump_stack+0x4e/0x71
[ 1836.672429]  [] __warn+0x10c/0x150
[ 1836.672433]  [] warn_slowpath_fmt+0x4a/0x50
[ 1836.672437]  [] btrfs_rename2+0xcd2/0x2a50
[ 1836.672443]  [] ? btrfs_permission+0x5b/0xc0
[ 1836.672448]  [] ? down_write+0x18/0x60
[ 1836.672453]  [] vfs_rename+0x7cc/0xc30
[ 1836.672457]  [] SyS_rename+0x32b/0x420
[ 1836.672461]  [] entry_SYSCALL_64_fastpath+0x17/0x93
[ 1836.672464] ---[ end trace 6405b6e3d0e6c945 ]---
[ 1836.672468] BTRFS warning (device sda1): btrfs_rename:9820: Aborting unused 
transaction(No such entry).
[ 1836.675505] BTRFS warning (device sda1): btrfs_rename:9820: Aborting unused 
transaction(No such entry).

[ 1837.935238] BTRFS warning (device sda1): btrfs_rename:9820: Aborting unused 
transaction(No such entry).
[ 1837.937602] BTRFS: error (device sda1) in btrfs_rename:9820: errno=-2 No 
such entry
[ 1837.937607] BTRFS info (device sda1): forced readonly
[ 1838.086754] BTRFS warning (device sda1): Skipping commit of aborted 
transaction.
[ 1838.086762] BTRFS: error (device sda1) in cleanup_transaction:1857: errno=-2 
No such entry
[ 1838.086782] BTRFS info (device sda1): delayed_refs has NO entry

Didn't trigger during a week of other work, yet a kernel compile triggers
this reliably.

Filesystem appears consistent (btrfs check, scrub).
Mount options: noatime,compress=lzo,ssd,space_cache.

-- 
An imaginary friend squared is a real enemy.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html