Re: [bug] userspace hitting sporadic SIGBUS on xfs (Power9, ppc64le), v4.19 and later

2019-12-09 Thread Jan Stancek


- Original Message -
> 
> 
> On 12/6/19 6:09 PM, dftxbs3e wrote:
> > Hello!
> > 
> > I am very happy that someone has found this issue.
> > 
> > I have been suffering from rather random SIGBUS errors in similar
> > conditions described by the author.
> > 
> > I don't have much troubleshooting information to provide, however, I hit
> > the issue regularly so I could investigate during that.
> > 
> > How do you debug such an issue? I tried a debugger etc. but besides
> > crashing with SIGBUS, I couldnt get any other meaningful information.

If it's same issue, you could check if dropping caches helps.
Figure out what page is it with crash or systemtap and look at page->flags
and ((struct iomap_page *)page->private)->uptodate bitmap.

> 
> You may want to test the patch Christoph sent on the original thread for
> this issue.

Or v5.5-rc1, Christoph's patch has been merged:
  1cea335d1db1 ("iomap: fix sub-page uptodate handling")



Re: [bug] userspace hitting sporadic SIGBUS on xfs (Power9, ppc64le), v4.19 and later

2019-12-08 Thread Eric Sandeen



On 12/6/19 6:09 PM, dftxbs3e wrote:
> Hello!
> 
> I am very happy that someone has found this issue.
> 
> I have been suffering from rather random SIGBUS errors in similar
> conditions described by the author.
> 
> I don't have much troubleshooting information to provide, however, I hit
> the issue regularly so I could investigate during that.
> 
> How do you debug such an issue? I tried a debugger etc. but besides
> crashing with SIGBUS, I couldnt get any other meaningful information.

You may want to test the patch Christoph sent on the original thread for
this issue.

-Eric


[bug] userspace hitting sporadic SIGBUS on xfs (Power9, ppc64le), v4.19 and later

2019-12-06 Thread dftxbs3e

Hello!

I am very happy that someone has found this issue.

I have been suffering from rather random SIGBUS errors in similar
conditions described by the author.

I don't have much troubleshooting information to provide, however, I hit
the issue regularly so I could investigate during that.

How do you debug such an issue? I tried a debugger etc. but besides
crashing with SIGBUS, I couldnt get any other meaningful information.

dftxbs3e

On 12/3/19 1:50 PM, Jan Stancek wrote:

Hi,

(This bug report is summary from thread [1] with some additions)

User-space binaries on Power9 ppc64le (with 64k pages) on xfs
filesystem are sporadically hitting SIGBUS:

-- 8< --
(gdb) r
Starting program: /mnt/testarea/ltp/testcases/bin/genasin

Program received signal SIGBUS, Bus error.
dl_main (phdr=0x1040, phnum=, user_entry=0x7fffe760, 
auxv=) at rtld.c:1362
1362switch (ph->p_type)

(gdb) p ph
$1 = (const Elf64_Phdr *) 0x1040

(gdb) p *ph
Cannot access memory at address 0x1040

(gdb) info proc map
process 1110670
Mapped address spaces:

Start Addr   End Addr   Size Offset objfile
0x1000 0x10010x10x0 
/mnt/testarea/ltp/testcases/bin/genasin
0x1001 0x10030x20x0 
/mnt/testarea/ltp/testcases/bin/genasin
0x77f9 0x77fb0x20x0 [vdso]
0x77fb 0x77fe0x30x0 
/usr/lib64/ld-2.30.so
0x77fe 0x78000x20x2 
/usr/lib64/ld-2.30.so
0x7ffd 0x80000x30x0 [stack]

(gdb) x/1x 0x1040
0x1040: Cannot access memory at address 0x1040
-- >8 --

When this happens the binary continues to hit SIGBUS until page
is released, for example by: echo 3 > /proc/sys/vm/drop_caches

The issue goes back to at least v4.19.

I can semi-reliably reproduce it with LTP is installed to /mnt/testarea/ltp by:
while [ True ]; do
  echo 3 > /proc/sys/vm/drop_caches
  rm -f /mnt/testarea/ltp/results/RUNTEST.log 
/mnt/testarea/ltp/output/RUNTEST.run.log
  ./runltp -p -d results -l RUNTEST.log -o RUNTEST.run.log -f math
  grep FAIL /mnt/testarea/ltp/results/RUNTEST.log && exit 1
done

and some stress activity in other terminal (e.g. kernel build).
Sometimes in minutes, sometimes in hours. It is not reliable
enough to get meaningful bisect results.

My theory is that there's a race in iomap. There appear to be
interleaved calls to iomap_set_range_uptodate() for same page
with varying offset and length. Each call sees bitmap as _not_
entirely "uptodate" and hence doesn't call SetPageUptodate().
Even though each bit in bitmap ends up uptodate by the time
all calls finish.

For example, with following traces:

iomap_set_range_uptodate()
...
  if (uptodate && !PageError(page))
  SetPageUptodate(page);
+
+   if (mycheck(page)) {
+   trace_printk("page: %px, iop: %px, uptodate: %d, !PageError(page): %d, 
flags: %lx\n", page, iop, uptodate, !PageError(page), page->flags);
+   trace_printk("first: %u, last: %u, off: %u, len: %u, i: %u\n", 
first, last, off, len, i);
+   }

I get:
   genacos-18471 [057]    162.465730: iomap_readpages: mapping: 
c03f185a1ab0
   genacos-18471 [057]    162.465732: iomap_page_create: 
iomap_page_create page: c00c0fe26180, page->private: , iop: 
c03fc70a19c0, flags: 380001
   genacos-18471 [057]    162.465736: iomap_set_range_uptodate: 
page: c00c0fe26180, iop: c03fc70a19c0, uptodate: 0, !PageError(page): 
1, flags: 382001
   genacos-18471 [057]    162.465736: iomap_set_range_uptodate: 
first: 1, last: 14, off: 4096, len: 57344, i: 16
-0 [060] ..s.   162.534862: iomap_set_range_uptodate: 
page: c00c0fe26180, iop: c03fc70a19c0, uptodate: 0, !PageError(page): 1, 
flags: 382081
-0 [061] ..s.   162.534862: iomap_set_range_uptodate: 
page: c00c0fe26180, iop: c03fc70a19c0, uptodate: 0, !PageError(page): 1, 
flags: 382081
-0 [060] ..s.   162.534864: iomap_set_range_uptodate: 
first: 0, last: 0, off: 0, len: 4096, i: 16
-0 [061] ..s.   162.534864: iomap_set_range_uptodate: 
first: 15, last: 15, off: 61440, len: 4096, i: 16

This page doesn't have Uptodate flag set, which leads to filemap_fault()
returning VM_FAULT_SIGBUS:

crash> p/x ((struct page *) 0xc00c0fe26180)->flags
$1 = 0x382032

crash> kmem -g 0x382032
FLAGS: 382032
PAGE-FLAG   BIT  VALUE
PG_error  1  002
PG_dirty  4  010
PG_lru5  020
PG_private_2 13  0002000
PG_fscache   13  0002000
PG_savepinned 4  010
PG_double_map13  0002000

But 

Re: [bug] userspace hitting sporadic SIGBUS on xfs (Power9, ppc64le), v4.19 and later

2019-12-04 Thread Jan Stancek


- Original Message -
> Please try the patch below:

I ran reproducer for 18 hours on 2 systems were it previously reproduced,
there were no crashes / SIGBUS.



Re: [bug] userspace hitting sporadic SIGBUS on xfs (Power9, ppc64le), v4.19 and later

2019-12-03 Thread Christoph Hellwig
Please try the patch below:

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 512856a88106..340c15400423 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -28,6 +28,7 @@
 struct iomap_page {
atomic_tread_count;
atomic_twrite_count;
+   spinlock_t  uptodate_lock;
DECLARE_BITMAP(uptodate, PAGE_SIZE / 512);
 };
 
@@ -51,6 +52,7 @@ iomap_page_create(struct inode *inode, struct page *page)
iop = kmalloc(sizeof(*iop), GFP_NOFS | __GFP_NOFAIL);
atomic_set(>read_count, 0);
atomic_set(>write_count, 0);
+   spin_lock_init(>uptodate_lock);
bitmap_zero(iop->uptodate, PAGE_SIZE / SECTOR_SIZE);
 
/*
@@ -139,25 +141,38 @@ iomap_adjust_read_range(struct inode *inode, struct 
iomap_page *iop,
 }
 
 static void
-iomap_set_range_uptodate(struct page *page, unsigned off, unsigned len)
+iomap_iop_set_range_uptodate(struct page *page, unsigned off, unsigned len)
 {
struct iomap_page *iop = to_iomap_page(page);
struct inode *inode = page->mapping->host;
unsigned first = off >> inode->i_blkbits;
unsigned last = (off + len - 1) >> inode->i_blkbits;
-   unsigned int i;
bool uptodate = true;
+   unsigned long flags;
+   unsigned int i;
 
-   if (iop) {
-   for (i = 0; i < PAGE_SIZE / i_blocksize(inode); i++) {
-   if (i >= first && i <= last)
-   set_bit(i, iop->uptodate);
-   else if (!test_bit(i, iop->uptodate))
-   uptodate = false;
-   }
+   spin_lock_irqsave(>uptodate_lock, flags);
+   for (i = 0; i < PAGE_SIZE / i_blocksize(inode); i++) {
+   if (i >= first && i <= last)
+   set_bit(i, iop->uptodate);
+   else if (!test_bit(i, iop->uptodate))
+   uptodate = false;
}
 
-   if (uptodate && !PageError(page))
+   if (uptodate)
+   SetPageUptodate(page);
+   spin_unlock_irqrestore(>uptodate_lock, flags);
+}
+
+static void
+iomap_set_range_uptodate(struct page *page, unsigned off, unsigned len)
+{
+   if (PageError(page))
+   return;
+
+   if (page_has_private(page))
+   iomap_iop_set_range_uptodate(page, off, len);
+   else
SetPageUptodate(page);
 }
 


Re: [bug] userspace hitting sporadic SIGBUS on xfs (Power9, ppc64le), v4.19 and later

2019-12-03 Thread Darrick J. Wong
On Tue, Dec 03, 2019 at 09:35:28AM -0500, Jan Stancek wrote:
> 
> - Original Message -
> > On Tue, Dec 03, 2019 at 07:50:39AM -0500, Jan Stancek wrote:
> > > My theory is that there's a race in iomap. There appear to be
> > > interleaved calls to iomap_set_range_uptodate() for same page
> > > with varying offset and length. Each call sees bitmap as _not_
> > > entirely "uptodate" and hence doesn't call SetPageUptodate().
> > > Even though each bit in bitmap ends up uptodate by the time
> > > all calls finish.
> > 
> > Weird.  That should be prevented by the page lock that all callers
> > of iomap_set_range_uptodate.  But in case I miss something, does
> > the patch below trigger?  If not it is not jut a race, but might
> > be some weird ordering problem with the bitops, especially if it
> > only triggers on ppc, which is very weakly ordered.
> > 
> > diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> > index d33c7bc5ee92..25e942c71590 100644
> > --- a/fs/iomap/buffered-io.c
> > +++ b/fs/iomap/buffered-io.c
> > @@ -148,6 +148,8 @@ iomap_set_range_uptodate(struct page *page, unsigned 
> > off,
> > unsigned len)
> > unsigned int i;
> > bool uptodate = true;
> >  
> > +   WARN_ON_ONCE(!PageLocked(page));
> > +
> > if (iop) {
> > for (i = 0; i < PAGE_SIZE / i_blocksize(inode); i++) {
> > if (i >= first && i <= last)
> > 
> 
> Hit it pretty quick this time:
> 
> # uptime
>  09:27:42 up 22 min,  2 users,  load average: 0.09, 13.38, 26.18
> 
> # /mnt/testarea/ltp/testcases/bin/genbessel   
>   
> 
> Bus error (core dumped)
> 
> # dmesg | grep -i -e warn -e call 
>   
> 
> [0.00] dt-cpu-ftrs: not enabling: system-call-vectored (disabled or 
> unsupported by kernel)
> [0.00] random: get_random_u64 called from 
> cache_random_seq_create+0x98/0x1e0 with crng_init=0
> [0.00] rcu: Offload RCU callbacks from CPUs: (none).
> [5.312075] megaraid_sas 0031:01:00.0: megasas_disable_intr_fusion is 
> called outbound_intr_mask:0x4009
> [5.357307] megaraid_sas 0031:01:00.0: megasas_disable_intr_fusion is 
> called outbound_intr_mask:0x4009
> [5.485126] megaraid_sas 0031:01:00.0: megasas_enable_intr_fusion is 
> called outbound_intr_mask:0x4000
> 
> So, extra WARN_ON_ONCE applied on top of v5.4-8836-g81b6b96475ac
> did not trigger.
> 
> Is it possible for iomap code to submit multiple bio-s for same
> locked page and then receive callbacks in parallel?

Yes, if (say) you have 64k pages on a 4k-block filesystem and the extent
mapping for all 16 blocks aren't contiguous, then iomap will issue
separate bios for each physical fragment it finds.  iomap will call
submit_bio on those bios whenever it thinks it's done filling the bio,
so you can indeed get multiple callbacks in parallel.

--D


Re: [bug] userspace hitting sporadic SIGBUS on xfs (Power9, ppc64le), v4.19 and later

2019-12-03 Thread Jan Stancek


- Original Message -
> On Tue, Dec 03, 2019 at 07:50:39AM -0500, Jan Stancek wrote:
> > My theory is that there's a race in iomap. There appear to be
> > interleaved calls to iomap_set_range_uptodate() for same page
> > with varying offset and length. Each call sees bitmap as _not_
> > entirely "uptodate" and hence doesn't call SetPageUptodate().
> > Even though each bit in bitmap ends up uptodate by the time
> > all calls finish.
> 
> Weird.  That should be prevented by the page lock that all callers
> of iomap_set_range_uptodate.  But in case I miss something, does
> the patch below trigger?  If not it is not jut a race, but might
> be some weird ordering problem with the bitops, especially if it
> only triggers on ppc, which is very weakly ordered.
> 
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index d33c7bc5ee92..25e942c71590 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -148,6 +148,8 @@ iomap_set_range_uptodate(struct page *page, unsigned off,
> unsigned len)
>   unsigned int i;
>   bool uptodate = true;
>  
> + WARN_ON_ONCE(!PageLocked(page));
> +
>   if (iop) {
>   for (i = 0; i < PAGE_SIZE / i_blocksize(inode); i++) {
>   if (i >= first && i <= last)
> 

Hit it pretty quick this time:

# uptime
 09:27:42 up 22 min,  2 users,  load average: 0.09, 13.38, 26.18

# /mnt/testarea/ltp/testcases/bin/genbessel 


Bus error (core dumped)

# dmesg | grep -i -e warn -e call   


[0.00] dt-cpu-ftrs: not enabling: system-call-vectored (disabled or 
unsupported by kernel)
[0.00] random: get_random_u64 called from 
cache_random_seq_create+0x98/0x1e0 with crng_init=0
[0.00] rcu: Offload RCU callbacks from CPUs: (none).
[5.312075] megaraid_sas 0031:01:00.0: megasas_disable_intr_fusion is called 
outbound_intr_mask:0x4009
[5.357307] megaraid_sas 0031:01:00.0: megasas_disable_intr_fusion is called 
outbound_intr_mask:0x4009
[5.485126] megaraid_sas 0031:01:00.0: megasas_enable_intr_fusion is called 
outbound_intr_mask:0x4000

So, extra WARN_ON_ONCE applied on top of v5.4-8836-g81b6b96475ac
did not trigger.

Is it possible for iomap code to submit multiple bio-s for same
locked page and then receive callbacks in parallel?



Re: [bug] userspace hitting sporadic SIGBUS on xfs (Power9, ppc64le), v4.19 and later

2019-12-03 Thread Christoph Hellwig
On Tue, Dec 03, 2019 at 07:50:39AM -0500, Jan Stancek wrote:
> My theory is that there's a race in iomap. There appear to be
> interleaved calls to iomap_set_range_uptodate() for same page
> with varying offset and length. Each call sees bitmap as _not_
> entirely "uptodate" and hence doesn't call SetPageUptodate().
> Even though each bit in bitmap ends up uptodate by the time
> all calls finish.

Weird.  That should be prevented by the page lock that all callers
of iomap_set_range_uptodate.  But in case I miss something, does
the patch below trigger?  If not it is not jut a race, but might
be some weird ordering problem with the bitops, especially if it
only triggers on ppc, which is very weakly ordered.

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index d33c7bc5ee92..25e942c71590 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -148,6 +148,8 @@ iomap_set_range_uptodate(struct page *page, unsigned off, 
unsigned len)
unsigned int i;
bool uptodate = true;
 
+   WARN_ON_ONCE(!PageLocked(page));
+
if (iop) {
for (i = 0; i < PAGE_SIZE / i_blocksize(inode); i++) {
if (i >= first && i <= last)


[bug] userspace hitting sporadic SIGBUS on xfs (Power9, ppc64le), v4.19 and later

2019-12-03 Thread Jan Stancek
Hi,

(This bug report is summary from thread [1] with some additions)

User-space binaries on Power9 ppc64le (with 64k pages) on xfs
filesystem are sporadically hitting SIGBUS:

-- 8< --
(gdb) r
Starting program: /mnt/testarea/ltp/testcases/bin/genasin

Program received signal SIGBUS, Bus error.
dl_main (phdr=0x1040, phnum=, user_entry=0x7fffe760, 
auxv=) at rtld.c:1362
1362switch (ph->p_type)

(gdb) p ph
$1 = (const Elf64_Phdr *) 0x1040

(gdb) p *ph
Cannot access memory at address 0x1040

(gdb) info proc map
process 1110670
Mapped address spaces:

  Start Addr   End Addr   Size Offset objfile
  0x1000 0x10010x10x0 
/mnt/testarea/ltp/testcases/bin/genasin
  0x1001 0x10030x20x0 
/mnt/testarea/ltp/testcases/bin/genasin
  0x77f9 0x77fb0x20x0 [vdso]
  0x77fb 0x77fe0x30x0 
/usr/lib64/ld-2.30.so
  0x77fe 0x78000x20x2 
/usr/lib64/ld-2.30.so
  0x7ffd 0x80000x30x0 [stack]

(gdb) x/1x 0x1040
0x1040: Cannot access memory at address 0x1040
-- >8 --

When this happens the binary continues to hit SIGBUS until page
is released, for example by: echo 3 > /proc/sys/vm/drop_caches

The issue goes back to at least v4.19.

I can semi-reliably reproduce it with LTP is installed to /mnt/testarea/ltp by:
while [ True ]; do
echo 3 > /proc/sys/vm/drop_caches
rm -f /mnt/testarea/ltp/results/RUNTEST.log 
/mnt/testarea/ltp/output/RUNTEST.run.log
./runltp -p -d results -l RUNTEST.log -o RUNTEST.run.log -f math
grep FAIL /mnt/testarea/ltp/results/RUNTEST.log && exit 1
done

and some stress activity in other terminal (e.g. kernel build).
Sometimes in minutes, sometimes in hours. It is not reliable
enough to get meaningful bisect results.

My theory is that there's a race in iomap. There appear to be
interleaved calls to iomap_set_range_uptodate() for same page
with varying offset and length. Each call sees bitmap as _not_
entirely "uptodate" and hence doesn't call SetPageUptodate().
Even though each bit in bitmap ends up uptodate by the time
all calls finish.

For example, with following traces:

iomap_set_range_uptodate()
...
if (uptodate && !PageError(page))
SetPageUptodate(page);
+   
+   if (mycheck(page)) {
+   trace_printk("page: %px, iop: %px, uptodate: %d, 
!PageError(page): %d, flags: %lx\n", page, iop, uptodate, !PageError(page), 
page->flags);
+   trace_printk("first: %u, last: %u, off: %u, len: %u, i: %u\n", 
first, last, off, len, i);
+   }

I get:
 genacos-18471 [057]    162.465730: iomap_readpages: mapping: 
c03f185a1ab0
 genacos-18471 [057]    162.465732: iomap_page_create: 
iomap_page_create page: c00c0fe26180, page->private: , iop: 
c03fc70a19c0, flags: 380001
 genacos-18471 [057]    162.465736: iomap_set_range_uptodate: page: 
c00c0fe26180, iop: c03fc70a19c0, uptodate: 0, !PageError(page): 1, 
flags: 382001
 genacos-18471 [057]    162.465736: iomap_set_range_uptodate: 
first: 1, last: 14, off: 4096, len: 57344, i: 16
  -0 [060] ..s.   162.534862: iomap_set_range_uptodate: page: 
c00c0fe26180, iop: c03fc70a19c0, uptodate: 0, !PageError(page): 1, 
flags: 382081
  -0 [061] ..s.   162.534862: iomap_set_range_uptodate: page: 
c00c0fe26180, iop: c03fc70a19c0, uptodate: 0, !PageError(page): 1, 
flags: 382081
  -0 [060] ..s.   162.534864: iomap_set_range_uptodate: 
first: 0, last: 0, off: 0, len: 4096, i: 16
  -0 [061] ..s.   162.534864: iomap_set_range_uptodate: 
first: 15, last: 15, off: 61440, len: 4096, i: 16

This page doesn't have Uptodate flag set, which leads to filemap_fault()
returning VM_FAULT_SIGBUS:

crash> p/x ((struct page *) 0xc00c0fe26180)->flags  

   
$1 = 0x382032

crash> kmem -g 0x382032
FLAGS: 382032
  PAGE-FLAG   BIT  VALUE
  PG_error  1  002
  PG_dirty  4  010
  PG_lru5  020
  PG_private_2 13  0002000
  PG_fscache   13  0002000
  PG_savepinned 4  010
  PG_double_map13  0002000

But iomap_page->uptodate in page->private suggests all bits are uptodate:

crash> p/x ((struct page *) 0xc00c0fe26180)->private
$2 = 0xc03fc70a19c0

crash> p/x ((struct iomap_page *) 0xc03fc70a19c0)->uptodate 

   
$3 = {0x, 0x0}


It appears (after ~4 hours) that I