Re: [PATCH v2]Btrfs: pwrite blocked when writing from the mmaped buffer of the same page

2011-02-25 Thread Mitch Harder
On Thu, Feb 24, 2011 at 5:14 PM, Mitch Harder
mitch.har...@sabayonlinux.org wrote:
 On Thu, Feb 24, 2011 at 10:32 AM, Mitch Harder
 mitch.har...@sabayonlinux.org wrote:
 On Thu, Feb 24, 2011 at 10:19 AM, Chris Mason chris.ma...@oracle.com wrote:
 Excerpts from Mitch Harder's message of 2011-02-24 11:03:07 -0500:
 On Thu, Feb 24, 2011 at 10:00 AM, Chris Mason chris.ma...@oracle.com 
 wrote:
  Excerpts from Mitch Harder's message of 2011-02-24 10:55:15 -0500:
  2011/2/24 Maria Wikström ma...@ponstudios.se:
   mån 2011-02-21 klockan 09:51 +0800 skrev Zhong, Xin:
   The backtrace in your attachment looks like a known bug of 2.6.37 
   which have already been fixed in 2.6.38. I have no idea why latest 
   btrfs still hang in your environment if there's no debug info...
  
  
   Haha, yes that's very hard :)
  
   2.6.38-rc6 and btrfs-unstable behaves the same way. I can close the
   process with ctrl+c and it disappear a few seconds later. There is no
   CPU usage. Reading works because I can start htop and watch svn info
   disappear, but everything writing to btrfs slows down to a crawl. It
   takes about 1 minute to log in. So I had to put the logs on an other
   partition using ext3 to get the output from sysrq+t.
  
 
  I believe I've been experiencing this issue also.  However, my problem
  usually results in a No space left on device error rather than a
  lock-up or crash.  But I've bisected my issue to this patch, and my
  btrfs fi show and btrfs fi df looks similar to others who've
  posted to this tread with all my space being allocated, but not used.
 
 
  Sorry, which patch did you bisect the problem down to?
 

 The patch at the head of this thread:

 Btrfs: pwrite blocked when writing from the mmaped buffer of the same page

 Hmmm, that patch shouldn't be changing our performance under delalloc
 pressure, and it really shouldn't impact early enospc.


 I've bisected this issue around where this patch went into git, and
 I've also constructed a testing patch that reverts this patch, and
 placed it on top of the current Btrfs git sources (I understand this
 patch addresses a real issue, this was just for testing).

 It could be that this patch just uncovers another problem, but all
 my tests seem to point to this patch triggering this issue.


 I don't belief the previous ftrace I supplied had a large enough scope
 to capture the issue.

 I've expanded my ftrace buffer, and filtered out everything but btrfs*
 function calls (# echo btrfs* 
 /sys/kernel/debug/tracing/set_ftrace_filter).

 In this trace, I see btrfs spending a great deal of time in a while
 loop (while (iov_iter_count(i)  0) {)) in the btrfs_file_aio_write()
 function in file.c without exiting the function.

 I'm going to try to inject some debugging trace_printk() statements to
 find if that portion of code is proceeding normally with my test case.

 I've put my expanded trace up on my local server, but my upload
 bandwidth is pretty sad, and it may take a few minutes to transfer
 even though it's only a 6MB file.

 http://dontpanic.dyndns.org/trace-openmotif-btrfs-v3.gz


Apologies for only hitting Reply instead of Reply-All on my last message.

I've inserted additional trace_printk() to the btrfs_file_aio_write()
and btrfs_copy_from_user() function in file.c in order to characterize
the problem I've been encountering.

I can see btrfs getting stuck in a loop in the while
(iov_iter_count(i)  0) {} portion of the btrfs_file_aio_write()
function.

The loop is more-or-less following this process (from within the
while (iov_iter_count(i)  0) {} loop):

(1) Reserve some space with btrfs_delalloc_reserve_space()
(2) Prepare the reserved space with prepare_pages()
(3) Call btrfs_copy_from_user() to copy to the prepared space.
- From btrfs_copy_from_user()
(4) Try to copy with copied = iov_iter_copy_from_user_atomic()
(5) The above operation results with copied == 0. Break and
return with a return value of 0 bytes copied.
(6) There is no special handling for copied == 0 in the while
(iov_iter_count(i)  0) {} loop, so it loops back around, reserves
some more space, and tries again.

If I look back at how the code was set up before the patch at the head
of this thread was applied (Btrfs: pwrite blocked when writing from
the mmaped buffer of the same page), the btrfs_copy_from_user()
function had some handling for copied == 0 that would change the
scope of the amount to write, and loop back to try the write again.

I attempted to construct a patch that just reverted the handling for
copied == 0 in btrfs_copy_from_user(), however, that just resulted
in my computer locking up when it reached the point where it was
previously beginning to allocate disk space.

So, I apologize for not having a patch to address the issue I'm
seeing, but I hope I've added some insight.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  

Re: [PATCH v2]Btrfs: pwrite blocked when writing from the mmaped buffer of the same page

2011-02-25 Thread Chris Mason
Excerpts from Mitch Harder's message of 2011-02-25 13:43:37 -0500:
  The loop is more-or-less following this process (from within the
  while (iov_iter_count(i)  0) {} loop):
 
  (1) Reserve some space with btrfs_delalloc_reserve_space()
  (2) Prepare the reserved space with prepare_pages()
  (3) Call btrfs_copy_from_user() to copy to the prepared space.
  - From btrfs_copy_from_user()
  (4) Try to copy with copied = iov_iter_copy_from_user_atomic()
  (5) The above operation results with copied == 0. Break and
  return with a return value of 0 bytes copied.
  (6) There is no special handling for copied == 0 in the while
  (iov_iter_count(i)  0) {} loop, so it loops back around, reserves
  some more space, and tries again.
 
  If I look back at how the code was set up before the patch at the head
  of this thread was applied (Btrfs: pwrite blocked when writing from
  the mmaped buffer of the same page), the btrfs_copy_from_user()
  function had some handling for copied == 0 that would change the
  scope of the amount to write, and loop back to try the write again.
 
  I attempted to construct a patch that just reverted the handling for
  copied == 0 in btrfs_copy_from_user(), however, that just resulted
  in my computer locking up when it reached the point where it was
  previously beginning to allocate disk space.
 
  So, I apologize for not having a patch to address the issue I'm
  seeing, but I hope I've added some insight.
 
 
 Some clarification on my previous message...
 
 After looking at my ftrace log more closely, I can see where Btrfs is
 trying to release the allocated pages.  However, the calculation for
 the number of dirty_pages is equal to 1 when copied == 0.
 
 So I'm seeing at least two problems:
 (1)  It keeps looping when copied == 0.
 (2)  One dirty page is not being released on every loop even though
 copied == 0 (at least this problem keeps it from being an infinite
 loop by eventually exhausting reserveable space on the disk).

Very nice, I think you're exactly right.  I should be able to reproduce this
now, thanks!

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2]Btrfs: pwrite blocked when writing from the mmaped buffer of the same page

2011-02-24 Thread Mitch Harder
2011/2/24 Maria Wikström ma...@ponstudios.se:
 mån 2011-02-21 klockan 09:51 +0800 skrev Zhong, Xin:
 The backtrace in your attachment looks like a known bug of 2.6.37 which have 
 already been fixed in 2.6.38. I have no idea why latest btrfs still hang in 
 your environment if there's no debug info...


 Haha, yes that's very hard :)

 2.6.38-rc6 and btrfs-unstable behaves the same way. I can close the
 process with ctrl+c and it disappear a few seconds later. There is no
 CPU usage. Reading works because I can start htop and watch svn info
 disappear, but everything writing to btrfs slows down to a crawl. It
 takes about 1 minute to log in. So I had to put the logs on an other
 partition using ext3 to get the output from sysrq+t.


I believe I've been experiencing this issue also.  However, my problem
usually results in a No space left on device error rather than a
lock-up or crash.  But I've bisected my issue to this patch, and my
btrfs fi show and btrfs fi df looks similar to others who've
posted to this tread with all my space being allocated, but not used.

I've been playing around with ftrace to try to get some information on
the issue.  Since I'm getting a soft error, I can used a command like
echo 1  tracing_on; emerge -1av openmotif; echo 0  tracing_on to
end the trace as soon as the build fails.

The traces are probably too large for the M/L (~275kb compressed), so
I've put them up on my local server in case anybody can find them
useful:

http://dontpanic.dyndns.org/emerge-openmotif-ftrace.gz

I'm still only capturing the tail end of the problem, but maybe
someone will find these insightful.

Let me know if you want me to increase the ftrace buffer size or
insert some trace_printk debugging statements.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2]Btrfs: pwrite blocked when writing from the mmaped buffer of the same page

2011-02-24 Thread Chris Mason
Excerpts from Johannes Hirte's message of 2011-02-23 18:02:44 -0500:
 On Wednesday 23 February 2011 22:56:27 Chris Mason wrote:
  Excerpts from Zhong, Xin's message of 2011-02-23 02:27:05 -0500:
   In the dmesg of rc4, I can see svn hang in shrink_dellalloc and there's
   two flush-btrfs threads hang there too.
   
   Josef, it seems you are the expert in this area. Could you take a quick
   look? Thanks!
  
  Ok, it does look like the fluhs-btrfs threads are busy trying to flush
  things.
  
  Could you please do a btrfs-show and a btrfs fi df /xxx (where xxx is
  your mount point) and send the results here?
  
  -chris
 
 failed to read /dev/sr0
 Label: none  uuid: 00eab15f-c4cf-4403-a529-9bc11fa50167
 Total devices 1 FS bytes used 47.72GB
 devid1 size 65.69GB used 65.69GB path /dev/sda2
 
 Label: none  uuid: c6f4e6e6-c4ba-4394-9e9c-bbc3d0b32793
 Total devices 1 FS bytes used 9.48GB
 devid1 size 20.01GB used 20.01GB path /dev/sda1
 
 Btrfs v0.19-35-g1b444cd-dirty
 
 and btrfs fi df on
 
 /
 
 Data: total=15.49GB, used=8.35GB
 System, DUP: total=8.00MB, used=12.00KB
 System: total=4.00MB, used=0.00
 Metadata, DUP: total=2.25GB, used=1.13GB
 
 /home
 
 Data: total=63.42GB, used=47.47GB
 System: total=4.00MB, used=16.00KB
 Metadata: total=2.27GB, used=251.34MB
 
 The bug is reproducable on both filesystems.

Ok, you've got a good amount of metadata space free, but it is
frantically trying to make room for the delayed allocation.  Let me see
if I can recreate this setup here.

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2]Btrfs: pwrite blocked when writing from the mmaped buffer of the same page

2011-02-24 Thread Mitch Harder
On Thu, Feb 24, 2011 at 10:00 AM, Chris Mason chris.ma...@oracle.com wrote:
 Excerpts from Mitch Harder's message of 2011-02-24 10:55:15 -0500:
 2011/2/24 Maria Wikström ma...@ponstudios.se:
  mån 2011-02-21 klockan 09:51 +0800 skrev Zhong, Xin:
  The backtrace in your attachment looks like a known bug of 2.6.37 which 
  have already been fixed in 2.6.38. I have no idea why latest btrfs still 
  hang in your environment if there's no debug info...
 
 
  Haha, yes that's very hard :)
 
  2.6.38-rc6 and btrfs-unstable behaves the same way. I can close the
  process with ctrl+c and it disappear a few seconds later. There is no
  CPU usage. Reading works because I can start htop and watch svn info
  disappear, but everything writing to btrfs slows down to a crawl. It
  takes about 1 minute to log in. So I had to put the logs on an other
  partition using ext3 to get the output from sysrq+t.
 

 I believe I've been experiencing this issue also.  However, my problem
 usually results in a No space left on device error rather than a
 lock-up or crash.  But I've bisected my issue to this patch, and my
 btrfs fi show and btrfs fi df looks similar to others who've
 posted to this tread with all my space being allocated, but not used.


 Sorry, which patch did you bisect the problem down to?


The patch at the head of this thread:

Btrfs: pwrite blocked when writing from the mmaped buffer of the same page
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2]Btrfs: pwrite blocked when writing from the mmaped buffer of the same page

2011-02-24 Thread Chris Mason
Excerpts from Mitch Harder's message of 2011-02-24 11:03:07 -0500:
 On Thu, Feb 24, 2011 at 10:00 AM, Chris Mason chris.ma...@oracle.com wrote:
  Excerpts from Mitch Harder's message of 2011-02-24 10:55:15 -0500:
  2011/2/24 Maria Wikström ma...@ponstudios.se:
   mån 2011-02-21 klockan 09:51 +0800 skrev Zhong, Xin:
   The backtrace in your attachment looks like a known bug of 2.6.37 which 
   have already been fixed in 2.6.38. I have no idea why latest btrfs 
   still hang in your environment if there's no debug info...
  
  
   Haha, yes that's very hard :)
  
   2.6.38-rc6 and btrfs-unstable behaves the same way. I can close the
   process with ctrl+c and it disappear a few seconds later. There is no
   CPU usage. Reading works because I can start htop and watch svn info
   disappear, but everything writing to btrfs slows down to a crawl. It
   takes about 1 minute to log in. So I had to put the logs on an other
   partition using ext3 to get the output from sysrq+t.
  
 
  I believe I've been experiencing this issue also.  However, my problem
  usually results in a No space left on device error rather than a
  lock-up or crash.  But I've bisected my issue to this patch, and my
  btrfs fi show and btrfs fi df looks similar to others who've
  posted to this tread with all my space being allocated, but not used.
 
 
  Sorry, which patch did you bisect the problem down to?
 
 
 The patch at the head of this thread:
 
 Btrfs: pwrite blocked when writing from the mmaped buffer of the same page

Hmmm, that patch shouldn't be changing our performance under delalloc
pressure, and it really shouldn't impact early enospc.

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2]Btrfs: pwrite blocked when writing from the mmaped buffer of the same page

2011-02-24 Thread Mitch Harder
On Thu, Feb 24, 2011 at 10:19 AM, Chris Mason chris.ma...@oracle.com wrote:
 Excerpts from Mitch Harder's message of 2011-02-24 11:03:07 -0500:
 On Thu, Feb 24, 2011 at 10:00 AM, Chris Mason chris.ma...@oracle.com wrote:
  Excerpts from Mitch Harder's message of 2011-02-24 10:55:15 -0500:
  2011/2/24 Maria Wikström ma...@ponstudios.se:
   mån 2011-02-21 klockan 09:51 +0800 skrev Zhong, Xin:
   The backtrace in your attachment looks like a known bug of 2.6.37 
   which have already been fixed in 2.6.38. I have no idea why latest 
   btrfs still hang in your environment if there's no debug info...
  
  
   Haha, yes that's very hard :)
  
   2.6.38-rc6 and btrfs-unstable behaves the same way. I can close the
   process with ctrl+c and it disappear a few seconds later. There is no
   CPU usage. Reading works because I can start htop and watch svn info
   disappear, but everything writing to btrfs slows down to a crawl. It
   takes about 1 minute to log in. So I had to put the logs on an other
   partition using ext3 to get the output from sysrq+t.
  
 
  I believe I've been experiencing this issue also.  However, my problem
  usually results in a No space left on device error rather than a
  lock-up or crash.  But I've bisected my issue to this patch, and my
  btrfs fi show and btrfs fi df looks similar to others who've
  posted to this tread with all my space being allocated, but not used.
 
 
  Sorry, which patch did you bisect the problem down to?
 

 The patch at the head of this thread:

 Btrfs: pwrite blocked when writing from the mmaped buffer of the same page

 Hmmm, that patch shouldn't be changing our performance under delalloc
 pressure, and it really shouldn't impact early enospc.


I've bisected this issue around where this patch went into git, and
I've also constructed a testing patch that reverts this patch, and
placed it on top of the current Btrfs git sources (I understand this
patch addresses a real issue, this was just for testing).

It could be that this patch just uncovers another problem, but all
my tests seem to point to this patch triggering this issue.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2]Btrfs: pwrite blocked when writing from the mmaped buffer of the same page

2011-02-24 Thread Piotr Szymaniak
On Mon, Feb 21, 2011 at 09:51:11AM +0800, Zhong, Xin wrote:
 The backtrace in your attachment looks like a known bug of 2.6.37 which have 
 already been fixed in 2.6.38. I have no idea why latest btrfs still hang in 
 your environment if there's no debug info...

Hi list.

I'm watching this list for a while because it seems, that I'm also
affected by this bug. In the archives I found someone with Gentoo system
with freezing `svn info' (thats why I joined). Well, seems that I have 
same issue here.

Attached latest _rc kernel sysrq+t output (first part when the `svn
info' freezed on libgcrypt, and second after ctrl+c that emerge).

Seems that my backtrace is small compared to Marias. I'm missing some
features in kernel to get larger backtraces?

Piotr Szymaniak.



 -Original Message-
 From: Maria Wikström [mailto:ma...@ponstudios.se] 
 Sent: Friday, February 18, 2011 7:32 PM
 To: Zhong, Xin
 Cc: Johannes Hirte; linux-btrfs@vger.kernel.org
 Subject: RE: [PATCH v2]Btrfs: pwrite blocked when writing from the mmaped 
 buffer of the same page
 
 
 Seems like my reply got eaten by the lists spam filter, so I resend with 
 attachment compressed. Should have thought of that :p
 
 
 fre 2011-02-11 klockan 12:39 +0800 skrev Zhong, Xin:
  Hi,
  
  Could you paste the output of sysrq+t here? Thanks!
 
 Yes, it's in the attachment. I tried latest btrfs from git (last commit Mon 
 Feb 14 00:45:29 2011 +) but it hang so bad that I couldn't get the output 
 from sysrq+t to hit the disk. So the output is from vanilla
 2.6.37 
 
 // Maria
 
  -Original Message-
  From: Johannes Hirte [mailto:johannes.hi...@fem.tu-ilmenau.de]
  Sent: Wednesday, February 02, 2011 7:35 AM
  To: Zhong, Xin
  Cc: Maria Wikström; linux-btrfs@vger.kernel.org
  Subject: Re: [PATCH v2]Btrfs: pwrite blocked when writing from the 
  mmaped buffer of the same page
  
  On Friday 28 January 2011 04:53:24 Zhong, Xin wrote:
   Could you describe the steps to recreate it?
   It will be a great help for me to look further. Thanks!
  
  It's a little strange. I have to systems with btrfs, both 
  Gentoo-based. One is affected by this bug the other is not. On the 
  affected system it is enough to do a 'emerge dev-libs/libgcrypt' that 
  should normaly compile and install libgcrypt. The emerge command is 
  part of portage, the package management of Gentoo.
  The strace output looks similar to the one from Maria:
  
 
 N?r??yb?X??ǧv?^?)޺{.n?+{?n?߲)w*jg????ݢj/???z?ޖ??2?ޙ?)ߡ?a?????G???h??j:+v???w??٥

-- 
Druzyna  futbolowa  tutaj  jest  do  niczego, ale ucze sie troche pilki
noznej.  Trener  mowi,  ze  pilka nozna to futbol dla inteligentnych, a
futbol to futbol dla kretynow.
  -- Stephen King, The Dead Zone


vanilla-2.6.38-rc6_sysrq-t.bz2
Description: BZip2 compressed data


pgpwOlW7jCAZg.pgp
Description: PGP signature


RE: [PATCH v2]Btrfs: pwrite blocked when writing from the mmaped buffer of the same page

2011-02-20 Thread Zhong, Xin
The backtrace in your attachment looks like a known bug of 2.6.37 which have 
already been fixed in 2.6.38. I have no idea why latest btrfs still hang in 
your environment if there's no debug info...

-Original Message-
From: Maria Wikström [mailto:ma...@ponstudios.se] 
Sent: Friday, February 18, 2011 7:32 PM
To: Zhong, Xin
Cc: Johannes Hirte; linux-btrfs@vger.kernel.org
Subject: RE: [PATCH v2]Btrfs: pwrite blocked when writing from the mmaped 
buffer of the same page


Seems like my reply got eaten by the lists spam filter, so I resend with 
attachment compressed. Should have thought of that :p


fre 2011-02-11 klockan 12:39 +0800 skrev Zhong, Xin:
 Hi,
 
 Could you paste the output of sysrq+t here? Thanks!

Yes, it's in the attachment. I tried latest btrfs from git (last commit Mon Feb 
14 00:45:29 2011 +) but it hang so bad that I couldn't get the output from 
sysrq+t to hit the disk. So the output is from vanilla
2.6.37 

// Maria

 -Original Message-
 From: Johannes Hirte [mailto:johannes.hi...@fem.tu-ilmenau.de]
 Sent: Wednesday, February 02, 2011 7:35 AM
 To: Zhong, Xin
 Cc: Maria Wikström; linux-btrfs@vger.kernel.org
 Subject: Re: [PATCH v2]Btrfs: pwrite blocked when writing from the 
 mmaped buffer of the same page
 
 On Friday 28 January 2011 04:53:24 Zhong, Xin wrote:
  Could you describe the steps to recreate it?
  It will be a great help for me to look further. Thanks!
 
 It's a little strange. I have to systems with btrfs, both 
 Gentoo-based. One is affected by this bug the other is not. On the 
 affected system it is enough to do a 'emerge dev-libs/libgcrypt' that 
 should normaly compile and install libgcrypt. The emerge command is 
 part of portage, the package management of Gentoo.
 The strace output looks similar to the one from Maria:
 

N�r��yb�X��ǧv�^�)޺{.n�+{�n�߲)w*jg����ݢj/���z�ޖ��2�ޙ�)ߡ�a�����G���h��j:+v���w��٥

RE: [PATCH v2]Btrfs: pwrite blocked when writing from the mmaped buffer of the same page

2011-02-10 Thread Zhong, Xin
Hi,

Could you paste the output of sysrq+t here? Thanks!

-Original Message-
From: Johannes Hirte [mailto:johannes.hi...@fem.tu-ilmenau.de] 
Sent: Wednesday, February 02, 2011 7:35 AM
To: Zhong, Xin
Cc: Maria Wikström; linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v2]Btrfs: pwrite blocked when writing from the mmaped 
buffer of the same page

On Friday 28 January 2011 04:53:24 Zhong, Xin wrote:
 Could you describe the steps to recreate it?
 It will be a great help for me to look further. Thanks!

It's a little strange. I have to systems with btrfs, both Gentoo-based. One is 
affected by this bug the other is not. On the affected system it is enough to 
do 
a 'emerge dev-libs/libgcrypt' that should normaly compile and install 
libgcrypt. The emerge command is part of portage, the package management of 
Gentoo. 
The strace output looks similar to the one from Maria:

open(/home/tmp/portage/dev-libs/libgcrypt-1.4.6/.ipc_in, O_RDONLY|
O_NONBLOCK|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFIFO|0770, st_size=0, ...}) = 0
ioctl(3, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 
0xbff5f678) = -1 EINVAL (Invalid argument)
open(/dev/ptmx, O_RDWR)   = 5
ioctl(5, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 
{B38400 opost isig icanon echo ...}) = 0
ioctl(5, TIOCGPTN, [2]) = 0
stat64(/dev/pts/2, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 2), ...}) = 0
getuid32()  = 0
ioctl(5, TIOCSPTLCK, [0])   = 0
ioctl(5, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 
{B38400 opost isig icanon echo ...}) = 0
ioctl(5, TIOCGPTN, [2]) = 0
stat64(/dev/pts/2, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 2), ...}) = 0
open(/dev/pts/2, O_RDWR|O_NOCTTY) = 6
ioctl(6, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 
{B38400 opost isig icanon echo ...}) = 0
ioctl(6, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 
{B38400 opost isig icanon echo ...}) = 0
ioctl(6, SNDCTL_TMR_START or SNDRV_TIMER_IOCTL_TREAD or TCSETS, {B38400 -opost 
isig icanon echo ...}) = 0
ioctl(1, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 
{B38400 opost isig icanon echo ...}) = 0
ioctl(1, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 
{B38400 opost isig icanon echo ...}) = 0
ioctl(1, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 
{B38400 opost isig icanon echo ...}) = 0
stat64(/root/.terminfo, 0xbff5e790)   = -1 ENOENT (No such file or directory)
stat64(/etc/terminfo, {st_mode=S_IFDIR|0755, st_size=14, ...}) = 0
access(/etc/terminfo/x/xterm, R_OK)   = 0
open(/etc/terminfo/x/xterm, O_RDONLY|O_LARGEFILE) = 7
read(7, \32\0010\0\0\17\0\235\1l\5xterm|xterm terminal..., 4097) = 3258
close(7)= 0
ioctl(1, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 
{B38400 opost isig icanon echo ...}) = 0
ioctl(1, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 
{B38400 opost isig icanon echo ...}) = 0
ioctl(1, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 
{B38400 opost isig icanon echo ...}) = 0
ioctl(1, TIOCGWINSZ, {ws_row=40, ws_col=207, ws_xpixel=0, ws_ypixel=0}) = 0
access(/usr/local/sbin/stty, X_OK)= -1 ENOENT (No such file or directory)
access(/usr/local/bin/stty, X_OK) = -1 ENOENT (No such file or directory)
access(/usr/sbin/stty, X_OK)  = -1 ENOENT (No such file or directory)
access(/usr/bin/stty, X_OK)   = -1 ENOENT (No such file or directory)
access(/sbin/stty, X_OK)  = -1 ENOENT (No such file or directory)
access(/bin/stty, X_OK)   = 0
stat64(/bin/stty, {st_mode=S_IFREG|0755, st_size=58836, ...}) = 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, 
child_tidptr=0xb753d728) = 2752
waitpid(2752, [{WIFEXITED(s)  WEXITSTATUS(s) == 0}], 0) = 2752
--- SIGCHLD (Child exited) @ 0 (0) ---
fcntl64(5, F_GETFL) = 0x2 (flags O_RDWR)
fcntl64(5, F_SETFL, O_RDWR|O_NONBLOCK)  = 0
fstat64(5, {st_mode=S_IFCHR|0666, st_rdev=makedev(5, 2), ...}) = 0
ioctl(5, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 
{B38400 -opost isig icanon echo ...}) = 0
open(/home/tmp/portage/dev-libs/libgcrypt-1.4.6/temp/build.log, O_WRONLY|
O_CREAT|O_APPEND|O_LARGEFILE, 0666) = 7
fstat64(7, {st_mode=S_IFREG|0660, st_size=480, ...}) = 0
_llseek(7, 0, [480], SEEK_END)  = 0
ioctl(7, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 
0xbff5fad8) = -1 ENOTTY (Inappropriate ioctl for device)
fstat64(7, {st_mode=S_IFREG|0660, st_size=480, ...}) = 0
_llseek(7, 0, [480], SEEK_CUR)  = 0
stat64(/home/tmp/portage/dev-libs/libgcrypt-1.4.6/temp/build.log, 
{st_mode=S_IFREG|0660, st_size=480, ...}) = 0
dup(1)  = 8
fstat64(8, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
ioctl(8, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE

Re: [PATCH v2]Btrfs: pwrite blocked when writing from the mmaped buffer of the same page

2011-01-28 Thread Rui Miguel Silva

On Friday, January 28, 2011 04:47:21 pm Maria Wikström wrote:
 fre 2011-01-28 klockan 03:54 +0100 skrev Johannes Hirte:
  On Friday 28 January 2011 02:26:43 Zhong, Xin wrote:
   Please try the fix in below link:
   http://www.spinics.net/lists/linux-btrfs/msg08051.html
   
   Thanks!
  
  This doesn't fix it for me. At least there is a difference. Whereas the
  svn process started consuming 100% CPU without any further interaction
  before, the system just hang now. The svn process starts eating the CPU
  when I cancel the emerge via ctrl-c. Additional I see a flush-btrfs task
  now consuming CPU time.
  
  regards,
  
Johannes
  
  --
  To unsubscribe from this list: send the line unsubscribe linux-btrfs in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
 The patch makes the process exit cleanly but complains that there is no
 space left. It should be a few GB but it is only a few MB!? I delete 5GB
 and controls that the space is usable. Try again but this time it almost
 hangs the system, I can blink the leds and move the mouse but noting
 more.
 I boot 2.6.36.1 and use a snapshot as root, emerge libgcrypt compiles
 and installs fine. Boots back into 2.6.37 and try again and the system
 hangs again.
 
 // Maria
 

Hi Maria,
I had something similar with vanilla 2.6.37 just for running:
cscope -R -b -q

allways hang complety the process, only solution reboot.
I just applied the  btrfs patches that arrived in the main tree between the 
2.6.37 and 2.6.38-rc1 and they seem to fix the issue.

I did not have the time to try to found which patch fixed the problem.

Hope this could help you.

Cheers,
  //Rui
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v2]Btrfs: pwrite blocked when writing from the mmaped buffer of the same page

2011-01-27 Thread Zhong, Xin
Please try the fix in below link:
http://www.spinics.net/lists/linux-btrfs/msg08051.html

Thanks!

-Original Message-
From: linux-btrfs-ow...@vger.kernel.org 
[mailto:linux-btrfs-ow...@vger.kernel.org] On Behalf Of Maria Wikstr?m
Sent: Friday, January 28, 2011 6:12 AM
To: johannes.hi...@fem.tu-ilmenau.de; Zhong, Xin
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v2]Btrfs: pwrite blocked when writing from the mmaped 
buffer of the same page

tor 2011-01-27 klockan 14:09 +0100 skrev Johannes Hirte: 
 On Thursday 09 December 2010 10:30:14 Zhong, Xin wrote:
  This problem is found in meego testing:
  http://bugs.meego.com/show_bug.cgi?id=6672
  A file in btrfs is mmaped and the mmaped buffer is passed to pwrite to
  write to the same page of the same file. In btrfs_file_aio_write(), the
  pages is locked by prepare_pages(). So when btrfs_copy_from_user() is
  called, page fault happens and the same page needs to be locked again in
  filemap_fault(). The fix is to move iov_iter_fault_in_readable() before
  prepage_pages() to make page fault happen before pages are locked. And
  also disable page fault in critical region in btrfs_copy_from_user().
  
  Reviewed-by: Yan, Zhengzheng.z@intel.com
  Signed-off-by: Zhong, Xin xin.zh...@intel.com
  ---
   fs/btrfs/file.c |   92
  --- 1 files changed,
  60 insertions(+), 32 deletions(-)
  
  diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
  index c1faded..66836d8 100644
  --- a/fs/btrfs/file.c
  +++ b/fs/btrfs/file.c
  @@ -48,30 +48,34 @@ static noinline int btrfs_copy_from_user(loff_t pos,
  int num_pages, struct page **prepared_pages,
   struct iov_iter *i)
   {
  -   size_t copied;
  +   size_t copied = 0;
  int pg = 0;
  int offset = pos  (PAGE_CACHE_SIZE - 1);
  +   int total_copied = 0;
  
  while (write_bytes  0) {
  size_t count = min_t(size_t,
   PAGE_CACHE_SIZE - offset, write_bytes);
  struct page *page = prepared_pages[pg];
  -again:
  -   if (unlikely(iov_iter_fault_in_readable(i, count)))
  -   return -EFAULT;
  -
  -   /* Copy data from userspace to the current page */
  -   copied = iov_iter_copy_from_user(page, i, offset, count);
  +   /*
  +* Copy data from userspace to the current page
  +*
  +* Disable pagefault to avoid recursive lock since
  +* the pages are already locked
  +*/
  +   pagefault_disable();
  +   copied = iov_iter_copy_from_user_atomic(page, i, offset, count);
  +   pagefault_enable();
  
  /* Flush processor's dcache for this page */
  flush_dcache_page(page);
  iov_iter_advance(i, copied);
  write_bytes -= copied;
  +   total_copied += copied;
  
  +   /* Return to btrfs_file_aio_write to fault page */
  if (unlikely(copied == 0)) {
  -   count = min_t(size_t, PAGE_CACHE_SIZE - offset,
  - iov_iter_single_seg_count(i));
  -   goto again;
  +   break;
  }
  
  if (unlikely(copied  PAGE_CACHE_SIZE - offset)) {
  @@ -81,7 +85,7 @@ again:
  offset = 0;
  }
  }
  -   return 0;
  +   return total_copied;
   }
  
   /*
  @@ -854,6 +858,8 @@ static ssize_t btrfs_file_aio_write(struct kiocb *iocb,
  unsigned long last_index;
  int will_write;
  int buffered = 0;
  +   int copied = 0;
  +   int dirty_pages = 0;
  
  will_write = ((file-f_flags  O_DSYNC) || IS_SYNC(inode) ||
(file-f_flags  O_DIRECT));
  @@ -970,7 +976,17 @@ static ssize_t btrfs_file_aio_write(struct kiocb
  *iocb, WARN_ON(num_pages  nrptrs);
  memset(pages, 0, sizeof(struct page *) * nrptrs);
  
  -   ret = btrfs_delalloc_reserve_space(inode, write_bytes);
  +   /*
  +* Fault pages before locking them in prepare_pages
  +* to avoid recursive lock
  +*/
  +   if (unlikely(iov_iter_fault_in_readable(i, write_bytes))) {
  +   ret = -EFAULT;
  +   goto out;
  +   }
  +
  +   ret = btrfs_delalloc_reserve_space(inode,
  +   num_pages  PAGE_CACHE_SHIFT);
  if (ret)
  goto out;
  
  @@ -978,37 +994,49 @@ static ssize_t btrfs_file_aio_write(struct kiocb
  *iocb, pos, first_index, last_index,
  write_bytes);
  if (ret) {
  -   btrfs_delalloc_release_space(inode, write_bytes);
  +   btrfs_delalloc_release_space(inode,
  +   num_pages  PAGE_CACHE_SHIFT);
  goto out;
  }
  
  -   ret = btrfs_copy_from_user(pos

Re: [PATCH v2]Btrfs: pwrite blocked when writing from the mmaped buffer of the same page

2011-01-27 Thread Johannes Hirte
On Friday 28 January 2011 02:26:43 Zhong, Xin wrote:
 Please try the fix in below link:
 http://www.spinics.net/lists/linux-btrfs/msg08051.html
 
 Thanks!

This doesn't fix it for me. At least there is a difference. Whereas the svn 
process started consuming 100% CPU without any further interaction before, the 
system just hang now. The svn process starts eating the CPU when I cancel the 
emerge via ctrl-c. Additional I see a flush-btrfs task now consuming CPU time.

regards,
  Johannes
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v2]Btrfs: pwrite blocked when writing from the mmaped buffer of the same page

2011-01-27 Thread Zhong, Xin
Could you describe the steps to recreate it? 
It will be a great help for me to look further. Thanks!

-Original Message-
From: Johannes Hirte [mailto:johannes.hi...@fem.tu-ilmenau.de] 
Sent: Friday, January 28, 2011 10:55 AM
To: Zhong, Xin
Cc: Maria Wikström; linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v2]Btrfs: pwrite blocked when writing from the mmaped 
buffer of the same page

On Friday 28 January 2011 02:26:43 Zhong, Xin wrote:
 Please try the fix in below link:
 http://www.spinics.net/lists/linux-btrfs/msg08051.html
 
 Thanks!

This doesn't fix it for me. At least there is a difference. Whereas the svn 
process started consuming 100% CPU without any further interaction before, the 
system just hang now. The svn process starts eating the CPU when I cancel the 
emerge via ctrl-c. Additional I see a flush-btrfs task now consuming CPU time.

regards,
  Johannes


[PATCH v2]Btrfs: pwrite blocked when writing from the mmaped buffer of the same page

2010-12-09 Thread Zhong, Xin
This problem is found in meego testing:
http://bugs.meego.com/show_bug.cgi?id=6672
A file in btrfs is mmaped and the mmaped buffer is passed to pwrite to write to 
the same page
of the same file. In btrfs_file_aio_write(), the pages is locked by 
prepare_pages(). So when
btrfs_copy_from_user() is called, page fault happens and the same page needs to 
be locked again
in filemap_fault(). The fix is to move iov_iter_fault_in_readable() before 
prepage_pages() to make page
fault happen before pages are locked. And also disable page fault in critical 
region in
btrfs_copy_from_user().

Reviewed-by: Yan, Zhengzheng.z@intel.com
Signed-off-by: Zhong, Xin xin.zh...@intel.com
---
 fs/btrfs/file.c |   92 ---
 1 files changed, 60 insertions(+), 32 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index c1faded..66836d8 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -48,30 +48,34 @@ static noinline int btrfs_copy_from_user(loff_t pos, int 
num_pages,
 struct page **prepared_pages,
 struct iov_iter *i)
 {
-   size_t copied;
+   size_t copied = 0;
int pg = 0;
int offset = pos  (PAGE_CACHE_SIZE - 1);
+   int total_copied = 0;
 
while (write_bytes  0) {
size_t count = min_t(size_t,
 PAGE_CACHE_SIZE - offset, write_bytes);
struct page *page = prepared_pages[pg];
-again:
-   if (unlikely(iov_iter_fault_in_readable(i, count)))
-   return -EFAULT;
-
-   /* Copy data from userspace to the current page */
-   copied = iov_iter_copy_from_user(page, i, offset, count);
+   /*
+* Copy data from userspace to the current page
+*
+* Disable pagefault to avoid recursive lock since
+* the pages are already locked
+*/
+   pagefault_disable();
+   copied = iov_iter_copy_from_user_atomic(page, i, offset, count);
+   pagefault_enable();
 
/* Flush processor's dcache for this page */
flush_dcache_page(page);
iov_iter_advance(i, copied);
write_bytes -= copied;
+   total_copied += copied;
 
+   /* Return to btrfs_file_aio_write to fault page */
if (unlikely(copied == 0)) {
-   count = min_t(size_t, PAGE_CACHE_SIZE - offset,
- iov_iter_single_seg_count(i));
-   goto again;
+   break;
}
 
if (unlikely(copied  PAGE_CACHE_SIZE - offset)) {
@@ -81,7 +85,7 @@ again:
offset = 0;
}
}
-   return 0;
+   return total_copied;
 }
 
 /*
@@ -854,6 +858,8 @@ static ssize_t btrfs_file_aio_write(struct kiocb *iocb,
unsigned long last_index;
int will_write;
int buffered = 0;
+   int copied = 0;
+   int dirty_pages = 0;
 
will_write = ((file-f_flags  O_DSYNC) || IS_SYNC(inode) ||
  (file-f_flags  O_DIRECT));
@@ -970,7 +976,17 @@ static ssize_t btrfs_file_aio_write(struct kiocb *iocb,
WARN_ON(num_pages  nrptrs);
memset(pages, 0, sizeof(struct page *) * nrptrs);
 
-   ret = btrfs_delalloc_reserve_space(inode, write_bytes);
+   /*
+* Fault pages before locking them in prepare_pages
+* to avoid recursive lock
+*/
+   if (unlikely(iov_iter_fault_in_readable(i, write_bytes))) {
+   ret = -EFAULT;
+   goto out;
+   }
+
+   ret = btrfs_delalloc_reserve_space(inode,
+   num_pages  PAGE_CACHE_SHIFT);
if (ret)
goto out;
 
@@ -978,37 +994,49 @@ static ssize_t btrfs_file_aio_write(struct kiocb *iocb,
pos, first_index, last_index,
write_bytes);
if (ret) {
-   btrfs_delalloc_release_space(inode, write_bytes);
+   btrfs_delalloc_release_space(inode,
+   num_pages  PAGE_CACHE_SHIFT);
goto out;
}
 
-   ret = btrfs_copy_from_user(pos, num_pages,
+   copied = btrfs_copy_from_user(pos, num_pages,
   write_bytes, pages, i);
-   if (ret == 0) {
+   dirty_pages = (copied + PAGE_CACHE_SIZE - 1) 
+   PAGE_CACHE_SHIFT;
+
+   if (num_pages  dirty_pages) {
+   if (copied  0)
+