Re: [PATCH 0/3] Ceph fscache: Fix kernel panic due to a race

2014-03-06 Thread Sage Weil
Hi Milosz,

Thanks, I've added this to the testing branch!

sage


On Mon, 3 Mar 2014, Milosz Tanski wrote:

> Hey guys, I'm terribly sorry but apparently this got stuck in my draft
> mailbox for 3 months. Since we've been running this on both our test /
> prod clusters I would say this is sufficiently tested.
> 
> I've looked at the code and tested this for a week now on test cluster
> and it looks good. It does fix a real problem and I think we should
> push it to mainline. Thanks for fixing this Li! The reason I was not
> seeing this is that I had other fscache patches that were masking this
> problem :/
> 
> Thanks,
> - Milosz
> 
> P.S: Sorry for the double mail, the first one was not sent as text. I
> apparently do not know how to use gmail.
> 
> On Fri, Jan 3, 2014 at 9:43 AM, Milosz Tanski  wrote:
> > I'm going to look the patches and the issue in full detail. In the
> > meantime do you guys have the oops back trace. I have some other
> > fscache patches that haven't made it upstream yet that might have been
> > masking this issue for me.
> >
> > On Fri, Dec 27, 2013 at 10:51 PM, Li Wang  wrote:
> >> Hi Milosz,
> >>   As far as I know, logically, currently fscache does not play
> >> as write cache for Ceph, except that there is a
> >> call to ceph_readpage_to_fscache() in ceph_writepage(), but that
> >> is nothing related to our test case. According to our observation,
> >> our test case never goes through ceph_writepage(), instead, it goes
> >> through ceph_writepages(). So in other words, I donot think this
> >> is related to caching in write path.
> >>   May I try to explain the panic in more detail,
> >>
> >> (1) dd if=/dev/zero of=cephfs/foo bs=8 count=512
> >> (2) echo 3 > /proc/sys/vm/drop_caches
> >> (3) dd if=cephfs/foo of=/dev/null bs=8 count=1024
> >>
> >> For statement (1), it is frequently appending a file, so
> >> ceph_aio_write() frequently updates the inode->i_size,
> >> however, these updates did not immediately reflected to
> >> object->store_limit_l. For statement (3), when we
> >> start reading the second page at [4096, 8192), ceph find that the page
> >> does not be cached in fscache, then it decides to write this page into
> >> fscache, during this process in cachefiles_write_page(), it found that
> >> object->store_limit_l < 4096 (page->index << 12), it causes panic. Does
> >> it make sense?
> >>
> >> Cheers,
> >> Li Wang
> >>
> >>
> >> On 2013/12/27 6:51, Milosz Tanski wrote:
> >>>
> >>> Li,
> >>>
> >>> I looked at the patchset am I correct that this only happens when we
> >>> enable caching in the write path?
> >>>
> >>> - Milosz
> >>>
> >>> On Thu, Dec 26, 2013 at 9:29 AM, Li Wang  wrote:
> 
>  From: Yunchuan Wen 
> 
>  The following scripts could easily panic the kernel,
> 
>  #!/bin/bash
>  mount -t ceph -o fsc MONADDR:/ cephfs
>  rm -rf cephfs/foo
>  dd if=/dev/zero of=cephfs/foo bs=8 count=512
>  echo 3 > /proc/sys/vm/drop_caches
>  dd if=cephfs/foo of=/dev/null bs=8 count=1024
> 
>  This is due to when writing a page into fscache, the code will
>  assert that the write position does not exceed the
>  object->store_limit_l, which is supposed to be equal to inode->i_size.
>  However, for current implementation, after file writing, the
>  object->store_limit_l is not synchronized with new
>  inode->i_size immediately, which introduces a race that if writing
>  a new page into fscache, will reach the ASSERT that write position
>  has exceeded the object->store_limit_l, and cause kernel panic.
>  This patch fixes it.
> 
>  Yunchuan Wen (3):
> Ceph fscache: Add an interface to synchronize object store limit
> Ceph fscache: Update object store limit after writing
> Ceph fscache: Wait for completion of object initialization
> 
>    fs/ceph/cache.c |1 +
>    fs/ceph/cache.h |   10 ++
>    fs/ceph/file.c  |3 +++
>    3 files changed, 14 insertions(+)
> 
>  --
>  1.7.9.5
> 
> >>>
> >>>
> >>>
> >>
> >
> >
> >
> > --
> > Milosz Tanski
> > CTO
> > 10 East 53rd Street, 37th floor
> > New York, NY 10022
> >
> > p: 646-253-9055
> > e: mil...@adfin.com
> 
> 
> 
> -- 
> Milosz Tanski
> CTO
> 10 East 53rd Street, 37th floor
> New York, NY 10022
> 
> p: 646-253-9055
> e: mil...@adfin.com
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] Ceph fscache: Fix kernel panic due to a race

2014-03-06 Thread Sage Weil
Hi Milosz,

Thanks, I've added this to the testing branch!

sage


On Mon, 3 Mar 2014, Milosz Tanski wrote:

 Hey guys, I'm terribly sorry but apparently this got stuck in my draft
 mailbox for 3 months. Since we've been running this on both our test /
 prod clusters I would say this is sufficiently tested.
 
 I've looked at the code and tested this for a week now on test cluster
 and it looks good. It does fix a real problem and I think we should
 push it to mainline. Thanks for fixing this Li! The reason I was not
 seeing this is that I had other fscache patches that were masking this
 problem :/
 
 Thanks,
 - Milosz
 
 P.S: Sorry for the double mail, the first one was not sent as text. I
 apparently do not know how to use gmail.
 
 On Fri, Jan 3, 2014 at 9:43 AM, Milosz Tanski mil...@adfin.com wrote:
  I'm going to look the patches and the issue in full detail. In the
  meantime do you guys have the oops back trace. I have some other
  fscache patches that haven't made it upstream yet that might have been
  masking this issue for me.
 
  On Fri, Dec 27, 2013 at 10:51 PM, Li Wang liw...@ubuntukylin.com wrote:
  Hi Milosz,
As far as I know, logically, currently fscache does not play
  as write cache for Ceph, except that there is a
  call to ceph_readpage_to_fscache() in ceph_writepage(), but that
  is nothing related to our test case. According to our observation,
  our test case never goes through ceph_writepage(), instead, it goes
  through ceph_writepages(). So in other words, I donot think this
  is related to caching in write path.
May I try to explain the panic in more detail,
 
  (1) dd if=/dev/zero of=cephfs/foo bs=8 count=512
  (2) echo 3  /proc/sys/vm/drop_caches
  (3) dd if=cephfs/foo of=/dev/null bs=8 count=1024
 
  For statement (1), it is frequently appending a file, so
  ceph_aio_write() frequently updates the inode-i_size,
  however, these updates did not immediately reflected to
  object-store_limit_l. For statement (3), when we
  start reading the second page at [4096, 8192), ceph find that the page
  does not be cached in fscache, then it decides to write this page into
  fscache, during this process in cachefiles_write_page(), it found that
  object-store_limit_l  4096 (page-index  12), it causes panic. Does
  it make sense?
 
  Cheers,
  Li Wang
 
 
  On 2013/12/27 6:51, Milosz Tanski wrote:
 
  Li,
 
  I looked at the patchset am I correct that this only happens when we
  enable caching in the write path?
 
  - Milosz
 
  On Thu, Dec 26, 2013 at 9:29 AM, Li Wang liw...@ubuntukylin.com wrote:
 
  From: Yunchuan Wen yunchuan...@ubuntukylin.com
 
  The following scripts could easily panic the kernel,
 
  #!/bin/bash
  mount -t ceph -o fsc MONADDR:/ cephfs
  rm -rf cephfs/foo
  dd if=/dev/zero of=cephfs/foo bs=8 count=512
  echo 3  /proc/sys/vm/drop_caches
  dd if=cephfs/foo of=/dev/null bs=8 count=1024
 
  This is due to when writing a page into fscache, the code will
  assert that the write position does not exceed the
  object-store_limit_l, which is supposed to be equal to inode-i_size.
  However, for current implementation, after file writing, the
  object-store_limit_l is not synchronized with new
  inode-i_size immediately, which introduces a race that if writing
  a new page into fscache, will reach the ASSERT that write position
  has exceeded the object-store_limit_l, and cause kernel panic.
  This patch fixes it.
 
  Yunchuan Wen (3):
 Ceph fscache: Add an interface to synchronize object store limit
 Ceph fscache: Update object store limit after writing
 Ceph fscache: Wait for completion of object initialization
 
fs/ceph/cache.c |1 +
fs/ceph/cache.h |   10 ++
fs/ceph/file.c  |3 +++
3 files changed, 14 insertions(+)
 
  --
  1.7.9.5
 
 
 
 
 
 
 
 
  --
  Milosz Tanski
  CTO
  10 East 53rd Street, 37th floor
  New York, NY 10022
 
  p: 646-253-9055
  e: mil...@adfin.com
 
 
 
 -- 
 Milosz Tanski
 CTO
 10 East 53rd Street, 37th floor
 New York, NY 10022
 
 p: 646-253-9055
 e: mil...@adfin.com
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] Ceph fscache: Fix kernel panic due to a race

2014-03-03 Thread Milosz Tanski
Hey guys, I'm terribly sorry but apparently this got stuck in my draft
mailbox for 3 months. Since we've been running this on both our test /
prod clusters I would say this is sufficiently tested.

I've looked at the code and tested this for a week now on test cluster
and it looks good. It does fix a real problem and I think we should
push it to mainline. Thanks for fixing this Li! The reason I was not
seeing this is that I had other fscache patches that were masking this
problem :/

Thanks,
- Milosz

P.S: Sorry for the double mail, the first one was not sent as text. I
apparently do not know how to use gmail.

On Fri, Jan 3, 2014 at 9:43 AM, Milosz Tanski  wrote:
> I'm going to look the patches and the issue in full detail. In the
> meantime do you guys have the oops back trace. I have some other
> fscache patches that haven't made it upstream yet that might have been
> masking this issue for me.
>
> On Fri, Dec 27, 2013 at 10:51 PM, Li Wang  wrote:
>> Hi Milosz,
>>   As far as I know, logically, currently fscache does not play
>> as write cache for Ceph, except that there is a
>> call to ceph_readpage_to_fscache() in ceph_writepage(), but that
>> is nothing related to our test case. According to our observation,
>> our test case never goes through ceph_writepage(), instead, it goes
>> through ceph_writepages(). So in other words, I donot think this
>> is related to caching in write path.
>>   May I try to explain the panic in more detail,
>>
>> (1) dd if=/dev/zero of=cephfs/foo bs=8 count=512
>> (2) echo 3 > /proc/sys/vm/drop_caches
>> (3) dd if=cephfs/foo of=/dev/null bs=8 count=1024
>>
>> For statement (1), it is frequently appending a file, so
>> ceph_aio_write() frequently updates the inode->i_size,
>> however, these updates did not immediately reflected to
>> object->store_limit_l. For statement (3), when we
>> start reading the second page at [4096, 8192), ceph find that the page
>> does not be cached in fscache, then it decides to write this page into
>> fscache, during this process in cachefiles_write_page(), it found that
>> object->store_limit_l < 4096 (page->index << 12), it causes panic. Does
>> it make sense?
>>
>> Cheers,
>> Li Wang
>>
>>
>> On 2013/12/27 6:51, Milosz Tanski wrote:
>>>
>>> Li,
>>>
>>> I looked at the patchset am I correct that this only happens when we
>>> enable caching in the write path?
>>>
>>> - Milosz
>>>
>>> On Thu, Dec 26, 2013 at 9:29 AM, Li Wang  wrote:

 From: Yunchuan Wen 

 The following scripts could easily panic the kernel,

 #!/bin/bash
 mount -t ceph -o fsc MONADDR:/ cephfs
 rm -rf cephfs/foo
 dd if=/dev/zero of=cephfs/foo bs=8 count=512
 echo 3 > /proc/sys/vm/drop_caches
 dd if=cephfs/foo of=/dev/null bs=8 count=1024

 This is due to when writing a page into fscache, the code will
 assert that the write position does not exceed the
 object->store_limit_l, which is supposed to be equal to inode->i_size.
 However, for current implementation, after file writing, the
 object->store_limit_l is not synchronized with new
 inode->i_size immediately, which introduces a race that if writing
 a new page into fscache, will reach the ASSERT that write position
 has exceeded the object->store_limit_l, and cause kernel panic.
 This patch fixes it.

 Yunchuan Wen (3):
Ceph fscache: Add an interface to synchronize object store limit
Ceph fscache: Update object store limit after writing
Ceph fscache: Wait for completion of object initialization

   fs/ceph/cache.c |1 +
   fs/ceph/cache.h |   10 ++
   fs/ceph/file.c  |3 +++
   3 files changed, 14 insertions(+)

 --
 1.7.9.5

>>>
>>>
>>>
>>
>
>
>
> --
> Milosz Tanski
> CTO
> 10 East 53rd Street, 37th floor
> New York, NY 10022
>
> p: 646-253-9055
> e: mil...@adfin.com



-- 
Milosz Tanski
CTO
10 East 53rd Street, 37th floor
New York, NY 10022

p: 646-253-9055
e: mil...@adfin.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] Ceph fscache: Fix kernel panic due to a race

2014-03-03 Thread Milosz Tanski
Hey guys, I'm terribly sorry but apparently this got stuck in my draft
mailbox for 3 months. Since we've been running this on both our test /
prod clusters I would say this is sufficiently tested.

I've looked at the code and tested this for a week now on test cluster
and it looks good. It does fix a real problem and I think we should
push it to mainline. Thanks for fixing this Li! The reason I was not
seeing this is that I had other fscache patches that were masking this
problem :/

Thanks,
- Milosz

P.S: Sorry for the double mail, the first one was not sent as text. I
apparently do not know how to use gmail.

On Fri, Jan 3, 2014 at 9:43 AM, Milosz Tanski mil...@adfin.com wrote:
 I'm going to look the patches and the issue in full detail. In the
 meantime do you guys have the oops back trace. I have some other
 fscache patches that haven't made it upstream yet that might have been
 masking this issue for me.

 On Fri, Dec 27, 2013 at 10:51 PM, Li Wang liw...@ubuntukylin.com wrote:
 Hi Milosz,
   As far as I know, logically, currently fscache does not play
 as write cache for Ceph, except that there is a
 call to ceph_readpage_to_fscache() in ceph_writepage(), but that
 is nothing related to our test case. According to our observation,
 our test case never goes through ceph_writepage(), instead, it goes
 through ceph_writepages(). So in other words, I donot think this
 is related to caching in write path.
   May I try to explain the panic in more detail,

 (1) dd if=/dev/zero of=cephfs/foo bs=8 count=512
 (2) echo 3  /proc/sys/vm/drop_caches
 (3) dd if=cephfs/foo of=/dev/null bs=8 count=1024

 For statement (1), it is frequently appending a file, so
 ceph_aio_write() frequently updates the inode-i_size,
 however, these updates did not immediately reflected to
 object-store_limit_l. For statement (3), when we
 start reading the second page at [4096, 8192), ceph find that the page
 does not be cached in fscache, then it decides to write this page into
 fscache, during this process in cachefiles_write_page(), it found that
 object-store_limit_l  4096 (page-index  12), it causes panic. Does
 it make sense?

 Cheers,
 Li Wang


 On 2013/12/27 6:51, Milosz Tanski wrote:

 Li,

 I looked at the patchset am I correct that this only happens when we
 enable caching in the write path?

 - Milosz

 On Thu, Dec 26, 2013 at 9:29 AM, Li Wang liw...@ubuntukylin.com wrote:

 From: Yunchuan Wen yunchuan...@ubuntukylin.com

 The following scripts could easily panic the kernel,

 #!/bin/bash
 mount -t ceph -o fsc MONADDR:/ cephfs
 rm -rf cephfs/foo
 dd if=/dev/zero of=cephfs/foo bs=8 count=512
 echo 3  /proc/sys/vm/drop_caches
 dd if=cephfs/foo of=/dev/null bs=8 count=1024

 This is due to when writing a page into fscache, the code will
 assert that the write position does not exceed the
 object-store_limit_l, which is supposed to be equal to inode-i_size.
 However, for current implementation, after file writing, the
 object-store_limit_l is not synchronized with new
 inode-i_size immediately, which introduces a race that if writing
 a new page into fscache, will reach the ASSERT that write position
 has exceeded the object-store_limit_l, and cause kernel panic.
 This patch fixes it.

 Yunchuan Wen (3):
Ceph fscache: Add an interface to synchronize object store limit
Ceph fscache: Update object store limit after writing
Ceph fscache: Wait for completion of object initialization

   fs/ceph/cache.c |1 +
   fs/ceph/cache.h |   10 ++
   fs/ceph/file.c  |3 +++
   3 files changed, 14 insertions(+)

 --
 1.7.9.5








 --
 Milosz Tanski
 CTO
 10 East 53rd Street, 37th floor
 New York, NY 10022

 p: 646-253-9055
 e: mil...@adfin.com



-- 
Milosz Tanski
CTO
10 East 53rd Street, 37th floor
New York, NY 10022

p: 646-253-9055
e: mil...@adfin.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] Ceph fscache: Fix kernel panic due to a race

2014-01-03 Thread Milosz Tanski
I'm going to look the patches and the issue in full detail. In the
meantime do you guys have the oops back trace. I have some other
fscache patches that haven't made it upstream yet that might have been
masking this issue for me.

On Fri, Dec 27, 2013 at 10:51 PM, Li Wang  wrote:
> Hi Milosz,
>   As far as I know, logically, currently fscache does not play
> as write cache for Ceph, except that there is a
> call to ceph_readpage_to_fscache() in ceph_writepage(), but that
> is nothing related to our test case. According to our observation,
> our test case never goes through ceph_writepage(), instead, it goes
> through ceph_writepages(). So in other words, I donot think this
> is related to caching in write path.
>   May I try to explain the panic in more detail,
>
> (1) dd if=/dev/zero of=cephfs/foo bs=8 count=512
> (2) echo 3 > /proc/sys/vm/drop_caches
> (3) dd if=cephfs/foo of=/dev/null bs=8 count=1024
>
> For statement (1), it is frequently appending a file, so
> ceph_aio_write() frequently updates the inode->i_size,
> however, these updates did not immediately reflected to
> object->store_limit_l. For statement (3), when we
> start reading the second page at [4096, 8192), ceph find that the page
> does not be cached in fscache, then it decides to write this page into
> fscache, during this process in cachefiles_write_page(), it found that
> object->store_limit_l < 4096 (page->index << 12), it causes panic. Does
> it make sense?
>
> Cheers,
> Li Wang
>
>
> On 2013/12/27 6:51, Milosz Tanski wrote:
>>
>> Li,
>>
>> I looked at the patchset am I correct that this only happens when we
>> enable caching in the write path?
>>
>> - Milosz
>>
>> On Thu, Dec 26, 2013 at 9:29 AM, Li Wang  wrote:
>>>
>>> From: Yunchuan Wen 
>>>
>>> The following scripts could easily panic the kernel,
>>>
>>> #!/bin/bash
>>> mount -t ceph -o fsc MONADDR:/ cephfs
>>> rm -rf cephfs/foo
>>> dd if=/dev/zero of=cephfs/foo bs=8 count=512
>>> echo 3 > /proc/sys/vm/drop_caches
>>> dd if=cephfs/foo of=/dev/null bs=8 count=1024
>>>
>>> This is due to when writing a page into fscache, the code will
>>> assert that the write position does not exceed the
>>> object->store_limit_l, which is supposed to be equal to inode->i_size.
>>> However, for current implementation, after file writing, the
>>> object->store_limit_l is not synchronized with new
>>> inode->i_size immediately, which introduces a race that if writing
>>> a new page into fscache, will reach the ASSERT that write position
>>> has exceeded the object->store_limit_l, and cause kernel panic.
>>> This patch fixes it.
>>>
>>> Yunchuan Wen (3):
>>>Ceph fscache: Add an interface to synchronize object store limit
>>>Ceph fscache: Update object store limit after writing
>>>Ceph fscache: Wait for completion of object initialization
>>>
>>>   fs/ceph/cache.c |1 +
>>>   fs/ceph/cache.h |   10 ++
>>>   fs/ceph/file.c  |3 +++
>>>   3 files changed, 14 insertions(+)
>>>
>>> --
>>> 1.7.9.5
>>>
>>
>>
>>
>



-- 
Milosz Tanski
CTO
10 East 53rd Street, 37th floor
New York, NY 10022

p: 646-253-9055
e: mil...@adfin.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] Ceph fscache: Fix kernel panic due to a race

2014-01-03 Thread Milosz Tanski
I'm going to look the patches and the issue in full detail. In the
meantime do you guys have the oops back trace. I have some other
fscache patches that haven't made it upstream yet that might have been
masking this issue for me.

On Fri, Dec 27, 2013 at 10:51 PM, Li Wang liw...@ubuntukylin.com wrote:
 Hi Milosz,
   As far as I know, logically, currently fscache does not play
 as write cache for Ceph, except that there is a
 call to ceph_readpage_to_fscache() in ceph_writepage(), but that
 is nothing related to our test case. According to our observation,
 our test case never goes through ceph_writepage(), instead, it goes
 through ceph_writepages(). So in other words, I donot think this
 is related to caching in write path.
   May I try to explain the panic in more detail,

 (1) dd if=/dev/zero of=cephfs/foo bs=8 count=512
 (2) echo 3  /proc/sys/vm/drop_caches
 (3) dd if=cephfs/foo of=/dev/null bs=8 count=1024

 For statement (1), it is frequently appending a file, so
 ceph_aio_write() frequently updates the inode-i_size,
 however, these updates did not immediately reflected to
 object-store_limit_l. For statement (3), when we
 start reading the second page at [4096, 8192), ceph find that the page
 does not be cached in fscache, then it decides to write this page into
 fscache, during this process in cachefiles_write_page(), it found that
 object-store_limit_l  4096 (page-index  12), it causes panic. Does
 it make sense?

 Cheers,
 Li Wang


 On 2013/12/27 6:51, Milosz Tanski wrote:

 Li,

 I looked at the patchset am I correct that this only happens when we
 enable caching in the write path?

 - Milosz

 On Thu, Dec 26, 2013 at 9:29 AM, Li Wang liw...@ubuntukylin.com wrote:

 From: Yunchuan Wen yunchuan...@ubuntukylin.com

 The following scripts could easily panic the kernel,

 #!/bin/bash
 mount -t ceph -o fsc MONADDR:/ cephfs
 rm -rf cephfs/foo
 dd if=/dev/zero of=cephfs/foo bs=8 count=512
 echo 3  /proc/sys/vm/drop_caches
 dd if=cephfs/foo of=/dev/null bs=8 count=1024

 This is due to when writing a page into fscache, the code will
 assert that the write position does not exceed the
 object-store_limit_l, which is supposed to be equal to inode-i_size.
 However, for current implementation, after file writing, the
 object-store_limit_l is not synchronized with new
 inode-i_size immediately, which introduces a race that if writing
 a new page into fscache, will reach the ASSERT that write position
 has exceeded the object-store_limit_l, and cause kernel panic.
 This patch fixes it.

 Yunchuan Wen (3):
Ceph fscache: Add an interface to synchronize object store limit
Ceph fscache: Update object store limit after writing
Ceph fscache: Wait for completion of object initialization

   fs/ceph/cache.c |1 +
   fs/ceph/cache.h |   10 ++
   fs/ceph/file.c  |3 +++
   3 files changed, 14 insertions(+)

 --
 1.7.9.5








-- 
Milosz Tanski
CTO
10 East 53rd Street, 37th floor
New York, NY 10022

p: 646-253-9055
e: mil...@adfin.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] Ceph fscache: Fix kernel panic due to a race

2013-12-27 Thread Li Wang

Hi Milosz,
  As far as I know, logically, currently fscache does not play
as write cache for Ceph, except that there is a
call to ceph_readpage_to_fscache() in ceph_writepage(), but that
is nothing related to our test case. According to our observation,
our test case never goes through ceph_writepage(), instead, it goes
through ceph_writepages(). So in other words, I donot think this
is related to caching in write path.
  May I try to explain the panic in more detail,

(1) dd if=/dev/zero of=cephfs/foo bs=8 count=512
(2) echo 3 > /proc/sys/vm/drop_caches
(3) dd if=cephfs/foo of=/dev/null bs=8 count=1024

For statement (1), it is frequently appending a file, so
ceph_aio_write() frequently updates the inode->i_size,
however, these updates did not immediately reflected to
object->store_limit_l. For statement (3), when we
start reading the second page at [4096, 8192), ceph find that the page
does not be cached in fscache, then it decides to write this page into
fscache, during this process in cachefiles_write_page(), it found that 
object->store_limit_l < 4096 (page->index << 12), it causes panic. Does

it make sense?

Cheers,
Li Wang

On 2013/12/27 6:51, Milosz Tanski wrote:

Li,

I looked at the patchset am I correct that this only happens when we
enable caching in the write path?

- Milosz

On Thu, Dec 26, 2013 at 9:29 AM, Li Wang  wrote:

From: Yunchuan Wen 

The following scripts could easily panic the kernel,

#!/bin/bash
mount -t ceph -o fsc MONADDR:/ cephfs
rm -rf cephfs/foo
dd if=/dev/zero of=cephfs/foo bs=8 count=512
echo 3 > /proc/sys/vm/drop_caches
dd if=cephfs/foo of=/dev/null bs=8 count=1024

This is due to when writing a page into fscache, the code will
assert that the write position does not exceed the
object->store_limit_l, which is supposed to be equal to inode->i_size.
However, for current implementation, after file writing, the
object->store_limit_l is not synchronized with new
inode->i_size immediately, which introduces a race that if writing
a new page into fscache, will reach the ASSERT that write position
has exceeded the object->store_limit_l, and cause kernel panic.
This patch fixes it.

Yunchuan Wen (3):
   Ceph fscache: Add an interface to synchronize object store limit
   Ceph fscache: Update object store limit after writing
   Ceph fscache: Wait for completion of object initialization

  fs/ceph/cache.c |1 +
  fs/ceph/cache.h |   10 ++
  fs/ceph/file.c  |3 +++
  3 files changed, 14 insertions(+)

--
1.7.9.5






--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] Ceph fscache: Fix kernel panic due to a race

2013-12-27 Thread Li Wang

Hi Milosz,
  As far as I know, logically, currently fscache does not play
as write cache for Ceph, except that there is a
call to ceph_readpage_to_fscache() in ceph_writepage(), but that
is nothing related to our test case. According to our observation,
our test case never goes through ceph_writepage(), instead, it goes
through ceph_writepages(). So in other words, I donot think this
is related to caching in write path.
  May I try to explain the panic in more detail,

(1) dd if=/dev/zero of=cephfs/foo bs=8 count=512
(2) echo 3  /proc/sys/vm/drop_caches
(3) dd if=cephfs/foo of=/dev/null bs=8 count=1024

For statement (1), it is frequently appending a file, so
ceph_aio_write() frequently updates the inode-i_size,
however, these updates did not immediately reflected to
object-store_limit_l. For statement (3), when we
start reading the second page at [4096, 8192), ceph find that the page
does not be cached in fscache, then it decides to write this page into
fscache, during this process in cachefiles_write_page(), it found that 
object-store_limit_l  4096 (page-index  12), it causes panic. Does

it make sense?

Cheers,
Li Wang

On 2013/12/27 6:51, Milosz Tanski wrote:

Li,

I looked at the patchset am I correct that this only happens when we
enable caching in the write path?

- Milosz

On Thu, Dec 26, 2013 at 9:29 AM, Li Wang liw...@ubuntukylin.com wrote:

From: Yunchuan Wen yunchuan...@ubuntukylin.com

The following scripts could easily panic the kernel,

#!/bin/bash
mount -t ceph -o fsc MONADDR:/ cephfs
rm -rf cephfs/foo
dd if=/dev/zero of=cephfs/foo bs=8 count=512
echo 3  /proc/sys/vm/drop_caches
dd if=cephfs/foo of=/dev/null bs=8 count=1024

This is due to when writing a page into fscache, the code will
assert that the write position does not exceed the
object-store_limit_l, which is supposed to be equal to inode-i_size.
However, for current implementation, after file writing, the
object-store_limit_l is not synchronized with new
inode-i_size immediately, which introduces a race that if writing
a new page into fscache, will reach the ASSERT that write position
has exceeded the object-store_limit_l, and cause kernel panic.
This patch fixes it.

Yunchuan Wen (3):
   Ceph fscache: Add an interface to synchronize object store limit
   Ceph fscache: Update object store limit after writing
   Ceph fscache: Wait for completion of object initialization

  fs/ceph/cache.c |1 +
  fs/ceph/cache.h |   10 ++
  fs/ceph/file.c  |3 +++
  3 files changed, 14 insertions(+)

--
1.7.9.5






--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] Ceph fscache: Fix kernel panic due to a race

2013-12-26 Thread Yunchuan Wen
Hi, Milosz
I am not very sure about how to enable caching in write path.

I just compile the latest kernel with "Enable Ceph client caching
support", start cachefilesd, and mount cephfs with -o fsc.
Then, the kernel will panic easily when the script runs to "dd
if=cephfs/foo of=/dev/null bs=8 count=1024"


2013/12/27 Milosz Tanski :
> Li,
>
> I looked at the patchset am I correct that this only happens when we
> enable caching in the write path?
>
> - Milosz
>
> On Thu, Dec 26, 2013 at 9:29 AM, Li Wang  wrote:
>> From: Yunchuan Wen 
>>
>> The following scripts could easily panic the kernel,
>>
>> #!/bin/bash
>> mount -t ceph -o fsc MONADDR:/ cephfs
>> rm -rf cephfs/foo
>> dd if=/dev/zero of=cephfs/foo bs=8 count=512
>> echo 3 > /proc/sys/vm/drop_caches
>> dd if=cephfs/foo of=/dev/null bs=8 count=1024
>>
>> This is due to when writing a page into fscache, the code will
>> assert that the write position does not exceed the
>> object->store_limit_l, which is supposed to be equal to inode->i_size.
>> However, for current implementation, after file writing, the
>> object->store_limit_l is not synchronized with new
>> inode->i_size immediately, which introduces a race that if writing
>> a new page into fscache, will reach the ASSERT that write position
>> has exceeded the object->store_limit_l, and cause kernel panic.
>> This patch fixes it.
>>
>> Yunchuan Wen (3):
>>   Ceph fscache: Add an interface to synchronize object store limit
>>   Ceph fscache: Update object store limit after writing
>>   Ceph fscache: Wait for completion of object initialization
>>
>>  fs/ceph/cache.c |1 +
>>  fs/ceph/cache.h |   10 ++
>>  fs/ceph/file.c  |3 +++
>>  3 files changed, 14 insertions(+)
>>
>> --
>> 1.7.9.5
>>
>
>
>
> --
> Milosz Tanski
> CTO
> 10 East 53rd Street, 37th floor
> New York, NY 10022
>
> p: 646-253-9055
> e: mil...@adfin.com
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] Ceph fscache: Fix kernel panic due to a race

2013-12-26 Thread Milosz Tanski
Li,

I looked at the patchset am I correct that this only happens when we
enable caching in the write path?

- Milosz

On Thu, Dec 26, 2013 at 9:29 AM, Li Wang  wrote:
> From: Yunchuan Wen 
>
> The following scripts could easily panic the kernel,
>
> #!/bin/bash
> mount -t ceph -o fsc MONADDR:/ cephfs
> rm -rf cephfs/foo
> dd if=/dev/zero of=cephfs/foo bs=8 count=512
> echo 3 > /proc/sys/vm/drop_caches
> dd if=cephfs/foo of=/dev/null bs=8 count=1024
>
> This is due to when writing a page into fscache, the code will
> assert that the write position does not exceed the
> object->store_limit_l, which is supposed to be equal to inode->i_size.
> However, for current implementation, after file writing, the
> object->store_limit_l is not synchronized with new
> inode->i_size immediately, which introduces a race that if writing
> a new page into fscache, will reach the ASSERT that write position
> has exceeded the object->store_limit_l, and cause kernel panic.
> This patch fixes it.
>
> Yunchuan Wen (3):
>   Ceph fscache: Add an interface to synchronize object store limit
>   Ceph fscache: Update object store limit after writing
>   Ceph fscache: Wait for completion of object initialization
>
>  fs/ceph/cache.c |1 +
>  fs/ceph/cache.h |   10 ++
>  fs/ceph/file.c  |3 +++
>  3 files changed, 14 insertions(+)
>
> --
> 1.7.9.5
>



-- 
Milosz Tanski
CTO
10 East 53rd Street, 37th floor
New York, NY 10022

p: 646-253-9055
e: mil...@adfin.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/3] Ceph fscache: Fix kernel panic due to a race

2013-12-26 Thread Li Wang
From: Yunchuan Wen 

The following scripts could easily panic the kernel,

#!/bin/bash
mount -t ceph -o fsc MONADDR:/ cephfs
rm -rf cephfs/foo
dd if=/dev/zero of=cephfs/foo bs=8 count=512
echo 3 > /proc/sys/vm/drop_caches
dd if=cephfs/foo of=/dev/null bs=8 count=1024

This is due to when writing a page into fscache, the code will
assert that the write position does not exceed the 
object->store_limit_l, which is supposed to be equal to inode->i_size.
However, for current implementation, after file writing, the 
object->store_limit_l is not synchronized with new 
inode->i_size immediately, which introduces a race that if writing
a new page into fscache, will reach the ASSERT that write position
has exceeded the object->store_limit_l, and cause kernel panic. 
This patch fixes it.

Yunchuan Wen (3):
  Ceph fscache: Add an interface to synchronize object store limit
  Ceph fscache: Update object store limit after writing
  Ceph fscache: Wait for completion of object initialization

 fs/ceph/cache.c |1 +
 fs/ceph/cache.h |   10 ++
 fs/ceph/file.c  |3 +++
 3 files changed, 14 insertions(+)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/3] Ceph fscache: Fix kernel panic due to a race

2013-12-26 Thread Li Wang
From: Yunchuan Wen yunchuan...@ubuntukylin.com

The following scripts could easily panic the kernel,

#!/bin/bash
mount -t ceph -o fsc MONADDR:/ cephfs
rm -rf cephfs/foo
dd if=/dev/zero of=cephfs/foo bs=8 count=512
echo 3  /proc/sys/vm/drop_caches
dd if=cephfs/foo of=/dev/null bs=8 count=1024

This is due to when writing a page into fscache, the code will
assert that the write position does not exceed the 
object-store_limit_l, which is supposed to be equal to inode-i_size.
However, for current implementation, after file writing, the 
object-store_limit_l is not synchronized with new 
inode-i_size immediately, which introduces a race that if writing
a new page into fscache, will reach the ASSERT that write position
has exceeded the object-store_limit_l, and cause kernel panic. 
This patch fixes it.

Yunchuan Wen (3):
  Ceph fscache: Add an interface to synchronize object store limit
  Ceph fscache: Update object store limit after writing
  Ceph fscache: Wait for completion of object initialization

 fs/ceph/cache.c |1 +
 fs/ceph/cache.h |   10 ++
 fs/ceph/file.c  |3 +++
 3 files changed, 14 insertions(+)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] Ceph fscache: Fix kernel panic due to a race

2013-12-26 Thread Milosz Tanski
Li,

I looked at the patchset am I correct that this only happens when we
enable caching in the write path?

- Milosz

On Thu, Dec 26, 2013 at 9:29 AM, Li Wang liw...@ubuntukylin.com wrote:
 From: Yunchuan Wen yunchuan...@ubuntukylin.com

 The following scripts could easily panic the kernel,

 #!/bin/bash
 mount -t ceph -o fsc MONADDR:/ cephfs
 rm -rf cephfs/foo
 dd if=/dev/zero of=cephfs/foo bs=8 count=512
 echo 3  /proc/sys/vm/drop_caches
 dd if=cephfs/foo of=/dev/null bs=8 count=1024

 This is due to when writing a page into fscache, the code will
 assert that the write position does not exceed the
 object-store_limit_l, which is supposed to be equal to inode-i_size.
 However, for current implementation, after file writing, the
 object-store_limit_l is not synchronized with new
 inode-i_size immediately, which introduces a race that if writing
 a new page into fscache, will reach the ASSERT that write position
 has exceeded the object-store_limit_l, and cause kernel panic.
 This patch fixes it.

 Yunchuan Wen (3):
   Ceph fscache: Add an interface to synchronize object store limit
   Ceph fscache: Update object store limit after writing
   Ceph fscache: Wait for completion of object initialization

  fs/ceph/cache.c |1 +
  fs/ceph/cache.h |   10 ++
  fs/ceph/file.c  |3 +++
  3 files changed, 14 insertions(+)

 --
 1.7.9.5




-- 
Milosz Tanski
CTO
10 East 53rd Street, 37th floor
New York, NY 10022

p: 646-253-9055
e: mil...@adfin.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] Ceph fscache: Fix kernel panic due to a race

2013-12-26 Thread Yunchuan Wen
Hi, Milosz
I am not very sure about how to enable caching in write path.

I just compile the latest kernel with Enable Ceph client caching
support, start cachefilesd, and mount cephfs with -o fsc.
Then, the kernel will panic easily when the script runs to dd
if=cephfs/foo of=/dev/null bs=8 count=1024


2013/12/27 Milosz Tanski mil...@adfin.com:
 Li,

 I looked at the patchset am I correct that this only happens when we
 enable caching in the write path?

 - Milosz

 On Thu, Dec 26, 2013 at 9:29 AM, Li Wang liw...@ubuntukylin.com wrote:
 From: Yunchuan Wen yunchuan...@ubuntukylin.com

 The following scripts could easily panic the kernel,

 #!/bin/bash
 mount -t ceph -o fsc MONADDR:/ cephfs
 rm -rf cephfs/foo
 dd if=/dev/zero of=cephfs/foo bs=8 count=512
 echo 3  /proc/sys/vm/drop_caches
 dd if=cephfs/foo of=/dev/null bs=8 count=1024

 This is due to when writing a page into fscache, the code will
 assert that the write position does not exceed the
 object-store_limit_l, which is supposed to be equal to inode-i_size.
 However, for current implementation, after file writing, the
 object-store_limit_l is not synchronized with new
 inode-i_size immediately, which introduces a race that if writing
 a new page into fscache, will reach the ASSERT that write position
 has exceeded the object-store_limit_l, and cause kernel panic.
 This patch fixes it.

 Yunchuan Wen (3):
   Ceph fscache: Add an interface to synchronize object store limit
   Ceph fscache: Update object store limit after writing
   Ceph fscache: Wait for completion of object initialization

  fs/ceph/cache.c |1 +
  fs/ceph/cache.h |   10 ++
  fs/ceph/file.c  |3 +++
  3 files changed, 14 insertions(+)

 --
 1.7.9.5




 --
 Milosz Tanski
 CTO
 10 East 53rd Street, 37th floor
 New York, NY 10022

 p: 646-253-9055
 e: mil...@adfin.com
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/