Re: [PATCH 1/2] staging: zram: fix handle_pending_slot_free() and zram_reset_device() race

2013-09-23 Thread Sergey Senozhatsky
On (09/23/13 13:24), Minchan Kim wrote:
> > On (09/16/13 09:02), Minchan Kim wrote:
> > > Hello Sergey,
> > > 
> > > Sorry for really slow response. I was really busy by internal works
> > > and Thanks for pointing the BUG, Dan, Jerome and Sergey.
> > > I read your threads roughly so I may miss something. If so, sorry
> > > for that. Anyway I will put my opinion.
> > > 
> > > On Wed, Sep 11, 2013 at 02:12:50AM +0300, Sergey Senozhatsky wrote:
> > > > Dan Carpenter noted that handle_pending_slot_free() is racy with
> > > > zram_reset_device(). Take write init_lock in zram_slot_free(), thus
> > > 
> > > Right but "init_lock" is what I really want to remove.
> > > Yes. It's just read-side lock so most of time it doesn't hurt us but it
> > > makes code very complicated and deadlock prone so I'd like to replace
> > > it with RCU. Yeah, It's off topic but just let me put my opinion in
> > > future direction.
> > > 
> > > Abought the bug, how about moving flush_work below down_write(init_lock)?
> > > zram_make_request is already closed by init_lock and we have a rule about
> > > lock ordering as following so I don't see any problem.
> > > 
> > >   init_lock
> > > zram->lock
> > > 
> > > > preventing any concurrent zram_slot_free(), zram_bvec_rw() or
> > > > zram_reset_device(). This also allows to safely check zram->init_done
> > > > in handle_pending_slot_free().
> > > > 
> > > > Initial intention was to minimze number of handle_pending_slot_free()
> > > > call from zram_bvec_rw(), which were slowing down READ requests due to
> > > > slot_free_lock spin lock. Jerome Marchand suggested to remove
> > > > handle_pending_slot_free() from zram_bvec_rw().
> > > > 
> > > > Link: https://lkml.org/lkml/2013/9/9/172
> > > > Signed-off-by: Sergey Senozhatsky 
> > > > 
> > > > ---
> > > > 
> > > >  drivers/staging/zram/zram_drv.c | 13 +
> > > >  1 file changed, 5 insertions(+), 8 deletions(-)
> > > > 
> > > > diff --git a/drivers/staging/zram/zram_drv.c 
> > > > b/drivers/staging/zram/zram_drv.c
> > > > index 91d94b5..7a2d4de 100644
> > > > --- a/drivers/staging/zram/zram_drv.c
> > > > +++ b/drivers/staging/zram/zram_drv.c
> > > > @@ -521,7 +521,8 @@ static void handle_pending_slot_free(struct zram 
> > > > *zram)
> > > > while (zram->slot_free_rq) {
> > > > free_rq = zram->slot_free_rq;
> > > > zram->slot_free_rq = free_rq->next;
> > > > -   zram_free_page(zram, free_rq->index);
> > > > +   if (zram->init_done)
> > > > +   zram_free_page(zram, free_rq->index);
> > > > kfree(free_rq);
> > > > }
> > > > spin_unlock(>slot_free_lock);
> > > > @@ -534,16 +535,13 @@ static int zram_bvec_rw(struct zram *zram, struct 
> > > > bio_vec *bvec, u32 index,
> > > >  
> > > > if (rw == READ) {
> > > > down_read(>lock);
> > > > -   handle_pending_slot_free(zram);
> > > 
> > > Read side is okay but actually I have a nitpick.
> > > If someone poll a block in zram-swap device, he would see a block
> > > has zero value suddenly although there was no I/O.(I don't want to argue
> > > it's sane user or not, anyway) it never happens on real block device and
> > > it never happens on zram-block device. Only it can happen zram-swap 
> > > device.
> > > And such behavior was there since we introduced swap_slot_free_notify.
> > > (off-topic: I'd like to remove it because it makes tight coupling between
> > > zram and swap and obviously, it was layering violation function)
> > > so now, I don't have strong objection. 
> > > 
> > > The idea is to remove swap_slot_free_notify is to use frontswap when
> > > user want to use zram as swap so zram can be notified when the block
> > > lose the owner but still we should solve the mutex problem in notify
> > > handler.
> > > 
> > > 
> > > > ret = zram_bvec_read(zram, bvec, index, offset, bio);
> > > > up_read(>lock);
> > > > } else {
> > > > down_write(>lock);
> > > > -   handle_pending_slot_free(zram);
> > > 
> > > Why did you remove this in write-side?
> > > We can't expect when the work will trigger. It means the work could remove
> > > valid block under the us.
> > > 
> > 
> > 
> > not sure I understand how.
> > zram_slot_free() takes down_write(>init_lock) and zram_make_request() 
> > takes
> > down_read(>init_lock), thus zram_slot_free() can not concurrently 
> > work with
> > any RW requests. RW requests are under read() lock and zram_slot_free() is 
> > under
> > write() lock.
> 
> Let's consider example.
> Swap subsystem asked to zram "A" block free from now by swap_slot_free_notify
> but zram had been pended it without real freeing.
> Swap reused "A" block for new data because "A" block was free but request 
> pended
> for a long time just handled and zram blindly free new data on the "A" block. 
> :(
> That's why we should handle pending free request right before 

Re: [PATCH 1/2] staging: zram: fix handle_pending_slot_free() and zram_reset_device() race

2013-09-23 Thread Sergey Senozhatsky
On (09/23/13 13:24), Minchan Kim wrote:
  On (09/16/13 09:02), Minchan Kim wrote:
   Hello Sergey,
   
   Sorry for really slow response. I was really busy by internal works
   and Thanks for pointing the BUG, Dan, Jerome and Sergey.
   I read your threads roughly so I may miss something. If so, sorry
   for that. Anyway I will put my opinion.
   
   On Wed, Sep 11, 2013 at 02:12:50AM +0300, Sergey Senozhatsky wrote:
Dan Carpenter noted that handle_pending_slot_free() is racy with
zram_reset_device(). Take write init_lock in zram_slot_free(), thus
   
   Right but init_lock is what I really want to remove.
   Yes. It's just read-side lock so most of time it doesn't hurt us but it
   makes code very complicated and deadlock prone so I'd like to replace
   it with RCU. Yeah, It's off topic but just let me put my opinion in
   future direction.
   
   Abought the bug, how about moving flush_work below down_write(init_lock)?
   zram_make_request is already closed by init_lock and we have a rule about
   lock ordering as following so I don't see any problem.
   
 init_lock
   zram-lock
   
preventing any concurrent zram_slot_free(), zram_bvec_rw() or
zram_reset_device(). This also allows to safely check zram-init_done
in handle_pending_slot_free().

Initial intention was to minimze number of handle_pending_slot_free()
call from zram_bvec_rw(), which were slowing down READ requests due to
slot_free_lock spin lock. Jerome Marchand suggested to remove
handle_pending_slot_free() from zram_bvec_rw().

Link: https://lkml.org/lkml/2013/9/9/172
Signed-off-by: Sergey Senozhatsky sergey.senozhat...@gmail.com

---

 drivers/staging/zram/zram_drv.c | 13 +
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/zram/zram_drv.c 
b/drivers/staging/zram/zram_drv.c
index 91d94b5..7a2d4de 100644
--- a/drivers/staging/zram/zram_drv.c
+++ b/drivers/staging/zram/zram_drv.c
@@ -521,7 +521,8 @@ static void handle_pending_slot_free(struct zram 
*zram)
while (zram-slot_free_rq) {
free_rq = zram-slot_free_rq;
zram-slot_free_rq = free_rq-next;
-   zram_free_page(zram, free_rq-index);
+   if (zram-init_done)
+   zram_free_page(zram, free_rq-index);
kfree(free_rq);
}
spin_unlock(zram-slot_free_lock);
@@ -534,16 +535,13 @@ static int zram_bvec_rw(struct zram *zram, struct 
bio_vec *bvec, u32 index,
 
if (rw == READ) {
down_read(zram-lock);
-   handle_pending_slot_free(zram);
   
   Read side is okay but actually I have a nitpick.
   If someone poll a block in zram-swap device, he would see a block
   has zero value suddenly although there was no I/O.(I don't want to argue
   it's sane user or not, anyway) it never happens on real block device and
   it never happens on zram-block device. Only it can happen zram-swap 
   device.
   And such behavior was there since we introduced swap_slot_free_notify.
   (off-topic: I'd like to remove it because it makes tight coupling between
   zram and swap and obviously, it was layering violation function)
   so now, I don't have strong objection. 
   
   The idea is to remove swap_slot_free_notify is to use frontswap when
   user want to use zram as swap so zram can be notified when the block
   lose the owner but still we should solve the mutex problem in notify
   handler.
   
   
ret = zram_bvec_read(zram, bvec, index, offset, bio);
up_read(zram-lock);
} else {
down_write(zram-lock);
-   handle_pending_slot_free(zram);
   
   Why did you remove this in write-side?
   We can't expect when the work will trigger. It means the work could remove
   valid block under the us.
   
  
  
  not sure I understand how.
  zram_slot_free() takes down_write(zram-init_lock) and zram_make_request() 
  takes
  down_read(zram-init_lock), thus zram_slot_free() can not concurrently 
  work with
  any RW requests. RW requests are under read() lock and zram_slot_free() is 
  under
  write() lock.
 
 Let's consider example.
 Swap subsystem asked to zram A block free from now by swap_slot_free_notify
 but zram had been pended it without real freeing.
 Swap reused A block for new data because A block was free but request 
 pended
 for a long time just handled and zram blindly free new data on the A block. 
 :(
 That's why we should handle pending free request right before zram-write.
 
 Another try to optimize the lock overhead is to check the block is pending 
 for free
 right before zram_free_page in write path. If so, we should remove pending 
 reuqest
 from slot_free_rq list to prevent valid block later. But for that case, we 
 need
 more complex data structure to find the block 

Re: [PATCH 1/2] staging: zram: fix handle_pending_slot_free() and zram_reset_device() race

2013-09-22 Thread Minchan Kim
On Tue, Sep 17, 2013 at 08:24:45PM +0300, Sergey Senozhatsky wrote:
> 
> Hello,
> 
> On (09/16/13 09:02), Minchan Kim wrote:
> > Hello Sergey,
> > 
> > Sorry for really slow response. I was really busy by internal works
> > and Thanks for pointing the BUG, Dan, Jerome and Sergey.
> > I read your threads roughly so I may miss something. If so, sorry
> > for that. Anyway I will put my opinion.
> > 
> > On Wed, Sep 11, 2013 at 02:12:50AM +0300, Sergey Senozhatsky wrote:
> > > Dan Carpenter noted that handle_pending_slot_free() is racy with
> > > zram_reset_device(). Take write init_lock in zram_slot_free(), thus
> > 
> > Right but "init_lock" is what I really want to remove.
> > Yes. It's just read-side lock so most of time it doesn't hurt us but it
> > makes code very complicated and deadlock prone so I'd like to replace
> > it with RCU. Yeah, It's off topic but just let me put my opinion in
> > future direction.
> > 
> > Abought the bug, how about moving flush_work below down_write(init_lock)?
> > zram_make_request is already closed by init_lock and we have a rule about
> > lock ordering as following so I don't see any problem.
> > 
> >   init_lock
> > zram->lock
> > 
> > > preventing any concurrent zram_slot_free(), zram_bvec_rw() or
> > > zram_reset_device(). This also allows to safely check zram->init_done
> > > in handle_pending_slot_free().
> > > 
> > > Initial intention was to minimze number of handle_pending_slot_free()
> > > call from zram_bvec_rw(), which were slowing down READ requests due to
> > > slot_free_lock spin lock. Jerome Marchand suggested to remove
> > > handle_pending_slot_free() from zram_bvec_rw().
> > > 
> > > Link: https://lkml.org/lkml/2013/9/9/172
> > > Signed-off-by: Sergey Senozhatsky 
> > > 
> > > ---
> > > 
> > >  drivers/staging/zram/zram_drv.c | 13 +
> > >  1 file changed, 5 insertions(+), 8 deletions(-)
> > > 
> > > diff --git a/drivers/staging/zram/zram_drv.c 
> > > b/drivers/staging/zram/zram_drv.c
> > > index 91d94b5..7a2d4de 100644
> > > --- a/drivers/staging/zram/zram_drv.c
> > > +++ b/drivers/staging/zram/zram_drv.c
> > > @@ -521,7 +521,8 @@ static void handle_pending_slot_free(struct zram 
> > > *zram)
> > >   while (zram->slot_free_rq) {
> > >   free_rq = zram->slot_free_rq;
> > >   zram->slot_free_rq = free_rq->next;
> > > - zram_free_page(zram, free_rq->index);
> > > + if (zram->init_done)
> > > + zram_free_page(zram, free_rq->index);
> > >   kfree(free_rq);
> > >   }
> > >   spin_unlock(>slot_free_lock);
> > > @@ -534,16 +535,13 @@ static int zram_bvec_rw(struct zram *zram, struct 
> > > bio_vec *bvec, u32 index,
> > >  
> > >   if (rw == READ) {
> > >   down_read(>lock);
> > > - handle_pending_slot_free(zram);
> > 
> > Read side is okay but actually I have a nitpick.
> > If someone poll a block in zram-swap device, he would see a block
> > has zero value suddenly although there was no I/O.(I don't want to argue
> > it's sane user or not, anyway) it never happens on real block device and
> > it never happens on zram-block device. Only it can happen zram-swap device.
> > And such behavior was there since we introduced swap_slot_free_notify.
> > (off-topic: I'd like to remove it because it makes tight coupling between
> > zram and swap and obviously, it was layering violation function)
> > so now, I don't have strong objection. 
> > 
> > The idea is to remove swap_slot_free_notify is to use frontswap when
> > user want to use zram as swap so zram can be notified when the block
> > lose the owner but still we should solve the mutex problem in notify
> > handler.
> > 
> > 
> > >   ret = zram_bvec_read(zram, bvec, index, offset, bio);
> > >   up_read(>lock);
> > >   } else {
> > >   down_write(>lock);
> > > - handle_pending_slot_free(zram);
> > 
> > Why did you remove this in write-side?
> > We can't expect when the work will trigger. It means the work could remove
> > valid block under the us.
> > 
> 
> 
> not sure I understand how.
> zram_slot_free() takes down_write(>init_lock) and zram_make_request() 
> takes
> down_read(>init_lock), thus zram_slot_free() can not concurrently work 
> with
> any RW requests. RW requests are under read() lock and zram_slot_free() is 
> under
> write() lock.

Let's consider example.
Swap subsystem asked to zram "A" block free from now by swap_slot_free_notify
but zram had been pended it without real freeing.
Swap reused "A" block for new data because "A" block was free but request pended
for a long time just handled and zram blindly free new data on the "A" block. :(
That's why we should handle pending free request right before zram-write.

Another try to optimize the lock overhead is to check the block is pending for 
free
right before zram_free_page in write path. If so, we should remove pending 
reuqest
from slot_free_rq list to prevent valid block later. But for that case, we need
more complex data structure 

Re: [PATCH 1/2] staging: zram: fix handle_pending_slot_free() and zram_reset_device() race

2013-09-22 Thread Minchan Kim
On Tue, Sep 17, 2013 at 08:24:45PM +0300, Sergey Senozhatsky wrote:
 
 Hello,
 
 On (09/16/13 09:02), Minchan Kim wrote:
  Hello Sergey,
  
  Sorry for really slow response. I was really busy by internal works
  and Thanks for pointing the BUG, Dan, Jerome and Sergey.
  I read your threads roughly so I may miss something. If so, sorry
  for that. Anyway I will put my opinion.
  
  On Wed, Sep 11, 2013 at 02:12:50AM +0300, Sergey Senozhatsky wrote:
   Dan Carpenter noted that handle_pending_slot_free() is racy with
   zram_reset_device(). Take write init_lock in zram_slot_free(), thus
  
  Right but init_lock is what I really want to remove.
  Yes. It's just read-side lock so most of time it doesn't hurt us but it
  makes code very complicated and deadlock prone so I'd like to replace
  it with RCU. Yeah, It's off topic but just let me put my opinion in
  future direction.
  
  Abought the bug, how about moving flush_work below down_write(init_lock)?
  zram_make_request is already closed by init_lock and we have a rule about
  lock ordering as following so I don't see any problem.
  
init_lock
  zram-lock
  
   preventing any concurrent zram_slot_free(), zram_bvec_rw() or
   zram_reset_device(). This also allows to safely check zram-init_done
   in handle_pending_slot_free().
   
   Initial intention was to minimze number of handle_pending_slot_free()
   call from zram_bvec_rw(), which were slowing down READ requests due to
   slot_free_lock spin lock. Jerome Marchand suggested to remove
   handle_pending_slot_free() from zram_bvec_rw().
   
   Link: https://lkml.org/lkml/2013/9/9/172
   Signed-off-by: Sergey Senozhatsky sergey.senozhat...@gmail.com
   
   ---
   
drivers/staging/zram/zram_drv.c | 13 +
1 file changed, 5 insertions(+), 8 deletions(-)
   
   diff --git a/drivers/staging/zram/zram_drv.c 
   b/drivers/staging/zram/zram_drv.c
   index 91d94b5..7a2d4de 100644
   --- a/drivers/staging/zram/zram_drv.c
   +++ b/drivers/staging/zram/zram_drv.c
   @@ -521,7 +521,8 @@ static void handle_pending_slot_free(struct zram 
   *zram)
 while (zram-slot_free_rq) {
 free_rq = zram-slot_free_rq;
 zram-slot_free_rq = free_rq-next;
   - zram_free_page(zram, free_rq-index);
   + if (zram-init_done)
   + zram_free_page(zram, free_rq-index);
 kfree(free_rq);
 }
 spin_unlock(zram-slot_free_lock);
   @@ -534,16 +535,13 @@ static int zram_bvec_rw(struct zram *zram, struct 
   bio_vec *bvec, u32 index,

 if (rw == READ) {
 down_read(zram-lock);
   - handle_pending_slot_free(zram);
  
  Read side is okay but actually I have a nitpick.
  If someone poll a block in zram-swap device, he would see a block
  has zero value suddenly although there was no I/O.(I don't want to argue
  it's sane user or not, anyway) it never happens on real block device and
  it never happens on zram-block device. Only it can happen zram-swap device.
  And such behavior was there since we introduced swap_slot_free_notify.
  (off-topic: I'd like to remove it because it makes tight coupling between
  zram and swap and obviously, it was layering violation function)
  so now, I don't have strong objection. 
  
  The idea is to remove swap_slot_free_notify is to use frontswap when
  user want to use zram as swap so zram can be notified when the block
  lose the owner but still we should solve the mutex problem in notify
  handler.
  
  
 ret = zram_bvec_read(zram, bvec, index, offset, bio);
 up_read(zram-lock);
 } else {
 down_write(zram-lock);
   - handle_pending_slot_free(zram);
  
  Why did you remove this in write-side?
  We can't expect when the work will trigger. It means the work could remove
  valid block under the us.
  
 
 
 not sure I understand how.
 zram_slot_free() takes down_write(zram-init_lock) and zram_make_request() 
 takes
 down_read(zram-init_lock), thus zram_slot_free() can not concurrently work 
 with
 any RW requests. RW requests are under read() lock and zram_slot_free() is 
 under
 write() lock.

Let's consider example.
Swap subsystem asked to zram A block free from now by swap_slot_free_notify
but zram had been pended it without real freeing.
Swap reused A block for new data because A block was free but request pended
for a long time just handled and zram blindly free new data on the A block. :(
That's why we should handle pending free request right before zram-write.

Another try to optimize the lock overhead is to check the block is pending for 
free
right before zram_free_page in write path. If so, we should remove pending 
reuqest
from slot_free_rq list to prevent valid block later. But for that case, we need
more complex data structure to find the block fast and many checking code right
before zram_free_page so that it would make code rather complicated.

So, do you have any real workload for us to consider it's really troublesome?

Re: [PATCH 1/2] staging: zram: fix handle_pending_slot_free() and zram_reset_device() race

2013-09-17 Thread Sergey Senozhatsky

Hello,

On (09/16/13 09:02), Minchan Kim wrote:
> Hello Sergey,
> 
> Sorry for really slow response. I was really busy by internal works
> and Thanks for pointing the BUG, Dan, Jerome and Sergey.
> I read your threads roughly so I may miss something. If so, sorry
> for that. Anyway I will put my opinion.
> 
> On Wed, Sep 11, 2013 at 02:12:50AM +0300, Sergey Senozhatsky wrote:
> > Dan Carpenter noted that handle_pending_slot_free() is racy with
> > zram_reset_device(). Take write init_lock in zram_slot_free(), thus
> 
> Right but "init_lock" is what I really want to remove.
> Yes. It's just read-side lock so most of time it doesn't hurt us but it
> makes code very complicated and deadlock prone so I'd like to replace
> it with RCU. Yeah, It's off topic but just let me put my opinion in
> future direction.
> 
> Abought the bug, how about moving flush_work below down_write(init_lock)?
> zram_make_request is already closed by init_lock and we have a rule about
> lock ordering as following so I don't see any problem.
> 
>   init_lock
> zram->lock
> 
> > preventing any concurrent zram_slot_free(), zram_bvec_rw() or
> > zram_reset_device(). This also allows to safely check zram->init_done
> > in handle_pending_slot_free().
> > 
> > Initial intention was to minimze number of handle_pending_slot_free()
> > call from zram_bvec_rw(), which were slowing down READ requests due to
> > slot_free_lock spin lock. Jerome Marchand suggested to remove
> > handle_pending_slot_free() from zram_bvec_rw().
> > 
> > Link: https://lkml.org/lkml/2013/9/9/172
> > Signed-off-by: Sergey Senozhatsky 
> > 
> > ---
> > 
> >  drivers/staging/zram/zram_drv.c | 13 +
> >  1 file changed, 5 insertions(+), 8 deletions(-)
> > 
> > diff --git a/drivers/staging/zram/zram_drv.c 
> > b/drivers/staging/zram/zram_drv.c
> > index 91d94b5..7a2d4de 100644
> > --- a/drivers/staging/zram/zram_drv.c
> > +++ b/drivers/staging/zram/zram_drv.c
> > @@ -521,7 +521,8 @@ static void handle_pending_slot_free(struct zram *zram)
> > while (zram->slot_free_rq) {
> > free_rq = zram->slot_free_rq;
> > zram->slot_free_rq = free_rq->next;
> > -   zram_free_page(zram, free_rq->index);
> > +   if (zram->init_done)
> > +   zram_free_page(zram, free_rq->index);
> > kfree(free_rq);
> > }
> > spin_unlock(>slot_free_lock);
> > @@ -534,16 +535,13 @@ static int zram_bvec_rw(struct zram *zram, struct 
> > bio_vec *bvec, u32 index,
> >  
> > if (rw == READ) {
> > down_read(>lock);
> > -   handle_pending_slot_free(zram);
> 
> Read side is okay but actually I have a nitpick.
> If someone poll a block in zram-swap device, he would see a block
> has zero value suddenly although there was no I/O.(I don't want to argue
> it's sane user or not, anyway) it never happens on real block device and
> it never happens on zram-block device. Only it can happen zram-swap device.
> And such behavior was there since we introduced swap_slot_free_notify.
> (off-topic: I'd like to remove it because it makes tight coupling between
> zram and swap and obviously, it was layering violation function)
> so now, I don't have strong objection. 
> 
> The idea is to remove swap_slot_free_notify is to use frontswap when
> user want to use zram as swap so zram can be notified when the block
> lose the owner but still we should solve the mutex problem in notify
> handler.
> 
> 
> > ret = zram_bvec_read(zram, bvec, index, offset, bio);
> > up_read(>lock);
> > } else {
> > down_write(>lock);
> > -   handle_pending_slot_free(zram);
> 
> Why did you remove this in write-side?
> We can't expect when the work will trigger. It means the work could remove
> valid block under the us.
> 


not sure I understand how.
zram_slot_free() takes down_write(>init_lock) and zram_make_request() 
takes
down_read(>init_lock), thus zram_slot_free() can not concurrently work 
with
any RW requests. RW requests are under read() lock and zram_slot_free() is under
write() lock.

> > ret = zram_bvec_write(zram, bvec, index, offset);
> > up_write(>lock);
> > }
> > -
> > return ret;
> >  }
> >  
> > @@ -750,12 +748,11 @@ error:
> >  
> >  static void zram_slot_free(struct work_struct *work)
> >  {
> > -   struct zram *zram;
> > +   struct zram *zram = container_of(work, struct zram, free_work);
> >  
> > -   zram = container_of(work, struct zram, free_work);
> > -   down_write(>lock);
> > +   down_write(>init_lock);
> 
> I don't like this.
> Primary problem is we should handle it as atomic so that we should use
> spinlock instead of mutex. Yeah, /me kicks his ass. From the beginning,
> I should solve this problem as that way.
> 
> The simple solution popped from my mind is that
> 
> 
> diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c
> index 91d94b5..b23bf0e 100644
> --- a/drivers/staging/zram/zram_drv.c
> +++ 

Re: [PATCH 1/2] staging: zram: fix handle_pending_slot_free() and zram_reset_device() race

2013-09-17 Thread Sergey Senozhatsky

Hello,

On (09/16/13 09:02), Minchan Kim wrote:
 Hello Sergey,
 
 Sorry for really slow response. I was really busy by internal works
 and Thanks for pointing the BUG, Dan, Jerome and Sergey.
 I read your threads roughly so I may miss something. If so, sorry
 for that. Anyway I will put my opinion.
 
 On Wed, Sep 11, 2013 at 02:12:50AM +0300, Sergey Senozhatsky wrote:
  Dan Carpenter noted that handle_pending_slot_free() is racy with
  zram_reset_device(). Take write init_lock in zram_slot_free(), thus
 
 Right but init_lock is what I really want to remove.
 Yes. It's just read-side lock so most of time it doesn't hurt us but it
 makes code very complicated and deadlock prone so I'd like to replace
 it with RCU. Yeah, It's off topic but just let me put my opinion in
 future direction.
 
 Abought the bug, how about moving flush_work below down_write(init_lock)?
 zram_make_request is already closed by init_lock and we have a rule about
 lock ordering as following so I don't see any problem.
 
   init_lock
 zram-lock
 
  preventing any concurrent zram_slot_free(), zram_bvec_rw() or
  zram_reset_device(). This also allows to safely check zram-init_done
  in handle_pending_slot_free().
  
  Initial intention was to minimze number of handle_pending_slot_free()
  call from zram_bvec_rw(), which were slowing down READ requests due to
  slot_free_lock spin lock. Jerome Marchand suggested to remove
  handle_pending_slot_free() from zram_bvec_rw().
  
  Link: https://lkml.org/lkml/2013/9/9/172
  Signed-off-by: Sergey Senozhatsky sergey.senozhat...@gmail.com
  
  ---
  
   drivers/staging/zram/zram_drv.c | 13 +
   1 file changed, 5 insertions(+), 8 deletions(-)
  
  diff --git a/drivers/staging/zram/zram_drv.c 
  b/drivers/staging/zram/zram_drv.c
  index 91d94b5..7a2d4de 100644
  --- a/drivers/staging/zram/zram_drv.c
  +++ b/drivers/staging/zram/zram_drv.c
  @@ -521,7 +521,8 @@ static void handle_pending_slot_free(struct zram *zram)
  while (zram-slot_free_rq) {
  free_rq = zram-slot_free_rq;
  zram-slot_free_rq = free_rq-next;
  -   zram_free_page(zram, free_rq-index);
  +   if (zram-init_done)
  +   zram_free_page(zram, free_rq-index);
  kfree(free_rq);
  }
  spin_unlock(zram-slot_free_lock);
  @@ -534,16 +535,13 @@ static int zram_bvec_rw(struct zram *zram, struct 
  bio_vec *bvec, u32 index,
   
  if (rw == READ) {
  down_read(zram-lock);
  -   handle_pending_slot_free(zram);
 
 Read side is okay but actually I have a nitpick.
 If someone poll a block in zram-swap device, he would see a block
 has zero value suddenly although there was no I/O.(I don't want to argue
 it's sane user or not, anyway) it never happens on real block device and
 it never happens on zram-block device. Only it can happen zram-swap device.
 And such behavior was there since we introduced swap_slot_free_notify.
 (off-topic: I'd like to remove it because it makes tight coupling between
 zram and swap and obviously, it was layering violation function)
 so now, I don't have strong objection. 
 
 The idea is to remove swap_slot_free_notify is to use frontswap when
 user want to use zram as swap so zram can be notified when the block
 lose the owner but still we should solve the mutex problem in notify
 handler.
 
 
  ret = zram_bvec_read(zram, bvec, index, offset, bio);
  up_read(zram-lock);
  } else {
  down_write(zram-lock);
  -   handle_pending_slot_free(zram);
 
 Why did you remove this in write-side?
 We can't expect when the work will trigger. It means the work could remove
 valid block under the us.
 


not sure I understand how.
zram_slot_free() takes down_write(zram-init_lock) and zram_make_request() 
takes
down_read(zram-init_lock), thus zram_slot_free() can not concurrently work 
with
any RW requests. RW requests are under read() lock and zram_slot_free() is under
write() lock.

  ret = zram_bvec_write(zram, bvec, index, offset);
  up_write(zram-lock);
  }
  -
  return ret;
   }
   
  @@ -750,12 +748,11 @@ error:
   
   static void zram_slot_free(struct work_struct *work)
   {
  -   struct zram *zram;
  +   struct zram *zram = container_of(work, struct zram, free_work);
   
  -   zram = container_of(work, struct zram, free_work);
  -   down_write(zram-lock);
  +   down_write(zram-init_lock);
 
 I don't like this.
 Primary problem is we should handle it as atomic so that we should use
 spinlock instead of mutex. Yeah, /me kicks his ass. From the beginning,
 I should solve this problem as that way.
 
 The simple solution popped from my mind is that
 
 
 diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c
 index 91d94b5..b23bf0e 100644
 --- a/drivers/staging/zram/zram_drv.c
 +++ b/drivers/staging/zram/zram_drv.c
 @@ -534,11 +534,14 @@ static int zram_bvec_rw(struct zram *zram, struct 
 bio_vec *bvec, u32 index,
  

Re: [PATCH 1/2] staging: zram: fix handle_pending_slot_free() and zram_reset_device() race

2013-09-13 Thread Sergey Senozhatsky
On (09/12/13 15:12), Greg KH wrote:
> On Wed, Sep 11, 2013 at 02:12:50AM +0300, Sergey Senozhatsky wrote:
> > Dan Carpenter noted that handle_pending_slot_free() is racy with
> > zram_reset_device(). Take write init_lock in zram_slot_free(), thus
> > preventing any concurrent zram_slot_free(), zram_bvec_rw() or
> > zram_reset_device(). This also allows to safely check zram->init_done
> > in handle_pending_slot_free().
> > 
> > Initial intention was to minimze number of handle_pending_slot_free()
> > call from zram_bvec_rw(), which were slowing down READ requests due to
> > slot_free_lock spin lock. Jerome Marchand suggested to remove
> > handle_pending_slot_free() from zram_bvec_rw().
> > 
> > Link: https://lkml.org/lkml/2013/9/9/172
> > Signed-off-by: Sergey Senozhatsky 
> 
> I have multiple versions of this and the other zram patches from you,
> with no idea of which to accept.

yes, please, drop all patches. I did not Cc you in these two
patches to stop spamming your inbox with multiply versions.
I will send them back to you as soon as I get positive feedback.


> So, I'm going to drop them all, can you please resend what you wish to
> submit, and in the future, be a bit more obvious with your "vN"
> markings?
>

sorry for that.

-ss

> thanks,
> 
> greg k-h
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] staging: zram: fix handle_pending_slot_free() and zram_reset_device() race

2013-09-13 Thread Sergey Senozhatsky
On (09/12/13 15:12), Greg KH wrote:
 On Wed, Sep 11, 2013 at 02:12:50AM +0300, Sergey Senozhatsky wrote:
  Dan Carpenter noted that handle_pending_slot_free() is racy with
  zram_reset_device(). Take write init_lock in zram_slot_free(), thus
  preventing any concurrent zram_slot_free(), zram_bvec_rw() or
  zram_reset_device(). This also allows to safely check zram-init_done
  in handle_pending_slot_free().
  
  Initial intention was to minimze number of handle_pending_slot_free()
  call from zram_bvec_rw(), which were slowing down READ requests due to
  slot_free_lock spin lock. Jerome Marchand suggested to remove
  handle_pending_slot_free() from zram_bvec_rw().
  
  Link: https://lkml.org/lkml/2013/9/9/172
  Signed-off-by: Sergey Senozhatsky sergey.senozhat...@gmail.com
 
 I have multiple versions of this and the other zram patches from you,
 with no idea of which to accept.

yes, please, drop all patches. I did not Cc you in these two
patches to stop spamming your inbox with multiply versions.
I will send them back to you as soon as I get positive feedback.


 So, I'm going to drop them all, can you please resend what you wish to
 submit, and in the future, be a bit more obvious with your vN
 markings?


sorry for that.

-ss

 thanks,
 
 greg k-h
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] staging: zram: fix handle_pending_slot_free() and zram_reset_device() race

2013-09-12 Thread Greg KH
On Wed, Sep 11, 2013 at 02:12:50AM +0300, Sergey Senozhatsky wrote:
> Dan Carpenter noted that handle_pending_slot_free() is racy with
> zram_reset_device(). Take write init_lock in zram_slot_free(), thus
> preventing any concurrent zram_slot_free(), zram_bvec_rw() or
> zram_reset_device(). This also allows to safely check zram->init_done
> in handle_pending_slot_free().
> 
> Initial intention was to minimze number of handle_pending_slot_free()
> call from zram_bvec_rw(), which were slowing down READ requests due to
> slot_free_lock spin lock. Jerome Marchand suggested to remove
> handle_pending_slot_free() from zram_bvec_rw().
> 
> Link: https://lkml.org/lkml/2013/9/9/172
> Signed-off-by: Sergey Senozhatsky 

I have multiple versions of this and the other zram patches from you,
with no idea of which to accept.

So, I'm going to drop them all, can you please resend what you wish to
submit, and in the future, be a bit more obvious with your "vN"
markings?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] staging: zram: fix handle_pending_slot_free() and zram_reset_device() race

2013-09-12 Thread Greg KH
On Wed, Sep 11, 2013 at 02:12:50AM +0300, Sergey Senozhatsky wrote:
 Dan Carpenter noted that handle_pending_slot_free() is racy with
 zram_reset_device(). Take write init_lock in zram_slot_free(), thus
 preventing any concurrent zram_slot_free(), zram_bvec_rw() or
 zram_reset_device(). This also allows to safely check zram-init_done
 in handle_pending_slot_free().
 
 Initial intention was to minimze number of handle_pending_slot_free()
 call from zram_bvec_rw(), which were slowing down READ requests due to
 slot_free_lock spin lock. Jerome Marchand suggested to remove
 handle_pending_slot_free() from zram_bvec_rw().
 
 Link: https://lkml.org/lkml/2013/9/9/172
 Signed-off-by: Sergey Senozhatsky sergey.senozhat...@gmail.com

I have multiple versions of this and the other zram patches from you,
with no idea of which to accept.

So, I'm going to drop them all, can you please resend what you wish to
submit, and in the future, be a bit more obvious with your vN
markings?

thanks,

greg k-h
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] staging: zram: fix handle_pending_slot_free() and zram_reset_device() race

2013-09-10 Thread Sergey Senozhatsky
Dan Carpenter noted that handle_pending_slot_free() is racy with
zram_reset_device(). Take write init_lock in zram_slot_free(), thus
preventing any concurrent zram_slot_free(), zram_bvec_rw() or
zram_reset_device(). This also allows to safely check zram->init_done
in handle_pending_slot_free().

Initial intention was to minimze number of handle_pending_slot_free()
call from zram_bvec_rw(), which were slowing down READ requests due to
slot_free_lock spin lock. Jerome Marchand suggested to remove
handle_pending_slot_free() from zram_bvec_rw().

Link: https://lkml.org/lkml/2013/9/9/172
Signed-off-by: Sergey Senozhatsky 

---

 drivers/staging/zram/zram_drv.c | 13 +
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c
index 91d94b5..7a2d4de 100644
--- a/drivers/staging/zram/zram_drv.c
+++ b/drivers/staging/zram/zram_drv.c
@@ -521,7 +521,8 @@ static void handle_pending_slot_free(struct zram *zram)
while (zram->slot_free_rq) {
free_rq = zram->slot_free_rq;
zram->slot_free_rq = free_rq->next;
-   zram_free_page(zram, free_rq->index);
+   if (zram->init_done)
+   zram_free_page(zram, free_rq->index);
kfree(free_rq);
}
spin_unlock(>slot_free_lock);
@@ -534,16 +535,13 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec 
*bvec, u32 index,
 
if (rw == READ) {
down_read(>lock);
-   handle_pending_slot_free(zram);
ret = zram_bvec_read(zram, bvec, index, offset, bio);
up_read(>lock);
} else {
down_write(>lock);
-   handle_pending_slot_free(zram);
ret = zram_bvec_write(zram, bvec, index, offset);
up_write(>lock);
}
-
return ret;
 }
 
@@ -750,12 +748,11 @@ error:
 
 static void zram_slot_free(struct work_struct *work)
 {
-   struct zram *zram;
+   struct zram *zram = container_of(work, struct zram, free_work);
 
-   zram = container_of(work, struct zram, free_work);
-   down_write(>lock);
+   down_write(>init_lock);
handle_pending_slot_free(zram);
-   up_write(>lock);
+   up_write(>init_lock);
 }
 
 static void add_slot_free(struct zram *zram, struct zram_slot_free *free_rq)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] staging: zram: fix handle_pending_slot_free() and zram_reset_device() race

2013-09-10 Thread Sergey Senozhatsky
Dan Carpenter noted that handle_pending_slot_free() is racy with
zram_reset_device(). Take write init_lock in zram_slot_free(), thus
preventing any concurrent zram_slot_free(), zram_bvec_rw() or
zram_reset_device(). This also allows to safely check zram-init_done
in handle_pending_slot_free().

Initial intention was to minimze number of handle_pending_slot_free()
call from zram_bvec_rw(), which were slowing down READ requests due to
slot_free_lock spin lock. Jerome Marchand suggested to remove
handle_pending_slot_free() from zram_bvec_rw().

Link: https://lkml.org/lkml/2013/9/9/172
Signed-off-by: Sergey Senozhatsky sergey.senozhat...@gmail.com

---

 drivers/staging/zram/zram_drv.c | 13 +
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c
index 91d94b5..7a2d4de 100644
--- a/drivers/staging/zram/zram_drv.c
+++ b/drivers/staging/zram/zram_drv.c
@@ -521,7 +521,8 @@ static void handle_pending_slot_free(struct zram *zram)
while (zram-slot_free_rq) {
free_rq = zram-slot_free_rq;
zram-slot_free_rq = free_rq-next;
-   zram_free_page(zram, free_rq-index);
+   if (zram-init_done)
+   zram_free_page(zram, free_rq-index);
kfree(free_rq);
}
spin_unlock(zram-slot_free_lock);
@@ -534,16 +535,13 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec 
*bvec, u32 index,
 
if (rw == READ) {
down_read(zram-lock);
-   handle_pending_slot_free(zram);
ret = zram_bvec_read(zram, bvec, index, offset, bio);
up_read(zram-lock);
} else {
down_write(zram-lock);
-   handle_pending_slot_free(zram);
ret = zram_bvec_write(zram, bvec, index, offset);
up_write(zram-lock);
}
-
return ret;
 }
 
@@ -750,12 +748,11 @@ error:
 
 static void zram_slot_free(struct work_struct *work)
 {
-   struct zram *zram;
+   struct zram *zram = container_of(work, struct zram, free_work);
 
-   zram = container_of(work, struct zram, free_work);
-   down_write(zram-lock);
+   down_write(zram-init_lock);
handle_pending_slot_free(zram);
-   up_write(zram-lock);
+   up_write(zram-init_lock);
 }
 
 static void add_slot_free(struct zram *zram, struct zram_slot_free *free_rq)

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/