Re: [PATCH 1/2] staging: zram: fix handle_pending_slot_free() and zram_reset_device() race
On (09/23/13 13:24), Minchan Kim wrote: > > On (09/16/13 09:02), Minchan Kim wrote: > > > Hello Sergey, > > > > > > Sorry for really slow response. I was really busy by internal works > > > and Thanks for pointing the BUG, Dan, Jerome and Sergey. > > > I read your threads roughly so I may miss something. If so, sorry > > > for that. Anyway I will put my opinion. > > > > > > On Wed, Sep 11, 2013 at 02:12:50AM +0300, Sergey Senozhatsky wrote: > > > > Dan Carpenter noted that handle_pending_slot_free() is racy with > > > > zram_reset_device(). Take write init_lock in zram_slot_free(), thus > > > > > > Right but "init_lock" is what I really want to remove. > > > Yes. It's just read-side lock so most of time it doesn't hurt us but it > > > makes code very complicated and deadlock prone so I'd like to replace > > > it with RCU. Yeah, It's off topic but just let me put my opinion in > > > future direction. > > > > > > Abought the bug, how about moving flush_work below down_write(init_lock)? > > > zram_make_request is already closed by init_lock and we have a rule about > > > lock ordering as following so I don't see any problem. > > > > > > init_lock > > > zram->lock > > > > > > > preventing any concurrent zram_slot_free(), zram_bvec_rw() or > > > > zram_reset_device(). This also allows to safely check zram->init_done > > > > in handle_pending_slot_free(). > > > > > > > > Initial intention was to minimze number of handle_pending_slot_free() > > > > call from zram_bvec_rw(), which were slowing down READ requests due to > > > > slot_free_lock spin lock. Jerome Marchand suggested to remove > > > > handle_pending_slot_free() from zram_bvec_rw(). > > > > > > > > Link: https://lkml.org/lkml/2013/9/9/172 > > > > Signed-off-by: Sergey Senozhatsky > > > > > > > > --- > > > > > > > > drivers/staging/zram/zram_drv.c | 13 + > > > > 1 file changed, 5 insertions(+), 8 deletions(-) > > > > > > > > diff --git a/drivers/staging/zram/zram_drv.c > > > > b/drivers/staging/zram/zram_drv.c > > > > index 91d94b5..7a2d4de 100644 > > > > --- a/drivers/staging/zram/zram_drv.c > > > > +++ b/drivers/staging/zram/zram_drv.c > > > > @@ -521,7 +521,8 @@ static void handle_pending_slot_free(struct zram > > > > *zram) > > > > while (zram->slot_free_rq) { > > > > free_rq = zram->slot_free_rq; > > > > zram->slot_free_rq = free_rq->next; > > > > - zram_free_page(zram, free_rq->index); > > > > + if (zram->init_done) > > > > + zram_free_page(zram, free_rq->index); > > > > kfree(free_rq); > > > > } > > > > spin_unlock(>slot_free_lock); > > > > @@ -534,16 +535,13 @@ static int zram_bvec_rw(struct zram *zram, struct > > > > bio_vec *bvec, u32 index, > > > > > > > > if (rw == READ) { > > > > down_read(>lock); > > > > - handle_pending_slot_free(zram); > > > > > > Read side is okay but actually I have a nitpick. > > > If someone poll a block in zram-swap device, he would see a block > > > has zero value suddenly although there was no I/O.(I don't want to argue > > > it's sane user or not, anyway) it never happens on real block device and > > > it never happens on zram-block device. Only it can happen zram-swap > > > device. > > > And such behavior was there since we introduced swap_slot_free_notify. > > > (off-topic: I'd like to remove it because it makes tight coupling between > > > zram and swap and obviously, it was layering violation function) > > > so now, I don't have strong objection. > > > > > > The idea is to remove swap_slot_free_notify is to use frontswap when > > > user want to use zram as swap so zram can be notified when the block > > > lose the owner but still we should solve the mutex problem in notify > > > handler. > > > > > > > > > > ret = zram_bvec_read(zram, bvec, index, offset, bio); > > > > up_read(>lock); > > > > } else { > > > > down_write(>lock); > > > > - handle_pending_slot_free(zram); > > > > > > Why did you remove this in write-side? > > > We can't expect when the work will trigger. It means the work could remove > > > valid block under the us. > > > > > > > > > not sure I understand how. > > zram_slot_free() takes down_write(>init_lock) and zram_make_request() > > takes > > down_read(>init_lock), thus zram_slot_free() can not concurrently > > work with > > any RW requests. RW requests are under read() lock and zram_slot_free() is > > under > > write() lock. > > Let's consider example. > Swap subsystem asked to zram "A" block free from now by swap_slot_free_notify > but zram had been pended it without real freeing. > Swap reused "A" block for new data because "A" block was free but request > pended > for a long time just handled and zram blindly free new data on the "A" block. > :( > That's why we should handle pending free request right before
Re: [PATCH 1/2] staging: zram: fix handle_pending_slot_free() and zram_reset_device() race
On (09/23/13 13:24), Minchan Kim wrote: On (09/16/13 09:02), Minchan Kim wrote: Hello Sergey, Sorry for really slow response. I was really busy by internal works and Thanks for pointing the BUG, Dan, Jerome and Sergey. I read your threads roughly so I may miss something. If so, sorry for that. Anyway I will put my opinion. On Wed, Sep 11, 2013 at 02:12:50AM +0300, Sergey Senozhatsky wrote: Dan Carpenter noted that handle_pending_slot_free() is racy with zram_reset_device(). Take write init_lock in zram_slot_free(), thus Right but init_lock is what I really want to remove. Yes. It's just read-side lock so most of time it doesn't hurt us but it makes code very complicated and deadlock prone so I'd like to replace it with RCU. Yeah, It's off topic but just let me put my opinion in future direction. Abought the bug, how about moving flush_work below down_write(init_lock)? zram_make_request is already closed by init_lock and we have a rule about lock ordering as following so I don't see any problem. init_lock zram-lock preventing any concurrent zram_slot_free(), zram_bvec_rw() or zram_reset_device(). This also allows to safely check zram-init_done in handle_pending_slot_free(). Initial intention was to minimze number of handle_pending_slot_free() call from zram_bvec_rw(), which were slowing down READ requests due to slot_free_lock spin lock. Jerome Marchand suggested to remove handle_pending_slot_free() from zram_bvec_rw(). Link: https://lkml.org/lkml/2013/9/9/172 Signed-off-by: Sergey Senozhatsky sergey.senozhat...@gmail.com --- drivers/staging/zram/zram_drv.c | 13 + 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c index 91d94b5..7a2d4de 100644 --- a/drivers/staging/zram/zram_drv.c +++ b/drivers/staging/zram/zram_drv.c @@ -521,7 +521,8 @@ static void handle_pending_slot_free(struct zram *zram) while (zram-slot_free_rq) { free_rq = zram-slot_free_rq; zram-slot_free_rq = free_rq-next; - zram_free_page(zram, free_rq-index); + if (zram-init_done) + zram_free_page(zram, free_rq-index); kfree(free_rq); } spin_unlock(zram-slot_free_lock); @@ -534,16 +535,13 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index, if (rw == READ) { down_read(zram-lock); - handle_pending_slot_free(zram); Read side is okay but actually I have a nitpick. If someone poll a block in zram-swap device, he would see a block has zero value suddenly although there was no I/O.(I don't want to argue it's sane user or not, anyway) it never happens on real block device and it never happens on zram-block device. Only it can happen zram-swap device. And such behavior was there since we introduced swap_slot_free_notify. (off-topic: I'd like to remove it because it makes tight coupling between zram and swap and obviously, it was layering violation function) so now, I don't have strong objection. The idea is to remove swap_slot_free_notify is to use frontswap when user want to use zram as swap so zram can be notified when the block lose the owner but still we should solve the mutex problem in notify handler. ret = zram_bvec_read(zram, bvec, index, offset, bio); up_read(zram-lock); } else { down_write(zram-lock); - handle_pending_slot_free(zram); Why did you remove this in write-side? We can't expect when the work will trigger. It means the work could remove valid block under the us. not sure I understand how. zram_slot_free() takes down_write(zram-init_lock) and zram_make_request() takes down_read(zram-init_lock), thus zram_slot_free() can not concurrently work with any RW requests. RW requests are under read() lock and zram_slot_free() is under write() lock. Let's consider example. Swap subsystem asked to zram A block free from now by swap_slot_free_notify but zram had been pended it without real freeing. Swap reused A block for new data because A block was free but request pended for a long time just handled and zram blindly free new data on the A block. :( That's why we should handle pending free request right before zram-write. Another try to optimize the lock overhead is to check the block is pending for free right before zram_free_page in write path. If so, we should remove pending reuqest from slot_free_rq list to prevent valid block later. But for that case, we need more complex data structure to find the block
Re: [PATCH 1/2] staging: zram: fix handle_pending_slot_free() and zram_reset_device() race
On Tue, Sep 17, 2013 at 08:24:45PM +0300, Sergey Senozhatsky wrote: > > Hello, > > On (09/16/13 09:02), Minchan Kim wrote: > > Hello Sergey, > > > > Sorry for really slow response. I was really busy by internal works > > and Thanks for pointing the BUG, Dan, Jerome and Sergey. > > I read your threads roughly so I may miss something. If so, sorry > > for that. Anyway I will put my opinion. > > > > On Wed, Sep 11, 2013 at 02:12:50AM +0300, Sergey Senozhatsky wrote: > > > Dan Carpenter noted that handle_pending_slot_free() is racy with > > > zram_reset_device(). Take write init_lock in zram_slot_free(), thus > > > > Right but "init_lock" is what I really want to remove. > > Yes. It's just read-side lock so most of time it doesn't hurt us but it > > makes code very complicated and deadlock prone so I'd like to replace > > it with RCU. Yeah, It's off topic but just let me put my opinion in > > future direction. > > > > Abought the bug, how about moving flush_work below down_write(init_lock)? > > zram_make_request is already closed by init_lock and we have a rule about > > lock ordering as following so I don't see any problem. > > > > init_lock > > zram->lock > > > > > preventing any concurrent zram_slot_free(), zram_bvec_rw() or > > > zram_reset_device(). This also allows to safely check zram->init_done > > > in handle_pending_slot_free(). > > > > > > Initial intention was to minimze number of handle_pending_slot_free() > > > call from zram_bvec_rw(), which were slowing down READ requests due to > > > slot_free_lock spin lock. Jerome Marchand suggested to remove > > > handle_pending_slot_free() from zram_bvec_rw(). > > > > > > Link: https://lkml.org/lkml/2013/9/9/172 > > > Signed-off-by: Sergey Senozhatsky > > > > > > --- > > > > > > drivers/staging/zram/zram_drv.c | 13 + > > > 1 file changed, 5 insertions(+), 8 deletions(-) > > > > > > diff --git a/drivers/staging/zram/zram_drv.c > > > b/drivers/staging/zram/zram_drv.c > > > index 91d94b5..7a2d4de 100644 > > > --- a/drivers/staging/zram/zram_drv.c > > > +++ b/drivers/staging/zram/zram_drv.c > > > @@ -521,7 +521,8 @@ static void handle_pending_slot_free(struct zram > > > *zram) > > > while (zram->slot_free_rq) { > > > free_rq = zram->slot_free_rq; > > > zram->slot_free_rq = free_rq->next; > > > - zram_free_page(zram, free_rq->index); > > > + if (zram->init_done) > > > + zram_free_page(zram, free_rq->index); > > > kfree(free_rq); > > > } > > > spin_unlock(>slot_free_lock); > > > @@ -534,16 +535,13 @@ static int zram_bvec_rw(struct zram *zram, struct > > > bio_vec *bvec, u32 index, > > > > > > if (rw == READ) { > > > down_read(>lock); > > > - handle_pending_slot_free(zram); > > > > Read side is okay but actually I have a nitpick. > > If someone poll a block in zram-swap device, he would see a block > > has zero value suddenly although there was no I/O.(I don't want to argue > > it's sane user or not, anyway) it never happens on real block device and > > it never happens on zram-block device. Only it can happen zram-swap device. > > And such behavior was there since we introduced swap_slot_free_notify. > > (off-topic: I'd like to remove it because it makes tight coupling between > > zram and swap and obviously, it was layering violation function) > > so now, I don't have strong objection. > > > > The idea is to remove swap_slot_free_notify is to use frontswap when > > user want to use zram as swap so zram can be notified when the block > > lose the owner but still we should solve the mutex problem in notify > > handler. > > > > > > > ret = zram_bvec_read(zram, bvec, index, offset, bio); > > > up_read(>lock); > > > } else { > > > down_write(>lock); > > > - handle_pending_slot_free(zram); > > > > Why did you remove this in write-side? > > We can't expect when the work will trigger. It means the work could remove > > valid block under the us. > > > > > not sure I understand how. > zram_slot_free() takes down_write(>init_lock) and zram_make_request() > takes > down_read(>init_lock), thus zram_slot_free() can not concurrently work > with > any RW requests. RW requests are under read() lock and zram_slot_free() is > under > write() lock. Let's consider example. Swap subsystem asked to zram "A" block free from now by swap_slot_free_notify but zram had been pended it without real freeing. Swap reused "A" block for new data because "A" block was free but request pended for a long time just handled and zram blindly free new data on the "A" block. :( That's why we should handle pending free request right before zram-write. Another try to optimize the lock overhead is to check the block is pending for free right before zram_free_page in write path. If so, we should remove pending reuqest from slot_free_rq list to prevent valid block later. But for that case, we need more complex data structure
Re: [PATCH 1/2] staging: zram: fix handle_pending_slot_free() and zram_reset_device() race
On Tue, Sep 17, 2013 at 08:24:45PM +0300, Sergey Senozhatsky wrote: Hello, On (09/16/13 09:02), Minchan Kim wrote: Hello Sergey, Sorry for really slow response. I was really busy by internal works and Thanks for pointing the BUG, Dan, Jerome and Sergey. I read your threads roughly so I may miss something. If so, sorry for that. Anyway I will put my opinion. On Wed, Sep 11, 2013 at 02:12:50AM +0300, Sergey Senozhatsky wrote: Dan Carpenter noted that handle_pending_slot_free() is racy with zram_reset_device(). Take write init_lock in zram_slot_free(), thus Right but init_lock is what I really want to remove. Yes. It's just read-side lock so most of time it doesn't hurt us but it makes code very complicated and deadlock prone so I'd like to replace it with RCU. Yeah, It's off topic but just let me put my opinion in future direction. Abought the bug, how about moving flush_work below down_write(init_lock)? zram_make_request is already closed by init_lock and we have a rule about lock ordering as following so I don't see any problem. init_lock zram-lock preventing any concurrent zram_slot_free(), zram_bvec_rw() or zram_reset_device(). This also allows to safely check zram-init_done in handle_pending_slot_free(). Initial intention was to minimze number of handle_pending_slot_free() call from zram_bvec_rw(), which were slowing down READ requests due to slot_free_lock spin lock. Jerome Marchand suggested to remove handle_pending_slot_free() from zram_bvec_rw(). Link: https://lkml.org/lkml/2013/9/9/172 Signed-off-by: Sergey Senozhatsky sergey.senozhat...@gmail.com --- drivers/staging/zram/zram_drv.c | 13 + 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c index 91d94b5..7a2d4de 100644 --- a/drivers/staging/zram/zram_drv.c +++ b/drivers/staging/zram/zram_drv.c @@ -521,7 +521,8 @@ static void handle_pending_slot_free(struct zram *zram) while (zram-slot_free_rq) { free_rq = zram-slot_free_rq; zram-slot_free_rq = free_rq-next; - zram_free_page(zram, free_rq-index); + if (zram-init_done) + zram_free_page(zram, free_rq-index); kfree(free_rq); } spin_unlock(zram-slot_free_lock); @@ -534,16 +535,13 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index, if (rw == READ) { down_read(zram-lock); - handle_pending_slot_free(zram); Read side is okay but actually I have a nitpick. If someone poll a block in zram-swap device, he would see a block has zero value suddenly although there was no I/O.(I don't want to argue it's sane user or not, anyway) it never happens on real block device and it never happens on zram-block device. Only it can happen zram-swap device. And such behavior was there since we introduced swap_slot_free_notify. (off-topic: I'd like to remove it because it makes tight coupling between zram and swap and obviously, it was layering violation function) so now, I don't have strong objection. The idea is to remove swap_slot_free_notify is to use frontswap when user want to use zram as swap so zram can be notified when the block lose the owner but still we should solve the mutex problem in notify handler. ret = zram_bvec_read(zram, bvec, index, offset, bio); up_read(zram-lock); } else { down_write(zram-lock); - handle_pending_slot_free(zram); Why did you remove this in write-side? We can't expect when the work will trigger. It means the work could remove valid block under the us. not sure I understand how. zram_slot_free() takes down_write(zram-init_lock) and zram_make_request() takes down_read(zram-init_lock), thus zram_slot_free() can not concurrently work with any RW requests. RW requests are under read() lock and zram_slot_free() is under write() lock. Let's consider example. Swap subsystem asked to zram A block free from now by swap_slot_free_notify but zram had been pended it without real freeing. Swap reused A block for new data because A block was free but request pended for a long time just handled and zram blindly free new data on the A block. :( That's why we should handle pending free request right before zram-write. Another try to optimize the lock overhead is to check the block is pending for free right before zram_free_page in write path. If so, we should remove pending reuqest from slot_free_rq list to prevent valid block later. But for that case, we need more complex data structure to find the block fast and many checking code right before zram_free_page so that it would make code rather complicated. So, do you have any real workload for us to consider it's really troublesome?
Re: [PATCH 1/2] staging: zram: fix handle_pending_slot_free() and zram_reset_device() race
Hello, On (09/16/13 09:02), Minchan Kim wrote: > Hello Sergey, > > Sorry for really slow response. I was really busy by internal works > and Thanks for pointing the BUG, Dan, Jerome and Sergey. > I read your threads roughly so I may miss something. If so, sorry > for that. Anyway I will put my opinion. > > On Wed, Sep 11, 2013 at 02:12:50AM +0300, Sergey Senozhatsky wrote: > > Dan Carpenter noted that handle_pending_slot_free() is racy with > > zram_reset_device(). Take write init_lock in zram_slot_free(), thus > > Right but "init_lock" is what I really want to remove. > Yes. It's just read-side lock so most of time it doesn't hurt us but it > makes code very complicated and deadlock prone so I'd like to replace > it with RCU. Yeah, It's off topic but just let me put my opinion in > future direction. > > Abought the bug, how about moving flush_work below down_write(init_lock)? > zram_make_request is already closed by init_lock and we have a rule about > lock ordering as following so I don't see any problem. > > init_lock > zram->lock > > > preventing any concurrent zram_slot_free(), zram_bvec_rw() or > > zram_reset_device(). This also allows to safely check zram->init_done > > in handle_pending_slot_free(). > > > > Initial intention was to minimze number of handle_pending_slot_free() > > call from zram_bvec_rw(), which were slowing down READ requests due to > > slot_free_lock spin lock. Jerome Marchand suggested to remove > > handle_pending_slot_free() from zram_bvec_rw(). > > > > Link: https://lkml.org/lkml/2013/9/9/172 > > Signed-off-by: Sergey Senozhatsky > > > > --- > > > > drivers/staging/zram/zram_drv.c | 13 + > > 1 file changed, 5 insertions(+), 8 deletions(-) > > > > diff --git a/drivers/staging/zram/zram_drv.c > > b/drivers/staging/zram/zram_drv.c > > index 91d94b5..7a2d4de 100644 > > --- a/drivers/staging/zram/zram_drv.c > > +++ b/drivers/staging/zram/zram_drv.c > > @@ -521,7 +521,8 @@ static void handle_pending_slot_free(struct zram *zram) > > while (zram->slot_free_rq) { > > free_rq = zram->slot_free_rq; > > zram->slot_free_rq = free_rq->next; > > - zram_free_page(zram, free_rq->index); > > + if (zram->init_done) > > + zram_free_page(zram, free_rq->index); > > kfree(free_rq); > > } > > spin_unlock(>slot_free_lock); > > @@ -534,16 +535,13 @@ static int zram_bvec_rw(struct zram *zram, struct > > bio_vec *bvec, u32 index, > > > > if (rw == READ) { > > down_read(>lock); > > - handle_pending_slot_free(zram); > > Read side is okay but actually I have a nitpick. > If someone poll a block in zram-swap device, he would see a block > has zero value suddenly although there was no I/O.(I don't want to argue > it's sane user or not, anyway) it never happens on real block device and > it never happens on zram-block device. Only it can happen zram-swap device. > And such behavior was there since we introduced swap_slot_free_notify. > (off-topic: I'd like to remove it because it makes tight coupling between > zram and swap and obviously, it was layering violation function) > so now, I don't have strong objection. > > The idea is to remove swap_slot_free_notify is to use frontswap when > user want to use zram as swap so zram can be notified when the block > lose the owner but still we should solve the mutex problem in notify > handler. > > > > ret = zram_bvec_read(zram, bvec, index, offset, bio); > > up_read(>lock); > > } else { > > down_write(>lock); > > - handle_pending_slot_free(zram); > > Why did you remove this in write-side? > We can't expect when the work will trigger. It means the work could remove > valid block under the us. > not sure I understand how. zram_slot_free() takes down_write(>init_lock) and zram_make_request() takes down_read(>init_lock), thus zram_slot_free() can not concurrently work with any RW requests. RW requests are under read() lock and zram_slot_free() is under write() lock. > > ret = zram_bvec_write(zram, bvec, index, offset); > > up_write(>lock); > > } > > - > > return ret; > > } > > > > @@ -750,12 +748,11 @@ error: > > > > static void zram_slot_free(struct work_struct *work) > > { > > - struct zram *zram; > > + struct zram *zram = container_of(work, struct zram, free_work); > > > > - zram = container_of(work, struct zram, free_work); > > - down_write(>lock); > > + down_write(>init_lock); > > I don't like this. > Primary problem is we should handle it as atomic so that we should use > spinlock instead of mutex. Yeah, /me kicks his ass. From the beginning, > I should solve this problem as that way. > > The simple solution popped from my mind is that > > > diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c > index 91d94b5..b23bf0e 100644 > --- a/drivers/staging/zram/zram_drv.c > +++
Re: [PATCH 1/2] staging: zram: fix handle_pending_slot_free() and zram_reset_device() race
Hello, On (09/16/13 09:02), Minchan Kim wrote: Hello Sergey, Sorry for really slow response. I was really busy by internal works and Thanks for pointing the BUG, Dan, Jerome and Sergey. I read your threads roughly so I may miss something. If so, sorry for that. Anyway I will put my opinion. On Wed, Sep 11, 2013 at 02:12:50AM +0300, Sergey Senozhatsky wrote: Dan Carpenter noted that handle_pending_slot_free() is racy with zram_reset_device(). Take write init_lock in zram_slot_free(), thus Right but init_lock is what I really want to remove. Yes. It's just read-side lock so most of time it doesn't hurt us but it makes code very complicated and deadlock prone so I'd like to replace it with RCU. Yeah, It's off topic but just let me put my opinion in future direction. Abought the bug, how about moving flush_work below down_write(init_lock)? zram_make_request is already closed by init_lock and we have a rule about lock ordering as following so I don't see any problem. init_lock zram-lock preventing any concurrent zram_slot_free(), zram_bvec_rw() or zram_reset_device(). This also allows to safely check zram-init_done in handle_pending_slot_free(). Initial intention was to minimze number of handle_pending_slot_free() call from zram_bvec_rw(), which were slowing down READ requests due to slot_free_lock spin lock. Jerome Marchand suggested to remove handle_pending_slot_free() from zram_bvec_rw(). Link: https://lkml.org/lkml/2013/9/9/172 Signed-off-by: Sergey Senozhatsky sergey.senozhat...@gmail.com --- drivers/staging/zram/zram_drv.c | 13 + 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c index 91d94b5..7a2d4de 100644 --- a/drivers/staging/zram/zram_drv.c +++ b/drivers/staging/zram/zram_drv.c @@ -521,7 +521,8 @@ static void handle_pending_slot_free(struct zram *zram) while (zram-slot_free_rq) { free_rq = zram-slot_free_rq; zram-slot_free_rq = free_rq-next; - zram_free_page(zram, free_rq-index); + if (zram-init_done) + zram_free_page(zram, free_rq-index); kfree(free_rq); } spin_unlock(zram-slot_free_lock); @@ -534,16 +535,13 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index, if (rw == READ) { down_read(zram-lock); - handle_pending_slot_free(zram); Read side is okay but actually I have a nitpick. If someone poll a block in zram-swap device, he would see a block has zero value suddenly although there was no I/O.(I don't want to argue it's sane user or not, anyway) it never happens on real block device and it never happens on zram-block device. Only it can happen zram-swap device. And such behavior was there since we introduced swap_slot_free_notify. (off-topic: I'd like to remove it because it makes tight coupling between zram and swap and obviously, it was layering violation function) so now, I don't have strong objection. The idea is to remove swap_slot_free_notify is to use frontswap when user want to use zram as swap so zram can be notified when the block lose the owner but still we should solve the mutex problem in notify handler. ret = zram_bvec_read(zram, bvec, index, offset, bio); up_read(zram-lock); } else { down_write(zram-lock); - handle_pending_slot_free(zram); Why did you remove this in write-side? We can't expect when the work will trigger. It means the work could remove valid block under the us. not sure I understand how. zram_slot_free() takes down_write(zram-init_lock) and zram_make_request() takes down_read(zram-init_lock), thus zram_slot_free() can not concurrently work with any RW requests. RW requests are under read() lock and zram_slot_free() is under write() lock. ret = zram_bvec_write(zram, bvec, index, offset); up_write(zram-lock); } - return ret; } @@ -750,12 +748,11 @@ error: static void zram_slot_free(struct work_struct *work) { - struct zram *zram; + struct zram *zram = container_of(work, struct zram, free_work); - zram = container_of(work, struct zram, free_work); - down_write(zram-lock); + down_write(zram-init_lock); I don't like this. Primary problem is we should handle it as atomic so that we should use spinlock instead of mutex. Yeah, /me kicks his ass. From the beginning, I should solve this problem as that way. The simple solution popped from my mind is that diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c index 91d94b5..b23bf0e 100644 --- a/drivers/staging/zram/zram_drv.c +++ b/drivers/staging/zram/zram_drv.c @@ -534,11 +534,14 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index,
Re: [PATCH 1/2] staging: zram: fix handle_pending_slot_free() and zram_reset_device() race
On (09/12/13 15:12), Greg KH wrote: > On Wed, Sep 11, 2013 at 02:12:50AM +0300, Sergey Senozhatsky wrote: > > Dan Carpenter noted that handle_pending_slot_free() is racy with > > zram_reset_device(). Take write init_lock in zram_slot_free(), thus > > preventing any concurrent zram_slot_free(), zram_bvec_rw() or > > zram_reset_device(). This also allows to safely check zram->init_done > > in handle_pending_slot_free(). > > > > Initial intention was to minimze number of handle_pending_slot_free() > > call from zram_bvec_rw(), which were slowing down READ requests due to > > slot_free_lock spin lock. Jerome Marchand suggested to remove > > handle_pending_slot_free() from zram_bvec_rw(). > > > > Link: https://lkml.org/lkml/2013/9/9/172 > > Signed-off-by: Sergey Senozhatsky > > I have multiple versions of this and the other zram patches from you, > with no idea of which to accept. yes, please, drop all patches. I did not Cc you in these two patches to stop spamming your inbox with multiply versions. I will send them back to you as soon as I get positive feedback. > So, I'm going to drop them all, can you please resend what you wish to > submit, and in the future, be a bit more obvious with your "vN" > markings? > sorry for that. -ss > thanks, > > greg k-h > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] staging: zram: fix handle_pending_slot_free() and zram_reset_device() race
On (09/12/13 15:12), Greg KH wrote: On Wed, Sep 11, 2013 at 02:12:50AM +0300, Sergey Senozhatsky wrote: Dan Carpenter noted that handle_pending_slot_free() is racy with zram_reset_device(). Take write init_lock in zram_slot_free(), thus preventing any concurrent zram_slot_free(), zram_bvec_rw() or zram_reset_device(). This also allows to safely check zram-init_done in handle_pending_slot_free(). Initial intention was to minimze number of handle_pending_slot_free() call from zram_bvec_rw(), which were slowing down READ requests due to slot_free_lock spin lock. Jerome Marchand suggested to remove handle_pending_slot_free() from zram_bvec_rw(). Link: https://lkml.org/lkml/2013/9/9/172 Signed-off-by: Sergey Senozhatsky sergey.senozhat...@gmail.com I have multiple versions of this and the other zram patches from you, with no idea of which to accept. yes, please, drop all patches. I did not Cc you in these two patches to stop spamming your inbox with multiply versions. I will send them back to you as soon as I get positive feedback. So, I'm going to drop them all, can you please resend what you wish to submit, and in the future, be a bit more obvious with your vN markings? sorry for that. -ss thanks, greg k-h -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] staging: zram: fix handle_pending_slot_free() and zram_reset_device() race
On Wed, Sep 11, 2013 at 02:12:50AM +0300, Sergey Senozhatsky wrote: > Dan Carpenter noted that handle_pending_slot_free() is racy with > zram_reset_device(). Take write init_lock in zram_slot_free(), thus > preventing any concurrent zram_slot_free(), zram_bvec_rw() or > zram_reset_device(). This also allows to safely check zram->init_done > in handle_pending_slot_free(). > > Initial intention was to minimze number of handle_pending_slot_free() > call from zram_bvec_rw(), which were slowing down READ requests due to > slot_free_lock spin lock. Jerome Marchand suggested to remove > handle_pending_slot_free() from zram_bvec_rw(). > > Link: https://lkml.org/lkml/2013/9/9/172 > Signed-off-by: Sergey Senozhatsky I have multiple versions of this and the other zram patches from you, with no idea of which to accept. So, I'm going to drop them all, can you please resend what you wish to submit, and in the future, be a bit more obvious with your "vN" markings? thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] staging: zram: fix handle_pending_slot_free() and zram_reset_device() race
On Wed, Sep 11, 2013 at 02:12:50AM +0300, Sergey Senozhatsky wrote: Dan Carpenter noted that handle_pending_slot_free() is racy with zram_reset_device(). Take write init_lock in zram_slot_free(), thus preventing any concurrent zram_slot_free(), zram_bvec_rw() or zram_reset_device(). This also allows to safely check zram-init_done in handle_pending_slot_free(). Initial intention was to minimze number of handle_pending_slot_free() call from zram_bvec_rw(), which were slowing down READ requests due to slot_free_lock spin lock. Jerome Marchand suggested to remove handle_pending_slot_free() from zram_bvec_rw(). Link: https://lkml.org/lkml/2013/9/9/172 Signed-off-by: Sergey Senozhatsky sergey.senozhat...@gmail.com I have multiple versions of this and the other zram patches from you, with no idea of which to accept. So, I'm going to drop them all, can you please resend what you wish to submit, and in the future, be a bit more obvious with your vN markings? thanks, greg k-h -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2] staging: zram: fix handle_pending_slot_free() and zram_reset_device() race
Dan Carpenter noted that handle_pending_slot_free() is racy with zram_reset_device(). Take write init_lock in zram_slot_free(), thus preventing any concurrent zram_slot_free(), zram_bvec_rw() or zram_reset_device(). This also allows to safely check zram->init_done in handle_pending_slot_free(). Initial intention was to minimze number of handle_pending_slot_free() call from zram_bvec_rw(), which were slowing down READ requests due to slot_free_lock spin lock. Jerome Marchand suggested to remove handle_pending_slot_free() from zram_bvec_rw(). Link: https://lkml.org/lkml/2013/9/9/172 Signed-off-by: Sergey Senozhatsky --- drivers/staging/zram/zram_drv.c | 13 + 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c index 91d94b5..7a2d4de 100644 --- a/drivers/staging/zram/zram_drv.c +++ b/drivers/staging/zram/zram_drv.c @@ -521,7 +521,8 @@ static void handle_pending_slot_free(struct zram *zram) while (zram->slot_free_rq) { free_rq = zram->slot_free_rq; zram->slot_free_rq = free_rq->next; - zram_free_page(zram, free_rq->index); + if (zram->init_done) + zram_free_page(zram, free_rq->index); kfree(free_rq); } spin_unlock(>slot_free_lock); @@ -534,16 +535,13 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index, if (rw == READ) { down_read(>lock); - handle_pending_slot_free(zram); ret = zram_bvec_read(zram, bvec, index, offset, bio); up_read(>lock); } else { down_write(>lock); - handle_pending_slot_free(zram); ret = zram_bvec_write(zram, bvec, index, offset); up_write(>lock); } - return ret; } @@ -750,12 +748,11 @@ error: static void zram_slot_free(struct work_struct *work) { - struct zram *zram; + struct zram *zram = container_of(work, struct zram, free_work); - zram = container_of(work, struct zram, free_work); - down_write(>lock); + down_write(>init_lock); handle_pending_slot_free(zram); - up_write(>lock); + up_write(>init_lock); } static void add_slot_free(struct zram *zram, struct zram_slot_free *free_rq) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2] staging: zram: fix handle_pending_slot_free() and zram_reset_device() race
Dan Carpenter noted that handle_pending_slot_free() is racy with zram_reset_device(). Take write init_lock in zram_slot_free(), thus preventing any concurrent zram_slot_free(), zram_bvec_rw() or zram_reset_device(). This also allows to safely check zram-init_done in handle_pending_slot_free(). Initial intention was to minimze number of handle_pending_slot_free() call from zram_bvec_rw(), which were slowing down READ requests due to slot_free_lock spin lock. Jerome Marchand suggested to remove handle_pending_slot_free() from zram_bvec_rw(). Link: https://lkml.org/lkml/2013/9/9/172 Signed-off-by: Sergey Senozhatsky sergey.senozhat...@gmail.com --- drivers/staging/zram/zram_drv.c | 13 + 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c index 91d94b5..7a2d4de 100644 --- a/drivers/staging/zram/zram_drv.c +++ b/drivers/staging/zram/zram_drv.c @@ -521,7 +521,8 @@ static void handle_pending_slot_free(struct zram *zram) while (zram-slot_free_rq) { free_rq = zram-slot_free_rq; zram-slot_free_rq = free_rq-next; - zram_free_page(zram, free_rq-index); + if (zram-init_done) + zram_free_page(zram, free_rq-index); kfree(free_rq); } spin_unlock(zram-slot_free_lock); @@ -534,16 +535,13 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index, if (rw == READ) { down_read(zram-lock); - handle_pending_slot_free(zram); ret = zram_bvec_read(zram, bvec, index, offset, bio); up_read(zram-lock); } else { down_write(zram-lock); - handle_pending_slot_free(zram); ret = zram_bvec_write(zram, bvec, index, offset); up_write(zram-lock); } - return ret; } @@ -750,12 +748,11 @@ error: static void zram_slot_free(struct work_struct *work) { - struct zram *zram; + struct zram *zram = container_of(work, struct zram, free_work); - zram = container_of(work, struct zram, free_work); - down_write(zram-lock); + down_write(zram-init_lock); handle_pending_slot_free(zram); - up_write(zram-lock); + up_write(zram-init_lock); } static void add_slot_free(struct zram *zram, struct zram_slot_free *free_rq) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/