Hi again,

while refactoring the code I realized that there is another case we
are not considering:
if the max and min write pointers across all non-bad chunks are off by
more than the write unit, we will not be able to continue writing to
the line.

I'm adding a check of this in my patch.

All the best,
Hans


On Fri, Jan 25, 2019 at 10:30 PM Javier González <jav...@javigon.com> wrote:
>
>
> > On 25 Jan 2019, at 21.20, Hans Holmberg <hans.ml.holmb...@owltronix.com> 
> > wrote:
> >
> >> On Fri, Jan 25, 2019 at 7:33 PM Javier González <jav...@javigon.com> wrote:
> >>
> >>
> >>
> >>>> On 25 Jan 2019, at 17.46, Hans Holmberg <hans.ml.holmb...@owltronix.com> 
> >>>> wrote:
> >>>>
> >>>>> On Fri, Jan 25, 2019 at 3:35 PM Matias Bjørling <m...@lightnvm.io> 
> >>>>> wrote:
> >>>>>
> >>>>>> On 1/25/19 3:21 PM, Hans Holmberg wrote:
> >>>>>> On Fri, Jan 25, 2019 at 2:33 PM Javier González <jav...@javigon.com> 
> >>>>>> wrote:
> >>>>>>
> >>>>>>
> >>>>>>> On 25 Jan 2019, at 13.59, Hans Holmberg 
> >>>>>>> <hans.ml.holmb...@owltronix.com> wrote:
> >>>>>>>
> >>>>>>> On Fri, Jan 25, 2019 at 10:41 AM Javier González <jav...@javigon.com> 
> >>>>>>> wrote:
> >>>>>>>>> On 25 Jan 2019, at 09.47, Hans Holmberg 
> >>>>>>>>> <hans.ml.holmb...@owltronix.com> wrote:
> >>>>>>>>>
> >>>>>>>>> On Thu, Jan 24, 2019 at 8:51 PM Zhoujie Wu <z...@marvell.com> wrote:
> >>>>>>>>>> The write pointer of the bad block could be 0 or undefined, ignore
> >>>>>>>>>> the checking of the bad block wp for pblk_line_wp_is_unbalanced to
> >>>>>>>>>> avoid fake warning.
> >>>>>>>>>
> >>>>>>>>> fake -> spurious?
> >>>>>>>>>
> >>>>>>>>>> Signed-off-by: Zhoujie Wu <z...@marvell.com>
> >>>>>>>>>> ---
> >>>>>>>>>> v3: return in case bit >= lm->blk_per_line.
> >>>>>>>>>> v2: changed according to Javier's comments.
> >>>>>>>>>>
> >>>>>>>>>> drivers/lightnvm/pblk-recovery.c | 12 +++++++++---
> >>>>>>>>>> 1 file changed, 9 insertions(+), 3 deletions(-)
> >>>>>>>>>>
> >>>>>>>>>> diff --git a/drivers/lightnvm/pblk-recovery.c 
> >>>>>>>>>> b/drivers/lightnvm/pblk-recovery.c
> >>>>>>>>>> index 6761d2a..02d466e 100644
> >>>>>>>>>> --- a/drivers/lightnvm/pblk-recovery.c
> >>>>>>>>>> +++ b/drivers/lightnvm/pblk-recovery.c
> >>>>>>>>>> @@ -312,21 +312,27 @@ static int pblk_line_wp_is_unbalanced(struct 
> >>>>>>>>>> pblk *pblk,
> >>>>>>>>>>      struct nvm_chk_meta *chunk;
> >>>>>>>>>>      struct ppa_addr ppa;
> >>>>>>>>>>      u64 line_wp;
> >>>>>>>>>> -       int pos, i;
> >>>>>>>>>> +       int pos, i, bit;
> >>>>>>>>>
> >>>>>>>>> We don't need both bit and i, one of them is enough.
> >>>>>>>>>
> >>>>>>>>>> -       rlun = &pblk->luns[0];
> >>>>>>>>>> +       bit = find_first_zero_bit(line->blk_bitmap, 
> >>>>>>>>>> lm->blk_per_line);
> >>>>>>>>>> +       if (bit >= lm->blk_per_line)
> >>>>>>>>>> +               return 0;
> >>>>>>>>>
> >>>>>>>>> If there is only one non-offline chunk in the line, the wp can't be 
> >>>>>>>>> unbalanced,
> >>>>>>>>> so it should be safe to return 0 here if bit >= lm->blk_per_line - 1
> >>>>>>>>>
> >>>>>>>>> If you change this please document why using a comment, as it might
> >>>>>>>>> not be obvious
> >>>>>>>>>
> >>>>>>>>>> +       rlun = &pblk->luns[bit];
> >>>>>>>>>>      ppa = rlun->bppa;
> >>>>>>>>>>      pos = pblk_ppa_to_pos(geo, ppa);
> >>>>>>>>>>      chunk = &line->chks[pos];
> >>>>>>>>>
> >>>>>>>>>>      line_wp = chunk->wp;
> >>>>>>>>>>
> >>>>>>>>>> -       for (i = 1; i < lm->blk_per_line; i++) {
> >>>>>>>>>> +       for (i = bit + 1; i < lm->blk_per_line; i++) {
> >>>>>>>>>
> >>>>>>>>>>              rlun = &pblk->luns[i];
> >>>>>>>>>>              ppa = rlun->bppa;
> >>>>>>>>>>              pos = pblk_ppa_to_pos(geo, ppa);
> >>>>>>>>>>              chunk = &line->chks[pos];
> >>>>>>>>>
> >>>>>>>>> This code is a copy of the code above, it'd be nice to refactor it
> >>>>>>>>> into a helper function or just do the chunk lookups in one place.
> >>>>>>>>>
> >>>>>>>>>> +               if (chunk->state & NVM_CHK_ST_OFFLINE)
> >>>>>>>>>> +                       continue;
> >>>>>>>>>> +
> >>>>>>>>>
> >>>>>>>>> Since we rely on the block bitmap anyway, we might as well just
> >>>>>>>>> iterate over the zeroes in the block bitmap using find_next_zero_bit
> >>>>>>>>> instead.
> >>>>>>>>> We do this in lots of other places, see: git grep -n
> >>>>>>>>> find_next_zero_bit -- drivers/lightnvm
> >>>>>>>>
> >>>>>>>> Hans, I proposed him to use the chunk->state instead. I think it is 
> >>>>>>>> way
> >>>>>>>> more robust. We introduced the block bitmap for OCSSD 1.2, because 
> >>>>>>>> there
> >>>>>>>> was no state. Now that we have state, it is better to use it 
> >>>>>>>> instead. In
> >>>>>>>> fact, we should remove the bock bitmap as we have to check for the 
> >>>>>>>> state
> >>>>>>>> either way - note that this aligns also very well with you patches
> >>>>>>>> removing the other bitmaps.
> >>>>>>>
> >>>>>>> These are just nitpicks.
> >>>>>>>
> >>>>>>> Relying on two data structures(chunks, block bitmap) to be in sync in
> >>>>>>> this function in stead of one does not make it more robust imho.
> >>>>>>> Either or (checking chunks or the block bitmap) is fine by me.
> >>>>>>> Searching the bitmap is more efficient, so that is what I proposed.
> >>>>>>
> >>>>>> chunk log page is the ground truth, so it is more robust.
> >>>>>>
> >>>>>> Also, pblk has a long way to start seeing bitmap search vs. integer
> >>>>>> comparisons in profiling.
> >>>>>
> >>>>> Hehe, yeah, but it does not hurt to use the better alternative when 
> >>>>> available.
> >>>>>
> >>>>>>>
> >>>>>>> If we want to remove the block bitmap, we can do that as a separate 
> >>>>>>> patch(set)
> >>>>>>> I do agree, having two copies of the chunk state is something worth
> >>>>>>> getting rid of :)
> >>>>>>>
> >>>>>>
> >>>>>> A good start is not adding code using what we want to remove.
> >>>>>
> >>>>> Well, I think it's very confusing to use both copies in the same 
> >>>>> function.
> >>>>>
> >>>>> Now we're nitpicking nitpicks :)
> >>>>>
> >>>>
> >>>> I look forward to a patch. Will one of you volunteer a patch?
> >>>
> >>> Sure! It'd be easier to Illustrate what I mean with a patch.
> >>
> >> Cool. Go ahead - you will see how much cleaner it is using the chunk info.
> >>
> >> Matias : In any case Zhoujie’s fix is orthogonal to this discussion and I 
> >> think you should pick it up - in one form of bb iteration or another - as 
> >> it fixes a real issue.
> >
> > Second that! I feel bad about the fix not making it in just because of
> > this squabbling.
>
> No squabbling :) We both care about pblk and want to get the code right.
>
> If you give that first pass to that patch, it would be great. We can take it 
> from there.
>
> Nos enjoy the weekend!
>
> Javier.

Reply via email to