On Fri, May 31, 2019 at 10:42:12AM +0100, John Garry wrote:
> > > > >
> > > > > -static int hisi_sas_slot_index_alloc(struct hisi_hba *hisi_hba,
> > > > > - struct scsi_cmnd *scsi_cmnd)
> > > > > +static int hisi_sas_slot_index_alloc(struct hisi_hba *hisi_hba)
> > > > > {
> > > > > int index;
> > > > > void *bitmap = hisi_hba->slot_index_tags;
> > > > > unsigned long flags;
> > > > >
> > > > > - if (scsi_cmnd)
> > > > > - return scsi_cmnd->request->tag;
> > > > > -
> > > > > spin_lock_irqsave(&hisi_hba->lock, flags);
> > > > > index = find_next_zero_bit(bitmap, hisi_hba->slot_index_count,
> > > > > hisi_hba->last_slot_index + 1);
> > > >
> > > > Then you switch to hisi_sas_slot_index_alloc() for allocating the unique
> > > > tag via spin_lock & find_next_zero_bit(). Do you think this way is more
> > > > efficient than blk-mq's sbitmap?
>
> These are not fast path, as used only for TMF, internal IO, etc.
OK, looks I miss that.
>
> Having said that, hopefully we can move to scsi_{get,put}_reserved_cmd()
> when available, so that the LLDD has to stop managing them.
Agree.
>
> > > >
> > > slot_index_alloc() is only used for commands which do _not_ have a tag
> > > (eg internal commands), or for v2 hardware which has weird allocation
> > > rules.
> >
> > But this patch has switched to this allocation unconditionally for all
> > commands:
> >
>
> As Hannes said, v2 had a few bugs which meant that we had to make a specific
> version of this function for that hw revision, cf.
> slot_index_alloc_quirk_v2_hw(), and it cannot use request queue tag.
>
> But, indeed, we could consider sbitmap for v2 hw. I'm not sure if it would
> help, considering the weird rules.
>
> > > - if (scsi_cmnd)
> > > - return scsi_cmnd->request->tag;
> > > -
> >
> > Otherwise duplicated slot can be used from different blk-mq hw queue.
> >
> > >
> > > > The worsen thing is that V3's actual max queue depth is (4096 - 96), but
> > > > this patch claims that the device can support (4096 - 96) * 32 command
> > > > slots
>
> To be clear about the hw, the hw supports max 4096 command tags and has 16
Is 4096 the max allowed host-wide command tags? Or per-queue's max commands
tags?
If it is per-queue's max command tags, V3 should be real MQ hardware,
otherwise it is still host wide. Otherwise, Hannes's patch can't work
because tag from different hw queue can be same.
> hw queues. The hw queue depth is configurable by software, and we configure
> it at 4096 per queue - same as max command tags - this is to support
> possibility of all commands using the same queue simultaneously.
Suppose 4096 is the host-wide command tags:
As Hannes's patch changed to allow each blk-mq hw queue to accept 4096 commands,
there will be big contention in driver, given now the actual .can_queue becomes
4096 * 32, and how to avoid the same tag from different hw queue?
Thanks,
Ming