The following patches were made over Linus's and Martin's 7.1 trees. They fix an issue where for virtio-scsi we export a lot of non-scsi devices but are getting throttled by the cmd_per_lun_limit too early. For example we export 1 or more NVMe or block devices and would like to just pass command to them in way where virtio-scsi's hw queue limits match the physical hardware. Or in some cases we are doing cgroup based throttling on the host side, and we don't want the guest to block IO when the host knows we have extra bandwidth.
The patches add a new cmd_per_lun value so drivers can indicate when to avoid tracking queueing at the device wide level. They then rely on just the block layer hw queue limits. And the patches convert virtio-scsi. They also fix some can_queue related issues discovered while testing/reviewing.

