On Thu, 2020-11-05 at 16:09 +0000, David Woodhouse wrote:
> From: David Woodhouse <d...@amazon.co.uk>
> 
> This ended up with an odd mix of recursion (albeit *mostly*
> tail-recursion) and interation that could have been prettier. In
> addition, while recursing it potentially adjusted op->count which is
> used by callers to see the amount of I/O actually performed.
> 
> Fix it by bringing nvme_build_prpl() into the normal loop using 'i'
> as the offset in the op.
> 
> Fixes: 94f0510dc ("nvme: Split requests by maximum allowed size")
> Signed-off-by: David Woodhouse <d...@amazon.co.uk>
> ---
> v2: Fix my bug with checking a u16 for being < 0.
>     Fix the bug I inherited from commit 94f0510dc but made unconditional.
> 
>  src/hw/nvme.c | 77 ++++++++++++++++++++++-----------------------------
>  1 file changed, 33 insertions(+), 44 deletions(-)
> 


Now, on top of that we *could* do something like this...

--- a/src/hw/nvme.c
+++ b/src/hw/nvme.c
@@ -742,6 +742,15 @@ nvme_cmd_readwrite(struct nvme_namespace *ns, struct 
disk_op_s *op, int write)
             blocks = blocks_remaining < max_blocks ? blocks_remaining
                                                    : max_blocks;
 
+            if (blocks < blocks_remaining && ns->block_size < NVME_PAGE_SIZE &&
+                !(((u32)op_buf) & (ns->block_size-1))) {
+                    u32 align = ((u32)op_buf & (NVME_PAGE_SIZE - 1)) / 
ns->block_size;
+                    if (blocks > (max_blocks - align)) {
+                            dprintf(3, "Restrain to %u blocks to align (buf %p 
size %u/%u)\n", max_blocks - align, op_buf, NVME_PAGE_SIZE, ns->block_size);
+                            blocks = max_blocks - align;
+                    }
+            }
+
             if (write) {
                 memcpy(ns->dma_buffer, op_buf, blocks * ns->block_size);
             }

While debugging this I watched a boot sector, after being loaded at
0000:7c00, load another 63 sectors at 0000:7e00. It didn't get to use
the prpl at all, and looked something like this...

ns 1 read lba 0+1: 0
Booting from 0000:7c00
ns 1 read lba 1+8: 0
ns 1 read lba 9+8: 0
ns 1 read lba 17+8: 0
ns 1 read lba 25+8: 0
ns 1 read lba 33+8: 0
ns 1 read lba 41+8: 0
ns 1 read lba 49+8: 0
ns 1 read lba 57+7: 0

If we make an *attempt* to align it, such as the proof-of-concept shown
above, then it ends up getting back in sync:

ns 1 read lba 0+1: 0
Booting from 0000:7c00
Restrain to 1 blocks to align (buf 0x00007e00 size 4096/512)
ns 1 read lba 1+1: 0
ns 1 read lba 1+62: 0

I just don't know that I care enough about the optimisation to make it
worth the ugliness of that special case in the loop, which includes a
division.

Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
SeaBIOS mailing list -- seabios@seabios.org
To unsubscribe send an email to seabios-le...@seabios.org

Reply via email to