RE: [PATCH v6 RESEND 5/5] crypto: LEA block cipher x86_64 optimization
From: Dongsoo Lee > Sent: 12 January 2024 02:29 > > For the x86_64 environment, we use AVX-512F/AVX2/SSE2 instructions. > Since LEA uses 128-bit blocks of four 32-bit integers, for optimization, > SSE2 encrypts 4 blocks, AVX2 encrypts 4/8 blocks, and AVX-512F encrypts > 4/8/16 blocks at a time. > > Our submission provides a optimized implementation of ECB, CBC > decryption, CTR, and XTS cipher operation modes on x86_64 CPUs > supporting. Given you say in 0/0: The LEA algorithm is a lightweight block cipher that processes data blocks of 128-bits and has three different key lengths, each with a different number of rounds: Just how big is it ? Doesn't look 'lightweight' to me. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
RE: [PATCH] virtio_blk: fix snprintf truncation compiler warning
From: Stefan Hajnoczi > Sent: 04 December 2023 14:08 > > Commit 4e0400525691 ("virtio-blk: support polling I/O") triggers the > following gcc 13 W=1 warnings: > > drivers/block/virtio_blk.c: In function ‘init_vq’: > drivers/block/virtio_blk.c:1077:68: warning: ‘%d’ directive output may be > truncated writing between 1 > and 11 bytes into a region of size 7 [-Wformat-truncation=] > 1077 | snprintf(vblk->vqs[i].name, VQ_NAME_LEN, > "req_poll.%d", i); > |^~ > drivers/block/virtio_blk.c:1077:58: note: directive argument in the range > [-2147483648, 65534] > 1077 | snprintf(vblk->vqs[i].name, VQ_NAME_LEN, > "req_poll.%d", i); > | ^ > drivers/block/virtio_blk.c:1077:17: note: ‘snprintf’ output between 11 and 21 > bytes into a destination > of size 16 > 1077 | snprintf(vblk->vqs[i].name, VQ_NAME_LEN, > "req_poll.%d", i); > | > ^~ > > This is a false positive because the lower bound -2147483648 is > incorrect. The true range of i is [0, num_vqs - 1] where 0 < num_vqs < > 65536. > > The code mixes int, unsigned short, and unsigned int types in addition > to using "%d" for an unsigned value. Use unsigned short and "%u" > consistently to solve the compiler warning. > > Cc: Suwan Kim > Reported-by: kernel test robot > Closes: > https://lore.kernel.org/oe-kbuild-all/202312041509.diyvet9h-...@intel.com/ > Signed-off-by: Stefan Hajnoczi > --- > drivers/block/virtio_blk.c | 8 > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c > index d53d6aa8ee69..47556d8ccc32 100644 > --- a/drivers/block/virtio_blk.c > +++ b/drivers/block/virtio_blk.c > @@ -1019,12 +1019,12 @@ static void virtblk_config_changed(struct > virtio_device *vdev) > static int init_vq(struct virtio_blk *vblk) > { > int err; > - int i; > + unsigned short i; > vq_callback_t **callbacks; > const char **names; > struct virtqueue **vqs; > unsigned short num_vqs; > - unsigned int num_poll_vqs; > + unsigned short num_poll_vqs; > struct virtio_device *vdev = vblk->vdev; > struct irq_affinity desc = { 0, }; > > @@ -1068,13 +1068,13 @@ static int init_vq(struct virtio_blk *vblk) > > for (i = 0; i < num_vqs - num_poll_vqs; i++) { Ugg doing arithmetic on char/short is likely to generate horrid code (especially on non-x86). Hint, there will be explicit masking and/or sign/zero extension. Even the array index might add extra code (although there'll be an explicit sign extend to 64bit with the current code). There really ought to be a better way to make gcc STFU. In this case 'unsigned int i' might be enough since gcc seems to have a small enough upper bound. David > callbacks[i] = virtblk_done; > - snprintf(vblk->vqs[i].name, VQ_NAME_LEN, "req.%d", i); > + snprintf(vblk->vqs[i].name, VQ_NAME_LEN, "req.%u", i); > names[i] = vblk->vqs[i].name; > } > > for (; i < num_vqs; i++) { > callbacks[i] = NULL; > - snprintf(vblk->vqs[i].name, VQ_NAME_LEN, "req_poll.%d", i); > + snprintf(vblk->vqs[i].name, VQ_NAME_LEN, "req_poll.%u", i); > names[i] = vblk->vqs[i].name; > } > > -- > 2.43.0 - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
RE: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory
From: Logan Gunthorpe > Sent: 13 March 2018 23:46 ... > As Stephen pointed out, it's a requirement of the PCIe spec that a > switch supports P2P. If you want to sell a switch that does P2P with bad > performance then that's on you to deal with. That surprises me (unless I missed something last time I read the spec). While P2P writes are relatively easy to handle, reads and any other TLP that require acks are a completely different proposition. There are no additional fields that can be set in the read TLP and will be reflected back in the ack(s) than can be used to route the acks back to the correct initiator. I'm pretty sure that to support P2P reads a switch would have to save the received read TLP and (possibly later on) issue read TLP of its own for the required data. I'm not even sure it is easy to interleave the P2P reads with those coming from the root. That requires a potentially infinite queue of pending requests. Some x86 root ports support P2P writes (maybe with a bios option). It would be a shame not to be able to do P2P writes on such systems even though P2P reads won't work. (We looked at using P2P transfers for some data, but in the end used a different scheme. For our use case P2P writes were enough. An alternative would be to access the same host memory buffer from two different devices - but there isn't an API that lets you do that.) David
RE: [PATCH] scsi: resolve COMMAND_SIZE at compile time
From: Stephen Kitt > Sent: 09 March 2018 22:34 > > COMMAND_SIZE currently uses an array of values in block/scsi_ioctl.c. > A number of device_handler functions use this to initialise arrays, > and this is flagged by -Wvla. > > This patch replaces COMMAND_SIZE with a variant using a formula which > can be resolved at compile time in cases where the opcode is fixed, > resolving the array size and avoiding the VLA. The downside is that > the code is less maintainable and that some call sites end up having > more complex generated code. > > Since scsi_command_size_tbl is dropped, we can remove the dependency > on BLK_SCSI_REQUEST from drivers/target/Kconfig. > > This was prompted by https://lkml.org/lkml/2018/3/7/621 > > Signed-off-by: Stephen Kitt > --- > block/scsi_ioctl.c | 8 > drivers/target/Kconfig | 1 - > include/scsi/scsi_common.h | 13 +++-- > 3 files changed, 11 insertions(+), 11 deletions(-) > > diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c > index 60b471f8621b..b9e176e537be 100644 > --- a/block/scsi_ioctl.c > +++ b/block/scsi_ioctl.c > @@ -41,14 +41,6 @@ struct blk_cmd_filter { > > static struct blk_cmd_filter blk_default_cmd_filter; > > -/* Command group 3 is reserved and should never be used. */ > -const unsigned char scsi_command_size_tbl[8] = > -{ > - 6, 10, 10, 12, > - 16, 12, 10, 10 > -}; > -EXPORT_SYMBOL(scsi_command_size_tbl); > - > #include > > static int sg_get_version(int __user *p) > diff --git a/drivers/target/Kconfig b/drivers/target/Kconfig > index 4c44d7bed01a..b5f05b60cf06 100644 > --- a/drivers/target/Kconfig > +++ b/drivers/target/Kconfig > @@ -4,7 +4,6 @@ menuconfig TARGET_CORE > depends on SCSI && BLOCK > select CONFIGFS_FS > select CRC_T10DIF > - select BLK_SCSI_REQUEST # only for scsi_command_size_tbl.. > select SGL_ALLOC > default n > help > diff --git a/include/scsi/scsi_common.h b/include/scsi/scsi_common.h > index 731ac09ed231..48d950666376 100644 > --- a/include/scsi/scsi_common.h > +++ b/include/scsi/scsi_common.h > @@ -15,8 +15,17 @@ scsi_varlen_cdb_length(const void *hdr) > return ((struct scsi_varlen_cdb_hdr *)hdr)->additional_cdb_length + 8; > } > > -extern const unsigned char scsi_command_size_tbl[8]; > -#define COMMAND_SIZE(opcode) scsi_command_size_tbl[((opcode) >> 5) & 7] > +/* > + * SCSI command sizes are as follows, in bytes, for fixed size commands, per > + * group: 6, 10, 10, 12, 16, 12, 10, 10. The top three bits of an opcode > + * determine its group. > + * The size table is encoded into a 32-bit value by subtracting each value > + * from 16, resulting in a value of 1715488362 > + * (6 << 28 + 6 << 24 + 4 << 20 + 0 << 16 + 4 << 12 + 6 << 8 + 6 << 4 + 10). > + * Command group 3 is reserved and should never be used. > + */ > +#define COMMAND_SIZE(opcode) \ > + (16 - (15 & (1715488362 >> (4 * (((opcode) >> 5) & 7) Why not: (6 + (15 & (0x446a6440u Specifying the constant in hex makes it more obvious. It is a shame about the 16, but an offset is easier than the negate. David
RE: [PATCH V5 3/4] Move stack parameters for sed_ioctl to prevent oversized stack with CONFIG_KASAN
From: Arnd Bergmann > Sent: 13 February 2017 17:02 ... > >> > + ioctl_ptr = memdup_user(arg, _IOC_SIZE(cmd)); > >> > + if (IS_ERR_OR_NULL(ioctl_ptr)) { > >> > + ret = PTR_ERR(ioctl_ptr); > >> > + goto out; > >> ... > >> > + out: > >> > + kfree(ioctl_ptr); > >> > + return ret; ... > >> That error path doesn't look quite right to me. ... > > good god, this is a mess this morning. Thanks for the catch, I'll review > > these > > more aggressively before sending out. > > memdup_user() never returns NULL, and generally speaking any use of > IS_ERR_OR_NULL() indicates that there is something wrong with the > interface, so aside from passing the right pointer (or NULL) into kfree() > I think using IS_ERR() is the correct solution. You missed the problem entirely. Code needs to be: if (IS_ERR(ioctl_ptr)) return PTR_ERR(ioctl_ptr); David
RE: [PATCH V5 3/4] Move stack parameters for sed_ioctl to prevent oversized stack with CONFIG_KASAN
From: Scott Bauer Sent: 13 February 2017 16:11 > When CONFIG_KASAN is enabled, compilation fails: > > block/sed-opal.c: In function 'sed_ioctl': > block/sed-opal.c:2447:1: error: the frame size of 2256 bytes is larger than > 2048 bytes [-Werror=frame- > larger-than=] > > Moved all the ioctl structures off the stack and dynamically activate > using _IOC_SIZE() Think I'd not that this simplifies the code considerably. AFAICT CONFIG_KASAN is a just brainf*ck. It at least needs annotation that copy_from_user() has properties similar to memset(). So if the size matches that of the type then no guard space (etc) is required. ... > + ioctl_ptr = memdup_user(arg, _IOC_SIZE(cmd)); > + if (IS_ERR_OR_NULL(ioctl_ptr)) { > + ret = PTR_ERR(ioctl_ptr); > + goto out; ... > + out: > + kfree(ioctl_ptr); > + return ret; > } That error path doesn't look quite right to me. David
RE: [PATCH V3 2/2] Move stack parameters for sed_ioctl to prevent oversized stack with CONFIG_KASAN
From: Johannes Thumshirn > Sent: 10 February 2017 07:46 > On 02/09/2017 06:20 PM, Scott Bauer wrote: > > When CONFIG_KASAN is enabled, compilation fails: Does CONFIG_KASAN allocate guard stack space around everything that is passed by address? That sounds completely brain-dead. There are a lot of functions that have an 'int *' argument to return a single value - and are never going to do anything else. ... > > Moved all the ioctl structures off the stack and dynamically activate > > using _IOC_SIZE() ... > > > - if (copy_from_user(&session, arg, sizeof(session))) > > - return -EFAULT; > > - return opal_erase_locking_range(dev, &session); > > + ioctl_ptr = kzalloc(cmd_size, GFP_KERNEL); > > + if (!ioctl_ptr) > > + return -ENOMEM; > > + if (copy_from_user(ioctl_ptr, arg, cmd_size)) { > > + ret = -EFAULT; > > + goto out; > > } > > Can't we use memdup_user() instead of kzalloc() + copy_from_user()? You either want the copy_from_user() or the memzero() not both. ISTM there could be two 'library' functions, maybe: void *get_ioctl_buf(unsigned int cmd, long arg) to malloc the buffer, memzero/copy_from_user, returns -EFAULT if copy fails. int put_ioctl_buf(int rval, unsigned int cmd, const void *buf) does copy_to_user() if rval >= 0 and IOR_READ, then frees buf. return value is rval unless the copyout fails. David
RE: Sed-opal fixups
From: Scott Bauer > Sent: 09 February 2017 17:20 > It may be too late to change anyhting in the uapi header. When we > switched over to using IOC_SIZE I found a bug where I had switched > up a structure in one of the series from v4 to v5 but never changed > the structure in the IOW. The structure that was in there was to small > so when we kzalloc on it we don't request enough space. It worked before > because we were using the cmd strictly as a command #, not using the IOC > and friends. > > If it's too late to modify that IOW, I can work around it by reallocing > on the correct size for that command only. I verified the rest of the > commands and the structures are the same. > > Let me know what you think, please. Maybe define IOC_OPAL_ACTIVATE_LSP_OLD to the incorrect value and IOC_OPAL_ACTIVATE_LSP to the correct one. But that relies on any users specifying the correct structure. I wouldn't guarantee that. At the top of the driver's ioctl path add: if (cmd == IOC_OPAL_ACTIVATE_LSP_OLD) cmd = IOC_OPAL_ACTIVATE_LSP; For some code I added a userspace wrapper on ioctl() to check the size of the supplied arg matched that required by the 'cmd'. I've also done the same in the kernel. (all as compile time checks). David
RE: [PATCH] block: sed-opal: reduce stack size of ioctl handler
From: Arnd Bergmann > Sent: 08 February 2017 21:15 > > When CONFIG_KASAN is in use, the sed_ioctl function uses unusually large > stack, > as each possible ioctl argument gets its own stack area plus redzone: Why not do a single copy_from_user() at the top of sed_ioctl() based on the _IOC_DIR() and __IOC_SIZE() values? Something like: int sed_ioctl(..., unsigned int cmd, void __user *arg) { u64 buf[??]; /* or a union */ unsigned int cmd_sz = _IOC_SIZE(cmd); if (_IOC_DIR(cmd) & (_IOC_WRITE | _IOC_READ) && cmd_sz > sizeof buf) return -EINVAL; if (_IOC_DIR(cmd) & _IOC_WRITE) { if (copy_from_user(buf, arg, cmd_sz)) return -EFAULT; } else { if (IOC_DIR(cmd) & _IOC_READ)) memzero(buf, cmd_sz); } switch (cmd) { ... rval = ... ... } if (rval >= 0 && (_IOC_DIR(cmd) & _IOC_READ) && copy_to_user(arg, buf, cmd_sz)); return -EFAULT; return rval; } David