[xen-unstable test] 186175: tolerable FAIL - PUSHED
flight 186175 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/186175/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-armhf-armhf-libvirt 16 saverestore-support-checkfail like 186168 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 186168 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 186168 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 186168 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 186168 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 186168 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 15 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qcow2 14 migrate-support-checkfail never pass test-amd64-amd64-libvirt-raw 14 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 15 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail never pass test-arm64-arm64-xl-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-xl-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-xl-raw 14 migrate-support-checkfail never pass test-armhf-armhf-xl-raw 15 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 16 saverestore-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-qcow214 migrate-support-checkfail never pass test-armhf-armhf-xl-qcow215 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-libvirt-vhd 15 saverestore-support-checkfail never pass version targeted for testing: xen 2d93f78bfe25f695d8ffb61d110da9df293ed71b baseline version: xen 96af090e33130b0bf0953f3ccab8e7a163392318 Last test of basis 186168 2024-05-28 10:08:43 Z0 days Testing same since 186175 2024-05-28 19:08:59 Z0 days1 attempts People who touched revisions under test: Jan Beulich Jason Andryuk Nicola Vetrini Oleksii Kurochko jobs: build-amd64-xsm pass build-arm64-xsm pass build-i386-xsm pass build-amd64-xtf pass build-amd64 pass build-arm64 pass build-armhf pass build-i386
[PATCH 12/12] block: add special APIs for run-time disabling of discard and friends
A few drivers optimistically try to support discard, write zeroes and secure erase and disable the features from the I/O completion handler if the hardware can't support them. This disable can't be done using the atomic queue limits API because the I/O completion handlers can't take sleeping locks or freezer the queue. Keep the existing clearing of the relevant field to zero, but replace the old blk_queue_max_* APIs with new disable APIs that force the value to 0. Signed-off-by: Christoph Hellwig --- arch/um/drivers/ubd_kern.c | 5 ++--- block/blk-settings.c | 41 drivers/block/xen-blkfront.c | 4 ++-- drivers/scsi/sd.c| 4 ++-- include/linux/blkdev.h | 28 ++-- 5 files changed, 28 insertions(+), 54 deletions(-) diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c index a79a3b7c33a647..7eae1519300fbd 100644 --- a/arch/um/drivers/ubd_kern.c +++ b/arch/um/drivers/ubd_kern.c @@ -475,10 +475,9 @@ static void ubd_handler(void) struct request_queue *q = io_req->req->q; if (req_op(io_req->req) == REQ_OP_DISCARD) - blk_queue_max_discard_sectors(q, 0); + blk_queue_disable_discard(q); if (req_op(io_req->req) == REQ_OP_WRITE_ZEROES) - blk_queue_max_write_zeroes_sectors(q, - 0); + blk_queue_disable_write_zeroes(q); } blk_mq_end_request(io_req->req, io_req->error); kfree(io_req); diff --git a/block/blk-settings.c b/block/blk-settings.c index 0b038729608f4b..996f247fc98e80 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -293,47 +293,6 @@ int queue_limits_set(struct request_queue *q, struct queue_limits *lim) } EXPORT_SYMBOL_GPL(queue_limits_set); -/** - * blk_queue_max_discard_sectors - set max sectors for a single discard - * @q: the request queue for the device - * @max_discard_sectors: maximum number of sectors to discard - **/ -void blk_queue_max_discard_sectors(struct request_queue *q, - unsigned int max_discard_sectors) -{ - struct queue_limits *lim = >limits; - - lim->max_hw_discard_sectors = max_discard_sectors; - lim->max_discard_sectors = - min(max_discard_sectors, lim->max_user_discard_sectors); -} -EXPORT_SYMBOL(blk_queue_max_discard_sectors); - -/** - * blk_queue_max_secure_erase_sectors - set max sectors for a secure erase - * @q: the request queue for the device - * @max_sectors: maximum number of sectors to secure_erase - **/ -void blk_queue_max_secure_erase_sectors(struct request_queue *q, - unsigned int max_sectors) -{ - q->limits.max_secure_erase_sectors = max_sectors; -} -EXPORT_SYMBOL(blk_queue_max_secure_erase_sectors); - -/** - * blk_queue_max_write_zeroes_sectors - set max sectors for a single - * write zeroes - * @q: the request queue for the device - * @max_write_zeroes_sectors: maximum number of sectors to write per command - **/ -void blk_queue_max_write_zeroes_sectors(struct request_queue *q, - unsigned int max_write_zeroes_sectors) -{ - q->limits.max_write_zeroes_sectors = max_write_zeroes_sectors; -} -EXPORT_SYMBOL(blk_queue_max_write_zeroes_sectors); - void disk_update_readahead(struct gendisk *disk) { blk_apply_bdi_limits(disk->bdi, >queue->limits); diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c index fd7c0ff2139cee..9b4ec3e4908cce 100644 --- a/drivers/block/xen-blkfront.c +++ b/drivers/block/xen-blkfront.c @@ -1605,8 +1605,8 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id) blkif_req(req)->error = BLK_STS_NOTSUPP; info->feature_discard = 0; info->feature_secdiscard = 0; - blk_queue_max_discard_sectors(rq, 0); - blk_queue_max_secure_erase_sectors(rq, 0); + blk_queue_disable_discard(rq); + blk_queue_disable_secure_erase(rq); } break; case BLKIF_OP_FLUSH_DISKCACHE: diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index 03e67936b27928..56fd523b3987a5 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -837,7 +837,7 @@ static unsigned char sd_setup_protect_cmnd(struct scsi_cmnd *scmd, static void sd_disable_discard(struct scsi_disk *sdkp) { sdkp->provisioning_mode = SD_LBP_DISABLE; - blk_queue_max_discard_sectors(sdkp->disk->queue, 0); + blk_queue_disable_discard(sdkp->disk->queue); } static void
[PATCH 10/12] sr: convert to the atomic queue limits API
Assign all queue limits through a local queue_limits variable and queue_limits_commit_update so that we can't race updating them from multiple places, and free the queue when updating them so that in-progress I/O submissions don't see half-updated limits. Also use the chance to clean up variable names to standard ones. Signed-off-by: Christoph Hellwig --- drivers/scsi/sr.c | 42 +- 1 file changed, 25 insertions(+), 17 deletions(-) diff --git a/drivers/scsi/sr.c b/drivers/scsi/sr.c index 7ab000942b97fc..3f491019103e0c 100644 --- a/drivers/scsi/sr.c +++ b/drivers/scsi/sr.c @@ -111,7 +111,7 @@ static struct lock_class_key sr_bio_compl_lkclass; static int sr_open(struct cdrom_device_info *, int); static void sr_release(struct cdrom_device_info *); -static void get_sectorsize(struct scsi_cd *); +static int get_sectorsize(struct scsi_cd *); static int get_capabilities(struct scsi_cd *); static unsigned int sr_check_events(struct cdrom_device_info *cdi, @@ -473,15 +473,15 @@ static blk_status_t sr_init_command(struct scsi_cmnd *SCpnt) return BLK_STS_IOERR; } -static void sr_revalidate_disk(struct scsi_cd *cd) +static int sr_revalidate_disk(struct scsi_cd *cd) { struct scsi_sense_hdr sshdr; /* if the unit is not ready, nothing more to do */ if (scsi_test_unit_ready(cd->device, SR_TIMEOUT, MAX_RETRIES, )) - return; + return 0; sr_cd_check(>cdi); - get_sectorsize(cd); + return get_sectorsize(cd); } static int sr_block_open(struct gendisk *disk, blk_mode_t mode) @@ -494,13 +494,16 @@ static int sr_block_open(struct gendisk *disk, blk_mode_t mode) return -ENXIO; scsi_autopm_get_device(sdev); - if (disk_check_media_change(disk)) - sr_revalidate_disk(cd); + if (disk_check_media_change(disk)) { + ret = sr_revalidate_disk(cd); + if (ret) + goto out; + } mutex_lock(>lock); ret = cdrom_open(>cdi, mode); mutex_unlock(>lock); - +out: scsi_autopm_put_device(sdev); if (ret) scsi_device_put(cd->device); @@ -685,7 +688,9 @@ static int sr_probe(struct device *dev) blk_pm_runtime_init(sdev->request_queue, dev); dev_set_drvdata(dev, cd); - sr_revalidate_disk(cd); + error = sr_revalidate_disk(cd); + if (error) + goto unregister_cdrom; error = device_add_disk(>sdev_gendev, disk, NULL); if (error) @@ -714,13 +719,14 @@ static int sr_probe(struct device *dev) } -static void get_sectorsize(struct scsi_cd *cd) +static int get_sectorsize(struct scsi_cd *cd) { + struct request_queue *q = cd->device->request_queue; static const u8 cmd[10] = { READ_CAPACITY }; unsigned char buffer[8] = { }; - int the_result; + struct queue_limits lim; + int err; int sector_size; - struct request_queue *queue; struct scsi_failure failure_defs[] = { { .result = SCMD_FAILURE_RESULT_ANY, @@ -736,10 +742,10 @@ static void get_sectorsize(struct scsi_cd *cd) }; /* Do the command and wait.. */ - the_result = scsi_execute_cmd(cd->device, cmd, REQ_OP_DRV_IN, buffer, + err = scsi_execute_cmd(cd->device, cmd, REQ_OP_DRV_IN, buffer, sizeof(buffer), SR_TIMEOUT, MAX_RETRIES, _args); - if (the_result) { + if (err) { cd->capacity = 0x1f; sector_size = 2048; /* A guess, just in case */ } else { @@ -789,10 +795,12 @@ static void get_sectorsize(struct scsi_cd *cd) set_capacity(cd->disk, cd->capacity); } - queue = cd->device->request_queue; - blk_queue_logical_block_size(queue, sector_size); - - return; + lim = queue_limits_start_update(q); + lim.logical_block_size = sector_size; + blk_mq_freeze_queue(q); + err = queue_limits_commit_update(q, ); + blk_mq_unfreeze_queue(q); + return err; } static int get_capabilities(struct scsi_cd *cd) -- 2.43.0
[PATCH 11/12] block: remove unused queue limits API
Remove all APIs that are unused now that sd and sr have been converted to the atomic queue limits API. Signed-off-by: Christoph Hellwig --- block/blk-settings.c | 190 - include/linux/blkdev.h | 12 --- 2 files changed, 202 deletions(-) diff --git a/block/blk-settings.c b/block/blk-settings.c index a49abdb3554834..0b038729608f4b 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -293,24 +293,6 @@ int queue_limits_set(struct request_queue *q, struct queue_limits *lim) } EXPORT_SYMBOL_GPL(queue_limits_set); -/** - * blk_queue_chunk_sectors - set size of the chunk for this queue - * @q: the request queue for the device - * @chunk_sectors: chunk sectors in the usual 512b unit - * - * Description: - *If a driver doesn't want IOs to cross a given chunk size, it can set - *this limit and prevent merging across chunks. Note that the block layer - *must accept a page worth of data at any offset. So if the crossing of - *chunks is a hard limitation in the driver, it must still be prepared - *to split single page bios. - **/ -void blk_queue_chunk_sectors(struct request_queue *q, unsigned int chunk_sectors) -{ - q->limits.chunk_sectors = chunk_sectors; -} -EXPORT_SYMBOL(blk_queue_chunk_sectors); - /** * blk_queue_max_discard_sectors - set max sectors for a single discard * @q: the request queue for the device @@ -352,139 +334,6 @@ void blk_queue_max_write_zeroes_sectors(struct request_queue *q, } EXPORT_SYMBOL(blk_queue_max_write_zeroes_sectors); -/** - * blk_queue_max_zone_append_sectors - set max sectors for a single zone append - * @q: the request queue for the device - * @max_zone_append_sectors: maximum number of sectors to write per command - * - * Sets the maximum number of sectors allowed for zone append commands. If - * Specifying 0 for @max_zone_append_sectors indicates that the queue does - * not natively support zone append operations and that the block layer must - * emulate these operations using regular writes. - **/ -void blk_queue_max_zone_append_sectors(struct request_queue *q, - unsigned int max_zone_append_sectors) -{ - unsigned int max_sectors = 0; - - if (WARN_ON(!blk_queue_is_zoned(q))) - return; - - if (max_zone_append_sectors) { - max_sectors = min(q->limits.max_hw_sectors, - max_zone_append_sectors); - max_sectors = min(q->limits.chunk_sectors, max_sectors); - - /* -* Signal eventual driver bugs resulting in the max_zone_append -* sectors limit being 0 due to the chunk_sectors limit (zone -* size) not set or the max_hw_sectors limit not set. -*/ - WARN_ON_ONCE(!max_sectors); - } - - q->limits.max_zone_append_sectors = max_sectors; -} -EXPORT_SYMBOL_GPL(blk_queue_max_zone_append_sectors); - -/** - * blk_queue_logical_block_size - set logical block size for the queue - * @q: the request queue for the device - * @size: the logical block size, in bytes - * - * Description: - * This should be set to the lowest possible block size that the - * storage device can address. The default of 512 covers most - * hardware. - **/ -void blk_queue_logical_block_size(struct request_queue *q, unsigned int size) -{ - struct queue_limits *limits = >limits; - - limits->logical_block_size = size; - - if (limits->discard_granularity < limits->logical_block_size) - limits->discard_granularity = limits->logical_block_size; - - if (limits->physical_block_size < size) - limits->physical_block_size = size; - - if (limits->io_min < limits->physical_block_size) - limits->io_min = limits->physical_block_size; - - limits->max_hw_sectors = - round_down(limits->max_hw_sectors, size >> SECTOR_SHIFT); - limits->max_sectors = - round_down(limits->max_sectors, size >> SECTOR_SHIFT); -} -EXPORT_SYMBOL(blk_queue_logical_block_size); - -/** - * blk_queue_physical_block_size - set physical block size for the queue - * @q: the request queue for the device - * @size: the physical block size, in bytes - * - * Description: - * This should be set to the lowest possible sector size that the - * hardware can operate on without reverting to read-modify-write - * operations. - */ -void blk_queue_physical_block_size(struct request_queue *q, unsigned int size) -{ - q->limits.physical_block_size = size; - - if (q->limits.physical_block_size < q->limits.logical_block_size) - q->limits.physical_block_size = q->limits.logical_block_size; - - if (q->limits.discard_granularity < q->limits.physical_block_size) - q->limits.discard_granularity = q->limits.physical_block_size; - - if (q->limits.io_min < q->limits.physical_block_size) - q->limits.io_min
[PATCH 08/12] sd: cleanup zoned queue limits initialization
Consolidate setting zone-related queue limits in sd_zbc_read_zones instead of splitting them between sd_zbc_revalidate_zones and sd_zbc_read_zones, and move the early_zone_information initialization in sd_zbc_read_zones above setting up the queue limits. Signed-off-by: Christoph Hellwig --- drivers/scsi/sd_zbc.c | 18 -- 1 file changed, 8 insertions(+), 10 deletions(-) diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c index 806036e48abeda..1c24c844e8d178 100644 --- a/drivers/scsi/sd_zbc.c +++ b/drivers/scsi/sd_zbc.c @@ -565,12 +565,6 @@ int sd_zbc_revalidate_zones(struct scsi_disk *sdkp) sdkp->zone_info.zone_blocks = zone_blocks; sdkp->zone_info.nr_zones = nr_zones; - blk_queue_chunk_sectors(q, - logical_to_sectors(sdkp->device, zone_blocks)); - - /* Enable block layer zone append emulation */ - blk_queue_max_zone_append_sectors(q, 0); - flags = memalloc_noio_save(); ret = blk_revalidate_disk_zones(disk); memalloc_noio_restore(flags); @@ -625,6 +619,10 @@ int sd_zbc_read_zones(struct scsi_disk *sdkp, u8 buf[SD_BUF_SIZE]) if (ret != 0) goto err; + nr_zones = round_up(sdkp->capacity, zone_blocks) >> ilog2(zone_blocks); + sdkp->early_zone_info.nr_zones = nr_zones; + sdkp->early_zone_info.zone_blocks = zone_blocks; + /* The drive satisfies the kernel restrictions: set it up */ blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q); if (sdkp->zones_max_open == U32_MAX) @@ -632,10 +630,10 @@ int sd_zbc_read_zones(struct scsi_disk *sdkp, u8 buf[SD_BUF_SIZE]) else disk_set_max_open_zones(disk, sdkp->zones_max_open); disk_set_max_active_zones(disk, 0); - nr_zones = round_up(sdkp->capacity, zone_blocks) >> ilog2(zone_blocks); - - sdkp->early_zone_info.nr_zones = nr_zones; - sdkp->early_zone_info.zone_blocks = zone_blocks; + blk_queue_chunk_sectors(q, + logical_to_sectors(sdkp->device, zone_blocks)); + /* Enable block layer zone append emulation */ + blk_queue_max_zone_append_sectors(q, 0); return 0; -- 2.43.0
[PATCH 06/12] sd: simplify the disable case in sd_config_discard
Fall through to the main call to blk_queue_max_discard_sectors given that max_blocks has been initialized to zero above instead of duplicating the call. Signed-off-by: Christoph Hellwig --- drivers/scsi/sd.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index 09ffe9d826aeac..437743e3a0d37d 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -844,8 +844,7 @@ static void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode) case SD_LBP_FULL: case SD_LBP_DISABLE: - blk_queue_max_discard_sectors(q, 0); - return; + break; case SD_LBP_UNMAP: max_blocks = min_not_zero(sdkp->max_unmap_blocks, -- 2.43.0
[PATCH 01/12] ubd: untagle discard vs write zeroes not support handling
Discard and Write Zeroes are different operation and implemented by different fallocate opcodes for ubd. If one fails the other one can work and vice versa. Split the code to disable the operations in ubd_handler to only disable the operation that actually failed. Fixes: 50109b5a03b4 ("um: Add support for DISCARD in the UBD Driver") Signed-off-by: Christoph Hellwig --- arch/um/drivers/ubd_kern.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c index ef805eaa9e013d..a79a3b7c33a647 100644 --- a/arch/um/drivers/ubd_kern.c +++ b/arch/um/drivers/ubd_kern.c @@ -471,9 +471,14 @@ static void ubd_handler(void) for (count = 0; count < n/sizeof(struct io_thread_req *); count++) { struct io_thread_req *io_req = (*irq_req_buffer)[count]; - if ((io_req->error == BLK_STS_NOTSUPP) && (req_op(io_req->req) == REQ_OP_DISCARD)) { - blk_queue_max_discard_sectors(io_req->req->q, 0); - blk_queue_max_write_zeroes_sectors(io_req->req->q, 0); + if (io_req->error == BLK_STS_NOTSUPP) { + struct request_queue *q = io_req->req->q; + + if (req_op(io_req->req) == REQ_OP_DISCARD) + blk_queue_max_discard_sectors(q, 0); + if (req_op(io_req->req) == REQ_OP_WRITE_ZEROES) + blk_queue_max_write_zeroes_sectors(q, + 0); } blk_mq_end_request(io_req->req, io_req->error); kfree(io_req); -- 2.43.0
[PATCH 04/12] sd: add a sd_disable_discard helper
Add helper to disable discard when it is not supported and use it instead of sd_config_discard in the I/O completion handler. This avoids touching more fields than required in the I/O completion handler and prepares for converting sd to use the atomic queue limits API. Signed-off-by: Christoph Hellwig --- drivers/scsi/sd.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index 15d0035048d902..a8838381823254 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -821,6 +821,12 @@ static unsigned char sd_setup_protect_cmnd(struct scsi_cmnd *scmd, return protect; } +static void sd_disable_discard(struct scsi_disk *sdkp) +{ + sdkp->provisioning_mode = SD_LBP_DISABLE; + blk_queue_max_discard_sectors(sdkp->disk->queue, 0); +} + static void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode) { struct request_queue *q = sdkp->disk->queue; @@ -2245,12 +2251,12 @@ static int sd_done(struct scsi_cmnd *SCpnt) case 0x24: /* INVALID FIELD IN CDB */ switch (SCpnt->cmnd[0]) { case UNMAP: - sd_config_discard(sdkp, SD_LBP_DISABLE); + sd_disable_discard(sdkp); break; case WRITE_SAME_16: case WRITE_SAME: if (SCpnt->cmnd[1] & 8) { /* UNMAP */ - sd_config_discard(sdkp, SD_LBP_DISABLE); + sd_disable_discard(sdkp); } else { sdkp->device->no_write_same = 1; sd_config_write_same(sdkp); -- 2.43.0
[PATCH 09/12] sd: convert to the atomic queue limits API
Assign all queue limits through a local queue_limits variable and queue_limits_commit_update so that we can't race updating them from multiple places, and free the queue when updating them so that in-progress I/O submissions don't see half-updated limits. Signed-off-by: Christoph Hellwig --- drivers/scsi/sd.c | 126 -- drivers/scsi/sd.h | 6 +- drivers/scsi/sd_zbc.c | 15 ++--- 3 files changed, 84 insertions(+), 63 deletions(-) diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index 2d08b69154b995..03e67936b27928 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -101,12 +101,13 @@ MODULE_ALIAS_SCSI_DEVICE(TYPE_ZBC); #define SD_MINORS 16 -static void sd_config_discard(struct scsi_disk *, unsigned int); -static void sd_config_write_same(struct scsi_disk *); +static void sd_config_discard(struct scsi_disk *sdkp, struct queue_limits *lim, + unsigned int mode); +static void sd_config_write_same(struct scsi_disk *sdkp, + struct queue_limits *lim); static int sd_revalidate_disk(struct gendisk *); static void sd_unlock_native_capacity(struct gendisk *disk); static void sd_shutdown(struct device *); -static void sd_read_capacity(struct scsi_disk *sdkp, unsigned char *buffer); static void scsi_disk_release(struct device *cdev); static DEFINE_IDA(sd_index_ida); @@ -456,7 +457,8 @@ provisioning_mode_store(struct device *dev, struct device_attribute *attr, { struct scsi_disk *sdkp = to_scsi_disk(dev); struct scsi_device *sdp = sdkp->device; - int mode; + struct queue_limits lim; + int mode, err; if (!capable(CAP_SYS_ADMIN)) return -EACCES; @@ -472,8 +474,13 @@ provisioning_mode_store(struct device *dev, struct device_attribute *attr, if (mode < 0) return -EINVAL; - sd_config_discard(sdkp, mode); - + lim = queue_limits_start_update(sdkp->disk->queue); + sd_config_discard(sdkp, , mode); + blk_mq_freeze_queue(sdkp->disk->queue); + err = queue_limits_commit_update(sdkp->disk->queue, ); + blk_mq_unfreeze_queue(sdkp->disk->queue); + if (err) + return err; return count; } static DEVICE_ATTR_RW(provisioning_mode); @@ -556,6 +563,7 @@ max_write_same_blocks_store(struct device *dev, struct device_attribute *attr, { struct scsi_disk *sdkp = to_scsi_disk(dev); struct scsi_device *sdp = sdkp->device; + struct queue_limits lim; unsigned long max; int err; @@ -577,8 +585,13 @@ max_write_same_blocks_store(struct device *dev, struct device_attribute *attr, sdkp->max_ws_blocks = max; } - sd_config_write_same(sdkp); - + lim = queue_limits_start_update(sdkp->disk->queue); + sd_config_write_same(sdkp, ); + blk_mq_freeze_queue(sdkp->disk->queue); + err = queue_limits_commit_update(sdkp->disk->queue, ); + blk_mq_unfreeze_queue(sdkp->disk->queue); + if (err) + return err; return count; } static DEVICE_ATTR_RW(max_write_same_blocks); @@ -827,17 +840,15 @@ static void sd_disable_discard(struct scsi_disk *sdkp) blk_queue_max_discard_sectors(sdkp->disk->queue, 0); } -static void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode) +static void sd_config_discard(struct scsi_disk *sdkp, struct queue_limits *lim, + unsigned int mode) { - struct request_queue *q = sdkp->disk->queue; unsigned int logical_block_size = sdkp->device->sector_size; unsigned int max_blocks = 0; - q->limits.discard_alignment = - sdkp->unmap_alignment * logical_block_size; - q->limits.discard_granularity = - max(sdkp->physical_block_size, - sdkp->unmap_granularity * logical_block_size); + lim->discard_alignment = sdkp->unmap_alignment * logical_block_size; + lim->discard_granularity = max(sdkp->physical_block_size, + sdkp->unmap_granularity * logical_block_size); sdkp->provisioning_mode = mode; switch (mode) { @@ -875,7 +886,8 @@ static void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode) break; } - blk_queue_max_discard_sectors(q, max_blocks * (logical_block_size >> 9)); + lim->max_hw_discard_sectors = max_blocks * + (logical_block_size >> SECTOR_SHIFT); } static void *sd_set_special_bvec(struct request *rq, unsigned int data_len) @@ -1010,9 +1022,9 @@ static void sd_disable_write_same(struct scsi_disk *sdkp) blk_queue_max_write_zeroes_sectors(sdkp->disk->queue, 0); } -static void sd_config_write_same(struct scsi_disk *sdkp) +static void sd_config_write_same(struct scsi_disk *sdkp, + struct queue_limits *lim) { - struct request_queue *q = sdkp->disk->queue; unsigned int logical_block_size =
convert the SCSI ULDs to the atomic queue limits API
Hi all, this series converts the SCSI upper level drivers to the atomic queue limits API. The first patch is a bug fix for ubd that later patches depend on and might be worth picking up for 6.10. The second patch changes the max_sectors calculation to take the optimal I/O size into account so that sd, nbd and rbd don't have to mess with the user max_sector value. I'd love to see a careful review from the nbd and rbd maintainers for this one! The following patches clean up a few lose ends in the sd driver, and then convert sd and sr to the atomic queue limits API. The final patches remove the now unused block APIs, and convert a few to be specific to their now more narrow use case. The patches are against Jens' block-6.10 tree. Due to the amount of block layer changes in here, and other that will depend on it, it would be good if this could eventually be merged through the block tree, or at least a shared branch between the SCSI and block trees. Diffstat: arch/um/drivers/ubd_kern.c | 10 + block/blk-settings.c | 238 +-- drivers/block/nbd.c |2 drivers/block/rbd.c |1 drivers/block/xen-blkfront.c |4 drivers/scsi/sd.c| 218 --- drivers/scsi/sd.h|6 - drivers/scsi/sd_zbc.c| 27 ++-- drivers/scsi/sr.c| 42 --- include/linux/blkdev.h | 40 +++ 10 files changed, 196 insertions(+), 392 deletions(-)
[PATCH 07/12] sd: factor out a sd_discard_mode helper
Split the logic to pick the right discard mode into a little helper to prepare for further changes. Signed-off-by: Christoph Hellwig --- drivers/scsi/sd.c | 37 - 1 file changed, 20 insertions(+), 17 deletions(-) diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index 437743e3a0d37d..2d08b69154b995 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -3201,6 +3201,25 @@ static void sd_read_app_tag_own(struct scsi_disk *sdkp, unsigned char *buffer) return; } +static unsigned int sd_discard_mode(struct scsi_disk *sdkp) +{ + if (!sdkp->lbpvpd) { + /* LBP VPD page not provided */ + if (sdkp->max_unmap_blocks) + return SD_LBP_UNMAP; + return SD_LBP_WS16; + } + + /* LBP VPD page tells us what to use */ + if (sdkp->lbpu && sdkp->max_unmap_blocks) + return SD_LBP_UNMAP; + if (sdkp->lbpws) + return SD_LBP_WS16; + if (sdkp->lbpws10) + return SD_LBP_WS10; + return SD_LBP_DISABLE; +} + /** * sd_read_block_limits - Query disk device for preferred I/O sizes. * @sdkp: disk to query @@ -3239,23 +3258,7 @@ static void sd_read_block_limits(struct scsi_disk *sdkp) sdkp->unmap_alignment = get_unaligned_be32(>data[32]) & ~(1 << 31); - if (!sdkp->lbpvpd) { /* LBP VPD page not provided */ - - if (sdkp->max_unmap_blocks) - sd_config_discard(sdkp, SD_LBP_UNMAP); - else - sd_config_discard(sdkp, SD_LBP_WS16); - - } else {/* LBP VPD page tells us what to use */ - if (sdkp->lbpu && sdkp->max_unmap_blocks) - sd_config_discard(sdkp, SD_LBP_UNMAP); - else if (sdkp->lbpws) - sd_config_discard(sdkp, SD_LBP_WS16); - else if (sdkp->lbpws10) - sd_config_discard(sdkp, SD_LBP_WS10); - else - sd_config_discard(sdkp, SD_LBP_DISABLE); - } + sd_config_discard(sdkp, sd_discard_mode(sdkp)); } out: -- 2.43.0
[PATCH 02/12] block: take io_opt and io_min into account for max_sectors
The soft max_sectors limit is normally capped by the hardware limits and an arbitrary upper limit enforced by the kernel, but can be modified by the user. A few drivers want to increase this limit (nbd, rbd) or adjust it up or down based on hardware capabilities (sd). Change blk_validate_limits to default max_sectors to the optimal I/O size, or upgrade it to the preferred minimal I/O size if that is larger than the kernel default if no optimal I/O size is provided based on the logic in the SD driver. This keeps the existing kernel default for drivers that do not provide an io_opt or very big io_min value, but picks a much more useful default for those who provide these hints, and allows to remove the hacks to set the user max_sectors limit in nbd, rbd and sd. Note that rd picks a different value for the optimal I/O size vs the user max_sectors value, so this is a bit of a behavior change that could use careful review from people familiar with rbd. Signed-off-by: Christoph Hellwig --- block/blk-settings.c | 7 +++ drivers/block/nbd.c | 2 +- drivers/block/rbd.c | 1 - drivers/scsi/sd.c| 29 + 4 files changed, 13 insertions(+), 26 deletions(-) diff --git a/block/blk-settings.c b/block/blk-settings.c index effeb9a639bb45..a49abdb3554834 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -153,6 +153,13 @@ static int blk_validate_limits(struct queue_limits *lim) if (lim->max_user_sectors < PAGE_SIZE / SECTOR_SIZE) return -EINVAL; lim->max_sectors = min(max_hw_sectors, lim->max_user_sectors); + } else if (lim->io_opt) { + lim->max_sectors = + min(max_hw_sectors, lim->io_opt >> SECTOR_SHIFT); + } else if (lim->io_min && + lim->io_min > (BLK_DEF_MAX_SECTORS_CAP << SECTOR_SHIFT)) { + lim->max_sectors = + min(max_hw_sectors, lim->io_min >> SECTOR_SHIFT); } else { lim->max_sectors = min(max_hw_sectors, BLK_DEF_MAX_SECTORS_CAP); } diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index 22a79a62cc4eab..ad887d614d5b3f 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -1808,7 +1808,7 @@ static struct nbd_device *nbd_dev_add(int index, unsigned int refs) { struct queue_limits lim = { .max_hw_sectors = 65536, - .max_user_sectors = 256, + .io_opt = 256 << SECTOR_SHIFT, .max_segments = USHRT_MAX, .max_segment_size = UINT_MAX, }; diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index 26ff5cd2bf0abc..05096172f334a3 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -4954,7 +4954,6 @@ static int rbd_init_disk(struct rbd_device *rbd_dev) rbd_dev->layout.object_size * rbd_dev->layout.stripe_count; struct queue_limits lim = { .max_hw_sectors = objset_bytes >> SECTOR_SHIFT, - .max_user_sectors = objset_bytes >> SECTOR_SHIFT, .io_min = rbd_dev->opts->alloc_size, .io_opt = rbd_dev->opts->alloc_size, .max_segments = USHRT_MAX, diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index f6c822c9cbd2d3..3dff9150ce11e2 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -3593,7 +3593,7 @@ static int sd_revalidate_disk(struct gendisk *disk) struct request_queue *q = sdkp->disk->queue; sector_t old_capacity = sdkp->capacity; unsigned char *buffer; - unsigned int dev_max, rw_max; + unsigned int dev_max; SCSI_LOG_HLQUEUE(3, sd_printk(KERN_INFO, sdkp, "sd_revalidate_disk\n")); @@ -3675,34 +3675,15 @@ static int sd_revalidate_disk(struct gendisk *disk) else blk_queue_io_min(sdkp->disk->queue, 0); - if (sd_validate_opt_xfer_size(sdkp, dev_max)) { - q->limits.io_opt = logical_to_bytes(sdp, sdkp->opt_xfer_blocks); - rw_max = logical_to_sectors(sdp, sdkp->opt_xfer_blocks); - } else { - q->limits.io_opt = 0; - rw_max = min_not_zero(logical_to_sectors(sdp, dev_max), - (sector_t)BLK_DEF_MAX_SECTORS_CAP); - } - /* * Limit default to SCSI host optimal sector limit if set. There may be * an impact on performance for when the size of a request exceeds this * host limit. */ - rw_max = min_not_zero(rw_max, sdp->host->opt_sectors); - - /* Do not exceed controller limit */ - rw_max = min(rw_max, queue_max_hw_sectors(q)); - - /* -* Only update max_sectors if previously unset or if the current value -* exceeds the capabilities of the hardware. -*/ - if (sdkp->first_scan || -
[PATCH 05/12] sd: add a sd_disable_write_same helper
Add helper to disable WRITE SAME when it is not supported and use it instead of sd_config_write_same in the I/O completion handler. This avoids touching more fields than required in the I/O completion handler and prepares for converting sd to use the atomic queue limits API. Signed-off-by: Christoph Hellwig --- drivers/scsi/sd.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index a8838381823254..09ffe9d826aeac 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -1004,6 +1004,13 @@ static blk_status_t sd_setup_write_zeroes_cmnd(struct scsi_cmnd *cmd) return sd_setup_write_same10_cmnd(cmd, false); } +static void sd_disable_write_same(struct scsi_disk *sdkp) +{ + sdkp->device->no_write_same = 1; + sdkp->max_ws_blocks = 0; + blk_queue_max_write_zeroes_sectors(sdkp->disk->queue, 0); +} + static void sd_config_write_same(struct scsi_disk *sdkp) { struct request_queue *q = sdkp->disk->queue; @@ -2258,8 +2265,7 @@ static int sd_done(struct scsi_cmnd *SCpnt) if (SCpnt->cmnd[1] & 8) { /* UNMAP */ sd_disable_discard(sdkp); } else { - sdkp->device->no_write_same = 1; - sd_config_write_same(sdkp); + sd_disable_write_same(sdkp); req->rq_flags |= RQF_QUIET; } break; -- 2.43.0
[PATCH 03/12] sd: simplify the ZBC case in provisioning_mode_store
Don't reset the discard settings to no-op over and over when a user writes to the provisioning attribute as that is already the default mode for ZBC devices. In hindsight we should have made writing to the attribute fail for ZBC devices, but the code has probably been around for far too long to change this now. Signed-off-by: Christoph Hellwig --- drivers/scsi/sd.c | 9 - 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index 3dff9150ce11e2..15d0035048d902 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -461,14 +461,13 @@ provisioning_mode_store(struct device *dev, struct device_attribute *attr, if (!capable(CAP_SYS_ADMIN)) return -EACCES; - if (sd_is_zoned(sdkp)) { - sd_config_discard(sdkp, SD_LBP_DISABLE); - return count; - } - if (sdp->type != TYPE_DISK) return -EINVAL; + /* ignore the proivisioning mode for ZBB devices */ + if (sd_is_zoned(sdkp)) + return count; + mode = sysfs_match_string(lbp_mode, buf); if (mode < 0) return -EINVAL; -- 2.43.0
Re: [PATCH v2 0/7] hw/xen: Simplify legacy backends handling
ping? On 10/5/24 12:49, Philippe Mathieu-Daudé wrote: Respin of Paolo's Xen patches from https://lore.kernel.org/qemu-devel/20240509170044.190795-1-pbonz...@redhat.com/ rebased on one of my cleanup branches making backend structures const. Treat xenfb as other backends. Paolo Bonzini (2): hw/xen: initialize legacy backends from xen_bus_init() hw/xen: register legacy backends via xen_backend_init Philippe Mathieu-Daudé (5): hw/xen: Remove declarations left over in 'xen-legacy-backend.h' hw/xen: Constify XenLegacyDevice::XenDevOps hw/xen: Constify xenstore_be::XenDevOps hw/xen: Make XenDevOps structures const hw/xen: Register framebuffer backend via xen_backend_init() include/hw/xen/xen-legacy-backend.h | 15 +-- include/hw/xen/xen_pvdev.h | 3 +-- hw/9pfs/xen-9p-backend.c| 8 +++- hw/display/xenfb.c | 15 +-- hw/i386/pc.c| 1 - hw/usb/xen-usb.c| 14 -- hw/xen/xen-bus.c| 4 hw/xen/xen-hvm-common.c | 2 -- hw/xen/xen-legacy-backend.c | 24 hw/xenpv/xen_machine_pv.c | 7 +-- 10 files changed, 35 insertions(+), 58 deletions(-)
[linux-linus test] 186174: tolerable FAIL - PUSHED
flight 186174 linux-linus real [real] flight 186177 linux-linus real-retest [real] http://logs.test-lab.xenproject.org/osstest/logs/186174/ http://logs.test-lab.xenproject.org/osstest/logs/186177/ Failures :-/ but no regressions. Tests which are failing intermittently (not blocking): test-armhf-armhf-xl-raw 8 xen-bootfail pass in 186177-retest test-armhf-armhf-xl-arndale 8 xen-bootfail pass in 186177-retest test-armhf-armhf-xl-credit1 8 xen-bootfail pass in 186177-retest test-amd64-amd64-xl-qemut-debianhvm-i386-xsm 12 debian-hvm-install fail pass in 186177-retest Tests which did not succeed, but are not blocking: test-armhf-armhf-xl-arndale 15 migrate-support-check fail in 186177 never pass test-armhf-armhf-xl-arndale 16 saverestore-support-check fail in 186177 never pass test-armhf-armhf-xl-credit1 15 migrate-support-check fail in 186177 never pass test-armhf-armhf-xl-credit1 16 saverestore-support-check fail in 186177 never pass test-armhf-armhf-xl-raw 14 migrate-support-check fail in 186177 never pass test-armhf-armhf-xl-raw 15 saverestore-support-check fail in 186177 never pass test-armhf-armhf-libvirt 16 saverestore-support-checkfail like 186161 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 186161 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 186161 test-armhf-armhf-examine 8 reboot fail like 186161 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 186161 test-armhf-armhf-xl-rtds 8 xen-boot fail like 186161 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 186161 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 186161 test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qcow2 14 migrate-support-checkfail never pass test-amd64-amd64-libvirt-raw 14 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 15 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-xl-qcow214 migrate-support-checkfail never pass test-armhf-armhf-xl-qcow215 saverestore-support-checkfail never pass test-arm64-arm64-xl-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-xl-vhd 15 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-libvirt-vhd 15 saverestore-support-checkfail never pass version targeted for testing: linuxe0cce98fe279b64f4a7d81b7f5c3a23d80b92fbc baseline version: linux2bfcfd584ff5ccc8bb7acde19b42570414bf880b Last test of basis 186161 2024-05-27 16:11:54 Z1 days Testing same since 186174 2024-05-28 18:10:31 Z0 days1 attempts People who touched revisions under test: Andrii Nakryiko Carlos López Carol Soto Jarkko Sakkinen Linus Torvalds Masami Hiramatsu (Google) Matthew
Re: [RFC XEN PATCH v8 5/5] domctl: Add XEN_DOMCTL_gsi_permission to grant gsi
Hi, On 2024/5/17 19:50, Jan Beulich wrote: > On 17.05.2024 13:14, Chen, Jiqian wrote: >> On 2024/5/17 18:51, Jan Beulich wrote: >>> On 17.05.2024 12:45, Chen, Jiqian wrote: On 2024/5/16 22:01, Jan Beulich wrote: > On 16.05.2024 11:52, Jiqian Chen wrote: >> +if ( gsi >= nr_irqs_gsi ) >> +{ >> +ret = -EINVAL; >> +break; >> +} >> + >> +if ( !irq_access_permitted(current->domain, gsi) || > > I.e. assuming IRQ == GSI? Is that a valid assumption when any number of > source overrides may be surfaced by ACPI? All irqs smaller than nr_irqs_gsi are gsi, aren't they? >>> >>> They are, but there's not necessarily a 1:1 mapping. >> Oh, so do I need to add a new gsi_caps to store granted gsi? > > Probably not. You ought to be able to translate between GSI and IRQ, > and then continue to record in / check against IRQ permissions. But I found in function init_irq_data: for ( irq = 0; irq < nr_irqs_gsi; irq++ ) { int rc; desc = irq_to_desc(irq); desc->irq = irq; rc = init_one_irq_desc(desc); if ( rc ) return rc; } Does it mean that when irq < nr_irqs_gsi, the gsi and irq is a 1:1 mapping? What's more, when using PHYSDEVOP_setup_gsi, it calls mp_register_gsi, and in mp_register_gsi, it uses " desc = irq_to_desc(gsi); " to get irq_desc directly. Combining above, can we consider "gsi == irq" when irq < nr_irqs_gsi ? > > Jan -- Best regards, Jiqian Chen.
[ovmf test] 186178: all pass - PUSHED
flight 186178 ovmf real [real] http://logs.test-lab.xenproject.org/osstest/logs/186178/ Perfect :-) All tests in this flight passed as required version targeted for testing: ovmf 55f8bddade205c9cbe3583d5d31d0048cdf26ed4 baseline version: ovmf 0e3189d406684e44608e01c93f7e2d53fa07b40a Last test of basis 186176 2024-05-28 20:43:05 Z0 days Testing same since 186178 2024-05-29 00:41:09 Z0 days1 attempts People who touched revisions under test: Michael Kubacki jobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass test-amd64-amd64-xl-qemuu-ovmf-amd64 pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/osstest/ovmf.git 0e3189d406..55f8bddade 55f8bddade205c9cbe3583d5d31d0048cdf26ed4 -> xen-tested-master
[ovmf test] 186176: all pass - PUSHED
flight 186176 ovmf real [real] http://logs.test-lab.xenproject.org/osstest/logs/186176/ Perfect :-) All tests in this flight passed as required version targeted for testing: ovmf 0e3189d406684e44608e01c93f7e2d53fa07b40a baseline version: ovmf 08281572aab5b1f7e05bf26de4148af19eddc8b7 Last test of basis 186160 2024-05-27 09:43:09 Z1 days Testing same since 186176 2024-05-28 20:43:05 Z0 days1 attempts People who touched revisions under test: Michael D Kinney jobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass test-amd64-amd64-xl-qemuu-ovmf-amd64 pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/osstest/ovmf.git 08281572aa..0e3189d406 0e3189d406684e44608e01c93f7e2d53fa07b40a -> xen-tested-master
Re: [XEN PATCH 3/4] x86: address violations of MISRA C Rule 8.4
On 2024-05-28 08:28, Jan Beulich wrote: On 27.05.2024 16:53, Nicola Vetrini wrote: Rule 8.4 states: "A compatible declaration shall be visible when an object or function with external linkage is defined." These variables are only referenced from asm modules, so they need to be extern and there is negligible risk of them being used improperly without noticing. "asm modules" isn't quite correct, as there's one use from inline assembly. I have to admit I have no good wording suggestion other than explicitly covering both: "asm modules or inline assembly". Yet that then is ambiguous, as a use in inline assembly may also mean that symbol is actually visible to the compiler by being mentioned as on of the operands. Better ideas? Maybe generically "asm code" or just "asm"? It's not really relevant whether it's inline or not, as far as I understand. As a result, they can be exempted using a comment-based deviation. No functional change. Signed-off-by: Nicola Vetrini With suitably adjusted wording: Acked-by: Jan Beulich Jan -- Nicola Vetrini, BSc Software Engineer, BUGSENG srl (https://bugseng.com)
[xen-unstable-smoke test] 186172: tolerable all pass - PUSHED
flight 186172 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/186172/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass version targeted for testing: xen d27c2835e0005200d464b57156c76455d46f74bb baseline version: xen 2d93f78bfe25f695d8ffb61d110da9df293ed71b Last test of basis 186170 2024-05-28 11:00:23 Z0 days Testing same since 186172 2024-05-28 17:01:57 Z0 days1 attempts People who touched revisions under test: Andrew Cooper Jan Beulich jobs: build-arm64-xsm pass build-amd64 pass build-armhf pass build-amd64-libvirt pass test-armhf-armhf-xl pass test-arm64-arm64-xl-xsm pass test-amd64-amd64-xl-qemuu-debianhvm-amd64pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/xen.git 2d93f78bfe..d27c2835e0 d27c2835e0005200d464b57156c76455d46f74bb -> smoke
Re: CVE-2021-47377: kernel: xen/balloon: use a kernel thread instead a workqueue
On Mon, May 27, 2024 at 12:58:16PM +0200, Juergen Gross wrote: > Hi, > > I'd like to dispute CVE-2021-47377: the issue fixed by upstream commit > 8480ed9c2bbd56fc86524998e5f2e3e22f5038f6 can in no way be triggered by > an unprivileged user or by a remote attack of the system, as it requires > initiation of memory ballooning of the running system. This can be done > only by either a host admin or by an admin of the guest which might > suffer the detection of the hanging workqueue. > > Please revoke this CVE. Ah, good catch, this came in as part of the GSD import, and I missed that this required that type of permissions. Now revoked, thanks for the review! greg k-h
[xen-unstable test] 186168: tolerable FAIL - PUSHED
flight 186168 xen-unstable real [real] flight 186173 xen-unstable real-retest [real] http://logs.test-lab.xenproject.org/osstest/logs/186168/ http://logs.test-lab.xenproject.org/osstest/logs/186173/ Failures :-/ but no regressions. Tests which are failing intermittently (not blocking): test-armhf-armhf-xl-credit2 8 xen-bootfail pass in 186173-retest test-amd64-amd64-xl-qemut-debianhvm-i386-xsm 12 debian-hvm-install fail pass in 186173-retest Tests which did not succeed, but are not blocking: test-armhf-armhf-xl-credit2 15 migrate-support-check fail in 186173 never pass test-armhf-armhf-xl-credit2 16 saverestore-support-check fail in 186173 never pass test-armhf-armhf-libvirt 16 saverestore-support-checkfail like 186163 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 186163 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 186163 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 186163 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 186163 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 186163 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 15 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qcow2 14 migrate-support-checkfail never pass test-amd64-amd64-libvirt-raw 14 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 15 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail never pass test-arm64-arm64-xl-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-xl-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-xl-raw 14 migrate-support-checkfail never pass test-armhf-armhf-xl-raw 15 saverestore-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-qcow214 migrate-support-checkfail never pass test-armhf-armhf-xl-qcow215 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-libvirt-vhd 15 saverestore-support-checkfail never pass version targeted for testing: xen 96af090e33130b0bf0953f3ccab8e7a163392318 baseline version: xen ac572152e578a8853de0534384c1539ec21f11e0 Last test of basis 186163 2024-05-28 01:53:34 Z0 days Testing same since 186168 2024-05-28 10:08:43 Z0 days1 attempts People who touched revisions under test: Jan Beulich Nicola Vetrini Stefano Stabellini Stefano Stabellini jobs: build-amd64-xsm pass build-arm64-xsm pass build-i386-xsm
Re: [PATCH v4 4/4] x86/ucode: Utilize ucode_force and remove opt_ucode_allow_same
On 28/05/2024 4:29 pm, Fouad Hilly wrote: > Pass ucode_force to common micorocde checks and utilize it to allow for > microcode downgrade > or reapply the same version of the microcode. > Update low level Intel and AMD to check for valid signature only. Any version > checks is done > at core.c. > While adding ucode_force, opt_ucode_allow_same was removed. > Remove opt_ucode_allow_same from documentation. > > Signed-off-by: Fouad Hilly > --- > [4] > 1- As opt_ucode_allow_same is not required anymore, it has been removed while > introducing ucode_force. > 2- Apply the changes for both AMD and Intel. > 3- Remove the mention of opt_ucode_allow_same from documentation. > --- > docs/misc/xen-command-line.pandoc| 7 +-- > xen/arch/x86/cpu/microcode/amd.c | 7 --- > xen/arch/x86/cpu/microcode/core.c| 9 +++-- > xen/arch/x86/cpu/microcode/intel.c | 4 > xen/arch/x86/cpu/microcode/private.h | 2 -- > 5 files changed, 4 insertions(+), 25 deletions(-) > > diff --git a/docs/misc/xen-command-line.pandoc > b/docs/misc/xen-command-line.pandoc > index 1dea7431fab6..a42ce1039fed 100644 > --- a/docs/misc/xen-command-line.pandoc > +++ b/docs/misc/xen-command-line.pandoc > @@ -2648,7 +2648,7 @@ performance. > Alternatively, selecting `tsx=1` will re-enable TSX at the users own risk. > > ### ucode > -> `= List of [ | scan=, nmi=, allow-same= ]` > +> `= List of [ | scan=, nmi= ]` > > Applicability: x86 > Default: `nmi` > @@ -2680,11 +2680,6 @@ precedence over `scan`. > stop_machine context. In NMI handler, even NMIs are blocked, which is > considered safer. The default value is `true`. > > -'allow-same' alters the default acceptance policy for new microcode to permit > -trying to reload the same version. Many CPUs will actually reload microcode > -of the same version, and this allows for easy testing of the late microcode > -loading path. > - > ### unrestricted_guest (Intel) > > `= ` > > diff --git a/xen/arch/x86/cpu/microcode/amd.c > b/xen/arch/x86/cpu/microcode/amd.c > index f76a563c8b84..4bcc79f1ab2d 100644 > --- a/xen/arch/x86/cpu/microcode/amd.c > +++ b/xen/arch/x86/cpu/microcode/amd.c > @@ -225,13 +225,6 @@ static int cf_check apply_microcode(const struct > microcode_patch *patch) > if ( result == MIS_UCODE ) > return -EINVAL; > > -/* > - * Allow application of the same revision to pick up SMT-specific changes > - * even if the revision of the other SMT thread is already up-to-date. > - */ > -if ( result == OLD_UCODE ) > -return -EEXIST; > - I'm afraid that because of the observation leading to 977d98e67c2e, I see no option other than to plumb the force flag down into apply_microcode(). This check cannot be deleted unconditionally, or we'll try downgrading microcode even without the force flag being passed. Unless we can fix the cacheing layer to not treat "I didn't load ucode at boot" as "no idea of the symmetry of the system". I'm unsure which of these two is going to be less ugly... > if ( check_final_patch_levels(sig) ) > { > printk(XENLOG_ERR > diff --git a/xen/arch/x86/cpu/microcode/core.c > b/xen/arch/x86/cpu/microcode/core.c > index 8a9e744489b9..fc8ad8cfdd76 100644 > --- a/xen/arch/x86/cpu/microcode/core.c > +++ b/xen/arch/x86/cpu/microcode/core.c > @@ -100,8 +100,6 @@ static bool __initdata ucode_scan; > /* By default, ucode loading is done in NMI handler */ > static bool ucode_in_nmi = true; > > -bool __read_mostly opt_ucode_allow_same; > - > /* Protected by microcode_mutex */ > static struct microcode_patch *microcode_cache; > > @@ -128,8 +126,6 @@ static int __init cf_check parse_ucode(const char *s) > > if ( (val = parse_boolean("nmi", s, ss)) >= 0 ) > ucode_in_nmi = val; > -else if ( (val = parse_boolean("allow-same", s, ss)) >= 0 ) > -opt_ucode_allow_same = val; > else if ( !ucode_mod_forced ) /* Not forced by EFI */ > { > if ( (val = parse_boolean("scan", s, ss)) >= 0 ) > @@ -583,6 +579,7 @@ static long cf_check microcode_update_helper(void *data) > struct ucode_buf *buffer = data; > unsigned int cpu, updated; > struct microcode_patch *patch; > +bool ucode_force = buffer->flags == XENPF_UCODE_FORCE; > > /* cpu_online_map must not change during update */ > if ( !get_cpu_maps() ) > @@ -636,12 +633,12 @@ static long cf_check microcode_update_helper(void *data) >microcode_cache); > > if ( result != NEW_UCODE && > - !(opt_ucode_allow_same && result == SAME_UCODE) ) > + (!ucode_force || (result & ~SAME_UCODE)) ) What is "result & ~SAME_UCODE" trying to do? ~Andrew
Re: [PATCH v4 3/4] x86/ucode: Introduce --force option to xen-ucode to force skipping microcode version check
On 28/05/2024 4:29 pm, Fouad Hilly wrote: > Introduce --force option to xen-ucode to force skipping microcode version > check, which > allows the user to update x86 microcode even if both versions are the same or > downgrade. > xc_microcode_update() refactored to accept flags and utilize > xenpf_microcode_update2. > > Signed-off-by: Fouad Hilly I think it would be better to stop the subject at "... to xen-ucode". The commit message itself covers what has changed. > diff --git a/tools/misc/xen-ucode.c b/tools/misc/xen-ucode.c > index 6f9dd2a7e431..b878edf2399a 100644 > --- a/tools/misc/xen-ucode.c > +++ b/tools/misc/xen-ucode.c > @@ -13,6 +13,8 @@ > #include > #include > > +#include > + > static xc_interface *xch; > > static const char intel_id[] = "GenuineIntel"; > @@ -24,7 +26,8 @@ static void usage(FILE *stream, const char *name) > "Usage: %s [microcode file] [options]\n" > "options:\n" > " -h, --helpdisplay this help and exit\n" > - " -s, --show-cpu-info show CPU information and exit\n", > + " -s, --show-cpu-info show CPU information and exit\n" > + " -f, --force force to skip microcode version check\n", I'd phrase this as "skip certain checks; do not use unless you know exactly what you are doing" which makes it very clear that people get to keep all pieces if they try this. Otherwise (and subject to the style cleanup in the previous patch), Reviewed-by: Andrew Cooper
Re: [PATCH v4 2/4] x86/ucode: refactor xen-ucode to utilize getopt
On 28/05/2024 4:29 pm, Fouad Hilly wrote: > Use getopt_long() to handle command line arguments. > Introduce ext_err for common exit with errors. > xc_microcode_update() refactored to accept flags and utilize > xenpf_microcode_update2 Importantly, not. That's deferred until the next patch. > Introducing usage() to handle usage\help messages in a common block. > show_curr_cpu is printed to stdout only. > > Signed-off-by: Fouad Hilly > --- > [v4] > 1- Merge three patches into one. > 2- usage() to print messages to the correct stream. > 3- Update commit message and description. > --- > tools/misc/xen-ucode.c | 51 -- > 1 file changed, 39 insertions(+), 12 deletions(-) > > diff --git a/tools/misc/xen-ucode.c b/tools/misc/xen-ucode.c > index 390969db3d1c..6f9dd2a7e431 100644 > --- a/tools/misc/xen-ucode.c > +++ b/tools/misc/xen-ucode.c > @@ -11,12 +11,23 @@ > #include > #include > #include > +#include > > static xc_interface *xch; > > static const char intel_id[] = "GenuineIntel"; > static const char amd_id[] = "AuthenticAMD"; > > +static void usage(FILE *stream, const char *name) > +{ > +fprintf(stream, "%s: Xen microcode updating tool\n" > + "Usage: %s [microcode file] [options]\n" Options are typically expressed first when writing this. > + "options:\n" > + " -h, --helpdisplay this help and exit\n" > + " -s, --show-cpu-info show CPU information and exit\n", The "and exit" is pointless to state like this. > + name, name); Alignment is out by one column here. Also, previously, usage() was always followed by show_curr_cpu(). That's still easy to retain by reordering the two static functions. > +} > + > static void show_curr_cpu(FILE *f) > { > int ret; > @@ -77,6 +88,13 @@ int main(int argc, char *argv[]) > char *filename, *buf; > size_t len; > struct stat st; > +int opt; > + > +static const struct option options[] = { > +{"help", no_argument, NULL, 'h'}, > +{"show-cpu-info", no_argument, NULL, 's'}, > +{NULL, no_argument, NULL, 0} > +}; Typically we prefer static variables before automatic variables. This block can just be repositioned. > > xch = xc_interface_open(NULL, NULL, 0); > if ( xch == NULL ) > @@ -86,22 +104,25 @@ int main(int argc, char *argv[]) > exit(1); > } > > -if ( argc < 2 ) > +while ( (opt = getopt_long(argc, argv, "hs", options, NULL)) != -1 ) > { > -fprintf(stderr, > -"xen-ucode: Xen microcode updating tool\n" > -"Usage: %s [ | show-cpu-info]\n", argv[0]); > -show_curr_cpu(stderr); > -exit(2); > +switch (opt) > +{ > +case 'h': > +usage(stdout, argv[0]); > +exit(EXIT_SUCCESS); Blank line here, ... > +case 's': > +show_curr_cpu(stdout); > +exit(EXIT_SUCCESS); ... and here. > +default: > +goto ext_err; > +} > } > > -if ( !strcmp(argv[1], "show-cpu-info") ) > -{ > -show_curr_cpu(stdout); > -return 0; > -} > +if ( optind == argc ) > +goto ext_err; This is a change in API, because now "show-cpu-info" needs to take a -- to be interpreted. I suspect we want: if ( optind == argc ) goto ext_err; /* For backwards compatibility to the pre-getopt() cmdline handling */ if ( !strcmp(argv[optind], "show-cpu-info") ) { show_curr_cpu(stdout); return 0; } to retain compatibility. (Also, this means you won't have the code rejected by XenRT when this gets added to Xen.) All of this I can fix on commit, but there's quite a lot in the way of changes. ~Andrew
Re: [PATCH v4 1/4] x86/ucode: Introduce PF_microcode_update2 with flags parameter
On 28/05/2024 4:29 pm, Fouad Hilly wrote: > Refactor microcode_update() by adding flags field. > struct xenpf_microcode_update2 added with uint32_t flags field. > Introduce XENPF_microcode_update2 hypercall with flags field. > > Signed-off-by: Fouad Hilly PF_ in the subject wants expanding to the full name, i.e. XENPF_microcode_update2. Can be fixed on commit. Reviewed-by: Andrew Cooper
Re: [PATCH 2/3] xen/x86: Drop useless non-Kconfig CONFIG_* variables
On 22/05/2024 7:03 am, Jan Beulich wrote: > On 21.05.2024 19:15, Andrew Cooper wrote: >> These are all either completely unused, or do nothing useful. >> >> Signed-off-by: Andrew Cooper > Not an objection, i.e. you're fine to commit as is with Stefano's R-b, yet > still a question: > >> @@ -30,11 +29,8 @@ >> /* Intel P4 currently has largest cache line (L2 line size is 128 bytes). */ >> #define CONFIG_X86_L1_CACHE_SHIFT 7 >> >> -#define CONFIG_ACPI_SRAT 1 >> #define CONFIG_ACPI_CSTATE 1 >> >> -#define CONFIG_WATCHDOG 1 > I wonder if this wouldn't make sense to become a proper Kconfig setting, > thus ... > >> --- a/xen/include/xen/watchdog.h >> +++ b/xen/include/xen/watchdog.h >> @@ -9,8 +9,6 @@ >> >> #include >> >> -#ifdef CONFIG_WATCHDOG >> - >> /* Try to set up a watchdog. */ >> int watchdog_setup(void); >> >> @@ -23,13 +21,4 @@ void watchdog_disable(void); >> /* Is the watchdog currently enabled. */ >> bool watchdog_enabled(void); >> >> -#else >> - >> -#define watchdog_setup() ((void)0) >> -#define watchdog_enable() ((void)0) >> -#define watchdog_disable() ((void)0) >> -#define watchdog_enabled() ((void)0) >> - >> -#endif > ... assigning purpose to these stubs. Actually I need to keep CONFIG_WATCHDOG to not break the build on other architectures. watchdog_{en,dis}able() are called from common/{debugtrace,keyhandler}.c. I'll leave it in place for now, and we can Kconfig it up later. ~Andrew
Re: [PATCH] x86/svm: Rework VMCB_ACCESSORS() to use a plain type name
On Tue, 2024-05-28 at 17:37 +0200, Jan Beulich wrote: > On 28.05.2024 17:32, Andrew Cooper wrote: > > This avoids having a function call in a typeof() expression. > > > > No functional change. > > > > Signed-off-by: Andrew Cooper > > Acked-by: Jan Beulich > Release-Acked-by: Oleksii Kurochko ~ Oleksii
Re: [PATCH] x86/svm: Rework VMCB_ACCESSORS() to use a plain type name
On 28.05.2024 17:32, Andrew Cooper wrote: > This avoids having a function call in a typeof() expression. > > No functional change. > > Signed-off-by: Andrew Cooper Acked-by: Jan Beulich
[PATCH] x86/svm: Rework VMCB_ACCESSORS() to use a plain type name
This avoids having a function call in a typeof() expression. No functional change. Signed-off-by: Andrew Cooper --- CC: Jan Beulich CC: Roger Pau Monné --- xen/arch/x86/include/asm/hvm/svm/vmcb.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/xen/arch/x86/include/asm/hvm/svm/vmcb.h b/xen/arch/x86/include/asm/hvm/svm/vmcb.h index 0396d10b90ba..28f715e37692 100644 --- a/xen/arch/x86/include/asm/hvm/svm/vmcb.h +++ b/xen/arch/x86/include/asm/hvm/svm/vmcb.h @@ -630,7 +630,7 @@ vmcb_get_ ## name(const struct vmcb_struct *vmcb) \ } #define VMCB_ACCESSORS(name, cleanbit) \ -VMCB_ACCESSORS_(name, typeof(alloc_vmcb()->_ ## name), cleanbit) +VMCB_ACCESSORS_(name, typeof(((struct vmcb_struct){})._ ## name), cleanbit) VMCB_ACCESSORS(cr_intercepts, intercepts) VMCB_ACCESSORS(dr_intercepts, intercepts) base-commit: 2d93f78bfe25f695d8ffb61d110da9df293ed71b prerequisite-patch-id: 4f38737ad5ea15b0341664ee1c6b96d0d00b6700 -- 2.30.2
[PATCH v4 4/4] x86/ucode: Utilize ucode_force and remove opt_ucode_allow_same
Pass ucode_force to common micorocde checks and utilize it to allow for microcode downgrade or reapply the same version of the microcode. Update low level Intel and AMD to check for valid signature only. Any version checks is done at core.c. While adding ucode_force, opt_ucode_allow_same was removed. Remove opt_ucode_allow_same from documentation. Signed-off-by: Fouad Hilly --- [4] 1- As opt_ucode_allow_same is not required anymore, it has been removed while introducing ucode_force. 2- Apply the changes for both AMD and Intel. 3- Remove the mention of opt_ucode_allow_same from documentation. --- docs/misc/xen-command-line.pandoc| 7 +-- xen/arch/x86/cpu/microcode/amd.c | 7 --- xen/arch/x86/cpu/microcode/core.c| 9 +++-- xen/arch/x86/cpu/microcode/intel.c | 4 xen/arch/x86/cpu/microcode/private.h | 2 -- 5 files changed, 4 insertions(+), 25 deletions(-) diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc index 1dea7431fab6..a42ce1039fed 100644 --- a/docs/misc/xen-command-line.pandoc +++ b/docs/misc/xen-command-line.pandoc @@ -2648,7 +2648,7 @@ performance. Alternatively, selecting `tsx=1` will re-enable TSX at the users own risk. ### ucode -> `= List of [ | scan=, nmi=, allow-same= ]` +> `= List of [ | scan=, nmi= ]` Applicability: x86 Default: `nmi` @@ -2680,11 +2680,6 @@ precedence over `scan`. stop_machine context. In NMI handler, even NMIs are blocked, which is considered safer. The default value is `true`. -'allow-same' alters the default acceptance policy for new microcode to permit -trying to reload the same version. Many CPUs will actually reload microcode -of the same version, and this allows for easy testing of the late microcode -loading path. - ### unrestricted_guest (Intel) > `= ` diff --git a/xen/arch/x86/cpu/microcode/amd.c b/xen/arch/x86/cpu/microcode/amd.c index f76a563c8b84..4bcc79f1ab2d 100644 --- a/xen/arch/x86/cpu/microcode/amd.c +++ b/xen/arch/x86/cpu/microcode/amd.c @@ -225,13 +225,6 @@ static int cf_check apply_microcode(const struct microcode_patch *patch) if ( result == MIS_UCODE ) return -EINVAL; -/* - * Allow application of the same revision to pick up SMT-specific changes - * even if the revision of the other SMT thread is already up-to-date. - */ -if ( result == OLD_UCODE ) -return -EEXIST; - if ( check_final_patch_levels(sig) ) { printk(XENLOG_ERR diff --git a/xen/arch/x86/cpu/microcode/core.c b/xen/arch/x86/cpu/microcode/core.c index 8a9e744489b9..fc8ad8cfdd76 100644 --- a/xen/arch/x86/cpu/microcode/core.c +++ b/xen/arch/x86/cpu/microcode/core.c @@ -100,8 +100,6 @@ static bool __initdata ucode_scan; /* By default, ucode loading is done in NMI handler */ static bool ucode_in_nmi = true; -bool __read_mostly opt_ucode_allow_same; - /* Protected by microcode_mutex */ static struct microcode_patch *microcode_cache; @@ -128,8 +126,6 @@ static int __init cf_check parse_ucode(const char *s) if ( (val = parse_boolean("nmi", s, ss)) >= 0 ) ucode_in_nmi = val; -else if ( (val = parse_boolean("allow-same", s, ss)) >= 0 ) -opt_ucode_allow_same = val; else if ( !ucode_mod_forced ) /* Not forced by EFI */ { if ( (val = parse_boolean("scan", s, ss)) >= 0 ) @@ -583,6 +579,7 @@ static long cf_check microcode_update_helper(void *data) struct ucode_buf *buffer = data; unsigned int cpu, updated; struct microcode_patch *patch; +bool ucode_force = buffer->flags == XENPF_UCODE_FORCE; /* cpu_online_map must not change during update */ if ( !get_cpu_maps() ) @@ -636,12 +633,12 @@ static long cf_check microcode_update_helper(void *data) microcode_cache); if ( result != NEW_UCODE && - !(opt_ucode_allow_same && result == SAME_UCODE) ) + (!ucode_force || (result & ~SAME_UCODE)) ) { spin_unlock(_mutex); printk(XENLOG_WARNING "microcode: couldn't find any newer%s revision in the provided blob!\n", - opt_ucode_allow_same ? " (or the same)" : ""); + ucode_force? " (or a valid)" : ""); microcode_free_patch(patch); ret = -EEXIST; diff --git a/xen/arch/x86/cpu/microcode/intel.c b/xen/arch/x86/cpu/microcode/intel.c index f505aa1b7888..5e1b528ffe05 100644 --- a/xen/arch/x86/cpu/microcode/intel.c +++ b/xen/arch/x86/cpu/microcode/intel.c @@ -297,10 +297,6 @@ static int cf_check apply_microcode(const struct microcode_patch *patch) if ( result == MIS_UCODE ) return -EINVAL; -if ( result == OLD_UCODE || - (result == SAME_UCODE && !opt_ucode_allow_same) ) -return -EEXIST; - wbinvd(); wrmsrl(MSR_IA32_UCODE_WRITE, (unsigned long)patch->data); diff --git a/xen/arch/x86/cpu/microcode/private.h
[PATCH v4 2/4] x86/ucode: refactor xen-ucode to utilize getopt
Use getopt_long() to handle command line arguments. Introduce ext_err for common exit with errors. xc_microcode_update() refactored to accept flags and utilize xenpf_microcode_update2 Introducing usage() to handle usage\help messages in a common block. show_curr_cpu is printed to stdout only. Signed-off-by: Fouad Hilly --- [v4] 1- Merge three patches into one. 2- usage() to print messages to the correct stream. 3- Update commit message and description. --- tools/misc/xen-ucode.c | 51 -- 1 file changed, 39 insertions(+), 12 deletions(-) diff --git a/tools/misc/xen-ucode.c b/tools/misc/xen-ucode.c index 390969db3d1c..6f9dd2a7e431 100644 --- a/tools/misc/xen-ucode.c +++ b/tools/misc/xen-ucode.c @@ -11,12 +11,23 @@ #include #include #include +#include static xc_interface *xch; static const char intel_id[] = "GenuineIntel"; static const char amd_id[] = "AuthenticAMD"; +static void usage(FILE *stream, const char *name) +{ +fprintf(stream, "%s: Xen microcode updating tool\n" + "Usage: %s [microcode file] [options]\n" + "options:\n" + " -h, --helpdisplay this help and exit\n" + " -s, --show-cpu-info show CPU information and exit\n", + name, name); +} + static void show_curr_cpu(FILE *f) { int ret; @@ -77,6 +88,13 @@ int main(int argc, char *argv[]) char *filename, *buf; size_t len; struct stat st; +int opt; + +static const struct option options[] = { +{"help", no_argument, NULL, 'h'}, +{"show-cpu-info", no_argument, NULL, 's'}, +{NULL, no_argument, NULL, 0} +}; xch = xc_interface_open(NULL, NULL, 0); if ( xch == NULL ) @@ -86,22 +104,25 @@ int main(int argc, char *argv[]) exit(1); } -if ( argc < 2 ) +while ( (opt = getopt_long(argc, argv, "hs", options, NULL)) != -1 ) { -fprintf(stderr, -"xen-ucode: Xen microcode updating tool\n" -"Usage: %s [ | show-cpu-info]\n", argv[0]); -show_curr_cpu(stderr); -exit(2); +switch (opt) +{ +case 'h': +usage(stdout, argv[0]); +exit(EXIT_SUCCESS); +case 's': +show_curr_cpu(stdout); +exit(EXIT_SUCCESS); +default: +goto ext_err; +} } -if ( !strcmp(argv[1], "show-cpu-info") ) -{ -show_curr_cpu(stdout); -return 0; -} +if ( optind == argc ) +goto ext_err; -filename = argv[1]; +filename = argv[optind]; fd = open(filename, O_RDONLY); if ( fd < 0 ) { @@ -146,4 +167,10 @@ int main(int argc, char *argv[]) close(fd); return 0; + + ext_err: +fprintf(stderr, +"%s: unable to process command line arguments\n", argv[0]); +usage(stderr, argv[0]); +exit(EXIT_FAILURE); } -- 2.42.0
[PATCH v4 0/4] x86/xen-ucode: Introduce --force option
Refactor and introduce --force option to xen-ucode, which skips microcode version check when updating x86 CPU micocode. A new hypercall introduced with flags field to facilitate the new option and allow for future flags as needed. This change is required to enable developers to load ucode that is the same version as the one already loaded or downgrade for testing. Suggested-by: Andrew Cooper Fouad Hilly (4): x86/ucode: Introduce PF_microcode_update2 with flags parameter x86/ucode: refactor xen-ucode to utilize getopt x86/ucode: Introduce --force option to xen-ucode to force skipping microcode version check x86/ucode: Utilize ucode_force and remove opt_ucode_allow_same docs/misc/xen-command-line.pandoc| 7 +--- tools/include/xenctrl.h | 3 +- tools/libs/ctrl/xc_misc.c| 12 +++--- tools/misc/xen-ucode.c | 61 ++-- xen/arch/x86/cpu/microcode/amd.c | 7 xen/arch/x86/cpu/microcode/core.c| 20 + xen/arch/x86/cpu/microcode/intel.c | 4 -- xen/arch/x86/cpu/microcode/private.h | 2 - xen/arch/x86/include/asm/microcode.h | 3 +- xen/arch/x86/platform_hypercall.c| 13 +- xen/include/public/platform.h| 14 +++ 11 files changed, 97 insertions(+), 49 deletions(-) -- 2.42.0
[PATCH v4 1/4] x86/ucode: Introduce PF_microcode_update2 with flags parameter
Refactor microcode_update() by adding flags field. struct xenpf_microcode_update2 added with uint32_t flags field. Introduce XENPF_microcode_update2 hypercall with flags field. Signed-off-by: Fouad Hilly --- [v4] 1- Commit message and description updated. 2- Changing the order of the patches. [v3] 1- Updated Commit message description. 2- Revereted changes to a stable ABI and introduced a new struct. 3- ucode_force_flag updated from static to a local variable. 4- microcode_update() updated to reject unsupported flags yet. [v2] 1- Update message description to highlight interface change. 2- Removed extra empty lines. 3- removed unnecessary define. 4- Corrected long lines. 5- Removed ternary operator. 6- Introduced static ucode_update_flags, which will be used later to determine local ucode_force_flag. --- xen/arch/x86/cpu/microcode/core.c| 11 --- xen/arch/x86/include/asm/microcode.h | 3 ++- xen/arch/x86/platform_hypercall.c| 13 - xen/include/public/platform.h| 14 ++ 4 files changed, 36 insertions(+), 5 deletions(-) diff --git a/xen/arch/x86/cpu/microcode/core.c b/xen/arch/x86/cpu/microcode/core.c index e90055772acf..8a9e744489b9 100644 --- a/xen/arch/x86/cpu/microcode/core.c +++ b/xen/arch/x86/cpu/microcode/core.c @@ -40,6 +40,8 @@ #include #include +#include + #include "private.h" /* @@ -570,6 +572,7 @@ static int cf_check do_microcode_update(void *patch) } struct ucode_buf { +unsigned int flags; unsigned int len; char buffer[]; }; @@ -708,13 +711,14 @@ static long cf_check microcode_update_helper(void *data) return ret; } -int microcode_update(XEN_GUEST_HANDLE(const_void) buf, unsigned long len) +int microcode_update(XEN_GUEST_HANDLE(const_void) buf, + unsigned long len, unsigned int flags) { int ret; struct ucode_buf *buffer; -if ( len != (uint32_t)len ) -return -E2BIG; +if ( flags & ~XENPF_UCODE_FORCE ) +return -EINVAL; if ( !ucode_ops.apply_microcode ) return -EINVAL; @@ -730,6 +734,7 @@ int microcode_update(XEN_GUEST_HANDLE(const_void) buf, unsigned long len) return -EFAULT; } buffer->len = len; +buffer->flags = flags; /* * Always queue microcode_update_helper() on CPU0. Most of the logic diff --git a/xen/arch/x86/include/asm/microcode.h b/xen/arch/x86/include/asm/microcode.h index 8f59b20b0289..57c08205d475 100644 --- a/xen/arch/x86/include/asm/microcode.h +++ b/xen/arch/x86/include/asm/microcode.h @@ -22,7 +22,8 @@ struct cpu_signature { DECLARE_PER_CPU(struct cpu_signature, cpu_sig); void microcode_set_module(unsigned int idx); -int microcode_update(XEN_GUEST_HANDLE(const_void) buf, unsigned long len); +int microcode_update(XEN_GUEST_HANDLE(const_void) buf, + unsigned long len, unsigned int flags); int early_microcode_init(unsigned long *module_map, const struct multiboot_info *mbi); int microcode_init_cache(unsigned long *module_map, diff --git a/xen/arch/x86/platform_hypercall.c b/xen/arch/x86/platform_hypercall.c index 95467b88ab64..7e3278109300 100644 --- a/xen/arch/x86/platform_hypercall.c +++ b/xen/arch/x86/platform_hypercall.c @@ -311,7 +311,18 @@ ret_t do_platform_op( guest_from_compat_handle(data, op->u.microcode.data); -ret = microcode_update(data, op->u.microcode.length); +ret = microcode_update(data, op->u.microcode.length, 0); +break; +} + +case XENPF_microcode_update2: +{ +XEN_GUEST_HANDLE(const_void) data; + +guest_from_compat_handle(data, op->u.microcode2.data); + +ret = microcode_update(data, op->u.microcode2.length, + op->u.microcode2.flags); break; } diff --git a/xen/include/public/platform.h b/xen/include/public/platform.h index 15777b541690..2725b8d1044f 100644 --- a/xen/include/public/platform.h +++ b/xen/include/public/platform.h @@ -624,6 +624,19 @@ struct xenpf_ucode_revision { typedef struct xenpf_ucode_revision xenpf_ucode_revision_t; DEFINE_XEN_GUEST_HANDLE(xenpf_ucode_revision_t); +/* Hypercall to microcode_update with flags */ +#define XENPF_microcode_update266 +struct xenpf_microcode_update2 { +/* IN variables. */ +uint32_t flags; /* Flags to be passed with ucode. */ +/* Force to skip microcode version check */ +#define XENPF_UCODE_FORCE 1 +uint32_t length; /* Length of microcode data. */ +XEN_GUEST_HANDLE(const_void) data;/* Pointer to microcode data */ +}; +typedef struct xenpf_microcode_update2 xenpf_microcode_update2_t; +DEFINE_XEN_GUEST_HANDLE(xenpf_microcode_update2_t); + /* * ` enum neg_errnoval * ` HYPERVISOR_platform_op(const struct xen_platform_op*); @@ -656,6 +669,7 @@ struct xen_platform_op { xenpf_symdata_t symdata; xenpf_dom0_console_t dom0_console;
[PATCH v4 3/4] x86/ucode: Introduce --force option to xen-ucode to force skipping microcode version check
Introduce --force option to xen-ucode to force skipping microcode version check, which allows the user to update x86 microcode even if both versions are the same or downgrade. xc_microcode_update() refactored to accept flags and utilize xenpf_microcode_update2. Signed-off-by: Fouad Hilly --- [4] 1- Add --force to xen-ucode options. 2- Update xc_microcode_update() to accept and handle flags. --- tools/include/xenctrl.h | 3 ++- tools/libs/ctrl/xc_misc.c | 12 +++- tools/misc/xen-ucode.c| 14 +++--- 3 files changed, 20 insertions(+), 9 deletions(-) diff --git a/tools/include/xenctrl.h b/tools/include/xenctrl.h index 499685594427..7fb409bc6dc4 100644 --- a/tools/include/xenctrl.h +++ b/tools/include/xenctrl.h @@ -1171,7 +1171,8 @@ typedef uint32_t xc_node_to_node_dist_t; int xc_physinfo(xc_interface *xch, xc_physinfo_t *info); int xc_cputopoinfo(xc_interface *xch, unsigned *max_cpus, xc_cputopo_t *cputopo); -int xc_microcode_update(xc_interface *xch, const void *buf, size_t len); +int xc_microcode_update(xc_interface *xch, const void *buf, +size_t len, unsigned int flags); int xc_get_cpu_version(xc_interface *xch, struct xenpf_pcpu_version *cpu_ver); int xc_get_ucode_revision(xc_interface *xch, struct xenpf_ucode_revision *ucode_rev); diff --git a/tools/libs/ctrl/xc_misc.c b/tools/libs/ctrl/xc_misc.c index 50282fd60dcc..6a60216bda03 100644 --- a/tools/libs/ctrl/xc_misc.c +++ b/tools/libs/ctrl/xc_misc.c @@ -203,11 +203,12 @@ int xc_physinfo(xc_interface *xch, return 0; } -int xc_microcode_update(xc_interface *xch, const void *buf, size_t len) +int xc_microcode_update(xc_interface *xch, const void *buf, +size_t len, unsigned int flags) { int ret; struct xen_platform_op platform_op = {}; -DECLARE_HYPERCALL_BUFFER(struct xenpf_microcode_update, uc); +DECLARE_HYPERCALL_BUFFER(struct xenpf_microcode_update2, uc); uc = xc_hypercall_buffer_alloc(xch, uc, len); if ( uc == NULL ) @@ -215,9 +216,10 @@ int xc_microcode_update(xc_interface *xch, const void *buf, size_t len) memcpy(uc, buf, len); -platform_op.cmd = XENPF_microcode_update; -platform_op.u.microcode.length = len; -set_xen_guest_handle(platform_op.u.microcode.data, uc); +platform_op.cmd = XENPF_microcode_update2; +platform_op.u.microcode2.length = len; +platform_op.u.microcode2.flags = flags; +set_xen_guest_handle(platform_op.u.microcode2.data, uc); ret = do_platform_op(xch, _op); diff --git a/tools/misc/xen-ucode.c b/tools/misc/xen-ucode.c index 6f9dd2a7e431..b878edf2399a 100644 --- a/tools/misc/xen-ucode.c +++ b/tools/misc/xen-ucode.c @@ -13,6 +13,8 @@ #include #include +#include + static xc_interface *xch; static const char intel_id[] = "GenuineIntel"; @@ -24,7 +26,8 @@ static void usage(FILE *stream, const char *name) "Usage: %s [microcode file] [options]\n" "options:\n" " -h, --helpdisplay this help and exit\n" - " -s, --show-cpu-info show CPU information and exit\n", + " -s, --show-cpu-info show CPU information and exit\n" + " -f, --force force to skip microcode version check\n", name, name); } @@ -89,10 +92,12 @@ int main(int argc, char *argv[]) size_t len; struct stat st; int opt; +uint32_t ucode_flags = 0; static const struct option options[] = { {"help", no_argument, NULL, 'h'}, {"show-cpu-info", no_argument, NULL, 's'}, +{"force", no_argument, NULL, 'f'}, {NULL, no_argument, NULL, 0} }; @@ -104,7 +109,7 @@ int main(int argc, char *argv[]) exit(1); } -while ( (opt = getopt_long(argc, argv, "hs", options, NULL)) != -1 ) +while ( (opt = getopt_long(argc, argv, "hsf", options, NULL)) != -1 ) { switch (opt) { @@ -114,6 +119,9 @@ int main(int argc, char *argv[]) case 's': show_curr_cpu(stdout); exit(EXIT_SUCCESS); +case 'f': +ucode_flags = XENPF_UCODE_FORCE; +break; default: goto ext_err; } @@ -147,7 +155,7 @@ int main(int argc, char *argv[]) } errno = 0; -ret = xc_microcode_update(xch, buf, len); +ret = xc_microcode_update(xch, buf, len, ucode_flags); if ( ret == -1 && errno == EEXIST ) printf("Microcode already up to date\n"); else if ( ret ) -- 2.42.0
[xen-unstable-smoke test] 186170: tolerable all pass - PUSHED
flight 186170 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/186170/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass version targeted for testing: xen 2d93f78bfe25f695d8ffb61d110da9df293ed71b baseline version: xen 96af090e33130b0bf0953f3ccab8e7a163392318 Last test of basis 186164 2024-05-28 02:00:23 Z0 days Testing same since 186166 2024-05-28 07:02:06 Z0 days2 attempts People who touched revisions under test: Jan Beulich Jason Andryuk Nicola Vetrini Oleksii Kurochko jobs: build-arm64-xsm pass build-amd64 pass build-armhf pass build-amd64-libvirt pass test-armhf-armhf-xl pass test-arm64-arm64-xl-xsm pass test-amd64-amd64-xl-qemuu-debianhvm-amd64pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/xen.git 96af090e33..2d93f78bfe 2d93f78bfe25f695d8ffb61d110da9df293ed71b -> smoke
[PATCH v2 for-4.19 0.5/13] xen: Introduce CONFIG_SELF_TESTS
... and move x86's stub_selftest() under this new option. There is value in having these tests included in release builds too. It will shortly be used to gate the bitops unit tests on all architectures. Signed-off-by: Andrew Cooper --- CC: Jan Beulich CC: Roger Pau Monné CC: Wei Liu CC: Stefano Stabellini CC: Julien Grall CC: Volodymyr Babchuk CC: Bertrand Marquis CC: Michal Orzel CC: Oleksii Kurochko CC: Shawn Anastasio CC: consult...@bugseng.com CC: Simone Ballarin CC: Federico Serafini CC: Nicola Vetrini v2.5: * As suggested in "[PATCH v2 05/13] xen/bitops: Implement generic_f?sl() in lib/" I've gone with SELF_TESTS rather than BOOT_TESTS, because already in bitops we've got compile time tests (which aren't strictly boot time), and the livepatching testing wants to be included here and is definitely not boot time. --- xen/Kconfig.debug | 6 ++ xen/arch/x86/extable.c | 4 ++-- xen/arch/x86/setup.c | 2 +- 3 files changed, 9 insertions(+), 3 deletions(-) diff --git a/xen/Kconfig.debug b/xen/Kconfig.debug index 61b24ac552cd..07ff7eb7ba83 100644 --- a/xen/Kconfig.debug +++ b/xen/Kconfig.debug @@ -29,6 +29,12 @@ config FRAME_POINTER maybe slower, but it gives very useful debugging information in case of any Xen bugs. +config SELF_TESTS + bool "Extra self-testing" + default DEBUG + help + Enable extra unit and functional testing. + config COVERAGE bool "Code coverage support" depends on !LIVEPATCH diff --git a/xen/arch/x86/extable.c b/xen/arch/x86/extable.c index 8415cd1fa249..705cf9eb94ca 100644 --- a/xen/arch/x86/extable.c +++ b/xen/arch/x86/extable.c @@ -144,7 +144,7 @@ search_exception_table(const struct cpu_user_regs *regs, unsigned long *stub_ra) return 0; } -#ifdef CONFIG_DEBUG +#ifdef CONFIG_SELF_TESTS #include #include @@ -214,7 +214,7 @@ int __init cf_check stub_selftest(void) return 0; } __initcall(stub_selftest); -#endif +#endif /* CONFIG_SELF_TESTS */ unsigned long asmlinkage search_pre_exception_table(struct cpu_user_regs *regs) { diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c index b50c9c84af6d..dd51e68dbe5b 100644 --- a/xen/arch/x86/setup.c +++ b/xen/arch/x86/setup.c @@ -742,7 +742,7 @@ static void noreturn init_done(void) system_state = SYS_STATE_active; /* Re-run stub recovery self-tests with CET-SS active. */ -if ( IS_ENABLED(CONFIG_DEBUG) && cpu_has_xen_shstk ) +if ( IS_ENABLED(CONFIG_SELF_TESTS) && cpu_has_xen_shstk ) stub_selftest(); domain_unpause_by_systemcontroller(dom0); base-commit: 2d93f78bfe25f695d8ffb61d110da9df293ed71b -- 2.30.2
Re: [PATCH v2 8/8] xen/x86: Synthesise domain topologies
On 27/05/2024 09:20, Roger Pau Monné wrote: > On Fri, May 24, 2024 at 06:16:01PM +0100, Alejandro Vallejo wrote: >> On 24/05/2024 09:58, Roger Pau Monné wrote: >>> On Wed, May 08, 2024 at 01:39:27PM +0100, Alejandro Vallejo wrote: >>> +rc = x86_topo_from_parts(>policy, threads_per_core, cores_per_pkg); >>> >>> I assume this generates the same topology as the current code, or will >>> the population of the leaves be different in some way? >>> >> >> The current code does not populate 0xb. This generates a topology >> consistent with the existing INTENDED topology. The actual APIC IDs will >> be different though (because there's no skipping of odd values). >> >> All the dance in patch 1 was to make this migrate-safe. The x2apic ID is >> stored in the lapic hidden regs so differences with previous behaviour >> don't matter. > > What about systems without CPU policy in the migration stream, will > those also get restored as expected? > > I think you likely need to check whether 'restore' is set and keep the > old logic in that case? > > As otherwise migrated systems without a CPU policy will get the new > topology information instead of the old one? Bah. I hoped the x2apic ID restoration would mean I could get away with removing all that junk, but migrations from Xen v.StoneAge do cause mayhem. And it'd be very bizarre because the new topology leaves would not reflect the existing x2apic IDs either. I'll condense that blob of nonsense with the old scheme into a separate function so we can easily deprecate it in the future. > >> IOW, The differences are: >> * 0xb is exposed, whereas previously it wasn't >> * APIC IDs are compacted such that new_apicid=old_apicid/2 >> * There's also a cleanup of the murkier paths to put the right core >> counts in the right leaves (whereas previously it was bonkers) > > This needs to be in the commit message IMO. > Sure. >>> >>> Note that currently the host policy also gets the topology leaves >>> cleared, is it intended to not clear them anymore after this change? >>> >>> (as you only clear the leaves for the guest {max,def} policies) >>> >>> Thanks, Roger. >> >> It was like that originally in v1, I changed in v2 as part of feedback >> from Jan. > > I think that's fine, but this divergence from current behavior of > cleaning the topology for the host policy needs to be mentioned in > the commit message. > > Thanks, Roger. Sure. Cheers, Alejandro
Re: [PATCH v2 6/8] xen/lib: Add topology generator for x86
On 23/05/2024 17:50, Roger Pau Monné wrote: > On Wed, May 08, 2024 at 01:39:25PM +0100, Alejandro Vallejo wrote: >> Add a helper to populate topology leaves in the cpu policy from >> threads/core and cores/package counts. >> >> No functional change, as it's not connected to anything yet. > > There is a functional change in test-cpu-policy.c. > > Maybe the commit message needs to be updated to reflect the added > testing to test-cpu-policy.c using the newly introduced helper to > generate topologies? > Sure to this and all formatting feedback below. >> >> Signed-off-by: Alejandro Vallejo >> --- >> v2: >> * New patch. Extracted from v1/patch6 >> --- >> tools/tests/cpu-policy/test-cpu-policy.c | 128 +++ >> xen/include/xen/lib/x86/cpu-policy.h | 16 +++ >> xen/lib/x86/policy.c | 86 +++ >> 3 files changed, 230 insertions(+) >> >> diff --git a/tools/tests/cpu-policy/test-cpu-policy.c >> b/tools/tests/cpu-policy/test-cpu-policy.c >> index 301df2c00285..0ba8c418b1b3 100644 >> --- a/tools/tests/cpu-policy/test-cpu-policy.c >> +++ b/tools/tests/cpu-policy/test-cpu-policy.c >> @@ -650,6 +650,132 @@ static void test_is_compatible_failure(void) >> } >> } >> >> +static void test_topo_from_parts(void) >> +{ >> +static const struct test { >> +unsigned int threads_per_core; >> +unsigned int cores_per_pkg; >> +struct cpu_policy policy; >> +} tests[] = { >> +{ >> +.threads_per_core = 3, .cores_per_pkg = 1, >> +.policy = { >> +.x86_vendor = X86_VENDOR_AMD, >> +.topo.subleaf = { >> +[0] = { .nr_logical = 3, .level = 0, .type = 1, >> .id_shift = 2, }, >> +[1] = { .nr_logical = 1, .level = 1, .type = 2, >> .id_shift = 2, }, >> +}, >> +}, >> +}, >> +{ >> +.threads_per_core = 1, .cores_per_pkg = 3, >> +.policy = { >> +.x86_vendor = X86_VENDOR_AMD, >> +.topo.subleaf = { >> +[0] = { .nr_logical = 1, .level = 0, .type = 1, >> .id_shift = 0, }, >> +[1] = { .nr_logical = 3, .level = 1, .type = 2, >> .id_shift = 2, }, >> +}, >> +}, >> +}, >> +{ >> +.threads_per_core = 7, .cores_per_pkg = 5, >> +.policy = { >> +.x86_vendor = X86_VENDOR_AMD, >> +.topo.subleaf = { >> +[0] = { .nr_logical = 7, .level = 0, .type = 1, >> .id_shift = 3, }, >> +[1] = { .nr_logical = 5, .level = 1, .type = 2, >> .id_shift = 6, }, >> +}, >> +}, >> +}, >> +{ >> +.threads_per_core = 2, .cores_per_pkg = 128, >> +.policy = { >> +.x86_vendor = X86_VENDOR_AMD, >> +.topo.subleaf = { >> +[0] = { .nr_logical = 2, .level = 0, .type = 1, >> .id_shift = 1, }, >> +[1] = { .nr_logical = 128, .level = 1, .type = 2, >> .id_shift = 8, }, >> +}, >> +}, >> +}, >> +{ >> +.threads_per_core = 3, .cores_per_pkg = 1, >> +.policy = { >> +.x86_vendor = X86_VENDOR_INTEL, >> +.topo.subleaf = { >> +[0] = { .nr_logical = 3, .level = 0, .type = 1, >> .id_shift = 2, }, >> +[1] = { .nr_logical = 3, .level = 1, .type = 2, >> .id_shift = 2, }, >> +}, >> +}, >> +}, >> +{ >> +.threads_per_core = 1, .cores_per_pkg = 3, >> +.policy = { >> +.x86_vendor = X86_VENDOR_INTEL, >> +.topo.subleaf = { >> +[0] = { .nr_logical = 1, .level = 0, .type = 1, >> .id_shift = 0, }, >> +[1] = { .nr_logical = 3, .level = 1, .type = 2, >> .id_shift = 2, }, >> +}, >> +}, >> +}, >> +{ >> +.threads_per_core = 7, .cores_per_pkg = 5, >> +.policy = { >> +.x86_vendor = X86_VENDOR_INTEL, >> +.topo.subleaf = { >> +[0] = { .nr_logical = 7, .level = 0, .type = 1, >> .id_shift = 3, }, >> +[1] = { .nr_logical = 35, .level = 1, .type = 2, >> .id_shift = 6, }, >> +}, >> +}, >> +}, >> +{ >> +.threads_per_core = 2, .cores_per_pkg = 128, >> +.policy = { >> +.x86_vendor = X86_VENDOR_INTEL, >> +.topo.subleaf = { >> +[0] = { .nr_logical = 2, .level = 0, .type = 1, >> .id_shift = 1, }, >> +[1] = { .nr_logical = 256, .level = 1, .type = 2, >> .id_shift = 8, }, > > You don't need the array index in the initialization: > > .topo.subleaf = {
Re: [PATCH v2 05/13] xen/bitops: Implement generic_f?sl() in lib/
On 27/05/2024 9:44 am, Jan Beulich wrote: > On 24.05.2024 22:03, Andrew Cooper wrote: >> generic_f?s() being static inline is the cause of lots of the complexity >> between the common and arch-specific bitops.h >> >> They appear to be static inline for constant-folding reasons (ARM uses them >> for this), but there are better ways to achieve the same effect. >> >> It is presumptuous that an unrolled binary search is the right algorithm to >> use on all microarchitectures. Indeed, it's not for the eventual users, but >> that can be addressed at a later point. >> >> It is also nonsense to implement the int form as the base primitive and >> construct the long form from 2x int in 64-bit builds, when it's just one >> extra >> step to operate at the native register width. >> >> Therefore, implement generic_f?sl() in lib/. They're not actually needed in >> x86/ARM/PPC by the end of the cleanup (i.e. the functions will be dropped by >> the linker), and they're only expected be needed by RISC-V on hardware which >> lacks the Zbb extension. >> >> Implement generic_fls() in terms of generic_flsl() for now, but this will be >> cleaned up in due course. >> >> Provide basic runtime testing using __constructor inside the lib/ file. This >> is important, as it means testing runs if and only if generic_f?sl() are used >> elsewhere in Xen. >> >> Signed-off-by: Andrew Cooper > Reviewed-by: Jan Beulich Thanks. > with a suggestion and a question. > >> I suspect we want to swap CONFIG_DEBUG for CONFIG_BOOT_UNIT_TESTS in due >> course. These ought to be able to be used in a release build too. > +1 Actually - I might as well do this now. Start as we mean to go on. > >> --- /dev/null >> +++ b/xen/lib/generic-ffsl.c >> @@ -0,0 +1,65 @@ >> +/* SPDX-License-Identifier: GPL-2.0 */ >> + >> +#include >> +#include >> +#include >> + >> +unsigned int generic_ffsl(unsigned long x) >> +{ >> +unsigned int r = 1; >> + >> +if ( !x ) >> +return 0; >> + >> +#if BITS_PER_LONG > 32 > To be future-proof, perhaps ahead of this > > #if BITS_PER_LONG > 64 > # error "..." > #endif > > or a functionally similar BUILD_BUG_ON()? Good point. I'll fold this in to both files. > >> --- /dev/null >> +++ b/xen/lib/generic-flsl.c >> @@ -0,0 +1,68 @@ >> +/* SPDX-License-Identifier: GPL-2.0 */ >> + >> +#include >> +#include >> +#include >> + >> +/* Mask of type UL with the upper x bits set. */ >> +#define UPPER_MASK(x) (~0UL << (BITS_PER_LONG - (x))) >> + >> +unsigned int generic_flsl(unsigned long x) >> +{ >> +unsigned int r = BITS_PER_LONG; >> + >> +if ( !x ) >> +return 0; >> + >> +#if BITS_PER_LONG > 32 >> +if ( !(x & UPPER_MASK(32)) ) >> +{ >> +x <<= 32; >> +r -= 32; >> +} >> +#endif >> +if ( !(x & UPPER_MASK(16)) ) >> +{ >> +x <<= 16; >> +r -= 16; >> +} >> +if ( !(x & UPPER_MASK(8)) ) >> +{ >> +x <<= 8; >> +r -= 8; >> +} >> +if ( !(x & UPPER_MASK(4)) ) >> +{ >> +x <<= 4; >> +r -= 4; >> +} >> +if ( !(x & UPPER_MASK(2)) ) >> +{ >> +x <<= 2; >> +r -= 2; >> +} >> +if ( !(x & UPPER_MASK(1)) ) >> +{ >> +x <<= 1; >> +r -= 1; >> +} >> + >> +return r; >> +} > While, as you say, the expectation is for this code to not commonly come > into actual use, I still find the algorithm a little inefficient in terms > of the constants used, specifically considering how they would need > instantiating in resulting assembly. It may be that Arm's fancy constant- > move insns can actually efficiently synthesize them, but I think on most > other architectures it would be more efficient (and presumably no less > efficient on Arm) to shift the "remaining" value right, thus allowing for > successively smaller (and hence easier to instantiate) constants to be > used. ARM can only synthesise UPPER_MASK(16) and narrower masks, I think. That said, I'm not concerned about the (in)efficiency seeing as this doesn't get included in x86/ARM/PPC builds by the end of the series. It's RISC-V which matters, and I'm pretty sure this is the wrong algorithm to be using. Incidentally, this algorithm is terrible for superscalar pipelines, because each branch is inherently unpredictable. Both these files want rewriting based on an analysis of the H-capable Zbb-incapable RISC-V cores which exist. I expect that what we actually want is the De Bruijn form which is an O(1) algorithm, given a decent hardware multiplier. If not, there's a loop form which I expect would still be better than this. ~Andrew
Re: [PATCH v2 07/13] x86/bitops: Improve arch_ffs() in the general case
On 28.05.2024 14:30, Andrew Cooper wrote: > On 27/05/2024 2:37 pm, Jan Beulich wrote: >> On 27.05.2024 15:27, Jan Beulich wrote: >>> On 24.05.2024 22:03, Andrew Cooper wrote: --- a/xen/arch/x86/include/asm/bitops.h +++ b/xen/arch/x86/include/asm/bitops.h @@ -432,12 +432,28 @@ static inline int ffsl(unsigned long x) static always_inline unsigned int arch_ffs(unsigned int x) { -int r; +unsigned int r; + +if ( __builtin_constant_p(x > 0) && x > 0 ) +{ +/* Safe, when the compiler knows that x is nonzero. */ +asm ( "bsf %[val], %[res]" + : [res] "=r" (r) + : [val] "rm" (x) ); +} >>> In patch 11 relevant things are all in a single patch, making it easier >>> to spot that this is dead code: The sole caller already has a >>> __builtin_constant_p(), hence I don't see how the one here could ever >>> return true. With that the respective part of the description is then >>> questionable, too, I'm afraid: Where did you observe any actual effect >>> from this? Or if you did - what am I missing? >> Hmm, thinking about it: I suppose that's why you have >> __builtin_constant_p(x > 0), not __builtin_constant_p(x). I have to admit >> I'm (positively) surprised that the former may return true when the latter >> doesn't. > > So was I, but this recommendation came straight from the GCC mailing > list. And it really does work, even back in obsolete versions of GCC. > > __builtin_constant_p() operates on an expression not a value, and is > documented as such. Of course. >> Nevertheless I'm inclined to think this deserves a brief comment. > > There is a comment, and it's even visible in the snippet. The comment is about the asm(); it is neither placed to clearly relate to __builtin_constant_p(), nor is it saying anything about this specific property of it. You said you were equally surprised; don't you think that when both of us are surprised, a specific (even if brief) comment is warranted? >> As an aside, to better match the comment inside the if()'s body, how about >> >> if ( __builtin_constant_p(!!x) && x ) >> >> ? That also may make a little more clear that this isn't just a style >> choice, but actually needed for the intended purpose. > > I am not changing the logic. > > Apart from anything else, your suggestion is trivially buggy. I care > about whether the RHS collapses to a constant, and the only way of doing > that correctly is asking the compiler about the *exact* expression. > Asking about some other expression which you hope - but do not know - > that the compiler will treat equivalently is bogus. It would be > strictly better to only take the else clause, than to have both halves > emitted. > > This is the form I've tested extensively. It's also the clearest form > IMO. You can experiment with alternative forms when we're not staring > down code freeze of 4.19. "Clearest form" is almost always a matter of taste. To me, comparing unsigned values with > or < against 0 is generally at least suspicious. Using != is typically better (again: imo), and simply omitting the != 0 then is shorter with no difference in effect. Except in peculiar cases like this one, where indeed it took me some time to figure why the comparison operator may not be omitted. All that said: I'm not going to insist on any change; the R-b previously offered still stands. I would highly appreciate though if the (further) comment asked for could be added. What I definitely dislike here is you - not for the first time - turning down remarks because a change of yours is late. This feels even more so bad when considering that I'm typically replying to your patches with pretty little turnaround. Whereas various of mine, pending in part for years, do not seem to deserve any review comments at all. Unlike before, where it was "only" improvements or feature additions, meanwhile even bug fixes are left sit like that. If I may be blunt: This may not work this way for much longer. At some point I will need to artificially delay reviews, making them dependent on my own work also being allowed to make progress. I question though whether that would be in everyone's interest. Jan
[PATCH v4.2] xen/p2m: put reference for level 2 superpage
From: Penny Zheng We are doing foreign memory mapping for static shared memory, and there is a great possibility that it could be super mapped. But today, p2m_put_l3_page could not handle superpages. This commits implements a new function p2m_put_l2_superpage to handle level 2 superpages, specifically for helping put extra references for foreign superpages. Modify relinquish_p2m_mapping as well to take into account preemption when we have a level-2 foreign mapping. Currently level 1 superpages are not handled because Xen is not preemptible and therefore some work is needed to handle such superpages, for which at some point Xen might end up freeing memory and therefore for such a big mapping it could end up in a very long operation. Signed-off-by: Penny Zheng Signed-off-by: Luca Fancellu --- v4.2 changes: - rework commit message to don't explicit say the size of the mapping but only say the level, remove unnecessary p2m_is_superpage() check in p2m_put_page, remove (level >= 2) condition to call p2m_put_page inside p2m_free_entry because the code in that function can already handle the levels, move TODO comment inside p2m_put_page, avoid mentioning 2MB in the comments. (Julien) - This patch is meant to superseed the homonymous patch in the serie: https://patchwork.kernel.org/project/xen-devel/cover/20240524124055.3871399-1-luca.fance...@arm.com/ v4 changes: - optimised the path to call put_page() on the foreign mapping as Julien suggested. Add a comment in p2m_put_l2_superpage to state that any changes needs to take into account some change in the relinquish code. (Julien) v3 changes: - Add reasoning why we don't support now 1GB superpage, remove level_order variable from p2m_put_l2_superpage, update TODO comment inside p2m_free_entry, use XEN_PT_LEVEL_ORDER(2) instead of value 9 inside relinquish_p2m_mapping. (Michal) v2: - Do not handle 1GB super page as there might be some issue where a lot of calls to put_page(...) might be issued which could lead to free memory that is a long operation. v1: - patch from https://patchwork.kernel.org/project/xen-devel/patch/20231206090623.1932275-9-penny.zh...@arm.com/ --- xen/arch/arm/mmu/p2m.c | 81 +++--- 1 file changed, 61 insertions(+), 20 deletions(-) diff --git a/xen/arch/arm/mmu/p2m.c b/xen/arch/arm/mmu/p2m.c index 41fcca011cf4..1725cca649b5 100644 --- a/xen/arch/arm/mmu/p2m.c +++ b/xen/arch/arm/mmu/p2m.c @@ -753,34 +753,72 @@ static int p2m_mem_access_radix_set(struct p2m_domain *p2m, gfn_t gfn, return rc; } -/* - * Put any references on the single 4K page referenced by pte. - * TODO: Handle superpages, for now we only take special references for leaf - * pages (specifically foreign ones, which can't be super mapped today). - */ -static void p2m_put_l3_page(const lpae_t pte) +static void p2m_put_foreign_page(struct page_info *pg) { -mfn_t mfn = lpae_get_mfn(pte); - -ASSERT(p2m_is_valid(pte)); - /* - * TODO: Handle other p2m types - * * It's safe to do the put_page here because page_alloc will * flush the TLBs if the page is reallocated before the end of * this loop. */ -if ( p2m_is_foreign(pte.p2m.type) ) +put_page(pg); +} + +/* Put any references on the single 4K page referenced by mfn. */ +static void p2m_put_l3_page(mfn_t mfn, p2m_type_t type) +{ +/* TODO: Handle other p2m types */ +if ( p2m_is_foreign(type) ) { ASSERT(mfn_valid(mfn)); -put_page(mfn_to_page(mfn)); +p2m_put_foreign_page(mfn_to_page(mfn)); } /* Detect the xenheap page and mark the stored GFN as invalid. */ -else if ( p2m_is_ram(pte.p2m.type) && is_xen_heap_mfn(mfn) ) +else if ( p2m_is_ram(type) && is_xen_heap_mfn(mfn) ) page_set_xenheap_gfn(mfn_to_page(mfn), INVALID_GFN); } +/* Put any references on the superpage referenced by mfn. */ +static void p2m_put_l2_superpage(mfn_t mfn, p2m_type_t type) +{ +struct page_info *pg; +unsigned int i; + +/* + * TODO: Handle other p2m types, but be aware that any changes to handle + * different types should require an update on the relinquish code to handle + * preemption. + */ +if ( !p2m_is_foreign(type) ) +return; + +ASSERT(mfn_valid(mfn)); + +pg = mfn_to_page(mfn); + +for ( i = 0; i < XEN_PT_LPAE_ENTRIES; i++, pg++ ) +p2m_put_foreign_page(pg); +} + +/* Put any references on the page referenced by pte. */ +static void p2m_put_page(const lpae_t pte, unsigned int level) +{ +mfn_t mfn = lpae_get_mfn(pte); + +ASSERT(p2m_is_valid(pte)); + +/* + * TODO: Currently we don't handle level 1 super-page, Xen is not + * preemptible and therefore some work is needed to handle such + * superpages, for which at some point Xen might end up freeing memory + * and therefore for such a big mapping it could end up in a very long + * operation. + */ +if (
Re: [PATCH 3/4] x86/pci/xen: Fix PCIBIOS_* return code handling
On 27.05.24 14:55, Ilpo Järvinen wrote: xen_pcifront_enable_irq() uses pci_read_config_byte() that returns PCIBIOS_* codes. The error handling, however, assumes the codes are normal errnos because it checks for < 0. xen_pcifront_enable_irq() also returns the PCIBIOS_* code back to the caller but the function is used as the (*pcibios_enable_irq) function which should return normal errnos. Convert the error check to plain non-zero check which works for PCIBIOS_* return codes and convert the PCIBIOS_* return code using pcibios_err_to_errno() into normal errno before returning it. Fixes: 3f2a230caf21 ("xen: handled remapped IRQs when enabling a pcifront PCI device.") Signed-off-by: Ilpo Järvinen Cc: sta...@vger.kernel.org Reviewed-by: Juergen Gross Juergen
Re: [PATCH v2 5/8] tools/hvmloader: Retrieve (x2)APIC IDs from the APs themselves
On 27/05/2024 09:52, Roger Pau Monné wrote: >> My intent here was to prevent the compiler omitting the write (though in >> practice it didn't). I think the barrier is still required to prevent >> reordering according to the spec. > > Yes, my understating is that you added the ACCESS_ONCE() to possibly > prevent the compiler from re-ordering the CPU_TO_X2APICID[ap_cpuid]) = > read_apic_id() after the 'hlt' loop? Yes, but not only that. Also preventing the compiler from eliding the write altogether on the basis that it's not read after within the function. I have suffered that particular behaviour on older versions of GCC and it stung very much. > > AFAICT the expressions in the `for` statement are considered sequence > points, and the calling `read_apic_id()` could have side-effects, so > the compiler won't be able to elide or move the setting of > CPU_TO_X2APICID[] due to all side-effects of previous evaluations must > be complete at sequence points. > > I'm fine with leaving the wmb() + ACCESS_ONCE(). > > Thanks, Roger. Cheers, Alejandro
Re: [PATCH v2 07/13] x86/bitops: Improve arch_ffs() in the general case
On 27/05/2024 2:37 pm, Jan Beulich wrote: > On 27.05.2024 15:27, Jan Beulich wrote: >> On 24.05.2024 22:03, Andrew Cooper wrote: >>> --- a/xen/arch/x86/include/asm/bitops.h >>> +++ b/xen/arch/x86/include/asm/bitops.h >>> @@ -432,12 +432,28 @@ static inline int ffsl(unsigned long x) >>> >>> static always_inline unsigned int arch_ffs(unsigned int x) >>> { >>> -int r; >>> +unsigned int r; >>> + >>> +if ( __builtin_constant_p(x > 0) && x > 0 ) >>> +{ >>> +/* Safe, when the compiler knows that x is nonzero. */ >>> +asm ( "bsf %[val], %[res]" >>> + : [res] "=r" (r) >>> + : [val] "rm" (x) ); >>> +} >> In patch 11 relevant things are all in a single patch, making it easier >> to spot that this is dead code: The sole caller already has a >> __builtin_constant_p(), hence I don't see how the one here could ever >> return true. With that the respective part of the description is then >> questionable, too, I'm afraid: Where did you observe any actual effect >> from this? Or if you did - what am I missing? > Hmm, thinking about it: I suppose that's why you have > __builtin_constant_p(x > 0), not __builtin_constant_p(x). I have to admit > I'm (positively) surprised that the former may return true when the latter > doesn't. So was I, but this recommendation came straight from the GCC mailing list. And it really does work, even back in obsolete versions of GCC. __builtin_constant_p() operates on an expression not a value, and is documented as such. > Nevertheless I'm inclined to think this deserves a brief comment. There is a comment, and it's even visible in the snippet. > As an aside, to better match the comment inside the if()'s body, how about > > if ( __builtin_constant_p(!!x) && x ) > > ? That also may make a little more clear that this isn't just a style > choice, but actually needed for the intended purpose. I am not changing the logic. Apart from anything else, your suggestion is trivially buggy. I care about whether the RHS collapses to a constant, and the only way of doing that correctly is asking the compiler about the *exact* expression. Asking about some other expression which you hope - but do not know - that the compiler will treat equivalently is bogus. It would be strictly better to only take the else clause, than to have both halves emitted. This is the form I've tested extensively. It's also the clearest form IMO. You can experiment with alternative forms when we're not staring down code freeze of 4.19. ~Andrew
CPU_DOWN_FAILED hits ASSERTs in scheduling logic
Hello, When the stop_machine_run() call in cpu_down() fails and calls the CPU notifier CPU_DOWN_FAILED hook the following assert triggers in the scheduling code: Assertion '!cpumask_test_cpu(cpu, >initialized)' failed at common/sched/cred1 [ Xen-4.19-unstable x86_64 debug=y Tainted: C] CPU:0 RIP:e008:[] common/sched/credit2.c#csched2_free_pdata+0xc8/0x177 RFLAGS: 00010093 CONTEXT: hypervisor rax: rbx: 83202ecc2f80 rcx: 83202f3e64c0 rdx: 0001 rsi: 0002 rdi: 83202ecc2f88 rbp: 83203d58 rsp: 83203d30 r8: r9: 83202f3e6e01 r10: r11: 0f0f0f0f0f0f0f0f r12: 83202ecb80b0 r13: 0001 r14: 0282 r15: 83202ecbbf00 cr0: 8005003b cr4: 007526e0 cr3: 574c2000 cr2: fsb: gsb: gss: ds: es: fs: gs: ss: cs: e008 Xen code around (common/sched/credit2.c#csched2_free_pdata+0xc8/0x177): fe ff eb 9a 0f 0b 0f 0b <0f> 0b 49 8d 4f 08 49 8b 47 08 48 3b 48 08 75 2e Xen stack trace from rsp=83203d30: 83202d74d100 0001 82d0404c4430 0006 83203d78 82d040257454 0001 83203da8 82d04021f303 82d0404c4628 82d0404c4620 82d0404c4430 0006 83203df0 82d04022bc4c 83203e18 0001 0001 fff0 82d0405e6500 83203e08 82d040204fd5 0001 83203e30 82d0402054f0 82d0404c5860 0001 83202ec75000 83203e48 82d040348c25 83202d74d0d0 83203e68 82d0402071aa 83202ec751d0 82d0405ce210 83203e80 82d0402343c9 82d0405ce200 83203eb0 82d040234631 7fff 82d0405d5080 82d0405ce210 83203ee8 82d040321411 82d040321399 83202f3a9000 001d91a6fa2d 82d0405e6500 83203de0 82d040324391 Xen call trace: [] R common/sched/credit2.c#csched2_free_pdata+0xc8/0x177 [] F free_cpu_rm_data+0x41/0x58 [] F common/sched/cpupool.c#cpu_callback+0xfb/0x466 [] F notifier_call_chain+0x6c/0x96 [] F common/cpu.c#cpu_notifier_call_chain+0x1b/0x36 [] F cpu_down+0xa7/0x143 [] F cpu_down_helper+0x11/0x27 [] F common/domain.c#continue_hypercall_tasklet_handler+0x50/0xbd [] F common/tasklet.c#do_tasklet_work+0x76/0xaf [] F do_tasklet+0x5b/0x8d [] F arch/x86/domain.c#idle_loop+0x78/0xe6 [] F continue_running+0x5b/0x5d Panic on CPU 0: Assertion '!cpumask_test_cpu(cpu, >initialized)' failed at common/sched/credit2.c:4111 The issue seems to be that since the CPU hasn't been removed, it's still part of prv->initialized and the assert in csched2_free_pdata() called as part of free_cpu_rm_data() triggers. It's easy to reproduce by substituting the stop_machine_run() call in cpu_down() with an error. Thanks, Roger.
Re: Linux HVM fails to start with PANIC: early exception 0x00 IP 0010:clear_page_orig+0x12/0x40 error 0
On 12/05/2024 3:48 pm, Andrew Cooper wrote: > On 12/05/2024 3:16 am, Marek Marczykowski-Górecki wrote: >> Hi, >> >> I've got a report[1] that after some update Linux HVM fails to start with the >> error as in the subject. It looks to be caused by some change between >> Xen 4.17.3 and 4.17.4. Here the failure is on Linux 6.6.25 (both dom0 >> and domU), but the 6.1.62 that worked with older Xen before, now fails >> too. The full error (logged via earlyprintk=xen) is: >> >> [0.009500] Using GB pages for direct mapping >> PANIC: early exception 0x00 IP 10:b01c32e2 error 0 cr2 >> 0xa08649801000 >> [0.009606] CPU: 0 PID: 0 Comm: swapper Not tainted >> 6.6.25-1.qubes.fc37.x86_64 #1 >> [0.009665] Hardware name: Xen HVM domU, BIOS 4.17.4 04/26/2024 >> [0.009710] RIP: 0010:clear_page_orig+0x12/0x40 >> [0.009766] Code: 84 00 00 00 00 00 66 90 90 90 90 90 90 90 90 90 90 >> 90 90 90 90 90 90 90 f3 0f 1e fa 31 c0 b9 40 00 00 00 0f 1f 44 00 00 ff c9 >> <48> 89 07 48 89 47 08 48 89 47 10 48 89 47 18 48 89 47 20 48 89 47 >> [0.009862] RSP: :b0e03d58 EFLAGS: 00010016 ORIG_RAX: >> >> [0.009915] RAX: RBX: RCX: >> 003f >> [0.009967] RDX: 9801 RSI: RDI: >> a08649801000 >> [0.010015] RBP: 0001 R08: 0001 R09: >> 6b7f283562d74b16 >> [0.010063] R10: R11: R12: >> 0001 >> [0.010112] R13: R14: b0e22a08 R15: >> a0864000 >> [0.010161] FS: () GS:b16ea000() >> knlGS: >> [0.010214] CS: 0010 DS: ES: CR0: 80050033 >> [0.010257] CR2: a08649801000 CR3: 08e8 CR4: >> 00b0 >> [0.010310] Call Trace: >> [0.010341] >> [0.010372] ? early_fixup_exception+0xf7/0x190 >> [0.010416] ? early_idt_handler_common+0x2f/0x3a >> [0.010460] ? clear_page_orig+0x12/0x40 >> [0.010501] ? alloc_low_pages+0xeb/0x150 >> [0.010541] ? __kernel_physical_mapping_init+0x1d2/0x630 >> [0.010588] ? init_memory_mapping+0x83/0x160 >> [0.010631] ? init_mem_mapping+0x9a/0x460 >> [0.010669] ? memblock_reserve+0x6d/0xf0 >> [0.010709] ? setup_arch+0x796/0xf90 >> [0.010748] ? start_kernel+0x63/0x420 >> [0.010787] ? x86_64_start_reservations+0x18/0x30 >> [0.010828] ? x86_64_start_kernel+0x96/0xa0 >> [0.010868] ? secondary_startup_64_no_verify+0x18f/0x19b >> [0.010918] >> >> I'm pretty sure the exception 0 is misleading here, I don't see how it >> could be #DE. >> >> More logs (including full hypervisor log) are attached to the linked >> issue. >> >> This is on HP 240 g7, and my educated guess is it's Intel Celeron N4020 >> CPU. I cannot reproduce the issue on different hardware. >> >> PVH domains seems to work. >> >> Any ideas what could have happened here? > Yes. > > Revert the microcode back to revision 0x38 for now. The regression is fixed in revision 0x42. ~Andrew
[libvirt test] 186165: tolerable all pass - PUSHED
flight 186165 libvirt real [real] http://logs.test-lab.xenproject.org/osstest/logs/186165/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-armhf-armhf-libvirt 16 saverestore-support-checkfail like 186133 test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-arm64-arm64-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qcow2 14 migrate-support-checkfail never pass test-amd64-amd64-libvirt-raw 14 migrate-support-checkfail never pass test-armhf-armhf-libvirt 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-qcow2 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-qcow2 15 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-libvirt-vhd 15 saverestore-support-checkfail never pass version targeted for testing: libvirt a4f38f6ffe6a9edc001d18890ccfc3f38e72fb94 baseline version: libvirt 3b3efef58dc4bf6c07a73862c280e30f2023054d Last test of basis 186133 2024-05-24 04:18:53 Z4 days Testing same since 186165 2024-05-28 04:20:40 Z0 days1 attempts People who touched revisions under test: Andi Chandler Göran Uddeborg Laine Stump jobs: build-amd64-xsm pass build-arm64-xsm pass build-i386-xsm pass build-amd64 pass build-arm64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-arm64-libvirt pass build-armhf-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-arm64-pvopspass build-armhf-pvopspass build-i386-pvops pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm pass test-amd64-amd64-libvirt-xsm pass test-arm64-arm64-libvirt-xsm pass test-amd64-amd64-libvirt pass test-arm64-arm64-libvirt pass test-armhf-armhf-libvirt pass test-amd64-amd64-libvirt-pairpass test-amd64-amd64-libvirt-qcow2 pass test-arm64-arm64-libvirt-qcow2 pass test-amd64-amd64-libvirt-raw pass test-arm64-arm64-libvirt-raw pass test-amd64-amd64-libvirt-vhd pass test-armhf-armhf-libvirt-vhd pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/libvirt.git 3b3efef58d..a4f38f6ffe a4f38f6ffe6a9edc001d18890ccfc3f38e72fb94 -> xen-tested-master
[xen-unstable-smoke test] 186166: regressions - FAIL
flight 186166 xen-unstable-smoke real [real] flight 186167 xen-unstable-smoke real-retest [real] http://logs.test-lab.xenproject.org/osstest/logs/186166/ http://logs.test-lab.xenproject.org/osstest/logs/186167/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-armhf-armhf-xl 8 xen-boot fail REGR. vs. 186164 Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass version targeted for testing: xen 2d93f78bfe25f695d8ffb61d110da9df293ed71b baseline version: xen 96af090e33130b0bf0953f3ccab8e7a163392318 Last test of basis 186164 2024-05-28 02:00:23 Z0 days Testing same since 186166 2024-05-28 07:02:06 Z0 days1 attempts People who touched revisions under test: Jan Beulich Jason Andryuk Nicola Vetrini Oleksii Kurochko jobs: build-arm64-xsm pass build-amd64 pass build-armhf pass build-amd64-libvirt pass test-armhf-armhf-xl fail test-arm64-arm64-xl-xsm pass test-amd64-amd64-xl-qemuu-debianhvm-amd64pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Not pushing. commit 2d93f78bfe25f695d8ffb61d110da9df293ed71b Author: Nicola Vetrini Date: Tue May 28 08:52:27 2024 +0200 x86/traps: address violation of MISRA C Rule 8.4 Rule 8.4 states: "A compatible declaration shall be visible when an object or function with external linkage is defined". The function do_general_protection is either used is asm code or only within this unit, so there is no risk of this getting out of sync with its definition, but the function must remain extern. Therefore, this function is deviated using a comment-based deviation. No functional change. Signed-off-by: Nicola Vetrini Acked-by: Jan Beulich commit 8b977fe57254ff8d343e4bf50cf98fa6c2b36b8b Author: Jason Andryuk Date: Tue May 28 08:52:15 2024 +0200 CHANGELOG: Mention libxl blktap/tapback support Add entry for backendtype=tap support in libxl. blktap needs some changes to work with libxl, which haven't been merged. They are available from this PR: https://github.com/xapi-project/blktap/pull/394 Signed-off-by: Jason Andryuk Acked-by: Oleksii Kurochko (qemu changes not included)
[xen-unstable test] 186163: regressions - FAIL
flight 186163 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/186163/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-armhf 6 xen-build fail in 186157 REGR. vs. 186163 Tests which are failing intermittently (not blocking): test-amd64-amd64-qemuu-freebsd11-amd64 21 guest-start/freebsd.repeat fail pass in 186157 Tests which did not succeed, but are not blocking: test-armhf-armhf-xl-raw 1 build-check(1) blocked in 186157 n/a test-armhf-armhf-xl-qcow2 1 build-check(1) blocked in 186157 n/a test-armhf-armhf-xl-rtds 1 build-check(1) blocked in 186157 n/a test-armhf-armhf-xl-arndale 1 build-check(1) blocked in 186157 n/a test-armhf-armhf-xl-credit2 1 build-check(1) blocked in 186157 n/a test-armhf-armhf-xl 1 build-check(1) blocked in 186157 n/a test-armhf-armhf-libvirt 1 build-check(1) blocked in 186157 n/a test-armhf-armhf-xl-credit1 1 build-check(1) blocked in 186157 n/a test-armhf-armhf-examine 1 build-check(1) blocked in 186157 n/a test-armhf-armhf-xl-multivcpu 1 build-check(1) blocked in 186157 n/a build-armhf-libvirt 1 build-check(1) blocked in 186157 n/a test-armhf-armhf-libvirt-vhd 1 build-check(1) blocked in 186157 n/a test-armhf-armhf-libvirt 16 saverestore-support-check fail blocked in 186157 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 186157 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 186157 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 186157 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 186157 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 186157 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 15 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qcow2 14 migrate-support-checkfail never pass test-amd64-amd64-libvirt-raw 14 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 15 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail never pass test-arm64-arm64-xl-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-xl-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-xl-raw 14 migrate-support-checkfail never pass test-armhf-armhf-xl-raw 15 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 16 saverestore-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-qcow214 migrate-support-checkfail never pass
[XEN PATCH] automation/eclair: extend and modifiy existing deviations of MISRA C:2012 Rule 16.3
Update ECLAIR configuration to deviate more cases where an unintentional fallthrough cannot happen. Add Rule 16.3 to the monitored set and tag as clean for arm. Signed-off-by: Federico Serafini --- In previous discussions about Rule 16.3, the preference expressed was to deviate cases where a fallthrough cannot occur (rather that refactoring the code). --- .../eclair_analysis/ECLAIR/deviations.ecl | 30 ++- .../eclair_analysis/ECLAIR/monitored.ecl | 1 + automation/eclair_analysis/ECLAIR/tagging.ecl | 2 +- docs/misra/deviations.rst | 28 +++-- 4 files changed, 49 insertions(+), 12 deletions(-) diff --git a/automation/eclair_analysis/ECLAIR/deviations.ecl b/automation/eclair_analysis/ECLAIR/deviations.ecl index cf62a874d9..95f07718ba 100644 --- a/automation/eclair_analysis/ECLAIR/deviations.ecl +++ b/automation/eclair_analysis/ECLAIR/deviations.ecl @@ -364,14 +364,29 @@ therefore it is deemed better to leave such files as is." -config=MC3R1.R16.2,reports+={deliberate, "any_area(any_loc(file(x86_emulate||x86_svm_emulate)))"} -doc_end --doc_begin="Switch clauses ending with continue, goto, return statements are -safe." --config=MC3R1.R16.3,terminals+={safe, "node(continue_stmt||goto_stmt||return_stmt)"} +-doc_begin="Statements that change the control flow (i.e., break, continue, goto, return) and calls to functions that does not return the control back are \"allowed terminal statements\"." +-stmt_selector+={r16_3_allowed_terminal, "node(break_stmt||continue_stmt||goto_stmt||return_stmt)||call(property(noreturn))"} +-config=MC3R1.R16.3,terminals+={safe, "r16_3_allowed_terminal"} +-doc_end + +-doc_begin="An if-else statement having both branches ending with an allowed terminal statement is itself an allowed terminal statement." +-stmt_selector+={r16_3_if, "node(if_stmt)&&(child(then,r16_3_allowed_terminal)||child(then,any_stmt(stmt,-1,r16_3_allowed_terminal)))"} +-stmt_selector+={r16_3_else, "node(if_stmt)&&(child(else,r16_3_allowed_terminal)||child(else,any_stmt(stmt,-1,r16_3_allowed_terminal)))"} +-stmt_selector+={r16_3_if_else, "r16_3_if&_3_else"} +-config=MC3R1.R16.3,terminals+={safe, "r16_3_if_else"} +-doc_end + +-doc_begin="An if-else statement having an always true condition and the true branch ending with an allowed terminal statement is itself an allowed terminal statement." +-stmt_selector+={r16_3_if_true, "r16_3_if&(cond,definitely_in(1..))"} +-config=MC3R1.R16.3,terminals+={safe, "r16_3_if_true"} +-doc_end + +-doc_begin="Switch clauses ending with a statement expression which, in turn, ends with an allowed terminal statement are safe." +-config=MC3R1.R16.3,terminals+={safe, "node(stmt_expr)&(stmt,node(compound_stmt)&_stmt(stmt,-1,r16_3_allowed_terminal||r16_3_if_else||r16_3_if_true))"} -doc_end --doc_begin="Switch clauses ending with a call to a function that does not give -the control back (i.e., a function with attribute noreturn) are safe." --config=MC3R1.R16.3,terminals+={safe, "call(property(noreturn))"} +-doc_begin="Switch clauses ending with a do-while-false which, in turn, ends with an allowed terminal statement are safe, except for debug macro ASSERT_UNREACHABLE()." +-config=MC3R1.R16.3,terminals+={safe, "!macro(name(ASSERT_UNREACHABLE))&(do_stmt)&(cond,definitely_in(0))&(body,any_stmt(stmt,-1,r16_3_allowed_terminal||r16_3_if_else||r16_3_if_true))"} -doc_end -doc_begin="Switch clauses ending with pseudo-keyword \"fallthrough\" are @@ -383,8 +398,7 @@ safe." -config=MC3R1.R16.3,reports+={safe, "any_area(end_loc(any_exp(text(/BUG\\(\\);/"} -doc_end --doc_begin="Switch clauses not ending with the break statement are safe if an -explicit comment indicating the fallthrough intention is present." +-doc_begin="Switch clauses ending with an explicit comment indicating the fallthrough intention is present are safe." -config=MC3R1.R16.3,reports+={safe, "any_area(end_loc(any_exp(text(^(?s).*/\\* [fF]all ?through.? \\*/.*$,0..1"} -doc_end diff --git a/automation/eclair_analysis/ECLAIR/monitored.ecl b/automation/eclair_analysis/ECLAIR/monitored.ecl index 4daecb0c83..45a60074f9 100644 --- a/automation/eclair_analysis/ECLAIR/monitored.ecl +++ b/automation/eclair_analysis/ECLAIR/monitored.ecl @@ -22,6 +22,7 @@ -enable=MC3R1.R14.1 -enable=MC3R1.R14.4 -enable=MC3R1.R16.2 +-enable=MC3R1.R16.3 -enable=MC3R1.R16.6 -enable=MC3R1.R16.7 -enable=MC3R1.R17.1 diff --git a/automation/eclair_analysis/ECLAIR/tagging.ecl b/automation/eclair_analysis/ECLAIR/tagging.ecl index a354ff322e..07de2e7b65 100644 --- a/automation/eclair_analysis/ECLAIR/tagging.ecl +++ b/automation/eclair_analysis/ECLAIR/tagging.ecl @@ -105,7 +105,7 @@ if(string_equal(target,"x86_64"), ) if(string_equal(target,"arm64"), - service_selector({"additional_clean_guidelines","MC3R1.R14.4||MC3R1.R16.6||MC3R1.R20.12||MC3R1.R2.1||MC3R1.R5.3||MC3R1.R7.2||MC3R1.R7.3||MC3R1.R8.6||MC3R1.R9.3"}) +
Re: [PATCH v11 2/9] xen: introduce generic non-atomic test_*bit()
On 28.05.2024 10:37, Oleksii K. wrote: > On Tue, 2024-05-28 at 08:20 +0200, Jan Beulich wrote: >> On 24.05.2024 13:08, Oleksii Kurochko wrote: >>> +/** >>> + * generic_test_bit - Determine whether a bit is set >>> + * @nr: bit number to test >>> + * @addr: Address to start counting from >>> + * >>> + * This operation is non-atomic and can be reordered. >>> + * If two examples of this operation race, one can appear to >>> succeed >>> + * but actually fail. You must protect multiple accesses with a >>> lock. >>> + */ >> >> You got carried away updating comments - there's no raciness for >> simple test_bit(). As is also expressed by its name not having those >> double underscores that the others have. > Then it is true for every function in this header. Based on the naming > the conclusion can be done if it is atomic/npn-atomic and can/can't be > reordered. So the comments aren't seem very useful. I disagree - test_bit() is different, in not being a read-modify-write operation. Jan
Re: [PATCH v16 1/5] arm/vpci: honor access size when returning an error
Hi Roger, On 28/05/2024 08:11, Roger Pau Monné wrote: On Mon, May 27, 2024 at 10:14:59PM +0100, Julien Grall wrote: Hi Roger, On 23/05/2024 08:55, Roger Pau Monné wrote: On Wed, May 22, 2024 at 06:59:20PM -0400, Stewart Hildebrand wrote: From: Volodymyr Babchuk Guest can try to read config space using different access sizes: 8, 16, 32, 64 bits. We need to take this into account when we are returning an error back to MMIO handler, otherwise it is possible to provide more data than requested: i.e. guest issues LDRB instruction to read one byte, but we are writing 0x in the target register. Shouldn't this be taken care of in the trap handler subsystem, rather than forcing each handler to ensure the returned data matches the access size? I understand how this can be useful when we return all 1s. However, in most of the current cases, we already need to deal with the masking because the data is extracted from a wider field (for instance, see the vGIC emulation). For those handlers, I would argue it would be concerning/ a bug if the handler return bits above the access size. Although, this would only impact the guest itself. Even if there was a bug in the handler, it would be mitigated by the truncation done in io.c. So overall, this seems to be a matter of taste and I don't quite (yet) see the benefits to do it in io.c. Regardless that... It's up to you really, it's all ARM code so I don't really have a stake. IMO it makes the handlers more complicated and fragile. I will let the other Arm folks commenting on it. If nothing else I would at least add an ASSERT() in io.c to ensure that the data returned from the handler matches the size constrains you expect. That would be a good idea. IOW, something like: diff --git a/xen/arch/arm/io.c b/xen/arch/arm/io.c index 96c740d5636c..b7e12df85f87 100644 --- a/xen/arch/arm/io.c +++ b/xen/arch/arm/io.c @@ -37,6 +37,7 @@ static enum io_state handle_read(const struct mmio_handler *handler, return IO_ABORT; r = sign_extend(dabt, r); +r = r & GENMASK_ULL((1U << dabt.size) * 8 - 1, 0); ... in some case we need to sign extend up to the width of the register (even if the access is 8-byte). So we would need to do the masking *before* calling sign_extend(). I would consider doing the truncation in sign_extend() if suitable, even if that's doing more than what the function name implies. If we decide to do a general truncation, then I would rather prefer if it happens outside of sign_extend(). Or the function needs to be renamed (I can't find a good name so far). Cheers, -- Julien Grall
Re: [PATCH v11 2/9] xen: introduce generic non-atomic test_*bit()
Hi Oleksii, On 28/05/2024 09:37, Oleksii K. wrote: On Tue, 2024-05-28 at 08:20 +0200, Jan Beulich wrote: On 24.05.2024 13:08, Oleksii Kurochko wrote: The following generic functions were introduced: * test_bit * generic__test_and_set_bit * generic__test_and_clear_bit * generic__test_and_change_bit These functions and macros can be useful for architectures that don't have corresponding arch-specific instructions. Also, the patch introduces the following generics which are used by the functions mentioned above: * BITOP_BITS_PER_WORD * BITOP_MASK * BITOP_WORD * BITOP_TYPE BITS_PER_BYTE was moved to xen/bitops.h as BITS_PER_BYTE is the same across architectures. The following approach was chosen for generic*() and arch*() bit operation functions: If the bit operation function that is going to be generic starts with the prefix "__", then the corresponding generic/arch function will also contain the "__" prefix. For example: * test_bit() will be defined using arch_test_bit() and generic_test_bit(). * __test_and_set_bit() will be defined using arch__test_and_set_bit() and generic__test_and_set_bit(). Signed-off-by: Oleksii Kurochko --- Reviewed-by: Jan Beulich jbeul...@suse.com? Jan gave his R-by for the previous version of the patch, but some changes were done, so I wasn't sure if I could use the R-by in this patch version. That really depends on the nature of the changes. Seeing the pretty long list below, I think it was appropriate to drop the R-b. --- Changes in V11: - fix identation in generic_test_bit() function. - move definition of BITS_PER_BYTE from /bitops.h to xen/bitops.h I realize you were asked to do this, but I'm not overly happy with it. BITS_PER_BYTE is far more similar to BITS_PER_LONG than to BITOP_BITS_PER_WORD. Plus in any event that change is entirely unrelated here. So where then it should be introduced? x86 introduces that in config.h, Arm in asm/bitops.h. I would be fine if it is defined in config.h for Arm. I am okay to revert this change and make a separate patch. [...] Also did Julien(?) really ask that these comments be reproduced on both the functions supposed to be used throughout the codebase _and_ the generic_*() ones (being merely internal helpers/fallbacks)? At least, I understood that in this way. I suggested to duplicate. But I also proposed an alternative to move the comment: "I think this comment should be duplicated (or moved to) on top of" I don't have a strong preference which one is used. +/** + * generic_test_bit - Determine whether a bit is set + * @nr: bit number to test + * @addr: Address to start counting from + * + * This operation is non-atomic and can be reordered. + * If two examples of this operation race, one can appear to succeed + * but actually fail. You must protect multiple accesses with a lock. + */ You got carried away updating comments - there's no raciness for simple test_bit(). As is also expressed by its name not having those double underscores that the others have. Then it is true for every function in this header. Based on the naming the conclusion can be done if it is atomic/npn-atomic and can/can't be reordered. So let me start with that my only request is to keep the existing comments as you move it. It looks like there were none of test_bit() before. > So the comments aren't seem very useful. The comments *are* useful. We had several instances where non-Arm folks thought all the operations were atomic on Arm as well. And the usual suggestion is to add barriers in the Arm version which is a no-go. Cheers, -- Julien Grall
Re: [PATCH v11 2/9] xen: introduce generic non-atomic test_*bit()
On Tue, 2024-05-28 at 08:20 +0200, Jan Beulich wrote: > On 24.05.2024 13:08, Oleksii Kurochko wrote: > > The following generic functions were introduced: > > * test_bit > > * generic__test_and_set_bit > > * generic__test_and_clear_bit > > * generic__test_and_change_bit > > > > These functions and macros can be useful for architectures > > that don't have corresponding arch-specific instructions. > > > > Also, the patch introduces the following generics which are > > used by the functions mentioned above: > > * BITOP_BITS_PER_WORD > > * BITOP_MASK > > * BITOP_WORD > > * BITOP_TYPE > > > > BITS_PER_BYTE was moved to xen/bitops.h as BITS_PER_BYTE is the > > same > > across architectures. > > > > The following approach was chosen for generic*() and arch*() bit > > operation functions: > > If the bit operation function that is going to be generic starts > > with the prefix "__", then the corresponding generic/arch function > > will also contain the "__" prefix. For example: > > * test_bit() will be defined using arch_test_bit() and > > generic_test_bit(). > > * __test_and_set_bit() will be defined using > > arch__test_and_set_bit() and generic__test_and_set_bit(). > > > > Signed-off-by: Oleksii Kurochko > > --- > > Reviewed-by: Jan Beulich jbeul...@suse.com? Jan gave his R-by for > > the previous > > version of the patch, but some changes were done, so I wasn't sure > > if I could > > use the R-by in this patch version. > > That really depends on the nature of the changes. Seeing the pretty > long list below, I think it was appropriate to drop the R-b. > > > --- > > Changes in V11: > > - fix identation in generic_test_bit() function. > > - move definition of BITS_PER_BYTE from /bitops.h to > > xen/bitops.h > > I realize you were asked to do this, but I'm not overly happy with > it. > BITS_PER_BYTE is far more similar to BITS_PER_LONG than to > BITOP_BITS_PER_WORD. Plus in any event that change is entirely > unrelated > here. So where then it should be introduced? x86 introduces that in config.h, Arm in asm/bitops.h. I am okay to revert this change and make a separate patch. > > > - drop the changes in arm64/livepatch.c. > > - update the the comments on top of functions: > > generic__test_and_set_bit(), > > generic__test_and_clear_bit(), generic__test_and_change_bit(), > > generic_test_bit(). > > - Update footer after Signed-off section. > > - Rebase the patch on top of staging branch, so it can be merged > > when necessary > > approves will be given. > > - Update the commit message. > > --- > > xen/arch/arm/include/asm/bitops.h | 69 --- > > xen/arch/ppc/include/asm/bitops.h | 54 - > > xen/arch/ppc/include/asm/page.h | 2 +- > > xen/arch/ppc/mm-radix.c | 2 +- > > xen/arch/x86/include/asm/bitops.h | 31 ++--- > > xen/include/xen/bitops.h | 185 > > ++ > > 6 files changed, 196 insertions(+), 147 deletions(-) > > > > --- a/xen/include/xen/bitops.h > > +++ b/xen/include/xen/bitops.h > > @@ -103,8 +103,193 @@ static inline int generic_flsl(unsigned long > > x) > > * Include this here because some architectures need > > generic_ffs/fls in > > * scope > > */ > > + > > +#define BITOP_BITS_PER_WORD 32 > > +typedef uint32_t bitop_uint_t; > > + > > +#define BITOP_MASK(nr) ((bitop_uint_t)1 << ((nr) % > > BITOP_BITS_PER_WORD)) > > + > > +#define BITOP_WORD(nr) ((nr) / BITOP_BITS_PER_WORD) > > + > > +#define BITS_PER_BYTE 8 > > This, if to be moved at all, very clearly doesn't belong here. As > much > as it's unrelated to this change, it's also unrelated to bitops as a > whole. > > > +extern void __bitop_bad_size(void); > > + > > +#define bitop_bad_size(addr) (sizeof(*(addr)) < > > sizeof(bitop_uint_t)) > > + > > +/* - Please tidy above here -- > > --- */ > > + > > #include > > > > +/** > > + * generic__test_and_set_bit - Set a bit and return its old value > > + * @nr: Bit to set > > + * @addr: Address to count from > > + * > > + * This operation is non-atomic and can be reordered. > > + * If two examples of this operation race, one can appear to > > succeed > > Why "examples"? Do you mean "instances"? It's further unclear what > "succeed" and "fail" here mean. The function after all has two > purposes: Checking and setting the specified bit. Therefore I think > you mean "succeed in updating the bit in memory", yet then it's > still unclear what the "appear" here means: The caller has no > indication of success/failure. Therefore I think "appear to" also > wants dropping. > > > + * but actually fail. You must protect multiple accesses with a > > lock. > > That's not "must". An often better alternative is to use the atomic > forms instead. "Multiple" is imprecise, too: "Potentially racy" or > some such would come closer. I think we can go only with: " This operation is non-atomic and can be reordered." and drop: " If two examples of this operation race, one
Re: [PATCH v4 3/7] xen/p2m: put reference for level 2 superpage
Hi Julien, > On 24 May 2024, at 13:56, Julien Grall wrote: > > Hi Luca, > > On 24/05/2024 13:40, Luca Fancellu wrote: >> From: Penny Zheng >> We are doing foreign memory mapping for static shared memory, and >> there is a great possibility that it could be super mapped. >> But today, p2m_put_l3_page could not handle superpages. >> This commits implements a new function p2m_put_l2_superpage to handle >> 2MB superpages, specifically for helping put extra references for >> foreign superpages. >> Modify relinquish_p2m_mapping as well to take into account preemption >> when type is foreign memory and order is above 9 (2MB). >> Currently 1GB superpages are not handled because Xen is not preemptible >> and therefore some work is needed to handle such superpages, for which >> at some point Xen might end up freeing memory and therefore for such a >> big mapping it could end up in a very long operation. >> Signed-off-by: Penny Zheng >> Signed-off-by: Luca Fancellu >> --- >> v4 changes: >> - optimised the path to call put_page() on the foreign mapping as >>Julien suggested. Add a comment in p2m_put_l2_superpage to state >>that any changes needs to take into account some change in the >>relinquish code. (Julien) >> v3 changes: >> - Add reasoning why we don't support now 1GB superpage, remove level_order >>variable from p2m_put_l2_superpage, update TODO comment inside >>p2m_free_entry, use XEN_PT_LEVEL_ORDER(2) instead of value 9 inside >>relinquish_p2m_mapping. (Michal) >> v2: >> - Do not handle 1GB super page as there might be some issue where >>a lot of calls to put_page(...) might be issued which could lead >>to free memory that is a long operation. >> v1: >> - patch from >> https://patchwork.kernel.org/project/xen-devel/patch/20231206090623.1932275-9-penny.zh...@arm.com/ >> --- >> xen/arch/arm/mmu/p2m.c | 82 +++--- >> 1 file changed, 62 insertions(+), 20 deletions(-) >> diff --git a/xen/arch/arm/mmu/p2m.c b/xen/arch/arm/mmu/p2m.c >> index 41fcca011cf4..986c5a03c54b 100644 >> --- a/xen/arch/arm/mmu/p2m.c >> +++ b/xen/arch/arm/mmu/p2m.c >> @@ -753,34 +753,66 @@ static int p2m_mem_access_radix_set(struct p2m_domain >> *p2m, gfn_t gfn, >> return rc; >> } >> -/* >> - * Put any references on the single 4K page referenced by pte. >> - * TODO: Handle superpages, for now we only take special references for leaf >> - * pages (specifically foreign ones, which can't be super mapped today). >> - */ >> -static void p2m_put_l3_page(const lpae_t pte) >> +static void p2m_put_foreign_page(struct page_info *pg) >> { >> -mfn_t mfn = lpae_get_mfn(pte); >> - >> -ASSERT(p2m_is_valid(pte)); >> - >> /* >> - * TODO: Handle other p2m types >> - * >> * It's safe to do the put_page here because page_alloc will >> * flush the TLBs if the page is reallocated before the end of >> * this loop. >> */ >> -if ( p2m_is_foreign(pte.p2m.type) ) >> +put_page(pg); >> +} >> + >> +/* Put any references on the single 4K page referenced by mfn. */ >> +static void p2m_put_l3_page(mfn_t mfn, p2m_type_t type) >> +{ >> +/* TODO: Handle other p2m types */ >> +if ( p2m_is_foreign(type) ) >> { >> ASSERT(mfn_valid(mfn)); >> -put_page(mfn_to_page(mfn)); >> +p2m_put_foreign_page(mfn_to_page(mfn)); >> } >> /* Detect the xenheap page and mark the stored GFN as invalid. */ >> -else if ( p2m_is_ram(pte.p2m.type) && is_xen_heap_mfn(mfn) ) >> +else if ( p2m_is_ram(type) && is_xen_heap_mfn(mfn) ) >> page_set_xenheap_gfn(mfn_to_page(mfn), INVALID_GFN); >> } >> +/* Put any references on the superpage referenced by mfn. */ >> +static void p2m_put_l2_superpage(mfn_t mfn, p2m_type_t type) >> +{ >> +struct page_info *pg; >> +unsigned int i; >> + >> +/* >> + * TODO: Handle other p2m types, but be aware that any changes to handle >> + * different types should require an update on the relinquish code to >> handle >> + * preemption. >> + */ >> +if ( !p2m_is_foreign(type) ) >> +return; >> + >> +ASSERT(mfn_valid(mfn)); >> + >> +pg = mfn_to_page(mfn); >> + >> +for ( i = 0; i < XEN_PT_LPAE_ENTRIES; i++, pg++ ) >> +p2m_put_foreign_page(pg); >> +} >> + >> +/* Put any references on the page referenced by pte. */ >> +static void p2m_put_page(const lpae_t pte, unsigned int level) >> +{ >> +mfn_t mfn = lpae_get_mfn(pte); >> + >> +ASSERT(p2m_is_valid(pte)); >> + >> +/* We have a second level 2M superpage */ >> +if ( p2m_is_superpage(pte, level) && (level == 2) ) > > AFAICT, p2m_put_page() can only be called if the pte points to a superpage or > page: > >if ( p2m_is_superpage(entry, level) || (level == 3) ) >{ > ... > p2m_put_page() > >} > > So do we actually need to check p2m_is_superpage()? > >> +return p2m_put_l2_superpage(mfn, pte.p2m.type); >> +else if ( level == 3 )
Re: [PATCH v16 1/5] arm/vpci: honor access size when returning an error
On Mon, May 27, 2024 at 10:14:59PM +0100, Julien Grall wrote: > Hi Roger, > > On 23/05/2024 08:55, Roger Pau Monné wrote: > > On Wed, May 22, 2024 at 06:59:20PM -0400, Stewart Hildebrand wrote: > > > From: Volodymyr Babchuk > > > > > > Guest can try to read config space using different access sizes: 8, > > > 16, 32, 64 bits. We need to take this into account when we are > > > returning an error back to MMIO handler, otherwise it is possible to > > > provide more data than requested: i.e. guest issues LDRB instruction > > > to read one byte, but we are writing 0x in the target > > > register. > > > > Shouldn't this be taken care of in the trap handler subsystem, rather > > than forcing each handler to ensure the returned data matches the > > access size? > > I understand how this can be useful when we return all 1s. > > However, in most of the current cases, we already need to deal with the > masking because the data is extracted from a wider field (for instance, see > the vGIC emulation). For those handlers, I would argue it would be > concerning/ a bug if the handler return bits above the access size. > Although, this would only impact the guest itself. Even if there was a bug in the handler, it would be mitigated by the truncation done in io.c. > So overall, this seems to be a matter of taste and I don't quite (yet) see > the benefits to do it in io.c. Regardless that... It's up to you really, it's all ARM code so I don't really have a stake. IMO it makes the handlers more complicated and fragile. If nothing else I would at least add an ASSERT() in io.c to ensure that the data returned from the handler matches the size constrains you expect. > > > > IOW, something like: > > > > diff --git a/xen/arch/arm/io.c b/xen/arch/arm/io.c > > index 96c740d5636c..b7e12df85f87 100644 > > --- a/xen/arch/arm/io.c > > +++ b/xen/arch/arm/io.c > > @@ -37,6 +37,7 @@ static enum io_state handle_read(const struct > > mmio_handler *handler, > > return IO_ABORT; > > > > r = sign_extend(dabt, r); > > +r = r & GENMASK_ULL((1U << dabt.size) * 8 - 1, 0); > > ... in some case we need to sign extend up to the width of the register > (even if the access is 8-byte). So we would need to do the masking *before* > calling sign_extend(). I would consider doing the truncation in sign_extend() if suitable, even if that's doing more than what the function name implies. Thanks, Roger.
Re: [XEN PATCH 4/4] x86/traps: address violation of MISRA C Rule 8.4
On 27.05.2024 16:53, Nicola Vetrini wrote: > Rule 8.4 states: "A compatible declaration shall be visible when > an object or function with external linkage is defined". > > The function do_general_protection is either used is asm code > or only within this unit, so there is no risk of this getting > out of sync with its definition, but the function must remain > extern. > > Therefore, this function is deviated using a comment-based deviation. > No functional change. > > Signed-off-by: Nicola Vetrini Acked-by: Jan Beulich Albeit here, too, I'm not entirely happy with the wording, but I'll let it go as is. While "there is no risk of this getting out of sync with its definition" is true for the C side, there's always the risk of assembly going out of sync with C, simply because the two cannot be properly connected by any means. Jan
Re: [XEN PATCH 3/4] x86: address violations of MISRA C Rule 8.4
On 27.05.2024 16:53, Nicola Vetrini wrote: > Rule 8.4 states: "A compatible declaration shall be visible when > an object or function with external linkage is defined." > > These variables are only referenced from asm modules, so they > need to be extern and there is negligible risk of them being > used improperly without noticing. "asm modules" isn't quite correct, as there's one use from inline assembly. I have to admit I have no good wording suggestion other than explicitly covering both: "asm modules or inline assembly". Yet that then is ambiguous, as a use in inline assembly may also mean that symbol is actually visible to the compiler by being mentioned as on of the operands. Better ideas? > As a result, they can be exempted using a comment-based deviation. > No functional change. > > Signed-off-by: Nicola Vetrini With suitably adjusted wording: Acked-by: Jan Beulich Jan
Re: [PATCH v11 2/9] xen: introduce generic non-atomic test_*bit()
On 24.05.2024 13:08, Oleksii Kurochko wrote: > The following generic functions were introduced: > * test_bit > * generic__test_and_set_bit > * generic__test_and_clear_bit > * generic__test_and_change_bit > > These functions and macros can be useful for architectures > that don't have corresponding arch-specific instructions. > > Also, the patch introduces the following generics which are > used by the functions mentioned above: > * BITOP_BITS_PER_WORD > * BITOP_MASK > * BITOP_WORD > * BITOP_TYPE > > BITS_PER_BYTE was moved to xen/bitops.h as BITS_PER_BYTE is the same > across architectures. > > The following approach was chosen for generic*() and arch*() bit > operation functions: > If the bit operation function that is going to be generic starts > with the prefix "__", then the corresponding generic/arch function > will also contain the "__" prefix. For example: > * test_bit() will be defined using arch_test_bit() and >generic_test_bit(). > * __test_and_set_bit() will be defined using >arch__test_and_set_bit() and generic__test_and_set_bit(). > > Signed-off-by: Oleksii Kurochko > --- > Reviewed-by: Jan Beulich jbeul...@suse.com? Jan gave his R-by for the > previous > version of the patch, but some changes were done, so I wasn't sure if I could > use the R-by in this patch version. That really depends on the nature of the changes. Seeing the pretty long list below, I think it was appropriate to drop the R-b. > --- > Changes in V11: > - fix identation in generic_test_bit() function. > - move definition of BITS_PER_BYTE from /bitops.h to xen/bitops.h I realize you were asked to do this, but I'm not overly happy with it. BITS_PER_BYTE is far more similar to BITS_PER_LONG than to BITOP_BITS_PER_WORD. Plus in any event that change is entirely unrelated here. > - drop the changes in arm64/livepatch.c. > - update the the comments on top of functions: generic__test_and_set_bit(), >generic__test_and_clear_bit(), generic__test_and_change_bit(), >generic_test_bit(). > - Update footer after Signed-off section. > - Rebase the patch on top of staging branch, so it can be merged when > necessary >approves will be given. > - Update the commit message. > --- > xen/arch/arm/include/asm/bitops.h | 69 --- > xen/arch/ppc/include/asm/bitops.h | 54 - > xen/arch/ppc/include/asm/page.h | 2 +- > xen/arch/ppc/mm-radix.c | 2 +- > xen/arch/x86/include/asm/bitops.h | 31 ++--- > xen/include/xen/bitops.h | 185 ++ > 6 files changed, 196 insertions(+), 147 deletions(-) > > --- a/xen/include/xen/bitops.h > +++ b/xen/include/xen/bitops.h > @@ -103,8 +103,193 @@ static inline int generic_flsl(unsigned long x) > * Include this here because some architectures need generic_ffs/fls in > * scope > */ > + > +#define BITOP_BITS_PER_WORD 32 > +typedef uint32_t bitop_uint_t; > + > +#define BITOP_MASK(nr) ((bitop_uint_t)1 << ((nr) % BITOP_BITS_PER_WORD)) > + > +#define BITOP_WORD(nr) ((nr) / BITOP_BITS_PER_WORD) > + > +#define BITS_PER_BYTE 8 This, if to be moved at all, very clearly doesn't belong here. As much as it's unrelated to this change, it's also unrelated to bitops as a whole. > +extern void __bitop_bad_size(void); > + > +#define bitop_bad_size(addr) (sizeof(*(addr)) < sizeof(bitop_uint_t)) > + > +/* - Please tidy above here - */ > + > #include > > +/** > + * generic__test_and_set_bit - Set a bit and return its old value > + * @nr: Bit to set > + * @addr: Address to count from > + * > + * This operation is non-atomic and can be reordered. > + * If two examples of this operation race, one can appear to succeed Why "examples"? Do you mean "instances"? It's further unclear what "succeed" and "fail" here mean. The function after all has two purposes: Checking and setting the specified bit. Therefore I think you mean "succeed in updating the bit in memory", yet then it's still unclear what the "appear" here means: The caller has no indication of success/failure. Therefore I think "appear to" also wants dropping. > + * but actually fail. You must protect multiple accesses with a lock. That's not "must". An often better alternative is to use the atomic forms instead. "Multiple" is imprecise, too: "Potentially racy" or some such would come closer. Also did Julien(?) really ask that these comments be reproduced on both the functions supposed to be used throughout the codebase _and_ the generic_*() ones (being merely internal helpers/fallbacks)? > +/** > + * generic_test_bit - Determine whether a bit is set > + * @nr: bit number to test > + * @addr: Address to start counting from > + * > + * This operation is non-atomic and can be reordered. > + * If two examples of this operation race, one can appear to succeed > + * but actually fail. You must protect multiple accesses with a lock. > + */ You got carried away updating comments - there's no raciness for simple