[PATCH RESEND v3 00/10] Support generic Luks encryption

2023-12-24 Thread Hyman Huang
v3:
- Rebase on master
- Add a test case for detached LUKS header
- Adjust the design to honour preallocation of the payload device
- Adjust the design to honour the payload offset from the header,
  even when detached
- Support detached LUKS header creation using qemu-img
- Support detached LUKS header querying
- Do some code clean

Thanks for commenting on this series, please review.

Best regared,

Yong

v2:
- Simplify the design by reusing the LUKS driver to implement
  the generic Luks encryption, thank Daniel for the insightful 
  advice.
- rebase on master. 

This functionality was motivated by the following to-do list seen
in crypto documents:
https://wiki.qemu.org/Features/Block/Crypto 

The last chapter says we should "separate header volume": 

The LUKS format has ability to store the header in a separate volume
from the payload. We should extend the LUKS driver in QEMU to support
this use case.

By enhancing the LUKS driver, it is possible to enable
the detachable LUKS header and, as a result, achieve
general encryption for any disk format that QEMU has
supported.

Take the qcow2 as an example, the usage of the generic
LUKS encryption as follows:

1. add a protocol blockdev node of data disk
$ virsh qemu-monitor-command vm '{"execute":"blockdev-add",
> "arguments":{"node-name":"libvirt-1-storage", "driver":"file",
> "filename":"/path/to/test_disk.qcow2"}}'

2. add a protocol blockdev node of LUKS header as above.
$ virsh qemu-monitor-command vm '{"execute":"blockdev-add",
> "arguments":{"node-name":"libvirt-2-storage", "driver":"file",
> "filename": "/path/to/cipher.gluks" }}'

3. add the secret for decrypting the cipher stored in LUKS
   header above
$ virsh qemu-monitor-command vm '{"execute":"object-add",
> "arguments":{"qom-type":"secret", "id":
> "libvirt-2-storage-secret0", "data":"abc123"}}'

4. add the qcow2-drived blockdev format node
$ virsh qemu-monitor-command vm '{"execute":"blockdev-add",
> "arguments":{"node-name":"libvirt-1-format", "driver":"qcow2",
> "file":"libvirt-1-storage"}}'

5. add the luks-drived blockdev to link the qcow2 disk with
   LUKS header by specifying the field "header"
$ virsh qemu-monitor-command vm '{"execute":"blockdev-add",
> "arguments":{"node-name":"libvirt-2-format", "driver":"luks",
> "file":"libvirt-1-format", "header":"libvirt-2-storage",
> "key-secret":"libvirt-2-format-secret0"}}'

6. add the virtio-blk device finally
$ virsh qemu-monitor-command vm '{"execute":"device_add",
> "arguments": {"num-queues":"1", "driver":"virtio-blk-pci",
> "drive": "libvirt-2-format", "id":"virtio-disk2"}}'

The generic LUKS encryption method of starting a virtual
machine (VM) is somewhat similar to hot-plug in that both
maintaining the same json command while the starting VM
changes the "blockdev-add/device_add" parameters to
"blockdev/device".

Hyman Huang (10):
  crypto: Introduce option and structure for detached LUKS header
  crypto: Support generic LUKS encryption
  qapi: Make parameter 'file' optional for BlockdevCreateOptionsLUKS
  crypto: Introduce creation option and structure for detached LUKS
header
  crypto: Mark the payload_offset_sector invalid for detached LUKS
header
  block: Support detached LUKS header creation using blockdev-create
  block: Support detached LUKS header creation using qemu-img
  crypto: Introduce 'detached-header' field in QCryptoBlockInfoLUKS
  tests: Add detached LUKS header case
  MAINTAINERS: Add section "Detached LUKS header"

 MAINTAINERS   |   5 +
 block.c   |   5 +-
 block/crypto.c| 146 ++--
 block/crypto.h|   8 +
 crypto/block-luks.c   |  49 +++-
 crypto/block.c|   1 +
 crypto/blockpriv.h|   3 +
 qapi/block-core.json  |  14 +-
 qapi/crypto.json  |  13 +-
 tests/qemu-iotests/210.out|   4 +
 tests/qemu-iotests/tests/luks-detached-header | 214 ++
 .../tests/luks-detached-header.out|   5 +
 12 files changed, 436 insertions(+), 31 deletions(-)
 create mode 100755 tests/qemu-iotests/tests/luks-detached-header
 create mode 100644 tests/qemu-iotests/tests/luks-detached-header.out

-- 
2.39.1




[PATCH RESEND v3 04/10] crypto: Introduce creation option and structure for detached LUKS header

2023-12-24 Thread Hyman Huang
Introduce 'header' field in BlockdevCreateOptionsLUKS to support
detached LUKS header creation. Meanwhile, introduce header-related
field in QCryptoBlock.

Signed-off-by: Hyman Huang 
---
 crypto/blockpriv.h   | 3 +++
 qapi/block-core.json | 3 +++
 qapi/crypto.json | 5 -
 3 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/crypto/blockpriv.h b/crypto/blockpriv.h
index 3c7ccea504..6289aea961 100644
--- a/crypto/blockpriv.h
+++ b/crypto/blockpriv.h
@@ -42,6 +42,9 @@ struct QCryptoBlock {
 size_t niv;
 uint64_t payload_offset; /* In bytes */
 uint64_t sector_size; /* In bytes */
+
+bool detached_header; /* True if disk has a detached LUKS header */
+uint64_t detached_header_size; /* LUKS header size plus key slot size */
 };
 
 struct QCryptoBlockDriver {
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 9ac256c489..8aec179926 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -4948,6 +4948,8 @@
 # @file: Node to create the image format on, mandatory except when
 #'preallocation' is not requested
 #
+# @header: Detached LUKS header node to format. (since 9.0)
+#
 # @size: Size of the virtual disk in bytes
 #
 # @preallocation: Preallocation mode for the new image (since: 4.2)
@@ -4958,6 +4960,7 @@
 { 'struct': 'BlockdevCreateOptionsLUKS',
   'base': 'QCryptoBlockCreateOptionsLUKS',
   'data': { '*file':'BlockdevRef',
+'*header':  'BlockdevRef',
 'size': 'size',
 '*preallocation':   'PreallocMode' } }
 
diff --git a/qapi/crypto.json b/qapi/crypto.json
index fd3d46ebd1..6b4e86cb81 100644
--- a/qapi/crypto.json
+++ b/qapi/crypto.json
@@ -195,10 +195,13 @@
 # decryption key.  Mandatory except when probing image for
 # metadata only.
 #
+# @detached-header: if true, disk has detached LUKS header.
+#
 # Since: 2.6
 ##
 { 'struct': 'QCryptoBlockOptionsLUKS',
-  'data': { '*key-secret': 'str' }}
+  'data': { '*key-secret': 'str',
+'*detached-header': 'bool' }}
 
 ##
 # @QCryptoBlockCreateOptionsLUKS:
-- 
2.39.1




[PATCH RESEND v3 10/10] MAINTAINERS: Add section "Detached LUKS header"

2023-12-24 Thread Hyman Huang
I've built interests in block cryptography and also
have been working on projects related to this
subsystem.

Add a section to the MAINTAINERS file for detached
LUKS header, it only has a test case in it currently.

Signed-off-by: Hyman Huang 
---
 MAINTAINERS | 5 +
 1 file changed, 5 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 395f26ba86..f0f7b889a3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3391,6 +3391,11 @@ F: migration/dirtyrate.c
 F: migration/dirtyrate.h
 F: include/sysemu/dirtyrate.h
 
+Detached LUKS header
+M: Hyman Huang 
+S: Maintained
+F: tests/qemu-iotests/tests/luks-detached-header
+
 D-Bus
 M: Marc-André Lureau 
 S: Maintained
-- 
2.39.1




[PATCH RESEND v3 05/10] crypto: Mark the payload_offset_sector invalid for detached LUKS header

2023-12-24 Thread Hyman Huang
Set the payload_offset_sector to a value that is nearly never reached
in order to mark it as invalid and indicate that 0 should be the offset
of the read/write operation on the 'file' protocol blockdev node.

Signed-off-by: Hyman Huang 
---
 crypto/block-luks.c | 41 +++--
 1 file changed, 31 insertions(+), 10 deletions(-)

diff --git a/crypto/block-luks.c b/crypto/block-luks.c
index fb01ec38bb..48443ffcae 100644
--- a/crypto/block-luks.c
+++ b/crypto/block-luks.c
@@ -34,6 +34,8 @@
 
 #include "qemu/bitmap.h"
 
+#define INVALID_SECTOR_OFFSET UINT32_MAX
+
 /*
  * Reference for the LUKS format implemented here is
  *
@@ -136,6 +138,13 @@ struct QCryptoBlockLUKS {
 };
 
 
+static inline uint32_t
+qcrypto_block_luks_payload_offset(uint32_t sector)
+{
+return sector == INVALID_SECTOR_OFFSET ? 0 :
+sector * QCRYPTO_BLOCK_LUKS_SECTOR_SIZE;
+}
+
 static int qcrypto_block_luks_cipher_name_lookup(const char *name,
  QCryptoCipherMode mode,
  uint32_t key_bytes,
@@ -1255,8 +1264,8 @@ qcrypto_block_luks_open(QCryptoBlock *block,
 }
 
 block->sector_size = QCRYPTO_BLOCK_LUKS_SECTOR_SIZE;
-block->payload_offset = luks->header.payload_offset_sector *
-block->sector_size;
+block->payload_offset =
+qcrypto_block_luks_payload_offset(luks->header.payload_offset_sector);
 
 return 0;
 
@@ -1529,16 +1538,28 @@ qcrypto_block_luks_create(QCryptoBlock *block,
 slot->stripes = QCRYPTO_BLOCK_LUKS_STRIPES;
 }
 
-/* The total size of the LUKS headers is the partition header + key
- * slot headers, rounded up to the nearest sector, combined with
- * the size of each master key material region, also rounded up
- * to the nearest sector */
-luks->header.payload_offset_sector = header_sectors +
-QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS * split_key_sectors;
+if (block->detached_header) {
+/*
+ * Set the payload_offset_sector to a value that is nearly never
+ * reached in order to mark it as invalid and indicate that 0 should
+ * be the offset of the read/write operation on the 'file' protocol
+ * blockdev node. Here the UINT32_MAX is choosed
+ */
+luks->header.payload_offset_sector = INVALID_SECTOR_OFFSET;
+} else {
+/*
+ * The total size of the LUKS headers is the partition header + key
+ * slot headers, rounded up to the nearest sector, combined with
+ * the size of each master key material region, also rounded up
+ * to the nearest sector
+ */
+luks->header.payload_offset_sector = header_sectors +
+QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS * split_key_sectors;
+}
 
 block->sector_size = QCRYPTO_BLOCK_LUKS_SECTOR_SIZE;
-block->payload_offset = luks->header.payload_offset_sector *
-block->sector_size;
+block->payload_offset =
+qcrypto_block_luks_payload_offset(luks->header.payload_offset_sector);
 
 /* Reserve header space to match payload offset */
 initfunc(block, block->payload_offset, opaque, _err);
-- 
2.39.1




[PATCH RESEND v3 09/10] tests: Add detached LUKS header case

2023-12-24 Thread Hyman Huang
Signed-off-by: Hyman Huang 
---
 tests/qemu-iotests/tests/luks-detached-header | 214 ++
 .../tests/luks-detached-header.out|   5 +
 2 files changed, 219 insertions(+)
 create mode 100755 tests/qemu-iotests/tests/luks-detached-header
 create mode 100644 tests/qemu-iotests/tests/luks-detached-header.out

diff --git a/tests/qemu-iotests/tests/luks-detached-header 
b/tests/qemu-iotests/tests/luks-detached-header
new file mode 100755
index 00..cf305bfa47
--- /dev/null
+++ b/tests/qemu-iotests/tests/luks-detached-header
@@ -0,0 +1,214 @@
+#!/usr/bin/env python3
+# group: rw auto
+#
+# Test detached LUKS header
+#
+# Copyright (C) 2024 SmartX Inc.
+#
+# Authors:
+# Hyman Huang 
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+import os
+import iotests
+from iotests import imgfmt, qemu_img_create, img_info_log, qemu_img_info, 
QMPTestCase
+
+
+image_size = 128 * 1024 * 1024
+
+luks_img = os.path.join(iotests.test_dir, 'luks.img')
+detached_header_img1 = os.path.join(iotests.test_dir, 'detached_header.img1')
+detached_header_img2 = os.path.join(iotests.test_dir, 'detached_header.img2')
+detached_payload_raw_img = os.path.join(iotests.test_dir, 
'detached_payload_raw.img')
+detached_payload_qcow2_img = os.path.join(iotests.test_dir, 
'detached_payload_qcow2.img')
+
+secret_obj = 'secret,id=sec0,data=foo'
+luks_opts = 'key-secret=sec0'
+
+
+class TestDetachedLUKSHeader(QMPTestCase):
+def setUp(self) -> None:
+self.vm = iotests.VM()
+self.vm.add_object(secret_obj)
+self.vm.launch()
+
+# 1. Create the normal LUKS disk with 128M size
+self.vm.blockdev_create({ 'driver': 'file',
+  'filename': luks_img,
+  'size': 0 })
+self.vm.qmp_log('blockdev-add', driver='file', filename=luks_img,
+ node_name='luks-1-storage')
+result = self.vm.blockdev_create({ 'driver': imgfmt,
+   'file': 'luks-1-storage',
+   'key-secret': 'sec0',
+   'size': image_size,
+   'iter-time': 10 })
+# None is expected
+self.assertEqual(result, None)
+
+# 2. Create the LUKS disk with detached header (raw)
+
+# Create detached LUKS header
+self.vm.blockdev_create({ 'driver': 'file',
+  'filename': detached_header_img1,
+  'size': 0 })
+self.vm.qmp_log('blockdev-add', driver='file', 
filename=detached_header_img1,
+ node_name='luks-2-header-storage')
+
+# Create detached LUKS raw payload
+self.vm.blockdev_create({ 'driver': 'file',
+  'filename': detached_payload_raw_img,
+  'size': 0 })
+self.vm.qmp_log('blockdev-add', driver='file',
+ filename=detached_payload_raw_img,
+ node_name='luks-2-payload-storage')
+
+# Format LUKS disk with detached header
+result = self.vm.blockdev_create({ 'driver': imgfmt,
+   'header': 'luks-2-header-storage',
+   'file': 'luks-2-payload-storage',
+   'key-secret': 'sec0',
+   'preallocation': 'full',
+   'size': image_size,
+   'iter-time': 10 })
+self.assertEqual(result, None)
+
+self.vm.shutdown()
+
+# 3. Create the LUKS disk with detached header (qcow2)
+
+# Create detached LUKS header using qemu-img
+res = qemu_img_create('-f', 'luks', '--object', secret_obj, '-o', 
luks_opts,
+  '-o', "detached-mode=true", detached_header_img2)
+assert res.returncode == 0
+
+# Create detached LUKS qcow2 payload
+res = qemu_img_create('-f', 'qcow2', detached_payload_qcow2_img, 
str(image_size))
+assert res.returncode == 0
+
+def tearDown(self) -> None:
+os.remove(luks_img)
+os.remove(detached_header_img1)
+os.remove(detached_header_img2)
+

[PATCH RESEND v3 01/10] crypto: Introduce option and structure for detached LUKS header

2023-12-24 Thread Hyman Huang
Add the "header" option for the LUKS format. This field would be
used to identify the blockdev's position where a detachable LUKS
header is stored.

In addition, introduce header field in struct BlockCrypto

Signed-off-by: Hyman Huang 
Reviewed-by: Daniel P. Berrangé 
Message-Id: 
<5b99f60c7317092a563d7ca3fb4b414197015eb2.1701879996.git.yong.hu...@smartx.com>
---
 block/crypto.c   | 1 +
 qapi/block-core.json | 6 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/block/crypto.c b/block/crypto.c
index 921933a5e5..f82b13d32b 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -39,6 +39,7 @@ typedef struct BlockCrypto BlockCrypto;
 struct BlockCrypto {
 QCryptoBlock *block;
 bool updating_keys;
+BdrvChild *header;  /* Reference to the detached LUKS header */
 };
 
 
diff --git a/qapi/block-core.json b/qapi/block-core.json
index ca390c5700..10be08d08f 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3352,11 +3352,15 @@
 # decryption key (since 2.6). Mandatory except when doing a
 # metadata-only probe of the image.
 #
+# @header: optional reference to the location of a blockdev
+# storing a detached LUKS header. (since 9.0)
+#
 # Since: 2.9
 ##
 { 'struct': 'BlockdevOptionsLUKS',
   'base': 'BlockdevOptionsGenericFormat',
-  'data': { '*key-secret': 'str' } }
+  'data': { '*key-secret': 'str',
+'*header': 'BlockdevRef'} }
 
 ##
 # @BlockdevOptionsGenericCOWFormat:
-- 
2.39.1




[PATCH RESEND v3 08/10] crypto: Introduce 'detached-header' field in QCryptoBlockInfoLUKS

2023-12-24 Thread Hyman Huang
When querying the LUKS disk with the qemu-img tool or other APIs,
add information about whether the LUKS header is detached.

Additionally, update the test case with the appropriate
modification.

Signed-off-by: Hyman Huang 
---
 crypto/block-luks.c| 2 ++
 qapi/crypto.json   | 3 +++
 tests/qemu-iotests/210.out | 4 
 3 files changed, 9 insertions(+)

diff --git a/crypto/block-luks.c b/crypto/block-luks.c
index 474c7aee2e..c5e53b4ee4 100644
--- a/crypto/block-luks.c
+++ b/crypto/block-luks.c
@@ -1266,6 +1266,7 @@ qcrypto_block_luks_open(QCryptoBlock *block,
 block->sector_size = QCRYPTO_BLOCK_LUKS_SECTOR_SIZE;
 block->payload_offset =
 qcrypto_block_luks_payload_offset(luks->header.payload_offset_sector);
+block->detached_header = (block->payload_offset == 0) ? true : false;
 
 return 0;
 
@@ -1892,6 +1893,7 @@ static int qcrypto_block_luks_get_info(QCryptoBlock 
*block,
 info->u.luks.master_key_iters = luks->header.master_key_iterations;
 info->u.luks.uuid = g_strndup((const char *)luks->header.uuid,
   sizeof(luks->header.uuid));
+info->u.luks.detached_header = block->detached_header;
 
 for (i = 0; i < QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS; i++) {
 slot = g_new0(QCryptoBlockInfoLUKSSlot, 1);
diff --git a/qapi/crypto.json b/qapi/crypto.json
index 8e81aa8454..336c880b5d 100644
--- a/qapi/crypto.json
+++ b/qapi/crypto.json
@@ -317,6 +317,8 @@
 #
 # @hash-alg: the master key hash algorithm
 #
+# @detached-header: whether the LUKS header is detached (Since 9.0)
+#
 # @payload-offset: offset to the payload data in bytes
 #
 # @master-key-iters: number of PBKDF2 iterations for key material
@@ -333,6 +335,7 @@
'ivgen-alg': 'QCryptoIVGenAlgorithm',
'*ivgen-hash-alg': 'QCryptoHashAlgorithm',
'hash-alg': 'QCryptoHashAlgorithm',
+   'detached-header': 'bool',
'payload-offset': 'int',
'master-key-iters': 'int',
'uuid': 'str',
diff --git a/tests/qemu-iotests/210.out b/tests/qemu-iotests/210.out
index 96d9f749dd..94b29b2120 100644
--- a/tests/qemu-iotests/210.out
+++ b/tests/qemu-iotests/210.out
@@ -18,6 +18,7 @@ virtual size: 128 MiB (134217728 bytes)
 encrypted: yes
 Format specific information:
 ivgen alg: plain64
+detached header: false
 hash alg: sha256
 cipher alg: aes-256
 uuid: ----
@@ -70,6 +71,7 @@ virtual size: 64 MiB (67108864 bytes)
 encrypted: yes
 Format specific information:
 ivgen alg: plain64
+detached header: false
 hash alg: sha1
 cipher alg: aes-128
 uuid: ----
@@ -125,6 +127,7 @@ virtual size: 0 B (0 bytes)
 encrypted: yes
 Format specific information:
 ivgen alg: plain64
+detached header: false
 hash alg: sha256
 cipher alg: aes-256
 uuid: ----
@@ -195,6 +198,7 @@ virtual size: 0 B (0 bytes)
 encrypted: yes
 Format specific information:
 ivgen alg: plain64
+detached header: false
 hash alg: sha256
 cipher alg: aes-256
 uuid: ----
-- 
2.39.1




[PATCH RESEND v3 03/10] qapi: Make parameter 'file' optional for BlockdevCreateOptionsLUKS

2023-12-24 Thread Hyman Huang
To support detached LUKS header creation, make the existing 'file'
filed in BlockdevCreateOptionsLUKS optional, while also adding an
extra optional 'header' field in the next commit.

Signed-off-by: Hyman Huang 
---
 block/crypto.c   | 21 ++---
 qapi/block-core.json |  5 +++--
 2 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/block/crypto.c b/block/crypto.c
index 6063879bac..78fbe79c95 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -659,9 +659,9 @@ block_crypto_co_create_luks(BlockdevCreateOptions 
*create_options, Error **errp)
 assert(create_options->driver == BLOCKDEV_DRIVER_LUKS);
 luks_opts = _options->u.luks;
 
-bs = bdrv_co_open_blockdev_ref(luks_opts->file, errp);
-if (bs == NULL) {
-return -EIO;
+if (luks_opts->file == NULL) {
+error_setg(errp, "Formatting LUKS disk requires parameter 'file'");
+return -EINVAL;
 }
 
 create_opts = (QCryptoBlockCreateOptions) {
@@ -673,10 +673,17 @@ block_crypto_co_create_luks(BlockdevCreateOptions 
*create_options, Error **errp)
 preallocation = luks_opts->preallocation;
 }
 
-ret = block_crypto_co_create_generic(bs, luks_opts->size, _opts,
- preallocation, errp);
-if (ret < 0) {
-goto fail;
+if (luks_opts->file) {
+bs = bdrv_co_open_blockdev_ref(luks_opts->file, errp);
+if (bs == NULL) {
+return -EIO;
+}
+
+ret = block_crypto_co_create_generic(bs, luks_opts->size, _opts,
+ preallocation, errp);
+if (ret < 0) {
+goto fail;
+}
 }
 
 ret = 0;
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 10be08d08f..9ac256c489 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -4945,7 +4945,8 @@
 #
 # Driver specific image creation options for LUKS.
 #
-# @file: Node to create the image format on
+# @file: Node to create the image format on, mandatory except when
+#'preallocation' is not requested
 #
 # @size: Size of the virtual disk in bytes
 #
@@ -4956,7 +4957,7 @@
 ##
 { 'struct': 'BlockdevCreateOptionsLUKS',
   'base': 'QCryptoBlockCreateOptionsLUKS',
-  'data': { 'file': 'BlockdevRef',
+  'data': { '*file':'BlockdevRef',
 'size': 'size',
 '*preallocation':   'PreallocMode' } }
 
-- 
2.39.1




[PATCH RESEND v3 06/10] block: Support detached LUKS header creation using blockdev-create

2023-12-24 Thread Hyman Huang
The LUKS disk with detached header consists of a separate LUKS
header and payload. This LUKS disk type should be formatted
as follows:

1. add the secret to lock/unlock the cipher stored in the
   detached LUKS header
$ virsh qemu-monitor-command vm '{"execute":"object-add",
> "arguments":{"qom-type": "secret", "id": "sec0", "data": "foo"}}'

2. create a header img with 0 size
$ virsh qemu-monitor-command vm '{"execute":"blockdev-create",
> "arguments":{"job-id":"job0", "options":{"driver":"file",
> "filename":"/path/to/detached_luks_header.img", "size":0 }}}'

3. add protocol blockdev node for header
$ virsh qemu-monitor-command vm '{"execute":"blockdev-add",
> "arguments": {"driver":"file", "filename":
> "/path/to/detached_luks_header.img", "node-name":
> "detached-luks-header-storage"}}'

4. create a payload img with 0 size
$ virsh qemu-monitor-command vm '{"execute":"blockdev-create",
> "arguments":{"job-id":"job1", "options":{"driver":"file",
> "filename":"/path/to/detached_luks_payload_raw.img", "size":0}}}'

5. add protocol blockdev node for payload
$ virsh qemu-monitor-command vm '{"execute":"blockdev-add",
> "arguments": {"driver":"file", "filename":
> "/path/to/detached_luks_payload_raw.img", "node-name":
> "luks-payload-raw-storage"}}'

6. do the formatting with 128M size
$ virsh qemu-monitor-command c81_node1 '{"execute":"blockdev-create",
> "arguments":{"job-id":"job2", "options":{"driver":"luks", "header":
> "detached-luks-header-storage", "file":"luks-payload-raw-storage",
> "size":134217728, "preallocation":"full", "key-secret":"sec0" }}}'

Signed-off-by: Hyman Huang 
---
 block/crypto.c  | 109 
 crypto/block-luks.c |   6 ++-
 crypto/block.c  |   1 +
 3 files changed, 106 insertions(+), 10 deletions(-)

diff --git a/block/crypto.c b/block/crypto.c
index 78fbe79c95..76cc8bda49 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -160,6 +160,48 @@ error:
 return ret;
 }
 
+static int coroutine_fn GRAPH_UNLOCKED
+block_crypto_co_format_luks_payload(BlockdevCreateOptionsLUKS *luks_opts,
+Error **errp)
+{
+BlockDriverState *bs = NULL;
+BlockBackend *blk = NULL;
+Error *local_error = NULL;
+int ret;
+
+if (luks_opts->size > INT64_MAX) {
+return -EFBIG;
+}
+
+bs = bdrv_co_open_blockdev_ref(luks_opts->file, errp);
+if (bs == NULL) {
+return -EIO;
+}
+
+blk = blk_co_new_with_bs(bs, BLK_PERM_WRITE | BLK_PERM_RESIZE,
+ BLK_PERM_ALL, errp);
+if (!blk) {
+ret = -EPERM;
+goto fail;
+}
+
+ret = blk_truncate(blk, luks_opts->size, true,
+   luks_opts->preallocation, 0, _error);
+if (ret < 0) {
+if (ret == -EFBIG) {
+/* Replace the error message with a better one */
+error_free(local_error);
+error_setg(errp, "The requested file size is too large");
+}
+goto fail;
+}
+
+ret = 0;
+
+fail:
+bdrv_co_unref(bs);
+return ret;
+}
 
 static QemuOptsList block_crypto_runtime_opts_luks = {
 .name = "crypto",
@@ -651,6 +693,7 @@ static int coroutine_fn GRAPH_UNLOCKED
 block_crypto_co_create_luks(BlockdevCreateOptions *create_options, Error 
**errp)
 {
 BlockdevCreateOptionsLUKS *luks_opts;
+BlockDriverState *hdr_bs = NULL;
 BlockDriverState *bs = NULL;
 QCryptoBlockCreateOptions create_opts;
 PreallocMode preallocation = PREALLOC_MODE_OFF;
@@ -659,8 +702,22 @@ block_crypto_co_create_luks(BlockdevCreateOptions 
*create_options, Error **errp)
 assert(create_options->driver == BLOCKDEV_DRIVER_LUKS);
 luks_opts = _options->u.luks;
 
-if (luks_opts->file == NULL) {
-error_setg(errp, "Formatting LUKS disk requires parameter 'file'");
+if (luks_opts->header == NULL && luks_opts->file == NULL) {
+error_setg(errp, "Either the parameter 'header' or 'file' should "
+   "be specified");
+return -EINVAL;
+}
+
+if (luks_opts->detached_header && luks_opts->header == NULL) {
+error_setg(errp, "Formatting a detached LUKS disk requries "
+   "'header' to be specified");
+return -EINVAL;
+}
+
+if ((luks_opts->preallocation != PREALLOC_MODE_OFF) &&
+(luks_opts->file == NULL)) {
+error_setg(errp, "Parameter 'preallocation' requries 'file' to be "
+   "specified for formatting LUKS disk");
 return -EINVAL;
 }
 
@@ -673,7 +730,40 @@ block_crypto_co_create_luks(BlockdevCreateOptions 
*create_options, Error **errp)
 preallocation = luks_opts->preallocation;
 }
 
-if (luks_opts->file) {
+if (luks_opts->header) {
+hdr_bs = bdrv_co_open_blockdev_ref(luks_opts->header, errp);
+if (hdr_bs == NULL) {
+return -EIO;
+}
+
+/*
+ * If blockdev reference of header is specified,
+ * detached_header default to 

[PATCH RESEND v3 02/10] crypto: Support generic LUKS encryption

2023-12-24 Thread Hyman Huang
By enhancing the LUKS driver, it is possible to enable
the detachable LUKS header and, as a result, achieve
general encryption for any disk format that QEMU has
supported.

Take the qcow2 as an example, the usage of the generic
LUKS encryption as follows:

1. add a protocol blockdev node of data disk
$ virsh qemu-monitor-command vm '{"execute":"blockdev-add",
> "arguments":{"node-name":"libvirt-1-storage", "driver":"file",
> "filename":"/path/to/test_disk.qcow2"}}'

2. add a protocol blockdev node of LUKS header as above.
$ virsh qemu-monitor-command vm '{"execute":"blockdev-add",
> "arguments":{"node-name":"libvirt-2-storage", "driver":"file",
> "filename": "/path/to/cipher.gluks" }}'

3. add the secret for decrypting the cipher stored in LUKS
   header above
$ virsh qemu-monitor-command vm '{"execute":"object-add",
> "arguments":{"qom-type":"secret", "id":
> "libvirt-2-storage-secret0", "data":"abc123"}}'

4. add the qcow2-drived blockdev format node
$ virsh qemu-monitor-command vm '{"execute":"blockdev-add",
> "arguments":{"node-name":"libvirt-1-format", "driver":"qcow2",
> "file":"libvirt-1-storage"}}'

5. add the luks-drived blockdev to link the qcow2 disk with
   LUKS header by specifying the field "header"
$ virsh qemu-monitor-command vm '{"execute":"blockdev-add",
> "arguments":{"node-name":"libvirt-2-format", "driver":"luks",
> "file":"libvirt-1-format", "header":"libvirt-2-storage",
> "key-secret":"libvirt-2-format-secret0"}}'

6. add the virtio-blk device finally
$ virsh qemu-monitor-command vm '{"execute":"device_add",
> "arguments": {"num-queues":"1", "driver":"virtio-blk-pci",
> "drive": "libvirt-2-format", "id":"virtio-disk2"}}'

The generic LUKS encryption method of starting a virtual
machine (VM) is somewhat similar to hot-plug in that both
maintaining the same json command while the starting VM
changes the "blockdev-add/device_add" parameters to
"blockdev/device".

Signed-off-by: Hyman Huang 
Message-Id: 
<910801f303da1601051479d3b7e5c2c6b4e01eb7.1701879996.git.yong.hu...@smartx.com>
---
 block/crypto.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/block/crypto.c b/block/crypto.c
index f82b13d32b..6063879bac 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -64,12 +64,14 @@ static int block_crypto_read_func(QCryptoBlock *block,
   Error **errp)
 {
 BlockDriverState *bs = opaque;
+BlockCrypto *crypto = bs->opaque;
 ssize_t ret;
 
 GLOBAL_STATE_CODE();
 GRAPH_RDLOCK_GUARD_MAINLOOP();
 
-ret = bdrv_pread(bs->file, offset, buflen, buf, 0);
+ret = bdrv_pread(crypto->header ? crypto->header : bs->file,
+ offset, buflen, buf, 0);
 if (ret < 0) {
 error_setg_errno(errp, -ret, "Could not read encryption header");
 return ret;
@@ -269,6 +271,7 @@ static int block_crypto_open_generic(QCryptoBlockFormat 
format,
 QCryptoBlockOpenOptions *open_opts = NULL;
 unsigned int cflags = 0;
 QDict *cryptoopts = NULL;
+const char *hdr_bdref = qdict_get_try_str(options, "header");
 
 GLOBAL_STATE_CODE();
 
@@ -277,6 +280,15 @@ static int block_crypto_open_generic(QCryptoBlockFormat 
format,
 return ret;
 }
 
+if (hdr_bdref) {
+crypto->header = bdrv_open_child(NULL, options, "header", bs,
+ _of_bds, BDRV_CHILD_METADATA,
+ false, errp);
+if (!crypto->header) {
+return -EINVAL;
+}
+}
+
 GRAPH_RDLOCK_GUARD_MAINLOOP();
 
 bs->supported_write_flags = BDRV_REQ_FUA &
-- 
2.39.1




[PATCH RESEND v3 07/10] block: Support detached LUKS header creation using qemu-img

2023-12-24 Thread Hyman Huang
Add the 'detached-mode' option to specify the creation of
a detached LUKS header. This is how it is used:
$ qemu-img create --object secret,id=sec0,data=abc123 -f luks
> -o cipher-alg=aes-256,cipher-mode=xts -o key-secret=sec0
> -o detached-mode=true header.luks

Signed-off-by: Hyman Huang 
---
 block.c  | 5 -
 block/crypto.c   | 9 -
 block/crypto.h   | 8 
 qapi/crypto.json | 5 -
 4 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/block.c b/block.c
index bfb0861ec6..fa9ce36928 100644
--- a/block.c
+++ b/block.c
@@ -7517,7 +7517,10 @@ void bdrv_img_create(const char *filename, const char 
*fmt,
 goto out;
 }
 
-if (size == -1) {
+/* Parameter 'size' is not needed for detached LUKS header */
+if (size == -1 &&
+!(!strcmp(fmt, "luks") &&
+  qemu_opt_get_bool(opts, "detached-mode", false))) {
 error_setg(errp, "Image creation needs a size parameter");
 goto out;
 }
diff --git a/block/crypto.c b/block/crypto.c
index 76cc8bda49..812c3c28f5 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -229,6 +229,7 @@ static QemuOptsList block_crypto_create_opts_luks = {
 BLOCK_CRYPTO_OPT_DEF_LUKS_IVGEN_HASH_ALG(""),
 BLOCK_CRYPTO_OPT_DEF_LUKS_HASH_ALG(""),
 BLOCK_CRYPTO_OPT_DEF_LUKS_ITER_TIME(""),
+BLOCK_CRYPTO_OPT_DEF_LUKS_DETACHED_MODE(""),
 { /* end of list */ }
 },
 };
@@ -793,6 +794,8 @@ block_crypto_co_create_opts_luks(BlockDriver *drv, const 
char *filename,
 PreallocMode prealloc;
 char *buf = NULL;
 int64_t size;
+bool detached_mode =
+qemu_opt_get_bool(opts, "detached-mode", false);
 int ret;
 Error *local_err = NULL;
 
@@ -832,8 +835,12 @@ block_crypto_co_create_opts_luks(BlockDriver *drv, const 
char *filename,
 goto fail;
 }
 
+   /* The detached_header default to true if detached-mode is specified */
+create_opts->u.luks.detached_header = detached_mode ? true : false;
+
 /* Create format layer */
-ret = block_crypto_co_create_generic(bs, size, create_opts, prealloc, 
errp);
+ret = block_crypto_co_create_generic(bs, detached_mode ? 0 : size,
+ create_opts, prealloc, errp);
 if (ret < 0) {
 goto fail;
 }
diff --git a/block/crypto.h b/block/crypto.h
index 72e792c9af..bceefd45bd 100644
--- a/block/crypto.h
+++ b/block/crypto.h
@@ -41,6 +41,7 @@
 #define BLOCK_CRYPTO_OPT_LUKS_IVGEN_HASH_ALG "ivgen-hash-alg"
 #define BLOCK_CRYPTO_OPT_LUKS_HASH_ALG "hash-alg"
 #define BLOCK_CRYPTO_OPT_LUKS_ITER_TIME "iter-time"
+#define BLOCK_CRYPTO_OPT_LUKS_DETACHED_MODE "detached-mode"
 #define BLOCK_CRYPTO_OPT_LUKS_KEYSLOT "keyslot"
 #define BLOCK_CRYPTO_OPT_LUKS_STATE "state"
 #define BLOCK_CRYPTO_OPT_LUKS_OLD_SECRET "old-secret"
@@ -100,6 +101,13 @@
 .help = "Select new state of affected keyslots (active/inactive)",\
 }
 
+#define BLOCK_CRYPTO_OPT_DEF_LUKS_DETACHED_MODE(prefix) \
+{ \
+.name = prefix BLOCK_CRYPTO_OPT_LUKS_DETACHED_MODE, \
+.type = QEMU_OPT_BOOL,\
+.help = "Create a detached LUKS header",  \
+}
+
 #define BLOCK_CRYPTO_OPT_DEF_LUKS_KEYSLOT(prefix)  \
 {  \
 .name = prefix BLOCK_CRYPTO_OPT_LUKS_KEYSLOT,  \
diff --git a/qapi/crypto.json b/qapi/crypto.json
index 6b4e86cb81..8e81aa8454 100644
--- a/qapi/crypto.json
+++ b/qapi/crypto.json
@@ -226,6 +226,8 @@
 # @iter-time: number of milliseconds to spend in PBKDF passphrase
 # processing.  Currently defaults to 2000. (since 2.8)
 #
+# @detached-mode: create a detached LUKS header. (since 9.0)
+#
 # Since: 2.6
 ##
 { 'struct': 'QCryptoBlockCreateOptionsLUKS',
@@ -235,7 +237,8 @@
 '*ivgen-alg': 'QCryptoIVGenAlgorithm',
 '*ivgen-hash-alg': 'QCryptoHashAlgorithm',
 '*hash-alg': 'QCryptoHashAlgorithm',
-'*iter-time': 'int'}}
+'*iter-time': 'int',
+'*detached-mode': 'bool'}}
 
 ##
 # @QCryptoBlockOpenOptions:
-- 
2.39.1




[v3 04/10] crypto: Introduce creation option and structure for detached LUKS header

2023-12-24 Thread Hyman Huang
Introduce 'header' field in BlockdevCreateOptionsLUKS to support
detached LUKS header creation. Meanwhile, introduce header-related
field in QCryptoBlock.

Signed-off-by: Hyman Huang 
---
 crypto/blockpriv.h   | 3 +++
 qapi/block-core.json | 3 +++
 qapi/crypto.json | 5 -
 3 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/crypto/blockpriv.h b/crypto/blockpriv.h
index 3c7ccea504..6289aea961 100644
--- a/crypto/blockpriv.h
+++ b/crypto/blockpriv.h
@@ -42,6 +42,9 @@ struct QCryptoBlock {
 size_t niv;
 uint64_t payload_offset; /* In bytes */
 uint64_t sector_size; /* In bytes */
+
+bool detached_header; /* True if disk has a detached LUKS header */
+uint64_t detached_header_size; /* LUKS header size plus key slot size */
 };
 
 struct QCryptoBlockDriver {
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 9ac256c489..8aec179926 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -4948,6 +4948,8 @@
 # @file: Node to create the image format on, mandatory except when
 #'preallocation' is not requested
 #
+# @header: Detached LUKS header node to format. (since 9.0)
+#
 # @size: Size of the virtual disk in bytes
 #
 # @preallocation: Preallocation mode for the new image (since: 4.2)
@@ -4958,6 +4960,7 @@
 { 'struct': 'BlockdevCreateOptionsLUKS',
   'base': 'QCryptoBlockCreateOptionsLUKS',
   'data': { '*file':'BlockdevRef',
+'*header':  'BlockdevRef',
 'size': 'size',
 '*preallocation':   'PreallocMode' } }
 
diff --git a/qapi/crypto.json b/qapi/crypto.json
index fd3d46ebd1..6b4e86cb81 100644
--- a/qapi/crypto.json
+++ b/qapi/crypto.json
@@ -195,10 +195,13 @@
 # decryption key.  Mandatory except when probing image for
 # metadata only.
 #
+# @detached-header: if true, disk has detached LUKS header.
+#
 # Since: 2.6
 ##
 { 'struct': 'QCryptoBlockOptionsLUKS',
-  'data': { '*key-secret': 'str' }}
+  'data': { '*key-secret': 'str',
+'*detached-header': 'bool' }}
 
 ##
 # @QCryptoBlockCreateOptionsLUKS:
-- 
2.39.1




[v3 09/10] tests: Add detached LUKS header case

2023-12-24 Thread Hyman Huang
Signed-off-by: Hyman Huang 
---
 tests/qemu-iotests/tests/luks-detached-header | 214 ++
 .../tests/luks-detached-header.out|   5 +
 2 files changed, 219 insertions(+)
 create mode 100755 tests/qemu-iotests/tests/luks-detached-header
 create mode 100644 tests/qemu-iotests/tests/luks-detached-header.out

diff --git a/tests/qemu-iotests/tests/luks-detached-header 
b/tests/qemu-iotests/tests/luks-detached-header
new file mode 100755
index 00..cf305bfa47
--- /dev/null
+++ b/tests/qemu-iotests/tests/luks-detached-header
@@ -0,0 +1,214 @@
+#!/usr/bin/env python3
+# group: rw auto
+#
+# Test detached LUKS header
+#
+# Copyright (C) 2024 SmartX Inc.
+#
+# Authors:
+# Hyman Huang 
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+import os
+import iotests
+from iotests import imgfmt, qemu_img_create, img_info_log, qemu_img_info, 
QMPTestCase
+
+
+image_size = 128 * 1024 * 1024
+
+luks_img = os.path.join(iotests.test_dir, 'luks.img')
+detached_header_img1 = os.path.join(iotests.test_dir, 'detached_header.img1')
+detached_header_img2 = os.path.join(iotests.test_dir, 'detached_header.img2')
+detached_payload_raw_img = os.path.join(iotests.test_dir, 
'detached_payload_raw.img')
+detached_payload_qcow2_img = os.path.join(iotests.test_dir, 
'detached_payload_qcow2.img')
+
+secret_obj = 'secret,id=sec0,data=foo'
+luks_opts = 'key-secret=sec0'
+
+
+class TestDetachedLUKSHeader(QMPTestCase):
+def setUp(self) -> None:
+self.vm = iotests.VM()
+self.vm.add_object(secret_obj)
+self.vm.launch()
+
+# 1. Create the normal LUKS disk with 128M size
+self.vm.blockdev_create({ 'driver': 'file',
+  'filename': luks_img,
+  'size': 0 })
+self.vm.qmp_log('blockdev-add', driver='file', filename=luks_img,
+ node_name='luks-1-storage')
+result = self.vm.blockdev_create({ 'driver': imgfmt,
+   'file': 'luks-1-storage',
+   'key-secret': 'sec0',
+   'size': image_size,
+   'iter-time': 10 })
+# None is expected
+self.assertEqual(result, None)
+
+# 2. Create the LUKS disk with detached header (raw)
+
+# Create detached LUKS header
+self.vm.blockdev_create({ 'driver': 'file',
+  'filename': detached_header_img1,
+  'size': 0 })
+self.vm.qmp_log('blockdev-add', driver='file', 
filename=detached_header_img1,
+ node_name='luks-2-header-storage')
+
+# Create detached LUKS raw payload
+self.vm.blockdev_create({ 'driver': 'file',
+  'filename': detached_payload_raw_img,
+  'size': 0 })
+self.vm.qmp_log('blockdev-add', driver='file',
+ filename=detached_payload_raw_img,
+ node_name='luks-2-payload-storage')
+
+# Format LUKS disk with detached header
+result = self.vm.blockdev_create({ 'driver': imgfmt,
+   'header': 'luks-2-header-storage',
+   'file': 'luks-2-payload-storage',
+   'key-secret': 'sec0',
+   'preallocation': 'full',
+   'size': image_size,
+   'iter-time': 10 })
+self.assertEqual(result, None)
+
+self.vm.shutdown()
+
+# 3. Create the LUKS disk with detached header (qcow2)
+
+# Create detached LUKS header using qemu-img
+res = qemu_img_create('-f', 'luks', '--object', secret_obj, '-o', 
luks_opts,
+  '-o', "detached-mode=true", detached_header_img2)
+assert res.returncode == 0
+
+# Create detached LUKS qcow2 payload
+res = qemu_img_create('-f', 'qcow2', detached_payload_qcow2_img, 
str(image_size))
+assert res.returncode == 0
+
+def tearDown(self) -> None:
+os.remove(luks_img)
+os.remove(detached_header_img1)
+os.remove(detached_header_img2)
+

[v3 06/10] block: Support detached LUKS header creation using blockdev-create

2023-12-24 Thread Hyman Huang
The LUKS disk with detached header consists of a separate LUKS
header and payload. This LUKS disk type should be formatted
as follows:

1. add the secret to lock/unlock the cipher stored in the
   detached LUKS header
$ virsh qemu-monitor-command vm '{"execute":"object-add",
> "arguments":{"qom-type": "secret", "id": "sec0", "data": "foo"}}'

2. create a header img with 0 size
$ virsh qemu-monitor-command vm '{"execute":"blockdev-create",
> "arguments":{"job-id":"job0", "options":{"driver":"file",
> "filename":"/path/to/detached_luks_header.img", "size":0 }}}'

3. add protocol blockdev node for header
$ virsh qemu-monitor-command vm '{"execute":"blockdev-add",
> "arguments": {"driver":"file", "filename":
> "/path/to/detached_luks_header.img", "node-name":
> "detached-luks-header-storage"}}'

4. create a payload img with 0 size
$ virsh qemu-monitor-command vm '{"execute":"blockdev-create",
> "arguments":{"job-id":"job1", "options":{"driver":"file",
> "filename":"/path/to/detached_luks_payload_raw.img", "size":0}}}'

5. add protocol blockdev node for payload
$ virsh qemu-monitor-command vm '{"execute":"blockdev-add",
> "arguments": {"driver":"file", "filename":
> "/path/to/detached_luks_payload_raw.img", "node-name":
> "luks-payload-raw-storage"}}'

6. do the formatting with 128M size
$ virsh qemu-monitor-command c81_node1 '{"execute":"blockdev-create",
> "arguments":{"job-id":"job2", "options":{"driver":"luks", "header":
> "detached-luks-header-storage", "file":"luks-payload-raw-storage",
> "size":134217728, "preallocation":"full", "key-secret":"sec0" }}}'

Signed-off-by: Hyman Huang 
---
 block/crypto.c  | 109 
 crypto/block-luks.c |   6 ++-
 crypto/block.c  |   1 +
 3 files changed, 106 insertions(+), 10 deletions(-)

diff --git a/block/crypto.c b/block/crypto.c
index 78fbe79c95..76cc8bda49 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -160,6 +160,48 @@ error:
 return ret;
 }
 
+static int coroutine_fn GRAPH_UNLOCKED
+block_crypto_co_format_luks_payload(BlockdevCreateOptionsLUKS *luks_opts,
+Error **errp)
+{
+BlockDriverState *bs = NULL;
+BlockBackend *blk = NULL;
+Error *local_error = NULL;
+int ret;
+
+if (luks_opts->size > INT64_MAX) {
+return -EFBIG;
+}
+
+bs = bdrv_co_open_blockdev_ref(luks_opts->file, errp);
+if (bs == NULL) {
+return -EIO;
+}
+
+blk = blk_co_new_with_bs(bs, BLK_PERM_WRITE | BLK_PERM_RESIZE,
+ BLK_PERM_ALL, errp);
+if (!blk) {
+ret = -EPERM;
+goto fail;
+}
+
+ret = blk_truncate(blk, luks_opts->size, true,
+   luks_opts->preallocation, 0, _error);
+if (ret < 0) {
+if (ret == -EFBIG) {
+/* Replace the error message with a better one */
+error_free(local_error);
+error_setg(errp, "The requested file size is too large");
+}
+goto fail;
+}
+
+ret = 0;
+
+fail:
+bdrv_co_unref(bs);
+return ret;
+}
 
 static QemuOptsList block_crypto_runtime_opts_luks = {
 .name = "crypto",
@@ -651,6 +693,7 @@ static int coroutine_fn GRAPH_UNLOCKED
 block_crypto_co_create_luks(BlockdevCreateOptions *create_options, Error 
**errp)
 {
 BlockdevCreateOptionsLUKS *luks_opts;
+BlockDriverState *hdr_bs = NULL;
 BlockDriverState *bs = NULL;
 QCryptoBlockCreateOptions create_opts;
 PreallocMode preallocation = PREALLOC_MODE_OFF;
@@ -659,8 +702,22 @@ block_crypto_co_create_luks(BlockdevCreateOptions 
*create_options, Error **errp)
 assert(create_options->driver == BLOCKDEV_DRIVER_LUKS);
 luks_opts = _options->u.luks;
 
-if (luks_opts->file == NULL) {
-error_setg(errp, "Formatting LUKS disk requires parameter 'file'");
+if (luks_opts->header == NULL && luks_opts->file == NULL) {
+error_setg(errp, "Either the parameter 'header' or 'file' should "
+   "be specified");
+return -EINVAL;
+}
+
+if (luks_opts->detached_header && luks_opts->header == NULL) {
+error_setg(errp, "Formatting a detached LUKS disk requries "
+   "'header' to be specified");
+return -EINVAL;
+}
+
+if ((luks_opts->preallocation != PREALLOC_MODE_OFF) &&
+(luks_opts->file == NULL)) {
+error_setg(errp, "Parameter 'preallocation' requries 'file' to be "
+   "specified for formatting LUKS disk");
 return -EINVAL;
 }
 
@@ -673,7 +730,40 @@ block_crypto_co_create_luks(BlockdevCreateOptions 
*create_options, Error **errp)
 preallocation = luks_opts->preallocation;
 }
 
-if (luks_opts->file) {
+if (luks_opts->header) {
+hdr_bs = bdrv_co_open_blockdev_ref(luks_opts->header, errp);
+if (hdr_bs == NULL) {
+return -EIO;
+}
+
+/*
+ * If blockdev reference of header is specified,
+ * detached_header default to 

[v3 03/10] qapi: Make parameter 'file' optional for BlockdevCreateOptionsLUKS

2023-12-24 Thread Hyman Huang
To support detached LUKS header creation, make the existing 'file'
filed in BlockdevCreateOptionsLUKS optional, while also adding an
extra optional 'header' field in the next commit.

Signed-off-by: Hyman Huang 
---
 block/crypto.c   | 21 ++---
 qapi/block-core.json |  5 +++--
 2 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/block/crypto.c b/block/crypto.c
index 6063879bac..78fbe79c95 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -659,9 +659,9 @@ block_crypto_co_create_luks(BlockdevCreateOptions 
*create_options, Error **errp)
 assert(create_options->driver == BLOCKDEV_DRIVER_LUKS);
 luks_opts = _options->u.luks;
 
-bs = bdrv_co_open_blockdev_ref(luks_opts->file, errp);
-if (bs == NULL) {
-return -EIO;
+if (luks_opts->file == NULL) {
+error_setg(errp, "Formatting LUKS disk requires parameter 'file'");
+return -EINVAL;
 }
 
 create_opts = (QCryptoBlockCreateOptions) {
@@ -673,10 +673,17 @@ block_crypto_co_create_luks(BlockdevCreateOptions 
*create_options, Error **errp)
 preallocation = luks_opts->preallocation;
 }
 
-ret = block_crypto_co_create_generic(bs, luks_opts->size, _opts,
- preallocation, errp);
-if (ret < 0) {
-goto fail;
+if (luks_opts->file) {
+bs = bdrv_co_open_blockdev_ref(luks_opts->file, errp);
+if (bs == NULL) {
+return -EIO;
+}
+
+ret = block_crypto_co_create_generic(bs, luks_opts->size, _opts,
+ preallocation, errp);
+if (ret < 0) {
+goto fail;
+}
 }
 
 ret = 0;
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 10be08d08f..9ac256c489 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -4945,7 +4945,8 @@
 #
 # Driver specific image creation options for LUKS.
 #
-# @file: Node to create the image format on
+# @file: Node to create the image format on, mandatory except when
+#'preallocation' is not requested
 #
 # @size: Size of the virtual disk in bytes
 #
@@ -4956,7 +4957,7 @@
 ##
 { 'struct': 'BlockdevCreateOptionsLUKS',
   'base': 'QCryptoBlockCreateOptionsLUKS',
-  'data': { 'file': 'BlockdevRef',
+  'data': { '*file':'BlockdevRef',
 'size': 'size',
 '*preallocation':   'PreallocMode' } }
 
-- 
2.39.1




[v3 08/10] crypto: Introduce 'detached-header' field in QCryptoBlockInfoLUKS

2023-12-24 Thread Hyman Huang
When querying the LUKS disk with the qemu-img tool or other APIs,
add information about whether the LUKS header is detached.

Additionally, update the test case with the appropriate
modification.

Signed-off-by: Hyman Huang 
---
 crypto/block-luks.c| 2 ++
 qapi/crypto.json   | 3 +++
 tests/qemu-iotests/210.out | 4 
 3 files changed, 9 insertions(+)

diff --git a/crypto/block-luks.c b/crypto/block-luks.c
index 474c7aee2e..c5e53b4ee4 100644
--- a/crypto/block-luks.c
+++ b/crypto/block-luks.c
@@ -1266,6 +1266,7 @@ qcrypto_block_luks_open(QCryptoBlock *block,
 block->sector_size = QCRYPTO_BLOCK_LUKS_SECTOR_SIZE;
 block->payload_offset =
 qcrypto_block_luks_payload_offset(luks->header.payload_offset_sector);
+block->detached_header = (block->payload_offset == 0) ? true : false;
 
 return 0;
 
@@ -1892,6 +1893,7 @@ static int qcrypto_block_luks_get_info(QCryptoBlock 
*block,
 info->u.luks.master_key_iters = luks->header.master_key_iterations;
 info->u.luks.uuid = g_strndup((const char *)luks->header.uuid,
   sizeof(luks->header.uuid));
+info->u.luks.detached_header = block->detached_header;
 
 for (i = 0; i < QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS; i++) {
 slot = g_new0(QCryptoBlockInfoLUKSSlot, 1);
diff --git a/qapi/crypto.json b/qapi/crypto.json
index 8e81aa8454..336c880b5d 100644
--- a/qapi/crypto.json
+++ b/qapi/crypto.json
@@ -317,6 +317,8 @@
 #
 # @hash-alg: the master key hash algorithm
 #
+# @detached-header: whether the LUKS header is detached (Since 9.0)
+#
 # @payload-offset: offset to the payload data in bytes
 #
 # @master-key-iters: number of PBKDF2 iterations for key material
@@ -333,6 +335,7 @@
'ivgen-alg': 'QCryptoIVGenAlgorithm',
'*ivgen-hash-alg': 'QCryptoHashAlgorithm',
'hash-alg': 'QCryptoHashAlgorithm',
+   'detached-header': 'bool',
'payload-offset': 'int',
'master-key-iters': 'int',
'uuid': 'str',
diff --git a/tests/qemu-iotests/210.out b/tests/qemu-iotests/210.out
index 96d9f749dd..94b29b2120 100644
--- a/tests/qemu-iotests/210.out
+++ b/tests/qemu-iotests/210.out
@@ -18,6 +18,7 @@ virtual size: 128 MiB (134217728 bytes)
 encrypted: yes
 Format specific information:
 ivgen alg: plain64
+detached header: false
 hash alg: sha256
 cipher alg: aes-256
 uuid: ----
@@ -70,6 +71,7 @@ virtual size: 64 MiB (67108864 bytes)
 encrypted: yes
 Format specific information:
 ivgen alg: plain64
+detached header: false
 hash alg: sha1
 cipher alg: aes-128
 uuid: ----
@@ -125,6 +127,7 @@ virtual size: 0 B (0 bytes)
 encrypted: yes
 Format specific information:
 ivgen alg: plain64
+detached header: false
 hash alg: sha256
 cipher alg: aes-256
 uuid: ----
@@ -195,6 +198,7 @@ virtual size: 0 B (0 bytes)
 encrypted: yes
 Format specific information:
 ivgen alg: plain64
+detached header: false
 hash alg: sha256
 cipher alg: aes-256
 uuid: ----
-- 
2.39.1




[v3 02/10] crypto: Support generic LUKS encryption

2023-12-24 Thread Hyman Huang
By enhancing the LUKS driver, it is possible to enable
the detachable LUKS header and, as a result, achieve
general encryption for any disk format that QEMU has
supported.

Take the qcow2 as an example, the usage of the generic
LUKS encryption as follows:

1. add a protocol blockdev node of data disk
$ virsh qemu-monitor-command vm '{"execute":"blockdev-add",
> "arguments":{"node-name":"libvirt-1-storage", "driver":"file",
> "filename":"/path/to/test_disk.qcow2"}}'

2. add a protocol blockdev node of LUKS header as above.
$ virsh qemu-monitor-command vm '{"execute":"blockdev-add",
> "arguments":{"node-name":"libvirt-2-storage", "driver":"file",
> "filename": "/path/to/cipher.gluks" }}'

3. add the secret for decrypting the cipher stored in LUKS
   header above
$ virsh qemu-monitor-command vm '{"execute":"object-add",
> "arguments":{"qom-type":"secret", "id":
> "libvirt-2-storage-secret0", "data":"abc123"}}'

4. add the qcow2-drived blockdev format node
$ virsh qemu-monitor-command vm '{"execute":"blockdev-add",
> "arguments":{"node-name":"libvirt-1-format", "driver":"qcow2",
> "file":"libvirt-1-storage"}}'

5. add the luks-drived blockdev to link the qcow2 disk with
   LUKS header by specifying the field "header"
$ virsh qemu-monitor-command vm '{"execute":"blockdev-add",
> "arguments":{"node-name":"libvirt-2-format", "driver":"luks",
> "file":"libvirt-1-format", "header":"libvirt-2-storage",
> "key-secret":"libvirt-2-format-secret0"}}'

6. add the virtio-blk device finally
$ virsh qemu-monitor-command vm '{"execute":"device_add",
> "arguments": {"num-queues":"1", "driver":"virtio-blk-pci",
> "drive": "libvirt-2-format", "id":"virtio-disk2"}}'

The generic LUKS encryption method of starting a virtual
machine (VM) is somewhat similar to hot-plug in that both
maintaining the same json command while the starting VM
changes the "blockdev-add/device_add" parameters to
"blockdev/device".

Signed-off-by: Hyman Huang 
Message-Id: 
<910801f303da1601051479d3b7e5c2c6b4e01eb7.1701879996.git.yong.hu...@smartx.com>
---
 block/crypto.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/block/crypto.c b/block/crypto.c
index f82b13d32b..6063879bac 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -64,12 +64,14 @@ static int block_crypto_read_func(QCryptoBlock *block,
   Error **errp)
 {
 BlockDriverState *bs = opaque;
+BlockCrypto *crypto = bs->opaque;
 ssize_t ret;
 
 GLOBAL_STATE_CODE();
 GRAPH_RDLOCK_GUARD_MAINLOOP();
 
-ret = bdrv_pread(bs->file, offset, buflen, buf, 0);
+ret = bdrv_pread(crypto->header ? crypto->header : bs->file,
+ offset, buflen, buf, 0);
 if (ret < 0) {
 error_setg_errno(errp, -ret, "Could not read encryption header");
 return ret;
@@ -269,6 +271,7 @@ static int block_crypto_open_generic(QCryptoBlockFormat 
format,
 QCryptoBlockOpenOptions *open_opts = NULL;
 unsigned int cflags = 0;
 QDict *cryptoopts = NULL;
+const char *hdr_bdref = qdict_get_try_str(options, "header");
 
 GLOBAL_STATE_CODE();
 
@@ -277,6 +280,15 @@ static int block_crypto_open_generic(QCryptoBlockFormat 
format,
 return ret;
 }
 
+if (hdr_bdref) {
+crypto->header = bdrv_open_child(NULL, options, "header", bs,
+ _of_bds, BDRV_CHILD_METADATA,
+ false, errp);
+if (!crypto->header) {
+return -EINVAL;
+}
+}
+
 GRAPH_RDLOCK_GUARD_MAINLOOP();
 
 bs->supported_write_flags = BDRV_REQ_FUA &
-- 
2.39.1




[v3 05/10] crypto: Mark the payload_offset_sector invalid for detached LUKS header

2023-12-24 Thread Hyman Huang
Set the payload_offset_sector to a value that is nearly never reached
in order to mark it as invalid and indicate that 0 should be the offset
of the read/write operation on the 'file' protocol blockdev node.

Signed-off-by: Hyman Huang 
---
 crypto/block-luks.c | 41 +++--
 1 file changed, 31 insertions(+), 10 deletions(-)

diff --git a/crypto/block-luks.c b/crypto/block-luks.c
index fb01ec38bb..48443ffcae 100644
--- a/crypto/block-luks.c
+++ b/crypto/block-luks.c
@@ -34,6 +34,8 @@
 
 #include "qemu/bitmap.h"
 
+#define INVALID_SECTOR_OFFSET UINT32_MAX
+
 /*
  * Reference for the LUKS format implemented here is
  *
@@ -136,6 +138,13 @@ struct QCryptoBlockLUKS {
 };
 
 
+static inline uint32_t
+qcrypto_block_luks_payload_offset(uint32_t sector)
+{
+return sector == INVALID_SECTOR_OFFSET ? 0 :
+sector * QCRYPTO_BLOCK_LUKS_SECTOR_SIZE;
+}
+
 static int qcrypto_block_luks_cipher_name_lookup(const char *name,
  QCryptoCipherMode mode,
  uint32_t key_bytes,
@@ -1255,8 +1264,8 @@ qcrypto_block_luks_open(QCryptoBlock *block,
 }
 
 block->sector_size = QCRYPTO_BLOCK_LUKS_SECTOR_SIZE;
-block->payload_offset = luks->header.payload_offset_sector *
-block->sector_size;
+block->payload_offset =
+qcrypto_block_luks_payload_offset(luks->header.payload_offset_sector);
 
 return 0;
 
@@ -1529,16 +1538,28 @@ qcrypto_block_luks_create(QCryptoBlock *block,
 slot->stripes = QCRYPTO_BLOCK_LUKS_STRIPES;
 }
 
-/* The total size of the LUKS headers is the partition header + key
- * slot headers, rounded up to the nearest sector, combined with
- * the size of each master key material region, also rounded up
- * to the nearest sector */
-luks->header.payload_offset_sector = header_sectors +
-QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS * split_key_sectors;
+if (block->detached_header) {
+/*
+ * Set the payload_offset_sector to a value that is nearly never
+ * reached in order to mark it as invalid and indicate that 0 should
+ * be the offset of the read/write operation on the 'file' protocol
+ * blockdev node. Here the UINT32_MAX is choosed
+ */
+luks->header.payload_offset_sector = INVALID_SECTOR_OFFSET;
+} else {
+/*
+ * The total size of the LUKS headers is the partition header + key
+ * slot headers, rounded up to the nearest sector, combined with
+ * the size of each master key material region, also rounded up
+ * to the nearest sector
+ */
+luks->header.payload_offset_sector = header_sectors +
+QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS * split_key_sectors;
+}
 
 block->sector_size = QCRYPTO_BLOCK_LUKS_SECTOR_SIZE;
-block->payload_offset = luks->header.payload_offset_sector *
-block->sector_size;
+block->payload_offset =
+qcrypto_block_luks_payload_offset(luks->header.payload_offset_sector);
 
 /* Reserve header space to match payload offset */
 initfunc(block, block->payload_offset, opaque, _err);
-- 
2.39.1




[v3 07/10] block: Support detached LUKS header creation using qemu-img

2023-12-24 Thread Hyman Huang
Add the 'detached-mode' option to specify the creation of
a detached LUKS header. This is how it is used:
$ qemu-img create --object secret,id=sec0,data=abc123 -f luks
> -o cipher-alg=aes-256,cipher-mode=xts -o key-secret=sec0
> -o detached-mode=true header.luks

Signed-off-by: Hyman Huang 
---
 block.c  | 5 -
 block/crypto.c   | 9 -
 block/crypto.h   | 8 
 qapi/crypto.json | 5 -
 4 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/block.c b/block.c
index bfb0861ec6..fa9ce36928 100644
--- a/block.c
+++ b/block.c
@@ -7517,7 +7517,10 @@ void bdrv_img_create(const char *filename, const char 
*fmt,
 goto out;
 }
 
-if (size == -1) {
+/* Parameter 'size' is not needed for detached LUKS header */
+if (size == -1 &&
+!(!strcmp(fmt, "luks") &&
+  qemu_opt_get_bool(opts, "detached-mode", false))) {
 error_setg(errp, "Image creation needs a size parameter");
 goto out;
 }
diff --git a/block/crypto.c b/block/crypto.c
index 76cc8bda49..812c3c28f5 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -229,6 +229,7 @@ static QemuOptsList block_crypto_create_opts_luks = {
 BLOCK_CRYPTO_OPT_DEF_LUKS_IVGEN_HASH_ALG(""),
 BLOCK_CRYPTO_OPT_DEF_LUKS_HASH_ALG(""),
 BLOCK_CRYPTO_OPT_DEF_LUKS_ITER_TIME(""),
+BLOCK_CRYPTO_OPT_DEF_LUKS_DETACHED_MODE(""),
 { /* end of list */ }
 },
 };
@@ -793,6 +794,8 @@ block_crypto_co_create_opts_luks(BlockDriver *drv, const 
char *filename,
 PreallocMode prealloc;
 char *buf = NULL;
 int64_t size;
+bool detached_mode =
+qemu_opt_get_bool(opts, "detached-mode", false);
 int ret;
 Error *local_err = NULL;
 
@@ -832,8 +835,12 @@ block_crypto_co_create_opts_luks(BlockDriver *drv, const 
char *filename,
 goto fail;
 }
 
+   /* The detached_header default to true if detached-mode is specified */
+create_opts->u.luks.detached_header = detached_mode ? true : false;
+
 /* Create format layer */
-ret = block_crypto_co_create_generic(bs, size, create_opts, prealloc, 
errp);
+ret = block_crypto_co_create_generic(bs, detached_mode ? 0 : size,
+ create_opts, prealloc, errp);
 if (ret < 0) {
 goto fail;
 }
diff --git a/block/crypto.h b/block/crypto.h
index 72e792c9af..bceefd45bd 100644
--- a/block/crypto.h
+++ b/block/crypto.h
@@ -41,6 +41,7 @@
 #define BLOCK_CRYPTO_OPT_LUKS_IVGEN_HASH_ALG "ivgen-hash-alg"
 #define BLOCK_CRYPTO_OPT_LUKS_HASH_ALG "hash-alg"
 #define BLOCK_CRYPTO_OPT_LUKS_ITER_TIME "iter-time"
+#define BLOCK_CRYPTO_OPT_LUKS_DETACHED_MODE "detached-mode"
 #define BLOCK_CRYPTO_OPT_LUKS_KEYSLOT "keyslot"
 #define BLOCK_CRYPTO_OPT_LUKS_STATE "state"
 #define BLOCK_CRYPTO_OPT_LUKS_OLD_SECRET "old-secret"
@@ -100,6 +101,13 @@
 .help = "Select new state of affected keyslots (active/inactive)",\
 }
 
+#define BLOCK_CRYPTO_OPT_DEF_LUKS_DETACHED_MODE(prefix) \
+{ \
+.name = prefix BLOCK_CRYPTO_OPT_LUKS_DETACHED_MODE, \
+.type = QEMU_OPT_BOOL,\
+.help = "Create a detached LUKS header",  \
+}
+
 #define BLOCK_CRYPTO_OPT_DEF_LUKS_KEYSLOT(prefix)  \
 {  \
 .name = prefix BLOCK_CRYPTO_OPT_LUKS_KEYSLOT,  \
diff --git a/qapi/crypto.json b/qapi/crypto.json
index 6b4e86cb81..8e81aa8454 100644
--- a/qapi/crypto.json
+++ b/qapi/crypto.json
@@ -226,6 +226,8 @@
 # @iter-time: number of milliseconds to spend in PBKDF passphrase
 # processing.  Currently defaults to 2000. (since 2.8)
 #
+# @detached-mode: create a detached LUKS header. (since 9.0)
+#
 # Since: 2.6
 ##
 { 'struct': 'QCryptoBlockCreateOptionsLUKS',
@@ -235,7 +237,8 @@
 '*ivgen-alg': 'QCryptoIVGenAlgorithm',
 '*ivgen-hash-alg': 'QCryptoHashAlgorithm',
 '*hash-alg': 'QCryptoHashAlgorithm',
-'*iter-time': 'int'}}
+'*iter-time': 'int',
+'*detached-mode': 'bool'}}
 
 ##
 # @QCryptoBlockOpenOptions:
-- 
2.39.1




[v3 10/10] MAINTAINERS: Add section "Detached LUKS header"

2023-12-24 Thread Hyman Huang
I've built interests in block cryptography and also
have been working on projects related to this
subsystem.

Add a section to the MAINTAINERS file for detached
LUKS header, it only has a test case in it currently.

Signed-off-by: Hyman Huang 
---
 MAINTAINERS | 5 +
 1 file changed, 5 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 395f26ba86..f0f7b889a3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3391,6 +3391,11 @@ F: migration/dirtyrate.c
 F: migration/dirtyrate.h
 F: include/sysemu/dirtyrate.h
 
+Detached LUKS header
+M: Hyman Huang 
+S: Maintained
+F: tests/qemu-iotests/tests/luks-detached-header
+
 D-Bus
 M: Marc-André Lureau 
 S: Maintained
-- 
2.39.1




[v3 00/10] Support generic Luks encryption

2023-12-24 Thread Hyman Huang
v3:
- Rebase on master
- Add a test case for detached LUKS header
- Adjust the design to honour preallocation of the payload device
- Adjust the design to honour the payload offset from the header,
  even when detached
- Support detached LUKS header creation using qemu-img
- Support detached LUKS header querying
- Do some code clean

Hyman Huang (10):
  crypto: Introduce option and structure for detached LUKS header
  crypto: Support generic LUKS encryption
  qapi: Make parameter 'file' optional for BlockdevCreateOptionsLUKS
  crypto: Introduce creation option and structure for detached LUKS
header
  crypto: Mark the payload_offset_sector invalid for detached LUKS
header
  block: Support detached LUKS header creation using blockdev-create
  block: Support detached LUKS header creation using qemu-img
  crypto: Introduce 'detached-header' field in QCryptoBlockInfoLUKS
  tests: Add detached LUKS header case
  MAINTAINERS: Add section "Detached LUKS header"

 MAINTAINERS   |   5 +
 block.c   |   5 +-
 block/crypto.c| 146 ++--
 block/crypto.h|   8 +
 crypto/block-luks.c   |  49 +++-
 crypto/block.c|   1 +
 crypto/blockpriv.h|   3 +
 qapi/block-core.json  |  14 +-
 qapi/crypto.json  |  13 +-
 tests/qemu-iotests/210.out|   4 +
 tests/qemu-iotests/tests/luks-detached-header | 214 ++
 .../tests/luks-detached-header.out|   5 +
 12 files changed, 436 insertions(+), 31 deletions(-)
 create mode 100755 tests/qemu-iotests/tests/luks-detached-header
 create mode 100644 tests/qemu-iotests/tests/luks-detached-header.out

-- 
2.39.1




[v3 01/10] crypto: Introduce option and structure for detached LUKS header

2023-12-24 Thread Hyman Huang
Add the "header" option for the LUKS format. This field would be
used to identify the blockdev's position where a detachable LUKS
header is stored.

In addition, introduce header field in struct BlockCrypto

Signed-off-by: Hyman Huang 
Reviewed-by: Daniel P. Berrangé 
Message-Id: 
<5b99f60c7317092a563d7ca3fb4b414197015eb2.1701879996.git.yong.hu...@smartx.com>
---
 block/crypto.c   | 1 +
 qapi/block-core.json | 6 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/block/crypto.c b/block/crypto.c
index 921933a5e5..f82b13d32b 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -39,6 +39,7 @@ typedef struct BlockCrypto BlockCrypto;
 struct BlockCrypto {
 QCryptoBlock *block;
 bool updating_keys;
+BdrvChild *header;  /* Reference to the detached LUKS header */
 };
 
 
diff --git a/qapi/block-core.json b/qapi/block-core.json
index ca390c5700..10be08d08f 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3352,11 +3352,15 @@
 # decryption key (since 2.6). Mandatory except when doing a
 # metadata-only probe of the image.
 #
+# @header: optional reference to the location of a blockdev
+# storing a detached LUKS header. (since 9.0)
+#
 # Since: 2.9
 ##
 { 'struct': 'BlockdevOptionsLUKS',
   'base': 'BlockdevOptionsGenericFormat',
-  'data': { '*key-secret': 'str' } }
+  'data': { '*key-secret': 'str',
+'*header': 'BlockdevRef'} }
 
 ##
 # @BlockdevOptionsGenericCOWFormat:
-- 
2.39.1




Re: [PATCH v5 1/2] qom: new object to associate device to numa node

2023-12-24 Thread Ankit Agrawal
Thanks Markus for the review.

>> Introduce a new acpi-generic-initiator object to allow host admin provide the
>> device and the corresponding NUMA nodes. Qemu maintain this association and
>> use this object to build the requisite GI Affinity Structure.
>
> Pardon my ignorance...  What makes this object an "initiator", and why
> is it "generic"?

In ACPI 6.3 
(https://uefi.org/sites/default/files/resources/ACPI_6_3_final_Jan30.pdf),
a new table in System Resource Affinity Table called Generic initiator Affinity 
table
was introduced to describe devices such as heterogeneous processors and
accelerators, GPUs, and I/O devices with  integrated compute or DMA engines
(termed as Generic Initiator) that are present on the system. It is used to 
associate
a proximity domain with those devices.

You may refer 5.2.16.6 in the aforementioned link. This patch implements that
structure (Table 5-78) for Qemu ACPI.

>> An admin can provide the range of nodes through a uint16 array host-nodes
>> and link it to a device by providing its id. Currently, only PCI device is
>> supported. The following sample creates 8 nodes and link them to the PCI
>> device dev0:
>>
>> -numa node,nodeid=2 \
>> -numa node,nodeid=3 \
>> -numa node,nodeid=4 \
>> -numa node,nodeid=5 \
>> -numa node,nodeid=6 \
>> -numa node,nodeid=7 \
>> -numa node,nodeid=8 \
>> -numa node,nodeid=9 \
>> -device 
>> vfio-pci-nohotplug,host=0009:01:00.0,bus=pcie.0,addr=04.0,rombar=0,id=dev0 \
>> -object acpi-generic-initiator,id=gi0,pci-dev=dev0,host-nodes=2-9 \
>
> Does this link *all* NUMA nodes to dev0?
>
> Would an example involving two devices be more instructive?

Sure, updated in v6.

>> diff --git a/qapi/qom.json b/qapi/qom.json
>> index c53ef978ff..efcc4b8dfd 100644
>> --- a/qapi/qom.json
>> +++ b/qapi/qom.json
>> @@ -794,6 +794,21 @@
>>  { 'struct': 'VfioUserServerProperties',
>>    'data': { 'socket': 'SocketAddress', 'device': 'str' } }
>>
>> +##
>> +# @AcpiGenericInitiatorProperties:
>> +#
>> +# Properties for acpi-generic-initiator objects.
>> +#
>> +# @pci-dev: PCI device ID to be associated with the node
>> +#
>> +# @host-nodes: numa node list
>
> This feels a bit terse.  The commit message makes me guess this
> specifies the NUMA nodes to be linked to @pci-dev.  Correct?

Right, it could be made cleared. Done in v6.


[PATCH v6 2/2] hw/acpi: Implement the SRAT GI affinity structure

2023-12-24 Thread ankita
From: Ankit Agrawal 

ACPI spec provides a scheme to associate "Generic Initiators" [1]
(e.g. heterogeneous processors and accelerators, GPUs, and I/O devices with
integrated compute or DMA engines GPUs) with Proximity Domains. This is
achieved using Generic Initiator Affinity Structure in SRAT. During bootup,
Linux kernel parse the ACPI SRAT to determine the PXM ids and create a NUMA
node for each unique PXM ID encountered. Qemu currently do not implement
these structures while building SRAT.

Add GI structures while building VM ACPI SRAT. The association between
devices and nodes are stored using acpi-generic-initiator object. Lookup
presence of all such objects and use them to build these structures.

The structure needs a PCI device handle [2] that consists of the device BDF.
The vfio-pci device corresponding to the acpi-generic-initiator object is
located to determine the BDF.

[1] ACPI Spec 6.3, Section 5.2.16.6
[2] ACPI Spec 6.3, Table 5.80

Signed-off-by: Ankit Agrawal 
---
 hw/acpi/acpi-generic-initiator.c | 99 
 hw/arm/virt-acpi-build.c |  3 +
 include/hw/acpi/acpi-generic-initiator.h | 26 +++
 3 files changed, 128 insertions(+)

diff --git a/hw/acpi/acpi-generic-initiator.c b/hw/acpi/acpi-generic-initiator.c
index e05e28e962..fa5235f2bb 100644
--- a/hw/acpi/acpi-generic-initiator.c
+++ b/hw/acpi/acpi-generic-initiator.c
@@ -68,3 +68,102 @@ static void acpi_generic_initiator_class_init(ObjectClass 
*oc, void *data)
 object_class_property_add(oc, "host-nodes", "int", NULL,
 acpi_generic_initiator_set_host_nodes, NULL, NULL);
 }
+
+static int acpi_generic_initiator_list(Object *obj, void *opaque)
+{
+GSList **list = opaque;
+
+if (object_dynamic_cast(obj, TYPE_ACPI_GENERIC_INITIATOR)) {
+*list = g_slist_append(*list, ACPI_GENERIC_INITIATOR(obj));
+}
+
+object_child_foreach(obj, acpi_generic_initiator_list, opaque);
+return 0;
+}
+
+/*
+ * Identify Generic Initiator objects and link them into the list which is
+ * returned to the caller.
+ *
+ * Note: it is the caller's responsibility to free the list to avoid
+ * memory leak.
+ */
+static GSList *acpi_generic_initiator_get_list(void)
+{
+GSList *list = NULL;
+
+object_child_foreach(object_get_root(),
+ acpi_generic_initiator_list, );
+return list;
+}
+
+/*
+ * ACPI 6.3:
+ * Table 5-78 Generic Initiator Affinity Structure
+ */
+static void
+build_srat_generic_pci_initiator_affinity(GArray *table_data, int node,
+  PCIDeviceHandle *handle)
+{
+uint8_t index;
+
+build_append_int_noprefix(table_data, 5, 1);  /* Type */
+build_append_int_noprefix(table_data, 32, 1); /* Length */
+build_append_int_noprefix(table_data, 0, 1);  /* Reserved */
+build_append_int_noprefix(table_data, 1, 1);  /* Device Handle Type: PCI */
+build_append_int_noprefix(table_data, node, 4);  /* Proximity Domain */
+
+/* Device Handle - PCI */
+build_append_int_noprefix(table_data, handle->segment, 2);
+build_append_int_noprefix(table_data, handle->bdf, 2);
+for (index = 0; index < 12; index++) {
+build_append_int_noprefix(table_data, 0, 1);
+}
+
+build_append_int_noprefix(table_data, GEN_AFFINITY_ENABLED, 4); /* Flags */
+build_append_int_noprefix(table_data, 0, 4); /* Reserved */
+}
+
+void build_srat_generic_pci_initiator(GArray *table_data)
+{
+GSList *gi_list, *list = acpi_generic_initiator_get_list();
+AcpiGenericInitiator *gi;
+
+for (gi_list = list; gi_list; gi_list = gi_list->next) {
+Object *o;
+uint16_t node;
+PCIDevice *pci_dev;
+bool node_specified = false;
+
+gi = gi_list->data;
+
+o = object_resolve_path_type(gi->pci_dev, TYPE_PCI_DEVICE, NULL);
+if (!o) {
+error_printf("Specified device must be a PCI device.\n");
+exit(1);
+}
+pci_dev = PCI_DEVICE(o);
+
+for (node = 0; (node = find_next_bit(gi->host_nodes,
+ MAX_NODES, node)) != MAX_NODES; node++)
+{
+PCIDeviceHandle dev_handle;
+dev_handle.segment = 0;
+dev_handle.bdf = PCI_BUILD_BDF(pci_bus_num(pci_get_bus(pci_dev)),
+   pci_dev->devfn);
+build_srat_generic_pci_initiator_affinity(table_data,
+  node, _handle);
+node_specified = true;
+}
+
+if (!node_specified) {
+error_report("Generic Initiator device 0:%x:%x.%x has no 
associated"
+ " NUMA node.", pci_bus_num(pci_get_bus(pci_dev)),
+ PCI_SLOT(pci_dev->devfn), PCI_FUNC(pci_dev->devfn));
+error_printf("Specify NUMA node with -host-nodes option.\n");
+exit(1);
+}
+}
+
+g_slist_free(list);
+}
diff --git 

[PATCH v6 0/2] acpi: report numa nodes for device memory using GI

2023-12-24 Thread ankita
From: Ankit Agrawal 

There are upcoming devices which allow CPU to cache coherently access
their memory. It is sensible to expose such memory as NUMA nodes separate
from the sysmem node to the OS. The ACPI spec provides a scheme in SRAT
called Generic Initiator Affinity Structure [1] to allow an association
between a Proximity Domain (PXM) and a Generic Initiator (GI) (e.g.
heterogeneous processors and accelerators, GPUs, and I/O devices with
integrated compute or DMA engines).

While a single node per device may cover several use cases, it is however
insufficient for a full utilization of the NVIDIA GPUs MIG
(Mult-Instance GPUs) [2] feature. The feature allows partitioning of the
GPU device resources (including device memory) into several (upto 8)
isolated instances. Each of the partitioned memory requires a dedicated NUMA
node to operate. The partitions are not fixed and they can be created/deleted
at runtime.

Linux OS does not provide a means to dynamically create/destroy NUMA nodes
and such feature implementation is expected to be non-trivial. The nodes
that OS discovers at the boot time while parsing SRAT remains fixed. So we
utilize the GI Affinity structures that allows association between nodes
and devices. Multiple GI structures per device/BDF is possible, allowing
creation of multiple nodes in the VM by exposing unique PXM in each of these
structures.

Implement the mechanism to build the GI affinity structures as Qemu currently
does not. Introduce a new acpi-generic-initiator object that allows an
association of a set of nodes with a device. During SRAT creation, all such
objected are identified and used to add the GI Affinity Structures. Currently,
only PCI device is supported. On a multi device system, each device supporting
the features needs a unique acpi-generic-initiator object with its own set of
NUMA nodes associated to it.

The admin will create a range of 8 nodes and associate that with the device
using the acpi-generic-initiator object. While a configuration of less than
8 nodes per device is allowed, such configuration will prevent utilization of
the feature to the fullest. This setting is applicable to all the Grace+Hopper
systems. The following is an example of the Qemu command line arguments to
create 8 nodes and link them to the device 'dev0':

-numa node,nodeid=2 -numa node,nodeid=3 -numa node,nodeid=4 \
-numa node,nodeid=5 -numa node,nodeid=6 -numa node,nodeid=7 \
-numa node,nodeid=8 -numa node,nodeid=9 \
-device 
vfio-pci-nohotplug,host=0009:01:00.0,bus=pcie.0,addr=04.0,rombar=0,id=dev0 \
-object acpi-generic-initiator,id=gi0,pci-dev=dev0,host-nodes=2-9 \

The performance benefits can be realized by providing the NUMA node distances
appropriately (through libvirt tags or Qemu params). The admin can get the
distance among nodes in hardware using `numactl -H`.

This series goes along with the vfio-pci variant driver [3] under review.

Applied over v8.2.0-rc4.

[1] ACPI Spec 6.3, Section 5.2.16.6
[2] https://www.nvidia.com/en-in/technologies/multi-instance-gpu
[3] https://lore.kernel.org/all/20231212184613.3237-1-ank...@nvidia.com/

Link for v5:
https://lore.kernel.org/all/20231203060245.31593-1-ank...@nvidia.com/

v5 -> v6
- Updated commit message for the [1/2] and the cover letter.
- Updated the acpi-generic-initiator object comment description for
  clarity on the input host-nodes.
- Rebased to v8.2.0-rc4.

v4 -> v5
- Removed acpi-dev option until full support.
- The numa nodes are saved as bitmap instead of uint16List.
- Replaced asserts to exit calls.
- Addressed other miscellaneous comments.

v3 -> v4
- changed the ':' delimited way to a uint16 array to communicate the
nodes associated with the device.
- added asserts to handle invalid inputs.
- addressed other miscellaneous v3 comments.

v2 -> v3
- changed param to accept a ':' delimited list of numa nodes, instead
of a range.
- Removed nvidia-acpi-generic-initiator object.
- Addressed miscellaneous comments in v2.

v1 -> v2
- Removed dependency on sysfs to communicate the feature with variant module.
- Use GI Affinity SRAT structure instead of Memory Affinity.
- No DSDT entries needed to communicate the PXM for the device. SRAT GI
structure is used instead.
- New objects introduced to establish link between device and nodes.

Ankit Agrawal (2):
  qom: new object to associate device to numa node
  hw/acpi: Implement the SRAT GI affinity structure

 hw/acpi/acpi-generic-initiator.c | 169 +++
 hw/acpi/meson.build  |   1 +
 hw/arm/virt-acpi-build.c |   3 +
 include/hw/acpi/acpi-generic-initiator.h |  53 +++
 qapi/qom.json|  17 +++
 5 files changed, 243 insertions(+)
 create mode 100644 hw/acpi/acpi-generic-initiator.c
 create mode 100644 include/hw/acpi/acpi-generic-initiator.h

-- 
2.34.1




[PATCH v6 1/2] qom: new object to associate device to numa node

2023-12-24 Thread ankita
From: Ankit Agrawal 

NVIDIA GPU's support MIG (Mult-Instance GPUs) feature [1], which allows
partitioning of the GPU device resources (including device memory) into
several (upto 8) isolated instances. Each of the partitioned memory needs
a dedicated NUMA node to operate. The partitions are not fixed and they
can be created/deleted at runtime.

Unfortunately Linux OS does not provide a means to dynamically create/destroy
NUMA nodes and such feature implementation is not expected to be trivial. The
nodes that OS discovers at the boot time while parsing SRAT remains fixed. So
we utilize the Generic Initiator Affinity structures that allows association
between nodes and devices. Multiple GI structures per BDF is possible,
allowing creation of multiple nodes by exposing unique PXM in each of these
structures.

Introduce a new acpi-generic-initiator object to allow host admin provide the
device and the corresponding NUMA nodes. Qemu maintain this association and
use this object to build the requisite GI Affinity Structure. On a multi
device system, each device supporting the features needs a unique
acpi-generic-initiator object with its own set of NUMA nodes associated to it.

An admin can provide the range of nodes through a uint16 array host-nodes
and link it to a device by providing its id. Currently, only PCI device is
supported. The following sample creates 8 nodes per PCI device for a VM
with 2 PCI devices and link them to the respecitve PCI device using
acpi-generic-initiator objects:

-numa node,nodeid=2 -numa node,nodeid=3 -numa node,nodeid=4 \
-numa node,nodeid=5 -numa node,nodeid=6 -numa node,nodeid=7 \
-numa node,nodeid=8 -numa node,nodeid=9 \
-device 
vfio-pci-nohotplug,host=0009:01:00.0,bus=pcie.0,addr=04.0,rombar=0,id=dev0 \
-object acpi-generic-initiator,id=gi0,pci-dev=dev0,host-nodes=2-9 \

-numa node,nodeid=10 -numa node,nodeid=11 -numa node,nodeid=12 \
-numa node,nodeid=13 -numa node,nodeid=14 -numa node,nodeid=15 \
-numa node,nodeid=16 -numa node,nodeid=17 \
-device 
vfio-pci-nohotplug,host=0009:01:01.0,bus=pcie.0,addr=05.0,rombar=0,id=dev1 \
-object acpi-generic-initiator,id=gi1,pci-dev=dev1,host-nodes=10-17 \

[1] https://www.nvidia.com/en-in/technologies/multi-instance-gpu

Signed-off-by: Ankit Agrawal 
---
 hw/acpi/acpi-generic-initiator.c | 70 
 hw/acpi/meson.build  |  1 +
 include/hw/acpi/acpi-generic-initiator.h | 27 +
 qapi/qom.json| 17 ++
 4 files changed, 115 insertions(+)
 create mode 100644 hw/acpi/acpi-generic-initiator.c
 create mode 100644 include/hw/acpi/acpi-generic-initiator.h

diff --git a/hw/acpi/acpi-generic-initiator.c b/hw/acpi/acpi-generic-initiator.c
new file mode 100644
index 00..e05e28e962
--- /dev/null
+++ b/hw/acpi/acpi-generic-initiator.c
@@ -0,0 +1,70 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved
+ */
+
+#include "qemu/osdep.h"
+#include "hw/acpi/acpi-generic-initiator.h"
+#include "hw/pci/pci_device.h"
+#include "qapi/error.h"
+#include "qapi/qapi-builtin-visit.h"
+#include "qapi/visitor.h"
+#include "qemu/error-report.h"
+
+OBJECT_DEFINE_TYPE_WITH_INTERFACES(AcpiGenericInitiator, 
acpi_generic_initiator,
+   ACPI_GENERIC_INITIATOR, OBJECT,
+   { TYPE_USER_CREATABLE },
+   { NULL })
+
+OBJECT_DECLARE_SIMPLE_TYPE(AcpiGenericInitiator, ACPI_GENERIC_INITIATOR)
+
+static void acpi_generic_initiator_init(Object *obj)
+{
+AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj);
+bitmap_zero(gi->host_nodes, MAX_NODES);
+gi->pci_dev = NULL;
+}
+
+static void acpi_generic_initiator_finalize(Object *obj)
+{
+AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj);
+
+g_free(gi->pci_dev);
+}
+
+static void acpi_generic_initiator_set_pci_device(Object *obj, const char *val,
+  Error **errp)
+{
+AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj);
+
+gi->pci_dev = g_strdup(val);
+}
+
+static void
+acpi_generic_initiator_set_host_nodes(Object *obj, Visitor *v, const char 
*name,
+  void *opaque, Error **errp)
+{
+AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj);
+uint16List *l = NULL, *host_nodes = NULL;
+
+visit_type_uint16List(v, name, _nodes, errp);
+
+for (l = host_nodes; l; l = l->next) {
+if (l->value >= MAX_NODES) {
+error_setg(errp, "Invalid host-nodes value: %d", l->value);
+break;
+} else {
+bitmap_set(gi->host_nodes, l->value, 1);
+}
+}
+
+qapi_free_uint16List(host_nodes);
+}
+
+static void acpi_generic_initiator_class_init(ObjectClass *oc, void *data)
+{
+object_class_property_add_str(oc, "pci-dev", NULL,
+acpi_generic_initiator_set_pci_device);
+object_class_property_add(oc, "host-nodes", "int", NULL,
+

Re: [PATCH for 9.0 00/12] Map memory at destination .load_setup in vDPA-net migration

2023-12-24 Thread Lei Yang
QE tested this series with regression tests, there are no new regression issues.

Tested-by: Lei Yang 



On Sat, Dec 16, 2023 at 1:28 AM Eugenio Pérez  wrote:
>
> Current memory operations like pinning may take a lot of time at the
> destination.  Currently they are done after the source of the migration is
> stopped, and before the workload is resumed at the destination.  This is a
> period where neigher traffic can flow, nor the VM workload can continue
> (downtime).
>
> We can do better as we know the memory layout of the guest RAM at the
> destination from the moment the migration starts.  Moving that operation 
> allows
> QEMU to communicate the kernel the maps while the workload is still running in
> the source, so Linux can start mapping them.
>
> Also, the destination of the guest memory may finish before the destination
> QEMU maps all the memory.  In this case, the rest of the memory will be mapped
> at the same time as before applying this series, when the device is starting.
> So we're only improving with this series.
>
> If the destination has the switchover_ack capability enabled, the destination
> hold the migration until all the memory is mapped.
>
> This needs to be applied on top of [1]. That series performs some code
> reorganization that allows to map the guest memory without knowing the queue
> layout the guest configure on the device.
>
> This series reduced the downtime in the stop-and-copy phase of the live
> migration from 20s~30s to 5s, with a 128G mem guest and two mlx5_vdpa devices,
> per [2].
>
> Future directions on top of this series may include:
> * Iterative migration of virtio-net devices, as it may reduce downtime per 
> [3].
>   vhost-vdpa net can apply the configuration through CVQ in the destination
>   while the source is still migrating.
> * Move more things ahead of migration time, like DRIVER_OK.
> * Check that the devices of the destination are valid, and cancel the 
> migration
>   in case it is not.
>
> v1 from RFC v2:
> * Hold on migration if memory has not been mapped in full with switchover_ack.
> * Revert map if the device is not started.
>
> RFC v2:
> * Delegate map to another thread so it does no block QMP.
> * Fix not allocating iova_tree if x-svq=on at the destination.
> * Rebased on latest master.
> * More cleanups of current code, that might be split from this series too.
>
> [1] https://lists.nongnu.org/archive/html/qemu-devel/2023-12/msg01986.html
> [2] https://lists.nongnu.org/archive/html/qemu-devel/2023-12/msg00909.html
> [3] 
> https://lore.kernel.org/qemu-devel/6c8ebb97-d546-3f1c-4cdd-54e23a566...@nvidia.com/T/
>
> Eugenio Pérez (12):
>   vdpa: do not set virtio status bits if unneeded
>   vdpa: make batch_begin_once early return
>   vdpa: merge _begin_batch into _batch_begin_once
>   vdpa: extract out _dma_end_batch from _listener_commit
>   vdpa: factor out stop path of vhost_vdpa_dev_start
>   vdpa: check for iova tree initialized at net_client_start
>   vdpa: set backend capabilities at vhost_vdpa_init
>   vdpa: add vhost_vdpa_load_setup
>   vdpa: approve switchover after memory map in the migration destination
>   vdpa: add vhost_vdpa_net_load_setup NetClient callback
>   vdpa: add vhost_vdpa_net_switchover_ack_needed
>   virtio_net: register incremental migration handlers
>
>  include/hw/virtio/vhost-vdpa.h |  32 
>  include/net/net.h  |   8 +
>  hw/net/virtio-net.c|  48 ++
>  hw/virtio/vhost-vdpa.c | 274 +++--
>  net/vhost-vdpa.c   |  43 +-
>  5 files changed, 357 insertions(+), 48 deletions(-)
>
> --
> 2.39.3
>
>




Re: [PATCH v2 02/17] hw/loongarch: Add load initrd

2023-12-24 Thread gaosong

在 2023/12/21 下午3:09, maobibo 写道:



On 2023/12/18 下午5:00, Song Gao wrote:

we load initrd ramdisk after kernel_high address

Signed-off-by: Song Gao 
---
  hw/loongarch/boot.c | 29 -
  1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/hw/loongarch/boot.c b/hw/loongarch/boot.c
index 9f25ea5847..2be6dfb037 100644
--- a/hw/loongarch/boot.c
+++ b/hw/loongarch/boot.c
@@ -21,7 +21,8 @@ static uint64_t cpu_loongarch_virt_to_phys(void 
*opaque, uint64_t addr)

    static int64_t load_kernel_info(struct loongarch_boot_info *info)
  {
-    uint64_t kernel_entry, kernel_low, kernel_high;
+    uint64_t kernel_entry, kernel_low, kernel_high, initrd_size;
+    ram_addr_t initrd_offset;
  ssize_t kernel_size;
    kernel_size = load_elf(info->kernel_filename, NULL,
@@ -36,6 +37,32 @@ static int64_t load_kernel_info(struct 
loongarch_boot_info *info)

   load_elf_strerror(kernel_size));
  exit(1);
  }
+
+    if (info->initrd_filename) {
+    initrd_size = get_image_size(info->initrd_filename);
+    if (initrd_size > 0) {
+    initrd_offset = ROUND_UP(kernel_high, 64 * KiB);

Do you test self-compressed vmlinuz elf load?


The LoongArch kenrel not support build bzimage.
I think that offset of initrd had better be 4 * kernel_size from 
kernel_high, else uncompressed kernel may overwrite INITRD image.

such as:
 initrd_offset = ROUND_UP(kernel_high + 4 * kernel_size, 64 * KiB);


but I think we can do this.

Thanks.
Song Gao

Regards
Bibo Mao

+
+    if (initrd_offset + initrd_size > info->ram_size) {
+    error_report("memory too small for initial ram disk 
'%s'",

+ info->initrd_filename);
+    exit(1);
+    }
+
+    initrd_size = load_image_targphys(info->initrd_filename, 
initrd_offset,
+  info->ram_size - 
initrd_offset);

+    }
+
+    if (initrd_size == (target_ulong)-1) {
+    error_report("could not load initial ram disk '%s'",
+ info->initrd_filename);
+    exit(1);
+    }
+    } else {
+    error_report("Need initrd!");
+    exit(1);
+    }
+
  return kernel_entry;
  }






Re: [PATCH v2 03/17] hw/loongarch: Add init_cmdline

2023-12-24 Thread gaosong

在 2023/12/21 下午3:20, maobibo 写道:



On 2023/12/18 下午5:00, Song Gao wrote:

Add init_cmline and set boot_info->a0, a1

Signed-off-by: Song Gao 
---
  hw/loongarch/boot.c | 21 +
  include/hw/loongarch/virt.h |  2 ++
  target/loongarch/cpu.h  |  2 ++
  3 files changed, 25 insertions(+)

diff --git a/hw/loongarch/boot.c b/hw/loongarch/boot.c
index 2be6dfb037..4bfe24274a 100644
--- a/hw/loongarch/boot.c
+++ b/hw/loongarch/boot.c
@@ -14,6 +14,20 @@
  #include "qemu/error-report.h"
  #include "sysemu/reset.h"
  +static int init_cmdline(struct loongarch_boot_info *info)
+{
+    hwaddr cmdline_addr;
+    cmdline_addr = 0xff0ULL;
+
+    pstrcpy_targphys("cmdline", 0xff0ULL,
+ COMMAND_LINE_SIZE, info->kernel_cmdline);
There are two places using 0xff0ULL here, it had better be defined 
as macro. Also can address for cmdline be before FDT base 
address(0x10) rather than strange value 0xff0 ? -:)



+
+    info->a0 = 1;
+    info->a1 = cmdline_addr;
+
+    return 0;
+}
+
  static uint64_t cpu_loongarch_virt_to_phys(void *opaque, uint64_t 
addr)

  {
  return addr & MAKE_64BIT_MASK(0, TARGET_PHYS_ADDR_SPACE_BITS);
@@ -63,6 +77,8 @@ static int64_t load_kernel_info(struct 
loongarch_boot_info *info)

  exit(1);
  }
  +    init_cmdline(info);
+
  return kernel_entry;
  }
  @@ -73,6 +89,10 @@ static void reset_load_elf(void *opaque)
    cpu_reset(CPU(cpu));
  if (env->load_elf) {
+    if (cpu == LOONGARCH_CPU(first_cpu)) {
+    env->gpr[4] = env->boot_info->a0;
+    env->gpr[5] = env->boot_info->a1;
+    }
  cpu_set_pc(CPU(cpu), env->elf_address);
  }
  }
@@ -129,6 +149,7 @@ static void 
loongarch_direct_kernel_boot(LoongArchMachineState *lams,

  lacpu = LOONGARCH_CPU(qemu_get_cpu(i));
  lacpu->env.load_elf = true;
  lacpu->env.elf_address = kernel_addr;
+    lacpu->env.boot_info = info;
  }
  }
  diff --git a/include/hw/loongarch/virt.h b/include/hw/loongarch/virt.h
index e4126dd0e7..d21de2cef4 100644
--- a/include/hw/loongarch/virt.h
+++ b/include/hw/loongarch/virt.h
@@ -31,6 +31,8 @@
  #define VIRT_GED_MEM_ADDR   (VIRT_GED_EVT_ADDR + 
ACPI_GED_EVT_SEL_LEN)
  #define VIRT_GED_REG_ADDR   (VIRT_GED_MEM_ADDR + 
MEMORY_HOTPLUG_IO_LEN)

  +#define COMMAND_LINE_SIZE   512
The macro COMMAND_LINE_SIZE is already defined in Linux header file, 
maybe standard header file can be used.


/usr/include/asm-generic/setup.h
#define COMMAND_LINE_SIZE  512


Yes,
#include 

Thanks.
Song Gao


Regards
Bibo Mao

+
  struct LoongArchMachineState {
  /*< private >*/
  MachineState parent_obj;
diff --git a/target/loongarch/cpu.h b/target/loongarch/cpu.h
index 00d1fba597..c7c695138e 100644
--- a/target/loongarch/cpu.h
+++ b/target/loongarch/cpu.h
@@ -362,6 +362,8 @@ typedef struct CPUArchState {
  uint64_t elf_address;
  /* Store ipistate to access from this struct */
  DeviceState *ipistate;
+
+    struct loongarch_boot_info *boot_info;
  #endif
  } CPULoongArchState;






Re: [PATCH v2 01/17] hw/loongarch: Move boot fucntions to boot.c

2023-12-24 Thread gaosong

在 2023/12/21 下午3:04, maobibo 写道:



On 2023/12/18 下午5:00, Song Gao wrote:

Move some boot functions to boot.c and struct
loongarch_boot_info into struct LoongArchMachineState.

Signed-off-by: Song Gao 
---
  hw/loongarch/boot.c | 127 
  hw/loongarch/meson.build    |   1 +
  hw/loongarch/virt.c | 118 ++---
  include/hw/loongarch/boot.h |  21 ++
  include/hw/loongarch/virt.h |   2 +
  5 files changed, 155 insertions(+), 114 deletions(-)
  create mode 100644 hw/loongarch/boot.c
  create mode 100644 include/hw/loongarch/boot.h

diff --git a/hw/loongarch/boot.c b/hw/loongarch/boot.c
new file mode 100644
index 00..9f25ea5847
--- /dev/null
+++ b/hw/loongarch/boot.c
@@ -0,0 +1,127 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * LoongArch boot helper functions.
+ * Yes,  #include <>
+ * Copyright (c) 2023 Loongson Technology Corporation Limited
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/units.h"
+#include "target/loongarch/cpu.h"
+#include "hw/loongarch/virt.h"
+#include "hw/loader.h"
+#include "elf.h"
+#include "qemu/error-report.h"
+#include "sysemu/reset.h"
+
+static uint64_t cpu_loongarch_virt_to_phys(void *opaque, uint64_t addr)
+{
+    return addr & MAKE_64BIT_MASK(0, TARGET_PHYS_ADDR_SPACE_BITS);
+}
+
+static int64_t load_kernel_info(struct loongarch_boot_info *info)
+{
+    uint64_t kernel_entry, kernel_low, kernel_high;
+    ssize_t kernel_size;
+
+    kernel_size = load_elf(info->kernel_filename, NULL,
+   cpu_loongarch_virt_to_phys, NULL,
+   _entry, _low,
+   _high, NULL, 0,
+   EM_LOONGARCH, 1, 0);
+
+    if (kernel_size < 0) {
+    error_report("could not load kernel '%s': %s",
+ info->kernel_filename,
+ load_elf_strerror(kernel_size));
+    exit(1);
+    }
+    return kernel_entry;
+}
+
+static void reset_load_elf(void *opaque)
+{
+    LoongArchCPU *cpu = opaque;
+    CPULoongArchState *env = >env;
+
+    cpu_reset(CPU(cpu));
+    if (env->load_elf) {
+    cpu_set_pc(CPU(cpu), env->elf_address);
+    }
+}
+
+static void fw_cfg_add_kernel_info(struct loongarch_boot_info *info,
+   FWCfgState *fw_cfg)
+{
+    /*
+ * Expose the kernel, the command line, and the initrd in fw_cfg.
+ * We don't process them here at all, it's all left to the
+ * firmware.
+ */
+    load_image_to_fw_cfg(fw_cfg,
+ FW_CFG_KERNEL_SIZE, FW_CFG_KERNEL_DATA,
+ info->kernel_filename,
+ false);
+
+    if (info->initrd_filename) {
+    load_image_to_fw_cfg(fw_cfg,
+ FW_CFG_INITRD_SIZE, FW_CFG_INITRD_DATA,
+ info->initrd_filename, false);
+    }
+
+    if (info->kernel_cmdline) {
+    fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_SIZE,
+   strlen(info->kernel_cmdline) + 1);
+    fw_cfg_add_string(fw_cfg, FW_CFG_CMDLINE_DATA,
+  info->kernel_cmdline);
+    }
+}
+
+static void loongarch_firmware_boot(LoongArchMachineState *lams,
+    struct loongarch_boot_info *info)
+{
+    fw_cfg_add_kernel_info(info, lams->fw_cfg);
+}
+
+static void loongarch_direct_kernel_boot(LoongArchMachineState *lams,
+ struct loongarch_boot_info 
*info)

+{
+    MachineState *machine = MACHINE(lams);
+    int64_t kernel_addr = 0;
+    LoongArchCPU *lacpu;
+    int i;
+
+    if (info->kernel_filename) {
+    kernel_addr = load_kernel_info(info);
+    } else {
+    error_report("Need kernel filename\n");
+    exit(1);
+    }
+
+    for (i = 0; i < machine->smp.cpus; i++) {
+    lacpu = LOONGARCH_CPU(qemu_get_cpu(i));
+    lacpu->env.load_elf = true;
+    lacpu->env.elf_address = kernel_addr;
+    }
+}
+
+void loongarch_load_kernel(MachineState *ms, struct 
loongarch_boot_info *info)

+{
+    LoongArchMachineState *lams = LOONGARCH_MACHINE(ms);
+    int i;
+
+    /* register reset function */
+    for (i = 0; i < ms->smp.cpus; i++) {
+    qemu_register_reset(reset_load_elf, 
LOONGARCH_CPU(qemu_get_cpu(i)));

+    }
+
+    info->kernel_filename = ms->kernel_filename;
+    info->kernel_cmdline = ms->kernel_cmdline;
+    info->initrd_filename = ms->initrd_filename;
+
+    if (lams->bios_loaded) {
+    loongarch_firmware_boot(lams, info);
+    } else {
+    loongarch_direct_kernel_boot(lams, info);
+    }
+}
diff --git a/hw/loongarch/meson.build b/hw/loongarch/meson.build
index c0421502ab..d306d82c2e 100644
--- a/hw/loongarch/meson.build
+++ b/hw/loongarch/meson.build
@@ -1,6 +1,7 @@
  loongarch_ss = ss.source_set()
  loongarch_ss.add(files(
  'fw_cfg.c',
+    'boot.c',
  ))
  loongarch_ss.add(when: 'CONFIG_LOONGARCH_VIRT', if_true: 
[files('virt.c'), fdt])

  loongarch_ss.add(when: 

Re: [PATCH v2 04/17] hw/loongarch: Add slave cpu boot_code

2023-12-24 Thread gaosong

在 2023/12/21 下午3:22, maobibo 写道:



On 2023/12/18 下午5:00, Song Gao wrote:

Signed-off-by: Song Gao 
---
  hw/loongarch/boot.c | 65 -
  1 file changed, 64 insertions(+), 1 deletion(-)

diff --git a/hw/loongarch/boot.c b/hw/loongarch/boot.c
index 4bfe24274a..076e795714 100644
--- a/hw/loongarch/boot.c
+++ b/hw/loongarch/boot.c
@@ -14,6 +14,62 @@
  #include "qemu/error-report.h"
  #include "sysemu/reset.h"
  +enum {
+    SLAVE_BOOT,
+};
+
+static const MemMapEntry loader_rommap[] = {
+    [SLAVE_BOOT] = {0xf10, 0x1},
+};

Address 0xf10 had better be defined before 0x10


I will correct it on v3

Thanks.
Song Gao

Regards
Bibo Mao


+
+static unsigned int slave_boot_code[] = {
+  /* Configure reset ebase. */
+    0x0400302c,   /* csrwr  $r12,0xc    */
+
+  /* Disable interrupt. */
+    0x0380100c,   /* ori    $r12,$r0,0x4    */
+    0x04000180,   /* csrxchg    $r0,$r12,0x0    */
+
+  /* Clear mailbox. */
+    0x142d,   /* lu12i.w    $r13,1(0x1) */
+    0x038081ad,   /* ori    $r13,$r13,0x20  */
+    0x06481da0,   /* iocsrwr.d  $r0,$r13    */
+
+  /* Enable IPI interrupt.  */
+    0x142c,   /* lu12i.w    $r12,1(0x1) */
+    0x0400118c,   /* csrxchg    $r12,$r12,0x4   */
+    0x02fffc0c,   /* addi.d $r12,$r0,-1(0xfff)  */
+    0x142d,   /* lu12i.w    $r13,1(0x1) */
+    0x038011ad,   /* ori    $r13,$r13,0x4   */
+    0x064819ac,   /* iocsrwr.w  $r12,$r13   */
+    0x142d,   /* lu12i.w    $r13,1(0x1) */
+    0x038081ad,   /* ori    $r13,$r13,0x20  */
+
+  /* Wait for wakeup  <.L11>:   */
+    0x06488000,   /* idle   0x0 */
+    0x0340,   /* andi   $r0,$r0,0x0 */
+    0x064809ac,   /* iocsrrd.w  $r12,$r13   */
+    0x43fff59f,   /* beqz   $r12,-12(0x74) # 48 <.L11> */
+
+  /* Read and clear IPI interrupt.  */
+    0x142d,   /* lu12i.w    $r13,1(0x1) */
+    0x064809ac,   /* iocsrrd.w  $r12,$r13   */
+    0x142d,   /* lu12i.w    $r13,1(0x1) */
+    0x038031ad,   /* ori    $r13,$r13,0xc   */
+    0x064819ac,   /* iocsrwr.w  $r12,$r13   */
+
+  /* Disable  IPI interrupt.    */
+    0x142c,   /* lu12i.w    $r12,1(0x1) */
+    0x04001180,   /* csrxchg    $r0,$r12,0x4    */
+
+  /* Read mail buf and jump to specified entry */
+    0x142d,   /* lu12i.w    $r13,1(0x1) */
+    0x038081ad,   /* ori    $r13,$r13,0x20  */
+    0x06480dac,   /* iocsrrd.d  $r12,$r13   */
+    0x00150181,   /* move   $r1,$r12    */
+    0x4c20,   /* jirl   $r0,$r1,0   */
+};
+
  static int init_cmdline(struct loongarch_boot_info *info)
  {
  hwaddr cmdline_addr;
@@ -145,10 +201,17 @@ static void 
loongarch_direct_kernel_boot(LoongArchMachineState *lams,

  exit(1);
  }
  +    rom_add_blob_fixed("slave_boot", slave_boot_code, 
sizeof(slave_boot_code),

+   loader_rommap[SLAVE_BOOT].base);
+
  for (i = 0; i < machine->smp.cpus; i++) {
  lacpu = LOONGARCH_CPU(qemu_get_cpu(i));
  lacpu->env.load_elf = true;
-    lacpu->env.elf_address = kernel_addr;
+    if (i == 0) {
+    lacpu->env.elf_address = kernel_addr;
+    } else {
+    lacpu->env.elf_address = loader_rommap[SLAVE_BOOT].base;
+    }
  lacpu->env.boot_info = info;
  }
  }






Re: [PATCH for-8.2?] target/i386: Fix 32-bit wrapping of pc/eip computation

2023-12-24 Thread Michael Tokarev

12.12.2023 20:25, Richard Henderson:

In 32-bit mode, pc = eip + cs_base is also 32-bit, and must wrap.
Failure to do so results in incorrect memory exceptions to the guest.
Before 732d548732ed, this was implicitly done via truncation to
target_ulong but only in qemu-system-i386, not qemu-system-x86_64.

To fix this, we must add conditional zero-extensions.
Since we have to test for 32 vs 64-bit anyway, note that cs_base
is always zero in 64-bit mode.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2022
Signed-off-by: Richard Henderson 
---

This may be too late for 8.2; if not, then 8.2.1 and 8.1.next.
I think I have found all forms of pc <-> eip, but another set
of eyes would be appreciated.


This change breaks trivial 4M edk2 boot - both in 8.2.0 and in
8.1.4 (which also has this commit now).

 qemu-system-x86_64 -machine q35 -no-user-config -nodefaults -display none \
  -serial stdio \
  -drive 
file=/usr/share/OVMF/OVMF_CODE_4M.secboot.fd,if=pflash,format=raw,readonly=on \
  -drive 
file=/usr/share/OVMF/OVMF_VARS_4M.ms.fd,if=pflash,format=raw,snapshot=on

After this change, nothing is printed on the serial console anymore
(or in vga, whatever). Before that commit, usual edk2 boot sequence
is seen.

Nothing has changed with the 2M variant though.

/mjt




Re: PCIe with Designware RC.

2023-12-24 Thread BALATON Zoltan

On Sun, 24 Dec 2023, Shlomo Pongratz wrote:

Thank you, see comment inside.

On Sun, Dec 24, 2023 at 1:11 PM BALATON Zoltan  wrote:


On Sun, 24 Dec 2023, Shlomo Pongratz wrote:

Hi,
I'm working on a AARCH64 project that uses the designeware
(hw/pci-host/designware.c).
I've copied the designware initialization from hw/arm/fsl-imx7.c and I
hope I've updated the dtsi correctly.
After fixing an issue with the iATU windows (see patch
https://lists.gnu.org/archive/html/qemu-devel/2023-12/msg02643.html)
I've tried to add virtualized NVMe controller.
When I added the lines:
   -device nvme,serial=deadbeef,drive=nvme0,bus=pcie \  (Or without bus=)
   -drive file=/home/pliops/disk-1.img,if=none,id=nvme1 \


You define drive with if=none,id=nvme1 but have drive=nvme0 in your
device. You should refer to the drive you want the device to use so I
think it should either be -device nvme,drive=nvme1 or the if of drive
should be nvme0.


This was a typo, All are nvme0.


Then maybe you need to give more details on what do you use as then the 
difference is only which pcie bus it's connected to. You could try 
checking in QEMU monitor with 'info qtree' where devices are attached and 
how are they organised and with 'info mtree' where BARs are mapped.


Regards,
BALATON Zoltan

Re: PCIe with Designware RC.

2023-12-24 Thread Shlomo Pongratz
Thank you, see comment inside.

On Sun, Dec 24, 2023 at 1:11 PM BALATON Zoltan  wrote:
>
> On Sun, 24 Dec 2023, Shlomo Pongratz wrote:
> > Hi,
> > I'm working on a AARCH64 project that uses the designeware
> > (hw/pci-host/designware.c).
> > I've copied the designware initialization from hw/arm/fsl-imx7.c and I
> > hope I've updated the dtsi correctly.
> > After fixing an issue with the iATU windows (see patch
> > https://lists.gnu.org/archive/html/qemu-devel/2023-12/msg02643.html)
> > I've tried to add virtualized NVMe controller.
> > When I added the lines:
> >-device nvme,serial=deadbeef,drive=nvme0,bus=pcie \  (Or without 
> > bus=)
> >-drive file=/home/pliops/disk-1.img,if=none,id=nvme1 \
>
> You define drive with if=none,id=nvme1 but have drive=nvme0 in your
> device. You should refer to the drive you want the device to use so I
> think it should either be -device nvme,drive=nvme1 or the if of drive
> should be nvme0.

This was a typo, All are nvme0.

 I don't know how this works for nvme but for CD drives
> for example adding a device would add it without disk and drive defines
> the disk to use. Not sure this makes sense for hard disks or nvme device
> but maybe the command line options don't consider that.
>
> > I could see in QEMU monitor that the NVMe device was preset i.e.
> > (qemu) info pci
> >  Bus  0, device   0, function 0:
> >PCI bridge: PCI device 16c3:abcd
> >  IRQ 0, pin A
> >  BUS 0.
> >  secondary bus 1.
> >  subordinate bus 255.
> >  IO range [0xf000, 0x0fff]
> >  memory range [0xfff0, 0x000f]
> >  prefetchable memory range [0xfff0, 0x000f]
> >  id ""
> >  Bus  0, device   1, function 0:
> >Class 0264: PCI device 1b36:0010
> >  PCI subsystem 1af4:1100
> >  IRQ 0, pin A
> >  BAR0: 64 bit memory at 0x [0x3ffe].
> >  id ""
> > However in lspci it was missing
> > # lspci
> > 00:00.0 Class 0604: 16c3:abcd
> >
> > If I used the following command
> >-drive file=/home/pliops/disk.img,if=none,id=nvme0 \
> >-device nvme,serial=deadbeef,drive=nvme0,bus=dw-pcie \
>
> Here you correctly define both media and drive so it works as expected.
> There are some shortcuts for -drive with media=disk or media=cdrom and
> if=ide or scsi that don't need a separate drive option as if=none does but
> not sure if that supports nvme. You probably have to check documentation
> or code to find out.
>
> > Then in the monitor I see:
> > (qemu) info pci
> >  Bus  0, device   0, function 0:
> >PCI bridge: PCI device 16c3:abcd
> >  IRQ 0, pin A
> >  BUS 0.
> >  secondary bus 1.
> >  subordinate bus 255.
> >  IO range [0xf000, 0x0fff]
> >  memory range [0x4000, 0x401f]
> >  prefetchable memory range [0xfff0, 0x000f]
> >  id ""
> >  Bus  1, device   0, function 0:
> >Class 0264: PCI device 1b36:0010
> >  PCI subsystem 1af4:1100
> >  IRQ 1, pin A
> >  BAR0: 64 bit memory at 0x [0x3ffe].
> >  id ""
> > That is the NVMe is on BUS 1.
> > And in lspci I can now see the device but on bus 1.
> > # lspci
> > 01:00.0 Class 0108: 1b36:0010
> > 00:00.0 Class 0604: 16c3:abcd
> >
> > Is this expected?
> >
> > But the main problem is that during the initialization of the
> > controller registers in BAR0 all the read and writes are actually done
> > into the config space.
>
> I don't know what this is but don't think it's related to the above.
>
> Regards,
> BALATON Zoltan
>
> > Any ideas?
> >
> > Thank you
> > Shlomo Pongratz.
> >
> >



[PATCH] virtio-blk: Fix potential nullpointer read access in virtio_blk_data_plane_destroy

2023-12-24 Thread Stefan Weil via
Fixes: CID 1532828
Fixes: b6948ab01d ("virtio-blk: add iothread-vq-mapping parameter")
Signed-off-by: Stefan Weil 
---
 hw/block/dataplane/virtio-blk.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 6debd4401e..97a302cf49 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -152,7 +152,7 @@ bool virtio_blk_data_plane_create(VirtIODevice *vdev, 
VirtIOBlkConf *conf,
 void virtio_blk_data_plane_destroy(VirtIOBlockDataPlane *s)
 {
 VirtIOBlock *vblk;
-VirtIOBlkConf *conf = s->conf;
+VirtIOBlkConf *conf;
 
 if (!s) {
 return;
@@ -160,6 +160,7 @@ void virtio_blk_data_plane_destroy(VirtIOBlockDataPlane *s)
 
 vblk = VIRTIO_BLK(s->vdev);
 assert(!vblk->dataplane_started);
+conf = s->conf;
 
 if (conf->iothread_vq_mapping_list) {
 IOThreadVirtQueueMappingList *node;
-- 
2.39.2




Re: PCIe with Designware RC.

2023-12-24 Thread BALATON Zoltan

On Sun, 24 Dec 2023, Shlomo Pongratz wrote:

Hi,
I'm working on a AARCH64 project that uses the designeware
(hw/pci-host/designware.c).
I've copied the designware initialization from hw/arm/fsl-imx7.c and I
hope I've updated the dtsi correctly.
After fixing an issue with the iATU windows (see patch
https://lists.gnu.org/archive/html/qemu-devel/2023-12/msg02643.html)
I've tried to add virtualized NVMe controller.
When I added the lines:
   -device nvme,serial=deadbeef,drive=nvme0,bus=pcie \  (Or without bus=)
   -drive file=/home/pliops/disk-1.img,if=none,id=nvme1 \


You define drive with if=none,id=nvme1 but have drive=nvme0 in your 
device. You should refer to the drive you want the device to use so I 
think it should either be -device nvme,drive=nvme1 or the if of drive 
should be nvme0. I don't know how this works for nvme but for CD drives 
for example adding a device would add it without disk and drive defines 
the disk to use. Not sure this makes sense for hard disks or nvme device 
but maybe the command line options don't consider that.



I could see in QEMU monitor that the NVMe device was preset i.e.
(qemu) info pci
 Bus  0, device   0, function 0:
   PCI bridge: PCI device 16c3:abcd
 IRQ 0, pin A
 BUS 0.
 secondary bus 1.
 subordinate bus 255.
 IO range [0xf000, 0x0fff]
 memory range [0xfff0, 0x000f]
 prefetchable memory range [0xfff0, 0x000f]
 id ""
 Bus  0, device   1, function 0:
   Class 0264: PCI device 1b36:0010
 PCI subsystem 1af4:1100
 IRQ 0, pin A
 BAR0: 64 bit memory at 0x [0x3ffe].
 id ""
However in lspci it was missing
# lspci
00:00.0 Class 0604: 16c3:abcd

If I used the following command
   -drive file=/home/pliops/disk.img,if=none,id=nvme0 \
   -device nvme,serial=deadbeef,drive=nvme0,bus=dw-pcie \


Here you correctly define both media and drive so it works as expected. 
There are some shortcuts for -drive with media=disk or media=cdrom and 
if=ide or scsi that don't need a separate drive option as if=none does but 
not sure if that supports nvme. You probably have to check documentation 
or code to find out.



Then in the monitor I see:
(qemu) info pci
 Bus  0, device   0, function 0:
   PCI bridge: PCI device 16c3:abcd
 IRQ 0, pin A
 BUS 0.
 secondary bus 1.
 subordinate bus 255.
 IO range [0xf000, 0x0fff]
 memory range [0x4000, 0x401f]
 prefetchable memory range [0xfff0, 0x000f]
 id ""
 Bus  1, device   0, function 0:
   Class 0264: PCI device 1b36:0010
 PCI subsystem 1af4:1100
 IRQ 1, pin A
 BAR0: 64 bit memory at 0x [0x3ffe].
 id ""
That is the NVMe is on BUS 1.
And in lspci I can now see the device but on bus 1.
# lspci
01:00.0 Class 0108: 1b36:0010
00:00.0 Class 0604: 16c3:abcd

Is this expected?

But the main problem is that during the initialization of the
controller registers in BAR0 all the read and writes are actually done
into the config space.


I don't know what this is but don't think it's related to the above.

Regards,
BALATON Zoltan


Any ideas?

Thank you
Shlomo Pongratz.






PCIe with Designware RC.

2023-12-24 Thread Shlomo Pongratz
Hi,
I'm working on a AARCH64 project that uses the designeware
(hw/pci-host/designware.c).
I've copied the designware initialization from hw/arm/fsl-imx7.c and I
hope I've updated the dtsi correctly.
After fixing an issue with the iATU windows (see patch
https://lists.gnu.org/archive/html/qemu-devel/2023-12/msg02643.html)
I've tried to add virtualized NVMe controller.
When I added the lines:
-device nvme,serial=deadbeef,drive=nvme0,bus=pcie \  (Or without bus=)
-drive file=/home/pliops/disk-1.img,if=none,id=nvme1 \
I could see in QEMU monitor that the NVMe device was preset i.e.
(qemu) info pci
  Bus  0, device   0, function 0:
PCI bridge: PCI device 16c3:abcd
  IRQ 0, pin A
  BUS 0.
  secondary bus 1.
  subordinate bus 255.
  IO range [0xf000, 0x0fff]
  memory range [0xfff0, 0x000f]
  prefetchable memory range [0xfff0, 0x000f]
  id ""
  Bus  0, device   1, function 0:
Class 0264: PCI device 1b36:0010
  PCI subsystem 1af4:1100
  IRQ 0, pin A
  BAR0: 64 bit memory at 0x [0x3ffe].
  id ""
However in lspci it was missing
# lspci
00:00.0 Class 0604: 16c3:abcd

If I used the following command
-drive file=/home/pliops/disk.img,if=none,id=nvme0 \
-device nvme,serial=deadbeef,drive=nvme0,bus=dw-pcie \
Then in the monitor I see:
(qemu) info pci
  Bus  0, device   0, function 0:
PCI bridge: PCI device 16c3:abcd
  IRQ 0, pin A
  BUS 0.
  secondary bus 1.
  subordinate bus 255.
  IO range [0xf000, 0x0fff]
  memory range [0x4000, 0x401f]
  prefetchable memory range [0xfff0, 0x000f]
  id ""
  Bus  1, device   0, function 0:
Class 0264: PCI device 1b36:0010
  PCI subsystem 1af4:1100
  IRQ 1, pin A
  BAR0: 64 bit memory at 0x [0x3ffe].
  id ""
That is the NVMe is on BUS 1.
And in lspci I can now see the device but on bus 1.
# lspci
01:00.0 Class 0108: 1b36:0010
00:00.0 Class 0604: 16c3:abcd

Is this expected?

But the main problem is that during the initialization of the
controller registers in BAR0 all the read and writes are actually done
into the config space.

Any ideas?

Thank you
Shlomo Pongratz.