date:20230522

Re: [PATCH] meson: remove -no-pie linker flag

2023-05-22 Thread Volker Rümelin


Am 22.05.23 um 10:08 schrieb Paolo Bonzini:

The large comment in the patch says it all; the -no-pie flag is broken and
this is why it was not included in QEMU_LDFLAGS before commit a988b4c5614
("build: move remaining compiler flag tests to meson", 2023-05-18).

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1664
Signed-off-by: Paolo Bonzini 
---
  meson.build | 13 +
  1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/meson.build b/meson.build
index 0a5cdefd4d3d..6733b2917081 100644
--- a/meson.build
+++ b/meson.build
@@ -267,10 +267,15 @@ endif
  # has explicitly disabled PIE we need to extend our cflags.
  if not get_option('b_pie')
qemu_common_flags += cc.get_supported_arguments('-fno-pie')
-  if not get_option('prefer_static')
-# No PIE is implied by -static which we added above.
-qemu_ldflags += cc.get_supported_link_arguments('-no-pie')
-  endif
+  # What about linker flags?  For a static build, no PIE is implied by -static
+  # which we added above.  For dynamic linking, adding -no-pie is messy because
+  # it overrides -shared: the linker then wants to build an executable instead
+  # of a shared library and the build fails.  Before moving this code to Meson,
+  # we went through a dozen different commits affecting the usage of -no-pie,
+  # ultimately settling for a completely broken one that added -no-pie to the
+  # compiler flags together with -fno-pie... except that -no-pie is a linker
+  # flag that has no effect on the compiler command line.  So, don't add
+  # -no-pie anywhere and cross fingers.
  endif
  
  if not get_option('stack_protector').disabled()


QEMU builds again on Windows with MSYS2 mingw64.

I also tried to build QEMU on Windows with libslirp from the subproject 
folder. The issue reported in 
https://gitlab.com/qemu-project/qemu/-/issues/1664 is fixed, but it now 
fails with a different error. This is a libslirp bug. See 
https://gitlab.freedesktop.org/slirp/libslirp/-/issues/68. The revision 
in subprojects/slirp.wrap should be at least 
fc5eaaf6f68d5cff76468c63984c33c4fb51506d.


Building QEMU on my Linux system works fine.

Tested-by: Volker Rümelin

Re: [PATCH v2 1/1] Add vpd data for Rainier machine

2023-05-22 Thread Cédric Le Goater


On 5/22/23 17:36, Ninad Palsule wrote:

The VPD data is added for system and BMC FRU. This data is fabricated.

Tested:
- The system-vpd.service is active.
- VPD service related to bmc is active.
Signed-off-by: Ninad Palsule 


Reviewed-by: Cédric Le Goater 

Thanks,

C.



---
  hw/arm/aspeed.c|  6 --
  hw/arm/aspeed_eeprom.c | 45 +-
  hw/arm/aspeed_eeprom.h |  5 +
  3 files changed, 53 insertions(+), 3 deletions(-)

diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c
index 0b29028fe1..bfc2070bd2 100644
--- a/hw/arm/aspeed.c
+++ b/hw/arm/aspeed.c
@@ -788,8 +788,10 @@ static void rainier_bmc_i2c_init(AspeedMachineState *bmc)
   0x48);
  i2c_slave_create_simple(aspeed_i2c_get_bus(>i2c, 8), TYPE_TMP105,
   0x4a);
-at24c_eeprom_init(aspeed_i2c_get_bus(>i2c, 8), 0x50, 64 * KiB);
-at24c_eeprom_init(aspeed_i2c_get_bus(>i2c, 8), 0x51, 64 * KiB);
+at24c_eeprom_init_rom(aspeed_i2c_get_bus(>i2c, 8), 0x50,
+  64 * KiB, rainier_bb_fruid, rainier_bb_fruid_len);
+at24c_eeprom_init_rom(aspeed_i2c_get_bus(>i2c, 8), 0x51,
+  64 * KiB, rainier_bmc_fruid, rainier_bmc_fruid_len);
  create_pca9552(soc, 8, 0x60);
  create_pca9552(soc, 8, 0x61);
  /* Bus 8: ucd90320@11 */
diff --git a/hw/arm/aspeed_eeprom.c b/hw/arm/aspeed_eeprom.c
index dc33a88a54..ace5266cec 100644
--- a/hw/arm/aspeed_eeprom.c
+++ b/hw/arm/aspeed_eeprom.c
@@ -119,9 +119,52 @@ const uint8_t yosemitev2_bmc_fruid[] = {
  0x6e, 0x66, 0x69, 0x67, 0x20, 0x41, 0xc1, 0x45,
  };
  
+const uint8_t rainier_bb_fruid[] = {

+0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x84,
+0x28, 0x00, 0x52, 0x54, 0x04, 0x56, 0x48, 0x44, 0x52, 0x56, 0x44, 0x02,
+0x01, 0x00, 0x50, 0x54, 0x0e, 0x56, 0x54, 0x4f, 0x43, 0x00, 0x00, 0x37,
+0x00, 0x4a, 0x00, 0x00, 0x00, 0x00, 0x00, 0x50, 0x46, 0x08, 0x00, 0x00,
+0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x46, 0x00, 0x52, 0x54,
+0x04, 0x56, 0x54, 0x4f, 0x43, 0x50, 0x54, 0x38, 0x56, 0x49, 0x4e, 0x49,
+0x00, 0x00, 0x81, 0x00, 0x3a, 0x00, 0x00, 0x00, 0x00, 0x00, 0x56, 0x53,
+0x59, 0x53, 0x00, 0x00, 0xbb, 0x00, 0x27, 0x00, 0x00, 0x00, 0x00, 0x00,
+0x56, 0x43, 0x45, 0x4e, 0x00, 0x00, 0xe2, 0x00, 0x27, 0x00, 0x00, 0x00,
+0x00, 0x00, 0x56, 0x53, 0x42, 0x50, 0x00, 0x00, 0x09, 0x01, 0x19, 0x00,
+0x00, 0x00, 0x00, 0x00, 0x50, 0x46, 0x01, 0x00, 0x00, 0x00, 0x36, 0x00,
+0x52, 0x54, 0x04, 0x56, 0x49, 0x4e, 0x49, 0x44, 0x52, 0x04, 0x44, 0x45,
+0x53, 0x43, 0x48, 0x57, 0x02, 0x30, 0x31, 0x43, 0x43, 0x04, 0x33, 0x34,
+0x35, 0x36, 0x46, 0x4e, 0x04, 0x46, 0x52, 0x34, 0x39, 0x53, 0x4e, 0x04,
+0x53, 0x52, 0x31, 0x32, 0x50, 0x4e, 0x04, 0x50, 0x52, 0x39, 0x39, 0x50,
+0x46, 0x04, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x23, 0x00, 0x52, 0x54,
+0x04, 0x56, 0x53, 0x59, 0x53, 0x53, 0x45, 0x07, 0x49, 0x42, 0x4d, 0x53,
+0x59, 0x53, 0x31, 0x54, 0x4d, 0x08, 0x32, 0x32, 0x32, 0x32, 0x2d, 0x32,
+0x32, 0x32, 0x50, 0x46, 0x04, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x23,
+0x00, 0x52, 0x54, 0x04, 0x56, 0x43, 0x45, 0x4e, 0x53, 0x45, 0x07, 0x31,
+0x32, 0x33, 0x34, 0x35, 0x36, 0x37, 0x46, 0x43, 0x08, 0x31, 0x31, 0x31,
+0x31, 0x2d, 0x31, 0x31, 0x31, 0x50, 0x46, 0x04, 0x00, 0x00, 0x00, 0x00,
+0x00, 0x00, 0x15, 0x00, 0x52, 0x54, 0x04, 0x56, 0x53, 0x42, 0x50, 0x49,
+0x4d, 0x04, 0x50, 0x00, 0x10, 0x01, 0x50, 0x46, 0x04, 0x00, 0x00, 0x00,
+0x00, 0x00,
+};
+
+/* Rainier BMC FRU */
+const uint8_t rainier_bmc_fruid[] = {
+0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x84,
+0x28, 0x00, 0x52, 0x54, 0x04, 0x56, 0x48, 0x44, 0x52, 0x56, 0x44, 0x02,
+0x01, 0x00, 0x50, 0x54, 0x0e, 0x56, 0x54, 0x4f, 0x43, 0x00, 0x00, 0x37,
+0x00, 0x20, 0x00, 0x00, 0x00, 0x00, 0x00, 0x50, 0x46, 0x08, 0x00, 0x00,
+0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x1c, 0x00, 0x52, 0x54,
+0x04, 0x56, 0x54, 0x4f, 0x43, 0x50, 0x54, 0x0e, 0x56, 0x49, 0x4e, 0x49,
+0x00, 0x00, 0x57, 0x00, 0x1e, 0x00, 0x00, 0x00, 0x00, 0x00, 0x50, 0x46,
+0x01, 0x00, 0x00, 0x00, 0x1a, 0x00, 0x52, 0x54, 0x04, 0x56, 0x49, 0x4e,
+0x49, 0x44, 0x52, 0x04, 0x44, 0x45, 0x53, 0x43, 0x48, 0x57, 0x02, 0x30,
+0x31, 0x50, 0x46, 0x04, 0x00, 0x00, 0x00, 0x00, 0x00,
+};
+
  const size_t tiogapass_bmc_fruid_len = sizeof(tiogapass_bmc_fruid);
  const size_t fby35_nic_fruid_len = sizeof(fby35_nic_fruid);
  const size_t fby35_bb_fruid_len = sizeof(fby35_bb_fruid);
  const size_t fby35_bmc_fruid_len = sizeof(fby35_bmc_fruid);
-
  const size_t yosemitev2_bmc_fruid_len = sizeof(yosemitev2_bmc_fruid);
+const size_t rainier_bb_fruid_len = sizeof(rainier_bb_fruid);
+const size_t rainier_bmc_fruid_len = sizeof(rainier_bmc_fruid);
diff --git a/hw/arm/aspeed_eeprom.h b/hw/arm/aspeed_eeprom.h
index 86db6f0479..bbf9e54365 100644
--- a/hw/arm/aspeed_eeprom.h
+++ b/hw/arm/aspeed_eeprom.h
@@ -22,4 +22,9 @@ extern const

Re: [PATCH] ui/cursor: incomplete check for integer overflow in cursor_alloc

2023-05-22 Thread Gerd Hoffmann

> > -QEMUCursor *cursor_alloc(int width, int height)
> > +QEMUCursor *cursor_alloc(uint32_t width, uint32_t height)
> >   {
> >       QEMUCursor *c;
> 
> Can't we check width/height > 0 && <= SOME_LIMIT_THAT_MAKES_SENSE?
> 
> Maybe a 16K * 16K cursor is future proof and safe enough.

Modern physical hardware typically uses 512x512 sprites (even if only a
fraction of that is actually needed and >90% are just transparent pixels).

take care,
  Gerd

Re: [PATCH 1/4] hw/intc/loongarch_ipi: Bring back all 4 IPI mailboxes

2023-05-22 Thread Jiaxun Yang

> 2023年5月23日 02:25，Song Gao  写道：
> 
> 
> 
> 在 2023/5/22 下午9:44, Philippe Mathieu-Daudé 写道:
>> On 22/5/23 13:47, Jiaxun Yang wrote:
>>> 
>>> 
 2023年5月22日 04:52，Huacai Chen  写道：

 Hi, Jiaxun,

 Rename loongarch_ipi to loongson_ipi? It will be shared by both MIPS
 and LoongArch in your series.
>>> 
>>> Hi Huacai,
>>> 
>>> Thanks for the point, what’s the opinion from LoongArch mainatiners?
>>> 
>>> Or perhaps rename it as loong_ipi to reflect the nature that it’s shared
>>> by MIPS based Loongson and LoongArch based Loongson?
>> 
>> I'm not a LoongArch maintainer, but a model named "loong_ipi" makes
>> sense to me.
>> 
>> Please add it to the two Virt machine sections in MAINTAINERS.

Hi Song,

>> 
> 'loonggson_ipi' is better, qemu doesn't have naming with 'loong' as prefix.

Thanks, I’ll take looongson_ipi then.

> 
> And  patch2 should not use macros. Some attributes should be added to 
> distinguish between MIPS and LongArch.

By attribute do you mean property? If so I don’t see any necessity, the IP block
Is totally the same on MIPS and LoongArch. I’m guarding them out because
We have different way to get IOCSR address space on MIPS, which is due
to be implemented.

I can further abstract out a function to get IOCSR address space. But still,
I think the best way to differ those two architecture is using TARGET_* macros,
as it doesn’t make much sense to have unused code for another architecture
compiled.

> 
> All references to loongarch_ipi should also be changed.
Sure.

Thanks
- Jiaxun

> 
> Thanks.
> Song Gao

Re: [PATCH] acpi/tests/avocado/bits: enable bios bits avocado tests on gitlab CI pipeline

2023-05-22 Thread Ani Sinha

> On 22-May-2023, at 11:30 PM, Thomas Huth  wrote:
> 
> On 21/05/2023 07.51, Michael S. Tsirkin wrote:
>> On Fri, May 19, 2023 at 08:44:18PM +0530, Ani Sinha wrote:
>>> 
>>> 
 On 17-May-2023, at 12:23 PM, Ani Sinha  wrote:

 Biosbits avocado tests on gitlab has thus far been disabled because some
 packages needed by this test was missing in the container images used by 
 gitlab
 CI. These packages have now been added with the commit:

 da9000784c90d ("tests/lcitool: Add mtools and xorriso and remove 
 genisoimage as dependencies")

 Therefore, this change enables bits avocado test on gitlab.
 At the same time, the bits cleanup code has also been made more robust with
 this change.

 Signed-off-by: Ani Sinha 
>>> 
>>> Michael, did you forget to queue this?
>> Not that I forgot but it takes me time to process new patches.
>> This came after I started testing the pull.
> 
> FYI, I've picked it up today.

Thanks Thomas! Much appreciated!

Re: [PULL v2 0/4] QAPI patches patches for 2023-05-17

2023-05-22 Thread Richard Henderson


On 5/22/23 04:20, Markus Armbruster wrote:

The following changes since commit aa222a8e4f975284b3f8f131653a4114b3d333b3:

   Merge tag 'for_upstream' ofhttps://git.kernel.org/pub/scm/virt/kvm/mst/qemu  
into staging (2023-05-19 12:17:16 -0700)

are available in the Git repository at:

   https://repo.or.cz/qemu/armbru.git  tags/pull-qapi-2023-05-17-v2

for you to fetch changes up to 0ec4468f233c53eb854f204d105d965455deec51:

   docs/interop: Delete qmp-intro.txt (2023-05-22 10:22:29 +0200)


QAPI patches patches for 2023-05-17


Applied, thanks.  Please update https://wiki.qemu.org/ChangeLog/8.1 as 
appropriate.


r~

Re: [PATCH v4 03/11] hw: allwinner-r40: Complete uart devices

2023-05-22 Thread qianfan





在 2023/5/15 2:55, Niek Linnenbank 写道:

Hi Qianfan,


On Wed, May 10, 2023 at 12:30 PM  wrote:

From: qianfan Zhao 

R40 has eight UARTs, support both 16450 and 16550 compatible modes.

Signed-off-by: qianfan Zhao 
---
 hw/arm/allwinner-r40.c         | 31 ---
 include/hw/arm/allwinner-r40.h |  8 
 2 files changed, 36 insertions(+), 3 deletions(-)

diff --git a/hw/arm/allwinner-r40.c b/hw/arm/allwinner-r40.c
index 128c0ca470..537a90b23d 100644
--- a/hw/arm/allwinner-r40.c
+++ b/hw/arm/allwinner-r40.c
@@ -45,6 +45,13 @@ const hwaddr allwinner_r40_memmap[] = {
     [AW_R40_DEV_CCU]        = 0x01c2,
     [AW_R40_DEV_PIT]        = 0x01c20c00,
     [AW_R40_DEV_UART0]      = 0x01c28000,
+    [AW_R40_DEV_UART1]      = 0x01c28400,
+    [AW_R40_DEV_UART2]      = 0x01c28800,
+    [AW_R40_DEV_UART3]      = 0x01c28c00,
+    [AW_R40_DEV_UART4]      = 0x01c29000,
+    [AW_R40_DEV_UART5]      = 0x01c29400,
+    [AW_R40_DEV_UART6]      = 0x01c29800,
+    [AW_R40_DEV_UART7]      = 0x01c29c00,


After adding the uarts to the memory map here, you should remove them 
from the unimplemented array.

Hi:

I had tried this including remove UART0 from allwinner_r40_memmap, but 
that will make qemu
for R40 doesn't work again. Only a few registers are implemented in 
hw/char/serial.c,

so we still need this.


     [AW_R40_DEV_GIC_DIST]   = 0x01c81000,
     [AW_R40_DEV_GIC_CPU]    = 0x01c82000,
     [AW_R40_DEV_GIC_HYP]    = 0x01c84000,
@@ -160,6 +167,10 @@ enum {
     AW_R40_GIC_SPI_UART1     =  2,
     AW_R40_GIC_SPI_UART2     =  3,
     AW_R40_GIC_SPI_UART3     =  4,


Since you put the addition of UART1-7 in this patch, probably it makes 
sense to have adding the lines 'AW_R40_GIC_SPI_UART1/2/3' also part of 
this patch.


With the two above remarks resolved, the patch looks good to me.

Reviewed-by: Niek Linnenbank 

Regards,
Niek

+    AW_R40_GIC_SPI_UART4     = 17,
+    AW_R40_GIC_SPI_UART5     = 18,
+    AW_R40_GIC_SPI_UART6     = 19,
+    AW_R40_GIC_SPI_UART7     = 20,
     AW_R40_GIC_SPI_TIMER0    = 22,
     AW_R40_GIC_SPI_TIMER1    = 23,
     AW_R40_GIC_SPI_MMC0      = 32,
@@ -387,9 +398,23 @@ static void allwinner_r40_realize(DeviceState
*dev, Error **errp)
     }

     /* UART0. For future clocktree API: All UARTS are connected
to APB2_CLK. */
-    serial_mm_init(get_system_memory(),
s->memmap[AW_R40_DEV_UART0], 2,
-                   qdev_get_gpio_in(DEVICE(>gic),
AW_R40_GIC_SPI_UART0),
-                   115200, serial_hd(0), DEVICE_NATIVE_ENDIAN);
+    for (int i = 0; i < AW_R40_NUM_UARTS; i++) {
+        static const int uart_irqs[AW_R40_NUM_UARTS] = {
+            AW_R40_GIC_SPI_UART0,
+            AW_R40_GIC_SPI_UART1,
+            AW_R40_GIC_SPI_UART2,
+            AW_R40_GIC_SPI_UART3,
+            AW_R40_GIC_SPI_UART4,
+            AW_R40_GIC_SPI_UART5,
+            AW_R40_GIC_SPI_UART6,
+            AW_R40_GIC_SPI_UART7,
+        };
+        const hwaddr addr = s->memmap[AW_R40_DEV_UART0 + i];
+
+        serial_mm_init(get_system_memory(), addr, 2,
+  qdev_get_gpio_in(DEVICE(>gic), uart_irqs[i]),
+                       115200, serial_hd(i), DEVICE_NATIVE_ENDIAN);
+    }

     /* Unimplemented devices */
     for (i = 0; i < ARRAY_SIZE(r40_unimplemented); i++) {
diff --git a/include/hw/arm/allwinner-r40.h
b/include/hw/arm/allwinner-r40.h
index 3be9dc962b..959b5dc4e0 100644
--- a/include/hw/arm/allwinner-r40.h
+++ b/include/hw/arm/allwinner-r40.h
@@ -41,6 +41,13 @@ enum {
     AW_R40_DEV_CCU,
     AW_R40_DEV_PIT,
     AW_R40_DEV_UART0,
+    AW_R40_DEV_UART1,
+    AW_R40_DEV_UART2,
+    AW_R40_DEV_UART3,
+    AW_R40_DEV_UART4,
+    AW_R40_DEV_UART5,
+    AW_R40_DEV_UART6,
+    AW_R40_DEV_UART7,
     AW_R40_DEV_GIC_DIST,
     AW_R40_DEV_GIC_CPU,
     AW_R40_DEV_GIC_HYP,
@@ -70,6 +77,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(AwR40State, AW_R40)
  * which are currently emulated by the R40 SoC code.
  */
 #define AW_R40_NUM_MMCS         4
+#define AW_R40_NUM_UARTS        8

 struct AwR40State {
     /*< private >*/
-- 
2.25.1




--
Niek Linnenbank

Re: [PATCH v8 4/7] igb: RX payload guest writting refactoring

2023-05-22 Thread Akihiko Odaki


On 2023/05/18 23:04, Tomasz Dzieciol wrote:

Refactoring is done in preparation for support of multiple advanced
descriptors RX modes, especially packet-split modes.

Signed-off-by: Tomasz Dzieciol 
---
  hw/net/e1000e_core.c |  18 ++--
  hw/net/igb_core.c| 214 +--
  tests/qtest/libqos/igb.c |   5 +
  3 files changed, 151 insertions(+), 86 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index b2e54fe802..f9ff31fd70 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -1418,11 +1418,11 @@ e1000e_write_hdr_to_rx_buffers(E1000ECore *core,
  }
  
  static void

-e1000e_write_to_rx_buffers(E1000ECore *core,
-   hwaddr ba[MAX_PS_BUFFERS],
-   e1000e_ba_state *bastate,
-   const char *data,
-   dma_addr_t data_len)
+e1000e_write_payload_frag_to_rx_buffers(E1000ECore *core,
+hwaddr ba[MAX_PS_BUFFERS],
+e1000e_ba_state *bastate,
+const char *data,
+dma_addr_t data_len)
  {
  while (data_len > 0) {
  uint32_t cur_buf_len = core->rxbuf_sizes[bastate->cur_idx];
@@ -1594,8 +1594,10 @@ e1000e_write_packet_to_guest(E1000ECore *core, struct 
NetRxPkt *pkt,
  while (copy_size) {
  iov_copy = MIN(copy_size, iov->iov_len - iov_ofs);
  
-e1000e_write_to_rx_buffers(core, ba, ,

-iov->iov_base + iov_ofs, iov_copy);
+e1000e_write_payload_frag_to_rx_buffers(core, ba, ,
+iov->iov_base +
+iov_ofs,
+iov_copy);
  
  copy_size -= iov_copy;

  iov_ofs += iov_copy;
@@ -1607,7 +1609,7 @@ e1000e_write_packet_to_guest(E1000ECore *core, struct 
NetRxPkt *pkt,
  
  if (desc_offset + desc_size >= total_size) {

  /* Simulate FCS checksum presence in the last descriptor 
*/
-e1000e_write_to_rx_buffers(core, ba, ,
+e1000e_write_payload_frag_to_rx_buffers(core, ba, ,
(const char *) _pad, e1000x_fcs_len(core->mac));
  }
  }
diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index c987b26d09..7a4a01c4a1 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -941,6 +941,14 @@ igb_has_rxbufs(IGBCore *core, const E1000ERingInfo *r, 
size_t total_size)
   bufsize;
  }
  
+static uint32_t

+igb_rxhdrbufsize(IGBCore *core, const E1000ERingInfo *r)
+{
+uint32_t srrctl = core->mac[E1000_SRRCTL(r->idx) >> 2];
+return (srrctl & E1000_SRRCTL_BSIZEHDRSIZE_MASK) >>
+   E1000_SRRCTL_BSIZEHDRSIZE_SHIFT;
+}
+
  void
  igb_start_recv(IGBCore *core)
  {
@@ -1231,6 +1239,21 @@ igb_read_adv_rx_descr(IGBCore *core, union 
e1000_adv_rx_desc *desc,
  *buff_addr = le64_to_cpu(desc->read.pkt_addr);
  }
  
+typedef struct IGBPacketRxDMAState {

+size_t size;
+size_t total_size;
+size_t ps_hdr_len;
+size_t desc_size;
+size_t desc_offset;
+uint32_t rx_desc_packet_buf_size;
+uint32_t rx_desc_header_buf_size;
+struct iovec *iov;
+size_t iov_ofs;
+bool is_first;
+uint16_t written;
+hwaddr ba;
+} IGBPacketRxDMAState;
+
  static inline void
  igb_read_rx_descr(IGBCore *core, union e1000_rx_desc_union *desc,
hwaddr *buff_addr)
@@ -1514,19 +1537,6 @@ igb_pci_dma_write_rx_desc(IGBCore *core, PCIDevice *dev, 
dma_addr_t addr,
  }
  }
  
-static void

-igb_write_to_rx_buffers(IGBCore *core,
-PCIDevice *d,
-hwaddr ba,
-uint16_t *written,
-const char *data,
-dma_addr_t data_len)
-{
-trace_igb_rx_desc_buff_write(ba, *written, data, data_len);
-pci_dma_write(d, ba + *written, data, data_len);
-*written += data_len;
-}
-
  static void
  igb_update_rx_stats(IGBCore *core, const E1000ERingInfo *rxi,
  size_t pkt_size, size_t pkt_fcs_size)
@@ -1552,6 +1562,93 @@ igb_rx_descr_threshold_hit(IGBCore *core, const 
E1000ERingInfo *rxi)
 ((core->mac[E1000_SRRCTL(rxi->idx) >> 2] >> 20) & 31) * 16;
  }
  
+static void

+igb_truncate_to_descriptor_size(IGBPacketRxDMAState *pdma_st, size_t *size)
+{
+if (*size > pdma_st->rx_desc_packet_buf_size) {
+*size = pdma_st->rx_desc_packet_buf_size;
+}
+}
+
+static void
+igb_write_payload_frag_to_rx_buffers(IGBCore *core,
+ PCIDevice *d,
+ hwaddr ba,
+

Re: [PATCH v8 3/7] igb: RX descriptors guest writting refactoring

2023-05-22 Thread Akihiko Odaki


On 2023/05/18 23:04, Tomasz Dzieciol wrote:

Refactoring is done in preparation for support of multiple advanced
descriptors RX modes, especially packet-split modes.

Signed-off-by: Tomasz Dzieciol 
---
  hw/net/igb_core.c   | 178 +++-
  hw/net/igb_regs.h   |  10 +--
  hw/net/trace-events |   6 +-
  3 files changed, 101 insertions(+), 93 deletions(-)

diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index b6031dea24..c987b26d09 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -1281,15 +1281,11 @@ igb_verify_csum_in_sw(IGBCore *core,
  }
  
  static void

-igb_build_rx_metadata(IGBCore *core,
-  struct NetRxPkt *pkt,
-  bool is_eop,
-  const E1000E_RSSInfo *rss_info, uint16_t etqf, bool ts,
-  uint16_t *pkt_info, uint16_t *hdr_info,
-  uint32_t *rss,
-  uint32_t *status_flags,
-  uint16_t *ip_id,
-  uint16_t *vlan_tag)
+igb_build_rx_metadata_common(IGBCore *core,
+ struct NetRxPkt *pkt,
+ bool is_eop,
+ uint32_t *status_flags,
+ uint16_t *vlan_tag)
  {
  struct virtio_net_hdr *vhdr;
  bool hasip4, hasip6, csum_valid;
@@ -1298,7 +1294,6 @@ igb_build_rx_metadata(IGBCore *core,
  *status_flags = E1000_RXD_STAT_DD;
  
  /* No additional metadata needed for non-EOP descriptors */

-/* TODO: EOP apply only to status so don't skip whole function. */
  if (!is_eop) {
  goto func_exit;
  }
@@ -1315,59 +1310,6 @@ igb_build_rx_metadata(IGBCore *core,
  trace_e1000e_rx_metadata_vlan(*vlan_tag);
  }
  
-/* Packet parsing results */

-if ((core->mac[RXCSUM] & E1000_RXCSUM_PCSD) != 0) {
-if (rss_info->enabled) {
-*rss = cpu_to_le32(rss_info->hash);
-trace_igb_rx_metadata_rss(*rss);
-}
-} else if (hasip4) {
-*status_flags |= E1000_RXD_STAT_IPIDV;
-*ip_id = cpu_to_le16(net_rx_pkt_get_ip_id(pkt));
-trace_e1000e_rx_metadata_ip_id(*ip_id);
-}
-
-if (pkt_info) {
-*pkt_info = rss_info->enabled ? rss_info->type : 0;
-
-if (etqf < 8) {
-*pkt_info |= BIT(11) | (etqf << 4);
-} else {
-if (hasip4) {
-*pkt_info |= E1000_ADVRXD_PKT_IP4;
-}
-
-if (hasip6) {
-*pkt_info |= E1000_ADVRXD_PKT_IP6;
-}
-
-switch (l4hdr_proto) {
-case ETH_L4_HDR_PROTO_TCP:
-*pkt_info |= E1000_ADVRXD_PKT_TCP;
-break;
-
-case ETH_L4_HDR_PROTO_UDP:
-*pkt_info |= E1000_ADVRXD_PKT_UDP;
-break;
-
-case ETH_L4_HDR_PROTO_SCTP:
-*pkt_info |= E1000_ADVRXD_PKT_SCTP;
-break;
-
-default:
-break;
-}
-}
-}
-
-if (hdr_info) {
-*hdr_info = 0;
-}
-
-if (ts) {
-*status_flags |= BIT(16);
-}
-
  /* RX CSO information */
  if (hasip6 && (core->mac[RFCTL] & E1000_RFCTL_IPV6_XSUM_DIS)) {
  trace_e1000e_rx_metadata_ipv6_sum_disabled();
@@ -1423,43 +1365,108 @@ func_exit:
  static inline void
  igb_write_lgcy_rx_descr(IGBCore *core, struct e1000_rx_desc *desc,
  struct NetRxPkt *pkt,
-const E1000E_RSSInfo *rss_info, uint16_t etqf, bool ts,
+const E1000E_RSSInfo *rss_info,
  uint16_t length)
  {
-uint32_t status_flags, rss;
-uint16_t ip_id;
+uint32_t status_flags;
  
  assert(!rss_info->enabled);

+
+memset(desc, 0, sizeof(*desc));
  desc->length = cpu_to_le16(length);
-desc->csum = 0;
+igb_build_rx_metadata_common(core, pkt, pkt != NULL,
+ _flags,
+ >special);
  
-igb_build_rx_metadata(core, pkt, pkt != NULL,

-  rss_info, etqf, ts,
-  NULL, NULL, ,
-  _flags, _id,
-  >special);
  desc->errors = (uint8_t) (le32_to_cpu(status_flags) >> 24);
  desc->status = (uint8_t) le32_to_cpu(status_flags);
  }
  
+static uint16_t

+igb_rx_desc_get_packet_type(IGBCore *core, struct NetRxPkt *pkt, uint16_t etqf)
+{
+uint16_t pkt_type;
+bool hasip4, hasip6;
+EthL4HdrProto l4hdr_proto;
+
+net_rx_pkt_get_protocols(pkt, , , _proto);
+
+if (hasip6 && !(core->mac[RFCTL] & E1000_RFCTL_IPV6_DIS)) {
+pkt_type = E1000_ADVRXD_PKT_IP6;
+} else if (hasip4) {
+pkt_type = E1000_ADVRXD_PKT_IP4;
+} else {
+pkt_type = 0;
+}
+
+if (etqf < 8) {


When ETQF is applied, E1000_ADVRXD_PKT_IP6 and E1000_ADVRXD_PKT_IP4 
shouldn't be set.



+pkt_type |=

Re: [PATCH v8 0/7] igb: packet-split descriptors support

2023-05-22 Thread Akihiko Odaki


On 2023/05/18 23:04, Tomasz Dzieciol wrote:

Based-on: <20230423041833.5302-1-akihiko.od...@daynix.com>
("[PATCH v3 00/47] igb: Fix for DPDK")

Purposes of this series of patches:
* introduce packet-split RX descriptors support. This feature is used by Linux
   VF driver for MTU values from 2048.
* refactor RX descriptor handling for introduction of packet-split RX
   descriptors support
* fix descriptors flags handling

Tomasz Dzieciol (7):
   igb: remove TCP ACK detection
   igb: rename E1000E_RingInfo_st
   igb: RX descriptors guest writting refactoring
   igb: RX payload guest writting refactoring
   igb: add IPv6 extended headers traffic detection
   igb: packet-split descriptors support
   e1000e: rename e1000e_ba_state and e1000e_write_hdr_to_rx_buffers

  hw/net/e1000e_core.c |  78 ++--
  hw/net/igb_core.c| 746 ---
  hw/net/igb_regs.h|  20 +-
  hw/net/trace-events  |   6 +-
  tests/qtest/libqos/igb.c |   5 +
  5 files changed, 604 insertions(+), 251 deletions(-)



Hi,

Finally I decided to test your patches, and found some problems in them 
*and* my series this is based on. Please rebase your series to
"[PATCH v5 00/48] igb: Fix for DPDK" which I have just sent, and review 
comments I'll give for patches.


Regards,
Akihiko Odaki

[PATCH v5 11/48] tests/avocado: Remove unused imports

2023-05-22 Thread Akihiko Odaki

Signed-off-by: Akihiko Odaki 
---
 tests/avocado/netdev-ethtool.py | 1 -
 1 file changed, 1 deletion(-)

diff --git a/tests/avocado/netdev-ethtool.py b/tests/avocado/netdev-ethtool.py
index f7e9464184..8de118e313 100644
--- a/tests/avocado/netdev-ethtool.py
+++ b/tests/avocado/netdev-ethtool.py
@@ -7,7 +7,6 @@
 
 from avocado import skip
 from avocado_qemu import QemuSystemTest
-from avocado_qemu import exec_command, exec_command_and_wait_for_pattern
 from avocado_qemu import wait_for_console_pattern
 
 class NetDevEthtool(QemuSystemTest):
-- 
2.40.1

[PATCH v5 24/48] igb: Add more definitions for Tx descriptor

2023-05-22 Thread Akihiko Odaki

Signed-off-by: Akihiko Odaki 
Reviewed-by: Sriram Yagnaraman 
---
 hw/net/igb_regs.h | 32 +++-
 hw/net/igb_core.c |  4 ++--
 2 files changed, 29 insertions(+), 7 deletions(-)

diff --git a/hw/net/igb_regs.h b/hw/net/igb_regs.h
index 21ee9a3b2d..eb995d8b2e 100644
--- a/hw/net/igb_regs.h
+++ b/hw/net/igb_regs.h
@@ -42,11 +42,6 @@ union e1000_adv_tx_desc {
 } wb;
 };
 
-#define E1000_ADVTXD_DTYP_CTXT  0x0020 /* Advanced Context Descriptor */
-#define E1000_ADVTXD_DTYP_DATA  0x0030 /* Advanced Data Descriptor */
-#define E1000_ADVTXD_DCMD_DEXT  0x2000 /* Descriptor Extension (1=Adv) */
-#define E1000_ADVTXD_DCMD_TSE   0x8000 /* TCP/UDP Segmentation Enable */
-
 #define E1000_ADVTXD_POTS_IXSM  0x0100 /* Insert TCP/UDP Checksum */
 #define E1000_ADVTXD_POTS_TXSM  0x0200 /* Insert TCP/UDP Checksum */
 
@@ -151,6 +146,10 @@ union e1000_adv_rx_desc {
 #define IGB_82576_VF_DEV_ID0x10CA
 #define IGB_I350_VF_DEV_ID 0x1520
 
+/* VLAN info */
+#define IGB_TX_FLAGS_VLAN_MASK 0x
+#define IGB_TX_FLAGS_VLAN_SHIFT16
+
 /* from igb/e1000_82575.h */
 
 #define E1000_MRQC_ENABLE_RSS_MQ0x0002
@@ -160,6 +159,29 @@ union e1000_adv_rx_desc {
 #define E1000_MRQC_RSS_FIELD_IPV6_UDP   0x0080
 #define E1000_MRQC_RSS_FIELD_IPV6_UDP_EX0x0100
 
+/* Adv Transmit Descriptor Config Masks */
+#define E1000_ADVTXD_MAC_TSTAMP   0x0008 /* IEEE1588 Timestamp packet */
+#define E1000_ADVTXD_DTYP_CTXT0x0020 /* Advanced Context Descriptor */
+#define E1000_ADVTXD_DTYP_DATA0x0030 /* Advanced Data Descriptor */
+#define E1000_ADVTXD_DCMD_EOP 0x0100 /* End of Packet */
+#define E1000_ADVTXD_DCMD_IFCS0x0200 /* Insert FCS (Ethernet CRC) */
+#define E1000_ADVTXD_DCMD_RS  0x0800 /* Report Status */
+#define E1000_ADVTXD_DCMD_DEXT0x2000 /* Descriptor extension (1=Adv) */
+#define E1000_ADVTXD_DCMD_VLE 0x4000 /* VLAN pkt enable */
+#define E1000_ADVTXD_DCMD_TSE 0x8000 /* TCP Seg enable */
+#define E1000_ADVTXD_PAYLEN_SHIFT14 /* Adv desc PAYLEN shift */
+
+#define E1000_ADVTXD_MACLEN_SHIFT9  /* Adv ctxt desc mac len shift */
+#define E1000_ADVTXD_TUCMD_L4T_UDP 0x  /* L4 Packet TYPE of UDP */
+#define E1000_ADVTXD_TUCMD_IPV40x0400  /* IP Packet Type: 1=IPv4 */
+#define E1000_ADVTXD_TUCMD_L4T_TCP 0x0800  /* L4 Packet TYPE of TCP */
+#define E1000_ADVTXD_TUCMD_L4T_SCTP 0x1000 /* L4 packet TYPE of SCTP */
+/* IPSec Encrypt Enable for ESP */
+#define E1000_ADVTXD_L4LEN_SHIFT 8  /* Adv ctxt L4LEN shift */
+#define E1000_ADVTXD_MSS_SHIFT  16  /* Adv ctxt MSS shift */
+/* Adv ctxt IPSec SA IDX mask */
+/* Adv ctxt IPSec ESP len mask */
+
 /* Additional Transmit Descriptor Control definitions */
 #define E1000_TXDCTL_QUEUE_ENABLE  0x0200 /* Enable specific Tx Queue */
 
diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 162ef26789..56a53872cf 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -418,7 +418,7 @@ igb_setup_tx_offloads(IGBCore *core, struct igb_tx *tx)
 {
 if (tx->first_cmd_type_len & E1000_ADVTXD_DCMD_TSE) {
 uint32_t idx = (tx->first_olinfo_status >> 4) & 1;
-uint32_t mss = tx->ctx[idx].mss_l4len_idx >> 16;
+uint32_t mss = tx->ctx[idx].mss_l4len_idx >> E1000_ADVTXD_MSS_SHIFT;
 if (!net_tx_pkt_build_vheader(tx->tx_pkt, true, true, mss)) {
 return false;
 }
@@ -612,7 +612,7 @@ igb_process_tx_desc(IGBCore *core,
 if (!tx->skip_cp && net_tx_pkt_parse(tx->tx_pkt)) {
 idx = (tx->first_olinfo_status >> 4) & 1;
 igb_tx_insert_vlan(core, queue_index, tx,
-tx->ctx[idx].vlan_macip_lens >> 16,
+tx->ctx[idx].vlan_macip_lens >> IGB_TX_FLAGS_VLAN_SHIFT,
 !!(tx->first_cmd_type_len & E1000_TXD_CMD_VLE));
 
 if (igb_tx_pkt_send(core, tx, queue_index)) {
-- 
2.40.1

[PATCH v5 40/48] igb: Implement igb-specific oversize check

2023-05-22 Thread Akihiko Odaki

igb has a configurable size limit for LPE, and uses different limits
depending on whether the packet is treated as a VLAN packet.

Signed-off-by: Akihiko Odaki 
Reviewed-by: Sriram Yagnaraman 
---
 hw/net/igb_core.c | 36 +---
 1 file changed, 21 insertions(+), 15 deletions(-)

diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 5345f57031..c04ec01117 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -980,16 +980,13 @@ igb_rx_l4_cso_enabled(IGBCore *core)
 return !!(core->mac[RXCSUM] & E1000_RXCSUM_TUOFLD);
 }
 
-static bool
-igb_rx_is_oversized(IGBCore *core, uint16_t qn, size_t size)
+static bool igb_rx_is_oversized(IGBCore *core, const struct eth_header *ehdr,
+size_t size, size_t vlan_num,
+bool lpe, uint16_t rlpml)
 {
-uint16_t pool = qn % IGB_NUM_VM_POOLS;
-bool lpe = !!(core->mac[VMOLR0 + pool] & E1000_VMOLR_LPE);
-int max_ethernet_lpe_size =
-core->mac[VMOLR0 + pool] & E1000_VMOLR_RLPML_MASK;
-int max_ethernet_vlan_size = 1522;
-
-return size > (lpe ? max_ethernet_lpe_size : max_ethernet_vlan_size);
+size_t vlan_header_size = sizeof(struct vlan_header) * vlan_num;
+size_t header_size = sizeof(struct eth_header) + vlan_header_size;
+return lpe ? size + ETH_FCS_LEN > rlpml : size > header_size + ETH_MTU;
 }
 
 static uint16_t igb_receive_assign(IGBCore *core, const L2Header *l2_header,
@@ -1002,6 +999,8 @@ static uint16_t igb_receive_assign(IGBCore *core, const 
L2Header *l2_header,
 uint16_t queues = 0;
 uint16_t oversized = 0;
 size_t vlan_num = 0;
+bool lpe;
+uint16_t rlpml;
 int i;
 
 memset(rss_info, 0, sizeof(E1000E_RSSInfo));
@@ -1021,6 +1020,14 @@ static uint16_t igb_receive_assign(IGBCore *core, const 
L2Header *l2_header,
 }
 }
 
+lpe = !!(core->mac[RCTL] & E1000_RCTL_LPE);
+rlpml = core->mac[RLPML];
+if (!(core->mac[RCTL] & E1000_RCTL_SBP) &&
+igb_rx_is_oversized(core, ehdr, size, vlan_num, lpe, rlpml)) {
+trace_e1000x_rx_oversized(size);
+return queues;
+}
+
 if (vlan_num &&
 !e1000x_rx_vlan_filter(core->mac, l2_header->vlan + vlan_num - 1)) {
 return queues;
@@ -1106,7 +1113,11 @@ static uint16_t igb_receive_assign(IGBCore *core, const 
L2Header *l2_header,
 queues &= core->mac[VFRE];
 if (queues) {
 for (i = 0; i < IGB_NUM_VM_POOLS; i++) {
-if ((queues & BIT(i)) && igb_rx_is_oversized(core, i, size)) {
+lpe = !!(core->mac[VMOLR0 + i] & E1000_VMOLR_LPE);
+rlpml = core->mac[VMOLR0 + i] & E1000_VMOLR_RLPML_MASK;
+if ((queues & BIT(i)) &&
+igb_rx_is_oversized(core, ehdr, size, vlan_num,
+lpe, rlpml)) {
 oversized |= BIT(i);
 }
 }
@@ -1662,11 +1673,6 @@ igb_receive_internal(IGBCore *core, const struct iovec 
*iov, int iovcnt,
 iov_to_buf(iov, iovcnt, iov_ofs, , sizeof(buf.l2_header));
 }
 
-/* Discard oversized packets if !LPE and !SBP. */
-if (e1000x_is_oversized(core->mac, size)) {
-return orig_size;
-}
-
 net_rx_pkt_set_packet_type(core->rx_pkt,
get_eth_packet_type(_header.eth));
 net_rx_pkt_set_protocols(core->rx_pkt, iov, iovcnt, iov_ofs);
-- 
2.40.1

[PATCH v5 23/48] vmxnet3: Reset packet state after emptying Tx queue

2023-05-22 Thread Akihiko Odaki

Keeping Tx packet state after the transmit queue is emptied but this
behavior is unreliable as the state can be reset anytime the migration
happens.

Always reset Tx packet state always after the queue is emptied.

Signed-off-by: Akihiko Odaki 
---
 hw/net/vmxnet3.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c
index 05f41b6dfa..18b9edfdb2 100644
--- a/hw/net/vmxnet3.c
+++ b/hw/net/vmxnet3.c
@@ -681,6 +681,8 @@ static void vmxnet3_process_tx_queue(VMXNET3State *s, int 
qidx)
  net_tx_pkt_unmap_frag_pci, PCI_DEVICE(s));
 }
 }
+
+net_tx_pkt_reset(s->tx_pkt, net_tx_pkt_unmap_frag_pci, PCI_DEVICE(s));
 }
 
 static inline void
@@ -1159,7 +1161,6 @@ static void vmxnet3_deactivate_device(VMXNET3State *s)
 {
 if (s->device_active) {
 VMW_CBPRN("Deactivating vmxnet3...");
-net_tx_pkt_reset(s->tx_pkt, net_tx_pkt_unmap_frag_pci, PCI_DEVICE(s));
 net_tx_pkt_uninit(s->tx_pkt);
 net_rx_pkt_uninit(s->rx_pkt);
 s->device_active = false;
-- 
2.40.1

[PATCH v5 43/48] e1000e: Notify only new interrupts

2023-05-22 Thread Akihiko Odaki

In MSI-X mode, if there are interrupts already notified but not cleared
and a new interrupt arrives, e1000e incorrectly notifies the notified
ones again along with the new one.

To fix this issue, replace e1000e_update_interrupt_state() with
two new functions: e1000e_raise_interrupts() and
e1000e_lower_interrupts(). These functions don't only raise or lower
interrupts, but it also performs register writes which updates the
interrupt state. Before it performs a register write, these function
determines the interrupts already raised, and compares with the
interrupts raised after the register write to determine the interrupts
to notify.

The introduction of these functions made tracepoints which assumes that
the caller of e1000e_update_interrupt_state() performs register writes
obsolete. These tracepoints are now removed, and alternative ones are
added to the new functions.

Signed-off-by: Akihiko Odaki 
---
 hw/net/e1000e_core.h |   2 -
 hw/net/e1000e_core.c | 153 +++
 hw/net/trace-events  |   2 +
 3 files changed, 69 insertions(+), 88 deletions(-)

diff --git a/hw/net/e1000e_core.h b/hw/net/e1000e_core.h
index 213a70530d..66b025cc43 100644
--- a/hw/net/e1000e_core.h
+++ b/hw/net/e1000e_core.h
@@ -111,8 +111,6 @@ struct E1000Core {
 PCIDevice *owner;
 void (*owner_start_recv)(PCIDevice *d);
 
-uint32_t msi_causes_pending;
-
 int64_t timadj;
 };
 
diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index d601386992..9f185d099c 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -165,14 +165,14 @@ e1000e_intrmgr_on_throttling_timer(void *opaque)
 
 timer->running = false;
 
-if (msi_enabled(timer->core->owner)) {
-trace_e1000e_irq_msi_notify_postponed();
-/* Clear msi_causes_pending to fire MSI eventually */
-timer->core->msi_causes_pending = 0;
-e1000e_set_interrupt_cause(timer->core, 0);
-} else {
-trace_e1000e_irq_legacy_notify_postponed();
-e1000e_set_interrupt_cause(timer->core, 0);
+if (timer->core->mac[IMS] & timer->core->mac[ICR]) {
+if (msi_enabled(timer->core->owner)) {
+trace_e1000e_irq_msi_notify_postponed();
+msi_notify(timer->core->owner, 0);
+} else {
+trace_e1000e_irq_legacy_notify_postponed();
+e1000e_raise_legacy_irq(timer->core);
+}
 }
 }
 
@@ -366,10 +366,6 @@ static void
 e1000e_intrmgr_fire_all_timers(E1000ECore *core)
 {
 int i;
-uint32_t val = e1000e_intmgr_collect_delayed_causes(core);
-
-trace_e1000e_irq_adding_delayed_causes(val, core->mac[ICR]);
-core->mac[ICR] |= val;
 
 if (core->itr.running) {
 timer_del(core->itr.timer);
@@ -1974,13 +1970,6 @@ 
void(*e1000e_phyreg_writeops[E1000E_PHY_PAGES][E1000E_PHY_PAGE_SIZE])
 }
 };
 
-static inline void
-e1000e_clear_ims_bits(E1000ECore *core, uint32_t bits)
-{
-trace_e1000e_irq_clear_ims(bits, core->mac[IMS], core->mac[IMS] & ~bits);
-core->mac[IMS] &= ~bits;
-}
-
 static inline bool
 e1000e_postpone_interrupt(E1000IntrDelayTimer *timer)
 {
@@ -2038,7 +2027,6 @@ e1000e_msix_notify_one(E1000ECore *core, uint32_t cause, 
uint32_t int_cfg)
 effective_eiac = core->mac[EIAC] & cause;
 
 core->mac[ICR] &= ~effective_eiac;
-core->msi_causes_pending &= ~effective_eiac;
 
 if (!(core->mac[CTRL_EXT] & E1000_CTRL_EXT_IAME)) {
 core->mac[IMS] &= ~effective_eiac;
@@ -2130,33 +2118,17 @@ e1000e_fix_icr_asserted(E1000ECore *core)
 trace_e1000e_irq_fix_icr_asserted(core->mac[ICR]);
 }
 
-static void
-e1000e_send_msi(E1000ECore *core, bool msix)
+static void e1000e_raise_interrupts(E1000ECore *core,
+size_t index, uint32_t causes)
 {
-uint32_t causes = core->mac[ICR] & core->mac[IMS] & ~E1000_ICR_ASSERTED;
-
-core->msi_causes_pending &= causes;
-causes ^= core->msi_causes_pending;
-if (causes == 0) {
-return;
-}
-core->msi_causes_pending |= causes;
+bool is_msix = msix_enabled(core->owner);
+uint32_t old_causes = core->mac[IMS] & core->mac[ICR];
+uint32_t raised_causes;
 
-if (msix) {
-e1000e_msix_notify(core, causes);
-} else {
-if (!e1000e_itr_should_postpone(core)) {
-trace_e1000e_irq_msi_notify(causes);
-msi_notify(core->owner, 0);
-}
-}
-}
+trace_e1000e_irq_set(index << 2,
+ core->mac[index], core->mac[index] | causes);
 
-static void
-e1000e_update_interrupt_state(E1000ECore *core)
-{
-bool interrupts_pending;
-bool is_msix = msix_enabled(core->owner);
+core->mac[index] |= causes;
 
 /* Set ICR[OTHER] for MSI-X */
 if (is_msix) {
@@ -2178,40 +2150,58 @@ e1000e_update_interrupt_state(E1000ECore *core)
  */
 core->mac[ICS] = core->mac[ICR];
 
-interrupts_pending = (core->mac[IMS] & core->mac[ICR]) ? true : false;
-if (!interrupts_pending) {
-core->msi_causes_pending =

[PATCH v5 31/48] net/eth: Always add VLAN tag

2023-05-22 Thread Akihiko Odaki

It is possible to have another VLAN tag even if the packet is already
tagged.

Signed-off-by: Akihiko Odaki 
---
 include/net/eth.h   |  4 ++--
 hw/net/net_tx_pkt.c | 16 +++-
 net/eth.c   | 22 ++
 3 files changed, 15 insertions(+), 27 deletions(-)

diff --git a/include/net/eth.h b/include/net/eth.h
index 95ff24d6b8..048e434685 100644
--- a/include/net/eth.h
+++ b/include/net/eth.h
@@ -353,8 +353,8 @@ eth_strip_vlan_ex(const struct iovec *iov, int iovcnt, 
size_t iovoff,
 uint16_t
 eth_get_l3_proto(const struct iovec *l2hdr_iov, int iovcnt, size_t l2hdr_len);
 
-void eth_setup_vlan_headers(struct eth_header *ehdr, uint16_t vlan_tag,
-uint16_t vlan_ethtype, bool *is_new);
+void eth_setup_vlan_headers(struct eth_header *ehdr, size_t *ehdr_size,
+uint16_t vlan_tag, uint16_t vlan_ethtype);
 
 
 uint8_t eth_get_gso_type(uint16_t l3_proto, uint8_t *l3_hdr, uint8_t l4proto);
diff --git a/hw/net/net_tx_pkt.c b/hw/net/net_tx_pkt.c
index ce6b102391..af8f77a3f0 100644
--- a/hw/net/net_tx_pkt.c
+++ b/hw/net/net_tx_pkt.c
@@ -40,7 +40,10 @@ struct NetTxPkt {
 
 struct iovec *vec;
 
-uint8_t l2_hdr[ETH_MAX_L2_HDR_LEN];
+struct {
+struct eth_header eth;
+struct vlan_header vlan[3];
+} l2_hdr;
 union {
 struct ip_header ip;
 struct ip6_header ip6;
@@ -365,18 +368,13 @@ bool net_tx_pkt_build_vheader(struct NetTxPkt *pkt, bool 
tso_enable,
 void net_tx_pkt_setup_vlan_header_ex(struct NetTxPkt *pkt,
 uint16_t vlan, uint16_t vlan_ethtype)
 {
-bool is_new;
 assert(pkt);
 
 eth_setup_vlan_headers(pkt->vec[NET_TX_PKT_L2HDR_FRAG].iov_base,
-vlan, vlan_ethtype, _new);
+   >vec[NET_TX_PKT_L2HDR_FRAG].iov_len,
+   vlan, vlan_ethtype);
 
-/* update l2hdrlen */
-if (is_new) {
-pkt->hdr_len += sizeof(struct vlan_header);
-pkt->vec[NET_TX_PKT_L2HDR_FRAG].iov_len +=
-sizeof(struct vlan_header);
-}
+pkt->hdr_len += sizeof(struct vlan_header);
 }
 
 bool net_tx_pkt_add_raw_fragment(struct NetTxPkt *pkt, void *base, size_t len)
diff --git a/net/eth.c b/net/eth.c
index f7ffbda600..5307978486 100644
--- a/net/eth.c
+++ b/net/eth.c
@@ -21,26 +21,16 @@
 #include "net/checksum.h"
 #include "net/tap.h"
 
-void eth_setup_vlan_headers(struct eth_header *ehdr, uint16_t vlan_tag,
-uint16_t vlan_ethtype, bool *is_new)
+void eth_setup_vlan_headers(struct eth_header *ehdr, size_t *ehdr_size,
+uint16_t vlan_tag, uint16_t vlan_ethtype)
 {
 struct vlan_header *vhdr = PKT_GET_VLAN_HDR(ehdr);
 
-switch (be16_to_cpu(ehdr->h_proto)) {
-case ETH_P_VLAN:
-case ETH_P_DVLAN:
-/* vlan hdr exists */
-*is_new = false;
-break;
-
-default:
-/* No VLAN header, put a new one */
-vhdr->h_proto = ehdr->h_proto;
-ehdr->h_proto = cpu_to_be16(vlan_ethtype);
-*is_new = true;
-break;
-}
+memmove(vhdr + 1, vhdr, *ehdr_size - ETH_HLEN);
 vhdr->h_tci = cpu_to_be16(vlan_tag);
+vhdr->h_proto = ehdr->h_proto;
+ehdr->h_proto = cpu_to_be16(vlan_ethtype);
+*ehdr_size += sizeof(*vhdr);
 }
 
 uint8_t
-- 
2.40.1

[PATCH v5 33/48] tests/qtest/libqos/igb: Set GPIE.Multiple_MSIX

2023-05-22 Thread Akihiko Odaki

GPIE.Multiple_MSIX is not set by default, and needs to be set to get
interrupts from multiple MSI-X vectors.

Signed-off-by: Akihiko Odaki 
Reviewed-by: Sriram Yagnaraman 
---
 tests/qtest/libqos/igb.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tests/qtest/libqos/igb.c b/tests/qtest/libqos/igb.c
index 12fb531bf0..a603468beb 100644
--- a/tests/qtest/libqos/igb.c
+++ b/tests/qtest/libqos/igb.c
@@ -114,6 +114,7 @@ static void igb_pci_start_hw(QOSGraphObject *obj)
 e1000e_macreg_write(>e1000e, E1000_RCTL, E1000_RCTL_EN);
 
 /* Enable all interrupts */
+e1000e_macreg_write(>e1000e, E1000_GPIE,  E1000_GPIE_MSIX_MODE);
 e1000e_macreg_write(>e1000e, E1000_IMS,  0x);
 e1000e_macreg_write(>e1000e, E1000_EIMS, 0x);
 
-- 
2.40.1

[PATCH v5 18/48] e1000e: Always log status after building rx metadata

2023-05-22 Thread Akihiko Odaki

Without this change, the status flags may not be traced e.g. if checksum
offloading is disabled.

Signed-off-by: Akihiko Odaki 
Reviewed-by: Philippe Mathieu-Daudé 
---
 hw/net/e1000e_core.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index 38d465a203..6a213c0224 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -1244,9 +1244,8 @@ e1000e_build_rx_metadata(E1000ECore *core,
 trace_e1000e_rx_metadata_l4_cso_disabled();
 }
 
-trace_e1000e_rx_metadata_status_flags(*status_flags);
-
 func_exit:
+trace_e1000e_rx_metadata_status_flags(*status_flags);
 *status_flags = cpu_to_le32(*status_flags);
 }
 
-- 
2.40.1

[PATCH v5 37/48] igb: Implement Tx SCTP CSO

2023-05-22 Thread Akihiko Odaki

Signed-off-by: Akihiko Odaki 
Reviewed-by: Sriram Yagnaraman 
---
 hw/net/net_tx_pkt.h |  8 
 hw/net/igb_core.c   | 12 +++-
 hw/net/net_tx_pkt.c | 18 ++
 3 files changed, 33 insertions(+), 5 deletions(-)

diff --git a/hw/net/net_tx_pkt.h b/hw/net/net_tx_pkt.h
index 4d7233e975..0a716e74a5 100644
--- a/hw/net/net_tx_pkt.h
+++ b/hw/net/net_tx_pkt.h
@@ -116,6 +116,14 @@ void net_tx_pkt_update_ip_checksums(struct NetTxPkt *pkt);
  */
 void net_tx_pkt_update_ip_hdr_checksum(struct NetTxPkt *pkt);
 
+/**
+ * Calculate the SCTP checksum.
+ *
+ * @pkt:packet
+ *
+ */
+bool net_tx_pkt_update_sctp_checksum(struct NetTxPkt *pkt);
+
 /**
  * get length of all populated data.
  *
diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 95d46d6e6d..5eacf1cd8c 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -440,8 +440,9 @@ igb_tx_insert_vlan(IGBCore *core, uint16_t qn, struct 
igb_tx *tx,
 static bool
 igb_setup_tx_offloads(IGBCore *core, struct igb_tx *tx)
 {
+uint32_t idx = (tx->first_olinfo_status >> 4) & 1;
+
 if (tx->first_cmd_type_len & E1000_ADVTXD_DCMD_TSE) {
-uint32_t idx = (tx->first_olinfo_status >> 4) & 1;
 uint32_t mss = tx->ctx[idx].mss_l4len_idx >> E1000_ADVTXD_MSS_SHIFT;
 if (!net_tx_pkt_build_vheader(tx->tx_pkt, true, true, mss)) {
 return false;
@@ -452,10 +453,11 @@ igb_setup_tx_offloads(IGBCore *core, struct igb_tx *tx)
 return true;
 }
 
-if (tx->first_olinfo_status & E1000_ADVTXD_POTS_TXSM) {
-if (!net_tx_pkt_build_vheader(tx->tx_pkt, false, true, 0)) {
-return false;
-}
+if ((tx->first_olinfo_status & E1000_ADVTXD_POTS_TXSM) &&
+!((tx->ctx[idx].type_tucmd_mlhl & E1000_ADVTXD_TUCMD_L4T_SCTP) ?
+  net_tx_pkt_update_sctp_checksum(tx->tx_pkt) :
+  net_tx_pkt_build_vheader(tx->tx_pkt, false, true, 0))) {
+return false;
 }
 
 if (tx->first_olinfo_status & E1000_ADVTXD_POTS_IXSM) {
diff --git a/hw/net/net_tx_pkt.c b/hw/net/net_tx_pkt.c
index af8f77a3f0..2e5f58b3c9 100644
--- a/hw/net/net_tx_pkt.c
+++ b/hw/net/net_tx_pkt.c
@@ -16,6 +16,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/crc32c.h"
 #include "net/eth.h"
 #include "net/checksum.h"
 #include "net/tap.h"
@@ -135,6 +136,23 @@ void net_tx_pkt_update_ip_checksums(struct NetTxPkt *pkt)
  pkt->virt_hdr.csum_offset, , sizeof(csum));
 }
 
+bool net_tx_pkt_update_sctp_checksum(struct NetTxPkt *pkt)
+{
+uint32_t csum = 0;
+struct iovec *pl_start_frag = pkt->vec + NET_TX_PKT_PL_START_FRAG;
+
+if (iov_from_buf(pl_start_frag, pkt->payload_frags, 8, , 
sizeof(csum)) < sizeof(csum)) {
+return false;
+}
+
+csum = cpu_to_le32(iov_crc32c(0x, pl_start_frag, 
pkt->payload_frags));
+if (iov_from_buf(pl_start_frag, pkt->payload_frags, 8, , 
sizeof(csum)) < sizeof(csum)) {
+return false;
+}
+
+return true;
+}
+
 static void net_tx_pkt_calculate_hdr_len(struct NetTxPkt *pkt)
 {
 pkt->hdr_len = pkt->vec[NET_TX_PKT_L2HDR_FRAG].iov_len +
-- 
2.40.1

[PATCH v5 05/48] igb: Do not require CTRL.VME for tx VLAN tagging

2023-05-22 Thread Akihiko Odaki

While the datasheet of e1000e says it checks CTRL.VME for tx VLAN
tagging, igb's datasheet has no such statements. It also says for
"CTRL.VLE":
> This register only affects the VLAN Strip in Rx it does not have any
> influence in the Tx path in the 82576.
(Appendix A. Changes from the 82575)

There is no "CTRL.VLE" so it is more likely that it is a mistake of
CTRL.VME.

Fixes: fba7c3b788 ("igb: respect VMVIR and VMOLR for VLAN")
Signed-off-by: Akihiko Odaki 
Reviewed-by: Sriram Yagnaraman 
---
 hw/net/igb_core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index dbd1192a8e..96a118b6c1 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -402,7 +402,7 @@ igb_tx_insert_vlan(IGBCore *core, uint16_t qn, struct 
igb_tx *tx,
 }
 }
 
-if (insert_vlan && e1000x_vlan_enabled(core->mac)) {
+if (insert_vlan) {
 net_tx_pkt_setup_vlan_header_ex(tx->tx_pkt, vlan,
 core->mac[VET] & 0x);
 }
-- 
2.40.1

[PATCH v5 42/48] igb: Implement Tx timestamp

2023-05-22 Thread Akihiko Odaki

Signed-off-by: Akihiko Odaki 
Reviewed-by: Sriram Yagnaraman 
---
 hw/net/igb_regs.h | 3 +++
 hw/net/igb_core.c | 7 +++
 2 files changed, 10 insertions(+)

diff --git a/hw/net/igb_regs.h b/hw/net/igb_regs.h
index 894705599d..82ff195dfc 100644
--- a/hw/net/igb_regs.h
+++ b/hw/net/igb_regs.h
@@ -322,6 +322,9 @@ union e1000_adv_rx_desc {
 /* E1000_EITR_CNT_IGNR is only for 82576 and newer */
 #define E1000_EITR_CNT_IGNR 0x8000 /* Don't reset counters on write */
 
+#define E1000_TSYNCTXCTL_VALID0x0001 /* tx timestamp valid */
+#define E1000_TSYNCTXCTL_ENABLED  0x0010 /* enable tx timestampping */
+
 /* PCI Express Control */
 #define E1000_GCR_CMPL_TMOUT_MASK   0xF000
 #define E1000_GCR_CMPL_TMOUT_10ms   0x1000
diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 43d23c7621..49d1917926 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -659,6 +659,13 @@ igb_process_tx_desc(IGBCore *core,
 tx->ctx[idx].vlan_macip_lens >> IGB_TX_FLAGS_VLAN_SHIFT,
 !!(tx->first_cmd_type_len & E1000_TXD_CMD_VLE));
 
+if ((tx->first_cmd_type_len & E1000_ADVTXD_MAC_TSTAMP) &&
+(core->mac[TSYNCTXCTL] & E1000_TSYNCTXCTL_ENABLED) &&
+!(core->mac[TSYNCTXCTL] & E1000_TSYNCTXCTL_VALID)) {
+core->mac[TSYNCTXCTL] |= E1000_TSYNCTXCTL_VALID;
+e1000x_timestamp(core->mac, core->timadj, TXSTMPL, TXSTMPH);
+}
+
 if (igb_tx_pkt_send(core, tx, queue_index)) {
 igb_on_tx_done_update_stats(core, tx->tx_pkt, queue_index);
 }
-- 
2.40.1

[PATCH v5 45/48] igb: Clear-on-read ICR when ICR.INTA is set

2023-05-22 Thread Akihiko Odaki

For GPIE.NSICR, Section 7.3.2.1.2 says:
> ICR bits are cleared on register read. If GPIE.NSICR = 0b, then the
> clear on read occurs only if no bit is set in the IMS or at least one
> bit is set in the IMS and there is a true interrupt as reflected in
> ICR.INTA.

e1000e does similar though it checks for CTRL_EXT.IAME, which does not
exist on igb.

Suggested-by: Sriram Yagnaraman 
Signed-off-by: Akihiko Odaki 
---
 hw/net/igb_core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 823dde8f28..d00b1caa6a 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -2598,6 +2598,8 @@ igb_mac_icr_read(IGBCore *core, int index)
 } else if (core->mac[IMS] == 0) {
 trace_e1000e_irq_icr_clear_zero_ims();
 igb_lower_interrupts(core, ICR, 0x);
+} else if (core->mac[ICR] & E1000_ICR_INT_ASSERTED) {
+igb_lower_interrupts(core, ICR, 0x);
 } else if (!msix_enabled(core->owner)) {
 trace_e1000e_irq_icr_clear_nonmsix_icr_read();
 igb_lower_interrupts(core, ICR, 0x);
-- 
2.40.1

[PATCH v5 12/48] tests/avocado: Remove test_igb_nomsi_kvm

2023-05-22 Thread Akihiko Odaki

It is unlikely to find more bugs with KVM so remove test_igb_nomsi_kvm
to save time to run it.

Signed-off-by: Akihiko Odaki 
Reviewed-by: Thomas Huth 
Acked-by: Alex Bennée 
---
 tests/avocado/netdev-ethtool.py | 12 +---
 1 file changed, 1 insertion(+), 11 deletions(-)

diff --git a/tests/avocado/netdev-ethtool.py b/tests/avocado/netdev-ethtool.py
index 8de118e313..6da800f62b 100644
--- a/tests/avocado/netdev-ethtool.py
+++ b/tests/avocado/netdev-ethtool.py
@@ -29,7 +29,7 @@ def get_asset(self, name, sha1):
 # URL into a unique one
 return self.fetch_asset(name=name, locations=(url), asset_hash=sha1)
 
-def common_test_code(self, netdev, extra_args=None, kvm=False):
+def common_test_code(self, netdev, extra_args=None):
 
 # This custom kernel has drivers for all the supported network
 # devices we can emulate in QEMU
@@ -57,9 +57,6 @@ def common_test_code(self, netdev, extra_args=None, 
kvm=False):
  '-drive', drive,
  '-device', netdev)
 
-if kvm:
-self.vm.add_args('-accel', 'kvm')
-
 self.vm.set_console(console_index=0)
 self.vm.launch()
 
@@ -86,13 +83,6 @@ def test_igb_nomsi(self):
 """
 self.common_test_code("igb", "pci=nomsi")
 
-def test_igb_nomsi_kvm(self):
-"""
-:avocado: tags=device:igb
-"""
-self.require_accelerator('kvm')
-self.common_test_code("igb", "pci=nomsi", True)
-
 # It seems the other popular cards we model in QEMU currently fail
 # the pattern test with:
 #
-- 
2.40.1

[PATCH v5 27/48] igb: Clear EICR bits for delayed MSI-X interrupts

2023-05-22 Thread Akihiko Odaki

Section 7.3.4.1 says:
> When auto-clear is enabled for an interrupt cause, the EICR bit is
> set when a cause event mapped to this vector occurs. When the EITR
> Counter reaches zero, the MSI-X message is sent on PCIe. Then the
> EICR bit is cleared and enabled to be set by a new cause event

Signed-off-by: Akihiko Odaki 
---
 hw/net/igb_core.c | 21 -
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 20645c4764..edda07e564 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -97,23 +97,31 @@ igb_lower_legacy_irq(IGBCore *core)
 pci_set_irq(core->owner, 0);
 }
 
-static void igb_msix_notify(IGBCore *core, unsigned int vector)
+static void igb_msix_notify(IGBCore *core, unsigned int cause)
 {
 PCIDevice *dev = core->owner;
 uint16_t vfn;
+uint32_t effective_eiac;
+unsigned int vector;
 
-vfn = 8 - (vector + 2) / IGBVF_MSIX_VEC_NUM;
+vfn = 8 - (cause + 2) / IGBVF_MSIX_VEC_NUM;
 if (vfn < pcie_sriov_num_vfs(core->owner)) {
 dev = pcie_sriov_get_vf_at_index(core->owner, vfn);
 assert(dev);
-vector = (vector + 2) % IGBVF_MSIX_VEC_NUM;
-} else if (vector >= IGB_MSIX_VEC_NUM) {
+vector = (cause + 2) % IGBVF_MSIX_VEC_NUM;
+} else if (cause >= IGB_MSIX_VEC_NUM) {
 qemu_log_mask(LOG_GUEST_ERROR,
   "igb: Tried to use vector unavailable for PF");
 return;
+} else {
+vector = cause;
 }
 
 msix_notify(dev, vector);
+
+trace_e1000e_irq_icr_clear_eiac(core->mac[EICR], core->mac[EIAC]);
+effective_eiac = core->mac[EIAC] & BIT(cause);
+core->mac[EICR] &= ~effective_eiac;
 }
 
 static inline void
@@ -1834,7 +1842,6 @@ igb_eitr_should_postpone(IGBCore *core, int idx)
 static void igb_send_msix(IGBCore *core)
 {
 uint32_t causes = core->mac[EICR] & core->mac[EIMS];
-uint32_t effective_eiac;
 int vector;
 
 for (vector = 0; vector < IGB_INTR_NUM; ++vector) {
@@ -1842,10 +1849,6 @@ static void igb_send_msix(IGBCore *core)
 
 trace_e1000e_irq_msix_notify_vec(vector);
 igb_msix_notify(core, vector);
-
-trace_e1000e_irq_icr_clear_eiac(core->mac[EICR], core->mac[EIAC]);
-effective_eiac = core->mac[EIAC] & BIT(vector);
-core->mac[EICR] &= ~effective_eiac;
 }
 }
 }
-- 
2.40.1

[PATCH v5 44/48] igb: Notify only new interrupts

2023-05-22 Thread Akihiko Odaki

This follows the corresponding change for e1000e. This fixes:
tests/avocado/netdev-ethtool.py:NetDevEthtool.test_igb

Signed-off-by: Akihiko Odaki 
---
 hw/net/igb_core.c | 201 --
 hw/net/trace-events   |  11 +-
 .../org.centos/stream/8/x86_64/test-avocado   |   1 +
 tests/avocado/netdev-ethtool.py   |   4 -
 4 files changed, 87 insertions(+), 130 deletions(-)

diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 49d1917926..823dde8f28 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -94,10 +94,7 @@ static ssize_t
 igb_receive_internal(IGBCore *core, const struct iovec *iov, int iovcnt,
  bool has_vnet, bool *external_tx);
 
-static inline void
-igb_set_interrupt_cause(IGBCore *core, uint32_t val);
-
-static void igb_update_interrupt_state(IGBCore *core);
+static void igb_raise_interrupts(IGBCore *core, size_t index, uint32_t causes);
 static void igb_reset(IGBCore *core, bool sw);
 
 static inline void
@@ -913,8 +910,8 @@ igb_start_xmit(IGBCore *core, const IGB_TxRing *txr)
 }
 
 if (eic) {
-core->mac[EICR] |= eic;
-igb_set_interrupt_cause(core, E1000_ICR_TXDW);
+igb_raise_interrupts(core, EICR, eic);
+igb_raise_interrupts(core, ICR, E1000_ICR_TXDW);
 }
 
 net_tx_pkt_reset(txr->tx->tx_pkt, net_tx_pkt_unmap_frag_pci, d);
@@ -1686,6 +1683,7 @@ igb_receive_internal(IGBCore *core, const struct iovec 
*iov, int iovcnt,
 {
 uint16_t queues = 0;
 uint32_t causes = 0;
+uint32_t ecauses = 0;
 union {
 L2Header l2_header;
 uint8_t octets[ETH_ZLEN];
@@ -1788,13 +1786,14 @@ igb_receive_internal(IGBCore *core, const struct iovec 
*iov, int iovcnt,
 causes |= E1000_ICS_RXDMT0;
 }
 
-core->mac[EICR] |= igb_rx_wb_eic(core, rxr.i->idx);
+ecauses |= igb_rx_wb_eic(core, rxr.i->idx);
 
 trace_e1000e_rx_written_to_guest(rxr.i->idx);
 }
 
 trace_e1000e_rx_interrupt_set(causes);
-igb_set_interrupt_cause(core, causes);
+igb_raise_interrupts(core, EICR, ecauses);
+igb_raise_interrupts(core, ICR, causes);
 
 return orig_size;
 }
@@ -1854,7 +1853,7 @@ void igb_core_set_link_status(IGBCore *core)
 }
 
 if (core->mac[STATUS] != old_status) {
-igb_set_interrupt_cause(core, E1000_ICR_LSC);
+igb_raise_interrupts(core, ICR, E1000_ICR_LSC);
 }
 }
 
@@ -1934,13 +1933,6 @@ igb_set_rx_control(IGBCore *core, int index, uint32_t 
val)
 }
 }
 
-static inline void
-igb_clear_ims_bits(IGBCore *core, uint32_t bits)
-{
-trace_e1000e_irq_clear_ims(bits, core->mac[IMS], core->mac[IMS] & ~bits);
-core->mac[IMS] &= ~bits;
-}
-
 static inline bool
 igb_postpone_interrupt(IGBIntrDelayTimer *timer)
 {
@@ -1963,9 +1955,8 @@ igb_eitr_should_postpone(IGBCore *core, int idx)
 return igb_postpone_interrupt(>eitr[idx]);
 }
 
-static void igb_send_msix(IGBCore *core)
+static void igb_send_msix(IGBCore *core, uint32_t causes)
 {
-uint32_t causes = core->mac[EICR] & core->mac[EIMS];
 int vector;
 
 for (vector = 0; vector < IGB_INTR_NUM; ++vector) {
@@ -1988,124 +1979,116 @@ igb_fix_icr_asserted(IGBCore *core)
 trace_e1000e_irq_fix_icr_asserted(core->mac[ICR]);
 }
 
-static void
-igb_update_interrupt_state(IGBCore *core)
+static void igb_raise_interrupts(IGBCore *core, size_t index, uint32_t causes)
 {
-uint32_t icr;
-uint32_t causes;
+uint32_t old_causes = core->mac[ICR] & core->mac[IMS];
+uint32_t old_ecauses = core->mac[EICR] & core->mac[EIMS];
+uint32_t raised_causes;
+uint32_t raised_ecauses;
 uint32_t int_alloc;
 
-icr = core->mac[ICR] & core->mac[IMS];
+trace_e1000e_irq_set(index << 2,
+ core->mac[index], core->mac[index] | causes);
+
+core->mac[index] |= causes;
 
 if (core->mac[GPIE] & E1000_GPIE_MSIX_MODE) {
-if (icr) {
-causes = 0;
-if (icr & E1000_ICR_DRSTA) {
-int_alloc = core->mac[IVAR_MISC] & 0xff;
-if (int_alloc & E1000_IVAR_VALID) {
-causes |= BIT(int_alloc & 0x1f);
-}
+raised_causes = core->mac[ICR] & core->mac[IMS] & ~old_causes;
+
+if (raised_causes & E1000_ICR_DRSTA) {
+int_alloc = core->mac[IVAR_MISC] & 0xff;
+if (int_alloc & E1000_IVAR_VALID) {
+core->mac[EICR] |= BIT(int_alloc & 0x1f);
 }
-/* Check if other bits (excluding the TCP Timer) are enabled. */
-if (icr & ~E1000_ICR_DRSTA) {
-int_alloc = (core->mac[IVAR_MISC] >> 8) & 0xff;
-if (int_alloc & E1000_IVAR_VALID) {
-causes |= BIT(int_alloc & 0x1f);
-}
-trace_e1000e_irq_add_msi_other(core->mac[EICR]);
+}
+/* Check if other bits (excluding the TCP Timer) are enabled. */
+if (raised_causes & ~E1000_ICR_DRSTA) {
+

[PATCH v5 06/48] igb: Clear IMS bits when committing ICR access

2023-05-22 Thread Akihiko Odaki

The datasheet says contradicting statements regarding ICR accesses so it
is not reliable to determine the behavior of ICR accesses. However,
e1000e does clear IMS bits when reading ICR accesses and Linux also
expects ICR accesses will clear IMS bits according to:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/intel/igb/igb_main.c?h=v6.2#n8048

Fixes: 3a977deebe ("Intrdocue igb device emulation")
Signed-off-by: Akihiko Odaki 
Reviewed-by: Sriram Yagnaraman 
---
 hw/net/igb_core.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 96a118b6c1..eaca5bd2b6 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -2452,16 +2452,16 @@ igb_set_ims(IGBCore *core, int index, uint32_t val)
 static void igb_commit_icr(IGBCore *core)
 {
 /*
- * If GPIE.NSICR = 0, then the copy of IAM to IMS will occur only if at
+ * If GPIE.NSICR = 0, then the clear of IMS will occur only if at
  * least one bit is set in the IMS and there is a true interrupt as
  * reflected in ICR.INTA.
  */
 if ((core->mac[GPIE] & E1000_GPIE_NSICR) ||
 (core->mac[IMS] && (core->mac[ICR] & E1000_ICR_INT_ASSERTED))) {
-igb_set_ims(core, IMS, core->mac[IAM]);
-} else {
-igb_update_interrupt_state(core);
+igb_clear_ims_bits(core, core->mac[IAM]);
 }
+
+igb_update_interrupt_state(core);
 }
 
 static void igb_set_icr(IGBCore *core, int index, uint32_t val)
-- 
2.40.1

[PATCH v5 21/48] igb: Read DCMD.VLE of the first Tx descriptor

2023-05-22 Thread Akihiko Odaki

Section 7.2.2.3 Advanced Transmit Data Descriptor says:
> For frames that spans multiple descriptors, all fields apart from
> DCMD.EOP, DCMD.RS, DCMD.DEXT, DTALEN, Address and DTYP are valid only
> in the first descriptors and are ignored in the subsequent ones.

Signed-off-by: Akihiko Odaki 
Reviewed-by: Sriram Yagnaraman 
---
 hw/net/igb_core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index bae51cbb63..162ef26789 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -613,7 +613,7 @@ igb_process_tx_desc(IGBCore *core,
 idx = (tx->first_olinfo_status >> 4) & 1;
 igb_tx_insert_vlan(core, queue_index, tx,
 tx->ctx[idx].vlan_macip_lens >> 16,
-!!(cmd_type_len & E1000_TXD_CMD_VLE));
+!!(tx->first_cmd_type_len & E1000_TXD_CMD_VLE));
 
 if (igb_tx_pkt_send(core, tx, queue_index)) {
 igb_on_tx_done_update_stats(core, tx->tx_pkt, queue_index);
-- 
2.40.1

[PATCH v5 35/48] igb: Use UDP for RSS hash

2023-05-22 Thread Akihiko Odaki

e1000e does not support using UDP for RSS hash, but igb does.

Signed-off-by: Akihiko Odaki 
Reviewed-by: Sriram Yagnaraman 
---
 hw/net/igb_regs.h |  3 +++
 hw/net/igb_core.c | 16 
 2 files changed, 19 insertions(+)

diff --git a/hw/net/igb_regs.h b/hw/net/igb_regs.h
index eb995d8b2e..e6ac26dc0e 100644
--- a/hw/net/igb_regs.h
+++ b/hw/net/igb_regs.h
@@ -659,6 +659,9 @@ union e1000_adv_rx_desc {
 
 #define E1000_RSS_QUEUE(reta, hash) (E1000_RETA_VAL(reta, hash) & 0x0F)
 
+#define E1000_MRQ_RSS_TYPE_IPV4UDP 7
+#define E1000_MRQ_RSS_TYPE_IPV6UDP 8
+
 #define E1000_STATUS_IOV_MODE 0x0004
 
 #define E1000_STATUS_NUM_VFS_SHIFT 14
diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 6d55b43fb4..41a2e5bf7b 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -287,6 +287,11 @@ igb_rss_get_hash_type(IGBCore *core, struct NetRxPkt *pkt)
 return E1000_MRQ_RSS_TYPE_IPV4TCP;
 }
 
+if (l4hdr_proto == ETH_L4_HDR_PROTO_UDP &&
+(core->mac[MRQC] & E1000_MRQC_RSS_FIELD_IPV4_UDP)) {
+return E1000_MRQ_RSS_TYPE_IPV4UDP;
+}
+
 if (E1000_MRQC_EN_IPV4(core->mac[MRQC])) {
 return E1000_MRQ_RSS_TYPE_IPV4;
 }
@@ -322,6 +327,11 @@ igb_rss_get_hash_type(IGBCore *core, struct NetRxPkt *pkt)
 return E1000_MRQ_RSS_TYPE_IPV6TCPEX;
 }
 
+if (l4hdr_proto == ETH_L4_HDR_PROTO_UDP &&
+(core->mac[MRQC] & E1000_MRQC_RSS_FIELD_IPV6_UDP)) {
+return E1000_MRQ_RSS_TYPE_IPV6UDP;
+}
+
 if (E1000_MRQC_EN_IPV6EX(core->mac[MRQC])) {
 return E1000_MRQ_RSS_TYPE_IPV6EX;
 }
@@ -360,6 +370,12 @@ igb_rss_calc_hash(IGBCore *core, struct NetRxPkt *pkt, 
E1000E_RSSInfo *info)
 case E1000_MRQ_RSS_TYPE_IPV6EX:
 type = NetPktRssIpV6Ex;
 break;
+case E1000_MRQ_RSS_TYPE_IPV4UDP:
+type = NetPktRssIpV4Udp;
+break;
+case E1000_MRQ_RSS_TYPE_IPV6UDP:
+type = NetPktRssIpV6Udp;
+break;
 default:
 assert(false);
 return 0;
-- 
2.40.1

[PATCH v5 22/48] e1000e: Reset packet state after emptying Tx queue

2023-05-22 Thread Akihiko Odaki

Keeping Tx packet state after the transmit queue is emptied has some
problems:
- The datasheet says the descriptors can be reused after the transmit
  queue is emptied, but the Tx packet state may keep references to them.
- The Tx packet state cannot be migrated so it can be reset anytime the
  migration happens.

Always reset Tx packet state always after the queue is emptied.

Signed-off-by: Akihiko Odaki 
---
 hw/net/e1000e_core.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index 6a213c0224..7dce448657 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -959,6 +959,8 @@ e1000e_start_xmit(E1000ECore *core, const E1000E_TxRing 
*txr)
 if (!ide || !e1000e_intrmgr_delay_tx_causes(core, )) {
 e1000e_set_interrupt_cause(core, cause);
 }
+
+net_tx_pkt_reset(txr->tx->tx_pkt, net_tx_pkt_unmap_frag_pci, core->owner);
 }
 
 static bool
@@ -3389,8 +3391,6 @@ e1000e_core_pci_uninit(E1000ECore *core)
 qemu_del_vm_change_state_handler(core->vmstate);
 
 for (i = 0; i < E1000E_NUM_QUEUES; i++) {
-net_tx_pkt_reset(core->tx[i].tx_pkt,
- net_tx_pkt_unmap_frag_pci, core->owner);
 net_tx_pkt_uninit(core->tx[i].tx_pkt);
 }
 
@@ -3515,8 +3515,6 @@ static void e1000e_reset(E1000ECore *core, bool sw)
 e1000x_reset_mac_addr(core->owner_nic, core->mac, core->permanent_mac);
 
 for (i = 0; i < ARRAY_SIZE(core->tx); i++) {
-net_tx_pkt_reset(core->tx[i].tx_pkt,
- net_tx_pkt_unmap_frag_pci, core->owner);
 memset(>tx[i].props, 0, sizeof(core->tx[i].props));
 core->tx[i].skip_cp = false;
 }
-- 
2.40.1

[PATCH v5 46/48] vmxnet3: Do not depend on PC

2023-05-22 Thread Akihiko Odaki

vmxnet3 has no dependency on PC, and VMware Fusion actually makes it
available on Apple Silicon according to:
https://kb.vmware.com/s/article/90364

Signed-off-by: Akihiko Odaki 
Reviewed-by: Philippe Mathieu-Daudé 
---
 hw/net/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/net/Kconfig b/hw/net/Kconfig
index 18c7851efe..98e00be4f9 100644
--- a/hw/net/Kconfig
+++ b/hw/net/Kconfig
@@ -56,7 +56,7 @@ config RTL8139_PCI
 
 config VMXNET3_PCI
 bool
-default y if PCI_DEVICES && PC_PCI
+default y if PCI_DEVICES
 depends on PCI
 
 config SMC91C111
-- 
2.40.1

[PATCH v5 32/48] hw/net/net_rx_pkt: Enforce alignment for eth_header

2023-05-22 Thread Akihiko Odaki

eth_strip_vlan and eth_strip_vlan_ex refers to ehdr_buf as struct
eth_header. Enforce alignment for the structure.

Signed-off-by: Akihiko Odaki 
Reviewed-by: Sriram Yagnaraman 
---
 hw/net/net_rx_pkt.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/hw/net/net_rx_pkt.c b/hw/net/net_rx_pkt.c
index 6125a063d7..1de42b4f51 100644
--- a/hw/net/net_rx_pkt.c
+++ b/hw/net/net_rx_pkt.c
@@ -23,7 +23,10 @@
 
 struct NetRxPkt {
 struct virtio_net_hdr virt_hdr;
-uint8_t ehdr_buf[sizeof(struct eth_header) + sizeof(struct vlan_header)];
+struct {
+struct eth_header eth;
+struct vlan_header vlan;
+} ehdr_buf;
 struct iovec *vec;
 uint16_t vec_len_total;
 uint16_t vec_len;
@@ -89,7 +92,7 @@ net_rx_pkt_pull_data(struct NetRxPkt *pkt,
 if (pkt->ehdr_buf_len) {
 net_rx_pkt_iovec_realloc(pkt, iovcnt + 1);
 
-pkt->vec[0].iov_base = pkt->ehdr_buf;
+pkt->vec[0].iov_base = >ehdr_buf;
 pkt->vec[0].iov_len = pkt->ehdr_buf_len;
 
 pkt->tot_len = pllen + pkt->ehdr_buf_len;
@@ -120,7 +123,7 @@ void net_rx_pkt_attach_iovec(struct NetRxPkt *pkt,
 assert(pkt);
 
 if (strip_vlan) {
-pkt->ehdr_buf_len = eth_strip_vlan(iov, iovcnt, iovoff, pkt->ehdr_buf,
+pkt->ehdr_buf_len = eth_strip_vlan(iov, iovcnt, iovoff, >ehdr_buf,
, );
 } else {
 pkt->ehdr_buf_len = 0;
@@ -142,7 +145,7 @@ void net_rx_pkt_attach_iovec_ex(struct NetRxPkt *pkt,
 
 if (strip_vlan) {
 pkt->ehdr_buf_len = eth_strip_vlan_ex(iov, iovcnt, iovoff, vet,
-  pkt->ehdr_buf,
+  >ehdr_buf,
   , );
 } else {
 pkt->ehdr_buf_len = 0;
-- 
2.40.1

[PATCH v5 29/48] igb: Rename a variable in igb_receive_internal()

2023-05-22 Thread Akihiko Odaki

Rename variable "n" to "causes", which properly represents the content
of the variable.

Signed-off-by: Akihiko Odaki 
Reviewed-by: Sriram Yagnaraman 
---
 hw/net/igb_core.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index edda07e564..c954369964 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -1569,7 +1569,7 @@ igb_receive_internal(IGBCore *core, const struct iovec 
*iov, int iovcnt,
  bool has_vnet, bool *external_tx)
 {
 uint16_t queues = 0;
-uint32_t n = 0;
+uint32_t causes = 0;
 union {
 L2Header l2_header;
 uint8_t octets[ETH_ZLEN];
@@ -1649,19 +1649,19 @@ igb_receive_internal(IGBCore *core, const struct iovec 
*iov, int iovcnt,
 e1000x_fcs_len(core->mac);
 
 if (!igb_has_rxbufs(core, rxr.i, total_size)) {
-n |= E1000_ICS_RXO;
+causes |= E1000_ICS_RXO;
 trace_e1000e_rx_not_written_to_guest(rxr.i->idx);
 continue;
 }
 
-n |= E1000_ICR_RXDW;
+causes |= E1000_ICR_RXDW;
 
 igb_rx_fix_l4_csum(core, core->rx_pkt);
 igb_write_packet_to_guest(core, core->rx_pkt, , _info);
 
 /* Check if receive descriptor minimum threshold hit */
 if (igb_rx_descr_threshold_hit(core, rxr.i)) {
-n |= E1000_ICS_RXDMT0;
+causes |= E1000_ICS_RXDMT0;
 }
 
 core->mac[EICR] |= igb_rx_wb_eic(core, rxr.i->idx);
@@ -1669,8 +1669,8 @@ igb_receive_internal(IGBCore *core, const struct iovec 
*iov, int iovcnt,
 trace_e1000e_rx_written_to_guest(rxr.i->idx);
 }
 
-trace_e1000e_rx_interrupt_set(n);
-igb_set_interrupt_cause(core, n);
+trace_e1000e_rx_interrupt_set(causes);
+igb_set_interrupt_cause(core, causes);
 
 return orig_size;
 }
-- 
2.40.1

[PATCH v5 34/48] igb: Implement MSI-X single vector mode

2023-05-22 Thread Akihiko Odaki

Signed-off-by: Akihiko Odaki 
Reviewed-by: Sriram Yagnaraman 
---
 hw/net/igb_core.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index c954369964..6d55b43fb4 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -1873,7 +1873,7 @@ igb_update_interrupt_state(IGBCore *core)
 
 icr = core->mac[ICR] & core->mac[IMS];
 
-if (msix_enabled(core->owner)) {
+if (core->mac[GPIE] & E1000_GPIE_MSIX_MODE) {
 if (icr) {
 causes = 0;
 if (icr & E1000_ICR_DRSTA) {
@@ -1908,7 +1908,12 @@ igb_update_interrupt_state(IGBCore *core)
 trace_e1000e_irq_pending_interrupts(core->mac[ICR] & core->mac[IMS],
 core->mac[ICR], core->mac[IMS]);
 
-if (msi_enabled(core->owner)) {
+if (msix_enabled(core->owner)) {
+if (icr) {
+trace_e1000e_irq_msix_notify_vec(0);
+msix_notify(core->owner, 0);
+}
+} else if (msi_enabled(core->owner)) {
 if (icr) {
 msi_notify(core->owner, 0);
 }
-- 
2.40.1

[PATCH v5 41/48] igb: Implement Rx PTP2 timestamp

2023-05-22 Thread Akihiko Odaki

Signed-off-by: Akihiko Odaki 
---
 hw/net/igb_common.h |  16 +++---
 hw/net/igb_regs.h   |  23 
 hw/net/igb_core.c   | 129 
 3 files changed, 127 insertions(+), 41 deletions(-)

diff --git a/hw/net/igb_common.h b/hw/net/igb_common.h
index f2a9065791..5c261ba9d3 100644
--- a/hw/net/igb_common.h
+++ b/hw/net/igb_common.h
@@ -51,7 +51,7 @@
defreg_indexeda(x, 0), defreg_indexeda(x, 1), \
defreg_indexeda(x, 2), defreg_indexeda(x, 3)
 
-#define defregv(x) defreg_indexed(x, 0), defreg_indexed(x, 1),   \
+#define defreg8(x) defreg_indexed(x, 0), defreg_indexed(x, 1),   \
defreg_indexed(x, 2), defreg_indexed(x, 3),   \
defreg_indexed(x, 4), defreg_indexed(x, 5),   \
defreg_indexed(x, 6), defreg_indexed(x, 7)
@@ -122,6 +122,8 @@ enum {
 defreg(EICS),defreg(EIMS),defreg(EIMC),   defreg(EIAM),
 defreg(EICR),defreg(IVAR_MISC),   defreg(GPIE),
 
+defreg(TSYNCRXCFG), defreg8(ETQF),
+
 defreg(RXPBS),  defregd(RDBAL),   defregd(RDBAH), 
defregd(RDLEN),
 defregd(SRRCTL),defregd(RDH), defregd(RDT),
 defregd(RXDCTL),defregd(RXCTL),   defregd(RQDPC), defreg(RA2),
@@ -133,15 +135,15 @@ enum {
 
 defreg(VT_CTL),
 
-defregv(P2VMAILBOX), defregv(V2PMAILBOX), defreg(MBVFICR),
defreg(MBVFIMR),
+defreg8(P2VMAILBOX), defreg8(V2PMAILBOX), defreg(MBVFICR),
defreg(MBVFIMR),
 defreg(VFLRE),   defreg(VFRE),defreg(VFTE),   defreg(WVBR),
 defreg(QDE), defreg(DTXSWC),  defreg_indexed(VLVF, 0),
-defregv(VMOLR),  defreg(RPLOLR),  defregv(VMBMEM),
defregv(VMVIR),
+defreg8(VMOLR),  defreg(RPLOLR),  defreg8(VMBMEM),
defreg8(VMVIR),
 
-defregv(PVTCTRL),defregv(PVTEICS),defregv(PVTEIMS),   
defregv(PVTEIMC),
-defregv(PVTEIAC),defregv(PVTEIAM),defregv(PVTEICR),   
defregv(PVFGPRC),
-defregv(PVFGPTC),defregv(PVFGORC),defregv(PVFGOTC),   
defregv(PVFMPRC),
-defregv(PVFGPRLBC),  defregv(PVFGPTLBC),  defregv(PVFGORLBC), 
defregv(PVFGOTLBC),
+defreg8(PVTCTRL),defreg8(PVTEICS),defreg8(PVTEIMS),   
defreg8(PVTEIMC),
+defreg8(PVTEIAC),defreg8(PVTEIAM),defreg8(PVTEICR),   
defreg8(PVFGPRC),
+defreg8(PVFGPTC),defreg8(PVFGORC),defreg8(PVFGOTC),   
defreg8(PVFMPRC),
+defreg8(PVFGPRLBC),  defreg8(PVFGPTLBC),  defreg8(PVFGORLBC), 
defreg8(PVFGOTLBC),
 
 defreg(MTA_A),
 
diff --git a/hw/net/igb_regs.h b/hw/net/igb_regs.h
index 4b4ebd3369..894705599d 100644
--- a/hw/net/igb_regs.h
+++ b/hw/net/igb_regs.h
@@ -210,6 +210,15 @@ union e1000_adv_rx_desc {
 #define E1000_DCA_TXCTRL_CPUID_SHIFT 24 /* Tx CPUID now in the last byte */
 #define E1000_DCA_RXCTRL_CPUID_SHIFT 24 /* Rx CPUID now in the last byte */
 
+/* ETQF register bit definitions */
+#define E1000_ETQF_FILTER_ENABLE   BIT(26)
+#define E1000_ETQF_1588BIT(30)
+#define E1000_ETQF_IMM_INT BIT(29)
+#define E1000_ETQF_QUEUE_ENABLEBIT(31)
+#define E1000_ETQF_QUEUE_SHIFT 16
+#define E1000_ETQF_QUEUE_MASK  0x0007
+#define E1000_ETQF_ETYPE_MASK  0x
+
 #define E1000_DTXSWC_MAC_SPOOF_MASK   0x00FF /* Per VF MAC spoof control */
 #define E1000_DTXSWC_VLAN_SPOOF_MASK  0xFF00 /* Per VF VLAN spoof control 
*/
 #define E1000_DTXSWC_LLE_MASK 0x00FF /* Per VF Local LB enables */
@@ -384,6 +393,20 @@ union e1000_adv_rx_desc {
 #define E1000_FRTIMER   0x01048  /* Free Running Timer - RW */
 #define E1000_FCRTV 0x02460  /* Flow Control Refresh Timer Value - RW */
 
+#define E1000_TSYNCRXCFG 0x05F50 /* Time Sync Rx Configuration - RW */
+
+/* Filtering Registers */
+#define E1000_SAQF(_n) (0x5980 + 4 * (_n))
+#define E1000_DAQF(_n) (0x59A0 + 4 * (_n))
+#define E1000_SPQF(_n) (0x59C0 + 4 * (_n))
+#define E1000_FTQF(_n) (0x59E0 + 4 * (_n))
+#define E1000_SAQF0 E1000_SAQF(0)
+#define E1000_DAQF0 E1000_DAQF(0)
+#define E1000_SPQF0 E1000_SPQF(0)
+#define E1000_FTQF0 E1000_FTQF(0)
+#define E1000_SYNQF(_n) (0x055FC + (4 * (_n))) /* SYN Packet Queue Fltr */
+#define E1000_ETQF(_n)  (0x05CB0 + (4 * (_n))) /* EType Queue Fltr */
+
 #define E1000_RQDPC(_n) (0x0C030 + ((_n) * 0x40))
 
 #define E1000_RXPBS 0x02404  /* Rx Packet Buffer Size - RW */
diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index c04ec01117..43d23c7621 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -72,6 +72,24 @@ typedef struct L2Header {
 struct vlan_header vlan[2];
 } L2Header;
 
+typedef struct PTP2 {
+uint8_t message_id_transport_specific;
+uint8_t version_ptp;
+uint16_t message_length;
+uint8_t subdomain_number;
+uint8_t reserved0;
+uint16_t flags;
+uint64_t correction;
+uint8_t reserved1[5];
+uint8_t source_communication_technology;
+uint32_t source_uuid_lo;
+uint16_t source_uuid_hi;
+uint16_t source_port_id;
+uint16_t

[PATCH v5 07/48] net/net_rx_pkt: Use iovec for net_rx_pkt_set_protocols()

2023-05-22 Thread Akihiko Odaki

igb does not properly ensure the buffer passed to
net_rx_pkt_set_protocols() is contiguous for the entire L2/L3/L4 header.
Allow it to pass scattered data to net_rx_pkt_set_protocols().

Fixes: 3a977deebe ("Intrdocue igb device emulation")
Signed-off-by: Akihiko Odaki 
Reviewed-by: Sriram Yagnaraman 
---
 hw/net/net_rx_pkt.h | 10 ++
 include/net/eth.h   |  6 +++---
 hw/net/igb_core.c   |  2 +-
 hw/net/net_rx_pkt.c | 14 +-
 hw/net/virtio-net.c |  7 +--
 hw/net/vmxnet3.c|  7 ++-
 net/eth.c   | 18 --
 7 files changed, 34 insertions(+), 30 deletions(-)

diff --git a/hw/net/net_rx_pkt.h b/hw/net/net_rx_pkt.h
index d00b484900..a06f5c2675 100644
--- a/hw/net/net_rx_pkt.h
+++ b/hw/net/net_rx_pkt.h
@@ -55,12 +55,14 @@ size_t net_rx_pkt_get_total_len(struct NetRxPkt *pkt);
  * parse and set packet analysis results
  *
  * @pkt:packet
- * @data:   pointer to the data buffer to be parsed
- * @len:data length
+ * @iov:received data scatter-gather list
+ * @iovcnt: number of elements in iov
+ * @iovoff: data start offset in the iov
  *
  */
-void net_rx_pkt_set_protocols(struct NetRxPkt *pkt, const void *data,
-  size_t len);
+void net_rx_pkt_set_protocols(struct NetRxPkt *pkt,
+  const struct iovec *iov, size_t iovcnt,
+  size_t iovoff);
 
 /**
  * fetches packet analysis results
diff --git a/include/net/eth.h b/include/net/eth.h
index c5ae4493b4..9f19c3a695 100644
--- a/include/net/eth.h
+++ b/include/net/eth.h
@@ -312,10 +312,10 @@ eth_get_l2_hdr_length(const void *p)
 }
 
 static inline uint32_t
-eth_get_l2_hdr_length_iov(const struct iovec *iov, int iovcnt)
+eth_get_l2_hdr_length_iov(const struct iovec *iov, size_t iovcnt, size_t 
iovoff)
 {
 uint8_t p[sizeof(struct eth_header) + sizeof(struct vlan_header)];
-size_t copied = iov_to_buf(iov, iovcnt, 0, p, ARRAY_SIZE(p));
+size_t copied = iov_to_buf(iov, iovcnt, iovoff, p, ARRAY_SIZE(p));
 
 if (copied < ARRAY_SIZE(p)) {
 return copied;
@@ -397,7 +397,7 @@ typedef struct eth_l4_hdr_info_st {
 bool has_tcp_data;
 } eth_l4_hdr_info;
 
-void eth_get_protocols(const struct iovec *iov, int iovcnt,
+void eth_get_protocols(const struct iovec *iov, size_t iovcnt, size_t iovoff,
bool *hasip4, bool *hasip6,
size_t *l3hdr_off,
size_t *l4hdr_off,
diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index eaca5bd2b6..21a8d9ada4 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -1650,7 +1650,7 @@ igb_receive_internal(IGBCore *core, const struct iovec 
*iov, int iovcnt,
 
 ehdr = PKT_GET_ETH_HDR(filter_buf);
 net_rx_pkt_set_packet_type(core->rx_pkt, get_eth_packet_type(ehdr));
-net_rx_pkt_set_protocols(core->rx_pkt, filter_buf, size);
+net_rx_pkt_set_protocols(core->rx_pkt, iov, iovcnt, iov_ofs);
 
 queues = igb_receive_assign(core, ehdr, size, _info, external_tx);
 if (!queues) {
diff --git a/hw/net/net_rx_pkt.c b/hw/net/net_rx_pkt.c
index 39cdea06de..63be6e05ad 100644
--- a/hw/net/net_rx_pkt.c
+++ b/hw/net/net_rx_pkt.c
@@ -103,7 +103,7 @@ net_rx_pkt_pull_data(struct NetRxPkt *pkt,
 iov, iovcnt, ploff, pkt->tot_len);
 }
 
-eth_get_protocols(pkt->vec, pkt->vec_len, >hasip4, >hasip6,
+eth_get_protocols(pkt->vec, pkt->vec_len, 0, >hasip4, >hasip6,
   >l3hdr_off, >l4hdr_off, >l5hdr_off,
   >ip6hdr_info, >ip4hdr_info, >l4hdr_info);
 
@@ -186,17 +186,13 @@ size_t net_rx_pkt_get_total_len(struct NetRxPkt *pkt)
 return pkt->tot_len;
 }
 
-void net_rx_pkt_set_protocols(struct NetRxPkt *pkt, const void *data,
-  size_t len)
+void net_rx_pkt_set_protocols(struct NetRxPkt *pkt,
+  const struct iovec *iov, size_t iovcnt,
+  size_t iovoff)
 {
-const struct iovec iov = {
-.iov_base = (void *)data,
-.iov_len = len
-};
-
 assert(pkt);
 
-eth_get_protocols(, 1, >hasip4, >hasip6,
+eth_get_protocols(iov, iovcnt, iovoff, >hasip4, >hasip6,
   >l3hdr_off, >l4hdr_off, >l5hdr_off,
   >ip6hdr_info, >ip4hdr_info, >l4hdr_info);
 }
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 53e1c32643..37551fd854 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1835,9 +1835,12 @@ static int virtio_net_process_rss(NetClientState *nc, 
const uint8_t *buf,
 VIRTIO_NET_HASH_REPORT_UDPv6,
 VIRTIO_NET_HASH_REPORT_UDPv6_EX
 };
+struct iovec iov = {
+.iov_base = (void *)buf,
+.iov_len = size
+};
 
-net_rx_pkt_set_protocols(pkt, buf + n->host_hdr_len,
- size - n->host_hdr_len);
+net_rx_pkt_set_protocols(pkt, , 1, n->host_hdr_len);

[PATCH v5 48/48] docs/system/devices/igb: Note igb is tested for DPDK

2023-05-22 Thread Akihiko Odaki

Signed-off-by: Akihiko Odaki 
---
 docs/system/devices/igb.rst | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/docs/system/devices/igb.rst b/docs/system/devices/igb.rst
index afe036dad2..60c10bf7c7 100644
--- a/docs/system/devices/igb.rst
+++ b/docs/system/devices/igb.rst
@@ -14,7 +14,8 @@ Limitations
 ===
 
 This igb implementation was tested with Linux Test Project [2]_ and Windows HLK
-[3]_ during the initial development. The command used when testing with LTP is:
+[3]_ during the initial development. Later it was also tested with DPDK Test
+Suite [4]_. The command used when testing with LTP is:
 
 .. code-block:: shell
 
@@ -22,8 +23,8 @@ This igb implementation was tested with Linux Test Project 
[2]_ and Windows HLK
 
 Be aware that this implementation lacks many functionalities available with the
 actual hardware, and you may experience various failures if you try to use it
-with a different operating system other than Linux and Windows or if you try
-functionalities not covered by the tests.
+with a different operating system other than DPDK, Linux, and Windows or if you
+try functionalities not covered by the tests.
 
 Using igb
 =
@@ -32,7 +33,7 @@ Using igb should be nothing different from using another 
network device. See
 :ref:`pcsys_005fnetwork` in general.
 
 However, you may also need to perform additional steps to activate SR-IOV
-feature on your guest. For Linux, refer to [4]_.
+feature on your guest. For Linux, refer to [5]_.
 
 Developing igb
 ==
@@ -68,4 +69,5 @@ References
 .. [1] 
https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/82576eb-gigabit-ethernet-controller-datasheet.pdf
 .. [2] https://github.com/linux-test-project/ltp
 .. [3] https://learn.microsoft.com/en-us/windows-hardware/test/hlk/
-.. [4] https://docs.kernel.org/PCI/pci-iov-howto.html
+.. [4] https://doc.dpdk.org/dts/gsg/
+.. [5] https://docs.kernel.org/PCI/pci-iov-howto.html
-- 
2.40.1

[PATCH v5 04/48] igb: Fix Rx packet type encoding

2023-05-22 Thread Akihiko Odaki

igb's advanced descriptor uses a packet type encoding different from
one used in e1000e's extended descriptor. Fix the logic to encode
Rx packet type accordingly.

Fixes: 3a977deebe ("Intrdocue igb device emulation")
Signed-off-by: Akihiko Odaki 
Reviewed-by: Sriram Yagnaraman 
---
 hw/net/igb_regs.h |  5 +
 hw/net/igb_core.c | 38 +++---
 2 files changed, 24 insertions(+), 19 deletions(-)

diff --git a/hw/net/igb_regs.h b/hw/net/igb_regs.h
index c5c5b3c3b8..21ee9a3b2d 100644
--- a/hw/net/igb_regs.h
+++ b/hw/net/igb_regs.h
@@ -641,6 +641,11 @@ union e1000_adv_rx_desc {
 
 #define E1000_STATUS_NUM_VFS_SHIFT 14
 
+#define E1000_ADVRXD_PKT_IP4 BIT(4)
+#define E1000_ADVRXD_PKT_IP6 BIT(6)
+#define E1000_ADVRXD_PKT_TCP BIT(8)
+#define E1000_ADVRXD_PKT_UDP BIT(9)
+
 static inline uint8_t igb_ivar_entry_rx(uint8_t i)
 {
 return i < 8 ? i * 4 : (i - 8) * 4 + 2;
diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 464a41d0aa..dbd1192a8e 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -1227,7 +1227,6 @@ igb_build_rx_metadata(IGBCore *core,
 struct virtio_net_hdr *vhdr;
 bool hasip4, hasip6;
 EthL4HdrProto l4hdr_proto;
-uint32_t pkt_type;
 
 *status_flags = E1000_RXD_STAT_DD;
 
@@ -1266,28 +1265,29 @@ igb_build_rx_metadata(IGBCore *core,
 trace_e1000e_rx_metadata_ack();
 }
 
-if (hasip6 && (core->mac[RFCTL] & E1000_RFCTL_IPV6_DIS)) {
-trace_e1000e_rx_metadata_ipv6_filtering_disabled();
-pkt_type = E1000_RXD_PKT_MAC;
-} else if (l4hdr_proto == ETH_L4_HDR_PROTO_TCP ||
-   l4hdr_proto == ETH_L4_HDR_PROTO_UDP) {
-pkt_type = hasip4 ? E1000_RXD_PKT_IP4_XDP : E1000_RXD_PKT_IP6_XDP;
-} else if (hasip4 || hasip6) {
-pkt_type = hasip4 ? E1000_RXD_PKT_IP4 : E1000_RXD_PKT_IP6;
-} else {
-pkt_type = E1000_RXD_PKT_MAC;
-}
+if (pkt_info) {
+*pkt_info = rss_info->enabled ? rss_info->type : 0;
 
-trace_e1000e_rx_metadata_pkt_type(pkt_type);
+if (hasip4) {
+*pkt_info |= E1000_ADVRXD_PKT_IP4;
+}
 
-if (pkt_info) {
-if (rss_info->enabled) {
-*pkt_info = rss_info->type;
+if (hasip6) {
+*pkt_info |= E1000_ADVRXD_PKT_IP6;
 }
 
-*pkt_info |= (pkt_type << 4);
-} else {
-*status_flags |= E1000_RXD_PKT_TYPE(pkt_type);
+switch (l4hdr_proto) {
+case ETH_L4_HDR_PROTO_TCP:
+*pkt_info |= E1000_ADVRXD_PKT_TCP;
+break;
+
+case ETH_L4_HDR_PROTO_UDP:
+*pkt_info |= E1000_ADVRXD_PKT_UDP;
+break;
+
+default:
+break;
+}
 }
 
 if (hdr_info) {
-- 
2.40.1

[PATCH v5 14/48] net/eth: Rename eth_setup_vlan_headers_ex

2023-05-22 Thread Akihiko Odaki

The old eth_setup_vlan_headers has no user so remove it and rename
eth_setup_vlan_headers_ex.

Signed-off-by: Akihiko Odaki 
Reviewed-by: Philippe Mathieu-Daudé 
---
 include/net/eth.h   | 9 +
 hw/net/net_tx_pkt.c | 2 +-
 net/eth.c   | 2 +-
 3 files changed, 3 insertions(+), 10 deletions(-)

diff --git a/include/net/eth.h b/include/net/eth.h
index 9f19c3a695..e8af5742be 100644
--- a/include/net/eth.h
+++ b/include/net/eth.h
@@ -351,16 +351,9 @@ eth_strip_vlan_ex(const struct iovec *iov, int iovcnt, 
size_t iovoff,
 uint16_t
 eth_get_l3_proto(const struct iovec *l2hdr_iov, int iovcnt, size_t l2hdr_len);
 
-void eth_setup_vlan_headers_ex(struct eth_header *ehdr, uint16_t vlan_tag,
+void eth_setup_vlan_headers(struct eth_header *ehdr, uint16_t vlan_tag,
 uint16_t vlan_ethtype, bool *is_new);
 
-static inline void
-eth_setup_vlan_headers(struct eth_header *ehdr, uint16_t vlan_tag,
-bool *is_new)
-{
-eth_setup_vlan_headers_ex(ehdr, vlan_tag, ETH_P_VLAN, is_new);
-}
-
 
 uint8_t eth_get_gso_type(uint16_t l3_proto, uint8_t *l3_hdr, uint8_t l4proto);
 
diff --git a/hw/net/net_tx_pkt.c b/hw/net/net_tx_pkt.c
index cc36750c9b..ce6b102391 100644
--- a/hw/net/net_tx_pkt.c
+++ b/hw/net/net_tx_pkt.c
@@ -368,7 +368,7 @@ void net_tx_pkt_setup_vlan_header_ex(struct NetTxPkt *pkt,
 bool is_new;
 assert(pkt);
 
-eth_setup_vlan_headers_ex(pkt->vec[NET_TX_PKT_L2HDR_FRAG].iov_base,
+eth_setup_vlan_headers(pkt->vec[NET_TX_PKT_L2HDR_FRAG].iov_base,
 vlan, vlan_ethtype, _new);
 
 /* update l2hdrlen */
diff --git a/net/eth.c b/net/eth.c
index d7b30df79f..b6ff89c460 100644
--- a/net/eth.c
+++ b/net/eth.c
@@ -21,7 +21,7 @@
 #include "net/checksum.h"
 #include "net/tap.h"
 
-void eth_setup_vlan_headers_ex(struct eth_header *ehdr, uint16_t vlan_tag,
+void eth_setup_vlan_headers(struct eth_header *ehdr, uint16_t vlan_tag,
 uint16_t vlan_ethtype, bool *is_new)
 {
 struct vlan_header *vhdr = PKT_GET_VLAN_HDR(ehdr);
-- 
2.40.1

[PATCH v5 47/48] MAINTAINERS: Add a reviewer for network packet abstractions

2023-05-22 Thread Akihiko Odaki

I have made significant changes for network packet abstractions so add
me as a reviewer.

Signed-off-by: Akihiko Odaki 
Reviewed-by: Philippe Mathieu-Daudé 
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index c31d2279ab..8b2ef5943c 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2214,6 +2214,7 @@ F: tests/qtest/fuzz-megasas-test.c
 
 Network packet abstractions
 M: Dmitry Fleytman 
+R: Akihiko Odaki 
 S: Maintained
 F: include/net/eth.h
 F: net/eth.c
-- 
2.40.1

[PATCH v5 25/48] igb: Share common VF constants

2023-05-22 Thread Akihiko Odaki

The constants need to be consistent between the PF and VF.

Signed-off-by: Akihiko Odaki 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Sriram Yagnaraman 
---
 hw/net/igb_common.h |  8 
 hw/net/igb.c| 10 +-
 hw/net/igbvf.c  |  7 ---
 3 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/hw/net/igb_common.h b/hw/net/igb_common.h
index 69ac490f75..f2a9065791 100644
--- a/hw/net/igb_common.h
+++ b/hw/net/igb_common.h
@@ -28,6 +28,14 @@
 
 #include "igb_regs.h"
 
+#define TYPE_IGBVF "igbvf"
+
+#define IGBVF_MMIO_BAR_IDX  (0)
+#define IGBVF_MSIX_BAR_IDX  (3)
+
+#define IGBVF_MMIO_SIZE (16 * 1024)
+#define IGBVF_MSIX_SIZE (16 * 1024)
+
 #define defreg(x) x = (E1000_##x >> 2)
 #define defreg_indexed(x, i) x##i = (E1000_##x(i) >> 2)
 #define defreg_indexeda(x, i) x##i##_A = (E1000_##x##_A(i) >> 2)
diff --git a/hw/net/igb.c b/hw/net/igb.c
index 51a7e9133e..1c989d7677 100644
--- a/hw/net/igb.c
+++ b/hw/net/igb.c
@@ -433,16 +433,16 @@ static void igb_pci_realize(PCIDevice *pci_dev, Error 
**errp)
 
 pcie_ari_init(pci_dev, 0x150, 1);
 
-pcie_sriov_pf_init(pci_dev, IGB_CAP_SRIOV_OFFSET, "igbvf",
+pcie_sriov_pf_init(pci_dev, IGB_CAP_SRIOV_OFFSET, TYPE_IGBVF,
 IGB_82576_VF_DEV_ID, IGB_MAX_VF_FUNCTIONS, IGB_MAX_VF_FUNCTIONS,
 IGB_VF_OFFSET, IGB_VF_STRIDE);
 
-pcie_sriov_pf_init_vf_bar(pci_dev, 0,
+pcie_sriov_pf_init_vf_bar(pci_dev, IGBVF_MMIO_BAR_IDX,
 PCI_BASE_ADDRESS_MEM_TYPE_64 | PCI_BASE_ADDRESS_MEM_PREFETCH,
-16 * KiB);
-pcie_sriov_pf_init_vf_bar(pci_dev, 3,
+IGBVF_MMIO_SIZE);
+pcie_sriov_pf_init_vf_bar(pci_dev, IGBVF_MSIX_BAR_IDX,
 PCI_BASE_ADDRESS_MEM_TYPE_64 | PCI_BASE_ADDRESS_MEM_PREFETCH,
-16 * KiB);
+IGBVF_MSIX_SIZE);
 
 igb_init_net_peer(s, pci_dev, macaddr);
 
diff --git a/hw/net/igbvf.c b/hw/net/igbvf.c
index 70beb7af50..284ea61184 100644
--- a/hw/net/igbvf.c
+++ b/hw/net/igbvf.c
@@ -50,15 +50,8 @@
 #include "trace.h"
 #include "qapi/error.h"
 
-#define TYPE_IGBVF "igbvf"
 OBJECT_DECLARE_SIMPLE_TYPE(IgbVfState, IGBVF)
 
-#define IGBVF_MMIO_BAR_IDX  (0)
-#define IGBVF_MSIX_BAR_IDX  (3)
-
-#define IGBVF_MMIO_SIZE (16 * 1024)
-#define IGBVF_MSIX_SIZE (16 * 1024)
-
 struct IgbVfState {
 PCIDevice parent_obj;
 
-- 
2.40.1

[PATCH v5 39/48] igb: Filter with the second VLAN tag for extended VLAN

2023-05-22 Thread Akihiko Odaki

Signed-off-by: Akihiko Odaki 
---
 hw/net/igb_core.c | 23 ++-
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 688eaf7319..5345f57031 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -69,7 +69,7 @@ typedef struct IGBTxPktVmdqCallbackContext {
 
 typedef struct L2Header {
 struct eth_header eth;
-struct vlan_header vlan;
+struct vlan_header vlan[2];
 } L2Header;
 
 static ssize_t
@@ -1001,7 +1001,7 @@ static uint16_t igb_receive_assign(IGBCore *core, const 
L2Header *l2_header,
 uint32_t f, ra[2], *macp, rctl = core->mac[RCTL];
 uint16_t queues = 0;
 uint16_t oversized = 0;
-uint16_t vid = be16_to_cpu(l2_header->vlan.h_tci) & VLAN_VID_MASK;
+size_t vlan_num = 0;
 int i;
 
 memset(rss_info, 0, sizeof(E1000E_RSSInfo));
@@ -1010,8 +1010,19 @@ static uint16_t igb_receive_assign(IGBCore *core, const 
L2Header *l2_header,
 *external_tx = true;
 }
 
-if (e1000x_is_vlan_packet(ehdr, core->mac[VET] & 0x) &&
-!e1000x_rx_vlan_filter(core->mac, PKT_GET_VLAN_HDR(ehdr))) {
+if (core->mac[CTRL_EXT] & BIT(26)) {
+if (be16_to_cpu(ehdr->h_proto) == core->mac[VET] >> 16 &&
+be16_to_cpu(l2_header->vlan[0].h_proto) == (core->mac[VET] & 
0x)) {
+vlan_num = 2;
+}
+} else {
+if (be16_to_cpu(ehdr->h_proto) == (core->mac[VET] & 0x)) {
+vlan_num = 1;
+}
+}
+
+if (vlan_num &&
+!e1000x_rx_vlan_filter(core->mac, l2_header->vlan + vlan_num - 1)) {
 return queues;
 }
 
@@ -1065,7 +1076,9 @@ static uint16_t igb_receive_assign(IGBCore *core, const 
L2Header *l2_header,
 if (e1000x_vlan_rx_filter_enabled(core->mac)) {
 uint16_t mask = 0;
 
-if (e1000x_is_vlan_packet(ehdr, core->mac[VET] & 0x)) {
+if (vlan_num) {
+uint16_t vid = be16_to_cpu(l2_header->vlan[vlan_num - 
1].h_tci) & VLAN_VID_MASK;
+
 for (i = 0; i < E1000_VLVF_ARRAY_SIZE; i++) {
 if ((core->mac[VLVF0 + i] & E1000_VLVF_VLANID_MASK) == vid 
&&
 (core->mac[VLVF0 + i] & E1000_VLVF_VLANID_ENABLE)) {
-- 
2.40.1

[PATCH v5 02/48] hw/net/net_tx_pkt: Decouple interface from PCI

2023-05-22 Thread Akihiko Odaki

This allows to use the network packet abstractions even if PCI is not
used.

Signed-off-by: Akihiko Odaki 
---
 hw/net/net_tx_pkt.h  | 31 ---
 hw/net/e1000e_core.c | 13 -
 hw/net/igb_core.c| 13 ++---
 hw/net/net_tx_pkt.c  | 36 +---
 hw/net/vmxnet3.c | 14 +++---
 5 files changed, 54 insertions(+), 53 deletions(-)

diff --git a/hw/net/net_tx_pkt.h b/hw/net/net_tx_pkt.h
index 5eb123ef90..4d7233e975 100644
--- a/hw/net/net_tx_pkt.h
+++ b/hw/net/net_tx_pkt.h
@@ -26,17 +26,16 @@
 
 struct NetTxPkt;
 
-typedef void (* NetTxPktCallback)(void *, const struct iovec *, int, const 
struct iovec *, int);
+typedef void (*NetTxPktFreeFrag)(void *, void *, size_t);
+typedef void (*NetTxPktSend)(void *, const struct iovec *, int, const struct 
iovec *, int);
 
 /**
  * Init function for tx packet functionality
  *
  * @pkt:packet pointer
- * @pci_dev:PCI device processing this packet
  * @max_frags:  max tx ip fragments
  */
-void net_tx_pkt_init(struct NetTxPkt **pkt, PCIDevice *pci_dev,
-uint32_t max_frags);
+void net_tx_pkt_init(struct NetTxPkt **pkt, uint32_t max_frags);
 
 /**
  * Clean all tx packet resources.
@@ -95,12 +94,11 @@ net_tx_pkt_setup_vlan_header(struct NetTxPkt *pkt, uint16_t 
vlan)
  * populate data fragment into pkt context.
  *
  * @pkt:packet
- * @pa: physical address of fragment
+ * @base:   pointer to fragment
  * @len:length of fragment
  *
  */
-bool net_tx_pkt_add_raw_fragment(struct NetTxPkt *pkt, hwaddr pa,
-size_t len);
+bool net_tx_pkt_add_raw_fragment(struct NetTxPkt *pkt, void *base, size_t len);
 
 /**
  * Fix ip header fields and calculate IP header and pseudo header checksums.
@@ -148,10 +146,11 @@ void net_tx_pkt_dump(struct NetTxPkt *pkt);
  * reset tx packet private context (needed to be called between packets)
  *
  * @pkt:packet
- * @dev:PCI device processing the next packet
- *
+ * @callback:   function to free the fragments
+ * @context:pointer to be passed to the callback
  */
-void net_tx_pkt_reset(struct NetTxPkt *pkt, PCIDevice *dev);
+void net_tx_pkt_reset(struct NetTxPkt *pkt,
+  NetTxPktFreeFrag callback, void *context);
 
 /**
  * Unmap a fragment mapped from a PCI device.
@@ -162,6 +161,16 @@ void net_tx_pkt_reset(struct NetTxPkt *pkt, PCIDevice 
*dev);
  */
 void net_tx_pkt_unmap_frag_pci(void *context, void *base, size_t len);
 
+/**
+ * map data fragment from PCI device and populate it into pkt context.
+ *
+ * @pci_dev:PCI device owning fragment
+ * @pa: physical address of fragment
+ * @len:length of fragment
+ */
+bool net_tx_pkt_add_raw_fragment_pci(struct NetTxPkt *pkt, PCIDevice *pci_dev,
+ dma_addr_t pa, size_t len);
+
 /**
  * Send packet to qemu. handles sw offloads if vhdr is not supported.
  *
@@ -182,7 +191,7 @@ bool net_tx_pkt_send(struct NetTxPkt *pkt, NetClientState 
*nc);
  * @ret:operation result
  */
 bool net_tx_pkt_send_custom(struct NetTxPkt *pkt, bool offload,
-NetTxPktCallback callback, void *context);
+NetTxPktSend callback, void *context);
 
 /**
  * parse raw packet data and analyze offload requirements.
diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index cfa3f55e96..15821a75e0 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -746,7 +746,8 @@ e1000e_process_tx_desc(E1000ECore *core,
 addr = le64_to_cpu(dp->buffer_addr);
 
 if (!tx->skip_cp) {
-if (!net_tx_pkt_add_raw_fragment(tx->tx_pkt, addr, split_size)) {
+if (!net_tx_pkt_add_raw_fragment_pci(tx->tx_pkt, core->owner,
+ addr, split_size)) {
 tx->skip_cp = true;
 }
 }
@@ -764,7 +765,7 @@ e1000e_process_tx_desc(E1000ECore *core,
 }
 
 tx->skip_cp = false;
-net_tx_pkt_reset(tx->tx_pkt, core->owner);
+net_tx_pkt_reset(tx->tx_pkt, net_tx_pkt_unmap_frag_pci, core->owner);
 
 tx->sum_needed = 0;
 tx->cptse = 0;
@@ -3421,7 +3422,7 @@ e1000e_core_pci_realize(E1000ECore *core,
 qemu_add_vm_change_state_handler(e1000e_vm_state_change, core);
 
 for (i = 0; i < E1000E_NUM_QUEUES; i++) {
-net_tx_pkt_init(>tx[i].tx_pkt, core->owner, E1000E_MAX_TX_FRAGS);
+net_tx_pkt_init(>tx[i].tx_pkt, E1000E_MAX_TX_FRAGS);
 }
 
 net_rx_pkt_init(>rx_pkt);
@@ -3446,7 +3447,8 @@ e1000e_core_pci_uninit(E1000ECore *core)
 qemu_del_vm_change_state_handler(core->vmstate);
 
 for (i = 0; i < E1000E_NUM_QUEUES; i++) {
-net_tx_pkt_reset(core->tx[i].tx_pkt, core->owner);
+net_tx_pkt_reset(core->tx[i].tx_pkt,
+ net_tx_pkt_unmap_frag_pci, core->owner);
 net_tx_pkt_uninit(core->tx[i].tx_pkt);
 }
 
@@ -3571,7

[PATCH v5 28/48] e1000e: Rename a variable in e1000e_receive_internal()

2023-05-22 Thread Akihiko Odaki

Rename variable "n" to "causes", which properly represents the content
of the variable.

Signed-off-by: Akihiko Odaki 
---
 hw/net/e1000e_core.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index 7dce448657..aea70b74d9 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -1650,7 +1650,7 @@ static ssize_t
 e1000e_receive_internal(E1000ECore *core, const struct iovec *iov, int iovcnt,
 bool has_vnet)
 {
-uint32_t n = 0;
+uint32_t causes = 0;
 uint8_t buf[ETH_ZLEN];
 struct iovec min_iov;
 size_t size, orig_size;
@@ -1723,32 +1723,32 @@ e1000e_receive_internal(E1000ECore *core, const struct 
iovec *iov, int iovcnt,
 
 /* Perform small receive detection (RSRPD) */
 if (total_size < core->mac[RSRPD]) {
-n |= E1000_ICS_SRPD;
+causes |= E1000_ICS_SRPD;
 }
 
 /* Perform ACK receive detection */
 if  (!(core->mac[RFCTL] & E1000_RFCTL_ACK_DIS) &&
  (e1000e_is_tcp_ack(core, core->rx_pkt))) {
-n |= E1000_ICS_ACK;
+causes |= E1000_ICS_ACK;
 }
 
 /* Check if receive descriptor minimum threshold hit */
 rdmts_hit = e1000e_rx_descr_threshold_hit(core, rxr.i);
-n |= e1000e_rx_wb_interrupt_cause(core, rxr.i->idx, rdmts_hit);
+causes |= e1000e_rx_wb_interrupt_cause(core, rxr.i->idx, rdmts_hit);
 
 trace_e1000e_rx_written_to_guest(rxr.i->idx);
 } else {
-n |= E1000_ICS_RXO;
+causes |= E1000_ICS_RXO;
 retval = 0;
 
 trace_e1000e_rx_not_written_to_guest(rxr.i->idx);
 }
 
-if (!e1000e_intrmgr_delay_rx_causes(core, )) {
-trace_e1000e_rx_interrupt_set(n);
-e1000e_set_interrupt_cause(core, n);
+if (!e1000e_intrmgr_delay_rx_causes(core, )) {
+trace_e1000e_rx_interrupt_set(causes);
+e1000e_set_interrupt_cause(core, causes);
 } else {
-trace_e1000e_rx_interrupt_delayed(n);
+trace_e1000e_rx_interrupt_delayed(causes);
 }
 
 return retval;
-- 
2.40.1

[PATCH v5 36/48] igb: Implement Rx SCTP CSO

2023-05-22 Thread Akihiko Odaki

Signed-off-by: Akihiko Odaki 
Reviewed-by: Sriram Yagnaraman 
---
 hw/net/igb_regs.h |  1 +
 include/net/eth.h |  4 ++-
 include/qemu/crc32c.h |  1 +
 hw/net/e1000e_core.c  |  5 
 hw/net/igb_core.c | 15 +-
 hw/net/net_rx_pkt.c   | 64 +++
 net/eth.c |  4 +++
 util/crc32c.c |  8 ++
 8 files changed, 89 insertions(+), 13 deletions(-)

diff --git a/hw/net/igb_regs.h b/hw/net/igb_regs.h
index e6ac26dc0e..4b4ebd3369 100644
--- a/hw/net/igb_regs.h
+++ b/hw/net/igb_regs.h
@@ -670,6 +670,7 @@ union e1000_adv_rx_desc {
 #define E1000_ADVRXD_PKT_IP6 BIT(6)
 #define E1000_ADVRXD_PKT_TCP BIT(8)
 #define E1000_ADVRXD_PKT_UDP BIT(9)
+#define E1000_ADVRXD_PKT_SCTP BIT(10)
 
 static inline uint8_t igb_ivar_entry_rx(uint8_t i)
 {
diff --git a/include/net/eth.h b/include/net/eth.h
index 048e434685..75e7f1551c 100644
--- a/include/net/eth.h
+++ b/include/net/eth.h
@@ -224,6 +224,7 @@ struct tcp_hdr {
 #define IP_HEADER_VERSION_6   (6)
 #define IP_PROTO_TCP  (6)
 #define IP_PROTO_UDP  (17)
+#define IP_PROTO_SCTP (132)
 #define IPTOS_ECN_MASK0x03
 #define IPTOS_ECN(x)  ((x) & IPTOS_ECN_MASK)
 #define IPTOS_ECN_CE  0x03
@@ -379,7 +380,8 @@ typedef struct eth_ip4_hdr_info_st {
 typedef enum EthL4HdrProto {
 ETH_L4_HDR_PROTO_INVALID,
 ETH_L4_HDR_PROTO_TCP,
-ETH_L4_HDR_PROTO_UDP
+ETH_L4_HDR_PROTO_UDP,
+ETH_L4_HDR_PROTO_SCTP
 } EthL4HdrProto;
 
 typedef struct eth_l4_hdr_info_st {
diff --git a/include/qemu/crc32c.h b/include/qemu/crc32c.h
index 5b78884c38..88b4d2b3b3 100644
--- a/include/qemu/crc32c.h
+++ b/include/qemu/crc32c.h
@@ -30,5 +30,6 @@
 
 
 uint32_t crc32c(uint32_t crc, const uint8_t *data, unsigned int length);
+uint32_t iov_crc32c(uint32_t crc, const struct iovec *iov, size_t iov_cnt);
 
 #endif
diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index aea70b74d9..0b939ff5a3 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -1114,6 +1114,11 @@ e1000e_verify_csum_in_sw(E1000ECore *core,
 return;
 }
 
+if (l4hdr_proto != ETH_L4_HDR_PROTO_TCP &&
+l4hdr_proto != ETH_L4_HDR_PROTO_UDP) {
+return;
+}
+
 if (!net_rx_pkt_validate_l4_csum(pkt, _valid)) {
 trace_e1000e_rx_metadata_l4_csum_validation_failed();
 return;
diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 41a2e5bf7b..95d46d6e6d 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -1220,7 +1220,7 @@ igb_build_rx_metadata(IGBCore *core,
   uint16_t *vlan_tag)
 {
 struct virtio_net_hdr *vhdr;
-bool hasip4, hasip6;
+bool hasip4, hasip6, csum_valid;
 EthL4HdrProto l4hdr_proto;
 
 *status_flags = E1000_RXD_STAT_DD;
@@ -1280,6 +1280,10 @@ igb_build_rx_metadata(IGBCore *core,
 *pkt_info |= E1000_ADVRXD_PKT_UDP;
 break;
 
+case ETH_L4_HDR_PROTO_SCTP:
+*pkt_info |= E1000_ADVRXD_PKT_SCTP;
+break;
+
 default:
 break;
 }
@@ -1312,6 +1316,15 @@ igb_build_rx_metadata(IGBCore *core,
 
 if (igb_rx_l4_cso_enabled(core)) {
 switch (l4hdr_proto) {
+case ETH_L4_HDR_PROTO_SCTP:
+if (!net_rx_pkt_validate_l4_csum(pkt, _valid)) {
+trace_e1000e_rx_metadata_l4_csum_validation_failed();
+goto func_exit;
+}
+if (!csum_valid) {
+*status_flags |= E1000_RXDEXT_STATERR_TCPE;
+}
+/* fall through */
 case ETH_L4_HDR_PROTO_TCP:
 *status_flags |= E1000_RXD_STAT_TCPCS;
 break;
diff --git a/hw/net/net_rx_pkt.c b/hw/net/net_rx_pkt.c
index 1de42b4f51..3575c8b9f9 100644
--- a/hw/net/net_rx_pkt.c
+++ b/hw/net/net_rx_pkt.c
@@ -16,6 +16,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/crc32c.h"
 #include "trace.h"
 #include "net_rx_pkt.h"
 #include "net/checksum.h"
@@ -554,32 +555,73 @@ _net_rx_pkt_calc_l4_csum(struct NetRxPkt *pkt)
 return csum;
 }
 
-bool net_rx_pkt_validate_l4_csum(struct NetRxPkt *pkt, bool *csum_valid)
+static bool
+_net_rx_pkt_validate_sctp_sum(struct NetRxPkt *pkt)
 {
-uint16_t csum;
+size_t csum_off;
+size_t off = pkt->l4hdr_off;
+size_t vec_len = pkt->vec_len;
+struct iovec *vec;
+uint32_t calculated = 0;
+uint32_t original;
+bool valid;
 
-trace_net_rx_pkt_l4_csum_validate_entry();
+for (vec = pkt->vec; vec->iov_len < off; vec++) {
+off -= vec->iov_len;
+vec_len--;
+}
 
-if (pkt->l4hdr_info.proto != ETH_L4_HDR_PROTO_TCP &&
-pkt->l4hdr_info.proto != ETH_L4_HDR_PROTO_UDP) {
-trace_net_rx_pkt_l4_csum_validate_not_xxp();
+csum_off = off + 8;
+
+if (!iov_to_buf(vec, vec_len, csum_off, , sizeof(original))) {
 return false;
 }
 
-if (pkt->l4hdr_info.proto == ETH_L4_HDR_PROTO_UDP &&
-pkt->l4hdr_info.hdr.udp.uh_sum == 0) {
-

[PATCH v5 26/48] igb: Fix igb_mac_reg_init coding style alignment

2023-05-22 Thread Akihiko Odaki

Signed-off-by: Akihiko Odaki 
Reviewed-by: Philippe Mathieu-Daudé 
---
 hw/net/igb_core.c | 96 +++
 1 file changed, 48 insertions(+), 48 deletions(-)

diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 56a53872cf..20645c4764 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -4027,54 +4027,54 @@ static const uint32_t igb_mac_reg_init[] = {
 [VMOLR0 ... VMOLR0 + 7] = 0x2600 | E1000_VMOLR_STRCRC,
 [RPLOLR]= E1000_RPLOLR_STRCRC,
 [RLPML] = 0x2600,
-[TXCTL0]   = E1000_DCA_TXCTRL_DATA_RRO_EN |
- E1000_DCA_TXCTRL_TX_WB_RO_EN |
- E1000_DCA_TXCTRL_DESC_RRO_EN,
-[TXCTL1]   = E1000_DCA_TXCTRL_DATA_RRO_EN |
- E1000_DCA_TXCTRL_TX_WB_RO_EN |
- E1000_DCA_TXCTRL_DESC_RRO_EN,
-[TXCTL2]   = E1000_DCA_TXCTRL_DATA_RRO_EN |
- E1000_DCA_TXCTRL_TX_WB_RO_EN |
- E1000_DCA_TXCTRL_DESC_RRO_EN,
-[TXCTL3]   = E1000_DCA_TXCTRL_DATA_RRO_EN |
- E1000_DCA_TXCTRL_TX_WB_RO_EN |
- E1000_DCA_TXCTRL_DESC_RRO_EN,
-[TXCTL4]   = E1000_DCA_TXCTRL_DATA_RRO_EN |
- E1000_DCA_TXCTRL_TX_WB_RO_EN |
- E1000_DCA_TXCTRL_DESC_RRO_EN,
-[TXCTL5]   = E1000_DCA_TXCTRL_DATA_RRO_EN |
- E1000_DCA_TXCTRL_TX_WB_RO_EN |
- E1000_DCA_TXCTRL_DESC_RRO_EN,
-[TXCTL6]   = E1000_DCA_TXCTRL_DATA_RRO_EN |
- E1000_DCA_TXCTRL_TX_WB_RO_EN |
- E1000_DCA_TXCTRL_DESC_RRO_EN,
-[TXCTL7]   = E1000_DCA_TXCTRL_DATA_RRO_EN |
- E1000_DCA_TXCTRL_TX_WB_RO_EN |
- E1000_DCA_TXCTRL_DESC_RRO_EN,
-[TXCTL8]   = E1000_DCA_TXCTRL_DATA_RRO_EN |
- E1000_DCA_TXCTRL_TX_WB_RO_EN |
- E1000_DCA_TXCTRL_DESC_RRO_EN,
-[TXCTL9]   = E1000_DCA_TXCTRL_DATA_RRO_EN |
- E1000_DCA_TXCTRL_TX_WB_RO_EN |
- E1000_DCA_TXCTRL_DESC_RRO_EN,
-[TXCTL10]  = E1000_DCA_TXCTRL_DATA_RRO_EN |
- E1000_DCA_TXCTRL_TX_WB_RO_EN |
- E1000_DCA_TXCTRL_DESC_RRO_EN,
-[TXCTL11]  = E1000_DCA_TXCTRL_DATA_RRO_EN |
- E1000_DCA_TXCTRL_TX_WB_RO_EN |
- E1000_DCA_TXCTRL_DESC_RRO_EN,
-[TXCTL12]  = E1000_DCA_TXCTRL_DATA_RRO_EN |
- E1000_DCA_TXCTRL_TX_WB_RO_EN |
- E1000_DCA_TXCTRL_DESC_RRO_EN,
-[TXCTL13]  = E1000_DCA_TXCTRL_DATA_RRO_EN |
- E1000_DCA_TXCTRL_TX_WB_RO_EN |
- E1000_DCA_TXCTRL_DESC_RRO_EN,
-[TXCTL14]  = E1000_DCA_TXCTRL_DATA_RRO_EN |
- E1000_DCA_TXCTRL_TX_WB_RO_EN |
- E1000_DCA_TXCTRL_DESC_RRO_EN,
-[TXCTL15]  = E1000_DCA_TXCTRL_DATA_RRO_EN |
- E1000_DCA_TXCTRL_TX_WB_RO_EN |
- E1000_DCA_TXCTRL_DESC_RRO_EN,
+[TXCTL0]= E1000_DCA_TXCTRL_DATA_RRO_EN |
+  E1000_DCA_TXCTRL_TX_WB_RO_EN |
+  E1000_DCA_TXCTRL_DESC_RRO_EN,
+[TXCTL1]= E1000_DCA_TXCTRL_DATA_RRO_EN |
+  E1000_DCA_TXCTRL_TX_WB_RO_EN |
+  E1000_DCA_TXCTRL_DESC_RRO_EN,
+[TXCTL2]= E1000_DCA_TXCTRL_DATA_RRO_EN |
+  E1000_DCA_TXCTRL_TX_WB_RO_EN |
+  E1000_DCA_TXCTRL_DESC_RRO_EN,
+[TXCTL3]= E1000_DCA_TXCTRL_DATA_RRO_EN |
+  E1000_DCA_TXCTRL_TX_WB_RO_EN |
+  E1000_DCA_TXCTRL_DESC_RRO_EN,
+[TXCTL4]= E1000_DCA_TXCTRL_DATA_RRO_EN |
+  E1000_DCA_TXCTRL_TX_WB_RO_EN |
+  E1000_DCA_TXCTRL_DESC_RRO_EN,
+[TXCTL5]= E1000_DCA_TXCTRL_DATA_RRO_EN |
+  E1000_DCA_TXCTRL_TX_WB_RO_EN |
+  E1000_DCA_TXCTRL_DESC_RRO_EN,
+[TXCTL6]= E1000_DCA_TXCTRL_DATA_RRO_EN |
+  E1000_DCA_TXCTRL_TX_WB_RO_EN |
+  E1000_DCA_TXCTRL_DESC_RRO_EN,
+[TXCTL7]= E1000_DCA_TXCTRL_DATA_RRO_EN |
+  E1000_DCA_TXCTRL_TX_WB_RO_EN |
+  E1000_DCA_TXCTRL_DESC_RRO_EN,
+[TXCTL8]= E1000_DCA_TXCTRL_DATA_RRO_EN |
+  E1000_DCA_TXCTRL_TX_WB_RO_EN |
+  E1000_DCA_TXCTRL_DESC_RRO_EN,
+[TXCTL9]= E1000_DCA_TXCTRL_DATA_RRO_EN |
+  E1000_DCA_TXCTRL_TX_WB_RO_EN |
+  E1000_DCA_TXCTRL_DESC_RRO_EN,
+[TXCTL10]   = E1000_DCA_TXCTRL_DATA_RRO_EN |
+  E1000_DCA_TXCTRL_TX_WB_RO_EN |
+  E1000_DCA_TXCTRL_DESC_RRO_EN,
+[TXCTL11]   = E1000_DCA_TXCTRL_DATA_RRO_EN |
+  E1000_DCA_TXCTRL_TX_WB_RO_EN |
+  E1000_DCA_TXCTRL_DESC_RRO_EN,
+

[PATCH v5 38/48] igb: Strip the second VLAN tag for extended VLAN

2023-05-22 Thread Akihiko Odaki

Signed-off-by: Akihiko Odaki 
---
 hw/net/net_rx_pkt.h  | 19 
 include/net/eth.h|  4 ++--
 hw/net/e1000e_core.c |  3 ++-
 hw/net/igb_core.c| 14 ++--
 hw/net/net_rx_pkt.c  | 15 +
 net/eth.c| 52 
 6 files changed, 65 insertions(+), 42 deletions(-)

diff --git a/hw/net/net_rx_pkt.h b/hw/net/net_rx_pkt.h
index ce8dbdb284..55ec67a1a7 100644
--- a/hw/net/net_rx_pkt.h
+++ b/hw/net/net_rx_pkt.h
@@ -223,18 +223,19 @@ void net_rx_pkt_attach_iovec(struct NetRxPkt *pkt,
 /**
 * attach scatter-gather data to rx packet
 *
-* @pkt:packet
-* @iov:received data scatter-gather list
-* @iovcnt  number of elements in iov
-* @iovoff  data start offset in the iov
-* @strip_vlan: should the module strip vlan from data
-* @vet:VLAN tag Ethernet type
+* @pkt:  packet
+* @iov:  received data scatter-gather list
+* @iovcnt:   number of elements in iov
+* @iovoff:   data start offset in the iov
+* @strip_vlan_index: index of Q tag if it is to be stripped. negative 
otherwise.
+* @vet:  VLAN tag Ethernet type
+* @vet_ext:  outer VLAN tag Ethernet type
 *
 */
 void net_rx_pkt_attach_iovec_ex(struct NetRxPkt *pkt,
-   const struct iovec *iov, int iovcnt,
-   size_t iovoff, bool strip_vlan,
-   uint16_t vet);
+const struct iovec *iov, int iovcnt,
+size_t iovoff, int strip_vlan_index,
+uint16_t vet, uint16_t vet_ext);
 
 /**
  * attach data to rx packet
diff --git a/include/net/eth.h b/include/net/eth.h
index 75e7f1551c..3b80b6e07f 100644
--- a/include/net/eth.h
+++ b/include/net/eth.h
@@ -347,8 +347,8 @@ eth_strip_vlan(const struct iovec *iov, int iovcnt, size_t 
iovoff,
uint16_t *payload_offset, uint16_t *tci);
 
 size_t
-eth_strip_vlan_ex(const struct iovec *iov, int iovcnt, size_t iovoff,
-  uint16_t vet, void *new_ehdr_buf,
+eth_strip_vlan_ex(const struct iovec *iov, int iovcnt, size_t iovoff, int 
index,
+  uint16_t vet, uint16_t vet_ext, void *new_ehdr_buf,
   uint16_t *payload_offset, uint16_t *tci);
 
 uint16_t
diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index 0b939ff5a3..d601386992 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -1711,7 +1711,8 @@ e1000e_receive_internal(E1000ECore *core, const struct 
iovec *iov, int iovcnt,
 }
 
 net_rx_pkt_attach_iovec_ex(core->rx_pkt, iov, iovcnt, iov_ofs,
-   e1000x_vlan_enabled(core->mac), core->mac[VET]);
+   e1000x_vlan_enabled(core->mac) ? 0 : -1,
+   core->mac[VET], 0);
 
 e1000e_rss_parse_packet(core, core->rx_pkt, _info);
 e1000e_rx_ring_init(core, , rss_info.queue);
diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 5eacf1cd8c..688eaf7319 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -1611,6 +1611,7 @@ igb_receive_internal(IGBCore *core, const struct iovec 
*iov, int iovcnt,
 E1000E_RxRing rxr;
 E1000E_RSSInfo rss_info;
 size_t total_size;
+int strip_vlan_index;
 int i;
 
 trace_e1000e_rx_receive_iov(iovcnt);
@@ -1672,9 +1673,18 @@ igb_receive_internal(IGBCore *core, const struct iovec 
*iov, int iovcnt,
 
 igb_rx_ring_init(core, , i);
 
+if (!igb_rx_strip_vlan(core, rxr.i)) {
+strip_vlan_index = -1;
+} else if (core->mac[CTRL_EXT] & BIT(26)) {
+strip_vlan_index = 1;
+} else {
+strip_vlan_index = 0;
+}
+
 net_rx_pkt_attach_iovec_ex(core->rx_pkt, iov, iovcnt, iov_ofs,
-   igb_rx_strip_vlan(core, rxr.i),
-   core->mac[VET] & 0x);
+   strip_vlan_index,
+   core->mac[VET] & 0x,
+   core->mac[VET] >> 16);
 
 total_size = net_rx_pkt_get_total_len(core->rx_pkt) +
 e1000x_fcs_len(core->mac);
diff --git a/hw/net/net_rx_pkt.c b/hw/net/net_rx_pkt.c
index 3575c8b9f9..32e5f3f9cf 100644
--- a/hw/net/net_rx_pkt.c
+++ b/hw/net/net_rx_pkt.c
@@ -137,20 +137,17 @@ void net_rx_pkt_attach_iovec(struct NetRxPkt *pkt,
 
 void net_rx_pkt_attach_iovec_ex(struct NetRxPkt *pkt,
 const struct iovec *iov, int iovcnt,
-size_t iovoff, bool strip_vlan,
-uint16_t vet)
+size_t iovoff, int strip_vlan_index,
+uint16_t vet, uint16_t vet_ext)
 {
 uint16_t tci = 0;
 uint16_t ploff = iovoff;
 assert(pkt);
 
-if (strip_vlan) {
-

[PATCH v5 19/48] igb: Always log status after building rx metadata

2023-05-22 Thread Akihiko Odaki

Without this change, the status flags may not be traced e.g. if checksum
offloading is disabled.

Signed-off-by: Akihiko Odaki 
Reviewed-by: Philippe Mathieu-Daudé 
---
 hw/net/igb_core.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 209fdad862..946b917f91 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -1303,9 +1303,8 @@ igb_build_rx_metadata(IGBCore *core,
 trace_e1000e_rx_metadata_l4_cso_disabled();
 }
 
-trace_e1000e_rx_metadata_status_flags(*status_flags);
-
 func_exit:
+trace_e1000e_rx_metadata_status_flags(*status_flags);
 *status_flags = cpu_to_le32(*status_flags);
 }
 
-- 
2.40.1

[PATCH v5 20/48] igb: Remove goto

2023-05-22 Thread Akihiko Odaki

The goto is a bit confusing as it changes the control flow only if L4
protocol is not recognized. It is also different from e1000e, and
noisy when comparing e1000e and igb.

Signed-off-by: Akihiko Odaki 
Reviewed-by: Sriram Yagnaraman 
---
 hw/net/igb_core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 946b917f91..bae51cbb63 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -1297,7 +1297,7 @@ igb_build_rx_metadata(IGBCore *core,
 break;
 
 default:
-goto func_exit;
+break;
 }
 } else {
 trace_e1000e_rx_metadata_l4_cso_disabled();
-- 
2.40.1

[PATCH v5 08/48] e1000e: Always copy ethernet header

2023-05-22 Thread Akihiko Odaki

e1000e_receive_internal() used to check the iov length to determine
copy the iovs to a contiguous buffer, but the check is flawed in two
ways:
- It does not ensure that iovcnt > 0.
- It does not take virtio-net header into consideration.

The size of this copy is just 18 octets, which can be even less than
the code size required for checks. This (wrong) optimization is probably
not worth so just remove it.

Fixes: 6f3fbe4ed0 ("net: Introduce e1000e device emulation")
Signed-off-by: Akihiko Odaki 
---
 hw/net/e1000e_core.c | 26 ++
 1 file changed, 10 insertions(+), 16 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index c2d864a504..14b94db59c 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -1686,12 +1686,9 @@ static ssize_t
 e1000e_receive_internal(E1000ECore *core, const struct iovec *iov, int iovcnt,
 bool has_vnet)
 {
-static const int maximum_ethernet_hdr_len = (ETH_HLEN + 4);
-
 uint32_t n = 0;
-uint8_t min_buf[ETH_ZLEN];
+uint8_t buf[ETH_ZLEN];
 struct iovec min_iov;
-uint8_t *filter_buf;
 size_t size, orig_size;
 size_t iov_ofs = 0;
 E1000E_RxRing rxr;
@@ -1714,24 +1711,21 @@ e1000e_receive_internal(E1000ECore *core, const struct 
iovec *iov, int iovcnt,
 net_rx_pkt_unset_vhdr(core->rx_pkt);
 }
 
-filter_buf = iov->iov_base + iov_ofs;
 orig_size = iov_size(iov, iovcnt);
 size = orig_size - iov_ofs;
 
 /* Pad to minimum Ethernet frame length */
-if (size < sizeof(min_buf)) {
-iov_to_buf(iov, iovcnt, iov_ofs, min_buf, size);
-memset(_buf[size], 0, sizeof(min_buf) - size);
+if (size < sizeof(buf)) {
+iov_to_buf(iov, iovcnt, iov_ofs, buf, size);
+memset([size], 0, sizeof(buf) - size);
 e1000x_inc_reg_if_not_full(core->mac, RUC);
-min_iov.iov_base = filter_buf = min_buf;
-min_iov.iov_len = size = sizeof(min_buf);
+min_iov.iov_base = buf;
+min_iov.iov_len = size = sizeof(buf);
 iovcnt = 1;
 iov = _iov;
 iov_ofs = 0;
-} else if (iov->iov_len < maximum_ethernet_hdr_len) {
-/* This is very unlikely, but may happen. */
-iov_to_buf(iov, iovcnt, iov_ofs, min_buf, maximum_ethernet_hdr_len);
-filter_buf = min_buf;
+} else {
+iov_to_buf(iov, iovcnt, iov_ofs, buf, ETH_HLEN + 4);
 }
 
 /* Discard oversized packets if !LPE and !SBP. */
@@ -1740,9 +1734,9 @@ e1000e_receive_internal(E1000ECore *core, const struct 
iovec *iov, int iovcnt,
 }
 
 net_rx_pkt_set_packet_type(core->rx_pkt,
-get_eth_packet_type(PKT_GET_ETH_HDR(filter_buf)));
+get_eth_packet_type(PKT_GET_ETH_HDR(buf)));
 
-if (!e1000e_receive_filter(core, filter_buf, size)) {
+if (!e1000e_receive_filter(core, buf, size)) {
 trace_e1000e_rx_flt_dropped();
 return orig_size;
 }
-- 
2.40.1

[PATCH v5 16/48] e1000x: Take CRC into consideration for size check

2023-05-22 Thread Akihiko Odaki

Section 13.7.15 Receive Length Error Count says:
>  Packets over 1522 bytes are oversized if LongPacketEnable is 0b
> (RCTL.LPE). If LongPacketEnable (LPE) is 1b, then an incoming packet
> is considered oversized if it exceeds 16384 bytes.

> These lengths are based on bytes in the received packet from
>  through , inclusively.

As QEMU processes packets without CRC, the number of bytes for CRC
need to be subtracted. This change adds some size definitions to be used
to derive the new size thresholds to eth.h.

Signed-off-by: Akihiko Odaki 
---
 include/net/eth.h  |  2 ++
 hw/net/e1000x_common.c | 10 +-
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/include/net/eth.h b/include/net/eth.h
index e8af5742be..05f56931e7 100644
--- a/include/net/eth.h
+++ b/include/net/eth.h
@@ -32,6 +32,8 @@
 #define ETH_ALEN 6
 #define ETH_HLEN 14
 #define ETH_ZLEN 60 /* Min. octets in frame without FCS */
+#define ETH_FCS_LEN 4
+#define ETH_MTU 1500
 
 struct eth_header {
 uint8_t  h_dest[ETH_ALEN];   /* destination eth addr */
diff --git a/hw/net/e1000x_common.c b/hw/net/e1000x_common.c
index 6cc23138a8..212873fd77 100644
--- a/hw/net/e1000x_common.c
+++ b/hw/net/e1000x_common.c
@@ -140,16 +140,16 @@ bool e1000x_hw_rx_enabled(uint32_t *mac)
 
 bool e1000x_is_oversized(uint32_t *mac, size_t size)
 {
+size_t header_size = sizeof(struct eth_header) + sizeof(struct 
vlan_header);
 /* this is the size past which hardware will
drop packets when setting LPE=0 */
-static const int maximum_ethernet_vlan_size = 1522;
+size_t maximum_short_size = header_size + ETH_MTU;
 /* this is the size past which hardware will
drop packets when setting LPE=1 */
-static const int maximum_ethernet_lpe_size = 16 * KiB;
+size_t maximum_large_size = 16 * KiB - ETH_FCS_LEN;
 
-if ((size > maximum_ethernet_lpe_size ||
-(size > maximum_ethernet_vlan_size
-&& !(mac[RCTL] & E1000_RCTL_LPE)))
+if ((size > maximum_large_size ||
+(size > maximum_short_size && !(mac[RCTL] & E1000_RCTL_LPE)))
 && !(mac[RCTL] & E1000_RCTL_SBP)) {
 e1000x_inc_reg_if_not_full(mac, ROC);
 trace_e1000x_rx_oversized(size);
-- 
2.40.1

[PATCH v5 15/48] e1000x: Share more Rx filtering logic

2023-05-22 Thread Akihiko Odaki

This saves some code and enables tracepoint for e1000's VLAN filtering.

Signed-off-by: Akihiko Odaki 
Reviewed-by: Sriram Yagnaraman 
---
 hw/net/e1000x_common.h |  4 +++-
 hw/net/e1000.c | 35 +--
 hw/net/e1000e_core.c   | 47 +-
 hw/net/e1000x_common.c | 44 +--
 hw/net/igb_core.c  | 41 +++-
 hw/net/trace-events|  4 ++--
 6 files changed, 56 insertions(+), 119 deletions(-)

diff --git a/hw/net/e1000x_common.h b/hw/net/e1000x_common.h
index 0298e06283..be291684de 100644
--- a/hw/net/e1000x_common.h
+++ b/hw/net/e1000x_common.h
@@ -107,7 +107,9 @@ bool e1000x_rx_ready(PCIDevice *d, uint32_t *mac);
 
 bool e1000x_is_vlan_packet(const void *buf, uint16_t vet);
 
-bool e1000x_rx_group_filter(uint32_t *mac, const uint8_t *buf);
+bool e1000x_rx_vlan_filter(uint32_t *mac, const struct vlan_header *vhdr);
+
+bool e1000x_rx_group_filter(uint32_t *mac, const struct eth_header *ehdr);
 
 bool e1000x_hw_rx_enabled(uint32_t *mac);
 
diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index 18eb6d8876..aae5f0bdc0 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -804,36 +804,11 @@ start_xmit(E1000State *s)
 }
 
 static int
-receive_filter(E1000State *s, const uint8_t *buf, int size)
+receive_filter(E1000State *s, const void *buf)
 {
-uint32_t rctl = s->mac_reg[RCTL];
-int isbcast = is_broadcast_ether_addr(buf);
-int ismcast = is_multicast_ether_addr(buf);
-
-if (e1000x_is_vlan_packet(buf, le16_to_cpu(s->mac_reg[VET])) &&
-e1000x_vlan_rx_filter_enabled(s->mac_reg)) {
-uint16_t vid = lduw_be_p(_GET_VLAN_HDR(buf)->h_tci);
-uint32_t vfta =
-ldl_le_p((uint32_t *)(s->mac_reg + VFTA) +
- ((vid >> E1000_VFTA_ENTRY_SHIFT) & 
E1000_VFTA_ENTRY_MASK));
-if ((vfta & (1 << (vid & E1000_VFTA_ENTRY_BIT_SHIFT_MASK))) == 0) {
-return 0;
-}
-}
-
-if (!isbcast && !ismcast && (rctl & E1000_RCTL_UPE)) { /* promiscuous 
ucast */
-return 1;
-}
-
-if (ismcast && (rctl & E1000_RCTL_MPE)) {  /* promiscuous mcast */
-return 1;
-}
-
-if (isbcast && (rctl & E1000_RCTL_BAM)) {  /* broadcast enabled */
-return 1;
-}
-
-return e1000x_rx_group_filter(s->mac_reg, buf);
+return (!e1000x_is_vlan_packet(buf, s->mac_reg[VET]) ||
+e1000x_rx_vlan_filter(s->mac_reg, PKT_GET_VLAN_HDR(buf))) &&
+   e1000x_rx_group_filter(s->mac_reg, buf);
 }
 
 static void
@@ -949,7 +924,7 @@ e1000_receive_iov(NetClientState *nc, const struct iovec 
*iov, int iovcnt)
 return size;
 }
 
-if (!receive_filter(s, filter_buf, size)) {
+if (!receive_filter(s, filter_buf)) {
 return size;
 }
 
diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index 14b94db59c..41d2435074 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -1034,48 +1034,11 @@ e1000e_rx_l4_cso_enabled(E1000ECore *core)
 }
 
 static bool
-e1000e_receive_filter(E1000ECore *core, const uint8_t *buf, int size)
+e1000e_receive_filter(E1000ECore *core, const void *buf)
 {
-uint32_t rctl = core->mac[RCTL];
-
-if (e1000x_is_vlan_packet(buf, core->mac[VET]) &&
-e1000x_vlan_rx_filter_enabled(core->mac)) {
-uint16_t vid = lduw_be_p(_GET_VLAN_HDR(buf)->h_tci);
-uint32_t vfta =
-ldl_le_p((uint32_t *)(core->mac + VFTA) +
- ((vid >> E1000_VFTA_ENTRY_SHIFT) & 
E1000_VFTA_ENTRY_MASK));
-if ((vfta & (1 << (vid & E1000_VFTA_ENTRY_BIT_SHIFT_MASK))) == 0) {
-trace_e1000e_rx_flt_vlan_mismatch(vid);
-return false;
-} else {
-trace_e1000e_rx_flt_vlan_match(vid);
-}
-}
-
-switch (net_rx_pkt_get_packet_type(core->rx_pkt)) {
-case ETH_PKT_UCAST:
-if (rctl & E1000_RCTL_UPE) {
-return true; /* promiscuous ucast */
-}
-break;
-
-case ETH_PKT_BCAST:
-if (rctl & E1000_RCTL_BAM) {
-return true; /* broadcast enabled */
-}
-break;
-
-case ETH_PKT_MCAST:
-if (rctl & E1000_RCTL_MPE) {
-return true; /* promiscuous mcast */
-}
-break;
-
-default:
-g_assert_not_reached();
-}
-
-return e1000x_rx_group_filter(core->mac, buf);
+return (!e1000x_is_vlan_packet(buf, core->mac[VET]) ||
+e1000x_rx_vlan_filter(core->mac, PKT_GET_VLAN_HDR(buf))) &&
+   e1000x_rx_group_filter(core->mac, buf);
 }
 
 static inline void
@@ -1736,7 +1699,7 @@ e1000e_receive_internal(E1000ECore *core, const struct 
iovec *iov, int iovcnt,
 net_rx_pkt_set_packet_type(core->rx_pkt,
 get_eth_packet_type(PKT_GET_ETH_HDR(buf)));
 
-if (!e1000e_receive_filter(core, buf, size)) {
+if (!e1000e_receive_filter(core, buf)) {
 trace_e1000e_rx_flt_dropped();
 return orig_size;
 }
diff

[PATCH v5 03/48] e1000x: Fix BPRC and MPRC

2023-05-22 Thread Akihiko Odaki

Before this change, e1000 and the common code updated BPRC and MPRC
depending on the matched filter, but e1000e and igb decided to update
those counters by deriving the packet type independently. This
inconsistency caused a multicast packet to be counted twice.

Updating BPRC and MPRC depending on are fundamentally flawed anyway as
a filter can be used for different types of packets. For example, it is
possible to filter broadcast packets with MTA.

Always determine what counters to update by inspecting the packets.

Fixes: 3b27430177 ("e1000: Implementing various counters")
Signed-off-by: Akihiko Odaki 
Reviewed-by: Sriram Yagnaraman 
---
 hw/net/e1000x_common.h |  5 +++--
 hw/net/e1000.c |  6 +++---
 hw/net/e1000e_core.c   | 20 +++-
 hw/net/e1000x_common.c | 25 +++--
 hw/net/igb_core.c  | 22 +-
 5 files changed, 33 insertions(+), 45 deletions(-)

diff --git a/hw/net/e1000x_common.h b/hw/net/e1000x_common.h
index 911abd8a90..0298e06283 100644
--- a/hw/net/e1000x_common.h
+++ b/hw/net/e1000x_common.h
@@ -91,8 +91,9 @@ e1000x_update_regs_on_link_up(uint32_t *mac, uint16_t *phy)
 }
 
 void e1000x_update_rx_total_stats(uint32_t *mac,
-  size_t data_size,
-  size_t data_fcs_size);
+  eth_pkt_types_e pkt_type,
+  size_t pkt_size,
+  size_t pkt_fcs_size);
 
 void e1000x_core_prepare_eeprom(uint16_t   *eeprom,
 const uint16_t *templ,
diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index 59bacb5d3b..18eb6d8876 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -826,12 +826,10 @@ receive_filter(E1000State *s, const uint8_t *buf, int 
size)
 }
 
 if (ismcast && (rctl & E1000_RCTL_MPE)) {  /* promiscuous mcast */
-e1000x_inc_reg_if_not_full(s->mac_reg, MPRC);
 return 1;
 }
 
 if (isbcast && (rctl & E1000_RCTL_BAM)) {  /* broadcast enabled */
-e1000x_inc_reg_if_not_full(s->mac_reg, BPRC);
 return 1;
 }
 
@@ -922,6 +920,7 @@ e1000_receive_iov(NetClientState *nc, const struct iovec 
*iov, int iovcnt)
 size_t desc_offset;
 size_t desc_size;
 size_t total_size;
+eth_pkt_types_e pkt_type;
 
 if (!e1000x_hw_rx_enabled(s->mac_reg)) {
 return -1;
@@ -971,6 +970,7 @@ e1000_receive_iov(NetClientState *nc, const struct iovec 
*iov, int iovcnt)
 size -= 4;
 }
 
+pkt_type = get_eth_packet_type(PKT_GET_ETH_HDR(filter_buf));
 rdh_start = s->mac_reg[RDH];
 desc_offset = 0;
 total_size = size + e1000x_fcs_len(s->mac_reg);
@@ -1036,7 +1036,7 @@ e1000_receive_iov(NetClientState *nc, const struct iovec 
*iov, int iovcnt)
 }
 } while (desc_offset < total_size);
 
-e1000x_update_rx_total_stats(s->mac_reg, size, total_size);
+e1000x_update_rx_total_stats(s->mac_reg, pkt_type, size, total_size);
 
 n = E1000_ICS_RXT0;
 if ((rdt = s->mac_reg[RDT]) < s->mac_reg[RDH])
diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index 15821a75e0..c2d864a504 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -1488,24 +1488,10 @@ e1000e_write_to_rx_buffers(E1000ECore *core,
 }
 
 static void
-e1000e_update_rx_stats(E1000ECore *core,
-   size_t data_size,
-   size_t data_fcs_size)
+e1000e_update_rx_stats(E1000ECore *core, size_t pkt_size, size_t pkt_fcs_size)
 {
-e1000x_update_rx_total_stats(core->mac, data_size, data_fcs_size);
-
-switch (net_rx_pkt_get_packet_type(core->rx_pkt)) {
-case ETH_PKT_BCAST:
-e1000x_inc_reg_if_not_full(core->mac, BPRC);
-break;
-
-case ETH_PKT_MCAST:
-e1000x_inc_reg_if_not_full(core->mac, MPRC);
-break;
-
-default:
-break;
-}
+eth_pkt_types_e pkt_type = net_rx_pkt_get_packet_type(core->rx_pkt);
+e1000x_update_rx_total_stats(core->mac, pkt_type, pkt_size, pkt_fcs_size);
 }
 
 static inline bool
diff --git a/hw/net/e1000x_common.c b/hw/net/e1000x_common.c
index 4c8e7dcf70..7694673bcc 100644
--- a/hw/net/e1000x_common.c
+++ b/hw/net/e1000x_common.c
@@ -80,7 +80,6 @@ bool e1000x_rx_group_filter(uint32_t *mac, const uint8_t *buf)
 f = mta_shift[(rctl >> E1000_RCTL_MO_SHIFT) & 3];
 f = (((buf[5] << 8) | buf[4]) >> f) & 0xfff;
 if (mac[MTA + (f >> 5)] & (1 << (f & 0x1f))) {
-e1000x_inc_reg_if_not_full(mac, MPRC);
 return true;
 }
 
@@ -212,13 +211,14 @@ e1000x_rxbufsize(uint32_t rctl)
 
 void
 e1000x_update_rx_total_stats(uint32_t *mac,
- size_t data_size,
- size_t data_fcs_size)
+ eth_pkt_types_e pkt_type,
+ size_t pkt_size,
+ size_t pkt_fcs_size)
 {
 static const int PRCregs[6] = { PRC64, PRC127, PRC255, PRC511,

[PATCH v5 30/48] net/eth: Use void pointers

2023-05-22 Thread Akihiko Odaki

The uses of uint8_t pointers were misleading as they are never accessed
as an array of octets and it even require more strict alignment to
access as struct eth_header.

Signed-off-by: Akihiko Odaki 
Reviewed-by: Philippe Mathieu-Daudé 
---
 include/net/eth.h | 4 ++--
 net/eth.c | 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/include/net/eth.h b/include/net/eth.h
index 05f56931e7..95ff24d6b8 100644
--- a/include/net/eth.h
+++ b/include/net/eth.h
@@ -342,12 +342,12 @@ eth_get_pkt_tci(const void *p)
 
 size_t
 eth_strip_vlan(const struct iovec *iov, int iovcnt, size_t iovoff,
-   uint8_t *new_ehdr_buf,
+   void *new_ehdr_buf,
uint16_t *payload_offset, uint16_t *tci);
 
 size_t
 eth_strip_vlan_ex(const struct iovec *iov, int iovcnt, size_t iovoff,
-  uint16_t vet, uint8_t *new_ehdr_buf,
+  uint16_t vet, void *new_ehdr_buf,
   uint16_t *payload_offset, uint16_t *tci);
 
 uint16_t
diff --git a/net/eth.c b/net/eth.c
index b6ff89c460..f7ffbda600 100644
--- a/net/eth.c
+++ b/net/eth.c
@@ -226,11 +226,11 @@ void eth_get_protocols(const struct iovec *iov, size_t 
iovcnt, size_t iovoff,
 
 size_t
 eth_strip_vlan(const struct iovec *iov, int iovcnt, size_t iovoff,
-   uint8_t *new_ehdr_buf,
+   void *new_ehdr_buf,
uint16_t *payload_offset, uint16_t *tci)
 {
 struct vlan_header vlan_hdr;
-struct eth_header *new_ehdr = (struct eth_header *) new_ehdr_buf;
+struct eth_header *new_ehdr = new_ehdr_buf;
 
 size_t copied = iov_to_buf(iov, iovcnt, iovoff,
new_ehdr, sizeof(*new_ehdr));
@@ -276,7 +276,7 @@ eth_strip_vlan(const struct iovec *iov, int iovcnt, size_t 
iovoff,
 
 size_t
 eth_strip_vlan_ex(const struct iovec *iov, int iovcnt, size_t iovoff,
-  uint16_t vet, uint8_t *new_ehdr_buf,
+  uint16_t vet, void *new_ehdr_buf,
   uint16_t *payload_offset, uint16_t *tci)
 {
 struct vlan_header vlan_hdr;
-- 
2.40.1

[PATCH v5 17/48] e1000x: Rename TcpIpv6 into TcpIpv6Ex

2023-05-22 Thread Akihiko Odaki

e1000e and igb employs NetPktRssIpV6TcpEx for RSS hash if TcpIpv6 MRQC
bit is set. Moreover, igb also has a MRQC bit for NetPktRssIpV6Tcp
though it is not implemented yet. Rename it to TcpIpv6Ex to avoid
confusion.

Signed-off-by: Akihiko Odaki 
Reviewed-by: Sriram Yagnaraman 
---
 hw/net/e1000x_regs.h | 24 
 hw/net/e1000e_core.c |  8 
 hw/net/igb_core.c|  8 
 hw/net/trace-events  |  2 +-
 4 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/hw/net/e1000x_regs.h b/hw/net/e1000x_regs.h
index 6d3c4c6d3a..13760c66d3 100644
--- a/hw/net/e1000x_regs.h
+++ b/hw/net/e1000x_regs.h
@@ -290,18 +290,18 @@
 #define E1000_RETA_IDX(hash)((hash) & (BIT(7) - 1))
 #define E1000_RETA_VAL(reta, hash)  (((uint8_t *)(reta))[E1000_RETA_IDX(hash)])
 
-#define E1000_MRQC_EN_TCPIPV4(mrqc) ((mrqc) & BIT(16))
-#define E1000_MRQC_EN_IPV4(mrqc)((mrqc) & BIT(17))
-#define E1000_MRQC_EN_TCPIPV6(mrqc) ((mrqc) & BIT(18))
-#define E1000_MRQC_EN_IPV6EX(mrqc)  ((mrqc) & BIT(19))
-#define E1000_MRQC_EN_IPV6(mrqc)((mrqc) & BIT(20))
-
-#define E1000_MRQ_RSS_TYPE_NONE (0)
-#define E1000_MRQ_RSS_TYPE_IPV4TCP  (1)
-#define E1000_MRQ_RSS_TYPE_IPV4 (2)
-#define E1000_MRQ_RSS_TYPE_IPV6TCP  (3)
-#define E1000_MRQ_RSS_TYPE_IPV6EX   (4)
-#define E1000_MRQ_RSS_TYPE_IPV6 (5)
+#define E1000_MRQC_EN_TCPIPV4(mrqc)   ((mrqc) & BIT(16))
+#define E1000_MRQC_EN_IPV4(mrqc)  ((mrqc) & BIT(17))
+#define E1000_MRQC_EN_TCPIPV6EX(mrqc) ((mrqc) & BIT(18))
+#define E1000_MRQC_EN_IPV6EX(mrqc)((mrqc) & BIT(19))
+#define E1000_MRQC_EN_IPV6(mrqc)  ((mrqc) & BIT(20))
+
+#define E1000_MRQ_RSS_TYPE_NONE   (0)
+#define E1000_MRQ_RSS_TYPE_IPV4TCP(1)
+#define E1000_MRQ_RSS_TYPE_IPV4   (2)
+#define E1000_MRQ_RSS_TYPE_IPV6TCPEX  (3)
+#define E1000_MRQ_RSS_TYPE_IPV6EX (4)
+#define E1000_MRQ_RSS_TYPE_IPV6   (5)
 
 #define E1000_ICR_ASSERTED BIT(31)
 #define E1000_EIAC_MASK0x01F0
diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index 41d2435074..38d465a203 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -537,7 +537,7 @@ e1000e_rss_get_hash_type(E1000ECore *core, struct NetRxPkt 
*pkt)
 ip6info->rss_ex_dst_valid,
 ip6info->rss_ex_src_valid,
 core->mac[MRQC],
-E1000_MRQC_EN_TCPIPV6(core->mac[MRQC]),
+E1000_MRQC_EN_TCPIPV6EX(core->mac[MRQC]),
 E1000_MRQC_EN_IPV6EX(core->mac[MRQC]),
 E1000_MRQC_EN_IPV6(core->mac[MRQC]));
 
@@ -546,8 +546,8 @@ e1000e_rss_get_hash_type(E1000ECore *core, struct NetRxPkt 
*pkt)
   ip6info->rss_ex_src_valid))) {
 
 if (l4hdr_proto == ETH_L4_HDR_PROTO_TCP &&
-E1000_MRQC_EN_TCPIPV6(core->mac[MRQC])) {
-return E1000_MRQ_RSS_TYPE_IPV6TCP;
+E1000_MRQC_EN_TCPIPV6EX(core->mac[MRQC])) {
+return E1000_MRQ_RSS_TYPE_IPV6TCPEX;
 }
 
 if (E1000_MRQC_EN_IPV6EX(core->mac[MRQC])) {
@@ -581,7 +581,7 @@ e1000e_rss_calc_hash(E1000ECore *core,
 case E1000_MRQ_RSS_TYPE_IPV4TCP:
 type = NetPktRssIpV4Tcp;
 break;
-case E1000_MRQ_RSS_TYPE_IPV6TCP:
+case E1000_MRQ_RSS_TYPE_IPV6TCPEX:
 type = NetPktRssIpV6TcpEx;
 break;
 case E1000_MRQ_RSS_TYPE_IPV6:
diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 934db3c3e5..209fdad862 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -301,7 +301,7 @@ igb_rss_get_hash_type(IGBCore *core, struct NetRxPkt *pkt)
 ip6info->rss_ex_dst_valid,
 ip6info->rss_ex_src_valid,
 core->mac[MRQC],
-E1000_MRQC_EN_TCPIPV6(core->mac[MRQC]),
+E1000_MRQC_EN_TCPIPV6EX(core->mac[MRQC]),
 E1000_MRQC_EN_IPV6EX(core->mac[MRQC]),
 E1000_MRQC_EN_IPV6(core->mac[MRQC]));
 
@@ -310,8 +310,8 @@ igb_rss_get_hash_type(IGBCore *core, struct NetRxPkt *pkt)
   ip6info->rss_ex_src_valid))) {
 
 if (l4hdr_proto == ETH_L4_HDR_PROTO_TCP &&
-E1000_MRQC_EN_TCPIPV6(core->mac[MRQC])) {
-return E1000_MRQ_RSS_TYPE_IPV6TCP;
+E1000_MRQC_EN_TCPIPV6EX(core->mac[MRQC])) {
+return E1000_MRQ_RSS_TYPE_IPV6TCPEX;
 }
 
 if (E1000_MRQC_EN_IPV6EX(core->mac[MRQC])) {
@@ -343,7 +343,7 @@ igb_rss_calc_hash(IGBCore *core, struct NetRxPkt *pkt, 
E1000E_RSSInfo *info)
 case E1000_MRQ_RSS_TYPE_IPV4TCP:
 type = NetPktRssIpV4Tcp;
 break;
-case E1000_MRQ_RSS_TYPE_IPV6TCP:
+case E1000_MRQ_RSS_TYPE_IPV6TCPEX:
 type = NetPktRssIpV6TcpEx;
 break;
 case

[PATCH v5 13/48] hw/net/net_tx_pkt: Remove net_rx_pkt_get_l4_info

2023-05-22 Thread Akihiko Odaki

This function is not used.

Signed-off-by: Akihiko Odaki 
---
 hw/net/net_rx_pkt.h | 9 -
 hw/net/net_rx_pkt.c | 5 -
 2 files changed, 14 deletions(-)

diff --git a/hw/net/net_rx_pkt.h b/hw/net/net_rx_pkt.h
index a06f5c2675..ce8dbdb284 100644
--- a/hw/net/net_rx_pkt.h
+++ b/hw/net/net_rx_pkt.h
@@ -119,15 +119,6 @@ eth_ip6_hdr_info *net_rx_pkt_get_ip6_info(struct NetRxPkt 
*pkt);
  */
 eth_ip4_hdr_info *net_rx_pkt_get_ip4_info(struct NetRxPkt *pkt);
 
-/**
- * fetches L4 header analysis results
- *
- * Return:  pointer to analysis results structure which is stored in internal
- *  packet area.
- *
- */
-eth_l4_hdr_info *net_rx_pkt_get_l4_info(struct NetRxPkt *pkt);
-
 typedef enum {
 NetPktRssIpV4,
 NetPktRssIpV4Tcp,
diff --git a/hw/net/net_rx_pkt.c b/hw/net/net_rx_pkt.c
index 63be6e05ad..6125a063d7 100644
--- a/hw/net/net_rx_pkt.c
+++ b/hw/net/net_rx_pkt.c
@@ -236,11 +236,6 @@ eth_ip4_hdr_info *net_rx_pkt_get_ip4_info(struct NetRxPkt 
*pkt)
 return >ip4hdr_info;
 }
 
-eth_l4_hdr_info *net_rx_pkt_get_l4_info(struct NetRxPkt *pkt)
-{
-return >l4hdr_info;
-}
-
 static inline void
 _net_rx_rss_add_chunk(uint8_t *rss_input, size_t *bytes_written,
   void *ptr, size_t size)
-- 
2.40.1

[PATCH v5 01/48] hw/net/net_tx_pkt: Decouple implementation from PCI

2023-05-22 Thread Akihiko Odaki

This is intended to be followed by another change for the interface.
It also fixes the leak of memory mapping when the specified memory is
partially mapped.

Fixes: e263cd49c7 ("Packet abstraction for VMWARE network devices")
Signed-off-by: Akihiko Odaki 
---
 hw/net/net_tx_pkt.h |  9 
 hw/net/net_tx_pkt.c | 53 -
 2 files changed, 42 insertions(+), 20 deletions(-)

diff --git a/hw/net/net_tx_pkt.h b/hw/net/net_tx_pkt.h
index e5ce6f20bc..5eb123ef90 100644
--- a/hw/net/net_tx_pkt.h
+++ b/hw/net/net_tx_pkt.h
@@ -153,6 +153,15 @@ void net_tx_pkt_dump(struct NetTxPkt *pkt);
  */
 void net_tx_pkt_reset(struct NetTxPkt *pkt, PCIDevice *dev);
 
+/**
+ * Unmap a fragment mapped from a PCI device.
+ *
+ * @context:PCI device owning fragment
+ * @base:   pointer to fragment
+ * @len:length of fragment
+ */
+void net_tx_pkt_unmap_frag_pci(void *context, void *base, size_t len);
+
 /**
  * Send packet to qemu. handles sw offloads if vhdr is not supported.
  *
diff --git a/hw/net/net_tx_pkt.c b/hw/net/net_tx_pkt.c
index 8dc8568ba2..aca12ff035 100644
--- a/hw/net/net_tx_pkt.c
+++ b/hw/net/net_tx_pkt.c
@@ -384,10 +384,9 @@ void net_tx_pkt_setup_vlan_header_ex(struct NetTxPkt *pkt,
 }
 }
 
-bool net_tx_pkt_add_raw_fragment(struct NetTxPkt *pkt, hwaddr pa,
-size_t len)
+static bool net_tx_pkt_add_raw_fragment_common(struct NetTxPkt *pkt,
+   void *base, size_t len)
 {
-hwaddr mapped_len = 0;
 struct iovec *ventry;
 assert(pkt);
 
@@ -395,23 +394,12 @@ bool net_tx_pkt_add_raw_fragment(struct NetTxPkt *pkt, 
hwaddr pa,
 return false;
 }
 
-if (!len) {
-return true;
- }
-
 ventry = >raw[pkt->raw_frags];
-mapped_len = len;
+ventry->iov_base = base;
+ventry->iov_len = len;
+pkt->raw_frags++;
 
-ventry->iov_base = pci_dma_map(pkt->pci_dev, pa,
-   _len, DMA_DIRECTION_TO_DEVICE);
-
-if ((ventry->iov_base != NULL) && (len == mapped_len)) {
-ventry->iov_len = mapped_len;
-pkt->raw_frags++;
-return true;
-} else {
-return false;
-}
+return true;
 }
 
 bool net_tx_pkt_has_fragments(struct NetTxPkt *pkt)
@@ -465,8 +453,9 @@ void net_tx_pkt_reset(struct NetTxPkt *pkt, PCIDevice 
*pci_dev)
 assert(pkt->raw);
 for (i = 0; i < pkt->raw_frags; i++) {
 assert(pkt->raw[i].iov_base);
-pci_dma_unmap(pkt->pci_dev, pkt->raw[i].iov_base,
-  pkt->raw[i].iov_len, DMA_DIRECTION_TO_DEVICE, 0);
+net_tx_pkt_unmap_frag_pci(pkt->pci_dev,
+  pkt->raw[i].iov_base,
+  pkt->raw[i].iov_len);
 }
 }
 pkt->pci_dev = pci_dev;
@@ -476,6 +465,30 @@ void net_tx_pkt_reset(struct NetTxPkt *pkt, PCIDevice 
*pci_dev)
 pkt->l4proto = 0;
 }
 
+void net_tx_pkt_unmap_frag_pci(void *context, void *base, size_t len)
+{
+pci_dma_unmap(context, base, len, DMA_DIRECTION_TO_DEVICE, 0);
+}
+
+bool net_tx_pkt_add_raw_fragment(struct NetTxPkt *pkt, hwaddr pa,
+size_t len)
+{
+dma_addr_t mapped_len = len;
+void *base = pci_dma_map(pkt->pci_dev, pa, _len,
+ DMA_DIRECTION_TO_DEVICE);
+if (!base) {
+return false;
+}
+
+if (mapped_len != len ||
+!net_tx_pkt_add_raw_fragment_common(pkt, base, len)) {
+net_tx_pkt_unmap_frag_pci(pkt->pci_dev, base, mapped_len);
+return false;
+}
+
+return true;
+}
+
 static void net_tx_pkt_do_sw_csum(struct NetTxPkt *pkt,
   struct iovec *iov, uint32_t iov_len,
   uint16_t csl)
-- 
2.40.1

[PATCH v5 10/48] Fix references to igb Avocado test

2023-05-22 Thread Akihiko Odaki

Fixes: 9f95111474 ("tests/avocado: re-factor igb test to avoid timeouts")
Signed-off-by: Akihiko Odaki 
Reviewed-by: Philippe Mathieu-Daudé 
---
 MAINTAINERS| 2 +-
 docs/system/devices/igb.rst| 2 +-
 scripts/ci/org.centos/stream/8/x86_64/test-avocado | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index ef45b5e71e..c31d2279ab 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2256,7 +2256,7 @@ R: Sriram Yagnaraman 
 S: Maintained
 F: docs/system/devices/igb.rst
 F: hw/net/igb*
-F: tests/avocado/igb.py
+F: tests/avocado/netdev-ethtool.py
 F: tests/qtest/igb-test.c
 F: tests/qtest/libqos/igb.c
 
diff --git a/docs/system/devices/igb.rst b/docs/system/devices/igb.rst
index 70edadd574..afe036dad2 100644
--- a/docs/system/devices/igb.rst
+++ b/docs/system/devices/igb.rst
@@ -60,7 +60,7 @@ Avocado test and can be ran with the following command:
 
 .. code:: shell
 
-  make check-avocado AVOCADO_TESTS=tests/avocado/igb.py
+  make check-avocado AVOCADO_TESTS=tests/avocado/netdev-ethtool.py
 
 References
 ==
diff --git a/scripts/ci/org.centos/stream/8/x86_64/test-avocado 
b/scripts/ci/org.centos/stream/8/x86_64/test-avocado
index d2c0e5fb4c..a1aa601ee3 100755
--- a/scripts/ci/org.centos/stream/8/x86_64/test-avocado
+++ b/scripts/ci/org.centos/stream/8/x86_64/test-avocado
@@ -30,7 +30,7 @@ make get-vm-images
 tests/avocado/cpu_queries.py:QueryCPUModelExpansion.test \
 tests/avocado/empty_cpu_model.py:EmptyCPUModel.test \
 tests/avocado/hotplug_cpu.py:HotPlugCPU.test \
-tests/avocado/igb.py:IGB.test \
+tests/avocado/netdev-ethtool.py:NetDevEthtool.test_igb_nomsi \
 tests/avocado/info_usernet.py:InfoUsernet.test_hostfwd \
 tests/avocado/intel_iommu.py:IntelIOMMU.test_intel_iommu \
 tests/avocado/intel_iommu.py:IntelIOMMU.test_intel_iommu_pt \
-- 
2.40.1

[PATCH v5 09/48] igb: Always copy ethernet header

2023-05-22 Thread Akihiko Odaki

igb_receive_internal() used to check the iov length to determine
copy the iovs to a contiguous buffer, but the check is flawed in two
ways:
- It does not ensure that iovcnt > 0.
- It does not take virtio-net header into consideration.

The size of this copy is just 22 octets, which can be even less than
the code size required for checks. This (wrong) optimization is probably
not worth so just remove it. Removing this also allows igb to assume
aligned accesses for the ethernet header.

Fixes: 3a977deebe ("Intrdocue igb device emulation")
Signed-off-by: Akihiko Odaki 
Reviewed-by: Sriram Yagnaraman 
---
 hw/net/igb_core.c | 43 +++
 1 file changed, 23 insertions(+), 20 deletions(-)

diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 21a8d9ada4..1123df9e77 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -67,6 +67,11 @@ typedef struct IGBTxPktVmdqCallbackContext {
 NetClientState *nc;
 } IGBTxPktVmdqCallbackContext;
 
+typedef struct L2Header {
+struct eth_header eth;
+struct vlan_header vlan;
+} L2Header;
+
 static ssize_t
 igb_receive_internal(IGBCore *core, const struct iovec *iov, int iovcnt,
  bool has_vnet, bool *external_tx);
@@ -961,15 +966,16 @@ igb_rx_is_oversized(IGBCore *core, uint16_t qn, size_t 
size)
 return size > (lpe ? max_ethernet_lpe_size : max_ethernet_vlan_size);
 }
 
-static uint16_t igb_receive_assign(IGBCore *core, const struct eth_header 
*ehdr,
+static uint16_t igb_receive_assign(IGBCore *core, const L2Header *l2_header,
size_t size, E1000E_RSSInfo *rss_info,
bool *external_tx)
 {
 static const int ta_shift[] = { 4, 3, 2, 0 };
+const struct eth_header *ehdr = _header->eth;
 uint32_t f, ra[2], *macp, rctl = core->mac[RCTL];
 uint16_t queues = 0;
 uint16_t oversized = 0;
-uint16_t vid = lduw_be_p(_GET_VLAN_HDR(ehdr)->h_tci) & VLAN_VID_MASK;
+uint16_t vid = be16_to_cpu(l2_header->vlan.h_tci) & VLAN_VID_MASK;
 bool accepted = false;
 int i;
 
@@ -1590,14 +1596,13 @@ static ssize_t
 igb_receive_internal(IGBCore *core, const struct iovec *iov, int iovcnt,
  bool has_vnet, bool *external_tx)
 {
-static const int maximum_ethernet_hdr_len = (ETH_HLEN + 4);
-
 uint16_t queues = 0;
 uint32_t n = 0;
-uint8_t min_buf[ETH_ZLEN];
+union {
+L2Header l2_header;
+uint8_t octets[ETH_ZLEN];
+} buf;
 struct iovec min_iov;
-struct eth_header *ehdr;
-uint8_t *filter_buf;
 size_t size, orig_size;
 size_t iov_ofs = 0;
 E1000E_RxRing rxr;
@@ -1623,24 +1628,21 @@ igb_receive_internal(IGBCore *core, const struct iovec 
*iov, int iovcnt,
 net_rx_pkt_unset_vhdr(core->rx_pkt);
 }
 
-filter_buf = iov->iov_base + iov_ofs;
 orig_size = iov_size(iov, iovcnt);
 size = orig_size - iov_ofs;
 
 /* Pad to minimum Ethernet frame length */
-if (size < sizeof(min_buf)) {
-iov_to_buf(iov, iovcnt, iov_ofs, min_buf, size);
-memset(_buf[size], 0, sizeof(min_buf) - size);
+if (size < sizeof(buf)) {
+iov_to_buf(iov, iovcnt, iov_ofs, , size);
+memset([size], 0, sizeof(buf) - size);
 e1000x_inc_reg_if_not_full(core->mac, RUC);
-min_iov.iov_base = filter_buf = min_buf;
-min_iov.iov_len = size = sizeof(min_buf);
+min_iov.iov_base = 
+min_iov.iov_len = size = sizeof(buf);
 iovcnt = 1;
 iov = _iov;
 iov_ofs = 0;
-} else if (iov->iov_len < maximum_ethernet_hdr_len) {
-/* This is very unlikely, but may happen. */
-iov_to_buf(iov, iovcnt, iov_ofs, min_buf, maximum_ethernet_hdr_len);
-filter_buf = min_buf;
+} else {
+iov_to_buf(iov, iovcnt, iov_ofs, , sizeof(buf.l2_header));
 }
 
 /* Discard oversized packets if !LPE and !SBP. */
@@ -1648,11 +1650,12 @@ igb_receive_internal(IGBCore *core, const struct iovec 
*iov, int iovcnt,
 return orig_size;
 }
 
-ehdr = PKT_GET_ETH_HDR(filter_buf);
-net_rx_pkt_set_packet_type(core->rx_pkt, get_eth_packet_type(ehdr));
+net_rx_pkt_set_packet_type(core->rx_pkt,
+   get_eth_packet_type(_header.eth));
 net_rx_pkt_set_protocols(core->rx_pkt, iov, iovcnt, iov_ofs);
 
-queues = igb_receive_assign(core, ehdr, size, _info, external_tx);
+queues = igb_receive_assign(core, _header, size,
+_info, external_tx);
 if (!queues) {
 trace_e1000e_rx_flt_dropped();
 return orig_size;
-- 
2.40.1

[PATCH v5 00/48] igb: Fix for DPDK

2023-05-22 Thread Akihiko Odaki

Based-on: <366bbcafdb6e0373f0deb105153768a8c0bded87.ca...@gmail.com>
("[PATCH 0/1] e1000e: Fix tx/rx counters")

This series has fixes and feature additions to pass DPDK Test Suite with
igb. It also includes a few minor changes related to networking.

Patch [01, 10] are bug fixes.
Patch [11, 14] delete code which is unnecessary.
Patch [15, 33] are minor changes.
Patch [34, 46] implement new features.
Patch [47, 48] update documentations.

While this includes so many patches, it is not necessary to land them at
once. Only bug fix patches may be applied first, for example.

V4 -> V5:
- Fixed L2 packet type bit location.

V3 -> V4:
- Renamed "min_buf variable to "buf". (Sriram Yagnaraman)
- Added patch "igb: Clear-on-read ICR when ICR.INTA is set".
  (Sriram Yagnaraman)

V2 -> V3:
- Fixed parameter name in hw/net/net_tx_pkt. (Philippe Mathieu-Daudé)
- Added patch "igb: Clear IMS bits when committing ICR access".
- Added patch "igb: Clear EICR bits for delayed MSI-X interrupts".
- Added patch "e1000e: Rename a variable in e1000e_receive_internal()".
- Added patch "igb: Rename a variable in igb_receive_internal()".
- Added patch "e1000e: Notify only new interrupts".
- Added patch "igb: Notify only new interrupts".

V1 -> V2:
- Dropped patch "Include the second VLAN tag in the buffer". The second
  VLAN tag is not used at the point and unecessary.
- Added patch "e1000x: Rename TcpIpv6 into TcpIpv6Ex".
- Split patch "hw/net/net_tx_pkt: Decouple from PCI".
  (Philippe Mathieu-Daudé)
- Added advanced Rx descriptor packet encoding definitions.
  (Sriram Yagnaraman)
- Added some constants to eth.h to derive packet oversize thresholds.
- Added IGB_TX_FLAGS_VLAN_SHIFT usage.
- Renamed patch "igb: Fix igb_mac_reg_init alignment".
  (Philippe Mathieu-Daudé)
- Fixed size check for packets with double VLAN. (Sriram Yagnaraman)
- Fixed timing to timestamp Tx packet.

Akihiko Odaki (48):
  hw/net/net_tx_pkt: Decouple implementation from PCI
  hw/net/net_tx_pkt: Decouple interface from PCI
  e1000x: Fix BPRC and MPRC
  igb: Fix Rx packet type encoding
  igb: Do not require CTRL.VME for tx VLAN tagging
  igb: Clear IMS bits when committing ICR access
  net/net_rx_pkt: Use iovec for net_rx_pkt_set_protocols()
  e1000e: Always copy ethernet header
  igb: Always copy ethernet header
  Fix references to igb Avocado test
  tests/avocado: Remove unused imports
  tests/avocado: Remove test_igb_nomsi_kvm
  hw/net/net_tx_pkt: Remove net_rx_pkt_get_l4_info
  net/eth: Rename eth_setup_vlan_headers_ex
  e1000x: Share more Rx filtering logic
  e1000x: Take CRC into consideration for size check
  e1000x: Rename TcpIpv6 into TcpIpv6Ex
  e1000e: Always log status after building rx metadata
  igb: Always log status after building rx metadata
  igb: Remove goto
  igb: Read DCMD.VLE of the first Tx descriptor
  e1000e: Reset packet state after emptying Tx queue
  vmxnet3: Reset packet state after emptying Tx queue
  igb: Add more definitions for Tx descriptor
  igb: Share common VF constants
  igb: Fix igb_mac_reg_init coding style alignment
  igb: Clear EICR bits for delayed MSI-X interrupts
  e1000e: Rename a variable in e1000e_receive_internal()
  igb: Rename a variable in igb_receive_internal()
  net/eth: Use void pointers
  net/eth: Always add VLAN tag
  hw/net/net_rx_pkt: Enforce alignment for eth_header
  tests/qtest/libqos/igb: Set GPIE.Multiple_MSIX
  igb: Implement MSI-X single vector mode
  igb: Use UDP for RSS hash
  igb: Implement Rx SCTP CSO
  igb: Implement Tx SCTP CSO
  igb: Strip the second VLAN tag for extended VLAN
  igb: Filter with the second VLAN tag for extended VLAN
  igb: Implement igb-specific oversize check
  igb: Implement Rx PTP2 timestamp
  igb: Implement Tx timestamp
  e1000e: Notify only new interrupts
  igb: Notify only new interrupts
  igb: Clear-on-read ICR when ICR.INTA is set
  vmxnet3: Do not depend on PC
  MAINTAINERS: Add a reviewer for network packet abstractions
  docs/system/devices/igb: Note igb is tested for DPDK

 MAINTAINERS   |   3 +-
 docs/system/devices/igb.rst   |  14 +-
 hw/net/e1000e_core.h  |   2 -
 hw/net/e1000x_common.h|   9 +-
 hw/net/e1000x_regs.h  |  24 +-
 hw/net/igb_common.h   |  24 +-
 hw/net/igb_regs.h |  67 +-
 hw/net/net_rx_pkt.h   |  38 +-
 hw/net/net_tx_pkt.h   |  46 +-
 include/net/eth.h |  29 +-
 include/qemu/crc32c.h |   1 +
 hw/net/e1000.c|  41 +-
 hw/net/e1000e_core.c  | 292 +++
 hw/net/e1000x_common.c|  79 +-
 hw/net/igb.c  |  10 +-
 hw/net/igb_core.c | 717 ++
 hw/net/igbvf.c|   7 -
 hw/net/net_rx_pkt.c

[PATCH v2] util/vfio-helpers: Use g_file_read_link()

2023-05-22 Thread Akihiko Odaki

When _FORTIFY_SOURCE=2, glibc version is 2.35, and GCC version is
12.1.0, the compiler complains as follows:

In file included from /usr/include/features.h:490,
 from /usr/include/bits/libc-header-start.h:33,
 from /usr/include/stdint.h:26,
 from 
/usr/lib/gcc/aarch64-unknown-linux-gnu/12.1.0/include/stdint.h:9,
 from /home/alarm/q/var/qemu/include/qemu/osdep.h:94,
 from ../util/vfio-helpers.c:13:
In function 'readlink',
inlined from 'sysfs_find_group_file' at ../util/vfio-helpers.c:116:9,
inlined from 'qemu_vfio_init_pci' at ../util/vfio-helpers.c:326:18,
inlined from 'qemu_vfio_open_pci' at ../util/vfio-helpers.c:517:9:
/usr/include/bits/unistd.h:119:10: error: argument 2 is null but the 
corresponding size argument 3 value is 4095 [-Werror=nonnull]
  119 |   return __glibc_fortify (readlink, __len, sizeof (char),
  |  ^~~

This error implies the allocated buffer can be NULL. Use
g_file_read_link(), which allocates buffer automatically to avoid the
error.

Signed-off-by: Akihiko Odaki 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Cédric Le Goater 
---
V1 -> V2: Initialize gerr variable.

 util/vfio-helpers.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/util/vfio-helpers.c b/util/vfio-helpers.c
index 2d8af38f88..f8bab46c68 100644
--- a/util/vfio-helpers.c
+++ b/util/vfio-helpers.c
@@ -106,15 +106,17 @@ struct QEMUVFIOState {
  */
 static char *sysfs_find_group_file(const char *device, Error **errp)
 {
+g_autoptr(GError) gerr = NULL;
 char *sysfs_link;
 char *sysfs_group;
 char *p;
 char *path = NULL;
 
 sysfs_link = g_strdup_printf("/sys/bus/pci/devices/%s/iommu_group", 
device);
-sysfs_group = g_malloc0(PATH_MAX);
-if (readlink(sysfs_link, sysfs_group, PATH_MAX - 1) == -1) {
-error_setg_errno(errp, errno, "Failed to find iommu group sysfs path");
+sysfs_group = g_file_read_link(sysfs_link, );
+if (gerr) {
+error_setg(errp, "Failed to find iommu group sysfs path: %s",
+   gerr->message);
 goto out;
 }
 p = strrchr(sysfs_group, '/');
-- 
2.40.1

Re: [PATCH v2 02/16] migration: Correct transferred bytes value

2023-05-22 Thread Leonardo Brás

On Mon, 2023-05-15 at 21:56 +0200, Juan Quintela wrote:
> We forget several places to add to trasferred amount of data.  With
> this fixes I get:
> 
>qemu_file_transferred() + multifd_bytes == transferred
> 
> The only place whrer this is not true is during devices sending.  But
> going all through the full tree searching for devices that use
> QEMUFile directly is a bit too much.
> 
> Multifd, precopy and xbzrle work as expected. Postocpy still misses 35
> bytes, but searching for them is getting complicated, so I stop here.
> 
> Signed-off-by: Juan Quintela 
> ---
>  migration/ram.c   | 14 ++
>  migration/savevm.c| 19 +--
>  migration/vmstate.c   |  3 +++
>  migration/meson.build |  2 +-
>  4 files changed, 35 insertions(+), 3 deletions(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index f69d8d42b0..fd5a8db0f8 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -337,6 +337,7 @@ int64_t ramblock_recv_bitmap_send(QEMUFile *file,
>  
>  g_free(le_bitmap);
>  
> +stat64_add(_stats.transferred, 8 + size + 8);
>  if (qemu_file_get_error(file)) {
>  return qemu_file_get_error(file);
>  }
> @@ -1392,6 +1393,7 @@ static int find_dirty_block(RAMState *rs, 
> PageSearchStatus *pss)
>  return ret;
>  }
>  qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
> +stat64_add(_stats.transferred, 8);
>  qemu_fflush(f);
>  }
>  /*
> @@ -3020,6 +3022,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>  RAMState **rsp = opaque;
>  RAMBlock *block;
>  int ret;
> +size_t size = 0;
>  
>  if (compress_threads_save_setup()) {
>  return -1;
> @@ -3038,16 +3041,20 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>  qemu_put_be64(f, ram_bytes_total_with_ignored()
>   | RAM_SAVE_FLAG_MEM_SIZE);
>  
> +size += 8;
>  RAMBLOCK_FOREACH_MIGRATABLE(block) {
>  qemu_put_byte(f, strlen(block->idstr));
>  qemu_put_buffer(f, (uint8_t *)block->idstr, 
> strlen(block->idstr));
>  qemu_put_be64(f, block->used_length);
> +size += 1 + strlen(block->idstr) + 8;

I was thinking some of them would look better with sizeof()s instead of given
literal number, such as:

size += sizeof(Byte) + strlen(block->idstr) + sizeof(block->used_length);

Maybe too much?

>  if (migrate_postcopy_ram() && block->page_size !=
>qemu_host_page_size) {
>  qemu_put_be64(f, block->page_size);
> +size += 8;
>  }
>  if (migrate_ignore_shared()) {
>  qemu_put_be64(f, block->mr->addr);
> +size += 8;
>  }
>  }
>  }
> @@ -3064,11 +3071,14 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>  
>  if (!migrate_multifd_flush_after_each_section()) {
>  qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
> +size += 8;

sizeof(uint64_t) here is probably too much.


Maybe, it would be nice to have qemu_put_* to return the value, and in this
case:

size += qemu_put_be64(...)

What do you think?

Anyway, 

Reviewed-by: Leonardo Bras 

>  }
>  
>  qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
> +size += 8;
>  qemu_fflush(f);
>  
> +stat64_add(_stats.transferred, size);
>  return 0;
>  }
>  
> @@ -3209,6 +3219,7 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
>  RAMState **temp = opaque;
>  RAMState *rs = *temp;
>  int ret = 0;
> +size_t size = 0;
>  
>  rs->last_stage = !migration_in_colo_state();
>  
> @@ -3253,8 +3264,11 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
>  
>  if (!migrate_multifd_flush_after_each_section()) {
>  qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
> +size += 8;
>  }
>  qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
> +size += 8;
> +stat64_add(_stats.transferred, size);
>  qemu_fflush(f);
>  
>  return 0;
> diff --git a/migration/savevm.c b/migration/savevm.c
> index e33788343a..c7af9050c2 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -952,6 +952,7 @@ static void save_section_header(QEMUFile *f, 
> SaveStateEntry *se,
>  qemu_put_byte(f, section_type);
>  qemu_put_be32(f, se->section_id);
>  
> +size_t size = 1 + 4;
>  if (section_type == QEMU_VM_SECTION_FULL ||
>  section_type == QEMU_VM_SECTION_START) {
>  /* ID string */
> @@ -961,7 +962,9 @@ static void save_section_header(QEMUFile *f, 
> SaveStateEntry *se,
>  
>  qemu_put_be32(f, se->instance_id);
>  qemu_put_be32(f, se->version_id);
> +size += 1 + len + 4 + 4;
>  }
> +stat64_add(_stats.transferred, size);
>  }
>  
>  /*
> @@ -973,6 +976,7 @@ static void save_section_footer(QEMUFile *f, 
> SaveStateEntry *se)
>  if

Re: [PATCH v2 01/16] migration: Don't use INT64_MAX for unlimited rate

2023-05-22 Thread Leonardo Brás

On Tue, 2023-05-16 at 14:47 +0200, Cédric Le Goater wrote:
> On 5/16/23 11:24, Juan Quintela wrote:
> > David Edmondson  wrote:
> > > Juan Quintela  writes:
> > > 
> > > > Define and use RATE_LIMIT_MAX instead.
> > > 
> > > Suggest "RATE_LIMIT_MAX_NONE".
> > 
> > Then even better
> > 
> > RATE_LIMIT_DISABLED?
> 
> I'd vote for RATE_LIMIT_DISABLED.

Me too.

> 
> > RATE_LIMIT_NONE?
> > 
> > Using MAX and NONE at the same time looks strange.
> 
> Cheers,
> 
> C.
> 

Reviewed-by: Leonardo Bras

Re: [PATCH v12 02/15] accel: collecting TB execution count

2023-05-22 Thread Wu, Fei

On 5/23/2023 8:45 AM, Richard Henderson wrote:
> On 5/18/23 06:57, Fei Wu wrote:
>> +void HELPER(inc_exec_freq)(void *ptr)
>> +{
>> +    TBStatistics *stats = (TBStatistics *) ptr;
>> +    tcg_debug_assert(stats);
>> +    ++stats->executions.normal;
>> +}
> ...
>> +static inline void gen_tb_exec_count(TranslationBlock *tb)
>> +{
>> +    if (tb_stats_enabled(tb, TB_EXEC_STATS)) {
>> +    TCGv_ptr ptr = tcg_temp_new_ptr();
>> +    tcg_gen_movi_ptr(ptr, (intptr_t)tb->tb_stats);
>> +    gen_helper_inc_exec_freq(ptr);
>> +    }
>> +}
> 
> This is 3 host instructions, easily expanded inline:
> 
> --- a/accel/tcg/translator.c
> +++ b/accel/tcg/translator.c
> @@ -11,6 +11,7 @@
>  #include "qemu/error-report.h"
>  #include "tcg/tcg.h"
>  #include "tcg/tcg-op.h"
> +#include "tcg/tcg-temp-internal.h"
>  #include "exec/exec-all.h"
>  #include "exec/gen-icount.h"
>  #include "exec/log.h"
> @@ -18,6 +19,30 @@
>  #include "exec/plugin-gen.h"
>  #include "exec/replay-core.h"
> 
> +
> +static void gen_tb_exec_count(TranslationBlock *tb)
> +{
> +    if (tb_stats_enabled(tb, TB_EXEC_STATS)) {
> +    TCGv_ptr ptr = tcg_temp_ebb_new_ptr();
> +
> +    tcg_gen_movi_ptr(ptr, (intptr_t)>tb_stats->executions.normal);
> +    if (sizeof(tb->tb_stats->executions.normal) == 4) {
> +    TCGv_i32 t = tcg_temp_ebb_new_i32();
> +    tcg_gen_ld_i32(t, ptr, 0);
> +    tcg_gen_addi_i32(t, t, 1);
> +    tcg_gen_st_i32(t, ptr, 0);
> +    tcg_temp_free_i32(t);
> +    } else {
> +    TCGv_i64 t = tcg_temp_ebb_new_i64();
> +    tcg_gen_ld_i64(t, ptr, 0);
> +    tcg_gen_addi_i64(t, t, 1);
> +    tcg_gen_st_i64(t, ptr, 0);
> +    tcg_temp_free_i64(t);
> +    }
> +    tcg_temp_free_ptr(ptr);
> +    }
> +}
> +

Thank you for the method, I will try it and measure the gain, it is
indeed the hot path and usually takes a lot of time.

Thanks,
Fei.

>  bool translator_use_goto_tb(DisasContextBase *db, target_ulong dest)
>  {
>  /* Suppress goto_tb if requested. */
> 
> 
> I'm not expecially keen on embedding the TBStatistics pointer directly
> like this; for most hosts we will have to put this constant into the
> constant pool.  Whereas the pointer already exists at tb->tb_stats, and
> tb is at a constant displacement prior to the code, so we already have
> mechanisms for generating pc-relative addresses.
> 
> However, that's premature optimization.  Let's get it working first.
> 
> 
> r~
>

RE: [PATCH v1] migration: fail the cap check if it requires the use of deferred incoming

2023-05-22 Thread Wang, Wei W

On Tuesday, May 23, 2023 7:36 AM, Peter Xu wrote:
> > > We may also want to trap the channel setups on num:
> > >
> > > migrate_params_test_apply():
> > >
> > > if (params->has_multifd_channels) {
> > > dest->multifd_channels = params->multifd_channels;
> > > }
> >
> > Didn’t get this one. What do you want to add to above?
> 
> I meant after listen() is called with an explicit number in this case, should 
> we
> disallow changing of multifd number of channels?

Got you, thanks. That seems unnecessary to me, as the cap setting is required
for the use of multifd and patching there already achieves below what we want:
- users get the error message when deferred -incoming isn’t used;
- fail the cap setting for multifd, meaning that multifd won't be used (i.e.
no place that will care about multifd_channels).

Re: [PATCH 1/4] hw/intc/loongarch_ipi: Bring back all 4 IPI mailboxes

2023-05-22 Thread Song Gao





在 2023/5/22 下午9:44, Philippe Mathieu-Daudé 写道:

On 22/5/23 13:47, Jiaxun Yang wrote:




2023年5月22日 04:52，Huacai Chen  写道：

Hi, Jiaxun,

Rename loongarch_ipi to loongson_ipi? It will be shared by both MIPS
and LoongArch in your series.


Hi Huacai,

Thanks for the point, what’s the opinion from LoongArch mainatiners?

Or perhaps rename it as loong_ipi to reflect the nature that it’s shared
by MIPS based Loongson and LoongArch based Loongson?


I'm not a LoongArch maintainer, but a model named "loong_ipi" makes
sense to me.

Please add it to the two Virt machine sections in MAINTAINERS.


'loonggson_ipi' is better, qemu doesn't have naming with 'loong' as prefix.

And  patch2 should not use macros. Some attributes should be added to 
distinguish between MIPS and LongArch.


All references to loongarch_ipi should also be changed.

Thanks.
Song Gao

Re: [PATCH v4 03/11] hw: allwinner-r40: Complete uart devices

2023-05-22 Thread qianfan





在 2023/5/15 2:55, Niek Linnenbank 写道:

Hi Qianfan,


On Wed, May 10, 2023 at 12:30 PM  wrote:

From: qianfan Zhao 

R40 has eight UARTs, support both 16450 and 16550 compatible modes.

Signed-off-by: qianfan Zhao 
---
 hw/arm/allwinner-r40.c         | 31 ---
 include/hw/arm/allwinner-r40.h |  8 
 2 files changed, 36 insertions(+), 3 deletions(-)

diff --git a/hw/arm/allwinner-r40.c b/hw/arm/allwinner-r40.c
index 128c0ca470..537a90b23d 100644
--- a/hw/arm/allwinner-r40.c
+++ b/hw/arm/allwinner-r40.c
@@ -45,6 +45,13 @@ const hwaddr allwinner_r40_memmap[] = {
     [AW_R40_DEV_CCU]        = 0x01c2,
     [AW_R40_DEV_PIT]        = 0x01c20c00,
     [AW_R40_DEV_UART0]      = 0x01c28000,
+    [AW_R40_DEV_UART1]      = 0x01c28400,
+    [AW_R40_DEV_UART2]      = 0x01c28800,
+    [AW_R40_DEV_UART3]      = 0x01c28c00,
+    [AW_R40_DEV_UART4]      = 0x01c29000,
+    [AW_R40_DEV_UART5]      = 0x01c29400,
+    [AW_R40_DEV_UART6]      = 0x01c29800,
+    [AW_R40_DEV_UART7]      = 0x01c29c00,


After adding the uarts to the memory map here, you should remove them 
from the unimplemented array.

OK.


     [AW_R40_DEV_GIC_DIST]   = 0x01c81000,
     [AW_R40_DEV_GIC_CPU]    = 0x01c82000,
     [AW_R40_DEV_GIC_HYP]    = 0x01c84000,
@@ -160,6 +167,10 @@ enum {
     AW_R40_GIC_SPI_UART1     =  2,
     AW_R40_GIC_SPI_UART2     =  3,
     AW_R40_GIC_SPI_UART3     =  4,


Since you put the addition of UART1-7 in this patch, probably it makes 
sense to have adding the lines 'AW_R40_GIC_SPI_UART1/2/3' also part of 
this patch.

OK


With the two above remarks resolved, the patch looks good to me.

Reviewed-by: Niek Linnenbank 

Regards,
Niek

+    AW_R40_GIC_SPI_UART4     = 17,
+    AW_R40_GIC_SPI_UART5     = 18,
+    AW_R40_GIC_SPI_UART6     = 19,
+    AW_R40_GIC_SPI_UART7     = 20,
     AW_R40_GIC_SPI_TIMER0    = 22,
     AW_R40_GIC_SPI_TIMER1    = 23,
     AW_R40_GIC_SPI_MMC0      = 32,
@@ -387,9 +398,23 @@ static void allwinner_r40_realize(DeviceState
*dev, Error **errp)
     }

     /* UART0. For future clocktree API: All UARTS are connected
to APB2_CLK. */
-    serial_mm_init(get_system_memory(),
s->memmap[AW_R40_DEV_UART0], 2,
-                   qdev_get_gpio_in(DEVICE(>gic),
AW_R40_GIC_SPI_UART0),
-                   115200, serial_hd(0), DEVICE_NATIVE_ENDIAN);
+    for (int i = 0; i < AW_R40_NUM_UARTS; i++) {
+        static const int uart_irqs[AW_R40_NUM_UARTS] = {
+            AW_R40_GIC_SPI_UART0,
+            AW_R40_GIC_SPI_UART1,
+            AW_R40_GIC_SPI_UART2,
+            AW_R40_GIC_SPI_UART3,
+            AW_R40_GIC_SPI_UART4,
+            AW_R40_GIC_SPI_UART5,
+            AW_R40_GIC_SPI_UART6,
+            AW_R40_GIC_SPI_UART7,
+        };
+        const hwaddr addr = s->memmap[AW_R40_DEV_UART0 + i];
+
+        serial_mm_init(get_system_memory(), addr, 2,
+  qdev_get_gpio_in(DEVICE(>gic), uart_irqs[i]),
+                       115200, serial_hd(i), DEVICE_NATIVE_ENDIAN);
+    }

     /* Unimplemented devices */
     for (i = 0; i < ARRAY_SIZE(r40_unimplemented); i++) {
diff --git a/include/hw/arm/allwinner-r40.h
b/include/hw/arm/allwinner-r40.h
index 3be9dc962b..959b5dc4e0 100644
--- a/include/hw/arm/allwinner-r40.h
+++ b/include/hw/arm/allwinner-r40.h
@@ -41,6 +41,13 @@ enum {
     AW_R40_DEV_CCU,
     AW_R40_DEV_PIT,
     AW_R40_DEV_UART0,
+    AW_R40_DEV_UART1,
+    AW_R40_DEV_UART2,
+    AW_R40_DEV_UART3,
+    AW_R40_DEV_UART4,
+    AW_R40_DEV_UART5,
+    AW_R40_DEV_UART6,
+    AW_R40_DEV_UART7,
     AW_R40_DEV_GIC_DIST,
     AW_R40_DEV_GIC_CPU,
     AW_R40_DEV_GIC_HYP,
@@ -70,6 +77,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(AwR40State, AW_R40)
  * which are currently emulated by the R40 SoC code.
  */
 #define AW_R40_NUM_MMCS         4
+#define AW_R40_NUM_UARTS        8

 struct AwR40State {
     /*< private >*/
-- 
2.25.1




--
Niek Linnenbank

Re: [PATCH v4 01/11] hw: arm: Add bananapi M2-Ultra and allwinner-r40 support

2023-05-22 Thread qianfan





在 2023/5/15 2:50, Niek Linnenbank 写道:

Hi Qianfan,


On Wed, May 10, 2023 at 12:30 PM  wrote:

From: qianfan Zhao 

Allwinner R40 (sun8i) SoC features a Quad-Core Cortex-A7 ARM CPU,
and a Mali400 MP2 GPU from ARM. It's also known as the Allwinner T3
for In-Car Entertainment usage, A40i and A40pro are variants that
differ in applicable temperatures range (industrial and military).

Signed-off-by: qianfan Zhao 
---
 hw/arm/Kconfig                 |  10 +
 hw/arm/allwinner-r40.c         | 418
+
 hw/arm/bananapi_m2u.c          | 129 ++
 hw/arm/meson.build             |   1 +
 include/hw/arm/allwinner-r40.h | 110 +
 5 files changed, 668 insertions(+)
 create mode 100644 hw/arm/allwinner-r40.c
 create mode 100644 hw/arm/bananapi_m2u.c
 create mode 100644 include/hw/arm/allwinner-r40.h

diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
index 2d7c457955..b7a84f6e3f 100644
--- a/hw/arm/Kconfig
+++ b/hw/arm/Kconfig
@@ -374,6 +374,16 @@ config ALLWINNER_H3
     select USB_EHCI_SYSBUS
     select SD

+config ALLWINNER_R40
+    bool
+    default y if TCG && ARM
+    select ALLWINNER_A10_PIT
+    select SERIAL
+    select ARM_TIMER
+    select ARM_GIC
+    select UNIMP
+    select SD
+
 config RASPI
     bool
     default y if TCG && ARM
diff --git a/hw/arm/allwinner-r40.c b/hw/arm/allwinner-r40.c
new file mode 100644
index 00..b743d64253
--- /dev/null
+++ b/hw/arm/allwinner-r40.c
@@ -0,0 +1,418 @@
+/*
+ * Allwinner R40/A40i/T3 System on Chip emulation
+ *
+ * Copyright (C) 2023 qianfan Zhao 
+ *
+ * This program is free software: you can redistribute it and/or
modify
+ * it under the terms of the GNU General Public License as
published by
+ * the Free Software Foundation, either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see
.
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qemu/error-report.h"
+#include "qemu/bswap.h"
+#include "qemu/module.h"
+#include "qemu/units.h"
+#include "hw/qdev-core.h"
+#include "hw/sysbus.h"
+#include "hw/char/serial.h"
+#include "hw/misc/unimp.h"
+#include "hw/usb/hcd-ehci.h"
+#include "hw/loader.h"
+#include "sysemu/sysemu.h"
+#include "hw/arm/allwinner-r40.h"
+
+/* Memory map */
+const hwaddr allwinner_r40_memmap[] = {
+    [AW_R40_DEV_SRAM_A1]    = 0x,
+    [AW_R40_DEV_SRAM_A2]    = 0x4000,
+    [AW_R40_DEV_SRAM_A3]    = 0x8000,
+    [AW_R40_DEV_SRAM_A4]    = 0xb400,
+    [AW_R40_DEV_MMC0]       = 0x01c0f000,
+    [AW_R40_DEV_MMC1]       = 0x01c1,
+    [AW_R40_DEV_MMC2]       = 0x01c11000,
+    [AW_R40_DEV_MMC3]       = 0x01c12000,
+    [AW_R40_DEV_PIT]        = 0x01c20c00,
+    [AW_R40_DEV_UART0]      = 0x01c28000,
+    [AW_R40_DEV_GIC_DIST]   = 0x01c81000,
+    [AW_R40_DEV_GIC_CPU]    = 0x01c82000,
+    [AW_R40_DEV_GIC_HYP]    = 0x01c84000,
+    [AW_R40_DEV_GIC_VCPU]   = 0x01c86000,
+    [AW_R40_DEV_SDRAM]      = 0x4000
+};
+
+/* List of unimplemented devices */
+struct AwR40Unimplemented {
+    const char *device_name;
+    hwaddr base;
+    hwaddr size;
+};
+
+static struct AwR40Unimplemented r40_unimplemented[] = {
+    { "d-engine",   0x0100, 4 * MiB },
+    { "d-inter",    0x0140, 128 * KiB },
+    { "sram-c",     0x01c0, 4 * KiB },
+    { "dma",        0x01c02000, 4 * KiB },
+    { "nfdc",       0x01c03000, 4 * KiB },
+    { "ts",         0x01c04000, 4 * KiB },
+    { "spi0",       0x01c05000, 4 * KiB },
+    { "spi1",       0x01c06000, 4 * KiB },
+    { "cs0",        0x01c09000, 4 * KiB },
+    { "keymem",     0x01c0a000, 4 * KiB },
+    { "emac",       0x01c0b000, 4 * KiB },
+    { "usb0-otg",   0x01c13000, 4 * KiB },
+    { "usb0-host",  0x01c14000, 4 * KiB },
+    { "crypto",     0x01c15000, 4 * KiB },
+    { "spi2",       0x01c17000, 4 * KiB },
+    { "sata",       0x01c18000, 4 * KiB },
+    { "usb1-host",  0x01c19000, 4 * KiB },
+    { "sid",        0x01c1b000, 4 * KiB },
+    { "usb2-host",  0x01c1c000, 4 * KiB },
+    { "cs1",        0x01c1d000, 4 * KiB },
+    { "spi3",       0x01c1f000, 4 * KiB },
+    { "ccu",        0x01c2, 1 *

Re: [PATCH v4 08/11] hw: arm: allwinner-r40: Add emac and gmac support

2023-05-22 Thread qianfan





在 2023/5/16 3:58, Niek Linnenbank 写道:

Hi Qianfan,

On Wed, May 10, 2023 at 12:30 PM  wrote:

From: qianfan Zhao 

R40 has two ethernet controllers named as emac and gmac. The emac is
compatibled with A10, and the GMAC is compatibled with H3.

Signed-off-by: qianfan Zhao 
---
 hw/arm/allwinner-r40.c         | 50
--
 hw/arm/bananapi_m2u.c          |  3 ++
 include/hw/arm/allwinner-r40.h |  6 
 3 files changed, 57 insertions(+), 2 deletions(-)

diff --git a/hw/arm/allwinner-r40.c b/hw/arm/allwinner-r40.c
index b148c56449..c018ad231a 100644
--- a/hw/arm/allwinner-r40.c
+++ b/hw/arm/allwinner-r40.c
@@ -39,6 +39,7 @@ const hwaddr allwinner_r40_memmap[] = {
     [AW_R40_DEV_SRAM_A2]    = 0x4000,
     [AW_R40_DEV_SRAM_A3]    = 0x8000,
     [AW_R40_DEV_SRAM_A4]    = 0xb400,
+    [AW_R40_DEV_EMAC]       = 0x01c0b000,
     [AW_R40_DEV_MMC0]       = 0x01c0f000,
     [AW_R40_DEV_MMC1]       = 0x01c1,
     [AW_R40_DEV_MMC2]       = 0x01c11000,
@@ -54,6 +55,7 @@ const hwaddr allwinner_r40_memmap[] = {
     [AW_R40_DEV_UART6]      = 0x01c29800,
     [AW_R40_DEV_UART7]      = 0x01c29c00,
     [AW_R40_DEV_TWI0]       = 0x01c2ac00,
+    [AW_R40_DEV_GMAC]       = 0x01c5,
     [AW_R40_DEV_DRAMCOM]    = 0x01c62000,
     [AW_R40_DEV_DRAMCTL]    = 0x01c63000,
     [AW_R40_DEV_DRAMPHY]    = 0x01c65000,
@@ -82,7 +84,6 @@ static struct AwR40Unimplemented
r40_unimplemented[] = {
     { "spi1",       0x01c06000, 4 * KiB },
     { "cs0",        0x01c09000, 4 * KiB },
     { "keymem",     0x01c0a000, 4 * KiB },
-    { "emac",       0x01c0b000, 4 * KiB },
     { "usb0-otg",   0x01c13000, 4 * KiB },
     { "usb0-host",  0x01c14000, 4 * KiB },
     { "crypto",     0x01c15000, 4 * KiB },
@@ -131,7 +132,6 @@ static struct AwR40Unimplemented
r40_unimplemented[] = {
     { "tvd2",       0x01c33000, 4 * KiB },
     { "tvd3",       0x01c34000, 4 * KiB },
     { "gpu",        0x01c4, 64 * KiB },
-    { "gmac",       0x01c5, 64 * KiB },
     { "hstmr",      0x01c6, 4 * KiB },
     { "tcon-top",   0x01c7, 4 * KiB },
     { "lcd0",       0x01c71000, 4 * KiB },
@@ -180,6 +180,8 @@ enum {
     AW_R40_GIC_SPI_MMC1      = 33,
     AW_R40_GIC_SPI_MMC2      = 34,
     AW_R40_GIC_SPI_MMC3      = 35,
+    AW_R40_GIC_SPI_EMAC      = 55,
+    AW_R40_GIC_SPI_GMAC      = 85,
 };

 /* Allwinner R40 general constants */
@@ -276,6 +278,11 @@ static void allwinner_r40_init(Object *obj)

     object_initialize_child(obj, "twi0", >i2c0,
TYPE_AW_I2C_SUN6I);

+    object_initialize_child(obj, "emac", >emac, TYPE_AW_EMAC);
+    object_initialize_child(obj, "gmac", >gmac,
TYPE_AW_SUN8I_EMAC);
+    object_property_add_alias(obj, "gmac-phy-addr",
+                              OBJECT(>gmac), "phy-addr");
+
     object_initialize_child(obj, "dramc", >dramc,
TYPE_AW_R40_DRAMC);
     object_property_add_alias(obj, "ram-addr", OBJECT(>dramc),
                              "ram-addr");
@@ -285,6 +292,7 @@ static void allwinner_r40_init(Object *obj)

 static void allwinner_r40_realize(DeviceState *dev, Error **errp)
 {
+    const char *r40_nic_models[] = { "gmac", "emac", NULL };
     AwR40State *s = AW_R40(dev);
     unsigned i;

@@ -442,6 +450,44 @@ static void allwinner_r40_realize(DeviceState
*dev, Error **errp)
     sysbus_mmio_map(SYS_BUS_DEVICE(>dramc), 2,
                     s->memmap[AW_R40_DEV_DRAMPHY]);

+    /* nic support gmac and emac */
+    for (int i = 0; i < ARRAY_SIZE(r40_nic_models) - 1; i++) {
+        NICInfo *nic = _table[i];
+
+        if (!nic->used) {
+            continue;
+        }


Could you please clarify the lines below here? I'm not seeing the 
function call 'qemu_show_nic_models()' in any of the other machines / 
soc implementations.


Also, if you intend to catch a possible input error here, probably its 
best to log/print the error for the user before calling exit()?
This is useful if the user don't known nic model names. Can running this 
command for a helper message:


$ qemu-system-arm -M bpim2u -net nic,model=help
Unable to init server: Could not connect: Connection refused
Available NIC models:
gmac
emac


+        if (qemu_show_nic_models(nic->model, r40_nic_models)) {
+            exit(0);

+        }
+
+        switch (qemu_find_nic_model(nic, r40_nic_models,
r40_nic_models[0])) {
+        case 0: /* gmac */
+ qdev_set_nic_properties(DEVICE(>gmac), nic);
+            break;
+        case 1: /* emac */
+ qdev_set_nic_properties(DEVICE(>emac), nic);
+            break;
+        default:
+            exit(1);
+            break;
+        }
+    }
+

Re: [PATCH v4 07/11] hw: sd: allwinner-sdhost: Add sun50i-a64 SoC support

2023-05-22 Thread qianfan





在 2023/5/16 3:54, Niek Linnenbank 写道:



On Wed, May 10, 2023 at 12:30 PM  wrote:

From: qianfan Zhao 

A64's sd register was similar to H3, and it introduced a new register
named SAMP_DL_REG location at 0x144. The dma descriptor buffer size of
mmc2 is only 8K and the other mmc controllers has 64K.

Also fix allwinner-r40's mmc controller type.

Signed-off-by: qianfan Zhao 
---
 hw/arm/allwinner-r40.c           |  2 +-
 hw/sd/allwinner-sdhost.c         | 70
++--
 include/hw/sd/allwinner-sdhost.h |  9 
 3 files changed, 77 insertions(+), 4 deletions(-)

diff --git a/hw/arm/allwinner-r40.c b/hw/arm/allwinner-r40.c
index 0e4542d35f..b148c56449 100644
--- a/hw/arm/allwinner-r40.c
+++ b/hw/arm/allwinner-r40.c
@@ -271,7 +271,7 @@ static void allwinner_r40_init(Object *obj)

     for (int i = 0; i < AW_R40_NUM_MMCS; i++) {
         object_initialize_child(obj, mmc_names[i], >mmc[i],
-                                TYPE_AW_SDHOST_SUN5I);
+                                TYPE_AW_SDHOST_SUN50I_A64);
     }

     object_initialize_child(obj, "twi0", >i2c0,
TYPE_AW_I2C_SUN6I);
diff --git a/hw/sd/allwinner-sdhost.c b/hw/sd/allwinner-sdhost.c
index 92a0f42708..f4fa2d179b 100644
--- a/hw/sd/allwinner-sdhost.c
+++ b/hw/sd/allwinner-sdhost.c
@@ -77,6 +77,7 @@ enum {
     REG_SD_DATA1_CRC  = 0x12C, /* CRC Data 1 from card/eMMC */
     REG_SD_DATA0_CRC  = 0x130, /* CRC Data 0 from card/eMMC */
     REG_SD_CRC_STA    = 0x134, /* CRC status from card/eMMC
during write */
+    REG_SD_SAMP_DL    = 0x144, /* Sample Delay Control
(sun50i-a64) */
     REG_SD_FIFO       = 0x200, /* Read/Write FIFO */
 };

@@ -158,6 +159,7 @@ enum {
     REG_SD_RES_CRC_RST      = 0x0,
     REG_SD_DATA_CRC_RST     = 0x0,
     REG_SD_CRC_STA_RST      = 0x0,
+    REG_SD_SAMPLE_DL_RST    = 0x2000,
     REG_SD_FIFO_RST         = 0x0,
 };

@@ -459,6 +461,7 @@ static uint64_t allwinner_sdhost_read(void
*opaque, hwaddr offset,
 {
     AwSdHostState *s = AW_SDHOST(opaque);
     AwSdHostClass *sc = AW_SDHOST_GET_CLASS(s);
+    bool out_of_bounds = false;
     uint32_t res = 0;

     switch (offset) {
@@ -577,13 +580,24 @@ static uint64_t allwinner_sdhost_read(void
*opaque, hwaddr offset,
     case REG_SD_FIFO:      /* Read/Write FIFO */
         res = allwinner_sdhost_fifo_read(s);
         break;
+    case REG_SD_SAMP_DL: /* Sample Delay */
+        if (sc->can_calibrate) {
+            res = s->sample_delay;
+        } else {
+            out_of_bounds = true;
+        }
+        break;
     default:
-        qemu_log_mask(LOG_GUEST_ERROR, "%s: out-of-bounds offset %"
-                      HWADDR_PRIx"\n", __func__, offset);
+        out_of_bounds = true;
         res = 0;
         break;
     }

+    if (out_of_bounds) {
+        qemu_log_mask(LOG_GUEST_ERROR, "%s: out-of-bounds offset %"
+                      HWADDR_PRIx"\n", __func__, offset);
+    }
+
     trace_allwinner_sdhost_read(offset, res, size);
     return res;
 }
@@ -602,6 +616,7 @@ static void allwinner_sdhost_write(void
*opaque, hwaddr offset,
 {
     AwSdHostState *s = AW_SDHOST(opaque);
     AwSdHostClass *sc = AW_SDHOST_GET_CLASS(s);
+    bool out_of_bounds = false;

     trace_allwinner_sdhost_write(offset, value, size);

@@ -725,10 +740,21 @@ static void allwinner_sdhost_write(void
*opaque, hwaddr offset,
     case REG_SD_DATA0_CRC: /* CRC Data 0 from card/eMMC */
     case REG_SD_CRC_STA:   /* CRC status from card/eMMC in write
operation */
         break;
+    case REG_SD_SAMP_DL: /* Sample delay control */
+        if (sc->can_calibrate) {
+            s->sample_delay = value;
+        } else {
+            out_of_bounds = true;
+        }
+        break;
     default:
+        out_of_bounds = true;
+        break;
+    }
+
+    if (out_of_bounds) {
         qemu_log_mask(LOG_GUEST_ERROR, "%s: out-of-bounds offset %"
                       HWADDR_PRIx"\n", __func__, offset);
-        break;
     }
 }

@@ -777,6 +803,7 @@ static const VMStateDescription
vmstate_allwinner_sdhost = {
         VMSTATE_UINT32(response_crc, AwSdHostState),
         VMSTATE_UINT32_ARRAY(data_crc, AwSdHostState, 8),
         VMSTATE_UINT32(status_crc, AwSdHostState),
+        VMSTATE_UINT32(sample_delay, AwSdHostState),
         VMSTATE_END_OF_LIST()
     }
 };
@@ -815,6 +842,7 @@ static void
allwinner_sdhost_realize(DeviceState *dev, Error **errp)
 static void allwinner_sdhost_reset(DeviceState *dev)
 {
     AwSdHostState *s = AW_SDHOST(dev);
+    AwSdHostClass *sc =

Re: [PATCH v4 06/11] hw/arm/allwinner-r40: add SDRAM controller device

2023-05-22 Thread qianfan





在 2023/5/16 3:47, Niek Linnenbank 写道:



On Wed, May 10, 2023 at 12:30 PM  wrote:

From: qianfan Zhao 

Types of memory that the SDRAM controller supports are DDR2/DDR3
and capacities of up to 2GiB. This commit adds emulation support
of the Allwinner R40 SDRAM controller.

This driver only support 256M, 512M and 1024M memory now.

Signed-off-by: qianfan Zhao 
---
 hw/arm/allwinner-r40.c                |  21 +-
 hw/arm/bananapi_m2u.c                 |   7 +
 hw/misc/allwinner-r40-dramc.c         | 513
++
 hw/misc/meson.build                   |   1 +
 hw/misc/trace-events                  |  14 +
 include/hw/arm/allwinner-r40.h        |  13 +-
 include/hw/misc/allwinner-r40-dramc.h | 108 ++
 7 files changed, 674 insertions(+), 3 deletions(-)
 create mode 100644 hw/misc/allwinner-r40-dramc.c
 create mode 100644 include/hw/misc/allwinner-r40-dramc.h

diff --git a/hw/arm/allwinner-r40.c b/hw/arm/allwinner-r40.c
index 4bc582630c..0e4542d35f 100644
--- a/hw/arm/allwinner-r40.c
+++ b/hw/arm/allwinner-r40.c
@@ -31,6 +31,7 @@
 #include "hw/loader.h"
 #include "sysemu/sysemu.h"
 #include "hw/arm/allwinner-r40.h"
+#include "hw/misc/allwinner-r40-dramc.h"

 /* Memory map */
 const hwaddr allwinner_r40_memmap[] = {
@@ -53,6 +54,9 @@ const hwaddr allwinner_r40_memmap[] = {
     [AW_R40_DEV_UART6]      = 0x01c29800,
     [AW_R40_DEV_UART7]      = 0x01c29c00,
     [AW_R40_DEV_TWI0]       = 0x01c2ac00,
+    [AW_R40_DEV_DRAMCOM]    = 0x01c62000,
+    [AW_R40_DEV_DRAMCTL]    = 0x01c63000,
+    [AW_R40_DEV_DRAMPHY]    = 0x01c65000,
     [AW_R40_DEV_GIC_DIST]   = 0x01c81000,
     [AW_R40_DEV_GIC_CPU]    = 0x01c82000,
     [AW_R40_DEV_GIC_HYP]    = 0x01c84000,
@@ -129,8 +133,6 @@ static struct AwR40Unimplemented
r40_unimplemented[] = {
     { "gpu",        0x01c4, 64 * KiB },
     { "gmac",       0x01c5, 64 * KiB },
     { "hstmr",      0x01c6, 4 * KiB },
-    { "dram-com",   0x01c62000, 4 * KiB },
-    { "dram-ctl",   0x01c63000, 4 * KiB },
     { "tcon-top",   0x01c7, 4 * KiB },
     { "lcd0",       0x01c71000, 4 * KiB },
     { "lcd1",       0x01c72000, 4 * KiB },
@@ -273,6 +275,12 @@ static void allwinner_r40_init(Object *obj)
     }

     object_initialize_child(obj, "twi0", >i2c0,
TYPE_AW_I2C_SUN6I);
+
+    object_initialize_child(obj, "dramc", >dramc,
TYPE_AW_R40_DRAMC);
+    object_property_add_alias(obj, "ram-addr", OBJECT(>dramc),
+                             "ram-addr");
+    object_property_add_alias(obj, "ram-size", OBJECT(>dramc),
+                              "ram-size");
 }

 static void allwinner_r40_realize(DeviceState *dev, Error **errp)
@@ -425,6 +433,15 @@ static void allwinner_r40_realize(DeviceState
*dev, Error **errp)
     sysbus_connect_irq(SYS_BUS_DEVICE(>i2c0), 0,
qdev_get_gpio_in(DEVICE(>gic), AW_R40_GIC_SPI_TWI0));

+    /* DRAMC */
+    sysbus_realize(SYS_BUS_DEVICE(>dramc), _fatal);
+    sysbus_mmio_map(SYS_BUS_DEVICE(>dramc), 0,
+                    s->memmap[AW_R40_DEV_DRAMCOM]);
+    sysbus_mmio_map(SYS_BUS_DEVICE(>dramc), 1,
+                    s->memmap[AW_R40_DEV_DRAMCTL]);
+    sysbus_mmio_map(SYS_BUS_DEVICE(>dramc), 2,
+                    s->memmap[AW_R40_DEV_DRAMPHY]);
+
     /* Unimplemented devices */
     for (i = 0; i < ARRAY_SIZE(r40_unimplemented); i++) {
 create_unimplemented_device(r40_unimplemented[i].device_name,
diff --git a/hw/arm/bananapi_m2u.c b/hw/arm/bananapi_m2u.c
index 9c5360a41b..20a4550c68 100644
--- a/hw/arm/bananapi_m2u.c
+++ b/hw/arm/bananapi_m2u.c
@@ -85,6 +85,13 @@ static void bpim2u_init(MachineState *machine)
     object_property_set_int(OBJECT(r40), "clk1-freq", 24 * 1000 *
1000,
                             _abort);

+    /* DRAMC */
+    r40->ram_size = machine->ram_size / MiB;
+    object_property_set_uint(OBJECT(r40), "ram-addr",
+  r40->memmap[AW_R40_DEV_SDRAM], _abort);
+    object_property_set_int(OBJECT(r40), "ram-size",
+                            r40->ram_size, _abort);
+
     /* Mark R40 object realized */
     qdev_realize(DEVICE(r40), NULL, _abort);

diff --git a/hw/misc/allwinner-r40-dramc.c
b/hw/misc/allwinner-r40-dramc.c
new file mode 100644
index 00..b102bcdaba
--- /dev/null
+++ b/hw/misc/allwinner-r40-dramc.c
@@ -0,0 +1,513 @@
+/*
+ * Allwinner R40 SDRAM Controller emulation
+ *
+ * CCopyright (C) 2023 qianfan Zhao 
+ *
+ * This program is free software: you can redistribute it and/or
modify
+ * it under the terms of the GNU General Public License as
published by
+ * the Free Software Foundation, either version 2 of the License, or
+ * (at

Re: [PATCH v12 02/15] accel: collecting TB execution count

2023-05-22 Thread Richard Henderson


On 5/18/23 06:57, Fei Wu wrote:

+void HELPER(inc_exec_freq)(void *ptr)
+{
+TBStatistics *stats = (TBStatistics *) ptr;
+tcg_debug_assert(stats);
+++stats->executions.normal;
+}

...

+static inline void gen_tb_exec_count(TranslationBlock *tb)
+{
+if (tb_stats_enabled(tb, TB_EXEC_STATS)) {
+TCGv_ptr ptr = tcg_temp_new_ptr();
+tcg_gen_movi_ptr(ptr, (intptr_t)tb->tb_stats);
+gen_helper_inc_exec_freq(ptr);
+}
+}


This is 3 host instructions, easily expanded inline:

--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -11,6 +11,7 @@
 #include "qemu/error-report.h"
 #include "tcg/tcg.h"
 #include "tcg/tcg-op.h"
+#include "tcg/tcg-temp-internal.h"
 #include "exec/exec-all.h"
 #include "exec/gen-icount.h"
 #include "exec/log.h"
@@ -18,6 +19,30 @@
 #include "exec/plugin-gen.h"
 #include "exec/replay-core.h"

+
+static void gen_tb_exec_count(TranslationBlock *tb)
+{
+if (tb_stats_enabled(tb, TB_EXEC_STATS)) {
+TCGv_ptr ptr = tcg_temp_ebb_new_ptr();
+
+tcg_gen_movi_ptr(ptr, (intptr_t)>tb_stats->executions.normal);
+if (sizeof(tb->tb_stats->executions.normal) == 4) {
+TCGv_i32 t = tcg_temp_ebb_new_i32();
+tcg_gen_ld_i32(t, ptr, 0);
+tcg_gen_addi_i32(t, t, 1);
+tcg_gen_st_i32(t, ptr, 0);
+tcg_temp_free_i32(t);
+} else {
+TCGv_i64 t = tcg_temp_ebb_new_i64();
+tcg_gen_ld_i64(t, ptr, 0);
+tcg_gen_addi_i64(t, t, 1);
+tcg_gen_st_i64(t, ptr, 0);
+tcg_temp_free_i64(t);
+}
+tcg_temp_free_ptr(ptr);
+}
+}
+
 bool translator_use_goto_tb(DisasContextBase *db, target_ulong dest)
 {
 /* Suppress goto_tb if requested. */


I'm not expecially keen on embedding the TBStatistics pointer directly like this; for most 
hosts we will have to put this constant into the constant pool.  Whereas the pointer 
already exists at tb->tb_stats, and tb is at a constant displacement prior to the code, so 
we already have mechanisms for generating pc-relative addresses.


However, that's premature optimization.  Let's get it working first.


r~

Re: [PATCH v4 05/11] hw/misc: Rename axp209 to axp22x and add support AXP221 PMU

2023-05-22 Thread qianfan





在 2023/5/16 3:29, Niek Linnenbank 写道:

Hi Qianfan,

Good idea indeed to turn this driver into a more generic one. If 
possible, its best to re-use code rather than adding new.


On Wed, May 10, 2023 at 12:30 PM  wrote:

From: qianfan Zhao 

This patch adds minimal support for AXP-221 PMU and connect it to
bananapi M2U board.

Signed-off-by: qianfan Zhao 
---
 hw/arm/Kconfig        |   3 +-
 hw/arm/bananapi_m2u.c |   6 +
 hw/misc/Kconfig       |   2 +-
 hw/misc/axp209.c      | 238 ---
 hw/misc/axp2xx.c      | 283
++
 hw/misc/meson.build   |   2 +-
 hw/misc/trace-events  |   8 +-
 7 files changed, 297 insertions(+), 245 deletions(-)
 delete mode 100644 hw/misc/axp209.c
 create mode 100644 hw/misc/axp2xx.c

diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
index b7a84f6e3f..bad4ea158c 100644
--- a/hw/arm/Kconfig
+++ b/hw/arm/Kconfig
@@ -355,7 +355,7 @@ config ALLWINNER_A10
     select ALLWINNER_WDT
     select ALLWINNER_EMAC
     select ALLWINNER_I2C
-    select AXP209_PMU
+    select AXP2XX_PMU
     select SERIAL
     select UNIMP

@@ -378,6 +378,7 @@ config ALLWINNER_R40
     bool
     default y if TCG && ARM
     select ALLWINNER_A10_PIT
+    select AXP2XX_PMU
     select SERIAL
     select ARM_TIMER
     select ARM_GIC
diff --git a/hw/arm/bananapi_m2u.c b/hw/arm/bananapi_m2u.c
index 1d49a006b5..9c5360a41b 100644
--- a/hw/arm/bananapi_m2u.c
+++ b/hw/arm/bananapi_m2u.c
@@ -23,6 +23,7 @@
 #include "qapi/error.h"
 #include "qemu/error-report.h"
 #include "hw/boards.h"
+#include "hw/i2c/i2c.h"
 #include "hw/qdev-properties.h"
 #include "hw/arm/allwinner-r40.h"

@@ -61,6 +62,7 @@ static void bpim2u_init(MachineState *machine)
 {
     bool bootroom_loaded = false;
     AwR40State *r40;
+    I2CBus *i2c;

     /* BIOS is not supported by this board */
     if (machine->firmware) {
@@ -104,6 +106,10 @@ static void bpim2u_init(MachineState *machine)
         }
     }

+    /* Connect AXP221 */
+    i2c = I2C_BUS(qdev_get_child_bus(DEVICE(>i2c0), "i2c"));
+    i2c_slave_create_simple(i2c, "axp221_pmu", 0x34);
+
     /* SDRAM */
     memory_region_add_subregion(get_system_memory(),
 r40->memmap[AW_R40_DEV_SDRAM], machine->ram);
diff --git a/hw/misc/Kconfig b/hw/misc/Kconfig
index 2ef5781ef8..efeb430a6c 100644
--- a/hw/misc/Kconfig
+++ b/hw/misc/Kconfig
@@ -176,7 +176,7 @@ config ALLWINNER_A10_CCM
 config ALLWINNER_A10_DRAMC
     bool

-config AXP209_PMU
+config AXP2XX_PMU
     bool
     depends on I2C

diff --git a/hw/misc/axp209.c b/hw/misc/axp209.c
deleted file mode 100644
index 2908ed99a6..00
--- a/hw/misc/axp209.c
+++ /dev/null
@@ -1,238 +0,0 @@
-/*
- * AXP-209 PMU Emulation
- *
- * Copyright (C) 2022 Strahinja Jankovic

- *
- * Permission is hereby granted, free of charge, to any person
obtaining a
- * copy of this software and associated documentation files (the
"Software"),
- * to deal in the Software without restriction, including without
limitation
- * the rights to use, copy, modify, merge, publish, distribute,
sublicense,
- * and/or sell copies of the Software, and to permit persons to
whom the
- * Software is furnished to do so, subject to the following
conditions:
- *
- * The above copyright notice and this permission notice shall be
included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO
EVENT SHALL THE
- * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES
OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
OTHER
- * DEALINGS IN THE SOFTWARE.
- *
- * SPDX-License-Identifier: MIT
- */
-
-#include "qemu/osdep.h"
-#include "qemu/log.h"
-#include "trace.h"
-#include "hw/i2c/i2c.h"
-#include "migration/vmstate.h"
-
-#define TYPE_AXP209_PMU "axp209_pmu"
-
-#define AXP209(obj) \
-    OBJECT_CHECK(AXP209I2CState, (obj), TYPE_AXP209_PMU)
-
-/* registers */
-enum {
-    REG_POWER_STATUS = 0x0u,
-    REG_OPERATING_MODE,
-    REG_OTG_VBUS_STATUS,
-    REG_CHIP_VERSION,
-    REG_DATA_CACHE_0,
-    REG_DATA_CACHE_1,
-    REG_DATA_CACHE_2,
-    REG_DATA_CACHE_3,
-    REG_DATA_CACHE_4,
-    REG_DATA_CACHE_5,
-    REG_DATA_CACHE_6,

Re: [PATCH v1] migration: fail the cap check if it requires the use of deferred incoming

2023-05-22 Thread Peter Xu

On Sat, May 20, 2023 at 01:42:06AM +, Wang, Wei W wrote:
> On Friday, May 19, 2023 11:34 PM, Peter Xu wrote:
> > > Ah yes indeed it keeps working, because we apply -global bits before
> > > setup sockets. Then it's fine by me since that's the only thing I
> > > would still like to keep it working. :)
> > >
> > > If so, can we reword the error message a bit?  Obviously as you said
> > > we're not really checking against -defer, but established channels.
> > > The problem is if something is established without knowing multifd
> > > being there it may not work for multifd or preempt, not strictly about 
> > > defer.
> > >
> > > How about:
> > >
> > >   "Multifd/Preempt-Mode cannot be modified if incoming channel has
> > setup"
> > >
> > > ?
> 
> Yes, I'll reword it a bit.
> 
> > 
> > We may also want to trap the channel setups on num:
> > 
> > migrate_params_test_apply():
> > 
> > if (params->has_multifd_channels) {
> > dest->multifd_channels = params->multifd_channels;
> > }
> 
> Didn’t get this one. What do you want to add to above?

I meant after listen() is called with an explicit number in this case,
should we disallow changing of multifd number of channels?

-- 
Peter Xu

Re: [PATCH] bitops.h: Compile out asserts without --enable-debug

2023-05-22 Thread Richard Henderson


On 5/22/23 15:26, BALATON Zoltan wrote:

On Mon, 22 May 2023, Alex Bennée wrote:

(ajb: add Richard for his compiler-fu)
BALATON Zoltan  writes:

On Mon, 22 May 2023, Alex Bennée wrote:

BALATON Zoltan  writes:


The low level extract and deposit funtions provided by bitops.h are
used in performance critical places. It crept into target/ppc via
FIELD_EX64 and also used by softfloat so PPC code using a lot of FPU
where hardfloat is also disabled is doubly affected.


Most of these asserts compile out to nothing if the compiler is able to
verify the constants are in the range. For example examining
the start of float64_add:





I don't see any check and abort steps because all the shift and mask
values are known at compile time. The softfloat compilation certainly
does have some assert points though:

readelf -s ./libqemu-ppc64-softmmu.fa.p/fpu_softfloat.c.o  |grep assert
  136:  0 NOTYPE  GLOBAL DEFAULT  UND g_assertion_mess[...]
  138:  0 NOTYPE  GLOBAL DEFAULT  UND __assert_fail

but the references are for the ISRA segments so its tricky to know if
they get used or are just there for LTO purposes.

If there are hot-paths that show up the extract/deposit functions I
suspect a better approach would be to implement _nocheck variants (or
maybe _noassert?) and use them where required rather than turning off
the assert checking for these utility functions.


Just to clarify again, the asserts are still there when compiled with
--enable-debug. The patch only turns them off for optimised release
builds which I think makes sense if these asserts are to catch
programming errors.


Well as Peter said the general policy is to keep asserts in but I
appreciate this is a hotpath case.


I think I've also suggested adding noassert
versions of these but that wasn't a popular idea and it may also not
be easy to convert all places to use that like for example the
register fields related usage in target/ppc as that would also affect
other places.


Is code generation or device emulation really on the hot-path. Generally
a well predicted assert is in the noise for those operations.


They aren't in code generation but in helpers as you can also see in the profile below and 
so they can be on hot path. Also I've noticed that extract8 and extract16 just call 
extract32 after adding another assert on their own in addition to the one in extract32 
which is double overhead for really no reason. I'd delete all these asserts as the 
likelhood of bugs these could catch is very low anyway (how often do you expect somebody 
to call these with out of bound values that would not be obvious from the results 
otherwise?) but leaving them in non-debug builds is totally useless in my opinion.



So this seems to be the simplest and most effective
approach.

The softfloat related usage in these tests I've done seem to mostly
come from unpacking and repacking floats in softfloat which is done
for every operation, e.g. muladd which mp3 encoding mostly uses does 3
unpacks and 1 pack for each call and each unpack is 3 extracts so even
small overheads add app quickly. Just 1 muladd will result in 9
extracts and 2 deposits at least plus updating PPC flags for each FPU
op adds a bunch more. I did some profiling with perf to find these.


After some messing about trying to get lame to cross compile to a static
binary I was able to replicate what you've seen:

 11.44%  qemu-ppc64  qemu-ppc64   [.] unpack_raw64.isra.0
 11.03%  qemu-ppc64  qemu-ppc64   [.] parts64_uncanon_normal
  8.26%  qemu-ppc64  qemu-ppc64   [.] helper_compute_fprf_float64
  6.75%  qemu-ppc64  qemu-ppc64   [.] do_float_check_status
  5.34%  qemu-ppc64  qemu-ppc64   [.] parts64_muladd
  4.75%  qemu-ppc64  qemu-ppc64   [.] pack_raw64.isra.0
  4.38%  qemu-ppc64  qemu-ppc64   [.] parts64_canonicalize
  3.62%  qemu-ppc64  qemu-ppc64   [.] 
float64r32_round_pack_canonical
  3.32%  qemu-ppc64  qemu-ppc64   [.] helper_todouble
  2.68%  qemu-ppc64  qemu-ppc64   [.] float64_add
  2.51%  qemu-ppc64  qemu-ppc64   [.] float64_hs_compare
  2.30%  qemu-ppc64  qemu-ppc64   [.] float64r32_muladd
  1.80%  qemu-ppc64  qemu-ppc64   [.] float64r32_mul
  1.40%  qemu-ppc64  qemu-ppc64   [.] float64r32_add
  1.34%  qemu-ppc64  qemu-ppc64   [.] parts64_mul
  1.16%  qemu-ppc64  qemu-ppc64   [.] parts64_addsub
  1.14%  qemu-ppc64  qemu-ppc64   [.] helper_reset_fpstatus
  1.06%  qemu-ppc64  qemu-ppc64   [.] helper_float_check_status
  1.04%  qemu-ppc64  qemu-ppc64   [.] float64_muladd


I've run 32 bit PPC version in qemu-system-ppc so the profile is a bit different (has more 
system related overhead that I plan to look at separately) but this part is similar to the 
above. I also wonder what makes helper_compute_fprf_float64 a bottleneck as

Re: [PATCH v7 5/7] hw/cxl/events: Add injection of General Media Events

2023-05-22 Thread Fan Ni

The 05/22/2023 16:09, Jonathan Cameron wrote:
> From: Ira Weiny 
> 
> To facilitate testing provide a QMP command to inject a general media
> event.  The event can be added to the log specified.
> 
> Signed-off-by: Ira Weiny 
> Signed-off-by: Jonathan Cameron 
> 
> ---

Reviewed-by: Fan Ni 

> v7: Various docs updates and field renames including a lot more
> specification references.
> ---
>  qapi/cxl.json   |  79 +
>  include/hw/cxl/cxl_events.h |  20 +++
>  hw/mem/cxl_type3.c  | 111 
>  hw/mem/cxl_type3_stubs.c|  10 
>  4 files changed, 220 insertions(+)
> 
> diff --git a/qapi/cxl.json b/qapi/cxl.json
> index 4849fca776..7700e26a0d 100644
> --- a/qapi/cxl.json
> +++ b/qapi/cxl.json
> @@ -5,6 +5,85 @@
>  # = CXL devices
>  ##
>  
> +##
> +# @CxlEventLog:
> +#
> +# CXL has a number of separate event logs for different types of
> +# events. Each such event log is handled and signaled independently.
> +#
> +# @informational: Information Event Log
> +#
> +# @warning: Warning Event Log
> +#
> +# @failure: Failure Event Log
> +#
> +# @fatal: Fatal Event Log
> +#
> +# Since: 8.1
> +##
> +{ 'enum': 'CxlEventLog',
> +  'data': ['informational',
> +   'warning',
> +   'failure',
> +   'fatal']
> + }
> +
> +##
> +# @cxl-inject-general-media-event:
> +#
> +# Inject an event record for a General Media Event (CXL r3.0
> +# 8.2.9.2.1.1) This event type is reported via one of the event logs
> +# specified via the log parameter.
> +#
> +# @path: CXL type 3 device canonical QOM path
> +#
> +# @log: event log to add the event to
> +#
> +# @flags: Event Record Flags. See CXL r3.0 Table 8-42 Common Event
> +# Record Format, Event Record Flags for subfield definitions.
> +#
> +# @dpa: Device Physical Address (relative to @path device). Note lower
> +#   bits include some flags. See CXL r3.0 Table 8-43 General Media
> +#   Event Record, Physical Address.
> +#
> +# @descriptor: Memory Event Descriptor with additional memory
> +#  event information. See CXL r3.0 Table 8-43 General
> +#  Media Event Record, Memory Event Descriptor for bit
> +#  definitions.
> +#
> +# @type: Type of memory event that occurred. See CXL r3.0 Table 8-43
> +#General Media Event Record, Memory Event Type for possible
> +#values.
> +#
> +# @transaction-type: Type of first transaction that caused the event
> +#to occur. See CXL r3.0 Table 8-43 General Media
> +#Event Record, Transaction Type for possible
> +#values.
> +#
> +# @channel: The channel of the memory event location. A channel is
> +#   an interface that can be independently accessed for a
> +#   transaction.
> +#
> +# @rank: The rank of the memory event location. A rank is a set of
> +#memory devices on a channel that together execute a
> +#transaction.
> +#
> +# @device: Bitmask that represents all devices in the rank associated
> +#  with the memory event location.
> +#
> +# @component-id: Device specific component identifier for the event.
> +#May describe a field replaceable sub-component of
> +#the device.
> +#
> +# Since: 8.1
> +##
> +{ 'command': 'cxl-inject-general-media-event',
> +  'data': { 'path': 'str', 'log': 'CxlEventLog', 'flags': 'uint8',
> +'dpa': 'uint64', 'descriptor': 'uint8',
> +'type': 'uint8', 'transaction-type': 'uint8',
> +'*channel': 'uint8', '*rank': 'uint8',
> +'*device': 'uint32', '*component-id': 'str' } }
> +
>  ##
>  # @cxl-inject-poison:
>  #
> diff --git a/include/hw/cxl/cxl_events.h b/include/hw/cxl/cxl_events.h
> index 4bf8b7aa08..b189193f4c 100644
> --- a/include/hw/cxl/cxl_events.h
> +++ b/include/hw/cxl/cxl_events.h
> @@ -103,4 +103,24 @@ typedef struct CXLEventInterruptPolicy {
>  /* DCD is optional but other fields are not */
>  #define CXL_EVENT_INT_SETTING_MIN_LEN 4
>  
> +/*
> + * General Media Event Record
> + * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
> + */
> +#define CXL_EVENT_GEN_MED_COMP_ID_SIZE  0x10
> +#define CXL_EVENT_GEN_MED_RES_SIZE  0x2e
> +typedef struct CXLEventGenMedia {
> +CXLEventRecordHdr hdr;
> +uint64_t phys_addr;
> +uint8_t descriptor;
> +uint8_t type;
> +uint8_t transaction_type;
> +uint16_t validity_flags;
> +uint8_t channel;
> +uint8_t rank;
> +uint8_t device[3];
> +uint8_t component_id[CXL_EVENT_GEN_MED_COMP_ID_SIZE];
> +uint8_t reserved[CXL_EVENT_GEN_MED_RES_SIZE];
> +} QEMU_PACKED CXLEventGenMedia;
> +
>  #endif /* CXL_EVENTS_H */
> diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
> index c9e347f42b..b1618779d2 100644
> --- a/hw/mem/cxl_type3.c
> +++ b/hw/mem/cxl_type3.c
> @@ -1181,6 +1181,117 @@ void qmp_cxl_inject_correctable_error(const char 
> *path, CxlCorErrorType type,
>

Re: Add CI configuration for Kubernetes

2023-05-22 Thread Richard Henderson


On 5/22/23 10:41, Camilla Conte wrote:

Here's a second version (v2) of patches to support the Kubernetes runner for 
Gitlab CI.
You can find the v1 thread here: 
https://lore.kernel.org/qemu-devel/20230407145252.32955-1-cco...@redhat.com/.



This does not work:

https://gitlab.com/qemu-project/qemu/-/pipelines/875254290

In particular, most jobs spent 30 minutes (until I cancelled them), e.g.

https://gitlab.com/qemu-project/qemu/-/jobs/4329346855#L7561

Client:
 Debug Mode: false
Server:
ERROR: Cannot connect to the Docker daemon at unix:///var/run/docker.sock.
Is the docker daemon running?


r~

Re: [PATCH v7 4/7] hw/cxl/events: Add event interrupt support

2023-05-22 Thread Fan Ni

The 05/22/2023 16:09, Jonathan Cameron wrote:
> From: Ira Weiny 
> 
> Replace the stubbed out CXL Get/Set Event interrupt policy mailbox
> commands.  Enable those commands to control interrupts for each of the
> event log types.
> 
> Skip the standard input mailbox length on the Set command due to DCD
> being optional.  Perform the checks separately.
> 
> Signed-off-by: Ira Weiny 
> Signed-off-by: Jonathan Cameron 
> ---

Reviewed-by: Fan Ni 

>  include/hw/cxl/cxl_device.h |   6 +-
>  include/hw/cxl/cxl_events.h |  23 
>  hw/cxl/cxl-events.c |  33 ++-
>  hw/cxl/cxl-mailbox-utils.c  | 106 +---
>  hw/mem/cxl_type3.c  |   4 +-
>  5 files changed, 147 insertions(+), 25 deletions(-)
> 
> diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
> index d3aec1bc0e..1978730fba 100644
> --- a/include/hw/cxl/cxl_device.h
> +++ b/include/hw/cxl/cxl_device.h
> @@ -121,6 +121,8 @@ typedef struct CXLEventLog {
>  uint16_t overflow_err_count;
>  uint64_t first_overflow_timestamp;
>  uint64_t last_overflow_timestamp;
> +bool irq_enabled;
> +int irq_vec;
>  QemuMutex lock;
>  QSIMPLEQ_HEAD(, CXLEvent) events;
>  } CXLEventLog;
> @@ -369,7 +371,7 @@ MemTxResult cxl_type3_write(PCIDevice *d, hwaddr 
> host_addr, uint64_t data,
>  
>  uint64_t cxl_device_get_timestamp(CXLDeviceState *cxlds);
>  
> -void cxl_event_init(CXLDeviceState *cxlds);
> +void cxl_event_init(CXLDeviceState *cxlds, int start_msg_num);
>  bool cxl_event_insert(CXLDeviceState *cxlds, CXLEventLogType log_type,
>CXLEventRecordRaw *event);
>  CXLRetCode cxl_event_get_records(CXLDeviceState *cxlds, CXLGetEventPayload 
> *pl,
> @@ -378,6 +380,8 @@ CXLRetCode cxl_event_get_records(CXLDeviceState *cxlds, 
> CXLGetEventPayload *pl,
>  CXLRetCode cxl_event_clear_records(CXLDeviceState *cxlds,
> CXLClearEventPayload *pl);
>  
> +void cxl_event_irq_assert(CXLType3Dev *ct3d);
> +
>  void cxl_set_poison_list_overflowed(CXLType3Dev *ct3d);
>  
>  #endif
> diff --git a/include/hw/cxl/cxl_events.h b/include/hw/cxl/cxl_events.h
> index d4aaa894f1..4bf8b7aa08 100644
> --- a/include/hw/cxl/cxl_events.h
> +++ b/include/hw/cxl/cxl_events.h
> @@ -80,4 +80,27 @@ typedef struct CXLClearEventPayload {
>  uint16_t handle[];
>  } CXLClearEventPayload;
>  
> +/**
> + * Event Interrupt Policy
> + *
> + * CXL rev 3.0 section 8.2.9.2.4; Table 8-52
> + */
> +typedef enum CXLEventIntMode {
> +CXL_INT_NONE = 0x00,
> +CXL_INT_MSI_MSIX = 0x01,
> +CXL_INT_FW   = 0x02,
> +CXL_INT_RES  = 0x03,
> +} CXLEventIntMode;
> +#define CXL_EVENT_INT_MODE_MASK 0x3
> +#define CXL_EVENT_INT_SETTING(vector) uint8_t)vector & 0xf) << 4) | 
> CXL_INT_MSI_MSIX)
> +typedef struct CXLEventInterruptPolicy {
> +uint8_t info_settings;
> +uint8_t warn_settings;
> +uint8_t failure_settings;
> +uint8_t fatal_settings;
> +uint8_t dyn_cap_settings;
> +} QEMU_PACKED CXLEventInterruptPolicy;
> +/* DCD is optional but other fields are not */
> +#define CXL_EVENT_INT_SETTING_MIN_LEN 4
> +
>  #endif /* CXL_EVENTS_H */
> diff --git a/hw/cxl/cxl-events.c b/hw/cxl/cxl-events.c
> index 5da1b76b97..d161d57456 100644
> --- a/hw/cxl/cxl-events.c
> +++ b/hw/cxl/cxl-events.c
> @@ -13,6 +13,8 @@
>  #include "qemu/bswap.h"
>  #include "qemu/typedefs.h"
>  #include "qemu/error-report.h"
> +#include "hw/pci/msi.h"
> +#include "hw/pci/msix.h"
>  #include "hw/cxl/cxl.h"
>  #include "hw/cxl/cxl_events.h"
>  
> @@ -26,7 +28,7 @@ static void reset_overflow(CXLEventLog *log)
>  log->last_overflow_timestamp = 0;
>  }
>  
> -void cxl_event_init(CXLDeviceState *cxlds)
> +void cxl_event_init(CXLDeviceState *cxlds, int start_msg_num)
>  {
>  CXLEventLog *log;
>  int i;
> @@ -37,9 +39,16 @@ void cxl_event_init(CXLDeviceState *cxlds)
>  log->overflow_err_count = 0;
>  log->first_overflow_timestamp = 0;
>  log->last_overflow_timestamp = 0;
> +log->irq_enabled = false;
> +log->irq_vec = start_msg_num++;
>  qemu_mutex_init(>lock);
>  QSIMPLEQ_INIT(>events);
>  }
> +
> +/* Override -- Dynamic Capacity uses the same vector as info */
> +cxlds->event_logs[CXL_EVENT_TYPE_DYNAMIC_CAP].irq_vec =
> +  cxlds->event_logs[CXL_EVENT_TYPE_INFO].irq_vec;
> +
>  }
>  
>  static CXLEvent *cxl_event_get_head(CXLEventLog *log)
> @@ -215,3 +224,25 @@ CXLRetCode cxl_event_clear_records(CXLDeviceState 
> *cxlds, CXLClearEventPayload *
>  
>  return CXL_MBOX_SUCCESS;
>  }
> +
> +void cxl_event_irq_assert(CXLType3Dev *ct3d)
> +{
> +CXLDeviceState *cxlds = >cxl_dstate;
> +PCIDevice *pdev = >parent_obj;
> +int i;
> +
> +for (i = 0; i < CXL_EVENT_TYPE_MAX; i++) {
> +CXLEventLog *log = >event_logs[i];
> +
> +if (!log->irq_enabled || cxl_event_empty(log)) {
> +continue;
> +}
> +
> +/*  Notifies

Re: [PATCH] bitops.h: Compile out asserts without --enable-debug

2023-05-22 Thread BALATON Zoltan


On Mon, 22 May 2023, Alex Bennée wrote:

(ajb: add Richard for his compiler-fu)
BALATON Zoltan  writes:

On Mon, 22 May 2023, Alex Bennée wrote:

BALATON Zoltan  writes:


The low level extract and deposit funtions provided by bitops.h are
used in performance critical places. It crept into target/ppc via
FIELD_EX64 and also used by softfloat so PPC code using a lot of FPU
where hardfloat is also disabled is doubly affected.


Most of these asserts compile out to nothing if the compiler is able to
verify the constants are in the range. For example examining
the start of float64_add:





I don't see any check and abort steps because all the shift and mask
values are known at compile time. The softfloat compilation certainly
does have some assert points though:

readelf -s ./libqemu-ppc64-softmmu.fa.p/fpu_softfloat.c.o  |grep assert
  136:  0 NOTYPE  GLOBAL DEFAULT  UND g_assertion_mess[...]
  138:  0 NOTYPE  GLOBAL DEFAULT  UND __assert_fail

but the references are for the ISRA segments so its tricky to know if
they get used or are just there for LTO purposes.

If there are hot-paths that show up the extract/deposit functions I
suspect a better approach would be to implement _nocheck variants (or
maybe _noassert?) and use them where required rather than turning off
the assert checking for these utility functions.


Just to clarify again, the asserts are still there when compiled with
--enable-debug. The patch only turns them off for optimised release
builds which I think makes sense if these asserts are to catch
programming errors.


Well as Peter said the general policy is to keep asserts in but I
appreciate this is a hotpath case.


I think I've also suggested adding noassert
versions of these but that wasn't a popular idea and it may also not
be easy to convert all places to use that like for example the
register fields related usage in target/ppc as that would also affect
other places.


Is code generation or device emulation really on the hot-path. Generally
a well predicted assert is in the noise for those operations.


They aren't in code generation but in helpers as you can also see in the 
profile below and so they can be on hot path. Also I've noticed that 
extract8 and extract16 just call extract32 after adding another assert on 
their own in addition to the one in extract32 which is double overhead for 
really no reason. I'd delete all these asserts as the likelhood of bugs 
these could catch is very low anyway (how often do you expect somebody to 
call these with out of bound values that would not be obvious from the 
results otherwise?) but leaving them in non-debug builds is totally 
useless in my opinion.



So this seems to be the simplest and most effective
approach.

The softfloat related usage in these tests I've done seem to mostly
come from unpacking and repacking floats in softfloat which is done
for every operation, e.g. muladd which mp3 encoding mostly uses does 3
unpacks and 1 pack for each call and each unpack is 3 extracts so even
small overheads add app quickly. Just 1 muladd will result in 9
extracts and 2 deposits at least plus updating PPC flags for each FPU
op adds a bunch more. I did some profiling with perf to find these.


After some messing about trying to get lame to cross compile to a static
binary I was able to replicate what you've seen:

 11.44%  qemu-ppc64  qemu-ppc64   [.] unpack_raw64.isra.0
 11.03%  qemu-ppc64  qemu-ppc64   [.] parts64_uncanon_normal
  8.26%  qemu-ppc64  qemu-ppc64   [.] helper_compute_fprf_float64
  6.75%  qemu-ppc64  qemu-ppc64   [.] do_float_check_status
  5.34%  qemu-ppc64  qemu-ppc64   [.] parts64_muladd
  4.75%  qemu-ppc64  qemu-ppc64   [.] pack_raw64.isra.0
  4.38%  qemu-ppc64  qemu-ppc64   [.] parts64_canonicalize
  3.62%  qemu-ppc64  qemu-ppc64   [.] 
float64r32_round_pack_canonical
  3.32%  qemu-ppc64  qemu-ppc64   [.] helper_todouble
  2.68%  qemu-ppc64  qemu-ppc64   [.] float64_add
  2.51%  qemu-ppc64  qemu-ppc64   [.] float64_hs_compare
  2.30%  qemu-ppc64  qemu-ppc64   [.] float64r32_muladd
  1.80%  qemu-ppc64  qemu-ppc64   [.] float64r32_mul
  1.40%  qemu-ppc64  qemu-ppc64   [.] float64r32_add
  1.34%  qemu-ppc64  qemu-ppc64   [.] parts64_mul
  1.16%  qemu-ppc64  qemu-ppc64   [.] parts64_addsub
  1.14%  qemu-ppc64  qemu-ppc64   [.] helper_reset_fpstatus
  1.06%  qemu-ppc64  qemu-ppc64   [.] helper_float_check_status
  1.04%  qemu-ppc64  qemu-ppc64   [.] float64_muladd


I've run 32 bit PPC version in qemu-system-ppc so the profile is a bit 
different (has more system related overhead that I plan to look at 
separately) but this part is similar to the above. I also wonder what 
makes helper_compute_fprf_float64 a bottleneck as that does not seem to 
have much

Re: [PULL 00/20] Allow "make check" with "--without-default-devices"

2023-05-22 Thread Richard Henderson


On 5/22/23 04:49, Thomas Huth wrote:

The following changes since commit aa222a8e4f975284b3f8f131653a4114b3d333b3:

   Merge tag 'for_upstream' ofhttps://git.kernel.org/pub/scm/virt/kvm/mst/qemu  
into staging (2023-05-19 12:17:16 -0700)

are available in the Git repository at:

   https://gitlab.com/thuth/qemu.git  tags/pull-request-2023-05-22

for you to fetch changes up to 3884bf6468ac6bbb58c2b3feaa74e87f821b52f3:

   memory: stricter checks prior to unsetting engaged_in_io (2023-05-22 
10:35:28 +0200)


* First batch of fixes to allow "make check" with "--without-default-devices"
* Enable the "bios bits" avocado test in the gitlab-CI
* Another minor fix for the redundancy DMA blocker code


Alexander Bulekov (1):
   memory: stricter checks prior to unsetting engaged_in_io

Ani Sinha (1):
   acpi/tests/avocado/bits: enable bios bits avocado tests on gitlab CI 
pipeline

Thomas Huth (18):
   hw/i386/Kconfig: ISAPC works fine without VGA_ISA
   softmmu/vl.c: Check for the availability of the VGA device before using 
it
   hw: Move the default NIC machine class setting from the x86 to the 
generic one
   softmmu/vl.c: Disable default NIC if it has not been compiled into the 
binary
   hw/ppc: Use MachineClass->default_nic in the ppc machines
   hw/s390x: Use MachineClass->default_nic in the s390x machine
   hw/sh4: Use MachineClass->default_nic in the sh4 r2d machine
   hw/char/parallel: Move TYPE_ISA_PARALLEL to the header file
   hw/i386: Ignore the default parallel port if it has not been compiled 
into QEMU
   hw/sparc64/sun4u: Use MachineClass->default_nic and 
MachineClass->no_parallel
   tests/qtest/readconfig-test: Check for the availability of USB 
controllers
   tests/qtest/usb-hcd-uhci-test: Skip test if UHCI controller is not 
available
   tests/qtest/cdrom-test: Fix the test to also work without optional 
devices
   tests/qtest/virtio-ccw-test: Remove superfluous tests
   tests/qtest: Check for the availability of virtio-ccw devices before 
using them
   tests/qtest/meson.build: Run the net filter tests only with default 
devices
   tests/qemu-iotests/172: Run QEMU with -vga none and -nic none
   .gitlab-ci.d/buildtest.yml: Run full "make check" with 
--without-default-devices


Applied, thanks.  Please update https://wiki.qemu.org/ChangeLog/8.1 as 
appropriate.


r~

Re: [PATCH 1/1] hw/ide/core.c: fix handling of unsupported commands

2023-05-22 Thread Mateusz Albecki

Certainly seems like my patch is wrong as it will make the abort path
execute ide_cmd_done twice. During debug I came to the conclusion that
ide_cmd_done is not called at all as I was getting timeouts on the driver
side while waiting for D2H FIS. I am still not sure how I was getting this
behavior if the problem was actually with setting correct error bits. Even
so I think it can be safely assumed that Niklas' change will solve the
issue, I will try to verify it in a couple of days and if I see any problem
I will come back to you.

Mateusz

On Wed, 17 May 2023 at 23:33, John Snow  wrote:

> On Sun, Apr 16, 2023 at 6:29 PM Mateusz Albecki
>  wrote:
> >
> > From: Mateusz Albecki 
> >
> > Current code will not call ide_cmd_done when aborting the unsupported
> > command which will lead to the command timeout on the driver side instead
> > of getting a D2H FIS with ABRT indication. This can lead to problems on
> the
> > driver side as the spec mandates that device should return a D2H FIS with
> > ABRT bit set in ERR register(from SATA 3.1 section 16.3.3.8.6)
> >
> > Signed-off-by: Mateusz Albecki 
> > ---
> >  hw/ide/core.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/hw/ide/core.c b/hw/ide/core.c
> > index 45d14a25e9..d7027bbd4d 100644
> > --- a/hw/ide/core.c
> > +++ b/hw/ide/core.c
> > @@ -2146,6 +2146,7 @@ void ide_bus_exec_cmd(IDEBus *bus, uint32_t val)
> >
> >  if (!ide_cmd_permitted(s, val)) {
> >  ide_abort_command(s);
> > +ide_cmd_done(s);
> >  ide_bus_set_irq(s->bus);
> >  return;
> >  }
> > --
> > 2.40.0
> >
>
> I recently noticed that Niklas Cassel sent a patch to fix unsupported
> command handling:
> https://lists.gnu.org/archive/html/qemu-devel/2023-04/msg05552.html
>
> I suspect that his approach is the more technically correct one and
> that calling ide_cmd_done here is a heavy cudgel that may have
> unintended consequences. Am I mistaken?
> Can you check that Niklas's patch solves your issue? I think you're
> both solving the same problem. I've CC'd him on this patch as well.
>
> --js
>
>

Re: [QEMU][PATCH v5 3/4] xlnx-versal: Connect Xilinx VERSAL CANFD controllers

2023-05-22 Thread Francisco Iglesias

On [2023 May 19] Fri 13:36:57, Vikram Garhwal wrote:
> Connect CANFD0 and CANFD1 on the Versal-virt machine and update 
> xlnx-versal-virt
> document with CANFD command line examples.
> 
> Signed-off-by: Vikram Garhwal 
> Reviewed-by: Peter Maydell 

Reviewed-by: Francisco Iglesias 

> ---
>  docs/system/arm/xlnx-versal-virt.rst | 31 
>  hw/arm/xlnx-versal-virt.c| 53 
>  hw/arm/xlnx-versal.c | 37 +++
>  include/hw/arm/xlnx-versal.h | 12 +++
>  4 files changed, 133 insertions(+)
> 
> diff --git a/docs/system/arm/xlnx-versal-virt.rst 
> b/docs/system/arm/xlnx-versal-virt.rst
> index 92ad10d2da..d2d1b26692 100644
> --- a/docs/system/arm/xlnx-versal-virt.rst
> +++ b/docs/system/arm/xlnx-versal-virt.rst
> @@ -34,6 +34,7 @@ Implemented devices:
>  - DDR memory
>  - BBRAM (36 bytes of Battery-backed RAM)
>  - eFUSE (3072 bytes of one-time field-programmable bit array)
> +- 2 CANFDs
>  
>  QEMU does not yet model any other devices, including the PL and the AI 
> Engine.
>  
> @@ -224,3 +225,33 @@ To use a different index value, N, from default of 1, 
> add:
>  
>Better yet, do not use actual product data when running guest image
>on this Xilinx Versal Virt board.
> +
> +Using CANFDs for Versal Virt
> +
> +Versal CANFD controller is developed based on SocketCAN and QEMU CAN bus
> +implementation. Bus connection and socketCAN connection for each CAN module
> +can be set through command lines.
> +
> +To connect both CANFD0 and CANFD1 on the same bus:
> +
> +.. code-block:: bash
> +
> +-object can-bus,id=canbus -machine canbus0=canbus -machine canbus1=canbus
> +
> +To connect CANFD0 and CANFD1 to separate buses:
> +
> +.. code-block:: bash
> +
> +-object can-bus,id=canbus0 -object can-bus,id=canbus1 \
> +-machine canbus0=canbus0 -machine canbus1=canbus1
> +
> +The SocketCAN interface can connect to a Physical or a Virtual CAN 
> interfaces on
> +the host machine. Please check this document to learn about CAN interface on
> +Linux: docs/system/devices/can.rst
> +
> +To connect CANFD0 and CANFD1 to host machine's CAN interface can0:
> +
> +.. code-block:: bash
> +
> +-object can-bus,id=canbus -machine canbus0=canbus -machine canbus1=canbus
> +-object can-host-socketcan,id=canhost0,if=can0,canbus=canbus
> diff --git a/hw/arm/xlnx-versal-virt.c b/hw/arm/xlnx-versal-virt.c
> index 668a9d65a4..1ee2b8697f 100644
> --- a/hw/arm/xlnx-versal-virt.c
> +++ b/hw/arm/xlnx-versal-virt.c
> @@ -40,9 +40,11 @@ struct VersalVirt {
>  uint32_t clk_25Mhz;
>  uint32_t usb;
>  uint32_t dwc;
> +uint32_t canfd[2];
>  } phandle;
>  struct arm_boot_info binfo;
>  
> +CanBusState *canbus[XLNX_VERSAL_NR_CANFD];
>  struct {
>  bool secure;
>  } cfg;
> @@ -235,6 +237,38 @@ static void fdt_add_uart_nodes(VersalVirt *s)
>  }
>  }
>  
> +static void fdt_add_canfd_nodes(VersalVirt *s)
> +{
> +uint64_t addrs[] = { MM_CANFD1, MM_CANFD0 };
> +uint32_t size[] = { MM_CANFD1_SIZE, MM_CANFD0_SIZE };
> +unsigned int irqs[] = { VERSAL_CANFD1_IRQ_0, VERSAL_CANFD0_IRQ_0 };
> +const char clocknames[] = "can_clk\0s_axi_aclk";
> +int i;
> +
> +/* Create and connect CANFD0 and CANFD1 nodes to canbus0. */
> +for (i = 0; i < ARRAY_SIZE(addrs); i++) {
> +char *name = g_strdup_printf("/canfd@%" PRIx64, addrs[i]);
> +qemu_fdt_add_subnode(s->fdt, name);
> +
> +qemu_fdt_setprop_cell(s->fdt, name, "rx-fifo-depth", 0x40);
> +qemu_fdt_setprop_cell(s->fdt, name, "tx-mailbox-count", 0x20);
> +
> +qemu_fdt_setprop_cells(s->fdt, name, "clocks",
> +   s->phandle.clk_25Mhz, s->phandle.clk_25Mhz);
> +qemu_fdt_setprop(s->fdt, name, "clock-names",
> + clocknames, sizeof(clocknames));
> +qemu_fdt_setprop_cells(s->fdt, name, "interrupts",
> +   GIC_FDT_IRQ_TYPE_SPI, irqs[i],
> +   GIC_FDT_IRQ_FLAGS_LEVEL_HI);
> +qemu_fdt_setprop_sized_cells(s->fdt, name, "reg",
> + 2, addrs[i], 2, size[i]);
> +qemu_fdt_setprop_string(s->fdt, name, "compatible",
> +"xlnx,canfd-2.0");
> +
> +g_free(name);
> +}
> +}
> +
>  static void fdt_add_fixed_link_nodes(VersalVirt *s, char *gemname,
>   uint32_t phandle)
>  {
> @@ -639,12 +673,17 @@ static void versal_virt_init(MachineState *machine)
>  TYPE_XLNX_VERSAL);
>  object_property_set_link(OBJECT(>soc), "ddr", OBJECT(machine->ram),
>   _abort);
> +object_property_set_link(OBJECT(>soc), "canbus0", 
> OBJECT(s->canbus[0]),
> + _abort);
> +object_property_set_link(OBJECT(>soc), "canbus1", 
> OBJECT(s->canbus[1]),
> +

Re: [PATCH] qcow2: Explicit mention of padding bytes

2023-05-22 Thread Vladimir Sementsov-Ogievskiy


On 22.05.23 21:46, Eric Blake wrote:

Although we already covered the need for padding bytes with our
changes in commit 3ae3fcfa, commit 66fcbca5 (both v5.0.0) added one
byte and relied on the rest of the text for implicitly covering 7
padding bytes.  For consistency with other parts of the header (such
as the header extension format listing padding from n - m, or the
snapshot table entry listing variable padding), we might as well call
out the remaining 7 bytes as padding until such time (as any) as they
gain another meaning.

Signed-off-by: Eric Blake 
CC: Vladimir Sementsov-Ogievskiy 


Reviewed-by: Vladimir Sementsov-Ogievskiy 


---

Spring cleaning my old branches.

v3: reviving an old patch; v2 was:
https://lists.gnu.org/archive/html/qemu-devel/2020-04/msg00687.html
---
  docs/interop/qcow2.txt | 1 +
  1 file changed, 1 insertion(+)

diff --git a/docs/interop/qcow2.txt b/docs/interop/qcow2.txt
index e7f036c286b..2c4618375ad 100644
--- a/docs/interop/qcow2.txt
+++ b/docs/interop/qcow2.txt
@@ -226,6 +226,7 @@ version 2.
   in QEMU. However, clusters with 
the
  deflate compression type do not have zlib headers.

+105 - 111:  Padding, contents defined below.




--
Best regards,
Vladimir

Building QEMU errors

2023-05-22 Thread Punj, Siddhartha

Hi QEMU community,

I'm trying to build QEMU on my development environment on Windows (64 bit, 
windows 10 enterprise), but am facing issues.

I'm using the MSYS2 installer as administrator with the following commands:

pacman -Syu
pacman -Su
pacman -S base-devel mingw-w64-x86_64-toolchain git python ninja
pacman -S mingw-w64-x86_64-glib2 mingw64/mingw/w64-x86_64-gtk3 
mingw64/mingw-w64-x86_64-SDL2 python-setuptools
closed console

started mingw64
git clone  https://www.gitlab.com/qemu/qemu
cd into qemu
and then ran the following command on MinGW64:


./configure --enable-gtk --enable-sdl --target-list=x86_64-softmmu 
--disable-werror --disable-stack-protector --disable-capstone --enable-debug

And I'm getting the following output in my build: ln: failed to create symbolic 
link 'x86_64-softmmu/qemu-system-x86_64.exe': No such file or directory

Is this a known issue for QEMU right now that's critical? I'm trying to 
use/modify QEMU to create a virtual PCIe device. Please let me know/how I could 
potentially resolve this.

Thanks,

Siddartha

Re: [PATCH v7 2/7] hw/cxl: Move CXLRetCode definition to cxl_device.h

2023-05-22 Thread Fan Ni

The 05/22/2023 16:09, Jonathan Cameron wrote:
> Following patches will need access to the mailbox return code
> type so move it to the header.
> 
> Reviewed-by: Ira Weiny 
> Signed-off-by: Jonathan Cameron 
> ---

Reviewed-by: Fan Ni 

>  include/hw/cxl/cxl_device.h | 28 
>  hw/cxl/cxl-mailbox-utils.c  | 28 
>  2 files changed, 28 insertions(+), 28 deletions(-)
> 
> diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
> index 16993f7098..9f8ee85f8a 100644
> --- a/include/hw/cxl/cxl_device.h
> +++ b/include/hw/cxl/cxl_device.h
> @@ -83,6 +83,34 @@
>  (CXL_DEVICE_CAP_REG_SIZE + CXL_DEVICE_STATUS_REGISTERS_LENGTH + \
>   CXL_MAILBOX_REGISTERS_LENGTH + CXL_MEMORY_DEVICE_REGISTERS_LENGTH)
>  
> +/* 8.2.8.4.5.1 Command Return Codes */
> +typedef enum {
> +CXL_MBOX_SUCCESS = 0x0,
> +CXL_MBOX_BG_STARTED = 0x1,
> +CXL_MBOX_INVALID_INPUT = 0x2,
> +CXL_MBOX_UNSUPPORTED = 0x3,
> +CXL_MBOX_INTERNAL_ERROR = 0x4,
> +CXL_MBOX_RETRY_REQUIRED = 0x5,
> +CXL_MBOX_BUSY = 0x6,
> +CXL_MBOX_MEDIA_DISABLED = 0x7,
> +CXL_MBOX_FW_XFER_IN_PROGRESS = 0x8,
> +CXL_MBOX_FW_XFER_OUT_OF_ORDER = 0x9,
> +CXL_MBOX_FW_AUTH_FAILED = 0xa,
> +CXL_MBOX_FW_INVALID_SLOT = 0xb,
> +CXL_MBOX_FW_ROLLEDBACK = 0xc,
> +CXL_MBOX_FW_REST_REQD = 0xd,
> +CXL_MBOX_INVALID_HANDLE = 0xe,
> +CXL_MBOX_INVALID_PA = 0xf,
> +CXL_MBOX_INJECT_POISON_LIMIT = 0x10,
> +CXL_MBOX_PERMANENT_MEDIA_FAILURE = 0x11,
> +CXL_MBOX_ABORTED = 0x12,
> +CXL_MBOX_INVALID_SECURITY_STATE = 0x13,
> +CXL_MBOX_INCORRECT_PASSPHRASE = 0x14,
> +CXL_MBOX_UNSUPPORTED_MAILBOX = 0x15,
> +CXL_MBOX_INVALID_PAYLOAD_LENGTH = 0x16,
> +CXL_MBOX_MAX = 0x17
> +} CXLRetCode;
> +
>  typedef struct cxl_device_state {
>  MemoryRegion device_registers;
>  
> diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
> index e3401b6be8..d7e114aaae 100644
> --- a/hw/cxl/cxl-mailbox-utils.c
> +++ b/hw/cxl/cxl-mailbox-utils.c
> @@ -68,34 +68,6 @@ enum {
>  #define CLEAR_POISON   0x2
>  };
>  
> -/* 8.2.8.4.5.1 Command Return Codes */
> -typedef enum {
> -CXL_MBOX_SUCCESS = 0x0,
> -CXL_MBOX_BG_STARTED = 0x1,
> -CXL_MBOX_INVALID_INPUT = 0x2,
> -CXL_MBOX_UNSUPPORTED = 0x3,
> -CXL_MBOX_INTERNAL_ERROR = 0x4,
> -CXL_MBOX_RETRY_REQUIRED = 0x5,
> -CXL_MBOX_BUSY = 0x6,
> -CXL_MBOX_MEDIA_DISABLED = 0x7,
> -CXL_MBOX_FW_XFER_IN_PROGRESS = 0x8,
> -CXL_MBOX_FW_XFER_OUT_OF_ORDER = 0x9,
> -CXL_MBOX_FW_AUTH_FAILED = 0xa,
> -CXL_MBOX_FW_INVALID_SLOT = 0xb,
> -CXL_MBOX_FW_ROLLEDBACK = 0xc,
> -CXL_MBOX_FW_REST_REQD = 0xd,
> -CXL_MBOX_INVALID_HANDLE = 0xe,
> -CXL_MBOX_INVALID_PA = 0xf,
> -CXL_MBOX_INJECT_POISON_LIMIT = 0x10,
> -CXL_MBOX_PERMANENT_MEDIA_FAILURE = 0x11,
> -CXL_MBOX_ABORTED = 0x12,
> -CXL_MBOX_INVALID_SECURITY_STATE = 0x13,
> -CXL_MBOX_INCORRECT_PASSPHRASE = 0x14,
> -CXL_MBOX_UNSUPPORTED_MAILBOX = 0x15,
> -CXL_MBOX_INVALID_PAYLOAD_LENGTH = 0x16,
> -CXL_MBOX_MAX = 0x17
> -} CXLRetCode;
> -
>  struct cxl_cmd;
>  typedef CXLRetCode (*opcode_handler)(struct cxl_cmd *cmd,
> CXLDeviceState *cxl_dstate, uint16_t 
> *len);
> -- 
> 2.39.2
> 

-- 
Fan Ni

Re: [PATCH v7 3/7] hw/cxl/events: Wire up get/clear event mailbox commands

2023-05-22 Thread Fan Ni

The 05/22/2023 16:09, Jonathan Cameron wrote:
> From: Ira Weiny 
> 
> CXL testing is benefited from an artificial event log injection
> mechanism.
> 
> Add an event log infrastructure to insert, get, and clear events from
> the various logs available on a device.
> 
> Replace the stubbed out CXL Get/Clear Event mailbox commands with
> commands that operate on the new infrastructure.
> 
> Signed-off-by: Ira Weiny 
> Signed-off-by: Jonathan Cameron 
> ---

Reviewed-by: Fan Ni 

See comments below in cxl_event_insert.

>  include/hw/cxl/cxl_device.h |  25 +
>  include/hw/cxl/cxl_events.h |  55 +
>  hw/cxl/cxl-events.c | 217 
>  hw/cxl/cxl-mailbox-utils.c  |  40 ++-
>  hw/mem/cxl_type3.c  |   1 +
>  hw/cxl/meson.build  |   1 +
>  6 files changed, 337 insertions(+), 2 deletions(-)
> 
> diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
> index 9f8ee85f8a..d3aec1bc0e 100644
> --- a/include/hw/cxl/cxl_device.h
> +++ b/include/hw/cxl/cxl_device.h
> @@ -111,6 +111,20 @@ typedef enum {
>  CXL_MBOX_MAX = 0x17
>  } CXLRetCode;
>  
> +typedef struct CXLEvent {
> +CXLEventRecordRaw data;
> +QSIMPLEQ_ENTRY(CXLEvent) node;
> +} CXLEvent;
> +
> +typedef struct CXLEventLog {
> +uint16_t next_handle;
> +uint16_t overflow_err_count;
> +uint64_t first_overflow_timestamp;
> +uint64_t last_overflow_timestamp;
> +QemuMutex lock;
> +QSIMPLEQ_HEAD(, CXLEvent) events;
> +} CXLEventLog;
> +
>  typedef struct cxl_device_state {
>  MemoryRegion device_registers;
>  
> @@ -161,6 +175,8 @@ typedef struct cxl_device_state {
>  uint64_t mem_size;
>  uint64_t pmem_size;
>  uint64_t vmem_size;
> +
> +CXLEventLog event_logs[CXL_EVENT_TYPE_MAX];
>  } CXLDeviceState;
>  
>  /* Initialize the register block for a device */
> @@ -353,6 +369,15 @@ MemTxResult cxl_type3_write(PCIDevice *d, hwaddr 
> host_addr, uint64_t data,
>  
>  uint64_t cxl_device_get_timestamp(CXLDeviceState *cxlds);
>  
> +void cxl_event_init(CXLDeviceState *cxlds);
> +bool cxl_event_insert(CXLDeviceState *cxlds, CXLEventLogType log_type,
> +  CXLEventRecordRaw *event);
> +CXLRetCode cxl_event_get_records(CXLDeviceState *cxlds, CXLGetEventPayload 
> *pl,
> + uint8_t log_type, int max_recs,
> + uint16_t *len);
> +CXLRetCode cxl_event_clear_records(CXLDeviceState *cxlds,
> +   CXLClearEventPayload *pl);
> +
>  void cxl_set_poison_list_overflowed(CXLType3Dev *ct3d);
>  
>  #endif
> diff --git a/include/hw/cxl/cxl_events.h b/include/hw/cxl/cxl_events.h
> index aeb3b0590e..d4aaa894f1 100644
> --- a/include/hw/cxl/cxl_events.h
> +++ b/include/hw/cxl/cxl_events.h
> @@ -10,6 +10,8 @@
>  #ifndef CXL_EVENTS_H
>  #define CXL_EVENTS_H
>  
> +#include "qemu/uuid.h"
> +
>  /*
>   * CXL rev 3.0 section 8.2.9.2.2; Table 8-49
>   *
> @@ -25,4 +27,57 @@ typedef enum CXLEventLogType {
>  CXL_EVENT_TYPE_MAX
>  } CXLEventLogType;
>  
> +/*
> + * Common Event Record Format
> + * CXL rev 3.0 section 8.2.9.2.1; Table 8-42
> + */
> +#define CXL_EVENT_REC_HDR_RES_LEN 0xf
> +typedef struct CXLEventRecordHdr {
> +QemuUUID id;
> +uint8_t length;
> +uint8_t flags[3];
> +uint16_t handle;
> +uint16_t related_handle;
> +uint64_t timestamp;
> +uint8_t maint_op_class;
> +uint8_t reserved[CXL_EVENT_REC_HDR_RES_LEN];
> +} QEMU_PACKED CXLEventRecordHdr;
> +
> +#define CXL_EVENT_RECORD_DATA_LENGTH 0x50
> +typedef struct CXLEventRecordRaw {
> +CXLEventRecordHdr hdr;
> +uint8_t data[CXL_EVENT_RECORD_DATA_LENGTH];
> +} QEMU_PACKED CXLEventRecordRaw;
> +#define CXL_EVENT_RECORD_SIZE (sizeof(CXLEventRecordRaw))
> +
> +/*
> + * Get Event Records output payload
> + * CXL rev 3.0 section 8.2.9.2.2; Table 8-50
> + */
> +#define CXL_GET_EVENT_FLAG_OVERFLOW BIT(0)
> +#define CXL_GET_EVENT_FLAG_MORE_RECORDS BIT(1)
> +typedef struct CXLGetEventPayload {
> +uint8_t flags;
> +uint8_t reserved1;
> +uint16_t overflow_err_count;
> +uint64_t first_overflow_timestamp;
> +uint64_t last_overflow_timestamp;
> +uint16_t record_count;
> +uint8_t reserved2[0xa];
> +CXLEventRecordRaw records[];
> +} QEMU_PACKED CXLGetEventPayload;
> +#define CXL_EVENT_PAYLOAD_HDR_SIZE (sizeof(CXLGetEventPayload))
> +
> +/*
> + * Clear Event Records input payload
> + * CXL rev 3.0 section 8.2.9.2.3; Table 8-51
> + */
> +typedef struct CXLClearEventPayload {
> +uint8_t event_log;  /* CXLEventLogType */
> +uint8_t clear_flags;
> +uint8_t nr_recs;
> +uint8_t reserved[3];
> +uint16_t handle[];
> +} CXLClearEventPayload;
> +
>  #endif /* CXL_EVENTS_H */
> diff --git a/hw/cxl/cxl-events.c b/hw/cxl/cxl-events.c
> new file mode 100644
> index 00..5da1b76b97
> --- /dev/null
> +++ b/hw/cxl/cxl-events.c
> @@ -0,0 +1,217 @@
> +/*
> + * CXL Event processing
> + *
> + * Copyright(C) 2023 Intel

Re: [PATCH v7 1/7] hw/cxl/events: Add event status register

2023-05-22 Thread Fan Ni

The 05/22/2023 16:09, Jonathan Cameron wrote:
> From: Ira Weiny 
> 
> The device status register block was defined.  However, there were no
> individual registers nor any data wired up.
> 
> Define the event status register [CXL 3.0; 8.2.8.3.1] as part of the
> device status register block.  Wire up the register and initialize the
> event status for each log.
> 
> To support CXL 3.0 the version of the device status register block needs
> to be 2.  Change the macro to allow for setting the version.
> 
> Signed-off-by: Ira Weiny 
> Signed-off-by: Jonathan Cameron 

Reviewed-by: Fan Ni 

> ---
>  include/hw/cxl/cxl_device.h | 23 +---
>  include/hw/cxl/cxl_events.h | 28 
>  hw/cxl/cxl-device-utils.c   | 43 -
>  3 files changed, 86 insertions(+), 8 deletions(-)
> 
> diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
> index 73328a52cf..16993f7098 100644
> --- a/include/hw/cxl/cxl_device.h
> +++ b/include/hw/cxl/cxl_device.h
> @@ -13,6 +13,7 @@
>  #include "hw/cxl/cxl_component.h"
>  #include "hw/pci/pci_device.h"
>  #include "hw/register.h"
> +#include "hw/cxl/cxl_events.h"
>  
>  /*
>   * The following is how a CXL device's Memory Device registers are laid out.
> @@ -86,7 +87,16 @@ typedef struct cxl_device_state {
>  MemoryRegion device_registers;
>  
>  /* mmio for device capabilities array - 8.2.8.2 */
> -MemoryRegion device;
> +struct {
> +MemoryRegion device;
> +union {
> +uint8_t dev_reg_state[CXL_DEVICE_STATUS_REGISTERS_LENGTH];
> +uint16_t dev_reg_state16[CXL_DEVICE_STATUS_REGISTERS_LENGTH / 2];
> +uint32_t dev_reg_state32[CXL_DEVICE_STATUS_REGISTERS_LENGTH / 4];
> +uint64_t dev_reg_state64[CXL_DEVICE_STATUS_REGISTERS_LENGTH / 8];
> +};
> +uint64_t event_status;
> +};
>  MemoryRegion memory_device;
>  struct {
>  MemoryRegion caps;
> @@ -141,6 +151,9 @@ REG64(CXL_DEV_CAP_ARRAY, 0) /* Documented as 128 bit 
> register but 64 byte access
>  FIELD(CXL_DEV_CAP_ARRAY, CAP_VERSION, 16, 8)
>  FIELD(CXL_DEV_CAP_ARRAY, CAP_COUNT, 32, 16)
>  
> +void cxl_event_set_status(CXLDeviceState *cxl_dstate, CXLEventLogType 
> log_type,
> +  bool available);
> +
>  /*
>   * Helper macro to initialize capability headers for CXL devices.
>   *
> @@ -175,7 +188,7 @@ CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MEMORY_DEVICE,
>  void cxl_initialize_mailbox(CXLDeviceState *cxl_dstate);
>  void cxl_process_mailbox(CXLDeviceState *cxl_dstate);
>  
> -#define cxl_device_cap_init(dstate, reg, cap_id)   \
> +#define cxl_device_cap_init(dstate, reg, cap_id, ver)  \
>  do {   \
>  uint32_t *cap_hdrs = dstate->caps_reg_state32; \
>  int which = R_CXL_DEV_##reg##_CAP_HDR0;\
> @@ -183,7 +196,7 @@ void cxl_process_mailbox(CXLDeviceState *cxl_dstate);
>  FIELD_DP32(cap_hdrs[which], CXL_DEV_##reg##_CAP_HDR0,  \
> CAP_ID, cap_id);\
>  cap_hdrs[which] = FIELD_DP32(  \
> -cap_hdrs[which], CXL_DEV_##reg##_CAP_HDR0, CAP_VERSION, 1);\
> +cap_hdrs[which], CXL_DEV_##reg##_CAP_HDR0, CAP_VERSION, ver);  \
>  cap_hdrs[which + 1] =  \
>  FIELD_DP32(cap_hdrs[which + 1], CXL_DEV_##reg##_CAP_HDR1,  \
> CAP_OFFSET, CXL_##reg##_REGISTERS_OFFSET);  \
> @@ -192,6 +205,10 @@ void cxl_process_mailbox(CXLDeviceState *cxl_dstate);
> CAP_LENGTH, CXL_##reg##_REGISTERS_LENGTH);  \
>  } while (0)
>  
> +/* CXL 3.0 8.2.8.3.1 Event Status Register */
> +REG64(CXL_DEV_EVENT_STATUS, 0)
> +FIELD(CXL_DEV_EVENT_STATUS, EVENT_STATUS, 0, 32)
> +
>  /* CXL 2.0 8.2.8.4.3 Mailbox Capabilities Register */
>  REG32(CXL_DEV_MAILBOX_CAP, 0)
>  FIELD(CXL_DEV_MAILBOX_CAP, PAYLOAD_SIZE, 0, 5)
> diff --git a/include/hw/cxl/cxl_events.h b/include/hw/cxl/cxl_events.h
> new file mode 100644
> index 00..aeb3b0590e
> --- /dev/null
> +++ b/include/hw/cxl/cxl_events.h
> @@ -0,0 +1,28 @@
> +/*
> + * QEMU CXL Events
> + *
> + * Copyright (c) 2022 Intel
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2. See the
> + * COPYING file in the top-level directory.
> + */
> +
> +#ifndef CXL_EVENTS_H
> +#define CXL_EVENTS_H
> +
> +/*
> + * CXL rev 3.0 section 8.2.9.2.2; Table 8-49
> + *
> + * Define these as the bit position for the event status register for ease of
> + * setting the status.
> + */
> +typedef enum CXLEventLogType {
> +CXL_EVENT_TYPE_INFO  = 0,
> +CXL_EVENT_TYPE_WARN  = 1,
> +CXL_EVENT_TYPE_FAIL  = 2,
> +

Help finding Coverity defects for generated Hexagon code

2023-05-22 Thread Anton Johansson via


Hi,

coverity recently reported some defects in code generated by idef-parser
(email attached). These defects are expected and we plan to emit a
/* coverity[event_tag] */ comment to disable the specific event triggered.

However, I'm not able to find the event_tag as I can't find the defect in
the QEMU coverity project, and the link in the email simply brings me
to the main project page. I've tried sorting through the defects and
searching for the CID without luck.

Any ideas? I'm not super familiar with coverity.

--
Anton Johansson,
rev.ng Labs Srl.
--- Begin Message ---
Hi,

Please find the latest report on new defect(s) introduced to QEMU found with 
Coverity Scan.

5 new defect(s) introduced to QEMU found with Coverity Scan.
9 defect(s), reported by Coverity Scan earlier, were marked fixed in the recent 
build analyzed by Coverity Scan.

New defect(s) Reported-by: Coverity Scan
Showing 5 of 5 defect(s)


** CID 1512512:  Control flow issues  (DEADCODE)
/target/hexagon/idef-generated-emitter.indented.c: 32297 in emit_F2_dfmpylh()



*** CID 1512512:  Control flow issues  (DEADCODE)
/target/hexagon/idef-generated-emitter.indented.c: 32297 in emit_F2_dfmpylh()
32291 tcg_gen_ori_i64(tmp_3, tmp_2, qemu_tmp_1);
32292 TCGv_i64 tmp_4 = tcg_temp_new_i64();
32293 tcg_gen_mul_i64(tmp_4, tmp_0, tmp_3);
32294 int64_t qemu_tmp_2 = (int64_t)((int32_t) 0x1);
32295 TCGv_i64 tmp_5 = tcg_temp_new_i64();
32296 if (qemu_tmp_2 >= 64) {
>>> CID 1512512:  Control flow issues  (DEADCODE)
>>> Execution cannot reach this statement: "tcg_gen_movi_i64(tmp_5, 0L);".
32297 tcg_gen_movi_i64(tmp_5, 0);
32298 } else {
32299 tcg_gen_shli_i64(tmp_5, tmp_4, qemu_tmp_2);
32300 }
32301 TCGv_i64 tmp_6 = tcg_temp_new_i64();
32302 tcg_gen_add_i64(tmp_6, RxxV, tmp_5);

** CID 1512511:  Control flow issues  (DEADCODE)
/target/hexagon/idef-generated-emitter.indented.c: 32257 in emit_F2_dfmpyll()



*** CID 1512511:  Control flow issues  (DEADCODE)
/target/hexagon/idef-generated-emitter.indented.c: 32257 in emit_F2_dfmpyll()
32251 } else {
32252 tcg_gen_shri_i64(tmp_3, prod, qemu_tmp_0);
32253 }
32254 int64_t qemu_tmp_1 = (int64_t)((int32_t) 0x1);
32255 TCGv_i64 tmp_4 = tcg_temp_new_i64();
32256 if (qemu_tmp_1 >= 64) {
>>> CID 1512511:  Control flow issues  (DEADCODE)
>>> Execution cannot reach this statement: "tcg_gen_movi_i64(tmp_4, 0L);".
32257 tcg_gen_movi_i64(tmp_4, 0);
32258 } else {
32259 tcg_gen_shli_i64(tmp_4, tmp_3, qemu_tmp_1);
32260 }
32261 tcg_gen_mov_i64(RddV, tmp_4);
32262 TCGv_i64 tmp_5 = tcg_temp_new_i64();

** CID 1512510:  Control flow issues  (DEADCODE)
/target/hexagon/idef-generated-emitter.indented.c: 16045 in 
emit_M2_dpmpyss_rnd_s0()



*** CID 1512510:  Control flow issues  (DEADCODE)
/target/hexagon/idef-generated-emitter.indented.c: 16045 in 
emit_M2_dpmpyss_rnd_s0()
16039 tcg_gen_addi_i64(tmp_3, tmp_2, qemu_tmp_0);
16040 int64_t qemu_tmp_1 = (int64_t)((int32_t) 0x20);
16041 TCGv_i64 tmp_4 = tcg_temp_new_i64();
16042 {
16043 int64_t shift = qemu_tmp_1;
16044 if (qemu_tmp_1 >= 64) {
>>> CID 1512510:  Control flow issues  (DEADCODE)
>>> Execution cannot reach this statement: "shift = 63L;".
16045 shift = 64 - 1;
16046 }
16047 tcg_gen_sari_i64(tmp_4, tmp_3, shift);
16048 }
16049 TCGv_i32 tmp_5 = tcg_temp_new_i32();
16050 tcg_gen_trunc_i64_tl(tmp_5, tmp_4);

** CID 1512509:  Control flow issues  (DEADCODE)
/target/hexagon/idef-generated-emitter.indented.c: 32287 in emit_F2_dfmpylh()



*** CID 1512509:  Control flow issues  (DEADCODE)
/target/hexagon/idef-generated-emitter.indented.c: 32287 in emit_F2_dfmpylh()
32281 tcg_gen_extract_i64(tmp_1, RttV, ((int32_t) 0x1) * 32, 32);
32282 int64_t qemu_tmp_0 = (int64_t)((int32_t) 0x14);
32283 TCGv_i64 tmp_2 = tcg_temp_new_i64();
32284 if (qemu_tmp_0 != 0) {
32285 tcg_gen_extract_i64(tmp_2, tmp_1, 0, qemu_tmp_0);
32286 } else {
>>> CID 1512509:  Control flow issues  (DEADCODE)
>>> Execution cannot reach this statement: "tcg_gen_movi_i64(tmp_2, 0L);".
32287 tcg_gen_movi_i64(tmp_2, 0);
32288 }
32289 int64_t qemu_tmp_1 = (int64_t)((int32_t) 0x10);
32290 TCGv_i64 tmp_3 = tcg_temp_new_i64();
32291 tcg_gen_ori_i64(tmp_3, tmp_2, qemu_tmp_1);
32292 TCGv_i64 tmp_4 = tcg_temp_new_i64();

** CID 1512508:  Control flow issues  (DEADCODE)
/target/hexagon/idef-generated-emitter.indented.c: 32250 in emit_F2_dfmpyll()

[PATCH v4 0/1] ROM migration

2023-05-22 Thread Vladimir Sementsov-Ogievskiy

v4:
preparation patches are already merged to master
01: fix false-positive "error: ‘size’ may be used uninitialized",
keep r-bs

Vladimir Sementsov-Ogievskiy (1):
  pci: ROM preallocation for incoming migration

 hw/pci/pci.c | 79 ++--
 1 file changed, 46 insertions(+), 33 deletions(-)

-- 
2.34.1

[PATCH v4 1/1] pci: ROM preallocation for incoming migration

2023-05-22 Thread Vladimir Sementsov-Ogievskiy

On incoming migration we have the following sequence to load option
ROM:

1. On device realize we do normal load ROM from the file

2. Than, on incoming migration we rewrite ROM from the incoming RAM
   block. If sizes mismatch we fail, like this:

Size mismatch: :00:03.0/virtio-net-pci.rom: 0x4 != 0x8: Invalid 
argument

This is not ideal when we migrate to updated distribution: we have to
keep old ROM files in new distribution and be careful around romfile
property to load correct ROM file. Which is loaded actually just to
allocate the ROM with correct length.

Note, that romsize property doesn't really help: if we try to specify
it when default romfile is larger, it fails with something like:

romfile "efi-virtio.rom" (160768 bytes) is too large for ROM size 65536

Let's just ignore ROM file when romsize is specified and we are in
incoming migration state. In other words, we need only to preallocate
ROM of specified size, local ROM file is unrelated.

This way:

If romsize was specified on source, we just use same commandline as on
source, and migration will work independently of local ROM files on
target.

If romsize was not specified on source (and we have mismatching local
ROM file on target host), we have to specify romsize on target to match
source romsize. romfile parameter may be kept same as on source or may
be dropped, the file is not loaded anyway.

As a bonus we avoid extra reading from ROM file on target.

Note: when we don't have romsize parameter on source command line and
need it for target, it may be calculated as aligned up to power of two
size of ROM file on source (if we know, which file is it) or,
alternatively it may be retrieved from source QEMU by QMP qom-get
command, like

  { "execute": "qom-get",
"arguments": {
  "path": "/machine/peripheral/CARD_ID/virtio-net-pci.rom[0]",
  "property": "size" } }

Note: we have extra initialization of size variable to zero in
  pci_add_option_rom to avoid false-positive
  "error: ‘size’ may be used uninitialized"

Suggested-by: Michael S. Tsirkin 
Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: David Hildenbrand 
Reviewed-by: Juan Quintela 
---
 hw/pci/pci.c | 79 ++--
 1 file changed, 46 insertions(+), 33 deletions(-)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 1cc7c89036..a3840cc452 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -36,6 +36,7 @@
 #include "migration/vmstate.h"
 #include "net/net.h"
 #include "sysemu/numa.h"
+#include "sysemu/runstate.h"
 #include "sysemu/sysemu.h"
 #include "hw/loader.h"
 #include "qemu/error-report.h"
@@ -2308,12 +2309,18 @@ static void pci_patch_ids(PCIDevice *pdev, uint8_t 
*ptr, uint32_t size)
 static void pci_add_option_rom(PCIDevice *pdev, bool is_default_rom,
Error **errp)
 {
-int64_t size;
+int64_t size = 0;
 g_autofree char *path = NULL;
-void *ptr;
 char name[32];
 const VMStateDescription *vmsd;
 
+/*
+ * In case of incoming migration ROM will come with migration stream, no
+ * reason to load the file.  Neither we want to fail if local ROM file
+ * mismatches with specified romsize.
+ */
+bool load_file = !runstate_check(RUN_STATE_INMIGRATE);
+
 if (!pdev->romfile || !strlen(pdev->romfile)) {
 return;
 }
@@ -2343,32 +2350,35 @@ static void pci_add_option_rom(PCIDevice *pdev, bool 
is_default_rom,
 return;
 }
 
-path = qemu_find_file(QEMU_FILE_TYPE_BIOS, pdev->romfile);
-if (path == NULL) {
-path = g_strdup(pdev->romfile);
-}
+if (load_file || pdev->romsize == -1) {
+path = qemu_find_file(QEMU_FILE_TYPE_BIOS, pdev->romfile);
+if (path == NULL) {
+path = g_strdup(pdev->romfile);
+}
 
-size = get_image_size(path);
-if (size < 0) {
-error_setg(errp, "failed to find romfile \"%s\"", pdev->romfile);
-return;
-} else if (size == 0) {
-error_setg(errp, "romfile \"%s\" is empty", pdev->romfile);
-return;
-} else if (size > 2 * GiB) {
-error_setg(errp, "romfile \"%s\" too large (size cannot exceed 2 GiB)",
-   pdev->romfile);
-return;
-}
-if (pdev->romsize != -1) {
-if (size > pdev->romsize) {
-error_setg(errp, "romfile \"%s\" (%u bytes) "
-   "is too large for ROM size %u",
-   pdev->romfile, (uint32_t)size, pdev->romsize);
+size = get_image_size(path);
+if (size < 0) {
+error_setg(errp, "failed to find romfile \"%s\"", pdev->romfile);
+return;
+} else if (size == 0) {
+error_setg(errp, "romfile \"%s\" is empty", pdev->romfile);
+return;
+} else if (size > 2 * GiB) {
+error_setg(errp,
+   "romfile \"%s\" too large (size cannot exceed 2 GiB)",
+   pdev->romfile);

Re: [PATCH v20 16/21] tests/avocado: s390x cpu topology entitlement tests

2023-05-22 Thread Nina Schoetterl-Glausch

On Tue, 2023-04-25 at 18:14 +0200, Pierre Morel wrote:
> This test takes care to check the changes on different entitlements
> when the guest requests a polarization change.
> 
> Signed-off-by: Pierre Morel 
> ---
>  tests/avocado/s390_topology.py | 56 ++
>  1 file changed, 56 insertions(+)
> 
> diff --git a/tests/avocado/s390_topology.py b/tests/avocado/s390_topology.py
> index 30d3c0d0cb..64e1cc9209 100644
> --- a/tests/avocado/s390_topology.py
> +++ b/tests/avocado/s390_topology.py
> @@ -244,3 +244,59 @@ def test_polarisation(self):
>  '/bin/cat /sys/devices/system/cpu/dispatching', '0')
>  
>  self.check_topology(0, 0, 0, 0, 'medium', False)
> +
> +def test_entitlement(self):
> +"""
> +This test verifies that QEMU modifies the polarization
> +after a guest request.
> +
> +:avocado: tags=arch:s390x
> +:avocado: tags=machine:s390-ccw-virtio
> +"""
> +self.kernel_init()
> +self.vm.add_args('-smp',
> + '1,drawers=2,books=2,sockets=3,cores=2,maxcpus=24')
> +self.vm.add_args('-device', 'z14-s390x-cpu,core-id=1')
> +self.vm.add_args('-device', 'z14-s390x-cpu,core-id=2')
> +self.vm.add_args('-device', 'z14-s390x-cpu,core-id=3')

Why the -device statements? Won't they result in the same as specifying -smp 
4,...?
Same for patch 17.

> +self.vm.launch()
> +self.wait_for_console_pattern('no job control')
> +
> +self.system_init()
> +
> +res = self.vm.qmp('set-cpu-topology',
> +  {'core-id': 0, 'entitlement': 'low'})
> +self.assertEqual(res['return'], {})
> +res = self.vm.qmp('set-cpu-topology',
> +  {'core-id': 1, 'entitlement': 'medium'})
> +self.assertEqual(res['return'], {})
> +res = self.vm.qmp('set-cpu-topology',
> +  {'core-id': 2, 'entitlement': 'high'})
> +self.assertEqual(res['return'], {})
> +res = self.vm.qmp('set-cpu-topology',
> +  {'core-id': 3, 'entitlement': 'high'})
> +self.assertEqual(res['return'], {})
> +self.check_topology(0, 0, 0, 0, 'low', False)
> +self.check_topology(1, 0, 0, 0, 'medium', False)
> +self.check_topology(2, 1, 0, 0, 'high', False)
> +self.check_topology(3, 1, 0, 0, 'high', False)
> +
> +exec_command(self, 'echo 1 > /sys/devices/system/cpu/dispatching')
> +time.sleep(0.2)
> +exec_command_and_wait_for_pattern(self,
> +'/bin/cat /sys/devices/system/cpu/dispatching', '1')
> +
> +self.check_topology(0, 0, 0, 0, 'low', False)
> +self.check_topology(1, 0, 0, 0, 'medium', False)
> +self.check_topology(2, 1, 0, 0, 'high', False)
> +self.check_topology(3, 1, 0, 0, 'high', False)
> +
> +exec_command(self, 'echo 0 > /sys/devices/system/cpu/dispatching')
> +time.sleep(0.2)
> +exec_command_and_wait_for_pattern(self,
> +'/bin/cat /sys/devices/system/cpu/dispatching', '0')
> +
> +self.check_topology(0, 0, 0, 0, 'low', False)
> +self.check_topology(1, 0, 0, 0, 'medium', False)
> +self.check_topology(2, 1, 0, 0, 'high', False)
> +self.check_topology(3, 1, 0, 0, 'high', False)

Re: [PATCH v20 15/21] tests/avocado: s390x cpu topology polarisation

2023-05-22 Thread Nina Schoetterl-Glausch

Try to be consistent in the spelling of polarization.
You use an s in the title and in the test name below.

On Tue, 2023-04-25 at 18:14 +0200, Pierre Morel wrote:
> Polarization is changed on a request from the guest.
> Let's verify the polarization is accordingly set by QEMU.
> 
> Signed-off-by: Pierre Morel 
> ---
>  tests/avocado/s390_topology.py | 38 ++
>  1 file changed, 38 insertions(+)
> 
> diff --git a/tests/avocado/s390_topology.py b/tests/avocado/s390_topology.py
> index ce119a095e..30d3c0d0cb 100644
> --- a/tests/avocado/s390_topology.py
> +++ b/tests/avocado/s390_topology.py
> @@ -104,6 +104,15 @@ def kernel_init(self):
>   '-initrd', initrd_path,
>   '-append', kernel_command_line)
>  
> +def system_init(self):
> +self.log.info("System init")
> +exec_command(self, 'mount proc -t proc /proc')
> +time.sleep(0.2)
> +exec_command(self, 'mount sys -t sysfs /sys')
> +time.sleep(0.2)
> +exec_command_and_wait_for_pattern(self,
> +'/bin/cat /sys/devices/system/cpu/dispatching', '0')
> +
>  def test_single(self):
>  self.kernel_init()
>  self.vm.launch()
> @@ -206,3 +215,32 @@ def test_hotplug_full(self):
>  self.check_topology(3, 1, 1, 1, 'high', False)
>  self.check_topology(4, 1, 1, 1, 'medium', False)
>  self.check_topology(5, 2, 1, 1, 'high', True)
> +
> +def test_polarisation(self):

I would unite this test with test_query_polarization, they are very similar.

> +"""
> +This test verifies that QEMU modifies the entitlement change after
> +several guest polarization change requests.
> +
> +:avocado: tags=arch:s390x
> +:avocado: tags=machine:s390-ccw-virtio
> +"""
> +self.kernel_init()
> +self.vm.launch()
> +self.wait_for_console_pattern('no job control')
> +
> +self.system_init()
> +self.check_topology(0, 0, 0, 0, 'medium', False)
> +
> +exec_command(self, 'echo 1 > /sys/devices/system/cpu/dispatching')
> +time.sleep(0.2)

Can you find a way to wait for the event here?

> +exec_command_and_wait_for_pattern(self,
> +'/bin/cat /sys/devices/system/cpu/dispatching', '1')

I think it would be good to refactor this snippet into a function.

def guest_set_dispatching(self, dispatching):
exec_command(self, f'echo {dispatching} > 
/sys/devices/system/cpu/dispatching')
#TODO wait
exec_command_and_wait_for_pattern(self,
'/bin/cat /sys/devices/system/cpu/dispatching', dispatching)

or similar, you could also put the path into a variable.

> +
> +self.check_topology(0, 0, 0, 0, 'medium', False)
> +
> +exec_command(self, 'echo 0 > /sys/devices/system/cpu/dispatching')
> +time.sleep(0.2)
> +exec_command_and_wait_for_pattern(self,
> +'/bin/cat /sys/devices/system/cpu/dispatching', '0')
> +
> +self.check_topology(0, 0, 0, 0, 'medium', False)

Re: [PATCH v20 14/21] tests/avocado: s390x cpu topology core

2023-05-22 Thread Nina Schoetterl-Glausch

On Tue, 2023-04-25 at 18:14 +0200, Pierre Morel wrote:
> Introduction of the s390x cpu topology core functions and
> basic tests.
> 
> We test the corelation between the command line and
> the QMP results in query-cpus-fast for various CPU topology.
> 
> Signed-off-by: Pierre Morel 
> ---
>  MAINTAINERS|   1 +
>  tests/avocado/s390_topology.py | 208 +
>  2 files changed, 209 insertions(+)
>  create mode 100644 tests/avocado/s390_topology.py
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index fe5638e31d..41419840b0 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1662,6 +1662,7 @@ F: hw/s390x/cpu-topology.c
>  F: target/s390x/kvm/cpu_topology.c
>  F: docs/devel/s390-cpu-topology.rst
>  F: docs/system/s390x/cpu-topology.rst
> +F: tests/avocado/s390_topology.py
>  
>  X86 Machines
>  
> diff --git a/tests/avocado/s390_topology.py b/tests/avocado/s390_topology.py
> new file mode 100644
> index 00..ce119a095e
> --- /dev/null
> +++ b/tests/avocado/s390_topology.py
> @@ -0,0 +1,208 @@
> +# Functional test that boots a Linux kernel and checks the console
> +#
> +# Copyright IBM Corp. 2023
> +#
> +# Author:
> +#  Pierre Morel 
> +#
> +# This work is licensed under the terms of the GNU GPL, version 2 or
> +# later.  See the COPYING file in the top-level directory.
> +
> +import os
> +import shutil
> +import time
> +
> +from avocado_qemu import QemuSystemTest
> +from avocado_qemu import exec_command
> +from avocado_qemu import exec_command_and_wait_for_pattern
> +from avocado_qemu import interrupt_interactive_console_until_pattern
> +from avocado_qemu import wait_for_console_pattern
> +from avocado.utils import process
> +from avocado.utils import archive
> +
> +
> +class LinuxKernelTest(QemuSystemTest):

I'd get rid of this class, unless you plan to use it for more children.

> +KERNEL_COMMON_COMMAND_LINE = 'printk.time=0 '
> +
> +def wait_for_console_pattern(self, success_message, vm=None):

You always use the same args for this function, I'd refactor it into
def wait_until_booted(self):

> +wait_for_console_pattern(self, success_message,
> + failure_message='Kernel panic - not 
> syncing',
> + vm=vm)
> +
> +
> +class S390CPUTopology(LinuxKernelTest):
> +"""
> +S390x CPU topology consist of 4 topology layers, from bottom to top,
> +the cores, sockets, books and drawers and 2 modifiers attributes,
> +the entitlement and the dedication.
> +See: docs/system/s390x/cpu-topology.rst.
> +
> +S390x CPU topology is setup in different ways:
> +- implicitely from the '-smp' argument by completing each topology
> +  level one after the other begining with drawer 0, book 0 and socket 0.
> +- explicitely from the '-device' argument on the QEMU command line
> +- explicitely by hotplug of a new CPU using QMP or HMP
> +- it is modified by using QMP 'set-cpu-topology'
> +
> +The S390x modifier attribute entitlement depends on the machine
> +polarization, which can be horizontal or vertical.
> +The polarization is changed on a request from the guest.
> +"""
> +timeout = 90
> +
> +
> +def check_topology(self, c, s, b, d, e, t):
> +res = self.vm.qmp('query-cpus-fast')
> +line =  res['return']
> +for x in line:

for cpu in cpus

> +core = x['props']['core-id']
> +socket = x['props']['socket-id']
> +book = x['props']['book-id']
> +drawer = x['props']['drawer-id']
> +entitlement = x['entitlement']
> +dedicated = x['dedicated']
> +if core == c:
> +self.assertEqual(drawer, d)
> +self.assertEqual(book, b)
> +self.assertEqual(socket, s)
> +self.assertEqual(entitlement, e)
> +self.assertEqual(dedicated, t)
> +
> +def kernel_init(self):
> +"""
> +We need a kernel supporting the CPU topology.
> +We need a minimal root filesystem with a shell.
> +"""
> +kernel_url = ('https://archives.fedoraproject.org/pub/archive'
> +  '/fedora-secondary/releases/35/Server/s390x/os'
> +  '/images/kernel.img')
> +kernel_hash = '0d1aaaf303f07cf0160c8c48e56fe638'
> +kernel_path = self.fetch_asset(kernel_url, algorithm='md5',
> +   asset_hash=kernel_hash)
> +
> +initrd_url = ('https://archives.fedoraproject.org/pub/archive'
> +  '/fedora-secondary/releases/35/Server/s390x/os'
> +  '/images/initrd.img')
> +initrd_hash = 'a122057d95725ac030e2ec51df46e172'
> +initrd_path_xz = self.fetch_asset(initrd_url, algorithm='md5',
> +  asset_hash=initrd_hash)
> +initrd_path = os.path.join(self.workdir, 'initrd-raw.img')
> +

Re: [PATCH 03/12] util/fifo8: Introduce fifo8_peek_buf()

2023-05-22 Thread Francisco Iglesias

On [2023 May 22] Mon 17:31:35, Philippe Mathieu-Daudé wrote:
> To be able to poke at FIFO content without popping it,
> introduce the fifo8_peek_buf() method by factoring
> common content from fifo8_pop_buf().
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  include/qemu/fifo8.h | 26 ++
>  util/fifo8.c | 22 ++
>  2 files changed, 44 insertions(+), 4 deletions(-)
> 
> diff --git a/include/qemu/fifo8.h b/include/qemu/fifo8.h
> index d0d02bc73d..7acf6d1347 100644
> --- a/include/qemu/fifo8.h
> +++ b/include/qemu/fifo8.h
> @@ -93,6 +93,32 @@ uint8_t fifo8_pop(Fifo8 *fifo);
>   */
>  const uint8_t *fifo8_pop_buf(Fifo8 *fifo, uint32_t max, uint32_t *numptr);
>  
> +/**
> + * fifo8_peek_buf:
> + * @fifo: FIFO to poke from
> + * @max: maximum number of bytes to pop
> + * @numptr: pointer filled with number of bytes returned (can be NULL)
> + *
> + * Pop a number of elements from the FIFO up to a maximum of max. The buffer

s/Pop/Peek into/

> + * containing the popped data is returned. This buffer points directly into

s/popped data/data peeked into/

If above sounds good:

Reviewed-by: Francisco Iglesias 


> + * the FIFO backing store and data is invalidated once any of the fifo8_* 
> APIs
> + * are called on the FIFO.

(Above sounds as if it happens automatically to me but I'm not english native,
a suggestion could be to put something as below "clients are responsible for
tracking this")

> + *
> + * The function may return fewer bytes than requested when the data wraps
> + * around in the ring buffer; in this case only a contiguous part of the data
> + * is returned.
> + *
> + * The number of valid bytes returned is populated in *numptr; will always
> + * return at least 1 byte. max must not be 0 or greater than the number of
> + * bytes in the FIFO.
> + *
> + * Clients are responsible for checking the availability of requested data
> + * using fifo8_num_used().
> + *
> + * Returns: A pointer to peekable data.
> + */
> +const uint8_t *fifo8_peek_buf(Fifo8 *fifo, uint32_t max, uint32_t *numptr);
> +
>  /**
>   * fifo8_reset:
>   * @fifo: FIFO to reset
> diff --git a/util/fifo8.c b/util/fifo8.c
> index 032e985440..e12477843e 100644
> --- a/util/fifo8.c
> +++ b/util/fifo8.c
> @@ -66,7 +66,8 @@ uint8_t fifo8_pop(Fifo8 *fifo)
>  return ret;
>  }
>  
> -const uint8_t *fifo8_pop_buf(Fifo8 *fifo, uint32_t max, uint32_t *numptr)
> +static const uint8_t *fifo8_peekpop_buf(Fifo8 *fifo, uint32_t max,
> +uint32_t *numptr, bool do_pop)
>  {
>  uint8_t *ret;
>  uint32_t num;
> @@ -74,15 +75,28 @@ const uint8_t *fifo8_pop_buf(Fifo8 *fifo, uint32_t max, 
> uint32_t *numptr)
>  assert(max > 0 && max <= fifo->num);
>  num = MIN(fifo->capacity - fifo->head, max);
>  ret = >data[fifo->head];
> -fifo->head += num;
> -fifo->head %= fifo->capacity;
> -fifo->num -= num;
> +
> +if (do_pop) {
> +fifo->head += num;
> +fifo->head %= fifo->capacity;
> +fifo->num -= num;
> +}
>  if (numptr) {
>  *numptr = num;
>  }
>  return ret;
>  }
>  
> +const uint8_t *fifo8_peek_buf(Fifo8 *fifo, uint32_t max, uint32_t *numptr)
> +{
> +return fifo8_peekpop_buf(fifo, max, numptr, false);
> +}
> +
> +const uint8_t *fifo8_pop_buf(Fifo8 *fifo, uint32_t max, uint32_t *numptr)
> +{
> +return fifo8_peekpop_buf(fifo, max, numptr, true);
> +}
> +
>  void fifo8_reset(Fifo8 *fifo)
>  {
>  fifo->num = 0;
> -- 
> 2.38.1
> 
>

Re: [PATCH v3 3/3] pci: ROM preallocation for incoming migration

2023-05-22 Thread Michael S. Tsirkin

On Mon, May 22, 2023 at 11:44:32AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> On 19.05.23 08:34, Michael S. Tsirkin wrote:
> > On Mon, May 15, 2023 at 03:52:29PM +0300, Vladimir Sementsov-Ogievskiy 
> > wrote:
> > > On incoming migration we have the following sequence to load option
> > > ROM:
> > > 
> > > 1. On device realize we do normal load ROM from the file
> > > 
> > > 2. Than, on incoming migration we rewrite ROM from the incoming RAM
> > > block. If sizes mismatch we fail, like this:
> > > 
> > >  Size mismatch: :00:03.0/virtio-net-pci.rom: 0x4 != 0x8: 
> > > Invalid argument
> > > 
> > > This is not ideal when we migrate to updated distribution: we have to
> > > keep old ROM files in new distribution and be careful around romfile
> > > property to load correct ROM file. Which is loaded actually just to
> > > allocate the ROM with correct length.
> > > 
> > > Note, that romsize property doesn't really help: if we try to specify
> > > it when default romfile is larger, it fails with something like:
> > > 
> > >  romfile "efi-virtio.rom" (160768 bytes) is too large for ROM size 
> > > 65536
> > > 
> > > Let's just ignore ROM file when romsize is specified and we are in
> > > incoming migration state. In other words, we need only to preallocate
> > > ROM of specified size, local ROM file is unrelated.
> > > 
> > > This way:
> > > 
> > > If romsize was specified on source, we just use same commandline as on
> > > source, and migration will work independently of local ROM files on
> > > target.
> > > 
> > > If romsize was not specified on source (and we have mismatching local
> > > ROM file on target host), we have to specify romsize on target to match
> > > source romsize. romfile parameter may be kept same as on source or may
> > > be dropped, the file is not loaded anyway.
> > > 
> > > As a bonus we avoid extra reading from ROM file on target.
> > > 
> > > Note: when we don't have romsize parameter on source command line and
> > > need it for target, it may be calculated as aligned up to power of two
> > > size of ROM file on source (if we know, which file is it) or,
> > > alternatively it may be retrieved from source QEMU by QMP qom-get
> > > command, like
> > > 
> > >{ "execute": "qom-get",
> > >  "arguments": {
> > >"path": "/machine/peripheral/CARD_ID/virtio-net-pci.rom[0]",
> > >"property": "size" } }
> > > 
> > > Suggested-by: Michael S. Tsirkin 
> > > Signed-off-by: Vladimir Sementsov-Ogievskiy 
> > > Reviewed-by: David Hildenbrand 
> > > Reviewed-by: Juan Quintela 
> > 
> > 
> > Breaks build here:
> > 
> > In function ‘pci_add_option_rom’,
> >  inlined from ‘pci_qdev_realize’ at ../hw/pci/pci.c:2155:5:
> > ../hw/pci/pci.c:2395:13: error: ‘size’ may be used uninitialized 
> > [-Werror=maybe-uninitialized]
> >   2395 | if (load_image_size(path, ptr, size) < 0) {
> >| ^~~~
> > ../hw/pci/pci.c: In function ‘pci_qdev_realize’:
> > ../hw/pci/pci.c:2312:13: note: ‘size’ was declared here
> >   2312 | int64_t size;
> >| ^~~~
> > 
> > 
> 
> Hmm, but works for me. Anyway that's obviously false-positive, if we are 
> here, size is initialized in previous block if (load_file || ..).
> 
> So, may be add simply this:
> 
> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> index 0f0c83c02f..075c998284 100644
> --- a/hw/pci/pci.c
> +++ b/hw/pci/pci.c
> @@ -2307,7 +2307,7 @@ static void pci_patch_ids(PCIDevice *pdev, uint8_t 
> *ptr, uint32_t size)
>  static void pci_add_option_rom(PCIDevice *pdev, bool is_default_rom,
> Error **errp)
>  {
> -int64_t size;
> +int64_t size = 0;  /* fix "uninitialized" false-positive */

I'd even drop the comment, we will not remember to remove it.
just mention in commit log.


OK, pls repost with this fix. Minor so include acks posted so far. Thanks!

>  g_autofree char *path = NULL;
>  char name[32];
>  const VMStateDescription *vmsd;
> 
> 
> > 
> > > ---
> > >   hw/pci/pci.c | 77 ++--
> > >   1 file changed, 45 insertions(+), 32 deletions(-)
> > > 
> > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > > index 3a0107758c..0f0c83c02f 100644
> > > --- a/hw/pci/pci.c
> > > +++ b/hw/pci/pci.c
> > > @@ -36,6 +36,7 @@
> > >   #include "migration/vmstate.h"
> > >   #include "net/net.h"
> > >   #include "sysemu/numa.h"
> > > +#include "sysemu/runstate.h"
> > >   #include "sysemu/sysemu.h"
> > >   #include "hw/loader.h"
> > >   #include "qemu/error-report.h"
> > > @@ -2308,10 +2309,16 @@ static void pci_add_option_rom(PCIDevice *pdev, 
> > > bool is_default_rom,
> > >   {
> > >   int64_t size;
> > >   g_autofree char *path = NULL;
> > > -void *ptr;
> > >   char name[32];
> > >   const VMStateDescription *vmsd;
> > > +/*
> > > + * In case of incoming migration ROM will come with migration 
> > > stream, no
> > > + * reason to load the

Re: [PATCH] ui/cursor: incomplete check for integer overflow in cursor_alloc

2023-05-22 Thread Mauro Matteo Cascella

On Mon, May 22, 2023 at 8:55 PM Philippe Mathieu-Daudé
 wrote:
>
> On 9/5/23 09:13, Marc-André Lureau wrote:
> > Hi
> >
> > On Mon, May 8, 2023 at 6:21 PM Mauro Matteo Cascella
> > mailto:mcasc...@redhat.com>> wrote:
> >
> > The cursor_alloc function still accepts a signed integer for both
> > the cursor
> > width and height. A specially crafted negative width/height could
> > make datasize
> > wrap around and cause the next allocation to be 0, potentially
> > leading to a
> > heap buffer overflow. Modify QEMUCursor struct and cursor_alloc
> > prototype to
> > accept unsigned ints.
> >
> > Fixes: CVE-2023-1601
> > Fixes: fa892e9a ("ui/cursor: fix integer overflow in cursor_alloc
> > (CVE-2021-4206)")
> > Signed-off-by: Mauro Matteo Cascella  > >
> > Reported-by: Jacek Halon  > >
> >
> >
> > Reviewed-by: Marc-André Lureau  > >
> >
> > It looks like this is not exploitable, QXL code uses u16 types, and
>
> 0x * 0x * 4 still overflows on 32-bit host, right?
>
> > VMWare VGA checks for values > 256. Other paths use fixed size.
> >
> > ---
> >   include/ui/console.h | 4 ++--
> >   ui/cursor.c  | 2 +-
> >   2 files changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/ui/console.h b/include/ui/console.h
> > index 2a8fab091f..92a4d90a1b 100644
> > --- a/include/ui/console.h
> > +++ b/include/ui/console.h
> > @@ -144,13 +144,13 @@ typedef struct QemuUIInfo {
> >
> >   /* cursor data format is 32bit RGBA */
> >   typedef struct QEMUCursor {
> > -int width, height;
> > +uint32_twidth, height;
> >   int hot_x, hot_y;
> >   int refcount;
> >   uint32_tdata[];
> >   } QEMUCursor;
> >
> > -QEMUCursor *cursor_alloc(int width, int height);
> > +QEMUCursor *cursor_alloc(uint32_t width, uint32_t height);
> >   QEMUCursor *cursor_ref(QEMUCursor *c);
> >   void cursor_unref(QEMUCursor *c);
> >   QEMUCursor *cursor_builtin_hidden(void);
> > diff --git a/ui/cursor.c b/ui/cursor.c
> > index 6fe67990e2..b5fcb64839 100644
> > --- a/ui/cursor.c
> > +++ b/ui/cursor.c
> > @@ -90,7 +90,7 @@ QEMUCursor *cursor_builtin_left_ptr(void)
> >   return cursor_parse_xpm(cursor_left_ptr_xpm);
> >   }
> >
> > -QEMUCursor *cursor_alloc(int width, int height)
> > +QEMUCursor *cursor_alloc(uint32_t width, uint32_t height)
> >   {
> >   QEMUCursor *c;
>
> Can't we check width/height > 0 && <= SOME_LIMIT_THAT_MAKES_SENSE?

We currently ensure width/height are less than 512 in cursor_alloc.

Checking for positive values is unnecessary if we make width/height
unsigned, isn't it?

> Maybe a 16K * 16K cursor is future proof and safe enough.
>
> >   size_t datasize = width * height * sizeof(uint32_t);
> > --
> > 2.40.1
> >
> >
> >
> >
> > --
> > Marc-André Lureau
>
--
Mauro Matteo Cascella
Red Hat Product Security
PGP-Key ID: BB3410B0

Re: [PATCH 02/12] util/fifo8: Allow fifo8_pop_buf() to not populate popped length

2023-05-22 Thread Francisco Iglesias

On [2023 May 22] Mon 17:31:34, Philippe Mathieu-Daudé wrote:
> There might be cases where we know the number of bytes we can
> pop from the FIFO, or we simply don't care how many bytes is
> returned. Allow fifo8_pop_buf() to take a NULL numptr.
> 
> Signed-off-by: Philippe Mathieu-Daudé 

Reviewed-by: Francisco Iglesias 


> ---
>  include/qemu/fifo8.h | 10 +-
>  util/fifo8.c | 12 
>  2 files changed, 13 insertions(+), 9 deletions(-)
> 
> diff --git a/include/qemu/fifo8.h b/include/qemu/fifo8.h
> index 16be02f361..d0d02bc73d 100644
> --- a/include/qemu/fifo8.h
> +++ b/include/qemu/fifo8.h
> @@ -71,7 +71,7 @@ uint8_t fifo8_pop(Fifo8 *fifo);
>   * fifo8_pop_buf:
>   * @fifo: FIFO to pop from
>   * @max: maximum number of bytes to pop
> - * @num: actual number of returned bytes
> + * @numptr: pointer filled with number of bytes returned (can be NULL)
>   *
>   * Pop a number of elements from the FIFO up to a maximum of max. The buffer
>   * containing the popped data is returned. This buffer points directly into
> @@ -82,16 +82,16 @@ uint8_t fifo8_pop(Fifo8 *fifo);
>   * around in the ring buffer; in this case only a contiguous part of the data
>   * is returned.
>   *
> - * The number of valid bytes returned is populated in *num; will always 
> return
> - * at least 1 byte. max must not be 0 or greater than the number of bytes in
> - * the FIFO.
> + * The number of valid bytes returned is populated in *numptr; will always
> + * return at least 1 byte. max must not be 0 or greater than the number of
> + * bytes in the FIFO.
>   *
>   * Clients are responsible for checking the availability of requested data
>   * using fifo8_num_used().
>   *
>   * Returns: A pointer to popped data.
>   */
> -const uint8_t *fifo8_pop_buf(Fifo8 *fifo, uint32_t max, uint32_t *num);
> +const uint8_t *fifo8_pop_buf(Fifo8 *fifo, uint32_t max, uint32_t *numptr);
>  
>  /**
>   * fifo8_reset:
> diff --git a/util/fifo8.c b/util/fifo8.c
> index d4d1c135e0..032e985440 100644
> --- a/util/fifo8.c
> +++ b/util/fifo8.c
> @@ -66,16 +66,20 @@ uint8_t fifo8_pop(Fifo8 *fifo)
>  return ret;
>  }
>  
> -const uint8_t *fifo8_pop_buf(Fifo8 *fifo, uint32_t max, uint32_t *num)
> +const uint8_t *fifo8_pop_buf(Fifo8 *fifo, uint32_t max, uint32_t *numptr)
>  {
>  uint8_t *ret;
> +uint32_t num;
>  
>  assert(max > 0 && max <= fifo->num);
> -*num = MIN(fifo->capacity - fifo->head, max);
> +num = MIN(fifo->capacity - fifo->head, max);
>  ret = >data[fifo->head];
> -fifo->head += *num;
> +fifo->head += num;
>  fifo->head %= fifo->capacity;
> -fifo->num -= *num;
> +fifo->num -= num;
> +if (numptr) {
> +*numptr = num;
> +}
>  return ret;
>  }
>  
> -- 
> 2.38.1
> 
>

[PATCH v3 12/19] cutils: Allow NULL str in qemu_strtosz

2023-05-22 Thread Eric Blake

All the other qemu_strto* and parse_uint allow a NULL str.  Having
qemu_strtosz not crash on qemu_strtosz(NULL, NULL, ) is an easy
fix that adds some consistency between our string parsers.

Signed-off-by: Eric Blake 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Hanna Czenczek 
---

v3: hoist hunk of do_strtosz_full from later patch [Hanna], R-b added
---
 tests/unit/test-cutils.c | 10 +-
 util/cutils.c|  2 +-
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/tests/unit/test-cutils.c b/tests/unit/test-cutils.c
index ebc6015a600..becac209987 100644
--- a/tests/unit/test-cutils.c
+++ b/tests/unit/test-cutils.c
@@ -3285,7 +3285,12 @@ static void do_strtosz_full(const char *str, 
qemu_strtosz_fn fn,
 ret = fn(str, , );
 g_assert_cmpint(ret, ==, exp_ptr_ret);
 g_assert_cmpuint(val, ==, exp_ptr_val);
-g_assert_true(endptr == str + exp_ptr_offset);
+if (str) {
+g_assert_true(endptr == str + exp_ptr_offset);
+} else {
+g_assert_cmpint(exp_ptr_offset, ==, 0);
+g_assert_null(endptr);
+}

 val = 0xbaadf00d;
 ret = fn(str, NULL, );
@@ -3383,6 +3388,9 @@ static void test_qemu_strtosz_float(void)

 static void test_qemu_strtosz_invalid(void)
 {
+do_strtosz(NULL, -EINVAL, 0xbaadf00d, 0);
+
+/* Must parse at least one digit */
 do_strtosz("", -EINVAL, 0xbaadf00d, 0);
 do_strtosz(" \t ", -EINVAL, 0xbaadf00d, 0);
 do_strtosz("crap", -EINVAL, 0xbaadf00d, 0);
diff --git a/util/cutils.c b/util/cutils.c
index 56a2aced8d4..1dc67d201dc 100644
--- a/util/cutils.c
+++ b/util/cutils.c
@@ -306,7 +306,7 @@ static int do_strtosz(const char *nptr, const char **end,
 out:
 if (end) {
 *end = endptr;
-} else if (*endptr) {
+} else if (nptr && *endptr) {
 retval = -EINVAL;
 }
 if (retval == 0) {
-- 
2.40.1

Re: [PATCH 7/8] python/qemu: allow avocado to set logging name space

2023-05-22 Thread John Snow

On Fri, May 19, 2023 at 2:39 AM Alex Bennée  wrote:
>
>
> John Snow  writes:
>
> > On Thu, May 18, 2023 at 12:20 PM Alex Bennée  wrote:
> >>
> >> Since the update to the latest version Avocado only automatically
> >> collects logging under the avocado name space. Tweak the QEMUMachine
> >> class to allow avocado to bring logging under its name space. This
> >> also allows useful tricks like:
> >>
> >>   ./avocado --show avocado.qemu.machine run path/to/test
> >>
> >> if you want to quickly get the machine invocation out of a test
> >> without searching deeply through the logs.
> >>
> >
> > Huh. That's kind of weird though, right? Each Python module is
> > intended to log to its own namespace by design; it feels like Avocado
> > really ought to have configuration options that allows it to collect
> > logging from other namespaces. I'm not against this patch, but if for
> > instance I wind up splitting qemu.machine out as a separate module
> > someday (like I did to qemu.qmp), then it feels weird to add options
> > specifically for fudging the logging hierarchy.
>
> According to the docs it does but I couldn't get it to work so this is a
> sticking plaster over that. If it gets fixed in later avocado's it is
> easy enough to remove.
>

Fair enough ...

Cleber, any input?

> > Also, what about the QMP logging? I don't suppose this will trickle
> > down to that level either.
>
> I can certainly add that - but it would need a similar hook.

Right... this is why I am wondering if it isn't just simpler to
configure Avocado to just relay everything from the qemu.* namespace,
which will be easier in the long run ... but hey, if it's broken, it's
broken :(

ACK to the bandaid.

--js

>
> >
> > Worried this is kind of incomplete.
> >
> > --js
> >
> >> Signed-off-by: Alex Bennée 
> >> ---
> >>  python/qemu/machine/machine.py | 42 ++
> >>  tests/avocado/avocado_qemu/__init__.py |  3 +-
> >>  2 files changed, 24 insertions(+), 21 deletions(-)
> >>
> >> diff --git a/python/qemu/machine/machine.py 
> >> b/python/qemu/machine/machine.py
> >> index e57c254484..402b9a0df9 100644
> >> --- a/python/qemu/machine/machine.py
> >> +++ b/python/qemu/machine/machine.py
> >> @@ -49,10 +49,6 @@
> >>
> >>  from . import console_socket
> >>
> >> -
> >> -LOG = logging.getLogger(__name__)
> >> -
> >> -
> >>  class QEMUMachineError(Exception):
> >>  """
> >>  Exception called when an error in QEMUMachine happens.
> >> @@ -131,6 +127,7 @@ def __init__(self,
> >>   drain_console: bool = False,
> >>   console_log: Optional[str] = None,
> >>   log_dir: Optional[str] = None,
> >> + log_namespace: Optional[str] = None,
> >>   qmp_timer: Optional[float] = 30):
> >>  '''
> >>  Initialize a QEMUMachine
> >> @@ -164,6 +161,11 @@ def __init__(self,
> >>  self._sock_dir = sock_dir
> >>  self._log_dir = log_dir
> >>
> >> +if log_namespace:
> >> +self.log = logging.getLogger(log_namespace)
> >> +else:
> >> +self.log = logging.getLogger(__name__)
> >> +
> >>  self._monitor_address = monitor_address
> >>
> >>  self._console_log_path = console_log
> >> @@ -382,11 +384,11 @@ def _post_shutdown(self) -> None:
> >>  Called to cleanup the VM instance after the process has exited.
> >>  May also be called after a failed launch.
> >>  """
> >> -LOG.debug("Cleaning up after VM process")
> >> +self.log.debug("Cleaning up after VM process")
> >>  try:
> >>  self._close_qmp_connection()
> >>  except Exception as err:  # pylint: disable=broad-except
> >> -LOG.warning(
> >> +self.log.warning(
> >>  "Exception closing QMP connection: %s",
> >>  str(err) if str(err) else type(err).__name__
> >>  )
> >> @@ -414,7 +416,7 @@ def _post_shutdown(self) -> None:
> >>  command = ' '.join(self._qemu_full_args)
> >>  else:
> >>  command = ''
> >> -LOG.warning(msg, -int(exitcode), command)
> >> +self.log.warning(msg, -int(exitcode), command)
> >>
> >>  self._quit_issued = False
> >>  self._user_killed = False
> >> @@ -458,7 +460,7 @@ def _launch(self) -> None:
> >>  Launch the VM and establish a QMP connection
> >>  """
> >>  self._pre_launch()
> >> -LOG.debug('VM launch command: %r', ' '.join(self._qemu_full_args))
> >> +self.log.debug('VM launch command: %r', ' 
> >> '.join(self._qemu_full_args))
> >>
> >>  # Cleaning up of this subprocess is guaranteed by _do_shutdown.
> >>  # pylint: disable=consider-using-with
> >> @@ -507,7 +509,7 @@ def _early_cleanup(self) -> None:
> >>  # for QEMU to exit, while QEMU is waiting for the socket to
> >>  # become writable.
> >>  if

[PATCH v3 08/19] cutils: Allow NULL endptr in parse_uint()

2023-05-22 Thread Eric Blake

All the qemu_strto*() functions permit a NULL endptr, just like their
libc counterparts, leaving parse_uint() as the oddball that caused
SEGFAULT on NULL and required the user to call parse_uint_full()
instead.  Relax things for consistency, even though the testsuite is
the only impacted caller.  Add one more unit test to ensure even
parse_uint_full(NULL, 0, ) works.  This also fixes our code to
uniformly favor EINVAL over ERANGE when both apply.

Also fixes a doc mismatch @v vs. a parameter named value.

Signed-off-by: Eric Blake 
Reviewed-by: Hanna Czenczek 
---

v3: commit message tweak, R-b added
---
 tests/unit/test-cutils.c | 18 --
 util/cutils.c| 34 --
 2 files changed, 28 insertions(+), 24 deletions(-)

diff --git a/tests/unit/test-cutils.c b/tests/unit/test-cutils.c
index 70469b583d3..2ac96117995 100644
--- a/tests/unit/test-cutils.c
+++ b/tests/unit/test-cutils.c
@@ -270,14 +270,26 @@ static void test_parse_uint_full_correct(void)

 static void test_parse_uint_full_erange_junk(void)
 {
-/* FIXME - inconsistent with qemu_strto* which favors EINVAL */
+/* EINVAL has priority over ERANGE */
 uint64_t i = 999;
 const char *str = "-2junk";
 int r;

 r = parse_uint_full(str, 0, );

-g_assert_cmpint(r, ==, -ERANGE /* FIXME -EINVAL */);
+g_assert_cmpint(r, ==, -EINVAL);
+g_assert_cmpuint(i, ==, 0);
+}
+
+static void test_parse_uint_full_null(void)
+{
+uint64_t i = 999;
+const char *str = NULL;
+int r;
+
+r = parse_uint_full(str, 0, );
+
+g_assert_cmpint(r, ==, -EINVAL);
 g_assert_cmpuint(i, ==, 0);
 }

@@ -3328,6 +3340,8 @@ int main(int argc, char **argv)
 test_parse_uint_full_correct);
 g_test_add_func("/cutils/parse_uint_full/erange_junk",
 test_parse_uint_full_erange_junk);
+g_test_add_func("/cutils/parse_uint_full/null",
+test_parse_uint_full_null);

 /* qemu_strtoi() tests */
 g_test_add_func("/cutils/qemu_strtoi/correct",
diff --git a/util/cutils.c b/util/cutils.c
index 0e279a531aa..56a2aced8d4 100644
--- a/util/cutils.c
+++ b/util/cutils.c
@@ -722,8 +722,7 @@ const char *qemu_strchrnul(const char *s, int c)
  * parse_uint:
  *
  * @s: String to parse
- * @endptr: Destination for pointer to first character not consumed, must
- * not be %NULL
+ * @endptr: Destination for pointer to first character not consumed
  * @base: integer base, between 2 and 36 inclusive, or 0
  * @value: Destination for parsed integer value
  *
@@ -737,7 +736,8 @@ const char *qemu_strchrnul(const char *s, int c)
  *
  * Set *@endptr to point right beyond the parsed integer (even if the integer
  * overflows or is negative, all digits will be parsed and *@endptr will
- * point right beyond them).
+ * point right beyond them).  If @endptr is %NULL, any trailing character
+ * instead causes a result of -EINVAL with *@value of 0.
  *
  * If the integer is negative, set *@value to 0, and return -ERANGE.
  * (If you want to allow negative numbers that wrap around within
@@ -784,7 +784,12 @@ int parse_uint(const char *s, const char **endptr, int 
base, uint64_t *value)

 out:
 *value = val;
-*endptr = endp;
+if (endptr) {
+*endptr = endp;
+} else if (s && *endp) {
+r = -EINVAL;
+*value = 0;
+}
 return r;
 }

@@ -795,28 +800,13 @@ out:
  * @base: integer base, between 2 and 36 inclusive, or 0
  * @value: Destination for parsed integer value
  *
- * Parse unsigned integer from entire string
+ * Parse unsigned integer from entire string, rejecting any trailing slop.
  *
- * Have the same behavior of parse_uint(), but with an additional
- * check for additional data after the parsed number. If extra
- * characters are present after a non-overflowing parsed number, the
- * function will return -EINVAL, and *@v will be set to 0.
+ * Shorthand for parse_uint(s, NULL, base, value).
  */
 int parse_uint_full(const char *s, int base, uint64_t *value)
 {
-const char *endp;
-int r;
-
-r = parse_uint(s, , base, value);
-if (r < 0) {
-return r;
-}
-if (*endp) {
-*value = 0;
-return -EINVAL;
-}
-
-return 0;
+return parse_uint(s, NULL, base, value);
 }

 int qemu_parse_fd(const char *param)
-- 
2.40.1

[PATCH v3 06/19] cutils: Document differences between parse_uint and qemu_strtou64

2023-05-22 Thread Eric Blake

These two functions are subtly different, and not just because of
swapped parameter order.  It took me adding better unit tests to
figure out why.  Document the differences to make it more obvious to
developers trying to pick which one to use, as well as to aid in
upcoming semantic changes.

While touching the documentation, adjust a mis-statement: parse_uint
does not return -EINVAL on invalid base, but assert()s, like all the
other qemu_strto* functions that take a base argument.

Signed-off-by: Eric Blake 
Reviewed-by: Hanna Czenczek 
---
 util/cutils.c | 20 
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/util/cutils.c b/util/cutils.c
index 9b6ce9179c4..36c14b769fd 100644
--- a/util/cutils.c
+++ b/util/cutils.c
@@ -611,6 +611,8 @@ int qemu_strtoi64(const char *nptr, const char **endptr, 
int base,
  * Convert string @nptr to an uint64_t.
  *
  * Works like qemu_strtoul(), except it stores UINT64_MAX on overflow.
+ * (If you want to prohibit negative numbers that wrap around to
+ * positive, use parse_uint()).
  */
 int qemu_strtou64(const char *nptr, const char **endptr, int base,
   uint64_t *result)
@@ -721,7 +723,8 @@ const char *qemu_strchrnul(const char *s, int c)
  *
  * @s: String to parse
  * @value: Destination for parsed integer value
- * @endptr: Destination for pointer to first character not consumed
+ * @endptr: Destination for pointer to first character not consumed, must
+ * not be %NULL
  * @base: integer base, between 2 and 36 inclusive, or 0
  *
  * Parse unsigned integer
@@ -729,15 +732,16 @@ const char *qemu_strchrnul(const char *s, int c)
  * Parsed syntax is like strtoull()'s: arbitrary whitespace, a single optional
  * '+' or '-', an optional "0x" if @base is 0 or 16, one or more digits.
  *
- * If @s is null, or @base is invalid, or @s doesn't start with an
- * integer in the syntax above, set *@value to 0, *@endptr to @s, and
- * return -EINVAL.
+ * If @s is null, or @s doesn't start with an integer in the syntax
+ * above, set *@value to 0, *@endptr to @s, and return -EINVAL.
  *
  * Set *@endptr to point right beyond the parsed integer (even if the integer
  * overflows or is negative, all digits will be parsed and *@endptr will
  * point right beyond them).
  *
  * If the integer is negative, set *@value to 0, and return -ERANGE.
+ * (If you want to allow negative numbers that wrap around within
+ * bounds, use qemu_strtou64()).
  *
  * If the integer overflows unsigned long long, set *@value to
  * ULLONG_MAX, and return -ERANGE.
@@ -794,10 +798,10 @@ out:
  *
  * Parse unsigned integer from entire string
  *
- * Have the same behavior of parse_uint(), but with an additional check
- * for additional data after the parsed number. If extra characters are present
- * after the parsed number, the function will return -EINVAL, and *@v will
- * be set to 0.
+ * Have the same behavior of parse_uint(), but with an additional
+ * check for additional data after the parsed number. If extra
+ * characters are present after a non-overflowing parsed number, the
+ * function will return -EINVAL, and *@v will be set to 0.
  */
 int parse_uint_full(const char *s, unsigned long long *value, int base)
 {
-- 
2.40.1

1 2 3 4 >

1 - 100 of 316 matches

Mail list logo