Re: Boot failure after QEMU's upgrade to OpenSBI v1.3 (was Re: [PATCH for-8.2 6/7] target/riscv: add 'max' CPU type)
On Fri, Jul 14, 2023 at 3:43 AM Conor Dooley wrote: > > +CC OpenSBI Mailing list > > I've not yet had the chance to bisect this, so adding the OpenSBI folks > to CC in case they might have an idea for what to try. > > And a question for you below Daniel. > > On Wed, Jul 12, 2023 at 11:14:21PM +0100, Conor Dooley wrote: > > On Wed, Jul 12, 2023 at 06:39:28PM -0300, Daniel Henrique Barboza wrote: > > > On 7/12/23 18:35, Conor Dooley wrote: > > > > On Wed, Jul 12, 2023 at 06:09:10PM -0300, Daniel Henrique Barboza wrote: > > > > > > > > > It is intentional. Those default marchid/mimpid vals were derived > > > > > from the current > > > > > QEMU version ID/build and didn't mean much. > > > > > > > > > > It is still possible to set them via "-cpu rv64,marchid=N,mimpid=N" > > > > > if needed when > > > > > using the generic (rv64,rv32) CPUs. Vendor CPUs can't have their > > > > > machine IDs changed > > > > > via command line. > > > > > > > > Sounds good, thanks. I did just now go and check icicle to see what it > > > > would report & it does not boot. I'll go bisect... > > > > > > BTW how are you booting the icicle board nowadays? I remember you > > > mentioning about > > > some changes in the FDT being required to boot and whatnot. > > > > I do direct kernel boots, as the HSS doesn't work anymore, and just lie > > a bit to QEMU about how much DDR we have. > > .PHONY: qemu-icicle > > qemu-icicle: > > $(qemu) -M microchip-icicle-kit \ > > -m 3G -smp 5 \ > > -kernel $(vmlinux_bin) \ > > -dtb $(icicle_dtb) \ > > -initrd $(initramfs) \ > > -display none -serial null \ > > -serial stdio \ > > -D qemu.log -d unimp > > > > The platform only supports 2 GiB of DDR, not 3, but if I pass 2 to QEMU > > it thinks there's 1 GiB at 0x8000_ and 1 GiB at 0x10__. The > > upstream devicetree (and current FPGA reference design) expects there to > > be 1 GiB at 0x8000_ and 1 GiB at 0x10_4000_. If I lie to QEMU, > > it thinks there is 1 GiB at 0x8000_ and 2 GiB at 0x10__, and > > things just work. I prefer doing it this way than having to modify the > > DT, it is a lot easier to explain to people this way. > > > > I've been meaning to work the support for the icicle & mpfs in QEMU, but > > it just gets shunted down the priority list. I'd really like if a proper > > boot flow would run in QEMU, which means fixing whatever broke the HSS, > > but I've recently picked up maintainership of dt-binding stuff in Linux, > > so I've unfortunately got even less time to try and work on it. Maybe > > we'll get some new graduate in and I can make them suffer in my stead... > > > > > If it's not too hard I'll add it in my test scripts to keep it under > > > check. Perhaps > > > we can even add it to QEMU testsuite. > > > > I don't think it really should be that bad, at least for the direct > > kernel boot, which is what I mainly care about, since I use it fairly > > often for debugging boot stuff in Linux. > > > > Anyways, aa903cf31391dd505b399627158f1292a6d19896 is the first bad commit: > > commit aa903cf31391dd505b399627158f1292a6d19896 > > Author: Bin Meng > > Date: Fri Jun 30 23:36:04 2023 +0800 > > > > roms/opensbi: Upgrade from v1.2 to v1.3 > > > > Upgrade OpenSBI from v1.2 to v1.3 and the pre-built bios images. > > > > And I see something like: > > qemu//build/qemu-system-riscv64 -M microchip-icicle-kit \ > > -m 3G -smp 5 \ > > -kernel vmlinux.bin \ > > -dtb icicle.dtb \ > > -initrd initramfs.cpio.gz \ > > -display none -serial null \ > > -serial stdio \ > > -D qemu.log -d unimp > > > qemu-system-riscv64: warning: disabling zca extension for hart > > 0x because privilege spec version does not match > > qemu-system-riscv64: warning: disabling zca extension for hart > > 0x0001 because privilege spec version does not match > > qemu-system-riscv64: warning: disabling zcd extension for hart > > 0x0001 because privilege spec version does not match > > qemu-system-riscv64: warning: disabling zca extension for hart > > 0x0002 because privilege spec version does not match > > qemu-system-riscv64: warning: disabling zcd extension for hart > > 0x0002 because privilege spec version does not match > > qemu-system-riscv64: warning: disabling zca extension for hart > > 0x0003 because privilege spec version does not match > > qemu-system-riscv64: warning: disabling zcd extension for hart > > 0x0003 because privilege spec version does not match > > qemu-system-riscv64: warning: disabling zca extension for hart > > 0x0004 because privilege spec version does not match > > qemu-system-riscv64: warning: disabling zcd extension for hart > > 0x0004 because privilege spec version does not match > > Why am I seeing these warnings? Does the
[PATCH] plugins: Set final instruction count in plugin_gen_tb_end
Translation logic may partially decode an instruction, then abort and remove the instruction from the TB. This can happen for example when an instruction spans two pages. In this case, plugins may get an incorrect result when calling qemu_plugin_tb_n_insns to query for the number of instructions in the TB. This patch updates plugin_gen_tb_end to set the final instruction count. Signed-off-by: Matt Borgerson --- accel/tcg/plugin-gen.c| 5 - accel/tcg/translator.c| 2 +- include/exec/plugin-gen.h | 4 ++-- 3 files changed, 7 insertions(+), 4 deletions(-) diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c index 5c13615112..809529990a 100644 --- a/accel/tcg/plugin-gen.c +++ b/accel/tcg/plugin-gen.c @@ -866,10 +866,13 @@ void plugin_gen_insn_end(void) * do any clean-up here and make sure things are reset in * plugin_gen_tb_start. */ -void plugin_gen_tb_end(CPUState *cpu) +void plugin_gen_tb_end(CPUState *cpu, int num_insns) { struct qemu_plugin_tb *ptb = tcg_ctx->plugin_tb; +/* translator may have removed instructions, update final count */ +ptb->n = num_insns; + /* collect instrumentation requests */ qemu_plugin_tb_trans_cb(cpu, ptb); diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c index 0fd9efceba..141f514886 100644 --- a/accel/tcg/translator.c +++ b/accel/tcg/translator.c @@ -215,7 +215,7 @@ void translator_loop(CPUState *cpu, TranslationBlock *tb, int *max_insns, gen_tb_end(tb, cflags, icount_start_insn, db->num_insns); if (plugin_enabled) { -plugin_gen_tb_end(cpu); +plugin_gen_tb_end(cpu, db->num_insns); } /* The disas_log hook may use these values rather than recompute. */ diff --git a/include/exec/plugin-gen.h b/include/exec/plugin-gen.h index 52828781bc..4feaa47b08 100644 --- a/include/exec/plugin-gen.h +++ b/include/exec/plugin-gen.h @@ -20,7 +20,7 @@ struct DisasContextBase; bool plugin_gen_tb_start(CPUState *cpu, const struct DisasContextBase *db, bool supress); -void plugin_gen_tb_end(CPUState *cpu); +void plugin_gen_tb_end(CPUState *cpu, int num_insns); void plugin_gen_insn_start(CPUState *cpu, const struct DisasContextBase *db); void plugin_gen_insn_end(void); @@ -42,7 +42,7 @@ void plugin_gen_insn_start(CPUState *cpu, const struct DisasContextBase *db) static inline void plugin_gen_insn_end(void) { } -static inline void plugin_gen_tb_end(CPUState *cpu) +static inline void plugin_gen_tb_end(CPUState *cpu, int num_insns) { } static inline void plugin_gen_disable_mem_helpers(void) -- 2.34.1
Re: [PATCH v3 06/16] target/riscv: Restrict riscv_cpu_do_interrupt() to sysemu
On Tue, Jul 11, 2023 at 10:20 PM Philippe Mathieu-Daudé wrote: > > riscv_cpu_do_interrupt() is not reachable on user emulation. > > Signed-off-by: Philippe Mathieu-Daudé Reviewed-by: Alistair Francis Alistair > --- > target/riscv/cpu.h| 5 +++-- > target/riscv/cpu_helper.c | 7 ++- > 2 files changed, 5 insertions(+), 7 deletions(-) > > diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h > index dba78db644..0602b948d4 100644 > --- a/target/riscv/cpu.h > +++ b/target/riscv/cpu.h > @@ -416,7 +416,6 @@ extern const char * const riscv_int_regnamesh[]; > extern const char * const riscv_fpr_regnames[]; > > const char *riscv_cpu_get_trap_name(target_ulong cause, bool async); > -void riscv_cpu_do_interrupt(CPUState *cpu); > int riscv_cpu_write_elf64_note(WriteCoreDumpFunction f, CPUState *cs, > int cpuid, DumpState *s); > int riscv_cpu_write_elf32_note(WriteCoreDumpFunction f, CPUState *cs, > @@ -449,6 +448,7 @@ void riscv_cpu_validate_set_extensions(RISCVCPU *cpu, > Error **errp); > #define cpu_mmu_index riscv_cpu_mmu_index > > #ifndef CONFIG_USER_ONLY > +void riscv_cpu_do_interrupt(CPUState *cpu); > void riscv_cpu_do_transaction_failed(CPUState *cs, hwaddr physaddr, > vaddr addr, unsigned size, > MMUAccessType access_type, > @@ -472,7 +472,8 @@ void riscv_cpu_set_aia_ireg_rmw_fn(CPURISCVState *env, > uint32_t priv, > void *rmw_fn_arg); > > RISCVException smstateen_acc_ok(CPURISCVState *env, int index, uint64_t bit); > -#endif > +#endif /* !CONFIG_USER_ONLY */ > + > void riscv_cpu_set_mode(CPURISCVState *env, target_ulong newpriv); > > void riscv_translate_init(void); > diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c > index 0adde26321..597c47bc56 100644 > --- a/target/riscv/cpu_helper.c > +++ b/target/riscv/cpu_helper.c > @@ -1579,7 +1579,6 @@ static target_ulong > riscv_transformed_insn(CPURISCVState *env, > > return xinsn; > } > -#endif /* !CONFIG_USER_ONLY */ > > /* > * Handle Traps > @@ -1589,8 +1588,6 @@ static target_ulong > riscv_transformed_insn(CPURISCVState *env, > */ > void riscv_cpu_do_interrupt(CPUState *cs) > { > -#if !defined(CONFIG_USER_ONLY) > - > RISCVCPU *cpu = RISCV_CPU(cs); > CPURISCVState *env = >env; > bool write_gva = false; > @@ -1783,6 +1780,6 @@ void riscv_cpu_do_interrupt(CPUState *cs) > > env->two_stage_lookup = false; > env->two_stage_indirect_lookup = false; > -#endif > -cs->exception_index = RISCV_EXCP_NONE; /* mark handled to qemu */ > } > + > +#endif /* !CONFIG_USER_ONLY */ > -- > 2.38.1 > >
Re: [PATCH for-8.2 v2 2/7] target/riscv/cpu.c: skip 'bool' check when filtering KVM props
On Thu, Jul 13, 2023 at 6:59 AM Daniel Henrique Barboza wrote: > > After the introduction of riscv_cpu_options[] all properties in > riscv_cpu_extensions[] are booleans. This check is now obsolete. > > Signed-off-by: Daniel Henrique Barboza Reviewed-by: Alistair Francis Alistair > --- > target/riscv/cpu.c | 14 -- > 1 file changed, 4 insertions(+), 10 deletions(-) > > diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c > index cdf9eeeb6b..735e0ed793 100644 > --- a/target/riscv/cpu.c > +++ b/target/riscv/cpu.c > @@ -1907,17 +1907,11 @@ static void riscv_cpu_add_user_properties(Object *obj) > * Set the default to disabled for every extension > * unknown to KVM and error out if the user attempts > * to enable any of them. > - * > - * We're giving a pass for non-bool properties since they're > - * not related to the availability of extensions and can be > - * safely ignored as is. > */ > -if (prop->info == _prop_bool) { > -object_property_add(obj, prop->name, "bool", > -NULL, cpu_set_cfg_unavailable, > -NULL, (void *)prop->name); > -continue; > -} > +object_property_add(obj, prop->name, "bool", > +NULL, cpu_set_cfg_unavailable, > +NULL, (void *)prop->name); > +continue; > } > #endif > qdev_property_add_static(dev, prop); > -- > 2.41.0 > >
Re: [PATCH for-8.2 v2 4/7] target/riscv/cpu.c: split non-ratified exts from riscv_cpu_extensions[]
On Thu, Jul 13, 2023 at 7:00 AM Daniel Henrique Barboza wrote: > > Create a new riscv_cpu_experimental_exts[] to store the non-ratified > extensions properties. Once they are ratified we'll move them back to > riscv_cpu_extensions[]. > > Change riscv_cpu_add_user_properties to keep adding them to users. > > Signed-off-by: Daniel Henrique Barboza Reviewed-by: Alistair Francis Alistair > --- > target/riscv/cpu.c | 38 +++--- > 1 file changed, 23 insertions(+), 15 deletions(-) > > diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c > index 9bbdc46126..c0826b449d 100644 > --- a/target/riscv/cpu.c > +++ b/target/riscv/cpu.c > @@ -1808,21 +1808,6 @@ static Property riscv_cpu_extensions[] = { > DEFINE_PROP_BOOL("zcmp", RISCVCPU, cfg.ext_zcmp, false), > DEFINE_PROP_BOOL("zcmt", RISCVCPU, cfg.ext_zcmt, false), > > -/* These are experimental so mark with 'x-' */ > -DEFINE_PROP_BOOL("x-zicond", RISCVCPU, cfg.ext_zicond, false), > - > -/* ePMP 0.9.3 */ > -DEFINE_PROP_BOOL("x-epmp", RISCVCPU, cfg.epmp, false), > -DEFINE_PROP_BOOL("x-smaia", RISCVCPU, cfg.ext_smaia, false), > -DEFINE_PROP_BOOL("x-ssaia", RISCVCPU, cfg.ext_ssaia, false), > - > -DEFINE_PROP_BOOL("x-zvfh", RISCVCPU, cfg.ext_zvfh, false), > -DEFINE_PROP_BOOL("x-zvfhmin", RISCVCPU, cfg.ext_zvfhmin, false), > - > -DEFINE_PROP_BOOL("x-zfbfmin", RISCVCPU, cfg.ext_zfbfmin, false), > -DEFINE_PROP_BOOL("x-zvfbfmin", RISCVCPU, cfg.ext_zvfbfmin, false), > -DEFINE_PROP_BOOL("x-zvfbfwma", RISCVCPU, cfg.ext_zvfbfwma, false), > - > DEFINE_PROP_END_OF_LIST(), > }; > > @@ -1843,6 +1828,25 @@ static Property riscv_cpu_vendor_exts[] = { > DEFINE_PROP_END_OF_LIST(), > }; > > +/* These are experimental so mark with 'x-' */ > +static Property riscv_cpu_experimental_exts[] = { > +DEFINE_PROP_BOOL("x-zicond", RISCVCPU, cfg.ext_zicond, false), > + > +/* ePMP 0.9.3 */ > +DEFINE_PROP_BOOL("x-epmp", RISCVCPU, cfg.epmp, false), > +DEFINE_PROP_BOOL("x-smaia", RISCVCPU, cfg.ext_smaia, false), > +DEFINE_PROP_BOOL("x-ssaia", RISCVCPU, cfg.ext_ssaia, false), > + > +DEFINE_PROP_BOOL("x-zvfh", RISCVCPU, cfg.ext_zvfh, false), > +DEFINE_PROP_BOOL("x-zvfhmin", RISCVCPU, cfg.ext_zvfhmin, false), > + > +DEFINE_PROP_BOOL("x-zfbfmin", RISCVCPU, cfg.ext_zfbfmin, false), > +DEFINE_PROP_BOOL("x-zvfbfmin", RISCVCPU, cfg.ext_zvfbfmin, false), > +DEFINE_PROP_BOOL("x-zvfbfwma", RISCVCPU, cfg.ext_zvfbfwma, false), > + > +DEFINE_PROP_END_OF_LIST(), > +}; > + > static Property riscv_cpu_options[] = { > DEFINE_PROP_UINT8("pmu-num", RISCVCPU, cfg.pmu_num, 16), > > @@ -1927,6 +1931,10 @@ static void riscv_cpu_add_user_properties(Object *obj) > for (prop = riscv_cpu_vendor_exts; prop && prop->name; prop++) { > qdev_property_add_static(dev, prop); > } > + > +for (prop = riscv_cpu_experimental_exts; prop && prop->name; prop++) { > +qdev_property_add_static(dev, prop); > +} > } > > static Property riscv_cpu_properties[] = { > -- > 2.41.0 > >
Re: [PATCH for-8.2 v2 3/7] target/riscv/cpu.c: split vendor exts from riscv_cpu_extensions[]
On Thu, Jul 13, 2023 at 6:58 AM Daniel Henrique Barboza wrote: > > Our goal is to make riscv_cpu_extensions[] hold only ratified, > non-vendor extensions. > > Create a new riscv_cpu_vendor_exts[] array for them, changing > riscv_cpu_add_user_properties() accordingly. > > Signed-off-by: Daniel Henrique Barboza Reviewed-by: Alistair Francis Alistair > --- > target/riscv/cpu.c | 34 -- > 1 file changed, 20 insertions(+), 14 deletions(-) > > diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c > index 735e0ed793..9bbdc46126 100644 > --- a/target/riscv/cpu.c > +++ b/target/riscv/cpu.c > @@ -1808,20 +1808,6 @@ static Property riscv_cpu_extensions[] = { > DEFINE_PROP_BOOL("zcmp", RISCVCPU, cfg.ext_zcmp, false), > DEFINE_PROP_BOOL("zcmt", RISCVCPU, cfg.ext_zcmt, false), > > -/* Vendor-specific custom extensions */ > -DEFINE_PROP_BOOL("xtheadba", RISCVCPU, cfg.ext_xtheadba, false), > -DEFINE_PROP_BOOL("xtheadbb", RISCVCPU, cfg.ext_xtheadbb, false), > -DEFINE_PROP_BOOL("xtheadbs", RISCVCPU, cfg.ext_xtheadbs, false), > -DEFINE_PROP_BOOL("xtheadcmo", RISCVCPU, cfg.ext_xtheadcmo, false), > -DEFINE_PROP_BOOL("xtheadcondmov", RISCVCPU, cfg.ext_xtheadcondmov, > false), > -DEFINE_PROP_BOOL("xtheadfmemidx", RISCVCPU, cfg.ext_xtheadfmemidx, > false), > -DEFINE_PROP_BOOL("xtheadfmv", RISCVCPU, cfg.ext_xtheadfmv, false), > -DEFINE_PROP_BOOL("xtheadmac", RISCVCPU, cfg.ext_xtheadmac, false), > -DEFINE_PROP_BOOL("xtheadmemidx", RISCVCPU, cfg.ext_xtheadmemidx, false), > -DEFINE_PROP_BOOL("xtheadmempair", RISCVCPU, cfg.ext_xtheadmempair, > false), > -DEFINE_PROP_BOOL("xtheadsync", RISCVCPU, cfg.ext_xtheadsync, false), > -DEFINE_PROP_BOOL("xventanacondops", RISCVCPU, cfg.ext_XVentanaCondOps, > false), > - > /* These are experimental so mark with 'x-' */ > DEFINE_PROP_BOOL("x-zicond", RISCVCPU, cfg.ext_zicond, false), > > @@ -1840,6 +1826,23 @@ static Property riscv_cpu_extensions[] = { > DEFINE_PROP_END_OF_LIST(), > }; > > +static Property riscv_cpu_vendor_exts[] = { > +DEFINE_PROP_BOOL("xtheadba", RISCVCPU, cfg.ext_xtheadba, false), > +DEFINE_PROP_BOOL("xtheadbb", RISCVCPU, cfg.ext_xtheadbb, false), > +DEFINE_PROP_BOOL("xtheadbs", RISCVCPU, cfg.ext_xtheadbs, false), > +DEFINE_PROP_BOOL("xtheadcmo", RISCVCPU, cfg.ext_xtheadcmo, false), > +DEFINE_PROP_BOOL("xtheadcondmov", RISCVCPU, cfg.ext_xtheadcondmov, > false), > +DEFINE_PROP_BOOL("xtheadfmemidx", RISCVCPU, cfg.ext_xtheadfmemidx, > false), > +DEFINE_PROP_BOOL("xtheadfmv", RISCVCPU, cfg.ext_xtheadfmv, false), > +DEFINE_PROP_BOOL("xtheadmac", RISCVCPU, cfg.ext_xtheadmac, false), > +DEFINE_PROP_BOOL("xtheadmemidx", RISCVCPU, cfg.ext_xtheadmemidx, false), > +DEFINE_PROP_BOOL("xtheadmempair", RISCVCPU, cfg.ext_xtheadmempair, > false), > +DEFINE_PROP_BOOL("xtheadsync", RISCVCPU, cfg.ext_xtheadsync, false), > +DEFINE_PROP_BOOL("xventanacondops", RISCVCPU, cfg.ext_XVentanaCondOps, > false), > + > +DEFINE_PROP_END_OF_LIST(), > +}; > + > static Property riscv_cpu_options[] = { > DEFINE_PROP_UINT8("pmu-num", RISCVCPU, cfg.pmu_num, 16), > > @@ -1921,6 +1924,9 @@ static void riscv_cpu_add_user_properties(Object *obj) > qdev_property_add_static(dev, prop); > } > > +for (prop = riscv_cpu_vendor_exts; prop && prop->name; prop++) { > +qdev_property_add_static(dev, prop); > +} > } > > static Property riscv_cpu_properties[] = { > -- > 2.41.0 > >
Re: [PATCH for-8.2 v2 1/7] target/riscv/cpu.c: split CPU options from riscv_cpu_extensions[]
On Thu, Jul 13, 2023 at 6:59 AM Daniel Henrique Barboza wrote: > > We'll add a new CPU type that will enable a considerable amount of > extensions. To make it easier for us we'll do a few cleanups in our > existing riscv_cpu_extensions[] array. > > Start by splitting all CPU non-boolean options from it. Create a new > riscv_cpu_options[] array for them. Add all these properties in > riscv_cpu_add_user_properties() as it is already being done today. > > No functional changes made. > > Signed-off-by: Daniel Henrique Barboza Reviewed-by: Alistair Francis Alistair > --- > target/riscv/cpu.c | 27 +++ > 1 file changed, 19 insertions(+), 8 deletions(-) > > diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c > index 9339c0241d..cdf9eeeb6b 100644 > --- a/target/riscv/cpu.c > +++ b/target/riscv/cpu.c > @@ -1751,7 +1751,6 @@ static void riscv_cpu_add_misa_properties(Object > *cpu_obj) > > static Property riscv_cpu_extensions[] = { > /* Defaults for standard extensions */ > -DEFINE_PROP_UINT8("pmu-num", RISCVCPU, cfg.pmu_num, 16), > DEFINE_PROP_BOOL("sscofpmf", RISCVCPU, cfg.ext_sscofpmf, false), > DEFINE_PROP_BOOL("Zifencei", RISCVCPU, cfg.ext_ifencei, true), > DEFINE_PROP_BOOL("Zicsr", RISCVCPU, cfg.ext_icsr, true), > @@ -1767,11 +1766,6 @@ static Property riscv_cpu_extensions[] = { > DEFINE_PROP_BOOL("pmp", RISCVCPU, cfg.pmp, true), > DEFINE_PROP_BOOL("sstc", RISCVCPU, cfg.ext_sstc, true), > > -DEFINE_PROP_STRING("priv_spec", RISCVCPU, cfg.priv_spec), > -DEFINE_PROP_STRING("vext_spec", RISCVCPU, cfg.vext_spec), > -DEFINE_PROP_UINT16("vlen", RISCVCPU, cfg.vlen, 128), > -DEFINE_PROP_UINT16("elen", RISCVCPU, cfg.elen, 64), > - > DEFINE_PROP_BOOL("smstateen", RISCVCPU, cfg.ext_smstateen, false), > DEFINE_PROP_BOOL("svadu", RISCVCPU, cfg.ext_svadu, true), > DEFINE_PROP_BOOL("svinval", RISCVCPU, cfg.ext_svinval, false), > @@ -1802,9 +1796,7 @@ static Property riscv_cpu_extensions[] = { > DEFINE_PROP_BOOL("zhinxmin", RISCVCPU, cfg.ext_zhinxmin, false), > > DEFINE_PROP_BOOL("zicbom", RISCVCPU, cfg.ext_icbom, true), > -DEFINE_PROP_UINT16("cbom_blocksize", RISCVCPU, cfg.cbom_blocksize, 64), > DEFINE_PROP_BOOL("zicboz", RISCVCPU, cfg.ext_icboz, true), > -DEFINE_PROP_UINT16("cboz_blocksize", RISCVCPU, cfg.cboz_blocksize, 64), > > DEFINE_PROP_BOOL("zmmul", RISCVCPU, cfg.ext_zmmul, false), > > @@ -1848,6 +1840,20 @@ static Property riscv_cpu_extensions[] = { > DEFINE_PROP_END_OF_LIST(), > }; > > +static Property riscv_cpu_options[] = { > +DEFINE_PROP_UINT8("pmu-num", RISCVCPU, cfg.pmu_num, 16), > + > +DEFINE_PROP_STRING("priv_spec", RISCVCPU, cfg.priv_spec), > +DEFINE_PROP_STRING("vext_spec", RISCVCPU, cfg.vext_spec), > + > +DEFINE_PROP_UINT16("vlen", RISCVCPU, cfg.vlen, 128), > +DEFINE_PROP_UINT16("elen", RISCVCPU, cfg.elen, 64), > + > +DEFINE_PROP_UINT16("cbom_blocksize", RISCVCPU, cfg.cbom_blocksize, 64), > +DEFINE_PROP_UINT16("cboz_blocksize", RISCVCPU, cfg.cboz_blocksize, 64), > + > +DEFINE_PROP_END_OF_LIST(), > +}; > > #ifndef CONFIG_USER_ONLY > static void cpu_set_cfg_unavailable(Object *obj, Visitor *v, > @@ -1916,6 +1922,11 @@ static void riscv_cpu_add_user_properties(Object *obj) > #endif > qdev_property_add_static(dev, prop); > } > + > +for (prop = riscv_cpu_options; prop && prop->name; prop++) { > +qdev_property_add_static(dev, prop); > +} > + > } > > static Property riscv_cpu_properties[] = { > -- > 2.41.0 > >
Re: Boot failure after QEMU's upgrade to OpenSBI v1.3 (was Re: [PATCH for-8.2 6/7] target/riscv: add 'max' CPU type)
On Fri, Jul 14, 2023 at 11:14 AM Daniel Henrique Barboza wrote: > > > > On 7/13/23 19:47, Conor Dooley wrote: > > On Thu, Jul 13, 2023 at 07:35:01PM -0300, Daniel Henrique Barboza wrote: > >> On 7/13/23 19:12, Conor Dooley wrote: > > > >>> And a question for you below Daniel. > >>> > >>> On Wed, Jul 12, 2023 at 11:14:21PM +0100, Conor Dooley wrote: > > > > > >>> > qemu-system-riscv64: warning: disabling zca extension for hart > 0x because privilege spec version does not match > qemu-system-riscv64: warning: disabling zca extension for hart > 0x0001 because privilege spec version does not match > qemu-system-riscv64: warning: disabling zcd extension for hart > 0x0001 because privilege spec version does not match > qemu-system-riscv64: warning: disabling zca extension for hart > 0x0002 because privilege spec version does not match > qemu-system-riscv64: warning: disabling zcd extension for hart > 0x0002 because privilege spec version does not match > qemu-system-riscv64: warning: disabling zca extension for hart > 0x0003 because privilege spec version does not match > qemu-system-riscv64: warning: disabling zcd extension for hart > 0x0003 because privilege spec version does not match > qemu-system-riscv64: warning: disabling zca extension for hart > 0x0004 because privilege spec version does not match > qemu-system-riscv64: warning: disabling zcd extension for hart > 0x0004 because privilege spec version does not match > >>> > >>> Why am I seeing these warnings? Does the mpfs machine type need to > >>> disable some things? It only supports rv64imafdc per the DT, and > >>> predates things like Zca existing, so emitting warnings does not seem > >>> fair at all to me! > >> > >> QEMU will disable extensions that are newer than a priv spec version that > >> is set > >> by the CPU. IIUC the icicle board is running a sifive_u54 CPU by default. > >> That > >> CPU has a priv spec version 1_10_0. The CPU is also enabling C. > >> > >> We will enable zca if C is enabled. C and D enabled will also enable zcd. > >> But > >> then the priv check will disabled both because zca and zcd have priv spec > >> 1_12_0. > >> > >> This is a side effect for a change that I did a few months ago. Back then > >> we > >> weren't disabling stuff correctly. > > > > Yah, I did check out the blame, hence directing it at you. Thanks for > > the explanation. > > > >> The warnings are annoying but are benign. > > > > To be honest, benign or not, this is kind of thing is only going to > > lead to grief. Even though only the direct kernel boot works, we do > > actually have some customers that are using the icicle target in QEMU. > > > >> And apparently the sifive_u54 CPU is being inconsistent for some time and > >> we noticed just now. > >> Now, if the icicle board is supposed to have zca and zcd then we have a > >> problem. > > > > I don't know, this depends on how you see things in QEMU. I would say > > that it supports c, and not Zca/Zcf/Zcd, given it predates the > > extensions. I have no interest in retrofitting my devicetree stuff with > > them, for example. > > > >> We'll need to discuss whether we move sifive_u54 CPU priv spec to 1_12_0 > >> (I'm not > >> sure how this will affect other boards that uses this CPU) or remove this > >> priv spec > >> disable code altogether from QEMU. > > > > I think you should stop warning for this? From my dumb-user perspective, > > the warning only "scares" me into thinking something is wrong, when > > there isn't. I can see a use case for the warning where someone tries to > > enable Zca & Co. in their QEMU incantation for a CPU that does not > > have the correct privilege level to support it, but I didn't try to set > > any options at all in that way, so the warnings seem unfair? > > > That's a fair criticism. We had similar discussions a few months back. It's > weird > to send warnings when the user didn't set the extensions manually, but ATM we > can't tell whether an extension was user enabled or not. > > So we can either show unfair warning messages or not show warnings and take > the risk > of silently disabling extensions that users enabled in the command line. It > seems > that the former is more annoying to deal with than the latter. > > I guess I can propose a patch to remove the warnings. We can send warning > again > when we have a better solution. A better solution is to just not enable Zca and friends automatically, or at least look at the priv spec before we do Alistair > > > Daniel > > > > > > Cheers, > > Conor. >
Re: [PATCH for-8.2 v2 6/7] target/riscv: add 'max' CPU type
On Thu, Jul 13, 2023 at 7:00 AM Daniel Henrique Barboza wrote: > > The 'max' CPU type is used by tooling to determine what's the most > capable CPU a current QEMU version implements. Other archs such as ARM > implements this type. Let's add it to RISC-V. > > What we consider "most capable CPU" in this context are related to > ratified, non-vendor extensions. This means that we want the 'max' CPU > to enable all (possible) ratified extensions by default. The reasoning > behind this design is (1) vendor extensions can conflict with each other > and we won't play favorities deciding which one is default or not and > (2) non-ratified extensions are always prone to changes, not being > stable enough to be enabled by default. > > All this said, we're still not able to enable all ratified extensions > due to conflicts between them. Zfinx and all its dependencies aren't > enabled because of a conflict with RVF. zce, zcmp and zcmt are also > disabled due to RVD conflicts. When running with 64 bits we're also > disabling zcf. > > MISA bits RVG, RVJ and RVV are also being set manually since they're > default disabled. > > This is the resulting 'riscv,isa' DT for this new CPU: > > rv64imafdcvh_zicbom_zicboz_zicsr_zifencei_zihintpause_zawrs_zfa_ > zfh_zfhmin_zca_zcb_zcd_zba_zbb_zbc_zbkb_zbkc_zbkx_zbs_zk_zkn_zknd_ > zkne_zknh_zkr_zks_zksed_zksh_zkt_zve32f_zve64f_zve64d_ > smstateen_sscofpmf_sstc_svadu_svinval_svnapot_svpbmt > > Signed-off-by: Daniel Henrique Barboza > --- > target/riscv/cpu-qom.h | 1 + > target/riscv/cpu.c | 53 ++ > 2 files changed, 54 insertions(+) > > diff --git a/target/riscv/cpu-qom.h b/target/riscv/cpu-qom.h > index 04af50983e..f3fbe37a2c 100644 > --- a/target/riscv/cpu-qom.h > +++ b/target/riscv/cpu-qom.h > @@ -30,6 +30,7 @@ > #define CPU_RESOLVING_TYPE TYPE_RISCV_CPU > > #define TYPE_RISCV_CPU_ANY RISCV_CPU_TYPE_NAME("any") > +#define TYPE_RISCV_CPU_MAX RISCV_CPU_TYPE_NAME("max") >From memory the "any" CPU was supposed to do this, so we might want to remove >it Alistair > #define TYPE_RISCV_CPU_BASE32 RISCV_CPU_TYPE_NAME("rv32") > #define TYPE_RISCV_CPU_BASE64 RISCV_CPU_TYPE_NAME("rv64") > #define TYPE_RISCV_CPU_BASE128 RISCV_CPU_TYPE_NAME("x-rv128") > diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c > index b61465c8c4..5172566cda 100644 > --- a/target/riscv/cpu.c > +++ b/target/riscv/cpu.c > @@ -248,6 +248,7 @@ static const char * const riscv_intr_names[] = { > }; > > static void riscv_cpu_add_user_properties(Object *obj); > +static void riscv_init_max_cpu_extensions(Object *obj); > > const char *riscv_cpu_get_trap_name(target_ulong cause, bool async) > { > @@ -374,6 +375,25 @@ static void riscv_any_cpu_init(Object *obj) > cpu->cfg.pmp = true; > } > > +static void riscv_max_cpu_init(Object *obj) > +{ > +RISCVCPU *cpu = RISCV_CPU(obj); > +CPURISCVState *env = >env; > +RISCVMXL mlx = MXL_RV64; > + > +#ifdef TARGET_RISCV32 > +mlx = MXL_RV32; > +#endif > +set_misa(env, mlx, 0); > +riscv_cpu_add_user_properties(obj); > +riscv_init_max_cpu_extensions(obj); > +env->priv_ver = PRIV_VERSION_LATEST; > +#ifndef CONFIG_USER_ONLY > +set_satp_mode_max_supported(RISCV_CPU(obj), mlx == MXL_RV32 ? > +VM_1_10_SV32 : VM_1_10_SV57); > +#endif > +} > + > #if defined(TARGET_RISCV64) > static void rv64_base_cpu_init(Object *obj) > { > @@ -1934,6 +1954,38 @@ static void riscv_cpu_add_user_properties(Object *obj) > ADD_CPU_PROPERTIES_ARRAY(dev, riscv_cpu_experimental_exts); > } > > +/* > + * The 'max' type CPU will have all possible ratified > + * non-vendor extensions enabled. > + */ > +static void riscv_init_max_cpu_extensions(Object *obj) > +{ > +RISCVCPU *cpu = RISCV_CPU(obj); > +CPURISCVState *env = >env; > +Property *prop; > + > +/* Enable RVG, RVJ and RVV that are disabled by default */ > +set_misa(env, env->misa_mxl, env->misa_ext | RVG | RVJ | RVV); > + > +for (prop = riscv_cpu_extensions; prop && prop->name; prop++) { > +object_property_set_bool(obj, prop->name, true, NULL); > +} > + > +/* Zfinx is not compatible with F. Disable it */ > +object_property_set_bool(obj, "zfinx", false, NULL); > +object_property_set_bool(obj, "zdinx", false, NULL); > +object_property_set_bool(obj, "zhinx", false, NULL); > +object_property_set_bool(obj, "zhinxmin", false, NULL); > + > +object_property_set_bool(obj, "zce", false, NULL); > +object_property_set_bool(obj, "zcmp", false, NULL); > +object_property_set_bool(obj, "zcmt", false, NULL); > + > +if (env->misa_mxl != MXL_RV32) { > +object_property_set_bool(obj, "zcf", false, NULL); > +} > +} > + > static Property riscv_cpu_properties[] = { > DEFINE_PROP_BOOL("debug", RISCVCPU, cfg.debug, true), > > @@ -2272,6 +2324,7 @@ static const TypeInfo riscv_cpu_type_infos[] = { >
Re: [PATCH] riscv/disas: Fix disas output of upper immediates
On Tue, Jul 11, 2023 at 5:52 PM Christoph Muellner wrote: > > From: Christoph Müllner > > The GNU assembler produces the following output for instructions > with upper immediates: > 2597auipc a1,0x2 > 24b7lui s1,0x2 > 6409lui s0,0x2 # c.lui > > The immediate operands of upper immediates are not shifted. > > However, the QEMU disassembler prints them shifted: > 2597 auipc a1,8192 > 24b7 lui s1,8192 > 6409 lui s0,8192 # c.lui > > The current implementation extracts the immediate bits and shifts the by 12, > so the internal representation of the immediate is the actual immediate. > However, the immediates are later printed using rv_fmt_rd_imm or > rv_fmt_rd_offset, which don't undo the shift. > > Let's fix this by using specific output formats for instructions > with upper immediates, that take care of the shift. > > Signed-off-by: Christoph Müllner Thanks! Applied to riscv-to-apply.next Alistair > --- > disas/riscv.c | 19 --- > disas/riscv.h | 2 ++ > 2 files changed, 18 insertions(+), 3 deletions(-) > > diff --git a/disas/riscv.c b/disas/riscv.c > index cd7b6e86a7..3873a69157 100644 > --- a/disas/riscv.c > +++ b/disas/riscv.c > @@ -1135,8 +1135,8 @@ static const rv_comp_data rvcp_fsgnjx_q[] = { > > const rv_opcode_data rvi_opcode_data[] = { > { "illegal", rv_codec_illegal, rv_fmt_none, NULL, 0, 0, 0 }, > -{ "lui", rv_codec_u, rv_fmt_rd_imm, NULL, 0, 0, 0 }, > -{ "auipc", rv_codec_u, rv_fmt_rd_offset, NULL, 0, 0, 0 }, > +{ "lui", rv_codec_u, rv_fmt_rd_uimm, NULL, 0, 0, 0 }, > +{ "auipc", rv_codec_u, rv_fmt_rd_uoffset, NULL, 0, 0, 0 }, > { "jal", rv_codec_uj, rv_fmt_rd_offset, rvcp_jal, 0, 0, 0 }, > { "jalr", rv_codec_i, rv_fmt_rd_rs1_offset, rvcp_jalr, 0, 0, 0 }, > { "beq", rv_codec_sb, rv_fmt_rs1_rs2_offset, rvcp_beq, 0, 0, 0 }, > @@ -1382,7 +1382,7 @@ const rv_opcode_data rvi_opcode_data[] = { >rv_op_addi }, > { "c.addi16sp", rv_codec_ci_16sp, rv_fmt_rd_rs1_imm, NULL, rv_op_addi, >rv_op_addi, rv_op_addi, rvcd_imm_nz }, > -{ "c.lui", rv_codec_ci_lui, rv_fmt_rd_imm, NULL, rv_op_lui, rv_op_lui, > +{ "c.lui", rv_codec_ci_lui, rv_fmt_rd_uimm, NULL, rv_op_lui, rv_op_lui, >rv_op_lui, rvcd_imm_nz }, > { "c.srli", rv_codec_cb_sh6, rv_fmt_rd_rs1_imm, NULL, rv_op_srli, >rv_op_srli, rv_op_srli, rvcd_imm_nz }, > @@ -4694,6 +4694,19 @@ static void format_inst(char *buf, size_t buflen, > size_t tab, rv_decode *dec) > dec->pc + dec->imm); > append(buf, tmp, buflen); > break; > +case 'U': > +fmt++; > +snprintf(tmp, sizeof(tmp), "%d", dec->imm >> 12); > +append(buf, tmp, buflen); > +if (*fmt == 'o') { > +while (strlen(buf) < tab * 2) { > +append(buf, " ", buflen); > +} > +snprintf(tmp, sizeof(tmp), "# 0x%" PRIx64, > +dec->pc + dec->imm); > +append(buf, tmp, buflen); > +} > +break; > case 'c': { > const char *name = csr_name(dec->imm & 0xfff); > if (name) { > diff --git a/disas/riscv.h b/disas/riscv.h > index 9cf901fc1e..8abb578b51 100644 > --- a/disas/riscv.h > +++ b/disas/riscv.h > @@ -227,7 +227,9 @@ enum { > #define rv_fmt_pred_succ "O\tp,s" > #define rv_fmt_rs1_rs2"O\t1,2" > #define rv_fmt_rd_imm "O\t0,i" > +#define rv_fmt_rd_uimm"O\t0,Ui" > #define rv_fmt_rd_offset "O\t0,o" > +#define rv_fmt_rd_uoffset "O\t0,Uo" > #define rv_fmt_rd_rs1_rs2 "O\t0,1,2" > #define rv_fmt_frd_rs1"O\t3,1" > #define rv_fmt_frd_rs1_rs2"O\t3,1,2" > -- > 2.41.0 > >
Re: [PATCH] riscv/disas: Fix disas output of upper immediates
On Tue, Jul 11, 2023 at 5:52 PM Christoph Muellner wrote: > > From: Christoph Müllner > > The GNU assembler produces the following output for instructions > with upper immediates: > 2597auipc a1,0x2 > 24b7lui s1,0x2 > 6409lui s0,0x2 # c.lui > > The immediate operands of upper immediates are not shifted. > > However, the QEMU disassembler prints them shifted: > 2597 auipc a1,8192 > 24b7 lui s1,8192 > 6409 lui s0,8192 # c.lui > > The current implementation extracts the immediate bits and shifts the by 12, > so the internal representation of the immediate is the actual immediate. > However, the immediates are later printed using rv_fmt_rd_imm or > rv_fmt_rd_offset, which don't undo the shift. > > Let's fix this by using specific output formats for instructions > with upper immediates, that take care of the shift. > > Signed-off-by: Christoph Müllner Acked-by: Alistair Francis Alistair > --- > disas/riscv.c | 19 --- > disas/riscv.h | 2 ++ > 2 files changed, 18 insertions(+), 3 deletions(-) > > diff --git a/disas/riscv.c b/disas/riscv.c > index cd7b6e86a7..3873a69157 100644 > --- a/disas/riscv.c > +++ b/disas/riscv.c > @@ -1135,8 +1135,8 @@ static const rv_comp_data rvcp_fsgnjx_q[] = { > > const rv_opcode_data rvi_opcode_data[] = { > { "illegal", rv_codec_illegal, rv_fmt_none, NULL, 0, 0, 0 }, > -{ "lui", rv_codec_u, rv_fmt_rd_imm, NULL, 0, 0, 0 }, > -{ "auipc", rv_codec_u, rv_fmt_rd_offset, NULL, 0, 0, 0 }, > +{ "lui", rv_codec_u, rv_fmt_rd_uimm, NULL, 0, 0, 0 }, > +{ "auipc", rv_codec_u, rv_fmt_rd_uoffset, NULL, 0, 0, 0 }, > { "jal", rv_codec_uj, rv_fmt_rd_offset, rvcp_jal, 0, 0, 0 }, > { "jalr", rv_codec_i, rv_fmt_rd_rs1_offset, rvcp_jalr, 0, 0, 0 }, > { "beq", rv_codec_sb, rv_fmt_rs1_rs2_offset, rvcp_beq, 0, 0, 0 }, > @@ -1382,7 +1382,7 @@ const rv_opcode_data rvi_opcode_data[] = { >rv_op_addi }, > { "c.addi16sp", rv_codec_ci_16sp, rv_fmt_rd_rs1_imm, NULL, rv_op_addi, >rv_op_addi, rv_op_addi, rvcd_imm_nz }, > -{ "c.lui", rv_codec_ci_lui, rv_fmt_rd_imm, NULL, rv_op_lui, rv_op_lui, > +{ "c.lui", rv_codec_ci_lui, rv_fmt_rd_uimm, NULL, rv_op_lui, rv_op_lui, >rv_op_lui, rvcd_imm_nz }, > { "c.srli", rv_codec_cb_sh6, rv_fmt_rd_rs1_imm, NULL, rv_op_srli, >rv_op_srli, rv_op_srli, rvcd_imm_nz }, > @@ -4694,6 +4694,19 @@ static void format_inst(char *buf, size_t buflen, > size_t tab, rv_decode *dec) > dec->pc + dec->imm); > append(buf, tmp, buflen); > break; > +case 'U': > +fmt++; > +snprintf(tmp, sizeof(tmp), "%d", dec->imm >> 12); > +append(buf, tmp, buflen); > +if (*fmt == 'o') { > +while (strlen(buf) < tab * 2) { > +append(buf, " ", buflen); > +} > +snprintf(tmp, sizeof(tmp), "# 0x%" PRIx64, > +dec->pc + dec->imm); > +append(buf, tmp, buflen); > +} > +break; > case 'c': { > const char *name = csr_name(dec->imm & 0xfff); > if (name) { > diff --git a/disas/riscv.h b/disas/riscv.h > index 9cf901fc1e..8abb578b51 100644 > --- a/disas/riscv.h > +++ b/disas/riscv.h > @@ -227,7 +227,9 @@ enum { > #define rv_fmt_pred_succ "O\tp,s" > #define rv_fmt_rs1_rs2"O\t1,2" > #define rv_fmt_rd_imm "O\t0,i" > +#define rv_fmt_rd_uimm"O\t0,Ui" > #define rv_fmt_rd_offset "O\t0,o" > +#define rv_fmt_rd_uoffset "O\t0,Uo" > #define rv_fmt_rd_rs1_rs2 "O\t0,1,2" > #define rv_fmt_frd_rs1"O\t3,1" > #define rv_fmt_frd_rs1_rs2"O\t3,1,2" > -- > 2.41.0 > >
Re: [PATCH] docs/system/target-riscv.rst: tidy CPU firmware section
On Thu, Jul 13, 2023 at 4:47 PM Michael Tokarev wrote: > > 12.07.2023 17:37, Daniel Henrique Barboza wrote: > > This is how the content of the "RISC-V CPU firmware" section is > > displayed after the html is generated: > > > > "When using the sifive_u or virt machine there are three different > > firmware boot options: 1. -bios default - This is the default behaviour > > if no -bios option is included. (...) 3. -bios - Tells QEMU to > > load the specified file as the firmware." > > > > It's all in the same paragraph, in a numbered list, and no special > > formatting for the options. > > > > Tidy it a bit by adding line breaks between items and its description. > > Remove the numbered list. And apply formatting for the options cited in > > the middle of the text. > > > > Cc: qemu-triv...@nongnu.org > > Signed-off-by: Daniel Henrique Barboza > > I'll pick this up for trivial-patches, but since it's the only patch there > now, it's IMHO better to apply it together with other riscv changes if > there will be any for 8.1. So let's pick it to both trees and the first > to apply wins. Sounds good to me! Applied to riscv-to-apply.next Alistair > > Thanks, > > /mjt > >
Re: [PATCH] docs/system/target-riscv.rst: tidy CPU firmware section
On Thu, Jul 13, 2023 at 12:38 AM Daniel Henrique Barboza wrote: > > This is how the content of the "RISC-V CPU firmware" section is > displayed after the html is generated: > > "When using the sifive_u or virt machine there are three different > firmware boot options: 1. -bios default - This is the default behaviour > if no -bios option is included. (...) 3. -bios - Tells QEMU to > load the specified file as the firmware." > > It's all in the same paragraph, in a numbered list, and no special > formatting for the options. > > Tidy it a bit by adding line breaks between items and its description. > Remove the numbered list. And apply formatting for the options cited in > the middle of the text. > > Cc: qemu-triv...@nongnu.org > Signed-off-by: Daniel Henrique Barboza Reviewed-by: Alistair Francis Alistair > --- > docs/system/target-riscv.rst | 24 > 1 file changed, 16 insertions(+), 8 deletions(-) > > diff --git a/docs/system/target-riscv.rst b/docs/system/target-riscv.rst > index 89a866e4f4..ba195f1518 100644 > --- a/docs/system/target-riscv.rst > +++ b/docs/system/target-riscv.rst > @@ -76,11 +76,19 @@ RISC-V CPU firmware > > When using the ``sifive_u`` or ``virt`` machine there are three different > firmware boot options: > -1. ``-bios default`` - This is the default behaviour if no -bios option > -is included. This option will load the default OpenSBI firmware > automatically. > -The firmware is included with the QEMU release and no user interaction is > -required. All a user needs to do is specify the kernel they want to boot > -with the -kernel option > -2. ``-bios none`` - QEMU will not automatically load any firmware. It is up > -to the user to load all the images they need. > -3. ``-bios `` - Tells QEMU to load the specified file as the firmware. > + > +* ``-bios default`` > + > +This is the default behaviour if no ``-bios`` option is included. This option > +will load the default OpenSBI firmware automatically. The firmware is > included > +with the QEMU release and no user interaction is required. All a user needs > to > +do is specify the kernel they want to boot with the ``-kernel`` option > + > +* ``-bios none`` > + > +QEMU will not automatically load any firmware. It is up to the user to load > all > +the images they need. > + > +* ``-bios `` > + > +Tells QEMU to load the specified file as the firmware. > -- > 2.41.0 > >
[PATCH v4 5/6] qmp: Added new command to retrieve eBPF blob.
Added command "request-ebpf". This command returns eBPF program encoded base64. The program taken from the skeleton and essentially is an ELF object that can be loaded in the future with libbpf. The reason to use the command to provide the eBPF object instead of a separate artifact was to avoid issues related to finding the eBPF itself. As the eBPF maps/program should correspond to QEMU, the eBPF cant be used from different QEMU build. The first solution was a helper that comes with QEMU and loads appropriate eBPF objects. And the issue is to find a proper helper if the system has several different QEMUs installed and/or built from the source, which helpers may not be compatible. Another issue is QEMU updating while there is a running QEMU instance. With an updated helper, it may not be possible to hotplug virtio-net device to the already running QEMU. Overall, requesting the eBPF object from QEMU itself solves possible failures with very little effort. Links: [PATCH 3/5] qmp: Added the helper stamp check. https://lore.kernel.org/all/20230219162100.174318-4-and...@daynix.com/ Signed-off-by: Andrew Melnychenko --- qapi/ebpf.json| 58 +++ qapi/meson.build | 1 + qapi/qapi-schema.json | 1 + 3 files changed, 60 insertions(+) create mode 100644 qapi/ebpf.json diff --git a/qapi/ebpf.json b/qapi/ebpf.json new file mode 100644 index 00..3237da69a7 --- /dev/null +++ b/qapi/ebpf.json @@ -0,0 +1,58 @@ +# -*- Mode: Python -*- +# vim: filetype=python +# +# This work is licensed under the terms of the GNU GPL, version 2 or later. +# See the COPYING file in the top-level directory. + +## +# = eBPF Objects +## + +{ 'include': 'common.json' } + +## +# @EbpfObject: +# +# Structure that holds eBPF ELF object encoded in base64. +# +# Since: 8.3 +# +## +{ 'struct': 'EbpfObject', + 'data': {'object': 'str'}, + 'if': 'CONFIG_EBPF' } + +## +# @EbpfProgramID: +# +# The eBPF programs that can be gotten with request-ebpf. +# +# @rss: Receive side scaling, technology that allows steering traffic +# between queues by calculation hash. Users may set up indirection table +# and hash/packet types configurations. Used with virtio-net. +# +# Since: 8.3 +## +{ 'enum': 'EbpfProgramID', + 'if': 'CONFIG_EBPF', + 'data': [ { 'name': 'rss' } ] } + +## +# @request-ebpf: +# +# Returns eBPF object that can be loaded with libbpf. +# Management applications (g.e. libvirt) may load it and pass file +# descriptors to QEMU. Which allows running QEMU without BPF capabilities. +# It's crucial that eBPF program/map is compatible with QEMU, so it's +# provided through QMP. +# +# Returns: RSS eBPF object encoded in base64. +# +# Since: 8.3 +# +## +{ 'command': 'request-ebpf', + 'data': { 'id': 'EbpfProgramID' }, + 'returns': 'EbpfObject', + 'if': 'CONFIG_EBPF' } + diff --git a/qapi/meson.build b/qapi/meson.build index 60a668b343..90047dae1c 100644 --- a/qapi/meson.build +++ b/qapi/meson.build @@ -33,6 +33,7 @@ qapi_all_modules = [ 'crypto', 'cxl', 'dump', + 'ebpf', 'error', 'introspect', 'job', diff --git a/qapi/qapi-schema.json b/qapi/qapi-schema.json index 6594afba31..2c82a49bae 100644 --- a/qapi/qapi-schema.json +++ b/qapi/qapi-schema.json @@ -53,6 +53,7 @@ { 'include': 'char.json' } { 'include': 'dump.json' } { 'include': 'net.json' } +{ 'include': 'ebpf.json' } { 'include': 'rdma.json' } { 'include': 'rocker.json' } { 'include': 'tpm.json' } -- 2.40.1
[PATCH v4 4/6] ebpf: Added declaration/initialization routines.
Now, the binary objects may be retrieved by id. It would require for future qmp commands that may require specific eBPF blob. Signed-off-by: Andrew Melnychenko --- ebpf/ebpf.c | 70 ebpf/ebpf.h | 31 + ebpf/ebpf_rss.c | 6 + ebpf/meson.build | 2 +- 4 files changed, 108 insertions(+), 1 deletion(-) create mode 100644 ebpf/ebpf.c create mode 100644 ebpf/ebpf.h diff --git a/ebpf/ebpf.c b/ebpf/ebpf.c new file mode 100644 index 00..ea97c0403e --- /dev/null +++ b/ebpf/ebpf.c @@ -0,0 +1,70 @@ +/* + * QEMU eBPF binary declaration routine. + * + * Developed by Daynix Computing LTD (http://www.daynix.com) + * + * Authors: + * Andrew Melnychenko + * + * This work is licensed under the terms of the GNU GPL, version 2 or + * later. See the COPYING file in the top-level directory. + */ + +#include "qemu/osdep.h" +#include "qemu/queue.h" +#include "qapi/error.h" +#include "qapi/qapi-commands-ebpf.h" +#include "ebpf/ebpf.h" + +struct ElfBinaryDataEntry { +int id; +const void *data; +size_t datalen; + +QSLIST_ENTRY(ElfBinaryDataEntry) node; +}; + +static QSLIST_HEAD(, ElfBinaryDataEntry) ebpf_elf_obj_list = +QSLIST_HEAD_INITIALIZER(); + +void ebpf_register_binary_data(int id, const void *data, size_t datalen) +{ +struct ElfBinaryDataEntry *dataentry = NULL; + +dataentry = g_new0(struct ElfBinaryDataEntry, 1); +dataentry->data = data; +dataentry->datalen = datalen; +dataentry->id = id; + +QSLIST_INSERT_HEAD(_elf_obj_list, dataentry, node); +} + +const void *ebpf_find_binary_by_id(int id, size_t *sz, Error **errp) +{ +struct ElfBinaryDataEntry *it = NULL; +QSLIST_FOREACH(it, _elf_obj_list, node) { +if (id == it->id) { +*sz = it->datalen; +return it->data; +} +} + +error_setg(errp, "can't find eBPF object with id: %d", id); + +return NULL; +} + +EbpfObject *qmp_request_ebpf(EbpfProgramID id, Error **errp) +{ +EbpfObject *ret = NULL; +size_t size = 0; +const void *data = ebpf_find_binary_by_id(id, , errp); +if (!data) { +return NULL; +} + +ret = g_new0(EbpfObject, 1); +ret->object = g_base64_encode(data, size); + +return ret; +} diff --git a/ebpf/ebpf.h b/ebpf/ebpf.h new file mode 100644 index 00..b6266b28b8 --- /dev/null +++ b/ebpf/ebpf.h @@ -0,0 +1,31 @@ +/* + * QEMU eBPF binary declaration routine. + * + * Developed by Daynix Computing LTD (http://www.daynix.com) + * + * Authors: + * Andrew Melnychenko + * + * This work is licensed under the terms of the GNU GPL, version 2 or + * later. See the COPYING file in the top-level directory. + */ + +#ifndef EBPF_H +#define EBPF_H + +struct Error; + +void ebpf_register_binary_data(int id, const void *data, + size_t datalen); +const void *ebpf_find_binary_by_id(int id, size_t *sz, + struct Error **errp); + +#define ebpf_binary_init(id, fn) \ +static void __attribute__((constructor)) ebpf_binary_init_ ## fn(void) \ +{ \ +size_t datalen = 0;\ +const void *data = fn(); \ +ebpf_register_binary_data(id, data, datalen); \ +} + +#endif /* EBPF_H */ diff --git a/ebpf/ebpf_rss.c b/ebpf/ebpf_rss.c index 24bc6cc409..8679dc452d 100644 --- a/ebpf/ebpf_rss.c +++ b/ebpf/ebpf_rss.c @@ -13,6 +13,8 @@ #include "qemu/osdep.h" #include "qemu/error-report.h" +#include "qapi/qapi-types-misc.h" +#include "qapi/qapi-commands-ebpf.h" #include #include @@ -21,6 +23,8 @@ #include "ebpf/ebpf_rss.h" #include "ebpf/rss.bpf.skeleton.h" +#include "ebpf/ebpf.h" + #include "trace.h" void ebpf_rss_init(struct EBPFRSSContext *ctx) @@ -261,3 +265,5 @@ void ebpf_rss_unload(struct EBPFRSSContext *ctx) ctx->map_toeplitz_key = -1; ctx->map_indirections_table = -1; } + +ebpf_binary_init(EBPF_PROGRAMID_RSS, rss_bpf__elf_bytes) diff --git a/ebpf/meson.build b/ebpf/meson.build index 2f627d6c7d..c9bbaa7c90 100644 --- a/ebpf/meson.build +++ b/ebpf/meson.build @@ -1 +1 @@ -system_ss.add(when: libbpf, if_true: files('ebpf_rss.c'), if_false: files('ebpf_rss-stub.c')) +common_ss.add(when: libbpf, if_true: files('ebpf.c', 'ebpf_rss.c'), if_false: files('ebpf_rss-stub.c')) \ No newline at end of file -- 2.40.1
[PATCH v4 2/6] ebpf: Added eBPF initialization by fds.
It allows using file descriptors of eBPF provided outside of QEMU. QEMU may be run without capabilities for eBPF and run RSS program provided by management tool(g.e. libvirt). Signed-off-by: Andrew Melnychenko --- ebpf/ebpf_rss-stub.c | 6 ++ ebpf/ebpf_rss.c | 27 +++ ebpf/ebpf_rss.h | 5 + 3 files changed, 38 insertions(+) diff --git a/ebpf/ebpf_rss-stub.c b/ebpf/ebpf_rss-stub.c index e71e229190..8d7fae2ad9 100644 --- a/ebpf/ebpf_rss-stub.c +++ b/ebpf/ebpf_rss-stub.c @@ -28,6 +28,12 @@ bool ebpf_rss_load(struct EBPFRSSContext *ctx) return false; } +bool ebpf_rss_load_fds(struct EBPFRSSContext *ctx, int program_fd, + int config_fd, int toeplitz_fd, int table_fd) +{ +return false; +} + bool ebpf_rss_set_all(struct EBPFRSSContext *ctx, struct EBPFRSSConfig *config, uint16_t *indirections_table, uint8_t *toeplitz_key) { diff --git a/ebpf/ebpf_rss.c b/ebpf/ebpf_rss.c index 247f5eee1b..24bc6cc409 100644 --- a/ebpf/ebpf_rss.c +++ b/ebpf/ebpf_rss.c @@ -146,6 +146,33 @@ error: return false; } +bool ebpf_rss_load_fds(struct EBPFRSSContext *ctx, int program_fd, + int config_fd, int toeplitz_fd, int table_fd) +{ +if (ctx == NULL || ebpf_rss_is_loaded(ctx)) { +return false; +} + +if (program_fd < 0 || config_fd < 0 || toeplitz_fd < 0 || table_fd < 0) { +return false; +} + +ctx->program_fd = program_fd; +ctx->map_configuration = config_fd; +ctx->map_toeplitz_key = toeplitz_fd; +ctx->map_indirections_table = table_fd; + +if (!ebpf_rss_mmap(ctx)) { +ctx->program_fd = -1; +ctx->map_configuration = -1; +ctx->map_toeplitz_key = -1; +ctx->map_indirections_table = -1; +return false; +} + +return true; +} + static bool ebpf_rss_set_config(struct EBPFRSSContext *ctx, struct EBPFRSSConfig *config) { diff --git a/ebpf/ebpf_rss.h b/ebpf/ebpf_rss.h index ab08a7266d..239242b0d2 100644 --- a/ebpf/ebpf_rss.h +++ b/ebpf/ebpf_rss.h @@ -14,6 +14,8 @@ #ifndef QEMU_EBPF_RSS_H #define QEMU_EBPF_RSS_H +#define EBPF_RSS_MAX_FDS 4 + struct EBPFRSSContext { void *obj; int program_fd; @@ -41,6 +43,9 @@ bool ebpf_rss_is_loaded(struct EBPFRSSContext *ctx); bool ebpf_rss_load(struct EBPFRSSContext *ctx); +bool ebpf_rss_load_fds(struct EBPFRSSContext *ctx, int program_fd, + int config_fd, int toeplitz_fd, int table_fd); + bool ebpf_rss_set_all(struct EBPFRSSContext *ctx, struct EBPFRSSConfig *config, uint16_t *indirections_table, uint8_t *toeplitz_key); -- 2.40.1
[PATCH v4 1/6] ebpf: Added eBPF map update through mmap.
Changed eBPF map updates through mmaped array. Mmaped arrays provide direct access to map data. It should omit using bpf_map_update_elem() call, which may require capabilities that are not present. Signed-off-by: Andrew Melnychenko --- ebpf/ebpf_rss.c | 117 ++-- ebpf/ebpf_rss.h | 5 +++ 2 files changed, 99 insertions(+), 23 deletions(-) diff --git a/ebpf/ebpf_rss.c b/ebpf/ebpf_rss.c index cee658c158..247f5eee1b 100644 --- a/ebpf/ebpf_rss.c +++ b/ebpf/ebpf_rss.c @@ -27,19 +27,83 @@ void ebpf_rss_init(struct EBPFRSSContext *ctx) { if (ctx != NULL) { ctx->obj = NULL; +ctx->program_fd = -1; +ctx->map_configuration = -1; +ctx->map_toeplitz_key = -1; +ctx->map_indirections_table = -1; + +ctx->mmap_configuration = NULL; +ctx->mmap_toeplitz_key = NULL; +ctx->mmap_indirections_table = NULL; } } bool ebpf_rss_is_loaded(struct EBPFRSSContext *ctx) { -return ctx != NULL && ctx->obj != NULL; +return ctx != NULL && (ctx->obj != NULL || ctx->program_fd != -1); +} + +static bool ebpf_rss_mmap(struct EBPFRSSContext *ctx) +{ +if (!ebpf_rss_is_loaded(ctx)) { +return false; +} + +ctx->mmap_configuration = mmap(NULL, qemu_real_host_page_size(), + PROT_READ | PROT_WRITE, MAP_SHARED, + ctx->map_configuration, 0); +if (ctx->mmap_configuration == MAP_FAILED) { +trace_ebpf_error("eBPF RSS", "can not mmap eBPF configuration array"); +return false; +} +ctx->mmap_toeplitz_key = mmap(NULL, qemu_real_host_page_size(), + PROT_READ | PROT_WRITE, MAP_SHARED, + ctx->map_toeplitz_key, 0); +if (ctx->mmap_toeplitz_key == MAP_FAILED) { +trace_ebpf_error("eBPF RSS", "can not mmap eBPF toeplitz key"); +goto toeplitz_fail; +} +ctx->mmap_indirections_table = mmap(NULL, qemu_real_host_page_size(), + PROT_READ | PROT_WRITE, MAP_SHARED, + ctx->map_indirections_table, 0); +if (ctx->mmap_indirections_table == MAP_FAILED) { +trace_ebpf_error("eBPF RSS", "can not mmap eBPF indirection table"); +goto indirection_fail; +} + +return true; + +indirection_fail: +munmap(ctx->mmap_toeplitz_key, qemu_real_host_page_size()); +toeplitz_fail: +munmap(ctx->mmap_configuration, qemu_real_host_page_size()); + +ctx->mmap_configuration = NULL; +ctx->mmap_toeplitz_key = NULL; +ctx->mmap_indirections_table = NULL; +return false; +} + +static void ebpf_rss_munmap(struct EBPFRSSContext *ctx) +{ +if (!ebpf_rss_is_loaded(ctx)) { +return; +} + +munmap(ctx->mmap_indirections_table, qemu_real_host_page_size()); +munmap(ctx->mmap_toeplitz_key, qemu_real_host_page_size()); +munmap(ctx->mmap_configuration, qemu_real_host_page_size()); + +ctx->mmap_configuration = NULL; +ctx->mmap_toeplitz_key = NULL; +ctx->mmap_indirections_table = NULL; } bool ebpf_rss_load(struct EBPFRSSContext *ctx) { struct rss_bpf *rss_bpf_ctx; -if (ctx == NULL) { +if (ctx == NULL || ebpf_rss_is_loaded(ctx)) { return false; } @@ -66,10 +130,18 @@ bool ebpf_rss_load(struct EBPFRSSContext *ctx) ctx->map_toeplitz_key = bpf_map__fd( rss_bpf_ctx->maps.tap_rss_map_toeplitz_key); +if (!ebpf_rss_mmap(ctx)) { +goto error; +} + return true; error: rss_bpf__destroy(rss_bpf_ctx); ctx->obj = NULL; +ctx->program_fd = -1; +ctx->map_configuration = -1; +ctx->map_toeplitz_key = -1; +ctx->map_indirections_table = -1; return false; } @@ -77,15 +149,11 @@ error: static bool ebpf_rss_set_config(struct EBPFRSSContext *ctx, struct EBPFRSSConfig *config) { -uint32_t map_key = 0; - if (!ebpf_rss_is_loaded(ctx)) { return false; } -if (bpf_map_update_elem(ctx->map_configuration, -_key, config, 0) < 0) { -return false; -} + +memcpy(ctx->mmap_configuration, config, sizeof(*config)); return true; } @@ -93,27 +161,19 @@ static bool ebpf_rss_set_indirections_table(struct EBPFRSSContext *ctx, uint16_t *indirections_table, size_t len) { -uint32_t i = 0; - if (!ebpf_rss_is_loaded(ctx) || indirections_table == NULL || len > VIRTIO_NET_RSS_MAX_TABLE_LEN) { return false; } -for (; i < len; ++i) { -if (bpf_map_update_elem(ctx->map_indirections_table, , -indirections_table + i, 0) < 0) { -return false; -} -} +memcpy(ctx->mmap_indirections_table, indirections_table, +sizeof(*indirections_table) * len); return true; }
[PATCH v4 6/6] ebpf: Updated eBPF program and skeleton.
Updated section name, so libbpf should init/gues proper program type without specifications during open/load. Signed-off-by: Andrew Melnychenko --- ebpf/rss.bpf.skeleton.h | 1469 --- tools/ebpf/rss.bpf.c|2 +- 2 files changed, 741 insertions(+), 730 deletions(-) diff --git a/ebpf/rss.bpf.skeleton.h b/ebpf/rss.bpf.skeleton.h index 18eb2adb12..41b84aea44 100644 --- a/ebpf/rss.bpf.skeleton.h +++ b/ebpf/rss.bpf.skeleton.h @@ -176,162 +176,162 @@ err: static inline const void *rss_bpf__elf_bytes(size_t *sz) { - *sz = 20440; + *sz = 20720; return (const void *)"\ \x7f\x45\x4c\x46\x02\x01\x01\0\0\0\0\0\0\0\0\0\x01\0\xf7\0\x01\0\0\0\0\0\0\0\0\ -\0\0\0\0\0\0\0\0\0\0\0\x98\x4c\0\0\0\0\0\0\0\0\0\0\x40\0\0\0\0\0\x40\0\x0d\0\ -\x01\0\xbf\x19\0\0\0\0\0\0\xb7\x01\0\0\0\0\0\0\x63\x1a\x54\xff\0\0\0\0\xbf\xa7\ -\0\0\0\0\0\0\x07\x07\0\0\x54\xff\xff\xff\x18\x01\0\0\0\0\0\0\0\0\0\0\0\0\0\0\ +\0\0\0\0\0\0\0\0\0\0\0\xb0\x4d\0\0\0\0\0\0\0\0\0\0\x40\0\0\0\0\0\x40\0\x0d\0\ +\x01\0\xbf\x19\0\0\0\0\0\0\xb7\x01\0\0\0\0\0\0\x63\x1a\x4c\xff\0\0\0\0\xbf\xa7\ +\0\0\0\0\0\0\x07\x07\0\0\x4c\xff\xff\xff\x18\x01\0\0\0\0\0\0\0\0\0\0\0\0\0\0\ \xbf\x72\0\0\0\0\0\0\x85\0\0\0\x01\0\0\0\xbf\x06\0\0\0\0\0\0\x18\x01\0\0\0\0\0\ \0\0\0\0\0\0\0\0\0\xbf\x72\0\0\0\0\0\0\x85\0\0\0\x01\0\0\0\xbf\x08\0\0\0\0\0\0\ -\x18\0\0\0\xff\xff\xff\xff\0\0\0\0\0\0\0\0\x15\x06\x67\x02\0\0\0\0\xbf\x87\0\0\ -\0\0\0\0\x15\x07\x65\x02\0\0\0\0\x71\x61\0\0\0\0\0\0\x55\x01\x01\0\0\0\0\0\x05\ -\0\x5e\x02\0\0\0\0\xb7\x01\0\0\0\0\0\0\x63\x1a\xc8\xff\0\0\0\0\x7b\x1a\xc0\xff\ -\0\0\0\0\x7b\x1a\xb8\xff\0\0\0\0\x7b\x1a\xb0\xff\0\0\0\0\x7b\x1a\xa8\xff\0\0\0\ -\0\x63\x1a\xa0\xff\0\0\0\0\x7b\x1a\x98\xff\0\0\0\0\x7b\x1a\x90\xff\0\0\0\0\x7b\ -\x1a\x88\xff\0\0\0\0\x7b\x1a\x80\xff\0\0\0\0\x7b\x1a\x78\xff\0\0\0\0\x7b\x1a\ -\x70\xff\0\0\0\0\x7b\x1a\x68\xff\0\0\0\0\x7b\x1a\x60\xff\0\0\0\0\x7b\x1a\x58\ -\xff\0\0\0\0\x15\x09\x4d\x02\0\0\0\0\x6b\x1a\xd0\xff\0\0\0\0\xbf\xa3\0\0\0\0\0\ -\0\x07\x03\0\0\xd0\xff\xff\xff\xbf\x91\0\0\0\0\0\0\xb7\x02\0\0\x0c\0\0\0\xb7\ +\x18\0\0\0\xff\xff\xff\xff\0\0\0\0\0\0\0\0\x15\x06\x64\x02\0\0\0\0\xbf\x87\0\0\ +\0\0\0\0\x15\x07\x62\x02\0\0\0\0\x71\x61\0\0\0\0\0\0\x55\x01\x01\0\0\0\0\0\x05\ +\0\x5b\x02\0\0\0\0\xb7\x01\0\0\0\0\0\0\x63\x1a\xc0\xff\0\0\0\0\x7b\x1a\xb8\xff\ +\0\0\0\0\x7b\x1a\xb0\xff\0\0\0\0\x7b\x1a\xa8\xff\0\0\0\0\x7b\x1a\xa0\xff\0\0\0\ +\0\x63\x1a\x98\xff\0\0\0\0\x7b\x1a\x90\xff\0\0\0\0\x7b\x1a\x88\xff\0\0\0\0\x7b\ +\x1a\x80\xff\0\0\0\0\x7b\x1a\x78\xff\0\0\0\0\x7b\x1a\x70\xff\0\0\0\0\x7b\x1a\ +\x68\xff\0\0\0\0\x7b\x1a\x60\xff\0\0\0\0\x7b\x1a\x58\xff\0\0\0\0\x7b\x1a\x50\ +\xff\0\0\0\0\x15\x09\x4a\x02\0\0\0\0\x6b\x1a\xc8\xff\0\0\0\0\xbf\xa3\0\0\0\0\0\ +\0\x07\x03\0\0\xc8\xff\xff\xff\xbf\x91\0\0\0\0\0\0\xb7\x02\0\0\x0c\0\0\0\xb7\ \x04\0\0\x02\0\0\0\xb7\x05\0\0\0\0\0\0\x85\0\0\0\x44\0\0\0\x67\0\0\0\x20\0\0\0\ -\x77\0\0\0\x20\0\0\0\x55\0\x42\x02\0\0\0\0\xb7\x02\0\0\x10\0\0\0\x69\xa1\xd0\ +\x77\0\0\0\x20\0\0\0\x55\0\x3f\x02\0\0\0\0\xb7\x02\0\0\x10\0\0\0\x69\xa1\xc8\ \xff\0\0\0\0\xbf\x13\0\0\0\0\0\0\xdc\x03\0\0\x10\0\0\0\x15\x03\x02\0\0\x81\0\0\ \x55\x03\x0b\0\xa8\x88\0\0\xb7\x02\0\0\x14\0\0\0\xbf\xa3\0\0\0\0\0\0\x07\x03\0\ -\0\xd0\xff\xff\xff\xbf\x91\0\0\0\0\0\0\xb7\x04\0\0\x02\0\0\0\xb7\x05\0\0\0\0\0\ -\0\x85\0\0\0\x44\0\0\0\x67\0\0\0\x20\0\0\0\x77\0\0\0\x20\0\0\0\x55\0\x32\x02\0\ -\0\0\0\x69\xa1\xd0\xff\0\0\0\0\x15\x01\x30\x02\0\0\0\0\x7b\x7a\x38\xff\0\0\0\0\ -\x7b\x9a\x40\xff\0\0\0\0\x15\x01\x55\0\x86\xdd\0\0\x55\x01\x39\0\x08\0\0\0\xb7\ -\x07\0\0\x01\0\0\0\x73\x7a\x58\xff\0\0\0\0\xb7\x01\0\0\0\0\0\0\x63\x1a\xe0\xff\ -\0\0\0\0\x7b\x1a\xd8\xff\0\0\0\0\x7b\x1a\xd0\xff\0\0\0\0\xbf\xa3\0\0\0\0\0\0\ -\x07\x03\0\0\xd0\xff\xff\xff\x79\xa1\x40\xff\0\0\0\0\xb7\x02\0\0\0\0\0\0\xb7\ +\0\xc8\xff\xff\xff\xbf\x91\0\0\0\0\0\0\xb7\x04\0\0\x02\0\0\0\xb7\x05\0\0\0\0\0\ +\0\x85\0\0\0\x44\0\0\0\x67\0\0\0\x20\0\0\0\x77\0\0\0\x20\0\0\0\x55\0\x2f\x02\0\ +\0\0\0\x69\xa1\xc8\xff\0\0\0\0\x15\x01\x2d\x02\0\0\0\0\x7b\x7a\x30\xff\0\0\0\0\ +\x7b\x9a\x38\xff\0\0\0\0\x15\x01\x55\0\x86\xdd\0\0\x55\x01\x39\0\x08\0\0\0\xb7\ +\x07\0\0\x01\0\0\0\x73\x7a\x50\xff\0\0\0\0\xb7\x01\0\0\0\0\0\0\x63\x1a\xd8\xff\ +\0\0\0\0\x7b\x1a\xd0\xff\0\0\0\0\x7b\x1a\xc8\xff\0\0\0\0\xbf\xa3\0\0\0\0\0\0\ +\x07\x03\0\0\xc8\xff\xff\xff\x79\xa1\x38\xff\0\0\0\0\xb7\x02\0\0\0\0\0\0\xb7\ \x04\0\0\x14\0\0\0\xb7\x05\0\0\x01\0\0\0\x85\0\0\0\x44\0\0\0\x67\0\0\0\x20\0\0\ -\0\x77\0\0\0\x20\0\0\0\x55\0\x1c\x02\0\0\0\0\x69\xa1\xd6\xff\0\0\0\0\x55\x01\ -\x01\0\0\0\0\0\xb7\x07\0\0\0\0\0\0\x61\xa1\xdc\xff\0\0\0\0\x63\x1a\x64\xff\0\0\ -\0\0\x61\xa1\xe0\xff\0\0\0\0\x63\x1a\x68\xff\0\0\0\0\x71\xa9\xd9\xff\0\0\0\0\ -\x73\x7a\x5e\xff\0\0\0\0\x71\xa1\xd0\xff\0\0\0\0\x67\x01\0\0\x02\0\0\0\x57\x01\ -\0\0\x3c\0\0\0\x7b\x1a\x48\xff\0\0\0\0\xbf\x91\0\0\0\0\0\0\x57\x01\0\0\xff\0\0\ +\0\x77\0\0\0\x20\0\0\0\x55\0\x19\x02\0\0\0\0\x69\xa1\xce\xff\0\0\0\0\x55\x01\ +\x01\0\0\0\0\0\xb7\x07\0\0\0\0\0\0\x61\xa1\xd4\xff\0\0\0\0\x63\x1a\x5c\xff\0\0\
[PATCH v4 0/6] eBPF RSS through QMP support.
This series of patches provides the ability to retrieve eBPF program through qmp, so management application may load bpf blob with proper capabilities. Now, virtio-net devices can accept eBPF programs and maps through properties as external file descriptors. Access to the eBPF map is direct through mmap() call, so it should not require additional capabilities to bpf* calls. eBPF file descriptors can be passed to QEMU from parent process or by unix socket with sendfd() qmp command. Possible solution for libvirt may look like this: https://github.com/daynix/libvirt/tree/RSS_eBPF (WIP) Changes since v3: * fixed issue with the build if bpf disabled * rebased to the last master * refactored according to review Changes since v2: * moved/refactored QMP command * refactored virtio-net Changes since v1: * refactored virtio-net * moved hunks for ebpf mmap() * added qmp enum for eBPF id. Andrew Melnychenko (6): ebpf: Added eBPF map update through mmap. ebpf: Added eBPF initialization by fds. virtio-net: Added property to load eBPF RSS with fds. ebpf: Added declaration/initialization routines. qmp: Added new command to retrieve eBPF blob. ebpf: Updated eBPF program and skeleton. ebpf/ebpf.c| 70 ++ ebpf/ebpf.h| 31 + ebpf/ebpf_rss-stub.c |6 + ebpf/ebpf_rss.c| 150 +++- ebpf/ebpf_rss.h| 10 + ebpf/meson.build |2 +- ebpf/rss.bpf.skeleton.h| 1469 hw/net/virtio-net.c| 55 +- include/hw/virtio/virtio-net.h |1 + qapi/ebpf.json | 58 ++ qapi/meson.build |1 + qapi/qapi-schema.json |1 + tools/ebpf/rss.bpf.c |2 +- 13 files changed, 1096 insertions(+), 760 deletions(-) create mode 100644 ebpf/ebpf.c create mode 100644 ebpf/ebpf.h create mode 100644 qapi/ebpf.json -- 2.40.1
[PATCH v4 3/6] virtio-net: Added property to load eBPF RSS with fds.
eBPF RSS program and maps may now be passed during initialization. Initially was implemented for libvirt to launch qemu without permissions, and initialized eBPF program through the helper. Signed-off-by: Andrew Melnychenko --- hw/net/virtio-net.c| 55 ++ include/hw/virtio/virtio-net.h | 1 + 2 files changed, 50 insertions(+), 6 deletions(-) diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index 7102ec4817..f1894f2095 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -42,6 +42,7 @@ #include "sysemu/sysemu.h" #include "trace.h" #include "monitor/qdev.h" +#include "monitor/monitor.h" #include "hw/pci/pci_device.h" #include "net_rx_pkt.h" #include "hw/virtio/vhost.h" @@ -1304,14 +1305,55 @@ static void virtio_net_detach_epbf_rss(VirtIONet *n) virtio_net_attach_ebpf_to_backend(n->nic, -1); } -static bool virtio_net_load_ebpf(VirtIONet *n) +static bool virtio_net_load_ebpf_fds(VirtIONet *n, Error **errp) { -if (!virtio_net_attach_ebpf_to_backend(n->nic, -1)) { -/* backend does't support steering ebpf */ -return false; +int fds[EBPF_RSS_MAX_FDS] = { [0 ... EBPF_RSS_MAX_FDS - 1] = -1}; +int nfds = 0; +int ret = true; +int i = 0; +g_auto(GStrv) fds_strs = g_strsplit(n->ebpf_rss_fds, ":", 0); + +ERRP_GUARD(); + +if (g_strv_length(fds_strs) != EBPF_RSS_MAX_FDS) { +error_setg(errp, + "Expected %d file descriptors but got %d", + EBPF_RSS_MAX_FDS, g_strv_length(fds_strs)); + return false; + } + +for (i = 0; i < nfds; i++) { +fds[i] = monitor_fd_param(monitor_cur(), fds_strs[i], errp); +if (*errp) { +ret = false; +goto exit; +} +} + +ret = ebpf_rss_load_fds(>ebpf_rss, fds[0], fds[1], fds[2], fds[3]); + +exit: +if (!ret || *errp) { +for (i = 0; i < nfds && fds[i] != -1; i++) { +close(fds[i]); +} } -return ebpf_rss_load(>ebpf_rss); +return ret; +} + +static bool virtio_net_load_ebpf(VirtIONet *n, Error **errp) +{ +bool ret = false; + +if (virtio_net_attach_ebpf_to_backend(n->nic, -1)) { +if (!(n->ebpf_rss_fds +&& virtio_net_load_ebpf_fds(n, errp))) { +ret = ebpf_rss_load(>ebpf_rss); +} +} + +return ret; } static void virtio_net_unload_ebpf(VirtIONet *n) @@ -3741,7 +3783,7 @@ static void virtio_net_device_realize(DeviceState *dev, Error **errp) net_rx_pkt_init(>rx_pkt); if (virtio_has_feature(n->host_features, VIRTIO_NET_F_RSS)) { -virtio_net_load_ebpf(n); +virtio_net_load_ebpf(n, errp); } } @@ -3903,6 +3945,7 @@ static Property virtio_net_properties[] = { VIRTIO_NET_F_RSS, false), DEFINE_PROP_BIT64("hash", VirtIONet, host_features, VIRTIO_NET_F_HASH_REPORT, false), +DEFINE_PROP_STRING("ebpf_rss_fds", VirtIONet, ebpf_rss_fds), DEFINE_PROP_BIT64("guest_rsc_ext", VirtIONet, host_features, VIRTIO_NET_F_RSC_EXT, false), DEFINE_PROP_UINT32("rsc_interval", VirtIONet, rsc_timeout, diff --git a/include/hw/virtio/virtio-net.h b/include/hw/virtio/virtio-net.h index 5f5dcb4572..44faf700b4 100644 --- a/include/hw/virtio/virtio-net.h +++ b/include/hw/virtio/virtio-net.h @@ -219,6 +219,7 @@ struct VirtIONet { VirtioNetRssData rss_data; struct NetRxPkt *rx_pkt; struct EBPFRSSContext ebpf_rss; +char *ebpf_rss_fds; }; size_t virtio_net_handle_ctrl_iov(VirtIODevice *vdev, -- 2.40.1
Re: Boot failure after QEMU's upgrade to OpenSBI v1.3 (was Re: [PATCH for-8.2 6/7] target/riscv: add 'max' CPU type)
On 7/13/23 19:47, Conor Dooley wrote: On Thu, Jul 13, 2023 at 07:35:01PM -0300, Daniel Henrique Barboza wrote: On 7/13/23 19:12, Conor Dooley wrote: And a question for you below Daniel. On Wed, Jul 12, 2023 at 11:14:21PM +0100, Conor Dooley wrote: qemu-system-riscv64: warning: disabling zca extension for hart 0x because privilege spec version does not match qemu-system-riscv64: warning: disabling zca extension for hart 0x0001 because privilege spec version does not match qemu-system-riscv64: warning: disabling zcd extension for hart 0x0001 because privilege spec version does not match qemu-system-riscv64: warning: disabling zca extension for hart 0x0002 because privilege spec version does not match qemu-system-riscv64: warning: disabling zcd extension for hart 0x0002 because privilege spec version does not match qemu-system-riscv64: warning: disabling zca extension for hart 0x0003 because privilege spec version does not match qemu-system-riscv64: warning: disabling zcd extension for hart 0x0003 because privilege spec version does not match qemu-system-riscv64: warning: disabling zca extension for hart 0x0004 because privilege spec version does not match qemu-system-riscv64: warning: disabling zcd extension for hart 0x0004 because privilege spec version does not match Why am I seeing these warnings? Does the mpfs machine type need to disable some things? It only supports rv64imafdc per the DT, and predates things like Zca existing, so emitting warnings does not seem fair at all to me! QEMU will disable extensions that are newer than a priv spec version that is set by the CPU. IIUC the icicle board is running a sifive_u54 CPU by default. That CPU has a priv spec version 1_10_0. The CPU is also enabling C. We will enable zca if C is enabled. C and D enabled will also enable zcd. But then the priv check will disabled both because zca and zcd have priv spec 1_12_0. This is a side effect for a change that I did a few months ago. Back then we weren't disabling stuff correctly. Yah, I did check out the blame, hence directing it at you. Thanks for the explanation. The warnings are annoying but are benign. To be honest, benign or not, this is kind of thing is only going to lead to grief. Even though only the direct kernel boot works, we do actually have some customers that are using the icicle target in QEMU. And apparently the sifive_u54 CPU is being inconsistent for some time and we noticed just now. Now, if the icicle board is supposed to have zca and zcd then we have a problem. I don't know, this depends on how you see things in QEMU. I would say that it supports c, and not Zca/Zcf/Zcd, given it predates the extensions. I have no interest in retrofitting my devicetree stuff with them, for example. We'll need to discuss whether we move sifive_u54 CPU priv spec to 1_12_0 (I'm not sure how this will affect other boards that uses this CPU) or remove this priv spec disable code altogether from QEMU. I think you should stop warning for this? From my dumb-user perspective, the warning only "scares" me into thinking something is wrong, when there isn't. I can see a use case for the warning where someone tries to enable Zca & Co. in their QEMU incantation for a CPU that does not have the correct privilege level to support it, but I didn't try to set any options at all in that way, so the warnings seem unfair? That's a fair criticism. We had similar discussions a few months back. It's weird to send warnings when the user didn't set the extensions manually, but ATM we can't tell whether an extension was user enabled or not. So we can either show unfair warning messages or not show warnings and take the risk of silently disabling extensions that users enabled in the command line. It seems that the former is more annoying to deal with than the latter. I guess I can propose a patch to remove the warnings. We can send warning again when we have a better solution. Daniel Cheers, Conor.
Re: [PATCH 0/3] hw/arm/virt: Use generic CPU invalidation
Hi Richard, On 7/14/23 05:27, Richard Henderson wrote: On 7/13/23 13:34, Gavin Shan wrote: On 7/13/23 21:52, Marcin Juszkiewicz wrote: W dniu 13.07.2023 o 13:44, Peter Maydell pisze: I see this isn't a change in this patch, but given that what the user specifies is not "cortex-a8-arm-cpu" but "cortex-a8", why do we include the "-arm-cpu" suffix in the error messages? It's not valid syntax to say "-cpu cortex-a8-arm-cpu", so it's a bit misleading... Internally those cpu names are "max-{TYPE_ARM_CPU}" and similar for other architectures. I like the change but it (IMHO) needs to cut "-{TYPE_*_CPU}" string from names: 13:37 marcin@applejack:qemu$ ./build/aarch64-softmmu/qemu-system-aarch64 -M virt -cpu cortex-r5 qemu-system-aarch64: Invalid CPU type: cortex-r5-arm-cpu The valid types are: cortex-a7-arm-cpu, cortex-a15-arm-cpu, cortex-a35-arm-cpu, cortex-a55-arm-cpu, cortex-a72-arm-cpu, cortex-a76-arm-cpu, a64fx-arm-cpu, neoverse-n1-arm-cpu, neoverse-v1-arm-cpu, cortex-a53-arm-cpu, cortex-a57-arm-cpu, host-arm-cpu, max-arm-cpu 13:37 marcin@applejack:qemu$ ./build/aarch64-softmmu/qemu-system-aarch64 -M virt -cpu cortex-a57-arm-cpu qemu-system-aarch64: unable to find CPU model 'cortex-a57-arm-cpu' The suffix of CPU types are provided in hw/arm/virt.c::valid_cpu_types in PATCH[2]. In the generic validation, the complete CPU type is used. The error message also have complete CPU type there. Peter and Marcin, how about to split the CPU types to two fields, as below? In this way, the complete CPU type will be used for validation and the 'internal' names will be used for the error messages. struct MachineClass { const char *valid_cpu_type_suffix; const char **valid_cpu_types; While you're changing this: const char * const *valid_cpu_types; yes, will do. }; hw/arm/virt.c - static const char *valid_cpu_types[] = { So that you can then do static const char * const valid_cpu_types[] yes, will do. Thanks, Gavin
Re: [PATCH 0/3] hw/arm/virt: Use generic CPU invalidation
Hi Philippe, On 7/14/23 02:29, Philippe Mathieu-Daudé wrote: On 13/7/23 14:34, Gavin Shan wrote: On 7/13/23 21:52, Marcin Juszkiewicz wrote: W dniu 13.07.2023 o 13:44, Peter Maydell pisze: I see this isn't a change in this patch, but given that what the user specifies is not "cortex-a8-arm-cpu" but "cortex-a8", why do we include the "-arm-cpu" suffix in the error messages? It's not valid syntax to say "-cpu cortex-a8-arm-cpu", so it's a bit misleading... Internally those cpu names are "max-{TYPE_ARM_CPU}" and similar for other architectures. I like the change but it (IMHO) needs to cut "-{TYPE_*_CPU}" string from names: 13:37 marcin@applejack:qemu$ ./build/aarch64-softmmu/qemu-system-aarch64 -M virt -cpu cortex-r5 qemu-system-aarch64: Invalid CPU type: cortex-r5-arm-cpu The valid types are: cortex-a7-arm-cpu, cortex-a15-arm-cpu, cortex-a35-arm-cpu, cortex-a55-arm-cpu, cortex-a72-arm-cpu, cortex-a76-arm-cpu, a64fx-arm-cpu, neoverse-n1-arm-cpu, neoverse-v1-arm-cpu, cortex-a53-arm-cpu, cortex-a57-arm-cpu, host-arm-cpu, max-arm-cpu 13:37 marcin@applejack:qemu$ ./build/aarch64-softmmu/qemu-system-aarch64 -M virt -cpu cortex-a57-arm-cpu qemu-system-aarch64: unable to find CPU model 'cortex-a57-arm-cpu' The suffix of CPU types are provided in hw/arm/virt.c::valid_cpu_types in PATCH[2]. In the generic validation, the complete CPU type is used. The error message also have complete CPU type there. In some places (arm_cpu_list_entry, arm_cpu_add_definition) we use: g_strndup(typename, strlen(typename) - strlen("-" TYPE_ARM_CPU)) Maybe extract as a helper? cpu_typename_name()? :) Yeah, it's definitely a good idea. The helper is needed by all architectures, not ARM alone. The following CPU types don't have explicit definition of _CPU_TYPE_SUFFIX. We need take "-" TYPE_CPU as the suffix. target/microblaze/cpu.c TYPE_MICROBLAZE_CPU target/hppa/cpu.cTYPE_HPPA_CPU target/nios2/cpu.c TYPE_NIOS2_CPU target/microblaze/cpu-qom.h:#define TYPE_MICROBLAZE_CPU "microblaze-cpu" target/hppa/cpu-qom.h: #define TYPE_HPPA_CPU "hppa-cpu" target/nios2/cpu.h: #define TYPE_NIOS2_CPU "nios2-cpu" I think the function name can be cpu_model_name() since we have called it as 'model' in cpu.c::parse_cpu_option(). Something like below. Please let me know if you have more comments. target//cpu.h - static inline char *cpu_model_name(const char *typename) { return g_strndup(typename, strlen(typename) - strlen(TYPE_XXX_CPU_SUFFIX)); } Thanks, Gavin
Re: Boot failure after QEMU's upgrade to OpenSBI v1.3 (was Re: [PATCH for-8.2 6/7] target/riscv: add 'max' CPU type)
On Thu, Jul 13, 2023 at 11:12:33PM +0100, Conor Dooley wrote: > +CC OpenSBI Mailing list > > I've not yet had the chance to bisect this, so adding the OpenSBI folks > to CC in case they might have an idea for what to try. NVM this, I bisected it. Logs below. > And a question for you below Daniel. > > On Wed, Jul 12, 2023 at 11:14:21PM +0100, Conor Dooley wrote: > > On Wed, Jul 12, 2023 at 06:39:28PM -0300, Daniel Henrique Barboza wrote: > > > On 7/12/23 18:35, Conor Dooley wrote: > > > > On Wed, Jul 12, 2023 at 06:09:10PM -0300, Daniel Henrique Barboza wrote: > > > > > > > > > It is intentional. Those default marchid/mimpid vals were derived > > > > > from the current > > > > > QEMU version ID/build and didn't mean much. > > > > > > > > > > It is still possible to set them via "-cpu rv64,marchid=N,mimpid=N" > > > > > if needed when > > > > > using the generic (rv64,rv32) CPUs. Vendor CPUs can't have their > > > > > machine IDs changed > > > > > via command line. > > > > > > > > Sounds good, thanks. I did just now go and check icicle to see what it > > > > would report & it does not boot. I'll go bisect... > > > > > > BTW how are you booting the icicle board nowadays? I remember you > > > mentioning about > > > some changes in the FDT being required to boot and whatnot. > > > > I do direct kernel boots, as the HSS doesn't work anymore, and just lie > > a bit to QEMU about how much DDR we have. > > .PHONY: qemu-icicle > > qemu-icicle: > > $(qemu) -M microchip-icicle-kit \ > > -m 3G -smp 5 \ > > -kernel $(vmlinux_bin) \ > > -dtb $(icicle_dtb) \ > > -initrd $(initramfs) \ > > -display none -serial null \ > > -serial stdio \ > > -D qemu.log -d unimp > > > > The platform only supports 2 GiB of DDR, not 3, but if I pass 2 to QEMU > > it thinks there's 1 GiB at 0x8000_ and 1 GiB at 0x10__. The > > upstream devicetree (and current FPGA reference design) expects there to > > be 1 GiB at 0x8000_ and 1 GiB at 0x10_4000_. If I lie to QEMU, > > it thinks there is 1 GiB at 0x8000_ and 2 GiB at 0x10__, and > > things just work. I prefer doing it this way than having to modify the > > DT, it is a lot easier to explain to people this way. > > > > I've been meaning to work the support for the icicle & mpfs in QEMU, but > > it just gets shunted down the priority list. I'd really like if a proper > > boot flow would run in QEMU, which means fixing whatever broke the HSS, > > but I've recently picked up maintainership of dt-binding stuff in Linux, > > so I've unfortunately got even less time to try and work on it. Maybe > > we'll get some new graduate in and I can make them suffer in my stead... > > > > > If it's not too hard I'll add it in my test scripts to keep it under > > > check. Perhaps > > > we can even add it to QEMU testsuite. > > > > I don't think it really should be that bad, at least for the direct > > kernel boot, which is what I mainly care about, since I use it fairly > > often for debugging boot stuff in Linux. > > > > Anyways, aa903cf31391dd505b399627158f1292a6d19896 is the first bad commit: > > commit aa903cf31391dd505b399627158f1292a6d19896 > > Author: Bin Meng > > Date: Fri Jun 30 23:36:04 2023 +0800 > > > > roms/opensbi: Upgrade from v1.2 to v1.3 > > > > Upgrade OpenSBI from v1.2 to v1.3 and the pre-built bios images. > > > > And I see something like: > > qemu//build/qemu-system-riscv64 -M microchip-icicle-kit \ > > -m 3G -smp 5 \ > > -kernel vmlinux.bin \ > > -dtb icicle.dtb \ > > -initrd initramfs.cpio.gz \ > > -display none -serial null \ > > -serial stdio \ > > -D qemu.log -d unimp > > > qemu-system-riscv64: warning: disabling zca extension for hart > > 0x because privilege spec version does not match > > qemu-system-riscv64: warning: disabling zca extension for hart > > 0x0001 because privilege spec version does not match > > qemu-system-riscv64: warning: disabling zcd extension for hart > > 0x0001 because privilege spec version does not match > > qemu-system-riscv64: warning: disabling zca extension for hart > > 0x0002 because privilege spec version does not match > > qemu-system-riscv64: warning: disabling zcd extension for hart > > 0x0002 because privilege spec version does not match > > qemu-system-riscv64: warning: disabling zca extension for hart > > 0x0003 because privilege spec version does not match > > qemu-system-riscv64: warning: disabling zcd extension for hart > > 0x0003 because privilege spec version does not match > > qemu-system-riscv64: warning: disabling zca extension for hart > > 0x0004 because privilege spec version does not match > > qemu-system-riscv64: warning: disabling zcd extension for hart > > 0x0004 because privilege spec version does not
Re: Boot failure after QEMU's upgrade to OpenSBI v1.3 (was Re: [PATCH for-8.2 6/7] target/riscv: add 'max' CPU type)
On Thu, Jul 13, 2023 at 07:35:01PM -0300, Daniel Henrique Barboza wrote: > On 7/13/23 19:12, Conor Dooley wrote: > > And a question for you below Daniel. > > > > On Wed, Jul 12, 2023 at 11:14:21PM +0100, Conor Dooley wrote: > > > > > qemu-system-riscv64: warning: disabling zca extension for hart > > > 0x because privilege spec version does not match > > > qemu-system-riscv64: warning: disabling zca extension for hart > > > 0x0001 because privilege spec version does not match > > > qemu-system-riscv64: warning: disabling zcd extension for hart > > > 0x0001 because privilege spec version does not match > > > qemu-system-riscv64: warning: disabling zca extension for hart > > > 0x0002 because privilege spec version does not match > > > qemu-system-riscv64: warning: disabling zcd extension for hart > > > 0x0002 because privilege spec version does not match > > > qemu-system-riscv64: warning: disabling zca extension for hart > > > 0x0003 because privilege spec version does not match > > > qemu-system-riscv64: warning: disabling zcd extension for hart > > > 0x0003 because privilege spec version does not match > > > qemu-system-riscv64: warning: disabling zca extension for hart > > > 0x0004 because privilege spec version does not match > > > qemu-system-riscv64: warning: disabling zcd extension for hart > > > 0x0004 because privilege spec version does not match > > > > Why am I seeing these warnings? Does the mpfs machine type need to > > disable some things? It only supports rv64imafdc per the DT, and > > predates things like Zca existing, so emitting warnings does not seem > > fair at all to me! > > QEMU will disable extensions that are newer than a priv spec version that is > set > by the CPU. IIUC the icicle board is running a sifive_u54 CPU by default. That > CPU has a priv spec version 1_10_0. The CPU is also enabling C. > > We will enable zca if C is enabled. C and D enabled will also enable zcd. But > then the priv check will disabled both because zca and zcd have priv spec > 1_12_0. > > This is a side effect for a change that I did a few months ago. Back then we > weren't disabling stuff correctly. Yah, I did check out the blame, hence directing it at you. Thanks for the explanation. > The warnings are annoying but are benign. To be honest, benign or not, this is kind of thing is only going to lead to grief. Even though only the direct kernel boot works, we do actually have some customers that are using the icicle target in QEMU. > And apparently the sifive_u54 CPU is being inconsistent for some time and > we noticed just now. > Now, if the icicle board is supposed to have zca and zcd then we have a > problem. I don't know, this depends on how you see things in QEMU. I would say that it supports c, and not Zca/Zcf/Zcd, given it predates the extensions. I have no interest in retrofitting my devicetree stuff with them, for example. > We'll need to discuss whether we move sifive_u54 CPU priv spec to 1_12_0 (I'm > not > sure how this will affect other boards that uses this CPU) or remove this > priv spec > disable code altogether from QEMU. I think you should stop warning for this? From my dumb-user perspective, the warning only "scares" me into thinking something is wrong, when there isn't. I can see a use case for the warning where someone tries to enable Zca & Co. in their QEMU incantation for a CPU that does not have the correct privilege level to support it, but I didn't try to set any options at all in that way, so the warnings seem unfair? Cheers, Conor. signature.asc Description: PGP signature
Re: Boot failure after QEMU's upgrade to OpenSBI v1.3 (was Re: [PATCH for-8.2 6/7] target/riscv: add 'max' CPU type)
On 7/13/23 19:12, Conor Dooley wrote: +CC OpenSBI Mailing list I've not yet had the chance to bisect this, so adding the OpenSBI folks to CC in case they might have an idea for what to try. And a question for you below Daniel. On Wed, Jul 12, 2023 at 11:14:21PM +0100, Conor Dooley wrote: On Wed, Jul 12, 2023 at 06:39:28PM -0300, Daniel Henrique Barboza wrote: On 7/12/23 18:35, Conor Dooley wrote: On Wed, Jul 12, 2023 at 06:09:10PM -0300, Daniel Henrique Barboza wrote: It is intentional. Those default marchid/mimpid vals were derived from the current QEMU version ID/build and didn't mean much. It is still possible to set them via "-cpu rv64,marchid=N,mimpid=N" if needed when using the generic (rv64,rv32) CPUs. Vendor CPUs can't have their machine IDs changed via command line. Sounds good, thanks. I did just now go and check icicle to see what it would report & it does not boot. I'll go bisect... BTW how are you booting the icicle board nowadays? I remember you mentioning about some changes in the FDT being required to boot and whatnot. I do direct kernel boots, as the HSS doesn't work anymore, and just lie a bit to QEMU about how much DDR we have. .PHONY: qemu-icicle qemu-icicle: $(qemu) -M microchip-icicle-kit \ -m 3G -smp 5 \ -kernel $(vmlinux_bin) \ -dtb $(icicle_dtb) \ -initrd $(initramfs) \ -display none -serial null \ -serial stdio \ -D qemu.log -d unimp The platform only supports 2 GiB of DDR, not 3, but if I pass 2 to QEMU it thinks there's 1 GiB at 0x8000_ and 1 GiB at 0x10__. The upstream devicetree (and current FPGA reference design) expects there to be 1 GiB at 0x8000_ and 1 GiB at 0x10_4000_. If I lie to QEMU, it thinks there is 1 GiB at 0x8000_ and 2 GiB at 0x10__, and things just work. I prefer doing it this way than having to modify the DT, it is a lot easier to explain to people this way. I've been meaning to work the support for the icicle & mpfs in QEMU, but it just gets shunted down the priority list. I'd really like if a proper boot flow would run in QEMU, which means fixing whatever broke the HSS, but I've recently picked up maintainership of dt-binding stuff in Linux, so I've unfortunately got even less time to try and work on it. Maybe we'll get some new graduate in and I can make them suffer in my stead... If it's not too hard I'll add it in my test scripts to keep it under check. Perhaps we can even add it to QEMU testsuite. I don't think it really should be that bad, at least for the direct kernel boot, which is what I mainly care about, since I use it fairly often for debugging boot stuff in Linux. Anyways, aa903cf31391dd505b399627158f1292a6d19896 is the first bad commit: commit aa903cf31391dd505b399627158f1292a6d19896 Author: Bin Meng Date: Fri Jun 30 23:36:04 2023 +0800 roms/opensbi: Upgrade from v1.2 to v1.3 Upgrade OpenSBI from v1.2 to v1.3 and the pre-built bios images. And I see something like: qemu//build/qemu-system-riscv64 -M microchip-icicle-kit \ -m 3G -smp 5 \ -kernel vmlinux.bin \ -dtb icicle.dtb \ -initrd initramfs.cpio.gz \ -display none -serial null \ -serial stdio \ -D qemu.log -d unimp qemu-system-riscv64: warning: disabling zca extension for hart 0x because privilege spec version does not match qemu-system-riscv64: warning: disabling zca extension for hart 0x0001 because privilege spec version does not match qemu-system-riscv64: warning: disabling zcd extension for hart 0x0001 because privilege spec version does not match qemu-system-riscv64: warning: disabling zca extension for hart 0x0002 because privilege spec version does not match qemu-system-riscv64: warning: disabling zcd extension for hart 0x0002 because privilege spec version does not match qemu-system-riscv64: warning: disabling zca extension for hart 0x0003 because privilege spec version does not match qemu-system-riscv64: warning: disabling zcd extension for hart 0x0003 because privilege spec version does not match qemu-system-riscv64: warning: disabling zca extension for hart 0x0004 because privilege spec version does not match qemu-system-riscv64: warning: disabling zcd extension for hart 0x0004 because privilege spec version does not match Why am I seeing these warnings? Does the mpfs machine type need to disable some things? It only supports rv64imafdc per the DT, and predates things like Zca existing, so emitting warnings does not seem fair at all to me! QEMU will disable extensions that are newer than a priv spec version that is set by the CPU. IIUC the icicle board is running a sifive_u54 CPU by default. That CPU has a priv spec version 1_10_0. The CPU is also enabling C. We will enable zca if C is
Boot failure after QEMU's upgrade to OpenSBI v1.3 (was Re: [PATCH for-8.2 6/7] target/riscv: add 'max' CPU type)
+CC OpenSBI Mailing list I've not yet had the chance to bisect this, so adding the OpenSBI folks to CC in case they might have an idea for what to try. And a question for you below Daniel. On Wed, Jul 12, 2023 at 11:14:21PM +0100, Conor Dooley wrote: > On Wed, Jul 12, 2023 at 06:39:28PM -0300, Daniel Henrique Barboza wrote: > > On 7/12/23 18:35, Conor Dooley wrote: > > > On Wed, Jul 12, 2023 at 06:09:10PM -0300, Daniel Henrique Barboza wrote: > > > > > > > It is intentional. Those default marchid/mimpid vals were derived from > > > > the current > > > > QEMU version ID/build and didn't mean much. > > > > > > > > It is still possible to set them via "-cpu rv64,marchid=N,mimpid=N" if > > > > needed when > > > > using the generic (rv64,rv32) CPUs. Vendor CPUs can't have their > > > > machine IDs changed > > > > via command line. > > > > > > Sounds good, thanks. I did just now go and check icicle to see what it > > > would report & it does not boot. I'll go bisect... > > > > BTW how are you booting the icicle board nowadays? I remember you > > mentioning about > > some changes in the FDT being required to boot and whatnot. > > I do direct kernel boots, as the HSS doesn't work anymore, and just lie > a bit to QEMU about how much DDR we have. > .PHONY: qemu-icicle > qemu-icicle: > $(qemu) -M microchip-icicle-kit \ > -m 3G -smp 5 \ > -kernel $(vmlinux_bin) \ > -dtb $(icicle_dtb) \ > -initrd $(initramfs) \ > -display none -serial null \ > -serial stdio \ > -D qemu.log -d unimp > > The platform only supports 2 GiB of DDR, not 3, but if I pass 2 to QEMU > it thinks there's 1 GiB at 0x8000_ and 1 GiB at 0x10__. The > upstream devicetree (and current FPGA reference design) expects there to > be 1 GiB at 0x8000_ and 1 GiB at 0x10_4000_. If I lie to QEMU, > it thinks there is 1 GiB at 0x8000_ and 2 GiB at 0x10__, and > things just work. I prefer doing it this way than having to modify the > DT, it is a lot easier to explain to people this way. > > I've been meaning to work the support for the icicle & mpfs in QEMU, but > it just gets shunted down the priority list. I'd really like if a proper > boot flow would run in QEMU, which means fixing whatever broke the HSS, > but I've recently picked up maintainership of dt-binding stuff in Linux, > so I've unfortunately got even less time to try and work on it. Maybe > we'll get some new graduate in and I can make them suffer in my stead... > > > If it's not too hard I'll add it in my test scripts to keep it under check. > > Perhaps > > we can even add it to QEMU testsuite. > > I don't think it really should be that bad, at least for the direct > kernel boot, which is what I mainly care about, since I use it fairly > often for debugging boot stuff in Linux. > > Anyways, aa903cf31391dd505b399627158f1292a6d19896 is the first bad commit: > commit aa903cf31391dd505b399627158f1292a6d19896 > Author: Bin Meng > Date: Fri Jun 30 23:36:04 2023 +0800 > > roms/opensbi: Upgrade from v1.2 to v1.3 > > Upgrade OpenSBI from v1.2 to v1.3 and the pre-built bios images. > > And I see something like: > qemu//build/qemu-system-riscv64 -M microchip-icicle-kit \ > -m 3G -smp 5 \ > -kernel vmlinux.bin \ > -dtb icicle.dtb \ > -initrd initramfs.cpio.gz \ > -display none -serial null \ > -serial stdio \ > -D qemu.log -d unimp > qemu-system-riscv64: warning: disabling zca extension for hart > 0x because privilege spec version does not match > qemu-system-riscv64: warning: disabling zca extension for hart > 0x0001 because privilege spec version does not match > qemu-system-riscv64: warning: disabling zcd extension for hart > 0x0001 because privilege spec version does not match > qemu-system-riscv64: warning: disabling zca extension for hart > 0x0002 because privilege spec version does not match > qemu-system-riscv64: warning: disabling zcd extension for hart > 0x0002 because privilege spec version does not match > qemu-system-riscv64: warning: disabling zca extension for hart > 0x0003 because privilege spec version does not match > qemu-system-riscv64: warning: disabling zcd extension for hart > 0x0003 because privilege spec version does not match > qemu-system-riscv64: warning: disabling zca extension for hart > 0x0004 because privilege spec version does not match > qemu-system-riscv64: warning: disabling zcd extension for hart > 0x0004 because privilege spec version does not match Why am I seeing these warnings? Does the mpfs machine type need to disable some things? It only supports rv64imafdc per the DT, and predates things like Zca existing, so emitting warnings does not seem fair at all to me! > > OpenSBI v1.3 >_
Re: [PATCH 02/18] target/arm: Use clmul_8* routines
On 13/7/23 23:14, Richard Henderson wrote: Use generic routines for 8-bit carry-less multiply. Remove our local version of pmull_h. Signed-off-by: Richard Henderson --- target/arm/tcg/vec_internal.h | 5 --- target/arm/tcg/mve_helper.c | 8 ++--- target/arm/tcg/vec_helper.c | 63 +++ 3 files changed, 15 insertions(+), 61 deletions(-) Reviewed-by: Philippe Mathieu-Daudé
Re: [PATCH for-8.1] tcg: Use HAVE_CMPXCHG128 instead of CONFIG_CMPXCHG128
Hi Richard, On 13/7/23 22:23, Richard Henderson wrote: We adjust CONFIG_ATOMIC128 and CONFIG_CMPXCHG128 with CONFIG_ATOMIC128_OPT in atomic128.h. It is difficult to tell when those changes have been applied with the ifdef we must use with CONFIG_CMPXCHG128. So instead use HAVE_CMPXCHG128, which triggers -Werror-undef when the proper header has not been included. Improves tcg_gen_atomic_cmpxchg_i128 for s390x host, which requires CONFIG_ATOMIC128_OPT. Without this we fall back to EXCP_ATOMIC to single-step 128-bit atomics, which is slow enough to cause some tests to time out. Reported-by: Thomas Huth Signed-off-by: Richard Henderson --- Thomas, this issue does not quite match the one you bisected, but other than the cmpxchg, I don't see any see any qemu_{ld,st}_i128 being used in BootLinuxS390X.test_s390_ccw_virtio_tcg. As far as I can see, this wasn't broken by the addition of CONFIG_ATOMIC128_OPT, rather that fix didn't go far enough. Anyway, test_s390_ccw_virtio_tcg now passes in 159s on our host. IIUC: If we have CONFIG_ATOMIC128, we use qatomic_cmpxchg__nocheck; else if we have CONFIG_CMPXCHG128 we use __sync_val_compare_and_swap_16; in both cases we set HAVE_CMPXCHG128; otherwise we can not use atomic128 cmpxchg(). (I'm trying to figure why we need both CONFIGs).
Re: [PATCH V3] migration: simplify notifiers
On 6/7/23 09:42, Steve Sistare wrote: Pass the callback function to add_migration_state_change_notifier so that migration can initialize the notifier on add and clear it on delete, which simplifies the call sites. Shorten the function names so the extra arg can be added more legibly. Hide the global notifier list in a new function migration_call_notifiers, and make it externally visible so future live update code can call it. Tested-by: Michael Galaxy Reviewed-by: Michael Galaxy No functional change. Signed-off-by: Steve Sistare --- hw/net/virtio-net.c | 6 +++--- hw/vfio/migration.c | 8 include/migration/misc.h | 6 -- migration/migration.c| 22 -- net/vhost-vdpa.c | 7 --- ui/spice-core.c | 3 +-- 6 files changed, 32 insertions(+), 20 deletions(-) diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index 6df6b73..c4dc795 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -3605,8 +3605,8 @@ static void virtio_net_device_realize(DeviceState *dev, Error **errp) n->primary_listener.hide_device = failover_hide_primary_device; qatomic_set(>failover_primary_hidden, true); device_listener_register(>primary_listener); -n->migration_state.notify = virtio_net_migration_state_notifier; -add_migration_state_change_notifier(>migration_state); +migration_add_notifier(>migration_state, + virtio_net_migration_state_notifier); n->host_features |= (1ULL << VIRTIO_NET_F_STANDBY); } @@ -3769,7 +3769,7 @@ static void virtio_net_device_unrealize(DeviceState *dev) if (n->failover) { qobject_unref(n->primary_opts); device_listener_unregister(>primary_listener); -remove_migration_state_change_notifier(>migration_state); +migration_remove_notifier(>migration_state); } else { assert(n->primary_opts == NULL); } diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index c4656bb..8af0294 100644 --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -619,9 +619,9 @@ static int vfio_migration_init(VFIODevice *vbasedev) migration->vm_state = qdev_add_vm_change_state_handler(vbasedev->dev, vfio_vmstate_change, vbasedev); -migration->migration_state.notify = vfio_migration_state_notifier; -add_migration_state_change_notifier(>migration_state); - +migration_add_notifier(>migration_state, + vfio_migration_state_notifier); + return 0; } @@ -670,7 +670,7 @@ void vfio_migration_exit(VFIODevice *vbasedev) if (vbasedev->migration) { VFIOMigration *migration = vbasedev->migration; -remove_migration_state_change_notifier(>migration_state); +migration_remove_notifier(>migration_state); qemu_del_vm_change_state_handler(migration->vm_state); unregister_savevm(VMSTATE_IF(vbasedev->dev), "vfio", vbasedev); vfio_migration_free(vbasedev); diff --git a/include/migration/misc.h b/include/migration/misc.h index 5ebe13b..0987eb1 100644 --- a/include/migration/misc.h +++ b/include/migration/misc.h @@ -59,8 +59,10 @@ void migration_object_init(void); void migration_shutdown(void); bool migration_is_idle(void); bool migration_is_active(MigrationState *); -void add_migration_state_change_notifier(Notifier *notify); -void remove_migration_state_change_notifier(Notifier *notify); +void migration_add_notifier(Notifier *notify, +void (*func)(Notifier *notifier, void *data)); +void migration_remove_notifier(Notifier *notify); +void migration_call_notifiers(MigrationState *s); bool migration_in_setup(MigrationState *); bool migration_has_finished(MigrationState *); bool migration_has_failed(MigrationState *); diff --git a/migration/migration.c b/migration/migration.c index 5103e2f..17b4b47 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -1178,7 +1178,7 @@ static void migrate_fd_cleanup(MigrationState *s) /* It is used on info migrate. We can't free it */ error_report_err(error_copy(s->error)); } -notifier_list_notify(_state_notifiers, s); +migration_call_notifiers(s); block_cleanup_parameters(); yank_unregister_instance(MIGRATION_YANK_INSTANCE); } @@ -1273,14 +1273,24 @@ static void migrate_fd_cancel(MigrationState *s) } } -void add_migration_state_change_notifier(Notifier *notify) +void migration_add_notifier(Notifier *notify, +void (*func)(Notifier *notifier, void *data)) { +notify->notify = func; notifier_list_add(_state_notifiers, notify); } -void remove_migration_state_change_notifier(Notifier *notify) +void migration_remove_notifier(Notifier *notify) { -notifier_remove(notify); +
Re: [PATCH for-8.2 v2 5/7] target/riscv/cpu.c: add a ADD_CPU_PROPERTIES_ARRAY() macro
On 7/13/23 17:40, Richard Henderson wrote: On 7/12/23 21:57, Daniel Henrique Barboza wrote: +#define ADD_CPU_PROPERTIES_ARRAY(_dev, _array) \ + for (prop = _array; prop && prop->name; prop++) { \ + qdev_property_add_static(_dev, prop); \ + } \ do { } while(0) Watch the \ on the last line of the macro. Declare the iterator within the macro, rather than use one defined in the outer scope. Like this? #define ADD_CPU_PROPERTIES_ARRAY(_dev, _array) \ do { \ Property *prop; \ for (prop = _array; prop && prop->name; prop++) { \ qdev_property_add_static(_dev, prop); \ } \ } while(0) Why not use ARRAY_SIZE? Hm, the arrays are finishing with DEFINE_PROP_END_OF_LIST() (I copied the existing array structure), which adds an empty element, so ARRAY_SIZE will get empty stuff in the end. Since these are new arrays I can get rid of the end_of_list blank element and use ARRAY_SIZE(). Daniel r~
Re: [PATCH V4 0/2] migration file URI
Tested-by: Michael Galaxy Reviewed-by: Michael Galaxy On 6/30/23 09:25, Steve Sistare wrote: Add the migration URI "file:filename[,offset=offset]". Fabiano Rosas has submitted the unit tests in the series migration: Test the new "file:" migration Steve Sistare (2): migration: file URI migration: file URI offset migration/file.c | 103 + migration/file.h | 14 +++ migration/meson.build | 1 + migration/migration.c | 5 +++ migration/trace-events | 4 ++ qemu-options.hx| 7 +++- 6 files changed, 133 insertions(+), 1 deletion(-) create mode 100644 migration/file.c create mode 100644 migration/file.h
Re: [PATCH V4] migration: simplify blockers
On 7/7/23 15:20, Steve Sistare wrote: Modify migrate_add_blocker and migrate_del_blocker to take an Error ** reason. This allows migration to own the Error object, so that if an error occurs, migration code can free the Error and clear the client handle, simplifying client code. This is also a pre-requisite for future patches that will add a mode argument to migration requests to support live update, and will maintain a list of blockers for each mode. A blocker may apply to a single mode or to multiple modes, and passing Error** will allow one Error object to be registered for multiple modes. No functional change. Tested-by: Michael Galaxy Reviewed-by: Michael Galaxy Signed-off-by: Steve Sistare --- backends/tpm/tpm_emulator.c | 10 ++ block/parallels.c| 6 ++ block/qcow.c | 6 ++ block/vdi.c | 6 ++ block/vhdx.c | 6 ++ block/vmdk.c | 6 ++ block/vpc.c | 6 ++ block/vvfat.c| 6 ++ dump/dump.c | 4 ++-- hw/9pfs/9p.c | 10 ++ hw/display/virtio-gpu-base.c | 8 ++-- hw/intc/arm_gic_kvm.c| 3 +-- hw/intc/arm_gicv3_its_kvm.c | 3 +-- hw/intc/arm_gicv3_kvm.c | 3 +-- hw/misc/ivshmem.c| 8 ++-- hw/ppc/pef.c | 2 +- hw/ppc/spapr.c | 2 +- hw/ppc/spapr_events.c| 2 +- hw/ppc/spapr_rtas.c | 2 +- hw/remote/proxy.c| 7 ++- hw/s390x/s390-virtio-ccw.c | 9 +++-- hw/scsi/vhost-scsi.c | 8 +++- hw/vfio/common.c | 26 -- hw/vfio/migration.c | 16 ++-- hw/virtio/vhost.c| 8 ++-- include/migration/blocker.h | 24 +--- migration/migration.c| 22 ++ stubs/migr-blocker.c | 4 ++-- target/i386/kvm/kvm.c| 8 target/i386/nvmm/nvmm-all.c | 3 +-- target/i386/sev.c| 2 +- target/i386/whpx/whpx-all.c | 3 +-- ui/vdagent.c | 5 ++--- 33 files changed, 89 insertions(+), 155 deletions(-) diff --git a/backends/tpm/tpm_emulator.c b/backends/tpm/tpm_emulator.c index 402a2d6..bf1a90f 100644 --- a/backends/tpm/tpm_emulator.c +++ b/backends/tpm/tpm_emulator.c @@ -534,11 +534,8 @@ static int tpm_emulator_block_migration(TPMEmulator *tpm_emu) error_setg(_emu->migration_blocker, "Migration disabled: TPM emulator does not support " "migration"); -if (migrate_add_blocker(tpm_emu->migration_blocker, ) < 0) { +if (migrate_add_blocker(_emu->migration_blocker, ) < 0) { error_report_err(err); -error_free(tpm_emu->migration_blocker); -tpm_emu->migration_blocker = NULL; - return -1; } } @@ -1016,10 +1013,7 @@ static void tpm_emulator_inst_finalize(Object *obj) qapi_free_TPMEmulatorOptions(tpm_emu->options); -if (tpm_emu->migration_blocker) { -migrate_del_blocker(tpm_emu->migration_blocker); -error_free(tpm_emu->migration_blocker); -} +migrate_del_blocker(_emu->migration_blocker); tpm_sized_buffer_reset(_blobs->volatil); tpm_sized_buffer_reset(_blobs->permanent); diff --git a/block/parallels.c b/block/parallels.c index 18e34ae..c2d92c4 100644 --- a/block/parallels.c +++ b/block/parallels.c @@ -960,9 +960,8 @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags, error_setg(>migration_blocker, "The Parallels format used by node '%s' " "does not support live migration", bdrv_get_device_or_node_name(bs)); -ret = migrate_add_blocker(s->migration_blocker, errp); +ret = migrate_add_blocker(>migration_blocker, errp); if (ret < 0) { -error_free(s->migration_blocker); goto fail; } qemu_co_mutex_init(>lock); @@ -994,8 +993,7 @@ static void parallels_close(BlockDriverState *bs) g_free(s->bat_dirty_bmap); qemu_vfree(s->header); -migrate_del_blocker(s->migration_blocker); -error_free(s->migration_blocker); +migrate_del_blocker(>migration_blocker); } static BlockDriver bdrv_parallels = { diff --git a/block/qcow.c b/block/qcow.c index 577bd70..feedad5 100644 --- a/block/qcow.c +++ b/block/qcow.c @@ -304,9 +304,8 @@ static int qcow_open(BlockDriverState *bs, QDict *options, int flags, error_setg(>migration_blocker, "The qcow format used by node '%s' " "does not support live migration", bdrv_get_device_or_node_name(bs)); -ret = migrate_add_blocker(s->migration_blocker, errp); +ret = migrate_add_blocker(>migration_blocker, errp); if (ret < 0) { -error_free(s->migration_blocker); goto fail; } @@ -796,8 +795,7 @@
[PATCH 09/18] crypto: Add generic 32-bit carry-less multiply routines
Signed-off-by: Richard Henderson --- host/include/generic/host/crypto/clmul.h | 4 +++ include/crypto/clmul.h | 23 ++ crypto/clmul.c | 31 3 files changed, 58 insertions(+) diff --git a/host/include/generic/host/crypto/clmul.h b/host/include/generic/host/crypto/clmul.h index cba8bbf3e4..3fbb1576cf 100644 --- a/host/include/generic/host/crypto/clmul.h +++ b/host/include/generic/host/crypto/clmul.h @@ -19,4 +19,8 @@ #define clmul_16x4_even clmul_16x4_even_gen #define clmul_16x4_odd clmul_16x4_odd_gen +#define clmul_32clmul_32_gen +#define clmul_32x2_even clmul_32x2_even_gen +#define clmul_32x2_odd clmul_32x2_odd_gen + #endif /* GENERIC_HOST_CRYPTO_CLMUL_H */ diff --git a/include/crypto/clmul.h b/include/crypto/clmul.h index b701bac9d6..ce43c9aeb1 100644 --- a/include/crypto/clmul.h +++ b/include/crypto/clmul.h @@ -88,6 +88,29 @@ Int128 clmul_16x4_even_gen(Int128, Int128); */ Int128 clmul_16x4_odd_gen(Int128, Int128); +/** + * clmul_32: + * + * Perform a 32x32->64 carry-less multiply. + */ +uint64_t clmul_32_gen(uint32_t, uint32_t); + +/** + * clmul_32x2_even: + * + * Perform two 32x32->64 carry-less multiplies. + * The odd words of the inputs are ignored. + */ +Int128 clmul_32x2_even_gen(Int128, Int128); + +/** + * clmul_32x2_odd: + * + * Perform two 32x32->64 carry-less multiplies. + * The even words of the inputs are ignored. + */ +Int128 clmul_32x2_odd_gen(Int128, Int128); + #include "host/crypto/clmul.h" #endif /* CRYPTO_CLMUL_H */ diff --git a/crypto/clmul.c b/crypto/clmul.c index 69a3b6f7ff..c197cd5f21 100644 --- a/crypto/clmul.c +++ b/crypto/clmul.c @@ -113,3 +113,34 @@ Int128 clmul_16x4_odd_gen(Int128 n, Int128 m) rh = clmul_16x2_odd_gen(int128_gethi(n), int128_gethi(m)); return int128_make128(rl, rh); } + +uint64_t clmul_32_gen(uint32_t n, uint32_t m32) +{ +uint64_t r = 0; +uint64_t m = m32; + +for (int i = 0; i < 32; ++i) { +r ^= n & 1 ? m : 0; +n >>= 1; +m <<= 1; +} +return r; +} + +Int128 clmul_32x2_even_gen(Int128 n, Int128 m) +{ +uint64_t rl, rh; + +rl = clmul_32_gen(int128_getlo(n), int128_getlo(m)); +rh = clmul_32_gen(int128_gethi(n), int128_gethi(m)); +return int128_make128(rl, rh); +} + +Int128 clmul_32x2_odd_gen(Int128 n, Int128 m) +{ +uint64_t rl, rh; + +rl = clmul_32_gen(int128_getlo(n) >> 32, int128_getlo(m) >> 32); +rh = clmul_32_gen(int128_gethi(n) >> 32, int128_gethi(m) >> 32); +return int128_make128(rl, rh); +} -- 2.34.1
[PATCH 07/18] target/s390x: Use clmul_16* routines
Use generic routines for 16-bit carry-less multiply. Remove our local version of galois_multiply16. Signed-off-by: Richard Henderson --- target/s390x/tcg/vec_int_helper.c | 22 +++--- 1 file changed, 19 insertions(+), 3 deletions(-) diff --git a/target/s390x/tcg/vec_int_helper.c b/target/s390x/tcg/vec_int_helper.c index e110a7581a..523d6375bb 100644 --- a/target/s390x/tcg/vec_int_helper.c +++ b/target/s390x/tcg/vec_int_helper.c @@ -180,7 +180,6 @@ static uint##TBITS##_t galois_multiply##BITS(uint##TBITS##_t a,\ } \ return res; \ } -DEF_GALOIS_MULTIPLY(16, 32) DEF_GALOIS_MULTIPLY(32, 64) static S390Vector galois_multiply64(uint64_t a, uint64_t b) @@ -226,6 +225,25 @@ void HELPER(gvec_vgfma8)(void *v1, const void *v2, const void *v3, *(Int128 *)v1 = int128_xor(r, *(Int128 *)v4); } +static Int128 do_gfm16(Int128 n, Int128 m) +{ +Int128 e = clmul_16x4_even(n, m); +Int128 o = clmul_16x4_odd(n, m); +return int128_xor(e, o); +} + +void HELPER(gvec_vgfm16)(void *v1, const void *v2, const void *v3, uint32_t d) +{ +*(Int128 *)v1 = do_gfm16(*(const Int128 *)v2, *(const Int128 *)v3); +} + +void HELPER(gvec_vgfma16)(void *v1, const void *v2, const void *v3, + const void *v4, uint32_t d) +{ +Int128 r = do_gfm16(*(const Int128 *)v2, *(const Int128 *)v3); +*(Int128 *)v1 = int128_xor(r, *(Int128 *)v4); +} + #define DEF_VGFM(BITS, TBITS) \ void HELPER(gvec_vgfm##BITS)(void *v1, const void *v2, const void *v3, \ uint32_t desc) \ @@ -243,7 +261,6 @@ void HELPER(gvec_vgfm##BITS)(void *v1, const void *v2, const void *v3, \ s390_vec_write_element##TBITS(v1, i, d); \ } \ } -DEF_VGFM(16, 32) DEF_VGFM(32, 64) void HELPER(gvec_vgfm64)(void *v1, const void *v2, const void *v3, @@ -279,7 +296,6 @@ void HELPER(gvec_vgfma##BITS)(void *v1, const void *v2, const void *v3,\ s390_vec_write_element##TBITS(v1, i, d); \ } \ } -DEF_VGFMA(16, 32) DEF_VGFMA(32, 64) void HELPER(gvec_vgfma64)(void *v1, const void *v2, const void *v3, -- 2.34.1
[PATCH 01/18] crypto: Add generic 8-bit carry-less multiply routines
Signed-off-by: Richard Henderson --- host/include/generic/host/crypto/clmul.h | 17 ++ include/crypto/clmul.h | 61 +++ crypto/clmul.c | 76 crypto/meson.build | 9 ++- 4 files changed, 160 insertions(+), 3 deletions(-) create mode 100644 host/include/generic/host/crypto/clmul.h create mode 100644 include/crypto/clmul.h create mode 100644 crypto/clmul.c diff --git a/host/include/generic/host/crypto/clmul.h b/host/include/generic/host/crypto/clmul.h new file mode 100644 index 00..694705f703 --- /dev/null +++ b/host/include/generic/host/crypto/clmul.h @@ -0,0 +1,17 @@ +/* + * No host specific carry-less multiply acceleration. + * SPDX-License-Identifier: GPL-2.0-or-later + */ + +#ifndef GENERIC_HOST_CRYPTO_CLMUL_H +#define GENERIC_HOST_CRYPTO_CLMUL_H + +/* Defer everything to the generic routines. */ +#define clmul_8x8_low clmul_8x8_low_gen +#define clmul_8x4_even clmul_8x4_even_gen +#define clmul_8x4_odd clmul_8x4_odd_gen +#define clmul_8x8_even clmul_8x8_even_gen +#define clmul_8x8_odd clmul_8x8_odd_gen +#define clmul_8x8_packedclmul_8x8_packed_gen + +#endif /* GENERIC_HOST_CRYPTO_CLMUL_H */ diff --git a/include/crypto/clmul.h b/include/crypto/clmul.h new file mode 100644 index 00..7f19205d6f --- /dev/null +++ b/include/crypto/clmul.h @@ -0,0 +1,61 @@ +/* + * Carry-less multiply + * SPDX-License-Identifier: GPL-2.0-or-later + * + * Copyright (C) 2023 Linaro, Ltd. + */ + +#ifndef CRYPTO_CLMUL_H +#define CRYPTO_CLMUL_H + +#include "qemu/int128.h" + +/** + * clmul_8x8_low: + * + * Perform eight 8x8->8 carry-less multiplies. + */ +uint64_t clmul_8x8_low_gen(uint64_t, uint64_t); + +/** + * clmul_8x4_even: + * + * Perform four 8x8->16 carry-less multiplies. + * The odd bytes of the inputs are ignored. + */ +uint64_t clmul_8x4_even_gen(uint64_t, uint64_t); + +/** + * clmul_8x4_odd: + * + * Perform four 8x8->16 carry-less multiplies. + * The even bytes of the inputs are ignored. + */ +uint64_t clmul_8x4_odd_gen(uint64_t, uint64_t); + +/** + * clmul_8x8_even: + * + * Perform eight 8x8->16 carry-less multiplies. + * The odd bytes of the inputs are ignored. + */ +Int128 clmul_8x8_even_gen(Int128, Int128); + +/** + * clmul_8x8_odd: + * + * Perform eight 8x8->16 carry-less multiplies. + * The even bytes of the inputs are ignored. + */ +Int128 clmul_8x8_odd_gen(Int128, Int128); + +/** + * clmul_8x8_packed: + * + * Perform eight 8x8->16 carry-less multiplies. + */ +Int128 clmul_8x8_packed_gen(uint64_t, uint64_t); + +#include "host/crypto/clmul.h" + +#endif /* CRYPTO_CLMUL_H */ diff --git a/crypto/clmul.c b/crypto/clmul.c new file mode 100644 index 00..866704e751 --- /dev/null +++ b/crypto/clmul.c @@ -0,0 +1,76 @@ +/* + * No host specific carry-less multiply acceleration. + * SPDX-License-Identifier: GPL-2.0-or-later + */ + +#include "qemu/osdep.h" +#include "crypto/clmul.h" + +uint64_t clmul_8x8_low_gen(uint64_t n, uint64_t m) +{ +uint64_t r = 0; + +for (int i = 0; i < 8; ++i) { +uint64_t mask = (n & 0x0101010101010101ull) * 0xff; +r ^= m & mask; +m = (m << 1) & 0xfefefefefefefefeull; +n >>= 1; +} +return r; +} + +uint64_t clmul_8x4_even_gen(uint64_t n, uint64_t m) +{ +uint64_t r = 0; + +n &= 0x00ff00ff00ff00ffull; +m &= 0x00ff00ff00ff00ffull; + +for (int i = 0; i < 8; ++i) { +uint64_t mask = (n & 0x0001000100010001ull) * 0x; +r ^= m & mask; +n >>= 1; +m <<= 1; +} +return r; +} + +uint64_t clmul_8x4_odd_gen(uint64_t n, uint64_t m) +{ +return clmul_8x4_even_gen(n >> 8, m >> 8); +} + +Int128 clmul_8x8_even_gen(Int128 n, Int128 m) +{ +uint64_t rl, rh; + +rl = clmul_8x4_even_gen(int128_getlo(n), int128_getlo(m)); +rh = clmul_8x4_even_gen(int128_gethi(n), int128_gethi(m)); +return int128_make128(rl, rh); +} + +Int128 clmul_8x8_odd_gen(Int128 n, Int128 m) +{ +uint64_t rl, rh; + +rl = clmul_8x4_odd_gen(int128_getlo(n), int128_getlo(m)); +rh = clmul_8x4_odd_gen(int128_gethi(n), int128_gethi(m)); +return int128_make128(rl, rh); +} + +static uint64_t unpack_8_to_16(uint64_t x) +{ +return (x & 0x00ff) + | ((x & 0xff00) << 8) + | ((x & 0x00ff) << 16) + | ((x & 0xff00) << 24); +} + +Int128 clmul_8x8_packed_gen(uint64_t n, uint64_t m) +{ +uint64_t rl, rh; + +rl = clmul_8x4_even_gen(unpack_8_to_16(n), unpack_8_to_16(m)); +rh = clmul_8x4_even_gen(unpack_8_to_16(n >> 32), unpack_8_to_16(m >> 32)); +return int128_make128(rl, rh); +} diff --git a/crypto/meson.build b/crypto/meson.build index 5f03a30d34..9ac1a89802 100644 --- a/crypto/meson.build +++ b/crypto/meson.build @@ -48,9 +48,12 @@ if have_afalg endif crypto_ss.add(when: gnutls, if_true: files('tls-cipher-suites.c')) -util_ss.add(files('sm4.c')) -util_ss.add(files('aes.c'))
[PATCH 13/18] crypto: Add generic 64-bit carry-less multiply routine
Signed-off-by: Richard Henderson --- host/include/generic/host/crypto/clmul.h | 2 ++ include/crypto/clmul.h | 7 +++ crypto/clmul.c | 17 + 3 files changed, 26 insertions(+) diff --git a/host/include/generic/host/crypto/clmul.h b/host/include/generic/host/crypto/clmul.h index 3fbb1576cf..7f70afeb57 100644 --- a/host/include/generic/host/crypto/clmul.h +++ b/host/include/generic/host/crypto/clmul.h @@ -23,4 +23,6 @@ #define clmul_32x2_even clmul_32x2_even_gen #define clmul_32x2_odd clmul_32x2_odd_gen +#define clmul_64clmul_64_gen + #endif /* GENERIC_HOST_CRYPTO_CLMUL_H */ diff --git a/include/crypto/clmul.h b/include/crypto/clmul.h index ce43c9aeb1..8b4c263459 100644 --- a/include/crypto/clmul.h +++ b/include/crypto/clmul.h @@ -111,6 +111,13 @@ Int128 clmul_32x2_even_gen(Int128, Int128); */ Int128 clmul_32x2_odd_gen(Int128, Int128); +/** + * clmul_64: + * + * Perform a 64x64->128 carry-less multiply. + */ +Int128 clmul_64_gen(uint64_t, uint64_t); + #include "host/crypto/clmul.h" #endif /* CRYPTO_CLMUL_H */ diff --git a/crypto/clmul.c b/crypto/clmul.c index c197cd5f21..0be06073f0 100644 --- a/crypto/clmul.c +++ b/crypto/clmul.c @@ -144,3 +144,20 @@ Int128 clmul_32x2_odd_gen(Int128 n, Int128 m) rh = clmul_32_gen(int128_gethi(n) >> 32, int128_gethi(m) >> 32); return int128_make128(rl, rh); } + +Int128 clmul_64_gen(uint64_t n, uint64_t m) +{ +uint64_t rl = 0, rh = 0; + +/* Bit 0 can only influence the low 64-bit result. */ +if (n & 1) { +rl = m; +} + +for (int i = 1; i < 64; ++i) { +uint64_t mask = -((n >> i) & 1); +rl ^= (m << i) & mask; +rh ^= (m >> (64 - i)) & mask; +} +return int128_make128(rl, rh); +} -- 2.34.1
[PATCH 17/18] host/include/i386: Implement clmul.h
Detect PCLMUL in cpuinfo; implement the accel hooks. Signed-off-by: Richard Henderson --- host/include/i386/host/cpuinfo.h| 1 + host/include/i386/host/crypto/clmul.h | 187 host/include/x86_64/host/crypto/clmul.h | 1 + util/cpuinfo-i386.c | 1 + 4 files changed, 190 insertions(+) create mode 100644 host/include/i386/host/crypto/clmul.h create mode 100644 host/include/x86_64/host/crypto/clmul.h diff --git a/host/include/i386/host/cpuinfo.h b/host/include/i386/host/cpuinfo.h index 073d0a426f..7ae21568f7 100644 --- a/host/include/i386/host/cpuinfo.h +++ b/host/include/i386/host/cpuinfo.h @@ -27,6 +27,7 @@ #define CPUINFO_ATOMIC_VMOVDQA (1u << 16) #define CPUINFO_ATOMIC_VMOVDQU (1u << 17) #define CPUINFO_AES (1u << 18) +#define CPUINFO_PCLMUL (1u << 19) /* Initialized with a constructor. */ extern unsigned cpuinfo; diff --git a/host/include/i386/host/crypto/clmul.h b/host/include/i386/host/crypto/clmul.h new file mode 100644 index 00..0877d65ab6 --- /dev/null +++ b/host/include/i386/host/crypto/clmul.h @@ -0,0 +1,187 @@ +/* + * x86 specific clmul acceleration. + * SPDX-License-Identifier: GPL-2.0-or-later + */ + +#ifndef X86_HOST_CRYPTO_CLMUL_H +#define X86_HOST_CRYPTO_CLMUL_H + +#include "host/cpuinfo.h" +#include + +#if defined(__PCLMUL__) +# define HAVE_CLMUL_ACCEL true +# define ATTR_CLMUL_ACCEL +#else +# define HAVE_CLMUL_ACCEL likely(cpuinfo & CPUINFO_PCLMUL) +# define ATTR_CLMUL_ACCEL __attribute__((target("pclmul"))) +#endif + +static inline Int128 ATTR_CLMUL_ACCEL +clmul_64(uint64_t n, uint64_t m) +{ +union { __m128i v; Int128 s; } u; + +if (!HAVE_CLMUL_ACCEL) { +return clmul_64_gen(n, m); +} + +u.v = _mm_clmulepi64_si128(_mm_set_epi64x(0, n), _mm_set_epi64x(0, m), 0); +return u.s; +} + +static inline uint64_t ATTR_CLMUL_ACCEL +clmul_32(uint32_t n, uint32_t m) +{ +__m128i r; + +if (!HAVE_CLMUL_ACCEL) { +return clmul_32_gen(n, m); +} + +r = _mm_clmulepi64_si128(_mm_cvtsi32_si128(n), _mm_cvtsi32_si128(m), 0); +return ((__v2di)r)[0]; +} + +static inline Int128 ATTR_CLMUL_ACCEL +clmul_32x2_even(Int128 n, Int128 m) +{ +union { __m128i v; Int128 s; } ur, un, um; +__m128i n02, m02, r0, r2; + +if (!HAVE_CLMUL_ACCEL) { +return clmul_32x2_even_gen(n, m); +} + +un.s = n; +um.s = m; +n02 = _mm_slli_epi64(un.v, 32); +m02 = _mm_slli_epi64(um.v, 32); +r0 = _mm_clmulepi64_si128(n02, m02, 0x00); +r2 = _mm_clmulepi64_si128(n02, m02, 0x11); +ur.v = _mm_unpackhi_epi64(r0, r2); +return ur.s; +} + +static inline Int128 ATTR_CLMUL_ACCEL +clmul_32x2_odd(Int128 n, Int128 m) +{ +union { __m128i v; Int128 s; } ur, un, um; +__m128i n13, m13, r1, r3; + +if (!HAVE_CLMUL_ACCEL) { +return clmul_32x2_odd_gen(n, m); +} + +un.s = n; +um.s = m; +n13 = _mm_srli_epi64(un.v, 32); +m13 = _mm_srli_epi64(um.v, 32); +r1 = _mm_clmulepi64_si128(n13, m13, 0x00); +r3 = _mm_clmulepi64_si128(n13, m13, 0x11); +ur.v = _mm_unpacklo_epi64(r1, r3); +return ur.s; +} + +static inline uint64_t ATTR_CLMUL_ACCEL +clmul_16x2_even(uint64_t n, uint64_t m) +{ +__m128i r0, r2; + +if (!HAVE_CLMUL_ACCEL) { +return clmul_16x2_even_gen(n, m); +} + +r0 = _mm_clmulepi64_si128(_mm_cvtsi32_si128(n & 0x), + _mm_cvtsi32_si128(m & 0x), 0); +r2 = _mm_clmulepi64_si128(_mm_cvtsi32_si128((n >> 32) & 0x), + _mm_cvtsi32_si128((m >> 32) & 0x), 0); +r0 = _mm_unpacklo_epi32(r0, r2); +return ((__v2di)r0)[0]; +} + +static inline uint64_t ATTR_CLMUL_ACCEL +clmul_16x2_odd(uint64_t n, uint64_t m) +{ +__m128i r1, r3; + +if (!HAVE_CLMUL_ACCEL) { +return clmul_16x2_even_gen(n, m); +} + +r1 = _mm_clmulepi64_si128(_mm_cvtsi32_si128((n >> 16) & 0x), + _mm_cvtsi32_si128((m >> 16) & 0x), 0); +r3 = _mm_clmulepi64_si128(_mm_cvtsi32_si128((n >> 48) & 0x), + _mm_cvtsi32_si128((m >> 48) & 0x), 0); +r1 = _mm_unpacklo_epi32(r1, r3); +return ((__v2di)r1)[0]; +} + +static inline Int128 ATTR_CLMUL_ACCEL +clmul_16x4_even(Int128 n, Int128 m) +{ +union { __m128i v; Int128 s; } ur, un, um; +__m128i mask = _mm_set_epi16(0, 0, 0, -1, 0, 0, 0, -1); +__m128i n04, m04, n26, m26, r0, r2, r4, r6; + +if (!HAVE_CLMUL_ACCEL) { +return clmul_16x4_even_gen(n, m); +} + +un.s = n; +um.s = m; +n04 = _mm_and_si128(un.v, mask); +m04 = _mm_and_si128(um.v, mask); +r0 = _mm_clmulepi64_si128(n04, m04, 0x00); +r4 = _mm_clmulepi64_si128(n04, m04, 0x11); +n26 = _mm_and_si128(_mm_srli_epi64(un.v, 32), mask); +m26 = _mm_and_si128(_mm_srli_epi64(um.v, 32), mask); +r2 = _mm_clmulepi64_si128(n26, m26, 0x00); +r6 = _mm_clmulepi64_si128(n26, m26, 0x11); + +r0 =
[PATCH 11/18] target/s390x: Use clmul_32* routines
Use generic routines for 32-bit carry-less multiply. Remove our local version of galois_multiply32. Signed-off-by: Richard Henderson --- target/s390x/tcg/vec_int_helper.c | 70 --- 1 file changed, 17 insertions(+), 53 deletions(-) diff --git a/target/s390x/tcg/vec_int_helper.c b/target/s390x/tcg/vec_int_helper.c index 523d6375bb..f5eea2330a 100644 --- a/target/s390x/tcg/vec_int_helper.c +++ b/target/s390x/tcg/vec_int_helper.c @@ -165,22 +165,6 @@ DEF_VCTZ(8) DEF_VCTZ(16) /* like binary multiplication, but XOR instead of addition */ -#define DEF_GALOIS_MULTIPLY(BITS, TBITS) \ -static uint##TBITS##_t galois_multiply##BITS(uint##TBITS##_t a, \ - uint##TBITS##_t b) \ -{ \ -uint##TBITS##_t res = 0; \ - \ -while (b) { \ -if (b & 0x1) { \ -res = res ^ a; \ -} \ -a = a << 1; \ -b = b >> 1; \ -} \ -return res; \ -} -DEF_GALOIS_MULTIPLY(32, 64) static S390Vector galois_multiply64(uint64_t a, uint64_t b) { @@ -244,24 +228,24 @@ void HELPER(gvec_vgfma16)(void *v1, const void *v2, const void *v3, *(Int128 *)v1 = int128_xor(r, *(Int128 *)v4); } -#define DEF_VGFM(BITS, TBITS) \ -void HELPER(gvec_vgfm##BITS)(void *v1, const void *v2, const void *v3, \ - uint32_t desc) \ -{ \ -int i; \ - \ -for (i = 0; i < (128 / TBITS); i++) { \ -uint##BITS##_t a = s390_vec_read_element##BITS(v2, i * 2); \ -uint##BITS##_t b = s390_vec_read_element##BITS(v3, i * 2); \ -uint##TBITS##_t d = galois_multiply##BITS(a, b); \ - \ -a = s390_vec_read_element##BITS(v2, i * 2 + 1); \ -b = s390_vec_read_element##BITS(v3, i * 2 + 1); \ -d = d ^ galois_multiply32(a, b); \ -s390_vec_write_element##TBITS(v1, i, d); \ -} \ +static Int128 do_gfm32(Int128 n, Int128 m) +{ +Int128 e = clmul_32x2_even(n, m); +Int128 o = clmul_32x2_odd(n, m); +return int128_xor(e, o); +} + +void HELPER(gvec_vgfm32)(void *v1, const void *v2, const void *v3, uint32_t d) +{ +*(Int128 *)v1 = do_gfm32(*(const Int128 *)v2, *(const Int128 *)v3); +} + +void HELPER(gvec_vgfma32)(void *v1, const void *v2, const void *v3, + const void *v4, uint32_t d) +{ +Int128 r = do_gfm32(*(const Int128 *)v2, *(const Int128 *)v3); +*(Int128 *)v1 = int128_xor(r, *(Int128 *)v4); } -DEF_VGFM(32, 64) void HELPER(gvec_vgfm64)(void *v1, const void *v2, const void *v3, uint32_t desc) @@ -278,26 +262,6 @@ void HELPER(gvec_vgfm64)(void *v1, const void *v2, const void *v3, s390_vec_xor(v1, , ); } -#define DEF_VGFMA(BITS, TBITS) \ -void HELPER(gvec_vgfma##BITS)(void *v1, const void *v2, const void *v3, \ - const void *v4, uint32_t desc) \ -{ \ -int i; \ - \ -for (i = 0; i < (128 / TBITS); i++) { \ -uint##BITS##_t a = s390_vec_read_element##BITS(v2, i * 2); \ -uint##BITS##_t b = s390_vec_read_element##BITS(v3, i * 2); \ -uint##TBITS##_t d = galois_multiply##BITS(a, b); \ -
[PATCH 12/18] target/ppc: Use clmul_32* routines
Use generic routines for 32-bit carry-less multiply. Signed-off-by: Richard Henderson --- target/ppc/int_helper.c | 27 +++ 1 file changed, 7 insertions(+), 20 deletions(-) diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c index 98d6310f59..828f04bce7 100644 --- a/target/ppc/int_helper.c +++ b/target/ppc/int_helper.c @@ -1444,28 +1444,15 @@ void helper_vpmsumh(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b) r->s128 = int128_xor(e, o); } -#define PMSUM(name, srcfld, trgfld, trgtyp) \ -void helper_##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b) \ -{ \ -int i, j; \ -trgtyp prod[sizeof(ppc_avr_t) / sizeof(a->srcfld[0])];\ - \ -VECTOR_FOR_INORDER_I(i, srcfld) { \ -prod[i] = 0; \ -for (j = 0; j < sizeof(a->srcfld[0]) * 8; j++) { \ -if (a->srcfld[i] & (1ull << j)) { \ -prod[i] ^= ((trgtyp)b->srcfld[i] << j); \ -} \ -} \ -} \ - \ -VECTOR_FOR_INORDER_I(i, trgfld) { \ -r->trgfld[i] = prod[2 * i] ^ prod[2 * i + 1]; \ -} \ +void helper_vpmsumw(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b) +{ +Int128 ia = a->s128; +Int128 ib = b->s128; +Int128 e = clmul_32x2_even(ia, ib); +Int128 o = clmul_32x2_odd(ia, ib); +r->s128 = int128_xor(e, o); } -PMSUM(vpmsumw, u32, u64, uint64_t) - void helper_VPMSUMD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b) { int i, j; -- 2.34.1
[PATCH 04/18] target/ppc: Use clmul_8* routines
Use generic routines for 8-bit carry-less multiply. Signed-off-by: Richard Henderson --- target/ppc/int_helper.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c index 834da80fe3..3bf0f5dbe5 100644 --- a/target/ppc/int_helper.c +++ b/target/ppc/int_helper.c @@ -26,6 +26,7 @@ #include "exec/helper-proto.h" #include "crypto/aes.h" #include "crypto/aes-round.h" +#include "crypto/clmul.h" #include "fpu/softfloat.h" #include "qapi/error.h" #include "qemu/guest-random.h" @@ -1425,6 +1426,15 @@ void helper_vbpermq(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b) #undef VBPERMQ_INDEX #undef VBPERMQ_DW +void helper_vpmsumb(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b) +{ +Int128 ia = a->s128; +Int128 ib = b->s128; +Int128 e = clmul_8x8_even(ia, ib); +Int128 o = clmul_8x8_odd(ia, ib); +r->s128 = int128_xor(e, o); +} + #define PMSUM(name, srcfld, trgfld, trgtyp) \ void helper_##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b) \ { \ @@ -1445,7 +1455,6 @@ void helper_##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b) \ } \ } -PMSUM(vpmsumb, u8, u16, uint16_t) PMSUM(vpmsumh, u16, u32, uint32_t) PMSUM(vpmsumw, u32, u64, uint64_t) -- 2.34.1
[PATCH 08/18] target/ppc: Use clmul_16* routines
Use generic routines for 16-bit carry-less multiply. Signed-off-by: Richard Henderson --- target/ppc/int_helper.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c index 3bf0f5dbe5..98d6310f59 100644 --- a/target/ppc/int_helper.c +++ b/target/ppc/int_helper.c @@ -1435,6 +1435,15 @@ void helper_vpmsumb(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b) r->s128 = int128_xor(e, o); } +void helper_vpmsumh(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b) +{ +Int128 ia = a->s128; +Int128 ib = b->s128; +Int128 e = clmul_16x4_even(ia, ib); +Int128 o = clmul_16x4_odd(ia, ib); +r->s128 = int128_xor(e, o); +} + #define PMSUM(name, srcfld, trgfld, trgtyp) \ void helper_##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b) \ { \ @@ -1455,7 +1464,6 @@ void helper_##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b) \ } \ } -PMSUM(vpmsumh, u16, u32, uint32_t) PMSUM(vpmsumw, u32, u64, uint64_t) void helper_VPMSUMD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b) -- 2.34.1
[PATCH 06/18] target/arm: Use clmul_16* routines
Use generic routines for 16-bit carry-less multiply. Remove our local version of pmull_w. Signed-off-by: Richard Henderson --- target/arm/tcg/vec_internal.h | 6 -- target/arm/tcg/mve_helper.c | 8 ++-- target/arm/tcg/vec_helper.c | 13 - 3 files changed, 2 insertions(+), 25 deletions(-) diff --git a/target/arm/tcg/vec_internal.h b/target/arm/tcg/vec_internal.h index c4afba6d9f..3ca1b94ccf 100644 --- a/target/arm/tcg/vec_internal.h +++ b/target/arm/tcg/vec_internal.h @@ -219,12 +219,6 @@ int16_t do_sqrdmlah_h(int16_t, int16_t, int16_t, bool, bool, uint32_t *); int32_t do_sqrdmlah_s(int32_t, int32_t, int32_t, bool, bool, uint32_t *); int64_t do_sqrdmlah_d(int64_t, int64_t, int64_t, bool, bool); -/* - * 16 x 16 -> 32 vector polynomial multiply where the inputs are - * in the low 16 bits of each 32-bit element - */ -uint64_t pmull_w(uint64_t op1, uint64_t op2); - /** * bfdotadd: * @sum: addend diff --git a/target/arm/tcg/mve_helper.c b/target/arm/tcg/mve_helper.c index 96ddfb4b3a..c666a96ba1 100644 --- a/target/arm/tcg/mve_helper.c +++ b/target/arm/tcg/mve_helper.c @@ -985,14 +985,10 @@ DO_2OP_L(vmulltuw, 1, 4, uint32_t, 8, uint64_t, DO_MUL) * Polynomial multiply. We can always do this generating 64 bits * of the result at a time, so we don't need to use DO_2OP_L. */ -#define VMULLPW_MASK 0xULL -#define DO_VMULLPBW(N, M) pmull_w((N) & VMULLPW_MASK, (M) & VMULLPW_MASK) -#define DO_VMULLPTW(N, M) DO_VMULLPBW((N) >> 16, (M) >> 16) - DO_2OP(vmullpbh, 8, uint64_t, clmul_8x4_even) DO_2OP(vmullpth, 8, uint64_t, clmul_8x4_odd) -DO_2OP(vmullpbw, 8, uint64_t, DO_VMULLPBW) -DO_2OP(vmullptw, 8, uint64_t, DO_VMULLPTW) +DO_2OP(vmullpbw, 8, uint64_t, clmul_16x2_even) +DO_2OP(vmullptw, 8, uint64_t, clmul_16x2_odd) /* * Because the computation type is at least twice as large as required, diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c index 4384b6c188..1b1d5fccbc 100644 --- a/target/arm/tcg/vec_helper.c +++ b/target/arm/tcg/vec_helper.c @@ -2029,19 +2029,6 @@ void HELPER(gvec_pmull_q)(void *vd, void *vn, void *vm, uint32_t desc) clear_tail(d, opr_sz, simd_maxsz(desc)); } -uint64_t pmull_w(uint64_t op1, uint64_t op2) -{ -uint64_t result = 0; -int i; -for (i = 0; i < 16; ++i) { -uint64_t mask = (op1 & 0x00010001ull) * 0x; -result ^= op2 & mask; -op1 >>= 1; -op2 <<= 1; -} -return result; -} - void HELPER(neon_pmull_h)(void *vd, void *vn, void *vm, uint32_t desc) { int hi = simd_data(desc); -- 2.34.1
[PATCH 05/18] crypto: Add generic 16-bit carry-less multiply routines
Signed-off-by: Richard Henderson --- host/include/generic/host/crypto/clmul.h | 5 +++ include/crypto/clmul.h | 32 +++ crypto/clmul.c | 39 3 files changed, 76 insertions(+) diff --git a/host/include/generic/host/crypto/clmul.h b/host/include/generic/host/crypto/clmul.h index 694705f703..cba8bbf3e4 100644 --- a/host/include/generic/host/crypto/clmul.h +++ b/host/include/generic/host/crypto/clmul.h @@ -14,4 +14,9 @@ #define clmul_8x8_odd clmul_8x8_odd_gen #define clmul_8x8_packedclmul_8x8_packed_gen +#define clmul_16x2_even clmul_16x2_even_gen +#define clmul_16x2_odd clmul_16x2_odd_gen +#define clmul_16x4_even clmul_16x4_even_gen +#define clmul_16x4_odd clmul_16x4_odd_gen + #endif /* GENERIC_HOST_CRYPTO_CLMUL_H */ diff --git a/include/crypto/clmul.h b/include/crypto/clmul.h index 7f19205d6f..b701bac9d6 100644 --- a/include/crypto/clmul.h +++ b/include/crypto/clmul.h @@ -56,6 +56,38 @@ Int128 clmul_8x8_odd_gen(Int128, Int128); */ Int128 clmul_8x8_packed_gen(uint64_t, uint64_t); +/** + * clmul_16x2_even: + * + * Perform two 16x16->32 carry-less multiplies. + * The odd words of the inputs are ignored. + */ +uint64_t clmul_16x2_even_gen(uint64_t, uint64_t); + +/** + * clmul_16x2_odd: + * + * Perform two 16x16->32 carry-less multiplies. + * The even bytes of the inputs are ignored. + */ +uint64_t clmul_16x2_odd_gen(uint64_t, uint64_t); + +/** + * clmul_16x4_even: + * + * Perform four 16x16->32 carry-less multiplies. + * The odd bytes of the inputs are ignored. + */ +Int128 clmul_16x4_even_gen(Int128, Int128); + +/** + * clmul_16x4_odd: + * + * Perform eight 16x16->32 carry-less multiplies. + * The even bytes of the inputs are ignored. + */ +Int128 clmul_16x4_odd_gen(Int128, Int128); + #include "host/crypto/clmul.h" #endif /* CRYPTO_CLMUL_H */ diff --git a/crypto/clmul.c b/crypto/clmul.c index 866704e751..69a3b6f7ff 100644 --- a/crypto/clmul.c +++ b/crypto/clmul.c @@ -74,3 +74,42 @@ Int128 clmul_8x8_packed_gen(uint64_t n, uint64_t m) rh = clmul_8x4_even_gen(unpack_8_to_16(n >> 32), unpack_8_to_16(m >> 32)); return int128_make128(rl, rh); } + +uint64_t clmul_16x2_even_gen(uint64_t n, uint64_t m) +{ +uint64_t r = 0; + +n &= 0xull; +m &= 0xull; + +for (int i = 0; i < 16; ++i) { +uint64_t mask = (n & 0x00010001ull) * 0xull; +r ^= m & mask; +n >>= 1; +m <<= 1; +} +return r; +} + +uint64_t clmul_16x2_odd_gen(uint64_t n, uint64_t m) +{ +return clmul_16x2_even_gen(n >> 16, m >> 16); +} + +Int128 clmul_16x4_even_gen(Int128 n, Int128 m) +{ +uint64_t rl, rh; + +rl = clmul_16x2_even_gen(int128_getlo(n), int128_getlo(m)); +rh = clmul_16x2_even_gen(int128_gethi(n), int128_gethi(m)); +return int128_make128(rl, rh); +} + +Int128 clmul_16x4_odd_gen(Int128 n, Int128 m) +{ +uint64_t rl, rh; + +rl = clmul_16x2_odd_gen(int128_getlo(n), int128_getlo(m)); +rh = clmul_16x2_odd_gen(int128_gethi(n), int128_gethi(m)); +return int128_make128(rl, rh); +} -- 2.34.1
[PATCH 18/18] host/include/aarch64: Implement clmul.h
Detect PMULL in cpuinfo; implement the accel hooks. Signed-off-by: Richard Henderson --- host/include/aarch64/host/cpuinfo.h | 1 + host/include/aarch64/host/crypto/clmul.h | 230 +++ util/cpuinfo-aarch64.c | 4 +- 3 files changed, 234 insertions(+), 1 deletion(-) create mode 100644 host/include/aarch64/host/crypto/clmul.h diff --git a/host/include/aarch64/host/cpuinfo.h b/host/include/aarch64/host/cpuinfo.h index 05feeb4f43..da268dce13 100644 --- a/host/include/aarch64/host/cpuinfo.h +++ b/host/include/aarch64/host/cpuinfo.h @@ -10,6 +10,7 @@ #define CPUINFO_LSE (1u << 1) #define CPUINFO_LSE2(1u << 2) #define CPUINFO_AES (1u << 3) +#define CPUINFO_PMULL (1u << 4) /* Initialized with a constructor. */ extern unsigned cpuinfo; diff --git a/host/include/aarch64/host/crypto/clmul.h b/host/include/aarch64/host/crypto/clmul.h new file mode 100644 index 00..7fd827898b --- /dev/null +++ b/host/include/aarch64/host/crypto/clmul.h @@ -0,0 +1,230 @@ +/* + * AArch64 specific clmul acceleration. + * SPDX-License-Identifier: GPL-2.0-or-later + */ + +#ifndef AARCH64_HOST_CRYPTO_CLMUL_H +#define AARCH64_HOST_CRYPTO_CLMUL_H + +#include "host/cpuinfo.h" +#include + +/* Both FEAT_AES and FEAT_PMULL are covered under the same macro. */ +#ifdef __ARM_FEATURE_AES +# define HAVE_CLMUL_ACCEL true +#else +# define HAVE_CLMUL_ACCEL likely(cpuinfo & CPUINFO_PMULL) +#endif +#if !defined(__ARM_FEATURE_AES) && defined(CONFIG_ARM_AES_BUILTIN) +# define ATTR_CLMUL_ACCEL __attribute__((target("+crypto"))) +#else +# define ATTR_CLMUL_ACCEL +#endif + +/* + * The 8x8->8 pmul and 8x8->16 pmull are available unconditionally. + */ + +static inline uint64_t clmul_8x8_low(uint64_t n, uint64_t m) +{ +return (uint64_t)vmul_p8((poly8x8_t)n, (poly8x8_t)m); +} + +static inline Int128 clmul_8x8_packed(uint64_t n, uint64_t m) +{ +union { poly16x8_t v; Int128 s; } u; +u.v = vmull_p8((poly8x8_t)n, (poly8x8_t)m); +return u.s; +} + +static inline Int128 clmul_8x8_even(Int128 n, Int128 m) +{ +union { uint16x8_t v; Int128 s; } un, um; +uint8x8_t pn, pm; + +un.s = n; +um.s = m; +pn = vmovn_u16(un.v); +pm = vmovn_u16(um.v); +return clmul_8x8_packed((uint64_t)pn, (uint64_t)pm); +} + +static inline Int128 clmul_8x8_odd(Int128 n, Int128 m) +{ +union { uint8x16_t v; Int128 s; } un, um; +uint8x8_t pn, pm; + +un.s = n; +um.s = m; +pn = vqtbl1_u8(un.v, (uint8x8_t){ 1, 3, 5, 7, 9, 11, 13, 15 }); +pm = vqtbl1_u8(um.v, (uint8x8_t){ 1, 3, 5, 7, 9, 11, 13, 15 }); +return clmul_8x8_packed((uint64_t)pn, (uint64_t)pm); +} + +static inline uint64_t clmul_8x4_even(uint64_t n, uint64_t m) +{ +return int128_getlo(clmul_8x8_even(int128_make64(n), int128_make64(m))); +} + +static inline uint64_t clmul_8x4_odd(uint64_t n, uint64_t m) +{ +return int128_getlo(clmul_8x8_odd(int128_make64(n), int128_make64(m))); +} + +static inline Int128 clmul_16x4_packed_accel(uint16x4_t n, uint16x4_t m) +{ +union { uint32x4_t v; Int128 s; } u; +uint32x4_t r0, r1, r2; + +/* + * Considering the per-byte multiplication: + * ab + * cd + *- + * bd << 0 + * bc << 8 + * ad << 8 + * ac<< 16 + * + * We get the ac and bd rows of the result for free from the expanding + * packed multiply. Reverse the two bytes in M, repeat, and we get the + * ad and bc results, but in the wrong column; shift to fix and sum all. + */ +r0 = (uint32x4_t)vmull_p8((poly8x8_t)n, (poly8x8_t)m); +r1 = (uint32x4_t)vmull_p8((poly8x8_t)n, vrev16_p8((poly8x8_t)m)); +r2 = r1 << 8; /* bc */ +r1 = r1 >> 8; /* ad */ +r1 &= (uint32x4_t){ 0x0000, 0x0000, 0x0000, 0x0000 }; +r2 &= (uint32x4_t){ 0x0000, 0x0000, 0x0000, 0x0000 }; +r0 = r0 ^ r1 ^ r2; + +u.v = r0; +return u.s; +} + +static inline Int128 clmul_16x4_even(Int128 n, Int128 m) +{ +union { uint32x4_t v; Int128 s; } um, un; +uint16x4_t pn, pm; + +/* Extract even uint16_t. */ +un.s = n; +um.s = m; +pn = vmovn_u32(un.v); +pm = vmovn_u32(um.v); +return clmul_16x4_packed_accel(pn, pm); +} + +static inline Int128 clmul_16x4_odd(Int128 n, Int128 m) +{ +union { uint8x16_t v; Int128 s; } um, un; +uint16x4_t pn, pm; + +/* Extract odd uint16_t. */ +un.s = n; +um.s = m; +pn = (uint16x4_t)vqtbl1_u8(un.v, (uint8x8_t){ 2, 3, 6, 7, 10, 11, 14, 15 }); +pm = (uint16x4_t)vqtbl1_u8(um.v, (uint8x8_t){ 2, 3, 6, 7, 10, 11, 14, 15 }); +return clmul_16x4_packed_accel(pn, pm); +} + +static inline uint64_t clmul_16x2_even(uint64_t n, uint64_t m) +{ +return int128_getlo(clmul_16x4_even(int128_make64(n), int128_make64(m))); +} + +static inline uint64_t clmul_16x2_odd(uint64_t n, uint64_t m) +{ +return int128_getlo(clmul_16x4_odd(int128_make64(n), int128_make64(m))); +} +
[PATCH 14/18] target/arm: Use clmul_64
Use generic routine for 64-bit carry-less multiply. Signed-off-by: Richard Henderson --- target/arm/tcg/vec_helper.c | 22 -- 1 file changed, 4 insertions(+), 18 deletions(-) diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c index c81447e674..1a21aff4d9 100644 --- a/target/arm/tcg/vec_helper.c +++ b/target/arm/tcg/vec_helper.c @@ -2003,28 +2003,14 @@ void HELPER(gvec_pmul_b)(void *vd, void *vn, void *vm, uint32_t desc) */ void HELPER(gvec_pmull_q)(void *vd, void *vn, void *vm, uint32_t desc) { -intptr_t i, j, opr_sz = simd_oprsz(desc); +intptr_t i, opr_sz = simd_oprsz(desc); intptr_t hi = simd_data(desc); uint64_t *d = vd, *n = vn, *m = vm; for (i = 0; i < opr_sz / 8; i += 2) { -uint64_t nn = n[i + hi]; -uint64_t mm = m[i + hi]; -uint64_t rhi = 0; -uint64_t rlo = 0; - -/* Bit 0 can only influence the low 64-bit result. */ -if (nn & 1) { -rlo = mm; -} - -for (j = 1; j < 64; ++j) { -uint64_t mask = -((nn >> j) & 1); -rlo ^= (mm << j) & mask; -rhi ^= (mm >> (64 - j)) & mask; -} -d[i] = rlo; -d[i + 1] = rhi; +Int128 r = clmul_64(n[i + hi], m[i + hi]); +d[i] = int128_getlo(r); +d[i + 1] = int128_gethi(r); } clear_tail(d, opr_sz, simd_maxsz(desc)); } -- 2.34.1
[PATCH 15/18] target/s390x: Use clmul_64
Use the generic routine for 64-bit carry-less multiply. Remove our local version of galois_multiply64. Signed-off-by: Richard Henderson --- target/s390x/tcg/vec_int_helper.c | 62 +++ 1 file changed, 14 insertions(+), 48 deletions(-) diff --git a/target/s390x/tcg/vec_int_helper.c b/target/s390x/tcg/vec_int_helper.c index f5eea2330a..002ba67b11 100644 --- a/target/s390x/tcg/vec_int_helper.c +++ b/target/s390x/tcg/vec_int_helper.c @@ -21,13 +21,6 @@ static bool s390_vec_is_zero(const S390Vector *v) return !v->doubleword[0] && !v->doubleword[1]; } -static void s390_vec_xor(S390Vector *res, const S390Vector *a, - const S390Vector *b) -{ -res->doubleword[0] = a->doubleword[0] ^ b->doubleword[0]; -res->doubleword[1] = a->doubleword[1] ^ b->doubleword[1]; -} - static void s390_vec_and(S390Vector *res, const S390Vector *a, const S390Vector *b) { @@ -166,26 +159,6 @@ DEF_VCTZ(16) /* like binary multiplication, but XOR instead of addition */ -static S390Vector galois_multiply64(uint64_t a, uint64_t b) -{ -S390Vector res = {}; -S390Vector va = { -.doubleword[1] = a, -}; -S390Vector vb = { -.doubleword[1] = b, -}; - -while (!s390_vec_is_zero()) { -if (vb.doubleword[1] & 0x1) { -s390_vec_xor(, , ); -} -s390_vec_shl(, , 1); -s390_vec_shr(, , 1); -} -return res; -} - static Int128 do_gfm8(Int128 n, Int128 m) { Int128 e = clmul_8x8_even(n, m); @@ -247,35 +220,28 @@ void HELPER(gvec_vgfma32)(void *v1, const void *v2, const void *v3, *(Int128 *)v1 = int128_xor(r, *(Int128 *)v4); } +static Int128 do_gfm64(Int128 n, Int128 m) +{ +/* + * The two 64-bit halves are treated identically, + * therefore host ordering does not matter. + */ +Int128 e = clmul_64(int128_getlo(n), int128_getlo(m)); +Int128 o = clmul_64(int128_gethi(n), int128_gethi(m)); +return int128_xor(e, o); +} + void HELPER(gvec_vgfm64)(void *v1, const void *v2, const void *v3, uint32_t desc) { -S390Vector tmp1, tmp2; -uint64_t a, b; - -a = s390_vec_read_element64(v2, 0); -b = s390_vec_read_element64(v3, 0); -tmp1 = galois_multiply64(a, b); -a = s390_vec_read_element64(v2, 1); -b = s390_vec_read_element64(v3, 1); -tmp2 = galois_multiply64(a, b); -s390_vec_xor(v1, , ); +*(Int128 *)v1 = do_gfm64(*(const Int128 *)v2, *(const Int128 *)v3); } void HELPER(gvec_vgfma64)(void *v1, const void *v2, const void *v3, const void *v4, uint32_t desc) { -S390Vector tmp1, tmp2; -uint64_t a, b; - -a = s390_vec_read_element64(v2, 0); -b = s390_vec_read_element64(v3, 0); -tmp1 = galois_multiply64(a, b); -a = s390_vec_read_element64(v2, 1); -b = s390_vec_read_element64(v3, 1); -tmp2 = galois_multiply64(a, b); -s390_vec_xor(, , ); -s390_vec_xor(v1, , v4); +Int128 r = do_gfm64(*(const Int128 *)v2, *(const Int128 *)v3); +*(Int128 *)v1 = int128_xor(r, *(Int128 *)v4); } #define DEF_VMAL(BITS) \ -- 2.34.1
[PATCH 02/18] target/arm: Use clmul_8* routines
Use generic routines for 8-bit carry-less multiply. Remove our local version of pmull_h. Signed-off-by: Richard Henderson --- target/arm/tcg/vec_internal.h | 5 --- target/arm/tcg/mve_helper.c | 8 ++--- target/arm/tcg/vec_helper.c | 63 +++ 3 files changed, 15 insertions(+), 61 deletions(-) diff --git a/target/arm/tcg/vec_internal.h b/target/arm/tcg/vec_internal.h index 1f4ed80ff7..c4afba6d9f 100644 --- a/target/arm/tcg/vec_internal.h +++ b/target/arm/tcg/vec_internal.h @@ -219,11 +219,6 @@ int16_t do_sqrdmlah_h(int16_t, int16_t, int16_t, bool, bool, uint32_t *); int32_t do_sqrdmlah_s(int32_t, int32_t, int32_t, bool, bool, uint32_t *); int64_t do_sqrdmlah_d(int64_t, int64_t, int64_t, bool, bool); -/* - * 8 x 8 -> 16 vector polynomial multiply where the inputs are - * in the low 8 bits of each 16-bit element -*/ -uint64_t pmull_h(uint64_t op1, uint64_t op2); /* * 16 x 16 -> 32 vector polynomial multiply where the inputs are * in the low 16 bits of each 32-bit element diff --git a/target/arm/tcg/mve_helper.c b/target/arm/tcg/mve_helper.c index 403b345ea3..96ddfb4b3a 100644 --- a/target/arm/tcg/mve_helper.c +++ b/target/arm/tcg/mve_helper.c @@ -26,6 +26,7 @@ #include "exec/exec-all.h" #include "tcg/tcg.h" #include "fpu/softfloat.h" +#include "crypto/clmul.h" static uint16_t mve_eci_mask(CPUARMState *env) { @@ -984,15 +985,12 @@ DO_2OP_L(vmulltuw, 1, 4, uint32_t, 8, uint64_t, DO_MUL) * Polynomial multiply. We can always do this generating 64 bits * of the result at a time, so we don't need to use DO_2OP_L. */ -#define VMULLPH_MASK 0x00ff00ff00ff00ffULL #define VMULLPW_MASK 0xULL -#define DO_VMULLPBH(N, M) pmull_h((N) & VMULLPH_MASK, (M) & VMULLPH_MASK) -#define DO_VMULLPTH(N, M) DO_VMULLPBH((N) >> 8, (M) >> 8) #define DO_VMULLPBW(N, M) pmull_w((N) & VMULLPW_MASK, (M) & VMULLPW_MASK) #define DO_VMULLPTW(N, M) DO_VMULLPBW((N) >> 16, (M) >> 16) -DO_2OP(vmullpbh, 8, uint64_t, DO_VMULLPBH) -DO_2OP(vmullpth, 8, uint64_t, DO_VMULLPTH) +DO_2OP(vmullpbh, 8, uint64_t, clmul_8x4_even) +DO_2OP(vmullpth, 8, uint64_t, clmul_8x4_odd) DO_2OP(vmullpbw, 8, uint64_t, DO_VMULLPBW) DO_2OP(vmullptw, 8, uint64_t, DO_VMULLPTW) diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c index f59d3b26ea..4384b6c188 100644 --- a/target/arm/tcg/vec_helper.c +++ b/target/arm/tcg/vec_helper.c @@ -23,6 +23,7 @@ #include "tcg/tcg-gvec-desc.h" #include "fpu/softfloat.h" #include "qemu/int128.h" +#include "crypto/clmul.h" #include "vec_internal.h" /* @@ -1986,21 +1987,11 @@ void HELPER(gvec_ushl_h)(void *vd, void *vn, void *vm, uint32_t desc) */ void HELPER(gvec_pmul_b)(void *vd, void *vn, void *vm, uint32_t desc) { -intptr_t i, j, opr_sz = simd_oprsz(desc); +intptr_t i, opr_sz = simd_oprsz(desc); uint64_t *d = vd, *n = vn, *m = vm; for (i = 0; i < opr_sz / 8; ++i) { -uint64_t nn = n[i]; -uint64_t mm = m[i]; -uint64_t rr = 0; - -for (j = 0; j < 8; ++j) { -uint64_t mask = (nn & 0x0101010101010101ull) * 0xff; -rr ^= mm & mask; -mm = (mm << 1) & 0xfefefefefefefefeull; -nn >>= 1; -} -d[i] = rr; +d[i] = clmul_8x8_low(n[i], m[i]); } clear_tail(d, opr_sz, simd_maxsz(desc)); } @@ -2038,22 +2029,6 @@ void HELPER(gvec_pmull_q)(void *vd, void *vn, void *vm, uint32_t desc) clear_tail(d, opr_sz, simd_maxsz(desc)); } -/* - * 8x8->16 polynomial multiply. - * - * The byte inputs are expanded to (or extracted from) half-words. - * Note that neon and sve2 get the inputs from different positions. - * This allows 4 bytes to be processed in parallel with uint64_t. - */ - -static uint64_t expand_byte_to_half(uint64_t x) -{ -return (x & 0x00ff) - | ((x & 0xff00) << 8) - | ((x & 0x00ff) << 16) - | ((x & 0xff00) << 24); -} - uint64_t pmull_w(uint64_t op1, uint64_t op2) { uint64_t result = 0; @@ -2067,30 +2042,14 @@ uint64_t pmull_w(uint64_t op1, uint64_t op2) return result; } -uint64_t pmull_h(uint64_t op1, uint64_t op2) -{ -uint64_t result = 0; -int i; -for (i = 0; i < 8; ++i) { -uint64_t mask = (op1 & 0x0001000100010001ull) * 0x; -result ^= op2 & mask; -op1 >>= 1; -op2 <<= 1; -} -return result; -} - void HELPER(neon_pmull_h)(void *vd, void *vn, void *vm, uint32_t desc) { int hi = simd_data(desc); uint64_t *d = vd, *n = vn, *m = vm; -uint64_t nn = n[hi], mm = m[hi]; - -d[0] = pmull_h(expand_byte_to_half(nn), expand_byte_to_half(mm)); -nn >>= 32; -mm >>= 32; -d[1] = pmull_h(expand_byte_to_half(nn), expand_byte_to_half(mm)); +Int128 r = clmul_8x8_packed(n[hi], m[hi]); +d[0] = int128_getlo(r); +d[1] = int128_gethi(r); clear_tail(d, 16, simd_maxsz(desc)); } @@ -2101,11 +2060,13 @@ void HELPER(sve2_pmull_h)(void *vd, void *vn, void *vm, uint32_t
[RFC PATCH for-8.2 00/18] crypto: Provide clmul.h and host accel
Inspired by Ard Biesheuvel's RFC patches [1] for accelerating carry-less multiply under emulation. This is less polished than the AES patch set: (1) Should I split HAVE_CLMUL_ACCEL into per-width HAVE_CLMUL{N}_ACCEL? The "_generic" and "_accel" split is different from aes-round.h because of the difference in support for different widths, and it means that each host accel has more boilerplate. (2) Should I bother trying to accelerate anything other than 64x64->128? That seems to be the one that GSM really wants anyway. I'd keep all of the sizes implemented generically, since that centralizes the 3 target implementations. (3) The use of Int128 isn't fantastic -- better would be a vector type, though that has its own special problems for ppc64le (see the endianness hoops within aes-round.h). Perhaps leave things in env memory, like I was mostly able to do with AES? (4) No guest test case(s). r~ [1] https://patchew.org/QEMU/20230601123332.3297404-1-a...@kernel.org/ Richard Henderson (18): crypto: Add generic 8-bit carry-less multiply routines target/arm: Use clmul_8* routines target/s390x: Use clmul_8* routines target/ppc: Use clmul_8* routines crypto: Add generic 16-bit carry-less multiply routines target/arm: Use clmul_16* routines target/s390x: Use clmul_16* routines target/ppc: Use clmul_16* routines crypto: Add generic 32-bit carry-less multiply routines target/arm: Use clmul_32* routines target/s390x: Use clmul_32* routines target/ppc: Use clmul_32* routines crypto: Add generic 64-bit carry-less multiply routine target/arm: Use clmul_64 target/s390x: Use clmul_64 target/ppc: Use clmul_64 host/include/i386: Implement clmul.h host/include/aarch64: Implement clmul.h host/include/aarch64/host/cpuinfo.h | 1 + host/include/aarch64/host/crypto/clmul.h | 230 +++ host/include/generic/host/crypto/clmul.h | 28 +++ host/include/i386/host/cpuinfo.h | 1 + host/include/i386/host/crypto/clmul.h| 187 ++ host/include/x86_64/host/crypto/clmul.h | 1 + include/crypto/clmul.h | 123 target/arm/tcg/vec_internal.h| 11 -- crypto/clmul.c | 163 target/arm/tcg/mve_helper.c | 16 +- target/arm/tcg/vec_helper.c | 112 ++- target/ppc/int_helper.c | 63 +++ target/s390x/tcg/vec_int_helper.c| 175 +++-- util/cpuinfo-aarch64.c | 4 +- util/cpuinfo-i386.c | 1 + crypto/meson.build | 9 +- 16 files changed, 865 insertions(+), 260 deletions(-) create mode 100644 host/include/aarch64/host/crypto/clmul.h create mode 100644 host/include/generic/host/crypto/clmul.h create mode 100644 host/include/i386/host/crypto/clmul.h create mode 100644 host/include/x86_64/host/crypto/clmul.h create mode 100644 include/crypto/clmul.h create mode 100644 crypto/clmul.c -- 2.34.1
[PATCH 16/18] target/ppc: Use clmul_64
Use generic routine for 64-bit carry-less multiply. Signed-off-by: Richard Henderson --- target/ppc/int_helper.c | 17 +++-- 1 file changed, 3 insertions(+), 14 deletions(-) diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c index 828f04bce7..4e1fa2fd68 100644 --- a/target/ppc/int_helper.c +++ b/target/ppc/int_helper.c @@ -1455,20 +1455,9 @@ void helper_vpmsumw(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b) void helper_VPMSUMD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b) { -int i, j; -Int128 tmp, prod[2] = {int128_zero(), int128_zero()}; - -for (j = 0; j < 64; j++) { -for (i = 0; i < ARRAY_SIZE(r->u64); i++) { -if (a->VsrD(i) & (1ull << j)) { -tmp = int128_make64(b->VsrD(i)); -tmp = int128_lshift(tmp, j); -prod[i] = int128_xor(prod[i], tmp); -} -} -} - -r->s128 = int128_xor(prod[0], prod[1]); +Int128 e = clmul_64(a->u64[0], b->u64[0]); +Int128 o = clmul_64(a->u64[1], b->u64[1]); +r->s128 = int128_xor(e, o); } #if HOST_BIG_ENDIAN -- 2.34.1
[PATCH 03/18] target/s390x: Use clmul_8* routines
Use generic routines for 8-bit carry-less multiply. Remove our local version of galois_multiply8. Signed-off-by: Richard Henderson --- target/s390x/tcg/vec_int_helper.c | 27 --- 1 file changed, 24 insertions(+), 3 deletions(-) diff --git a/target/s390x/tcg/vec_int_helper.c b/target/s390x/tcg/vec_int_helper.c index 53ab5c5eb3..e110a7581a 100644 --- a/target/s390x/tcg/vec_int_helper.c +++ b/target/s390x/tcg/vec_int_helper.c @@ -14,6 +14,7 @@ #include "vec.h" #include "exec/helper-proto.h" #include "tcg/tcg-gvec-desc.h" +#include "crypto/clmul.h" static bool s390_vec_is_zero(const S390Vector *v) { @@ -179,7 +180,6 @@ static uint##TBITS##_t galois_multiply##BITS(uint##TBITS##_t a,\ } \ return res; \ } -DEF_GALOIS_MULTIPLY(8, 16) DEF_GALOIS_MULTIPLY(16, 32) DEF_GALOIS_MULTIPLY(32, 64) @@ -203,6 +203,29 @@ static S390Vector galois_multiply64(uint64_t a, uint64_t b) return res; } +static Int128 do_gfm8(Int128 n, Int128 m) +{ +Int128 e = clmul_8x8_even(n, m); +Int128 o = clmul_8x8_odd(n, m); +return int128_xor(e, o); +} + +void HELPER(gvec_vgfm8)(void *v1, const void *v2, const void *v3, uint32_t d) +{ +/* + * There is no carry across the two doublewords, so their order + * does not matter, so we need not care for host endianness. + */ +*(Int128 *)v1 = do_gfm8(*(const Int128 *)v2, *(const Int128 *)v3); +} + +void HELPER(gvec_vgfma8)(void *v1, const void *v2, const void *v3, + const void *v4, uint32_t d) +{ +Int128 r = do_gfm8(*(const Int128 *)v2, *(const Int128 *)v3); +*(Int128 *)v1 = int128_xor(r, *(Int128 *)v4); +} + #define DEF_VGFM(BITS, TBITS) \ void HELPER(gvec_vgfm##BITS)(void *v1, const void *v2, const void *v3, \ uint32_t desc) \ @@ -220,7 +243,6 @@ void HELPER(gvec_vgfm##BITS)(void *v1, const void *v2, const void *v3, \ s390_vec_write_element##TBITS(v1, i, d); \ } \ } -DEF_VGFM(8, 16) DEF_VGFM(16, 32) DEF_VGFM(32, 64) @@ -257,7 +279,6 @@ void HELPER(gvec_vgfma##BITS)(void *v1, const void *v2, const void *v3,\ s390_vec_write_element##TBITS(v1, i, d); \ } \ } -DEF_VGFMA(8, 16) DEF_VGFMA(16, 32) DEF_VGFMA(32, 64) -- 2.34.1
[PATCH 10/18] target/arm: Use clmul_32* routines
Use generic routines for 32-bit carry-less multiply. Remove our local version of pmull_d. Signed-off-by: Richard Henderson --- target/arm/tcg/vec_helper.c | 14 +- 1 file changed, 1 insertion(+), 13 deletions(-) diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c index 1b1d5fccbc..c81447e674 100644 --- a/target/arm/tcg/vec_helper.c +++ b/target/arm/tcg/vec_helper.c @@ -2057,18 +2057,6 @@ void HELPER(sve2_pmull_h)(void *vd, void *vn, void *vm, uint32_t desc) } } -static uint64_t pmull_d(uint64_t op1, uint64_t op2) -{ -uint64_t result = 0; -int i; - -for (i = 0; i < 32; ++i) { -uint64_t mask = -((op1 >> i) & 1); -result ^= (op2 << i) & mask; -} -return result; -} - void HELPER(sve2_pmull_d)(void *vd, void *vn, void *vm, uint32_t desc) { intptr_t sel = H4(simd_data(desc)); @@ -2077,7 +2065,7 @@ void HELPER(sve2_pmull_d)(void *vd, void *vn, void *vm, uint32_t desc) uint64_t *d = vd; for (i = 0; i < opr_sz / 8; ++i) { -d[i] = pmull_d(n[2 * i + sel], m[2 * i + sel]); +d[i] = clmul_32(n[2 * i + sel], m[2 * i + sel]); } } #endif -- 2.34.1
Re: [PATCH for-8.2 v2 5/7] target/riscv/cpu.c: add a ADD_CPU_PROPERTIES_ARRAY() macro
On 7/12/23 21:57, Daniel Henrique Barboza wrote: +#define ADD_CPU_PROPERTIES_ARRAY(_dev, _array) \ +for (prop = _array; prop && prop->name; prop++) { \ +qdev_property_add_static(_dev, prop); \ +} \ do { } while(0) Watch the \ on the last line of the macro. Declare the iterator within the macro, rather than use one defined in the outer scope. Why not use ARRAY_SIZE? r~
[PATCH for-8.1] tcg: Use HAVE_CMPXCHG128 instead of CONFIG_CMPXCHG128
We adjust CONFIG_ATOMIC128 and CONFIG_CMPXCHG128 with CONFIG_ATOMIC128_OPT in atomic128.h. It is difficult to tell when those changes have been applied with the ifdef we must use with CONFIG_CMPXCHG128. So instead use HAVE_CMPXCHG128, which triggers -Werror-undef when the proper header has not been included. Improves tcg_gen_atomic_cmpxchg_i128 for s390x host, which requires CONFIG_ATOMIC128_OPT. Without this we fall back to EXCP_ATOMIC to single-step 128-bit atomics, which is slow enough to cause some tests to time out. Reported-by: Thomas Huth Signed-off-by: Richard Henderson --- Thomas, this issue does not quite match the one you bisected, but other than the cmpxchg, I don't see any see any qemu_{ld,st}_i128 being used in BootLinuxS390X.test_s390_ccw_virtio_tcg. As far as I can see, this wasn't broken by the addition of CONFIG_ATOMIC128_OPT, rather that fix didn't go far enough. Anyway, test_s390_ccw_virtio_tcg now passes in 159s on our host. r~ --- accel/tcg/tcg-runtime.h| 2 +- include/exec/helper-proto-common.h | 2 ++ accel/tcg/cputlb.c | 2 +- accel/tcg/user-exec.c | 2 +- tcg/tcg-op-ldst.c | 2 +- accel/tcg/atomic_common.c.inc | 2 +- 6 files changed, 7 insertions(+), 5 deletions(-) diff --git a/accel/tcg/tcg-runtime.h b/accel/tcg/tcg-runtime.h index 39e68007f9..186899a2c7 100644 --- a/accel/tcg/tcg-runtime.h +++ b/accel/tcg/tcg-runtime.h @@ -58,7 +58,7 @@ DEF_HELPER_FLAGS_5(atomic_cmpxchgq_be, TCG_CALL_NO_WG, DEF_HELPER_FLAGS_5(atomic_cmpxchgq_le, TCG_CALL_NO_WG, i64, env, i64, i64, i64, i32) #endif -#ifdef CONFIG_CMPXCHG128 +#if HAVE_CMPXCHG128 DEF_HELPER_FLAGS_5(atomic_cmpxchgo_be, TCG_CALL_NO_WG, i128, env, i64, i128, i128, i32) DEF_HELPER_FLAGS_5(atomic_cmpxchgo_le, TCG_CALL_NO_WG, diff --git a/include/exec/helper-proto-common.h b/include/exec/helper-proto-common.h index 4d4b022668..8b67170a22 100644 --- a/include/exec/helper-proto-common.h +++ b/include/exec/helper-proto-common.h @@ -7,6 +7,8 @@ #ifndef HELPER_PROTO_COMMON_H #define HELPER_PROTO_COMMON_H +#include "qemu/atomic128.h" /* for HAVE_CMPXCHG128 */ + #define HELPER_H "accel/tcg/tcg-runtime.h" #include "exec/helper-proto.h.inc" #undef HELPER_H diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c index c2b81ec569..e0079c9a9d 100644 --- a/accel/tcg/cputlb.c +++ b/accel/tcg/cputlb.c @@ -3105,7 +3105,7 @@ void cpu_st16_mmu(CPUArchState *env, target_ulong addr, Int128 val, #include "atomic_template.h" #endif -#if defined(CONFIG_ATOMIC128) || defined(CONFIG_CMPXCHG128) +#if defined(CONFIG_ATOMIC128) || HAVE_CMPXCHG128 #define DATA_SIZE 16 #include "atomic_template.h" #endif diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c index d95b875a6a..e7225e10e9 100644 --- a/accel/tcg/user-exec.c +++ b/accel/tcg/user-exec.c @@ -1385,7 +1385,7 @@ static void *atomic_mmu_lookup(CPUArchState *env, vaddr addr, MemOpIdx oi, #include "atomic_template.h" #endif -#if defined(CONFIG_ATOMIC128) || defined(CONFIG_CMPXCHG128) +#if defined(CONFIG_ATOMIC128) || HAVE_CMPXCHG128 #define DATA_SIZE 16 #include "atomic_template.h" #endif diff --git a/tcg/tcg-op-ldst.c b/tcg/tcg-op-ldst.c index 0fcc1618e5..d54c305598 100644 --- a/tcg/tcg-op-ldst.c +++ b/tcg/tcg-op-ldst.c @@ -778,7 +778,7 @@ typedef void (*gen_atomic_op_i64)(TCGv_i64, TCGv_env, TCGv_i64, #else # define WITH_ATOMIC64(X) #endif -#ifdef CONFIG_CMPXCHG128 +#if HAVE_CMPXCHG128 # define WITH_ATOMIC128(X) X, #else # define WITH_ATOMIC128(X) diff --git a/accel/tcg/atomic_common.c.inc b/accel/tcg/atomic_common.c.inc index ee222fd7e7..95a5c5ff12 100644 --- a/accel/tcg/atomic_common.c.inc +++ b/accel/tcg/atomic_common.c.inc @@ -41,7 +41,7 @@ CMPXCHG_HELPER(cmpxchgq_be, uint64_t) CMPXCHG_HELPER(cmpxchgq_le, uint64_t) #endif -#ifdef CONFIG_CMPXCHG128 +#if HAVE_CMPXCHG128 CMPXCHG_HELPER(cmpxchgo_be, Int128) CMPXCHG_HELPER(cmpxchgo_le, Int128) #endif -- 2.34.1
Re: [RFC, PATCH, trivial, sample] treewide: spelling fixes in comments and some strings
.. include/standard-headers/drm/drm_fourcc.h | 8 +++--- include/standard-headers/linux/ethtool.h | 2 +- .../standard-headers/linux/virtio_console.h | 2 +- include/standard-headers/linux/virtio_i2c.h | 2 +- include/standard-headers/linux/virtio_net.h | 4 +-- It looks like these should be dropped: diff --git a/include/standard-headers/drm/drm_fourcc.h b/include/standard-headers/drm/drm_fourcc.h index 72279f4d25..b782802a34 100644 --- a/include/standard-headers/drm/drm_fourcc.h +++ b/include/standard-headers/drm/drm_fourcc.h @@ -55,3 +55,3 @@ extern "C" { * vendor-namespaced, and as such the relationship between a fourcc code and a - * modifier is specific to the modifer being used. For example, some modifiers + * modifier is specific to the modifier being used. For example, some modifiers * may preserve meaning - such as number of planes - from the fourcc code, @@ -80,3 +80,3 @@ extern "C" { * see modifiers as opaque tokens they can check for equality and intersect. - * These users musn't need to know to reason about the modifier value + * These users mustn't need to know to reason about the modifier value * (i.e. they are not expected to extract information out of the modifier). @@ -539,3 +539,3 @@ extern "C" { * are arranged in four groups (two wide, two high) with column-major layout. - * Each group therefore consits out of four 256 byte units, which are also laid + * Each group therefore consists out of four 256 byte units, which are also laid * out as 2x2 column-major. @@ -1418,3 +1418,3 @@ drm_fourcc_canonicalize_nvidia_format_mod(uint64_t modifier) * Indicates the storage is packed when pixel size is multiple of word - * boudaries, i.e. 8bit should be stored in this mode to save allocation + * boundaries, i.e. 8bit should be stored in this mode to save allocation * memory. diff --git a/include/standard-headers/linux/ethtool.h b/include/standard-headers/linux/ethtool.h index 99fcddf04f..e72ec42bf8 100644 --- a/include/standard-headers/linux/ethtool.h +++ b/include/standard-headers/linux/ethtool.h @@ -571,3 +571,3 @@ struct ethtool_channels { * Drivers should reject a non-zero setting of @autoneg when - * autoneogotiation is disabled (or not supported) for the link. + * autonegotiation is disabled (or not supported) for the link. * diff --git a/include/standard-headers/linux/virtio_console.h b/include/standard-headers/linux/virtio_console.h index 71f5f648e3..8cba52198a 100644 --- a/include/standard-headers/linux/virtio_console.h +++ b/include/standard-headers/linux/virtio_console.h @@ -46,3 +46,3 @@ struct virtio_console_config { - /* colums of the screens */ + /* columns of the screens */ __virtio16 cols; ...
Re: [PATCH v1 13/15] virtio-mem: Expose device memory via multiple memslots if enabled
On 16.06.2023 11:26, David Hildenbrand wrote: Having large virtio-mem devices that only expose little memory to a VM is currently a problem: we map the whole sparse memory region into the guest using a single memslot, resulting in one gigantic memslot in KVM. KVM allocates metadata for the whole memslot, which can result in quite some memory waste. Assuming we have a 1 TiB virtio-mem device and only expose little (e.g., 1 GiB) memory, we would create a single 1 TiB memslot and KVM has to allocate metadata for that 1 TiB memslot: on x86, this implies allocating a significant amount of memory for metadata: (1) RMAP: 8 bytes per 4 KiB, 8 bytes per 2 MiB, 8 bytes per 1 GiB -> For 1 TiB: 2147483648 + 4194304 + 8192 = ~ 2 GiB (0.2 %) With the TDP MMU (cat /sys/module/kvm/parameters/tdp_mmu) this gets allocated lazily when required for nested VMs (2) gfn_track: 2 bytes per 4 KiB -> For 1 TiB: 536870912 = ~512 MiB (0.05 %) (3) lpage_info: 4 bytes per 2 MiB, 4 bytes per 1 GiB -> For 1 TiB: 2097152 + 4096 = ~2 MiB (0.0002 %) (4) 2x dirty bitmaps for tracking: 2x 1 bit per 4 KiB page -> For 1 TiB: 536870912 = 64 MiB (0.006 %) So we primarily care about (1) and (2). The bad thing is, that the memory consumption *doubles* once SMM is enabled, because we create the memslot once for !SMM and once for SMM. Having a 1 TiB memslot without the TDP MMU consumes around: * With SMM: 5 GiB * Without SMM: 2.5 GiB Having a 1 TiB memslot with the TDP MMU consumes around: * With SMM: 1 GiB * Without SMM: 512 MiB ... and that's really something we want to optimize, to be able to just start a VM with small boot memory (e.g., 4 GiB) and a virtio-mem device that can grow very large (e.g., 1 TiB). Consequently, using multiple memslots and only mapping the memslots we really need can significantly reduce memory waste and speed up memslot-related operations. Let's expose the sparse RAM memory region using multiple memslots, mapping only the memslots we currently need into our device memory region container. * With VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE, we only map the memslots that actually have memory plugged, and dynamically (un)map when (un)plugging memory blocks. * Without VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE, we always map the memslots covered by the usable region, and dynamically (un)map when resizing the usable region. We'll auto-determine the number of memslots to use based on the suggested memslot limit provided by the core. We'll use at most 1 memslot per gigabyte. Note that our global limit of memslots accross all memory devices is currently set to 256: even with multiple large virtio-mem devices, we'd still have a sane limit on the number of memslots used. The default is a single memslot for now ("multiple-memslots=off"). The optimization must be enabled manually using "multiple-memslots=on", because some vhost setups (e.g., hotplug of vhost-user devices) might be problematic until we support more memslots especially in vhost-user backends. Note that "multiple-memslots=on" is just a hint that multiple memslots *may* be used for internal optimizations, not that multiple memslots *must* be used. The actual number of memslots that are used is an internal detail: for example, once memslot metadata is no longer an issue, we could simply stop optimizing for that. Migration source and destination can differ on the setting of "multiple-memslots". Signed-off-by: David Hildenbrand --- hw/virtio/virtio-mem-pci.c | 21 +++ hw/virtio/virtio-mem.c | 265 - include/hw/virtio/virtio-mem.h | 23 ++- 3 files changed, 304 insertions(+), 5 deletions(-) diff --git a/hw/virtio/virtio-mem-pci.c b/hw/virtio/virtio-mem-pci.c index b85c12668d..8b403e7e78 100644 --- a/hw/virtio/virtio-mem-pci.c +++ b/hw/virtio/virtio-mem-pci.c (...) @@ -790,6 +921,43 @@ static void virtio_mem_system_reset(void *opaque) virtio_mem_unplug_all(vmem); } +static void virtio_mem_prepare_mr(VirtIOMEM *vmem) +{ +const uint64_t region_size = memory_region_size(>memdev->mr); + +g_assert(!vmem->mr); +vmem->mr = g_new0(MemoryRegion, 1); +memory_region_init(vmem->mr, OBJECT(vmem), "virtio-mem", + region_size); +vmem->mr->align = memory_region_get_alignment(>memdev->mr); +} + +static void virtio_mem_prepare_memslots(VirtIOMEM *vmem) +{ +const uint64_t region_size = memory_region_size(>memdev->mr); +unsigned int idx; + +g_assert(!vmem->memslots && vmem->nb_memslots); +vmem->memslots = g_new0(MemoryRegion, vmem->nb_memslots); + +/* Initialize our memslots, but don't map them yet. */ +for (idx = 0; idx < vmem->nb_memslots; idx++) { +const uint64_t memslot_offset = idx * vmem->memslot_size; +uint64_t memslot_size = vmem->memslot_size; +char name[20]; + +/* The size of the last memslot might be smaller. */ +if (idx == vmem->nb_memslots) { ^ I
drain_call_rcu() vs nested event loops
Hi, I've encountered a bug where two vcpu threads enter a device's MMIO emulation callback at the same time. This is never supposed to happen thanks to the Big QEMU Lock (BQL), but drain_call_rcu() and nested event loops make it possible: 1. A device's MMIO emulation callback invokes AIO_WAIT_WHILE(). 2. A device_add monitor command runs in AIO_WAIT_WHILE()'s aio_poll() nested event loop. 3. qmp_device_add() -> drain_call_rcu() is called and the BQL is temporarily dropped. 4. Another vcpu thread dispatches the same device's MMIO callback because it is now able to acquire the BQL. I've included the backtraces below if you want to see the details. They are from a RHEL qemu-kvm 6.2.0-35 coredump but I haven't found anything in qemu.git/master that would fix this. One fix is to make qmp_device_add() a coroutine and schedule a BH in the iohandler AioContext. That way the coroutine must wait until the nested event loop finishes before its BH executes. drain_call_rcu() will never be called from a nested event loop and the problem does not occur anymore. Another possibility is to remove the following in monitor_qmp_dispatcher_co(): /* * Move the coroutine from iohandler_ctx to qemu_aio_context for * executing the command handler so that it can make progress if it * involves an AIO_WAIT_WHILE(). */ aio_co_schedule(qemu_get_aio_context(), qmp_dispatcher_co); qemu_coroutine_yield(); By executing QMP commands in the iohandler AioContext by default, we can prevent issues like this in the future. However, there might be some QMP commands that assume they are running in the qemu_aio_context (e.g. coroutine commands that yield) and they might need to manually move to the qemu_aio_context. What do you think? Stefan --- Thread 41 (Thread 0x7fdc3dffb700 (LWP 910296)): #0 0x7fde88ac99bd in syscall () from /lib64/libc.so.6 #1 0x55bd7a2e066f in qemu_futex_wait (val=, f=) at /usr/src/debug/qemu-kvm-6.2.0-35.module+el8.9.0+19024+8193e2ac.x86_64/include/qemu/futex.h:29 #2 qemu_event_wait (ev=ev@entry=0x7fdc3dffa2d0) at ../util/qemu-thread-posix.c:510 #3 0x55bd7a2e8e54 in drain_call_rcu () at ../util/rcu.c:347 #4 0x55bd79f63d1e in qmp_device_add (qdict=, ret_data=, errp=) at ../softmmu/qdev-monitor.c:863 #5 0x55bd7a2d420d in do_qmp_dispatch_bh (opaque=0x7fde8c22aee0) at ../qapi/qmp-dispatch.c:129 #6 0x55bd7a2ef3bd in aio_bh_call (bh=0x7fdc6015cd50) at ../util/async.c:174 #7 aio_bh_poll (ctx=ctx@entry=0x55bd7c910f40) at ../util/async.c:174 #8 0x55bd7a2dd3b2 in aio_poll (ctx=0x55bd7c910f40, blocking=blocking@entry=true) at ../util/aio-posix.c:659 #9 0x55bd7a2effea in aio_wait_bh_oneshot (ctx=0x55bd7ca980e0, cb=cb@entry=0x55bd7a11a9c0 , opaque=opaque@entry=0x55bd7e585c40) at ../util/aio-wait.c:85 #10 0x55bd7a11b30b in virtio_blk_data_plane_stop (vdev=) at ../hw/block/dataplane/virtio-blk.c:333 #11 0x55bd7a0591e0 in virtio_bus_stop_ioeventfd (bus=bus@entry=0x55bd7cb57ba8) at ../hw/virtio/virtio-bus.c:258 #12 0x55bd7a05995f in virtio_bus_stop_ioeventfd (bus=bus@entry=0x55bd7cb57ba8) at ../hw/virtio/virtio-bus.c:250 #13 0x55bd7a05b238 in virtio_pci_stop_ioeventfd (proxy=0x55bd7cb4f9a0) at ../hw/virtio/virtio-pci.c:1289 #14 virtio_pci_common_write (opaque=0x55bd7cb4f9a0, addr=, val=, size=) at ../hw/virtio/virtio-pci.c:1289 ^^ #15 0x55bd7a0f6777 in memory_region_write_accessor (mr=0x55bd7cb50410, addr=, value=, size=1, shift=, mask=, attrs=...) at ../softmmu/memory.c:492 #16 0x55bd7a0f320e in access_with_adjusted_size (addr=addr@entry=20, value=value@entry=0x7fdc3dffa5c8, size=size@entry=1, access_size_min=, access_size_max=, access_fn=0x55bd7a0f6710 , mr=0x55bd7cb50410, attrs=...) at ../softmmu/memory.c:554 #17 0x55bd7a0f62a3 in memory_region_dispatch_write (mr=mr@entry=0x55bd7cb50410, addr=20, data=, op=, attrs=attrs@entry=...) at ../softmmu/memory.c:1504 #18 0x55bd7a0e7f2e in flatview_write_continue (fv=fv@entry=0x55bd7d17cad0, addr=addr@entry=4236247060, attrs=..., ptr=ptr@entry=0x7fde84003028, len=len@entry=1, addr1=, l=, mr=0x55bd7cb50410) at /usr/src/debug/qemu-kvm-6.2.0-35.module+el8.9.0+19024+8193e2ac.x86_64/include/qemu/host-utils.h:165 #19 0x55bd7a0e8093 in flatview_write (fv=0x55bd7d17cad0, addr=4236247060, attrs=..., buf=0x7fde84003028, len=1) at ../softmmu/physmem.c:2856 #20 0x55bd7a0ebc6f in address_space_write (as=, addr=, attrs=..., buf=, len=) at ../softmmu/physmem.c:2952 #21 0x55bd7a1a28b9 in kvm_cpu_exec (cpu=cpu@entry=0x55bd7cc32bf0) at ../accel/kvm/kvm-all.c:2995 #22 0x55bd7a1a36e5 in kvm_vcpu_thread_fn (arg=0x55bd7cc32bf0) at ../accel/kvm/kvm-accel-ops.c:49 #23 0x55bd7a2dfdd4 in qemu_thread_start (args=0x55bd7cc41f20) at ../util/qemu-thread-posix.c:585 #24 0x7fde88e5d1ca in start_thread () from /lib64/libpthread.so.0 #25 0x7fde88ac9e73 in clone () from /lib64/libc.so.6 Thread 1 (Thread
Re: [PATCH 0/3] hw/arm/virt: Use generic CPU invalidation
On 7/13/23 13:34, Gavin Shan wrote: Hi Peter and Marcin, On 7/13/23 21:52, Marcin Juszkiewicz wrote: W dniu 13.07.2023 o 13:44, Peter Maydell pisze: I see this isn't a change in this patch, but given that what the user specifies is not "cortex-a8-arm-cpu" but "cortex-a8", why do we include the "-arm-cpu" suffix in the error messages? It's not valid syntax to say "-cpu cortex-a8-arm-cpu", so it's a bit misleading... Internally those cpu names are "max-{TYPE_ARM_CPU}" and similar for other architectures. I like the change but it (IMHO) needs to cut "-{TYPE_*_CPU}" string from names: 13:37 marcin@applejack:qemu$ ./build/aarch64-softmmu/qemu-system-aarch64 -M virt -cpu cortex-r5 qemu-system-aarch64: Invalid CPU type: cortex-r5-arm-cpu The valid types are: cortex-a7-arm-cpu, cortex-a15-arm-cpu, cortex-a35-arm-cpu, cortex-a55-arm-cpu, cortex-a72-arm-cpu, cortex-a76-arm-cpu, a64fx-arm-cpu, neoverse-n1-arm-cpu, neoverse-v1-arm-cpu, cortex-a53-arm-cpu, cortex-a57-arm-cpu, host-arm-cpu, max-arm-cpu 13:37 marcin@applejack:qemu$ ./build/aarch64-softmmu/qemu-system-aarch64 -M virt -cpu cortex-a57-arm-cpu qemu-system-aarch64: unable to find CPU model 'cortex-a57-arm-cpu' The suffix of CPU types are provided in hw/arm/virt.c::valid_cpu_types in PATCH[2]. In the generic validation, the complete CPU type is used. The error message also have complete CPU type there. Peter and Marcin, how about to split the CPU types to two fields, as below? In this way, the complete CPU type will be used for validation and the 'internal' names will be used for the error messages. struct MachineClass { const char *valid_cpu_type_suffix; const char **valid_cpu_types; While you're changing this: const char * const *valid_cpu_types; }; hw/arm/virt.c - static const char *valid_cpu_types[] = { So that you can then do static const char * const valid_cpu_types[] r~
[RFC, PATCH, trivial, sample] treewide: spelling fixes in comments and some strings
I got annoyed enough by various misspellings, and tried to clean up this a bit. And got this in the result, for now: https://gitlab.com/mjt0k/qemu/-/commit/eb5a376c7282e63d9e11eb952046b01f1a5ae7d4 Below is a diffstat plus a few actual changes as a sample. It fixes misspellings in comments and in some strings (mostly in error messages). There are a few misspelt variable names which are worth fixing too - this patch does not change variable names. Now, given the size of the patch, I wonder how to address it in a best way. Hence the RFC. It took quite a lot of work to went there, there's no automatic tools involved - all changes are done after manual inspection of the context (with some help from codespell). So I'd love to avoid grouping it by maintainer, - maximum I can do is to group by source directory. Or maybe there's some other way to attack this? Thanks, /mjt --- accel/tcg/tb-maint.c | 2 +- audio/mixeng.h| 2 +- backends/tpm/tpm_ioctl.h | 2 +- block.c | 2 +- block/block-copy.c| 4 +-- block/export/vduse-blk.c | 2 +- block/export/vhost-user-blk-server.c | 2 +- block/export/vhost-user-blk-server.h | 2 +- block/file-posix.c| 8 +++--- block/graph-lock.c| 2 +- block/io.c| 2 +- block/linux-aio.c | 2 +- block/mirror.c| 2 +- block/qcow2-refcount.c| 2 +- block/vhdx.c | 2 +- block/vhdx.h | 4 +-- bsd-user/errno_defs.h | 2 +- bsd-user/freebsd/target_os_siginfo.h | 2 +- bsd-user/freebsd/target_os_stack.h| 4 +-- bsd-user/freebsd/target_os_user.h | 2 +- bsd-user/qemu.h | 2 +- bsd-user/signal-common.h | 4 +-- bsd-user/signal.c | 6 ++--- chardev/char-socket.c | 6 ++--- chardev/char.c| 2 +- contrib/plugins/cache.c | 2 +- contrib/plugins/lockstep.c| 2 +- crypto/afalg.c| 2 +- crypto/block-luks.c | 6 ++--- crypto/der.c | 2 +- crypto/der.h | 6 ++--- docs/about/deprecated.rst | 2 +- docs/devel/qapi-code-gen.rst | 2 +- docs/devel/qom.rst| 2 +- docs/system/arm/palm.rst | 2 +- docs/system/arm/xscale.rst| 2 +- docs/system/devices/can.rst | 6 ++--- docs/system/devices/nvme.rst | 2 +- host/include/aarch64/host/cpuinfo.h | 2 +- host/include/generic/host/cpuinfo.h | 2 +- host/include/i386/host/cpuinfo.h | 2 +- host/include/ppc/host/cpuinfo.h | 2 +- hw/9pfs/9p-local.c| 8 +++--- hw/9pfs/9p-proxy.c| 2 +- hw/9pfs/9p-synth.c| 2 +- hw/9pfs/9p-util.h | 2 +- hw/9pfs/9p.c | 4 +-- hw/9pfs/9p.h | 2 +- hw/acpi/aml-build.c | 6 ++--- hw/acpi/hmat.c| 2 +- hw/acpi/nvdimm.c | 2 +- hw/arm/aspeed.c | 2 +- hw/arm/mps2-tz.c | 2 +- hw/arm/versatilepb.c | 2 +- hw/audio/fmopl.c | 8 +++--- hw/audio/fmopl.h | 2 +- hw/audio/gusemu_hal.c | 4 +-- hw/audio/intel-hda-defs.h | 4 +-- hw/block/hd-geometry.c| 4 +-- hw/block/pflash_cfi01.c | 2 +- hw/char/cadence_uart.c| 2 +- hw/char/imx_serial.c | 2 +- hw/char/serial.c | 2 +- hw/core/generic-loader.c | 4 +-- hw/core/loader.c | 4 +-- hw/core/machine.c | 2 +- hw/core/qdev-properties-system.c | 2 +- hw/cpu/a15mpcore.c| 2 +- hw/cxl/cxl-events.c | 2 +- hw/cxl/cxl-host.c | 2 +- hw/cxl/cxl-mailbox-utils.c| 4 +-- hw/display/bochs-display.c| 2 +- hw/display/qxl.c | 2 +- hw/display/ssd0303.c | 2
Re: [PATCH 04/11] tpm_crb: use a single read-as-mem/write-as-mmio mapping
On 7/13/23 13:18, Peter Maydell wrote: On Thu, 13 Jul 2023 at 18:16, Stefan Berger wrote: I guess the first point would be to decide whether to support an i2c bus on the virt board and then whether we can use the aspeed bus that we know that the tpm_tis_i2c device model works with but we don't know how Windows may react to it. It seems sysbus is already supported there so ... we may have a 'match'? You can use sysbus devices anywhere -- they're just 'anywhere' also includes aarch64 virt board I suppose. "this is a memory mapped device". The question is whether we should, or whether an i2c controller is more like what the real world uses (and if so, what i2c controller). I don't want to accept changes to the virt board that are hard to live with in future, because changing virt in non-backward compatible ways is painful. Once we have the CRB sysbus device we would keep it around forever and it seems to - not require any changes to the virt board (iiuc) since sysbus is already being used - works already with Windows and probably also Linux Stefan -- PMM
Re: [PATCH 05/11] tpm_crb: use the ISA bus
On 7/12/23 23:51, Joelle van Dyne wrote: Since this device is gated to only build for targets with the PC configuration, we should use the ISA bus like with TPM TIS. Signed-off-by: Joelle van Dyne I think this patch is good but I'd like to try it with resuming and old VM snapshot and for this to work we need 04/11 to have the registers in the VM state. Stefan --- hw/tpm/tpm_crb.c | 52 hw/tpm/Kconfig | 2 +- 2 files changed, 27 insertions(+), 27 deletions(-) diff --git a/hw/tpm/tpm_crb.c b/hw/tpm/tpm_crb.c index 07c6868d8d..6144081d30 100644 --- a/hw/tpm/tpm_crb.c +++ b/hw/tpm/tpm_crb.c @@ -22,6 +22,7 @@ #include "hw/qdev-properties.h" #include "hw/pci/pci_ids.h" #include "hw/acpi/tpm.h" +#include "hw/isa/isa.h" #include "migration/vmstate.h" #include "sysemu/tpm_backend.h" #include "sysemu/tpm_util.h" @@ -34,7 +35,7 @@ #include "tpm_crb.h" struct CRBState { -DeviceState parent_obj; +ISADevice parent_obj; TPMCRBState state; }; @@ -43,49 +44,49 @@ typedef struct CRBState CRBState; DECLARE_INSTANCE_CHECKER(CRBState, CRB, TYPE_TPM_CRB) -static void tpm_crb_none_request_completed(TPMIf *ti, int ret) +static void tpm_crb_isa_request_completed(TPMIf *ti, int ret) { CRBState *s = CRB(ti); tpm_crb_request_completed(>state, ret); } -static enum TPMVersion tpm_crb_none_get_version(TPMIf *ti) +static enum TPMVersion tpm_crb_isa_get_version(TPMIf *ti) { CRBState *s = CRB(ti); return tpm_crb_get_version(>state); } -static int tpm_crb_none_pre_save(void *opaque) +static int tpm_crb_isa_pre_save(void *opaque) { CRBState *s = opaque; return tpm_crb_pre_save(>state); } -static const VMStateDescription vmstate_tpm_crb_none = { +static const VMStateDescription vmstate_tpm_crb_isa = { .name = "tpm-crb", -.pre_save = tpm_crb_none_pre_save, +.pre_save = tpm_crb_isa_pre_save, .fields = (VMStateField[]) { VMSTATE_END_OF_LIST(), } }; -static Property tpm_crb_none_properties[] = { +static Property tpm_crb_isa_properties[] = { DEFINE_PROP_TPMBE("tpmdev", CRBState, state.tpmbe), DEFINE_PROP_BOOL("ppi", CRBState, state.ppi_enabled, true), DEFINE_PROP_END_OF_LIST(), }; -static void tpm_crb_none_reset(void *dev) +static void tpm_crb_isa_reset(void *dev) { CRBState *s = CRB(dev); return tpm_crb_reset(>state, TPM_CRB_ADDR_BASE); } -static void tpm_crb_none_realize(DeviceState *dev, Error **errp) +static void tpm_crb_isa_realize(DeviceState *dev, Error **errp) { CRBState *s = CRB(dev); @@ -100,52 +101,51 @@ static void tpm_crb_none_realize(DeviceState *dev, Error **errp) tpm_crb_init_memory(OBJECT(s), >state, errp); -memory_region_add_subregion(get_system_memory(), +memory_region_add_subregion(isa_address_space(ISA_DEVICE(dev)), TPM_CRB_ADDR_BASE, >state.mmio); if (s->state.ppi_enabled) { -memory_region_add_subregion(get_system_memory(), +memory_region_add_subregion(isa_address_space(ISA_DEVICE(dev)), TPM_PPI_ADDR_BASE, >state.ppi.ram); } if (xen_enabled()) { -tpm_crb_none_reset(dev); +tpm_crb_isa_reset(dev); } else { -qemu_register_reset(tpm_crb_none_reset, dev); +qemu_register_reset(tpm_crb_isa_reset, dev); } } -static void tpm_crb_none_class_init(ObjectClass *klass, void *data) +static void tpm_crb_isa_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); TPMIfClass *tc = TPM_IF_CLASS(klass); -dc->realize = tpm_crb_none_realize; -device_class_set_props(dc, tpm_crb_none_properties); -dc->vmsd = _tpm_crb_none; +dc->realize = tpm_crb_isa_realize; +device_class_set_props(dc, tpm_crb_isa_properties); +dc->vmsd = _tpm_crb_isa; dc->user_creatable = true; tc->model = TPM_MODEL_TPM_CRB; -tc->get_version = tpm_crb_none_get_version; -tc->request_completed = tpm_crb_none_request_completed; +tc->get_version = tpm_crb_isa_get_version; +tc->request_completed = tpm_crb_isa_request_completed; set_bit(DEVICE_CATEGORY_MISC, dc->categories); } -static const TypeInfo tpm_crb_none_info = { +static const TypeInfo tpm_crb_isa_info = { .name = TYPE_TPM_CRB, -/* could be TYPE_SYS_BUS_DEVICE (or LPC etc) */ -.parent = TYPE_DEVICE, +.parent = TYPE_ISA_DEVICE, .instance_size = sizeof(CRBState), -.class_init = tpm_crb_none_class_init, +.class_init = tpm_crb_isa_class_init, .interfaces = (InterfaceInfo[]) { { TYPE_TPM_IF }, { } } }; -static void tpm_crb_none_register(void) +static void tpm_crb_isa_register(void) { -type_register_static(_crb_none_info); +type_register_static(_crb_isa_info); } -type_init(tpm_crb_none_register) +type_init(tpm_crb_isa_register) diff --git
Re: [PATCH 09/11] tpm_tis_sysbus: fix crash when PPI is enabled
On 7/13/23 14:15, Joelle van Dyne wrote: On Thu, Jul 13, 2023 at 9:49 AM Stefan Berger wrote: The tpm-tis-device doesn't work for x86_64 but for aarch64. We have this here in this file: DEFINE_PROP_BOOL("ppi", TPMStateSysBus, state.ppi_enabled, false), I don't know whether ppi would work on aarch64. It needs firmware support like in edk2. I think the best solution is to remove this DEFINE_PROP_BOOL() and if someone wants to enable it they would have to add firmware support and test it before re-enabling it. Stefan static void tpm_tis_sysbus_class_init(ObjectClass *klass, void *data) Yeah, I'm not sure if PPI works with AARCH64 since I didn't bother to change it to not use hard coded addresses. However, isn't that "ppi" overridable from the command line? If so, should we add a check in "realize" to error if PPI=true? Otherwise, it will just crash. Once the option is removed via my patch (cc'ed you), then you get this once you pass ppi=on on the command line: qemu-system-aarch64: -device tpm-tis-device,tpmdev=tpm0,ppi=on: Property 'tpm-tis-device.ppi' not found This disables it for good. Stefan
Re: [PATCH for-8.1 3/3] target/arm/ptw.c: Account for FEAT_RME when applying {N}SW,SA bits
On 7/10/23 16:21, Peter Maydell wrote: In get_phys_addr_twostage() the code that applies the effects of VSTCR.{SA,SW} and VTCR.{NSA,NSW} only updates result->f.attrs.secure. Now we also have f.attrs.space for FEAT_RME, we need to keep the two in sync. These bits only have an effect for Secure space translations, not for Root, so use the input in_space field to determine whether to apply them rather than the input is_secure. This doesn't actually make a difference because Root translations are never two-stage, but it's a little clearer. Signed-off-by: Peter Maydell --- I noticed this while reading through the ptw code... --- target/arm/ptw.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) Reviewed-by: Richard Henderson r~
Re: [PATCH 06/11] tpm_crb: move ACPI table building to device interface
On 7/13/23 14:10, Joelle van Dyne wrote: In that case, do you think we should have a check in "realize" to make sure the backend is 2.0? Maybe. I think at the moment it would simply not work (with existing drivers) without terminating QEMU on it due to the misconfiguration. On libvirt level we intercept this case and notify the user that the combination doesn't work. Leaving it like this would be an option... Stefan On Thu, Jul 13, 2023 at 9:08 AM Stefan Berger wrote: On 7/12/23 23:51, Joelle van Dyne wrote: This logic is similar to TPM TIS ISA device. Signed-off-by: Joelle van Dyne --- hw/i386/acpi-build.c | 23 --- hw/tpm/tpm_crb.c | 28 2 files changed, 28 insertions(+), 23 deletions(-) diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index 9c74fa17ad..b767df39df 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -1441,9 +1441,6 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, uint32_t nr_mem = machine->ram_slots; int root_bus_limit = 0xFF; PCIBus *bus = NULL; -#ifdef CONFIG_TPM -TPMIf *tpm = tpm_find(); -#endif bool cxl_present = false; int i; VMBusBridge *vmbus_bridge = vmbus_bridge_find(); @@ -1793,26 +1790,6 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, } } -#ifdef CONFIG_TPM -if (TPM_IS_CRB(tpm)) { -dev = aml_device("TPM"); -aml_append(dev, aml_name_decl("_HID", aml_string("MSFT0101"))); -aml_append(dev, aml_name_decl("_STR", - aml_string("TPM 2.0 Device"))); -crs = aml_resource_template(); -aml_append(crs, aml_memory32_fixed(TPM_CRB_ADDR_BASE, - TPM_CRB_ADDR_SIZE, AML_READ_WRITE)); -aml_append(dev, aml_name_decl("_CRS", crs)); - -aml_append(dev, aml_name_decl("_STA", aml_int(0xf))); -aml_append(dev, aml_name_decl("_UID", aml_int(1))); - -tpm_build_ppi_acpi(tpm, dev); - -aml_append(sb_scope, dev); -} -#endif - if (pcms->sgx_epc.size != 0) { uint64_t epc_base = pcms->sgx_epc.base; uint64_t epc_size = pcms->sgx_epc.size; diff --git a/hw/tpm/tpm_crb.c b/hw/tpm/tpm_crb.c index 6144081d30..14feb9857f 100644 --- a/hw/tpm/tpm_crb.c +++ b/hw/tpm/tpm_crb.c @@ -19,6 +19,8 @@ #include "qemu/module.h" #include "qapi/error.h" #include "exec/address-spaces.h" +#include "hw/acpi/acpi_aml_interface.h" +#include "hw/acpi/tpm.h" #include "hw/qdev-properties.h" #include "hw/pci/pci_ids.h" #include "hw/acpi/tpm.h" @@ -116,10 +118,34 @@ static void tpm_crb_isa_realize(DeviceState *dev, Error **errp) } } +static void build_tpm_crb_isa_aml(AcpiDevAmlIf *adev, Aml *scope) +{ +Aml *dev, *crs; +CRBState *s = CRB(adev); +TPMIf *ti = TPM_IF(s); + +dev = aml_device("TPM"); +if (tpm_crb_isa_get_version(ti) == TPM_VERSION_2_0) { +aml_append(dev, aml_name_decl("_HID", aml_string("MSFT0101"))); +aml_append(dev, aml_name_decl("_STR", aml_string("TPM 2.0 Device"))); +} else { +aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0C31"))); +} CRB only exists for TPM 2.0 and that's why we didn't have a different case here before. CRB only has MSFT0101: https://elixir.bootlin.com/linux/latest/source/drivers/char/tpm/tpm_crb.c#L820 TIS has PNP0C31: https://elixir.bootlin.com/linux/latest/source/drivers/char/tpm/tpm_tis.c You should remove the check for TPM_VERSION_2_0. Stefan +aml_append(dev, aml_name_decl("_UID", aml_int(1))); +aml_append(dev, aml_name_decl("_STA", aml_int(0xF))); +crs = aml_resource_template(); +aml_append(crs, aml_memory32_fixed(TPM_CRB_ADDR_BASE, TPM_CRB_ADDR_SIZE, + AML_READ_WRITE)); +aml_append(dev, aml_name_decl("_CRS", crs)); +tpm_build_ppi_acpi(ti, dev); +aml_append(scope, dev); +} + static void tpm_crb_isa_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); TPMIfClass *tc = TPM_IF_CLASS(klass); +AcpiDevAmlIfClass *adevc = ACPI_DEV_AML_IF_CLASS(klass); dc->realize = tpm_crb_isa_realize; device_class_set_props(dc, tpm_crb_isa_properties); @@ -128,6 +154,7 @@ static void tpm_crb_isa_class_init(ObjectClass *klass, void *data) tc->model = TPM_MODEL_TPM_CRB; tc->get_version = tpm_crb_isa_get_version; tc->request_completed = tpm_crb_isa_request_completed; +adevc->build_dev_aml = build_tpm_crb_isa_aml; set_bit(DEVICE_CATEGORY_MISC, dc->categories); } @@ -139,6 +166,7 @@ static const TypeInfo tpm_crb_isa_info = { .class_init = tpm_crb_isa_class_init, .interfaces = (InterfaceInfo[]) { { TYPE_TPM_IF }, +{ TYPE_ACPI_DEV_AML_IF }, { } } };
Re: [PATCH for-8.1 2/3] target/arm: Fix S1_ptw_translate() debug path
On 7/10/23 16:21, Peter Maydell wrote: In commit XXX we rearranged the logic in S1_ptw_translate() so that the debug-access "call get_phys_addr_*" codepath is used both when S1 is doing ptw reads from stage 2 and when it is doing ptw reads from physical memory. However, we didn't update the calculation of s2ptw->in_space and s2ptw->in_secure to account for the "ptw reads from physical memory" case. This meant that debug accesses when in Secure state broke. Create a new function S2_security_space() which returns the correct security space to use for the ptw load, and use it to determine the correct .in_secure and .in_space fields for the stage 2 lookup for the ptw load. Reported-by: Jean-Philippe Brucker Fixes: fe4a5472ccd6 ("target/arm: Use get_phys_addr_with_struct in S1_ptw_translate") Signed-off-by: Peter Maydell --- target/arm/ptw.c | 37 - 1 file changed, 32 insertions(+), 5 deletions(-) Reviewed-by: Richard Henderson r~
Re: [PATCH 09/11] tpm_tis_sysbus: fix crash when PPI is enabled
On Thu, Jul 13, 2023 at 9:49 AM Stefan Berger wrote: > > > The tpm-tis-device doesn't work for x86_64 but for aarch64. > > > We have this here in this file: > > DEFINE_PROP_BOOL("ppi", TPMStateSysBus, state.ppi_enabled, false), > > I don't know whether ppi would work on aarch64. It needs firmware support > like in edk2. > I think the best solution is to remove this DEFINE_PROP_BOOL() and if someone > wants > to enable it they would have to add firmware support and test it before > re-enabling it. > > Stefan > > > static void tpm_tis_sysbus_class_init(ObjectClass *klass, void *data) Yeah, I'm not sure if PPI works with AARCH64 since I didn't bother to change it to not use hard coded addresses. However, isn't that "ppi" overridable from the command line? If so, should we add a check in "realize" to error if PPI=true? Otherwise, it will just crash.
Re: [PATCH 06/11] tpm_crb: move ACPI table building to device interface
In that case, do you think we should have a check in "realize" to make sure the backend is 2.0? On Thu, Jul 13, 2023 at 9:08 AM Stefan Berger wrote: > > > > On 7/12/23 23:51, Joelle van Dyne wrote: > > This logic is similar to TPM TIS ISA device. > > > > Signed-off-by: Joelle van Dyne > > --- > > hw/i386/acpi-build.c | 23 --- > > hw/tpm/tpm_crb.c | 28 > > 2 files changed, 28 insertions(+), 23 deletions(-) > > > > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c > > index 9c74fa17ad..b767df39df 100644 > > --- a/hw/i386/acpi-build.c > > +++ b/hw/i386/acpi-build.c > > @@ -1441,9 +1441,6 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, > > uint32_t nr_mem = machine->ram_slots; > > int root_bus_limit = 0xFF; > > PCIBus *bus = NULL; > > -#ifdef CONFIG_TPM > > -TPMIf *tpm = tpm_find(); > > -#endif > > bool cxl_present = false; > > int i; > > VMBusBridge *vmbus_bridge = vmbus_bridge_find(); > > @@ -1793,26 +1790,6 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, > > } > > } > > > > -#ifdef CONFIG_TPM > > -if (TPM_IS_CRB(tpm)) { > > -dev = aml_device("TPM"); > > -aml_append(dev, aml_name_decl("_HID", aml_string("MSFT0101"))); > > -aml_append(dev, aml_name_decl("_STR", > > - aml_string("TPM 2.0 Device"))); > > -crs = aml_resource_template(); > > -aml_append(crs, aml_memory32_fixed(TPM_CRB_ADDR_BASE, > > - TPM_CRB_ADDR_SIZE, > > AML_READ_WRITE)); > > -aml_append(dev, aml_name_decl("_CRS", crs)); > > - > > -aml_append(dev, aml_name_decl("_STA", aml_int(0xf))); > > -aml_append(dev, aml_name_decl("_UID", aml_int(1))); > > - > > -tpm_build_ppi_acpi(tpm, dev); > > - > > -aml_append(sb_scope, dev); > > -} > > -#endif > > - > > if (pcms->sgx_epc.size != 0) { > > uint64_t epc_base = pcms->sgx_epc.base; > > uint64_t epc_size = pcms->sgx_epc.size; > > diff --git a/hw/tpm/tpm_crb.c b/hw/tpm/tpm_crb.c > > index 6144081d30..14feb9857f 100644 > > --- a/hw/tpm/tpm_crb.c > > +++ b/hw/tpm/tpm_crb.c > > @@ -19,6 +19,8 @@ > > #include "qemu/module.h" > > #include "qapi/error.h" > > #include "exec/address-spaces.h" > > +#include "hw/acpi/acpi_aml_interface.h" > > +#include "hw/acpi/tpm.h" > > #include "hw/qdev-properties.h" > > #include "hw/pci/pci_ids.h" > > #include "hw/acpi/tpm.h" > > @@ -116,10 +118,34 @@ static void tpm_crb_isa_realize(DeviceState *dev, > > Error **errp) > > } > > } > > > > +static void build_tpm_crb_isa_aml(AcpiDevAmlIf *adev, Aml *scope) > > +{ > > +Aml *dev, *crs; > > +CRBState *s = CRB(adev); > > +TPMIf *ti = TPM_IF(s); > > + > > +dev = aml_device("TPM"); > > +if (tpm_crb_isa_get_version(ti) == TPM_VERSION_2_0) { > > +aml_append(dev, aml_name_decl("_HID", aml_string("MSFT0101"))); > > +aml_append(dev, aml_name_decl("_STR", aml_string("TPM 2.0 > > Device"))); > > +} else { > > +aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0C31"))); > > +} > > CRB only exists for TPM 2.0 and that's why we didn't have a different case > here before. > > CRB only has MSFT0101: > https://elixir.bootlin.com/linux/latest/source/drivers/char/tpm/tpm_crb.c#L820 > TIS has PNP0C31: > https://elixir.bootlin.com/linux/latest/source/drivers/char/tpm/tpm_tis.c > > You should remove the check for TPM_VERSION_2_0. > > Stefan > > +aml_append(dev, aml_name_decl("_UID", aml_int(1))); > > +aml_append(dev, aml_name_decl("_STA", aml_int(0xF))); > > +crs = aml_resource_template(); > > +aml_append(crs, aml_memory32_fixed(TPM_CRB_ADDR_BASE, > > TPM_CRB_ADDR_SIZE, > > + AML_READ_WRITE)); > > +aml_append(dev, aml_name_decl("_CRS", crs)); > > +tpm_build_ppi_acpi(ti, dev); > > +aml_append(scope, dev); > > +} > > + > > static void tpm_crb_isa_class_init(ObjectClass *klass, void *data) > > { > > DeviceClass *dc = DEVICE_CLASS(klass); > > TPMIfClass *tc = TPM_IF_CLASS(klass); > > +AcpiDevAmlIfClass *adevc = ACPI_DEV_AML_IF_CLASS(klass); > > > > dc->realize = tpm_crb_isa_realize; > > device_class_set_props(dc, tpm_crb_isa_properties); > > @@ -128,6 +154,7 @@ static void tpm_crb_isa_class_init(ObjectClass *klass, > > void *data) > > tc->model = TPM_MODEL_TPM_CRB; > > tc->get_version = tpm_crb_isa_get_version; > > tc->request_completed = tpm_crb_isa_request_completed; > > +adevc->build_dev_aml = build_tpm_crb_isa_aml; > > > > set_bit(DEVICE_CATEGORY_MISC, dc->categories); > > } > > @@ -139,6 +166,7 @@ static const TypeInfo tpm_crb_isa_info = { > > .class_init = tpm_crb_isa_class_init, > > .interfaces = (InterfaceInfo[]) { > > { TYPE_TPM_IF }, > > +{ TYPE_ACPI_DEV_AML_IF }, > > { } > >
Re: [PATCH 07/11] hw/arm/virt: add plug handler for TPM on SysBus
On Thu, Jul 13, 2023 at 8:31 AM Peter Maydell wrote: > > On Thu, 13 Jul 2023 at 04:52, Joelle van Dyne wrote: > > > > TPM needs to know its own base address in order to generate its DSDT > > device entry. > > > > Signed-off-by: Joelle van Dyne > > --- > > hw/arm/virt.c | 37 + > > 1 file changed, 37 insertions(+) > > > > diff --git a/hw/arm/virt.c b/hw/arm/virt.c > > index 7d9dbc2663..432148ef47 100644 > > --- a/hw/arm/virt.c > > +++ b/hw/arm/virt.c > > @@ -2732,6 +2732,37 @@ static void virt_memory_plug(HotplugHandler > > *hotplug_dev, > > dev, _abort); > > } > > > > +#ifdef CONFIG_TPM > > +static void virt_tpm_plug(VirtMachineState *vms, TPMIf *tpmif) > > +{ > > +PlatformBusDevice *pbus = PLATFORM_BUS_DEVICE(vms->platform_bus_dev); > > +hwaddr pbus_base = vms->memmap[VIRT_PLATFORM_BUS].base; > > +SysBusDevice *sbdev = SYS_BUS_DEVICE(tpmif); > > +MemoryRegion *sbdev_mr; > > +hwaddr tpm_base; > > +uint64_t tpm_size; > > + > > +if (!sbdev || !object_dynamic_cast(OBJECT(sbdev), > > TYPE_SYS_BUS_DEVICE)) { > > +return; > > +} > > + > > +tpm_base = platform_bus_get_mmio_addr(pbus, sbdev, 0); > > +assert(tpm_base != -1); > > + > > +tpm_base += pbus_base; > > + > > +sbdev_mr = sysbus_mmio_get_region(sbdev, 0); > > +tpm_size = memory_region_size(sbdev_mr); > > + > > +if (object_property_find(OBJECT(sbdev), "baseaddr")) { > > +object_property_set_uint(OBJECT(sbdev), "baseaddr", tpm_base, > > NULL); > > +} > > +if (object_property_find(OBJECT(sbdev), "size")) { > > +object_property_set_uint(OBJECT(sbdev), "size", tpm_size, NULL); > > +} > > +} > > +#endif > > I do not like the "platform bus" at all -- it is a nasty hack. > If the virt board needs a memory mapped TPM device it should probably > just create one, the same way we create our other memory mapped > devices. But... > > How are TPM devices typically set up/visible to the guest on > real Arm server hardware ? Should this be a sysbus device at all? +Alexander Graf who may answer this better. My understanding is that we need to do this for the device to know its own address which it needs to return in a register. On ISA devices, it is always mapped to the same physical address so there's no issues but for Virt machines, device addresses are dynamically allocated by the PlatformBusDevice so only at this late stage can we tell the device what its own address is. > > thanks > -- PMM Also to Stefan's question on consolidating code: that is ideal but currently, it seems like much platform setup code is duplicated amongst the various architecture's Virt machines. There would have to be a larger effort in de-duplicating a lot of that code. Indeed, we try to do this already with some of the ACPI stuff in the other patches. For this specifically, we would need to know the platform bus' base address which is done differently in ARM64's Virt and in Loongarch's Virt. All we did was delete some existing duplicated code and replace it with a different duplicated code :)
Re: [PATCH] hw/tpm: TIS on sysbus: Remove unsupport ppi command line option
Hi Stefan, On 7/13/23 19:19, Stefan Berger wrote: > The ppi command line option for the TIS device on sysbus never worked > and caused an immediate segfault. Remove support for it since it also > needs support in the firmware and needs testing inside the VM. > > Reproducer with the ppi=on option passed: > > qemu-system-aarch64 \ >-machine virt,gic-version=3 \ >-m 4G \ >-nographic -no-acpi \ >-chardev socket,id=chrtpm,path=/tmp/mytpm1/swtpm-sock \ >-tpmdev emulator,id=tpm0,chardev=chrtpm \ >-device tpm-tis-device,tpmdev=tpm0,ppi=on > [...] > Segmentation fault (core dumped) > > Signed-off-by: Stefan Berger Reviewed-by: Eric Auger Thanks! Eric > --- > hw/tpm/tpm_tis_sysbus.c | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/hw/tpm/tpm_tis_sysbus.c b/hw/tpm/tpm_tis_sysbus.c > index 45e63efd63..6724b3d4f6 100644 > --- a/hw/tpm/tpm_tis_sysbus.c > +++ b/hw/tpm/tpm_tis_sysbus.c > @@ -93,7 +93,6 @@ static void tpm_tis_sysbus_reset(DeviceState *dev) > static Property tpm_tis_sysbus_properties[] = { > DEFINE_PROP_UINT32("irq", TPMStateSysBus, state.irq_num, TPM_TIS_IRQ), > DEFINE_PROP_TPMBE("tpmdev", TPMStateSysBus, state.be_driver), > -DEFINE_PROP_BOOL("ppi", TPMStateSysBus, state.ppi_enabled, false), > DEFINE_PROP_END_OF_LIST(), > }; >
Re: [PATCH for-8.1 1/3] target/arm/ptw.c: Add comments to S1Translate struct fields
On 7/10/23 16:21, Peter Maydell wrote: Add comments to the in_* fields in the S1Translate struct that explain what they're doing. Signed-off-by: Peter Maydell --- I figured some of this out when writing commit fcc0b0418fff, and then I found I'd forgotten it all when I was trying to fix this new bug. So this time I'm writing this down :-) --- target/arm/ptw.c | 40 1 file changed, 40 insertions(+) Reviewed-by: Richard Henderson r~
Re: [PATCH 00/11] tpm: introduce TPM CRB SysBus device
On Thu, Jul 13, 2023 at 6:07 AM Stefan Berger wrote: > > > > On 7/12/23 23:51, Joelle van Dyne wrote: > > The impetus for this patch set is to get TPM 2.0 working on Windows 11 > > ARM64. > > Windows' tpm.sys does not seem to work on a TPM TIS device (as verified with > > VMWare's implementation). However, the current TPM CRB device uses a fixed > > system bus address that is reserved for RAM in ARM64 Virt machines. > > Thanks a lot for this work. The last sentence seems to hint at the current > issue > with TPM CRB on ARM64 and seems to be the only way forward there. You may want > to reformulate it a bit because it's not clear how the 'however' related to > CRB relates to TIS. > > > > > In the process of adding the TPM CRB SysBus device, we also went ahead and > > cleaned up some of the existing TPM hardware code and fixed some bugs. We > > used > > Please reorder bugs to the beginning of the series or submit in an extra > patch set > so we can backport them. Ideal would be description(s) for how to trigger the > bug(s). > > > the TPM TIS devices as a template for the TPM CRB devices and refactored out > > common code. We moved the ACPI DSDT generation to the device in order to > > handle > > dynamic base address requirements as well as reduce redundent code in > > different > s/redundent/redundant > > > > machine ACPI generation. We also changed the tpm_crb device to use the ISA > > bus > > instead of depending on the default system bus as the device only was built > > for > > the PC configuration. > > > > Another change is that the TPM CRB registers are now mapped in the same way > > that > > the pflash ROM devices are mapped. It is a memory region whose writes are > > trapped as MMIO accesses. This was needed because Apple Silicon does not > > decode > > LDP caused page faults. @agraf suggested that we do this to avoid having to > > Afaik, LDP is an ARM assembly instruction that loads two 32bit or 64bit > registers from > consecutive addresses. May be worth mentioning for those wondering about it... > > > do AARCH64 decoding in the HVF fault handler. > > What is HVF? Sorry, HVF is the QEMU backend for Apple's Hypervisor.framework which runs on macOS including on Apple Silicon. > > Regards, > Stefan > > > > Unfortunately, it seems like the LDP fault still happens on HVF but the > > issue > > seems to be in the HVF backend which needs to be fixed in a separate patch. > > > > One last thing that's needed to get Windows 11 to recognize the TPM 2.0 > > device > > is for the OVMF firmware to setup the TPM device. Currently, OVMF for ARM64 > > Virt > > only recognizes the TPM TIS device through a FDT entry. A workaround is to > > falsely identify the TPM CRB device as a TPM TIS device in the FDT node but > > this > > causes issues for Linux. A proper fix would involve adding an ACPI device > > driver > > in OVMF. > > > > Joelle van Dyne (11): > >tpm_crb: refactor common code > >tpm_crb: CTRL_RSP_ADDR is 64-bits wide > >tpm_ppi: refactor memory space initialization > >tpm_crb: use a single read-as-mem/write-as-mmio mapping > >tpm_crb: use the ISA bus > >tpm_crb: move ACPI table building to device interface > >hw/arm/virt: add plug handler for TPM on SysBus > >hw/loongarch/virt: add plug handler for TPM on SysBus > >tpm_tis_sysbus: fix crash when PPI is enabled > >tpm_tis_sysbus: move DSDT AML generation to device > >tpm_crb_sysbus: introduce TPM CRB SysBus device > > > > docs/specs/tpm.rst | 2 + > > hw/tpm/tpm_crb.h| 74 + > > hw/tpm/tpm_ppi.h| 10 +- > > include/hw/acpi/aml-build.h | 1 + > > include/hw/acpi/tpm.h | 3 +- > > include/sysemu/tpm.h| 3 + > > hw/acpi/aml-build.c | 7 +- > > hw/arm/virt-acpi-build.c| 38 + > > hw/arm/virt.c | 38 + > > hw/core/sysbus-fdt.c| 1 + > > hw/i386/acpi-build.c| 23 --- > > hw/loongarch/acpi-build.c | 38 + > > hw/loongarch/virt.c | 38 + > > hw/riscv/virt.c | 1 + > > hw/tpm/tpm_crb.c| 307 > > hw/tpm/tpm_crb_common.c | 224 ++ > > hw/tpm/tpm_crb_sysbus.c | 178 + > > hw/tpm/tpm_ppi.c| 5 +- > > hw/tpm/tpm_tis_isa.c| 5 +- > > hw/tpm/tpm_tis_sysbus.c | 43 + > > tests/qtest/tpm-crb-test.c | 2 +- > > tests/qtest/tpm-util.c | 2 +- > > hw/arm/Kconfig | 1 + > > hw/riscv/Kconfig| 1 + > > hw/tpm/Kconfig | 7 +- > > hw/tpm/meson.build | 3 + > > hw/tpm/trace-events | 2 +- > > 27 files changed, 703 insertions(+), 354 deletions(-) > > create mode 100644 hw/tpm/tpm_crb.h > > create mode 100644 hw/tpm/tpm_crb_common.c > > create mode 100644 hw/tpm/tpm_crb_sysbus.c > >
[PATCH] hw/tpm: TIS on sysbus: Remove unsupport ppi command line option
The ppi command line option for the TIS device on sysbus never worked and caused an immediate segfault. Remove support for it since it also needs support in the firmware and needs testing inside the VM. Reproducer with the ppi=on option passed: qemu-system-aarch64 \ -machine virt,gic-version=3 \ -m 4G \ -nographic -no-acpi \ -chardev socket,id=chrtpm,path=/tmp/mytpm1/swtpm-sock \ -tpmdev emulator,id=tpm0,chardev=chrtpm \ -device tpm-tis-device,tpmdev=tpm0,ppi=on [...] Segmentation fault (core dumped) Signed-off-by: Stefan Berger --- hw/tpm/tpm_tis_sysbus.c | 1 - 1 file changed, 1 deletion(-) diff --git a/hw/tpm/tpm_tis_sysbus.c b/hw/tpm/tpm_tis_sysbus.c index 45e63efd63..6724b3d4f6 100644 --- a/hw/tpm/tpm_tis_sysbus.c +++ b/hw/tpm/tpm_tis_sysbus.c @@ -93,7 +93,6 @@ static void tpm_tis_sysbus_reset(DeviceState *dev) static Property tpm_tis_sysbus_properties[] = { DEFINE_PROP_UINT32("irq", TPMStateSysBus, state.irq_num, TPM_TIS_IRQ), DEFINE_PROP_TPMBE("tpmdev", TPMStateSysBus, state.be_driver), -DEFINE_PROP_BOOL("ppi", TPMStateSysBus, state.ppi_enabled, false), DEFINE_PROP_END_OF_LIST(), }; -- 2.41.0
Re: [PATCH 04/11] tpm_crb: use a single read-as-mem/write-as-mmio mapping
On Thu, 13 Jul 2023 at 18:16, Stefan Berger wrote: > I guess the first point would be to decide whether to support an i2c bus on > the virt board and then whether we can use the aspeed bus that we know that > the tpm_tis_i2c device model works with but we don't know how Windows may > react to it. > > It seems sysbus is already supported there so ... we may have a 'match'? You can use sysbus devices anywhere -- they're just "this is a memory mapped device". The question is whether we should, or whether an i2c controller is more like what the real world uses (and if so, what i2c controller). -- PMM
Re: [PATCH 04/11] tpm_crb: use a single read-as-mem/write-as-mmio mapping
On 7/13/23 13:07, Peter Maydell wrote: On Thu, 13 Jul 2023 at 17:54, Stefan Berger wrote: On 7/13/23 11:55, Peter Maydell wrote: On Thu, 13 Jul 2023 at 16:46, Stefan Berger wrote: On 7/13/23 11:34, Peter Maydell wrote: On Thu, 13 Jul 2023 at 16:28, Stefan Berger wrote: On 7/13/23 10:50, Peter Maydell wrote: I'm not a super-fan of hacking around the fact that LDP to hardware registers isn't supported in specific device models, though... What does this mean for this effort here? Usually we say "fix the guest to not try to access hardware registers with silly load/store instruction types". The other option would be "put in a large amount of effort to support emulating those instructions in QEMU userspace when KVM/HVF/etc trap and punt them to us". For the last decade or so we have taken the first of these approaches :-) Is Microsoft likely to react to use telling them "fix the guest"? They have on occasion in the past, yes. The other outstanding question here is if this TPM device should be a sysbus one at all (i.e. not i2c), which might render this part moot. Does the aarch64 virt VM support an i2c bus? Would it support the aspeed i2c bus? Does Windows then accept this i2c bus? Maybe the faster answer comes via this device that Joelle presumably has working on AARCH64 Windows. The aim is not "get Windows booting as fast as possible", though. It's to end up with a QEMU virt board that (a) is maintainable (b) is reasonably congruent with what real hardware does (c) works in a way that will also work with what other guest OSes are expecting. I don't want to accept changes to the virt board that are hard to live with in future, because changing virt in non-backward compatible ways is painful. I guess the first point would be to decide whether to support an i2c bus on the virt board and then whether we can use the aspeed bus that we know that the tpm_tis_i2c device model works with but we don't know how Windows may react to it. It seems sysbus is already supported there so ... we may have a 'match'? dev = qdev_new("arm-gicv2m"); sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, vms->memmap[VIRT_GIC_V2M].base); qdev_prop_set_uint32(dev, "base-spi", irq); qdev_prop_set_uint32(dev, "num-spi", NUM_GICV2M_SPIS); sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), _fatal); Stefan thanks -- PMM
Re: [PATCH 04/11] tpm_crb: use a single read-as-mem/write-as-mmio mapping
On Thu, 13 Jul 2023 at 17:54, Stefan Berger wrote: > > > > On 7/13/23 11:55, Peter Maydell wrote: > > On Thu, 13 Jul 2023 at 16:46, Stefan Berger wrote: > >> On 7/13/23 11:34, Peter Maydell wrote: > >>> On Thu, 13 Jul 2023 at 16:28, Stefan Berger wrote: > On 7/13/23 10:50, Peter Maydell wrote: > > I'm not a super-fan of hacking around the fact that LDP > > to hardware registers isn't supported in specific device > > models, though... > > What does this mean for this effort here? > >>> > >>> Usually we say "fix the guest to not try to access hardware > >>> registers with silly load/store instruction types". The other > >>> option would be "put in a large amount of effort to support > >>> emulating those instructions in QEMU userspace when KVM/HVF/etc > >>> trap and punt them to us". For the last decade or so we have > >>> taken the first of these approaches :-) > >> > >> Is Microsoft likely to react to use telling them "fix the guest"? > > > > They have on occasion in the past, yes. > > > > The other outstanding question here is if this TPM device > > should be a sysbus one at all (i.e. not i2c), which might > > render this part moot. > > Does the aarch64 virt VM support an i2c bus? Would it support the aspeed i2c > bus? Does Windows then accept this i2c bus? Maybe the faster answer comes via > this device that Joelle presumably has working on AARCH64 Windows. The aim is not "get Windows booting as fast as possible", though. It's to end up with a QEMU virt board that (a) is maintainable (b) is reasonably congruent with what real hardware does (c) works in a way that will also work with what other guest OSes are expecting. I don't want to accept changes to the virt board that are hard to live with in future, because changing virt in non-backward compatible ways is painful. thanks -- PMM
Re: [PATCH 04/11] tpm_crb: use a single read-as-mem/write-as-mmio mapping
On 7/13/23 11:55, Peter Maydell wrote: On Thu, 13 Jul 2023 at 16:46, Stefan Berger wrote: On 7/13/23 11:34, Peter Maydell wrote: On Thu, 13 Jul 2023 at 16:28, Stefan Berger wrote: On 7/13/23 10:50, Peter Maydell wrote: I'm not a super-fan of hacking around the fact that LDP to hardware registers isn't supported in specific device models, though... What does this mean for this effort here? Usually we say "fix the guest to not try to access hardware registers with silly load/store instruction types". The other option would be "put in a large amount of effort to support emulating those instructions in QEMU userspace when KVM/HVF/etc trap and punt them to us". For the last decade or so we have taken the first of these approaches :-) Is Microsoft likely to react to use telling them "fix the guest"? They have on occasion in the past, yes. The other outstanding question here is if this TPM device should be a sysbus one at all (i.e. not i2c), which might render this part moot. Does the aarch64 virt VM support an i2c bus? Would it support the aspeed i2c bus? Does Windows then accept this i2c bus? Maybe the faster answer comes via this device that Joelle presumably has working on AARCH64 Windows. Stefan thanks -- PMM
Re: [PATCH 09/11] tpm_tis_sysbus: fix crash when PPI is enabled
On 7/12/23 23:51, Joelle van Dyne wrote: If 'ppi' property is set, then `tpm_ppi_reset` is called on reset which SEGFAULTs because `tpmppi->buf` is not allocated. Signed-off-by: Joelle van Dyne --- hw/tpm/tpm_tis_sysbus.c | 4 1 file changed, 4 insertions(+) diff --git a/hw/tpm/tpm_tis_sysbus.c b/hw/tpm/tpm_tis_sysbus.c index 45e63efd63..1014d5d993 100644 --- a/hw/tpm/tpm_tis_sysbus.c +++ b/hw/tpm/tpm_tis_sysbus.c @@ -124,6 +124,10 @@ static void tpm_tis_sysbus_realizefn(DeviceState *dev, Error **errp) error_setg(errp, "'tpmdev' property is required"); return; } + +if (s->ppi_enabled) { +sysbus_init_mmio(SYS_BUS_DEVICE(dev), >ppi.ram); +} } The tpm-tis-device doesn't work for x86_64 but for aarch64. We have this here in this file: DEFINE_PROP_BOOL("ppi", TPMStateSysBus, state.ppi_enabled, false), I don't know whether ppi would work on aarch64. It needs firmware support like in edk2. I think the best solution is to remove this DEFINE_PROP_BOOL() and if someone wants to enable it they would have to add firmware support and test it before re-enabling it. Stefan static void tpm_tis_sysbus_class_init(ObjectClass *klass, void *data)
Re: [PATCH 0/3] hw/arm/virt: Use generic CPU invalidation
On 13/7/23 14:34, Gavin Shan wrote: Hi Peter and Marcin, On 7/13/23 21:52, Marcin Juszkiewicz wrote: W dniu 13.07.2023 o 13:44, Peter Maydell pisze: I see this isn't a change in this patch, but given that what the user specifies is not "cortex-a8-arm-cpu" but "cortex-a8", why do we include the "-arm-cpu" suffix in the error messages? It's not valid syntax to say "-cpu cortex-a8-arm-cpu", so it's a bit misleading... Internally those cpu names are "max-{TYPE_ARM_CPU}" and similar for other architectures. I like the change but it (IMHO) needs to cut "-{TYPE_*_CPU}" string from names: 13:37 marcin@applejack:qemu$ ./build/aarch64-softmmu/qemu-system-aarch64 -M virt -cpu cortex-r5 qemu-system-aarch64: Invalid CPU type: cortex-r5-arm-cpu The valid types are: cortex-a7-arm-cpu, cortex-a15-arm-cpu, cortex-a35-arm-cpu, cortex-a55-arm-cpu, cortex-a72-arm-cpu, cortex-a76-arm-cpu, a64fx-arm-cpu, neoverse-n1-arm-cpu, neoverse-v1-arm-cpu, cortex-a53-arm-cpu, cortex-a57-arm-cpu, host-arm-cpu, max-arm-cpu 13:37 marcin@applejack:qemu$ ./build/aarch64-softmmu/qemu-system-aarch64 -M virt -cpu cortex-a57-arm-cpu qemu-system-aarch64: unable to find CPU model 'cortex-a57-arm-cpu' The suffix of CPU types are provided in hw/arm/virt.c::valid_cpu_types in PATCH[2]. In the generic validation, the complete CPU type is used. The error message also have complete CPU type there. In some places (arm_cpu_list_entry, arm_cpu_add_definition) we use: g_strndup(typename, strlen(typename) - strlen("-" TYPE_ARM_CPU)) Maybe extract as a helper? cpu_typename_name()? :)
RE: [PATCH] target/hexagon/idef-parser: Remove self-assignment
> -Original Message- > From: Anton Johansson > Sent: Thursday, July 13, 2023 7:09 AM > To: qemu-devel@nongnu.org > Cc: Brian Cain ; peter.mayd...@linaro.org > Subject: [PATCH] target/hexagon/idef-parser: Remove self-assignment > > WARNING: This email originated from outside of Qualcomm. Please be wary of > any links or attachments, and do not enable macros. > > The self assignment is clearly useless, and @1.last_column does not have > to be set for an expression with only a single token, so remove it. > > Reported-by: Peter Maydell > Signed-off-by: Anton Johansson > --- > target/hexagon/idef-parser/idef-parser.y | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/target/hexagon/idef-parser/idef-parser.y b/target/hexagon/idef- > parser/idef-parser.y > index cd2612eb8c..a6587f5bcc 100644 > --- a/target/hexagon/idef-parser/idef-parser.y > +++ b/target/hexagon/idef-parser/idef-parser.y > @@ -802,7 +802,6 @@ rvalue : FAIL > > lvalue : FAIL > { > - @1.last_column = @1.last_column; > yyassert(c, &@1, false, "Encountered a FAIL token as > lvalue.\n"); > } > | REG > -- > 2.41.0 Reviewed-by: Brian Cain
Re: [PATCH] target/hexagon/idef-parser: Remove self-assignment
On 13/7/23 14:08, Anton Johansson via wrote: The self assignment is clearly useless, and @1.last_column does not have to be set for an expression with only a single token, so remove it. Reported-by: Peter Maydell Signed-off-by: Anton Johansson --- target/hexagon/idef-parser/idef-parser.y | 1 - 1 file changed, 1 deletion(-) Reviewed-by: Philippe Mathieu-Daudé
Re: [PATCH 06/11] tpm_crb: move ACPI table building to device interface
On 7/12/23 23:51, Joelle van Dyne wrote: This logic is similar to TPM TIS ISA device. Signed-off-by: Joelle van Dyne --- hw/i386/acpi-build.c | 23 --- hw/tpm/tpm_crb.c | 28 2 files changed, 28 insertions(+), 23 deletions(-) diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index 9c74fa17ad..b767df39df 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -1441,9 +1441,6 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, uint32_t nr_mem = machine->ram_slots; int root_bus_limit = 0xFF; PCIBus *bus = NULL; -#ifdef CONFIG_TPM -TPMIf *tpm = tpm_find(); -#endif bool cxl_present = false; int i; VMBusBridge *vmbus_bridge = vmbus_bridge_find(); @@ -1793,26 +1790,6 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, } } -#ifdef CONFIG_TPM -if (TPM_IS_CRB(tpm)) { -dev = aml_device("TPM"); -aml_append(dev, aml_name_decl("_HID", aml_string("MSFT0101"))); -aml_append(dev, aml_name_decl("_STR", - aml_string("TPM 2.0 Device"))); -crs = aml_resource_template(); -aml_append(crs, aml_memory32_fixed(TPM_CRB_ADDR_BASE, - TPM_CRB_ADDR_SIZE, AML_READ_WRITE)); -aml_append(dev, aml_name_decl("_CRS", crs)); - -aml_append(dev, aml_name_decl("_STA", aml_int(0xf))); -aml_append(dev, aml_name_decl("_UID", aml_int(1))); - -tpm_build_ppi_acpi(tpm, dev); - -aml_append(sb_scope, dev); -} -#endif - if (pcms->sgx_epc.size != 0) { uint64_t epc_base = pcms->sgx_epc.base; uint64_t epc_size = pcms->sgx_epc.size; diff --git a/hw/tpm/tpm_crb.c b/hw/tpm/tpm_crb.c index 6144081d30..14feb9857f 100644 --- a/hw/tpm/tpm_crb.c +++ b/hw/tpm/tpm_crb.c @@ -19,6 +19,8 @@ #include "qemu/module.h" #include "qapi/error.h" #include "exec/address-spaces.h" +#include "hw/acpi/acpi_aml_interface.h" +#include "hw/acpi/tpm.h" #include "hw/qdev-properties.h" #include "hw/pci/pci_ids.h" #include "hw/acpi/tpm.h" @@ -116,10 +118,34 @@ static void tpm_crb_isa_realize(DeviceState *dev, Error **errp) } } +static void build_tpm_crb_isa_aml(AcpiDevAmlIf *adev, Aml *scope) +{ +Aml *dev, *crs; +CRBState *s = CRB(adev); +TPMIf *ti = TPM_IF(s); + +dev = aml_device("TPM"); +if (tpm_crb_isa_get_version(ti) == TPM_VERSION_2_0) { +aml_append(dev, aml_name_decl("_HID", aml_string("MSFT0101"))); +aml_append(dev, aml_name_decl("_STR", aml_string("TPM 2.0 Device"))); +} else { +aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0C31"))); +} CRB only exists for TPM 2.0 and that's why we didn't have a different case here before. CRB only has MSFT0101: https://elixir.bootlin.com/linux/latest/source/drivers/char/tpm/tpm_crb.c#L820 TIS has PNP0C31: https://elixir.bootlin.com/linux/latest/source/drivers/char/tpm/tpm_tis.c You should remove the check for TPM_VERSION_2_0. Stefan +aml_append(dev, aml_name_decl("_UID", aml_int(1))); +aml_append(dev, aml_name_decl("_STA", aml_int(0xF))); +crs = aml_resource_template(); +aml_append(crs, aml_memory32_fixed(TPM_CRB_ADDR_BASE, TPM_CRB_ADDR_SIZE, + AML_READ_WRITE)); +aml_append(dev, aml_name_decl("_CRS", crs)); +tpm_build_ppi_acpi(ti, dev); +aml_append(scope, dev); +} + static void tpm_crb_isa_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); TPMIfClass *tc = TPM_IF_CLASS(klass); +AcpiDevAmlIfClass *adevc = ACPI_DEV_AML_IF_CLASS(klass); dc->realize = tpm_crb_isa_realize; device_class_set_props(dc, tpm_crb_isa_properties); @@ -128,6 +154,7 @@ static void tpm_crb_isa_class_init(ObjectClass *klass, void *data) tc->model = TPM_MODEL_TPM_CRB; tc->get_version = tpm_crb_isa_get_version; tc->request_completed = tpm_crb_isa_request_completed; +adevc->build_dev_aml = build_tpm_crb_isa_aml; set_bit(DEVICE_CATEGORY_MISC, dc->categories); } @@ -139,6 +166,7 @@ static const TypeInfo tpm_crb_isa_info = { .class_init = tpm_crb_isa_class_init, .interfaces = (InterfaceInfo[]) { { TYPE_TPM_IF }, +{ TYPE_ACPI_DEV_AML_IF }, { } } };
Re: [PATCH 03/11] tpm_ppi: refactor memory space initialization
On 7/12/23 23:51, Joelle van Dyne wrote: Instead of calling `memory_region_add_subregion` directly, we defer to the caller to do it. This allows us to re-use the code for a SysBus device. Signed-off-by: Joelle van Dyne Reviewed-by: Stefan Berger --- hw/tpm/tpm_ppi.h| 10 +++--- hw/tpm/tpm_crb.c| 4 ++-- hw/tpm/tpm_crb_common.c | 3 +++ hw/tpm/tpm_ppi.c| 5 + hw/tpm/tpm_tis_isa.c| 5 +++-- 5 files changed, 12 insertions(+), 15 deletions(-) diff --git a/hw/tpm/tpm_ppi.h b/hw/tpm/tpm_ppi.h index bf5d4a300f..30863c6438 100644 --- a/hw/tpm/tpm_ppi.h +++ b/hw/tpm/tpm_ppi.h @@ -20,17 +20,13 @@ typedef struct TPMPPI { } TPMPPI; /** - * tpm_ppi_init: + * tpm_ppi_init_memory: * @tpmppi: a TPMPPI - * @m: the address-space / MemoryRegion to use - * @addr: the address of the PPI region * @obj: the owner object * - * Register the TPM PPI memory region at @addr on the given address - * space for the object @obj. + * Creates the TPM PPI memory region. **/ -void tpm_ppi_init(TPMPPI *tpmppi, MemoryRegion *m, - hwaddr addr, Object *obj); +void tpm_ppi_init_memory(TPMPPI *tpmppi, Object *obj); /** * tpm_ppi_reset: diff --git a/hw/tpm/tpm_crb.c b/hw/tpm/tpm_crb.c index 3ef4977fb5..598c3e0161 100644 --- a/hw/tpm/tpm_crb.c +++ b/hw/tpm/tpm_crb.c @@ -107,8 +107,8 @@ static void tpm_crb_none_realize(DeviceState *dev, Error **errp) TPM_CRB_ADDR_BASE + sizeof(s->state.regs), >state.cmdmem); if (s->state.ppi_enabled) { -tpm_ppi_init(>state.ppi, get_system_memory(), - TPM_PPI_ADDR_BASE, OBJECT(s)); +memory_region_add_subregion(get_system_memory(), +TPM_PPI_ADDR_BASE, >state.ppi.ram); } if (xen_enabled()) { diff --git a/hw/tpm/tpm_crb_common.c b/hw/tpm/tpm_crb_common.c index 228e2d0faf..e56e910670 100644 --- a/hw/tpm/tpm_crb_common.c +++ b/hw/tpm/tpm_crb_common.c @@ -216,4 +216,7 @@ void tpm_crb_init_memory(Object *obj, TPMCRBState *s, Error **errp) "tpm-crb-mmio", sizeof(s->regs)); memory_region_init_ram(>cmdmem, obj, "tpm-crb-cmd", CRB_CTRL_CMD_SIZE, errp); +if (s->ppi_enabled) { +tpm_ppi_init_memory(>ppi, obj); +} } diff --git a/hw/tpm/tpm_ppi.c b/hw/tpm/tpm_ppi.c index 7f74e26ec6..40cab59afa 100644 --- a/hw/tpm/tpm_ppi.c +++ b/hw/tpm/tpm_ppi.c @@ -44,14 +44,11 @@ void tpm_ppi_reset(TPMPPI *tpmppi) } } -void tpm_ppi_init(TPMPPI *tpmppi, MemoryRegion *m, - hwaddr addr, Object *obj) +void tpm_ppi_init_memory(TPMPPI *tpmppi, Object *obj) { tpmppi->buf = qemu_memalign(qemu_real_host_page_size(), HOST_PAGE_ALIGN(TPM_PPI_ADDR_SIZE)); memory_region_init_ram_device_ptr(>ram, obj, "tpm-ppi", TPM_PPI_ADDR_SIZE, tpmppi->buf); vmstate_register_ram(>ram, DEVICE(obj)); - -memory_region_add_subregion(m, addr, >ram); } diff --git a/hw/tpm/tpm_tis_isa.c b/hw/tpm/tpm_tis_isa.c index 91e3792248..7cd7415f30 100644 --- a/hw/tpm/tpm_tis_isa.c +++ b/hw/tpm/tpm_tis_isa.c @@ -134,8 +134,9 @@ static void tpm_tis_isa_realizefn(DeviceState *dev, Error **errp) TPM_TIS_ADDR_BASE, >mmio); if (s->ppi_enabled) { -tpm_ppi_init(>ppi, isa_address_space(ISA_DEVICE(dev)), - TPM_PPI_ADDR_BASE, OBJECT(dev)); +tpm_ppi_init_memory(>ppi, OBJECT(dev)); +memory_region_add_subregion(isa_address_space(ISA_DEVICE(dev)), +TPM_PPI_ADDR_BASE, >ppi.ram); } }
Re: [PATCH 04/11] tpm_crb: use a single read-as-mem/write-as-mmio mapping
On Thu, 13 Jul 2023 at 16:46, Stefan Berger wrote: > On 7/13/23 11:34, Peter Maydell wrote: > > On Thu, 13 Jul 2023 at 16:28, Stefan Berger wrote: > >> On 7/13/23 10:50, Peter Maydell wrote: > >>> I'm not a super-fan of hacking around the fact that LDP > >>> to hardware registers isn't supported in specific device > >>> models, though... > >> > >> What does this mean for this effort here? > > > > Usually we say "fix the guest to not try to access hardware > > registers with silly load/store instruction types". The other > > option would be "put in a large amount of effort to support > > emulating those instructions in QEMU userspace when KVM/HVF/etc > > trap and punt them to us". For the last decade or so we have > > taken the first of these approaches :-) > > Is Microsoft likely to react to use telling them "fix the guest"? They have on occasion in the past, yes. The other outstanding question here is if this TPM device should be a sysbus one at all (i.e. not i2c), which might render this part moot. thanks -- PMM
Re: [PATCH V9 00/46] Live Update
Good morning, On 7/10/23 10:10, Steven Sistare wrote: On 6/12/2023 10:59 AM, Michael Galaxy wrote: Hi Steve, On 6/7/23 12:37, Steven Sistare wrote: On 6/7/2023 11:55 AM, Michael Galaxy wrote: Another option could be to expose "-migrate-mode-disable" (instead of enable) and just enable all 3 modes by default, since we are already required to switch from "normal" mode to a CPR-specific mode when it is time to do a live update, if the intention is to preserve the capability to completely prevent a running QEMU from using these modes before the VM starts up. - Michael On 6/6/23 17:15, Michael Galaxy wrote: Hi Steve, In the current design you have, we have to specify both the command line parameter "-migrate-mode-enable cpr-reboot" *and* issue the monitor command "migrate_set_parameter mode cpr-${mode}". Is it possible to opt-in to the CPR mode just once over the monitor instead of having to specify it twice on the command line? This would also match the live migration model: You do not need to necessarily "opt in" to live migration mode through a command line parameter, you simply request it when you need to. Can CPR behave the same way? This would also make switching over to a CPR-capable version of QEMU much simpler and would even make it work for existing libvirt-managed guests as their command line parameters would no longer need to change. This would allow us to simply power-off and power-on existing VMs to make them CPR-capable and then work on a libvirt patch later when we're ready to do so. Comments? Hi Michael, Requiring -migrate-enable-mode allows qemu to initialize objects differently, if necessary, so that migration for a mode is not blocked. See callers of migrate_mode_enabled. There is only one so far, in ram_block_add. If the mode is cpr-exec, then it creates anonymous ram blocks using memfd_create, else using MAP_ANON. In the V7 series, this was controlled by a '-machine memfd-alloc=on' option. migrate-enable-mode is more future proof for the user. If something new must initialize differently to support cpr, then it adds a call to migrate_mode_enabled, and the command line remains the same. However, I could be persuaded to go either way. OK, so it is cpr-exec that needs this option (because of ram block allocation), not really cpr-reboot. Could the option then be made to only be required for cpr-exec and not cpr-reboot, then, since cpr-reboot doesn't require that consideration? In a different forum Juan said this is a memory issue, so it should be expressed as a memory related option. So, I will delete -migrate-enable-mode and revert back to -machine memfd-alloc, as defined in the V7 patch series. Acknowledged. I'm going to try to get my reviewed-by's in soon. Sorry I haven't done it sooner. We've finished testing these patches on our systems and are moving forward. A secondary reason for -migrate-enable-mode is to support the only-cpr-capable option. It needs to know which mode will be used, in order to check a mode-specific blocker list. Still, only-cpr-capable is also optional. If and only if one needs this option, the mode could be specified as part of the option itself, rather than requiring an extra command line parameter, no? Yes, I will make that change. - Steve Acknowledged.
Re: [PATCH 04/11] tpm_crb: use a single read-as-mem/write-as-mmio mapping
On 7/13/23 11:34, Peter Maydell wrote: On Thu, 13 Jul 2023 at 16:28, Stefan Berger wrote: On 7/13/23 10:50, Peter Maydell wrote: On Thu, 13 Jul 2023 at 15:18, Stefan Berger wrote: On 7/12/23 23:51, Joelle van Dyne wrote: On Apple Silicon, when Windows performs a LDP on the CRB MMIO space, the exception is not decoded by hardware and we cannot trap the MMIO read. This led to the idea from @agraf to use the same mapping type as ROM devices: namely that reads should be seen as memory type and writes should trap as MMIO. +++ b/hw/tpm/tpm_crb.c @@ -68,7 +68,6 @@ static const VMStateDescription vmstate_tpm_crb_none = { .name = "tpm-crb", .pre_save = tpm_crb_none_pre_save, .fields = (VMStateField[]) { -VMSTATE_UINT32_ARRAY(state.regs, CRBState, TPM_CRB_R_MAX), This has to stay here otherwise we cannot restart VMs from saved state once QEMU is upgraded. 2023-07-13T14:15:43.997718Z qemu-system-x86_64: Unknown ramblock "tpm-crb-cmd", cannot accept migration 2023-07-13T14:15:43.997813Z qemu-system-x86_64: error while loading state for instance 0x0 of device 'ram' 2023-07-13T14:15:43.997841Z qemu-system-x86_64: load of migration failed: Invalid argument More generally, for migration compatibility in the other direction you need to use memory_region_init_rom_device_nomigrate() and make sure you keep migrating the data via this, not via the MemoryRegion. I'm not a super-fan of hacking around the fact that LDP to hardware registers isn't supported in specific device models, though... What does this mean for this effort here? Usually we say "fix the guest to not try to access hardware registers with silly load/store instruction types". The other option would be "put in a large amount of effort to support emulating those instructions in QEMU userspace when KVM/HVF/etc trap and punt them to us". For the last decade or so we have taken the first of these approaches :-) Is Microsoft likely to react to use telling them "fix the guest"? Stefan thanks -- PMM
Re: [PATCH] util/interval-tree: Avoid race conditions without optimization
On 7/13/23 12:32, Peter Maydell wrote: On Fri, 7 Jul 2023 at 11:30, Richard Henderson wrote: Read the left and right trees once, so that the gating tests are meaningful. This was only a problem at -O0, where the compiler didn't CSE the two reads. Cc: qemu-sta...@nongnu.org Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell If this data structure is intended to support operations being done on it while it's being mutated, shouldn't it be using the atomic accessors, though? That would make it clearer that you can't just undo the transformation made by this patch. Yes, it probably should. I use qatomic_set() where the kernel used WRITE_ONCE, but there was no markup for the read side. r~
Re: [PATCH 02/11] tpm_crb: CTRL_RSP_ADDR is 64-bits wide
On 7/12/23 23:51, Joelle van Dyne wrote: The register is actually 64-bits but in order to make this more clear than the specification, we define two 32-bit registers: CTRL_RSP_LADDR and CTRL_RSP_HADDR to match the CTRL_CMD_* naming. This deviates from the specs but is way more clear. Previously, the only CRB device uses a fixed system address so this was not an issue. However, once we support SysBus CRB device, the address can be anywhere in 64-bit space. Signed-off-by: Joelle van Dyne Reviewed-by: Stefan Berger --- include/hw/acpi/tpm.h | 3 ++- hw/tpm/tpm_crb_common.c| 3 ++- tests/qtest/tpm-crb-test.c | 2 +- tests/qtest/tpm-util.c | 2 +- 4 files changed, 6 insertions(+), 4 deletions(-) diff --git a/include/hw/acpi/tpm.h b/include/hw/acpi/tpm.h index 579c45f5ba..f60bfe2789 100644 --- a/include/hw/acpi/tpm.h +++ b/include/hw/acpi/tpm.h @@ -174,7 +174,8 @@ REG32(CRB_CTRL_CMD_SIZE, 0x58) REG32(CRB_CTRL_CMD_LADDR, 0x5C) REG32(CRB_CTRL_CMD_HADDR, 0x60) REG32(CRB_CTRL_RSP_SIZE, 0x64) -REG32(CRB_CTRL_RSP_ADDR, 0x68) +REG32(CRB_CTRL_RSP_LADDR, 0x68) +REG32(CRB_CTRL_RSP_HADDR, 0x6C) REG32(CRB_DATA_BUFFER, 0x80) #define TPM_CRB_ADDR_BASE 0xFED4 diff --git a/hw/tpm/tpm_crb_common.c b/hw/tpm/tpm_crb_common.c index 4c173affb6..228e2d0faf 100644 --- a/hw/tpm/tpm_crb_common.c +++ b/hw/tpm/tpm_crb_common.c @@ -199,7 +199,8 @@ void tpm_crb_reset(TPMCRBState *s, uint64_t baseaddr) s->regs[R_CRB_CTRL_CMD_LADDR] = (uint32_t)baseaddr; s->regs[R_CRB_CTRL_CMD_HADDR] = (uint32_t)(baseaddr >> 32); s->regs[R_CRB_CTRL_RSP_SIZE] = CRB_CTRL_CMD_SIZE; -s->regs[R_CRB_CTRL_RSP_ADDR] = (uint32_t)baseaddr; +s->regs[R_CRB_CTRL_RSP_LADDR] = (uint32_t)baseaddr; +s->regs[R_CRB_CTRL_RSP_HADDR] = (uint32_t)(baseaddr >> 32); s->be_buffer_size = MIN(tpm_backend_get_buffer_size(s->tpmbe), CRB_CTRL_CMD_SIZE); diff --git a/tests/qtest/tpm-crb-test.c b/tests/qtest/tpm-crb-test.c index 396ae3f91c..9d30fe8293 100644 --- a/tests/qtest/tpm-crb-test.c +++ b/tests/qtest/tpm-crb-test.c @@ -28,7 +28,7 @@ static void tpm_crb_test(const void *data) uint32_t csize = readl(TPM_CRB_ADDR_BASE + A_CRB_CTRL_CMD_SIZE); uint64_t caddr = readq(TPM_CRB_ADDR_BASE + A_CRB_CTRL_CMD_LADDR); uint32_t rsize = readl(TPM_CRB_ADDR_BASE + A_CRB_CTRL_RSP_SIZE); -uint64_t raddr = readq(TPM_CRB_ADDR_BASE + A_CRB_CTRL_RSP_ADDR); +uint64_t raddr = readq(TPM_CRB_ADDR_BASE + A_CRB_CTRL_RSP_LADDR); uint8_t locstate = readb(TPM_CRB_ADDR_BASE + A_CRB_LOC_STATE); uint32_t locctrl = readl(TPM_CRB_ADDR_BASE + A_CRB_LOC_CTRL); uint32_t locsts = readl(TPM_CRB_ADDR_BASE + A_CRB_LOC_STS); diff --git a/tests/qtest/tpm-util.c b/tests/qtest/tpm-util.c index 1c0319e6e7..dd02057fc0 100644 --- a/tests/qtest/tpm-util.c +++ b/tests/qtest/tpm-util.c @@ -25,7 +25,7 @@ void tpm_util_crb_transfer(QTestState *s, unsigned char *rsp, size_t rsp_size) { uint64_t caddr = qtest_readq(s, TPM_CRB_ADDR_BASE + A_CRB_CTRL_CMD_LADDR); -uint64_t raddr = qtest_readq(s, TPM_CRB_ADDR_BASE + A_CRB_CTRL_RSP_ADDR); +uint64_t raddr = qtest_readq(s, TPM_CRB_ADDR_BASE + A_CRB_CTRL_RSP_LADDR); qtest_writeb(s, TPM_CRB_ADDR_BASE + A_CRB_LOC_CTRL, 1);
Re: [PATCH 04/11] tpm_crb: use a single read-as-mem/write-as-mmio mapping
On Thu, 13 Jul 2023 at 16:28, Stefan Berger wrote: > > > > On 7/13/23 10:50, Peter Maydell wrote: > > On Thu, 13 Jul 2023 at 15:18, Stefan Berger wrote: > >> > >> > >> > >> On 7/12/23 23:51, Joelle van Dyne wrote: > >>> On Apple Silicon, when Windows performs a LDP on the CRB MMIO space, > >>> the exception is not decoded by hardware and we cannot trap the MMIO > >>> read. This led to the idea from @agraf to use the same mapping type as > >>> ROM devices: namely that reads should be seen as memory type and > >>> writes should trap as MMIO. > > >>> +++ b/hw/tpm/tpm_crb.c > >>> @@ -68,7 +68,6 @@ static const VMStateDescription vmstate_tpm_crb_none = { > >>>.name = "tpm-crb", > >>>.pre_save = tpm_crb_none_pre_save, > >>>.fields = (VMStateField[]) { > >>> -VMSTATE_UINT32_ARRAY(state.regs, CRBState, TPM_CRB_R_MAX), > >> > >> This has to stay here otherwise we cannot restart VMs from saved state > >> once QEMU is upgraded. > >> > >> 2023-07-13T14:15:43.997718Z qemu-system-x86_64: Unknown ramblock > >> "tpm-crb-cmd", cannot accept migration > >> 2023-07-13T14:15:43.997813Z qemu-system-x86_64: error while loading state > >> for instance 0x0 of device 'ram' > >> 2023-07-13T14:15:43.997841Z qemu-system-x86_64: load of migration failed: > >> Invalid argument > > > > More generally, for migration compatibility in the other > > direction you need to use memory_region_init_rom_device_nomigrate() > > and make sure you keep migrating the data via this, not > > via the MemoryRegion. > > > > I'm not a super-fan of hacking around the fact that LDP > > to hardware registers isn't supported in specific device > > models, though... > > What does this mean for this effort here? Usually we say "fix the guest to not try to access hardware registers with silly load/store instruction types". The other option would be "put in a large amount of effort to support emulating those instructions in QEMU userspace when KVM/HVF/etc trap and punt them to us". For the last decade or so we have taken the first of these approaches :-) thanks -- PMM
Re: [PATCH 07/11] hw/arm/virt: add plug handler for TPM on SysBus
On Thu, 13 Jul 2023 at 04:52, Joelle van Dyne wrote: > > TPM needs to know its own base address in order to generate its DSDT > device entry. > > Signed-off-by: Joelle van Dyne > --- > hw/arm/virt.c | 37 + > 1 file changed, 37 insertions(+) > > diff --git a/hw/arm/virt.c b/hw/arm/virt.c > index 7d9dbc2663..432148ef47 100644 > --- a/hw/arm/virt.c > +++ b/hw/arm/virt.c > @@ -2732,6 +2732,37 @@ static void virt_memory_plug(HotplugHandler > *hotplug_dev, > dev, _abort); > } > > +#ifdef CONFIG_TPM > +static void virt_tpm_plug(VirtMachineState *vms, TPMIf *tpmif) > +{ > +PlatformBusDevice *pbus = PLATFORM_BUS_DEVICE(vms->platform_bus_dev); > +hwaddr pbus_base = vms->memmap[VIRT_PLATFORM_BUS].base; > +SysBusDevice *sbdev = SYS_BUS_DEVICE(tpmif); > +MemoryRegion *sbdev_mr; > +hwaddr tpm_base; > +uint64_t tpm_size; > + > +if (!sbdev || !object_dynamic_cast(OBJECT(sbdev), TYPE_SYS_BUS_DEVICE)) { > +return; > +} > + > +tpm_base = platform_bus_get_mmio_addr(pbus, sbdev, 0); > +assert(tpm_base != -1); > + > +tpm_base += pbus_base; > + > +sbdev_mr = sysbus_mmio_get_region(sbdev, 0); > +tpm_size = memory_region_size(sbdev_mr); > + > +if (object_property_find(OBJECT(sbdev), "baseaddr")) { > +object_property_set_uint(OBJECT(sbdev), "baseaddr", tpm_base, NULL); > +} > +if (object_property_find(OBJECT(sbdev), "size")) { > +object_property_set_uint(OBJECT(sbdev), "size", tpm_size, NULL); > +} > +} > +#endif I do not like the "platform bus" at all -- it is a nasty hack. If the virt board needs a memory mapped TPM device it should probably just create one, the same way we create our other memory mapped devices. But... How are TPM devices typically set up/visible to the guest on real Arm server hardware ? Should this be a sysbus device at all? thanks -- PMM
Re: [PATCH 04/11] tpm_crb: use a single read-as-mem/write-as-mmio mapping
On 7/13/23 10:50, Peter Maydell wrote: On Thu, 13 Jul 2023 at 15:18, Stefan Berger wrote: On 7/12/23 23:51, Joelle van Dyne wrote: On Apple Silicon, when Windows performs a LDP on the CRB MMIO space, the exception is not decoded by hardware and we cannot trap the MMIO read. This led to the idea from @agraf to use the same mapping type as ROM devices: namely that reads should be seen as memory type and writes should trap as MMIO. +++ b/hw/tpm/tpm_crb.c @@ -68,7 +68,6 @@ static const VMStateDescription vmstate_tpm_crb_none = { .name = "tpm-crb", .pre_save = tpm_crb_none_pre_save, .fields = (VMStateField[]) { -VMSTATE_UINT32_ARRAY(state.regs, CRBState, TPM_CRB_R_MAX), This has to stay here otherwise we cannot restart VMs from saved state once QEMU is upgraded. 2023-07-13T14:15:43.997718Z qemu-system-x86_64: Unknown ramblock "tpm-crb-cmd", cannot accept migration 2023-07-13T14:15:43.997813Z qemu-system-x86_64: error while loading state for instance 0x0 of device 'ram' 2023-07-13T14:15:43.997841Z qemu-system-x86_64: load of migration failed: Invalid argument More generally, for migration compatibility in the other direction you need to use memory_region_init_rom_device_nomigrate() and make sure you keep migrating the data via this, not via the MemoryRegion. I'm not a super-fan of hacking around the fact that LDP to hardware registers isn't supported in specific device models, though... What does this mean for this effort here? Stefan thanks -- PMM
Re: [PATCH 04/11] tpm_crb: use a single read-as-mem/write-as-mmio mapping
On Thu, 13 Jul 2023 at 15:18, Stefan Berger wrote: > > > > On 7/12/23 23:51, Joelle van Dyne wrote: > > On Apple Silicon, when Windows performs a LDP on the CRB MMIO space, > > the exception is not decoded by hardware and we cannot trap the MMIO > > read. This led to the idea from @agraf to use the same mapping type as > > ROM devices: namely that reads should be seen as memory type and > > writes should trap as MMIO. These are hardware registers, right? Windows shouldn't really be doing LDP to those if it expects to be able to run in a VM... > > Once that was done, the second memory mapping of the command buffer > > region was redundent and was removed. > > > > A note about the removal of the read trap for `CRB_LOC_STATE`: > > The only usage was to return the most up-to-date value for > > `tpmEstablished`. However, `tpmEstablished` is only set when a > > TPM2_HashStart operation is called which only exists for locality 4. > > Indeed, the comment for the write handler of `CRB_LOC_CTRL` makes the > > same argument for why it is not calling the backend to reset the > > `tpmEstablished` bit. As this bit is unused, we do not need to worry > > about updating it for reads. > > > > Signed-off-by: Joelle van Dyne > > --- > > hw/tpm/tpm_crb.h| 2 - > > hw/tpm/tpm_crb.c| 3 - > > hw/tpm/tpm_crb_common.c | 124 > > 3 files changed, 63 insertions(+), 66 deletions(-) > > > > diff --git a/hw/tpm/tpm_crb.h b/hw/tpm/tpm_crb.h > > index da3a0cf256..7cdd37335f 100644 > > --- a/hw/tpm/tpm_crb.h > > +++ b/hw/tpm/tpm_crb.h > > @@ -26,9 +26,7 @@ > > typedef struct TPMCRBState { > > TPMBackend *tpmbe; > > TPMBackendCmd cmd; > > -uint32_t regs[TPM_CRB_R_MAX]; > > MemoryRegion mmio; > > -MemoryRegion cmdmem; > > > > size_t be_buffer_size; > > > > diff --git a/hw/tpm/tpm_crb.c b/hw/tpm/tpm_crb.c > > index 598c3e0161..07c6868d8d 100644 > > --- a/hw/tpm/tpm_crb.c > > +++ b/hw/tpm/tpm_crb.c > > @@ -68,7 +68,6 @@ static const VMStateDescription vmstate_tpm_crb_none = { > > .name = "tpm-crb", > > .pre_save = tpm_crb_none_pre_save, > > .fields = (VMStateField[]) { > > -VMSTATE_UINT32_ARRAY(state.regs, CRBState, TPM_CRB_R_MAX), > > This has to stay here otherwise we cannot restart VMs from saved state once > QEMU is upgraded. > > 2023-07-13T14:15:43.997718Z qemu-system-x86_64: Unknown ramblock > "tpm-crb-cmd", cannot accept migration > 2023-07-13T14:15:43.997813Z qemu-system-x86_64: error while loading state for > instance 0x0 of device 'ram' > 2023-07-13T14:15:43.997841Z qemu-system-x86_64: load of migration failed: > Invalid argument More generally, for migration compatibility in the other direction you need to use memory_region_init_rom_device_nomigrate() and make sure you keep migrating the data via this, not via the MemoryRegion. I'm not a super-fan of hacking around the fact that LDP to hardware registers isn't supported in specific device models, though... thanks -- PMM
Re: [PATCH 04/11] tpm_crb: use a single read-as-mem/write-as-mmio mapping
On 7/12/23 23:51, Joelle van Dyne wrote: On Apple Silicon, when Windows performs a LDP on the CRB MMIO space, the exception is not decoded by hardware and we cannot trap the MMIO read. This led to the idea from @agraf to use the same mapping type as ROM devices: namely that reads should be seen as memory type and writes should trap as MMIO. Once that was done, the second memory mapping of the command buffer region was redundent and was removed. A note about the removal of the read trap for `CRB_LOC_STATE`: The only usage was to return the most up-to-date value for `tpmEstablished`. However, `tpmEstablished` is only set when a TPM2_HashStart operation is called which only exists for locality 4. Indeed, the comment for the write handler of `CRB_LOC_CTRL` makes the same argument for why it is not calling the backend to reset the `tpmEstablished` bit. As this bit is unused, we do not need to worry about updating it for reads. Signed-off-by: Joelle van Dyne --- hw/tpm/tpm_crb.h| 2 - hw/tpm/tpm_crb.c| 3 - hw/tpm/tpm_crb_common.c | 124 3 files changed, 63 insertions(+), 66 deletions(-) diff --git a/hw/tpm/tpm_crb.h b/hw/tpm/tpm_crb.h index da3a0cf256..7cdd37335f 100644 --- a/hw/tpm/tpm_crb.h +++ b/hw/tpm/tpm_crb.h @@ -26,9 +26,7 @@ typedef struct TPMCRBState { TPMBackend *tpmbe; TPMBackendCmd cmd; -uint32_t regs[TPM_CRB_R_MAX]; MemoryRegion mmio; -MemoryRegion cmdmem; size_t be_buffer_size; diff --git a/hw/tpm/tpm_crb.c b/hw/tpm/tpm_crb.c index 598c3e0161..07c6868d8d 100644 --- a/hw/tpm/tpm_crb.c +++ b/hw/tpm/tpm_crb.c @@ -68,7 +68,6 @@ static const VMStateDescription vmstate_tpm_crb_none = { .name = "tpm-crb", .pre_save = tpm_crb_none_pre_save, .fields = (VMStateField[]) { -VMSTATE_UINT32_ARRAY(state.regs, CRBState, TPM_CRB_R_MAX), This has to stay here otherwise we cannot restart VMs from saved state once QEMU is upgraded. 2023-07-13T14:15:43.997718Z qemu-system-x86_64: Unknown ramblock "tpm-crb-cmd", cannot accept migration 2023-07-13T14:15:43.997813Z qemu-system-x86_64: error while loading state for instance 0x0 of device 'ram' 2023-07-13T14:15:43.997841Z qemu-system-x86_64: load of migration failed: Invalid argument Stefan
Re: [PATCH] hw/pci: Warn when ARI/SR-IOV device has non-zero Function number
On 2023/07/12 21:06, Michael S. Tsirkin wrote: On Wed, Jul 12, 2023 at 08:50:32PM +0900, Akihiko Odaki wrote: On 2023/07/12 20:46, Michael S. Tsirkin wrote: On Wed, Jul 12, 2023 at 08:27:32PM +0900, Akihiko Odaki wrote: Current SR/IOV implementations assume that hardcoded Function numbers are always available and will not conflict. It is somewhat non-trivial to make the Function numbers to use controllable to avoid Function number conflicts so ensure there is only one PF to make the assumption hold true. Also warn when non-SR/IOV multifunction was attempted with ARI enabled; ARI has the next Function number field register, and currently it's hardcoded to 0, which prevents non-SR/IOV multifunction. It is certainly possible to add a logic to determine the correct next Function number according to the configuration, but it's not worth since all ARI-capable devices are also SR/IOV devices, which do not support multiple PFs as stated above. Signed-off-by: Akihiko Odaki I am not really interested in adding this stuff. The real thing to focus on is fixing ARI emulation, not warning users about ways in which it's broken. What do you think about multiple SR/IOV PFs? Do you think it's worth/easy enough to fix SR/IOV code to support it? Otherwise it's not worth fixing ARI since currently only SR/IOV devices implement it. There's nothing especially hard about it. You can in particular just assume the user knows what he's doing and not worry too much about checking. Creating invalid configs might also come handy e.g. for debug. The important thing, and that's missing ATM, is giving management ability to find out TotalVFs, VF offset and VF stride, so it can avoid creating these conflicts. For igd maybe we should make VF offset and VF stride just 1 unconditionally - I have no idea why it was made 2 ATM - could you check what does real hardware do? The current igb implementation match with real hardware. It is defined in the datasheet*, section 9.6.4.6. I don't know why it's 2 either. * https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/82576eg-gbe-datasheet.pdf Yes, warning at least is handy for management debugging. It shouldn't be hard I think, but the logic does tend to be O(n^2). Maybe add a flag to check, and management developers can use that for debugging. --- hw/pci/pci.c | 59 +--- 1 file changed, 42 insertions(+), 17 deletions(-) diff --git a/hw/pci/pci.c b/hw/pci/pci.c index 784c02a182..50359a0f3a 100644 --- a/hw/pci/pci.c +++ b/hw/pci/pci.c @@ -2124,23 +2124,48 @@ static void pci_qdev_realize(DeviceState *qdev, Error **errp) } } -/* - * A PCIe Downstream Port that do not have ARI Forwarding enabled must - * associate only Device 0 with the device attached to the bus - * representing the Link from the Port (PCIe base spec rev 4.0 ver 0.3, - * sec 7.3.1). - * With ARI, PCI_SLOT() can return non-zero value as the traditional - * 5-bit Device Number and 3-bit Function Number fields in its associated - * Routing IDs, Requester IDs and Completer IDs are interpreted as a - * single 8-bit Function Number. Hence, ignore ARI capable devices. - */ -if (pci_is_express(pci_dev) && -!pcie_find_capability(pci_dev, PCI_EXT_CAP_ID_ARI) && -pcie_has_upstream_port(pci_dev) && -PCI_SLOT(pci_dev->devfn)) { -warn_report("PCI: slot %d is not valid for %s," -" parent device only allows plugging into slot 0.", -PCI_SLOT(pci_dev->devfn), pci_dev->name); +if (pci_is_express(pci_dev)) { +/* + * A PCIe Downstream Port that do not have ARI Forwarding enabled must + * associate only Device 0 with the device attached to the bus + * representing the Link from the Port (PCIe base spec rev 4.0 ver 0.3, + * sec 7.3.1). + * With ARI, PCI_SLOT() can return non-zero value as the traditional + * 5-bit Device Number and 3-bit Function Number fields in its + * associated Routing IDs, Requester IDs and Completer IDs are + * interpreted as a single 8-bit Function Number. Hence, ignore ARI + * capable devices. + */ +if (!pcie_find_capability(pci_dev, PCI_EXT_CAP_ID_ARI) && +pcie_has_upstream_port(pci_dev) && +PCI_SLOT(pci_dev->devfn)) { +warn_report("PCI: slot %d is not valid for %s," +" parent device only allows plugging into slot 0.", +PCI_SLOT(pci_dev->devfn), pci_dev->name); +} + +/* + * Current SR/IOV implementations assume that hardcoded Function numbers + * are always available. Ensure there is only one PF to make the + * assumption hold true. + */ +if (pcie_find_capability(pci_dev, PCI_EXT_CAP_ID_SRIOV) && +PCI_FUNC(pci_dev->devfn)) { +
Re: [PATCH 01/11] tpm_crb: refactor common code
On 7/12/23 23:51, Joelle van Dyne wrote: In preparation for the SysBus variant, we move common code styled after the TPM TIS devices. To maintain compatibility, we do not rename the existing tpm-crb device. Signed-off-by: Joelle van Dyne --- docs/specs/tpm.rst | 1 + hw/tpm/tpm_crb.h| 76 +++ hw/tpm/tpm_crb.c| 270 ++-- hw/tpm/tpm_crb_common.c | 218 hw/tpm/meson.build | 1 + hw/tpm/trace-events | 2 +- 6 files changed, 333 insertions(+), 235 deletions(-) create mode 100644 hw/tpm/tpm_crb.h create mode 100644 hw/tpm/tpm_crb_common.c diff --git a/docs/specs/tpm.rst b/docs/specs/tpm.rst index efe124a148..2bc29c9804 100644 --- a/docs/specs/tpm.rst +++ b/docs/specs/tpm.rst @@ -45,6 +45,7 @@ operating system. QEMU files related to TPM CRB interface: - ``hw/tpm/tpm_crb.c`` + - ``hw/tpm/tpm_crb_common.c`` If you could add the command line to use for Windows on AARCH64 to this document in 11/11 that would be helpful because what is there right now ony works for Linux iirc. Regarding this patch here: Reviewed-by: Stefan Berger SPAPR interface --- diff --git a/hw/tpm/tpm_crb.h b/hw/tpm/tpm_crb.h new file mode 100644 index 00..da3a0cf256 --- /dev/null +++ b/hw/tpm/tpm_crb.h @@ -0,0 +1,76 @@ +/* + * tpm_crb.h - QEMU's TPM CRB interface emulator + * + * Copyright (c) 2018 Red Hat, Inc. + * + * Authors: + * Marc-André Lureau + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + * tpm_crb is a device for TPM 2.0 Command Response Buffer (CRB) Interface + * as defined in TCG PC Client Platform TPM Profile (PTP) Specification + * Family “2.0” Level 00 Revision 01.03 v22 + */ +#ifndef TPM_TPM_CRB_H +#define TPM_TPM_CRB_H + +#include "exec/memory.h" +#include "hw/acpi/tpm.h" +#include "sysemu/tpm_backend.h" +#include "tpm_ppi.h" + +#define CRB_CTRL_CMD_SIZE (TPM_CRB_ADDR_SIZE - A_CRB_DATA_BUFFER) + +typedef struct TPMCRBState { +TPMBackend *tpmbe; +TPMBackendCmd cmd; +uint32_t regs[TPM_CRB_R_MAX]; +MemoryRegion mmio; +MemoryRegion cmdmem; + +size_t be_buffer_size; + +bool ppi_enabled; +TPMPPI ppi; +} TPMCRBState; + +#define CRB_INTF_TYPE_CRB_ACTIVE 0b1 +#define CRB_INTF_VERSION_CRB 0b1 +#define CRB_INTF_CAP_LOCALITY_0_ONLY 0b0 +#define CRB_INTF_CAP_IDLE_FAST 0b0 +#define CRB_INTF_CAP_XFER_SIZE_64 0b11 +#define CRB_INTF_CAP_FIFO_NOT_SUPPORTED 0b0 +#define CRB_INTF_CAP_CRB_SUPPORTED 0b1 +#define CRB_INTF_IF_SELECTOR_CRB 0b1 + +enum crb_loc_ctrl { +CRB_LOC_CTRL_REQUEST_ACCESS = BIT(0), +CRB_LOC_CTRL_RELINQUISH = BIT(1), +CRB_LOC_CTRL_SEIZE = BIT(2), +CRB_LOC_CTRL_RESET_ESTABLISHMENT_BIT = BIT(3), +}; + +enum crb_ctrl_req { +CRB_CTRL_REQ_CMD_READY = BIT(0), +CRB_CTRL_REQ_GO_IDLE = BIT(1), +}; + +enum crb_start { +CRB_START_INVOKE = BIT(0), +}; + +enum crb_cancel { +CRB_CANCEL_INVOKE = BIT(0), +}; + +#define TPM_CRB_NO_LOCALITY 0xff + +void tpm_crb_request_completed(TPMCRBState *s, int ret); +enum TPMVersion tpm_crb_get_version(TPMCRBState *s); +int tpm_crb_pre_save(TPMCRBState *s); +void tpm_crb_reset(TPMCRBState *s, uint64_t baseaddr); +void tpm_crb_init_memory(Object *obj, TPMCRBState *s, Error **errp); + +#endif /* TPM_TPM_CRB_H */ diff --git a/hw/tpm/tpm_crb.c b/hw/tpm/tpm_crb.c index ea930da545..3ef4977fb5 100644 --- a/hw/tpm/tpm_crb.c +++ b/hw/tpm/tpm_crb.c @@ -31,257 +31,62 @@ #include "tpm_ppi.h" #include "trace.h" #include "qom/object.h" +#include "tpm_crb.h" struct CRBState { DeviceState parent_obj; -TPMBackend *tpmbe; -TPMBackendCmd cmd; -uint32_t regs[TPM_CRB_R_MAX]; -MemoryRegion mmio; -MemoryRegion cmdmem; - -size_t be_buffer_size; - -bool ppi_enabled; -TPMPPI ppi; +TPMCRBState state; }; typedef struct CRBState CRBState; DECLARE_INSTANCE_CHECKER(CRBState, CRB, TYPE_TPM_CRB) -#define CRB_INTF_TYPE_CRB_ACTIVE 0b1 -#define CRB_INTF_VERSION_CRB 0b1 -#define CRB_INTF_CAP_LOCALITY_0_ONLY 0b0 -#define CRB_INTF_CAP_IDLE_FAST 0b0 -#define CRB_INTF_CAP_XFER_SIZE_64 0b11 -#define CRB_INTF_CAP_FIFO_NOT_SUPPORTED 0b0 -#define CRB_INTF_CAP_CRB_SUPPORTED 0b1 -#define CRB_INTF_IF_SELECTOR_CRB 0b1 - -#define CRB_CTRL_CMD_SIZE (TPM_CRB_ADDR_SIZE - A_CRB_DATA_BUFFER) - -enum crb_loc_ctrl { -CRB_LOC_CTRL_REQUEST_ACCESS = BIT(0), -CRB_LOC_CTRL_RELINQUISH = BIT(1), -CRB_LOC_CTRL_SEIZE = BIT(2), -CRB_LOC_CTRL_RESET_ESTABLISHMENT_BIT = BIT(3), -}; - -enum crb_ctrl_req { -CRB_CTRL_REQ_CMD_READY = BIT(0), -CRB_CTRL_REQ_GO_IDLE = BIT(1), -}; - -enum crb_start { -CRB_START_INVOKE = BIT(0), -}; - -enum crb_cancel { -CRB_CANCEL_INVOKE = BIT(0), -}; - -#define TPM_CRB_NO_LOCALITY 0xff - -static uint64_t tpm_crb_mmio_read(void *opaque, hwaddr addr, - unsigned size) -{ -
QEMU Summit Minutes 2023
QEMU Summit Minutes 2023 As usual, we held a QEMU Summit meeting at KVM Forum. This is an invite-only meeting for the most active maintainers and submaintainers in the project, and we discuss various project-wide issues, usually process stuff. We then post the minutes of the meeting to the list as a jumping off point for wider discussion and for those who weren't able to attend. Attendees = "Peter Maydell" "Alex Bennée" "Kevin Wolf" "Thomas Huth" "Markus Armbruster" "Mark Cave-Ayland" "Philippe Mathieu-Daudé" "Daniel P. Berrangé" "Richard Henderson" "Michael S. Tsirkin" "Stefan Hajnoczi" "Alex Graf" "Gerd Hoffmann" "Paolo Bonzini" "Michael Roth" Topic 1: Dealing with tree wide changes === Mark Cave-Ayland raised concerns that tree wide changes often get stuck because maintainers are conservative about merging code that touches other subsystems and doesn't have review. He mentioned a couple of cases of PC refactoring which had been held up and languished on the list due to lack of review time. It can be hard to get everything in the change reviewed, and then hard to get the change merged, especially if it still has parts that weren't reviewed by anybody. Alex Bennée mentioned that maintainers can always give an Acked-by and then rely on someone else doing the review. But even getting Acked-by's can take time and we still have a problem with absent maintainers. In a brief diversion Markus mused that having more automated checking for things like QAPI changes would help reduce the maintainer load for the more mechanical changes. It was pointed out we should be more accepting of merging changes without explicit maintainer approval where the changes are surface level system wide API changes rather than touching the guts of any particular subsystem. This avoids the sometimes onerous task of splitting mechanical tree-wide changes along subsystem boundaries. A delay of one-week + send a ping + one-week was suggested as sufficient time for maintainers to reply if they care specifically about the series. Alex Graf suggested that we should hold different standards of review requirements depending on the importance of the sub-system. We should not hold up code because a minor underused subsystem didn't get signoff. We already informally do this but we don't make it very clear so it can be hard to tell what is and isn't OK to let through without review. Topic 2: Are we happy with the email workflow? == This was a topic to see if there was any consensus among maintainers about the long-term acceptability of sticking with email for patch submission and review -- in five years' time, if we're still doing it the same way, how would we feel about it? One area where we did get consensus was that now that we're doing CI on gitlab we can change pull requests from maintainers from via-email to gitlab merge requests. This would hopefully mean that instead of the release-manager having to tell gitlab to do a merge and then reporting back the results of any CI failures, the maintainer could directly see the CI results and deal with fixing up failures and resubmitting without involving the release manager. (This may have the disbenefit that there isn't a single person any more who looks at all the CI results and gets a sense of whether particular test cases have pre-existing intermittent failures.) There was less agreement on the main problem of reviewing code. On the positive side: - everyone acknowledged that the email workflow was a barrier to new contributors - email is not awkward just for newcomers -- many regular developers have to deal with corporate mail systems, firewalls, etc, that make the email workflow more awkward than it was when Linux (and subsequently QEMU) first adopted it decades ago - a web UI means that unreviewed submissions are easier to track, rather than being simply ignored on the mailing list But on the negative side: - gitlab is not set up for a "submaintainer tree" kind of workflow, so patches would go directly into the main tree and get no per-subsystem testing beyond whatever our CI can cover - gitlab doesn't handle adding Reviewed-by: and similar tags - email provides an automatic archive of the historical code review conversation; gitlab doesn't do this as well - it would increase the degree to which we might have a lock-in problem with gitlab (we could switch away, but it gets more painful) - it has the potential to be a bigger barrier to new contributors getting started with reviewing, compared to "just send an email" - it would probably burn the project's CI minutes more quickly as we would do runs per-submission, not just per-pullreq - might increase the awkwardness of the "some contributors/bug reporters/people interested in a patch are only notifiable by gitlab handle, and some only by email, and you
Re: [PATCH 2/2] migration: Make it clear that qemu_file_set_error() needs a negative value
Peter Maydell writes: > On Thu, 6 Jul 2023 at 20:52, Fabiano Rosas wrote: >> >> The convention in qemu-file.c is to return a negative value on >> error. >> >> The only place that could use qemu_file_set_error() to store a >> positive value to f->last_error was vmstate_save() which has been >> fixed in the previous patch. >> >> bdrv_inactivate_all() already returns a negative value on error. >> >> Document that qemu_file_set_error() needs -errno and alter the callers >> to check ret < 0. >> >> Signed-off-by: Fabiano Rosas >> --- >> migration/qemu-file.c | 2 ++ >> migration/savevm.c| 6 +++--- >> 2 files changed, 5 insertions(+), 3 deletions(-) >> >> diff --git a/migration/qemu-file.c b/migration/qemu-file.c >> index acc282654a..8276bac248 100644 >> --- a/migration/qemu-file.c >> +++ b/migration/qemu-file.c >> @@ -222,6 +222,8 @@ int qemu_file_get_error(QEMUFile *f) >> >> /* >> * Set the last error for stream f >> + * >> + * The error ('ret') should be in -errno format. >> */ >> void qemu_file_set_error(QEMUFile *f, int ret) >> { >> diff --git a/migration/savevm.c b/migration/savevm.c >> index 95c2abf47c..f3c303ab74 100644 >> --- a/migration/savevm.c >> +++ b/migration/savevm.c >> @@ -1249,7 +1249,7 @@ void qemu_savevm_state_setup(QEMUFile *f) >> QTAILQ_FOREACH(se, _state.handlers, entry) { >> if (se->vmsd && se->vmsd->early_setup) { >> ret = vmstate_save(f, se, ms->vmdesc); >> -if (ret) { >> +if (ret < 0) { >> qemu_file_set_error(f, ret); > > You say qemu_file_set_error() should take an errno, > but vmstate_save() doesn't return one. It will directly > return whatever the VMStateInfo put, pre_save, etc hooks > return, which isn't necessarily an errno. (Specifically, > patch 1 in this series makes a .put hook return -1, > rather than an errno. I'm guessing other implementations > might too, though it's a bit hard to find them. A > coccinelle script could probably locate them.) > All implementations return either 0, -1 or some errno; that one instance from patch 1 returns 1. But you're right, those -1 are not really errno, they are just "some negative value". Since qemu-file.c puts the error through the error.c functions and those call strerror(), all values that will go into qemu_file_set_error() should be proper errnos. I should probably audit users of qemu_file_set_error() instead and stop using it for errors that have nothing to do with the actual migration stream/QEMUFile. Currently it seems to have morphed into a mechanism to record generic migration errors.
Re: [PATCH 3/3] hw/arm/virt: Support host CPU type only when KVM or HVF is configured
Hi Connie, On 7/13/23 22:46, Cornelia Huck wrote: On Thu, Jul 13 2023, Gavin Shan wrote: The CPU type 'host-arm-cpu' class won't be registered until KVM or HVF is configured in target/arm/cpu64.c. Support the corresponding CPU type only when KVM or HVF is configured. Signed-off-by: Gavin Shan --- hw/arm/virt.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/hw/arm/virt.c b/hw/arm/virt.c index 43d7772ffd..ad28634445 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -217,7 +217,9 @@ static const char *valid_cpu_types[] = { #endif ARM_CPU_TYPE_NAME("cortex-a53"), ARM_CPU_TYPE_NAME("cortex-a57"), +#if defined(CONFIG_KVM) || defined(CONFIG_HVF) ARM_CPU_TYPE_NAME("host"), +#endif ARM_CPU_TYPE_NAME("max"), NULL }; Doesn't the check in parse_cpu_option() already catch the case where the "host" cpu model isn't registered? I might be getting lost in the code flow, though. Right, it's guranteed that the needed CPU type (class) is registered by parse_cpu_option(). However, we have different story here. The CPU type invalidation intends to limit the CPU type (class) into a range for the specific machine (board). Taking "cortex-a8-arm-cpu" as an example, it's not expected by hw/arm/virt machines even it has been registered when we have CONFIG_TCG=y. the list of supported CPU type (class) will be dumped by hw/core/machine.c::validate_cpu_type() in PATCH[1], "host" is obviously invalid when we have CONFIG_KVM=n and CONFIG_HVF=n. We can't tell user that "host" is supported, to confuse user. Thanks, Gavin
Re: [PATCH 07/11] hw/arm/virt: add plug handler for TPM on SysBus
On 7/12/23 23:51, Joelle van Dyne wrote: TPM needs to know its own base address in order to generate its DSDT device entry. This and the loongarch patch seem to have largely identical virt_tpm_plug functions. Could they be consolidated in hw/tpm/virt.c ? Stefan Signed-off-by: Joelle van Dyne --- hw/arm/virt.c | 37 + 1 file changed, 37 insertions(+) diff --git a/hw/arm/virt.c b/hw/arm/virt.c index 7d9dbc2663..432148ef47 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -2732,6 +2732,37 @@ static void virt_memory_plug(HotplugHandler *hotplug_dev, dev, _abort); } +#ifdef CONFIG_TPM +static void virt_tpm_plug(VirtMachineState *vms, TPMIf *tpmif) +{ +PlatformBusDevice *pbus = PLATFORM_BUS_DEVICE(vms->platform_bus_dev); +hwaddr pbus_base = vms->memmap[VIRT_PLATFORM_BUS].base; +SysBusDevice *sbdev = SYS_BUS_DEVICE(tpmif); +MemoryRegion *sbdev_mr; +hwaddr tpm_base; +uint64_t tpm_size; + +if (!sbdev || !object_dynamic_cast(OBJECT(sbdev), TYPE_SYS_BUS_DEVICE)) { +return; +} + +tpm_base = platform_bus_get_mmio_addr(pbus, sbdev, 0); +assert(tpm_base != -1); + +tpm_base += pbus_base; + +sbdev_mr = sysbus_mmio_get_region(sbdev, 0); +tpm_size = memory_region_size(sbdev_mr); + +if (object_property_find(OBJECT(sbdev), "baseaddr")) { +object_property_set_uint(OBJECT(sbdev), "baseaddr", tpm_base, NULL); +} +if (object_property_find(OBJECT(sbdev), "size")) { +object_property_set_uint(OBJECT(sbdev), "size", tpm_size, NULL); +} +} +#endif + static void virt_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev, DeviceState *dev, Error **errp) { @@ -2803,6 +2834,12 @@ static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev, vms->virtio_iommu_bdf = pci_get_bdf(pdev); create_virtio_iommu_dt_bindings(vms); } + +#ifdef CONFIG_TPM +if (object_dynamic_cast(OBJECT(dev), TYPE_TPM_IF)) { +virt_tpm_plug(vms, TPM_IF(dev)); +} +#endif } static void virt_dimm_unplug_request(HotplugHandler *hotplug_dev,
Re: [PATCH] linux-user: Remove pointless NULL check in clock_adjtime handling
I'll take this via target-arm.next unless there are any objections... thanks -- PMM On Tue, 4 Jul 2023 at 14:26, Peter Maydell wrote: > > Laurent, ping? This patch has been reviewed. > > thanks > -- PMM > > On Fri, 23 Jun 2023 at 15:44, Peter Maydell wrote: > > > > In the code for TARGET_NR_clock_adjtime, we set the pointer phtx to > > the address of the local variable htx. This means it can never be > > NULL, but later in the code we check it for NULL anyway. Coverity > > complains about this (CID 1507683) because the NULL check comes after > > a call to clock_adjtime() that assumes it is non-NULL. > > > > Since phtx is always , and is used only in three places, it's not > > really necessary. Remove it, bringing the code structure in to line > > with that for TARGET_NR_clock_adjtime64, which already uses a simple > > '' when it wants a pointer to 'htx'. > > > > Signed-off-by: Peter Maydell > > --- > > linux-user/syscall.c | 12 +--- > > 1 file changed, 5 insertions(+), 7 deletions(-) > > > > diff --git a/linux-user/syscall.c b/linux-user/syscall.c > > index f2cb101d83c..7b2f9f7340e 100644 > > --- a/linux-user/syscall.c > > +++ b/linux-user/syscall.c > > @@ -10935,16 +10935,14 @@ static abi_long do_syscall1(CPUArchState > > *cpu_env, int num, abi_long arg1, > > #if defined(TARGET_NR_clock_adjtime) && defined(CONFIG_CLOCK_ADJTIME) > > case TARGET_NR_clock_adjtime: > > { > > -struct timex htx, *phtx = > > +struct timex htx; > > > > -if (target_to_host_timex(phtx, arg2) != 0) { > > +if (target_to_host_timex(, arg2) != 0) { > > return -TARGET_EFAULT; > > } > > -ret = get_errno(clock_adjtime(arg1, phtx)); > > -if (!is_error(ret) && phtx) { > > -if (host_to_target_timex(arg2, phtx) != 0) { > > -return -TARGET_EFAULT; > > -} > > +ret = get_errno(clock_adjtime(arg1, )); > > +if (!is_error(ret) && host_to_target_timex(arg2, )) { > > +return -TARGET_EFAULT; > > } > > } > > return ret; > > --
Re: [PATCH 00/11] tpm: introduce TPM CRB SysBus device
On 7/12/23 23:51, Joelle van Dyne wrote: The impetus for this patch set is to get TPM 2.0 working on Windows 11 ARM64. Windows' tpm.sys does not seem to work on a TPM TIS device (as verified with VMWare's implementation). However, the current TPM CRB device uses a fixed system bus address that is reserved for RAM in ARM64 Virt machines. Thanks a lot for this work. The last sentence seems to hint at the current issue with TPM CRB on ARM64 and seems to be the only way forward there. You may want to reformulate it a bit because it's not clear how the 'however' related to CRB relates to TIS. In the process of adding the TPM CRB SysBus device, we also went ahead and cleaned up some of the existing TPM hardware code and fixed some bugs. We used Please reorder bugs to the beginning of the series or submit in an extra patch set so we can backport them. Ideal would be description(s) for how to trigger the bug(s). the TPM TIS devices as a template for the TPM CRB devices and refactored out common code. We moved the ACPI DSDT generation to the device in order to handle dynamic base address requirements as well as reduce redundent code in different s/redundent/redundant machine ACPI generation. We also changed the tpm_crb device to use the ISA bus instead of depending on the default system bus as the device only was built for the PC configuration. Another change is that the TPM CRB registers are now mapped in the same way that the pflash ROM devices are mapped. It is a memory region whose writes are trapped as MMIO accesses. This was needed because Apple Silicon does not decode LDP caused page faults. @agraf suggested that we do this to avoid having to Afaik, LDP is an ARM assembly instruction that loads two 32bit or 64bit registers from consecutive addresses. May be worth mentioning for those wondering about it... do AARCH64 decoding in the HVF fault handler. What is HVF? Regards, Stefan Unfortunately, it seems like the LDP fault still happens on HVF but the issue seems to be in the HVF backend which needs to be fixed in a separate patch. One last thing that's needed to get Windows 11 to recognize the TPM 2.0 device is for the OVMF firmware to setup the TPM device. Currently, OVMF for ARM64 Virt only recognizes the TPM TIS device through a FDT entry. A workaround is to falsely identify the TPM CRB device as a TPM TIS device in the FDT node but this causes issues for Linux. A proper fix would involve adding an ACPI device driver in OVMF. Joelle van Dyne (11): tpm_crb: refactor common code tpm_crb: CTRL_RSP_ADDR is 64-bits wide tpm_ppi: refactor memory space initialization tpm_crb: use a single read-as-mem/write-as-mmio mapping tpm_crb: use the ISA bus tpm_crb: move ACPI table building to device interface hw/arm/virt: add plug handler for TPM on SysBus hw/loongarch/virt: add plug handler for TPM on SysBus tpm_tis_sysbus: fix crash when PPI is enabled tpm_tis_sysbus: move DSDT AML generation to device tpm_crb_sysbus: introduce TPM CRB SysBus device docs/specs/tpm.rst | 2 + hw/tpm/tpm_crb.h| 74 + hw/tpm/tpm_ppi.h| 10 +- include/hw/acpi/aml-build.h | 1 + include/hw/acpi/tpm.h | 3 +- include/sysemu/tpm.h| 3 + hw/acpi/aml-build.c | 7 +- hw/arm/virt-acpi-build.c| 38 + hw/arm/virt.c | 38 + hw/core/sysbus-fdt.c| 1 + hw/i386/acpi-build.c| 23 --- hw/loongarch/acpi-build.c | 38 + hw/loongarch/virt.c | 38 + hw/riscv/virt.c | 1 + hw/tpm/tpm_crb.c| 307 hw/tpm/tpm_crb_common.c | 224 ++ hw/tpm/tpm_crb_sysbus.c | 178 + hw/tpm/tpm_ppi.c| 5 +- hw/tpm/tpm_tis_isa.c| 5 +- hw/tpm/tpm_tis_sysbus.c | 43 + tests/qtest/tpm-crb-test.c | 2 +- tests/qtest/tpm-util.c | 2 +- hw/arm/Kconfig | 1 + hw/riscv/Kconfig| 1 + hw/tpm/Kconfig | 7 +- hw/tpm/meson.build | 3 + hw/tpm/trace-events | 2 +- 27 files changed, 703 insertions(+), 354 deletions(-) create mode 100644 hw/tpm/tpm_crb.h create mode 100644 hw/tpm/tpm_crb_common.c create mode 100644 hw/tpm/tpm_crb_sysbus.c