On Tue, May 19, 2026 at 12:48 PM Philippe Mathieu-Daudé
<[email protected]> wrote:
>
> On 19/5/26 18:22, James Hilliard wrote:
> > ZCB zeros the 128-byte cache block containing the base address. ZCBT has
> > the same user-mode-visible memory effect for QEMU purposes.
> >
> > Model both forms with a single decodetree wildcard entry, align the
> > address down to a 128-byte line, and store eight zero 128-bit chunks to
> > guest memory.
> >
> > Acked-by: Richard Henderson <[email protected]>
> > Signed-off-by: James Hilliard <[email protected]>
> > ---
> > Changes v8 -> v9:
> >    - Use MO_ATOM_NONE for the 128-bit zero stores so TCG does not
> >      require unavailable 128-bit atomic stores on hosts that lack them.
> >
> > Changes v7 -> v8:
> >    - Fold the ZCBT wildcard decode into the ZCB patch so the series does not
> >      add a ZCB-only decode and rewrite it in the next patch.
> >
> > Changes v6 -> v7:
> >    - Use 128-bit zero stores with MO_128 instead of sixteen 64-bit stores.
> >      (suggested by Philippe Mathieu-Daudé)
> >    - Fold ZCB and ZCBT into a single decodetree wildcard entry instead of
> >      using a duplicate entry with a selector comment.  (suggested by 
> > Philippe
> >      Mathieu-Daudé)
> >
> > Changes v2 -> v3:
> >    - Split ZCB/ZCBT out of the combined Octeon arithmetic and memory
> >      instruction patch.  (requested by Richard Henderson)
> > ---
> >   target/mips/tcg/octeon.decode      |  3 +++
> >   target/mips/tcg/octeon_translate.c | 27 +++++++++++++++++++++++++++
> >   2 files changed, 30 insertions(+)
> >
> > diff --git a/target/mips/tcg/octeon.decode b/target/mips/tcg/octeon.decode
> > index d77717cd50..01ed3b50be 100644
> > --- a/target/mips/tcg/octeon.decode
> > +++ b/target/mips/tcg/octeon.decode
> > @@ -49,6 +49,9 @@ SNEI         011100 rs:5 rt:5 imm:s10 101111 &cmpi
> >   SAA          011100 ..... ..... 00000 00000 011000 @saa
> >   SAAD         011100 ..... ..... 00000 00000 011001 @saa
> >
> > +&zcb         base
> > +ZCB          011100 base:5 00000 00000 1110- 011111 &zcb
> > +
> >   &lx          base index rd
> >   @lx          ...... base:5 index:5 rd:5 ...... ..... &lx
> >   LWX          011111 ..... ..... ..... 00000 001010 @lx
> > diff --git a/target/mips/tcg/octeon_translate.c 
> > b/target/mips/tcg/octeon_translate.c
> > index d3dfef2e0c..721a9a8d9d 100644
> > --- a/target/mips/tcg/octeon_translate.c
> > +++ b/target/mips/tcg/octeon_translate.c
> > @@ -176,6 +176,33 @@ static bool trans_saa(DisasContext *ctx, arg_saa *a, 
> > MemOp mop)
> >
> >   TRANS(SAA,  trans_saa, MO_32);
> >   TRANS(SAAD, trans_saa, MO_64);
> > +
> > +static bool trans_ZCB(DisasContext *ctx, arg_ZCB *a)
> > +{
> > +    TCGv_i64 addr = tcg_temp_new_i64();
> > +    TCGv_i64 line = tcg_temp_new_i64();
> > +    TCGv_i64 zero64 = tcg_constant_i64(0);
> > +    TCGv_i128 zero128 = tcg_temp_new_i128();
>
>         const MemOp mop = mo_endian(ctx) | MO_128 | MO_ATOM_NONE;
>
> Although $zero endianness is irrelevant :) but I prefer to keep
> it explicit for coding style.
>
> I can squash upon applying if you agree or keep your patch as it.

Whichever you prefer is fine with me.

>
> Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
>
> > +    gen_base_offset_addr(ctx, addr, a->base, 0);
> > +    tcg_gen_concat_i64_i128(zero128, zero64, zero64);
> > +
> > +    /*
> > +     * QEMU models ZCB/ZCBT as zeroing the containing 128-byte cache line
> > +     * in guest memory.
> > +     */
> > +    tcg_gen_andi_i64(line, addr, ~0x7fULL);
> > +
> > +    for (int i = 0; i < 8; i++) {
> > +        TCGv_i64 slot = tcg_temp_new_i64();
> > +
> > +        tcg_gen_addi_i64(slot, line, i * 16);
> > +        tcg_gen_qemu_st_i128(zero128, slot, ctx->mem_idx,
> > +                             mo_endian(ctx) | MO_128 | MO_ATOM_NONE);
>  > +    }> +
> > +    return true;
> > +}
> >   TRANS(LBX,  trans_lx, MO_SB);
> >   TRANS(LBUX, trans_lx, MO_UB);
> >   TRANS(LHX,  trans_lx, MO_SW);
> >
>

Reply via email to