This patch adds CRC support for the RISC-V architecture. It adds internal functions and built-ins specifically designed to handle CRC computations efficiently.
If the target is ZBC, the clmul instruction is used for the CRC code generation; otherwise, table-based CRC is generated. A table with 256 elements is used to store precomputed CRCs. These CRC calculation algorithms have higher performance than the naive CRC calculation algorithm. gcc/ChangeLog: *builtin-types.def (BT_FN_UINT8_UINT8_UINT8_CONST_SIZE): Define. (BT_FN_UINT16_UINT16_UINT8_CONST_SIZE): Likewise. (BT_FN_UINT16_UINT16_UINT16_CONST_SIZE): Likewise. (BT_FN_UINT32_UINT32_UINT8_CONST_SIZE): Likewise. (BT_FN_UINT32_UINT32_UINT16_CONST_SIZE): Likewise. (BT_FN_UINT32_UINT32_UINT32_CONST_SIZE): Likewise. (BT_FN_UINT64_UINT64_UINT8_CONST_SIZE): Likewise. (BT_FN_UINT64_UINT64_UINT16_CONST_SIZE): Likewise. (BT_FN_UINT64_UINT64_UINT32_CONST_SIZE): Likewise. (BT_FN_UINT64_UINT64_UINT32_CONST_SIZE): Likewise. * builtins.cc (associated_internal_fn): Handle BUILT_IN_CRC8_DATA8, BUILT_IN_CRC16_DATA8, BUILT_IN_CRC16_DATA16, BUILT_IN_CRC32_DATA8, BUILT_IN_CRC32_DATA16, BUILT_IN_CRC32_DATA32, BUILT_IN_CRC64_DATA8, BUILT_IN_CRC64_DATA16, BUILT_IN_CRC64_DATA32, BUILT_IN_CRC64_DATA64. * builtins.def (BUILT_IN_CRC8_DATA8): New builtin. (BUILT_IN_CRC16_DATA8): Likewise. (BUILT_IN_CRC16_DATA16): Likewise. (BUILT_IN_CRC32_DATA8): Likewise. (BUILT_IN_CRC32_DATA16): Likewise. (BUILT_IN_CRC32_DATA32): Likewise. (BUILT_IN_CRC64_DATA8): Likewise. (BUILT_IN_CRC64_DATA16): Likewise. (BUILT_IN_CRC64_DATA32): Likewise. (BUILT_IN_CRC64_DATA64): Likewise. * config/riscv/bitmanip.md (crc<ANYI2:mode><ANYI:mode>4): New expander. * config/riscv/riscv-protos.h (expand_crc_table_based): Declare. (expand_crc_using_clmul): Likewise. * config/riscv/riscv.cc (gf2n_poly_long_div_quotient): New function. (generate_crc): Likewise. (generate_crc_table): Likewise. (expand_crc_table_based): Likewise. (expand_crc_using_clmul): Likewise. * config/riscv/riscv.md (UNSPEC_CRC): New unspec for CRC. * internal-fn.cc (crc_direct): Define. (expand_crc_optab_fn): New function. (direct_crc_optab_supported_p): Define. * internal-fn.def (CRC): New internal optab function. * optabs.def (crc_optab): New optab. gcc/testsuite/ChangeLog: * gcc.target/riscv/crc-builtin-table-target32.c: New test. * gcc.target/riscv/crc-builtin-table-target64.c: New test. * gcc.target/riscv/crc-builtin-zbc32.c: New test. * gcc.target/riscv/crc-builtin-zbc64.c: New test.
From 9d2e9023c222501a1d9519bea3d5cdbd32b5a91e Mon Sep 17 00:00:00 2001 From: Mariam Arutunian <mariamarutunian@gmail.com> Date: Thu, 3 Aug 2023 15:59:57 +0400 Subject: [PATCH] RISC-V: Added support for CRC. If the target is ZBC, then the clmul instruction is used for the CRC code generation; otherwise, table-based CRC is generated. A table with 256 elements is used to store precomputed CRCs. gcc/ChangeLog: *builtin-types.def (BT_FN_UINT8_UINT8_UINT8_CONST_SIZE): Define. (BT_FN_UINT16_UINT16_UINT8_CONST_SIZE): Likewise. (BT_FN_UINT16_UINT16_UINT16_CONST_SIZE): Likewise. (BT_FN_UINT32_UINT32_UINT8_CONST_SIZE): Likewise. (BT_FN_UINT32_UINT32_UINT16_CONST_SIZE): Likewise. (BT_FN_UINT32_UINT32_UINT32_CONST_SIZE): Likewise. (BT_FN_UINT64_UINT64_UINT8_CONST_SIZE): Likewise. (BT_FN_UINT64_UINT64_UINT16_CONST_SIZE): Likewise. (BT_FN_UINT64_UINT64_UINT32_CONST_SIZE): Likewise. (BT_FN_UINT64_UINT64_UINT32_CONST_SIZE): Likewise. * builtins.cc (associated_internal_fn): Handle BUILT_IN_CRC8_DATA8, BUILT_IN_CRC16_DATA8, BUILT_IN_CRC16_DATA16, BUILT_IN_CRC32_DATA8, BUILT_IN_CRC32_DATA16, BUILT_IN_CRC32_DATA32, BUILT_IN_CRC64_DATA8, BUILT_IN_CRC64_DATA16, BUILT_IN_CRC64_DATA32, BUILT_IN_CRC64_DATA64. * builtins.def (BUILT_IN_CRC8_DATA8): New builtin. (BUILT_IN_CRC16_DATA8): Likewise. (BUILT_IN_CRC16_DATA16): Likewise. (BUILT_IN_CRC32_DATA8): Likewise. (BUILT_IN_CRC32_DATA16): Likewise. (BUILT_IN_CRC32_DATA32): Likewise. (BUILT_IN_CRC64_DATA8): Likewise. (BUILT_IN_CRC64_DATA16): Likewise. (BUILT_IN_CRC64_DATA32): Likewise. (BUILT_IN_CRC64_DATA64): Likewise. * config/riscv/bitmanip.md (crc<ANYI2:mode><ANYI:mode>4): New expander. * config/riscv/riscv-protos.h (expand_crc_table_based): Declare. (expand_crc_using_clmul): Likewise. * config/riscv/riscv.cc (gf2n_poly_long_div_quotient): New function. (generate_crc): Likewise. (generate_crc_table): Likewise. (expand_crc_table_based): Likewise. (expand_crc_using_clmul): Likewise. * config/riscv/riscv.md (UNSPEC_CRC): New unspec for CRC. * internal-fn.cc (crc_direct): Define. (expand_crc_optab_fn): New function. (direct_crc_optab_supported_p): Define. * internal-fn.def (CRC): New internal optab function. * optabs.def (crc_optab): New optab. gcc/testsuite/ChangeLog: * gcc.target/riscv/crc-builtin-table-target32.c: New test. * gcc.target/riscv/crc-builtin-table-target64.c: New test. * gcc.target/riscv/crc-builtin-zbc32.c: New test. * gcc.target/riscv/crc-builtin-zbc64.c: New test. --- gcc/ChangeLog | 42 +++ gcc/builtin-types.def | 20 ++ gcc/builtins.cc | 11 + gcc/builtins.def | 10 + gcc/config/riscv/bitmanip.md | 35 +++ gcc/config/riscv/riscv-protos.h | 2 + gcc/config/riscv/riscv.cc | 280 ++++++++++++++++++ gcc/config/riscv/riscv.md | 3 + gcc/internal-fn.cc | 33 +++ gcc/internal-fn.def | 1 + gcc/optabs.def | 1 + gcc/testsuite/ChangeLog | 7 + .../riscv/crc-builtin-table-target32.c | 36 +++ .../riscv/crc-builtin-table-target64.c | 60 ++++ .../gcc.target/riscv/crc-builtin-zbc32.c | 20 ++ .../gcc.target/riscv/crc-builtin-zbc64.c | 35 +++ 16 files changed, 596 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/crc-builtin-table-target32.c create mode 100644 gcc/testsuite/gcc.target/riscv/crc-builtin-table-target64.c create mode 100644 gcc/testsuite/gcc.target/riscv/crc-builtin-zbc32.c create mode 100644 gcc/testsuite/gcc.target/riscv/crc-builtin-zbc64.c diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 2eab466a9f8..748d8be384b 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,45 @@ +2023-08-03 Mariam Arutunian <mariamarutunian@gmail.com> + + * builtin-types.def (BT_FN_UINT8_UINT8_UINT8_CONST_SIZE): Define. + (BT_FN_UINT16_UINT16_UINT8_CONST_SIZE): Likewise. + (BT_FN_UINT16_UINT16_UINT16_CONST_SIZE): Likewise. + (BT_FN_UINT32_UINT32_UINT8_CONST_SIZE): Likewise. + (BT_FN_UINT32_UINT32_UINT16_CONST_SIZE): Likewise. + (BT_FN_UINT32_UINT32_UINT32_CONST_SIZE): Likewise. + (BT_FN_UINT64_UINT64_UINT8_CONST_SIZE): Likewise. + (BT_FN_UINT64_UINT64_UINT16_CONST_SIZE): Likewise. + (BT_FN_UINT64_UINT64_UINT32_CONST_SIZE): Likewise. + (BT_FN_UINT64_UINT64_UINT32_CONST_SIZE): Likewise. + * builtins.cc (associated_internal_fn): Handle BUILT_IN_CRC8_DATA8, + BUILT_IN_CRC16_DATA8, BUILT_IN_CRC16_DATA16, + BUILT_IN_CRC32_DATA8, BUILT_IN_CRC32_DATA16, BUILT_IN_CRC32_DATA32, + BUILT_IN_CRC64_DATA8, BUILT_IN_CRC64_DATA16, BUILT_IN_CRC64_DATA32, + BUILT_IN_CRC64_DATA64. + * builtins.def (BUILT_IN_CRC8_DATA8): New builtin. + (BUILT_IN_CRC16_DATA8): Likewise. + (BUILT_IN_CRC16_DATA16): Likewise. + (BUILT_IN_CRC32_DATA8): Likewise. + (BUILT_IN_CRC32_DATA16): Likewise. + (BUILT_IN_CRC32_DATA32): Likewise. + (BUILT_IN_CRC64_DATA8): Likewise. + (BUILT_IN_CRC64_DATA16): Likewise. + (BUILT_IN_CRC64_DATA32): Likewise. + (BUILT_IN_CRC64_DATA64): Likewise. + * config/riscv/bitmanip.md (crc<ANYI2:mode><ANYI:mode>4): New expander. + * config/riscv/riscv-protos.h (expand_crc_table_based): Declare. + (expand_crc_using_clmul): Likewise. + * config/riscv/riscv.cc (gf2n_poly_long_div_quotient): New function. + (generate_crc): Likewise. + (generate_crc_table): Likewise. + (expand_crc_table_based): Likewise. + (expand_crc_using_clmul): Likewise. + * config/riscv/riscv.md (UNSPEC_CRC): New unspec for CRC. + * internal-fn.cc (crc_direct): Define. + (expand_crc_optab_fn): New function. + (direct_crc_optab_supported_p): Define. + * internal-fn.def (CRC): New internal optab function. + * optabs.def (crc_optab): New optab. + 2023-07-22 Vineet Gupta <vineetg@rivosinc.com> PR target/110748 diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def index 43381bc8949..e33837c27d0 100644 --- a/gcc/builtin-types.def +++ b/gcc/builtin-types.def @@ -829,6 +829,26 @@ DEF_FUNCTION_TYPE_3 (BT_FN_PTR_SIZE_SIZE_PTRMODE, BT_PTR, BT_SIZE, BT_SIZE, BT_PTRMODE) DEF_FUNCTION_TYPE_3 (BT_FN_VOID_PTR_UINT8_PTRMODE, BT_VOID, BT_PTR, BT_UINT8, BT_PTRMODE) +DEF_FUNCTION_TYPE_3 (BT_FN_UINT8_UINT8_UINT8_CONST_SIZE, BT_UINT8, BT_UINT8, + BT_UINT8, BT_CONST_SIZE) +DEF_FUNCTION_TYPE_3 (BT_FN_UINT16_UINT16_UINT8_CONST_SIZE, BT_UINT16, BT_UINT16, + BT_UINT8, BT_CONST_SIZE) +DEF_FUNCTION_TYPE_3 (BT_FN_UINT16_UINT16_UINT16_CONST_SIZE, BT_UINT16, + BT_UINT16, BT_UINT16, BT_CONST_SIZE) +DEF_FUNCTION_TYPE_3 (BT_FN_UINT32_UINT32_UINT8_CONST_SIZE, BT_UINT32, BT_UINT32, + BT_UINT8, BT_CONST_SIZE) +DEF_FUNCTION_TYPE_3 (BT_FN_UINT32_UINT32_UINT16_CONST_SIZE, BT_UINT32, + BT_UINT32, BT_UINT16, BT_CONST_SIZE) +DEF_FUNCTION_TYPE_3 (BT_FN_UINT32_UINT32_UINT32_CONST_SIZE, BT_UINT32, + BT_UINT32, BT_UINT32, BT_CONST_SIZE) +DEF_FUNCTION_TYPE_3 (BT_FN_UINT64_UINT64_UINT8_CONST_SIZE, BT_UINT64, BT_UINT64, + BT_UINT8, BT_CONST_SIZE) +DEF_FUNCTION_TYPE_3 (BT_FN_UINT64_UINT64_UINT16_CONST_SIZE, BT_UINT64, + BT_UINT64, BT_UINT16, BT_CONST_SIZE) +DEF_FUNCTION_TYPE_3 (BT_FN_UINT64_UINT64_UINT32_CONST_SIZE, BT_UINT64, + BT_UINT64, BT_UINT32, BT_CONST_SIZE) +DEF_FUNCTION_TYPE_3 (BT_FN_UINT64_UINT64_UINT64_CONST_SIZE, BT_UINT64, + BT_UINT64, BT_UINT64, BT_CONST_SIZE) DEF_FUNCTION_TYPE_4 (BT_FN_SIZE_CONST_PTR_SIZE_SIZE_FILEPTR, BT_SIZE, BT_CONST_PTR, BT_SIZE, BT_SIZE, BT_FILEPTR) diff --git a/gcc/builtins.cc b/gcc/builtins.cc index e2e99d6a995..94f88290435 100644 --- a/gcc/builtins.cc +++ b/gcc/builtins.cc @@ -2198,6 +2198,17 @@ associated_internal_fn (built_in_function fn, tree return_type) if (REAL_MODE_FORMAT (TYPE_MODE (return_type))->b == 2) return IFN_LDEXP; return IFN_LAST; + case BUILT_IN_CRC8_DATA8: + case BUILT_IN_CRC16_DATA8: + case BUILT_IN_CRC16_DATA16: + case BUILT_IN_CRC32_DATA8: + case BUILT_IN_CRC32_DATA16: + case BUILT_IN_CRC32_DATA32: + case BUILT_IN_CRC64_DATA8: + case BUILT_IN_CRC64_DATA16: + case BUILT_IN_CRC64_DATA32: + case BUILT_IN_CRC64_DATA64: + return IFN_CRC; default: return IFN_LAST; diff --git a/gcc/builtins.def b/gcc/builtins.def index 5953266acba..f0ddfd0ce46 100644 --- a/gcc/builtins.def +++ b/gcc/builtins.def @@ -704,6 +704,16 @@ DEF_EXT_LIB_BUILTIN (BUILT_IN_Y1L, "y1l", BT_FN_LONGDOUBLE_LONGDOUBLE, ATTR_M DEF_EXT_LIB_BUILTIN (BUILT_IN_YN, "yn", BT_FN_DOUBLE_INT_DOUBLE, ATTR_MATHFN_FPROUNDING_ERRNO) DEF_EXT_LIB_BUILTIN (BUILT_IN_YNF, "ynf", BT_FN_FLOAT_INT_FLOAT, ATTR_MATHFN_FPROUNDING_ERRNO) DEF_EXT_LIB_BUILTIN (BUILT_IN_YNL, "ynl", BT_FN_LONGDOUBLE_INT_LONGDOUBLE, ATTR_MATHFN_FPROUNDING_ERRNO) +DEF_GCC_BUILTIN (BUILT_IN_CRC8_DATA8, "crc8_data8", BT_FN_UINT8_UINT8_UINT8_CONST_SIZE, ATTR_CONST_NOTHROW_LEAF_LIST) +DEF_GCC_BUILTIN (BUILT_IN_CRC16_DATA8, "crc16_data8", BT_FN_UINT16_UINT16_UINT8_CONST_SIZE, ATTR_CONST_NOTHROW_LEAF_LIST) +DEF_GCC_BUILTIN (BUILT_IN_CRC16_DATA16, "crc16_data16", BT_FN_UINT16_UINT16_UINT16_CONST_SIZE, ATTR_CONST_NOTHROW_LEAF_LIST) +DEF_GCC_BUILTIN (BUILT_IN_CRC32_DATA8, "crc32_data8", BT_FN_UINT32_UINT32_UINT8_CONST_SIZE, ATTR_CONST_NOTHROW_LEAF_LIST) +DEF_GCC_BUILTIN (BUILT_IN_CRC32_DATA16, "crc32_data16", BT_FN_UINT32_UINT32_UINT16_CONST_SIZE, ATTR_CONST_NOTHROW_LEAF_LIST) +DEF_GCC_BUILTIN (BUILT_IN_CRC32_DATA32, "crc32_data32", BT_FN_UINT32_UINT32_UINT32_CONST_SIZE, ATTR_CONST_NOTHROW_LEAF_LIST) +DEF_GCC_BUILTIN (BUILT_IN_CRC64_DATA8, "crc64_data8", BT_FN_UINT64_UINT64_UINT8_CONST_SIZE, ATTR_CONST_NOTHROW_LEAF_LIST) +DEF_GCC_BUILTIN (BUILT_IN_CRC64_DATA16, "crc64_data16", BT_FN_UINT64_UINT64_UINT16_CONST_SIZE, ATTR_CONST_NOTHROW_LEAF_LIST) +DEF_GCC_BUILTIN (BUILT_IN_CRC64_DATA32, "crc64_data32", BT_FN_UINT64_UINT64_UINT32_CONST_SIZE, ATTR_CONST_NOTHROW_LEAF_LIST) +DEF_GCC_BUILTIN (BUILT_IN_CRC64_DATA64, "crc64_data64", BT_FN_UINT64_UINT64_UINT64_CONST_SIZE, ATTR_CONST_NOTHROW_LEAF_LIST) /* Category: _Complex math builtins. */ DEF_C99_COMPL_BUILTIN (BUILT_IN_CABS, "cabs", BT_FN_DOUBLE_COMPLEX_DOUBLE, ATTR_MATHFN_FPROUNDING) diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md index c42e7b890db..4c896303242 100644 --- a/gcc/config/riscv/bitmanip.md +++ b/gcc/config/riscv/bitmanip.md @@ -856,3 +856,38 @@ "TARGET_ZBC" "clmulr\t%0,%1,%2" [(set_attr "type" "clmul")]) + +;; Iterator for hardware-supported integer modes, same as ANYI +(define_mode_iterator ANYI2 [QI HI SI (DI "TARGET_64BIT")]) + +;; CRC 8, 16, 32, (64 for TARGET_64) +(define_expand "crc<ANYI2:mode><ANYI:mode>4" + ;; return value (calculated CRC) + [(set (match_operand:ANYI 0 "register_operand" "=r") + ;; initial CRC + (unspec:ANYI [(match_operand:ANYI 1 "register_operand" "r") + ;; data + (match_operand:ANYI2 2 "register_operand" "r") + ;; polynomial + (match_operand:ANYI 3)] + UNSPEC_CRC))] +"" +{ + /* TODO: We only support cases where the data size is not greater + than the CRC size. */ + if (GET_MODE (operands[0]) >= GET_MODE (operands[2])) + { + /* If we have the ZBC extension (ie, clmul) and + it is possible to store the quotient within a single variable + (E.g. CRC64's quotient may need 65 bits, + we can't keep it in 64 bit variable.) + then use clmul instruction to implement the CRC, + otherwise generate table-based CRC. */ + if (TARGET_ZBC && ((TARGET_64BIT && GET_MODE (operands[0]) != DImode) + || (!TARGET_64BIT && GET_MODE (operands[0]) < SImode))) + expand_crc_using_clmul (operands); + else + expand_crc_table_based (operands); + } + DONE; +}) \ No newline at end of file diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index c9520f689e2..35bf19806df 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -131,6 +131,8 @@ extern bool riscv_shamt_matches_mask_p (int, HOST_WIDE_INT); extern void riscv_subword_address (rtx, rtx *, rtx *, rtx *, rtx *); extern void riscv_lshift_subword (machine_mode, rtx, rtx, rtx *); extern enum memmodel riscv_union_memmodels (enum memmodel, enum memmodel); +extern void expand_crc_table_based (rtx *); +extern void expand_crc_using_clmul (rtx *); /* Routines implemented in riscv-c.cc. */ void riscv_cpu_cpp_builtins (cpp_reader *); diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 332fa720f01..e15850910a2 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -7975,6 +7975,286 @@ riscv_preferred_else_value (unsigned, tree, unsigned int nops, tree *ops) return nops == 3 ? ops[2] : ops[0]; } +/* Returns the quotient of polynomial long division of x^2n by POLYNOMIAL + in GF (2^n). */ + +unsigned HOST_WIDE_INT +gf2n_poly_long_div_quotient (unsigned HOST_WIDE_INT polynomial, size_t n) +{ + vec<short> x2n; + vec<bool> pol, q; + /* Create vector of bits, for the polynomial. */ + pol.create (n + 1); + for (size_t i=0 ; i < n; i++) + { + pol.quick_push (polynomial&1); + polynomial >>= 1; + } + pol.quick_push (1); + + /* Create vector for x^2n polynomial. */ + x2n.create (2 * n - 1); + for (size_t i = 0; i < 2 * (n - 1); i++) + x2n.safe_push (0); + x2n.safe_push (1); + + q.create (n); + for (size_t i = 0; i < n; i++) + q.quick_push (0); + + /* Calculate the quotient of x^2n/polynomial. */ + for (int i = n - 1; i >= 0; i--) + { + int d = x2n[i + n - 1]; + if (d == 0) + continue; + for (int j = i + n - 1; j >= i; j--) + x2n[j] = x2n[j] ^ (pol[j - i] * d); + q[i] = d; + } + + /* Get the number from the vector of 0/1s. */ + unsigned HOST_WIDE_INT quotient = 0; + for (size_t i = 0; i < q.length (); i++) + { + quotient <<= 1; + quotient = quotient | q[q.length () - i - 1]; + } + return quotient; +} + +/* Calculates CRC for initial CRC and given POLYNOMIAL. */ + +static unsigned HOST_WIDE_INT +generate_crc (unsigned HOST_WIDE_INT crc, + unsigned HOST_WIDE_INT polynomial, + unsigned crc_bits) +{ + crc = crc << (crc_bits - 8); + for (int i = 8; i > 0; --i) + { + if ((crc >> (crc_bits - 1)) & 1) + crc = (crc << 1) ^ polynomial; + else + crc <<= 1; + } + + crc <<= (sizeof (crc) * BITS_PER_UNIT - crc_bits); + crc >>= (sizeof (crc) * BITS_PER_UNIT - crc_bits); + + return crc; +} + +/* Generates CRC lookup table by calculating CRC for all possible + 8-bit data values. The table is stored with a specific name in the read-only + data section. */ + +rtx +generate_crc_table (unsigned HOST_WIDE_INT polynom, unsigned crc_bits) +{ + gcc_assert (crc_bits <= 64); + FILE *out = asm_out_file; + + /* Buf size - 33 letters + + 18 for numbers (2 for crc bit size + 2 for 0x + 16 for 64 bit polynomial) + + 1 for \0. */ + char buf[33 + 20 + 1]; + sprintf (buf, "crc_table_for_%u_bit_crc_%llx_polynomial", crc_bits, polynom); + tree id = maybe_get_identifier (buf); + if (id) + return gen_rtx_SYMBOL_REF (Pmode, IDENTIFIER_POINTER (id)); + id = get_identifier (buf); + rtx tab = gen_rtx_SYMBOL_REF (Pmode, IDENTIFIER_POINTER (id)); + + /* Create a table with 256 elements. */ + unsigned table_elements_n = 0x100; + char val_align_op[7]; + if (crc_bits <= 8) + sprintf (val_align_op, ".byte"); + else if (crc_bits <= 16) + sprintf (val_align_op, ".half"); + else if (crc_bits <= 32) + sprintf (val_align_op, ".word"); + else + sprintf (val_align_op, ".dword"); + + /* Write in read only data section. */ +#if 0 + asm_fprintf (out, "\t.pushsection\t%s.%s, \"a\"\n", ".gnu.linkonce.r", buf); + asm_fprintf (out, "\t.linkonce same_contents\n"); +#else + asm_fprintf (out, "\t.pushsection\t%s.%s, \"a\"\n", ".rodata", buf); + asm_fprintf (out, "\t.weak %s\n", buf); +#endif + asm_fprintf (out, "\t.type\t%s, @object\n", buf); + asm_fprintf (out, "\t.size\t%s, %d\n", buf, table_elements_n + * (crc_bits / BITS_PER_UNIT)); + asm_fprintf (out, "%s:\n\t%s ", buf, val_align_op); + + /* Generate CRC for each value from 1 to table_elements_n number. + These are CRC table's values. */ + for (unsigned i = 0; i < table_elements_n; i++) + { + unsigned HOST_WIDE_INT crc = generate_crc (i, polynom, crc_bits); + fprintf (out, HOST_WIDE_INT_PRINT_HEX, crc); + if (i % 8 != 7) + asm_fprintf (out, ", "); + else if (i < table_elements_n - 1) + asm_fprintf (out, "\n\t%s ", val_align_op); + else + asm_fprintf (out, "\n"); + } + asm_fprintf (out, "\n\t.popsection\n"); + + return tab; +} + +/* Generates table based CRC code. + CRC is operands[1], data is operands[2] and the polynomial is operands[3]. + The function, at first, using polynomial's value generates CRC table of 256 + elements, then generates the assembly for the following code, + where crc_size and data_size may be 8, 16, 32, 64, depending on CRC: + uint_crc_size_t + crc_crc_size (uint_crc_size_t crc_init, uint_data_size_t data, size_t size) + { + uint_crc_size_t crc = crc_init; + for (int i = 0; i < data_size / 8; i++) + crc = (crc << 8) ^ crc_table[(crc >> (crc_size - 8)) + ^ (data >> (data_size - (i + 1) * 8) & 0xFF))]; + return crc; + } + + In this implementation, 256 size table is generated for all bit CRCs. + So to take values from the table, we need 8-bit data. + If input data size is not 8, when first we extract upper 8 bits, + then the other 8 bits and so on. */ + +void +expand_crc_table_based (rtx *operands) +{ + gcc_assert (!CONST_INT_P (operands[0])); + gcc_assert (CONST_INT_P (operands[3])); + machine_mode crc_mode = GET_MODE (operands[0]); + machine_mode data_mode = GET_MODE (operands[2]); + unsigned HOST_WIDE_INT mode_bit_size = GET_MODE_BITSIZE (crc_mode) + .to_constant (); + unsigned HOST_WIDE_INT data_mode_bit_size = GET_MODE_BITSIZE (data_mode) + .to_constant (); + + rtx crc = operands[1]; + rtx tab = generate_crc_table (UINTVAL (operands[3]), + mode_bit_size); + for (int i = 0; i < GET_MODE_SIZE (data_mode).to_constant (); i++) + { + /* crc >> (bit_size - 8). */ + rtx op1 = gen_rtx_ASHIFTRT (crc_mode, crc, + GEN_INT (mode_bit_size - 8)); + /* data >> (8 * (GET_MODE_SIZE (data_mode).to_constant () - i - 1)). */ + rtx data = force_reg (crc_mode, gen_rtx_ASHIFTRT (crc_mode, operands[2], + GEN_INT (8 * (GET_MODE_SIZE (data_mode).to_constant () - i - 1)))); + /* data >> (8 * (GET_MODE_SIZE (data_mode) + .to_constant () - i - 1)) & 0xFF. + */ + data = gen_rtx_AND (crc_mode, data, GEN_INT (255)); + /* (crc >> (bit_size - 8)) ^ data_8bit. */ + rtx in = force_reg (crc_mode, gen_rtx_XOR (crc_mode, op1, data)); + /* ((crc >> (bit_size - 8)) ^ data_8bit) & 0xFF. */ + rtx ix = gen_rtx_AND (crc_mode, in, GEN_INT (255)); + + /* crc_mode is the return value's mode, + depends on CRC function's CRC size. */ + if (crc_mode != word_mode) + ix = gen_rtx_ZERO_EXTEND (word_mode, ix); + ix = gen_rtx_ASHIFT (word_mode, ix, GEN_INT (exact_log2 ( + GET_MODE_SIZE (crc_mode).to_constant ()))); + ix = force_reg (word_mode, ix); + + /* crc_table[(crc >> (bit_size - 8)) ^ data_8bit] */ + rtx tab_el = gen_rtx_MEM (crc_mode, gen_rtx_PLUS (word_mode, ix, tab)); + + /* (crc << 8) */ + rtx high = gen_rtx_ASHIFT (crc_mode, crc, GEN_INT (8)); + if (crc_mode != word_mode) + high = force_reg (crc_mode, + gen_rtx_AND (crc_mode, high, + GEN_INT (GET_MODE_MASK (crc_mode)))); + /* crc = (crc << 8) ^ crc_table[(crc >> (bit_size - 8)) ^ data_8bit]; */ + crc = force_reg (crc_mode, gen_rtx_XOR (crc_mode, tab_el, high)); + } + riscv_emit_move (operands[0], crc); +} + +/* Generates assembly to calculate CRC using clmul instruction. + The following code will be generated when the CRC and data sizes are equal: + li a4,quotient + li a5,polynomial + xor a0,a1,a0 + clmul a0,a0,a4 + srli a0,a0,crc_size + clmul a0,a0,a5 + slli a0,a0,GET_MODE_BITSIZE (word_mode) - crc_size + srli a0,a0,GET_MODE_BITSIZE (word_mode) - crc_size + ret + crc_size may be 8, 16, 32. + Some instructions will be added for the cases when CRC's size is larger than + data's size. + operands[1] is CRC, operands[2] is data, operands[3] is the polynomial. +*/ + +void +expand_crc_using_clmul (rtx *operands) +{ + /* Check and keep arguments. */ + gcc_assert (!CONST_INT_P (operands[0])); + gcc_assert (CONST_INT_P (operands[3])); + rtx crc = operands[1]; + rtx data = operands[2]; + unsigned HOST_WIDE_INT + crc_size = GET_MODE_BITSIZE (GET_MODE (operands[0])).to_constant (); + gcc_assert (crc_size <= 32); + unsigned HOST_WIDE_INT + data_size = GET_MODE_BITSIZE (GET_MODE (data)).to_constant (); + + /* Calculate the quotient. */ + unsigned HOST_WIDE_INT + q = gf2n_poly_long_div_quotient (UINTVAL (operands[3]), crc_size + 1); + + if (crc_size > data_size) + crc = force_reg (word_mode, gen_rtx_LSHIFTRT (word_mode, crc, + GEN_INT (crc_size + - data_size))); + rtx t0 = force_reg (word_mode, GEN_INT (q)); + rtx t1 = force_reg (word_mode, operands[3]); + rtx a0 = force_reg (word_mode, gen_rtx_XOR (word_mode, crc, data)); + if (TARGET_64BIT) + emit_insn (gen_riscv_clmul_di (a0, a0, t0)); + else + emit_insn (gen_riscv_clmul_si (a0, a0, t0)); + + a0 = force_reg (word_mode, gen_rtx_LSHIFTRT (word_mode, a0, + GEN_INT (crc_size))); + if (TARGET_64BIT) + emit_insn (gen_riscv_clmul_di (a0, a0, t1)); + else + emit_insn (gen_riscv_clmul_si (a0, a0, t1)); + + if (crc_size > data_size) + { + rtx crc_part = force_reg (word_mode, + gen_rtx_ASHIFT (word_mode, operands[1], + GEN_INT (data_size))); + a0 = force_reg (word_mode, gen_rtx_XOR (word_mode, a0, crc_part)); + + } + a0 = force_reg (word_mode, gen_rtx_ASHIFT (word_mode, a0, GEN_INT ( + GET_MODE_BITSIZE (word_mode) - crc_size))); + a0 = force_reg (word_mode, gen_rtx_LSHIFTRT (word_mode, a0, GEN_INT ( + GET_MODE_BITSIZE (word_mode) - crc_size))); + rtx tgt = simplify_gen_subreg (word_mode, operands[0], + GET_MODE (operands[0]), 0); + riscv_emit_move (tgt, a0); +} + /* Initialize the GCC target structure. */ #undef TARGET_ASM_ALIGNED_HI_OP #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t" diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index 4615e811947..21e1bf5ba19 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -70,6 +70,9 @@ UNSPEC_CLMUL UNSPEC_CLMULH UNSPEC_CLMULR + + ;; CRC unspec + UNSPEC_CRC ]) (define_c_enum "unspecv" [ diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index 8e294286388..dd48428dc4d 100644 --- a/gcc/internal-fn.cc +++ b/gcc/internal-fn.cc @@ -192,6 +192,7 @@ init_internal_fns () #define mask_fold_left_direct { 1, 1, false } #define mask_len_fold_left_direct { 1, 1, false } #define check_ptrs_direct { 0, 0, false } +#define crc_direct { 1, -1, true } const direct_internal_fn_info direct_internal_fn_array[IFN_LAST + 1] = { #define DEF_INTERNAL_FN(CODE, FLAGS, FNSPEC) not_direct, @@ -3866,6 +3867,37 @@ expand_convert_optab_fn (internal_fn fn, gcall *stmt, convert_optab optab, expand_fn_using_insn (stmt, icode, 1, nargs); } +/* Expand CRC call STMT using optab OPTAB. */ +static void +expand_crc_optab_fn (internal_fn, gcall *stmt, convert_optab optab) +{ + tree lhs = gimple_call_lhs (stmt); + tree rhs1 = gimple_call_arg (stmt, 0); // crc + tree rhs2 = gimple_call_arg (stmt, 1); // data + tree rhs3 = gimple_call_arg (stmt, 2); // polynomial + + tree result_type = TREE_TYPE (lhs); + tree data_type = TREE_TYPE (rhs2); + + rtx dest = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE); + rtx op1 = expand_normal (rhs1); + rtx op2 = expand_normal (rhs2); + gcc_assert (TREE_CODE (rhs3) == INTEGER_CST); + rtx op3 = gen_rtx_CONST_INT (TYPE_MODE (result_type), + TREE_INT_CST_LOW (rhs3)); + + class expand_operand ops[4]; + create_output_operand (&ops[0], dest, TYPE_MODE (result_type)); + create_input_operand (&ops[1], op1, TYPE_MODE (result_type)); // crc + create_input_operand (&ops[2], op2, TYPE_MODE (data_type)); // data + create_input_operand (&ops[3], op3, TYPE_MODE (result_type)); //polynomial + insn_code icode = convert_optab_handler (optab, TYPE_MODE (data_type), + TYPE_MODE (result_type)); + expand_insn (icode, 4, ops); + if (!rtx_equal_p (dest, ops[0].value)) + emit_move_insn (dest, ops[0].value); +} + /* Expanders for optabs that can use expand_direct_optab_fn. */ #define expand_unary_optab_fn(FN, STMT, OPTAB) \ @@ -3996,6 +4028,7 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types, #define direct_cond_len_unary_optab_supported_p direct_optab_supported_p #define direct_cond_len_binary_optab_supported_p direct_optab_supported_p #define direct_cond_len_ternary_optab_supported_p direct_optab_supported_p +#define direct_crc_optab_supported_p convert_optab_supported_p #define direct_mask_load_optab_supported_p convert_optab_supported_p #define direct_load_lanes_optab_supported_p multi_vector_optab_supported_p #define direct_mask_load_lanes_optab_supported_p multi_vector_optab_supported_p diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 04f3812326e..c8a53735451 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -153,6 +153,7 @@ along with GCC; see the file COPYING3. If not see DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _ODD, FLAGS, SELECTOR, SOPTAB##_odd, UOPTAB##_odd, TYPE) #endif +DEF_INTERNAL_OPTAB_FN (CRC, ECF_CONST | ECF_NOTHROW, crc, crc) DEF_INTERNAL_OPTAB_FN (MASK_LOAD, ECF_PURE, maskload, mask_load) DEF_INTERNAL_OPTAB_FN (LOAD_LANES, ECF_CONST, vec_load_lanes, load_lanes) DEF_INTERNAL_OPTAB_FN (MASK_LOAD_LANES, ECF_PURE, diff --git a/gcc/optabs.def b/gcc/optabs.def index 1ea1947b3b5..7135f5334d7 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -78,6 +78,7 @@ OPTAB_CD(smsub_widen_optab, "msub$b$a4") OPTAB_CD(umsub_widen_optab, "umsub$b$a4") OPTAB_CD(ssmsub_widen_optab, "ssmsub$b$a4") OPTAB_CD(usmsub_widen_optab, "usmsub$a$b4") +OPTAB_CD(crc_optab, "crc$a$b4") OPTAB_CD(vec_load_lanes_optab, "vec_load_lanes$a$b") OPTAB_CD(vec_store_lanes_optab, "vec_store_lanes$a$b") OPTAB_CD(vec_mask_load_lanes_optab, "vec_mask_load_lanes$a$b") diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index b11b4632e1e..cc57995065e 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,10 @@ +2023-08-03 Mariam Arutunian <mariamarutunian@gmail.com> + + * gcc.target/riscv/crc-builtin-table-target32.c: New test. + * gcc.target/riscv/crc-builtin-table-target64.c: New test. + * gcc.target/riscv/crc-builtin-zbc32.c: New test. + * gcc.target/riscv/crc-builtin-zbc64.c: New test. + 2023-07-22 Vineet Gupta <vineetg@rivosinc.com> * gcc.target/riscv/pr110748-1.c: New Test. diff --git a/gcc/testsuite/gcc.target/riscv/crc-builtin-table-target32.c b/gcc/testsuite/gcc.target/riscv/crc-builtin-table-target32.c new file mode 100644 index 00000000000..195d8a2e207 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/crc-builtin-table-target32.c @@ -0,0 +1,36 @@ +/* { dg-do compile } */ +#include <stdint-gcc.h> + +int8_t crc8_data8 () +{ + return __builtin_crc8_data8 (0x34, 'a', 0x12); +} + +int16_t crc16_data8 () +{ + return __builtin_crc16_data8 (0x1234, 'a', 0x1021); +} + +int16_t crc16_data16 () +{ + return __builtin_crc16_data16 (0x1234, 0x3214, 0x1021); +} + +int32_t crc32_data8 () +{ + return __builtin_crc32_data8 (0xffffffff, 0x32, 0x4002123); +} + +int32_t crc32_data16 () +{ + return __builtin_crc32_data16 (0xffffffff, 0x3232, 0x4002123); +} + +int32_t crc32_data32 () +{ + return __builtin_crc32_data32 (0xffffffff, 0x123546ff, 0x4002123); +} + +/* { dg-final { scan-assembler "crc_table_for_8_bit_crc_12_polynomial" } } */ +/* { dg-final { scan-assembler "crc_table_for_16_bit_crc_1021_polynomial"} } */ +/* { dg-final { scan-assembler "crc_table_for_32_bit_crc_4002123_polynomial"} } */ \ No newline at end of file diff --git a/gcc/testsuite/gcc.target/riscv/crc-builtin-table-target64.c b/gcc/testsuite/gcc.target/riscv/crc-builtin-table-target64.c new file mode 100644 index 00000000000..ef35ab10ebf --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/crc-builtin-table-target64.c @@ -0,0 +1,60 @@ +/* { dg-do compile { target lp64 } } */ +#include <stdint-gcc.h> + +int8_t crc8_data8 () +{ + return __builtin_crc8_data8 (0x34, 'a', 0x12); +} + +int16_t crc16_data8 () +{ + return __builtin_crc16_data8 (0x1234, 'a', 0x1021); +} + +int16_t crc16_data16 () +{ + return __builtin_crc16_data16 (0x1234, 0x3214, 0x1021); +} + +int32_t crc32_data8 () +{ + return __builtin_crc32_data8 (0xffffffff, 0x32, 0x4002123); +} + +int32_t crc32_data16 () +{ + return __builtin_crc32_data16 (0xffffffff, 0x3232, 0x4002123); +} + +int32_t crc32_data32 () +{ + return __builtin_crc32_data32 (0xffffffff, 0x123546ff, 0x4002123); +} + +int64_t crc64_data8 () +{ + return __builtin_crc64_data8 (0xffffffffffffffff, 0x32, 0x40021234002123); +} + +int64_t crc64_data16 () +{ + return __builtin_crc64_data16 (0xffffffffffffffff, + 0x3232, 0x40021234002123); +} + +int64_t crc64_data32 () +{ + return __builtin_crc64_data32 (0xffffffffffffffff, + 0x123546ff, 0x40021234002123); +} + +int64_t crc64_data64 () +{ + return __builtin_crc64_data64 (0xffffffffffffffff, + 0x123546ff123546ff, 0x40021234002123); +} + +/* { dg-final { scan-assembler "crc_table_for_8_bit_crc_12_polynomial" } } */ +/* { dg-final { scan-assembler "crc_table_for_16_bit_crc_1021_polynomial"} } */ +/* { dg-final { scan-assembler "crc_table_for_32_bit_crc_4002123_polynomial"} } */ +/* { dg-final { scan-assembler "crc_table_for_64_bit_crc_40021234002123_polynomial"} } */ \ No newline at end of file diff --git a/gcc/testsuite/gcc.target/riscv/crc-builtin-zbc32.c b/gcc/testsuite/gcc.target/riscv/crc-builtin-zbc32.c new file mode 100644 index 00000000000..45a1c8a72a2 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/crc-builtin-zbc32.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gc_zbc -mabi=ilp32" } */ +#include <stdint-gcc.h> + +int8_t crc8_data8 () +{ + return __builtin_crc8_data8 (0x34, 'a', 0x12); +} + +int16_t crc16_data8 () +{ + return __builtin_crc16_data8 (0x1234, 'a', 0x1021); +} + +int16_t crc16_data16 () +{ + return __builtin_crc16_data16 (0x1234, 0x3214, 0x1021); +} + +/* { dg-final { scan-assembler-times "clmul\t" 6 } } */ \ No newline at end of file diff --git a/gcc/testsuite/gcc.target/riscv/crc-builtin-zbc64.c b/gcc/testsuite/gcc.target/riscv/crc-builtin-zbc64.c new file mode 100644 index 00000000000..4c619d6bd5e --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/crc-builtin-zbc64.c @@ -0,0 +1,35 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc_zbc -mabi=lp64" } */ +#include <stdint-gcc.h> + +int8_t crc8_data8 () +{ + return __builtin_crc8_data8 (0x34, 'a', 0x12); +} + +int16_t crc16_data8 () +{ + return __builtin_crc16_data8 (0x1234, 'a', 0x1021); +} + +int16_t crc16_data16 () +{ + return __builtin_crc16_data16 (0x1234, 0x3214, 0x1021); +} + +int32_t crc32_data8 () +{ + return __builtin_crc32_data8 (0xffffffff, 0x32, 0x4002123); +} + +int32_t crc32_data16 () +{ + return __builtin_crc32_data16 (0xffffffff, 0x3232, 0x4002123); +} + +int32_t crc32_data32 () +{ + return __builtin_crc32_data32 (0xffffffff, 0x123546ff, 0x4002123); +} + +/* { dg-final { scan-assembler-times "clmul\t" 12 } } */ \ No newline at end of file -- 2.25.1