Re: RISC-V: Added support for CRC.

Jeff Law via Gcc-patches Tue, 15 Aug 2023 21:13:05 -0700



On 8/9/23 00:32, Alexander Monakov wrote:


On Tue, 8 Aug 2023, Jeff Law wrote:

If the compiler can identify a CRC and collapse it down to a table or clmul,
that's a major win and such code does exist in the real world. That was the
whole point behind the Fedora experiment -- to determine if these things are
showing up in the real world or if this is just a benchmarking exercise.


Can you share the results of the experiment and give your estimate of what
sort of real-world improvement is expected? I already listed the popular
FOSS projects where CRC performance is important: the Linux kernel and
a few compression libraries. Those projects do not use a bitwise CRC loop,
except sometimes for table generation on startup (which needs less time
than a page fault that may be necessary to bring in a hardcoded table).

That experiment was ~7 months ago. I don't think any of the data isstill around except for some extracted testcases.


For those projects that need a better CRC, why is the chosen solution is
to optimize it in the compiler instead of offering them a library they
could use with any compiler?

Because if the compiler can optimize it automatically, then the projectshave to do literally nothing to take advantage of it. They just compilenormally and their bitwise CRC gets optimized down to either a tablelookup or a clmul variant. That's the real goal here.

If a step where we provide the backend bits hooked up to a builtin isn'tuseful, then we won't pursue it. The thinking was it would providevalue for those willing to make a slight change to their sources and atthe same time we get real world exposure for the backend work of the CRCoptimization effort while we polish the gimple detection bits.


Was there any thought given to embedded projects that use bitwise CRC
exactly because they little space for a hardcoded table to spare?

It wasn't an explicit goal, but the ability to select between a tableimplementation and a clmul implementation in the backend seemed useful,so we wired up both.


No, not if the compiler is not GCC, or its version is less than 14. And
those projects are not going to sacrifice their portability just for
__builtin_crc.

You may be right.   I don't think it's so clear cut. though.

I think offering a conventional library for CRC has substantial advantages.

That's not what I asked.  If you think there's room for improvement to a
builtin API, I'd love to hear it.

But it seems you don't think this is worth the effort at all.  That's
unfortunate, but if that's the consensus, then so be it.


I think it's a strange application of development effort. You'd get more
done coding a library.

Not if the end goal is to detect the CRC and optimize it into a table orclmul without the user having to do anything special.

Again, what we've proposed in this patch is a piece of that larger bodyof work, specifically the backend bits that we thought would have valueindependently. If the community doesn't see that carved out chunk ashelpful we'll table it until the whole end-to-end path is ready forsubmission.

I'll note LLVM is likely going forward with CRC detection and optimization at
some point in the next ~6 months (effectively moving the implementation from
the hexagon port into the generic parts of their loop optimizer).


I don't see CRC detection in the Hexagon port. There is a recognizer for
polynomial multiplication (CRC is division, not multiplication).

Yes, you need to the recognizer so that you can detect a CRC loop, thenwith a bit of math you turn that into a carryless multiply sequence. Ifind the math here mindbending, but the Hexagon bits are precisely tooptimize CRC loops. Sadly the Hexagon bits are fairly specific to theCRC implementation inside coremark. The GCC bits we've been working onare much more general.

One final note. Elsewhere in this thread you described performanceconcerns. Right now clmuls can be implemented in 4c, fully piped. Ifully expect that latency to drop within the next 12-18 months. In thatworld, there's not going to be much benefit to using hand-codedlibraries vs just letting the compiler do it.


Jeff

Re: RISC-V: Added support for CRC.

Reply via email to