On Wed, Jul 31, 2024 at 10:15 AM Mariam Arutunian <mariamarutun...@gmail.com> wrote: > > This patch adds a new compiler pass aimed at identifying naive CRC > implementations, > characterized by the presence of a loop calculating a CRC (polynomial long > division). > Upon detection of a potential CRC, the pass prints an informational message. > > Performs CRC optimization if optimization level is >= 2, > besides optimizations for size and if fno_gimple_crc_optimization given. > > This pass is added for the detection and optimization of naive CRC > implementations, > improving the efficiency of CRC-related computations. > > This patch includes only initial fast checks for filtering out non-CRCs, > detected possible CRCs verification and optimization parts will be provided > in subsequent patches.
+fgimple-crc-optimization Please do not put IL names in flags exposed to users, this should be -foptimize-crc +This flag is enabled by default at @option{-O2} and higher +if @option{-Os} is not also specified. Please leave out "if @option{-Os} is not also specified". I wonder why for example if the CPU has a CRC instruction and the polynomial matches we would disable this when optimizing for size? A CRC instruction should be smaller? +/* This pass performs CRC optimization. */ +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "backend.h" +#include "tree.h" +#include "gimple.h" +#include "tree-pass.h" +#include "ssa.h" +#include "gimple-iterator.h" +#include "tree-cfg.h" +#include "tree-ssa-loop-niter.h" please try to prune the set of #include (I guess you've copied this from elsewhere). + /* Don't continue searching if encountered the loop's beginning. */ + if (bb_loop_header_p (gimple_bb (stmt))) + return nullptr; gimple_bb (stmt) == m_crc_loop->header it looks like this loop ignores other uses of xored_crc, like in calls? Shouldn't there be a else if (!is_gimple_debug (stmt)) return nullptr; in the FOR_EACH_IMM_USE_FAST loop? It looks like loop_may_calculate_crc walks all loops XOR stmts and for each xor_calculates_crc you call set_bbs_stmts_not_visited. This looks like it would be quadratic in complexity considering a loop with 1000s of XOR stmts _not_ computing a CRC? This would be solved by aborting the search when the first found XOR is not a CRC for example. Or by doing the initialization only once. It also seems that in xor_calculates_crc finding the shift after the XOR is much cheaper than the one before as that collects dependent stmts (but we unset visited on all stmts _again_). May I suggest to use a bitmap of SSA def SSA_NAME_VERSIONs instead of using the gimple visited flag? You are collecting a possibly very large use-def chain but only very simple checks are later done on it. Richard. > gcc/ > > * Makefile.in (OBJS): Add gimple-crc-optimization.o. > * common.opt (fgimple-crc-optimization): New option. > * common.opt.urls: Regenerate to add > fgimple-crc-optimization. > * doc/invoke.texi (-fgimple-crc-optimization): Add documentation. > * gimple-crc-optimization.cc: New file. > * gimple.cc (set_phi_stmts_not_visited): New function. > (set_gimple_stmts_not_visited): Likewise. > (set_bbs_stmts_not_visited): Likewise. > * gimple.h (set_gimple_stmts_not_visited): New extern function > declaration. > (set_phi_stmts_not_visited): New extern function declaration. > (set_bbs_stmts_not_visited): New extern function declaration. > * opts.cc (default_options_table): Add OPT_fgimple_crc_optimization. > (enable_fdo_optimizations): Enable gimple-crc-optimization. > * passes.def (pass_crc_optimization): Add new pass. > * timevar.def (TV_GIMPLE_CRC_OPTIMIZATION): New timevar. > * tree-pass.h (make_pass_crc_optimization): New extern function > declaration. > > Signed-off-by: Mariam Arutunian <mariamarutun...@gmail.com>