https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107905

            Bug ID: 107905
           Summary: 2x slowdown versus CLANG and ICL
           Product: gcc
           Version: 11.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: sanmayce at sanmayce dot com
  Target Milestone: ---

Hi,
I have encountered performance problem with MinGW GCC 11.3.0, nearly 2x
slowdown with a simple C function (being the 3rd fastest wildcard matching
function).

```
int Tcheburaschka_Wildcard_Iterative_Kaze_CaseSensitive (const char* mask,
const char* name) {
        const char* maskSTACK;
        const char* nameSTACK;
#pragma nounroll
        for (name, mask; *name; ++name, ++mask) {
                if (*mask == '*') {
                        goto Backtrack;
                //} else if ((*mask != '?') && (*name != *mask)) {
/*
                } else if ((*mask - '?') * (*name - *mask)) {
                        return 0;
                } 
*/
                }
                if ((*mask - '?') * (*name - *mask)) {
                        return 0;
                } 
        }
Backtrack:
#pragma nounroll
        for (nameSTACK = name, maskSTACK = mask; *nameSTACK; ++nameSTACK,
++maskSTACK) {
                if (*maskSTACK == '*') {
                        mask = maskSTACK+1;
                        if (!*mask) return 1;
                        name = nameSTACK;
                        goto Backtrack;
                //} else if ((*maskSTACK != '?') && (*nameSTACK != *maskSTACK))
{
/*
                } else if ((*maskSTACK - '?') * (*nameSTACK - *maskSTACK)) {
                        name++;
                        goto Backtrack;
                } 
*/
                }
                if ((*maskSTACK - '?') * (*nameSTACK - *maskSTACK)) {
                        name++;
                        goto Backtrack;
                } 
        }
        while (*maskSTACK == '*') ++maskSTACK;
        return (!*maskSTACK);
}
```

```
[CPU: AMD Zen2 Ryzen7 4800H, @2.9GHz, Max. Boost Clock Up to 4.2GHz]
+-------------------------------------------------------------------------------+-------------------+-------------------------+-----------------------+
| Function \ Compiler                                                          
| CLANG 14.0.1, -O3 | Intel's ICL 19.0.0, /O3 | MinGW gcc 11.3.0, -O3 |
+-------------------------------------------------------------------------------+-------------------+-------------------------+-----------------------+
| Dogan Kurt's 'Antimalware', 2016, Iterative (wild_iterative)                 
|       70.605000 s |            102.610000 s |           83.398000 s |
| Dogan Kurt's 'Antimalware', 2016, Iterative Optimised (wild_iterative_opt)   
|       61.322000 s |             74.243000 s |           66.538000 s |
| Tcheburaschka_r3, 2022, (Tcheburaschka_Wildcard_Iterative_Kaze_CaseSensitive)
|       72.990000 s |             76.161000 s |          127.717000 s |
| JackHandy_Iterative, 2005, (IterativeWildcards)                              
|       80.053000 s |             90.872000 s |           70.156000 s |
| Kirk J. Krauss, 2014, DrDobbs (FastWildCompare)                              
|       44.113000 s |             48.109000 s |           51.018000 s |
| Alessandro Cantatore, 2003, (szWildMatch7)                                   
|       98.729000 s |             85.986000 s |          121.965000 s |
| Nondeterministic Finite Automaton (wild_nfa)                                 
|      162.561000 s |            200.022000 s |          176.440000 s |
+-------------------------------------------------------------------------------+-------------------+-------------------------+-----------------------+
[Note1: All functions returned 1,075,000,000 Matches - that is TRUEs, kinda
means they passed the quality test, no, really, I printed all the 1's and 0's
after each run - the sequences matched.]
[Note2: It is well-known that Maximum Turbo Modes are maintained for some 15-30
seconds, so it is good that each function takes 30+ seconds, to emulate some 8
billion real-world searches.]
```

For more info:
https://github.com/kirkjkrauss/MatchingWildcards/issues/1#issue-1467311771

Reply via email to