Hello!

I ran some COPY FROM tests using master and then Nazir's v7-0001 and
v7-0002 patches applied to master.

x86 master
TXT :                 29222.524250 ms
CSV :                 36162.588500 ms
TXT with 1/3 escapes: 32922.649750 ms
CSV with 1/3 quotes:  47631.423750 ms

x86 v7-0001
TXT :                 23247.834250 ms  20.445496% improvement
CSV :                 23162.711750 ms  35.948413% improvement
TXT with 1/3 escapes: 31786.386000 ms  3.451313% improvement
CSV with 1/3 quotes:  43330.475500 ms  9.029645% improvement

x86 v7-0002
TXT :                 22394.812500 ms  23.364552% improvement
CSV :                 22374.645750 ms  38.127643% improvement
TXT with 1/3 escapes: 32378.929750 ms  1.651507% improvement
CSV with 1/3 quotes:  47139.171750 ms  1.033461% improvement

arm master
TXT :                 9448.900500 ms
CSV :                 11135.871500 ms
TXT with 1/3 escapes: 10786.418750 ms
CSV with 1/3 quotes:  14115.335500 ms

arm v7-0001
TXT :                 7271.170500 ms  23.047443% improvement
CSV :                 7259.866750 ms  34.806479% improvement
TXT with 1/3 escapes: 10894.445500 ms  -1.001507% regression
CSV with 1/3 quotes:  13398.444000 ms  5.078813% improvement

arm v7-0002
TXT :                 7165.707250 ms  24.163587% improvement
CSV :                 7140.497250 ms  35.878416% improvement
TXT with 1/3 escapes: 10308.782250 ms  4.428129% improvement
CSV with 1/3 quotes:  12576.179500 ms  10.904140% improvement

v7-0001 + v7-0002 applied to master certainly seems promising: nice to see
speed improvements across the board on both x86 and arm!

On Fri, Feb 13, 2026 at 5:09 PM Nathan Bossart <[email protected]>
wrote:

> On Fri, Feb 13, 2026 at 02:45:30PM +0300, Nazir Bilal Yavuz wrote:
> > Also, if I change this code to:
> >
> >     if (cstate->simd_enabled)
> >     {
> >         if (is_csv)
> >             result = CopyReadLineText(cstate, true, true);
> >         else
> >             result = CopyReadLineText(cstate, false, true);
> >     }
> >     else
> >     {
> >         if (is_csv)
> >             result = CopyReadLineText(cstate, true, false);
> >         else
> >             result = CopyReadLineText(cstate, false, false);
> >     }
> >
> > then I see ~%5 performance improvement in scalar path compared to master.
>
> Hm.  What difference do you see if you just do
>
>         if (is_csv)
>                 result = CopyReadLineText(cstate, true);
>         else
>                 result = CopyReadLineText(cstate, false);
>
> both with and without the SIMD stuff?  IIUC this is allowing the compiler
> to remove several branches in CopyReadLineText(), which might be a nice
> improvement on its own.  That being said, I'm less convinced that adding a
> simd_enabled parameter to CopyReadLineText() helps, because 1) it's
> involved in fewer branches and 2) we change it within the function, so the
> compiler can't remove the branches, anyway.  But perhaps I'm missing
> something.
>
> Some other random thoughts:
>
> +                    match = vector8_or(vector8_eq(chunk, nl),
> vector8_eq(chunk, cr));
>
> +                match = vector8_or(vector8_eq(chunk, nl),
> vector8_eq(chunk, cr));
>
> Since \n and \r are well below "normal" ASCII values, I wonder if we could
> simplify these to something like
>
>         match = vector8_gt(... vector with all lanes set to \r + 1 ...,
> chunk);
>
> +            /* Check if we found any special characters */
> +            mask = vector8_highbit_mask(match);
> +            if (mask != 0)
>
> vector8_highbit_mask() is somewhat expensive on AArch64, so I wonder if
> waiting until we enter the "if" block to calculate it has any benefit.
>
> +                simd_hit_eol = (c1 == '\r' || c1 == '\n') && (!is_csv ||
> !in_quote);
>
> If (is_csv && in_quote), we shouldn't have picked up \r or \n in the first
> place, right?
>
> +                simd_hit_eof = c1 == '\\' && c2 == '.' && !is_csv;
> +
> +                /*
> +                 * Do not disable SIMD when we hit EOL or EOF characters.
> In
> +                 * practice, it does not matter for EOF because parsing
> ends
> +                 * there, but we keep the behavior consistent.
> +                 */
> +                if (!(simd_hit_eof || simd_hit_eol))
>
> I'd think that doing less unnecessary work would outweigh the benefits of
> consistency for the EOF case.
>
> --
> nathan
>


-- 
-- Manni Wood EDB: https://www.enterprisedb.com

Reply via email to