Hello, On Wed, Dec 31, 2025 at 2:04 PM Nazir Bilal Yavuz <[email protected]> wrote:
> Hi, > > On Wed, 24 Dec 2025 at 18:08, KAZAR Ayoub <[email protected]> wrote: > > > > Hello, > > Following the same path of optimizing COPY FROM using SIMD, i found that > COPY TO can also benefit from this. > > > > I attached a small patch that uses SIMD to skip data and advance as far > as the first special character is found, then fallback to scalar processing > for that character and re-enter the SIMD path again... > > There's two ways to do this: > > 1) Essentially we do SIMD until we find a special character, then > continue scalar path without re-entering SIMD again. > > - This gives from 10% to 30% speedups depending on the weight of special > characters in the attribute, we don't lose anything here since it advances > with SIMD until it can't (using the previous scripts: 1/3, 2/3 specials > chars). > > > > 2) Do SIMD path, then use scalar path when we hit a special character, > keep re-entering the SIMD path each time. > > - This is equivalent to the COPY FROM story, we'll need to find the same > heuristic to use for both COPY FROM/TO to reduce the regressions (same > regressions: around from 20% to 30% with 1/3, 2/3 specials chars). > > > > Something else to note is that the scalar path for COPY TO isn't as > heavy as the state machine in COPY FROM. > > > > So if we find the sweet spot for the heuristic, doing the same for COPY > TO will be trivial and always beneficial. > > Attached is 0004 which is option 1 (SIMD without re-entering), 0005 is > the second one. > > Patches look correct to me. I think we could move these SIMD code > portions into a shared function to remove duplication, although that > might have a performance impact. I have not benchmarked these patches > yet. > Definitely yes. > > Another consideration is that these patches might need their own > thread, though I am not completely sure about this yet. > I thought maybe since it uses the same infrastructure, it needs/does the same ideas and it's an easier problem than COPY FROM so this might be interesting to be kept/committed together. Regards, Ayoub Kazar
