On Sat, Jan 31, 2026 at 10:21 AM KAZAR Ayoub <[email protected]> wrote:
> Hello, > > On Wed, Jan 21, 2026 at 9:50 PM Neil Conway <[email protected]> wrote: > >> A few suggestions: >> >> * I'm curious if we'll see better performance on large inputs if we flush >> to `line_buf` periodically (e.g., at least every few thousand bytes or so). >> Otherwise we might see poor data cache behavior if large inputs with no >> control characters get evicted before we've copied them over. See the >> approach taken in escape_json_with_len() in utils/adt/json.c >> >> So i gave this a try, attached is the small patch that has v3 + the > suggestion added, here are the results with different threshold for > line_buf refill: > > Execution time compared to master: > Workload v3 v3.1 (2k) v3.1 (4k) v3.1 (8k) v3.1 (16k) v3.1 (20k) v3.1 (28k) > text/none -16.5% -17.4% -14.3% -12.6% -13.6% -10.5% -16.3% > text/esc +5.6% +11.1% +3.1% +7.6% +3.0% +4.9% +4.2% > csv/none -31.0% -29.9% -26.7% -30.1% -27.9% -30.2% -29.6% > csv/quote +0.2% -0.6% -0.4% -1.0% +0.1% +2.5% -1.0% > > L1d cache miss rates: > Workload Master v3 v3.1 (2k) v3.1 (4k) v3.1 (8k) v3.1 (16k) v3.1 (20k) v3.1 > (28k) > text/none 0.20% 0.23% 0.21% 0.22% 0.21% 0.21% 0.21% 0.22% > text/esc 0.21% 0.22% 0.22% 0.22% 0.22% 0.21% 0.22% 0.22% > csv/none 0.17% 0.22% 0.21% 0.22% 0.21% 0.21% 0.22% 0.22% > csv/quote 0.18% 0.22% 0.19% 0.20% 0.20% 0.19% 0.20% 0.20% > On my laptop I have 32KB L1 cache per core. > Results are super close, it is hard to see in the cache misses numbers but > execution times are saying other things, doing the periodic filling of > line_buf seems good to do. > If Manni can rerun the benchmarks on these too, it would be nice to > confirm this. > > > Regards, > Ayoub > Hello, All! Ayoub, I will try to benchmark v3.1 this week on my standalone x86 and arm PCs. Sadly, other work has been taking priority these last couple weeks, but I will carve out some time. Neil, thanks so much for looking at this patch! -Manni -- -- Manni Wood EDB: https://www.enterprisedb.com
