Benjamin-Philip commented on issue #5801: URL: https://github.com/apache/couchdb/issues/5801#issuecomment-3656996512
I looked into this, and yes, `b64url` is faster on sizes above 100 bytes. # Benchmark I made some improvements in this [patch](https://github.com/user-attachments/files/24171160/0001-Remove-generation-overhead-from-b64url-benchmark.patch) (changes explained in the commit msg). Applying the patch: ```sh git checkout -b b64url-bench git am ~/Downloads/0001-Remove-generation-overhead-from-b64url-benchmark.patch ``` and then benchmarking for different sizes: ```sh cd src/b64url alias bench="ERL_LIBS=_build/default/lib/b64url/ ./test/benchmark.escript" for power in $(seq 1 3); do bench 1 $((10 ** power)) $((10 ** ($power + 1))) 60 100 done ``` I finally get: ``` Workers: 1, MinSize: 10, MaxSize: 100, Duration: 60, SampleSize: 100 erl : 4752923375 bytes / 60 seconds = 79215389.58 bps nif : 3055668280 bytes / 60 seconds = 50927804.67 bps 1.5554448125501372 times slower Workers: 1, MinSize: 100, MaxSize: 1000, Duration: 60, SampleSize: 100 nif : 19462006451 bytes / 60 seconds = 324366774.18 bps erl : 8901825341 bytes / 60 seconds = 148363755.68 bps 2.186293901022968 times slower Workers: 1, MinSize: 1000, MaxSize: 10000, Duration: 60, SampleSize: 100 nif : 29760976505 bytes / 60 seconds = 496016275.08 bps erl : 11500940039 bytes / 60 seconds = 191682333.98 bps 2.5876994753541642 times slower ``` As you can see, the erlang version becomes progressively slower as the size increases (peaking at about 50 bytes), and the difference is significant (upto 2.5x). If you calculate the difference in performance on 10,000 bytes, you get a sub-millisecond value: 0.032 ms. Additionally, if you compare that last range as the number of parallel workers increases, the difference decreases to just 1.87x: ```sh for power in $(seq 1 3); do bench $((10 ** power)) 1000 10000 60 100 done Workers: 10, MinSize: 1000, MaxSize: 10000, Duration: 60, SampleSize: 100 nif : 114522454433 bytes / 60 seconds = 1908707573.88 bps erl : 51406532861 bytes / 60 seconds = 856775547.68 bps 2.2277801683817393 times slower Workers: 100, MinSize: 1000, MaxSize: 10000, Duration: 60, SampleSize: 100 nif : 100150195411 bytes / 60 seconds = 1669169923.52 bps erl : 46881336505 bytes / 60 seconds = 781355608.42 bps 2.136248726618934 times slower Workers: 1000, MinSize: 1000, MaxSize: 10000, Duration: 60, SampleSize: 100 nif : 83513632230 bytes / 60 seconds = 1391893870.50 bps erl : 44569748335 bytes / 60 seconds = 742829138.92 bps 1.873773924014238 times slower ``` For reference this is my environment: ``` $ inxi -MSC System: Host: rivendell Kernel: 6.17.11-200.fc42.x86_64 arch: x86_64 bits: 64 Desktop: GNOME v: 48.7 Distro: Fedora Linux 42 (Workstation Edition) Machine: Type: Laptop System: LENOVO product: 21C1S0SM00 v: ThinkPad L14 Gen 3 serial: <superuser required> Mobo: LENOVO model: 21C1S0SM00 serial: <superuser required> UEFI: LENOVO v: R1XET54W (1.36 ) date: 07/01/2024 CPU: Info: 10-core (2-mt/8-st) model: 12th Gen Intel Core i7-1255U bits: 64 type: MST AMCP cache: L2: 6.5 MiB Speed (MHz): avg: 4481 min/max: 400/4700:3500 cores: 1: 4481 2: 4481 3: 4481 4: 4481 5: 4481 6: 4481 7: 4481 8: 4481 9: 4481 10: 4481 11: 4481 12: 4481 $ erl -s erlang halt Erlang/OTP 28 [erts-16.0.2] [source] [64-bit] [smp:12:12] [ds:12:12:10] [async-threads:1] [jit:ns] ``` ## Conclusion The point I'm trying to make is that in my mind a 0.032 ms overhead on the worst-case input is still competitive. In the real world, this difference might even be lesser. Ultimately, it comes down to whether the sub-millisecond overhead in an acceptable tradeoff for eliminating an entire submodule (and better complying with Erlang standard practice). I'm not familiar with what components are impacted by b64url's performance, and I don't know how performance sensitive couchdb's users are. You might also conclude that removing b64url doesn't make any meaningful improvement to the maintenance workload since b64url is so rarely updated. # Moving Forward Moving forward, I see 3 options: ## Option 1 - Completely replace b64url We completely replace b64url if we're not too performance sensitive, and the tradeoff is worthwhile. ## Option 2 - Keep everything as is We make no changes if we are performance sensitive, and revisit this later when stdlib performance improves. ## Option 3 - Replace b64url for data less than 100 bytes If we're hyper performance sensitive, we handle binaries less than 100 bytes with `base64` to take advantage of the speed up on small binaries. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
