(couchdb-jiffy) 09/09: Version 2.0.0

vatamane Sat, 25 Apr 2026 09:22:53 -0700

This is an automated email from the ASF dual-hosted git repository.

nickva pushed a commit to tag 2.0.0
in repository https://gitbox.apache.org/repos/asf/couchdb-jiffy.git


commit 83a470a574c543812a11b6dd92dd2698eb00c615
Author: Nick Vatamaniuc <[email protected]>
AuthorDate: Sat Apr 25 12:07:30 2026 -0400

    Version 2.0.0
    
    Performance
    ===========
    
     - **SIMD vectorization** for ASCII scan-ahead loops in both decoder string 
parsing and encoder
      string emission. This meant replacing the byte-at-a-time scans with 
16/32-byte chunked compares.
      The most interesting part is it was done without writing a single line of 
assembly, just relying
      on compiler auto-vectorization. This showed a **15x** performance 
improvement on encoding large
      strings like in the "Issue 90" benchmark. To get even better 
auto-vectorizer behavior, it's
      advisable to set `-march=native` or `-march=x86-64-v3`. That can make the 
auto-vectorizers on
      recent compilers switch to using 256bit AVX2 registers and instructions.
    
    - **UTF-8 skip-ahead in encoder** and faster UTF-8 validation. This is like 
the scan-ahead loop for
      ASCII, but it's for UTF-8 validation. This helps quite a bit on 
non-ASCII, Unicode-heavy inputs.
      "UTF-8 unescaped" benchmark got a **5.8x** speedup from it.
    
    - Use **Ryu** for number encoding. This is the exact Ryu version from the 
latest Erlang/OTP release
      with all the updates and tweaks they added. This makes the float output 
the same as Erlang's.
      However, this means the output is not exactly the same as before for 
Jiffy (we used to emit more
      fractional digits, now it switches to the scientific notation a bit 
earlier). Number heavy
      benchmarks like "Canada" showed a **2x** speedup.
    
    - **ffc.h** for number parsing in the decoder. This is the fastest C number 
parser around at this
      time. I worked with the upstream author to add a new API to it parse JSON 
numbers as a single call
      which returns back either an integer or a double, as opposed pre-parsing 
to figure out which is
      which first (https://github.com/kolemannix/ffc.h/pull/22). Using this 
library yielded a **4x**
      speedup in the number-heavy "Canada" benchmark on decoding.
    
    - **Faster array and map creation** for building the result term in fewer 
steps. (In the processes
      discovered that maps with duplicates created from NIFs were subtly broken 
in Erlang
      https://github.com/erlang/otp/pull/10976. The fix is now merged and 
should be in the recent patch
      Erlang releases). This bulk creation improved decoding across the board. 
Some examples are
      **2.5x** for "JSON Generator", **2.6x** for "Github" and **3.3x** for 
"Blockchain". Most of those
      a mixed inputs so number parsing and scan-ahead played a role in there as 
well.
    
    - **Branch hints** (`JIFFY_LIKELY` / `JIFFY_UNLIKELY`) on encoder hot 
paths. I saw QuickJS library
      doing this, so experimented around and saw few percent speedup from it.
    
    - **Unity build**. Having handled a few issues over the years related to 
enabling, disabling and
      detecting LTO (Link-time optimization) compiler features, decided to 
side-step it and go with a
      unity build. This is where we include all the source file into one 
`jiffy.c` file and compile
      that. We get all the benefits of LTO but without having to juggle linker 
flags.
    
    Yielding & scheduler behavior
    
    - **Reduction count bumped to 4000** to match current Erlang VM defaults
    
    - **Bytes per reduction lowered** so cooperative yields fire more
      often on long input. This results in better latency under contention 
without a
      measurable throughput hit.
    
    Since Jiffy is a NIF, it's crucial for it to never block schedulers and 
always yield appropriately.
    As the concurrency increases it should degrade gracefully in proportion to 
the applied load. This is
    not a trivial task to accomplish in a NIF, in general. Some json library 
NIFs use dirty schedulers,
    however in cases where Jiffy is used that wouldn't work as that is still a 
limited resource and
    during high concurrency it would lead to bottlenecks.
    
    A separate benchmark, `bench_scheduling.sh` in 
https://github.com/nickva/bench runs concurrent JSON
    encoding and decoding scaled by the number of schedulers. Testing with a 
few Erlang json libraries
    shows something like this:
    
    ```
    ./bench_scheduling.sh
    ...
    scheduler responsiveness check
      input:       citm-catalog.json duration: 2000
      schedulers:  12 online
      impls:       json, jiffy, simdjsone, jsone, jsx
    
    [json]
      1x encdec                    n=84 p50=135.0ms p95=182.9ms p99=191.9ms 
max=196.7ms
      12x encdec                   n=86 p50=129.7ms p95=189.9ms p99=203.0ms 
max=206.2ms
      24x encdec                   n=87 p50=263.0ms p95=461.2ms p99=506.1ms 
max=527.1ms
    
    [jiffy]
      1x encdec                    n=309 p50=38.3ms p95=51.9ms p99=57.4ms 
max=66.5ms
      12x encdec                   n=300 p50=41.2ms p95=52.5ms p99=59.7ms 
max=66.2ms
      24x encdec                   n=306 p50=80.2ms p95=111.8ms p99=118.8ms 
max=140.1ms
    
    [simdjsone]
      1x encdec                    n=20 p50=690.1ms p95=784.6ms p99=784.6ms 
max=784.8ms
      12x encdec                   n=16 p50=790.9ms p95=887.5ms p99=887.5ms 
max=899.9ms
      24x encdec                   n=24 p50=1448.4ms p95=1876.7ms p99=1879.5ms 
max=1882.7ms
    
    [jsone]
      1x encdec                    n=60 p50=213.1ms p95=261.8ms p99=263.9ms 
max=264.8ms
      12x encdec                   n=60 p50=204.9ms p95=329.8ms p99=345.0ms 
max=350.9ms
      24x encdec                   n=52 p50=440.1ms p95=700.3ms p99=773.3ms 
max=817.3ms
    
    [jsx]
      1x encdec                    n=24 p50=398.8ms p95=539.0ms p99=544.1ms 
max=548.3ms
      12x encdec                   n=24 p50=391.5ms p95=684.9ms p99=687.0ms 
max=689.6ms
      24x encdec                   n=24 p50=1181.3ms p95=1479.0ms p99=1558.1ms 
max=1654.7ms
    ```
    
    There we measure both the latency of sending a term back and forth between 
two encoder/decoder
    processes, as well as the throughput (`n` is how many times we managed to 
do that).
    
    Features
    
    - **Pre-encoded JSON** — embed already-encoded JSON fragments directly in a 
value being encoded,
      saving a round-trip through the decoder. Use `{json, IoData}` terms and 
they will be embedded in
      the emitted stream as is. This was a surprisingly popular feature over 
the years. Paul J. Davis
      (Jiffy's original author) suggested a nice and quick patch to make it 
work so I went with that.
    
    - **Encode UTF-8 atoms** (on OTP 26+ only!) atoms with non-ASCII bytes now 
encode as their UTF-8
      source. Unfortunately this is for OTP 26+ only.
    
    - **Number-as-key encoding** — integer/float map keys are encoded as string 
keys instead of
      erroring. Both Python and Erlang/OTP's built-in json already does this.
    
    Correctness & compliance
    ========================
    
    - **RFC 8259 100% compliance.** A new test suite based on 
`nst/JSONTestSuite` is wired in and all
      conformance tests pass.
    
    - **Big List of Naughty Strings (BLNS)** added in the test mix.
    
    Build & CI
    ==========
    
    - **OTP 21** is the new minimum.
    
    - **C coverage checks** added so the test suite reports per-file C line 
coverage; several uncovered
      paths were closed during this work.
---
 src/jiffy.app.src | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/jiffy.app.src b/src/jiffy.app.src
index 1908c8c..1dce860 100644
--- a/src/jiffy.app.src
+++ b/src/jiffy.app.src
@@ -1,9 +1,9 @@
 {application, jiffy, [
     {description, "JSON Decoder/Encoder."},
-    {vsn, "1.1.2"},
+    {vsn, "2.0.0"},
     {registered, []},
     {applications, [kernel, stdlib, xmerl]},
-    {maintainers, ["Paul J. Davis"]},
+    {maintainers, ["Paul J. Davis", "Nick Vatamaniuc"]},
     {licenses, ["MIT", "BSD"]},
     {links, [{"GitHub", "https://github.com/davisp/jiffy"}]},
     {files, [

(couchdb-jiffy) 09/09: Version 2.0.0

Reply via email to