This is an automated email from the ASF dual-hosted git repository. nickva pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/couchdb-jiffy.git
commit 67571e1bbb04131d3f35441b683373476d94a6e7 Author: Nick Vatamaniuc <[email protected]> AuthorDate: Wed Apr 1 16:58:29 2026 -0400 Use quick lookahead parsing for numbers Same idea as we did for strings in [1]. For numbers we do it for any states which are looping (mantissa, frac and edigit). The wins are not as great since most numbers are not that long but it's plausible someone has some large dataset of mostly number arrays perhaps and it could help them. Benchmarking with jason's benchee setup with a new data file consisting from an array of largish numbers showed a 7% improvement compared to current master. [1] https://github.com/davisp/jiffy/commit/dbd7bcc2804ccf37ee936ff36fbe9fdeff8a3664 --- c_src/decoder.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/c_src/decoder.c b/c_src/decoder.c index 354c335..5021630 100644 --- a/c_src/decoder.c +++ b/c_src/decoder.c @@ -390,7 +390,10 @@ dec_number(Decoder* d, ERL_NIF_TERM* value) // to the compiler p won't alias any other pointers so it can optimize // access to it. Also avoid writing back do d->i on every increment, // instead increment a local variable (hopefully in a register) then update - // d->i once at the end. + // d->i once at the end. Also, when parsing looping states (mantissa, frac, + // edigit) scan-ahead quickly looking for strings of digits only. The wins + // will not be as big as we have for strings as most numbers are not that + // long, but it shouldn't hurt either. const unsigned char* JIFFY_RESTRICT p = d->p; const size_t len = d->len; const size_t start = d->i; @@ -468,7 +471,9 @@ dec_number(Decoder* d, ERL_NIF_TERM* value) case '7': case '8': case '9': - idx++; + while(idx < len && p[idx] >= '0' && p[idx] <= '9') { + idx++; + } break; default: goto parse; @@ -529,7 +534,9 @@ dec_number(Decoder* d, ERL_NIF_TERM* value) case '7': case '8': case '9': - idx++; + while(idx < len && p[idx] >= '0' && p[idx] <= '9') { + idx++; + } break; default: goto parse; @@ -571,7 +578,9 @@ dec_number(Decoder* d, ERL_NIF_TERM* value) case '7': case '8': case '9': - idx++; + while(idx < len && p[idx] >= '0' && p[idx] <= '9') { + idx++; + } break; default: goto parse;
