This is an automated email from the ASF dual-hosted git repository.

nickva pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/couchdb-jiffy.git

commit 67571e1bbb04131d3f35441b683373476d94a6e7
Author: Nick Vatamaniuc <[email protected]>
AuthorDate: Wed Apr 1 16:58:29 2026 -0400

    Use quick lookahead parsing for numbers
    
    Same idea as we did for strings in [1]. For numbers we do it for any states
    which are looping (mantissa, frac and edigit). The wins are not as great 
since
    most numbers are not that long but it's plausible someone has some large
    dataset of mostly number arrays perhaps and it could help them.
    
    Benchmarking with jason's benchee setup with a new data file consisting 
from an
    array of largish numbers showed a 7% improvement compared to current master.
    
    [1]
    
https://github.com/davisp/jiffy/commit/dbd7bcc2804ccf37ee936ff36fbe9fdeff8a3664
---
 c_src/decoder.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/c_src/decoder.c b/c_src/decoder.c
index 354c335..5021630 100644
--- a/c_src/decoder.c
+++ b/c_src/decoder.c
@@ -390,7 +390,10 @@ dec_number(Decoder* d, ERL_NIF_TERM* value)
     // to the compiler p won't alias any other pointers so it can optimize
     // access to it. Also avoid writing back do d->i on every increment,
     // instead increment a local variable (hopefully in a register) then update
-    // d->i once at the end.
+    // d->i once at the end. Also, when parsing looping states (mantissa, frac,
+    // edigit) scan-ahead quickly looking for strings of digits only. The wins
+    // will not be as big as we have for strings as most numbers are not that
+    // long, but it shouldn't hurt either.
     const unsigned char* JIFFY_RESTRICT p = d->p;
     const size_t len = d->len;
     const size_t start = d->i;
@@ -468,7 +471,9 @@ dec_number(Decoder* d, ERL_NIF_TERM* value)
                     case '7':
                     case '8':
                     case '9':
-                        idx++;
+                        while(idx < len && p[idx] >= '0' && p[idx] <= '9') {
+                            idx++;
+                        }
                         break;
                     default:
                         goto parse;
@@ -529,7 +534,9 @@ dec_number(Decoder* d, ERL_NIF_TERM* value)
                     case '7':
                     case '8':
                     case '9':
-                        idx++;
+                        while(idx < len && p[idx] >= '0' && p[idx] <= '9') {
+                            idx++;
+                        }
                         break;
                     default:
                         goto parse;
@@ -571,7 +578,9 @@ dec_number(Decoder* d, ERL_NIF_TERM* value)
                     case '7':
                     case '8':
                     case '9':
-                        idx++;
+                        while(idx < len && p[idx] >= '0' && p[idx] <= '9') {
+                            idx++;
+                        }
                         break;
                     default:
                         goto parse;

Reply via email to