On Thu, Jan 15, 2026 at 06:32:14AM +0000, Diego A. Mundo via Chicken-hackers wrote: > Hi all, > > I noticed number->string still only works with bases up to 16, while > string->number works up to base 36. I have attached a trivial patch on top of > master that appears to do the thing.
Hi Diego, Thanks for sending in this patch! It reminded me of a long-standing unfinished TODO: looking into how to deal with the character "i" inside bases higher than 18. As you probably know, Scheme allows complex number literals like "1+2i", but this means it's ambiguous in higher bases, because the i could either be read as a digit in such a base, or as syntax for indicating the end of the imaginary component. > I'm not sure if this might have adverse effects elsewhere, but at least make > check passes. No, it's fine. I've attached a signed-off patch where I also update the NEWS and the manual. Cheers, Peter
>From 06cbb8a4ded121e23312a6f7ac86d8d5d4732f0c Mon Sep 17 00:00:00 2001 From: "Diego A. Mundo" <[email protected]> Date: Thu, 15 Jan 2026 01:21:05 -0500 Subject: [PATCH 1/2] Make number->string work with bases up to 36 Signed-off-by: Peter Bex <[email protected]> --- NEWS | 2 ++ manual/Module (scheme base) | 10 +++++----- runtime.c | 8 ++++---- 3 files changed, 11 insertions(+), 9 deletions(-) diff --git a/NEWS b/NEWS index 698fe2d2..adb64e62 100644 --- a/NEWS +++ b/NEWS @@ -84,6 +84,8 @@ - Added the (chicken version) module. - "delete-file*" and "delete-file" now behave consistently with broken symlinks. + - number->string now accepts bases up to 36, where before it only accepted + bases up to 16 (thanks to Diego A. Mundo) - Syntax expander: - `syntax-rules' attempts to better support tail patterns with ellipses diff --git a/manual/Module (scheme base) b/manual/Module (scheme base) index 7dae7d87..3d3cd302 100644 --- a/manual/Module (scheme base) +++ b/manual/Module (scheme base) @@ -2535,11 +2535,11 @@ reported. Radix must be an exact integer. The R7RS standard only requires implementations to support 2, 8, 10, or 16, but CHICKEN allows any -radix between 2 and 36, inclusive (note: a bug in CHICKEN 5 currently -limits the upper bound to 16). If omitted, radix defaults to -10. The procedure number->string takes a number and a radix and -returns as a string an external representation of the given number in -the given radix such that +radix between 2 and 36, inclusive (note: due to a bug, flonums with +fractional components always use radix 10, irrespective of the argument). +If omitted, radix defaults to 10. The procedure number->string takes +a number and a radix and returns as a string an external +representation of the given number in the given radix such that (let ((number number) (radix radix)) diff --git a/runtime.c b/runtime.c index 084c7a4f..7f038935 100644 --- a/runtime.c +++ b/runtime.c @@ -11135,7 +11135,7 @@ static C_regparm double decode_flonum_literal(C_char *str) static char *to_n_nary(C_uword num, C_uword base, int negp, int as_flonum) { - static char *digits = "0123456789abcdef"; + static char *digits = "0123456789abcdefghijklmnopqrstuvwxyz"; char *p; C_uword shift = C_ilen(base) - 1; int mask = (1 << shift) - 1; @@ -11203,7 +11203,7 @@ void C_ccall C_fixnum_to_string(C_word c, C_word *av) radix = ((c == 3) ? 10 : C_unfix(av[ 3 ])), neg = ((num & C_INT_SIGN_BIT) ? 1 : 0); - if (radix < 2 || radix > 16) { + if (radix < 2 || radix > 36) { barf(C_BAD_ARGUMENT_TYPE_BAD_BASE_ERROR, "number->string", C_fix(radix)); } @@ -11287,7 +11287,7 @@ void C_ccall C_integer_to_string(C_word c, C_word *av) int len, radix_shift; size_t nbits; - if ((radix < 2) || (radix > 16)) { + if ((radix < 2) || (radix > 36)) { barf(C_BAD_ARGUMENT_TYPE_BAD_BASE_ERROR, "number->string", C_fix(radix)); } @@ -11327,7 +11327,7 @@ void C_ccall C_integer_to_string(C_word c, C_word *av) static void bignum_to_str_2(C_word c, C_word *av) { - static char *characters = "0123456789abcdef"; + static char *characters = "0123456789abcdefghijklmnopqrstuvwxyz"; C_word self = av[ 0 ], string = av[ 1 ], -- 2.51.2
>From dc7e81a5a49ff6a4226b390546dc9c09ea1bb0ff Mon Sep 17 00:00:00 2001 From: Peter Bex <[email protected]> Date: Fri, 16 Jan 2026 13:34:49 +0100 Subject: [PATCH 2/2] Improve string->number handling of "i" in bases higher than 18 Before, it would parse "-i" as a complex number with real value zero and imaginary value minus one. Now, that string is parsed as the negative integer eighteen, for consistency with number->string, which now supports such higher bases as well. If one needs a complex number, it can be written in explicit long form, like "0-1i". This is consistent with how number->string has always emitted complex numbers. We handle this by adding a context variable to the "scan-digits" procedure which scans the integer component in a number, which is used to find the delimitation of the string that gets passed into C_s_a_i_digits_to_integer in scan-digits+hashes. This context variable tells it whether we're inside the imaginary part of a rectangular complex number literal and causes it to back up one character if the last character happens to be an "i". Because Scheme number syntax is surprisingly complicated, we have to also handle the case where the number is the numerator of a fractional number - in that case, if "i" is at the end, we do *not* want to back up, because the complex "i" can only come at the end of the denominator. --- NEWS | 2 + library.scm | 63 ++++++++++++++--------- manual/Module (scheme base) | 8 +++ tests/numbers-string-conversion-tests.scm | 38 ++++++++++++++ 4 files changed, 88 insertions(+), 23 deletions(-) diff --git a/NEWS b/NEWS index adb64e62..56904cb3 100644 --- a/NEWS +++ b/NEWS @@ -86,6 +86,8 @@ broken symlinks. - number->string now accepts bases up to 36, where before it only accepted bases up to 16 (thanks to Diego A. Mundo) + - string->number now handles ambiguous cases involving the character "i" + in bases higher than 18 more consistently. - Syntax expander: - `syntax-rules' attempts to better support tail patterns with ellipses diff --git a/library.scm b/library.scm index 9888facd..ff407b9e 100644 --- a/library.scm +++ b/library.scm @@ -2965,8 +2965,6 @@ EOF (string-append (number->string (%ratnum-numerator n) base) "/" (number->string (%ratnum-denominator n) base))) - ;; What about bases that include an "i"? That could lead to - ;; ambiguous results. ((cplxnum? n) (let ((r (%cplxnum-real n)) (i (%cplxnum-imag n)) ) (string-append @@ -3047,20 +3045,36 @@ EOF ;; position. If the cdr is false, that's the end of the string. ;; If just #f is returned, the string contains invalid number syntax. (scan-digits - (lambda (start) - (let lp ((i start)) + (lambda (start cplx?) + (let lp ((i start) + ;; Drop is true when the last read character is + ;; an "i" while reading the second part of a + ;; rectangular complex number literal *and* the + ;; radix is 19 or above. In that case, we back + ;; up one character to ensure we don't consume + ;; the trailing "i", which we otherwise would. + (drop? #f)) (if (fx= i len) - (and (fx> i start) (cons i #f)) + (and (fx> i start) + (if drop? + (cons (sub1 i) (sub1 i)) + (cons i #f))) (let ((c (string-ref str i))) (if (fx<= radix 10) (if (and (char>=? c #\0) (char<=? c 0..r)) - (lp (fx+ i 1)) + (lp (fx+ i 1) #f) (and (fx> i start) (cons i i))) (if (or (and (char>=? c #\0) (char<=? c #\9)) (and (char>=? c #\a) (char<=? c a..r)) (and (char>=? c #\A) (char<=? c A..r))) - (lp (fx+ i 1)) - (and (fx> i start) (cons i i))))))))) + (lp (fx+ i 1) + (and cplx? (fx>= radix 19) + (or (char=? c #\i) + (char=? c #\I)))) + (and (fx> i start) + (if (and drop? (not (char=? c #\/))) ;; Fractional numbers are an exception - the i may only come after the slash + (cons (sub1 i) (sub1 i)) + (cons i i)))))))))) (scan-hashes (lambda (start) (let lp ((i start)) @@ -3071,8 +3085,8 @@ EOF (lp (fx+ i 1)) (and (fx> i start) (cons i i)))))))) (scan-digits+hashes - (lambda (start neg? all-hashes-ok?) - (let* ((digits (and (not seen-hashes?) (scan-digits start))) + (lambda (start neg? cplx? all-hashes-ok?) + (let* ((digits (and (not seen-hashes?) (scan-digits start cplx?))) (hashes (if digits (and (cdr digits) (scan-hashes (cdr digits))) (and all-hashes-ok? (scan-hashes start)))) @@ -3091,7 +3105,7 @@ EOF (let ((sign (case (string-ref str start) ((#\+) 'pos) ((#\-) 'neg) (else #f)))) (and-let* ((start (if sign (fx+ start 1) start)) - (end (scan-digits start))) + (end (scan-digits start #f))) (cons (##core#inline_allocate ("C_s_a_i_digits_to_integer" 6) str start (car end) radix (eq? sign 'neg)) @@ -3099,7 +3113,7 @@ EOF (scan-decimal-tail ; The part after the decimal dot (lambda (start neg? decimal-head) (and (fx< start len) - (let* ((tail (scan-digits+hashes start neg? decimal-head)) + (let* ((tail (scan-digits+hashes start neg? #f decimal-head)) (next (if tail (cdr tail) start))) (and (or decimal-head (not next) (fx> next start)) ; Don't allow empty "." @@ -3121,13 +3135,13 @@ EOF (h (or decimal-head 0))) (cons (if t (+ h t) h) next))))))))) (scan-ureal - (lambda (start neg?) + (lambda (start neg? cplx?) (if (and (fx> len (fx+ start 1)) (eq? radix 10) (eq? (string-ref str start) #\.)) (begin (go-inexact! neg?) (scan-decimal-tail (fx+ start 1) neg? #f)) - (and-let* ((end (scan-digits+hashes start neg? #f))) + (and-let* ((end (scan-digits+hashes start neg? cplx? #f))) (case (and (cdr end) (string-ref str (cdr end))) ((#\.) (go-inexact! neg?) @@ -3147,7 +3161,7 @@ EOF ((#\/) (set! seen-hashes? #f) ; Reset flag for denominator (and-let* (((fx> len (cdr end))) - (d (scan-digits+hashes (fx+ (cdr end) 1) #f #f)) + (d (scan-digits+hashes (fx+ (cdr end) 1) #f cplx? #f)) (num (car end)) (denom (car d))) (if (not (eq? denom 0)) @@ -3161,7 +3175,7 @@ EOF ((+1) (cons +inf.0 (cdr d)))))))) (else end)))))) (scan-real - (lambda (start) + (lambda (start cplx?) (and (fx< start len) (let* ((sign (case (string-ref str start) ((#\+) 'pos) ((#\-) 'neg) (else #f))) @@ -3171,7 +3185,10 @@ EOF ((#\i #\I) (or (and sign (cond - ((fx= (fx+ next 1) len) ; [+-]i + ((and (fx= (fx+ next 1) len) ; [+-]i + ;; Reject bare "+i" in higher radixes where this would be ambiguous + (or cplx? + (fx< radix 19))) (cons (if (eq? sign 'neg) -1 1) next)) ((and (fx<= (fx+ next 5) len) (string-ci=? (substring str next (fx+ next 5)) "inf.0")) @@ -3180,7 +3197,7 @@ EOF (and (fx< (fx+ next 5) len) (fx+ next 5)))) (else #f))) - (scan-ureal next (eq? sign 'neg)))) + (scan-ureal next (eq? sign 'neg) cplx?))) ((#\n #\N) (or (and sign (fx<= (fx+ next 5) len) @@ -3189,9 +3206,9 @@ EOF (cons (make-nan) (and (fx< (fx+ next 5) len) (fx+ next 5))))) - (scan-ureal next (eq? sign 'neg)))) - (else (scan-ureal next (eq? sign 'neg))))))))) - (number (and-let* ((r1 (scan-real offset))) + (scan-ureal next (eq? sign 'neg) cplx?))) + (else (scan-ureal next (eq? sign 'neg) cplx?)))))))) + (number (and-let* ((r1 (scan-real offset #f))) (case (and (cdr r1) (string-ref str (cdr r1))) ((#f) (car r1)) ((#\i #\I) (and (fx= len (fx+ (cdr r1) 1)) @@ -3200,7 +3217,7 @@ EOF (make-rectangular 0 (car r1)))) ((#\+ #\-) (set! seen-hashes? #f) ; Reset flag for imaginary part - (and-let* ((r2 (scan-real (cdr r1))) + (and-let* ((r2 (scan-real (cdr r1) #t)) ((cdr r2)) ((fx= len (fx+ (cdr r2) 1))) ((or (eq? (string-ref str (cdr r2)) #\i) @@ -3208,7 +3225,7 @@ EOF (make-rectangular (car r1) (car r2)))) ((#\@) (set! seen-hashes? #f) ; Reset flag for angle - (and-let* ((r2 (scan-real (fx+ (cdr r1) 1))) + (and-let* ((r2 (scan-real (fx+ (cdr r1) 1) #f)) ((not (cdr r2)))) (make-polar (car r1) (car r2)))) (else #f))))) diff --git a/manual/Module (scheme base) b/manual/Module (scheme base) index 3d3cd302..a71f5620 100644 --- a/manual/Module (scheme base) +++ b/manual/Module (scheme base) @@ -2586,6 +2586,14 @@ string (e.g. "#o177"). If radix is not supplied, then the default radix is 10. If string is not a syntactically valid notation for a number, then string->number returns #f. +If the radix is higher than 18, the parser treats ambiguous syntax +that might be a complex number, like {{{"+i"}}} and {{{"-i"}}} (and +any prefixes like {{{"+1234i"}}}), as an integer. If you want this to +be parsed as a complex number, explicitly write down {{{"0+i"}}} to +disambiguate. Note that {{{number->string}}} will always emit complex +numbers using the full notation, so it can always be read back by +{{{string->number}}}. + (string->number "100") ===> 100 (string->number "100" 16) ===> 256 (string->number "1e2") ===> 100.0 diff --git a/tests/numbers-string-conversion-tests.scm b/tests/numbers-string-conversion-tests.scm index 577ceb56..ccf82e6a 100644 --- a/tests/numbers-string-conversion-tests.scm +++ b/tests/numbers-string-conversion-tests.scm @@ -517,3 +517,41 @@ (assert (eqv? 0.0 (string->number "0.0"))) (assert (eqv? -0.0 (string->number "-0e1"))) (assert (eqv? 0.0 (string->number "0e-1"))) + +;; Nonambiguous cases involving reading of complex numbers in lower bases +(assert (eqv? (string->number "-i" 18) (make-rectangular 0 -1))) +(assert (eqv? (string->number "-1i" 18) (make-rectangular 0 -1))) +(assert (eqv? (string->number "0-1i" 18) (make-rectangular 0 -1))) +(assert (eqv? (string->number "i" 18) #f)) +(assert (eqv? (string->number "+i" 18) (make-rectangular 0 1))) +(assert (eqv? (string->number "+1i" 18) (make-rectangular 0 1))) +(assert (eqv? (string->number "0+i" 18) (make-rectangular 0 1))) +(assert (eqv? (string->number "0+1i" 18) (make-rectangular 0 1))) +(assert (eqv? (string->number "0+1/1i" 18) (make-rectangular 0 1))) +(assert (eqv? (string->number "0+i/1i" 18) #f)) +(assert (eqv? (string->number "+i/1i" 18) #f)) + +;; Ambiguous cases involving reading of complex numbers in higher bases and their disambiguated versions +(assert (eqv? (string->number "-i" 19) -18)) +(assert (eqv? (string->number "-1i" 19) -37)) +(assert (eqv? (string->number "0-1i" 19) (make-rectangular 0 -1))) +(assert (eqv? (string->number "i" 19) 18)) +(assert (eqv? (string->number "+i" 19) 18)) +(assert (eqv? (string->number "+1i" 19) 37)) +(assert (eqv? (string->number "0+i" 19) (make-rectangular 0 1))) +(assert (eqv? (string->number "0+1i" 19) (make-rectangular 0 1))) +(assert (eqv? (string->number "0+1/1i" 19) (make-rectangular 0 1))) +(assert (eqv? (string->number "0+i/1i" 19) (make-rectangular 0 18))) +(assert (eqv? (string->number "+i/1i" 19) 18/37)) + +;; Nonambiguous cases (polar notation requires no trailing "i") +;; This makes sure the i is correctly consumed by the integer parser in higher bases +;; and the number is invalid in lower bases. +(assert (eqv? (string->number "+1@i" 18) #f)) +(assert (eqv? (string->number "+1@1i" 18) #f)) +(assert (eqv? (string->number "+1@1/i" 18) #f)) +(assert (eqv? (string->number "+1@1/1i" 18) #f)) +(assert (eqv? (string->number "+1@i" 19) (make-polar 1 18))) +(assert (eqv? (string->number "+1@1i" 19) (make-polar 1 37))) +(assert (eqv? (string->number "+1@1/i" 19) (make-polar 1 1/18))) +(assert (eqv? (string->number "+1@1/1i" 19) (make-polar 1 1/37))) -- 2.51.2
