Hi Mark Final update - first, we've reused your efficient substring-replace function in https://github.com/Gnucash/gnucash/commit/7d15e6e4e727c87fb4a501e924c4ae02276e508d from a few years ago. Second, the email thread https://lists.gnu.org/archive/html/guile-devel/2014-03/msg00060.html confirmed a lot of issues in guile-2.0 could be solved in Windows by upgrading to guile-2.2. So, GnuCash has now upgraded to guile-2.2 on Windows and the string-ports are now behaving. Thank you (twice) :)
On Fri, 19 Apr 2019 at 10:26, Christopher Lam <christopher....@gmail.com> wrote: > Hi, > The patch *does* work and handles unicode properly :) There are unintended > consequences however, whereby other (probably C-based) string-code in > Windows are now reading the lira-symbol into unexpected chars (eg > lira-symbol -> "â‚°" i.e. #xe2 #x201a #xba) but this is now outside the > scope of this post. > Thank you again! > > On Thu, 18 Apr 2019 at 21:20, Mark H Weaver <m...@netris.org> wrote: > >> Hi again, >> >> Earlier, I wrote: >> >> > Christopher Lam <christopher....@gmail.com> writes: >> > >> >> Hi Mark >> >> Thank you so much for looking into this. >> >> I'm reviewing the GnuCash for Windows package (v3.5 released April >> 2019) >> >> which contains the following libraries: >> >> - guile 2.0.14 >> > >> > Ah, for some reason I thought you were using Guile 2.2. That explains >> > the problem. >> > >> > In Guile 2.0, string ports internally used the locale encoding by >> > default, which meant that any characters not supported by the locale >> > encoding would be munged. >> > >> > Guile 2.2 changed the behavior of string ports to always use UTF-8 >> > internally, which ensures that all valid Guile strings can pass through >> > unmunged. >> > >> > So, this problem would almost certainly be fixed by updating to >> > Guile 2.2. >> >> It's probably a good idea to update to Guile 2.2 anyway, but I'd like to >> also offer the following workaround, which monkey patches the string >> port procedures in Guile 2.0 to behave more like Guile 2.2. >> >> Note that it only patches the Scheme APIs for string ports, and not the >> underlying C functions. It might be that some code, possibly within >> Guile itself, creates a string port using the C functions, and such >> string ports may still munge characters. >> >> Anyway, if you want to try it, arrange for GnuCash to evaluate the code >> below, after initializing Guile. >> >> Mark >> >> >> (when (string=? (effective-version) "2.0") >> ;; When using Guile 2.0.x, use monkey patching to change the >> ;; behavior of string ports to use UTF-8 as the internal encoding. >> ;; Note that this is the default behavior in Guile 2.2 or later. >> (let* ((mod (resolve-module '(guile))) >> (orig-open-input-string (module-ref mod 'open-input-string)) >> (orig-open-output-string (module-ref mod 'open-output-string)) >> (orig-object->string (module-ref mod 'object->string)) >> (orig-simple-format (module-ref mod 'simple-format))) >> >> (define (open-input-string str) >> (with-fluids ((%default-port-encoding "UTF-8")) >> (orig-open-input-string str))) >> >> (define (open-output-string) >> (with-fluids ((%default-port-encoding "UTF-8")) >> (orig-open-output-string))) >> >> (define (object->string . args) >> (with-fluids ((%default-port-encoding "UTF-8")) >> (apply orig-object->string args))) >> >> (define (simple-format . args) >> (with-fluids ((%default-port-encoding "UTF-8")) >> (apply orig-simple-format args))) >> >> (define (call-with-input-string str proc) >> (proc (open-input-string str))) >> >> (define (call-with-output-string proc) >> (let ((port (open-output-string))) >> (proc port) >> (get-output-string port))) >> >> (module-set! mod 'open-input-string open-input-string) >> (module-set! mod 'open-output-string open-output-string) >> (module-set! mod 'object->string object->string) >> (module-set! mod 'simple-format simple-format) >> (module-set! mod 'call-with-input-string call-with-input-string) >> (module-set! mod 'call-with-output-string call-with-output-string) >> >> (when (eqv? (module-ref mod 'format) orig-simple-format) >> (module-set! mod 'format simple-format)))) >> >