Hi James.
> 1. "National" support. COBOL programs define the runtime encoding and > collation of each string (sometimes implicitly). COBOL defines two > encodings: "alphanumeric" and "national". Every alphanumeric (and > national) variable and literal has a defined runtime encoding that is > distinct from the compile-time and runtime locale, and from the > encoding of the source code. This means > > MOVE 'foo' TO FOO. > > may involve iconv(3) and > > IF 'foo' = FOO > > is defined as true/false depending on the *characters* represented, not > their encoding. That 'foo' could be CP1140 (single-byte EBCDIC) and > FOO could be UTF-16. > > Cauldron Alert: Who was I talking to? Unicode equality and inequality > require library support. Some wise person in Porto agreed that 1) GCC > was unlikely to want to add ICU as a dependency and 2) a limited amount > of Unicode evaluation is available in (IIRC) gnulib, which might be > adapted to libiberty, and might serve the purpose. Problem is, I don't > remember and didn't take notes. If that conversation sounds familiar, > please ping me. > > Conversion is a solved problem. Comparison is not. And, no, afaik the > C and C++ libraries are insufficient. Anything that relies on the > environment is begging for trouble. For the Algol 68 Unicode needs I am using a small set of functions that I hand-adapted from the libunistring gnulib module. The functions needed at compile-time are in https://forge.sourceware.org/gcc/gcc-a68/src/branch/a68/gcc/algol68/a68-unistr.c The functions needed at run-time are in: https://forge.sourceware.org/gcc/gcc-a68/src/branch/a68/libga68/ga68-unistr.c If libunistring fulfills your comparison needs (see manual at [1]) then perhaps you could use a similar strategy. It would be good to avoid duplicating that code though. (Note that the reason I am using a customized version is that the elements in my strings may be stored with a stride > 1 so I modified the libunistring functions to account for that. Not sure if Bruno would welcome the addition of an interface where you can specify strides, but if so, I will gladly submit a patch to libunistring.) [1] https://www.gnu.org/software/libunistring/manual/libunistring.html
