On Tue, 2016-07-19 at 16:10 -0700, David Edelsohn wrote: > Hi, David > > I don't believe that hardware easily is available. We probably could > arrange for access, if it is necessary, but it is not accessible > through the IBM Community Development system for Linux on z Systems > because this isn't Linux-based. GCC on the system is not self > -hosting > -- I believe that GCC only is used as a cross-compiler. > > Thanks, David
I did some more digging, and it looks like hardware isn't necessary: I found PR 18785 ("[4.0 Regression] isdigit builtin function fails with EBCDIC character sets") which led me to these options in the C family of frontends: fexec-charset= fwide-exec-charset= and these get used for cpp_opts->narrow_charset and cpp_opts ->wide_charset respectively in libcpp; they ultimately get passed to iconv (if they don't match any of the priority special-cases in libcpp/charset.c) It looks like -fexec-charset=IBM1047 is the correct command-line option for enabling EBCDIC, (or rather, one of the various EBCDIC encodings), and I was able to use this from my x86_64 host to generate .s files with EBCDIC for the embedded strings. I wasn't able to find an iconv code for UTF-EBCDIC: gcc -S ../../src/test.c -fexec-charset=UTF-EBCDIC cc1: error: conversion from UTF-8 to UTF-EBCDIC not supported by iconv but "interesting" values like -fexec-charset=UTF-16 appear to satisfy my requirement for a way to stress-test the string-literal location -handling code. Thanks! > On Tue, Jul 19, 2016 at 3:39 PM, David Malcolm <dmalc...@redhat.com> > wrote: > > On Tue, 2016-07-19 at 12:24 -0400, David Edelsohn wrote: > > > On Tue, Jul 19, 2016 at 12:05 PM, David Malcolm < > > > dmalc...@redhat.com> > > > wrote: > > > > libcpp/charset.c has a helpful introductory comment > > > > describingcharacter > > > > sets, including the source and execution character sets. > > > > > > > > libcpp appears to attempt to support both UTF-8 and UTF-EBCDIC > > > > for > > > > the > > > > source character set, via: > > > > > > > > #if HOST_CHARSET == HOST_CHARSET_ASCII > > > > #define SOURCE_CHARSET "UTF-8" > > > > #define LAST_POSSIBLY_BASIC_SOURCE_CHAR 0x7e > > > > #elif HOST_CHARSET == HOST_CHARSET_EBCDIC > > > > #define SOURCE_CHARSET "UTF-EBCDIC" > > > > #define LAST_POSSIBLY_BASIC_SOURCE_CHAR 0xFF > > > > #else > > > > #error "Unrecognized basic host character set" > > > > #endif > > > > > > > > though libiberty's safe-ctype.c has: > > > > > > > > # if HOST_CHARSET == HOST_CHARSET_EBCDIC > > > > #error "FIXME: write tables for EBCDIC" > > > > > > > > so presumably we only effectively support UTF-8 as the source > > > > char > > > > set. > > > > > > > > Do we support any hosts for which the source character set is > > > > *not* > > > > UTF > > > > -8? > > > > > > > > Similarly, do we support any targets for which the execution > > > > character > > > > set is *not* UTF-8? > > > > > > > > This relates to the locations-within-string-literals patch I > > > > posted > > > > here: > > > > https://gcc.gnu.org/ml/gcc-patches/2016-07/msg00441.html > > > > ("[PATCH] RFC: On-demand locations within string-literals"); > > > > that > > > > patch > > > > currently has an assumption that the source encoding == > > > > execution > > > > encoding, and I'd appreciate knowing a configuration for which > > > > this > > > > isn't the case so I can test accordingly. > > > > > > I believe that the GCC z/TPF configuration uses EBCDIC. There > > > also > > > is > > > the on-again off-again i370 port. > > > > > > Thanks, David > > > > Thanks. Looks like the triple for the former is "s390x-ibm-tpf"; > > I'm > > experimenting with that as the target. > > > > Is there any accessible hardware for these? I don't see them in > > the > > gcc compile farm. > > > > Dave