perry-ca wrote: > > The -fexec-charset option used for the c++ library may not be the same as > > in the user source. The design point is to have one typeinfo object per > > type. In a simple example we can have two source files defining two types > > compiled with two different -fexec-charset options. This will result in > > typeinfo objects being defined in three different places: > > > > 1. the basic types in standard c++ library > > 2. type T1 defined in the first source file > > 3. type T2 defined in the second source file > > > > If these are all in different code pages, users won't be able to know what > > code page is being returned by name(). > > Linking objects compiled with different -fexec-charset option would be a > programmer mistake because the objects are not ABI compatible (in some > sense). You can't safely pass strings across object boundaries if the objects > are compiled with different -fexec-charset options. For example, if your > program compiled with -fexec-charset=ascii tries to examine an ebcdic-encoded > string returned by type_info::name() (e.g., tries to demangle it), it will > fail. If you compile libc++-/abi with -fexec-charset=ascii, you will get > rubbish in "std::terminate called after throwing an exception of type > std::bad_alloc", because the "std::bad_alloc" part will be demangled using > the wrong (ASCII) decoder. > > These are the same problems you get when passing any strings hardcoded in any > library to another library compiled with different -fexec-charset option. I > still don't see how type_info::name() is different.
These are all valid concerns that are addressed. The compiler has the -fzos-le-char-mode=ascii option that selects "ascii mode" or "ebcdic mode". I think that is more fitting for the example you outline. In addition to the rule we're talking about here that the typeinfo name is in the system charset, we also duplicate the name on z/OS and provide it in ebcdic and ascii. The ascii encoding comes right after the ebcdic form and the name() function will skip the ebcdic encoding when it detects the program is in ascii mode. Abhina put up a PR for this. In your example, the type_info::name() function will return either the ascii encoding or the ebcdic encoding depending on the mode the program is being run in. The mode the program runs in is determined by the -fzos-le-char-mode=ascii|ebcdic option. That's the option that is important for your example. The user is should be careful to use a -fexec-charset option that works for them and could create a problem like you outline (outside of libc++). That would be a user problem. For libc++/abi, we provide enough to make what you describe. Libc++ is provided in many forms. For 64-bit, we provide 4 variations (ascii/ebcdic & ieee/hexfloat). However, there is only one libc++abi that is independent of those variations. The libc++abi includes all of the std:exception derived classes. The code in libc++abi uses a z/OS function to determine if the program is in ascii/ebcdic mode and handles the strings correctly (i.e. treats the strings as ascii in ascii mode). >For example, if your program compiled with -fexec-charset=ascii tries to >examine an ebcdic-encoded string returned by type_info::name() (e.g., tries to >demangle it), it will fail. This example has a number of subtilities and would actually work as presented. I get what you are driving at. I'll change the example a little. If your program is compiled with -fzos-le-char-mode=ebcdic and the source file is compiled with -fexec-charset=ascii and the program tries to do a comparison of the characters in the name (eg. name[0]=='_' or strcmp(name, "abc")) then it will fail. When the program is compiled with -fzos-le-char-mode=ebcdic, the user knows all names returned by type_info::name() are in ebcdic and needs to make sure those strings are handled as such. Using -fexec-charset=ascii with -fzos-le=char-mode=ebcdic can fail in many places not just with the type_info::name() (eg. format string for printf). This option combination can be helpful, for example building an ebcdic app that reads and parses ascii files. Your example works because demangle() is provided in libc++abi. Libc++abi will detect the ascii/ebcdic mode of the program and expect the argument to be in that encoding. If the user has code like `cxa_demangle(ti.name())`, it will always handle the string correctly. The -fexec-charset option has no effect on that code. https://github.com/llvm/llvm-project/pull/138895 _______________________________________________ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
