[ https://issues.apache.org/jira/browse/STDCXX-499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12557859#action_12557859 ]
Martin Sebor commented on STDCXX-499: ------------------------------------- The question is: is this our problem or one with the locale definition (such as the Bulgarian locale on Linux in the test case above). I.e., is it a valid locale that specifies a grouping but no thousands_sep? Among our own locales there is only one that fits this description suggesting it might be a bug in the locale definition: $ (cd ~/stdcxx && for f in `grep -l "^grouping *[1-9]" etc/nls/src/*`; do grep -l "thousands_sep *\"\"" $f; done) etc/nls/src/bg_BG The latest glibc bg_BG definition is the same: http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/localedata/locales/bg_BG?rev=1.7.2.2&content-type=text/x-cvsweb-markup&cvsroot=glibc I opened a glibc issue to see if they agree it's a bug: http://sources.redhat.com/bugzilla/show_bug.cgi?id=5599 If we should decide to work around it I see two possible ways of handling it in punct.cpp, after retrieving the grouping and thousands_sep for the locale using localeconv(): When grouping is not empty and valid and thsousands_sep is NUL, either a) set grouping to "", or b) set thousands_sep to some non-NUL value. Solution a) seems safer because it doesn't involve inventing a thousands_sep that's valid for the locale but the downside is that it loses potentially useful information. Solution b) leaves open the question of which thousands_sep is appropriate for the locale. > std::num_put inserts NUL thousand separator > ------------------------------------------- > > Key: STDCXX-499 > URL: https://issues.apache.org/jira/browse/STDCXX-499 > Project: C++ Standard Library > Issue Type: Bug > Components: 22. Localization > Affects Versions: 4.1.2, 4.1.3, 4.1.4 > Reporter: Martin Sebor > Assignee: Martin Sebor > Fix For: 4.2.1 > > > Moved from Rogue Wave Bugzilla: > http://bugzilla.cvo.roguewave.com/show_bug.cgi?id=1913 > -------- Original Message -------- > Subject: num_put and null-character thousand separator > Date: Tue, 11 Jan 2005 16:10:23 -0500 > From: Boris Gubenko <[EMAIL PROTECTED]> > Reply-To: Boris Gubenko <[EMAIL PROTECTED]> > Organization: Hewlett-Packard Co. > To: Martin Sebor <[EMAIL PROTECTED]> > Another locale-related issue that we fixed in rw stdlib v3.0 (and in > v2.0 also) is making sure, that num_put does not insert null thousand > separator character into the stream. Here is the fix in _num_put.cc > in v3.0 : > template <class _CharT, class _OutputIter /* = ostreambuf_iterator<_CharT> > */> > _TYPENAME num_put<_CharT, _OutputIter>::iter_type > num_put<_CharT, _OutputIter>:: > _C_put (iter_type __it, ios_base &__flags, char_type __fill, int __type, > const void *__pval) const > { > const numpunct<char_type> &__np = > _V3_USE_FACET (numpunct<char_type>, __flags.getloc ()); > // FIXME: adjust buffer dynamically as necessary > char __buf [_RWSTD_DBL_MAX_10_EXP]; > char *__pbuf = __buf; > const string __grouping = __np.grouping (); > const char *__grp = __grouping.c_str (); > const int __prec = __flags.precision (); > #if defined(__VMS) && defined(__DECCXX) && !defined(__DECFIXCXXL1730) > const char __nogrouping = _RWSTD_CHAR_MAX; > if (!__np.thousands_sep()) > __grp = &__nogrouping; > #endif > Here is the test: > cosf.zko.dec.com> setenv LANG fr_FR.ISO8859-1 > cosf.zko.dec.com> locale -k thousands_sep > thousands_sep="" > cosf.zko.dec.com> cxx x.cxx && a.out > null character thousand_sep was not inserted > cosf.zko.dec.com> cxx x.cxx -D_RWSTD_USE_CONFIG -D_RWSTDDEBUG \ > -I/usr/cxx1/boris/CXXL_1886-2/stdlib-4.0/stdlib/include/ \ > -nocxxstd -L/usr/cxx1/boris/CXXL_1886-2/result/lib -lstd11s \ > && a.out > null character thousand_sep was inserted > cosf.zko.dec.com> > x.cxx > ----- > #ifndef __USE_STD_IOSTREAM > #define __USE_STD_IOSTREAM > #endif > #include <iostream> > #include <sstream> > #include <string> > #include <locale> > #include <locale.h> > #ifdef __linux > #define FRENCH_LOCALE "fr_FR" > #else > #define FRENCH_LOCALE "fr_FR.ISO8859-1" > #endif > using namespace std; > int main() > { > ostringstream os; > if (setlocale(LC_ALL,FRENCH_LOCALE)) > { > setlocale(LC_ALL,"C"); > os.imbue(locale(FRENCH_LOCALE)); > os << (double) 10000.1 << endl; > if ( (os.str())[2] == '\0' ) > cout << "null character thousand_sep was inserted" << endl; > else > cout << "null character thousand_sep was not inserted" << endl; > } > return 0; > } > ------- Additional Comments From [EMAIL PROTECTED] 2005-01-11 14:50:44 ---- > -------- Original Message -------- > Subject: Re: num_put and null-character thousand separator > Date: Tue, 11 Jan 2005 15:50:06 -0700 > From: Martin Sebor <[EMAIL PROTECTED]> > To: Boris Gubenko <[EMAIL PROTECTED]> > References: <[EMAIL PROTECTED]> > Boris Gubenko wrote: > > Another locale-related issue that we fixed in rw stdlib v3.0 (and in > > v2.0 also) is making sure, that num_put does not insert null thousand > > separator character into the stream. Here is the fix in _num_put.cc > > in v3.0 : > I don't think this fix would be quite correct in general. NUL is > a valid character that the locale library was specifically designed > to be able to insert and extract just like any other. In addition, > in the code below, operator==() need not be defined for the character > type. > > > ... > > Here is the test: > Thanks for the helpful test case. > My feeling is that this case points out a fundamental design > disconnect between the C and C++ locales. In C, NUL is not > an ordinary character -- it's a special character that terminates > strings. In addition, C formatted I/O is done in multibyte > characters. In contrast, in C++, NUL is a character like any other > and formatted I/O is always done in single chars (or wchar_t when > char is not wide enough), but never in multibyte characters. > In C, the thousand separator is a multibyte string so even if > grouping is non-empty, inserting an empty string will be as good > as inserting none at all. In C++ the separator is assumed to be > a single character so there's no way to achieve the same effect. > Instead, whether a thousand separator gets inserted or not is > controlled by the grouping string. > One way to fix this would be to set grouping to "" if thousands_sep > is NUL, although that would be quite correct, either because numpunct > can be used directly by user programs. I'll have to think about how > to deal with this. In the meantime, I filed bug 1913 for this problem > so that you can track it. > Martin -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.