2016-07-01 17:41 GMT+03:00 Mike Gelfand <mike...@mikedld.com>: > > Since you already have "bool SystemTools::GetEnv(const char* key, > std::string& result)", another option would be to use it everywhere and > maybe introduce something like "bool SystemTools::HasEnv(const char* > key)" for those several cases where you only need to check the existence. > > I implemented this, it's actually really nice API to work with.
2016-07-02 1:54 GMT+03:00 <clin...@elemtech.com>: > > ... > > > +#if defined(_WIN32) > > + // Kinda hack, with MSVC (and MinGW) for some reason std::wcout > > + // (and all other std::w*) fails once it encounters non-ASCII > > + // string unless locale is set. > > + // Note that even with this, seems it can't output characters > > + // which aren't present in ANSI codepage no matter what encoding > > + // is used for console. > > + // Also once any character outside of ANSI codepage is tried to > > + // be outputed then after there anymore won't be output from > > + // any of std::w* functions. > > + _wsetlocale(LC_ALL, L""); > > +#endif > > The _wsetlocale() may be a problem. > > See: > https://gitlab.kitware.com/cmake/cmake/commit/87be2e142 > https://cmake.org/Bug/view.php?id=16099 > > > Indeed, good catch, thanks, I didn't thought about this. But even now most these locale aware functions like tolower/toupper and others are wrong because internally we use UTF-8 and there 1 character can take more than 1 byte so these functions won't work correctly for some strings even if we don't set any locale. Now here we actually set it only on Windows because there just isn't any other way. Without setting locale we get only ASCII support and can't output even ANSI characters. With locale we can atleast output ANSI characters. Currently Microsoft C++ library doesn't support UTF-8/UTF-16 locales. Only way to output Unicode would be implement our own std::wcout which would use wide WinAPI to write to console. Anyway quick fix is to always use English locale then number parsing will be expected and set user's codepage. std::wstring locale = L"English_United States."; locale += std::to_wstring(GetACP()); _wsetlocale(LC_ALL, locale.c_str()); Of course proper Unicode support will be needed some day, but for now this is still an improvement. > @@ -30,8 +30,23 @@ namespace @KWSYS_NAMESPACE@ > > typedef std::basic_filebuf<CharType,Traits> my_base_type; > > basic_filebuf *open(char const *s,std::ios_base::openmode mode) > > { > > + std::wstring wstr = Encoding::ToWide(s); > > + const wchar_t *ws = wstr.c_str(); > > +#if defined(_MSC_VER) && _MSC_VER >= 1400 > > + const wchar_t *ss = ws; > > +#else > > + const char *ss = 0; > > + size_t length = std::wcstombs(0, ws, 0); > > + if (length != size_t(-1)) { > > + std::vector<char> str(length + 1); > > + ss = str.data(); > > + std::wcstombs((char *)ss, ws, str.size()); > > + } else { > > + ss = s; > > + } > > +#endif > > return static_cast<basic_filebuf*>( > > - my_base_type::open(Encoding::ToWide(s).c_str(), mode) > > + my_base_type::open(ss, mode) > > ); > > Yes. It makes sense to convert from utf-8 to code page when we are not > using the wide apis. > This would at least give you support for the current code page, beyond > ascii. Beyond that, the wide functions should be used. > Which compiler are you trying to support here, which doesn't give a wide > open()? > > Only MSVC have ofstream::open(const wchar_t *) so for MinGW need to use ofstream::open(const char *) or use wofstream::open(const wchar_t *) which would require quite big changes. > > } > > }; > > diff --git a/Source/kwsys/ProcessWin32.c b/Source/kwsys/ProcessWin32.c > > index 2b93e69..208e725 100644 > > --- a/Source/kwsys/ProcessWin32.c > > +++ b/Source/kwsys/ProcessWin32.c > > @@ -181,7 +181,7 @@ struct kwsysProcessPipeData_s > > /* ------------- Data managed per call to Execute ------------- */ > > > > /* Buffer for data read in this pipe's thread. */ > > - char DataBuffer[KWSYSPE_PIPE_BUFFER_SIZE]; > > + char DataBuffer[KWSYSPE_PIPE_BUFFER_SIZE*2]; > > This "*2" assumes all characters cmake sees will fit in the Basic > Multilingual Plane. > Are we OK assuming that? > If the conversion from a code page to utf-16 results in surrogate pairs, > there may not be enough space. > > Not really, it assumes character string from ANSI code page encoded to internal encoding (UTF-8) will not take more than 2x space. Which I think is a pretty good guess, because here we're processing output from a process and so output will be paths and texts usually in ASCII with not many characters which would require more than 2 bytes. And it will still work if say 1/4th uses 4 bytes and 1 byte for rest 3/4th. 1024 * 1/4 * 4 bytes + 1024 * 3/4 * 1 byte (1792) < 1024 * 2 (2048) But of course can increase it to 4x, only I think it's very unlikely that it will be used. > > > > > /* The length of the data stored in the buffer. */ > > DWORD DataLength; > > @@ -1626,6 +1626,25 @@ void kwsysProcessPipeThreadReadPipe(kwsysProcess* > cp, > > kwsysProcessPipeData* td) > > KWSYSPE_DEBUG((stderr, "read closed %d\n", td->Index)); > > } > > > > + if (td->DataLength > 0) { > > + UINT codepage = GetConsoleCP(); > > + if (!codepage) { > > + codepage = GetACP(); > > > Can this be stored in kwsysProcessCreateInformation, and not computed ever > time we read data? > > > Not kwsysProcessCreateInformation, because we can't access it from this function, but we can use kwsysProcess and I'll change to that. > > > + } > > + if (codepage != KWSYS_ENCODING_DEFAULT_CODEPAGE) { > > + int wlength = MultiByteToWideChar(codepage, 0, > td->DataBuffer, > > td->DataLength, NULL, 0); > > + wchar_t* wdata = malloc(wlength * sizeof(wchar_t)); > > + int r = MultiByteToWideChar(codepage, 0, td->DataBuffer, > > td->DataLength, wdata, wlength); > > + if (r > 0) { > > + r = > WideCharToMultiByte(KWSYS_ENCODING_DEFAULT_CODEPAGE, 0, > > wdata, wlength, td->DataBuffer, KWSYSPE_PIPE_BUFFER_SIZE * 2, NULL, > NULL); > > + if (r > 0) { > > + td->DataLength = r; > > + } > > + } > > + free(wdata); > > + } > > + } > > How about avoiding a malloc()/free() each time we read an array? > > I don't really understand what you mean, I used malloc there because we can't know beforehand how many bytes we actually will need before DataBuffer is filled. Also I don't see there any array reading. All process output is read with single ReadFile call so I don't see how malloc could cause any overhead. Do you want that this buffer is added to kwsysProcessPipeData as static array? > ProcessWin32.c already uses Encoding.h. How about using the Encoding > instead of > WideCharToMultiByte(...)? I'm fine either way. > > There we get length from td->DataLength and so I assume we might get non-null terminated data in which case we can't use those Encoding functions because they expect source to be null-terminated.
-- Powered by www.kitware.com Please keep messages on-topic and check the CMake FAQ at: http://www.cmake.org/Wiki/CMake_FAQ Kitware offers various services to support the CMake community. For more information on each offering, please visit: CMake Support: http://cmake.org/cmake/help/support.html CMake Consulting: http://cmake.org/cmake/help/consulting.html CMake Training Courses: http://cmake.org/cmake/help/training.html Visit other Kitware open-source projects at http://www.kitware.com/opensource/opensource.html Follow this link to subscribe/unsubscribe: http://public.kitware.com/mailman/listinfo/cmake-developers