Just use snprintf to copy the error message from strerror_r into a
thread-local buffer of 64 bytes or so.  Then preserve the existing
terror() interface.

Can you open a jira for this?

best,
Colin

On Thu, Dec 11, 2014 at 8:35 PM, malcolm <malcolm.kaval...@oracle.com> wrote:
> So, turns out that if I had naively changed all calls to terror or
> references to sys_errlist, to using strerror_r, then I would have broken
> code for Windows and HPUX (and possibly other OSes).
>
> If we are to assume that current code runs fine on all platforms (maybe even
> AIX an MacOS, for example), then any change/additions made to the code and
> not ifdeffed appropriately can break on other OSes. On the other hand,  too
> many ifdefs can pollute the code source and render it less readable (though
> possibly less important).
>
> In the general case what are code contributors responsibilities to adding
> code regarding OSes besides Linux ?
> What OSes does jenkins test on ?
> I guess maintainers of code on non-tested platforms are responsible for
> their own testing ?
>
> How do we avoid the ping-pong effect, i.e. I make a generic change to code
> which breaks on Windows, then the Windows maintainer reverts changes to
> break on Solaris for example ? Or does this not happen in actuality ?
>
>
> On 12/11/2014 11:25 PM, Asokan, M wrote:
>>
>> Hi Malcom,
>>    The Windows versions of strerror() and strerror_s() functions are
>> probably meant for ANSI C library functions that set errno.  For core
>> Windows API calls (like UNIX system calls), one gets the error number by
>> calling GetLastError() function.  In the code snippet I sent earlier, the
>> "code" argument is the value returned by GetLastError().  Neither strerror()
>> nor strerror_s() will give the correct error message for this error code.
>>
>> You could probably look at libwinutils.c in Hadoop source.  It uses
>> FormatMessageW (which returns messages in Unicode.)  My requirement was to
>> return messages in current system locale.
>>
>> -- Asokan
>> ________________________________________
>> From: malcolm [malcolm.kaval...@oracle.com]
>> Sent: Thursday, December 11, 2014 4:04 PM
>> To: common-dev@hadoop.apache.org
>> Subject: Re: Solaris Port
>>
>> Hi Asok,
>>
>> I googled and found that windows has strerror, and strerror_s (which is
>> the strerror_r equivalent).
>> Is there a reason why you didn't use this call ?
>>
>> On 12/11/2014 06:27 PM, Asokan, M wrote:
>>>
>>> Hi Malcom,
>>>      Recently, I had to work on a function to get system error message on
>>> various systems.  Here is the piece of code I came up with.  Hope it helps.
>>>
>>> static void get_system_error_message(char * buf, int buf_len, int code)
>>> {
>>> #if defined(_WIN32)
>>>       LPVOID lpMsgBuf;
>>>       DWORD status = FormatMessage(FORMAT_MESSAGE_ALLOCATE_BUFFER |
>>>                                    FORMAT_MESSAGE_FROM_SYSTEM |
>>>                                    FORMAT_MESSAGE_IGNORE_INSERTS,
>>>                                    NULL, code,
>>>                                    MAKELANGID(LANG_NEUTRAL,
>>> SUBLANG_DEFAULT),
>>>                                                            /* Default
>>> language */
>>>                                    (LPTSTR) &lpMsgBuf, 0, NULL);
>>>       if (status > 0)
>>>       {
>>>           strncpy(buf, (char *)lpMsgBuf, buf_len-1);
>>>           buf[buf_len-1] = '\0';
>>>           /* Free the buffer returned by system */
>>>           LocalFree(lpMsgBuf);
>>>       }
>>>       else
>>>       {
>>>           _snprintf(buf, buf_len-1 , "%s %d",
>>>               "Can't get system error message for code", code);
>>>           buf[buf_len-1] = '\0';
>>>       }
>>> #else
>>> #if defined(_HPUX_SOURCE)
>>>       {
>>>           char * msg;
>>>           errno = 0;
>>>           msg = strerror(code);
>>>           if (errno == 0)
>>>           {
>>>               strncpy(buf, msg, buf_len-1);
>>>               buf[buf_len-1] = '\0';
>>>           }
>>>           else
>>>           {
>>>               snprintf(buf, buf_len, "%s %d",
>>>                   "Can't get system error message for code", code);
>>>           }
>>>       }
>>> #else
>>>       if (strerror_r(code, buf, buf_len) != 0)
>>>       {
>>>           snprintf(buf, buf_len, "%s %d",
>>>               "Can't get system error message for code", code);
>>>       }
>>> #endif
>>> #endif
>>> }
>>>
>>> Note that HPUX does not have strerror_r() since strerror() itself is
>>> thread-safe.  Also Windows does not have snprintf().  The equivalent
>>> function _snprintf() has a subtle difference in its interface.
>>>
>>> -- Asokan
>>> ________________________________________
>>> From: malcolm [malcolm.kaval...@oracle.com]
>>> Sent: Thursday, December 11, 2014 11:02 AM
>>> To: common-dev@hadoop.apache.org
>>> Subject: Re: Solaris Port
>>>
>>> Fine with me, I volunteer to do this, if accepted.
>>>
>>> On 12/11/2014 05:48 PM, Allen Wittenauer wrote:
>>>>
>>>> sys_errlist was removed for a reason.  Creating a fake sys_errlist on
>>>> Solaris will mean the libhadoop.so will need to be tied a specific build
>>>> (kernel/include pairing) and therefore limits upward 
>>>> mobility/compatibility.
>>>> That doesn’t seem like a very good idea.
>>>>
>>>> IMO, switching to strerror_r is much preferred, since other than the
>>>> brain-dead GNU libc version, is highly portable and should work regardless
>>>> of the kernel or OS in place.
>>>>
>>>> On Dec 11, 2014, at 5:20 AM, malcolm <malcolm.kaval...@oracle.com>
>>>> wrote:
>>>>
>>>>> FYI, there are a couple more files that reference sys_errlist directly
>>>>> (not just terror within exception.c) , but also hdfs_http_client.c and
>>>>> NativeiO.c
>>>>>
>>>>> On 12/11/2014 07:38 AM, malcolm wrote:
>>>>>>
>>>>>> Hi Colin,
>>>>>>
>>>>>> Exactly, as you noticed, the problem is the thread-local buffer needed
>>>>>> to return from terror.
>>>>>> Currently, terror just returns a static string from an array, this is
>>>>>> fast, simple and error-proof.
>>>>>>
>>>>>> In order to use strerror_r inside terror,  would require allocating a
>>>>>> buffer inside terror  and depend on the caller to free the buffer after
>>>>>> using it, or to pass a buffer to terrror (which is basically the same as
>>>>>> strerror_r, rendering terror redundant).
>>>>>> Both cases require modification outside terror itself, as far as I can
>>>>>> tell, no simple fix. Unless you have an alternative which I haven't 
>>>>>> thought
>>>>>> of ?
>>>>>>
>>>>>> As far as I can tell, we have two choices:
>>>>>>
>>>>>> 1. Remove terror and replace calls with strerror_r, passing a buffer
>>>>>> from the callee.
>>>>>>       Advantage: a more modern portable interface.
>>>>>>       Disadvantage: All calls to terror need to be modified, though
>>>>>> all seem to be in a few files as far as I can tell.
>>>>>>
>>>>>> 2. Adding a sys_errlist array (ifdeffed for Solaris)
>>>>>>       Advantage: no change to any calls to terror
>>>>>>       Disadvantage: 2 additional files added to source tree (.c and
>>>>>> .h) and some minor ifdefs only used for Solaris.
>>>>>>
>>>>>> I think it is more a question of style than anything else, so I leave
>>>>>> you to make the call.
>>>>>>
>>>>>> Thanks for your patience,
>>>>>> Malcolm
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 12/10/2014 09:54 PM, Colin McCabe wrote:
>>>>>>>
>>>>>>> On Wed, Dec 10, 2014 at 2:31 AM, malcolm
>>>>>>> <malcolm.kaval...@oracle.com> wrote:
>>>>>>>>
>>>>>>>> Hi Colin,
>>>>>>>>
>>>>>>>> Thanks for the hints around JIRAs.
>>>>>>>>
>>>>>>>> You are correct errno still exists, however sys_errlist does not.
>>>>>>>>
>>>>>>>> Hadoop uses a function terror (defined in exception.c) which indexes
>>>>>>>> sys_errlist by errno to return the error message from the array.
>>>>>>>> This
>>>>>>>> function is called 26 times in various places (in 2.2)
>>>>>>>>
>>>>>>>> Originally, I thought to replace all calls to terror with strerror,
>>>>>>>> but
>>>>>>>> there can be issues with multi-threading (it returns a buffer which
>>>>>>>> can be
>>>>>>>> overwritten), so it seemed simpler just to recreate the sys_errlist
>>>>>>>> message
>>>>>>>> array.
>>>>>>>>
>>>>>>>> There is also a multi-threaded version strerror_r where you pass the
>>>>>>>> buffer
>>>>>>>> as a parameter, but this would necessitate changing every call to
>>>>>>>> terror
>>>>>>>> with mutiple lines of code.
>>>>>>>
>>>>>>> Why don't you just use strerror_r inside terror()?
>>>>>>>
>>>>>>> I wrote that code originally.  The reason I didn't want to use
>>>>>>> strerror_r there is because GNU libc provides a non-POSIX definition
>>>>>>> of strerror_r, and forcing it to use the POSIX one is a pain. But you
>>>>>>> can do it.  You also will require a thread-local buffer to hold the
>>>>>>> return from strerror_r, since it is not guaranteed to be static
>>>>>>> (although in practice it is 99% of the time-- another annoyance with
>>>>>>> the API).
>>>>>>>
>>>>>>>
>>> ________________________________
>>>
>>>
>>>
>>> ATTENTION: -----
>>>
>>> The information contained in this message (including any files
>>> transmitted with this message) may contain proprietary, trade secret or
>>> other confidential and/or legally privileged information. Any pricing
>>> information contained in this message or in any files transmitted with this
>>> message is always confidential and cannot be shared with any third parties
>>> without prior written approval from Syncsort. This message is intended to be
>>> read only by the individual or entity to whom it is addressed or by their
>>> designee. If the reader of this message is not the intended recipient, you
>>> are on notice that any use, disclosure, copying or distribution of this
>>> message, in any form, is strictly prohibited. If you have received this
>>> message in error, please immediately notify the sender and/or Syncsort and
>>> destroy all copies of this message in your possession, custody or control.
>
>

Reply via email to