[Bug 1109327] Re: who command gets "who: memory exhausted" for certain inputs

Bug Watch Updater Sun, 10 Feb 2013 20:06:01 -0800

Launchpad has imported 35 comments from the remote bug at
http://sourceware.org/bugzilla/show_bug.cgi?id=6530.

If you reply to an imported comment from within Launchpad, your comment
will be sent to the remote bug automatically. Read more about
Launchpad's inter-bugtracker facilities at
https://help.launchpad.net/InterBugTracking.

------------------------------------------------------------------------
On 2008-05-18T10:23:18+00:00 Pierre Habouzit wrote:

With the following testcase, it happens while it shouldn't, according to
the manual:
-----8<-------
#include <stdio.h>
#include <locale.h>

#define STR "²éÄ¾ÂíÉ±²¡¶¾£¬ÖÜºèµtÄúµÄ360²»×¨Òµ£¡"

int main(void) {
        char buf[200];
        setlocale(LC_ALL, "");
        printf("%d\n", snprintf(buf, 150, "%.50s", STR));
        return 0;
}
----->8-------

The manual page has this to say:
 About precision:
  An optional precision, in the form of a period (&#8216;.&#8217;)  followed by 
an
  optional decimal digit string.(...) This gives (...) the maximum
  number of characters to be printed from a string for s and S
  conversions.

 About s:
  If no l modifier is present: The const char * argument is expected to
  be a pointer to an array of character type(...)

  If an l modifier is present: The const wchar_t * argument is expected
  to be a pointer to an array of wide characters.  Wide characters from
  the  array  are converted  to  multibyte  characters (...)

There is no "l" modifier, but still, the string goes through the
multibyte conversion code, and fails because the string is invalid
multibyte.

Note, it only works with non UTF-8 locale set in LC_CTYPE or LC_ALL.

This is debian bug http://bugs.debian.org/208308

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1109327/comments/0

------------------------------------------------------------------------
On 2008-05-18T10:38:54+00:00 Pierre Habouzit wrote:

Err the title is bogus, the thing is that sprintf returns -1 bogusly, if you 
run the previous testcase using  LC_ALL=ja_JP.EUC-JP e.g.

%s should not care about multibyte at all

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1109327/comments/1

------------------------------------------------------------------------
On 2010-08-30T13:47:37+00:00 Ævar Arnfjörð Bjarmason wrote:

Here's a simpler testcase for this bug courtesy of Jonathan Nieder:

    #include <stdio.h>
    #include <locale.h>

    int main(void)
    {
        int n;

        setlocale(LC_CTYPE, "");
        n = printf("%.11s\n", "Author: \277");
        perror("printf");
        fprintf(stderr, "return value: %d\n", n);
        return 0;
    }

Under a C locale that'll do the right thing:

    $ LANG=C ./test
    Author: &#65533;
    printf: Success
    return value: 10

But not under a UTF-8 locale, since \277 isn't a valid UTF-8 sequence:

    $ LANG=en_US.utf8 ./test
    printf: Invalid or incomplete multibyte or wide character
    return value: -1

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1109327/comments/2

------------------------------------------------------------------------
On 2012-03-11T16:31:51+00:00 Bugdal wrote:

This may be a duplicate of another longstanding bug report that Mr.
Drepper has marked as WONTFIX despite it being obviously non-conformant.
He seems to believe, for whatever reason, that avoiding writing out
incomplete multibyte sequences in incorrect programs that use %.NNs
formats haphazardly with multibyte strings is more important than not
breaking correct programs that are using %.NNs with strings not intended
to be interpreted as multibyte characters in the current locale. Neither
C nor POSIX makes any requirement that strings used with the %s format
specifier be valid multibyte character sequences in the current locale;
glibc is simply wrong on this issue.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1109327/comments/3

------------------------------------------------------------------------
On 2012-03-11T16:55:49+00:00 Carlos-0 wrote:

I'm moving this bug to unassigned. We need someone to review this issue
and confirm the problem.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1109327/comments/4

------------------------------------------------------------------------
On 2012-03-11T16:59:07+00:00 Carlos-0 wrote:

What would really make this bug stand out is if someone did the work to
review what the *standards* say about this particular behaviour.
Reviewing what should be accepted is 90% of the work in issues like
this. Help on this from the bug submitter would be very useful.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1109327/comments/5

------------------------------------------------------------------------
On 2012-03-11T17:03:57+00:00 Bugdal wrote:

My pleasure. Per N1256 (C99+TC1/2/3):

s
If no l length modifier is present, the argument shall be a pointer to the 
initial element of an array of character type.246) Characters from the array 
are written up to (but not including) the terminating null character. If the 
precision is specified, no more than that many bytes are written. If the 
precision is not specified or is greater than the size of the array, the array 
shall contain a null character. If an l length modifier is present, the 
argument shall be a pointer to the initial element of an array of wchar_t type. 
Wide characters from the array are converted to multibyte characters (each as 
if by a call to the wcrtomb function, with the conversion state described by an 
mbstate_t object initialized to zero before the first wide character is 
converted) up to and including a terminating null wide character. The resulting 
multibyte characters are written up to (but not including) the terminating null 
character (byte). If no precision is specified, the array shall contain a null 
wide character. If a precision is specified, no more than that many bytes are 
written (including shift sequences, if any), and the array shall contain a null 
wide character if, to equal the multibyte character sequence length given by 
the precision, the function would need to access a wide character one past the 
end of the array. In no case is a partial multibyte character written.247)

And the referenced footnotes:

246) No special provisions are made for multibyte characters.

247) Redundant shift sequences may result if multibyte characters have a
state-dependent encoding.

I don't think it could be clearer than what footnote 246 says about %s
without the l modifier.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1109327/comments/6

------------------------------------------------------------------------
On 2012-03-11T17:08:08+00:00 Carlos-0 wrote:

Many thanks Rich, that really helps. I'll put this into the queue to see
if we can fix it for 2.16.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1109327/comments/7

------------------------------------------------------------------------
On 2012-05-14T13:21:52+00:00 Bugdal wrote:

Any progress on this? I'm going to try to find and reopen the bug it's a
duplicate of; perhaps then they can be marked as duplicate.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1109327/comments/8

------------------------------------------------------------------------
On 2012-06-22T21:02:47+00:00 Carlos-odonell wrote:

I'm moving this to 2.17. We didn't get a chance to look into this issue.
We are in the code freeze for 2.16 and I think it will take considerable
effort for the community to develop and test a fix for this.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1109327/comments/9

------------------------------------------------------------------------
On 2012-06-22T21:23:20+00:00 Jonathan Nieder wrote:

Based on experience from the sinking feeling of watching target
milestones slowly march forward on the Mozilla and Chromium bugtrackers:
could you target it at no particular milestone instead? That way, anyone
stumbling on this will realize that it is their own responsibility to
get it fixed quickly if they care for that to happen.

Thanks,
Jonathan

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1109327/comments/10

------------------------------------------------------------------------
On 2012-06-22T22:08:14+00:00 Carlos-odonell wrote:

(In reply to comment #10)
> Based on experience from the sinking feeling of watching target milestones
> slowly march forward on the Mozilla and Chromium bugtrackers: could you target
> it at no particular milestone instead? That way, anyone stumbling on this will
> realize that it is their own responsibility to get it fixed quickly if they
> care for that to happen.
> 
> Thanks,
> Jonathan

Jonathan, That's a discussion that should be had on libc-
al...@sourceware.org. Please start a thread there :-)

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1109327/comments/11

------------------------------------------------------------------------
On 2012-06-22T22:20:23+00:00 Bugdal wrote:

Fixing this bug should not be complex or invasive. Basically, it's a
case of special-case code having been incorrectly written for a case
that does not need to be, and should not be, special-cased. The fix
should essentially just remove this incorrect code. I haven't read the
source yet, but I'd be willing to look into it and possibly submit a
patch.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1109327/comments/12

------------------------------------------------------------------------
On 2012-06-23T00:54:03+00:00 Bugdal wrote:

Created attachment 6472
patch to fix the bug

This patch fixes the bug. As I suspected, it's all -'s, no +'s. If you
want to fix the indention on the one line that's left, you can and it'll
be a single +...

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1109327/comments/13

------------------------------------------------------------------------
On 2012-06-23T02:31:05+00:00 Jonathan Nieder wrote:

Hi,

Rich Felker wrote:

> --- vfprintf.c.orig
> +++ vfprintf.c
> @@ -1168,42 +1168,7 @@
>       else if (!is_long && spec != L_('S'))                                 \
>         {                                                                   \
>           if (prec != -1)                                                   \
> -           {                                                               \
> -             /* Search for the end of the string, but don't search past    \
> -                the length (in bytes) specified by the precision.  Also    \
> -                don't use incomplete characters.  */                       \
> -             if (_NL_CURRENT_WORD (LC_CTYPE, _NL_CTYPE_MB_CUR_MAX) == 1)   \
>                 len = __strnlen (string, prec);                             \
> -             else                                                          \
> -               {                                                           \
> -                 /* In case we have a multibyte character set the          \
> -                    situation is more complicated.  We must not copy       \
> -                    bytes at the end which form an incomplete character. */\
[...]
> -                   len = str2 - string - (ps.__count & 7);                 \
> -               }                                                           \
> -           }                                                               \
>           else                                                              \
>             len = strlen (string);                                          \
>         }                                                                   \
> This patch fixes the bug. As I suspected, it's all -'s, no +'s.

Thanks.  wprintf(3) tells me:

        s
                If no l modifier is present: The const char * argument
                is expected to be a pointer to an array of character
                type (pointer to a string) containing a multibyte character
                sequence beginning in the initial shift state.
                Characters from the array are converted to wide
                characters (each by a call to the mbrtowc(3) function
                with a conversion state starting in the initial state
                before the first byte).  The resulting wide characters
                are written up to (but not including) the terminating
                null wide character.
[etc]

C99 says that in this case "the argument shall be a pointer to the
initial element of a character array containing a multibyte character
sequence beginning in the initial shift state".  POSIX uses similar
wording.  C99 describes an error condition:

        The fwprintf function returns the number of wide characters
        transmitted, or a negative value if an output or encoding
        error occurred.

POSIX does not document what value errno should have in this case, but
presumably EILSEQ is typical, for consistency with mbrtowc.

Luckily these considerations only matter in the defined(COMPILE_WPRINTF)
case.  Your patch only modifies the !defined(COMPILE_WPRINTF) code.
So for what it's worth:

Reviewed-by: Jonathan Nieder <jrnie...@gmail.com>

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1109327/comments/14

------------------------------------------------------------------------
On 2012-06-23T03:04:11+00:00 Bugdal wrote:

The bug has nothing to do with wprintf. It applies to the non-wide
printf functions. The text you cited is about wprintf and is therefore
irrelevant.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1109327/comments/15

------------------------------------------------------------------------
On 2012-06-23T03:19:33+00:00 Jonathan Nieder wrote:

(In reply to comment #15)
> The bug has nothing to do with wprintf. It applies to the non-wide printf
> functions. The text you cited is about wprintf and is therefore irrelevant.

Did you read my message to the end?  Is your reply meant to motivate people
to help you?

I would be thrilled in printf and wprintf were defined in separate files
with shared code (if any) in a third file, but that's a topic for another
day.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1109327/comments/16

------------------------------------------------------------------------
On 2012-06-23T03:30:16+00:00 Bugdal wrote:

Apologies for jumping on your reply. I did read the whole thing but
didn't get what you intended the message to be, and the whole history of
this bug has been folks misquoting text that has nothing to do with the
non-long %s format in the non-wide printf functions to justify the buggy
behavior... :-(

With that said, thanks for the review.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1109327/comments/17

------------------------------------------------------------------------
On 2012-06-23T04:22:10+00:00 Jonathan Nieder wrote:

(In reply to comment #17)
> Apologies for jumping on your reply. I did read the whole thing but didn't get
> what you intended the message to be, and the whole history of this bug has 
> been
> folks misquoting text that has nothing to do with the non-long %s format in 
> the
> non-wide printf functions to justify the buggy behavior... :-(

I had been assuming that the bug was introduced by copying the wprintf()
implementation for reuse in printf() --- i.e., no misunderstanding of the
standard but just simple operator error.

But actually the patch introducing the regression (40c014b3, "Correct
handling of multibyte character strings in %s format with precision",
2000-07-22) only affected vsnprintf() and co.  The changelog comment
makes no sense.  There is no explanation on libc-alpha@ or libc-hacker@
from around that time.  It looks like a simple, unilateral, bad decision.

There's a patch identical to yours complete with changelog and copyright
assignment at <http://sourceware.org/ml/libc-hacker/2010-08/msg00009.html>.
It just reverts 40c014b3.  It was ignored.  A story from a worse time.

Thanks for fixing it.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1109327/comments/18

------------------------------------------------------------------------
On 2012-06-23T11:00:23+00:00 Joseph-codesourcery wrote:

A testcase for the fix should be added to the glibc testsuite (this 
applies to any bug fix that can reasonably be covered in the testsuite and 
isn't fixing an existing test failure).  (It's probably a good idea to add 
a test for wprintf as well to verify that wprintf continues to behave as 
expected, if this isn't already covered by existing wprintf tests.)

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1109327/comments/19

------------------------------------------------------------------------
On 2012-06-23T11:37:57+00:00 Bugdal wrote:

I don't think there's any concern about breaking wprintf. wprintf
inherently operates by converting the argument to %s to wide characters
since its output it wide characters; as such, there's no way it could
avoid failing when the input is an invalid multibyte sequence unless it
completely ignored the failure return.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1109327/comments/20

------------------------------------------------------------------------
On 2012-06-23T20:34:55+00:00 Jonathan Nieder wrote:

Created attachment 6474
test conversion from UTF-8 and handling of incomplete characters (fails)

I assume Joseph means to spend some time to avoid future authors breaking
wprintf now that our attention's on it, not to avoid breaking wprintf when
applying your patch.  Your patch doesn't even touch wprintf code.

I tried writing a wprintf testcase, and it's not gone so well.  For
example, I tried

  wchar_t buf[1000];
  int n;

  if (!setlocale(LC_ALL, "ja_JP.UTF-8)) {
    perror("setlocale");
    return 1;
  }
  n = swprintf(buf, sizeof(buf)/sizeof(wchar_t), L"%3s",
        "\xef\xbd\xa1g\xef\xbd\xa2h\xef\xbd\xa3i\xef\xbd\xa4j");

I expected swprintf to return 3 and buf to equal

  { 0xff61, 'g', 0xff62, 0 }

Instead, swprintf returns 8 and converts the entire string.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1109327/comments/21

------------------------------------------------------------------------
On 2012-06-23T20:53:14+00:00 Bugdal wrote:

That's because you made the classic printf mistake of confusing width
and precision.

Try L"%.3s".

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1109327/comments/22

------------------------------------------------------------------------
On 2012-06-23T21:00:49+00:00 Jonathan Nieder wrote:

Created attachment 6475
Test swprintf's handling of incomplete multibyte characters

(In reply to comment #22)
> That's because you made the classic printf mistake of confusing width and
> precision.
> 
> Try L"%.3s".

Good catch, thanks.  Here's a corrected test, which works fine.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1109327/comments/23

------------------------------------------------------------------------
On 2012-06-24T15:03:10+00:00 Joseph-codesourcery wrote:

On Fri, 22 Jun 2012, carlos_odonell at mentor dot com wrote:

> > Based on experience from the sinking feeling of watching target milestones
> > slowly march forward on the Mozilla and Chromium bugtrackers: could you 
> > target
> > it at no particular milestone instead? That way, anyone stumbling on this 
> > will
> > realize that it is their own responsibility to get it fixed quickly if they
> > care for that to happen.
> > 
> > Thanks,
> > Jonathan
> 
> Jonathan, That's a discussion that should be had on libc-al...@sourceware.org.
> Please start a thread there :-)

Yes, we do need to discuss how best to use milestones based on our 
experiences of the 2.16 release cycle.  Experience with bug trackers of 
other problems would certainly be a useful contribution to that discussion 
(and the related one of how to get bugs fixed faster than they are filed).

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1109327/comments/24

------------------------------------------------------------------------
On 2012-06-24T21:44:37+00:00 Bugdal wrote:

When a bug has been known for over 4 years and there's such a simple
fix, why is there discussion of putting off fixing it until later? Or am
I misunderstanding Jonathan's reply again?

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1109327/comments/25

------------------------------------------------------------------------
On 2012-06-24T22:03:11+00:00 Carlos-odonell wrote:

(In reply to comment #25)
> When a bug has been known for over 4 years and there's such a simple fix, why
> is there discussion of putting off fixing it until later? Or am I
> misunderstanding Jonathan's reply again?

This bug will likely be fixed in 2.17 development, with a backport to
2.16.1 if requested. Does that match your expectations?

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1109327/comments/26

------------------------------------------------------------------------
On 2012-06-25T01:00:26+00:00 Jonathan Nieder wrote:

Created attachment 6477
Test sprintf's handling of incomplete multibyte characters

And here's a test to make sure the bug doesn't rear its head again in
the future.

I personally think Rich's fix should be safe to apply for 2.16.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1109327/comments/27

------------------------------------------------------------------------
On 2012-10-01T23:42:08+00:00 Jonathan Nieder wrote:

commit 715a900c
Author: Jeff Law <l...@redhat.com>
Date:   Fri Sep 28 12:48:42 2012 -0600

    2012-09-28  Andreas Schwab  <sch...@linux-m68k.org>

            [BZ #6530]
            * stdio-common/vfprintf.c (process_string_arg): Revert
            2000-07-22 change.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1109327/comments/28

------------------------------------------------------------------------
On 2012-10-02T21:20:23+00:00 Carlos-odonell wrote:

Jonathan,

Could you please open one new issue per backport request and use
keywords to mark which branch to backport the patch to.

Also we expect the person or groups that needs the backport to do the
work of making and testing the patch. The release branch manager simply
does the approval and analysis of risk for the branch.

Thanks.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1109327/comments/29

------------------------------------------------------------------------
On 2012-10-03T00:51:02+00:00 Jonathan Nieder wrote:

Sure, no problem.

Do I understand correctly that this means testing the patch against
2.16, 2.15, 2.14, 2.12, and 2.11 and then filing one bug for each
branch describing the result?

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1109327/comments/30

------------------------------------------------------------------------
On 2012-10-03T00:54:00+00:00 Carlos-odonell wrote:

(In reply to comment #30)
> Sure, no problem.
> 
> Do I understand correctly that this means testing the patch against
> 2.16, 2.15, 2.14, 2.12, and 2.11 and then filing one bug for each
> branch describing the result?

Yes, that way we can track the backport to each branch individually and
the various release branch maintainers and the release branch
shareholders can comment on the change. The keyword allows a release
branch manager to quickly identify issues open against their branch.

If you have any feedback about this process I'd be more than happy to
hear it. We don't have much good formal process for bugzilla, but we're
trying :-)

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1109327/comments/31

------------------------------------------------------------------------
On 2012-10-03T21:19:42+00:00 Joseph-codesourcery wrote:

On Wed, 3 Oct 2012, jrnieder at gmail dot com wrote:

> Do I understand correctly that this means testing the patch against
> 2.16, 2.15, 2.14, 2.12, and 2.11 and then filing one bug for each
> branch describing the result?

There is no real evidence of 2.14 or 2.12 being alive - no response from 
their maintainer when I asked on libc-alpha whether they are alive or dead 
- and 2.11 is considered dead 
<http://sourceware.org/ml/libc-alpha/2012-08/msg00557.html>.  I don't 
think it's worth doing any backports to those three branches.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1109327/comments/32

------------------------------------------------------------------------
On 2012-10-04T01:47:58+00:00 Jonathan Nieder wrote:

Backport requests for known living branches filed as bug 14667 and bug
14667.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1109327/comments/33

------------------------------------------------------------------------
On 2012-10-04T01:50:06+00:00 Jonathan Nieder wrote:

(In reply to comment #33)
> Backport requests for known living branches filed as bug 14667 and bug 14667.

... and bug 14668.  Sorry for the noise.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1109327/comments/34

** Changed in: glibc
       Status: Unknown => Fix Released

** Changed in: glibc
   Importance: Unknown => Medium

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1109327

Title:
  who command gets "who: memory exhausted" for certain inputs

To manage notifications about this bug go to:
https://bugs.launchpad.net/glibc/+bug/1109327/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1109327] Re: who command gets "who: memory exhausted" for certain inputs

Reply via email to