Re: [PATCH] Fix newly exposed bug [was RE: RFC: Fix partial NaN-parsing problem [was RE: sscanf problem]]

2005-04-29 Thread Eric Blake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

According to Dave Korn on 4/28/2005 12:41 PM:
>   Heh, actually we probably have to talk about that.  The k should IIUIC be
> swallowed by the %lf and the %c should fail; this is the production
> described as NAN(n-char-sequence opt) in the C language spec, strtod
> documentation (that's 7.20.1.3.3 in WG14/N843 draft, I don't have the final
> version).  And we haven't even mentioned the lack of INF support yet :)

Per POSIX,
http://www.opengroup.org/onlinepubs/009695399/functions/sscanf.html,
NAN{n-char-sequence} is implementation defined - sscanf should only accept
n-char-sequences that are also generated by the same implementation's
printf and parsed by strtod.  And since newlib printf does not output any
n-char-sequence versions of NaN, sscanf should not parse them.  IIUC,
n-char-sequence exists to allow implementations the ability to specify
signalling NaNs or exact bit patterns within the NaN.

All this means that I think Jeff is right that
 i = sscanf ("nank", "%lf%c%n", &x, &m, &n)
should succeed on newlib, but you should also remember that it is not
portable and that on non-newlib systems it might fail.

- --
Life is short - so eat dessert first!

Eric Blake [EMAIL PROTECTED]
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.0 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFCciWn84KuGfSFAYARAn57AKC4NQ8N9eIfdVRw4wTqzSzkyJIbPgCffhII
wOxYCH3ae9wn47EgS0sTYGs=
=L/cl
-END PGP SIGNATURE-

--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



RE: [PATCH] Fix newly exposed bug [was RE: RFC: Fix partial NaN-parsing problem [was RE: sscanf problem]]

2005-04-28 Thread Gary R. Van Sickle
>   Heh, actually we probably have to talk about that.  The k 
> should IIUIC be swallowed by the %lf and the %c should fail; 
> this is the production described as NAN(n-char-sequence opt) 
> in the C language spec, strtod documentation (that's 
> 7.20.1.3.3 in WG14/N843 draft, I don't have the final 
> version).  And we haven't even mentioned the lack of INF 
> support yet :)
> 
>   However I'm on UK time, so it won't be happening today!
> 
> 
> cheers,
>   DaveK

You should switch to UTC.

;-)

-- 
Gary R. Van Sickle


--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



RE: [PATCH] Fix newly exposed bug [was RE: RFC: Fix partial NaN-parsing problem [was RE: sscanf problem]]

2005-04-28 Thread Dave Korn
Original Message
>From: Jeff Johnston
>Sent: 28 April 2005 19:33

> Hi Dave,
> 
>Thanks for looking into this.  Your patch wasn't quite correct.  It
> ended up breaking nan-support which isn't tested in the accompanying
> testcase.  It needed to verify that x & multiple_flags_ored_together ==
>   multiple_flags_ored_together. Anyway, I have checked a patch in and
> verified that it works for your tests below plus it also works for a
> simple test like 
> i = sscanf ("nank", "%lf%c%n", &x, &m, &n)
> 
> -- Jeff J.


  Heh, actually we probably have to talk about that.  The k should IIUIC be
swallowed by the %lf and the %c should fail; this is the production
described as NAN(n-char-sequence opt) in the C language spec, strtod
documentation (that's 7.20.1.3.3 in WG14/N843 draft, I don't have the final
version).  And we haven't even mentioned the lack of INF support yet :)

  However I'm on UK time, so it won't be happening today!


cheers,
  DaveK
-- 
Can't think of a witty .sigline today


--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



Re: [PATCH] Fix newly exposed bug [was RE: RFC: Fix partial NaN-parsing problem [was RE: sscanf problem]]

2005-04-28 Thread Jeff Johnston
Hi Dave,
  Thanks for looking into this.  Your patch wasn't quite correct.  It ended up 
breaking nan-support which isn't tested in the accompanying testcase.  It needed 
to verify that x & multiple_flags_ored_together == multiple_flags_ored_together. 
 Anyway, I have checked a patch in and verified that it works for your tests 
below plus it also works for a simple test like
i = sscanf ("nank", "%lf%c%n", &x, &m, &n)

-- Jeff J.
Dave Korn wrote:
Original Message
From: Jean-Christophe Kablitz
Sent: 27 April 2005 00:22

Hello,
I have noticed, that, while parsing {a float_value immediately followed by
'n' or 'N'} with the "%f%c" format, the sscanf function of cygwin-1.5.16-1
behaves differently from the scanf function of cygwin-1.5.14-1.
Until cygwin-1.5.14-1 (included), 'n' matches %c, while with
cygwin-1.5.15-1 

and cygwin-1.5-16-1, 'n' is no more assigned to %c.
In the following test case, I would expect the progran to output
i=2 x=1 m=a
i=2 x=1 m=n
that was the case until cygwin-1.5.14-1 (included).
With cygwin-1.5.15-1 and cygwin-1.5-16-1, the program outputs instead
i=2 x=1 m=a
i=1 x=1 m=_
Maybe I have been misusing sscanf. Or there is a relationship with the
NaN-parsing problem of the "newlib".

  No, your use of sscanf is perfectly correct!  Yes, there is a newly
exposed bug in the NaN parsing code, as you guessed; it falsely accepted the
N as part of 'NaN'.  Then, because it had begun by parsing a number, and
because it successfully parsed the number, it didn't go through the
'nan-parsing-has-failed-so-put-back-the-eaten-chars' bit that my last fix
introduced.

--- beginning of test case ---
jck:/sscanf> cat ssn.c
#include 
int main()
{
   double x;
   char   m;
   inti;
   x = 0.0;
   m = '_';
   i = sscanf("1.0a", "%lf%c", &x, &m);
   printf("i=%d x=%g m=%c\n", i, x, m);
   x = 0.0;
   m = '_';
   i = sscanf("1.0n", "%lf%c", &x, &m);
   printf("i=%d x=%g m=%c\n", i, x, m);
   return 0;
}
jck:/sscanf> gcc -O0 ssn.c -o ssn.exe
jck:/sscanf> ./ssn.exe
i=2 x=1 m=a
i=1 x=1 m=_
--- end of test case ---

  Thank you for the simple test case; I was able to reproduce the problem
easily, although not exactly: the output I got was:
[EMAIL PROTECTED] /test/sscanf> ./ssn.exe
i=2 x=1 m=a
i=0 x=0 m=_
  It turns out there has been an underlying bug that was exposed with my
earlier fix.  The problem is in /src/newlib/libc/stdio/vfscanf.c, function
__SVFSCANF_R, case CT_FLOAT, where it's parsing a float and sees an 'n':
case 'n':
case 'N':
  if (nancount == 0
  && (flags & (SIGNOK | NDIGITS | DPTOK | EXPOK)))
{
  flags &= ~(SIGNOK | DPTOK | EXPOK | NDIGITS);
  nancount = 1;
  goto fok;
}
  else if (nancount == 2)
{
  nancount = 3;
  goto fok;
}
  break;
  The condition at the top of the loop is meant to be testing to ensure we
haven't already parsed any of the other possible components of an FP number,
but what it actually tests is whether or not we've parsed *all* the other
possible components; that's the only case it'll refuse to accept an 'n' at
present.  The reason it used to work is because after bogusly parsing the
'n', the old version then hits this bit of code when it comes time to parse
the %c field (CT_CHAR):
case CT_CHAR:
  /* scan arbitrary characters (sets NOSKIP) */
  if (width == 0)
width = 1;
  I don't understand what this is doing, but it looks like some kind of
kludge that's saying "If we got here, then we know there must have been a
char to parse, so if we don't have any, we must have bogusly consumed it
already, so pretend it's there anyway".  Or something; like I say, I don't
understand it, but it looks like a kludge to me.
  Anyway, the attached patch changes the bitwise-AND (&) to an equality (==)
operator, which genuinely tests that we haven't parsed anything else at all;
it's effectively verifying that the flags haven't changed from their initial
value before beginning to attempt to parse the possible 'NaN' string.  This
fixes the testcase for me: I now see
[EMAIL PROTECTED] /test/sscanf> ./ssn.exe
i=2 x=1 m=a
i=2 x=1 m=n
and indeed, with an expanded version of it, which also verifies the amount
of characters consumed, I see:
[EMAIL PROTECTED] /test/sscanf> cat ssn.c
#include 
int main()
{
double x;
char   m;
inti, n;
x = 0.0;
m = '_';
n = -1;
i = sscanf("1.0a", "%lf%c%n", &x, &m, &n);
printf("i=%d x=%g m=%c n=%d\n", i, x, m, n);
x = 0.0;
m = '_';
n = -1;
i = sscanf("1.0n", "%lf%c%n", &x, &m, &n);
printf("i=%d x=%g m=%c n=%d\n", i, x, m, n);
x = 0.0;
m = '_';
n = -1;
i = sscanf("1.0na", "%lf%c%n", &x, &m, &n);
printf("i=%d x=%g m=%c n=%d\n", i, x, m, n);
x = 0.0;
m = '_';
n = -1;
  

[PATCH] Fix newly exposed bug [was RE: RFC: Fix partial NaN-parsing problem [was RE: sscanf problem]]

2005-04-28 Thread Dave Korn
Original Message
>From: Jean-Christophe Kablitz
>Sent: 27 April 2005 00:22

> Hello,
> 
> I have noticed, that, while parsing {a float_value immediately followed by
> 'n' or 'N'} with the "%f%c" format, the sscanf function of cygwin-1.5.16-1
> behaves differently from the scanf function of cygwin-1.5.14-1.
> Until cygwin-1.5.14-1 (included), 'n' matches %c, while with
cygwin-1.5.15-1 
> and cygwin-1.5-16-1, 'n' is no more assigned to %c.
> 
> In the following test case, I would expect the progran to output
> i=2 x=1 m=a
> i=2 x=1 m=n
> 
> that was the case until cygwin-1.5.14-1 (included).
> 
> With cygwin-1.5.15-1 and cygwin-1.5-16-1, the program outputs instead
> i=2 x=1 m=a
> i=1 x=1 m=_
> 
> Maybe I have been misusing sscanf. Or there is a relationship with the
> NaN-parsing problem of the "newlib".

  No, your use of sscanf is perfectly correct!  Yes, there is a newly
exposed bug in the NaN parsing code, as you guessed; it falsely accepted the
N as part of 'NaN'.  Then, because it had begun by parsing a number, and
because it successfully parsed the number, it didn't go through the
'nan-parsing-has-failed-so-put-back-the-eaten-chars' bit that my last fix
introduced.

> --- beginning of test case ---
> jck:/sscanf> cat ssn.c
> #include 
> 
> int main()
> {
> double x;
> char   m;
> inti;
> 
> x = 0.0;
> m = '_';
> i = sscanf("1.0a", "%lf%c", &x, &m);
> printf("i=%d x=%g m=%c\n", i, x, m);
> x = 0.0;
> m = '_';
> i = sscanf("1.0n", "%lf%c", &x, &m);
> printf("i=%d x=%g m=%c\n", i, x, m);
> return 0;
> }
> 
> jck:/sscanf> gcc -O0 ssn.c -o ssn.exe
> jck:/sscanf> ./ssn.exe
> i=2 x=1 m=a
> i=1 x=1 m=_
> --- end of test case ---

  Thank you for the simple test case; I was able to reproduce the problem
easily, although not exactly: the output I got was:

[EMAIL PROTECTED] /test/sscanf> ./ssn.exe
i=2 x=1 m=a
i=0 x=0 m=_

  It turns out there has been an underlying bug that was exposed with my
earlier fix.  The problem is in /src/newlib/libc/stdio/vfscanf.c, function
__SVFSCANF_R, case CT_FLOAT, where it's parsing a float and sees an 'n':

case 'n':
case 'N':
  if (nancount == 0
  && (flags & (SIGNOK | NDIGITS | DPTOK | EXPOK)))
{
  flags &= ~(SIGNOK | DPTOK | EXPOK | NDIGITS);
  nancount = 1;
  goto fok;
}
  else if (nancount == 2)
{
  nancount = 3;
  goto fok;
}
  break;

  The condition at the top of the loop is meant to be testing to ensure we
haven't already parsed any of the other possible components of an FP number,
but what it actually tests is whether or not we've parsed *all* the other
possible components; that's the only case it'll refuse to accept an 'n' at
present.  The reason it used to work is because after bogusly parsing the
'n', the old version then hits this bit of code when it comes time to parse
the %c field (CT_CHAR):

case CT_CHAR:
  /* scan arbitrary characters (sets NOSKIP) */
  if (width == 0)
width = 1;

  I don't understand what this is doing, but it looks like some kind of
kludge that's saying "If we got here, then we know there must have been a
char to parse, so if we don't have any, we must have bogusly consumed it
already, so pretend it's there anyway".  Or something; like I say, I don't
understand it, but it looks like a kludge to me.

  Anyway, the attached patch changes the bitwise-AND (&) to an equality (==)
operator, which genuinely tests that we haven't parsed anything else at all;
it's effectively verifying that the flags haven't changed from their initial
value before beginning to attempt to parse the possible 'NaN' string.  This
fixes the testcase for me: I now see

[EMAIL PROTECTED] /test/sscanf> ./ssn.exe
i=2 x=1 m=a
i=2 x=1 m=n

and indeed, with an expanded version of it, which also verifies the amount
of characters consumed, I see:

[EMAIL PROTECTED] /test/sscanf> cat ssn.c
#include 
int main()
{
double x;
char   m;
inti, n;

x = 0.0;
m = '_';
n = -1;
i = sscanf("1.0a", "%lf%c%n", &x, &m, &n);
printf("i=%d x=%g m=%c n=%d\n", i, x, m, n);
x = 0.0;
m = '_';
n = -1;
i = sscanf("1.0n", "%lf%c%n", &x, &m, &n);
printf("i=%d x=%g m=%c n=%d\n", i, x, m, n);
x = 0.0;
m = '_';
n = -1;
i = sscanf("1.0na", "%lf%c%n", &x, &m, &n);
printf("i=%d x=%g m=%c n=%d\n", i, x, m, n);
x = 0.0;
m = '_';
n = -1;
i = sscanf("1.0nan", "%lf%c%n", &x, &m, &n);
printf("i=%d x=%g m=%c n=%d\n", i, x, m, n);
x = 0.0;
m = '_';
n = -1;
i = sscanf("1.0e", "%lf%c%n", &x, &m, &n);
printf("i=%d x=%g m=%c n=%d\n", i, x, m, n);
x = 0.0;
m = '_';
n = -1;
i = sscanf("1.0f", "%lf%c%n", &x, &m, &n);
printf("

RE: RFC: Fix partial NaN-parsing problem [was RE: sscanf problem]

2005-04-27 Thread Dave Korn
Original Message
>From: Jean-Christophe Kablitz
>Sent: 27 April 2005 00:22


> Maybe I have been misusing sscanf. Or there is a relationship with the
> NaN-parsing problem of the "newlib".
> 
> Best regards.
> Jean-Christophe K.


  Almost certainly so; thanks for the test case, I'll get onto it later
today.


cheers,
  DaveK
-- 
Can't think of a witty .sigline today


--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



Re: RFC: Fix partial NaN-parsing problem [was RE: sscanf problem]

2005-04-26 Thread Jean-Christophe Kablitz
Hello,
I have noticed, that, while parsing {a float_value immediately followed by 
'n' or 'N'} with the "%f%c" format, the sscanf function of cygwin-1.5.16-1 
behaves differently from the scanf function of cygwin-1.5.14-1.
Until cygwin-1.5.14-1 (included), 'n' matches %c, while with cygwin-1.5.15-1 
and cygwin-1.5-16-1, 'n' is no more assigned to %c.

In the following test case, I would expect the progran to output
i=2 x=1 m=a
i=2 x=1 m=n
that was the case until cygwin-1.5.14-1 (included).
With cygwin-1.5.15-1 and cygwin-1.5-16-1, the program outputs instead
i=2 x=1 m=a
i=1 x=1 m=_
Maybe I have been misusing sscanf. Or there is a relationship with the 
NaN-parsing problem of the "newlib".

Best regards.
Jean-Christophe K.
--- beginning of test case ---
jck:/sscanf> cat ssn.c
#include 
int main()
{
   double x;
   char   m;
   inti;
   x = 0.0;
   m = '_';
   i = sscanf("1.0a", "%lf%c", &x, &m);
   printf("i=%d x=%g m=%c\n", i, x, m);
   x = 0.0;
   m = '_';
   i = sscanf("1.0n", "%lf%c", &x, &m);
   printf("i=%d x=%g m=%c\n", i, x, m);
   return 0;
}
jck:/sscanf> gcc -O0 ssn.c -o ssn.exe
jck:/sscanf> ./ssn.exe
i=2 x=1 m=a
i=1 x=1 m=_
--- end of test case ---
----- Original Message - 
From: "Jeff Johnston" 
To: "Dave Korn" 
Cc: ; 
Sent: Tuesday, April 05, 2005 8:47 PM
Subject: Re: RFC: Fix partial NaN-parsing problem [was RE: sscanf problem]


Patch checked in.  Thanks.
-- Jeff J.
Dave Korn wrote:
Original Message
From: Dave Korn
Sent: 04 April 2005 19:07

Original Message
From: Dave Korn
Sent: 04 April 2005 18:51

Original Message
From: Michael Hines
Sent: 04 April 2005 19:43

The following program prints
i=1 x=0
instead of
i=0 x=10
when using the latest version of cygwin1.dll.

 No, hang on, on checking the newlib-l archive that seems to have been
something to do with a zero exponent.  This is a separate bug: it accepts
the first one or two characters of 'nan' and says "ok, everything's still
good", and then because it's reached the end of the string it treats that
as a successful parse; it forgets to verify that it doesn't have an
outstanding half-formed NaN.  I'll post a (provisional) patch shortly.

  Ok, this is only provisional, because as I point out I'm not quite sure
about the corner case where we've refilled the buffer.  It also has minor
formatting issues (slightly long lines in the comment, IMO).  However, it
fixes the testcase, and I've got to go home for the evening, so here's my
work-in-progress; comments welcomed.
--
[EMAIL PROTECTED] /test/sscanf> cat ss.c
#include 
int main() {
 int i;
 double x;
 x = 10;
 i = sscanf("n", "%lf", &x);
 printf("i=%d x=%g\n", i, x);
 i = sscanf("nan", "%lf", &x);
 printf("i=%d x=%g\n", i, x);
 return 0;
}

[EMAIL PROTECTED] /test/sscanf> gcc -O0 -g ss.c -o ss.exe
[EMAIL PROTECTED] /test/sscanf> ./ss.exe
i=0 x=10
i=1 x=NaN
[EMAIL PROTECTED] /test/sscanf>
--


cheers,
  DaveK

--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/

--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/


Re: RFC: Fix partial NaN-parsing problem [was RE: sscanf problem]

2005-04-05 Thread Jeff Johnston
Patch checked in.  Thanks.
-- Jeff J.
Dave Korn wrote:
Original Message
From: Dave Korn
Sent: 04 April 2005 19:07

Original Message
From: Dave Korn
Sent: 04 April 2005 18:51

Original Message
From: Michael Hines
Sent: 04 April 2005 19:43

The following program prints
i=1 x=0
instead of
i=0 x=10
when using the latest version of cygwin1.dll.

 No, hang on, on checking the newlib-l archive that seems to have been
something to do with a zero exponent.  This is a separate bug: it accepts
the first one or two characters of 'nan' and says "ok, everything's still
good", and then because it's reached the end of the string it treats that
as a successful parse; it forgets to verify that it doesn't have an
outstanding half-formed NaN.  I'll post a (provisional) patch shortly.

  Ok, this is only provisional, because as I point out I'm not quite sure
about the corner case where we've refilled the buffer.  It also has minor
formatting issues (slightly long lines in the comment, IMO).  However, it
fixes the testcase, and I've got to go home for the evening, so here's my
work-in-progress; comments welcomed.
--
[EMAIL PROTECTED] /test/sscanf> cat ss.c
#include 
int main() {
 int i;
 double x;
 x = 10;
 i = sscanf("n", "%lf", &x);
 printf("i=%d x=%g\n", i, x);
 i = sscanf("nan", "%lf", &x);
 printf("i=%d x=%g\n", i, x);
 return 0;
}

[EMAIL PROTECTED] /test/sscanf> gcc -O0 -g ss.c -o ss.exe
[EMAIL PROTECTED] /test/sscanf> ./ss.exe
i=0 x=10
i=1 x=NaN
[EMAIL PROTECTED] /test/sscanf>
--


cheers,
  DaveK

--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/