[v8-dev] Re: StringToDouble rewritten not using String::Get and memory allocations.... (issue1096002)

Florian Loitsch Tue, 23 Mar 2010 02:27:41 -0700

It's a pity that most of the time more than 700 bytes on the stack are
wasted, but I don't see any easier solution. I agree with Erik (offline
discussion) that ~780 bytes on the stack should not be a big issue since the
memory is not initialized and we are not calling this function recursively.
I vote for allocating the big buffer on the stack.
btw: it would be nice to have one example of a halfway-case with 771 digits.
(Note that sprintf "%.771e" gives a number with 771 precision digits).
// florian


On Mon, Mar 22, 2010 at 7:53 PM, Sergey Ryazanov <se...@google.com> wrote:

> It seems that strtod rounds a decimal to a closest number
> representable in double ("24414062505131250" parses as
> 24414062505131248 and "24414062505131250.0.....01" 24414062505131252).
> As I get from GNU source code it uses "multipercision numbers" for
> exact representation of numbers.
>
> Any double is representable as d*2^p where 0 <= d < 2^53 and ... < p
> <= -1074 (considering subnormal numbers). Any positive double x has
> exact decimal representation:
> 1) if 0 < x < 1: not more than ~770 significant digits (2^53 * 5^1074)
> 2) 2^53 < x < DBL_MAX: not more than 308 significant digits since it's
> an integer and DBL_MAX ≈ 1.79769 × 10^308.
> 3) 1 < x < 2^53: not more than 60 significant digits.
> (significant digits doesn't include leading and trailing zeros).
>
> Let's we have a decimal with more than 770 significant digits. We want
> to find a double closest to our number. Dropping other digits (as well
> as changing them to any other digits) would give us right result
> unless our number lays exactly between 2 adjacent doubles. Mean of 2
> adjacent doubles may have not more that 771 digits (all other would be
> zoros). If the first digits of our number are equal to the digits of
> that number the result of rounding would depend on if the rest of
> digits are zeros.
>
> So conclusion is following: If we preserve at least 771 significant
> digits and replace any nonzero tail by '1' we would never change
> behavior of strtod.
>
>
> On Sat, Mar 20, 2010 at 7:56 PM,  <floitsc...@gmail.com> wrote:
> > I will discuss the
> > 100000000000000000000000.0000000000000000000000000000000000000000000001
> > issue
> > with the V8 team on monday.
> > The way I see it we have two options:
> > 1. Follow ECMA-262 and round down, thus being incompatible with older
> > versions.
> > 2. Fallback to a more expensive reading when there are more than 20
> digits.
> >
> > Pros/Cons for 1:
> > Pro: Basically nothing to do. That's what we have now.
> > Cons: Incompatible and we might numbers the "wrong" way. On the other
> hand
> > these
> > numbers have to be written by hand (toString/toExponential/toFixed will
> > never
> > produce a number that would make such problems). Therefore they are
> > extremely
> > rare.
> >
> > Pros/Cons for 2:
> > Pro: Compatible with older variants of V8. Reading is correct. Might
> > slightly
> > simplify the fast case: the exponent would need to be in range -999 to
> 999.
> > Cons: we would need to keep/add a fallback method. Maybe a template
> taking
> > either a fixed-size buffer or a dynamic vector would do the trick,
> though.
> >
> >
> >
> > http://codereview.chromium.org/1096002/diff/3002/4003
> > File src/conversions.cc (right):
> >
> > http://codereview.chromium.org/1096002/diff/3002/4003#newcode109
> > src/conversions.cc:109: bool operator != (EndMarker const& m) const {
> > return !(*this == m); }
> > On 2010/03/19 15:46:12, SeRya wrote:
> >>
> >> On 2010/03/18 20:34:22, Erik Corry wrote:
> >> > Some funky C++ here :-).  return !end_; seems simpler, but perhaps
> >
> > this is
> >>
> >> > somehow better?
> >
> >> Just a canonical form of != which simplifies maintenance (IMHO).
> >
> > I'm with Erik here.
> > I still don't understand how this actually types. (Although I'm by no
> > means a C++ expert).
> > Also operator-overloading should be rare in Google code.
> > Why not Peek(), AtEnd(), etc?
> > This said, I'm not very familiar with V8 coding practices.
> >
> > http://codereview.chromium.org/1096002/diff/3002/4003#newcode496
> > src/conversions.cc:496: const int max_exponent = INT_MAX / 2;
> > On 2010/03/19 15:46:12, SeRya wrote:
> >>
> >> On 2010/03/19 13:39:43, Florian Loitsch wrote:
> >> > This seems to be too complicated. A decimal number without leading
> >
> > 0s may only
> >>
> >> > have a decimal exponent of ~-400 to ~+400 before ending up being
> >
> > infinite or
> >>
> >> 0.
> >
> >> 1<1000 zeros>e-1000 == 1.
> >
> > Right you are.
> >
> > http://codereview.chromium.org/1096002/diff/3002/4003#newcode519
> > src/conversions.cc:519: if (exponent != 0) {
> > On 2010/03/19 15:46:12, SeRya wrote:
> >>
> >> On 2010/03/19 13:39:43, Florian Loitsch wrote:
> >> > not that it really matters, but you could copy the exponent
> >
> > characters while
> >>
> >> > reading them, and just stop after 4 digits.
> >> > This way you could avoid this part here.
> >
> >> It would mean another chunk of code that which drop leading zeros and
> >
> > check for
> >>
> >> junk tail. I'd prefer to simplify for now and may be add this
> >
> > optimization
> >>
> >> later.
> >
> > my comment was based on the assumption that the read exponent was in
> > range -400 to +400. So disregard it.
> >
> > http://codereview.chromium.org/1096002/diff/21004/27003#newcode298
> > src/conversions.cc:298: // 1. currnet == end (other ops are not
> > allowed), current != end.
> > Are we sure there is at least one character?
> > If yes assert it.
> > If not, and it is legal to access current[0] of empty string, explain.
> >
> > http://codereview.chromium.org/1096002/diff/21004/27003#newcode330
> > src/conversions.cc:330: buffer[buffer_pos++] = '-';
> > It might make sense to move the hexadecimal reading into a separate
> > function.
> >
> > http://codereview.chromium.org/1096002/diff/21004/27003#newcode405
> > src/conversions.cc:405: if (current == end) return signed_zero;
> > I think it makes more sense to structure as follows:
> > if (current == end) {
> >  if (significant_digits == 0 && !leading_zero) {
> >    // String was ".".
> >    return JUNK_STRING_VALUE;
> >  } else {
> >    goto parsing_done;
> >  }
> > }
> > if (significant_digits == 0) {
> >  octal = false;
> >  ...
> > }
> >
> > http://codereview.chromium.org/1096002/diff/21004/27003#newcode451
> > src/conversions.cc:451: }
> > How should "123e" be parsed when "trailing junk is enabled?
> > as "123" or as JUNK_STRING_VALUE?
> > If it's the latter, then this is fine.
> >
> > http://codereview.chromium.org/1096002/diff/21004/27003#newcode456
> > src/conversions.cc:456: ++current;
> > As before: should 123e+ be parsed as 123 or JUNK_STRING_VALUE when
> > trailing junk is enabled.
> >
> > http://codereview.chromium.org/1096002
> >
>

-- 
v8-dev mailing list
v8-dev@googlegroups.com
http://groups.google.com/group/v8-dev

To unsubscribe from this group, send email to 
v8-dev+unsubscribegooglegroups.com or reply to this email with the words 
"REMOVE ME" as the subject.

[v8-dev] Re: StringToDouble rewritten not using String::Get and memory allocations.... (issue1096002)

Reply via email to