#14317: numberformat.format produces wrong results
---------------------------------------------+------------------------------
          Reporter:  akaariai                |         Owner:  akaariai         
              
            Status:  new                     |     Milestone:                   
              
         Component:  Internationalization    |       Version:  1.2              
              
        Resolution:                          |      Keywords:  Localization, 
number formatting
             Stage:  Design decision needed  |     Has_patch:  1                
              
        Needs_docs:  1                       |   Needs_tests:  1                
              
Needs_better_patch:  1                       |  
---------------------------------------------+------------------------------
Changes (by akaariai):

  * needs_better_patch:  0 => 1
  * has_patch:  0 => 1
  * stage:  Unreviewed => Design decision needed
  * needs_tests:  0 => 1
  * needs_docs:  0 => 1

Comment:

 The proposed patch above does not work correctly. Decimal(10**100) is
 formatted in floating point precision.

 Formatting every type passed in in a way that always results in correct
 formatting is messy, and will perform poorly. For example large float
 passed in needs to either be casted to decimal, and after that back to
 string, or the exponent form needs to be transformed directly to string,
 which is slow and error prone. To make things worse, the formatting will
 change depending on underlying C library
 (http://docs.python.org/library/functions.html#float).

 So, to solve this ticket, one possibility is just check that if
 unicode(number) has e in it, return it directly. This way everything else
 will work, giving right result, except for large floats (will return
 1.2e+100), small floats (will return 1.2e-100), or small decimals (again
 will return 1.2e-100). It is possible to fix the small numbers by using
 '%.(decimal_pos)f', but this will do rounding. This means that decimals
 and floats are rounded, while strings are not. Still, it is not wise to
 use '%.(decimal_pos)f' when handling large numbers (Decimal(10**100) or
 large numbers passed in as strings) will return the number in floating
 point precision.

 As said, it is possible to get almost any case to work if enough special
 casing is done. But it will require much code and will probably be much
 slower than the current code, at least for some cases.

 I would say the best solution at least for now is to return the number in
 exponent format if that is what unicode(number) returns. Then document
 that this is how things work. This is backwards incompatible as the
 expectation is currently that any number is formatted in non-exponent
 format. However, this is not how things actually work. Still one
 possibility is to cast everything as float and if the result has e in it,
 return it directly. After that continue as now. This way number formatting
 is most consistent...

 Some code to show the problems:

 {{{
 from decimal import Decimal as d
 >>> str(10000000000000000000000000000000000000.0)
 '1e+37' # Wrong result.
 >>> str(10000000000000000000000000000000000000)
 '10000000000000000000000000000000000000' # ints work correctly
 >>> str(d('10000000000000000000000000000000000000.000000000001'))
 '10000000000000000000000000000000000000.000000000001' # as do decimals
 >>> '%.2f' % d('10000000000000000000000000000000000000.000000000001')
 '9999999999999999538762658202121142272.00' # formatting large decimals
 using '%.nf' is bad idea.
 >>> str(d('0.00000000000000000000000000001'))
 '1E-29' # small decimals do not work correctly
 >>> '%.2f' % d('0.00000000000000000000000000001')
 '0.00' # for small decimals this is a nice way
 >>> '%.2f' % d('0.666')
 '0.67' # and we get rounding
 >>> str(d('0.666'))
 '0.666' # but if we want convert this string to 2 decimal_pos format, it
 will be messy with rounding
 >>> float('inf')
 'inf' # just a reminder that there are still more possibilities...
 }}}

 Still one more thing: when considering alternatives, it is good to
 remember that this is very performance sensitive area. After all the
 standard template rendering benchmark is to render large table of numbers,
 and for a reason too.

-- 
Ticket URL: <http://code.djangoproject.com/ticket/14317#comment:2>
Django <http://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To post to this group, send email to django-upda...@googlegroups.com.
To unsubscribe from this group, send email to 
django-updates+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-updates?hl=en.

Reply via email to