On Thu, Feb 22, 2018 at 11:57 AM, Sebastian Berg <sebast...@sipsolutions.net > wrote:
> > First, you are right...Decimal is not the right module for this. I > > think instead I should use the 'fractions' module for loading grid > > spec information from strings (command-line, configs, etc). The > > tricky part is getting the yaml reader to use it instead of > > converting to a float under the hood. > I'm not sure fractions is any better (Or necessary, anyway) in the end, you need floats, so the inherent limitations of floats aren't the problem. In your original use-case, you wanted a 32 bit float grid in the end, so doing the calculations is 64 bit float and then downcasting is as good as you're going to get, and easy and fast. And I suppose you could use 128 bit float if you want to get to 64 bit in the end -- not as easy, and python itself doesn't have it. > The > tricky part is getting the yaml reader to use it instead of > converting to a float under the hood. 64 bit floats support about 15 decimal digits -- are you string-based sources providing more than that?? if not, then the 64 bit float version is as good as it's going to get. > Second, what has been pointed out about the implementation of arange > > actually helps to explain some oddities I have encountered. In some > > situations, I have found that it was better for me to produce the > > reversed sequence, and then reverse that array back and use it. > interesting -- though I'd still say "don't use arange" is the "correct" answer. > > Third, it would be nice to do what we can to improve arange()'s > > results. Would we be ok with a PR that uses fma() if it is available, > > but then falls back on a regular multiply and add if it isn't > > available, or are we going to need to implement it ourselves for > > consistency? > I would think calling fma() if supported would be fine -- if there is an easy macro to check if it's there. I don't know if numpy has a policy about this sort of thing, but I'm pretty sure everywhere else, the final details of computation fal back to the hardware/compiler/library (i.e. Intel used extended precision fp, other platforms don't, etc) so I can't see that having a slightly more accurate computation in arange on some platforms and not others would cause a problem. If any of the tests are checking to that level of accuracy, they should be fixed :-) 2. It sounds *not* like a fix, but rather a > "make it slightly less bad, but it is still awful" > exactly -- better accuracy is a good thing, but it's not the source of the problem here -- the source of the problem is inherent to FP, and/or poorly specified goal. having arrange or linspace lose a couple ULPs fewer isn't going to change anything. > Using fma inside linspace might make linspace a bit more exact > possible, and would be a good thing, though I am not sure we have a > policy yet for something that is only used sometimes, see above -- nor do I, but it seems like a fine idea to me. > It also would be nice to add stable summation to numpy in general (in > whatever way), which maybe is half related but on nobody's specific > todo list. I recall a conversation on this list a (long) while back about compensated summation (Kahan summation) -- I guess nothign ever came of it? > Lastly, there definitely needs to be a better tool for grid making. > > The problem appears easy at first, but it is fraught with many > > pitfalls and subtle issues. It is easy to say, "always use > > linspace()", but if the user doesn't have the number of pixels, they > > will need to calculate that using --- gasp! -- floating point > > numbers, which could result in the wrong answer. agreed -- this tends to be an inherently over-specified problem: min_value max_value spacing number_of_grid_spaces That is four values, and only three independent ones. arange() looks like it uses: min_value, max_value, spacing -- but it doesn't really (see previous discussion) so not the right tool for anything. linspace() uses: min_value, max_value, (number_of_grid_spaces + 1), which is about as good as you can get (except for that annoying 1). But what if you are given min_value, spacing, number_of_grid_spaces? Maybe we need a function for that?? (which I think would simply be: np.arange(number_of_grid_spaces + 1) * spacing Which is why we probably don't need a function :-) (note that that's only error of one multiplication per grid point) Or maybe a way to take all four values, and return a "best fit" grid. The problem with that is that it's over specified, and and it may not be only fp error that makes it not fit. What should a code do??? So Ben: What is the problem you are trying to solve? -- I'm still confused. What information do you have to define the grid? Maybe all we need are docs for how to best compute a grid with given specifications? And point to them in the arange() and linspace() docstrings. -CHB I once wanted to add a "step" argument to linspace, but didn't in the > end, largely because it basically enforced in a very convoluted way > that the step fit exactly to a number of steps (up to floating point > precision) and body was quite sure it was a good idea, since it would > just be useful for a little convenience when you do not want to > calculate the steps. > exactly! -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion