#1161: g.region and r.info decimel issue when using grass python libs -------------------------+-------------------------------------------------- Reporter: isaacullah | Owner: grass-...@… Type: defect | Status: closed Priority: normal | Milestone: 6.4.1 Component: Python | Version: 6.4.0 Resolution: invalid | Keywords: Platform: All | Cpu: All -------------------------+--------------------------------------------------
Comment(by glynn): Replying to [comment:3 cmbarton]: > > These functions simply parse the (decimal) output from g.region and r.info. Python has printf-like formatting operations if you wish to use them. > > Actually this is not what seems to be happening. Yes it is. > g.region and r.info produce single precision values, as expected. g.region and r.info produce '''decimal''' values using G_format_{northing,easting}, which uses either %.15g or %.8f (except for lat/lon, which uses DMS). Both of these are better than IEEE single- precision which has between 6 and 7 decimal digits. %.15g uses 15 decimal digits (trailing zeros after the decimal pointer are omitted, as is the decimal point itself if it is not required); %.8f uses as many digits as are required before the decimal point and a further 8 digits after it. > But the python library functions do not seem to be getting values from these-- The Python functions are wrappers around "g.region -g" and "r.info -rgstmpud", which parse the output into a dictionary, with the strings parsed using float(), int() or float_or_dms() as appropriate. > or are doing something strange with the values after the fact Yes; if by "strange" you mean converting them to (double precision) binary floating point values (which is a lossy operation; 10^-n^ (for n >= 1) isn't exactly representable in binary). OTOH, that isn't all that strange, given that the values started out as floating point before g.region/r.info converted them to decimal (which itself may be lossy; %.15g isn't quite enough for double precision, which has slightly better than 15 decimal digits of precision). > --in order to come up with double precision values. The result is that the values in the dictionary produced by grass.region() and grass.raster_info() are *different* from the values that come from g.region or r.info. Therein lies the problem. The values which come from g.region or r.info are '''strings''', each comprising a decimal representation of a number. Most of the things which you might want to do with that information will expect numbers rather than strings, so the Python functions convert them to numbers automatically. We could use Python's "decimal" package, although that doesn't work with everything, still doesn't necessarily give you the original value, and serves no purpose other than to work around bugs in scripts which expect to be able to perform floating-point comparisons using "==" or (worse still) string comparison. But if someone is making that kind of mistake, they will have far bigger problems. If you really need the exact output from g.region/r.info, use grass.parse_command() (which will parse key/value output into a dictionary but will leave the values as strings). But don't expect other commands to return identical strings for the same information; there is no one "correct" format string for coordinates. > A region set using g.region is different from a region set using grass.region(). The difference is not much In the example give, it's around 10 microns. I'm not convinced that there is a single set of geospatial data in existence which genuinely has that accuracy. > but it is enough to cause problems if you are comparing regions in a boolean way Which is a bug, and not one which will be solved by any changes to the Python library. Any program which parses the output from g.region or r.info will have exactly the same issues. > or trying to overlay maps created with a setting in g.region and maps created with a setting from grass.region(). Even on the largest map, the differences are nowhere near half a cell, which is what would be required to move the sample point into the next cell. > My only guess is that somehow grass.region() is populating its dictionary via a swig/ctype call instead of just parsing g.region. It's just parsing the output from "g.region -g" via Python's float() operator: http://trac.osgeo.org/grass/browser/grass/trunk/lib/python/core.py#L525 http://trac.osgeo.org/grass/browser/grass/trunk/lib/python/core.py#L485 > If this guess is wrong, then something else is happening to the values after they are generated by g.region and before they go into the python dictionary. The only "something else" is that g.region() parses the decimal string to a float, and "print" converts it back to a decimal string. Both of these operations are lossy. But then just about anything which you do with a floating-point value is lossy, including parsing the values from the WIND/cellhd file in the first place. Parsing a decimal string to a floating-point value is inherently lossy. Converting a floating-point value to a decimal isn't inherently lossy but in practice you invariably use far fewer digits than are required for an exact representation, as the exact representation requires roughly 3 times as many digits as are necessary for a unique representation. -- Ticket URL: <http://trac.osgeo.org/grass/ticket/1161#comment:4> GRASS GIS <http://grass.osgeo.org>
_______________________________________________ grass-dev mailing list grass-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/grass-dev