Re: Functions, floats and doubles
On Nov 13, 2009, at 1:48 PM, Mattmann, Chris A (388J) wrote: >> >> Still, I think we should put it on our roadmap. SOLR-1562.
Re: Functions, floats and doubles
On 11/13/09 11:38 AM, "Grant Ingersoll" wrote: > If I implement Vincenty's formula for distance between two points on an > ellipsoid that can be accurate down to the 0.5mm. Not doing that yet and not > planning on implementing, so this is not a huge issue right now. > > Still, I think we should put it on our roadmap. +1 Cheers, Chris > > > On Nov 13, 2009, at 1:32 PM, Yonik Seeley wrote: > >> On Fri, Nov 13, 2009 at 1:01 PM, Walter Underwood >> wrote: >>> Float is almost never good enough. The loss of precision is horrific. >> >> Are you saying it's not good enough for this case (the final answer of >> a relative distance calculation)? >> 7 digits of precision is enough to represent a distance across the US >> down to the meter... and points closer together would have higher >> precision of course. >> >> For storage of the points themselves, 32 bit floats may also often be >> enough (~2.4 meter resolution at the equator). Allowing doubles as an >> option would be nice too - but I expect that doubling the fieldcache >> may not be worth it for many. >> Actually, a 32 bit fixed point representation would have a lot more >> accuracy for this (256 times the resolution at the cost of on-the-fly >> conversion to a double for calculations). >> >> -Yonik >> http://www.lucidimagination.com > > > ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.mattm...@jpl.nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++
Re: Functions, floats and doubles
On Nov 13, 2009, at 1:35 PM, Yonik Seeley wrote: > On Fri, Nov 13, 2009 at 1:31 PM, Grant Ingersoll wrote: >> Yeah, in the end, Lucene Scorer returns a float... However, if we allow for >> pseudo fields and other capabilities, it would be nice to have doubles. > > We have doubles already... It's just that our general purpose > functions (like sum) don't use them. > geo functions could certainly use them. Yep, I have a patch for the QueryParsing, etc. to allow me to get doubles from that. It should suffice for now. > >>> But for something like gdist(point_a,point_b), the internal >>> calculations can be done in double precision and if the result is cast >>> to a float at the end, it should be good enough for most uses, right? >> >> This is what I am doing for the specific case I'm working on, but I agree >> with Walter here. As Solr starts to evolve to power apps where you want to >> do complex functions based on the results of queries, the loss of precision >> can be quite bad. > > I agree with you all that eventually we want generic double precision support. > What I don't understand is if anyone thinks it's a blocker for geo, and why. Definitely not a blocker. I'll put up a patch on https://issues.apache.org/jira/browse/SOLR-1302 and people can kick it around.
Re: Functions, floats and doubles
If I implement Vincenty's formula for distance between two points on an ellipsoid that can be accurate down to the 0.5mm. Not doing that yet and not planning on implementing, so this is not a huge issue right now. Still, I think we should put it on our roadmap. On Nov 13, 2009, at 1:32 PM, Yonik Seeley wrote: > On Fri, Nov 13, 2009 at 1:01 PM, Walter Underwood > wrote: >> Float is almost never good enough. The loss of precision is horrific. > > Are you saying it's not good enough for this case (the final answer of > a relative distance calculation)? > 7 digits of precision is enough to represent a distance across the US > down to the meter... and points closer together would have higher > precision of course. > > For storage of the points themselves, 32 bit floats may also often be > enough (~2.4 meter resolution at the equator). Allowing doubles as an > option would be nice too - but I expect that doubling the fieldcache > may not be worth it for many. > Actually, a 32 bit fixed point representation would have a lot more > accuracy for this (256 times the resolution at the cost of on-the-fly > conversion to a double for calculations). > > -Yonik > http://www.lucidimagination.com
Re: Functions, floats and doubles
Float is often OK until you try and use it for further calculation. Maybe it is good enough for printing out distance, but maybe not for further use. wunder On Nov 13, 2009, at 10:32 AM, Yonik Seeley wrote: > On Fri, Nov 13, 2009 at 1:01 PM, Walter Underwood > wrote: >> Float is almost never good enough. The loss of precision is horrific. > > Are you saying it's not good enough for this case (the final answer of > a relative distance calculation)? > 7 digits of precision is enough to represent a distance across the US > down to the meter... and points closer together would have higher > precision of course. > > For storage of the points themselves, 32 bit floats may also often be > enough (~2.4 meter resolution at the equator). Allowing doubles as an > option would be nice too - but I expect that doubling the fieldcache > may not be worth it for many. > Actually, a 32 bit fixed point representation would have a lot more > accuracy for this (256 times the resolution at the cost of on-the-fly > conversion to a double for calculations). > > -Yonik > http://www.lucidimagination.com >
Re: Functions, floats and doubles
On Fri, Nov 13, 2009 at 1:31 PM, Grant Ingersoll wrote: > Yeah, in the end, Lucene Scorer returns a float... However, if we allow for > pseudo fields and other capabilities, it would be nice to have doubles. We have doubles already... It's just that our general purpose functions (like sum) don't use them. geo functions could certainly use them. >> But for something like gdist(point_a,point_b), the internal >> calculations can be done in double precision and if the result is cast >> to a float at the end, it should be good enough for most uses, right? > > This is what I am doing for the specific case I'm working on, but I agree > with Walter here. As Solr starts to evolve to power apps where you want to > do complex functions based on the results of queries, the loss of precision > can be quite bad. I agree with you all that eventually we want generic double precision support. What I don't understand is if anyone thinks it's a blocker for geo, and why. -Yonik http://www.lucidimagination.com
Re: Functions, floats and doubles
On Fri, Nov 13, 2009 at 1:01 PM, Walter Underwood wrote: > Float is almost never good enough. The loss of precision is horrific. Are you saying it's not good enough for this case (the final answer of a relative distance calculation)? 7 digits of precision is enough to represent a distance across the US down to the meter... and points closer together would have higher precision of course. For storage of the points themselves, 32 bit floats may also often be enough (~2.4 meter resolution at the equator). Allowing doubles as an option would be nice too - but I expect that doubling the fieldcache may not be worth it for many. Actually, a 32 bit fixed point representation would have a lot more accuracy for this (256 times the resolution at the cost of on-the-fly conversion to a double for calculations). -Yonik http://www.lucidimagination.com
Re: Functions, floats and doubles
On Nov 13, 2009, at 12:58 PM, Yonik Seeley wrote: > On Fri, Nov 13, 2009 at 12:52 PM, Grant Ingersoll wrote: >> Implementing my first function (distance stuff) and notices that functions >> seem to have a float bent to them. Not even sure what would be involved, >> but there are cases for distance that I could see wanting double precision. >> Thoughts? > > > It's an issue in general. Yeah, in the end, Lucene Scorer returns a float... However, if we allow for pseudo fields and other capabilities, it would be nice to have doubles. > > But for something like gdist(point_a,point_b), the internal > calculations can be done in double precision and if the result is cast > to a float at the end, it should be good enough for most uses, right? This is what I am doing for the specific case I'm working on, but I agree with Walter here. As Solr starts to evolve to power apps where you want to do complex functions based on the results of queries, the loss of precision can be quite bad. -Grant
Re: Functions, floats and doubles
Float is almost never good enough. The loss of precision is horrific. wunder On Nov 13, 2009, at 9:58 AM, Yonik Seeley wrote: > On Fri, Nov 13, 2009 at 12:52 PM, Grant Ingersoll wrote: >> Implementing my first function (distance stuff) and notices that functions >> seem to have a float bent to them. Not even sure what would be involved, >> but there are cases for distance that I could see wanting double precision. >> Thoughts? > > > It's an issue in general. > > But for something like gdist(point_a,point_b), the internal > calculations can be done in double precision and if the result is cast > to a float at the end, it should be good enough for most uses, right? > > -Yonik > http://www.lucidimagination.com >
Re: Functions, floats and doubles
On Fri, Nov 13, 2009 at 12:52 PM, Grant Ingersoll wrote: > Implementing my first function (distance stuff) and notices that functions > seem to have a float bent to them. Not even sure what would be involved, but > there are cases for distance that I could see wanting double precision. > Thoughts? It's an issue in general. But for something like gdist(point_a,point_b), the internal calculations can be done in double precision and if the result is cast to a float at the end, it should be good enough for most uses, right? -Yonik http://www.lucidimagination.com
Functions, floats and doubles
Implementing my first function (distance stuff) and notices that functions seem to have a float bent to them. Not even sure what would be involved, but there are cases for distance that I could see wanting double precision. Thoughts? -Grant