Re: Functions, floats and doubles

2009-11-13 Thread Grant Ingersoll

On Nov 13, 2009, at 1:48 PM, Mattmann, Chris A (388J) wrote:

>> 
>> Still, I think we should put it on our roadmap.

SOLR-1562.


Re: Functions, floats and doubles

2009-11-13 Thread Mattmann, Chris A (388J)
On 11/13/09 11:38 AM, "Grant Ingersoll"  wrote:

> If I implement Vincenty's formula for distance between two points on an
> ellipsoid that can be accurate down to the 0.5mm.  Not doing that yet and not
> planning on implementing, so this is not a huge issue right now.
> 
> Still, I think we should put it on our roadmap.

+1

Cheers,
Chris

> 
> 
> On Nov 13, 2009, at 1:32 PM, Yonik Seeley wrote:
> 
>> On Fri, Nov 13, 2009 at 1:01 PM, Walter Underwood 
>> wrote:
>>> Float is almost never good enough. The loss of precision is horrific.
>> 
>> Are you saying it's not good enough for this case (the final answer of
>> a relative distance calculation)?
>> 7 digits of precision is enough to represent a distance across the US
>> down to the meter... and points closer together would have higher
>> precision of course.
>> 
>> For storage of the points themselves, 32 bit floats may also often be
>> enough (~2.4 meter resolution at the equator).  Allowing doubles as an
>> option would be nice too - but I expect that doubling the fieldcache
>> may not be worth it for many.
>> Actually, a 32 bit fixed point representation would have a lot more
>> accuracy for this (256 times the resolution at the cost of on-the-fly
>> conversion to a double for calculations).
>> 
>> -Yonik
>> http://www.lucidimagination.com
> 
> 
> 

++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.mattm...@jpl.nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++





Re: Functions, floats and doubles

2009-11-13 Thread Grant Ingersoll

On Nov 13, 2009, at 1:35 PM, Yonik Seeley wrote:

> On Fri, Nov 13, 2009 at 1:31 PM, Grant Ingersoll  wrote:
>> Yeah, in the end, Lucene Scorer returns a float...  However, if we allow for 
>> pseudo fields and other capabilities, it would be nice to have doubles.
> 
> We have doubles already...  It's just that our general purpose
> functions (like sum) don't use them.
> geo functions could certainly use them.

Yep, I have a patch for the QueryParsing, etc. to allow me to get doubles from 
that.  It should suffice for now.


> 
>>> But for something like gdist(point_a,point_b), the internal
>>> calculations can be done in double precision and if the result is cast
>>> to a float at the end, it should be good enough for most uses, right?
>> 
>> This is what I am doing for the specific case I'm working on, but I agree 
>> with Walter here.  As Solr starts to evolve to power apps where you want to 
>> do complex functions based on the results of queries, the loss of precision 
>> can be quite bad.
> 
> I agree with you all that eventually we want generic double precision support.
> What I don't understand is if anyone thinks it's a blocker for geo, and why.

Definitely not a blocker.  I'll put up a patch on 
https://issues.apache.org/jira/browse/SOLR-1302 and people can kick it around.

Re: Functions, floats and doubles

2009-11-13 Thread Grant Ingersoll
If I implement Vincenty's formula for distance between two points on an 
ellipsoid that can be accurate down to the 0.5mm.  Not doing that yet and not 
planning on implementing, so this is not a huge issue right now.

Still, I think we should put it on our roadmap. 


On Nov 13, 2009, at 1:32 PM, Yonik Seeley wrote:

> On Fri, Nov 13, 2009 at 1:01 PM, Walter Underwood  
> wrote:
>> Float is almost never good enough. The loss of precision is horrific.
> 
> Are you saying it's not good enough for this case (the final answer of
> a relative distance calculation)?
> 7 digits of precision is enough to represent a distance across the US
> down to the meter... and points closer together would have higher
> precision of course.
> 
> For storage of the points themselves, 32 bit floats may also often be
> enough (~2.4 meter resolution at the equator).  Allowing doubles as an
> option would be nice too - but I expect that doubling the fieldcache
> may not be worth it for many.
> Actually, a 32 bit fixed point representation would have a lot more
> accuracy for this (256 times the resolution at the cost of on-the-fly
> conversion to a double for calculations).
> 
> -Yonik
> http://www.lucidimagination.com




Re: Functions, floats and doubles

2009-11-13 Thread Walter Underwood
Float is often OK until you try and use it for further calculation. Maybe it is 
good enough for printing out distance, but maybe not for further use.

wunder

On Nov 13, 2009, at 10:32 AM, Yonik Seeley wrote:

> On Fri, Nov 13, 2009 at 1:01 PM, Walter Underwood  
> wrote:
>> Float is almost never good enough. The loss of precision is horrific.
> 
> Are you saying it's not good enough for this case (the final answer of
> a relative distance calculation)?
> 7 digits of precision is enough to represent a distance across the US
> down to the meter... and points closer together would have higher
> precision of course.
> 
> For storage of the points themselves, 32 bit floats may also often be
> enough (~2.4 meter resolution at the equator).  Allowing doubles as an
> option would be nice too - but I expect that doubling the fieldcache
> may not be worth it for many.
> Actually, a 32 bit fixed point representation would have a lot more
> accuracy for this (256 times the resolution at the cost of on-the-fly
> conversion to a double for calculations).
> 
> -Yonik
> http://www.lucidimagination.com
> 



Re: Functions, floats and doubles

2009-11-13 Thread Yonik Seeley
On Fri, Nov 13, 2009 at 1:31 PM, Grant Ingersoll  wrote:
> Yeah, in the end, Lucene Scorer returns a float...  However, if we allow for 
> pseudo fields and other capabilities, it would be nice to have doubles.

We have doubles already...  It's just that our general purpose
functions (like sum) don't use them.
geo functions could certainly use them.

>> But for something like gdist(point_a,point_b), the internal
>> calculations can be done in double precision and if the result is cast
>> to a float at the end, it should be good enough for most uses, right?
>
> This is what I am doing for the specific case I'm working on, but I agree 
> with Walter here.  As Solr starts to evolve to power apps where you want to 
> do complex functions based on the results of queries, the loss of precision 
> can be quite bad.

I agree with you all that eventually we want generic double precision support.
What I don't understand is if anyone thinks it's a blocker for geo, and why.

-Yonik
http://www.lucidimagination.com


Re: Functions, floats and doubles

2009-11-13 Thread Yonik Seeley
On Fri, Nov 13, 2009 at 1:01 PM, Walter Underwood  wrote:
> Float is almost never good enough. The loss of precision is horrific.

Are you saying it's not good enough for this case (the final answer of
a relative distance calculation)?
7 digits of precision is enough to represent a distance across the US
down to the meter... and points closer together would have higher
precision of course.

For storage of the points themselves, 32 bit floats may also often be
enough (~2.4 meter resolution at the equator).  Allowing doubles as an
option would be nice too - but I expect that doubling the fieldcache
may not be worth it for many.
Actually, a 32 bit fixed point representation would have a lot more
accuracy for this (256 times the resolution at the cost of on-the-fly
conversion to a double for calculations).

-Yonik
http://www.lucidimagination.com


Re: Functions, floats and doubles

2009-11-13 Thread Grant Ingersoll

On Nov 13, 2009, at 12:58 PM, Yonik Seeley wrote:

> On Fri, Nov 13, 2009 at 12:52 PM, Grant Ingersoll  wrote:
>> Implementing my first function (distance stuff) and notices that functions 
>> seem to have a float bent to them.  Not even sure what would be involved, 
>> but there are cases for distance that I could see wanting double precision.  
>> Thoughts?
> 
> 
> It's an issue in general.

Yeah, in the end, Lucene Scorer returns a float...  However, if we allow for 
pseudo fields and other capabilities, it would be nice to have doubles.

> 
> But for something like gdist(point_a,point_b), the internal
> calculations can be done in double precision and if the result is cast
> to a float at the end, it should be good enough for most uses, right?

This is what I am doing for the specific case I'm working on, but I agree with 
Walter here.  As Solr starts to evolve to power apps where you want to do 
complex functions based on the results of queries, the loss of precision can be 
quite bad.

-Grant



Re: Functions, floats and doubles

2009-11-13 Thread Walter Underwood
Float is almost never good enough. The loss of precision is horrific.

wunder

On Nov 13, 2009, at 9:58 AM, Yonik Seeley wrote:

> On Fri, Nov 13, 2009 at 12:52 PM, Grant Ingersoll  wrote:
>> Implementing my first function (distance stuff) and notices that functions 
>> seem to have a float bent to them.  Not even sure what would be involved, 
>> but there are cases for distance that I could see wanting double precision.  
>> Thoughts?
> 
> 
> It's an issue in general.
> 
> But for something like gdist(point_a,point_b), the internal
> calculations can be done in double precision and if the result is cast
> to a float at the end, it should be good enough for most uses, right?
> 
> -Yonik
> http://www.lucidimagination.com
> 



Re: Functions, floats and doubles

2009-11-13 Thread Yonik Seeley
On Fri, Nov 13, 2009 at 12:52 PM, Grant Ingersoll  wrote:
> Implementing my first function (distance stuff) and notices that functions 
> seem to have a float bent to them.  Not even sure what would be involved, but 
> there are cases for distance that I could see wanting double precision.  
> Thoughts?


It's an issue in general.

But for something like gdist(point_a,point_b), the internal
calculations can be done in double precision and if the result is cast
to a float at the end, it should be good enough for most uses, right?

-Yonik
http://www.lucidimagination.com


Functions, floats and doubles

2009-11-13 Thread Grant Ingersoll
Implementing my first function (distance stuff) and notices that functions seem 
to have a float bent to them.  Not even sure what would be involved, but there 
are cases for distance that I could see wanting double precision.  Thoughts?

-Grant