Re: [HACKERS] float output precision questions

2002-11-07 Thread Bruce Momjian

Added to TODO:

* Add GUC variables extra_float_digits and extra_double_digits
to control output digits

---

Tom Lane wrote:
 Pedro M. Ferreira [EMAIL PROTECTED] writes:
  Have two parameters, say DOUBLE_OUTPUT and EXTRA_DIGITS. DOUBLE_OUTPUT 
  would select from decimal output or normalized output. EXTRA_DIGITS 
  would add the required extra digits, from 0 (default) to 3,  when output 
  is decimal.
 
 I'm not happy with adding the hex-output option, since it's not
 very portable and doesn't seem necessary to solve the problem anyway.
 
 But I think an EXTRA_DIGITS setting might be interesting.  In
 particular, suppose we allowed EXTRA_DIGITS to be negative?  Setting
 it to -1 or -2 would go a long way towards eliminating our problems
 with platform variations in the geometry regression test.
 
 Perhaps something like
 
 extra_float_digitsint range -2 to 2, default 0
 
 extra_float_digits adjusts the number of digits displayed for float4 and
 float8 output; the base value of 0 means we output FLT_DIG or DBL_DIG
 digits respectively.
 
 Per discussion, there's no reason to allow a value greater than 2, but
 I'm not as sure what the lower limit should be --- maybe there's some
 use in setting it less than -2?
 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] float output precision questions

2002-10-31 Thread Bruno Wolff III
On Thu, Oct 31, 2002 at 12:58:21 -0500,
  Tom Lane [EMAIL PROTECTED] wrote:
 
 But I think an EXTRA_DIGITS setting might be interesting.  In
 particular, suppose we allowed EXTRA_DIGITS to be negative?  Setting
 it to -1 or -2 would go a long way towards eliminating our problems
 with platform variations in the geometry regression test.

My attempt to avoid this problem with the earthdistance regression used
a cast to numeric to limit the number of digits to the right of the
decimal point.

If the normal number of digits displayed is different between two systems,
than displaying a fixed number less digits is still going to result in
differences.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] float output precision questions

2002-10-31 Thread Pedro M. Ferreira
Just one more note,

Maybe it makes sense that in the proposal below the parameter 
EXTRA_DIGITS could be SIGNIFICANT_DIGITS with a default value of 15 and 
maximum 18.
Its more 'documentable' and maybe easy to understand in general.


Pedro M. Ferreira wrote:

Yes. I think there are several options.
I checked the sprintf(ascii, %A, num) output format and all the 
numbers that would fail because of DBL_DIG=15 are ok. After insertion on 
a table and conversion to double after a query, comparison a==b holds.
AFAICT %A is system independent.

I would (if I may) propose the following:

Have two parameters, say DOUBLE_OUTPUT and EXTRA_DIGITS. DOUBLE_OUTPUT 
would select from decimal output or normalized output. EXTRA_DIGITS 
would add the required extra digits, from 0 (default) to 3,  when output 
is decimal.

EXTRA_DIGITS:
in the range [0:3]. 0 as defualt.

DOUBLE_OUTPUT:

'DECIMAL': sprintf(ascii, %.*g, DBL_DIG+EXTRA_DIGITS, num); (default)
'NORMALIZED': sprintf(ascii, %A, num);

The same could be done for floats (float4).

This way PG does not assume anything (DOUBLE_OUTPUT as 'NORMALIZED'), it 
does not hardwire 'inappropriate' assumptions about the number of 
significant digits in a double (default EXTRA_DIGITS=0), and it gives 
flexibility (EXTRA_DIGITS!=0) if needed.
I think this is functional and reasonable.

Regards,
Pedro M. Ferreira


regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])








--
--
Pedro Miguel Frazao Fernandes Ferreira
Universidade do Algarve
Faculdade de Ciencias e Tecnologia
Campus de Gambelas
8000-117 Faro
Portugal
Tel./Fax:  (+351) 289 800950 / 289 819403
http://w3.ualg.pt/~pfrazao


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] float output precision questions

2002-10-31 Thread Zeugswetter Andreas SB SD

 Maybe it makes sense that in the proposal below the parameter 
 EXTRA_DIGITS could be SIGNIFICANT_DIGITS with a default value 
 of 15 and maximum 18.
 Its more 'documentable' and maybe easy to understand in general.

Yes agree (or double_significant_digits or format_double_digits ?), 
but default to DBL_DIG and allow range between 1 and DBL_DIG + 3.
format_* could be used for all future output format tweaks.

Unfortunately %A is not portable :-(

Andreas

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] float output precision questions

2002-10-31 Thread Pedro M. Ferreira
Zeugswetter Andreas SB SD wrote:


Yes agree (or double_significant_digits or format_double_digits ?), 
but default to DBL_DIG and allow range between 1 and DBL_DIG + 3.
format_* could be used for all future output format tweaks.

Unfortunately %A is not portable :-(

What do you mean ?
It is C99, introduced in glibc 2.1.

What are the requirements for PostgreSQL ?

Pedro



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] float output precision questions

2002-10-31 Thread Tom Lane
Stephan Szabo [EMAIL PROTECTED] writes:
 On Wed, 30 Oct 2002, Tom Lane wrote:
 sprintf(ascii, %.*g, DBL_DIG+1, num);
 and similarly for float4.  Given carefully written float I/O routines,
 reading the latter output should reproduce the originally stored value.

 Well, on my system, it doesn't look like doing the above sprintfs will
 actually work for all numbers.  I did a simple program using an arbitrary
 big number and the DBL_DIG+1 output when stuck into another double
 actually was a different double value.  DBL_DIG+2 worked on my system,
 but...

Oh, you're right; I had forgotten about the effects of scale.
DBL_DIG=15 means that the system claims to distinguish all 15-digit
values, but in a binary system there's more headroom at the bottom end
of a decimal order of magnitude.  For example, 15-digit values are fine:

regression=# select 101::float8 - 100::float8;
 ?column?
--
1
(1 row)

regression=# select 999::float8 - 998::float8;
 ?column?
--
1
(1 row)

but the 9-etc values are over three binary orders of magnitude larger
than the 1-etc values, and so they have three less spare bits at the
right end.  The system would be lying to claim DBL_DIG=16:

regression=# select ::float8 - 9998::float8;
 ?column?
--
2
(1 row)

even though values a little over 1e15 are represented perfectly
accurately:

regression=# select 1001::float8 - 1000::float8;
 ?column?
--
1
(1 row)

If you experiment with 17-digit values, you find that the representable
values are about 2 counts apart near 1e16:

regression=# select 10001::float8 - 1::float8;
 ?column?
--
0
(1 row)

regression=# select 10002::float8 - 1::float8;
 ?column?
--
2
(1 row)

but they're about 16 counts apart near 9e16:

regression=# select 2::float8 - 0::float8;
 ?column?
--
   16
(1 row)

regression=# select 1::float8 - 0::float8;
 ?column?
--
0
(1 row)

which is exactly what you'd expect seeing that the values are about a
factor of 8 apart.

Bottom line: if DBL_DIG=15 and the float arithmetic is binary, then
there are some double values that require 17 displayed digits to
distinguish, even though not all 16-digit numbers are distinct.

So I retract my original proposal and instead suggest that we offer
a switch to display either DBL_DIG or DBL_DIG+2 significant digits
(and correspondingly increase the digits for float4).  The DBL_DIG+2
case should handle the need for exact dump/reload.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] float output precision questions

2002-10-31 Thread Tom Lane
Pedro M. Ferreira [EMAIL PROTECTED] writes:
 Zeugswetter Andreas SB SD wrote:
 Unfortunately %A is not portable :-(

 What do you mean ?

Just what he said.

 It is C99, introduced in glibc 2.1.

 What are the requirements for PostgreSQL ?

glibc does not define the universe; nor are all platforms supporting C99
yet.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] float output precision questions

2002-10-31 Thread Tom Lane
Pedro M. Ferreira [EMAIL PROTECTED] writes:
 Its like I said before, the guys from matlab (in x86 IEEE float) go to 
 DBL_BIG+3 to have 'maximum precision'.

Apparently they have not read the canonical papers in the field.
[ googles for a moment... ]  See

 How to read floating point numbers accurately
 William D. Clinger
  
 How to print floating-point numbers accurately
 Guy L. Steele, Jr., Jon L. White 

both published at the 1990 ACM Conference on Programming Language Design
and Implementation and subsequently reprinted in ACM SIGPLAN Notices
Volume 25, Issue 6 (June 1990).

I was misremembering these papers to claim DBL_DIG+1 is enough, but
actually they prove that DBL_DIG+2 is necessary and sufficient (and
give code to do it correctly, too).

Printing DBL_DIG+3 is just producing an extra garbage digit; it won't
help matters.  Any reasonably well-written C library is going to be
able to reproduce a double value with DBL_DIG+2 digits of I/O; and if
it's not well-written, I would have no confidence in its ability to do
so with DBL_DIG+3 digits...

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] float output precision questions

2002-10-31 Thread Pedro M. Ferreira
Tom Lane wrote:


I was misremembering these papers to claim DBL_DIG+1 is enough, but
actually they prove that DBL_DIG+2 is necessary and sufficient (and
give code to do it correctly, too).


Yeahh! If there's a proof its safe to implement.
I also Googled a bit and found another paper saying that 17 is the
minimum number of significant digits guaranteed to distinguish among
IEEE double-precision floating point numbers:

Robert G. Burger and R. Kent Dybvig. Printing floating-point numbers 
quickly and accurately. In Proceedings of the ACM SIGPLAN '96 Conference 
on Programming Language Design and Implementation, pages 108--116
http://citeseer.nj.nec.com/28233.html


Printing DBL_DIG+3 is just producing an extra garbage digit; it won't
help matters.  Any reasonably well-written C library is going to be
able to reproduce a double value with DBL_DIG+2 digits of I/O; and if
it's not well-written, I would have no confidence in its ability to do
so with DBL_DIG+3 digits...


Off course. This is also good in terms of dump storage for big float8 
databases. Its one byte less for every float8.


			regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]





--
--
Pedro Miguel Frazao Fernandes Ferreira
Universidade do Algarve
Faculdade de Ciencias e Tecnologia
Campus de Gambelas
8000-117 Faro
Portugal
Tel./Fax:  (+351) 289 800950 / 289 819403
http://w3.ualg.pt/~pfrazao


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] float output precision questions

2002-10-31 Thread Tom Lane
Pedro M. Ferreira [EMAIL PROTECTED] writes:
 Have two parameters, say DOUBLE_OUTPUT and EXTRA_DIGITS. DOUBLE_OUTPUT 
 would select from decimal output or normalized output. EXTRA_DIGITS 
 would add the required extra digits, from 0 (default) to 3,  when output 
 is decimal.

I'm not happy with adding the hex-output option, since it's not
very portable and doesn't seem necessary to solve the problem anyway.

But I think an EXTRA_DIGITS setting might be interesting.  In
particular, suppose we allowed EXTRA_DIGITS to be negative?  Setting
it to -1 or -2 would go a long way towards eliminating our problems
with platform variations in the geometry regression test.

Perhaps something like

extra_float_digits  int range -2 to 2, default 0

extra_float_digits adjusts the number of digits displayed for float4 and
float8 output; the base value of 0 means we output FLT_DIG or DBL_DIG
digits respectively.

Per discussion, there's no reason to allow a value greater than 2, but
I'm not as sure what the lower limit should be --- maybe there's some
use in setting it less than -2?

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] float output precision questions

2002-10-31 Thread Pedro M. Ferreira
Tom Lane wrote:


Bottom line: if DBL_DIG=15 and the float arithmetic is binary, then
there are some double values that require 17 displayed digits to
distinguish, even though not all 16-digit numbers are distinct.

So I retract my original proposal and instead suggest that we offer
a switch to display either DBL_DIG or DBL_DIG+2 significant digits
(and correspondingly increase the digits for float4).  The DBL_DIG+2
case should handle the need for exact dump/reload.


Nice. This will be good for number storage purposes.

Shall it be done with two parameters, 'DOUBLE_FORMAT' and 
'SINGLE_FORMAT', with options 'SHORT' and 'LONG' controlling how the 
sprintf's are done ?

Will someone from pg-people do it or shall I do it for you ?
As I said previously, I have seen the GUC stuff and it seem's ok for me 
to do it. I really do not know if there are any restrictions on who 
implements what respecting PostgreSQL.
Tomorrow we have an holyday in Portugal and I shall leave for the whole 
week-end, but I can do it on monday.

Best regards,
Pedro M. Ferreira


			regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org





--
--
Pedro Miguel Frazao Fernandes Ferreira
Universidade do Algarve
Faculdade de Ciencias e Tecnologia
Campus de Gambelas
8000-117 Faro
Portugal
Tel./Fax:  (+351) 289 800950 / 289 819403
http://w3.ualg.pt/~pfrazao


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] float output precision questions

2002-10-31 Thread Pedro M. Ferreira
I sent an email before receiving the one below. I am happier also with 
the extra_digits way (from the previous email I thought the options were 
DBL_DIG or DBL_DIG+2).

I'm not happy with adding the hex-output option, since it's not
very portable and doesn't seem necessary to solve the problem anyway.


Agree.


Perhaps something like

extra_float_digits	int	range -2 to 2, default 0

extra_float_digits adjusts the number of digits displayed for float4 and
float8 output; the base value of 0 means we output FLT_DIG or DBL_DIG
digits respectively.


Agree.



Per discussion, there's no reason to allow a value greater than 2, but
I'm not as sure what the lower limit should be --- maybe there's some
use in setting it less than -2?


I could see some use. At least in my type of application. When people 
are shure they only need p significant digits, they can set 
extra_float_digits to an apropriate negative value and spare a lot in 
storage for dumps and backups. In this case it would make sense to let
extra_float_digits go to -13.

Regards,
Pedro

			regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly





--
--
Pedro Miguel Frazao Fernandes Ferreira
Universidade do Algarve
Faculdade de Ciencias e Tecnologia
Campus de Gambelas
8000-117 Faro
Portugal
Tel./Fax:  (+351) 289 800950 / 289 819403
http://w3.ualg.pt/~pfrazao


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] float output precision questions

2002-10-30 Thread Stephan Szabo
On Wed, 30 Oct 2002, Pedro Miguel Frazao Fernandes Ferreira wrote:

 In C this is possible:

 #include stdio.h
 #include stdlib.h

 int main(void)
 {
double v;
char a[30];

v=1.79769313486231571e+308;

printf(   Stored double number: %25.18g\n,v);
sprintf(a,%25.18g,v);
printf(Converted to string: %s\n,a);
v=atof(a);
printf(Converted from string to double: %25.18g\n,v);
 }

AFAICT, this is not guaranteed to give you the same representation that
you're using in the initializer however.

 Using standard PostgreSQL query output it would not be possible to get
 this number, which has representation as a double.

 I fetched the PostgreSQL source from Debian, changed
 src/backend/utils/adt/float.c to do sprintf(ascii, %25.18g, num)
 instead of sprintf(ascii, %.*g, DBL_DIG, num), compiled and installed.
 Now I can get the number as is.

 I understand that if people insert a value of 1.1 in a double, they want
 to get 1.1 without knowing that in fact the stored number is
 1.10009. But do you understand that if some people insert,
 for example, a value of 1.79769313486231571e+308 they dont want to get
 1.79769313486232e+308 which does not compare equal (in Matlab or C) to
 the first ? This is a bug.

I disagree to some extent.  I'm not sure it's meaningful to expect that
(what if the database and the client are on different architectures) in
general.  In any case, you're effectively going from decimal
representation to double to decimal representation (the string you used
to insert it - internal representation - string used to output it) and
that's only guaranteed to be correct up to DBL_DIG digits as far as I can
tell.  I think it'd be nice to have an option to get more digits for those
sorts of applications, however.

  This would probably make sense as an option, so why don't you look at the
  past discussions and see if you can come up with a solution that keeps
  everyone happy (and preferably implement it, but...) :)

 but ???

, but I realize that you might not be interested in doing such. (I figured
the last part was implied)

 I have a sugestion:

 To have parameters, say DOUBLE_FORMAT and FLOAT_FORMAT, which could have
   option values of 'SHORT' and 'LONG'.
 Option 'SHORT' would be default and produce the standard sprintf(ascii,...
 Option 'LONG' would produce sprintf(ascii, %25.18g, num).

 Other way would be to have number parameters to be used in the sprintf
 calls, in place of 25 and 18, in the format string.

From what Tom said, something similar was suggested and there were issues
brought up.  I don't know what they were, since I wasn't personally
terribly interested, but it should be in the archives.  If there were any
concerns, you'll probably need to deal with those as well.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] float output precision questions

2002-10-30 Thread Bruno Wolff III
On Wed, Oct 30, 2002 at 19:27:57 +0100,
  Peter Eisentraut [EMAIL PROTECTED] wrote:
 
 The printf(%A) output is not system-specific.

Just out of curiosity, can you tell me a web page or keywords to use
in a search to see what that format does? I tried using google, but
searching for printf with and A format didn't go to well. I only
found one correct reference and it didn't explain what %A did.
Thanks.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] float output precision questions

2002-10-30 Thread Pedro M. Ferreira
Stephan Szabo wrote:

On Wed, 30 Oct 2002, Pedro Miguel Frazao Fernandes Ferreira wrote:


I understand that if people insert a value of 1.1 in a double, they want
to get 1.1 without knowing that in fact the stored number is
1.10009. But do you understand that if some people insert,
for example, a value of 1.79769313486231571e+308 they dont want to get
1.79769313486232e+308 which does not compare equal (in Matlab or C) to
the first ? This is a bug.



I disagree to some extent.  I'm not sure it's meaningful to expect that
(what if the database and the client are on different architectures) in
general.  In any case, you're effectively going from decimal
representation to double to decimal representation (the string you used
to insert it - internal representation - string used to output it) and
that's only guaranteed to be correct up to DBL_DIG digits as far as I can
tell.  I think it'd be nice to have an option to get more digits for those
sorts of applications, however.


In the previous email example, in C, I was going from decimal to double 
and so on, but this is not the case when I do some simulation. In this 
case it will allways be from Matlab double to PostgreSQL float8 and from 
libpq PQgetvalue() string to Matlab double. The example was just a x86 
number example where (got the string from Matlab double) query output 
would fail.

This would probably make sense as an option, so why don't you look at the
past discussions and see if you can come up with a solution that keeps
everyone happy (and preferably implement it, but...) :)


but ???

, but I realize that you might not be interested in doing such. (I figured
the last part was implied)


ok! :)

No problem. I have seen the GUC thing in the source (guc.c etc...) and it
does not look too dificult. It has lots of examples in the code itself.
What I am saying is that I can do it if pg-people agree on the (some) 
way to do it.

To have parameters, say DOUBLE_FORMAT and FLOAT_FORMAT, which could have
 option values of 'SHORT' and 'LONG'.
Option 'SHORT' would be default and produce the standard sprintf(ascii,...
Option 'LONG' would produce sprintf(ascii, %25.18g, num).

Other way would be to have number parameters to be used in the sprintf
calls, in place of 25 and 18, in the format string.




From what Tom said, something similar was suggested and there were issues

brought up.  I don't know what they were, since I wasn't personally
terribly interested, but it should be in the archives.  If there were any
concerns, you'll probably need to deal with those as well.


I looked at some of these emails and it seemed to me that the problem 
was that Tom did'nt want a parameter that would force people to know 
about printf number formatting. I think the first solution above (the 
SHORT and LONG way) is simple, maintains usual output as default and 
enables 'maximum' precision at request.



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])





--
--
Pedro Miguel Frazao Fernandes Ferreira
Universidade do Algarve
Faculdade de Ciencias e Tecnologia
Campus de Gambelas
8000-117 Faro
Portugal
Tel./Fax:  (+351) 289 800950 / 289 819403
http://w3.ualg.pt/~pfrazao


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] float output precision questions

2002-10-30 Thread Pedro Miguel Frazao Fernandes Ferreira


Peter Eisentraut wrote:



Matlab has a toolbox fucntion, claiming maximum precision, to convert
from its double type (PostgreSQL float8) to string which does a
sprintf('%25.18g',number).



Do we have some mathematical guarantee that this is sufficient and
necessary?  If so, then it might do.


It is necessary if you want to do this type of (huge amount of number 
storage) application:

 [Some client] (insert)  [PostgreSQL]  (query)  [Some client]
(double number a)(float8 number)---(double number b)

In order for a=b this is necessary. With current float8 output this is 
not allways true.

Here is the help for that particular Matlab function:

 NUM2MSTR Convert number to string in maximum precision.
S = NUM2MSTR(N) converts real numbers of input
matrix N to string output vector S, in
maximum precision.

See also NUM2STR.

If you want I can try to contact the guys who coded this function to know
if this is sufficient.

Thanks,
Pedro M. Ferreira
--
--
Pedro Miguel Frazao Fernandes Ferreira
Universidade do Algarve
Faculdade de Ciencias e Tecnologia
Campus de Gambelas
8000-117 Faro
Portugal
Tel./Fax:  (+351) 289 800950 / 289 819403
http://w3.ualg.pt/~pfrazao


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] float output precision questions

2002-10-30 Thread Pedro Miguel Frazao Fernandes Ferreira
Stephan Szabo wrote:

On Tue, 29 Oct 2002, Peter Eisentraut wrote:



Pedro Miguel Frazao Fernandes Ferreira writes:



Is there a way to set query output precision to maximum precision ?
For the type of application I mentioned this is crucial. People want to
get the 'same' numbers, from querys or dumps, as they inserted them.




How do you define maximum precision and same?  With simple test programs
in C, using two digits more than DBL_DIG for printf specifier, it's easy
for me to find numbers that change decimal string representation in the
decimal representation - double - decimal representation conversion(*).
The final double you get from the second conversion should be the same as
the first, but is that what you need or do you need a stronger guarantee
than that?


When I say same I am talking about having a number 'stored' in double 
format in some client, inserting it in PostgreSQL float8 field and get 
it to the client as it was before:

  [Some client]  (insert)   [PostgreSQL]  (query)  [Some client]
(double number a)(float8 number)---(double number b)

same is so that a==b is true.
With current float8 output this is not allways true.

I believe this should allways be true for numbers which are originally 
stored in double format.

Thanks,
Pedro M. Ferreira








--
--
Pedro Miguel Frazao Fernandes Ferreira
Universidade do Algarve
Faculdade de Ciencias e Tecnologia
Campus de Gambelas
8000-117 Faro
Portugal
Tel./Fax:  (+351) 289 800950 / 289 819403
http://w3.ualg.pt/~pfrazao


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] float output precision questions

2002-10-30 Thread Pedro Miguel Frazao Fernandes Ferreira
Bruno Wolff III wrote:

On Tue, Oct 29, 2002 at 23:19:05 +0100,
  Peter Eisentraut [EMAIL PROTECTED] wrote:


There isn't a way right now, but it's planned to be able to dump
floating-point numbers in some binary form (like printf(%A)) to be able
to restore them exactly.  Not sure how that would satisfy the needs of
client interfaces, though.



Why not print it as a floating binary number instead of a floating decimal
number? I would think that would give you better portability than a system
specific binary representation.


Having a way to get the binary storage representation for float numbers 
would be good and would suffice within the same float number standard.

Thanks,
Pedro M. Ferreira

--
--
Pedro Miguel Frazao Fernandes Ferreira
Universidade do Algarve
Faculdade de Ciencias e Tecnologia
Campus de Gambelas
8000-117 Faro
Portugal
Tel./Fax:  (+351) 289 800950 / 289 819403
http://w3.ualg.pt/~pfrazao


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html


Re: [HACKERS] float output precision questions

2002-10-30 Thread Stephan Szabo

On Wed, 30 Oct 2002, Pedro Miguel Frazao Fernandes Ferreira wrote:

 Stephan Szabo wrote:
  On Tue, 29 Oct 2002, Peter Eisentraut wrote:
 
 
 Pedro Miguel Frazao Fernandes Ferreira writes:
 
 
 Is there a way to set query output precision to maximum precision ?
 For the type of application I mentioned this is crucial. People want to
 get the 'same' numbers, from querys or dumps, as they inserted them.
 
 
  How do you define maximum precision and same?  With simple test programs
  in C, using two digits more than DBL_DIG for printf specifier, it's easy
  for me to find numbers that change decimal string representation in the
  decimal representation - double - decimal representation conversion(*).
  The final double you get from the second conversion should be the same as
  the first, but is that what you need or do you need a stronger guarantee
  than that?

 When I say same I am talking about having a number 'stored' in double
 format in some client, inserting it in PostgreSQL float8 field and get
 it to the client as it was before:

[Some client]  (insert)   [PostgreSQL]  (query)  [Some client]
 (double number a)(float8 number)---(double number b)

 same is so that a==b is true.
 With current float8 output this is not allways true.

 I believe this should allways be true for numbers which are originally
 stored in double format.

The problem is that there are two competing needs here.  One is the above,
the other other is that you get something that has the same decimal
representation (within the float's ability to store the number). Right now
the system does the latter since for most people, that seems to be the
guarantee they want.

This would probably make sense as an option, so why don't you look at the
past discussions and see if you can come up with a solution that keeps
everyone happy (and preferably implement it, but...) :)




---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] float output precision questions

2002-10-30 Thread Pedro Miguel Frazao Fernandes Ferreira
Stephan Szabo wrote:

On Wed, 30 Oct 2002, Pedro Miguel Frazao Fernandes Ferreira wrote:

Stephan Szabo wrote:

On Tue, 29 Oct 2002, Peter Eisentraut wrote:

Pedro Miguel Frazao Fernandes Ferreira writes:

Is there a way to set query output precision to maximum precision ?
For the type of application I mentioned this is crucial. People want to
get the 'same' numbers, from querys or dumps, as they inserted them.



How do you define maximum precision and same?  With simple test programs
in C, using two digits more than DBL_DIG for printf specifier, it's easy
for me to find numbers that change decimal string representation in the
decimal representation - double - decimal representation conversion(*).
The final double you get from the second conversion should be the same as
the first, but is that what you need or do you need a stronger guarantee
than that?


When I say same I am talking about having a number 'stored' in double
format in some client, inserting it in PostgreSQL float8 field and get
it to the client as it was before:

  [Some client]  (insert)   [PostgreSQL]  (query)  [Some client]
(double number a)(float8 number)---(double number b)

same is so that a==b is true.
With current float8 output this is not allways true.

I believe this should allways be true for numbers which are originally
stored in double format.


The problem is that there are two competing needs here.  One is the above,
the other other is that you get something that has the same decimal
representation (within the float's ability to store the number). Right now
the system does the latter since for most people, that seems to be the
guarantee they want.


Look at this example:

1.79769313486231571e+308 is the largest floating point number 
representable by a C double in x86.

In C this is possible:

#include stdio.h
#include stdlib.h

int main(void)
{
  double v;
  char a[30];

  v=1.79769313486231571e+308;

  printf(   Stored double number: %25.18g\n,v);
  sprintf(a,%25.18g,v);
  printf(Converted to string: %s\n,a);
  v=atof(a);
  printf(Converted from string to double: %25.18g\n,v);
}

Using standard PostgreSQL query output it would not be possible to get 
this number, which has representation as a double.

I fetched the PostgreSQL source from Debian, changed 
src/backend/utils/adt/float.c to do sprintf(ascii, %25.18g, num) 
instead of sprintf(ascii, %.*g, DBL_DIG, num), compiled and installed. 
Now I can get the number as is.

I understand that if people insert a value of 1.1 in a double, they want 
to get 1.1 without knowing that in fact the stored number is 
1.10009. But do you understand that if some people insert, 
for example, a value of 1.79769313486231571e+308 they dont want to get 
1.79769313486232e+308 which does not compare equal (in Matlab or C) to 
the first ? This is a bug.

This would probably make sense as an option, so why don't you look at the
past discussions and see if you can come up with a solution that keeps
everyone happy (and preferably implement it, but...) :)


but ???

I have a sugestion:

To have parameters, say DOUBLE_FORMAT and FLOAT_FORMAT, which could have 
 option values of 'SHORT' and 'LONG'.
Option 'SHORT' would be default and produce the standard sprintf(ascii,...
Option 'LONG' would produce sprintf(ascii, %25.18g, num).

Other way would be to have number parameters to be used in the sprintf 
calls, in place of 25 and 18, in the format string.





---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster





--
--
Pedro Miguel Frazao Fernandes Ferreira
Universidade do Algarve
Faculdade de Ciencias e Tecnologia
Campus de Gambelas
8000-117 Faro
Portugal
Tel./Fax:  (+351) 289 800950 / 289 819403
http://w3.ualg.pt/~pfrazao


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] float output precision questions

2002-10-30 Thread Peter Eisentraut
Bruno Wolff III writes:

 On Tue, Oct 29, 2002 at 23:19:05 +0100,
   Peter Eisentraut [EMAIL PROTECTED] wrote:
 
  There isn't a way right now, but it's planned to be able to dump
  floating-point numbers in some binary form (like printf(%A)) to be able
  to restore them exactly.  Not sure how that would satisfy the needs of
  client interfaces, though.

 Why not print it as a floating binary number instead of a floating decimal
 number? I would think that would give you better portability than a system
 specific binary representation.

The printf(%A) output is not system-specific.

-- 
Peter Eisentraut   [EMAIL PROTECTED]


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] float output precision questions

2002-10-30 Thread Stephan Szabo

On Wed, 30 Oct 2002, Pedro M. Ferreira wrote:

 Stephan Szabo wrote:
  On Wed, 30 Oct 2002, Pedro Miguel Frazao Fernandes Ferreira wrote:
 
 I understand that if people insert a value of 1.1 in a double, they want
 to get 1.1 without knowing that in fact the stored number is
 1.10009. But do you understand that if some people insert,
 for example, a value of 1.79769313486231571e+308 they dont want to get
 1.79769313486232e+308 which does not compare equal (in Matlab or C) to
 the first ? This is a bug.
 
 
  I disagree to some extent.  I'm not sure it's meaningful to expect that
  (what if the database and the client are on different architectures) in
  general.  In any case, you're effectively going from decimal
  representation to double to decimal representation (the string you used
  to insert it - internal representation - string used to output it) and
  that's only guaranteed to be correct up to DBL_DIG digits as far as I can
  tell.  I think it'd be nice to have an option to get more digits for those
  sorts of applications, however.

 In the previous email example, in C, I was going from decimal to double
 and so on, but this is not the case when I do some simulation. In this
 case it will allways be from Matlab double to PostgreSQL float8 and from
 libpq PQgetvalue() string to Matlab double. The example was just a x86
 number example where (got the string from Matlab double) query output
 would fail.

I was just responding to it being a bug.  I don't think that expecting
a float-db-float (double-db-double) giving the same double is always
safe when you assume that the PostgreSQL machine might be on a system with
different guarantees about precision.  In practice, it's probably not a
big deal.

  , but I realize that you might not be interested in doing such. (I figured
  the last part was implied)

 ok! :)

 No problem. I have seen the GUC thing in the source (guc.c etc...) and it
 does not look too dificult. It has lots of examples in the code itself.
 What I am saying is that I can do it if pg-people agree on the (some)
 way to do it.

Yeah, I didn't think it'd be hard, but sometimes people are unable or
unwilling to do C code for things.

 I looked at some of these emails and it seemed to me that the problem
 was that Tom did'nt want a parameter that would force people to know
 about printf number formatting. I think the first solution above (the
 SHORT and LONG way) is simple, maintains usual output as default and
 enables 'maximum' precision at request.

That seems reasonable then, Tom'll probably give any other objections he
might have if he has any.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] float output precision questions

2002-10-30 Thread Tom Lane
Stephan Szabo [EMAIL PROTECTED] writes:
 On Wed, 30 Oct 2002, Pedro M. Ferreira wrote:
 I looked at some of these emails and it seemed to me that the problem
 was that Tom did'nt want a parameter that would force people to know
 about printf number formatting. I think the first solution above (the
 SHORT and LONG way) is simple, maintains usual output as default and
 enables 'maximum' precision at request.

 That seems reasonable then, Tom'll probably give any other objections he
 might have if he has any.

My recollection is that other people (perhaps Peter?) were the ones
objecting before.  However I'd be somewhat unhappy with the proposal
as given:

Option 'SHORT' would be default and produce the standard sprintf(ascii,...
Option 'LONG' would produce sprintf(ascii, %25.18g, num).

since this seems to me to hardwire inappropriate assumptions about the
number of significant digits in a double.  (Yes, I know practically
everyone uses IEEE floats these days.  But it's inappropriate for PG
to assume that.)

AFAICT the real issue here is that binary float representations will
have a fractional decimal digit of precision beyond what DBL_DIG claims.
I think I could support adding an option that switches between the
current output format:
sprintf(ascii, %.*g, DBL_DIG, num);
and:
sprintf(ascii, %.*g, DBL_DIG+1, num);
and similarly for float4.  Given carefully written float I/O routines,
reading the latter output should reproduce the originally stored value.
(And if the I/O routines are not carefully written, you probably lose
anyway.)  I don't see a need for allowing more flexibility than that.

Comments?

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



[HACKERS] float output precision questions

2002-10-29 Thread Pedro Miguel Frazao Fernandes Ferreira
Hi All,

I wrote a Matlab interface to PostgreSQL based on libpq. It is working 
fine, supports multiple connections, supports all data types and arrays. 
It is a C program to do the interface and some Matlab wrapper functions 
around it to do the job at application level.

Matlab has an ODBC toolbox but we dont want to buy it since the 
interface does the job and we have been using PostgreSQL for long time.
We want to use PostgreSQl to store numeric data from simulation, 
computation and data acquisition sources. Basically a big bunch of float 
numbers.

There is still one problem, regarding float output formatting in querys 
and dumps, which essential for this type of application.

If I have a float8 field (named real8 below) in a table and insert the 
following,

insert into test(real8) values (4503599627370496);
INSERT 21192 1

A select produces,

select real8 from test;
real8
-
 4.5035996273705e+15
(1 row)

This is the string I would get from libpq's PQgetvalue(). PQftype() 
correctly returns float8. pg_dump produces the same result. If I convert 
PQgetvalue() to a C double I wont get the same number I inserted. If I 
do a restore from a dump I also wont have the same number. I can see 
that the original number is correctly stored because,

select to_char(real8,'999.9') from test;
   to_char
--
 4503599627370496
(1 row)

This way PQftype wont tell this is a float8.

Is there a way to set query output precision to maximum precision ?
For the type of application I mentioned this is crucial. People want to 
get the 'same' numbers, from querys or dumps, as they inserted them.

Matlab has a toolbox fucntion, claiming maximum precision, to convert 
from its double type (PostgreSQL float8) to string which does a 
sprintf('%25.18g',number).

Would it be possible to have a a parameter which could be changed by a 
SET command, in order to control output precision ? I searched the docs 
but could not find this.

--
--
Pedro Miguel Frazao Fernandes Ferreira
Universidade do Algarve
Faculdade de Ciencias e Tecnologia
Campus de Gambelas
8000-117 Faro
Portugal
Tel./Fax:  (+351) 289 800950 / 289 819403
http://w3.ualg.pt/~pfrazao


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] float output precision questions

2002-10-29 Thread Pedro Miguel Frazao Fernandes Ferreira
Sorry. I forgot to thank for any help from all of you in the previous 
message. Thanks!  :)

Just one more thing:

I now I can go to the source and change the code which converts floats 
to strings, and have my problem solved. But this wont be general. Others 
might need this kind of application.

I could post this interface to postgres interfaces site. Do you think 
this is ok ?

Thanks again !

Pedro M. Ferreira

Pedro Miguel Frazao Fernandes Ferreira wrote:
Hi All,

I wrote a Matlab interface to PostgreSQL based on libpq. It is working 
fine, supports multiple connections, supports all data types and arrays. 
It is a C program to do the interface and some Matlab wrapper functions 
around it to do the job at application level.

Matlab has an ODBC toolbox but we dont want to buy it since the 
interface does the job and we have been using PostgreSQL for long time.
We want to use PostgreSQl to store numeric data from simulation, 
computation and data acquisition sources. Basically a big bunch of float 
numbers.

There is still one problem, regarding float output formatting in querys 
and dumps, which essential for this type of application.

If I have a float8 field (named real8 below) in a table and insert the 
following,

insert into test(real8) values (4503599627370496);
INSERT 21192 1

A select produces,

select real8 from test;
real8
-
 4.5035996273705e+15
(1 row)

This is the string I would get from libpq's PQgetvalue(). PQftype() 
correctly returns float8. pg_dump produces the same result. If I convert 
PQgetvalue() to a C double I wont get the same number I inserted. If I 
do a restore from a dump I also wont have the same number. I can see 
that the original number is correctly stored because,

select to_char(real8,'999.9') from test;
   to_char
--
 4503599627370496
(1 row)

This way PQftype wont tell this is a float8.

Is there a way to set query output precision to maximum precision ?
For the type of application I mentioned this is crucial. People want to 
get the 'same' numbers, from querys or dumps, as they inserted them.

Matlab has a toolbox fucntion, claiming maximum precision, to convert 
from its double type (PostgreSQL float8) to string which does a 
sprintf('%25.18g',number).

Would it be possible to have a a parameter which could be changed by a 
SET command, in order to control output precision ? I searched the docs 
but could not find this.



--
--
Pedro Miguel Frazao Fernandes Ferreira
Universidade do Algarve
Faculdade de Ciencias e Tecnologia
Campus de Gambelas
8000-117 Faro
Portugal
Tel./Fax:  (+351) 289 800950 / 289 819403
http://w3.ualg.pt/~pfrazao


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] float output precision questions

2002-10-29 Thread Bruce Momjian

TODO has:

   o Add SET REAL_FORMAT and SET DOUBLE_PRECISION_FORMAT
 using printf args

so we have not implemented it yet.

---

Pedro Miguel Frazao Fernandes Ferreira wrote:
 Hi All,
 
 I wrote a Matlab interface to PostgreSQL based on libpq. It is working 
 fine, supports multiple connections, supports all data types and arrays. 
 It is a C program to do the interface and some Matlab wrapper functions 
 around it to do the job at application level.
 
 Matlab has an ODBC toolbox but we dont want to buy it since the 
 interface does the job and we have been using PostgreSQL for long time.
 We want to use PostgreSQl to store numeric data from simulation, 
 computation and data acquisition sources. Basically a big bunch of float 
 numbers.
 
 There is still one problem, regarding float output formatting in querys 
 and dumps, which essential for this type of application.
 
 If I have a float8 field (named real8 below) in a table and insert the 
 following,
 
 insert into test(real8) values (4503599627370496);
 INSERT 21192 1
 
 A select produces,
 
 select real8 from test;
  real8
 -
   4.5035996273705e+15
 (1 row)
 
 This is the string I would get from libpq's PQgetvalue(). PQftype() 
 correctly returns float8. pg_dump produces the same result. If I convert 
 PQgetvalue() to a C double I wont get the same number I inserted. If I 
 do a restore from a dump I also wont have the same number. I can see 
 that the original number is correctly stored because,
 
 select to_char(real8,'999.9') from test;
 to_char
 --
   4503599627370496
 (1 row)
 
 This way PQftype wont tell this is a float8.
 
 Is there a way to set query output precision to maximum precision ?
 For the type of application I mentioned this is crucial. People want to 
 get the 'same' numbers, from querys or dumps, as they inserted them.
 
 Matlab has a toolbox fucntion, claiming maximum precision, to convert 
 from its double type (PostgreSQL float8) to string which does a 
 sprintf('%25.18g',number).
 
 Would it be possible to have a a parameter which could be changed by a 
 SET command, in order to control output precision ? I searched the docs 
 but could not find this.
 
 -- 
 --
 Pedro Miguel Frazao Fernandes Ferreira
 Universidade do Algarve
 Faculdade de Ciencias e Tecnologia
 Campus de Gambelas
 8000-117 Faro
 Portugal
 Tel./Fax:  (+351) 289 800950 / 289 819403
 http://w3.ualg.pt/~pfrazao
 
 
 ---(end of broadcast)---
 TIP 4: Don't 'kill -9' the postmaster
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] float output precision questions

2002-10-29 Thread Peter Eisentraut
Pedro Miguel Frazao Fernandes Ferreira writes:

 Is there a way to set query output precision to maximum precision ?
 For the type of application I mentioned this is crucial. People want to
 get the 'same' numbers, from querys or dumps, as they inserted them.

There isn't a way right now, but it's planned to be able to dump
floating-point numbers in some binary form (like printf(%A)) to be able
to restore them exactly.  Not sure how that would satisfy the needs of
client interfaces, though.

 Matlab has a toolbox fucntion, claiming maximum precision, to convert
 from its double type (PostgreSQL float8) to string which does a
 sprintf('%25.18g',number).

Do we have some mathematical guarantee that this is sufficient and
necessary?  If so, then it might do.

-- 
Peter Eisentraut   [EMAIL PROTECTED]


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] float output precision questions

2002-10-29 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 TODO has:
o Add SET REAL_FORMAT and SET DOUBLE_PRECISION_FORMAT
using printf args
 so we have not implemented it yet.

IIRC, the last time it was discussed there was disagreement about how
it should work; check the pghackers archives for details.  The feature
probably won't go anywhere until those issues are resolved.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] float output precision questions

2002-10-29 Thread Stephan Szabo
On Tue, 29 Oct 2002, Peter Eisentraut wrote:

 Pedro Miguel Frazao Fernandes Ferreira writes:

  Is there a way to set query output precision to maximum precision ?
  For the type of application I mentioned this is crucial. People want to
  get the 'same' numbers, from querys or dumps, as they inserted them.

How do you define maximum precision and same?  With simple test programs
in C, using two digits more than DBL_DIG for printf specifier, it's easy
for me to find numbers that change decimal string representation in the
decimal representation - double - decimal representation conversion(*).
The final double you get from the second conversion should be the same as
the first, but is that what you need or do you need a stronger guarantee
than that?



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] float output precision questions

2002-10-29 Thread Bruno Wolff III
On Tue, Oct 29, 2002 at 23:19:05 +0100,
  Peter Eisentraut [EMAIL PROTECTED] wrote:
 
 There isn't a way right now, but it's planned to be able to dump
 floating-point numbers in some binary form (like printf(%A)) to be able
 to restore them exactly.  Not sure how that would satisfy the needs of
 client interfaces, though.

Why not print it as a floating binary number instead of a floating decimal
number? I would think that would give you better portability than a system
specific binary representation.

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]