Re: [HACKERS] [GENERAL] Floating point error

2013-03-05 Thread Tom Duffey
This conversation has moved beyond my ability to be useful but I want to remind 
everyone of my original issues in case it helps you improve the docs:

1) Data shown in psql did not match data retrieved by JDBC. I had to debug 
pretty deep into the JDBC code to confirm that a value I was staring at in psql 
was different in JDBC. Pretty weird, but I figured it had something to do with 
floating point malarky.

2) The problem in #1 could not be reproduced when running on our test database. 
Again very weird, because as far as psql was showing me the values in the two 
databases were identical. I used COPY to transfer some data from the production 
database to the test database.

I now know that what you see in psql is not necessarily what you see in JDBC. I 
also know that you need to set extra_float_digits = 3 before using COPY to 
transfer data from one database to another or risk differences in floating 
point values. Sounds like both pg_dump and the JDBC driver must be doing this 
or its equivalent on their own.

If the numeric types page of the documentation had mentioned the 
extra_float_digits then I might have been able to solve my own problem. I'd 
like you to add some mention of it even if it is just handwaving but will let 
you guys hash it out from here. Either way, PostgreSQL rocks!

Tom

On Mar 5, 2013, at 12:38 PM, Tom Lane  wrote:

> Maciek Sakrejda  writes:
>> On Tue, Mar 5, 2013 at 10:23 AM, Tom Lane  wrote:
>>> Basically, the default behavior is tuned to the expectations of people
>>> who think that what they put in is what they should get back, ie we
>>> don't want the system doing this by default:
>>> 
>>> regression=# set extra_float_digits = 3;
>>> SET
>>> regression=# select 0.1::float4;
>>> float4
>>> -
>>> 0.10001
>>> (1 row)
>>> 
>>> regression=# select 0.1::float8;
>>> float8
>>> -
>>> 0.10001
>>> (1 row)
>>> 
>>> We would get a whole lot more bug reports, not fewer, if that were
>>> the default behavior.
> 
>> Isn't this a client rendering issue, rather than an on-the-wire encoding 
>> issue?
> 
> Nope, at least not unless you ask for binary output format (which
> introduces a whole different set of portability gotchas, so it's
> not the default either).
> 
>   regards, tom lane

--
Tom Duffey
tduf...@trillitech.com
414-751-0600 x102



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Floating point error

2013-02-28 Thread Tom Duffey
Hi Everyone,

To bring closure to this thread, my whole problem was caused by not knowing 
about the extra_float_digits setting. We have a script that uses COPY to 
transfer a subset of rows from a very large production table to a test table. 
The script was not setting extra_float_digits so the values did not match even 
though they appeared to match when running queries in psql. Definitely another 
gotcha for floating point values and it might be a good idea to mention this 
setting on the "Numeric Types" page of the docs.

Thanks to all who chimed in to help!

Tom

On Feb 28, 2013, at 7:05 PM, James Cloos  wrote:

>>>>>> "TD" == Tom Duffey  writes:
> 
> TD> Riddle me this. I have a database column of type "real" that gets
> TD> mapped to a Java field of type double via JDBC. ...
> 
> TD> - Selecting values from both test and production DBs using psql
> TD>   shows "10.3885" as the value
> 
> TD> - The Java app on production shows "10.3884573" while the test app
> TD>   shows "10.3885"
> 
> I suspect the issue is that psql(1) and whatever java method you use to
> convert the floats to text choose different rounding.
> 
> By default, it seems that psql(1) uses something like printf("%.4f",...)
> whereas your java app calls a routing which works more like "%.7f".
> 
> (The wire format for floats is the same as they are stored, not a text
> representation thereof.)
> 
> -JimC
> -- 
> James Cloos  OpenPGP: 1024D/ED7DAEA6

--
Tom Duffey
tduf...@trillitech.com
414-751-0600 x102



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Floating point error

2013-02-25 Thread Tom Duffey
That's exactly what I was looking for. We use COPY to transfer data from a 1 
billion+ row table to a test database and were confused why the results looked 
the same but were obviously not. Sounds like we need to use the 
extra_float_digits setting to include all the available information when 
transferring the data.

Thanks for the explanation.

Tom

On Feb 25, 2013, at 8:00 AM, Albe Laurenz  wrote:

> Tom Duffey wrote:
>> Here is a smaller test case that does not involve Java. I guess this 
>> probably is just due to floating
>> point error when the initial value is inserted that is too large for the 
>> field but it's still a
>> surprise.
>> 
>> Create a test table, insert a couple values and view the results:
>> 
>> CREATE TABLE test (
>>id INTEGER PRIMARY KEY,
>>value REAL NOT NULL
>> );
>> 
>> INSERT INTO test (id, value) VALUES (1, 10.3884573), (2, 10.3885);
>> SELECT * FROM test;
>> 
>> id |  value
>> +-
>>   1 | 10.3885
>>   2 | 10.3885
>> (2 rows)
> 
> SET extra_float_digits=3;
> SELECT * FROM test;
> 
> id |   value
> +
>  1 | 10.3884573
>  2 | 10.3885002
> (2 rows)
> 
> PostgreSQL by default omits the last three digits to avoid
> differences on different architectures (I think).
> 
> When you convert to double precision, you'll see these digits.
> 
>> At this point you would think you have two equal values. Now change the type:
>> 
>> ALTER TABLE test ALTER COLUMN value TYPE DOUBLE PRECISION;
>> SELECT * FROM test;
>> 
>> id |  value
>> +--
>>  1 | 10.3884572982788
>>  2 |  10.388500213623
>> (2 rows)
> 
> Yours,
> Laurenz Albe

--
Tom Duffey
tduf...@trillitech.com
414-751-0600 x102



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Floating point error

2013-02-25 Thread Tom Duffey
Here is a smaller test case that does not involve Java. I guess this probably 
is just due to floating point error when the initial value is inserted that is 
too large for the field but it's still a surprise.

Create a test table, insert a couple values and view the results:

CREATE TABLE test (
id INTEGER PRIMARY KEY,
value REAL NOT NULL
);  

INSERT INTO test (id, value) VALUES (1, 10.3884573), (2, 10.3885);
SELECT * FROM test;

 id |  value  
 +-
   1 | 10.3885
   2 | 10.3885
(2 rows)

At this point you would think you have two equal values. Now change the type:

ALTER TABLE test ALTER COLUMN value TYPE DOUBLE PRECISION;
SELECT * FROM test;

 id |  value   
+--
  1 | 10.3884572982788
  2 |  10.388500213623
(2 rows)

Values no longer equal and the first one is in fact closer to what as 
originally inserted. Why is this? Is this simply caused by how the initially 
inserted value is stored as floating point?

If you create a "copy" of the database before changing the field type then both 
values get inserted as "10.3885." Changing the type then results in two equal 
values. Maybe this is just another pitfall of using floating point numbers and 
at this point I am just trying to identify exactly where our errors are being 
introduced so can anyone confirm the above behavior is correct? In our real 
world example we are not changing the type but are instead getting the second 
value w/id = 1 above when using JDBC to retrieve values into a Java double 
field.

I ran the above on PostgreSQL 9.1.2 and 9.2.2 with the same results.

Tom

On Feb 24, 2013, at 9:17 PM, Adrian Klaver  wrote:

> On 02/24/2013 06:58 PM, Tom Duffey wrote:
>> 
>> On Feb 24, 2013, at 8:44 PM, Adrian Klaver  wrote:
>> 
>>> On 02/24/2013 06:13 PM, Tom Duffey wrote:
>>>> Hi Everyone,
>>>> 
>>>> Riddle me this. I have a database column of type "real" that gets mapped 
>>>> to a Java field of type double via JDBC. We have two databases, test and 
>>>> production, and the test database is periodically blown away and reloaded 
>>>> from a copy of production. We recently noticed that some values do not 
>>>> match when viewed within our application on test vs. production. More 
>>>> specifically:
>>>> 
>>>> - Selecting values from both test and production DBs using psql shows 
>>>> "10.3885" as the value
>>>> - The Java app on production shows "10.3884573" while the test app shows 
>>>> "10.3885"
>>>> 
>>>> I have a hunch that when the value was originally inserted into the 
>>>> production DB it probably contained more than the 6 digits supported by 
>>>> the real data type. It may have even been exactly the "10.3884573" value 
>>>> we see when retrieving via JDBC on production. What I don't understand is 
>>>> why when the value gets mapped back to Java via JDBC those extra digits 
>>>> are coming back. Can anyone explain this or do you think I'm on the wrong 
>>>> track? I stepped through code and it sure seems like the extra information 
>>>> is coming back from the JDBC driver.
>>> 
>>> Are the production and test apps running on the same platform i.e. OS, 
>>> bitness, etc.
>> 
>> Yes, the production and test apps are running on the same platform. The Java 
>> apps themselves are physically on the same Linux server. The production and 
>> test databases reside within the same instance of PostgreSQL.
>> 
>> Also, I should have mentioned up front that I am well aware of the pitfalls 
>> of using floating point values and also the fact that PostgreSQL's "real" 
>> data type supports 6 digits of precision. What I do not understand is why my 
>> JDBC driver is returning more information than what I receive in psql or if 
>> I operate on a copy of the database. This leads me to believe that more 
>> information was available at insertion time and is somehow being made 
>> available to my application even though the data type should only store 6 
>> digits. Let me see if I can write a quick little test case.
>> 
> 
> Well I guess you could look in the dump file and see what is recorded there.
> 
>> Tom
>> 
> 
> 
> -- 
> Adrian Klaver
> adrian.kla...@gmail.com

--
Tom Duffey
tduf...@trillitech.com
414-751-0600 x102



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Floating point error

2013-02-24 Thread Tom Duffey

On Feb 24, 2013, at 8:44 PM, Adrian Klaver  wrote:

> On 02/24/2013 06:13 PM, Tom Duffey wrote:
>> Hi Everyone,
>> 
>> Riddle me this. I have a database column of type "real" that gets mapped to 
>> a Java field of type double via JDBC. We have two databases, test and 
>> production, and the test database is periodically blown away and reloaded 
>> from a copy of production. We recently noticed that some values do not match 
>> when viewed within our application on test vs. production. More specifically:
>> 
>> - Selecting values from both test and production DBs using psql shows 
>> "10.3885" as the value
>> - The Java app on production shows "10.3884573" while the test app shows 
>> "10.3885"
>> 
>> I have a hunch that when the value was originally inserted into the 
>> production DB it probably contained more than the 6 digits supported by the 
>> real data type. It may have even been exactly the "10.3884573" value we see 
>> when retrieving via JDBC on production. What I don't understand is why when 
>> the value gets mapped back to Java via JDBC those extra digits are coming 
>> back. Can anyone explain this or do you think I'm on the wrong track? I 
>> stepped through code and it sure seems like the extra information is coming 
>> back from the JDBC driver.
> 
> Are the production and test apps running on the same platform i.e. OS, 
> bitness, etc.

Yes, the production and test apps are running on the same platform. The Java 
apps themselves are physically on the same Linux server. The production and 
test databases reside within the same instance of PostgreSQL.

Also, I should have mentioned up front that I am well aware of the pitfalls of 
using floating point values and also the fact that PostgreSQL's "real" data 
type supports 6 digits of precision. What I do not understand is why my JDBC 
driver is returning more information than what I receive in psql or if I 
operate on a copy of the database. This leads me to believe that more 
information was available at insertion time and is somehow being made available 
to my application even though the data type should only store 6 digits. Let me 
see if I can write a quick little test case.

Tom

--
Tom Duffey
tduf...@trillitech.com
414-751-0600 x102



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Floating point error

2013-02-24 Thread Tom Duffey
Hi Everyone,

Riddle me this. I have a database column of type "real" that gets mapped to a 
Java field of type double via JDBC. We have two databases, test and production, 
and the test database is periodically blown away and reloaded from a copy of 
production. We recently noticed that some values do not match when viewed 
within our application on test vs. production. More specifically:

- Selecting values from both test and production DBs using psql shows "10.3885" 
as the value
- The Java app on production shows "10.3884573" while the test app shows 
"10.3885"

I have a hunch that when the value was originally inserted into the production 
DB it probably contained more than the 6 digits supported by the real data 
type. It may have even been exactly the "10.3884573" value we see when 
retrieving via JDBC on production. What I don't understand is why when the 
value gets mapped back to Java via JDBC those extra digits are coming back. Can 
anyone explain this or do you think I'm on the wrong track? I stepped through 
code and it sure seems like the extra information is coming back from the JDBC 
driver.

Tom

--
Tom Duffey
tduf...@trillitech.com
414-751-0600 x102



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Query memory usage

2010-05-15 Thread Tom Duffey


On May 15, 2010, at 8:00 PM, Tom Lane wrote:


Tom Duffey  writes:

On May 15, 2010, at 7:28 PM, Tom Lane wrote:
Well, I tried executing a large "copy (select ...)" query and  
couldn't
see any memory bloat at all in either the backend or psql.  So  
there's

something relevant that you haven't told us.


I hope you are right!  The actual query is different because I was  
not

aware until right before I posted this question that you can have a
WHERE clause with COPY.  Here is the actual query I ran:



SELECT point_id || E'\t' || status || E'\t' || value || E'\t' ||
timestamp
FROM point_history
WHERE timestamp > NOW() - interval '18 months';


Ermm ... is that the whole query, or did you wrap it in COPY (...) TO
STDOUT?  The former case will cause psql to eat memory, because it  
tries

to buffer the whole result of an ordinary query.  In the latter case
psql will just stream the data through to the output file.



That's the whole query.  If I understand your reply correctly it  
sounds like psql was the culprit and that I should try again using  
COPY (...) TO STDOUT, no?


Tom

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Query memory usage

2010-05-15 Thread Tom Duffey


On May 15, 2010, at 7:28 PM, Tom Lane wrote:


Tom Duffey  writes:

On May 15, 2010, at 4:51 PM, Tom Lane wrote:

What's being done on the client side with the data?



I am executing the query in psql at the command line and piping the
result to a file, e.g.,
psql < get_data.sql > data.sql


Well, I tried executing a large "copy (select ...)" query and couldn't
see any memory bloat at all in either the backend or psql.  So there's
something relevant that you haven't told us.

Could we see the full schema (eg via psql \dt) for the table being
copied?


I hope you are right!  The actual query is different because I was not  
aware until right before I posted this question that you can have a  
WHERE clause with COPY.  Here is the actual query I ran:


SELECT point_id || E'\t' || status || E'\t' || value || E'\t' ||  
timestamp

FROM point_history
WHERE timestamp > NOW() - interval '18 months';

And here is the table schema:

prod=> \dt point_history
List of relations
 Schema | Name  | Type  | Owner
+---+---+
 public | point_history | table | prod
(1 row)

prod=> \d point_history
Table "public.point_history"
  Column   |Type | Modifiers
---+-+---
 point_id  | integer | not null
 value | real| not null
 status| integer | not null
 timestamp | timestamp without time zone | not null
Indexes:
"point_history_pkey" PRIMARY KEY, btree (point_id, "timestamp")
Foreign-key constraints:
"$1" FOREIGN KEY (point_id) REFERENCES point(id)

Tom

Re: [GENERAL] Query memory usage

2010-05-15 Thread Tom Duffey


On May 15, 2010, at 4:51 PM, Tom Lane wrote:


Tom Duffey  writes:

I have a table with several hundred million rows of timestamped
values.  Using pg_dump we are able to dump the entire table to disk  
no

problem.  However, I would like to retrieve a large subset of data
from this table using something like:


COPY (SELECT * FROM history WHERE timestamp > '2009-01-01') TO  
STDOUT;



Executing this query causes our server to consume all available swap
and crash.


What's being done on the client side with the data?  AFAIK that
operation really shouldn't consume a lot of memory on the server side.
It would help if you'd be more specific about which process is  
consuming

swap space.


I am executing the query in psql at the command line and piping the  
result to a file, e.g.,


psql < get_data.sql > data.sql

Tom

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Query memory usage

2010-05-15 Thread Tom Duffey

Hi Everyone,

I have a table with several hundred million rows of timestamped  
values.  Using pg_dump we are able to dump the entire table to disk no  
problem.  However, I would like to retrieve a large subset of data  
from this table using something like:


COPY (SELECT * FROM history WHERE timestamp > '2009-01-01') TO STDOUT;

Executing this query causes our server to consume all available swap  
and crash.  Can anyone help me figure out what needs to be done to  
allow this query to execute?  How long it takes doesn't really matter  
as long as it can be performed reliably.  The database currently lives  
on a Red Hat EL 5.3 server with 16GB RAM and 4GB swap running  
PostgreSQL 8.3.7.  Possibly relevant lines from postgresql.conf:


shared_buffers = 4GB
work_mem = 32MB
maintenance_work_mem = 1GB
effective_cache_size = 8GB

Tom

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] VMWare file system / database corruption

2009-09-21 Thread Tom Duffey


On Sep 21, 2009, at 12:40 PM, Scott Marlowe wrote:

On Mon, Sep 21, 2009 at 11:09 AM, Tom Duffey  
 wrote:

Hi All,

We're having numerous problems with a PostgreSQL 8.3.7 database  
running on a
virtual Linux server w/VMWare ESX.  This is not by choice and I  
have been
asking the operator of this equipment for details about the disk  
setup and

here's what I got:

"We have a SAN that is presenting an NFS share.  VMWare sees that  
share and

reads the VMDK file that make up the virtual file system."

Does anyone with a better understanding of PostgreSQL and VMWare  
know if
this is an unreliable setup for PostgreSQL?  I see things like  
"NFS" and

"VMWare" and start to get worried.


I see VMWare and thing performance issues, I see NFS and thing dear
god help us all.  Even if properly setup NFS is a problem waiting to
happen, and it's not reliable storage for a database in my opinion.
That said, lots of folks do it.  Ask for the NFS mount options from
the sysadmin.


Thanks to everyone so far for the insight.  I'm trying to get more  
details about the hardware setup but am not making much progress.


Here are some of the errors we're getting.  I searched through  
archives and they all seem to point at hardware trouble but is there  
anything else I should be looking at?


ERROR:  invalid page header in block 2 of relation  
"pg_toast_19466_index"


ERROR:  invalid memory alloc request size 1667592311
STATEMENT:  COPY public.version_bundle (node_id_hi, node_id_lo,  
bundle_data) TO stdout;


ERROR:  unexpected chunk number 1632 (expected 1629) for toast value  
19711 in pg_toast_19184

STATEMENT:  COPY public.data_binval (binval_id, binval_data) TO stdout;

ERROR:  invalid page header in block 414 of relation  
"pg_toast_19460_index"


ERROR:  could not open segment 1 of relation 1663/16386/16535 (target  
block 3966127611): No such file or directory


I dealt with some of the above by reindexing or finding and deleting  
bad rows.  I can now successfully dump the database but of course have  
missing data so the application is toast.  What I'm really wondering  
now is how to prevent this from happening again and if that means  
moving the database to new hardware.


Best Regards,

Tom


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] VMWare file system / database corruption

2009-09-21 Thread Tom Duffey

Hi All,

We're having numerous problems with a PostgreSQL 8.3.7 database  
running on a virtual Linux server w/VMWare ESX.  This is not by choice  
and I have been asking the operator of this equipment for details  
about the disk setup and here's what I got:


"We have a SAN that is presenting an NFS share.  VMWare sees that  
share and reads the VMDK file that make up the virtual file system."


Does anyone with a better understanding of PostgreSQL and VMWare know  
if this is an unreliable setup for PostgreSQL?  I see things like  
"NFS" and "VMWare" and start to get worried.


Tom

--
Tom Duffey 
Technology by Design :: http://techbydesign.com/
p: 414.431.0800



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] could not access status of transaction

2009-03-25 Thread Tom Duffey

Hi Tom,

On Mar 25, 2009, at 9:02 PM, Tom Lane wrote:


Tom Duffey  writes:

One of our databases suffered a problem yesterday during a normal
update, something we have been doing for years.  Near the end of the
process a foreign key constraint is rebuilt on a table containing
several hundred million rows.  Rebuilding the constraint failed with
the following message:



ERROR:  could not access status of transaction 4294918145
DETAIL:  Could not open file "pg_clog/0FFF": No such file or  
directory.


This looks like a garden-variety data corruption problem to me.
Trashed rows tend to yield this type of error because the "xmin"
transaction ID is the first field that the server can check with
any amount of finesse.  4294918145 is 4001 in hex, saith my
calculator, so it looks like a bunch of bits went to ones --- or
perhaps more likely, the row offset in the page header got clobbered
and we're looking at some bytes that never were a transaction ID
at all.

So I'd try looking around for flaky RAM, failing disks, loose cables,
that sort of thing ...


Are you aware of any issues like this related to VMWare ESX?  Our  
PostgreSQL server is running in such an environment and I asked the  
guys to review your email and they thought maybe this type of  
corruption could happen when the virtual machine was moved from one  
physical server to another, which we have done once or twice in the  
past few months.


Tom

--
Tom Duffey 
Technology by Design :: http://techbydesign.com/
p: 414.431.0800



[GENERAL] could not access status of transaction

2009-03-25 Thread Tom Duffey

Hi All,

One of our databases suffered a problem yesterday during a normal  
update, something we have been doing for years.  Near the end of the  
process a foreign key constraint is rebuilt on a table containing  
several hundred million rows.  Rebuilding the constraint failed with  
the following message:


ERROR:  could not access status of transaction 4294918145
DETAIL:  Could not open file "pg_clog/0FFF": No such file or directory.

Here's the table and constraint definitions:

CREATE TABLE point_history (
point_id integer NOT NULL,
value real NOT NULL,
status integer NOT NULL,
"timestamp" timestamp without time zone NOT NULL
);

ALTER TABLE point_history ADD CONSTRAINT point_history_pkey PRIMARY  
KEY (point_id, "timestamp");
ALTER TABLE point_history ADD CONSTRAINT "$1" FOREIGN KEY (point_id)  
REFERENCES point(id);


I read about this and and created the pg_clog/0FFF file, filling it  
with 256K of zeroes and then vacuumed the database.  Then I tried  
rebuilding the constraint and received a foreign key violation:


DETAIL:  Key (point_id)=(2) is not present in table "point".

The crappy thing about this is that there was no record in the  
point_history table with point_id = 2:


db=> select * from point_history where point_id = 2;
 point_id | value | status | timestamp
--+---++---
(0 rows)

I scratched my head for a while and decided to reload the database  
from a backup, which I'm still working on now.  I'm wondering if  
anyone has any thoughts or ideas about this?  I found references to  
similar problems but they were all for older versions of PostgreSQL.   
When the problem occurred we were running 8.3.6 and are now running  
8.3.7.


Tom

--
Tom Duffey 
Technology by Design :: http://techbydesign.com/
p: 414.431.0800