[SQL] how to speed up these queries ?

2005-03-03 Thread Dracula 007
Hello,
   I have two "large" tables - "sessions" (about 1.500.000 rows) and
"actions" (about 4.000.000 rows), and the "actions" table is connected
to the "sessions" (it contains a key from it). The simplified structure
of these tables is
sessions (
session_id int4,
visitor_id int4,
session_ip inet,
session_date timestamp
)
actions (
action_id int4,
session_id int4, -- foreign key, references sessions(session_id)
action_date timestamp,
action_year int2,
action_month int2,
action_day int2
)
I run SQL queries like
   SELECT
  COUNT(actions.session_id) AS sessions_count,
  COUNT(DISTINCT visitor_id) AS visitors_count,
  COUNT(DISTINCT session_ip) AS ips_count
   FROM actions LEFT JOIN sessions USING (session_id)
   GROUP BY action_year, action_month, action_day
but it's really really slow. I've tried to use different indexes on
different columns, but no matter what I've tried I can't get it faster. 
The explain analyze of the query is

--
 Aggregate  (cost=347276.05..347276.05 rows=1 width=23) (actual 
time=210060.349..210060.350 rows=1 loops=1)
   ->  Hash Left Join  (cost=59337.55..305075.27 rows=4220077 width=23) 
(actualtime=24202.338..119924.254 rows=4220077 loops=1)
 Hash Cond: ("outer".session_id = "inner".session_id)
 ->  Seq Scan on actions  (cost=0.00..114714.77 rows=4220077 
width=8) (actual time=7539.653..44585.023 rows=4220077 loops=1)
 ->  Hash  (cost=47650.64..47650.64 rows=1484764 width=19) 
(actual time=16628.790..16628.790 rows=0 loops=1)
   ->  Seq Scan on sessions  (cost=0.00..47650.64 
rows=1484764 width=19) (actual time=0.041..13378.667 rows=1484764 loops=1)
 Total runtime: 210061.073 ms
--

As you can see it runs for about 4 mins, which is not too fast. Is there 
some way to speed up such queries?

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [SQL] Multiples schemas

2005-03-03 Thread Markus Schaber
Hi, Lucas,

[EMAIL PROTECTED] schrieb:

> Is there a way to construct a multi schema in my data base?
>  Something like:
>   mysystem.finances.money.tables
>   mysystem.finances.money.functions
>   mysystem.finances.credits.tables
>   mysystem.finances.credits.functions
>   mysystem.amount.products..
>   

AFAIK, there is no way to have hierarchical schemas, but if you only
need it for humans, you can name your schemas like finances_money_tables
and finances_money_functions.

> Or can I use another database like:
>  createdb DB1
>  createdb DB2
>  psql DB1
>  select * from DB2.schema.table

In contrib/, there's a dblink module that lets you select data from
other databases, but I'm afraid it is rather limited in usage and
performance.


Markus
-- 
Markus Schaber | Dipl. Informatiker | Software Development GIS

Fight against software patents in EU! http://ffii.org/
  http://nosoftwarepatents.org/

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [SQL] how to speed up these queries ?

2005-03-03 Thread Sean Davis
On Mar 3, 2005, at 6:05 AM, Dracula 007 wrote:
Hello,
   I have two "large" tables - "sessions" (about 1.500.000 rows) and
"actions" (about 4.000.000 rows), and the "actions" table is connected
to the "sessions" (it contains a key from it). The simplified structure
of these tables is
sessions (
session_id int4,
visitor_id int4,
session_ip inet,
session_date timestamp
)
actions (
action_id int4,
session_id int4, -- foreign key, references sessions(session_id)
action_date timestamp,
action_year int2,
action_month int2,
action_day int2
)
I run SQL queries like
   SELECT
  COUNT(actions.session_id) AS sessions_count,
  COUNT(DISTINCT visitor_id) AS visitors_count,
  COUNT(DISTINCT session_ip) AS ips_count
   FROM actions LEFT JOIN sessions USING (session_id)
   GROUP BY action_year, action_month, action_day
but it's really really slow. I've tried to use different indexes on
different columns, but no matter what I've tried I can't get it 
faster. The explain analyze of the query is

--
 Aggregate  (cost=347276.05..347276.05 rows=1 width=23) (actual 
time=210060.349..210060.350 rows=1 loops=1)
   ->  Hash Left Join  (cost=59337.55..305075.27 rows=4220077 
width=23) (actualtime=24202.338..119924.254 rows=4220077 loops=1)
 Hash Cond: ("outer".session_id = "inner".session_id)
 ->  Seq Scan on actions  (cost=0.00..114714.77 rows=4220077 
width=8) (actual time=7539.653..44585.023 rows=4220077 loops=1)
 ->  Hash  (cost=47650.64..47650.64 rows=1484764 width=19) 
(actual time=16628.790..16628.790 rows=0 loops=1)
   ->  Seq Scan on sessions  (cost=0.00..47650.64 
rows=1484764 width=19) (actual time=0.041..13378.667 rows=1484764 
loops=1)
It looks like you are going to always do a sequential scan on the 
tables, as you always look a the entire table(s).  How often do you do 
the query as compared to the load on the database?  If you do the query 
often relative to the load, could you keep a table of counts something 
like:

create table summarize_use (
action_date date,
sessions_count int,
visitors_count int,
isp_count int)
and then use triggers from the sessions and actions to increment the 
various counts in the summarize_use table based on the action_date and 
session_date date parts?  The summarize_use table would then look like:

action_date sessions_count  visitors_count  ips_count
3-2-200515  12  
12
Just a thought, and of course you would pay the price of triggers with 
each insert to the sessions or actions table.

If you don't like that and you run this say every night at midnight, 
you could set up a cron job that ran this query and selected it into a 
table for direct querying--call this table summarize_use again.  Then, 
to get the full result, you would need to simply do something like:

select * from summarize_use
union
SELECT
  COUNT(actions.session_id) AS sessions_count,
  COUNT(DISTINCT visitor_id) AS visitors_count,
  COUNT(DISTINCT session_ip) AS ips_count
   FROM actions LEFT JOIN sessions USING (session_id)
   WHERE
 action_year=2005 AND
 action_month=3 AND
 action_day=3;
This would then be fully up-to-date and would use indices on 
action_year, action_month, action_day.  DISCLAIMER--All of this is 
untested

Sean.
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [SQL] Building a database from a flat file

2005-03-03 Thread Sean Davis
On Mar 2, 2005, at 7:15 PM, Casey T. Deccio wrote:
A database I am currently using is built and updated periodically from 
a
flat csv file (The situation is rather unfortunate, but that's all I
have right now).  The schema I use is more complex than the flat file,
so I follow a process to populate the tables with the data from the
file.  First I slurp the whole file into one temporary table, whose
columns correspond to the columns in the file.  Then I DELETE all the
existing rows from the tables in the schema and perform a series of
queries on that table to INSERT and UPDATE rows in the tables that are
in the schema.  Then I DELETE the data from the temporary table.  I do
it this way, rather than trying to synchronize it, because of the
inconsistencies and redundancies in the flat file.

There is more than one problem with this, but the largest is that I
would like to perform this whole database rebuild within one
transaction, so other processes that need to access the database can do
so without noticing the disturbance.  However, performing this set of
events (besides populating the temporary table) within a single
transaction takes a long time--over an hour in some cases.
What are some suggestions to help improve performance with replacing 
one
set of data in a schema with another?
Why not rebuild the entire thing in a separate "build" schema then do 
only the stuff like copying tables inside the transaction block:

BEGIN;
truncate table1;
truncate table2;
--reset any sequences you feel you need to have reset;
select * into table1 from build.table1;
select * into table2 from build.table2;
vacuum analyze table1;
vacuum analyze table2;
COMMIT;
I haven't tried this method exactly, but building in a separate schema 
(expensive) and then doing cheap operations like copying the table into 
the working schema should minimize the amount of time you spend inside 
the transaction block.

**I don't know what effect this will have on performance of the whole 
process, though**.

Sean
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [SQL] how to speed up these queries ?

2005-03-03 Thread Sean Davis
On Mar 3, 2005, at 8:34 AM, Dracula 007 wrote:

It looks like you are going to always do a sequential scan on the 
tables, as you always look a the entire table(s).  How often do you 
do the query as compared to the load on the database?  If you do the 
query often relative to the load, could you keep a table of counts 
something like:
create table summarize_use (
action_date date,
sessions_count int,
visitors_count int,
isp_count int)
and then use triggers from the sessions and actions to increment the 
various counts in the summarize_use table based on the action_date 
and session_date date parts?  The summarize_use table would then look 
like:
I can't exactly predict how often these queries will be executed - 
this is an example of one of many different queries that are executed 
over these two tables (in these two tables the main part of statistics 
of our web application is stored). And for every query there can be 
different parameter values etc.

It definitely will be less than 1% of all queries run on the server, 
but the problem is that running of it consumes most of the cpu, so all 
the other queries have to wait and timeouts. And we don't want the 
visitors on the web to wait ...

As a temporary solution we build all the possible results once a week 
(every Monday on 00:00) and store it in a separate tables, so it's 
pretty fast to view, but it takes about 3 hours to rebuild all the 
possible stats (total, by month, by day, by week) for different view 
(pages, articles, visitors, etc). I still hope I'll be able to speed 
it up somehow.

The solution using a triggers looks nice, I'll try that and it 
probably will work, but I can't predict how complicated it will be to 
log all the interesting stats.

t.v.
If your group by is at its finest grain only daily, then the most you 
will run these queries is daily, correct?  Could you try some of your 
queries doing:

select BLAH,BLAH,BLAH from BLAH where action_year=2005 and action_day=3 
and action_month=3

after building indices on the columns?  If this is fast (which it very 
well may be), then you can simply run that set of queries daily and 
insert the result into your "big stats" table, as presumably all other 
data in the "big stats" table is static if the date has past.  That 
would save you writing the triggers, which could be complicated from 
what you are saying

Sean
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [WW Spam: medium] [SQL] Building a database from a flat file

2005-03-03 Thread RobertD . Stewart
Could you set up functions triggers on the temp table that would do updates
and inserts on the other tables from the data entered into the temp table?

Then all you would have to do is inter the data into the temp table

-Original Message-
From: Casey T. Deccio [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, March 02, 2005 7:15 PM
To: pgsql-sql@postgresql.org
Subject: [WW Spam: medium] [SQL] Building a database from a flat file

A database I am currently using is built and updated periodically from a
flat csv file (The situation is rather unfortunate, but that's all I
have right now).  The schema I use is more complex than the flat file,
so I follow a process to populate the tables with the data from the
file.  First I slurp the whole file into one temporary table, whose
columns correspond to the columns in the file.  Then I DELETE all the
existing rows from the tables in the schema and perform a series of
queries on that table to INSERT and UPDATE rows in the tables that are
in the schema.  Then I DELETE the data from the temporary table.  I do
it this way, rather than trying to synchronize it, because of the
inconsistencies and redundancies in the flat file.

There is more than one problem with this, but the largest is that I
would like to perform this whole database rebuild within one
transaction, so other processes that need to access the database can do
so without noticing the disturbance.  However, performing this set of
events (besides populating the temporary table) within a single
transaction takes a long time--over an hour in some cases.

What are some suggestions to help improve performance with replacing one
set of data in a schema with another?

Casey


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [SQL] Performance of Views

2005-03-03 Thread Christoph Haller
There is no difference in run-time between a VIEW 
and executing the underlying SELECT because it's the same. 
You may move your question to the [PERFORMANCE] list 
for hints to speed up your query. 

Regards, Christoph 


Steffen Boehme wrote:
> 
> Hello there,
> 
> i have a short question ...
> 
> I have a few tables (at the moment "only" 3 for testing), over which
> will by made a query in this form:
> 
> SELECT
> a.orderitem_id,
> a.transaction_id,
> a.order_id,
> a.shop_id,
> a.quantity,
> a.price,
> b.affiliate_id,
> c."type"
> FROM
> ss_order_orderitems a
> LEFT JOIN ss_order_affiliate_tracking b ON a.order_id = b.order_id,
> ss_shops c
> WHERE
> (a.order_id = b.order_id OR b.order_id IS NULL) AND
> a.shop_id = c.shop_id;
> 
> The query will get later a few more conditions ...
> 
> Now is the problem, that the performance is not realy good ... and i
> think about the best solution for such a query and found three possibilitys:
> 
> 1. a simple select over the three tables (one of them contains 16
> entrys in the moment and it's growing) in the form like above (the db is
> mysql 4.1.x)
> I think this solution is not very perfomant ...
> 
> 2. move the data to a postgresql-db with the same structur and create a
> view, wich makes the same query ...
> Is the performance for the same query different between a simple select
> and a view!?
> If so, i can forget the view ...
> 
> 3. put the data with the above query in one big table ...
> I know, thats no good db-structur, but i don't know how i could make it
> better ...
> 
> The main-question at the moment iss ...
> Is the performance of the View-Method better then the first Method on
> the existing tables!?
> 
> I hope of a view hints ...
> 
> Thanks
> Steffen
>

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [SQL] Building a database from a flat file

2005-03-03 Thread Markus Schaber
Hi, Casey,

Casey T. Deccio schrieb:

> There is more than one problem with this, but the largest is that I
> would like to perform this whole database rebuild within one
> transaction, so other processes that need to access the database can do
> so without noticing the disturbance.  However, performing this set of
> events (besides populating the temporary table) within a single
> transaction takes a long time--over an hour in some cases.
> 
> What are some suggestions to help improve performance with replacing one
> set of data in a schema with another?

- Create the new date in another schema, and then simply rename those
two schemas for "switch over"

- Create the new data in differently named tables, and then simply
rename all the old and new tables for "switch over".

- Have two different set of tables (maybe two identical schemas), and
let your application work on a set of views. Then you can change the
views via "create or replace view" for switch over.

Markus

-- 
Markus Schaber | Dipl. Informatiker | Software Development GIS

Fight against software patents in EU! http://ffii.org/
  http://nosoftwarepatents.org/

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [SQL] Postgres performance

2005-03-03 Thread Christopher Browne
[EMAIL PROTECTED] (PFC) writes:
>> The reason PostgreSQL is slower is because it (and by extension the team
>> behind it) cares about your data.
>
>   Sure, postgres is (a bit but not much) slower for a simple
> query like SELECT * FROM one table WHERE id=some number, and
> postgres is a lot slower for UPDATES (although I heard that it's
> faster than MySQL InnoDB)...

There is a _HUGE_ set of misconceptions here.

  1.  The speed difference is only repeatedly true for simple selects
  when done against MyISAM tables.

  2.  That speed difference for UPDATEs is only true if you are talking
  about ONE MySQL(tm) client doing updates against MyISAM tables.

  MyISAM does not support row locks; if multiple clients are
  trying to update a table, they must fight for a single table
  lock, with the result that updating tables doesn't scale _at
  all_ with MySQL(tm) for the default table type.

If you only have one process touching the database, MySQL(tm) can
therefore look quite a lot better than PostgreSQL.  Move to 2 clients
and it's not quite so good.  Move to 100 concurrent clients all trying
to do updates and you may discover that you simply can't do that...
-- 
let name="cbbrowne" and tld="cbbrowne.com" in String.concat "@" [name;tld];;
http://www.ntlug.org/~cbbrowne/linuxxian.html
A VAX is virtually a computer, but not quite.

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [SQL] Building a database from a flat file

2005-03-03 Thread Casey T. Deccio
On Thu, 2005-03-03 at 06:23 -0700, Sean Davis wrote:
> Why not rebuild the entire thing in a separate "build" schema then do
> only the stuff like copying tables inside the transaction block:
> 

Building everything in the separate "build" schema works great, but it
is the DELETE (TRUNCATE won't work when foreign keys point to the table)
that took the most time in the transaction.  However, renaming the
schemas, as suggested in other posts, worked great.

Casey


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [SQL] Building a database from a flat file

2005-03-03 Thread Casey T. Deccio
On Thu, 2005-03-03 at 08:28 -0700, Markus Schaber wrote:
> - Create the new date in another schema, and then simply rename those
> two schemas for "switch over"
> 

This worked very well.  I created another schema ("build") and populated
the tables within build.  Then:

BEGIN;
ALTER SCHEMA public RENAME TO public_old;
ALTER SCHEMA build RENAME TO public;
COMMIT;
/* remove data from tables in build schema... */

It was very quick and seamless.

Question: is there an "easy" way to duplicate an existing schema
(tables, functions, sequences, etc.)--not the data; only the schema?
This way, I would only need to modify one schema (public) to make
changes, and the build schema could be created each time as a duplicate
of the public schema.  Maintenance would be much simpler.

> - Create the new data in differently named tables, and then simply
> rename all the old and new tables for "switch over".
> 

This probably would work too, but there may be problems with foreign
keys in renaming the tables one at a time (unless deferrable is used).
To avoid any mess, the previous one works well.

> - Have two different set of tables (maybe two identical schemas), and
> let your application work on a set of views. Then you can change the
> views via "create or replace view" for switch over.
> 
> Markus

Casey


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [SQL] Building a database from a flat file

2005-03-03 Thread Markus Schaber
Hi, Casey,

Casey T. Deccio schrieb:

> Question: is there an "easy" way to duplicate an existing schema
> (tables, functions, sequences, etc.)--not the data; only the schema?
> This way, I would only need to modify one schema (public) to make
> changes, and the build schema could be created each time as a duplicate
> of the public schema.  Maintenance would be much simpler.

I do not know about schemas, but for tables you can "CREATE TABLE alpha
(LIKE beta)".

>>- Create the new data in differently named tables, and then simply
>>rename all the old and new tables for "switch over".
> This probably would work too, but there may be problems with foreign
> keys in renaming the tables one at a time (unless deferrable is used).
> To avoid any mess, the previous one works well.

AFAIK, the foreign key relations are adopted when a table is renamed,
they stick to the same table disregarding name changes.


Markus
-- 
Markus Schaber | Dipl. Informatiker | Software Development GIS

Fight against software patents in EU! http://ffii.org/
  http://nosoftwarepatents.org/

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [SQL] Building a database from a flat file

2005-03-03 Thread Bricklen Anderson
Casey T. Deccio wrote:
Question: is there an "easy" way to duplicate an existing schema
(tables, functions, sequences, etc.)--not the data; only the schema?
This way, I would only need to modify one schema (public) to make
changes, and the build schema could be created each time as a duplicate
of the public schema.  Maintenance would be much simpler.
check the docs for pg_dump (-s) for doing structural dumps of your schema.
--
___
This e-mail may be privileged and/or confidential, and the sender does
not waive any related rights and obligations. Any distribution, use or
copying of this e-mail or the information it contains by other than an
intended recipient is unauthorized. If you received this e-mail in
error, please advise me (by return e-mail or otherwise) immediately.
___
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [SQL] Building a database from a flat file

2005-03-03 Thread Sean Davis
On Mar 3, 2005, at 1:37 PM, Casey T. Deccio wrote:
On Thu, 2005-03-03 at 08:28 -0700, Markus Schaber wrote:
- Create the new date in another schema, and then simply rename those
two schemas for "switch over"
This worked very well.  I created another schema ("build") and 
populated
the tables within build.  Then:

BEGIN;
ALTER SCHEMA public RENAME TO public_old;
ALTER SCHEMA build RENAME TO public;
COMMIT;
/* remove data from tables in build schema... */
It was very quick and seamless.
Question: is there an "easy" way to duplicate an existing schema
(tables, functions, sequences, etc.)--not the data; only the schema?
This way, I would only need to modify one schema (public) to make
changes, and the build schema could be created each time as a duplicate
of the public schema.  Maintenance would be much simpler.
Why not just create a dump of your schema (without data), drop the 
schema, rebuild it from the dump and then populate as normal.  Then do 
the name switch.

Sean
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [SQL] Building a database from a flat file

2005-03-03 Thread Markus Schaber
Hi, Casey,

Casey T. Deccio schrieb:

> Building everything in the separate "build" schema works great, but it
> is the DELETE (TRUNCATE won't work when foreign keys point to the table)

Just as an additional info: You could temporarily drop the foreing key
constraints, run TRUNCATE, and recreate the constraints.

Markus

-- 
Markus Schaber | Dipl. Informatiker | Software Development GIS

Fight against software patents in EU! http://ffii.org/
  http://nosoftwarepatents.org/

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [SQL] definative way to place secs from epoc into timestamp

2005-03-03 Thread Andrew - Supernews
On 2005-03-03, Bret Hughes <[EMAIL PROTECTED]> wrote:
> a RFE would be to let to_timestamp be to a timezone without time zone
> and have a to_timestamptz do the time zone thing.  Seems more consistent
> and would give me the functionality I am looking for :)

Unix epoch times correspond to timestamp _with_ time zone.

(Why are you using timestamp without time zone anyway? For recording the
time at which an event occurred that usage is simply wrong - in fact I
can't see any situation in which a Unix epoch time can correctly be
converted to a timestamp without time zone.)

-- 
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [SQL] definative way to place secs from epoc into timestamp

2005-03-03 Thread Bret Hughes
On Thu, 2005-03-03 at 14:58, Andrew - Supernews wrote:
> On 2005-03-03, Bret Hughes <[EMAIL PROTECTED]> wrote:
> > a RFE would be to let to_timestamp be to a timezone without time zone
> > and have a to_timestamptz do the time zone thing.  Seems more consistent
> > and would give me the functionality I am looking for :)
> 
> Unix epoch times correspond to timestamp _with_ time zone.
> 
> (Why are you using timestamp without time zone anyway? For recording the
> time at which an event occurred that usage is simply wrong - in fact I
> can't see any situation in which a Unix epoch time can correctly be
> converted to a timestamp without time zone.)
> 

Valid question.  Because there is no reason to keep up with time zones
and the fact that  I want the same value from the data base that I put
into it.  The app that this db supports is written in  php and I kept
getting something different out than what I put into it in the other
passes I made while trying to get my head around this.  the timestamps
have historically been stored in flat files.

here is an example of a valid use:

The table:

[EMAIL PROTECTED] bhughes]$ psql elevating -c '\d testtime'
 Table "public.testtime"
 Column |Type | Modifiers 
+-+---
 ts | timestamp without time zone | 


The script:

[EMAIL PROTECTED] elevatetest]$ cat timetest.php 
#!/usr/bin/php -q


The output:

[EMAIL PROTECTED] elevatetest]$ ./timetest.php 
date from date command
Thu Mar  3 22:30:14 EST 2005
system secs   1109907014
php time secs 1109907014
03/03/05 22:30
the number of rows affected was 1
Array
(
[0] => 2005-03-03 22:30:14
[ts] => 2005-03-03 22:30:14
[1] => 1109907014
[ts2int] => 1109907014
)


What goes in comes out.  Gotta like it.

Bret



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [SQL] definative way to place secs from epoc into timestamp

2005-03-03 Thread Andrew - Supernews
On 2005-03-04, Bret Hughes <[EMAIL PROTECTED]> wrote:
>> Unix epoch times correspond to timestamp _with_ time zone.
>> 
>> (Why are you using timestamp without time zone anyway? For recording the
>> time at which an event occurred that usage is simply wrong - in fact I
>> can't see any situation in which a Unix epoch time can correctly be
>> converted to a timestamp without time zone.)
>
> Valid question.  Because there is no reason to keep up with time zones

It's a common mistake to think that just because you don't need to keep
track of time zones that somehow using timestamp without time zone is
correct. It is _not_. "timestamp with time zone" and "timestamp without
time zone" have _very_ different semantics.

One way to look at it is that "timestamp with time zone" designates a
specific instant in absolute time (past or future). It is therefore the
correct type to use for recording when something happened. In contrast,
"timestamp without time zone" designates a point on the calendar, which
has a different meaning according to where you are, and when. So the
latter type crops up in some cases in calendar applications, and also in
input/output conversions, but it's more often than not the _wrong_ type
to use for storage, since the meaning changes with the timezone (and data
_does_ get moved across timezones, whether due to physical relocation or
other factors).

Unix epoch times have the same semantics as "timestamp with time zone".

> and the fact that  I want the same value from the data base that I put
> into it.

"same" in which sense? The same absolute point in time? Or the same point
on a calendar? Obviously if the timezone doesn't change, then the two are
equivalent; but which one is your application actually looking for? (If
your app is using Unix epoch times, then it's looking only at the absolute
time and not the calendar time...)

Here's an example of how it breaks (using your own conversion functions):

test=> set timezone to 'UTC';
SET
test=> insert into ttst values (int2ts(1109916954));
INSERT 887766166 1
test=> select ts,ts2int(ts) from ttst;
 ts  |   ts2int   
-+
 2005-03-04 06:15:54 | 1109916954
(1 row)

(that is the correct UTC time corresponding to 1109916954)

test=> set timezone to 'America/Denver';
SET
test=> select ts,ts2int(ts) from ttst;
 ts  |   ts2int   
-+
 2005-03-04 06:15:54 | 1109942154
(1 row)

test=> set timezone to 'America/New_York';
SET
test=> select ts,ts2int(ts) from ttst;
 ts  |   ts2int   
-+
 2005-03-04 06:15:54 | 1109934954
(1 row)

Notice the value stored in the DB didn't change, but it suddenly means
something different...

In contrast, if you do the same thing with "timestamp with time zone",
then the Unix time that you get back will _always_ be the same, as you
would expect, regardless of the time zone. Using functions identical to
yours except using "with time zone":

test=> insert into tztst values (int2tsz(1109916954));
INSERT 889130554 1
test=> select ts,ts2int(ts) from tztst;
   ts   |   ts2int   
+
 2005-03-04 06:15:54+00 | 1109916954
(1 row)

test=> set timezone to 'America/New_York';
SET
test=> select ts,ts2int(ts) from tztst;
   ts   |   ts2int   
+
 2005-03-04 01:15:54-05 | 1109916954
(1 row)

test=> set timezone to 'America/Los_Angeles';
SET
test=> select ts,ts2int(ts) from tztst;
   ts   |   ts2int   
+
 2005-03-03 22:15:54-08 | 1109916954
(1 row)

Notice that the stored timestamp doesn't actually change; it is displayed
differently according to the timezone. The Unix time correctly _doesn't_
change, reflecting the fact that what we stored was the absolute time.

-- 
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [SQL] definative way to place secs from epoc into timestamp

2005-03-03 Thread Michael Glaesemann
On Mar 4, 2005, at 14:47, Bret Hughes wrote:
On Thu, 2005-03-03 at 14:58, Andrew - Supernews wrote:
(Why are you using timestamp without time zone anyway? For recording  
the
time at which an event occurred that usage is simply wrong - in fact I
can't see any situation in which a Unix epoch time can correctly be
converted to a timestamp without time zone.)

Valid question.  Because there is no reason to keep up with time zones
and the fact that  I want the same value from the data base that I put
into it.  The app that this db supports is written in  php and I kept
getting something different out than what I put into it in the other
passes I made while trying to get my head around this.  the timestamps
have historically been stored in flat files.

What goes in comes out.  Gotta like it.
I think the reason this works is because your webserver and your  
postgresql server are in the same time zone, which is probably an  
assumption made in a great-many cases. You may run into problems if at  
some time the dbms and webserver are not in the same time zone and  
you're relying on dbms-generated times (such as now() or  
current_timestamp), or if the system
is relocated to another time zone.

I think the following illustrates a problem that can occur if the  
assumption that the time zone is not constant is no longer valid.

Your system is working for you, so that's great. I just wanted to  
explore this for myself a bit more -- I find the time zone related  
material hard to get my head around myself :). Since I went through it,  
I thought I'd share it with the list.

Regards,
Michael Glaesemann
grzm myrealbox com
test=# create table ts2int (ts2int_id serial not null unique
, ts timestamp without time zone default current_timestamp
, tstz timestamptz default current_timestamp) without oids;
NOTICE:  CREATE TABLE will create implicit sequence  
"ts2int_ts2int_id_seq" for serial column "ts2int.ts2int_id"
NOTICE:  CREATE TABLE / UNIQUE will create implicit index  
"ts2int_ts2int_id_key" for table "ts2int"
CREATE TABLE
test=# \d ts2int
  Table "public.ts2int"
  Column   |Type |
Modifiers
---+- 
+---
 ts2int_id | integer | not null default  
nextval('public.ts2int_ts2int_id_seq'::text)
 ts| timestamp without time zone | default  
('now'::text)::timestamp(6) with time zone
 tstz  | timestamp with time zone| default  
('now'::text)::timestamp(6) with time zone
Indexes:
"ts2int_ts2int_id_key" UNIQUE, btree (ts2int_id)

test=# insert into ts2int (ts) values (default);
INSERT 0 1
To simulate webserver and postgresql server being in different time  
zones, I'm
using the "at time zone" construct to convert to CST.

test=# insert into ts2int (ts) values (current_timestamp at time zone  
'CST');
INSERT 0 1
test=# select * from ts2int;
 ts2int_id | ts | tstz
---++---
 1 | 2005-03-04 15:46:20.443158 | 2005-03-04 15:46:20.443158+09
 2 | 2005-03-04 00:46:50.336831 | 2005-03-04 15:46:50.336831+09
(2 rows)

test=# select ts2int_id
, extract('epoch' from ts) as ts_epoch
, extract ('epoch' from tstz) as tstz_epoch
from ts2int;
 ts2int_id | ts_epoch |tstz_epoch
---+--+--
 1 | 1109918780.44316 | 1109918780.44316
 2 | 1109864810.33683 | 1109918810.33683
(2 rows)
Note that ts_epoch and tstz_epoch are the same for 1, but different for  
2. Both
ts and tstz are being evaluated at +9 (the postgres server time zone  
offset). As
ts for 2 wasn't inserted at +9, it's not the same.

test=# select ts2int_id
, extract('epoch' from ts at time zone 'CST') as ts_epoch
, extract ('epoch' from tstz) as tstz_epoch
from ts2int;
 ts2int_id | ts_epoch |tstz_epoch
---+--+--
 1 | 1109972780.44316 | 1109918780.44316
 2 | 1109918810.33683 | 1109918810.33683
(2 rows)
Note that ts_epoch and tstz_epoch are the same for 2, but different for  
1. ts
was inserted relative to CST and is now being evaluated "at time zone  
'CST'", so
the ts and tstz values for 2 are "the same". ts_epoch for 2 is also the  
Unix timestamp for the time that was originally inserted.

test=# select ts2int_id
, extract('epoch' from ts) as ts_epoch
, extract ('epoch' from tstz at time zone 'CST') as tstz_epoch
from ts2int;
 ts2int_id | ts_epoch |tstz_epoch
---+--+--
 1 | 1109918780.44316 | 1109864780.44316
 2 | 1109864810.33683 | 1109864810.33683
(2 rows)
Note again that ts_epoch and tstz_epoch are the same for 2, but  
different for 1. ts is
being evaluated at +9, while tstz is being converted to CST befo