from:"Mike Harding"

[GENERAL] Timestamp/Timezone - does this make sense?

2007-02-13 Thread Mike Harding

mvh= set time zone 'UTC';
SET
mvh= select now();
  now  
---
 2007-02-13 03:37:35.660652+00
(1 row)

mvh= select timestamp with time zone '2007-01-01' at time zone
'America/Los_Angeles';
  timezone   
-
 2006-12-31 16:00:00
(1 row)

mvh= select timestamp '2007-01-01' at time zone 'America/Los_Angeles';
timezone

 2007-01-01 08:00:00+00
(1 row)

Where does that extra 8 hours come from?



---(end of broadcast)---
TIP 6: explain analyze is your friend

[GENERAL] Data mining setup/hardware?

2005-03-11 Thread Mike Harding

Any suggestions on how to set up a server -strictly- for data mining?
This means:

- I want queries to run as quickly as possible, and
- I don't care if I lose data if the power goes down.

A solid state disk would obviously speed things up, like the to-disk
sorts that Postgres does when it runs out of RAM, and help the seek
time, etc.  This would also be somewhat expensive, of course.

I can buy a custom server and throw lots of RAM in it and so on.
Commodity PC hardware seems to limited to about 4G, or 8G or so for the
AMD64 hardware, so it seems unlikely I can run a pure memory disk.

I'm using FreeBSD , so any FreeBSD related hints are appreciated, but I
can run Linux or whatever if needed for this server.

Thanks,

-- 
Mike Harding [EMAIL PROTECTED]


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [GENERAL] Hash aggregates blowing out memory

2005-02-26 Thread Mike Harding

Sorry, I should have said 'vacuum analyze verbose'...

On Sat, 2005-02-26 at 00:45 -0500, Greg Stark wrote:
 Mike Harding [EMAIL PROTECTED] writes:
 
  The following was run -immediately- after a vacuum.
 
 You realize vacuum doesn't update the statistics, right? 
 You have to do analyze or vacuum analyze for that.
 
 
-- 
Mike Harding [EMAIL PROTECTED]


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

[GENERAL] Hash aggregates blowing out memory

2005-02-25 Thread Mike Harding

I've been having problems where a HashAggregate is used because of a bad
estimate of the distinct number of elements involved.  In the following
example the total number of domain IDs is about 2/3 of the number of
rows, and it's estimated at about 1/15 of the actual value.  This will
occasionally cause the generated query to use a HashAggregate, and this
runs the backend out of memory - it will use 700 or more meg before
failing.

The following was run -immediately- after a vacuum.

explain analyze select sum(count) as sumc,class,domain_id into temp
new_clicks from clicks,countries where date  (current_date - 20) and
clicks.country_id=countries.country_id group by domain_id,class;

 GroupAggregate  (cost=1136261.89..1183383.51 rows=191406 width=12)
(actual time=138375.935..163794.452 rows=3258152 loops=1)
   -  Sort  (cost=1136261.89..1147922.66 rows=4664311 width=12) (actual
time=138374.865..147308.343 rows=4514313 loops=1)
 Sort Key: clicks.domain_id, countries.class
 -  Hash Join  (cost=4.72..421864.06 rows=4664311 width=12)
(actual time=6837.405..66938.259 rows=4514313 loops=1)
   Hash Cond: (outer.country_id = inner.country_id)
   -  Seq Scan on clicks  (cost=0.00..351894.67
rows=4664311 width=12) (actual time=6836.388..46865.490 rows=4514313
loops=1)
 Filter: (date  (('now'::text)::date - 20))
   -  Hash  (cost=4.18..4.18 rows=218 width=8) (actual
time=0.946..0.946 rows=0 loops=1)
 -  Seq Scan on countries  (cost=0.00..4.18
rows=218 width=8) (actual time=0.011..0.516 rows=218 loops=1)
 Total runtime: 175404.738 ms
(10 rows)
-- 
Mike Harding [EMAIL PROTECTED]


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [GENERAL] Hash aggregates blowing out memory

2005-02-25 Thread Mike Harding

Any way to adjust n_distinct to be more accurate?

I don't think a 'disk spill' would be that bad, if you could re-sort the
hash in place.  If nothing else, if it could -fail- when it reaches the
lower stratosphere, and re-start, it's faster than getting no result at
all... sort of an auto disable of the hashagg.

On Fri, 2005-02-25 at 16:55 -0500, Tom Lane wrote:
 Mike Harding [EMAIL PROTECTED] writes:
  I've been having problems where a HashAggregate is used because of a bad
  estimate of the distinct number of elements involved.
 
 If you're desperate, there's always enable_hashagg.  Or reduce sort_mem
 enough so that even the misestimate looks like it will exceed sort_mem.
 
 In the long run it would be nice if HashAgg could spill to disk.  We
 were expecting to see a contribution of code along that line last year
 (from the CMU/Berkeley database class) but it never showed up.  The
 performance implications might be a bit grim anyway :-(
 
   regards, tom lane
-- 
Mike Harding [EMAIL PROTECTED]


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [GENERAL] row numbering

2005-02-25 Thread Mike Harding

If you insert the results of your query into a table with a serial
column, the serial column will do what you want..

On Sat, 2005-02-26 at 01:10 +0100, Peter Eisentraut wrote:
 josue wrote:
  is there a way return a column with the row number automatically
  generated according the way the rows were processed by the query.
 
 No, but you can easily keep a counter in the client.
 
-- 
Mike Harding [EMAIL PROTECTED]


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

[GENERAL] updateable view with join?

2005-02-14 Thread Mike Harding

Suppose I have some tables and views like so for dog breeds and
breeders and their inventory where I also have a 'human' view of dog
inventory where the breed and breeder names are displayed instead of
the numerical IDs.  I've gotten inserts working but I'm stymied by
updates and deletes... any help?

Also, would inserting into this view be faster than the 3 SQL
statements I would otherwise need to look up the IDs and inserting
them and the count in the inv table?

This is a stripped down idealized example, so I apologize for any
syntax errors...

create table breeds (
   breed_id SERTIAL PRIMARY KEY,
   name TEXT NOT NULL UNIQUE
);

create table breeders (
   breeder_id SERIAL PRIMARY KEY,
   name TEXT NOT NULL UNIQUE
);

create table inv (
   breeder_id integer references breeder on update cascade,
   breed_id integer references breeds on update cascase,
   count integer NOT NULL,
   PRIMARY KEY(breeder_id,breed_id)
);

create view hinv
AS SELECT h.name AS breeder, d.name AS breed, count
FROM breeders h, breed d, inv
WHERE h.breeder_id = inv.breeder_id AND
d.breed_id = inv.breed_id;

CREATE hinv_ins AS ON INSERT TO hinv
   DO INSTEAD
   INSERT INTO inv(breeder_id,breed_id,count)
   SELECT h.breeder_id, d.breed_id, NEW.count
   WHERE h.name = NEW.breeder
   AND d.name = NEW.breed;

update/delete???

Thanks,

Mike H.

---(end of broadcast)---
TIP 8: explain analyze is your friend

[GENERAL] Timestamp/Timezone - does this make sense?

[GENERAL] Data mining setup/hardware?

Re: [GENERAL] Hash aggregates blowing out memory

[GENERAL] Hash aggregates blowing out memory

Re: [GENERAL] Hash aggregates blowing out memory

Re: [GENERAL] row numbering

[GENERAL] updateable view with join?

7 matches

Site Navigation

Mail list logo

Footer information