[GENERAL] Spurious rows returned with left join?

2006-04-04 Thread Edmund Bacon
I think I have stumbled on a  bug, though I'm not entirely sure about 
that.  Things  do seem to get a little fuzzy when using outer joins 


Consider the following:

create table t1(t1_a int);
insert into t1 values (1);
insert into t1 values (2);
insert into t1 values (3);

create table t2(t2_a int , t2_b int);
insert into t2 values (1, 1);
insert into t2 values (2, 1);

create table t3(t3_b int, t3_c int);
insert into t3 values (1, 9);
insert into t3 values (1, 10);

select * from t1
left join t2 on t1_a = t2_a
left join t3 on t2_b = t3_b
order by t1_a, t2_b;

t1_a | t2_a | t2_b | t3_b | t3_c
--+--+--+--+--
   1 |1 |1 |1 |9
   1 |1 |1 |1 |   10
   2 |2 |1 |1 |9
   2 |2 |1 |1 |   10
   3 |  |  |  |
   3 |  |  |  |

Note that I get 2 rows where t1_a = 3. 


My pgsql version is:

PostgreSQL 8.1.3 on i686-redhat-linux-gnu, compiled by GCC gcc (GCC) 
3.4.4 20050721 (Red Hat 3.4.4-2)


(RedHat ES4 rpms from postgresql.org)


I have tried this same select on pgsql ver-7.4.8, CVS tip, as well as 
SQL Server 2005, MySQL and Firebird, all of which return just one row 
for t1_a = 3.


Since writing the original query I've realized that this is partly a 
problem with the INNER JOINS before OUTER JOINS  query writing principle 
(law?) and the query should perhaps be better written as:


select t1.*, t2.*, t3.*
from t2 join t3 on t2_b = t3_b
right join t1 on t1_a = t2_a;

which does return just one row for t1_a =3 for all postgres versions I 
have currently available.


Edmund






---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Spurious rows returned with left join?

2006-04-04 Thread Edmund Bacon

Tom Lane wrote:


Edmund Bacon [EMAIL PROTECTED] writes:
 


Consider the following:
...
Note that I get 2 rows where t1_a = 3. 
   



Are you getting a Merge Right Join plan for that?  If so, you're likely
getting bit by this bug:

2006-03-17 14:38  tgl

 


That's correct.  After doing an analyze on t1,t2,t3 the plan changed to
Hash Left Join, and I got the expected results.


This'll be fixed in 8.1.4, or if you're in a hurry you can get the patch
from our CVS server.

regards, tom lane
 

No great worries.  As remarked in original post, by doing the inner join 
first, I got the right results.  I didn't see anything in the archives 
about this, so I thought it might be an unknown issue. Admittedly, I 
didn't scour the archives too thouroghly ..


Edmund

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] joining tables

2006-01-23 Thread Edmund
[EMAIL PROTECTED] writes:

 Hi,
 
 If you have two tables, each with a column called keys and a column
 called values, and they are both incomplete, such as:
 
 table 1:
 
 keys | values
 -+--
 1| (null)
 2| two
 3| (null)
 
 table 2:
 
 keys | values
 -+-
 1| one
 2| (null)
 3| three
 
 is there a way to join them, in order to get:
 
 keys | values
 -+-
 1| one
 2| two
 3| three
 
 The closest I could get was with NATURAL FULL JOIN:
 
 keys | values
 -+-
 1| one
 1| (null)
 2| two
 2| (null)
 3| three
 3| (null)
 
 Thanks

Try something like:

SELECT key, 
CASE when table1.value IS NOT NULL THEN k1.value
ELSE table2.value END as value
FROM table1
FULL JOIN table2 USING(key);


You might want to use 'IS DISTINCT FROM table2.value' if you want the
value for table1 to be returned in preference to table2.value.



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] find last day of month

2005-12-09 Thread Edmund
Andrus Moor [EMAIL PROTECTED] writes:

 I have a table containing month column in format mm.
 
 create table months ( tmkuu c(7));
 insert into months values ('01.2005');
 insert into months values ('02.2005');
 
 How to create select statement which converts this column to date type 
 containing last day of month like
 
 '2005-01-31'
 '2005-02-28'
 

As long as you are using DateSytle = DMY the following should work:


ebacon=# select * from months;
  tmkuu  
-
 01.2005
 02.2005
 12.2004
(3 rows)

ebacon=# select (('01.' || tmkuu)::date + interval '1 month' - interval '1 
day')::date from months;
date

 2005-01-31
 2005-02-28
 2004-12-31
(3 rows)



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Select all invalid e-mail addresses

2005-10-20 Thread Edmund
Andrus [EMAIL PROTECTED] writes:

 I have a database of e-mail addresses.
 
 I want to select the email addresses which are not valid:
 
 do not contain exactly one @ character,
 contain ;' , characters or spaces etc.
 
 What is the WHERE clause for this ?
 

There was a thread here not so long ago about matching valid email addresses.
It's not so simple.  You probably want to do a regex match - e.g.

select ... where email ~ 'regex'

However the regex for all valid email possibilities is *VERY* complex.
see: http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html

You should probably search the maillist archives.  ISTR that there were
some suggestions on how one might simplify the search space.



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Generating random values.

2005-08-18 Thread Edmund
[EMAIL PROTECTED] (Joshua D. Drake) writes:

 Fernando Lujan wrote:
  Hi folks,
  I have a table wich contains my users... I want to insert to each
  user
  a random password, so I need a random function. Is there such function
  in Postgres? I just found the RANDOM which generates values between
  0.0 and 1.0.
  Any help or suggestion will be appreciated. :)
 
 I would do someting like:
 
 select substring(md5(random() || random()), 5, 8);
 
 Sincerely,
 
 Joshua D. Drkae

Great! a simple, dumb program can generate all your passwords in very 
quickly.  My 2.4 Ghz Pentium 4 did it in under 10 minutes.  A token set of
16 characters, and a fixed length of 8 charachters just isnt a very big
search space.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Failure to use indexes (fwd)

2005-08-02 Thread Edmund Dengler
Greetings all!

Given the quiet, I assume that there is no experience with index issues on
inherited tables? Just seeing if anybody may have any ideas or suggested
work arounds (I seem to have found one by constructing a query that does
all the joins between inherited tables explicitely - this causes the
indexes to be used - still trying to make sure it is a legitimate method).

Regards!
Ed

-- Forwarded message --
Date: Fri, 29 Jul 2005 14:23:29 -0400 (EDT)
From: Edmund Dengler [EMAIL PROTECTED]
To: Postgresql-General pgsql-general@postgresql.org
Subject: Re: [GENERAL] Failure to use indexes

Greetings!

I managed to get an even simpler example:

select *
from eventlog.record_main
where luid in (
select luid from eventlog_partition._20050706__raw_record
order by luid limit 1
  )
;

If I use = rather than in, postgresql uses index scanning. As soon as
I use in (ie, a set to compare against), I get sequential scanning,
event though the set size is only a single element.

Regards!
Ed


On Fri, 29 Jul 2005, Edmund Dengler wrote:

 Greetings!

 I am using inherits to partition several tables. When I perform a query
 on another table, and then try to join against an inherited table set, the
 optimizer does not use any indexes to perform the join.

 This is 8.0.1. Is this fixed in 8.0.2 or 8.0.3?

 The query:

 explain
 select *
 from (
 select *  from eventlog.record_classification as record_classification
 where
   time_written = '2005-07-06 00:00:00+00'::timestamptz
   and time_written  '2005-07-06 00:00:00+00'::timestamptz
 order by time_written, luid
 offset  0
 limit  500
   ) as classification
   join eventlog.record_main as main using (luid, time_written)
 ;

 The explanation:

   
   QUERY PLAN
 ---
  Hash Join  (cost=51.15..20191003.89 rows=208027 width=178)
Hash Cond: ((outer.luid = inner.luid) AND (outer.time_written = 
 inner.time_written))
-  Append  (cost=0.00..14641488.64 rows=554738383 width=96)
  -  Seq Scan on record_main main  (cost=0.00..0.00 rows=1 width=96)
  -  Seq Scan on _20050723__record_main main  (cost=0.00..94078.62 
 rows=3564462 width=96)
  -  Seq Scan on _20050724__record_main main  (cost=0.00..110075.12 
 rows=4170512 width=96)
  -  Seq Scan on _20050725__record_main main  (cost=0.00..122836.02 
 rows=4654002 width=96)
  -  Seq Scan on _20050726__record_main main  (cost=0.00..142347.71 
 rows=5393271 width=96)
  -  Seq Scan on _20050727__record_main main  (cost=0.00..130858.80 
 rows=4957980 width=96)
  
  (and so on, currently 123 such inheritd tables)
  
-  Hash  (cost=51.07..51.07 rows=15 width=98)
  -  Subquery Scan classification  (cost=50.89..51.07 rows=15 
 width=98)
-  Limit  (cost=50.89..50.92 rows=15 width=98)
  -  Sort  (cost=50.89..50.92 rows=15 width=98)
Sort Key: record_classification.time_written, 
 record_classification.luid
-  Result  (cost=0.00..50.59 rows=15 width=98)
  -  Append  (cost=0.00..50.59 rows=15 
 width=98)
-  Seq Scan on record_classification  
 (cost=0.00..0.00 rows=1 width=98)
  Filter: ((time_written = 
 '2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written  
 '2005-07-06 00:00:00+00'::timestamp with time zone))
-  Index Scan using 
 _20050705__record_classification_time_written_idx on 
 _20050705__record_classification record_classification  (cost=0.00..3.46 
 rows=1 width=54)
  Index Cond: ((time_written = 
 '2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written  
 '2005-07-06 00:00:00+00'::timestamp with time zone))
-  Index Scan using 
 _20050701__record_classification_time_written_idx on 
 _20050701__record_classification record_classification  (cost=0.00..3.59 
 rows=1 width=54)
  Index Cond: ((time_written = 
 '2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written  
 '2005-07-06 00:00:00+00'::timestamp with time zone))
-  Index Scan using 
 _20050702__record_classification_time_written_idx on 
 _20050702__record_classification record_classification  (cost=0.00..3.69 
 rows=1 width=54)
  Index Cond: ((time_written = 
 '2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written  
 '2005-07-06 00:00

Re: [GENERAL] Failure to use indexes (fwd)

2005-08-02 Thread Edmund Dengler
Greetings!

I have already increased the stats from 10 to 100. In addition, if I
specify individual tables, then the indexes are used. However, when I go
through the inherits, then indexes are not used. I will try and expand
the statistics, but suspect it is not the root cause of the problem.

Regards!
Ed


On Tue, 2 Aug 2005, Scott Marlowe wrote:

 On Tue, 2005-08-02 at 16:06, Dr NoName wrote:
  The solution to my problem was to increase statistics
  value and do another analyze. You can also change
  default_statistics_target parameter in
  postgresql.conf. Don't know if that's related to the
  problem you're seeing, but it's worth a try.

 Cool postgresql trick:

 alter database test set default_statistics_target=200;

 You can change the default for a databases's new tables too.

 ---(end of broadcast)---
 TIP 2: Don't 'kill -9' the postmaster


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[GENERAL] Failure to use indexes

2005-07-29 Thread Edmund Dengler
Greetings!

I am using inherits to partition several tables. When I perform a query
on another table, and then try to join against an inherited table set, the
optimizer does not use any indexes to perform the join.

This is 8.0.1. Is this fixed in 8.0.2 or 8.0.3?

The query:

explain
select *
from (
select *  from eventlog.record_classification as record_classification
where
  time_written = '2005-07-06 00:00:00+00'::timestamptz
  and time_written  '2005-07-06 00:00:00+00'::timestamptz
order by time_written, luid
offset  0
limit  500
  ) as classification
  join eventlog.record_main as main using (luid, time_written)
;

The explanation:


QUERY PLAN
---
 Hash Join  (cost=51.15..20191003.89 rows=208027 width=178)
   Hash Cond: ((outer.luid = inner.luid) AND (outer.time_written = 
inner.time_written))
   -  Append  (cost=0.00..14641488.64 rows=554738383 width=96)
 -  Seq Scan on record_main main  (cost=0.00..0.00 rows=1 width=96)
 -  Seq Scan on _20050723__record_main main  (cost=0.00..94078.62 
rows=3564462 width=96)
 -  Seq Scan on _20050724__record_main main  (cost=0.00..110075.12 
rows=4170512 width=96)
 -  Seq Scan on _20050725__record_main main  (cost=0.00..122836.02 
rows=4654002 width=96)
 -  Seq Scan on _20050726__record_main main  (cost=0.00..142347.71 
rows=5393271 width=96)
 -  Seq Scan on _20050727__record_main main  (cost=0.00..130858.80 
rows=4957980 width=96)
 
 (and so on, currently 123 such inheritd tables)
 
   -  Hash  (cost=51.07..51.07 rows=15 width=98)
 -  Subquery Scan classification  (cost=50.89..51.07 rows=15 width=98)
   -  Limit  (cost=50.89..50.92 rows=15 width=98)
 -  Sort  (cost=50.89..50.92 rows=15 width=98)
   Sort Key: record_classification.time_written, 
record_classification.luid
   -  Result  (cost=0.00..50.59 rows=15 width=98)
 -  Append  (cost=0.00..50.59 rows=15 width=98)
   -  Seq Scan on record_classification  
(cost=0.00..0.00 rows=1 width=98)
 Filter: ((time_written = 
'2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written  
'2005-07-06 00:00:00+00'::timestamp with time zone))
   -  Index Scan using 
_20050705__record_classification_time_written_idx on 
_20050705__record_classification record_classification  (cost=0.00..3.46 rows=1 
width=54)
 Index Cond: ((time_written = 
'2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written  
'2005-07-06 00:00:00+00'::timestamp with time zone))
   -  Index Scan using 
_20050701__record_classification_time_written_idx on 
_20050701__record_classification record_classification  (cost=0.00..3.59 rows=1 
width=54)
 Index Cond: ((time_written = 
'2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written  
'2005-07-06 00:00:00+00'::timestamp with time zone))
   -  Index Scan using 
_20050702__record_classification_time_written_idx on 
_20050702__record_classification record_classification  (cost=0.00..3.69 rows=1 
width=54)
 Index Cond: ((time_written = 
'2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written  
'2005-07-06 00:00:00+00'::timestamp with time zone))
   -  Index Scan using 
_20050703__record_classification_time_written_idx on 
_20050703__record_classification record_classification  (cost=0.00..3.70 rows=1 
width=54)
 Index Cond: ((time_written = 
'2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written  
'2005-07-06 00:00:00+00'::timestamp with time zone))
   
   (and so on)
   
   -  Index Scan using 
_20050714__record_classification_time_written_idx on 
_20050714__record_classification record_classification  (cost=0.00..3.69 rows=1 
width=53)
 Index Cond: ((time_written = 
'2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written  
'2005-07-06 00:00:00+00'::timestamp with time zone))
(164 rows)

Sample tables:

eventlog=# \d eventlog_partition._20050723__record_main
  Table eventlog_partition._20050723__record_main
Column| 

Re: [GENERAL] Failure to use indexes

2005-07-29 Thread Edmund Dengler
Greetings!

I managed to get an even simpler example:

select *
from eventlog.record_main
where luid in (
select luid from eventlog_partition._20050706__raw_record
order by luid limit 1
  )
;

If I use = rather than in, postgresql uses index scanning. As soon as
I use in (ie, a set to compare against), I get sequential scanning,
event though the set size is only a single element.

Regards!
Ed


On Fri, 29 Jul 2005, Edmund Dengler wrote:

 Greetings!

 I am using inherits to partition several tables. When I perform a query
 on another table, and then try to join against an inherited table set, the
 optimizer does not use any indexes to perform the join.

 This is 8.0.1. Is this fixed in 8.0.2 or 8.0.3?

 The query:

 explain
 select *
 from (
 select *  from eventlog.record_classification as record_classification
 where
   time_written = '2005-07-06 00:00:00+00'::timestamptz
   and time_written  '2005-07-06 00:00:00+00'::timestamptz
 order by time_written, luid
 offset  0
 limit  500
   ) as classification
   join eventlog.record_main as main using (luid, time_written)
 ;

 The explanation:

   
   QUERY PLAN
 ---
  Hash Join  (cost=51.15..20191003.89 rows=208027 width=178)
Hash Cond: ((outer.luid = inner.luid) AND (outer.time_written = 
 inner.time_written))
-  Append  (cost=0.00..14641488.64 rows=554738383 width=96)
  -  Seq Scan on record_main main  (cost=0.00..0.00 rows=1 width=96)
  -  Seq Scan on _20050723__record_main main  (cost=0.00..94078.62 
 rows=3564462 width=96)
  -  Seq Scan on _20050724__record_main main  (cost=0.00..110075.12 
 rows=4170512 width=96)
  -  Seq Scan on _20050725__record_main main  (cost=0.00..122836.02 
 rows=4654002 width=96)
  -  Seq Scan on _20050726__record_main main  (cost=0.00..142347.71 
 rows=5393271 width=96)
  -  Seq Scan on _20050727__record_main main  (cost=0.00..130858.80 
 rows=4957980 width=96)
  
  (and so on, currently 123 such inheritd tables)
  
-  Hash  (cost=51.07..51.07 rows=15 width=98)
  -  Subquery Scan classification  (cost=50.89..51.07 rows=15 
 width=98)
-  Limit  (cost=50.89..50.92 rows=15 width=98)
  -  Sort  (cost=50.89..50.92 rows=15 width=98)
Sort Key: record_classification.time_written, 
 record_classification.luid
-  Result  (cost=0.00..50.59 rows=15 width=98)
  -  Append  (cost=0.00..50.59 rows=15 
 width=98)
-  Seq Scan on record_classification  
 (cost=0.00..0.00 rows=1 width=98)
  Filter: ((time_written = 
 '2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written  
 '2005-07-06 00:00:00+00'::timestamp with time zone))
-  Index Scan using 
 _20050705__record_classification_time_written_idx on 
 _20050705__record_classification record_classification  (cost=0.00..3.46 
 rows=1 width=54)
  Index Cond: ((time_written = 
 '2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written  
 '2005-07-06 00:00:00+00'::timestamp with time zone))
-  Index Scan using 
 _20050701__record_classification_time_written_idx on 
 _20050701__record_classification record_classification  (cost=0.00..3.59 
 rows=1 width=54)
  Index Cond: ((time_written = 
 '2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written  
 '2005-07-06 00:00:00+00'::timestamp with time zone))
-  Index Scan using 
 _20050702__record_classification_time_written_idx on 
 _20050702__record_classification record_classification  (cost=0.00..3.69 
 rows=1 width=54)
  Index Cond: ((time_written = 
 '2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written  
 '2005-07-06 00:00:00+00'::timestamp with time zone))
-  Index Scan using 
 _20050703__record_classification_time_written_idx on 
 _20050703__record_classification record_classification  (cost=0.00..3.70 
 rows=1 width=54)
  Index Cond: ((time_written = 
 '2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written  
 '2005-07-06 00:00:00+00'::timestamp with time zone))

(and so

Re: [GENERAL] Copying bytea data out via pgsql

2005-07-23 Thread Edmund
[EMAIL PROTECTED] (Leonel Nunez) writes:

 John Wells wrote:
 
 Guys,
 
 I have a number of jpegs and tiffs that are stored in a bytea field in a
 PostgreSQL database by a Java program using Hibernate.
 
 I need to copy these out to a flat file via pgsql for viewing, etc.  I've
 tried
 
 psql -c 'select binarydata_field from my_image_table where id=1' mydb 
 flatfile
 
 but although that seems to provide the data it does not seem to pull it in
 a valid image format.  Is there some special handling I need to do here?
 
 
 You have on theflatfile   info  about the query
 let's say leonel= select count (*) from image;
  count
 ---
  1
 (1 row)
 
 
 you gotfieldname,   ---, the image  and   ( 1 row )   on
 the flatfile   file
 
 you need to make a script to put the image  on that flatfile
 just read it as any query  and put the contents of that field  on the file
 
 I use this  little  Perl Script

or you could use some of the other options to psql,
e.g

psql -A -t -c select ...

From the man page:
   -A

   --no-align
  Switches to unaligned output mode. (The default output mode  is
  otherwise aligned.)

   -t

   --tuples-only
  Turn off printing of column names and result row count footers,
  etc. This is equivalent to the \t command.

You might also want to consider the -o filename  option

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[GENERAL] Deletes and large tables

2005-06-10 Thread Edmund Dengler
Greetings!

We have a table with more than 250 million rows. I am trying to delete the
first 100,000 rows (based on a bigint primary key), and I had to cancel
after 4 hours of the system not actually finishing the delete. I wrote a
script to delete individual rows 10,000 at a time using transactions, and
am finding each individual delete takes on the order of 0.1 seconds to 2-3
seconds. There are 4 indexes on the table, one of which is very hashlike
(ie, distribution is throught the index for sequential rows).

Is this considered normal for delete speed? Currently using 8.0.1.
Drives are capable of 40MB/sec sustained transfers, but only acheiving
about 2-10kB/sec, and mostly CPU bound.

Regards!
Ed

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [GENERAL] Deletes and large tables

2005-06-10 Thread Edmund Dengler
Greetings!

This table has 3 foreign keys, but that should not matter during deletes.
In addition, the tables being referred to are small, and should be in
cache.

There are no tables depending on it for references, so no dependent
triggers should be running.

Also, if this was a foreign key issue, I would expect I/O issues/bounds
and not CPU.

Regards!
Ed


On Fri, 10 Jun 2005, Richard Huxton wrote:

 Edmund Dengler wrote:
  Greetings!
 
  We have a table with more than 250 million rows. I am trying to delete the
  first 100,000 rows (based on a bigint primary key), and I had to cancel
  after 4 hours of the system not actually finishing the delete. I wrote a
  script to delete individual rows 10,000 at a time using transactions, and
  am finding each individual delete takes on the order of 0.1 seconds to 2-3
  seconds. There are 4 indexes on the table, one of which is very hashlike
  (ie, distribution is throught the index for sequential rows).

 I don't suppose it's off checking foreign-keys in a lot of tables is it?

 --
Richard Huxton
Archonet Ltd


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Deletes and large tables

2005-06-10 Thread Edmund Dengler
Just did a sanity check. I dumped the DB schema, and there is indeed a
foreign key reference into the table. Now interestingly, the table
pointing in has no index on the column, but is a relatively small table
with only entries near the end of the large table.

So looks like I was getting CPU bound because of a sequental scan of the
smaller table per delete.

Dropped the constraint, and deletes are now much faster.

Regards!
Ed

On Fri, 10 Jun 2005, David Gagnon wrote:


 This table has 3 foreign keys, but that should not matter during deletes.
 In addition, the tables being referred to are small, and should be in
 cache.
 
 
 I'm talking about FK that point this table... Not FK defined for this
 table that point to other table.  If Table A is referenced by 10 other
 tables 10 referencial check are needed.


 There are no tables depending on it for references, so no dependent
 triggers should be running.
 
 Also, if this was a foreign key issue, I would expect I/O issues/bounds
 and not CPU.
 
 
 Maybe... I'm honetly not sure.

 Like I said in my previous mail... I got a similar problem (maybe not
 the same).  It was taking 10 minutes to delete 10k line in a table.  I
 turn on some log info in postgresql.conf and I saw that for each row
 deleted 4 selects were issued to check FK.  I drop those FK and the
 after the delete was taking less than a second.

  Hope it help
 /David



 Regards!
 Ed
 
 
 On Fri, 10 Jun 2005, Richard Huxton wrote:
 
 
 
 Edmund Dengler wrote:
 
 
 Greetings!
 
 We have a table with more than 250 million rows. I am trying to delete the
 first 100,000 rows (based on a bigint primary key), and I had to cancel
 after 4 hours of the system not actually finishing the delete. I wrote a
 script to delete individual rows 10,000 at a time using transactions, and
 am finding each individual delete takes on the order of 0.1 seconds to 2-3
 seconds. There are 4 indexes on the table, one of which is very hashlike
 (ie, distribution is throught the index for sequential rows).
 
 
 I don't suppose it's off checking foreign-keys in a lot of tables is it?
 
 --
Richard Huxton
Archonet Ltd
 
 
 
 
 ---(end of broadcast)---
 TIP 3: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly
 
 
 



---(end of broadcast)---
TIP 8: explain analyze is your friend


[GENERAL] INHERITS and planning

2005-06-09 Thread Edmund Dengler
Greetings!

Is there an issue when a large number of INHERITS tables exist for
planning?

We have 2 base tables, and use INHERITS to partition the data. When we get
around 2000-2200 sub-tables (approx 1000-1100 per base table), planning a
SELECT statement on the base table (ie, to search all sub-tables) will
start slowing down dramatically (ie, feels like something exponential OR
some kind of in-memory to on-disk transition).

I haven't done enough to really plot out the planning times, but
definitely around 1600 tables we were getting sub-second plans, and around
2200 we were above 30 seconds.

Also, is there any plans to support proper partitioning/binning of data
rather than through INHERITS? I know it has been mentioned as upcoming
sometime similar to Oracle. I would like to put in a vote to support
auto-binning in which a function is called to define the bin. The Oracle
model really only supports: (1) explicit partitioning (ie, every new
partition must be defined), or (2) hash binning. What we deal with is
temporal data, and would like to bin on the hour or day automatically,
hopefully to support truncating whole bins.

This helps us 2 ways: (1) data deletion is bulk (we currently drop a full
inherited table), (2) cancelling a VACUUM/SELECT doesn't take forever
while the execution engine finishes this table (we have had cancels
take 2 hours because the VACUUM was on a very large single table).

Regards!
Ed

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[GENERAL] Optimizer and inherited tables

2005-06-08 Thread Edmund Dengler
(Sorry, wrong subject line got sent)

Greetings!

Does anybody know how well the optimizer works when dealing with inherited
tables? I am currently using 8.0.1.

I have a table called eventlog.record_main, and a number of inherited
tables to partition the data (called
eventlog_partition._day__record_main). luid is the primary key (all
tables have this indexed via the primary key).

The reason for doing this is that a single table would be way too big
(there are on average 6-7 million rows per table) so that vacuum and
deletes would be inefficient. Inserting has been much more efficient using
this mechanism.

When I try the following query, I get sequential scans:

explain select * from eventlog.record_main order by luid limit 5;

 QUERY PLAN
--
 Limit  (cost=160800332.75..160800332.77 rows=5 width=92)
   -  Sort  (cost=160800332.75..161874465.60 rows=429653138 width=92)
 Sort Key: eventlog.record_main.luid
 -  Result  (cost=0.00..11138614.37 rows=429653138 width=92)
   -  Append  (cost=0.00..11138614.37 rows=429653138 width=92)
 -  Seq Scan on record_main  (cost=0.00..0.00 rows=1 
width=92)
 -  Seq Scan on _20050401__record_main record_main  
(cost=0.00..223029.64 rows=8620164 width=92)
 -  Seq Scan on _20050402__record_main record_main  
(cost=0.00..201536.46 rows=7789446 width=92)
 -  Seq Scan on _20050403__record_main record_main  
(cost=0.00..211277.34 rows=8165934 width=92)
 -  Seq Scan on _20050404__record_main record_main  
(cost=0.00..219381.70 rows=8479170 width=92)
 -  Seq Scan on _20050405__record_main record_main  
(cost=0.00..226305.94 rows=8746794 width=92)

(and so on)

Yet, when I run the query on a single table, I get index usage:

explain select * from eventlog_partition._20050601__record_main order by luid 
limit 5;
 QUERY PLAN

 Limit  (cost=0.00..0.15 rows=5 width=92)
   -  Index Scan using _20050601__record_main_pkey on _20050601__record_main  
(cost=0.00..163375.23 rows=5460021 width=92)
(2 rows)

This means that any query that limits the rows will run extremely
inefficiently. Given a limit of 5, at most only 5 rows need to be
considered in each partition sub-table, so an optimal plan would run
a sub-query in each table limited to 5 rows, and then merge the results.

Any ideas/fixes/patches?

Regards!
Ed


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [GENERAL] vulnerability/SSL (fwd)

2005-06-08 Thread Edmund Dengler
Greetings!

Does anybody know how well the optimizer works when dealing with inherited
tables? I am currently using 8.0.1.

I have a table called eventlog.record_main, and a number of inherited
tables to partition the data (called
eventlog_partition._day__record_main). luid is the primary key (all
tables have this indexed via the primary key).

The reason for doing this is that a single table would be way too big
(there are on average 6-7 million rows per table) so that vacuum and
deletes would be inefficient. Inserting has been much more efficient using
this mechanism.

When I try the following query, I get sequential scans:

explain select * from eventlog.record_main order by luid limit 5;

 QUERY PLAN
--
 Limit  (cost=160800332.75..160800332.77 rows=5 width=92)
   -  Sort  (cost=160800332.75..161874465.60 rows=429653138 width=92)
 Sort Key: eventlog.record_main.luid
 -  Result  (cost=0.00..11138614.37 rows=429653138 width=92)
   -  Append  (cost=0.00..11138614.37 rows=429653138 width=92)
 -  Seq Scan on record_main  (cost=0.00..0.00 rows=1 
width=92)
 -  Seq Scan on _20050401__record_main record_main  
(cost=0.00..223029.64 rows=8620164 width=92)
 -  Seq Scan on _20050402__record_main record_main  
(cost=0.00..201536.46 rows=7789446 width=92)
 -  Seq Scan on _20050403__record_main record_main  
(cost=0.00..211277.34 rows=8165934 width=92)
 -  Seq Scan on _20050404__record_main record_main  
(cost=0.00..219381.70 rows=8479170 width=92)
 -  Seq Scan on _20050405__record_main record_main  
(cost=0.00..226305.94 rows=8746794 width=92)

(and so on)

Yet, when I run the query on a single table, I get index usage:

explain select * from eventlog_partition._20050601__record_main order by luid 
limit 5;
 QUERY PLAN

 Limit  (cost=0.00..0.15 rows=5 width=92)
   -  Index Scan using _20050601__record_main_pkey on _20050601__record_main  
(cost=0.00..163375.23 rows=5460021 width=92)
(2 rows)

This means that any query that limits the rows will run extremely
inefficiently. Given a limit of 5, at most only 5 rows need to be
considered in each partition sub-table, so an optimal plan would run
a sub-query in each table limited to 5 rows, and then merge the results.

Any ideas/fixes/patches?

Regards!
Ed


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [GENERAL] hpw to Count without group by

2005-06-02 Thread Edmund Bacon
[EMAIL PROTECTED] (Yudie Pg) writes:

 Hello,
 I have a table, structure like this:
  create table product(
  sku, int4 not null,
  category int4 null,
  display_name varchar(100) null,
  rank int4 null
 )
  let say example data:
 sku, category, display_name
 ===
 10001, 5, postgresql, 132
 10002, 5, mysql, 243
 10003, 5, oracle, 323
 10006, 7, photoshop, 53 
 10007, 7, flash mx, 88
 10008, 9, Windows XP, 44
  10008, 9, Linux, 74
  Expected query result:
  sku, category, display_name, category_count
 
 10001, 5, postgresql, 3
  10006, 7, photoshop, 2
  10008, 9, Windows XP, 2
  The idea is getting getting highest ranking each product category and COUNT 
 how many products in the category with SINGLE query.
  the first 3 columns can be done with select distinct on (category) ... 
 order by category, rank desc but it still missing the category_count. I wish 
 no subquery needed for having simplest query plan.
   Thank you. 
   Yudie G.

I do not believe you can do this without a subquery - you are trying
to get 2 separate pieces of information from your data
   * some data about the record having MAX(rank) for each category
and
   * the count of records in each category

Note, however that you can get MAX(rank) and COUNT(category) in one
sequential pass of the data: e.g
 SELECT category, MAX(rank), COUNT(category) FROM product;

Joining this with the orignal table is not too dificult :

SELECT sku, category, display_name, category_count 
  FROM  product
  JOIN (SELECT category, MAX(rank) AS rank, COUNT(category) AS category_count
   FROM product 
   GROUP BY category) subq
USING(category, rank)
 ORDER BY sku;

Depending on what your data looks like, you might improve things by
having an index on category, and perhaps on (category, rank).

Note that there is may be a problem with this query: If you have more
than one product with the same rank in the same category, you may get
more than one record for that category.  Apply distinct on as
neccessary.

-- 
Remove -42 for email

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [GENERAL] Count and Results together

2005-05-20 Thread Edmund Bacon
[EMAIL PROTECTED] (Jan Sunavec) writes:

 I am using libpg.so.
I assume that you mean libpq ? 
I tryed find solution for this problem in
 internet  but, I don't find nothing yet. I have idea get rowcount
 throught some  function write in C. Or is there any plan add this
 feature into PostgreSQL?


Have you read your documentation?  Perhaps this page  would be of
interest to you?
http://www.postgresql.org/docs/8.0/interactive/libpq-exec.html#LIBPQ-EXEC-SELECT-INFO

 

-- 
Remove -42 for email

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] CSV delim quoting differences PgCOPY, Excel etc...

2005-05-18 Thread Edmund Bacon
[EMAIL PROTECTED] (Jerry Sievers) writes:

 Hello.
 
 Anyway, I am often enough having to load Pg databases using SQL COPY
 from CSV output written by Excel, that I've had to write a script to
 change the quoting behavior from Excel's putting double quotes around
 a field having embedded delim chars, to Pg's backslash quoting.
 
 I've got a hunch that one or both of the aforementioned softwares can
 change their quoting behavior.  (IT just seems reasonable... famous
 last words.)
 
 Is it true that neither of Pg SQL \copy and/or Excel can adapt as I
 wish?  What about psql \copy?  I fail to read in the docs for the Pg
 products any support for this.
 
 If Excel can adapt, at least one experienced user I've asked, doesn't
 know about it.
 
 Thanks.

As of version 8.0 Pg has been taught how to do the right thing to
copy CSV files: 

COPY mytable TO 'myfile' WITH CSV;
 

-- 
Remove -42 for email

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [GENERAL] Debugging deadlocks

2005-04-02 Thread Edmund Bacon
[EMAIL PROTECTED] (Bruno Wolff III) writes:

 Using domains is a good way to keep column constraints in just one place.
 

Speaking of domains, how do you find out what the range of a domain
is?

eg:

test=# create domain fruit as text
check( value in ('apple', 'orange', 'banana', 'pear'));
CREATE DOMAIN
test=# \dD fruit
 List of domains
 Schema | Name  | Type | Modifier 
+---+--+--
 public | fruit | text | 
(1 row)

test=# \dD+ fuit
 List of domains
 Schema | Name  | Type | Modifier 
+---+--+--
 public | fruit | text | 
(1 row)

A quick look through pg_catalog doesn't suggest anything that would
return the check conditions -   Is there any way to do this?

-- 
Remove -42 for email

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] fied separator change from the shell command line

2005-03-13 Thread Edmund Bacon
[EMAIL PROTECTED] writes:

 Hello,
 
 I'm trying to change the usal | table field separator from the shell
 command line:
 psql -d ect -f pl_lost.sql -o pl_lost.out.txt  -F \t -U asaadmin
 
 But it doesn't work. It keeps the same | separator in the output
 file.
 Can anyone please help me?
 I need to output to a tab separated file. Can anynone hel me with this?
 
 Thanks in advance,

Changing the field separator only works with unaligned output. You
need to add a -A to your command line switches (or --no-align). Doing
this, though, drops the column headers.


-- 
Remove -42 for email

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [GENERAL] Stuck with a query...

2005-03-13 Thread Edmund Bacon
[EMAIL PROTECTED] (Greg Stark) writes:

 Geoff Caplan [EMAIL PROTECTED] writes:
 
  Hi folks,
  
  Sorry to ask a newbie SQL question but I'm struggling...
 
 There's no efficient way to write this in standard SQL. However Postgres has
 an extension DISTINCT ON that would do it:
 
 select url,count(*) 
   from (select distinct on (session_id)
url
   from clickstream
  order by session_id,sequence_num  desc
)
  group by url
 
 This isn't going to be a superfast query. It has to sort all the clickstream
 records by session and sequence, take just the last one, then probably sort
 those again.


As an experiment I tried a more  'standard SQL' approach to this problem:

SELECT url, count(1) 
  FROM clickstream 
 WHERE (session_id, sequence_num) IN 
 (SELECT session_id, max(sequence_num) 
FROM clickstream 
GROUP BY session_id)
GROUP BY url;

On a table with about 100,000 rows this runs in about 720ms on my
system , compared to the ON DISTICNT version which runs in about
1000ms.  Adding an index on (session_id, sequence_num) reduced the run
time to about 690ms, but made no difference to the DISTINCT ON
version.  With only about 10,000 rows, there's no appreciable
difference. This surprised me, because I expected the DISTINCT ON to
be better.


-- 
Remove -42 for email

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [GENERAL] Novice Question

2005-03-01 Thread Edmund Bacon
[EMAIL PROTECTED] (Michael Romagnoli) writes:

 What kind of command would I run if I wanted to copy an entire table
 (along with renaming it, and, of course, all data from the first table
 -
 some of which is binary)?

SELECT * INTO newtable FROM oldtable;

Note that this doesn't construct indexes, Foreign keys, constraints,
etc.

If by 'binary data' you mean BLOBs, I'd expect the above to work.
Other than that, AFAIUI you have no reasonable expectation that your data is
stored in any meaningful binary format by the database.  All data
could be internally stored as strings (though that might be very
slow).



-- 
Remove -42 for email

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [GENERAL] basic trigger using OLD not working?

2005-02-28 Thread Edmund Bacon
[EMAIL PROTECTED] (Rick Casey) writes:


 CREATE OR REPLACE FUNCTION logPedigreesDel() RETURNS TRIGGER AS '
 begin
 RAISE EXCEPTION ''OLD.famindid = '', OLD.famindid;

  RAISE EXCEPTION ''OLD.famindid = %'', OLD.famindid;
   ^
 return OLD;
 end;
 ' LANGUAGE plpgsql;


-- 
Remove -42 for email

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[GENERAL] Multi-column indexes

2005-01-15 Thread Edmund Dengler
Greetings!

I have a technical question concerning multi-column indexes and their
implementation. I tried looking for the answr in the docs but couldn't
find anything.

I have the following table:

eventlog= \d agent.record
  Table agent.record
   Column   |   Type   |
Modifiers
+--+-
 luid   | bigint   | not null default 
nextval('agent.record_luid_seq'::text)
 host_luid  | bigint   |
 remote_system_luid | bigint   |
 log_luid   | bigint   | not null
 time_logged| timestamp with time zone | not null default now()
 record | bytea| not null
 error  | boolean  |
 error_reason   | text |
Indexes:
record_pkey primary key, btree (luid)
record_to_process_idx unique, btree (host_luid, log_luid, luid) WHERE 
(error IS NULL)
record_to_process_idx2 unique, btree (luid) WHERE (error IS NULL)
record_last_logged_idx btree (time_logged, host_luid, log_luid, luid)
Foreign-key constraints:
$1 FOREIGN KEY (host_luid) REFERENCES eventlog.host(luid) ON UPDATE 
CASCADE ON DELETE CASCADE
$2 FOREIGN KEY (remote_system_luid) REFERENCES 
eventlog.remote_system(luid)
$3 FOREIGN KEY (log_luid) REFERENCES eventlog.log(luid) ON UPDATE CASCADE 
ON DELETE CASCADE

consisting of 27306578 rows.


So I try running the following query:

explain analyze
select record
from agent.record
where host_luid = 3::bigint
  and log_luid = 2::bigint
  and error is null
order by host_luid desc, log_luid desc, luid desc
limit 1

I get the following query plan:


 Limit  (cost=0.00..1.47 rows=1 width=286) (actual time=249064.949..249064.950 
rows=1 loops=1)
   -  Index Scan Backward using record_to_process_idx on record  
(cost=0.00..13106.73 rows=8898 width=286) (actual time=249064.944..249064.944 
rows=1 loops=1)
 Index Cond: ((host_luid = 3::bigint) AND (log_luid = 2::bigint))
 Filter: (error IS NULL)
 Total runtime: 249065.004 ms
(5 rows)

Now, this plan seems kinda slow, in the sense of scanning backwards. And
it takes quite a long time (compared to seeking the last row based only on
luid, for example). It feels that if I have host_luid values of
(1,2,3,4,5), that the above is scanning through _all_ 5 entries, then 4
entries, and then finally gets to 3.

So, now to my question: is this really happening?

I guess it breaks down to how these indexes are implemented. Are
multi-column indexes implemented as true multiple level indexes, in the
sense there is a level 1 index on host_luid, pointing to a level 2 index
on log_luid, pointing to a level 3 index on luid? Or are they the
equivalent of a host_luid,log_luid,luid single index (ie, as if I
created a functional index consisting of
host_luid || ',' || log_luid || ',' || luid
)?

My initial guess was that Postgresql would search first to the host_luid
desc, then from there to the specific log_luid desc, and then from there
to the luid (ie, the equivalent of 3 btree jumps), essentialy skipping
over the inappropriate host_luid's of 5 and 4. But it seems to be
scanning through them, even though I have a low cost for random page
accesses within my postgresql.conf file. Are they components of the index
to allow it to skip backwards lots of pages rather than loading them
from disk?

Any ideas? How does multi-column indexes really work? I would hate to have
to define specific indexes for each host_luid as this is an
unmaintainable situation.

Thanks!
Ed


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [GENERAL] Multi-column indexes

2005-01-15 Thread Edmund Dengler
Hi Tom!

Yep, there are a large number of host_luid/log_luid combinations (there
are approximatly 5-10 hosts and 1-3 logs per system we are running).

Thanks for the recommended workaround, I'll have a try at it at some point
tomorrow.

Regards!
Ed


On Sat, 15 Jan 2005, Tom Lane wrote:

 Edmund Dengler [EMAIL PROTECTED] writes:
  record_to_process_idx unique, btree (host_luid, log_luid, luid) WHERE 
  (error IS NULL)

  explain analyze
  select record
  from agent.record
  where host_luid = 3::bigint
and log_luid = 2::bigint
and error is null
  order by host_luid desc, log_luid desc, luid desc
  limit 1

   Limit  (cost=0.00..1.47 rows=1 width=286) (actual 
  time=249064.949..249064.950 rows=1 loops=1)
 -  Index Scan Backward using record_to_process_idx on record  
  (cost=0.00..13106.73 rows=8898 width=286) (actual 
  time=249064.944..249064.944 rows=1 loops=1)
   Index Cond: ((host_luid = 3::bigint) AND (log_luid = 2::bigint))
   Filter: (error IS NULL)
   Total runtime: 249065.004 ms

 Are there a whole lotta rows with that host_luid/log_luid combination?

 What's happening is that the index search initially finds the first such
 row, and then it has to step to the last such row to start the backwards
 scan.  This is fixed as of 8.0, but all earlier releases are going to be
 slow in that scenario.  It's got nothing to do with single vs multi
 column indexes, it is just a shortcoming of the startup code for
 backwards index scans.  (I get the impression that the original
 implementation of Postgres' btree indexes only supported unique indexes,
 because there were a number of places where it was horridly inefficient
 for large numbers of equal keys.  I think this 8.0 fix is the last such
 issue.)

 Since your index has an additional column, there is a hack you can use
 to get decent performance in 7.4 and before.  Add a dummy condition on
 the last column:
   where host_luid = 3::bigint
 and log_luid = 2::bigint
 AND LUID = someverylargevalue::bigint
 and error is null
   order by host_luid desc, log_luid desc, luid desc
   limit 1
 Now, instead of positioning to the first row with value (3,2,anything)
 the initial btree descent will home in on the end of that range, and
 so the expensive stepping over all the rows between is avoided.

   regards, tom lane

 ---(end of broadcast)---
 TIP 8: explain analyze is your friend


---(end of broadcast)---
TIP 8: explain analyze is your friend


[GENERAL] select into temp tables withough using EXECUTE in plpgsql

2004-11-24 Thread Edmund Kleiser
I have a query surrounding somthing taht seems to have been a relatively 
FAQ. It concerns the use of temporary tables in plpgsql.

Which initially resulted in the good old 'oid  not found' error.
So I learnt from the maliing-list that I should be 'executing' (with EXECUTE 
Command) my queries because expressions used in a PL/pgSQL function are only 
prepared and saved once (without using execute).

However I would like to select ditinct valuse in my temp table which seem 
impossible as:

 SELECT INTO is not currently supported within EXECUTE. So, the only way to 
extract a result from a dynamically-created SELECT is to use the FOR ... 
EXECUTE form described later.

(http://www.postgresql.org/docs/7.1/static/plpgsql-description.html)
I either cannot find or do not understand the documentation for this  FOR 
... EXECUTE form  being described somewhere later in the docuanetation.

So to recap I'm creating a temp table fine.
I'm EXCUTING an insert into the temp table fine.
Then I cannot select from the table
in the form:
SELECT INTO int1 count(distinct(value)) from TEMP1;
Then it happily drops the table (without the select).
Any advice on how to select from a temp table into a variable wuold be 
gratefully recieved.

Many Thanks
Edmund

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [GENERAL] disabling constraints

2004-11-07 Thread Edmund Bacon
[EMAIL PROTECTED] (Vivek Khera) writes:

  DP == David Parker [EMAIL PROTECTED] writes:
 
 DP I would like to be able to truncate all of the tables in a schema
 DP without worrying about FK constraints. I tried issuing a SET
 DP CONSTRAINTS ALL DEFERRED before truncating, but I still get
 DP constraint errors. Is there a way to do something like:
 
 Why don't you truncate your tables in an order that won't violate
 FK's?  Or do you have circular references?
 
 -- 

Because that doesn't work:

test=# create table able(id serial primary key, data text);
NOTICE:  CREATE TABLE will create implicit sequence able_id_seq for serial 
column able.id
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index able_pkey for 
table able
CREATE TABLE
test=# create table baker(id int references able(id) deferrable, data text);
CREATE TABLE
test=# truncate able;
ERROR:  cannot truncate a table referenced in a foreign key constraint
DETAIL:  Table baker references able via foreign key constraint $1.
test=# begin;
BEGIN
test=# set constraints all deferred;
SET CONSTRAINTS
test=# truncate able;
ERROR:  cannot truncate a table referenced in a foreign key constraint
DETAIL:  Table baker references able via foreign key constraint $1.
test=# rollback;
ROLLBACK
test=#

-- 
Remove -42 for email

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[GENERAL] UTF-8 and =, LIKE problems

2004-11-03 Thread Edmund Lian
I am running a web-based accounting package (SQL-Ledger) that supports 
multiple languages on PostgreSQL. When a database encoding is set to 
Unicode, multilingual operation is possible.

However, when a user's input language is set to say English, and the 
user enters data such as 79, the data that is sent back to PostgreSQL 
for storage is U+FF17 U+FF19, which are the Unicode half width 
characters 79. So far so good.

Now, if the user switches languages and enters 79 as a search key, the 
previously entered row will not be found with the LIKE or = operators, 
and all other comparison operations will fail too. The problem is that 
the browser now sends back U+0037 U+0039, which are Unicode full width 
characters for 79.

Semantically, one might expect U+FF17 U+FF19 to be identical to U+0037 
U+0039, but of course they aren't if a simple-minded byte-by-byte or 
character-by-character comparison is done.

In the ideal case, one would probably want to convert all full width 
chars to their half width equivalents because the numbers look wierd on 
the screen (e.g., 7 9  B r i s b a n e  S t r e e t instead of 79 
Brisbane Street. Is there any way to get PostgreSQL to do so?

Failing this, is there any way to get PostgreSQL to be a bit smarter in 
doing comparisons? I think I'm SOL, but I thought I'd ask anyway.

...Edmund.
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [GENERAL] Issue adding foreign key

2004-10-31 Thread Edmund Bacon
[EMAIL PROTECTED] (George Woodring) writes:

 I have 2 existing tables in my db:
 
 iss= \d pollgrpinfo
  Table public.pollgrpinfo
 Column |  Type  | Modifiers
 ---++---
  pollgrpinfoid | integer| not null
  pollgrpid | integer| not null
  name  | character varying(100) |
  descript  | character varying(200) |
 Indexes:
 pollgrpinfo_pkey primary key, btree (pollgrpinfoid)
 pollgrpinfo_pollgrpid_key unique, btree (pollgrpid)
  
 iss= \d notpoll
Table public.notpoll
Column|  Type  |  Modifiers
 -++-
  notpollid   | integer| not null
  pollgrpid   | integer|
  notgroupsid | integer|
  alerting| character(1)   | default 'y'::bpchar
  disuser | character varying(50)  |
  distime | integer|
  alertingcom | character varying(200) |
 Indexes:
 notpoll_pkey primary key, btree (notpollid)
 notpoll_pollgrpid_key unique, btree (pollgrpid)
 notpoll_alerting_index btree (alerting)
 Triggers:
 RI_ConstraintTrigger_2110326 AFTER INSERT OR UPDATE ON notpoll
 FROM notgroups NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE
 PROCEDURE RI_FKey_check_ins('notgroups_exists', 'notpoll',
 'notgroups', 'UNSPECIFIED', 'notgroupsid', 'notgroupsid')
 
 I am trying to add a foreign key to the notpoll table
 
 iss= alter table notpoll add constraint pollgrp_exists foreign
 key(pollgrpid) references pollgrpinfo on delete cascade;
 ERROR:  insert or update on table notpoll violates foreign key
 constraint pollgrp_exists
 DETAIL:  Key (pollgrpid)=(7685) is not present in table pollgrpinfo.
 

When expressed as FOREIGN KEY (foo) REFERENCES mytable  postgresql
assumes that foo references the PRIMARY KEY  for table mytable. If you
are not referencing the primary key, you need to tell postgresql which
column  e.g. FOREIGN KEY (foo) REFERNCES mytable(foo). 


-- 
Remove -42 for email

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [GENERAL] adding missing FROM-clause

2004-10-31 Thread Edmund Bacon
[EMAIL PROTECTED] (C G) writes:

 Dear All,
 
 I have a simple join query
 
 SELECT c1 FROM t1
 INNER JOIN
 t2 ON t2.c2 = t1.c2 WHERE t3.c3= t2.c3;
 
 Which gives the expected result but I get the message
 NOTICE:  adding missing FROM-clause entry for table t3
 
 How do I get rid of this NOTICE, i.e. how should I construct my select
 query.


SELECT c1 FROM t1, t2, t3
   WHERE t2.c2 = t1.c2 AND t3.c3 = t2.c3;

or

SELECT c1 FROM t1
  INNER JOIN t2 ON t2.c2 = t1.c2
  INNER JOIN t3 ON T3.c3 = t2.c3;


The above can also be written as

SELECT c1 FROM t1
   JOIN t2 USING(c2)
   JOIN t3 USING(c3);

or even
SELECT c1 FROM t1
  NATURAL JOIN t2
  NATURAL JOIN t3;

This last might be problematic if t3 has a column named c1. 


Question:

Is there any advantage to specifying USING() rather than ON?  I know
that if I do SELECT * from T1 JOIN t2 USING(col) then I only get 1
instance of col in the returned rows, but I'm wondering if there is
any advantage to the planner by specifying either USING() or ON?


-- 
Remove -42 for email

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [GENERAL] ON DELETE trigger blocks delete from my table

2004-10-30 Thread Edmund Bacon
[EMAIL PROTECTED] (Naeem Bari) writes:

 I understand. Makes sense. Is there anyway for my trigger function to
 know that it is being called on a delete or on an update? Because I do
 need to return new on update... and I really don't want to write 2
 different functions, one for update and one for delete...


Yes, plpgsql sets a variable TG_OP to INSERT, UPDATE or DELETE.

so, for example

IF ( TG_OP = ''DELETE'' ) THEN
RETURN old;
ELSE
RETURN new;
END IF;

-- 
Remove -42 for email

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [GENERAL] earthdistance is not giving correct results.

2004-10-03 Thread Edmund Bacon
[EMAIL PROTECTED] (mike cox) writes:

 I'm running PostgreSQL 8.0 beta 1.  I'm using the
 earthdistance to find the distance between two
 different latitude and logitude locations. 
 Unfortunately, the result seems to be wrong.
 
 Here is what I'm doing:
 select
 earth_distance(ll_to_earth('122.55688','45.513746'),ll_to_earth('122.396357','47.648845'));
 
 The result I get is this:
 

I believe ll_to_earth() is expecting ll_to_earth(latitude, longitude), 

Also, I think earth_distance returns it's value in meters.

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [GENERAL] SELECT based on function result

2004-07-18 Thread Edmund Bacon
Robert Fitzpatrick wrote:

 I have a function that tells me if a record is positive and negative
 based on several field values. I use it in select statements:
 
 ohc= SELECT sample_id, is_wipe_positive(tblleadwipe.sample_id) AS
 positive FROM tblleadwipe WHERE hud_building_id IS NOT NULL;
  sample_id | positive
 ---+--
 73 | f
 81 | t
 (2 rows)
 
 I see that I cannot change my WHERE statement to WHERE positive = 't'
 because the column positive does not exist. Now I am looking for the
 best way to return all records that are found positive or negative using
 a query. Can anyone offer any guidance to how I can return all the
 positives (or negatvies)? Or do I need to write another function that
 does that?
 
 --
 Robert


Any reason why:

SELECT sample_id, positive
  FROM (SELECT sample_id, 
   is_wipe_positive(tblleadwipe.sample_id) AS positive
  FROM tblleadwipe
  WHERE hud_building_id IS NOT NULL) foo
  WHERE positive = 't';

won't work? 
  
 
 ---(end of broadcast)---
 TIP 9: the planner will ignore your desire to choose an index scan if your
   joining column's datatypes do not match


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [GENERAL] Enough RAM for entire Database.. cost aside, is this

2004-07-08 Thread Edmund Dengler
Greetings!

On Fri, 2 Jul 2004, Mike Rylander wrote:

 I find that experience does not bear this out.  There is a saying a coworker
 of mine has about apps that try to solve problems, in this case caching,
 that are well understood and generally handled well at other levels of the
 software stack... he calls them too smart by half :)

But on the other hand, general algorithms which are designed to work under
a wide variety of circumstances may fail in specific cases. I am thinking
of VACUUM which would kill most caching algorithms simply because we
cannot tell the O/S by the by, this set of pages will not be used again,
and therefore it would be fine to use almost none of the general cache to
store this. All algorithms have assumptions of value distribution and
usages. Caches depend on locality of reference, and we do not have an easy
way to say when this is broken.

Regards!
Ed

---(end of broadcast)---
TIP 8: explain analyze is your friend


[GENERAL] Unable to use index?

2004-04-29 Thread Edmund Dengler
Hi folks!

A query I am running does not seem to use indexes that are available
(running version 7.4.2). I have the following table:

= \d replicated
Table public.replicated
 Column  |   Type   |
Modifiers
-+--+-
 rep_id  | bigint   | not null default 
nextval('replicated_id_seq'::text)
 rep_component   | character varying(100)   |
 rep_key1| integer  |
 rep_key2| bigint   |
 rep_key3| smallint |
 rep_replicated  | timestamp with time zone |
 rep_remotekey1  | integer  |
 rep_remotekey2  | bigint   |
 rep_remotekey3  | smallint |
 rep_key2b   | bigint   |
 rep_remotekey2b | bigint   |
 rep_key4| text |
Indexes:
replicated_pkey primary key, btree (rep_id)
replicate_key1_idx btree (rep_key1, rep_key2, rep_key3)
replicated_item2_idx btree (rep_component, rep_key2, rep_key3)
replicated_item_idx btree (rep_component, rep_key1, rep_key2, rep_key3)
replicated_key2_idx btree (rep_key2, rep_key3)
replicated_key4_idx btree (rep_key4)

= analyze verbose replicated;
INFO:  analyzing public.replicated
INFO:  replicated: 362140 pages, 3 rows sampled, 45953418 estimated
total rows
ANALYZE

The following does not use an index, even though two are available for the
specific selection of rep_component.

= explain analyze select * from replicated where rep_component = 'ps_probe' limit 1;
  QUERY PLAN
---
 Limit  (cost=0.00..0.23 rows=1 width=101) (actual time=34401.857..34401.859 rows=1 
loops=1)
   -  Seq Scan on replicated  (cost=0.00..936557.70 rows=4114363 width=101) (actual 
time=34401.849..34401.849 rows=1 loops=1)
 Filter: ((rep_component)::text = 'ps_probe'::text)
 Total runtime: 34401.925 ms
(4 rows)

Yet, if I do the following, an index will be used, and it runs much
faster (even when I swapped the order of the execution).

= explain analyze select * from replicated where rep_component = 'ps_probe' order by 
rep_component limit 1;
  QUERY PLAN
---
 Limit  (cost=0.00..1.66 rows=1 width=101) (actual time=51.163..51.165 rows=1 loops=1)
   -  Index Scan using replicated_item2_idx on replicated  (cost=0.00..6838123.76 
rows=4114363 width=101) (actual time=51.157..51.157 rows=1 loops=1)
 Index Cond: ((rep_component)::text = 'ps_probe'::text)
 Total runtime: 51.265 ms
(4 rows)

Any reason why the index is not chosen? Maybe I need to up the number of
rows sampled for statistics?

Regards!
Ed

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


[GENERAL] unsubscribe

2004-02-17 Thread Edmund Zynda








unsubscribe




 
  
  
  
  
  EDMUND ZYNDA
  [EMAIL PROTECTED]
  FocalBase Internet Solutions
  p.
  410.751.2093 x219
  f.
  410.751.2653
  www.focalbase.com
  
 











image001.gif

Re: [GENERAL] SET within a function?

2003-10-15 Thread Edmund Dengler
I guess it comes back to the semantics of NULL. As has been pointed out in
many a database course, what we mean by NULL changes, and how we want to
use NULL changes on circumstances.

Normally, when I am comparing rows, I do want NULL  NULL. In
this specific instance, no value has been assigned to the specific
column for this row, so NULL is appropriate. However, there are cases
where I am trying to explicitely test for existence of a specific row
in the table, and in this case, I _do_ want a NULL == NULL type of
comparison. I could try and specify a dummy value (in this case, I could
put in -1), but then I am trying to create a second class of NULLs, and
this is usually not considered good design.

Note that as a prime example of how postgresql itself is not consistent
(in the strictest sense) is GROUP BY which treats NULL == NULL
(interesting side bar, is there a way to cause GROUP BY to treat NULLs as
not equal to each other?). In a theoretical question, how is this
justified if NULL should not equal to NULL (other than it is in the
spec)?

Also, is there a particular reason for not having a strict equality
operator (or is it simply because it is not in the specification)?
Performance? No support from the back-end? Something else?

Regards,
Ed

On Wed, 15 Oct 2003, Tom Lane wrote:

 Edmund Dengler [EMAIL PROTECTED] writes:
  ... I have no real choice in this as there is no way to specify that
  NULL == NULL.

 The conventional wisdom on this is that if you think you need NULL ==
 NULL to yield true, then you are misusing NULL, and you'd better
 reconsider your data representation.  The standard semantics for NULL
 really do not support any other interpretation of NULL than I don't
 know what this value is.  If you are trying to use NULL to mean
 something else, you will face nothing but misery.  Choose another
 representation for whatever you do mean.

   regards, tom lane

 ---(end of broadcast)---
 TIP 2: you can get off all lists at once with the unregister command
 (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [GENERAL] SET within a function?

2003-10-14 Thread Edmund Dengler
The problem I would face is that this still needs to be a sequential scan
in the table rather than an index lookup.

Regards,
Ed

On Tue, 14 Oct 2003, Arthur Ward wrote:

  Is the rewrite only for the literal 'X = NULL' or will it do a test
  against a value such as 'X = OLD.X' (and rewrite is OLD.X is NULL)?
 
  Is there any way to match NULLS to each other (as I am looking for a
  literal row, not using NULL as the UNKNOWN). I suppose I could put in a
  dummy value for the 'Not a valid value', but it seems to be quite awkward
  when I really do want the NULL.

 I ended up writing an equivalent function for the project I'm working
 on. It goes like this in plpgsql:

   IF $1 IS NULL THEN
   RETURN $2 IS NULL;
   ELSIF $2 IS NULL THEN
   -- We already know $1 is not null.
   RETURN FALSE;
   ELSE
   -- Both args are not null.
   RETURN $1 = $2;
   END IF;

 That's the basic idea. I put a wrapper around this to generate a copy of
 it for all the data types used in my database.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


[GENERAL] SET within a function?

2003-10-13 Thread Edmund Dengler
Hi all!

I am doing some trigger functions that need to find a tuple in another
table. The problem is that this second table is doing some summarization
work, and I need nulls to equal each other.

Basically, in the trigger I do a:

  SELECT INTO ... x
  FROM table1
  WHERE ...(some straightforward x = old.x)...
AND (x1 = old.x1 OR (x1 is null and old.x1 is null))
AND (x2 = old.x2 OR (x2 is null and old.x2 is null))
AND (x3 = old.x3 OR (x3 is null and old.x3 is null));

The problem is that an index is used to perform the straightforward stuff,
and then the x1,x2,x3 is done via an index scan, rather than directly.
Unfortunately for the data set I have, it can be clustered pretty badly
around the straightforward stuff, and so the scan can take multiple
seconds per call.

I think if I could do a 'SET TRANSFORM_NULL_EQUALS TO ON' then this might
fix the issue (don't know, haven't tried it yet). My question is: can this
be done within a function such that at the end of the function, the value
is reset back to value upon entering (kind of like 'SET LOCAL' except for
just the length of the function call). Is this possible?

Thanks!
Ed

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [GENERAL] SET within a function?

2003-10-13 Thread Edmund Dengler
Is the rewrite only for the literal 'X = NULL' or will it do a test
against a value such as 'X = OLD.X' (and rewrite is OLD.X is NULL)?

Is there any way to match NULLS to each other (as I am looking for a
literal row, not using NULL as the UNKNOWN). I suppose I could put in a
dummy value for the 'Not a valid value', but it seems to be quite awkward
when I really do want the NULL.

Regards!
Ed

On Mon, 13 Oct 2003, Bruno Wolff III wrote:

 On Mon, Oct 13, 2003 at 21:16:33 -0400,
   Edmund Dengler [EMAIL PROTECTED] wrote:
 
  I think if I could do a 'SET TRANSFORM_NULL_EQUALS TO ON' then this might
  fix the issue (don't know, haven't tried it yet). My question is: can this
  be done within a function such that at the end of the function, the value
  is reset back to value upon entering (kind of like 'SET LOCAL' except for
  just the length of the function call). Is this possible?

 I don't think that will do what you want. That setting is used to
 rewrite = null as is null, not to change things so that nulls match each
 other.

 ---(end of broadcast)---
 TIP 6: Have you searched our list archives?

http://archives.postgresql.org


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [GENERAL] Buglist

2003-08-21 Thread Edmund Dengler
Well, if they are locked waiting on vacuum, then vacuum should upgrade
it's priority to the highest waiting process (priority inheritance).
This way, vacuum will be running at a priority level equivalent to who is
waiting on it.

Regards,
Ed

On Thu, 21 Aug 2003, Andrew Sullivan wrote:

 On Wed, Aug 20, 2003 at 11:41:41PM +0200, Karsten Hilbert wrote:
  You mean, like, nice 19 or so ?

 ISTR someone reporting problems with locking on the performance list
 from doing exactly that.  The problem is that the vacuum back end
 might take a lock and then not get any processor time -- in which
 case everybody else gets their processor slice but can't do anything,
 because they have to wait until the niced vacuum process gets back in
 line.

 A

 --
 
 Andrew Sullivan 204-4141 Yonge Street
 Liberty RMS   Toronto, Ontario Canada
 [EMAIL PROTECTED]  M2P 2A8
  +1 416 646 3304 x110


 ---(end of broadcast)---
 TIP 8: explain analyze is your friend



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [GENERAL] Bulk Insert / Update / Delete

2003-08-21 Thread Edmund Dengler
Wasn't there a feature in some SQL database which was the equivalent of
UPDATE OR INSERT ... based on the primary key? Would this accomplish what
you want (I know that I have a desire for this feature a couple of times,
as I simply have code or triggers to essentially do the equivalent)? Is
this a desirable feature for Postgresql?

Regards,
Ed

On Thu, 21 Aug 2003, Philip Boonzaaier wrote:

 Hi Ron

 That is just the point. If Postgres cannot tell me which records exist and
 need updating, and which do not and need inserting, then what can ?

 In the old world of indexed ISAM files it is very simple - try to get the
 record ( row ) by primary key. If it is there, update it, if it is not,
 insert it.

 Now, one can do this with a higher level language and SQL combined, but is
 SQL that weak ?

 What happens when you merge two tables ? Surely SQL must somehow determine
 what needs INSERTING and what needs UPDATING Or does one try to merge,
 get a failure, an resort to writing something in Perl or C ?

 Please help to un - confuse me !

 Regards

 Phil
 - Original Message -
 From: Ron Johnson [EMAIL PROTECTED]
 To: PgSQL General ML [EMAIL PROTECTED]
 Sent: Tuesday, August 19, 2003 6:45 PM
 Subject: Re: [GENERAL] Bulk Insert / Update / Delete


 On Tue, 2003-08-19 at 22:03, Philip Boonzaaier wrote:
  Hi Jason
 
  Thanks for your prompt response.
 
  I'm pretty new to SQL, so please excuse the following rather stupid
 question
  :
 
  How do I use the (NOT) EXISTS or (NOT) IN clause ? Would it be feasible,
  using your suggestion, to simply put in two SQL statements, in the same
  query - first UPDATE when EXISTS, then INSERT when NOT EXISTS, to
 accomplist
  this in one go ?
 
  Regards
 
  Phil

 How will you which records were updated, thus able to know which need
 to be inserted?

 A temporary table and pl/pgsql should do the trick.

  - Original Message -
  From: Jason Godden [EMAIL PROTECTED]
  To: Philip Boonzaaier [EMAIL PROTECTED]; [EMAIL PROTECTED]
  Sent: Tuesday, August 19, 2003 4:42 PM
  Subject: Re: [GENERAL] Bulk Insert / Update / Delete
 
 
  Hi Philip,
 
  Pg is more ansi compliant than most (GoodThing (TM)).  You can use the
  'when'
  conditional but not to do what you need.  If I understand you correclty
 you
  should be able to acheive the same result using two seperate queries and
 the
  (NOT) EXISTS or (NOT) IN clause.  Failing that have a look at the fine
 docs
  on pl/pgsql and other postgresql procedural languages which allow you to
 use
  loops and conditional statements like 'if'.
 
  Rgds,
 
  J
 
  On Wed, 20 Aug 2003 12:21 pm, Philip Boonzaaier wrote:
   I want to be able to generate SQL statements that will go through a list
  of
   data, effectively row by row, enquire on the database if this exists in
  the
   selected table- If it exists, then the colums must be UPDATED, if not,
  they
   must be INSERTED.
  
   Logically then, I would like to SELECT * FROM TABLE
   WHERE Values entered here, and then IF FOUND
   UPDATE TABLE SET  Values entered here ELSE
   INSERT INTO TABLE VALUES Values entered here
   END IF;
  
   The IF statement gets rejected by the parser. So it would appear that
   PostgreSQL does not support an IF in this type of query, or maybe not at
   all.
  
   Does anyone have any suggestions as to how I can achieve this ?

 --
 -
 Ron Johnson, Jr. [EMAIL PROTECTED]
 Jefferson, LA USA

 484,246 sq mi are needed for 6 billion people to live, 4 persons
 per lot, in lots that are 60'x150'.
 That is ~ California, Texas and Missouri.
 Alternatively, France, Spain and The United Kingdom.


 ---(end of broadcast)---
 TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

 This message is privileged and confidential and intended for the addressee only. If 
 you are not the intended recipient you may not disclose, copy or
 in any way use or publish the content hereof, which is subject to copyright.If you 
 have received this in error, please destroy the original message
 and contact us at [EMAIL PROTECTED] Any views expressed in this message
 are those of the individual sender, except where the sender specifically
 states them to be the view of Computerkit Retail Systems, its subsidiaries or
 associates. Please note that the recipient must scan this e-mail and attachments for 
  viruses. We accept no liability of whatever nature for any loss,
 liability,damage or expense resulting directly or indirectly from this transmission
 of this message and/or attachments.

 ---(end of broadcast)---
 TIP 8: explain analyze is your friend



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [GENERAL] Buglist

2003-08-21 Thread Edmund Dengler
What I am pointing out is that this is all the same issue, and that
solutions to the we can't do priorities because of locking issues have
existed for many years. I/O is the same as processors, it is a resource
that needs managing. So the intelligence can be made to exist, it just
needs to be made.

Now onto other questions: can vacuuming be done without locks? Can it be
done in parts (ie, lock only a bit)? Can the I/O be better managed? Is
this a general model that would work well?

I have plenty of queries that I would love to run on a as the system
allows basis, or on a keep a bit of spare cycles or I/O for the
important stuff, but which I cannot specify. So a vote from me for any
mechanism that allows priorities to be specified. If this is a desired
feature, then comes the hard part of what is feasible, what can be done in
a reasonable amount of time, and of doing it.

Regards!
Ed

On Thu, 21 Aug 2003, Andrew Sullivan wrote:

 On Thu, Aug 21, 2003 at 12:05:28PM -0400, Edmund Dengler wrote:
  Well, if they are locked waiting on vacuum, then vacuum should upgrade
  it's priority to the highest waiting process (priority inheritance).
  This way, vacuum will be running at a priority level equivalent to who is
  waiting on it.

 Right, but all that intelligence is something that isn't in there
 now.  And anyway, the real issue is I/O, not processor.

 A

 --
 
 Andrew Sullivan 204-4141 Yonge Street
 Liberty RMS   Toronto, Ontario Canada
 [EMAIL PROTECTED]  M2P 2A8
  +1 416 646 3304 x110


 ---(end of broadcast)---
 TIP 2: you can get off all lists at once with the unregister command
 (send unregister YourEmailAddressHere to [EMAIL PROTECTED])



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [GENERAL] Buglist

2003-08-20 Thread Edmund Dengler
What about the use of priority inheritance to deal with the issue of
priority inversion (a standard methodology within the real-time world)?

Then we could have priorities, but still have low priority processes
bumped up if a high level one is waiting on them.

Regards,
Ed

On Wed, 20 Aug 2003, Tom Lane wrote:

 Andrew Sullivan [EMAIL PROTECTED] writes:
  I disagree.  Triggering a vacuum on a db that is nearly saturating the
  disk bandwidth has a significant impact.

  Vivek is right about this.  If your system is already very busy, then
  a vacuum on a largish table is painful.

  I don't actually think having the process done in real time will
  help, though -- it seems to me what would be more useful is an even
  lazier vacuum: something that could be told clean up as cycles are
  available, but make sure you stay out of the way.  Of course, that's
  easy to say glibly, and mighty hard to do, I expect.

 I'd love to be able to do that, but I can't think of a good way.

 Just nice'ing the VACUUM process is likely to be counterproductive
 because of locking issues (priority inversion).  Though if anyone cares
 to try it on a heavily-loaded system, I'd be interested to hear the
 results...

   regards, tom lane

 ---(end of broadcast)---
 TIP 3: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [GENERAL] Unused Indexes

2003-07-30 Thread EDMUND DENGLER
You need to convert the int's to bigints.

select id where col1 = 1::bigint and col2 = 1::bigint

Regards,
Ed

-Original Message-
From: Tim McAuley [EMAIL PROTECTED]
Date: Wed, 30 Jul 2003 13:46:46 
To:[EMAIL PROTECTED]
Subject: [GENERAL] Unused Indexes

Hi,

I have a table which I have populated with over 5000 entries. There is a 
combined  index placed on two of the columns (both bigint). I am trying 
a simple select (i.e. select id where col1 = 1 and col2 = 1) covering 
these  two columns and it keeps using a seq scan. Is this correct? I 
would have thought that with this number of entries that an index scan 
should be used.

I am testing this using postgresql 7.3.3 on windows 2000 using cygwin.

Doing set enable_seqscan to off does not change the results of the 
explain operation.

I also tried setting a single index on just one of the columns and 
running an appropriate search; it still uses a seq scan. At what stage 
will the planner normally start using an index scan?

Any hints appreciated.

Tim






---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[GENERAL] Storing double-byte strings in text fields.

2001-02-16 Thread edmund

Hello,

I am putting together a web site to display a collection of Chinese
woodblock prints. I want to be able to store double byte values (that is
to say Big5, Unicode etc encoded) in a text field for things such as the
artist's name and the title of the print. I have the following questions:

Is this possible using a plain vanilla version of Postgres, ie without the
multi-lingual support enabled? As I understand it multi-lingual support
allows me to store table and field names etc in non-ASCII, but doesn't
really affect what goes into the fields.

Are programs such as pgdump and the COPY method 8bit clean or will they
mess up the text? I have done some quick trials and it all seems OK but I
want to be sure before commiting.

If the above is not the case will the multi-lingual support fix my
problems? I tried it out but had problems with the backend crashing on
certain queries. I'd also rather not use it as it will be easier to port
my system to other servers if it just needs a plain vanilla install.

I am currently using Postgresql 7.0.3 on RedHat 6.2 (x86) and also on
YellowDog 1.2 (PPC). The web server is Apache 1.3.12 with PHP 4.0.x.


Thanks,

Edmund.


--   
*** *** 
Edmund von der Burg ***   [EMAIL PROTECTED]   *** 
***



[GENERAL] [Fwd: PostgreSQL - Desparate!]

1998-08-02 Thread Edmund Mergl

 


Pardon the intrusion, but I have a dilemma which I cannot find
the answer to, after searching the newsgroups, documentation,
and mailing lists -- My postmaster will not start.

Below is a message I've posted in several newsgroups.  If you
could make ANY suggestions on getting postmaster to work,
I would appreciated it greatly.

-Thanks

-
whenever I attempt to start the postmaster, I get the
following error:

FindBackend: found "/usr/bin/postgres" using argv[0]
FATAL: StreamServerPort: bind() failed: errno=13
Is another postmaster already running on that port?
If not, remove socket node (/tmp/.s.PGSQL.portnr)and retry.

I'ts been working fine for several weeks until recently.  There are
no files in /tmp relating to postgreSQL.  The port I'm using is not in
use by another process, determined with the ps auxw|grep postma
command, and also, fuser -vun tcp 5432.

Here's the command I'm using to start postmaster:
nohup /usr/bin/postmaster -i -B 256 -p 5432 -d -D /var/lib/pgsql
/home/postgres/pgsql.log 2/home/postgres/pgsql.err 

I'm using postgreSQL 6.3.2 under RedHat Linux 5.1.
Any suggestions would be greatly appreciated.

-Thanks