Re: [GENERAL] request help forming query

2008-02-25 Thread danmcb
yes, this could get called on quite large tables (maybe not
billions ...). The second solution looks useful - I'll try it on some
test data.

thanks both of you.



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Planner: rows=1 after similar to where condition.

2008-02-25 Thread Joris Dobbelsteen
-Original Message-
From: Scott Marlowe [mailto:[EMAIL PROTECTED] 
Sent: Monday, 25 February 2008 7:14
To: Joris Dobbelsteen
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Planner: rows=1 after similar to 
where condition.

On Sun, Feb 24, 2008 at 4:35 PM, Joris Dobbelsteen 
[EMAIL PROTECTED] wrote:
 I seem to have some planner oddity, where it seems to completely  
 mispredict the output after a regex compare. I've seem it on other  
 occasions, where it completely screws up the join. You can note the  
 rows=1 after the filter.
  A similar sitution has occurred when doing a regex filter in a 
 subquery,  which was subsequently predited as 1 row and triggered 
 (oddly enough) a  sequencial scan. Doing the same using 
equality on 
 the result to  substring(text from regex) seemed to work and 
 produced a useful  plan, since it did a hash-join (as it 
should have).
  Is this a known problem? Otherwise I think I should build a smaller 
 test  case...

  Using Postgresql 8.2.6 from Debian Etch-backports.

Should be:
PostGreSQL 8.2.5 on x86_64-pc-linux-gnu (GCC 4.1.2.20061115) (Debian
4.1.1-21).
Should have paid closer attention.


  Bitmap Heap Scan on log_syslog syslog  (cost=13124.26..51855.25 
 rows=1  width=270)
Recheck Cond: (((program)::text = 'amavis'::text) AND  
 ((facility)::text = 'mail'::text))
Filter: ***SOME VERY LONG SIMILAR TO REGEX
-  BitmapAnd  (cost=13124.26..13124.26 rows=18957 width=0)
  -  Bitmap Index Scan on IX_log_syslog_program
  (cost=0.00..2223.95 rows=92323 width=0)
Index Cond: ((program)::text = 'amavis'::text)
  -  Bitmap Index Scan on IX_log_syslog_facility
  (cost=0.00..10899.81 rows=463621 width=0)
Index Cond: ((facility)::text = 'mail'::text)

It's not saying it will only get one row back for sure, it's 
saying it thinks it will return one row.  and depending on 
your query, it might.
 What's the query, and what's the explain analyze of that query?


See the attached file for the query and the explain (hopefully this
gives a consistent view and maintains the layout for easier reading).

The point is that it will NOT, not even close. The planner guesses 1
row, but the output was arround 13000 rows (of the 2.2M rows in the
table). Oddly enough the 18k rows on the bitmap and seems a very good
estimation. In fact, if I omit the SIMILAR TO, it estimates ~12000
rows, which is spot on. So it seems the SIMILAR TO really gets the
planner confused.

The real value was 12981 rows that were returned by the query in the
first case. However, since I removed this data from the original table
(its now somewhere else), I cannot present the original EXPLAIN ANALYZE
any more. The new dataset only contains ~137 (but I still have the old
statistics, I think, or at least they provide the same predictions).
I also included a run after EXPLAIN ANALYZE on the current dataset.

Hopefully this helps.

Thanks,

- Joris


QUERY=



SELECT  a.msgid, a.rcv_time, a.sent_time, a.hostname, 
a.passedas, a.from, a.to,
a.message_id, a.resent_message_id,
a.mail_id,
a.queue_id,
convert_score(a.spamscore) AS spamscore,
CAST(a.duration AS integer)
FROM(SELECT syslog.msgid, syslog.rcv_time, syslog.sent_time, 
syslog.hostname,
substring(syslog.text from 'amavis% Passed #%#, % % - %, 
Message-ID: %, mail_id: %, Hits: %, queued_as: %, % ms' for '#') as passedas,
substring(syslog.text from 'amavis% Passed %, % #%# - %, 
Message-ID: %, mail_id: %, Hits: %, queued_as: %, % ms' for '#') as from,
substring(syslog.text from 'amavis% Passed %, % % - #%#, 
Message-ID: %, mail_id: %, Hits: %, queued_as: %, % ms' for '#') as to,
substring(syslog.text from 'amavis% Passed %, % % - %, 
Message-ID: #[^]+#, (Resent-Message-ID: #%#, |)mail_id: %, Hits: %, 
queued_as: %, % ms' for '#') as message_id,
substring(syslog.text from 'amavis% Passed %, % % - %, 
Message-ID: %, Resent-Message-ID: #%#, mail_id: %, Hits: %, queued_as: %, 
% ms' for '#') as resent_message_id,
substring(syslog.text from 'amavis% Passed %, % % - %, 
Message-ID: %, mail_id: #%#, Hits: %, queued_as: %, % ms' for '#') as 
mail_id,
substring(syslog.text from 'amavis% Passed %, % % - %, 
Message-ID: %, mail_id: %, Hits: #%#, queued_as: %, % ms' for '#') as 
spamscore,
substring(syslog.text from 'amavis% Passed %, % % - %, 
Message-ID: %, mail_id: %, Hits: %, queued_as: #%#, % ms' for '#') as 
queue_id,
substring(syslog.text from 'amavis% Passed %, % % - %, 
Message-ID: %, mail_id: %, Hits: %, queued_as: %, #%# ms' for '#') as 
duration,
syslog.text,
syslog.facility, syslog.priority, syslog.program
 FROM public.log_syslog syslog
 WHERE syslog.program = 'amavis'
 AND syslog.facility = 'mail'
 AND syslog.priority = 

Re: [GENERAL] Planner: rows=1 after similar to where condition.

2008-02-25 Thread Gregory Stark
Joris Dobbelsteen [EMAIL PROTECTED] writes:

 Should be:
 PostGreSQL 8.2.5 on x86_64-pc-linux-gnu (GCC 4.1.2.20061115) (Debian 
 4.1.1-21).

In this case that may matter. One of the changes in the 8.2.6 update was:

. Improve planner's handling of LIKE/regex estimation in non-C locales (Tom)

I seem to recall this largely had to do with negated regexp matches but I
might only be remembering part of it. I would start by taking all the fixes
for known bugs before trying to diagnose a new one :)

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] Planner: rows=1 after similar to where condition.

2008-02-25 Thread Joris Dobbelsteen
-Original Message-
From: Gregory Stark [mailto:[EMAIL PROTECTED] 
Sent: Monday, 25 February 2008 12:31
To: Joris Dobbelsteen
Cc: Scott Marlowe; pgsql-general@postgresql.org
Subject: Re: Planner: rows=1 after similar to where condition.

Joris Dobbelsteen [EMAIL PROTECTED] writes:

 Should be:
 PostGreSQL 8.2.5 on x86_64-pc-linux-gnu (GCC 4.1.2.20061115) 
(Debian 4.1.1-21).

In this case that may matter. One of the changes in the 8.2.6 
update was:

. Improve planner's handling of LIKE/regex estimation in non-C 
locales (Tom)

I seem to recall this largely had to do with negated regexp 
matches but I might only be remembering part of it. I would 
start by taking all the fixes for known bugs before trying to 
diagnose a new one :)

It seems debian backports are not upgraded on my system. So I performed
the upgrade to:
PostgreSQL 8.2.6 on x86_64-pc-linux-gnu, compiled by GCC cc (GCC) 4.1.2
20061115 (prerelease) (Debian 4.1.1-21)
And indeed, I should have upgraded it before to the latest version.

It still spits out the same EXPLAIN ANALYZE result.

Bitmap Heap Scan on log_syslog syslog  (cost=11168.32..16988.84 rows=1
width=221) (actual time=11145.729..30067.606 rows=212 loops=1)
  Recheck Cond: (((program)::text = 'amavis'::text) AND
((facility)::text = 'mail'::text))
  Filter: (((priority)::text = 'notice'::text) AND ((text)::text ~
'***:^(?:amavis\\[[0-9]+\\]: \\([0-9]+-[-0-9]+\\) Passed [A-Za-z0-9]+,
[][0-9\\.]* [^]+ - [^]+, Message-ID: [^]+,
(Resent-Message-ID: [^]+, |)mail.id: [^ ,]+, Hits: [-+0-9\\.,]+,
queued.as: [^ ,]+, [0-9]+ ms)$'::text))
  -  BitmapAnd  (cost=11168.32..11168.32 rows=2095 width=0) (actual
time=608.771..608.771 rows=0 loops=1)
-  Bitmap Index Scan on IX_log_syslog_program
(cost=0.00..490.06 rows=19160 width=0) (actual time=94.982..94.982
rows=85238 loops=1)
  Index Cond: ((program)::text = 'amavis'::text)
-  Bitmap Index Scan on IX_log_syslog_facility
(cost=0.00..10677.75 rows=426214 width=0) (actual time=504.960..504.960
rows=455084 loops=1)
  Index Cond: ((facility)::text = 'mail'::text)
Total runtime: 30068.868 ms

From what little I know statistics and how, in postgres, the filter
expressions are computed, the rows=1 prediction seems still a bit odd.

It seems the fix did not effect this query.

Thanks for your support so far,

- Joris


---(end of broadcast)---
TIP 6: explain analyze is your friend


[GENERAL] HOW does 8.3 CREATE REPLACe .. FUNCTION ..COST work

2008-02-25 Thread Obe, Regina
I think I am missing something about how the new CREATE OR REPLACE
FUNCTION ...COST works or I am missing some setting in postgresql conf.
 
I was hoping I could use it to control the function that is used in
cases where only one needs to be evaluated.  Regardless of what I do it
seems to always evaluate the first function in the list.  I'm running on

PostgreSQL 8.3.0, compiled by Visual C++ build 1400
 
Here is an example of my test: Functions and tables
CREATE TABLE log_call
(
  fn_name character varying(100) NOT NULL,
  fn_calltime timestamp with time zone NOT NULL DEFAULT now()
)
WITH (OIDS=FALSE);
 
CREATE OR REPLACE FUNCTION fn_pg_costlyfunction()
  RETURNS integer AS
$$
BEGIN
 INSERT INTO log_call(fn_name) VALUES('fn_pg_costlyfunction()');
 RETURN 5;
END$$
  LANGUAGE 'plpgsql' VOLATILE
  COST 100;
 
CREATE OR REPLACE FUNCTION fn_pg_cheapfunction()
  RETURNS integer AS
$$
BEGIN
 INSERT INTO log_call(fn_name) VALUES('fn_pg_cheapfunction()');
 RETURN 5;
END$$
  LANGUAGE 'plpgsql' VOLATILE
  COST 1;
 
--- Now for the test - 
--Test 1: This shows that fn_pg_costlyfunction() is the only function
that is run - 
-- unexpected to me shouldn't no function be evaluated or the cheap one?
--What's the difference between Test 1 and Test 2 that makes Test 2 do
the RIGHT thing?
TRUNCATE TABLE log_call;
SELECT (fn_pg_costlyfunction()  2 OR fn_pg_cheapfunction()  2 OR 5 
2);
 
--Test 2: This works as I would expect - shows that none of the
functions are run presumably its going straight for 5  2
--becuase it recognizes its the cheapest route
TRUNCATE TABLE log_call;
SELECT foo.value
FROM (SELECT (fn_pg_costlyfunction()  2 OR fn_pg_cheapfunction()  2 OR
5  2 ) as value) as foo
 
--Test 3: It always runs the first function even though the cost of the
first is higher than the second
(in this case log_call contains fn_pg_costlyfunction())
TRUNCATE TABLE log_call;
SELECT foo.value
FROM (SELECT (fn_pg_costlyfunction()  2 OR fn_pg_cheapfunction()  2)
as value) as foo;
 
TRUNCATE TABLE log_call;
SELECT (fn_pg_costlyfunction()  2 OR fn_pg_cheapfunction()  2) as
value;
 
--Test 4: It always runs the first function even though the cost of the
first is higher than the second
(in this case log_call contains fn_pg_cheapfunction())
TRUNCATE TABLE log_call;
SELECT foo.value
FROM (SELECT (fn_pg_cheapfunction()  2 OR  fn_pg_costlyfunction()  2 )
as value) as foo;
 
TRUNCATE TABLE log_call;
SELECT (fn_pg_cheapfunction()  2 OR  fn_pg_costlyfunction()  2 ) as
value;
 
Thanks,
Regina
 


-
The substance of this message, including any attachments, may be
confidential, legally privileged and/or exempt from disclosure
pursuant to Massachusetts law. It is intended
solely for the addressee. If you received this in error, please
contact the sender and delete the material from any computer.


Re: [GENERAL] text and bytea

2008-02-25 Thread hernan gonzalez
  Umm, I think all you showed was that the to_ascii() function was
  broken. Postgres knows exactly what encoding the string is in, the
  backend encoding: in your case UTF-8.

That would be fine, if it were true; then, one could assume that every
postgresql function that returns a text gets ALWAYS the standard
backend encoding (again: as in Java). But consider the result
postgresql gets from this (from my example):
 encode(convert_to(c,'LATIN9'),'escape')
That's something of type text (a strign), postgresql believes it's
UTF8, but it's not  (it probably woud not even validate as a valid
utf8 sequence).
IMHO, the semantics of encode() and decode() are correct (the bridge
between bytea and text ... in the backend encoding; they should be the
only bridge), convert() is also ok (deals with bytes), but
convert_to() and convert_from() are dubious if not broken: they imply
texts in arbitrary encodings (for output or input) , lead to
anomalities and shouldnt be necessary at all.

Cheers

Hernán J. González
http://hjg.com.ar/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] text and bytea

2008-02-25 Thread Gregory Stark
hernan gonzalez [EMAIL PROTECTED] writes:

 IMHO, the semantics of encode() and decode() are correct (the bridge
 between bytea and text ... in the backend encoding; they should be the
 only bridge), convert() is also ok (deals with bytes), but
 convert_to() and convert_from() are dubious if not broken: they imply
 texts in arbitrary encodings (for output or input) , lead to
 anomalities and shouldnt be necessary at all.

postgres=# \df convert_from
 List of functions
   Schema   | Name | Result data type | Argument data types 
+--+--+-
 pg_catalog | convert_from | text | bytea, name
(1 row)

postgres=# \df convert_to
List of functions
   Schema   |Name| Result data type | Argument data types 
++--+-
 pg_catalog | convert_to | bytea| text, name
(1 row)


Looks like they produce and consume byteas to me.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Planner: rows=1 after similar to where condition.

2008-02-25 Thread Tom Lane
Joris Dobbelsteen [EMAIL PROTECTED] writes:
 Bitmap Heap Scan on log_syslog syslog  (cost=11168.32..16988.84 rows=1
 width=221) (actual time=11145.729..30067.606 rows=212 loops=1)
   Recheck Cond: (((program)::text = 'amavis'::text) AND
 ((facility)::text = 'mail'::text))
   Filter: (((priority)::text = 'notice'::text) AND ((text)::text ~
 '***:^(?:amavis\\[[0-9]+\\]: \\([0-9]+-[-0-9]+\\) Passed [A-Za-z0-9]+,
 [][0-9\\.]* [^]+ - [^]+, Message-ID: [^]+,
 (Resent-Message-ID: [^]+, |)mail.id: [^ ,]+, Hits: [-+0-9\\.,]+,
 queued.as: [^ ,]+, [0-9]+ ms)$'::text))

It's not too surprising that you'd get a small selectivity estimate
for such a long regexp; the default estimate is just based on the amount
of fixed text in the pattern, and you've got a lot.

If you increase the stats target for the column to 100 or more then it
will try actually applying the regexp to all the histogram entries.
That might or might not give you a better estimate.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] Planner: rows=1 after similar to where condition.

2008-02-25 Thread Joris Dobbelsteen
Resent due to bounce... 
orange.nl #5.0.0 X-SMTP-Server; host sss.pgh.pa.us[66.207.139.130] said:
550

-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED] 
Sent: Monday, 25 February 2008 16:34
To: Joris Dobbelsteen
Cc: Gregory Stark; Scott Marlowe; pgsql-general@postgresql.org
Subject: Re: [GENERAL] Planner: rows=1 after similar to 
where condition. 

Joris Dobbelsteen [EMAIL PROTECTED] writes:
 Bitmap Heap Scan on log_syslog syslog  (cost=11168.32..16988.84 
 rows=1
 width=221) (actual time=11145.729..30067.606 rows=212 loops=1)
   Recheck Cond: (((program)::text = 'amavis'::text) AND 
 ((facility)::text = 'mail'::text))
   Filter: (((priority)::text = 'notice'::text) AND ((text)::text ~
 '***:^(?:amavis\\[[0-9]+\\]: \\([0-9]+-[-0-9]+\\) Passed 
[A-Za-z0-9]+,
 [][0-9\\.]* [^]+ - [^]+, Message-ID: [^]+,
 (Resent-Message-ID: [^]+, |)mail.id: [^ ,]+, Hits: [-+0-9\\.,]+,
 queued.as: [^ ,]+, [0-9]+ ms)$'::text))

It's not too surprising that you'd get a small selectivity 
estimate for such a long regexp; the default estimate is just 
based on the amount of fixed text in the pattern, and you've got a lot.

If you increase the stats target for the column to 100 or more 
then it will try actually applying the regexp to all the 
histogram entries.
That might or might not give you a better estimate.

I will try that, expect result back within a few days (have it collect
some better sample set). Unfortunally the regex is not so much for
narrowing down the selection, but rather guarenteeing the format of the
messages.
You seem to consider the common case differently, and I can agree for
most part. Unfortunally my use-case is different from the expected. That
said, might a less aggressive selectivity estimation for long strings
work better in the common case?

Might an alternative be to use a function and check for a positive
result, i.e. something that the predictor cannot take into account?

- Joris



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] HOW does 8.3 CREATE REPLACe .. FUNCTION ..COST work

2008-02-25 Thread Tom Lane
Obe, Regina [EMAIL PROTECTED] writes:
 --Test 1: This shows that fn_pg_costlyfunction() is the only function
 that is run - 
 -- unexpected to me shouldn't no function be evaluated or the cheap one?
 --What's the difference between Test 1 and Test 2 that makes Test 2 do
 the RIGHT thing?
 TRUNCATE TABLE log_call;
 SELECT (fn_pg_costlyfunction()  2 OR fn_pg_cheapfunction()  2 OR 5 
 2);

In a SELECT with no FROM we don't run the optimizer at all; the
assumption is that when the expression will only be evaluated once,
it's not worth trying to do expression simplification on it first.
 
 --Test 2: This works as I would expect - shows that none of the
 functions are run presumably its going straight for 5  2
 --becuase it recognizes its the cheapest route
 TRUNCATE TABLE log_call;
 SELECT foo.value
 FROM (SELECT (fn_pg_costlyfunction()  2 OR fn_pg_cheapfunction()  2 OR
 5  2 ) as value) as foo

That's just constant-folding: x OR TRUE is TRUE.  It has exactly
zero to do with the cost of anything.

Offhand I think the behavior you are looking for of choosing to run more
expensive subexpressions later only occurs for top-level WHERE clauses
that are combined with AND.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Planner: rows=1 after similar to where condition.

2008-02-25 Thread Joris Dobbelsteen
-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED] 
Sent: Monday, 25 February 2008 16:34
To: Joris Dobbelsteen
Cc: Gregory Stark; Scott Marlowe; pgsql-general@postgresql.org
Subject: Re: [GENERAL] Planner: rows=1 after similar to 
where condition. 

Joris Dobbelsteen [EMAIL PROTECTED] writes:
 Bitmap Heap Scan on log_syslog syslog  (cost=11168.32..16988.84 
 rows=1
 width=221) (actual time=11145.729..30067.606 rows=212 loops=1)
   Recheck Cond: (((program)::text = 'amavis'::text) AND 
 ((facility)::text = 'mail'::text))
   Filter: (((priority)::text = 'notice'::text) AND ((text)::text ~
 '***:^(?:amavis\\[[0-9]+\\]: \\([0-9]+-[-0-9]+\\) Passed 
[A-Za-z0-9]+,
 [][0-9\\.]* [^]+ - [^]+, Message-ID: [^]+,
 (Resent-Message-ID: [^]+, |)mail.id: [^ ,]+, Hits: [-+0-9\\.,]+,
 queued.as: [^ ,]+, [0-9]+ ms)$'::text))

It's not too surprising that you'd get a small selectivity 
estimate for such a long regexp; the default estimate is just 
based on the amount of fixed text in the pattern, and you've got a lot.

If you increase the stats target for the column to 100 or more 
then it will try actually applying the regexp to all the 
histogram entries.
That might or might not give you a better estimate.

I will try that, expect result back within a few days (have it collect
some better sample set). Unfortunally the regex is not so much for
narrowing down the selection, but rather guarenteeing the format of the
messages.
You seem to consider the common case differently, and I can agree for
most part. Unfortunally my use-case is different from the expected. That
said, might a less aggressive selectivity estimation for long strings
work better in the common case?

Might an alternative be to use a function and check for a positive
result, i.e. something that the predictor cannot take into account?

- Joris


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] HOW does 8.3 CREATE REPLACe .. FUNCTION ..COST work

2008-02-25 Thread Obe, Regina
 
  --Test 2: This works as I would expect - shows that none of the
  functions are run presumably its going straight for 5  2
  --becuase it recognizes its the cheapest route
  TRUNCATE TABLE log_call;
  SELECT foo.value
  FROM (SELECT (fn_pg_costlyfunction()  2 OR fn_pg_cheapfunction() 
2 OR
  5  2 ) as value) as foo

 That's just constant-folding: x OR TRUE is TRUE.  It has exactly
 zero to do with the cost of anything.

 Offhand I think the behavior you are looking for of choosing to run
more
 expensive subexpressions later only occurs for top-level WHERE clauses
that are combined with AND.

   regards, tom lane

Tom thanks for the clarification - based on your comment I verified with
these

-- fn_pg_cheapfunction() is the only one run as you predicted
TRUNCATE TABLE log_call;
SELECT true as value 
WHERE (fn_pg_costlyfunction()  2 AND fn_pg_cheapfunction()  5 );

-- fn_pg_costlyfunction() is the only one run - again as predicted by
your statement
TRUNCATE TABLE log_call;
SELECT true as value 
WHERE (fn_pg_costlyfunction()  2 OR fn_pg_cheapfunction()  2 );

It would be really nice if this worked with OR as well.  Is it just much
harder to deal with the 
OR case in the planner or was there some other reason why the OR case
was left out?

Thanks,
Regina
-
The substance of this message, including any attachments, may be
confidential, legally privileged and/or exempt from disclosure
pursuant to Massachusetts law. It is intended
solely for the addressee. If you received this in error, please
contact the sender and delete the material from any computer.


---(end of broadcast)---
TIP 6: explain analyze is your friend


[GENERAL] process pool

2008-02-25 Thread Luca Ferrari
Hi,
sorry for this question, but as far as I know postgresql does not use a 
process pool, rather a new process is created for any connection on demand. 
If this is true, what is the reason for this? 

Thanks,
Luca

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] HOW does 8.3 CREATE REPLACe .. FUNCTION ..COST work

2008-02-25 Thread Tom Lane
Obe, Regina [EMAIL PROTECTED] writes:
 It would be really nice if this worked with OR as well.  Is it just much
 harder to deal with the 
 OR case in the planner or was there some other reason why the OR case
 was left out?

Nobody's really made a case why we should have the planner expend cycles
on that.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[GENERAL] Cascading Trigger - changing row on delete does not delete row

2008-02-25 Thread D. Dante Lorenso

All,

I have 2 tables which both have triggers on them.  When I delete a row 
on table A, a cascading trigger ends up modifying rows in table B.  The 
modified rows in table B trigger an update on rows in table A which 
happens to be the same row that I am trying to delete.


I don't get any errors from the delete, yet PostgreSQL tells me 0 rows 
affected by the delete and sure enough the row I just tried to delete is 
still there.  Running the delete a 2nd time works because the trigger 
does not cascade and effect the deleted row.


Is there a way to know that a row I am deleting is being deleted so I 
don't update it?


I thought about adding a boolean column 'is_being_deleted' but I can't 
set that to true without updating the row (which I'm trying to avoid).


I've thought about using PL/Perl to access transaction-level global 
variables where I could store the ID of the row I'm deleting and fetch 
that value in order to avoid it in my updates ... but I don't want 
invoke the PL/Perl interpreter and slow down what I'm already doing in 
PL/PGSQL.  Are there transaction-level variables in PL/PGSQL (globals)?


Suggestions?

-- Dante


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] process pool

2008-02-25 Thread Gregory Stark

Luca Ferrari [EMAIL PROTECTED] writes:

 Hi,
 sorry for this question, but as far as I know postgresql does not use a 
 process pool, rather a new process is created for any connection on demand. 
 If this is true, what is the reason for this? 

Generally to add code the question is not why not, but why? That is, the
question you should be asking is why would we do it? What advantage do you
anticipate it would have.

I assume you think it would be faster. But forking processes on Unix is very
fast compared to starting a database session and transaction so the gain would
be minimal. And in any case you can always add a pool outside of Postgres
using things like pgpool or pgbouncer.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] text and bytea

2008-02-25 Thread hernan gonzalez
 IMHO, the semantics of encode() and decode() are correct (the bridge
 between bytea and text ... in the backend encoding; they should be the
 only bridge), convert() is also ok (deals with bytes), but
 convert_to() and convert_from() are dubious if not broken: they imply
 texts in arbitrary encodings (for output or input) , lead to
 anomalities and shouldnt be necessary at all.

Sorry, my mistake. I meant the opposite:

convert_to() and convert_from()  are the correct bridge (text =
bytea) functions.
Also convert() is ok.
The objetionable ones IMHO are decode()/encode(), which can
consume/produce a non-utf8 string (I mean, not the backend encoding)

Going back to the line:

encode(convert_to(c,'LATIN9'),'escape')

Here we have:
 c = text  (ut8)
 convert_to(..). = bytea (represents a char sequence in latin9 encoding)
 encode(...) = text (in latin9 encoding?)

Cheers

Hernán J. González
http://hjg.com.ar/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] text and bytea

2008-02-25 Thread hernan gonzalez
Another example  (Psotgresql 8.3.0, UTF-8  server/client encoding)

test=# create table chartest ( c text);
test=# insert into chartest (c) values ('¡Hasta mañana!');
test=# create view vchartest as
 select encode(convert_to(c,'LATIN9'),'escape') as c1 from chartest;

test=# select c,octet_length(c) from chartest ;
   c| octet_length
+--
 ¡Hasta mañana! |   16

test=# select c1,octet_length(c1) from vchartest ;
  c1  | octet_length
--+--
 Hasta maana! |   14

(the field is seen as as text by postgresql, with the default
encoding.. UTF8; it is actually not)

test=# select * from vchartest where c1 like '%a%';
  c1
--
 Hasta maana!
(1 row)

test=# select * from vchartest where c1 ilike '%a%';
ERROR:  invalid byte sequence for encoding UTF8: 0xa1
HINT:  This error can also happen if the byte sequence does not match
the encoding expected by the server, which is controlled by
client_encoding.

That ilike breaks and like is rather random, it seems that the
later has some optimization does not check the validty of the utf8
stream. But thats not the point. The point is that IMO postgresql
should always handle text in the backend encoding, there should no
exists funcions that are designed to produce/consume texts in other
encodings. Perhaps the encode function is ill defined, and should be
rethinked. Two alternatives:
1. For special binary-to-ascii encodings (base64,hex). Keep its
present signature but remove/deprecate the escape type. It returns a
text in the backend encoding.
2  For arbitrary binary encodings. Change its signature so that it
returns bytea.
Of course, all this applies symmetrically to decode().

Appart, and in accordance with this, I think to_ascii() should accept
only one text argument.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] APEX / HTML DB for PostgreSQL

2008-02-25 Thread Enrico Sirola


Il giorno 24/feb/08, alle ore 04:53, justin tocci ha scritto:

In its most basic form a great tool would just start as a pl/pgsql  
or pl/perl function that could be used to call a table and have it  
output a batch of records to an editable html form or a colored pdf  
for nice reports. There would need to be some kind of interface for  
Apaché of course.


I guess the XML functions of 8.3 + an XSL stylesheet shoud do the job...
my 2 cents,
e.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] autovacuum not freeing up unused space on 8.3.0

2008-02-25 Thread Alvaro Herrera
Stuart Brooks wrote:

 ERROR:  canceling autovacuum task
 CONTEXT:  automatic vacuum of table metadb.test.transactions

Are these happening regularly?  They indicate that something is
happening on the table that collides with what autovacuum needs to do,
and autovacuum defers its task.  For this to happen you need to be doing
ALTER TABLE or similar however; normal UPDATE/INSERT/DELETE should not
cause autovacuum to cancel itself.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] text and bytea

2008-02-25 Thread Tom Lane
hernan gonzalez [EMAIL PROTECTED] writes:
 The objetionable ones IMHO are decode()/encode(), which can
 consume/produce a non-utf8 string (I mean, not the backend encoding)

Huh?  Those deal with bytea too --- in fact, they've got nothing at
all to do with multibyte character representations.  They're for
handling hex and base64 and suchlike representations of binary data.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] text and bytea

2008-02-25 Thread Tom Lane
hernan gonzalez [EMAIL PROTECTED] writes:
 test=# create view vchartest as
  select encode(convert_to(c,'LATIN9'),'escape') as c1 from chartest;

Hmm.  This isn't a very sensible combination that you've written here,
but I see the point: encode(..., 'escape') is broken in that it fails
to convert high-bit-set bytes into \nnn sequences.  At least in
multibyte backend encodings, we *must* do that to produce valid textual
output.  I suspect that for consistency we should do it regardless of
backend encoding.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


[GENERAL] give a hand to a user having trouble with MySQL?

2008-02-25 Thread Wenjian Yang
Came across a blog from Ovid, one of the Perl gurus at BBC.
http://use.perl.org/~Ovid/journal/35752?from=rss

Quote

They're not kidding, either. Subquery optimization in MySQL is terribly
brokenhttp://www.xaprb.com/blog/2006/04/30/how-to-optimize-subqueries-and-joins-in-mysql/,
as we've discovered today. We have Matt Trout in for some consulting and
even ridiculously stupid subqueries have terrible query plans, forcing us to
to consider nasty workarounds, one of which is called PostgreSQL.

Even though he counts PostgreSQL as one of th enasty workarounds, maybe we
should give them a helping hand?

Wenjian


[GENERAL] Windows 8.3 installer leaves a debug flag on?

2008-02-25 Thread Dan Armbrust
To follow up on my old thread -

I tested another install of Postgres 8.3.0.1 - and on windows, the
postgresql.conf file has this in it:

shared_preload_libraries = '$libdir/plugins/plugin_debugger.dll'
#
(change requires restart)

I got the impression that that debugger wasn't intended to be enabled,
by default.  It looks like the windows installer is enabling it,
however.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[GENERAL] disabling triggers, constaints and so on

2008-02-25 Thread Geoffrey
We are still in a pickle with trying to resolve our trigger issues 
without affecting slony triggers.


The point is, we need to be able to disable triggers, check constraints, 
and foreign-key constraints without affecting slony triggers in certain 
situations.


This is all running on 7.4.19, thus, it's our understanding that using 
tgenabled is not going to be a solution.


Making our triggers smarter doesn't get us all the way there.

Suggestions?

--
Until later, Geoffrey

Those who would give up essential Liberty, to purchase a little
temporary Safety, deserve neither Liberty nor Safety.
 - Benjamin Franklin

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] Use index for upper(customername) like 'ABC%'

2008-02-25 Thread Andrus
 CREATE INDEX ie_cust_upper_name ON customer ( UPPER( customername ) )
 WHERE ( UPPER(customername) LIKE 'ABC%' );

ABC is different at every time.

This is first letters of customer name entered by user.
So I cannot add ABC to index.

Andrus.



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] RETURNS SETOF function question

2008-02-25 Thread Kynn Jones
On Sun, Feb 24, 2008 at 7:08 PM, Erik Jones [EMAIL PROTECTED] wrote:


 There was an article that covered this in the Postgres Online Journal
 (
 http://www.postgresonline.com/journal/index.php?/categories/6-pl-programming).
  Basically, do this:

 CREATE OR REPLACE FUNCTION foo(text, text)
RETURNS SETOF text
 $$
SELECT * FROM foo($1, $2);
 $$
 LANGUAGE sql;



That one goes straight into my (still tiny) bag o' Pg tricks.  It worked
like a charm.

Thanks!

G.


[GENERAL] how to auto GRANT custom ACL on a new table?

2008-02-25 Thread Maciej Sieczka

Hello,

I have asked the question on novice ML but it seems there is not much 
traffic there. I hope nobody minds I'm re-asking here, on a much more 
active list. The problem is as follows:


There are 2 role groups in my cluster: editors and viewers. In each
group there are several users.

In the DB, members of editors are allowed to create new tables in one 
schema. My problem is that only the very user who created the table is 
allowed to view and edit it, by default.


I need to modify this default PostgreSQL's behaviour, so that the ACL on
a new table in this schema is set to SELECT, INSERT, UPDATE, DELETE,
REFERENCES for editors, and SELECT for viewers, without having to 
manually GRANT rights each time a new table is created. I can't control 
this setting from the client software as these are various programs, and 
even if I could it'd be still better anyway to have it controlled in one 
single place on the server side. But how?


From reading so far I *suppose* I should create a function which calls
an appropriate GRANT, and trigger it when a new record is added to
pg_class. Is this doable?

If the function+trigger approach is optimal, can Anybody share a working 
example? I have a muddy idea it would be something like this pseudo-code:


CREATE TRIGGER trig_acl
  AFTER INSERT
  ON pg_catalog.pg_class FOR EACH ROW
  EXECUTE PROCEDURE fnct_acl();

CREATE FUNCTION fnct_acl () RETURNS opaque AS '
  DECLARE
table_name TEXT #how do I fetch the table name???
  BEGIN
GRANT ALL ON myschema.table_name TO editors;
  END;
' LANGUAGE 'plpgsql';

What should I modify to make it real code?


An extra, but desired functionality, would be if I could also prevent 
other editors from modifying the table *if* it is being currently 
being edited by somebody. Is that feasible at all?


Thank you for any pointers!

Regards,
Maciek

P.S.
I searched the archives and only found this [1] thread related, but it
does not provide an obvious (for me) solution.

[1]http://www.nabble.com/grant-select-on-all-tables-to1029307.html#a1029307


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] Windows 8.3 installer leaves a debug flag on?

2008-02-25 Thread Tom Lane
Dan Armbrust [EMAIL PROTECTED] writes:
 To follow up on my old thread -
 I tested another install of Postgres 8.3.0.1 - and on windows, the
 postgresql.conf file has this in it:

 shared_preload_libraries = '$libdir/plugins/plugin_debugger.dll'  
 #
 (change requires restart)

 I got the impression that that debugger wasn't intended to be enabled,
 by default.  It looks like the windows installer is enabling it,
 however.

I agree, this does not seem like a wise default.  I'm not sure what the
performance penalty is for having that preloaded, but it's likely not
zero ... and production machines really shouldn't have it installed
anyway.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[GENERAL] copy with escape

2008-02-25 Thread blackwater dev
I have data that I'm running through pg_escape_sting in php and then adding
to stdin for a copy command.  The problem is O'reilly is being changed to
O''Reilly in the string and then in the db.  I saw with the copy command I
can specify the escape but it isn't working for me.  Should this command fix
this double 'single' quote issue when I put it in the db?  And what is the
proper syntax?

COPY mytable FROM stdin with escape



Thanks!


Re: [GENERAL] copy with escape

2008-02-25 Thread Tom Lane
blackwater dev [EMAIL PROTECTED] writes:
 I have data that I'm running through pg_escape_sting in php and then adding
 to stdin for a copy command.  The problem is O'reilly is being changed to
 O''Reilly in the string and then in the db.

pg_escape_string is designed to produce a string properly quoted for use
as a literal in a SQL command.  It is completely wrong for data that is
to go into COPY input.  I kinda doubt that PHP has anything built-in
that's suitable for COPY, though I could be wrong ...

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] copy with escape

2008-02-25 Thread brian



blackwater dev wrote:

I have data that I'm running through pg_escape_sting in php and then adding
to stdin for a copy command.  The problem is O'reilly is being changed to
O''Reilly in the string and then in the db.  I saw with the copy command I
can specify the escape but it isn't working for me.  Should this command fix
this double 'single' quote issue when I put it in the db?  And what is the
proper syntax?

COPY mytable FROM stdin with escape





COPY mytable (...) FROM STDIN WITH CSV ESCAPE ';

http://www.postgresql.org/docs/8.3/static/sql-copy.html

But CSV comes with a lot of baggage. You'd be far better off doing 
tab-delimited, unquoted fields (if you have no tabs in your data).


COPY mytable (...) FROM STDIN;
...
\.

I can't remember precisely all of what pg_escape_string() does, but if 
you need it for something else you could always do this afterward ;-)


implode(\t, str_replace('', ', $row))

b

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[GENERAL] Deploying PostgreSQL on virtualized hardware

2008-02-25 Thread Colin Wetherbee
In a couple months, I'm going to be considering how best to deploy an 
application I have that uses PostgreSQL as its back-end.  It also makes 
heavy use of Perl under mod_perl and UMN MapServer with a sprinkling of 
PostGIS.


I've recently become intrigued by the idea of virtualized servers [0], 
but I don't know anyone who uses them, so I thought I'd ask the list if 
any of you have tried deploying PostgreSQL or any other disk-heavy 
software on them?


I'm not familiar with how the virtualized disks are arranged, how 
volatile the memory is, and so forth.  And, unfortunately, the 
advertising folk don't seem to be very keen on divulging the technical 
details.


It would certainly be nice to pay $60 per month for eight virtualization 
slots and scale up from there as the service grows, rather than buying 
$10,000 worth of servers when they won't be even close to fully utilized 
for a several months.


Any thoughts?

Thanks.

Colin

[0] eg. http://www.gandi.net/hebergement/offre/xen/

---(end of broadcast)---
TIP 6: explain analyze is your friend