Re: [BUGS] BUG #4927: psql does "spoil" the query before sending it to server

2009-07-17 Thread Euler Taveira de Oliveira
Tom Lane escreveu:
> Euler Taveira de Oliveira  writes:
>> Why are we analysing the query there? One possible fix is to remove the '.' 
>> as
>> delimiter in strtokx(). The trivial patch is attached.
> 
> Surely that would break a lot of other cases.
> 
Why? Even if it can't catch all cases when we remove the '.', the query will
be parsed (again) by PostgreSQL.


-- 
  Euler Taveira de Oliveira
  http://www.timbira.com/

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #4926: too few pathkeys for mergeclauses

2009-07-17 Thread Tom Lane
"Roman Kononov"  writes:
> Description:too few pathkeys for mergeclauses

I've applied a patch for this.  Thanks for the report.

regards, tom lane

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #4927: psql does "spoil" the query before sending it to server

2009-07-17 Thread Tom Lane
Euler Taveira de Oliveira  writes:
> Why are we analysing the query there? One possible fix is to remove the '.' as
> delimiter in strtokx(). The trivial patch is attached.

Surely that would break a lot of other cases.

regards, tom lane

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #4927: psql does "spoil" the query before sending it to server

2009-07-17 Thread Euler Taveira de Oliveira
handling numeric literals with dots in psql copy command escreveu:
> fi...@filip=# \copy ( select 1.23::numeric as num ) to 'out.csv' with csv
> header
> ERROR:  syntax error at or near "."
> LINE 1: COPY ( select 1 . 23::numeric as num ) TO STDOUT CSV HEADER
> ^
> \copy: ERROR:  syntax error at or near "."
> LINE 1: COPY ( select 1 . 23::numeric as num ) TO STDOUT CSV HEADER
> ^
> 
> query inside parentheses is OK.
> 
> same bug reproduced in 8.3.7
> 
Thanks for your report.

Why are we analysing the query there? One possible fix is to remove the '.' as
delimiter in strtokx(). The trivial patch is attached.


-- 
  Euler Taveira de Oliveira
  http://www.timbira.com/
Index: copy.c
===
RCS file: /a/pgsql/dev/anoncvs/pgsql/src/bin/psql/copy.c,v
retrieving revision 1.80
diff -c -r1.80 copy.c
*** copy.c  26 Apr 2009 15:31:50 -  1.80
--- copy.c  17 Jul 2009 19:41:58 -
***
*** 146,152 
  
while (parens > 0)
{
!   token = strtokx(NULL, whitespace, ".,()", "\"'",
nonstd_backslash, true, 
false, pset.encoding);
if (!token)
goto error;
--- 146,152 
  
while (parens > 0)
{
!   token = strtokx(NULL, whitespace, ",()", "\"'",
nonstd_backslash, true, 
false, pset.encoding);
if (!token)
goto error;

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #4926: too few pathkeys for mergeclauses

2009-07-17 Thread Tom Lane
I wrote:
> I think what this case may show is simply that the consistency
> checking I added to create_mergejoin_plan in 8.3 is too strict.
> Not quite convinced yet though.

After further review I think that is the correct approach to take.

The proximate cause of the problem is that
find_mergeclauses_for_pathkeys is selecting an order for the merge
clauses that corresponds to a noncanonical pathkey list for the
inner relation (to wit, x, y, x).  While it would be possible in
this particular example to put the clauses in x, x, y order instead,
I don't think that is necessarily possible in every case.  The clause
ordering is constrained by the outer pathkeys and what we have here
is a demonstration that the inner pathkeys needn't match the outer
ones one-to-one.  So you could have a clause that references an inner
pathkey that is also referenced by some earlier clause that matches
a different outer pathkey, and there won't be any way to make them
adjacent.

By the time the plan gets to create_mergejoin_plan, the inner pathkey
list has been reduced to canonical form (x, y), but *this does not
represent any actual change in sort order*.  (Which is why there's
no actual bug in 8.2 and before, which blithely generate plans that
involve such "incorrect" mergeclause orderings.)  So I think we should
just weaken the checks in create_mergejoin_plan to allow such cases,
ie, each mergeclause should be allowed to match any already-used
inner pathkey.

The other approach we could possibly take is to have
find_mergeclauses_for_pathkeys reject candidate mergeclauses that
produce out-of-order inner pathkeys, but that would break at least
this Assert at joinpath.c:272:

/* Should have used them all... */
Assert(list_length(cur_mergeclauses) == list_length(mergeclause_list));

and it'd be rather expensive to test for anyway.

regards, tom lane

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] bug or simply not enough stack space?

2009-07-17 Thread Frank van Vugt
Hi Tom,

> Hmm ... the relevant code change seems to have been
> http://archives.postgresql.org/pgsql-committers/2009-04/msg00127.php

Well, though not during beta, the field testing did pay off ;)

> I think I might have been overenthusiastic in trying to free resources
> during a subtransaction abort.  Try this patch and see if you notice
> any bad side-effects.

All examples I had that crashed and burned, now work correctly and/or bail out 
correctly where needed.

No side-effects noticed.




-- 
Best,




Frank.


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #4926: too few pathkeys for mergeclauses

2009-07-17 Thread Tom Lane
Heikki Linnakangas  writes:
> Version 8.3 has the same bug, apparently introduced along with the
> equivalence classes. In 8.2, the merge condition is reduced into (i=x
> AND i=y), IOW the planner eliminates the duplicate condition. I believe
> 8.2 would otherwise have the same problem as well.

The fact that 8.2 eliminates the redundant condition is more or less
accidental, I think.  It's using equal() to detect duplicate
RestrictInfos coming up from the two input relations for the join,
while later versions rely on pointer equality for that.  You can fool
8.2 by commuting the duplicate condition, but it still doesn't fail:

regression=# explain select * from a left outer join b on i=x and i=y and x=i;
   QUERY PLAN
-
 Merge Left Join  (cost=285.12..325.93 rows=2140 width=12)
   Merge Cond: ((a.i = b.x) AND (a.i = b.y) AND (a.i = b.x))
   ->  Sort  (cost=149.78..155.13 rows=2140 width=4)
 Sort Key: a.i
 ->  Seq Scan on a  (cost=0.00..31.40 rows=2140 width=4)
   ->  Sort  (cost=135.34..140.19 rows=1940 width=8)
 Sort Key: b.x, b.y
 ->  Seq Scan on b  (cost=0.00..29.40 rows=1940 width=8)
(8 rows)

I think what this case may show is simply that the consistency
checking I added to create_mergejoin_plan in 8.3 is too strict.
Not quite convinced yet though.

Another possible solution for this particular case is to allow the
equivclass code to deduce x=y as an equivalence class, that is
the plan should enforce that check at the scan of b and then just
have one sort key for the merge.  Not sure how complicated that is,
however, and in any case it may not fix every possible failure
case for create_mergejoin_plan.

regards, tom lane

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] bug or simply not enough stack space?

2009-07-17 Thread Tom Lane
Frank van Vugt  writes:
> So this took a while, but here's your test case.
> Turns out to be quite small actually ;)

Hmm ... the relevant code change seems to have been
http://archives.postgresql.org/pgsql-committers/2009-04/msg00127.php

I think I might have been overenthusiastic in trying to free resources
during a subtransaction abort.  Try this patch and see if you notice
any bad side-effects.

regards, tom lane

Index: src/pl/plpgsql/src/pl_exec.c
===
RCS file: /cvsroot/pgsql/src/pl/plpgsql/src/pl_exec.c,v
retrieving revision 1.244
diff -c -r1.244 pl_exec.c
*** src/pl/plpgsql/src/pl_exec.c17 Jun 2009 13:46:12 -  1.244
--- src/pl/plpgsql/src/pl_exec.c17 Jul 2009 16:12:22 -
***
*** 5292,5298 
{
SimpleEcontextStackEntry *next;
  
!   FreeExprContext(simple_econtext_stack->stack_econtext);
next = simple_econtext_stack->next;
pfree(simple_econtext_stack);
simple_econtext_stack = next;
--- 5292,5299 
{
SimpleEcontextStackEntry *next;
  
!   if (event == SUBXACT_EVENT_COMMIT_SUB)
!   FreeExprContext(simple_econtext_stack->stack_econtext);
next = simple_econtext_stack->next;
pfree(simple_econtext_stack);
simple_econtext_stack = next;

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] Huge speed penalty using <>TRUE instead of =FALSE

2009-07-17 Thread Greg Stark
I don't think it even has to be so specific. We should just always  
rewrite bool <> bool into bool = NOT bool.



Hmm. That only has a 50/50 chance of  creating an indexable clause.  
Perhaps we could even rewrite it as "a = NOT b AND NOT a = b".


--
Greg

On 2009-07-17, at 3:21 PM, Tom Lane  wrote:


Peter Eisentraut  writes:

... But again, this is data type specific knowledge.


Actually, now that I think about it, the planner already has
datatype-specific knowledge about boolean equality (see
simplify_boolean_equality).  It would take just a few more lines of  
code

there to recognize "x <> true" and "x <> false" as additional variant
spellings of the generic "x" or "NOT x" constructs.  Not sure if it's
worth the trouble though; how many people really write such things?

If you really wanted to take it to extremes, you could also reduce
cases like "x > false", but that's starting to get a bit silly.

   regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] Huge speed penalty using <>TRUE instead of =FALSE

2009-07-17 Thread Tom Lane
Peter Eisentraut  writes:
> ... But again, this is data type specific knowledge.

Actually, now that I think about it, the planner already has
datatype-specific knowledge about boolean equality (see
simplify_boolean_equality).  It would take just a few more lines of code
there to recognize "x <> true" and "x <> false" as additional variant
spellings of the generic "x" or "NOT x" constructs.  Not sure if it's
worth the trouble though; how many people really write such things?

If you really wanted to take it to extremes, you could also reduce
cases like "x > false", but that's starting to get a bit silly.

regards, tom lane

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] Huge speed penalty using <>TRUE instead of =FALSE

2009-07-17 Thread Tom Lane
Jan-Ivar Mellingen  writes:
> One of our customers discovered that by replacing <>TRUE with =FALSE in
> a query of a table containing 750.000 records reduced the query time
> from about 12 seconds to about 60 milliseconds!

This is not a bug.  The set of operators that are indexable is well
documented, and <> is not one of them.

regards, tom lane

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] Huge speed penalty using <>TRUE instead of =FALSE

2009-07-17 Thread Peter Eisentraut
On Friday 17 July 2009 12:45:47 Mikael Krantz wrote:
> It might be that your column may be NULL as well as TRUE or FALSE. I
> am no expert in this matter though.

Nulls also need to be considered when attempting to substitute purportedly 
equivalent clauses.  But in this case it wouldn't actually matter, because

WHERE foo <> TRUE

and

WHERE foo = false

would both omit the row if foo is null.  Both expressions only return true if 
foo has the value "false".  But again, this is data type specific knowledge.


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] Huge speed penalty using <>TRUE instead of =FALSE

2009-07-17 Thread Mikael Krantz
It might be that your column may be NULL as well as TRUE or FALSE. I
am no expert in this matter though.

/M

On Fri, Jul 17, 2009 at 10:12 AM, Jan-Ivar
Mellingen wrote:
> One of our customers discovered that by replacing <>TRUE with =FALSE in
> a query of a table containing 750.000 records reduced the query time
> from about 12 seconds to about 60 milliseconds!
>
> The problematic query looks like this:
> SELECT * FROM AlarmLogg WHERE Logg_avsluttet <> TRUE AND
> Alarm_status='X' ORDER BY ID DESC
>
> If it is changed to this it works as expected:
> SELECT * FROM AlarmLogg WHERE Logg_avsluttet = FALSE AND
> Alarm_status='X' ORDER BY ID DESC
>
> After investigation (on a smaller dataset on my own database) I found
> that the query was resulting in a sequential scan:
>
> "explain analyze SELECT * FROM AlarmLogg WHERE Logg_avsluttet <> TRUE
> AND Alarm_status='X' ORDER BY ID DESC"
> "Sort  (cost=49936.96..49936.96 rows=1 width=405) (actual
> time=837.793..837.793 rows=0 loops=1)"
> "  Sort Key: id"
> "  Sort Method:  quicksort  Memory: 17kB"
> "  ->  Seq Scan on alarmlogg  (cost=0.00..49936.95 rows=1 width=405)
> (actual time=837.782..837.782 rows=0 loops=1)"
> "        Filter: ((logg_avsluttet <> true) AND ((alarm_status)::text =
> 'X'::text))"
> "Total runtime: 837.896 ms"
>
> The modified query gave this result:
> "explain analyze SELECT * FROM AlarmLogg WHERE Logg_avsluttet = FALSE
> AND Alarm_status='X' ORDER BY ID DESC"
> "Sort  (cost=8.36..8.37 rows=1 width=405) (actual time=0.032..0.032
> rows=0 loops=1)"
> "  Sort Key: id"
> "  Sort Method:  quicksort  Memory: 17kB"
> "  ->  Index Scan using i_alarmlogg_logg_avsluttet on alarmlogg
> (cost=0.00..8.35 rows=1 width=405) (actual time=0.024..0.024 rows=0
> loops=1)"
> "        Index Cond: (logg_avsluttet = false)"
> "        Filter: ((NOT logg_avsluttet) AND ((alarm_status)::text =
> 'X'::text))"
> "Total runtime: 0.123 ms"
>
> This is a dramatical difference, but I cannot understand why. In my head
> "<>TRUE" should behave exactly the same as "=FALSE". This looks like a
> bug to me, or am I overlooking something?
>
> This was verified on PostgreSQL 8.3.7, both on Windows Xp and Ubuntu 8.10.
>
> Some relevant details from the table definition:
> CREATE TABLE alarmlogg
> (
>   id serial NOT NULL,
>   alarm_status character varying(1) DEFAULT ''::character varying,
>   logg_avsluttet boolean DEFAULT false,
>   ...
>   CONSTRAINT alarmlogg_pkey PRIMARY KEY (id)
> )
>
> CREATE INDEX i_alarmlogg_alarm_status
>  ON alarmlogg
>  USING btree
>  (alarm_status);
>
> CREATE INDEX i_alarmlogg_logg_avsluttet
>  ON alarmlogg
>  USING btree
>  (logg_avsluttet);
>
> Regards,
> Jan-Ivar Mellingen
> Securinet AS
>
>
> --
> Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs
>

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


[BUGS] BUG #4927: psql does "spoil" the query before sending it to server

2009-07-17 Thread handling numeric literals with dots in psql copy command

The following bug has been logged online:

Bug reference:  4927
Logged by:  handling numeric literals with dots in psql \copy
command
Email address:  filip.rembialkow...@gmail.com
PostgreSQL version: 8.4.0
Operating system:   Linux
Description:psql does "spoil" the query before sending it to server
Details: 

8.4.0:

fi...@filip=# \copy ( select 1.23::numeric as num ) to 'out.csv' with csv
header
ERROR:  syntax error at or near "."
LINE 1: COPY ( select 1 . 23::numeric as num ) TO STDOUT CSV HEADER
^
\copy: ERROR:  syntax error at or near "."
LINE 1: COPY ( select 1 . 23::numeric as num ) TO STDOUT CSV HEADER
^

query inside parentheses is OK.

same bug reproduced in 8.3.7

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] Huge speed penalty using <>TRUE instead of =FALSE

2009-07-17 Thread Peter Eisentraut
On Friday 17 July 2009 11:12:41 Jan-Ivar Mellingen wrote:
> One of our customers discovered that by replacing <>TRUE with =FALSE in
> a query of a table containing 750.000 records reduced the query time
> from about 12 seconds to about 60 milliseconds!

> This is a dramatical difference, but I cannot understand why. In my head
> "<>TRUE" should behave exactly the same as "=FALSE". This looks like a
> bug to me, or am I overlooking something?

The planner just isn't that smart.  The boolean type is a special case where 
<> some_value implies = some_other_value, but this doesn't generalize well to 
other data types.  And the planner doesn't have a whole lot of data type 
specific knowledge.

I think a better index definition might actually be on alarm_status, with a 
partial index predicate on logg_avsluttet = false.


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] bug or simply not enough stack space?

2009-07-17 Thread Frank van Vugt
Ok,

So this took a while, but here's your test case.
Turns out to be quite small actually ;)



create table t1 (id int);

CREATE FUNCTION tr_t1_after_iud()
RETURNS trigger
LANGUAGE 'plpgsql'
VOLATILE
STRICT
SECURITY INVOKER
AS 'DECLARE
 BEGIN
 RAISE NOTICE ''%'', ROW(NEW.*);
 SELECT 1/0;
 RETURN NEW;
 END;';

CREATE TRIGGER t1_after AFTER INSERT OR UPDATE OR DELETE ON t1 FOR EACH ROW 
EXECUTE PROCEDURE tr_t1_after_iud();

begin;

savepoint s1;

INSERT INTO t1 values (1);




=> this will result in the following:

db=# INSERT INTO t1 values (1);
NOTICE:  (1)
WARNING:  AbortSubTransaction while in ABORT state
WARNING:  did not find subXID 77063 in MyProc
ERROR:  division by zero
CONTEXT:  SQL statement "SELECT 1/0"
PL/pgSQL function "tr_t1_after_iud" line 4 at SQL statement
ERROR:  tupdesc reference 0x7ffe74f24ad0 is not owned by resource owner 
SubTransaction



=> mind the fact that the savepoint is 'needed', without it there will be no 
problem

=> in my reallife example, this resulted in a "PANIC: ERRORDATA_STACK_SIZE 
exceeded", I cannot reproduce that, but as you stated earlier, this might just 
be collateral damage, which I'll be able to easily confirm once the problem 
above has been fixed.



=> cleanup:

rollback;
drop table t1;
drop function tr_t1_after_iud();


=> to avoid possible confusion

db=# select version();
version
---
 PostgreSQL 8.4.0 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 
4.2.4, 64-bit



Looking forward to your reply.



-- 
Best,




Frank.


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


[BUGS] Huge speed penalty using <>TRUE instead of =FALSE

2009-07-17 Thread Jan-Ivar Mellingen
One of our customers discovered that by replacing <>TRUE with =FALSE in
a query of a table containing 750.000 records reduced the query time
from about 12 seconds to about 60 milliseconds!

The problematic query looks like this:
SELECT * FROM AlarmLogg WHERE Logg_avsluttet <> TRUE AND
Alarm_status='X' ORDER BY ID DESC

If it is changed to this it works as expected:
SELECT * FROM AlarmLogg WHERE Logg_avsluttet = FALSE AND
Alarm_status='X' ORDER BY ID DESC

After investigation (on a smaller dataset on my own database) I found
that the query was resulting in a sequential scan:

"explain analyze SELECT * FROM AlarmLogg WHERE Logg_avsluttet <> TRUE
AND Alarm_status='X' ORDER BY ID DESC"
"Sort  (cost=49936.96..49936.96 rows=1 width=405) (actual
time=837.793..837.793 rows=0 loops=1)"
"  Sort Key: id"
"  Sort Method:  quicksort  Memory: 17kB"
"  ->  Seq Scan on alarmlogg  (cost=0.00..49936.95 rows=1 width=405)
(actual time=837.782..837.782 rows=0 loops=1)"
"Filter: ((logg_avsluttet <> true) AND ((alarm_status)::text =
'X'::text))"
"Total runtime: 837.896 ms"

The modified query gave this result:
"explain analyze SELECT * FROM AlarmLogg WHERE Logg_avsluttet = FALSE
AND Alarm_status='X' ORDER BY ID DESC"
"Sort  (cost=8.36..8.37 rows=1 width=405) (actual time=0.032..0.032
rows=0 loops=1)"
"  Sort Key: id"
"  Sort Method:  quicksort  Memory: 17kB"
"  ->  Index Scan using i_alarmlogg_logg_avsluttet on alarmlogg 
(cost=0.00..8.35 rows=1 width=405) (actual time=0.024..0.024 rows=0
loops=1)"
"Index Cond: (logg_avsluttet = false)"
"Filter: ((NOT logg_avsluttet) AND ((alarm_status)::text =
'X'::text))"
"Total runtime: 0.123 ms"

This is a dramatical difference, but I cannot understand why. In my head
"<>TRUE" should behave exactly the same as "=FALSE". This looks like a
bug to me, or am I overlooking something?

This was verified on PostgreSQL 8.3.7, both on Windows Xp and Ubuntu 8.10.

Some relevant details from the table definition:
CREATE TABLE alarmlogg
(
   id serial NOT NULL,
   alarm_status character varying(1) DEFAULT ''::character varying,
   logg_avsluttet boolean DEFAULT false,
   ...
   CONSTRAINT alarmlogg_pkey PRIMARY KEY (id)
)

CREATE INDEX i_alarmlogg_alarm_status
  ON alarmlogg
  USING btree
  (alarm_status);

CREATE INDEX i_alarmlogg_logg_avsluttet
  ON alarmlogg
  USING btree
  (logg_avsluttet);

Regards,
Jan-Ivar Mellingen
Securinet AS


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #4926: too few pathkeys for mergeclauses

2009-07-17 Thread Heikki Linnakangas
Heikki Linnakangas wrote:
> 2. make_inner_pathkeys_for_merge() should have created sort order (x, y,
> x) for the inner side.

On further thought, that would make no sense. Sort order (x, y) is
always equivalent to (x, y, x).

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #4926: too few pathkeys for mergeclauses

2009-07-17 Thread Heikki Linnakangas
Greg Stark wrote:
> On Thu, Jul 16, 2009 at 9:07 PM, Roman Kononov wrote:
>> test=# create table junk(i int);
>> CREATE TABLE
>> test=# select * from junk left outer join (select coalesce(i,1) as x,
>> coalesce(i,2) as y from junk) t on coalesce(i,3)=x and coalesce(i,4)=y and
>> coalesce(i,5)=x;
>> ERROR:  too few pathkeys for mergeclauses
> 
> Thanks for the bug report. That's definitely not supposed to be
> happening. It's always nice when it's easy to reproduce the problem
> like this.

Yep. This can be further reduced into this:

CREATE TABLE a (i integer);
CREATE TABLE b (x integer, y integer);

select * from a left outer join b on i=x and i=y and i=x;

The planner is choosing a merge join, where the outer side (table a) is
sorted by (i), and the inner side is sorted by (x, y). But that doesn't
work with the merge condition (i=x AND i=y AND i=x).

Version 8.3 has the same bug, apparently introduced along with the
equivalence classes. In 8.2, the merge condition is reduced into (i=x
AND i=y), IOW the planner eliminates the duplicate condition. I believe
8.2 would otherwise have the same problem as well.

I can see two different things that you could say is at fault here:

1. We no longer eliminate the duplicate condition, but the
find_mergeclauses_for_pathkeys() + make_inner_pathkeys_for_merge()
combination relies on there being no duplicates. We should try harder to
eliminate duplicates in left join clauses.

2. make_inner_pathkeys_for_merge() should have created sort order (x, y,
x) for the inner side.

The first solution is what we probably want, to avoid unnecessary work
at execution time.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs