Re: [BUGS] BUG #6067: In PL/pgsql, EXISTS(SELECT ... INTO...) fails

2012-08-16 Thread Bruce Momjian
On Wed, Nov 30, 2011 at 03:36:11PM -0500, Robert Haas wrote:
> On Tue, Nov 29, 2011 at 9:32 PM, Bruce Momjian  wrote:
> > Tom Lane wrote:
> >> "David Fetter"  writes:
> >> >     IF EXISTS (SELECT 1 INTO STRICT i) THEN
> >> >         RAISE NOTICE '%', a;
> >> >     END IF;
> >>
> >> Umm ... are you just complaining that the error message isn't very
> >> helpful, or are you actually expecting that to do something useful?
> >> If the latter, what exactly?  I'm particularly confused by your use
> >> of the STRICT option here, because if we did support that, I would
> >> expect the STRICT to throw an error if there were not exactly one
> >> matching row, making the EXISTS test 100% pointless.
> >>
> >> But the short answer is that we don't support INTO in sub-selects,
> >> and in general I doubt that we ever will, since in most cases the
> >> behavior wouldn't be very well-defined.  It might be worth a TODO
> >> to provide a better error message than "syntax error", though.
> >
> > Is it worth documenting, fixing, or adding this to the TODO list?
> 
> At most I would say we could try to improve the error message.

I researched this and it seems to complex to improve the error message. 
I am afraid it would have to bleed into the main backend parser.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] Cannot dump 8.4.8 database using later versions

2012-08-16 Thread Bruce Momjian

I don't think we ever addressed this, but since we have had minimal
complaints about it, I guess we are OK.

---

On Tue, Nov 15, 2011 at 10:04:57PM -0500, Tom Lane wrote:
> Robert Haas  writes:
> > On Tue, Nov 15, 2011 at 5:16 AM, Heikki Linnakangas
> >  wrote:
> >> NEW used to be a reserved keyword, but it's not so in 9.0 anymore. So 9.0
> >> pg_dump thinks it doesn't need to be quoted.
> 
> > Why isn't it correct?
> 
> It's correct to not quote it in pg_dump's output (since we make no
> promises that such output would load into a pre-9.0 server anyway).
> The problem is that it needs to be quoted in commands that pg_dump
> sends back to the 8.4 server.  Example:
> 
> psql (8.4.9)
> You are now connected to database "db84".
> db84=# create table "new"( f1 int, "new" text);
> 
> ... pg_dump with newer pg_dump ...
> 
> pg_dump: SQL command failed
> pg_dump: Error message from server: ERROR:  syntax error at or near "new"
> LINE 1: COPY public.new (f1, new) TO stdout;
>  ^
> pg_dump: The command was: COPY public.new (f1, new) TO stdout;
> 
> The least painful solution might be to always quote *every* identifier
> in commands sent to the source server, since we don't especially care
> how nice-looking those are.
> 
>   regards, tom lane

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] 'pg_ctl restart' confused about pathname to postgresql.conf

2012-08-16 Thread Bruce Momjian

I have applied the attached doc patch to document the problem with
relative paths and pg_ctl restart.

---

On Sun, Oct 23, 2011 at 08:49:25PM -0400, Josh Kupershmidt wrote:
> On Sat, Oct 22, 2011 at 12:13 PM, Tom Lane  wrote:
> > I think the reason it has a problem is that this is what's left in
> > postmaster.opts:
> >
> > /home/tgl/pgsql/bin/postgres "-D" "baz"
> >
> > (which is an accurate representation of the command line from startup)
> > and that -D switch gets fed to the postmaster as-is during restart.
> 
> I see.
> 
> > By and large, I would not recommend using a relative pathname to start
> > the postmaster, unless you plan to start it from the same working
> > directory every time.
> 
> Well, now I know. But that really seems like an annoying and arbitrary
> restriction, not to mention not being documented anywhere AFAICT.
> 
> (I came upon this problem because I often set up servers with
> binaries, libraries, and $PGDATA all tucked away under
> /home/postgres/, and it seemed natural to use a relative pathname as
> my data directory argument to pg_ctl since my working directory will
> usually be /home/postgres/ when I'm poking at the server.)
> 
> > We could possibly avoid this by having pg_ctl try to absolute-ify the -D
> > setting during postmaster start, but I'm not convinced it's worth the
> > trouble, or even that it's appropriate for pg_ctl to editorialize on the
> > user's choice of absolute vs relative path.
> 
> I don't want to bikeshed on the mechanics of how exactly this should
> work, but it doesn't seem like it should be so hard to get this to
> DWIM. In the example I posted, the last step which fails is basically:
> 
>   pg_ctl -D /tmp/foo/bar/baz/ restart
> 
> and it just seems totally broken for that to not work: pg_ctl knows
> exactly which data directory the user means when invoked here. Plus,
> these steps would work fine instead at that point:
> 
>   pg_ctl -D /tmp/foo/bar/baz/ stop
>   pg_ctl -D /tmp/foo/bar/baz/ start
> 
> and I was under the impression (supported by the pg_ctl doc page,
> which claims "restart mode effectively executes a stop followed by a
> start") that these sequences should be equivalent.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/doc/src/sgml/ref/pg_ctl-ref.sgml b/doc/src/sgml/ref/pg_ctl-ref.sgml
new file mode 100644
index 7a4c2be..3107514
*** a/doc/src/sgml/ref/pg_ctl-ref.sgml
--- b/doc/src/sgml/ref/pg_ctl-ref.sgml
*** PostgreSQL documentation
*** 188,194 

 restart mode effectively executes a stop followed
 by a start.  This allows changing the postgres
!command-line options.

  

--- 188,196 

 restart mode effectively executes a stop followed
 by a start.  This allows changing the postgres
!command-line options.  restart might fail if
!relative paths specified were specified on the command-line during
!server start.

  


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #7485: 9.2 beta3 libxml2 can't be loaded on Windows

2012-08-16 Thread Tom Lane
Dave Page  writes:
> On Thu, Aug 16, 2012 at 3:28 PM, Dave Page  wrote:
>> On Thu, Aug 16, 2012 at 3:26 PM, Tom Lane  wrote:
>>> Seems pretty brute-force.  Why not just
>>> 
>>> #if LIBXML_VERSION > 20703
>>> #define HAVE_XMLSTRUCTUREDERRORCONTEXT
>>> #endif
>>> 
>>> in some suitable place (probably xml.c itself, since I don't think we
>>> want to #include xmlversion.h in global headers)?
>>> 
>>> Or actually, maybe we could just unconditionally define
>>> HAVE_XMLSTRUCTUREDERRORCONTEXT in pg_config.h.win32.  Is anybody
>>> likely to still be building PG with ancient libxml on Windows?

>> That works for me. Using older libxml's would be a bad idea anyway -
>> and there are precompiled binaries available, so it's not hard to get.

> Works as in, I'm happy with it. I haven't tested. Though, it'll likely
> break some of the old BF animals (temporarily, I hope).

On the whole plan A sounds safer (ie look at LIBXML_VERSION in place of
using a configure test).  I'll work on that tonight or tomorrow, if
nobody beats me to it.

regards, tom lane


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: Re-2: [BUGS] BUG #7495: chosen wrong index

2012-08-16 Thread Kevin Grittner
 wrote:
 
> In my live environment i have a table with a boolean where the
> boolean is usually true. The boolean is false on new or changed
> entrys and if i select the false-rows order by primary key i get
> slow querys. The table has a lot of million rows and a very small
> amount of rows with false.
 
That sure sounds like a situation where you should use a partial
index.
 
http://www.postgresql.org/docs/current/interactive/indexes-partial.html
 
-Kevin


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re-2: [BUGS] BUG #7495: chosen wrong index

2012-08-16 Thread psql
"Kevin Grittner"  wrote:
> insert into bla ( a , b )
>   select a , floor(random() * 100)   1
>   from generate_series( 1 , 100 ) as a ( a ) ;
>  
> On my machine, with that data, all of the queries run fast.

Yes, this runs by me fast too. But here is no relation
between a and b. By my data psql must scan mostly all data
to find the rows.

This is only an example. In my live environment i have a table with
a boolean where the boolean is usually true. The boolean is
false on new or changed entrys and if i select the false-rows
order by primary key i get slow querys. The table has a lot of
million rows and a very small amount of rows with false. The
boolean indicates that this row has an entry in a second table
so i can select the rows without a join which is expensive too.

BTW: it would be nice to have an index wich works on
select * from a left join b on b.id = a.id where b.id is null. 

explain select * from bla where b > 99 order by a limit 10 ;
  QUERY PLAN
---
 Limit  (cost=0.00..30.75 rows=10 width=8)
   ->  Index Scan using bla_a on bla  (cost=0.00..30747.29 rows=1 width=8)
 Filter: (b > 99)

drop index bla_a ;

explain select * from bla where b > 99 order by a limit 10 ;
QUERY PLAN
---
 Limit  (cost=633.50..633.52 rows=10 width=8)
   ->  Sort  (cost=633.50..658.50 rows=1 width=8)
 Sort Key: a
 ->  Index Scan using bla_b on bla  (cost=0.00..417.40 rows=1 
width=8)
   Index Cond: (b > 99)

The first explain reduce by the limit the cost by the faktor 10/1.
This is good for data with no relation between a and b. But in my
example it is about 1000 times higher.

BTW: in the first explain this is not an Index Scan, it is an
sequential  scan on an index. It would be nice to show this in
the explain.

Perhaps it would be good to reduce the cost for the limit on
an sequential scan on an index not linear.


Best regards,
Andreas


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #7494: WAL replay speed depends heavily on the shared_buffers size

2012-08-16 Thread Valentine Gogichashvili
Hello John,


>> we see up to 10x performance increase with bigger shared_buffers in case
>> of this database. Main database entities are about 20GB in size and we see
>> that performance drops considerably when running with smaller
>> shared_buffers smaller then that.
>>
>>
> do you adjust effective_cache_size accordingly?   with the smaller
> shared_buffers, we typically find at least half or more of physical memory
> is available as OS level disk cache, as shown by the 'cached' output of
> 'free' or whatever after the system has been running long enough to fully
> populate its disk cache.  this parameter has a significant performance
> impact on the planner's estimation of the best way of executing given
> queries.  also, especially if you're executing queries that process a lot
> of rows and have to do sorts and such, increasing work_mem is quite helpful.
>
>
Yes, the effective_cache_size is set to the the 50% of the RAM = 64GB, but
as I mentioned already, we are measuring considerable performance increase
when increasing shared_buffers to the values, when it fits most important
tables completely.

Regards,

-- Valentine


Re: [BUGS] BUG #7494: WAL replay speed depends heavily on the shared_buffers size

2012-08-16 Thread Valentine Gogichashvili
Hello again,

now I have restarted the slave with shared_buffers set to 32GB, and now the
recovery process is at 100% CPU:

$ strace -c -f -p 27076
Process 27076 attached - interrupt to quit
 Process 27076 detached
% time seconds  usecs/call callserrors syscall
-- --- --- - - 
 67.650.002127   0 18127   lseek
 28.980.000911   0 16496   getppid
  2.580.81   0   758   read
  0.800.25   4 6 1 open
  0.000.00   0 4   close
  0.000.00   0 1   stat
  0.000.00   0   493   493 unlink
-- --- --- - - 
100.000.003144 35885   494 total


$ strace -c -f -p 27076
Process 27076 attached - interrupt to quit
Process 27076 detached
% time seconds  usecs/call callserrors syscall
-- --- --- - - 
 44.440.000240   0  5976   lseek
 38.150.000206   0  5522   getppid
 17.410.94   1   172   read
  0.000.00   0 4 1 open
  0.000.00   0 2   close
  0.000.00   0 1   stat
  0.000.00   0   159   159 unlink
-- --- --- - - 
100.000.000540 11836   160 total

Then restarted again with shared_buffers set back to 2GB:

now the replay process is relatively fast again and here is the strace:

$ strace -c -f -p 36450
Process 36450 attached - interrupt to quit
Process 39336 attached
Process 39337 attached
Process 39337 detached
Process 39338 attached
Process 39338 detached
Process 39339 attached
Process 39339 detached
Process 39336 detached
Process 39359 attached
Process 39360 attached
Process 39360 detached
Process 39361 attached
Process 39361 detached
Process 39362 attached
Process 39362 detached
Process 39359 detached
Process 40168 attached
Process 40169 attached
Process 40169 detached
Process 40170 attached
Process 40170 detached
Process 40171 attached
Process 40171 detached
Process 40168 detached
Process 36450 detached
% time seconds  usecs/call callserrors syscall
-- --- --- - - 
 88.210.668159   3181721 9 wait4
  4.690.035553   6  5970   read
  1.920.014572   0120565   lseek
  1.910.014474   5  2957  2954 unlink
  1.550.011766   8  1545   write
  1.270.009582   0109274   getppid
  0.310.002340   3   762   fcntl
  0.030.000245  2012   clone
  0.020.000163   1   219   rt_sigprocmask
  0.020.000162   1   15410 open
  0.010.92   0   216   mmap
  0.010.84   1   13342 stat
  0.010.66   612   getrlimit
  0.010.46   0   178   close
  0.000.37   127   munmap
  0.000.35   6 6   pipe
  0.000.35  12 3   set_tid_address
  0.000.30   142   getegid
  0.000.21   0   183   rt_sigaction
  0.000.00   0   108   fstat
  0.000.00   0   102   mprotect
  0.000.00   036   brk
  0.000.00   0 9   rt_sigreturn
  0.000.00   0 9 9 ioctl
  0.000.00   04812 access
  0.000.00   0 6   dup2
  0.000.00   0 6   getpid
  0.000.00   012   execve
  0.000.00   0 9   uname
  0.000.00   042   getuid
  0.000.00   042   getgid
  0.000.00   045   geteuid
  0.000.00   0 6   getpgrp
  0.000.00   0 6   statfs
  0.000.00   012   arch_prctl
  0.000.00   0 6 3 futex
  0.000.00   0 3   set_robust_list
-- --- --- - - 
100.000.757462242786  3039 total

With best regards,

-- Valentine Gogichashvili


On Thu, Aug 16, 2012 at 4:53 PM, Valentine Gogichashvili
wrote:

> Hello Andreas,
>
> here is the process, that now actually is not using CPU

Re: [BUGS] BUG #7494: WAL replay speed depends heavily on the shared_buffers size

2012-08-16 Thread Valentine Gogichashvili
Hello Andreas,

here is the process, that now actually is not using CPU at all and the
shared_buffers are set to 2GB:

50978 postgres  20   0 2288m 2.0g 2.0g S  0.0  1.6   4225:34 postgres:
startup process   recovering 0005262E00FD

It is hanging on that file for several minutes now.

and here is the strace:

$ strace -c -f -p 50978
Process 50978 attached - interrupt to quit
 Process 50978 detached
% time seconds  usecs/call callserrors syscall
-- --- --- - - 
 94.820.007999  37   215   select
  2.730.000230   1   215   getppid
  2.450.000207   1   215   215 stat
-- --- --- - - 
100.000.008436   645   215 total

What kind of additional profiling information would you like to see?

Regards,

-- Valentin


On Wed, Aug 15, 2012 at 4:09 PM, Andres Freund wrote:

> Hi,
>
> On Wednesday, August 15, 2012 12:10:42 PM val...@gmail.com wrote:
> > The following bug has been logged on the website:
> >
> > Bug reference:  7494
> > Logged by:  Valentine Gogichashvili
> > Email address:  val...@gmail.com
> > PostgreSQL version: 9.0.7
> > Operating system:   Linux version 2.6.32-5-amd64 (Debian 2.6.32-41)
> > Description:
> >
> > We are experiencing strange(?) behavior on the replication slave
> machines.
> > The master machine has a very heavy update load, where many processes are
> > updating lots of data. It generates up to 30GB of WAL files per hour.
> > Normally it is not a problem for the slave machines to replay this amount
> > of WAL files on time and keep on with the master. But at some moments,
> the
> > slaves are “hanging” with 100% CPU usage on the WAL replay process and 3%
> > IOWait, needing up to 30 seconds to process one WAL file. If this tipping
> > point is reached, then a huge WAL replication lag is building up quite
> > fast, that also leads to overfill of the XLOG directory on the slave
> > machines, as the WAL receiver is putting the WAL files it gets via
> > streaming replication the XLOG directory (that, in many cases are quite a
> > limited size separate disk partition).
> Could you try to get a profile of that 100% cpu time?
>
> Greetings,
>
> Andres
> --
> Andres Freund   http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>


Re-2: [BUGS] BUG #7495: chosen wrong index

2012-08-16 Thread psql
"Kevin Grittner"  wrote:
> insert into bla ( a , b )
>   select a , floor(random() * 100)   1
>   from generate_series( 1 , 100 ) as a ( a ) ;
>  
> On my machine, with that data, all of the queries run fast.

Yes, this runs by me fast too. But here is no relation
between a and b. By my data psql must scan mostly all data
to find the rows.

This is only an example. In my live environment i have a table with
a boolean where the boolean is usually true. The boolean is
false on new or changed entrys and if i select the false-rows
order by primary key i get slow querys. The table has a lot of
million rows and a very small amount of rows with false. The
boolean indicates that this row has an entry in a second table
so i can select the rows without a join which is expensive too.

BTW: it would be nice to have an index wich works on
select * from a left join b on b.id = a.id where b.id is null. 

explain select * from bla where b > 99 order by a limit 10 ;
  QUERY PLAN
---
 Limit  (cost=0.00..30.75 rows=10 width=8)
   ->  Index Scan using bla_a on bla  (cost=0.00..30747.29 rows=1 width=8)
 Filter: (b > 99)

drop index bla_a ;

explain select * from bla where b > 99 order by a limit 10 ;
QUERY PLAN
---
 Limit  (cost=633.50..633.52 rows=10 width=8)
   ->  Sort  (cost=633.50..658.50 rows=1 width=8)
 Sort Key: a
 ->  Index Scan using bla_b on bla  (cost=0.00..417.40 rows=1 
width=8)
   Index Cond: (b > 99)

The first explain reduce by the limit the cost by the faktor 10/1.
This is good for data with no relation between a and b. But in my
example it is about 1000 times higher.

BTW: in the first explain this is not an Index Scan, it is an
sequential  scan on an index. It would be nice to show this in
the explain.

Perhaps it would be good to reduce the cost for the limit on
an sequential scan on an index not linear.


Best regards,
Andreas


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re-2: [BUGS] BUG #7495: chosen wrong index

2012-08-16 Thread psql
"Kevin Grittner"  wrote:
> insert into bla ( a , b )
>   select a , floor(random() * 100)   1
>   from generate_series( 1 , 100 ) as a ( a ) ;
>  
> On my machine, with that data, all of the queries run fast.

Yes, this runs by me fast too. But here is no relation
between a and b. By my data psql must scan mostly all data
to find the rows.

This is only an example. In my live environment i have a table with
a boolean where the boolean is usually true. The boolean is
false on new or changed entrys and if i select the false-rows
order by primary key i get slow querys. The table has a lot of
million rows and a very small amount of rows with false. The
boolean indicates that this row has an entry in a second table
so i can select the rows without a join which is expensive too.

BTW: it would be nice to have an index wich works on
select * from a left join b on b.id = a.id where b.id is null. 

explain select * from bla where b > 99 order by a limit 10 ;
  QUERY PLAN
---
 Limit  (cost=0.00..30.75 rows=10 width=8)
   ->  Index Scan using bla_a on bla  (cost=0.00..30747.29 rows=1 width=8)
 Filter: (b > 99)

drop index bla_a ;

explain select * from bla where b > 99 order by a limit 10 ;
QUERY PLAN
---
 Limit  (cost=633.50..633.52 rows=10 width=8)
   ->  Sort  (cost=633.50..658.50 rows=1 width=8)
 Sort Key: a
 ->  Index Scan using bla_b on bla  (cost=0.00..417.40 rows=1 
width=8)
   Index Cond: (b > 99)

The first explain reduce by the limit the cost by the faktor 10/1.
This is good for data with no relation between a and b. But in my
example it is about 1000 times higher.

BTW: in the first explain this is not an Index Scan, it is an
sequential  scan on an index. It would be nice to show this in
the explain.

Perhaps it would be good to reduce the cost for the limit on
an sequential scan on an index not linear.


Best regards,
Andreas


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re-2: [BUGS] BUG #7495: chosen wrong index

2012-08-16 Thread psql
"Kevin Grittner"  wrote:
> insert into bla ( a , b )
>   select a , floor(random() * 100)   1
>   from generate_series( 1 , 100 ) as a ( a ) ;
>  
> On my machine, with that data, all of the queries run fast.

Yes, this runs by me fast too. But here is no relation
between a and b. By my data psql must scan mostly all data
to find the rows.

This is only an example. In my live environment i have a table with
a boolean where the boolean is usually true. The boolean is
false on new or changed entrys and if i select the false-rows
order by primary key i get slow querys. The table has a lot of
million rows and a very small amount of rows with false. The
boolean indicates that this row has an entry in a second table
so i can select the rows without a join which is expensive too.

BTW: it would be nice to have an index wich works on
select * from a left join b on b.id = a.id where b.id is null. 

explain select * from bla where b > 99 order by a limit 10 ;
  QUERY PLAN
---
 Limit  (cost=0.00..30.75 rows=10 width=8)
   ->  Index Scan using bla_a on bla  (cost=0.00..30747.29 rows=1 width=8)
 Filter: (b > 99)

drop index bla_a ;

explain select * from bla where b > 99 order by a limit 10 ;
QUERY PLAN
---
 Limit  (cost=633.50..633.52 rows=10 width=8)
   ->  Sort  (cost=633.50..658.50 rows=1 width=8)
 Sort Key: a
 ->  Index Scan using bla_b on bla  (cost=0.00..417.40 rows=1 
width=8)
   Index Cond: (b > 99)

The first explain reduce by the limit the cost by the faktor 10/1.
This is good for data with no relation between a and b. But in my
example it is about 1000 times higher.

BTW: in the first explain this is not an Index Scan, it is an
sequential  scan on an index. It would be nice to show this in
the explain.

Perhaps it would be good to reduce the cost for the limit on
an sequential scan on an index not linear.


Best regards,
Andreas


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re-2: [BUGS] BUG #7495: chosen wrong index

2012-08-16 Thread psql
"Kevin Grittner"  wrote:
> insert into bla ( a , b )
>   select a , floor(random() * 100)   1
>   from generate_series( 1 , 100 ) as a ( a ) ;
>  
> On my machine, with that data, all of the queries run fast.

Yes, this runs by me fast too. But here is no relation
between a and b. By my data psql must scan mostly all data
to find the rows.

This is only an example. In my live environment i have a table with
a boolean where the boolean is usually true. The boolean is
false on new or changed entrys and if i select the false-rows
order by primary key i get slow querys. The table has a lot of
million rows and a very small amount of rows with false. The
boolean indicates that this row has an entry in a second table
so i can select the rows without a join which is expensive too.

BTW: it would be nice to have an index wich works on
select * from a left join b on b.id = a.id where b.id is null. 

explain select * from bla where b > 99 order by a limit 10 ;
  QUERY PLAN
---
 Limit  (cost=0.00..30.75 rows=10 width=8)
   ->  Index Scan using bla_a on bla  (cost=0.00..30747.29 rows=1 width=8)
 Filter: (b > 99)

drop index bla_a ;

explain select * from bla where b > 99 order by a limit 10 ;
QUERY PLAN
---
 Limit  (cost=633.50..633.52 rows=10 width=8)
   ->  Sort  (cost=633.50..658.50 rows=1 width=8)
 Sort Key: a
 ->  Index Scan using bla_b on bla  (cost=0.00..417.40 rows=1 
width=8)
   Index Cond: (b > 99)

The first explain reduce by the limit the cost by the faktor 10/1.
This is good for data with no relation between a and b. But in my
example it is about 1000 times higher.

BTW: in the first explain this is not an Index Scan, it is an
sequential  scan on an index. It would be nice to show this in
the explain.

Perhaps it would be good to reduce the cost for the limit on
an sequential scan on an index not linear.


Best regards,
Andreas


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #7485: 9.2 beta3 libxml2 can't be loaded on Windows

2012-08-16 Thread Dave Page
On Thu, Aug 16, 2012 at 3:28 PM, Dave Page  wrote:
> On Thu, Aug 16, 2012 at 3:26 PM, Tom Lane  wrote:
>> Dave Page  writes:
>>> So Sachin and I have looked at this but not found a problem with the
>>> installers. He then found that one of our colleagues has already
>>> reported this as an issue with PostgreSQL and submitted a possible
>>> patch. So, we need a committer to look at fixing this:
>>
>>> http://archives.postgresql.org/pgsql-hackers/2012-06/msg00874.php
>>
>> Seems pretty brute-force.  Why not just
>>
>> #if LIBXML_VERSION > 20703
>> #define HAVE_XMLSTRUCTUREDERRORCONTEXT
>> #endif
>>
>> in some suitable place (probably xml.c itself, since I don't think we
>> want to #include xmlversion.h in global headers)?
>>
>> Or actually, maybe we could just unconditionally define
>> HAVE_XMLSTRUCTUREDERRORCONTEXT in pg_config.h.win32.  Is anybody
>> likely to still be building PG with ancient libxml on Windows?
>
> That works for me. Using older libxml's would be a bad idea anyway -
> and there are precompiled binaries available, so it's not hard to get.

Works as in, I'm happy with it. I haven't tested. Though, it'll likely
break some of the old BF animals (temporarily, I hope).

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #7485: 9.2 beta3 libxml2 can't be loaded on Windows

2012-08-16 Thread Dave Page
On Thu, Aug 16, 2012 at 3:26 PM, Tom Lane  wrote:
> Dave Page  writes:
>> So Sachin and I have looked at this but not found a problem with the
>> installers. He then found that one of our colleagues has already
>> reported this as an issue with PostgreSQL and submitted a possible
>> patch. So, we need a committer to look at fixing this:
>
>> http://archives.postgresql.org/pgsql-hackers/2012-06/msg00874.php
>
> Seems pretty brute-force.  Why not just
>
> #if LIBXML_VERSION > 20703
> #define HAVE_XMLSTRUCTUREDERRORCONTEXT
> #endif
>
> in some suitable place (probably xml.c itself, since I don't think we
> want to #include xmlversion.h in global headers)?
>
> Or actually, maybe we could just unconditionally define
> HAVE_XMLSTRUCTUREDERRORCONTEXT in pg_config.h.win32.  Is anybody
> likely to still be building PG with ancient libxml on Windows?

That works for me. Using older libxml's would be a bad idea anyway -
and there are precompiled binaries available, so it's not hard to get.


-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #7485: 9.2 beta3 libxml2 can't be loaded on Windows

2012-08-16 Thread Tom Lane
Dave Page  writes:
> So Sachin and I have looked at this but not found a problem with the
> installers. He then found that one of our colleagues has already
> reported this as an issue with PostgreSQL and submitted a possible
> patch. So, we need a committer to look at fixing this:

> http://archives.postgresql.org/pgsql-hackers/2012-06/msg00874.php

Seems pretty brute-force.  Why not just

#if LIBXML_VERSION > 20703
#define HAVE_XMLSTRUCTUREDERRORCONTEXT
#endif

in some suitable place (probably xml.c itself, since I don't think we
want to #include xmlversion.h in global headers)?

Or actually, maybe we could just unconditionally define
HAVE_XMLSTRUCTUREDERRORCONTEXT in pg_config.h.win32.  Is anybody
likely to still be building PG with ancient libxml on Windows?

regards, tom lane


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #7485: 9.2 beta3 libxml2 can't be loaded on Windows

2012-08-16 Thread Dave Page
On Wed, Aug 15, 2012 at 2:43 PM, Dave Page  wrote:
> On Wed, Aug 15, 2012 at 2:37 PM, Rikard Pavelic
>  wrote:
>> On 15.8.2012. 3:54, Craig Ringer wrote:
>>> On 08/14/2012 11:42 PM, Rikard Pavelic wrote:
 On 8.8.2012. 13:30, Craig Ringer wrote:
> On 08/08/2012 06:24 PM, rikard.pave...@zg.htnet.hr wrote:
>> The following bug has been logged on the website:
>>
>> Bug reference:  7485
>> Logged by:  Rikard Pavelic
>> Email address:  rikard.pave...@zg.htnet.hr
>> PostgreSQL version: 9.2 beta3
>> Operating system:   Windows
>> Description:
>>
>> This is still an issue as reported in
>> http://archives.postgresql.org/pgsql-bugs/2012-05/msg00231.php
>>>
> - How you installed PostgeSQL (ie the EnterpriseDB installer, from 
> source, .zip install)
 I've used EnterpriseDB installer (9.2 beta2 with this bug, and now 9.2 
 beta3)
>>>
> - The error message you are getting and the command(s) that cause it
 ERROR: could not set up XML error handler
 SQL state: 0A000
 Hint: This probably indicates that the version of libxml2 being used is 
 not compatible with the libxml2 header files that PostgreSQL was built 
 with.
 I'm calling function with type that has an XML attribute.
 I can provide a test case, but since this is just a warning of the old bug 
 I didn't.
>>>
>>> OK, I get you.
>>>
>>> Given the context of the old report this sounds like it could be an 
>>> installation/packaging issue. I'll ping the EDB folks.
>>>
>>> Which Windows is it exactly? And did you use the 32-bit or 64-bit installer?
>>>
>>> --
>>> Craig Ringer
>>>
>> Windows 7 64bit
>> PostgreSQL 9.2beta3, compiled by Visual C++ build 1600, 64-bit
>>
>
> This still seems to be an issue in the test beta4 installers rolled
> this morning. I'm looking into it.

So Sachin and I have looked at this but not found a problem with the
installers. He then found that one of our colleagues has already
reported this as an issue with PostgreSQL and submitted a possible
patch. So, we need a committer to look at fixing this:

http://archives.postgresql.org/pgsql-hackers/2012-06/msg00874.php

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] ERROR - CREATE GIST INDEX on 9.2 beta3

2012-08-16 Thread Heikki Linnakangas

On 15.08.2012 09:50, Heikki Linnakangas wrote:

On 15.08.2012 01:02, Zdeněk Jílovec wrote:

Hello,

I use PostgreSQL 9.2beta3 with PostGIS 2.0.1 and if I try create GIST
index
on column geometry(Point,2065) I get error:

test=> CREATE INDEX places_point ON places USING GIST(def_point);
ERROR: failed to re-find parent for block 18097

It works on 9.1


Hmm, I bet this is a bug in the new GiST buffering build code. There was
an earlier bug that led to "failed to re-find parent" errors that I
fixed back in May, but maybe I missed some corner case.


Zdeněk sent me the dump and instructions off-list, and I was able to 
reproduce and diagnose the bug. Many thanks for that! It was indeed a 
corner-case in the parent tracking logic.


During the build, we maintain a hash table of the parent of each page. 
The hash table is used to find the parent of a page, when a page is 
split and we have to insert the downlinks of the new pages to the 
parent. In a regular GiST insertion, we always descend the tree from the 
root to leaf, and we get the parent pointers from the stack. During a 
buffered build, we don't have such a stack available, because we can 
start the descend from a buffer in the middle of the tree. So we use the 
parent map instead.


However, the parent hash table does not track the immediate parents of 
leaf pages. That's not required, because even though we can begin the 
descend somewhere in the middle of the tree, when we descend to a leaf 
page we know the immediate parent where we came from. Not tracking the 
leaf level saves a considerable amount of memory.


But just before we descend to the leaf page, we check if the downlink 
needs to be adjusted to accommodate the new tuple, and replace it with 
an updated tuple if so. The bug arises when updating the downlink of the 
leaf splits the parent page, and the downlink is moved to a right 
sibling. When we then descend to the leaf page, the parent of the leaf 
page is incorrect, the real parent is somewhere to the right of where we 
think it is.


In a normal index insert that case is covered by the logic to move right 
if the downlink is not found on the expected page. In the buffering 
build, we don't do that because we think we know exactly what the parent 
of each page is.


I committed the attached patch to fix that. With the patch, when the 
downlink is updated in the parent page, the gistbufferinginserttuples() 
function returns the block where the updated tuple was placed, so that 
when we descend to the leaf, we know the parent of the leaf correctly.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
*** a/src/backend/access/gist/gist.c
--- b/src/backend/access/gist/gist.c
***
*** 148,163  gistinsert(PG_FUNCTION_ARGS)
--- 148,169 
   * pages are released; note that new tuple(s) are *not* on the root page
   * but in one of the new child pages.
   *
+  * If 'newblkno' is not NULL, returns the block number of page the first
+  * new/updated tuple was inserted to. Usually it's the given page, but could
+  * be its right sibling if the page was split.
+  *
   * Returns 'true' if the page was split, 'false' otherwise.
   */
  bool
  gistplacetopage(Relation rel, Size freespace, GISTSTATE *giststate,
  Buffer buffer,
  IndexTuple *itup, int ntup, OffsetNumber oldoffnum,
+ BlockNumber *newblkno,
  Buffer leftchildbuf,
  List **splitinfo,
  bool markfollowright)
  {
+ 	BlockNumber blkno = BufferGetBlockNumber(buffer);
  	Page		page = BufferGetPage(buffer);
  	bool		is_leaf = (GistPageIsLeaf(page)) ? true : false;
  	XLogRecPtr	recptr;
***
*** 199,205  gistplacetopage(Relation rel, Size freespace, GISTSTATE *giststate,
  		BlockNumber oldrlink = InvalidBlockNumber;
  		GistNSN		oldnsn = 0;
  		SplitedPageLayout rootpg;
- 		BlockNumber blkno = BufferGetBlockNumber(buffer);
  		bool		is_rootsplit;
  
  		is_rootsplit = (blkno == GIST_ROOT_BLKNO);
--- 205,210 
***
*** 319,327  gistplacetopage(Relation rel, Size freespace, GISTSTATE *giststate,
  
  			for (i = 0; i < ptr->block.num; i++)
  			{
! if (PageAddItem(ptr->page, (Item) data, IndexTupleSize((IndexTuple) data), i + FirstOffsetNumber, false, false) == InvalidOffsetNumber)
  	elog(ERROR, "failed to add item to index page in \"%s\"", RelationGetRelationName(rel));
! data += IndexTupleSize((IndexTuple) data);
  			}
  
  			/* Set up rightlinks */
--- 324,342 
  
  			for (i = 0; i < ptr->block.num; i++)
  			{
! IndexTuple	thistup = (IndexTuple) data;
! 
! if (PageAddItem(ptr->page, (Item) data, IndexTupleSize(thistup), i + FirstOffsetNumber, false, false) == InvalidOffsetNumber)
  	elog(ERROR, "failed to add item to index page in \"%s\"", RelationGetRelationName(rel));
! 
! /*
!  * If this is the first inserted/updated tuple, let the caller
!  * know which page it landed on.
!  */
! if (newblkno && ItemPointerEquals(&thistup->t_tid, &(*itup)->t_tid