Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-20 Thread Chuck McDevitt
There is a problem trying to make Postgres do these things in Parallel.

 

The backend code isn't thread-safe, so doing a multi-thread
implementation requires quite  a bit of work.

 

Using multiple processes has its own problems:  The whole way locking
works equates one process with one transaction (The proc table is one
entry per process).  Processes would conflict on locks, deadlocking
themselves, as well as many other problems.

 

It's all a good idea, but the work is probably far more than you expect.

 

Async I/O might be easier, if you used pThreads, which is mostly
portable, but not to all platforms.  (Yes, they do work on Windows)

 

From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Jeffrey Baker
Sent: 2008-10-20 22:25
To: Julius Stroffek
Cc: pgsql-hackers@postgresql.org; Dano Vojtek
Subject: Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions
wanted!

 

On Mon, Oct 20, 2008 at 12:05 PM, Julius Stroffek
<[EMAIL PROTECTED]> wrote:

Topics that seem to be of interest and most of them were already
discussed at developers meeting in Ottawa are
1.) parallel sorts
2.) parallel query execution
3.) asynchronous I/O
4.) parallel COPY
5.) parallel pg_dump
6.) using threads for parallel processing

[...] 

2.)
Different subtrees (or nodes) of the plan could be executed in
parallel
on different CPUs and the results of this subtrees could be
requested
either synchronously or asynchronously.


I don't see why multiple CPUs can't work on the same node of a plan.
For instance, consider a node involving a scan with an expensive
condition, like UTF-8 string length.  If you have four CPUs you can
bring to bear, each CPU could take every fourth page, computing the
expensive condition for each tuple in that page.  The results of the
scan can be retired asynchronously to the next node above.

-jwb



Re: [HACKERS] Debian no longer dumps cores?

2008-10-20 Thread tomas
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Mon, Oct 20, 2008 at 05:49:04PM -0300, Alvaro Herrera wrote:
> Hi,
> 
> My Debian system (now running Linux 2.6.26) is no longer dumping core
> files, and I can't figure out why :-(

FWIW, same happens here, out-of-the-box 2.6.26-1 vanilla Debian. Booting
with 2.6.24-1 "fixes" this. Didn't try 2.6.25-1 yet.

Regards
- -- tomás
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFI/XFUBcgs9XrR2kYRAmATAJ96SU3oNvNWeJw0VOB7RMBcL66npQCfeZ1Q
YlMSHwJ5c/XxgH3sFpDuA94=
=a1Vq
-END PGP SIGNATURE-

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-20 Thread Jeffrey Baker
On Mon, Oct 20, 2008 at 12:05 PM, Julius Stroffek
<[EMAIL PROTECTED]>wrote:

> Topics that seem to be of interest and most of them were already
> discussed at developers meeting in Ottawa are
> 1.) parallel sorts
> 2.) parallel query execution
> 3.) asynchronous I/O
> 4.) parallel COPY
> 5.) parallel pg_dump
> 6.) using threads for parallel processing
>
[...]

> 2.)
> Different subtrees (or nodes) of the plan could be executed in parallel
> on different CPUs and the results of this subtrees could be requested
> either synchronously or asynchronously.


I don't see why multiple CPUs can't work on the same node of a plan.  For
instance, consider a node involving a scan with an expensive condition, like
UTF-8 string length.  If you have four CPUs you can bring to bear, each CPU
could take every fourth page, computing the expensive condition for each
tuple in that page.  The results of the scan can be retired asynchronously
to the next node above.

-jwb


[HACKERS] pg_stat_statements in core

2008-10-20 Thread ITAGAKI Takahiro
I wrote:
> Now I'm working on storing statistics into disks on server
> shutdown. If it is impossible unless the module is in core,
> I would change my policy...

I reconsidered this part and found that pg_stat_statements needs to be
in core to write stats in file on server shutdown because:

  1. shared_preload_libraries are only loaded in postmaster and backends.
  2. Postmaster should write stats file because it is the only process
 that knows when to shutdown.
  3. Postmaster cannot attach shared memory which contains statement stats
 because it cannot acquire LWLocks at all.

My next plan is put pg_stat_statements into core and treat it just like as
freespacemap in 8.3 or before. Stats file is read by bootstrap process and
written down by bgwriter process. All of the codes in pg_stat_statements
will be merged into pgstat.c and some of interface functions will be called
from ExecutorRun instead of using ExecutorRun_hook.

If no objections, I'll go ahead in the direction.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Window Functions: buffering strategy

2008-10-20 Thread Hitoshi Harada
2008/10/21 Heikki Linnakangas <[EMAIL PROTECTED]>:
> Hitoshi Harada wrote:
>>
>> The real problem is not how to cut off preceding rows, but how to read
>> ahead after the current row. I intend to avoid reading ahead until end
>> of the partition for only row_number() that doesn't need any following
>> rows. Sometimes we have to store whole the partition before returning
>> the first result and sometimes not. It depends on function categories,
>> or function access range. My current idea is classify Window function
>> API to three parallel to buffering strategies.
>
> Could the rows be read ahead on demand? If the window function calls
> window_getarg on a row that's not yet fetched, fetch forward to that row.

Well, it could be possible. But from my current view it will be very
complicated and might be impossible. So I will try to implement basic
approach, and let's consider your approach then. We keep stay in
private API so that we have time to consider again.

Regards,


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Block-level CRC checks

2008-10-20 Thread Paul Schlie
Alvaro Herrera wrote:
> So this discussion died with no solution arising to the
> hint-bit-setting-invalidates-the-CRC problem.

Is there no point at which a page is logically committed to
storage, past which no mutating access may be performed?



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Lisp as a procedural language?

2008-10-20 Thread John DeSoi


On Oct 20, 2008, at 3:00 PM, Joshua Tolley wrote:


One of the Java-as-a-procedural-language options uses RMI to get the
server talking to a separate JVM, where the actual function processing
gets done. Could a PL/Lisp work similarly (and would it be anything
approaching a good idea...)?


I think it could work, but it is hard to say how good an idea it would  
be without being more familiar with the implementation details on what  
it takes to create a complete procedural language.


There might be some useful ideas from SLIME (http://common-lisp.net/project/slime/ 
) which connects to many different Lisp implementations to provide a  
Lisp IDE in Emacs.


BTW, this is Lisp's 50th birthday being celebrated today at OOPSLA.

http://www.lisp50.org/


John DeSoi, Ph.D.





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets

2008-10-20 Thread Lawrence, Ramon
We propose a patch that improves hybrid hash join's performance for
large multi-batch joins where the probe relation has skew.

 

Project name: Histojoin

Patch file: histojoin_v1.patch

 

This patch implements the Histojoin join algorithm as an optional
feature added to the standard Hybrid Hash Join (HHJ).  A flag is used to
enable or disable the Histojoin features.  When Histojoin is disabled,
HHJ acts as normal.  The Histojoin features allow HHJ to use
PostgreSQL's statistics to do skew aware partitioning.  The basic idea
is to keep build relation tuples in a small in-memory hash table that
have join values that are frequently occurring in the probe relation.
This improves performance of HHJ when multiple batches are used by 10%
to 50% for skewed data sets.  The performance improvements of this patch
can be seen in the paper (pages 25-30) at:

 

http://people.ok.ubc.ca/rlawrenc/histojoin2.pdf

 

All generators and materials needed to verify these results can be
provided.

 

This is a patch against the HEAD of the repository.

 

This patch does not contain platform specific code.  It compiles and has
been tested on our machines in both Windows (MSVC++) and Linux (GCC).

 

Currently the Histojoin feature is enabled by default and is used
whenever HHJ is used and there are Most Common Value (MCV) statistics
available on the probe side base relation of the join.  To disable this
feature simply set the enable_hashjoin_usestatmcvs flag to off in the
database configuration file or at run time with the 'set' command.

 

One potential improvement not included in the patch is that Most Common
Value (MCV) statistics are only determined when the probe relation is
produced by a scan operator.  There is a benefit to using MCVs even when
the probe relation is not a base scan, but we were unable to determine
how to find statistics from a base relation after other operators are
performed.

 

This patch was created by Bryce Cutt as part of his work on his M.Sc.
thesis.

 

--

Dr. Ramon Lawrence

Assistant Professor, Department of Computer Science, University of
British Columbia Okanagan

E-mail: [EMAIL PROTECTED]  

 



histojoin_v1.patch
Description: histojoin_v1.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Properly access a buffer's LSN using existing access macros

2008-10-20 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Hmm, I just noticed this comment in bufpage.h (which was also in Jonah's
> patch) :-(

> typedef struct PageHeaderData
> {
> /* XXX LSN is member of *any* block, not only page-organized ones */
> ...

We don't have any non-page-organized blocks ;-)

Seriously, if the issue ever became significant I'd expect that we'd add
a buffer flag to tell whether the buffer had a LSN or not, and set this
appropriately at buffer readin time depending on where we got the page
from.  That would then cue the write logic to know what to do.  (In
particular, I suppose we'd need to do this if we try to migrate clog and
friends into the main buffer arena.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [GENERAL] [HACKERS] Hot Standby utility and administrator functions

2008-10-20 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes:
> On Mon, 2008-10-20 at 17:44 -0300, Alvaro Herrera wrote:
>> That's been "extended with an epoch counter" per the docs; I don't think
>> that's appropriate for the new functions, is it?

> I assumed it was, so you can subtract them easily. 

> It can be done either way, I guess. Happy to provide what people need. I
> just dreamed up a few that sounded useful.

I don't think you should be inventing new functions without clear
use-cases in mind.  Depending on what the use is, I could see either the
xid or the txid definition as being *required*.

In any case, do not use the wrong return type for the definition you're
implementing.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Debian no longer dumps cores?

2008-10-20 Thread Greg Smith

On Mon, 20 Oct 2008, Alvaro Herrera wrote:


My Debian system (now running Linux 2.6.26) is no longer dumping core
files, and I can't figure out why :-(


My guess is that you're being nailed by one of the changes related to 
implementing the improved capabilities interface made in 2.6.25 or 2.6.26. 
I know that broke some libpcap versions for example: 
http://lkml.org/lkml/2008/4/22/18


I haven't upgraded a Debian system to a kernel that new yet myself to know 
specifically what's wrong, the blog posting at 
http://linux-man-pages.blogspot.com/2008/05/capabilities-have-fully-arrived-finally.html 
is a good starter point to dig for more info.  If I had to make a guess 
I'd suspect whatever is dumping core doesn't have the CAP_SYS_RAWIO 
capability needed to access /proc/kcore and you need to tweak a file 
related to that area rather than the ulimit stuff.


Also worth mentioning is that you may need to adjust 
/proc/sys/fs/suid_dumpable (see 
http://manpages.courier-mta.org/htmlman5/proc.5.html ) and that there have 
been distributions that just broke even the CAP_SYS_RAWIO interface at the 
kernel level; see http://vir.homelinux.org/blog/archives/30-Lost-time.html 
for an example.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Properly access a buffer's LSN using existing access macros

2008-10-20 Thread Jonah H. Harris
On Mon, Oct 20, 2008 at 5:23 PM, Alvaro Herrera
<[EMAIL PROTECTED]> wrote:
> Hmm, I just noticed this comment in bufpage.h (which was also in Jonah's
> patch) :-(
>
> typedef struct PageHeaderData
> {
>/* XXX LSN is member of *any* block, not only page-organized ones */

Passed regressions and several benchmarks for me.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Properly access a buffer's LSN using existing access macros

2008-10-20 Thread Alvaro Herrera
Alvaro Herrera wrote:
> Log Message:
> ---
> Properly access a buffer's LSN using existing access macros instead of abusing
> knowledge of page layout.

Hmm, I just noticed this comment in bufpage.h (which was also in Jonah's
patch) :-(

typedef struct PageHeaderData
{
/* XXX LSN is member of *any* block, not only page-organized ones */
...


So I'm now wondering if the above patch is really correct.  It'll of
course work, but is it appropriate?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [GENERAL] [HACKERS] Hot Standby utility and administrator functions

2008-10-20 Thread Simon Riggs

On Mon, 2008-10-20 at 17:44 -0300, Alvaro Herrera wrote:
> Simon Riggs escribió:
> > 
> > On Mon, 2008-10-20 at 16:22 -0400, Robert Haas wrote:
> > > > * pg_last_recovered_xact_xid()
> > > > Will throw an ERROR if *not* executed in recovery mode.
> > > > returns bigint
> > > >
> > > > * pg_last_completed_xact_xid()
> > > > Will throw an ERROR *if* executed in recovery mode.
> > > > returns bigint
> > > 
> > > Should these return xid?
> > 
> > Perhaps, but they match txid_current() which returns bigint.
> > http://developer.postgresql.org/pgdocs/postgres/functions-info.html
> 
> That's been "extended with an epoch counter" per the docs; I don't think
> that's appropriate for the new functions, is it?

I assumed it was, so you can subtract them easily. 

It can be done either way, I guess. Happy to provide what people need. I
just dreamed up a few that sounded useful.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSL cleanups/hostname verification

2008-10-20 Thread Magnus Hagander
Robert Haas wrote:
>>> How can you make that the default?  Won't it immediately break every
>>> installation without certificates?
>> *all* SSL installations have certificate on the server side. You cannot
>> run without it.
> 
> s/without certificates/with self-signed certificates/
> 
> which I would guess to be a common configuration

Self-signed still work. In a self-signed scenario, the server
certificate *is* the CA certificate.

//Magnus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Debian no longer dumps cores?

2008-10-20 Thread Alvaro Herrera
Hi,

My Debian system (now running Linux 2.6.26) is no longer dumping core
files, and I can't figure out why :-(

Of course, I've set ulimit -c to unlimited, and I'm running the
postmaster directly in the same shell (no pg_ctl or init scripts), but
it's still not working.  I'm not sure where else to look; Google
searches return tons of junk but nothing useful.

The kernel was compiled by me to add oprofile support, though I used
Debian's .config to generate the base config, and changed nothing else
(no time to fiddle).

If anybody has any idea of what's going on I'd appreciate it.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSL cleanups/hostname verification

2008-10-20 Thread Robert Haas
>> How can you make that the default?  Won't it immediately break every
>> installation without certificates?
>
> *all* SSL installations have certificate on the server side. You cannot
> run without it.

s/without certificates/with self-signed certificates/

which I would guess to be a common configuration

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [GENERAL] [HACKERS] Hot Standby utility and administrator functions

2008-10-20 Thread Alvaro Herrera
Simon Riggs escribió:
> 
> On Mon, 2008-10-20 at 16:22 -0400, Robert Haas wrote:
> > > * pg_last_recovered_xact_xid()
> > > Will throw an ERROR if *not* executed in recovery mode.
> > > returns bigint
> > >
> > > * pg_last_completed_xact_xid()
> > > Will throw an ERROR *if* executed in recovery mode.
> > > returns bigint
> > 
> > Should these return xid?
> 
> Perhaps, but they match txid_current() which returns bigint.
> http://developer.postgresql.org/pgdocs/postgres/functions-info.html

That's been "extended with an epoch counter" per the docs; I don't think
that's appropriate for the new functions, is it?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] HEAD is broken - transam.c

2008-10-20 Thread Alvaro Herrera
Zdenek Kotala wrote:
> Last commit broken transam.c function definitions:
>
>
> "transam.c", line 268: void function cannot return value
> "transam.c", line 280: void function cannot return value
> cc: acomp failed for transam.c

Interesting ... my compiler did not even complain.  Fixed, thanks for
the notice.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] WIP patch to add "placeholder" variables to planner

2008-10-20 Thread Tom Lane
Per earlier discussion,
http://archives.postgresql.org/pgsql-hackers/2008-10/msg00853.php
I've been fooling around with a patch to let the planner evaluate
some expressions at lower join levels and bubble the results up
like Vars.  I've got it passing the regression tests now, and though
there is more left to do I thought it'd be worth posting for comment.

One thing I'm not totally satisfied with is that I've got each instance
of a PlaceHolderVar carrying a copy of the represented expression.
This seems a bit inefficient, but it's difficult to get rid of it
without making things a lot more fragile.  There are a lot of properties
of an expression tree (for example, whether it contains any volatile
functions) that are currently extracted on-demand using recursive tree
walkers.  So we'd need walkers to be able to descend through a
PlaceHolderVar in any case, and once you buy into that it's hard to
not say the same for mutators, so there can't just be a single shared
copy of the expression.  The only significant drawback I've found is
that if the expression contains a sub-SELECT you end up with some
useless extra copies of the resulting subplan.  Which is annoying but
it isn't really going to cost anything at runtime, and the case seems
unlikely to occur much in practice anyhow.

Comments anyone?

regards, tom lane



binQdBKKQuMGj.bin
Description: placeholders-1.patch.gz

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby utility and administrator functions

2008-10-20 Thread Simon Riggs

On Mon, 2008-10-20 at 16:22 -0400, Robert Haas wrote:
> > * pg_last_recovered_xact_xid()
> > Will throw an ERROR if *not* executed in recovery mode.
> > returns bigint
> >
> > * pg_last_completed_xact_xid()
> > Will throw an ERROR *if* executed in recovery mode.
> > returns bigint
> 
> Should these return xid?

Perhaps, but they match txid_current() which returns bigint.
http://developer.postgresql.org/pgdocs/postgres/functions-info.html

Thanks for checking.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch status for reducing de-TOAST overhead?

2008-10-20 Thread Tom Lane
"Mark Cave-Ayland" <[EMAIL PROTECTED]> writes:
> I'm just following up on the patch created here by Tom to aid with
> repeated de-TOASTing attempts:
> http://archives.postgresql.org/pgsql-hackers/2008-06/msg01096.php

> Given the performance report from Jeff
> (http://archives.postgresql.org/pgsql-hackers/2008-08/msg01178.php), is
> there still scope for getting something like this in 8.4 - or this patch
> still far shy of anything ready for a commit fest?

I'm afraid nothing's likely to get done about that for 8.4 ... the
existing patch doesn't seem satisfactory enough to apply, and we're
running out of time to do anything more extensive.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] HEAD is broken - transam.c

2008-10-20 Thread Zdenek Kotala

Last commit broken transam.c function definitions:


"transam.c", line 268: void function cannot return value
"transam.c", line 280: void function cannot return value
cc: acomp failed for transam.c


Detail 
http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=gothic_moth&dt=2008-10-20%2020:06:01



Zdenek



--
Zdenek Kotala  Sun Microsystems
Prague, Czech Republic http://sun.com/postgresql


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby utility and administrator functions

2008-10-20 Thread Robert Haas
> * pg_last_recovered_xact_xid()
> Will throw an ERROR if *not* executed in recovery mode.
> returns bigint
>
> * pg_last_completed_xact_xid()
> Will throw an ERROR *if* executed in recovery mode.
> returns bigint

Should these return xid?

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Subtransaction commits and Hot Standby

2008-10-20 Thread Simon Riggs

On Mon, 2008-10-20 at 16:23 -0300, Alvaro Herrera wrote:
> Simon Riggs wrote:
> 
> > Renamed to set_status_by_pages because we never use this on the whole
> > tree. Added comments to say that.
> > 
> > Overall, cleaner and more readable now. Thanks.
> 
> Committed, thanks.

Cheers.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index use during Hot Standby

2008-10-20 Thread Simon Riggs

On Mon, 2008-10-20 at 21:11 +0300, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > OK, I think I've found a problem.
> > 
> > In _bt_insertonpg(), if we split we do _bt_split() then do
> > _bt_insert_parent(), which then does _bt_insertonpg() recursively.
> > 
> > _bt_split() writes a WAL record but continues holding write locks.
> > btree_xlog_split() reads WAL record and does *not* continue to hold
> > write locks. So recovery locking differs from Lehman & Yao requirements
> > at that point.
> 
> Hmm. I don't have Lehman & Yao's paper at hand, but I fail to see what 
> would go wrong.
> 
> Recovery of a split works like this:
> 
> 1. Reconstruct new right sibling from scratch. Keep locked
> 2. Update old page (= new left sibling). Keep locked
> 3. Release locks on both pages.
> 4. Update the left-link of the page to the right of the new right sibling.
> 
> Searches descending work just fine without the pointer in the parent 
> page to the new right sibling, just slower because they will always land 
> on the left sibling, and might have move right from there. Searchers 
> moving from left to right work fine; they will see either the old page, 
> or both the new left and right sibling. Searchers moving right to left 
> will likewise work; they will see either the old page, or the new right, 
> then left page, or between steps 3 and 4, they will move to the left 
> page, see that the right-link doesn't point to the page it came from, 
> and move right to the new right sibling.
> 
> All that works just like during normal operation, so I don't actually 
> understand why L&Y requires that you keep the split pages locked until 
> you've locked the parent. Maybe it's needed to handle concurrent inserts 
> or splits, but there can't be any during WAL replay.

I think you're right to question that. I was happy to say "locking must
be identical", which is correct, but the assumptions are different in
recovery, as you point out. The details you cite are not as important as
the realisation that we need to get concurrency correct from the
perspective of only a single inserter and many readers. I'd overlooked
that basic assumption, but its important we clearly state that:
"Recovery operations are serialized and therefore recovery operations
need not fully emulate the locking required for multiple concurrent
writers."

Grokking the paper suggests to me you are correct and that the double
locking can only be required for correct ordering of page split
operations. It clearly isn't needed at all for the correctness proof
with a single inserter on p.350.

So updating blocks for a page split on any one level is performed by a
single WAL record and therefore atomic. None of the things I worried
about earlier need addressing, so we're back to where I was this
morning: no changes required for correct concurrency behaviour.

Thanks everybody for a valuable discussion.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Window Functions: buffering strategy

2008-10-20 Thread Heikki Linnakangas

Hitoshi Harada wrote:

The real problem is not how to cut off preceding rows, but how to read
ahead after the current row. I intend to avoid reading ahead until end
of the partition for only row_number() that doesn't need any following
rows. Sometimes we have to store whole the partition before returning
the first result and sometimes not. It depends on function categories,
or function access range. My current idea is classify Window function
API to three parallel to buffering strategies.


Could the rows be read ahead on demand? If the window function calls 
window_getarg on a row that's not yet fetched, fetch forward to that row.



And the lag()/lead(), spec says "OFFSET is exact numeric literal" but
we postgres doesn't have mechanism to limit the function argument data
type to Const integer only. So I am thinking about OFFSET for
lag()/lead() may be dynamic integer variable. If it comes, even those
functions don't know how many rows should be cut off. The lag()/lead()
can access any row of partition, per spec.


Hmm, yeah, that's a bit of a problem :-(.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Subtransaction commits and Hot Standby

2008-10-20 Thread Alvaro Herrera
Simon Riggs wrote:

> Renamed to set_status_by_pages because we never use this on the whole
> tree. Added comments to say that.
> 
> Overall, cleaner and more readable now. Thanks.

Committed, thanks.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-20 Thread Julius Stroffek

Hi All,

we would like to start some work on improving the performance of
PostgreSQL in a multi-CPU environment. Dano Vojtek is student at the
Faculty of Mathematics and Physics of Charles university in Prague
(http://www.mff.cuni.cz) and he is going to cover this topic in his
master thesis. He is going to do some investigation in the methods and
write down the possibilities and then he is going to implement something
from that for PostgreSQL.

We want to come out with a serious proposal for this work after
collecting the feedback/opinions and doing the more serious investigation.

Topics that seem to be of interest and most of them were already
discussed at developers meeting in Ottawa are
1.) parallel sorts
2.) parallel query execution
3.) asynchronous I/O
4.) parallel COPY
5.) parallel pg_dump
6.) using threads for parallel processing

A scaling with increasing number of CPUs in 1.) and 2.) will face with
the I/O bottleneck at some point and the benefit gained here should be
nearly the same as for 3.) - the OS or disk could do a better job while
scheduling multiple reads from the disk for the same query at the same time.

1.)
More merges could be executed on different CPUs. However, one N-way
merge on one CPU is probably better than two N/2-way merges on 2 CPUs
while sharing the limit of work_mem together for these. This is specific
and separate from 2.) or 3.) and if something implemented here it could
probably share just the parallel infrastructure code.


2.)
Different subtrees (or nodes) of the plan could be executed in parallel
on different CPUs and the results of this subtrees could be requested
either synchronously or asynchronously.


3.)
The simplest possible way is to change the scan nodes that they will
send out the asynchronous  I/O requests for the next blocks before they
manage to run out of tuples in the block they are going through. The
more advanced way would arise just by implementing 2.) which will then
lead to different scan nodes to be executed on different CPUs at the
same time.


4.) and 5.)
We do not want to focus here, since there are on-going projects already.


6.)
Currently, threads are not used in PostgreSQL (except some cases for 
Windows OS). Generally using them would bring some problems


a) different thread implementations on different OSes
b) crash of the whole process if the problem happens in one thread.
Backends are isolated and the problem in one backend leads to the
graceful shut down of other backends.
c) synchronization problems

* a) seem just to be more for implementation. Is there any problem with 
execution of more threads on any supported OS? Like some planning issue 
that all the threads for the same process end up planned on the same 
CPU? Or something similar?


* b) is fine with using more threads for processing the same query in 
the same backend - if one crashes others could do the graceful shutdown.


* c) does not have to be solved in general because the work of
all the threads will be synchronized and we could expect pretty well 
which data are being accessed by which thread. The memory allocation 
have to be adjusted to be thread safe and should not affect the 
performance (Is different memory context for different threads 
sufficient?). Other common code might need some changes as well. 
Possibly, the synchronization/critical section exclusion could be done 
in executor and only if needed.


* Using processes instead of threads makes other things more complex
  - sharing objects between processes might need much more coding
  - more overhead during execution and synchronization


It seems to that it makes sense to start working on 2) and 3) and we
would like to think of using more threads for processing the same query
within one backend.

We appreciate feedback, comments and/or suggestions.

Cheers

Julo


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Patch status for reducing de-TOAST overhead?

2008-10-20 Thread Mark Cave-Ayland
Hi everyone,

I'm just following up on the patch created here by Tom to aid with
repeated de-TOASTing attempts:
http://archives.postgresql.org/pgsql-hackers/2008-06/msg01096.php

Given the performance report from Jeff
(http://archives.postgresql.org/pgsql-hackers/2008-08/msg01178.php), is
there still scope for getting something like this in 8.4 - or this patch
still far shy of anything ready for a commit fest? I have a feeling that
working on this is something currently beyond my own capability :(


ATB,

Mark.

-- 
Mark Cave-Ayland
Sirius Corporation - The Open Source Experts
http://www.siriusit.co.uk
T: +44 870 608 0063


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Block-level CRC checks

2008-10-20 Thread Heikki Linnakangas

Markus Wanner wrote:

Alvaro Herrera wrote:

So this discussion died with no solution arising to the
hint-bit-setting-invalidates-the-CRC problem.


Isn't double-buffering solving this issue? Has somebody checked if it
even helps performance due to being able to release the lock on the
buffer *before* the syscall?


Double-buffering helps with the hint bit issues within shared buffers, 
but doesn't help with the torn page and hint bits problem. The torn page 
problem seems to be the show-stopper.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Lisp as a procedural language?

2008-10-20 Thread Joshua Tolley
On Mon, Oct 20, 2008 at 12:56 PM, John DeSoi <[EMAIL PROTECTED]> wrote:
>
> On Oct 19, 2008, at 1:27 PM, Douglas McNaught wrote:
>
>> SBCL is a big and very sophisticated program.  It's designed to be a
>> self-contained Lisp system and has (AFAIK) no concessions to
>> "embeddability".  It uses threads internally, and plays games with the
>> memory map to make GC more efficient.  Only a small part of it is
>> written in C, and the rest is Lisp compiled directly to binary. It
>> would almost certainly be a heroic project to make it coexist with a
>> PostgreSQL backend process--like Java, but much worse.
>>
>> It's not likely that any of the "serious" Common Lisp systems would be
>> easily embedded in Postgres.
>
>
> Probably the ideal implementation would be ECL:
>
> http://ecls.sourceforge.net/
>
> It is designed to be a full Common Lisp implementation that can be easily
> embedded in other environments.
>
> It generates C source code so you could have the option of developing with
> Lisp and then generating C language functions for additional speed or source
> code security.
>
> Not open source, but I've played around a bit with integrating LispWorks to
> get Lisp a procedural language.
>
> I'd like to see Lisp as a procedural language, but I'm not very proficient
> with C. If anyone is interested in leading the way, I would be happy to
> help.
>
>
> John DeSoi, Ph.D.
>

One of the Java-as-a-procedural-language options uses RMI to get the
server talking to a separate JVM, where the actual function processing
gets done. Could a PL/Lisp work similarly (and would it be anything
approaching a good idea...)?

- Josh / eggyknap

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Lisp as a procedural language?

2008-10-20 Thread John DeSoi


On Oct 19, 2008, at 1:27 PM, Douglas McNaught wrote:


SBCL is a big and very sophisticated program.  It's designed to be a
self-contained Lisp system and has (AFAIK) no concessions to
"embeddability".  It uses threads internally, and plays games with the
memory map to make GC more efficient.  Only a small part of it is
written in C, and the rest is Lisp compiled directly to binary. It
would almost certainly be a heroic project to make it coexist with a
PostgreSQL backend process--like Java, but much worse.

It's not likely that any of the "serious" Common Lisp systems would be
easily embedded in Postgres.



Probably the ideal implementation would be ECL:

http://ecls.sourceforge.net/

It is designed to be a full Common Lisp implementation that can be  
easily embedded in other environments.


It generates C source code so you could have the option of developing  
with Lisp and then generating C language functions for additional  
speed or source code security.


Not open source, but I've played around a bit with integrating  
LispWorks to get Lisp a procedural language.


I'd like to see Lisp as a procedural language, but I'm not very  
proficient with C. If anyone is interested in leading the way, I would  
be happy to help.



John DeSoi, Ph.D.





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSL cleanups/hostname verification

2008-10-20 Thread Magnus Hagander
Tom Lane wrote:
> Magnus Hagander <[EMAIL PROTECTED]> writes:
>> Attached patch cleans up the certificate verification in libpq, and adds
>> a configuration paraqmeter to control it. The new parameter is
>> "sslverify", and can be set to:
> 
>> * cn = default = will validate that the certificate chains to a trusted
>> root, *and* that the cn on the certificate matches the hostname
>> specificed in the connection. This is the only option that prevents
>> man-in-the-middle attacks completely, and therefor is the default.
> 
> How can you make that the default?  Won't it immediately break every
> installation without certificates?

*all* SSL installations have certificate on the server side. You cannot
run without it.

And obviously the setting only has effect if you are actually running
over SSL.

> The patch seems pretty far short of sufficient as far as supporting a
> new conninfo option goes --- for instance it appears to leak the string
> at disconnect.  Check through all the references to some existing option
> field to see if you missed anything else.

Hmm. yeah, I hadn't finished that part - and promptly forgot about that
:S Will look it over again.

//Magnus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index use during Hot Standby

2008-10-20 Thread Heikki Linnakangas

Simon Riggs wrote:

OK, I think I've found a problem.

In _bt_insertonpg(), if we split we do _bt_split() then do
_bt_insert_parent(), which then does _bt_insertonpg() recursively.

_bt_split() writes a WAL record but continues holding write locks.
btree_xlog_split() reads WAL record and does *not* continue to hold
write locks. So recovery locking differs from Lehman & Yao requirements
at that point.


Hmm. I don't have Lehman & Yao's paper at hand, but I fail to see what 
would go wrong.


Recovery of a split works like this:

1. Reconstruct new right sibling from scratch. Keep locked
2. Update old page (= new left sibling). Keep locked
3. Release locks on both pages.
4. Update the left-link of the page to the right of the new right sibling.

Searches descending work just fine without the pointer in the parent 
page to the new right sibling, just slower because they will always land 
on the left sibling, and might have move right from there. Searchers 
moving from left to right work fine; they will see either the old page, 
or both the new left and right sibling. Searchers moving right to left 
will likewise work; they will see either the old page, or the new right, 
then left page, or between steps 3 and 4, they will move to the left 
page, see that the right-link doesn't point to the page it came from, 
and move right to the new right sibling.


All that works just like during normal operation, so I don't actually 
understand why L&Y requires that you keep the split pages locked until 
you've locked the parent. Maybe it's needed to handle concurrent inserts 
or splits, but there can't be any during WAL replay.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Block level concurrency during recovery

2008-10-20 Thread Teodor Sigaev

But does holding cleanup lock on root prevent an in-progress Insert from
changing non-root pages? I assume so, just not sure how.


Yes, because insertion process doesn't unpin root page until insert will done.
--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Block level concurrency during recovery

2008-10-20 Thread Simon Riggs

On Mon, 2008-10-20 at 20:12 +0400, Teodor Sigaev wrote:
> > I don't understand why in ginVacuumPostingTreeLeaves() we lock only the
> > root page for Cleanup and no others. Why do we need to hold Cleanup lock
> > on the root? If the index is concurrent safe for existing scans, why is
> > it not safe for new scans also? And the converse: if it is not safe for
> > new scans, why is it safe for existing scans? 
> 
> Because we wish to prevent concurrent inserts and page deletion just to 
> simplify 
> code. LockForCleanup guarantees that insertion process is not work here (it 
> keeps root buffer pinned all time of insertion). New scan processes can't 
> start 
> as a side effect.

But does holding cleanup lock on root prevent an in-progress Insert from
changing non-root pages? I assume so, just not sure how.

> Note, in most cases it keeps enough concurrence because all that is about 
> work 
> on one tree in GIN index. Usually, there is a lot of such trees in index - 
> for 
> each lexeme if we speak about tsearch index. So, there is a place for 
> improvements but I don't believe that will give a big advantage for 
> performance 
> in typical usage of GIN.

I'm just worried about safety during Hot Standby, not trying to improve
anything.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index use during Hot Standby

2008-10-20 Thread Simon Riggs

On Mon, 2008-10-20 at 16:11 +0100, Simon Riggs wrote:
> On Mon, 2008-10-20 at 18:24 +0400, Teodor Sigaev wrote:
> > > 3. Implement an extra indexAM API call that allows indexAM to decide
> > > when/if index is valid during recovery. This would also cover the second
> > > concern neatly in a single API call.
> > > 
> > > wait until after deadline to implement (2) or (3), in case somebody
> > > fixes this up in the next few weeks.
> > > 
> > 
> > IMHO, Without locking of pages in recovery mode Btree and GIN are not 
> > usable 
> > while incomplete split exists - there is  a nonconnected branch in tree.
> 
> Now
> you may be right, that we do not have the correct locking for
> concurrency in the splitting algorithms. But I haven't seen any issue
> yet myself. But I will look again. 

OK, I think I've found a problem.

In _bt_insertonpg(), if we split we do _bt_split() then do
_bt_insert_parent(), which then does _bt_insertonpg() recursively.

_bt_split() writes a WAL record but continues holding write locks.
btree_xlog_split() reads WAL record and does *not* continue to hold
write locks. So recovery locking differs from Lehman & Yao requirements
at that point.

So to make this concurrent safe, we must keep holding the locks in
btree_xlog_split() and record details of that as part of the the
log_incomplete_split process. That means skipping UnlockReleaseBuffer()
at the end of btree_xlog_split(). Those buffers would remain pinned and
locked.

If btree_xlog_insert() is called on a non-leaf node we will then get the
details of blocks still holding write locks from the incomplete split
data. Those locks could be held awhile if the incomplete split process
is screwed, but it would make this concurrent safe.

I guess we'd use the same technique for GIN. ginInsertValue() ??
Hmm, you release the lock at line 412, ginbtree.c before you get the
parent lock at line 428. That seems different to the L&Y interactions.
Am I looking in the wrong place?

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL:2008 LIMIT/OFFSET

2008-10-20 Thread Tom Lane
Peter Eisentraut <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> BTW, I think it's a bad idea to assign made-up parse locations, as
>> you did here:

> Hmm, @$ is the location of the complete rule, so it should point to the
> "empty" spot in theory.  Or am I misunderstanding something?

Well, yeah, but what is that?  If you did get an error complaining
about, say, an invalid integral constant, the cursor wouldn't be
pointing at anything sensible.

I'm not even very sure that bison would produce a valid offset at all in
this case; it looks to me like the location macro just copies a value
that might not have been initialized.  Did you test what gets produced?

But even stipulating that the cursor would point at the next or previous
token, it seems it'd be more confusing than useful.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Block level concurrency during recovery

2008-10-20 Thread Teodor Sigaev

I don't understand why in ginVacuumPostingTreeLeaves() we lock only the
root page for Cleanup and no others. Why do we need to hold Cleanup lock
on the root? If the index is concurrent safe for existing scans, why is
it not safe for new scans also? And the converse: if it is not safe for
new scans, why is it safe for existing scans? 


Because we wish to prevent concurrent inserts and page deletion just to simplify 
code. LockForCleanup guarantees that insertion process is not work here (it 
keeps root buffer pinned all time of insertion). New scan processes can't start 
as a side effect.


Note, in most cases it keeps enough concurrence because all that is about work 
on one tree in GIN index. Usually, there is a lot of such trees in index - for 
each lexeme if we speak about tsearch index. So, there is a place for 
improvements but I don't believe that will give a big advantage for performance 
in typical usage of GIN.


--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index use during Hot Standby

2008-10-20 Thread Teodor Sigaev
One more thing about GiST - when database is switched from recovery mode to the 
normal mode then it's needed to complete insertion in GiST and, possibly, vacuum 
index. Dig around GistBulkDeleteResult->needFullVacuum and gistContinueInsert()


OK, will look at those. (Problems++).
Not a very big issue: in this case index is still in workable state but not very 
effective until vacuum it.


--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Window Functions: buffering strategy

2008-10-20 Thread Hitoshi Harada
2008/10/20 Heikki Linnakangas <[EMAIL PROTECTED]>:
> Hitoshi Harada wrote:
>>
>> Hi,
>>
>> 2008/10/20 Simon Riggs <[EMAIL PROTECTED]>:
>>>
>>> On Mon, 2008-10-20 at 10:32 +0900, Hitoshi Harada wrote:
>>>
 So I propose three Window node buffering strategies,
 row/frame/partition buffering so as to avoid unnecessary row
 buffering.
>>>
>>> Sounds good from here. Can I suggest you release the code in phases?
>>>
>>> It would be better if we got just one of those done (row?), than to
>>> attempt all 3 and end up with none because of emerging details.
>>
>> Thank you for your feedback.
>> Ok, actually the first will be partition buffering, because it covers
>> all of the functions' requirement. The row buffering is kind of
>> optimization in the special case.
>
> The thought I had during the last commit fest was that the function would
> call a callback, something like window_forget(pos), that would tell the
> system that it can release any rows before the given position. That way you
> only need one method, and it's also be optimal for functions like lag(),
> that doesn't fit perfectly into either the row or frame buffering category.
> Or do we need the information at plan-time?
>

Right. In the last commit fest we discussed about run-time cut-off
signal API. But I have finally come to that we must know how to buffer
*before* any execution.

The real problem is not how to cut off preceding rows, but how to read
ahead after the current row. I intend to avoid reading ahead until end
of the partition for only row_number() that doesn't need any following
rows. Sometimes we have to store whole the partition before returning
the first result and sometimes not. It depends on function categories,
or function access range. My current idea is classify Window function
API to three parallel to buffering strategies.

And the lag()/lead(), spec says "OFFSET is exact numeric literal" but
we postgres doesn't have mechanism to limit the function argument data
type to Const integer only. So I am thinking about OFFSET for
lag()/lead() may be dynamic integer variable. If it comes, even those
functions don't know how many rows should be cut off. The lag()/lead()
can access any row of partition, per spec.

Regards,


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/pg_stat_statements

2008-10-20 Thread Martin Pihlak
ITAGAKI Takahiro wrote:
> I'd like to submit pg_stat_statements contrib module, that counts up
> incoming statements in shared memory and summarizes the result as a view.
> It is just a statements-version of pg_stat_user_functions.
> 
Nice work! There is one feature I'd like to request -- we need to be able
to also track nested statements. This would greatly simplify diagnosing
performance problems in complex stored procedures. Perhaps the GUC
track_statements could be made an enum - none, top, all?

PS. Similar feature would be useful for auto-explain as well.

regards,
Martin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index use during Hot Standby

2008-10-20 Thread Simon Riggs

On Mon, 2008-10-20 at 18:24 +0400, Teodor Sigaev wrote:
> > 3. Implement an extra indexAM API call that allows indexAM to decide
> > when/if index is valid during recovery. This would also cover the second
> > concern neatly in a single API call.
> > 
> > wait until after deadline to implement (2) or (3), in case somebody
> > fixes this up in the next few weeks.
> > 
> 
> IMHO, Without locking of pages in recovery mode Btree and GIN are not usable 
> while incomplete split exists - there is  a nonconnected branch in tree.

I think the two things are not necessarily related to each other.

Incomplete splits prevent us from using a checkpoint as a valid
restartpoint from recovery. So that means an archive recovery always
starts from a point where we have all the WAL information required to
complete any incomplete splits.

That has nothing to do with locking requirements for concurrency. Now
you may be right, that we do not have the correct locking for
concurrency in the splitting algorithms. But I haven't seen any issue
yet myself. But I will look again. If you could be more specific that
would help me.

What I would say is that we need not consider the whole index to be
unusable at a point in time. If we emulate the lock sequence in recovery
that was used on the master, then we should have the same concurrency.
So preventing access to the whole index should be unnecessary.

Thinking some more on this, the database is not in a usable state until
all splits which were incomplete at time recovery started have been
completed. This is because the first half of the split is on disk
already, but we haven't reacquired the locks we held while making the
split in the first place. So we do have at least one problem in this
area.

> GiST has similar issue - incomplete insert. One insertion in leaf page can 
> produce updating of keys up to the root. During that split pages may occurs.
> So, it's needed to add to gistxlog.c tracking of pages split to get exact 
> knowledge about moments of unusability of index.
> 
> One more thing about GiST - when database is switched from recovery mode to 
> the 
> normal mode then it's needed to complete insertion in GiST and, possibly, 
> vacuum 
> index. Dig around GistBulkDeleteResult->needFullVacuum and 
> gistContinueInsert()

OK, will look at those. (Problems++).

Thanks,

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL:2008 LIMIT/OFFSET

2008-10-20 Thread Peter Eisentraut

Tom Lane wrote:

Peter Eisentraut <[EMAIL PROTECTED]> writes:
SQL:2008 specifies the following syntax for what we have so far called 
LIMIT and OFFSET

SELECT ... [ ORDER BY ... ]
 OFFSET num {ROW|ROWS} FETCH {FIRST|NEXT} [num] {ROW|ROWS} ONLY


What does the "NEXT" option mean?


FIRST and NEXT mean exactly the same, namely nothing.


BTW, I think it's a bad idea to assign made-up parse locations, as
you did here:


+   | /*EMPTY*/ { $$ = makeIntConst(1, @$); }


Just use -1.


Hmm, @$ is the location of the complete rule, so it should point to the
"empty" spot in theory.  Or am I misunderstanding something?

It's hard to simulate an error with this of course.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Block level concurrency during recovery

2008-10-20 Thread Simon Riggs

On Mon, 2008-10-20 at 18:28 +0400, Teodor Sigaev wrote:
> > * For GIN indexes, we appear to not hold a Cleanup lock during
> > vacuuming, except on root page. That stops new scans from starting, but
> > it doesn't prevent progress of concurrent scans. Doesn't look correct to
> > me... so not sure what strength lock to acquire in each case. Probably

> Why do you think so?

I have questions.

I don't understand why in ginVacuumPostingTreeLeaves() we lock only the
root page for Cleanup and no others. Why do we need to hold Cleanup lock
on the root? If the index is concurrent safe for existing scans, why is
it not safe for new scans also? And the converse: if it is not safe for
new scans, why is it safe for existing scans? 

> > need to differentiate between WAL record types so we can tell which to
> > acquire CleanupLock for and which not.
> > 
> > * GIST doesn't use CleaupLocks at all. So I'm very unclear here.

> Scan process does not hold any pointer on page now, insertion process holds 
> but 
> it tracks changes of page by looking at LSN of page.

Sounds good.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] minimal update

2008-10-20 Thread Andrew Dunstan



Magnus Hagander wrote:

Andrew Dunstan wrote:
  

Tom Lane wrote:


Andrew Dunstan <[EMAIL PROTECTED]> writes:
 
  

OK. Where would be a good place to put the code? Maybe a new file
src/backend/utils/adt/trigger_utils.c ?



I thought the plan was to make it a contrib module.

   
  
  

Well, previous discussion did mention catalog entries, which would
suggest otherwise, but I can do it as a contrib module if that's the
consensus.



What would be the actual reason to put it in contrib and not core? Are
there any "dangers" by having it there? Or is it "just a hack" and not a
"real solution"?


  


No, it's not just a hack. It's very close to what we'd probably do if we 
built the facility right into the language, although it does involve the 
overhead of calling the trigger. However, it performs reasonably well - 
not surprising since the guts of it is just a memcmp() call.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Window Functions: buffering strategy

2008-10-20 Thread Heikki Linnakangas

Hitoshi Harada wrote:

Hi,

2008/10/20 Simon Riggs <[EMAIL PROTECTED]>:

On Mon, 2008-10-20 at 10:32 +0900, Hitoshi Harada wrote:


So I propose three Window node buffering strategies,
row/frame/partition buffering so as to avoid unnecessary row
buffering.

Sounds good from here. Can I suggest you release the code in phases?

It would be better if we got just one of those done (row?), than to
attempt all 3 and end up with none because of emerging details.


Thank you for your feedback.
Ok, actually the first will be partition buffering, because it covers
all of the functions' requirement. The row buffering is kind of
optimization in the special case.


The thought I had during the last commit fest was that the function 
would call a callback, something like window_forget(pos), that would 
tell the system that it can release any rows before the given position. 
That way you only need one method, and it's also be optimal for 
functions like lag(), that doesn't fit perfectly into either the row or 
frame buffering category. Or do we need the information at plan-time?


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] crypt auth

2008-10-20 Thread Peter Eisentraut

Tom Lane wrote:

Peter Eisentraut <[EMAIL PROTECTED]> writes:
AFAICT, removing an authentication method requires a protocol version 
bump.


Why would it require that?  There would just be some auth method codes
that remain reserved but aren't used anymore.


Yeah, I was mistaken.  AuthenticationCryptPassword would remain in the 
protocol definition, but the server would just never send it.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] crypt auth

2008-10-20 Thread Tom Lane
Peter Eisentraut <[EMAIL PROTECTED]> writes:
> AFAICT, removing an authentication method requires a protocol version 
> bump.

Why would it require that?  There would just be some auth method codes
that remain reserved but aren't used anymore.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Block level concurrency during recovery

2008-10-20 Thread Teodor Sigaev

* For GIN indexes, we appear to not hold a Cleanup lock during
vacuuming, except on root page. That stops new scans from starting, but
it doesn't prevent progress of concurrent scans. Doesn't look correct to
me... so not sure what strength lock to acquire in each case. Probably

Why do you think so?


need to differentiate between WAL record types so we can tell which to
acquire CleanupLock for and which not.

* GIST doesn't use CleaupLocks at all. So I'm very unclear here.
Scan process does not hold any pointer on page now, insertion process holds but 
it tracks changes of page by looking at LSN of page.




--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index use during Hot Standby

2008-10-20 Thread Teodor Sigaev

3. Implement an extra indexAM API call that allows indexAM to decide
when/if index is valid during recovery. This would also cover the second
concern neatly in a single API call.

wait until after deadline to implement (2) or (3), in case somebody
fixes this up in the next few weeks.



IMHO, Without locking of pages in recovery mode Btree and GIN are not usable 
while incomplete split exists - there is  a nonconnected branch in tree.


GiST has similar issue - incomplete insert. One insertion in leaf page can 
produce updating of keys up to the root. During that split pages may occurs.
So, it's needed to add to gistxlog.c tracking of pages split to get exact 
knowledge about moments of unusability of index.


One more thing about GiST - when database is switched from recovery mode to the 
normal mode then it's needed to complete insertion in GiST and, possibly, vacuum 
index. Dig around GistBulkDeleteResult->needFullVacuum and gistContinueInsert()


--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Window Functions: buffering strategy

2008-10-20 Thread Hitoshi Harada
Hi,

2008/10/20 Simon Riggs <[EMAIL PROTECTED]>:
>
> On Mon, 2008-10-20 at 10:32 +0900, Hitoshi Harada wrote:
>
>> So I propose three Window node buffering strategies,
>> row/frame/partition buffering so as to avoid unnecessary row
>> buffering.
>
> Sounds good from here. Can I suggest you release the code in phases?
>
> It would be better if we got just one of those done (row?), than to
> attempt all 3 and end up with none because of emerging details.

Thank you for your feedback.
Ok, actually the first will be partition buffering, because it covers
all of the functions' requirement. The row buffering is kind of
optimization in the special case.

Regards,

-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] crypt auth

2008-10-20 Thread Peter Eisentraut

Magnus Hagander wrote:

I notice our docs have:

If you are at all concerned about password
sniffing attacks then md5 is preferred, with
crypt to be used only if you must support pre-7.2
clients. Plain password should be avoided especially for


At what point do we just remove the support and say that people need to
upgrade their clients? Sure, it's up to ppl not to configure it that
way, but security-wise it's a foot-gun that I think is completely
unnecessary.


AFAICT, removing an authentication method requires a protocol version 
bump.  If you think it is worth dealing with those complications, then 
go for it.  I think it might be worth it.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Block level concurrency during recovery

2008-10-20 Thread Simon Riggs

On Mon, 2008-10-20 at 14:23 +0100, Gregory Stark wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> 
> > * re-create the Cleanup lock on blocks, when the original operation was
> > performed while a Cleanup lock was held.
> >
> > So the plan is to introduce a new XLogLockBufferForCleanup() function
> > and then use it in all places where a cleanup lock was held during the
> > write of the WAL record.
> 
> I'm not sure I'm following what you're talking about. Are you saying the slave
> will have to stop when it reaches a cleanup log entry and wait until it can
> obtain the exclusive lock with no other pins? 

Yes. 

> That sounds like the right thing though it could cause a long pause in WAL
> recovery. 

It could. 

> Perhaps we should plan on being able to kill other processes holding
> pins on the buffer just as we discussed killing transactions holding up
> advancing the globalxmin.

I'll copy my notes straight out of the Wiki:

We can't cancel a query that holds a pin on a particular block since we
don't keep track of who holds a pin. We just know a backend has a pin
and that the startup process must wait.

This may become a problem, or it may not. In most cases, backends hold
5-15 pins concurrently and pins are held for relatively short times.
Startup process will provide debug information for when pin times are
extended, and for monitoring total delay from pin waits.

Some strategies, if this becomes a problem:

  * minimise pin hold times wherever this occurs
  * deferring application of WAL for pinned blocks (requiring us to
use conditional locking)
  * making VACUUM FREEZE hold stronger locks on standby to prevent
query access (but doesn't help HOT)
  * ensuring that we perform read ahead I/O for WAL records that
have not yet reached head of apply queue

Most people on-list were comfortable with the idea of recovery waiting
for queries to finish, up to a limit, re: discussion around
max_standby_delay. So waiting for cleanup locks is just another source
of potential wait time.

I think it will take a while for people to come up with some user
experience of whether this will be a pain or not.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Block level concurrency during recovery

2008-10-20 Thread Gregory Stark

Simon Riggs <[EMAIL PROTECTED]> writes:

> * re-create the Cleanup lock on blocks, when the original operation was
> performed while a Cleanup lock was held.
>
> So the plan is to introduce a new XLogLockBufferForCleanup() function
> and then use it in all places where a cleanup lock was held during the
> write of the WAL record.

I'm not sure I'm following what you're talking about. Are you saying the slave
will have to stop when it reaches a cleanup log entry and wait until it can
obtain the exclusive lock with no other pins? 

That sounds like the right thing though it could cause a long pause in WAL
recovery. Perhaps we should plan on being able to kill other processes holding
pins on the buffer just as we discussed killing transactions holding up
advancing the globalxmin.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSL cleanups/hostname verification

2008-10-20 Thread Tom Lane
Magnus Hagander <[EMAIL PROTECTED]> writes:
> Attached patch cleans up the certificate verification in libpq, and adds
> a configuration paraqmeter to control it. The new parameter is
> "sslverify", and can be set to:

> * cn = default = will validate that the certificate chains to a trusted
> root, *and* that the cn on the certificate matches the hostname
> specificed in the connection. This is the only option that prevents
> man-in-the-middle attacks completely, and therefor is the default.

How can you make that the default?  Won't it immediately break every
installation without certificates?

The patch seems pretty far short of sufficient as far as supporting a
new conninfo option goes --- for instance it appears to leak the string
at disconnect.  Check through all the references to some existing option
field to see if you missed anything else.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/pg_stat_statements

2008-10-20 Thread Heikki Linnakangas

Magnus Hagander wrote:

ITAGAKI Takahiro wrote:

Magnus Hagander <[EMAIL PROTECTED]> wrote:
I'm not sure what should be in the main and what should not.
Why is pg_freespacemap still in contrib?


I don't know, why is it? :-)


I guess that was a joke, given the smiley, but I'll bite:

1. pg_freespacemap is intimately tied to the implementation of the FSM. 
It was completely overhauled in the FSM rewrite, and there's no 
guarantee it won't change again in the future to reflect changes in the 
FSM implementation. And if it does, we won't bother to provide an easy 
upgrade path, so existing queries using it will brake. For the same 
reason it's in contrib and not pgFoundry: it's very dependent on the 
server version.


2. It's not useful for the general public.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_hba options parsing

2008-10-20 Thread Magnus Hagander
Tom Lane wrote:
> Magnus Hagander <[EMAIL PROTECTED]> writes:
>> Bruce Momjian wrote:
>>> This is missing 'do' or something:
>>>
>>> + #define MANDATORY_AUTH_ARG(argvar, argname, authname) \
>>> + if (argvar == NULL) {\
>>> +   ereport(LOG, \
>>> +   (errcode(ERRCODE_CONFIG_FILE_ERROR), \
>>> +errmsg("authentication method '%s' requires argument '%s' to 
>>> be set", \
>>> +   authname, argname), \
>>> +errcontext("line %d of configuration file \"%s\"", \
>>> +   line_num, HbaFileName))); \
>>> +   goto hba_other_error; \
>>> + } while (0);
> 
>> Wow.Amazing that it actually compiles and work. I guess it treats the
>> while(0) as a separate statement completely.
> 
>> The correct fix is, AFAICS, to remove the while(0).
> 
> Absolutely not!  The reason for using do/while in this sort of situation
> is to make sure that the "if" can't get matched up to an "else" in code
> following the macro.  Without do/while this macro will be a loaded
> foot-gun.

Oh, didn't think of that. I just thought of the braces part, which was
"solved" by if. Thanks for clearing that up.

Ok, will add back do/while instead.

//Magnus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Block level concurrency during recovery

2008-10-20 Thread Simon Riggs
I'm looking at how to make queries safe during recovery, in the presence
of concurrent changes to blocks. In particular, concurrent removal of
rows that might be read by queries.

My thinking is 
* we ignore LockRelationForExtension(). Normal queries never request it.
All new blocks were created with that lock held and we are replaying
changes serially, so we do not need to re-create that lock. We already
do this, so no change.
* re-create the Cleanup lock on blocks, when the original operation was
performed while a Cleanup lock was held.

So the plan is to introduce a new XLogLockBufferForCleanup() function
and then use it in all places where a cleanup lock was held during the
write of the WAL record.

This means we will need to hold cleanup lock:

* while RM_HEAP2_ID records are applied (Freeze, Clean, CleanMove)

* while an XLOG_BTREE_DELETE was generated by VACUUM - this record type
is not always generated by VACUUM. So split this WAL record into two
types XLOG_BTREE_DELETE and XLOG_BTREE_VACUUM, so we can hold Cleanup
lock while applyinh XLOG_BTREE_VACUUM. (This may not be required, but
I'd rather do the full locking now and relax it later).

* Whenever we apply a backup block that performs the same function as
any of the above WAL records. So we would hold Cleanup lock when
applying WAL records of types
  all RM_HEAP2_ID types
  XLOG_BTREE_VACUUM

I'm most of the way through implementing the above and will post patch
as a separate item to make it easier to inspect.

Other aspects:

* For GIN indexes, we appear to not hold a Cleanup lock during
vacuuming, except on root page. That stops new scans from starting, but
it doesn't prevent progress of concurrent scans. Doesn't look correct to
me... so not sure what strength lock to acquire in each case. Probably
need to differentiate between WAL record types so we can tell which to
acquire CleanupLock for and which not.

* GIST doesn't use CleaupLocks at all. So I'm very unclear here.

Teodor has mentioned that it should be OK for GIST/GIN. Can I double
check that based upon my inspection of the code?

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_hba options parsing

2008-10-20 Thread Tom Lane
Magnus Hagander <[EMAIL PROTECTED]> writes:
> Bruce Momjian wrote:
>> This is missing 'do' or something:
>> 
>> + #define MANDATORY_AUTH_ARG(argvar, argname, authname) \
>> + if (argvar == NULL) {\
>> +   ereport(LOG, \
>> +   (errcode(ERRCODE_CONFIG_FILE_ERROR), \
>> +errmsg("authentication method '%s' requires argument '%s' to be 
>> set", \
>> +   authname, argname), \
>> +errcontext("line %d of configuration file \"%s\"", \
>> +   line_num, HbaFileName))); \
>> +   goto hba_other_error; \
>> + } while (0);

> Wow.Amazing that it actually compiles and work. I guess it treats the
> while(0) as a separate statement completely.

> The correct fix is, AFAICS, to remove the while(0).

Absolutely not!  The reason for using do/while in this sort of situation
is to make sure that the "if" can't get matched up to an "else" in code
following the macro.  Without do/while this macro will be a loaded
foot-gun.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] SSL cleanups/hostname verification

2008-10-20 Thread Magnus Hagander
Attached patch cleans up the certificate verification in libpq, and adds
a configuration paraqmeter to control it. The new parameter is
"sslverify", and can be set to:

* cn = default = will validate that the certificate chains to a trusted
root, *and* that the cn on the certificate matches the hostname
specificed in the connection. This is the only option that prevents
man-in-the-middle attacks completely, and therefor is the default.

* cert = what we had before if there was a root certificate file = will
validate that the certificate chains to a trusted root, but ignore the cn.

* none = will disable certificate validation completely


This means that the connection string is now in charge of the security
policy, and not just the "if file exists or not". IMHO this is the only
proper way to do it. Now, if you for some reason loose the root
certificate file, libpq will refuse to connect unless you have
explicitly told it to connect without verifying the certificate.
Previously if you accidentally lost the file, you would connect
insecurely without knowing about it.


The error messages from the patch requires the
libpq-error-message-stacking patch as well (or something like it),
otherwise the error message will often get overwritten by a later one if
we retry without SSL.


I intend to follow this up with a similar patch for the server side,
which will make it a connection option instead of being dependent on the
presence of a file. This is depending on the pg_hba options patch,
however, so it's not ready yet.


//Magnus
*** a/doc/src/sgml/libpq.sgml
--- b/doc/src/sgml/libpq.sgml
***
*** 259,264 
--- 259,291 
  
  
  
+  sslverify
+  
+   
+This option controls how libpq verifies the certificate on the
+server when performing an SSL connection. There are
+three options: none disables verification completely
+(not recommended!); cert enables verification that
+the certificate chains to a known CA only; cn will
+both verify that the certificate chains to a known CA and that
+the cn attribute of the certificate matches the
+hostname the connection is being made to (default).
+   
+ 
+   
+It is always recommended to use the cn value for
+this parameter, since this is the only option that prevents
+man-in-the-middle attacks. Note that this requires the server
+name on the certificate to match exactly with the host name
+used for the connection, and therefore does not support connections
+to aliased names. It can be used with pure IP address connections
+only if the certificate also has just the IP address in the
+cn field.
+   
+  
+ 
+ 
+ 
   requiressl
   

***
*** 6011,6019  myEventProc(PGEventId evtId, void *evtInfo, void *passThrough)

  

!To verify the server certificate is trustworthy, place certificates of
!the certificate authorities (CA) you trust in the
!file ~/.postgresql/root.crt in the user's home directory.
 (On Microsoft Windows the file is named
 %APPDATA%\postgresql\root.crt.)
 libpq will then verify that the server's
--- 6038,6048 

  

!When the sslverify parameter is set to cn or
!cert, libpq will verify that the server certificate is
!trustworthy by checking the certificate chain up to a CA.
!For this to work, place the certificate of a trusted CA
!in the file ~/.postgresql/root.crt in the user's home directory.
 (On Microsoft Windows the file is named
 %APPDATA%\postgresql\root.crt.)
 libpq will then verify that the server's
*** a/doc/src/sgml/runtime.sgml
--- b/doc/src/sgml/runtime.sgml
***
*** 1418,1426  $ kill -INT `head -1 /usr/local/pgsql/data/postmaster.pid`server.key (key) and
 server.crt (certificate) files (). The TCP client must connect using
!sslmode='require' () and have
!a ~/.postgresql/root.crt SSL certificate ().

   

--- 1418,1426 
 server.key (key) and
 server.crt (certificate) files (). The TCP client must connect using
!sslmode='require', specify sslverify='cn'
!or sslverify='cert' and have the required certificate
!files present ().

   

***
*** 1544,1551  $ kill -INT `head -1 /usr/local/pgsql/data/postmaster.pid`
   
!  It is possible for both the client and server to provide SSL keys
!  or certificates to each other. It takes some extra configuration
   on each side, but this provides stronger verification of identity
   than the mere use of passwords. It prevents a computer from
   pretending to be the server just long enough to read the password
--- 1544,1551 
  
 
   
!  It is possible for both the client and server to provid

Re: [HACKERS] SQL:2008 LIMIT/OFFSET

2008-10-20 Thread Tom Lane
Peter Eisentraut <[EMAIL PROTECTED]> writes:
> SQL:2008 specifies the following syntax for what we have so far called 
> LIMIT and OFFSET
> SELECT ... [ ORDER BY ... ]
>  OFFSET num {ROW|ROWS} FETCH {FIRST|NEXT} [num] {ROW|ROWS} ONLY

What does the "NEXT" option mean?  I'm a bit worried that this isn't
actually quite equivalent to LIMIT.

> If we want to avoid reshuffling the expression syntax (always good to 
> avoid) and avoid making ROWS reserved, we need to make some arbitrary 
> restrictions on what kinds of expressions can be used in these clauses. 

This syntax seems sufficiently brain-dead that only standards-compliance
fanatics would use it.  Accordingly, limiting it to match the letter
of the standard (literals only) is probably sufficient.

BTW, I think it's a bad idea to assign made-up parse locations, as
you did here:

> + | /*EMPTY*/ { $$ = makeIntConst(1, @$); }

Just use -1.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/pg_stat_statements

2008-10-20 Thread Magnus Hagander
ITAGAKI Takahiro wrote:
> Magnus Hagander <[EMAIL PROTECTED]> wrote:
> 
>>> I'd like to submit pg_stat_statements contrib module
>> Sounds very good, but why contrib and not along with the rest of the
>> stats stuff in the main backend (with an on/off switch if the overhead
>> is high)?
> 
> That's because it could be done as a contrib module ;-)
> (Yeah! Postgres is a very extensible database!)

It can also go as a pgfoundry project, no? ;-)

Given that making it a contrib module instead of core will significantly
decrease the exposure, I personally consider that a bad thing..


> I'm not sure what should be in the main and what should not.
> Why is pg_freespacemap still in contrib?

I don't know, why is it? :-)


> I think the module is not mature yet and users will want to
> modify it to their satisfaction. It will be easier if the
> module is separated from the core codes. (The same can be
> said for contrib/auto_explan, which is my other proposal.)

Yes, that is a reasonable argument for keeping it in contrib. Or perhaps
even better, on pgFoundry until it is ready.

//Magnus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] crypt auth

2008-10-20 Thread Magnus Hagander
I notice our docs have:

If you are at all concerned about password
sniffing attacks then md5 is preferred, with
crypt to be used only if you must support pre-7.2
clients. Plain password should be avoided especially for


At what point do we just remove the support and say that people need to
upgrade their clients? Sure, it's up to ppl not to configure it that
way, but security-wise it's a foot-gun that I think is completely
unnecessary.

//Magnus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Index use during Hot Standby

2008-10-20 Thread Simon Riggs
For Hot Standby I need to mark which indexes are usable or not.

There are three aspects to index use during recovery:
* Certain index types may not correctly implement fully concurrent
locking order to allow that index type to be used during recovery.
* Other indexes might become unusable as indexAM code determines
particular indexes are corrupt. 
* Any index type that is not WAL logged will never be usable during
recovery. 

I've looked into the first of those concerns and can't find any reason
to believe index _redo code is dangerous for concurrent use. In
particular, even though some actions are split across multiple WAL
records does not imply that the individual parts of that action are not
concurrent. Please could people double check that analysis?

Assuming that is OK, the second and third concerns need to be addressed.
Currently, the second concern is simply ignored. Should we continue
that?

We can handle the third concern in a couple of ways:

1. Implement WAL logging for hash indexes
2. Implement an extra flag on pg_am that would be checked when we build
relcache so that all indexes of a certain type show as invalid during
recovery.
3. Implement an extra indexAM API call that allows indexAM to decide
when/if index is valid during recovery. This would also cover the second
concern neatly in a single API call.

I heard that people were working on (1). Are they going to make the
deadline or not?

If not, I will have to do some further work on this. I suggest that I
wait until after deadline to implement (2) or (3), in case somebody
fixes this up in the next few weeks.

I'm also happy to help out a little with adding WAL to hash indexes. In
general, every place that we do MarkBufferDirty() we need to make some
WAL entries (probably). MarkBufferDirty is rarely called directly
(once...), but the next level of code is called in these places:

hash.c
hashbulkdelete()

hashovfl.c
_hash_addovflpage()
_hash_getovflpage
_hash_freeovflpage -- looks complex
_hash_initbitmap
_hash_squeezebucket

hashpage.c
_hash_metapinit()
_hash_splitbucket()
_hash_expandtable()

hashinsert.c
_hash_doinsert()


-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/pg_stat_statements

2008-10-20 Thread ITAGAKI Takahiro

Magnus Hagander <[EMAIL PROTECTED]> wrote:

> > I'd like to submit pg_stat_statements contrib module
> 
> Sounds very good, but why contrib and not along with the rest of the
> stats stuff in the main backend (with an on/off switch if the overhead
> is high)?

That's because it could be done as a contrib module ;-)
(Yeah! Postgres is a very extensible database!)

I'm not sure what should be in the main and what should not.
Why is pg_freespacemap still in contrib?

I think the module is not mature yet and users will want to
modify it to their satisfaction. It will be easier if the
module is separated from the core codes. (The same can be
said for contrib/auto_explan, which is my other proposal.)

Now I'm working on storing statistics into disks on server
shutdown. If it is impossible unless the module is in core,
I would change my policy...

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PGDay.it collation discussion notes

2008-10-20 Thread Heikki Linnakangas

Tom Lane wrote:

Another objection to this design is that it's completely unclear that
functions from text to text should necessarily yield the same collation
that went into them, but if you treat collation as a hard-wired part of
the expression syntax tree you aren't going to be able to do anything else.
(What will you do about functions/operators taking more than one text
argument?)


Whatever the spec says. Collation is intimately associated with the 
comparison operations, and doesn't make any sense anywhere else. The way 
the default collation for a given operation is determined, by bubbling 
up the collation from the operands, through function calls and other 
expressions, is just to make life a bit easier for the developer who's 
writing the SQL. We could demand that you always explicitly specify a 
collation when you use the text equality or inequality operators, but 
because that would be quite tiresome, a reasonable default is derived 
from the context.


I believe the spec stipulates how that default is derived, so I don't 
think we need to fret over it. We'll need it eventually, but the parser 
changes is not the critical part. We can start off by deriving the 
collation from a GUC variable, for example.



I think it would be better to treat the collation indicator as part of
string *values* and let it bubble up through expressions that way.
The "expr COLLATE ident" syntax would be a simple run-time operation
that pokes a new collation into a string value.  The notion of a column
having a particular collation would then amount to a check constraint on
the values going into the column.


Looking at an individual value, collation just doesn't make sense.
Collation is property of the comparison operation, not of a value.

In the parser, we might have to do something like that though, because 
according to the standard you can tack the COLLATION keyword to string 
constants and have it bubble up. But let's keep that ugliness just 
inside the parser.


One, impractical, way to implement collation would be to have one 
operator class per collation. In fact you could do that today, with no 
backend changes, to support multiple collations. It's totally 
impractical, because for starters you'd need different comparison 
operators, with different names, for each collation. But it's the right 
mental model.


I think the right approach is to invent a new concept called "operator 
modifier". It's basically a 3rd argument to operators. It can be 
specified explicitly when an operator is used, with syntax like " 
Op  USING ", or in case of collation, it's derived from 
the context, per SQL spec. The operator modifier is tacked on to OpExprs 
and SortClauses in the parser, and passed as a 3rd argument to the 
function implementing the operator at execution time.


When an index is created, if the operators in the operator class take an 
operator modifier, it's stored at creation time into a new column in 
pg_index (needs to be a vector or array to handle multi-column indexes). 
The planner needs to check the modifier when it determines whether an 
index can be used or not.


BTW, this reminds me of the discussions we had about the tsearch default 
configuration. It's different, though, because in full text search, 
there's a separate tsvector data type, and the problem was with 
expression indexes, not regular ones.


Another consideration is LC_CTYPE. Just like we want to support
different collations, we should support different character
classifications for upper()/lower(). We might want to tie it into
collation, as using different ctype and collation doesn't usually make
sense, but it's something to keep in mind.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Block-level CRC checks

2008-10-20 Thread Markus Wanner
Hi,

Alvaro Herrera wrote:
> So this discussion died with no solution arising to the
> hint-bit-setting-invalidates-the-CRC problem.

Isn't double-buffering solving this issue? Has somebody checked if it
even helps performance due to being able to release the lock on the
buffer *before* the syscall?

Regards

Markus Wanner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Hot Standby utility and administrator functions

2008-10-20 Thread Simon Riggs

I'm looking to implement the following functions for Hot Standby, to
allow those with administrative tools or management applications to have
more control during recovery. Please let me know if other functions are
required.

What else do we need?

* pg_is_in_recovery()
returns bool (true if in recovery, false if not)

* pg_last_recovered_xact_xid()
Will throw an ERROR if *not* executed in recovery mode.
returns bigint 

* pg_last_completed_xact_xid()
Will throw an ERROR *if* executed in recovery mode.
returns bigint 

(together allows easy arithmetic on xid difference between master and
slave).

* pg_last_recovered_xact_timestamp()
returns timestamp with timezone
(allows easy arithmetic with now() to allow derivation of replication
delay etc)

* pg_freeze_recovery() - freezes recovery after the current record has
been applied. The server is still up and queries can happen, but no WAL
replay will occur. This is a temporary state change and we keep no
record of this, other than making a server log entry. If the server is
shutdown or crashes, it will unfreeze itself automatically. Has no
effect on master.
Will throw an ERROR if not executed in recovery mode.
Superusers only.
returns text (XLogRecPtr of freeze point)

* pg_unfreeze_recovery() - unfreezes recovery. Recovery will begin again
at exactly the point recovery was frozen at.
Will throw an ERROR is not executed in recovery mode.
Superusers only.
returns bool (true if unfroze, false if was not frozen when called)

* pg_end_recovery() - 
Will force recovery to end at current location. Recovery mode cannot be
easily re-entered, so there is no "restart" function.
Will throw an ERROR is not executed in recovery mode.
Superusers only.
returns text (XLogRecPtr of freeze point)

* pg_start_backup()/pg_stop_backup() could work during recovery, but the
backup history file would need to be manually inserted into the archive
once complete. Is that acceptable? (Note that we don't know where the
archive is or how to access that; the information is all in
recovery_command. We cannot assume that archive_command points to same
archive. So making it happen automatically is too much work for this
release, if ever.) If that seems useful, we could do this by avoiding
any operation that changes WAL stream during recovery: no checkpoints,
log switches etc.. 
pg_start_backup() would return XLogRecPtr of last restartpoint.
pg_stop_backup() would return last known xlrec recovered (we won't keep
track of this record by record).

* pg_reload_conf() will not force re-read of recovery.conf since that
may require extra work and doesn't seem that important, if we have the
manual override mentioned above.

All desirable? All possible? Any others?

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] SQL:2008 LIMIT/OFFSET

2008-10-20 Thread Peter Eisentraut
SQL:2008 specifies the following syntax for what we have so far called 
LIMIT and OFFSET


SELECT ... [ ORDER BY ... ]
OFFSET num {ROW|ROWS} FETCH {FIRST|NEXT} [num] {ROW|ROWS} ONLY

For example,

SELECT id, name FROM tab1 ORDER BY id OFFSET 20 ROWS FETCH NEXT 10 ROWS 
ONLY;


(I understand this syntax was taken from IBM.)

Supporting this in PostgreSQL poses a couple of parsing challenges that 
involve some tradeoffs.  I have attached a draft patch if you want to 
follow along.


FETCH must become reserved.  It's in the same position now that LIMIT 
and OFFSET are already.  This should be OK because FETCH is already a 
well-known SQL command for cursor use.


The trailing {ROW|ROWS} key words plus the fact that the number 
specification is optional after FETCH (defaulting to 1) cause some 
independent problems because ROWS is unreserved and ROW can introduce an 
expression (c_expr even).


If we want to avoid reshuffling the expression syntax (always good to 
avoid) and avoid making ROWS reserved, we need to make some arbitrary 
restrictions on what kinds of expressions can be used in these clauses. 
 Considering that specifying arbitrary expressions in these places 
isn't terribly common and the SQL standard only calls for literals, I 
hope I have found a good balance that satisfies the letter of the 
standard and works well in practice with some parentheses needed in 
complicated cases.  But it may be objected to because it creates some 
inconsistencies between the traditional and the new syntax in more 
complex cases.


Another question, if we want to go this route, is whether we would want 
to change the query tree reversing to use the new syntax.


Comments?
diff -ur -x '*.o' ../cvs-pgsql/src/backend/parser/gram.y 
src/backend/parser/gram.y
--- ../cvs-pgsql/src/backend/parser/gram.y  2008-10-15 14:19:39.0 
+0300
+++ src/backend/parser/gram.y   2008-10-20 11:39:31.0 +0300
@@ -308,6 +308,8 @@
 %type reindex_type drop_type comment_type
 
 %typefetch_direction select_limit_value select_offset_value
+   select_offset_value2 
opt_select_fetch_first_value
+%typerow_or_rows first_or_next
 
 %typeOptSeqOptList SeqOptList
 %type  SeqOptElem
@@ -6575,6 +6577,13 @@
 errhint("Use separate 
LIMIT and OFFSET clauses."),
 
scanner_errposition(@1)));
}
+   /* SQL:2008 syntax variants */
+   | OFFSET select_offset_value2 row_or_rows
+   { $$ = list_make2($2, NULL); }
+   | FETCH first_or_next opt_select_fetch_first_value 
row_or_rows ONLY
+   { $$ = list_make2(NULL, $3); }
+   | OFFSET select_offset_value2 row_or_rows FETCH 
first_or_next opt_select_fetch_first_value row_or_rows ONLY
+   { $$ = list_make2($2, $6); }
;
 
 opt_select_limit:
@@ -6592,10 +6601,28 @@
}
;
 
+opt_select_fetch_first_value:
+   SignedIconst{ $$ = makeIntConst($1, @1); }
+   | '(' a_expr ')'{ $$ = $2; }
+   | /*EMPTY*/ { $$ = makeIntConst(1, @$); }
+   ;
+
 select_offset_value:
a_expr  
{ $$ = $1; }
;
 
+select_offset_value2:
+   c_expr  
{ $$ = $1; }
+   ;
+
+row_or_rows:
+   ROW { $$ = 0; }
+   | ROWS  { $$ = 0; }
+
+first_or_next:
+   FIRST_P { $$ = 0; }
+   | NEXT  { $$ = 0; }
+
 group_clause:
GROUP_P BY expr_list
{ $$ = $3; }
| /*EMPTY*/ 
{ $$ = NIL; }
@@ -9214,6 +9241,7 @@
 RoleId:ColId   
{ $$ = $1; };
 
 SignedIconst: ICONST   
{ $$ = $1; }
+   | '+' ICONST
{ $$ = + $2; }
| '-' ICONST
{ $$ = - $2; }
;
 
@@ -9346,7 +9374,6 @@
| EXPLAIN
| EXTERNAL
| FAMILY
-   | FETCH
| FIRST_P
| FORCE
| FORWARD
@@ -9636,6 +9663,7 @@
| END_P
| EXCEPT
  

Re: [HACKERS] minimal update

2008-10-20 Thread Magnus Hagander
Andrew Dunstan wrote:
> 
> 
> Tom Lane wrote:
>> Andrew Dunstan <[EMAIL PROTECTED]> writes:
>>  
>>> OK. Where would be a good place to put the code? Maybe a new file
>>> src/backend/utils/adt/trigger_utils.c ?
>>> 
>>
>> I thought the plan was to make it a contrib module.
>>
>>
>>   
> 
> Well, previous discussion did mention catalog entries, which would
> suggest otherwise, but I can do it as a contrib module if that's the
> consensus.

What would be the actual reason to put it in contrib and not core? Are
there any "dangers" by having it there? Or is it "just a hack" and not a
"real solution"?

//Magnus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/pg_stat_statements

2008-10-20 Thread Magnus Hagander
ITAGAKI Takahiro wrote:
> Hello,
> 
> I'd like to submit pg_stat_statements contrib module, that counts up
> incoming statements in shared memory and summarizes the result as a view.
> It is just a statements-version of pg_stat_user_functions.

Sounds very good, but why contrib and not along with the rest of the
stats stuff in the main backend (with an on/off switch if the overhead
is high)?


//Magnus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Incorrect cursor behaviour with gist index

2008-10-20 Thread Martin Schäfer
> Okay.  I'll go fix the core code, and you can take out 
> whatever you want in GiST/GIN.

Which PostgreSQL versions will contain the fix?

Regards,

Martin Schaefer

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq ssl -> clear fallback looses error messages

2008-10-20 Thread Magnus Hagander
Tom Lane wrote:
> Magnus Hagander <[EMAIL PROTECTED]> writes:
>> Here's an ugly attempt towards this. Though I'm unsure if we can change
>> the "const" on the PQerrorMessage parameter without messing with library
>> versions and such?
> 
> That's a bad idea in any case --- PQerrorMessage shouldn't be changing
> the state of anything.  The merging needs to happen earlier.

That was my general thought, but since I'd written it, I just threw it
out there..


>> Another option could be to change all the calls that set the error
>> message to *append* to the error message instead. Thoughts on that?
> 
> That might work ... but we'd probably have to add some "clear the
> message" calls in places.  It might be worth trying to restrict this
> convention to the connection-startup code.

How about something like this?

//Magnus
*** a/src/interfaces/libpq/fe-connect.c
--- b/src/interfaces/libpq/fe-connect.c
***
*** 699,705  connectNoDelay(PGconn *conn)
  	{
  		char		sebuf[256];
  
! 		printfPQExpBuffer(&conn->errorMessage,
  			libpq_gettext("could not set socket to TCP no delay mode: %s\n"),
  		  SOCK_STRERROR(SOCK_ERRNO, sebuf, sizeof(sebuf)));
  		return 0;
--- 699,705 
  	{
  		char		sebuf[256];
  
! 		appendPQExpBuffer(&conn->errorMessage,
  			libpq_gettext("could not set socket to TCP no delay mode: %s\n"),
  		  SOCK_STRERROR(SOCK_ERRNO, sebuf, sizeof(sebuf)));
  		return 0;
***
*** 729,735  connectFailureMessage(PGconn *conn, int errorno)
  		   NULL, 0,
  		   service, sizeof(service),
  		   NI_NUMERICSERV);
! 		printfPQExpBuffer(&conn->errorMessage,
  		  libpq_gettext("could not connect to server: %s\n"
  			"\tIs the server running locally and accepting\n"
  			"\tconnections on Unix domain socket \"%s\"?\n"),
--- 729,735 
  		   NULL, 0,
  		   service, sizeof(service),
  		   NI_NUMERICSERV);
! 		appendPQExpBuffer(&conn->errorMessage,
  		  libpq_gettext("could not connect to server: %s\n"
  			"\tIs the server running locally and accepting\n"
  			"\tconnections on Unix domain socket \"%s\"?\n"),
***
*** 739,745  connectFailureMessage(PGconn *conn, int errorno)
  	else
  #endif   /* HAVE_UNIX_SOCKETS */
  	{
! 		printfPQExpBuffer(&conn->errorMessage,
  		  libpq_gettext("could not connect to server: %s\n"
  	 "\tIs the server running on host \"%s\" and accepting\n"
  		"\tTCP/IP connections on port %s?\n"),
--- 739,745 
  	else
  #endif   /* HAVE_UNIX_SOCKETS */
  	{
! 		appendPQExpBuffer(&conn->errorMessage,
  		  libpq_gettext("could not connect to server: %s\n"
  	 "\tIs the server running on host \"%s\" and accepting\n"
  		"\tTCP/IP connections on port %s?\n"),
***
*** 829,839  connectDBStart(PGconn *conn)
  	if (ret || !addrs)
  	{
  		if (node)
! 			printfPQExpBuffer(&conn->errorMessage,
  			  libpq_gettext("could not translate host name \"%s\" to address: %s\n"),
  			  node, gai_strerror(ret));
  		else
! 			printfPQExpBuffer(&conn->errorMessage,
  			  libpq_gettext("could not translate Unix-domain socket path \"%s\" to address: %s\n"),
  			  portstr, gai_strerror(ret));
  		if (addrs)
--- 829,839 
  	if (ret || !addrs)
  	{
  		if (node)
! 			appendPQExpBuffer(&conn->errorMessage,
  			  libpq_gettext("could not translate host name \"%s\" to address: %s\n"),
  			  node, gai_strerror(ret));
  		else
! 			appendPQExpBuffer(&conn->errorMessage,
  			  libpq_gettext("could not translate Unix-domain socket path \"%s\" to address: %s\n"),
  			  portstr, gai_strerror(ret));
  		if (addrs)
***
*** 924,929  connectDBComplete(PGconn *conn)
--- 924,931 
  		switch (flag)
  		{
  			case PGRES_POLLING_OK:
+ /* Reset stored error messages since we now have a working connection */
+ resetPQExpBuffer(&conn->errorMessage);
  return 1;		/* success! */
  
  			case PGRES_POLLING_READING:
***
*** 1033,1039  PQconnectPoll(PGconn *conn)
  			break;
  
  		default:
! 			printfPQExpBuffer(&conn->errorMessage,
  			  libpq_gettext(
  			"invalid connection state, "
   "probably indicative of memory corruption\n"
--- 1035,1041 
  			break;
  
  		default:
! 			appendPQExpBuffer(&conn->errorMessage,
  			  libpq_gettext(
  			"invalid connection state, "
   "probably indicative of memory corruption\n"
***
*** 1077,1083  keep_going:		/* We will come back to here until there is
  			conn->addr_cur = addr_cur->ai_next;
  			continue;
  		}
! 		printfPQExpBuffer(&conn->errorMessage,
  			  libpq_gettext("could not create socket: %s\n"),
  			SOCK_STRERROR(SOCK_ERRNO, sebuf, sizeof(sebuf)));
  		break;
--- 1079,1085 
  			conn->addr_cur = addr_cur->ai_next;
  			continue;
  		}
! 		appendPQExpBuffer(&conn->errorMessage,
  			  libpq_gettext("could not creat

Re: [HACKERS] pg_hba options parsing

2008-10-20 Thread Magnus Hagander
Bruce Momjian wrote:
> OK, a few comments:
> 
> This should have spaces:
> 
>   !port->hba->ldapprefix?port->hba->ldapprefix:"",

Fixed. Though I guess pgindent would've fixed it eventually otherwise.


> This is missing 'do' or something:
> 
>   + #define MANDATORY_AUTH_ARG(argvar, argname, authname) \
>   + if (argvar == NULL) {\
>   +   ereport(LOG, \
>   +   (errcode(ERRCODE_CONFIG_FILE_ERROR), \
>   +errmsg("authentication method '%s' requires argument '%s' 
> to be set", \
>   +   authname, argname), \
>   +errcontext("line %d of configuration file \"%s\"", \
>   +   line_num, HbaFileName))); \
>   +   goto hba_other_error; \
>   + } while (0);

Wow.Amazing that it actually compiles and work. I guess it treats the
while(0) as a separate statement completely.

The correct fix is, AFAICS, to remove the while(0).


> Seems we are losing the sscanf(), which is good.

Yes, that is one of the goals :-)


If there are no further comments I will go ahead and commit this (with
the above fixes) within a couple of days.

//Magnus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers