date:20080317

[HACKERS] [2/4] Proposal of SE-PostgreSQL patches

2008-03-17 Thread Kohei KaiGai

[2/4] - sepostgresql-sepgsql-8.4devel-3.patch.gz

This patch provides SE-PostgreSQL facilities based on PGACE.

Security-Enhanced PostgreSQL (SE-PostgreSQL) is a security extension
built in PostgreSQL, to provide system-wide consistency in access
controls. It enables to apply a single unigied security policy of
SELinux for both operating system and database management system.
In addition, it also provides fine-grained mandatory access which
includes column-/row- level non-bypassable access control even if
privileged database users.

  Quick overview at:
http://code.google.com/p/sepgsql/wiki/WhatIsSEPostgreSQL

(This patch is gzip'ed, bacause it overed the limitation of filesize.)

Thanks,
-- 
OSS Platform Development Division, NEC
KaiGai Kohei [EMAIL PROTECTED]


sepostgresql-sepgsql-8.4devel-3.patch.gz
Description: application/gzip

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [0/4] Proposal of SE-PostgreSQL patches

2008-03-17 Thread Kohei KaiGai

Zdenek Kotala wrote:
 Kohei KaiGai napsal(a):
 It seems to me some of SE-PostgreSQL patches are not delivered yet,
 although [3/4] and [4/4] were already done.

 Does anti-spam system caught my previous three messages?
 If necessary, I will send them again.
 
 There is a file size limitation. If your patch is too big (I guess over
 40kB), please gzip it or send only url for download.
 
   Zdenek

Thanks for your information,

Your estimation is correct. Two of them are over the limitaion.
So, I'll send it again with gzip'ed attachment.

[EMAIL PROTECTED] a]$ ls -lh *-8.4devel-*.patch
-rw-r--r-- 1 kaigai users  17K 2008-03-17 13:01 
sepostgresql-pg_dump-8.4devel-3.patch
-rw-r--r-- 1 kaigai users 134K 2008-03-17 13:01 
sepostgresql-pgace-8.4devel-3.patch
-rw-r--r-- 1 kaigai users  17K 2008-03-17 13:01 
sepostgresql-policy-8.4devel-3.patch
-rw-r--r-- 1 kaigai users 138K 2008-03-17 13:01 
sepostgresql-sepgsql-8.4devel-3.patch

-- 
OSS Platform Development Division, NEC
KaiGai Kohei [EMAIL PROTECTED]

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rewriting Free Space Map

2008-03-17 Thread Hannu Krosing


On Sun, 2008-03-16 at 21:33 -0300, Alvaro Herrera wrote:
 Tom Lane wrote:
 
  The idea that's becoming attractive to me while contemplating the
  multiple-maps problem is that we should adopt something similar to
  the old Mac OS idea of multiple forks in a relation.  In addition
  to the main data fork which contains the same info as now, there could
  be one or more map forks which are separate files in the filesystem.

Are'nt we in a way doing this for indexes ?

 I think something similar could be used to store tuple visibility bits
 separately from heap tuple data itself, so +1 to this idea.

Not just bits, but whole visibility info (xmin,xmax,tmin,tmax, plus
bits) should be stored separately.

A separate fork for visibility should be organized as a b-tree index
(as we already have well-honed mechanisms for dealing with those
effectively) but visibility fork is stored in a compressed form by
storing ranges of all-visible or all-deleted tuples as two endpoints
only and also the tree is reorganized when possible similar to what we
currently do for HOT updates.

This will keep the visibility index really small for cases with little
updates, most likely one or two pages regardless of table size.

One important difference from indexes is that visibility info should be
stored first, before writing data to heap and creating ordinary index
entries.

 (The rough idea in my head was that you can do an indexscan and look
 up visibility bits without having to pull the whole heap along; and
 visibility updates are also cheaper, whether they come from indexscans
 or heap scans.  Of course, the implicit cost is that a seqscan needs to
 fetch the visibility pages, too; and the locking is more complex.)

another cost is heavy inserting/updating where there will probably be
more lock contention as visibility info for new tuples will more often
land on the same visibility pages due to visibility info being generally
smaller.

Of course, with visibility info in a separate fork, very narrow tables
will have the ratios reversed - for one byte wide table visibility info
will be a few times bigger than actual data, at least initially before
compression has kicked in.


Hannu











-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [0/4] Proposal of SE-PostgreSQL patches

2008-03-17 Thread Zdenek Kotala

Kohei KaiGai napsal(a):
 It seems to me some of SE-PostgreSQL patches are not delivered yet,
 although [3/4] and [4/4] were already done.
 
 Does anti-spam system caught my previous three messages?
 If necessary, I will send them again.

There is a file size limitation. If your patch is too big (I guess over
40kB), please gzip it or send only url for download.

Zdenek

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commit fest?

2008-03-17 Thread Magnus Hagander


Greg Smith wrote:

On Sun, 16 Mar 2008, Bruce Momjian wrote:

Oh, what I would really like is to be able to pull up 
archives.postgresql.org emails based on message id so I can link to 
the entire thread.  Unfortunately, it doesn't work there, nor does 
Google or any of the other Postgres email archive sites.


This is something I've been looking into my own organization.  The 
message ids are at the start of the archive web pages.  For example your 
e-mail here I'm replying to begins like this if you look at the page 
source:


http://archives.postgresql.org/pgsql-hackers/2008-03/msg00554.php
!-- MHonArc v2.6.16 --
!--X-Subject: Re: Commit fest? --
!--X-From-R13: Pehpr [bzwvna oehprNzbzwvna.hf --
!--X-Date: Sun, 16 Mar 2008 23:19:33 #45;0300 (ADT) --
!--X-Message-Id: [EMAIL PROTECTED] --
!--X-Content-Type: text/plain --
!--X-Reference: [EMAIL PROTECTED] --
!--X-Head-End--

I was thinking of writing something that scraped the archives building a 
lookup table out of this information.  What would be nice is if the 
X-Message-Id and X-Reference were both put into the regular HTML for 
future archived messages so that it's more likely tools like Google 
could search based on them.  A brief glance at the MHonArc documentation 
suggests that could be run to re-covert any existing messages that are 
still available in order to add to those even.


We are sucking a lot of this data down to the db on 
search.postgresql.org already. Does it make sense to do it there 
perhaps? Is there need for anything more than a unique-messageid-hit? If 
that's all we need, we could easily have an url like 
http://search.postgresql.org/[EMAIL PROTECTED] redirect 
to the proper page on archives?


//Magnus

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Request for feature - ECPGget_PGconn

2008-03-17 Thread Mike Aubury


Request Overview

Add a function to return the current PGConn used within ecpg..


Background
--
For years now within the Aubit4GL project we've been able to access the PGConn 
record used by ecpg by the highly dubious means of accessing an internal 
record within ecpg (which has now been removed/hidden). 
It would be really useful if we could get at the PGConn connection via a 
formal API/function call...

This would be useful to others as it would allow libpq calls on the currently 
open connection to use features for which there is no direct ecpg equivilent, 
or where the functionality has already been implemented using libpq calls.
(The ability to drop to a lower level of abstraction is common in most db 
orientated languages/language extensions like esql/c.)



Implementation
--

This could be implemented by adding the following code to the existing 
ecpglib/connect.c file : 

PGconn* ECPGget_PGconn(const char *connection_name) {
 struct connection * con;
 con=ecpg_get_connection(connection_name);
 if (con==NULL) return NULL;

 return con-connection;
}




TIA


-- 
Mike Aubury

Aubit Computing Ltd is registered in England and Wales, Number: 3112827
Registered Address : Clayton House,59 Piccadilly,Manchester,M1 2AQ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Remove hacks for old bad qsort() implementations?

2008-03-17 Thread Zeugswetter Andreas OSB SD


 2. If you've got lots of equal keys, it's conceivable that having the
 index entries sorted by TID offers some advantage in indexscan speed.
 I'm dubious that that's useful, mainly because the planner should
prefer
 a bitmap scan in such a case; and anyway the ordering is unlikely to
 be preserved for long.  But it's something to think about.

How about always adding the TID as last key when using qsort for create
index ?

Andreas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Remove hacks for old bad qsort() implementations?

2008-03-17 Thread Alvaro Herrera

Tom Lane wrote:
 There are several places in tuplesort.c (and perhaps elsewhere) where
 we explicitly work around limitations of various platforms' qsort()
 functions.  Notably, there's this bit in comparetup_index_btree
 
 /*
  * If key values are equal, we sort on ItemPointer.  This does not affect
  * validity of the finished index, but it offers cheap insurance against
  * performance problems with bad qsort implementations that have trouble
  * with large numbers of equal keys.
  */

Hmm, wasn't this supposed to be there to fix a problem with Lehman 
Yao's btree definition, that required all keys to be distinct?

[ checks the README ]

Okay, it seems I'm wrong; it has nothing to do with what we pass to
qsort.

The requirement that all btree keys be unique is too onerous,
but the algorithm won't work correctly without it.  Fortunately, it is
only necessary that keys be unique on a single tree level, because LY
only use the assumption of key uniqueness when re-finding a key in a
parent page (to determine where to insert the key for a split page).
Therefore, we can use the link field to disambiguate multiple
occurrences of the same user key: only one entry in the parent level
will be pointing at the page we had split.


-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New style of hash join proposal

2008-03-17 Thread Gregory Stark


Tom Lane [EMAIL PROTECTED] writes:

 Gregory Stark [EMAIL PROTECTED] writes:
 We currently execute a lot of joins as Nested Loops which would be more
 efficient if we could batch together all the outer keys and execute a single
 inner bitmap index scan for all of them together.

 Please give an example of what you're talking about that you think we
 can't do now.

So basically given a simple join like:

postgres=# explain analyze select * from a where i in (select i from b);
 QUERY PLAN 


 Hash IN Join  (cost=3.25..181.75 rows=100 width=4) (actual time=0.704..44.262 
rows=100 loops=1)
   Hash Cond: (a.i = b.i)
   -  Seq Scan on a  (cost=0.00..140.00 rows=1 width=4) (actual 
time=0.034..20.864 rows=1 loops=1)
   -  Hash  (cost=2.00..2.00 rows=100 width=4) (actual time=0.530..0.530 
rows=100 loops=1)
 -  Seq Scan on b  (cost=0.00..2.00 rows=100 width=4) (actual 
time=0.013..0.236 rows=100 loops=1)
 Total runtime: 44.550 ms
(6 rows)

Note that we're doing a full sequential scan of a even though we've already
finished hashing b and know full well which keys we'll need. If we have an
index on a and b is sufficiently smaller than a, as in this case, then
we could do a bitmap index scan on a and pull out just those keys.

In a simple case like this you could hack it by doing a query like:

postgres=# explain analyze select * from a where i = any (array(select i from 
b));
  QUERY PLAN
   
---
 Bitmap Heap Scan on a  (cost=40.59..64.01 rows=10 width=4) (actual 
time=2.193..2.535 rows=100 loops=1)
   Recheck Cond: (i = ANY ($0))
   InitPlan
 -  Seq Scan on b  (cost=0.00..2.00 rows=100 width=4) (actual 
time=0.024..0.279 rows=100 loops=1)
 -  Bitmap Index Scan on ai  (cost=0.00..38.59 rows=10 width=0) (actual 
time=2.155..2.155 rows=100 loops=1)
 Index Cond: (i = ANY ($0))
 Total runtime: 2.829 ms
(7 rows)

This is effectively an equivalent plan but it runs 20x faster. 

But constructing an array is pointless, we could just do a hash_seq_search
directly and in any case I'm not sure it's always possible to transform the
query in this way. 

I was thinking we could pass the hash itself as a parameter to the bitmap
index scan kind of like how we pass the bitmap structures up from bitmap index
scans to bitmap heap scans and get a plan like:

 QUERY PLAN 


 Hash IN Join
   Hash Cond: (a.i = b.i)
   -  Bitmap Heap Scan on a
 Recheck Cond: (a.i = ANY ($0))
 -  Bitmap Index Scan on ai
   Index Cond: (i = ANY ($0))
   -  Hash
 -  Seq Scan on b

The really promising thing about this is it would actually simplify the work
in the case where the hash overflows. Instead of having to set aside tuples
for subsequent batches we would know we got all the matching records in the
index scan. So we can throw out the hash and start a new one for each batch.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rewriting Free Space Map

2008-03-17 Thread Tom Lane

Hannu Krosing [EMAIL PROTECTED] writes:
 On Sun, 2008-03-16 at 21:33 -0300, Alvaro Herrera wrote:
 Tom Lane wrote:
 The idea that's becoming attractive to me while contemplating the
 multiple-maps problem is that we should adopt something similar to
 the old Mac OS idea of multiple forks in a relation.

 Are'nt we in a way doing this for indexes ?

Not really --- indexes are closer to being independent entities, since
they have their own relfilenode values, own pg_class entries, etc.  What
I'm imagining here is something that's so tightly tied to the core heap
that there's no value in managing it as a distinct entity, thus the idea
of same relfilenode with a different extension.  The existence of
multiple forks in a relation wouldn't be exposed at all at the SQL
level.

 I think something similar could be used to store tuple visibility bits
 separately from heap tuple data itself, so +1 to this idea.

 Not just bits, but whole visibility info (xmin,xmax,tmin,tmax, plus
 bits) should be stored separately.

I'm entirely un-sold on this idea, but yeah it would be something that
would be possible to experiment with once we have a multi-fork
infrastructure.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Remove hacks for old bad qsort() implementations?

2008-03-17 Thread Tom Lane

Zeugswetter Andreas OSB SD [EMAIL PROTECTED] writes:
 How about always adding the TID as last key when using qsort for create
 index ?

I think you misunderstood: that's what we do now.  I'm proposing
removing it because I think it's probably useless.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [COMMITTERS] pgsql: Some cleanups of enum-guc code, per comments from Tom.

2008-03-17 Thread Alvaro Herrera

Magnus Hagander wrote:
 Log Message:
 ---
 Some cleanups of enum-guc code, per comments from Tom.

These bits seem to have broken --describe-config:

LC_ALL=C postmaster --describe-config  /dev/null
internal error: unrecognized run-time parameter type
internal error: unrecognized run-time parameter type
internal error: unrecognized run-time parameter type
internal error: unrecognized run-time parameter type
internal error: unrecognized run-time parameter type

(Perhaps they were already broken in the previous patch, not sure).

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Single table forcing sequential scans on query plans

2008-03-17 Thread Alvaro Herrera

Tom Lane wrote:

 Another possibility (though not a back-patchable solution) is that
 we could just dispense with the heuristic size estimate and trust a
 zero-sized table to stay zero-sized.  This would be relying on the
 assumption that autovacuum will kick in and update the stats, leading
 to invalidation of any existing plans that assume the table is small.
 I don't feel very comfortable about that though --- throwing a few
 hundred tuples into a table might not be enough to draw autovacuum's
 attention, but it could surely be enough to create a performance
 disaster for nestloop plans.

FWIW autovacuum fires an analyze with the 51st tuple inserted on a
table on 8.3's default configuration.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [0/4] Proposal of SE-PostgreSQL patches

2008-03-17 Thread KaiGai Kohei


Alvaro Herrera wrote:

Kohei KaiGai wrote:

The series of patches are the proposal of Security-Enhanced PostgreSQL
(SE-PostgreSQL) for the upstreamed PostgreSQL 8.4 development cycle.


Before we go any further, is this work derived from SELinux?  If so, is
it covered under the GPL?  If so, can it be licensed under BSD terms?

Obviously, if it's not BSD, we cannot include it in Postgres.


All of SE-PostgreSQL works are licensed unser BSD terms.
We are considering to push SE-PostgreSQL into upstreamed PostgreSQL from
the beginning, and we understand to choose GPL makes it impossible.

Thanks,
--
KaiGai Kohei [EMAIL PROTECTED]

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [0/4] Proposal of SE-PostgreSQL patches

2008-03-17 Thread KaiGai Kohei

I'll submit the proposal of SE-PostgreSQL patches again, because some of 
previous
messages are filtered due to attachment and I cannot provide whole of patches 
yet.
Please refer the pointed URL, as follows.

--
The series of patches are the proposal of Security-Enhanced PostgreSQL 
(SE-PostgreSQL)
for the upstreamed PostgreSQL 8.4 development cycle.

 [1/4] sepostgresql-pgace-8.4devel-3.patch
 provides PGACE (PostgreSQL Access Control Extension) framework
http://sepgsql.googlecode.com/files/sepostgresql-pgace-8.4devel-3-r704.patch

 [2/4] sepostgresql-sepgsql-8.4devel-3.patch
 provides SE-PostgreSQL feature, based on PGACE framework.

http://sepgsql.googlecode.com/files/sepostgresql-sepgsql-8.4devel-3-r704.patch

 [3/4] sepostgresql-pg_dump-8.4devel-3.patch
 enables pg_dump to dump database with security attribute.

http://sepgsql.googlecode.com/files/sepostgresql-pg_dump-8.4devel-3-r704.patch

 [4/4] sepostgresql-policy-8.4devel-3.patch
 provides the default security policy for SE-PostgreSQL.

http://sepgsql.googlecode.com/files/sepostgresql-policy-8.4devel-3-r704.patch

We can provide a quick overview for SE-PostgreSQL at:
http://code.google.com/p/sepgsql/wiki/WhatIsSEPostgreSQL

ENVIRONMENT
---
Please confirm your environment.
The followings are requriements of SE-PostgreSQL.
 * Fedora 8 or later system
 * SELinux is enabled and working
 * kernel-2.6.24 or later
 * selinux-policy and selinux-policy-devel v3.0.8 or later
 * libselinux, policycoreutils

INSTALLATION

$ tar jxvf postgresql-snapshot.tar.bz2
$ cd postgresql-snapshot
$ patch -p1  ../sepostgresql-pgace-8.4devel-3.patch
$ patch -p1  ../sepostgresql-sepgsql-8.4devel-3.patch
$ patch -p1  ../sepostgresql-pg_dump-8.4devel-3.patch
$ patch -p1  ../sepostgresql-policy-8.4devel-3.patch

$ ./configure --enable-selinux
$ make
$ make -C contrib/sepgsql-policy
$ su
# make install

# /usr/sbin/semodule -i contrib/sepgsql-policy/sepostgresql.pp
  (NOTE: semodule is a utility to load/unload security policy modules.)

# /sbin/restorecon -R /usr/local/pgsql
  (NOTE: restorecon is a utilicy to initialize security context of files.)

SETUP
-
# mkdir -p /opt/sepgsql
# chown foo_user:var_group /opt/sepgsql
# chcon -t postgresql_db_t /opt/sepgsql
  (NOTE: chcon is a utility to set up security context of files.)
# exit

$ /usr/sbin/run_init /usr/local/pgsql/bin/initdb -D /opt/sepgsql
  (NOTE: run_init is a utility to start a program, as if it is branched from 
init script.)
$ /usr/local/pgsql/bin/pg_ctl -D /opt/sepgsql start


SUMMARYS FOR EVERY PATCHES
--
[1/4] - sepostgresql-pgace-8.4devel-3.patch

This patch provides PGACE (PostgreSQL Access Control Extension) framework.

It has a similar idea of LSM (Linu Security Module).
It can provide a guest module several hooks at strategic points.
The guest module can make its decision whether required actions should be
allowed, or not.
In addition, PGACE also provides falicilites to manage security attribute
of database objects. Any tuple can have a its security attribute, and the
guest module can refer it to control accesses.

  A more conprehensive memo at:
http://code.google.com/p/sepgsql/wiki/WhatIsPGACE

[2/4] - sepostgresql-sepgsql-8.4devel-3.patch

This patch provides SE-PostgreSQL facilities based on PGACE.

Security-Enhanced PostgreSQL (SE-PostgreSQL) is a security extension
built in PostgreSQL, to provide system-wide consistency in access
controls. It enables to apply a single unigied security policy of
SELinux for both operating system and database management system.
In addition, it also provides fine-grained mandatory access which
includes column-/row- level non-bypassable access control even if
privileged database users.

  Quick overview at:
http://code.google.com/p/sepgsql/wiki/WhatIsSEPostgreSQL

[3/4] - sepostgresql-pg_dump-8.4devel-3.patch

This patch gives us a feature to dump database with security attribute.
It is turned on with '--enable-selinux' option at pg_dump/pg_dumpall,
when the server works as SE- version.
No need to say, users need to have enough capabilities to dump whole of
database. It it same when they tries to restore the database.

[4/4] - sepostgresql-policy-8.4devel-3.patch

This patch gives us the default security policy for SE-PostgreSQL.
You can build it as a security policy module. It can be linked with
the existing distributor's policy, and reloaded.

-- 
KaiGai Kohei [EMAIL PROTECTED]

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rewriting Free Space Map

2008-03-17 Thread Tom Lane

Heikki Linnakangas [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 You're cavalierly waving away a whole boatload of problems that will
 arise as soon as you start trying to make the index AMs play along
 with this :-(.  

 It doesn't seem very hard.

The problem is that the index AMs are no longer in control of what goes
where within their indexes, which has always been their prerogative to
determine.  The fact that you think you can kluge btree to still work
doesn't mean that it will work for other AMs.

 Also, I don't think that use the special space will scale to handle
 other kinds of maps such as the proposed dead space map.  (This is
 exactly why I said the other day that we need a design roadmap for all
 these ideas.)

 It works for anything that scales linearly with the relation itself. The 
 proposed FSM and visibility map both fall into that category.

It can work only with prearrangement among all the maps that are trying
to coexist in the same special space.  Every time a new one comes along,
we'd have to reconsider the space allocation, re-optimize tradeoffs,
and force an initdb that could not possibly be implemented in-place.

If we had a short list of maps in mind with no real prospect of
changing, then I think this would be acceptable; but that doesn't seem
to be the case.  It's certainly foolish to start detail design on
something like this when we don't even have the roadmap I asked
for about what sorts of maps we are agreed we want to have.

 The idea that's becoming attractive to me while contemplating the
 multiple-maps problem is that we should adopt something similar to
 the old Mac OS idea of multiple forks in a relation.

 Hmm. You also need to teach at least xlog.c and xlogutils.c about the 
 map forks, for full page images and the invalid page tracking.

Well, you'd have to teach them something anyway, for any incarnation
of maps that they might need to update.

 I also wonder what the performance impact of extending BufferTag is.

That's a fair objection, and obviously something we'd need to check.
But I don't recall seeing hash_any so high on any profile that I think
it'd be a big problem.

 My original thought was to have a separate RelFileNode for each of the 
 maps. That would require no smgr or xlog changes, and not very many 
 changes in the buffer manager, though I guess you'd more catalog 
 changes. You had doubts about that on the previous thread 
 (http://archives.postgresql.org/pgsql-hackers/2007-11/msg00204.php), but 
 the map forks idea certainly seems much more invasive than that.

The main problems with that are (a) the need to expose every type of map
in pg_class and (b) the need to pass all those relfilenode numbers down
to pretty low levels of the system.  The nice thing about the fork idea
is that you don't need any added info to uniquely identify what relation
you're working on.  The fork numbers would be hard-wired into whatever
code needed to know about particular forks.  (Of course, these same
advantages apply to using special space in an existing file.  I'm
just suggesting that we can keep these advantages without buying into
the restrictions that special space would have.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rewriting Free Space Map

2008-03-17 Thread Lars-Erik Bjørk


On Mon, 2008-03-17 at 09:29 -0400, Tom Lane wrote:
 Hannu Krosing [EMAIL PROTECTED] writes:
  On Sun, 2008-03-16 at 21:33 -0300, Alvaro Herrera wrote:
  Tom Lane wrote:
  The idea that's becoming attractive to me while contemplating the
  multiple-maps problem is that we should adopt something similar to
  the old Mac OS idea of multiple forks in a relation.
 
  Are'nt we in a way doing this for indexes ?
 
 Not really --- indexes are closer to being independent entities, since
 they have their own relfilenode values, own pg_class entries, etc.  What
 I'm imagining here is something that's so tightly tied to the core heap
 that there's no value in managing it as a distinct entity, thus the idea
 of same relfilenode with a different extension.  The existence of
 multiple forks in a relation wouldn't be exposed at all at the SQL
 level.
 
  I think something similar could be used to store tuple visibility bits
  separately from heap tuple data itself, so +1 to this idea.
 
  Not just bits, but whole visibility info (xmin,xmax,tmin,tmax, plus
  bits) should be stored separately.
 
 I'm entirely un-sold on this idea, but yeah it would be something that
 would be possible to experiment with once we have a multi-fork
 infrastructure.
 
   regards, tom lane
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rewriting Free Space Map

2008-03-17 Thread Heikki Linnakangas


Tom Lane wrote:

Heikki Linnakangas [EMAIL PROTECTED] writes:
I've started working on revamping Free Space Map, using the approach 
where we store a map of heap pages on every nth heap page. What we need 
now is discussion on the details of how exactly it should work.


You're cavalierly waving away a whole boatload of problems that will
arise as soon as you start trying to make the index AMs play along
with this :-(.  


It doesn't seem very hard. An indexam wanting to use FSM needs a little 
bit of code where the relation is extended, to let the FSM initialize 
FSM pages. And then there's the B-tree metapage issue I mentioned. But 
that's all, AFAICS.



Hash for instance has very narrow-minded ideas about
page allocation within its indexes.


Hash doesn't use FSM at all.


Also, I don't think that use the special space will scale to handle
other kinds of maps such as the proposed dead space map.  (This is
exactly why I said the other day that we need a design roadmap for all
these ideas.)


It works for anything that scales linearly with the relation itself. The 
proposed FSM and visibility map both fall into that category.


A separate file is certainly more flexible. I was leaning towards that 
option originally 
(http://archives.postgresql.org/pgsql-hackers/2007-11/msg00142.php) for 
that reason.



The idea that's becoming attractive to me while contemplating the
multiple-maps problem is that we should adopt something similar to
the old Mac OS idea of multiple forks in a relation.  In addition
to the main data fork which contains the same info as now, there could
be one or more map forks which are separate files in the filesystem.
They are named by relfilenode plus an extension, for instance a relation
with relfilenode NNN would have a data fork in file NNN (plus perhaps
NNN.1, NNN.2, etc) and a map fork named something like NNN.map (plus
NNN.map.1 etc as needed).  We'd have to add one more field to buffer
lookup keys (BufferTag) to disambiguate which fork the referenced page
is in.  Having bitten that bullet, though, the idea trivially scales to
any number of map forks with potentially different space requirements
and different locking and WAL-logging requirements.


Hmm. You also need to teach at least xlog.c and xlogutils.c about the 
map forks, for full page images and the invalid page tracking. I also 
wonder what the performance impact of extending BufferTag is.


My original thought was to have a separate RelFileNode for each of the 
maps. That would require no smgr or xlog changes, and not very many 
changes in the buffer manager, though I guess you'd more catalog 
changes. You had doubts about that on the previous thread 
(http://archives.postgresql.org/pgsql-hackers/2007-11/msg00204.php), but 
the map forks idea certainly seems much more invasive than that.


I like the map forks idea; it groups the maps nicely at the filesystem 
level, and I can see it being useful for all kinds of things in the 
future. The question is, is it really worth the extra code churn? If you 
think it is, I can try that approach.



Another possible advantage is that a new map fork could be added to an
existing table without much trouble.  Which is certainly something we'd
need if we ever hope to get update-in-place working.


Yep.


The main disadvantage I can see is that for very small tables, the
percentage overhead from multiple map forks of one page apiece is
annoyingly high.  However, most of the point of a map disappears if
the table is small, so we might finesse that by not creating any maps
until the table has reached some minimum size.


Yeah, the map fork idea is actually better than the every nth heap 
page approach from that point of view.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [0/4] Proposal of SE-PostgreSQL patches

2008-03-17 Thread Alvaro Herrera

Kohei KaiGai wrote:
 The series of patches are the proposal of Security-Enhanced PostgreSQL
 (SE-PostgreSQL) for the upstreamed PostgreSQL 8.4 development cycle.

Before we go any further, is this work derived from SELinux?  If so, is
it covered under the GPL?  If so, can it be licensed under BSD terms?

Obviously, if it's not BSD, we cannot include it in Postgres.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New style of hash join proposal

2008-03-17 Thread Tom Lane

Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 Please give an example of what you're talking about that you think we
 can't do now.

 Note that we're doing a full sequential scan of a even though we've already
 finished hashing b and know full well which keys we'll need. If we have an
 index on a and b is sufficiently smaller than a, as in this case, then
 we could do a bitmap index scan on a and pull out just those keys.

You mean like this?

regression=# explain select * from tenk1 a where unique1 in (select f1 from 
int4_tbl b);
 QUERY PLAN 
 
-
 Nested Loop  (cost=1.06..42.52 rows=5 width=244)
   -  HashAggregate  (cost=1.06..1.11 rows=5 width=4)
 -  Seq Scan on int4_tbl b  (cost=0.00..1.05 rows=5 width=4)
   -  Index Scan using tenk1_unique1 on tenk1 a  (cost=0.00..8.27 rows=1 
width=244)
 Index Cond: (a.unique1 = b.f1)
(5 rows)

In the example you give, this type of plan was rejected because there
were too many rows in the subplan (or so I suppose anyway; you might
play around with the cost constants and see what happens).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [0/4] Proposal of SE-PostgreSQL patches

2008-03-17 Thread Alvaro Herrera

KaiGai Kohei wrote:
 Alvaro Herrera wrote:

 Before we go any further, is this work derived from SELinux?  If so, is
 it covered under the GPL?  If so, can it be licensed under BSD terms?

 All of SE-PostgreSQL works are licensed unser BSD terms.
 We are considering to push SE-PostgreSQL into upstreamed PostgreSQL from
 the beginning, and we understand to choose GPL makes it impossible.

Right.  The question is: since this is derived from SE-Linux, is it
affected by SE-Linux license?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [0/4] Proposal of SE-PostgreSQL patches

2008-03-17 Thread KaiGai Kohei


Alvaro Herrera wrote:

KaiGai Kohei wrote:

Alvaro Herrera wrote:



Before we go any further, is this work derived from SELinux?  If so, is
it covered under the GPL?  If so, can it be licensed under BSD terms?

All of SE-PostgreSQL works are licensed unser BSD terms.
We are considering to push SE-PostgreSQL into upstreamed PostgreSQL from
the beginning, and we understand to choose GPL makes it impossible.


Right.  The question is: since this is derived from SE-Linux, is it
affected by SE-Linux license?


No, SE-PostgreSQL does not derivered from SELinux.

I guess you worry about SE-PostgreSQL contains a part of SELinux licensed
as GPL, but it is incorrect.
SE-PostgreSQL communicate with SELinux to make its decision in access control,
via an official interface provided by libselinux, because it does not have
information to make its decision.
The libselinux is linked with SE-PostgreSQL, but it is licensed as public
domain software by NSA.

Therefore, we have no issues related to imcompatible licenses.

Thanks,
--
KaiGai Kohei [EMAIL PROTECTED]

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Some cleanups of enum-guc code, per comments from Tom.

2008-03-17 Thread Tom Lane

Alvaro Herrera [EMAIL PROTECTED] writes:
 These bits seem to have broken --describe-config:
 (Perhaps they were already broken in the previous patch, not sure).

It was already broken :-(.  Not sure how both Magnus and I missed
the switch in help_config.c --- I know I grepped for references to
the PGC_xxx enum constants, and I suppose he did too.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] postgresql.org dns problems

2008-03-17 Thread ohp

Hi All,

Is it me or none of the mirror names can be resolved?
ftp.fr.postgresql.org host unknown
ftp.fr4.postgresql.org host unknown


Regards

-- 
Olivier PRENANT Tel: +33-5-61-50-97-00 (Work)
15, Chemin des Monges+33-5-61-50-97-01 (Fax)
31190 AUTERIVE   +33-6-07-63-80-64 (GSM)
FRANCE  Email: [EMAIL PROTECTED]
--
Make your life a dream, make your dream a reality. (St Exupery)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commit fest?

2008-03-17 Thread Bruce Momjian

Magnus Hagander wrote:
  I was thinking of writing something that scraped the archives building a 
  lookup table out of this information.  What would be nice is if the 
  X-Message-Id and X-Reference were both put into the regular HTML for 
  future archived messages so that it's more likely tools like Google 
  could search based on them.  A brief glance at the MHonArc documentation 
  suggests that could be run to re-covert any existing messages that are 
  still available in order to add to those even.
 
 We are sucking a lot of this data down to the db on 
 search.postgresql.org already. Does it make sense to do it there 
 perhaps? Is there need for anything more than a unique-messageid-hit? If 
 that's all we need, we could easily have an url like 
 http://search.postgresql.org/[EMAIL PROTECTED] redirect 
 to the proper page on archives?

Agreed, we just need search to index the message-id line and we can link
to that easily.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commit fest?

2008-03-17 Thread Bruce Momjian

Greg Smith wrote:
 On Sun, 16 Mar 2008, Bruce Momjian wrote:
 
  Oh, what I would really like is to be able to pull up 
  archives.postgresql.org emails based on message id so I can link to the 
  entire thread.  Unfortunately, it doesn't work there, nor does Google or 
  any of the other Postgres email archive sites.
 
 This is something I've been looking into my own organization.  The message 
 ids are at the start of the archive web pages.  For example your e-mail 
 here I'm replying to begins like this if you look at the page source:
 
 http://archives.postgresql.org/pgsql-hackers/2008-03/msg00554.php
 !-- MHonArc v2.6.16 --
 !--X-Subject: Re: Commit fest? --
 !--X-From-R13: Pehpr [bzwvna oehprNzbzwvna.hf --
 !--X-Date: Sun, 16 Mar 2008 23:19:33 #45;0300 (ADT) --
 !--X-Message-Id: [EMAIL PROTECTED] --
 !--X-Content-Type: text/plain --
 !--X-Reference: [EMAIL PROTECTED] --
 !--X-Head-End--
 
 I was thinking of writing something that scraped the archives building a 
 lookup table out of this information.  What would be nice is if the 
 X-Message-Id and X-Reference were both put into the regular HTML for 
 future archived messages so that it's more likely tools like Google could 
 search based on them.  A brief glance at the MHonArc documentation 
 suggests that could be run to re-covert any existing messages that are 
 still available in order to add to those even.

Yep, also you can pull my comments by using the message id, e.g.:

  http://js-kit.com/rss/momjian.us/[EMAIL PROTECTED]

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] postgresql.org dns problems

2008-03-17 Thread Mike Aubury

I'm getting the same here...
Just went to download 8.3 (if anyone has an alternate mirror...)


On Monday 17 March 2008 15:07:58 [EMAIL PROTECTED] wrote:
 Hi All,

 Is it me or none of the mirror names can be resolved?
 ftp.fr.postgresql.org host unknown
 ftp.fr4.postgresql.org host unknown

 
 Regards

 --
 Olivier PRENANT   Tel: +33-5-61-50-97-00 (Work)
 15, Chemin des Monges+33-5-61-50-97-01 (Fax)
 31190 AUTERIVE   +33-6-07-63-80-64 (GSM)
 FRANCE  Email: [EMAIL PROTECTED]
 ---
--- Make your life a dream, make your dream a reality. (St Exupery)



-- 
Mike Aubury

Aubit Computing Ltd is registered in England and Wales, Number: 3112827
Registered Address : Clayton House,59 Piccadilly,Manchester,M1 2AQ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New style of hash join proposal

2008-03-17 Thread Gregory Stark

Tom Lane [EMAIL PROTECTED] writes:

 Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 Please give an example of what you're talking about that you think we
 can't do now.

 Note that we're doing a full sequential scan of a even though we've already
 finished hashing b and know full well which keys we'll need. If we have an
 index on a and b is sufficiently smaller than a, as in this case, then
 we could do a bitmap index scan on a and pull out just those keys.

 You mean like this?

 regression=# explain select * from tenk1 a where unique1 in (select f1 from 
 int4_tbl b);
  QUERY PLAN   

 -
  Nested Loop  (cost=1.06..42.52 rows=5 width=244)
-  HashAggregate  (cost=1.06..1.11 rows=5 width=4)
  -  Seq Scan on int4_tbl b  (cost=0.00..1.05 rows=5 width=4)
-  Index Scan using tenk1_unique1 on tenk1 a  (cost=0.00..8.27 rows=1 
 width=244)
  Index Cond: (a.unique1 = b.f1)
 (5 rows)

 In the example you give, this type of plan was rejected because there
 were too many rows in the subplan (or so I suppose anyway; you might
 play around with the cost constants and see what happens).

Sort of, except using a bitmap index scan which would be precisely what allows
it to work for much larger sets of records.

It doesn't have to be an IN join either. Hash joins are used for normal joins
for even quite large sets of records. If the join is very selective then a
nested loop is fine, and if it covers most of the table then a sequential scan
might be fine. But in many cases you're dealing with a smallish percentage of
a very large table. That's what bitmap index scans are tailor made for.

Cases like:

select * from invoice join invoice_detail on (invoice_id) where 
invoice.quarter='Q4'

Is going to pull out thousands of invoices from the invoice table, then want
to pull out all the matching invoice_detail records from the invoice_detail
table. It'll probably be 5-10% of the invoice_detail table making a sequential
scan a big waste but enough records that a nested loop with a plain index scan
will be very slow. Worse, if it does the hash join there may be enough
invoices to force the hash join to work in batches.

It would be ideal if it could scan the invoices using an index, toss them all
in a hash, then do a bitmap index scan to pull out all the matching detail
records. If there are multiple batches it can start a whole new index scan for
the each of the batches.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] postgresql.org dns problems

2008-03-17 Thread Dave Page

On Mon, Mar 17, 2008 at 3:07 PM,  [EMAIL PROTECTED] wrote:
 Hi All,

  Is it me or none of the mirror names can be resolved?
  ftp.fr.postgresql.org host unknown
  ftp.fr4.postgresql.org host unknown

Should be OK now:

snake:~ dpage$ nslookup ftp.fr.postgresql.org
Server: 172.24.32.2
Address:172.24.32.2#53

Non-authoritative answer:
Name:   ftp.fr.postgresql.org
Address: 194.116.145.140

snake:~ dpage$ nslookup ftp4.fr.postgresql.org
Server: 172.24.32.2
Address:172.24.32.2#53

Non-authoritative answer:
ftp4.fr.postgresql.org  canonical name = distrib-coffee.ipsl.jussieu.fr.
Name:   distrib-coffee.ipsl.jussieu.fr
Address: 134.157.176.20

-- 
Dave Page
EnterpriseDB UK Ltd: http://www.enterprisedb.com
PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rewriting Free Space Map

2008-03-17 Thread Simon Riggs

On Sun, 2008-03-16 at 21:33 -0300, Alvaro Herrera wrote:
 Tom Lane wrote:
 
  The idea that's becoming attractive to me while contemplating the
  multiple-maps problem is that we should adopt something similar to
  the old Mac OS idea of multiple forks in a relation.  In addition
  to the main data fork which contains the same info as now, there could
  be one or more map forks which are separate files in the filesystem.
 
 I think something similar could be used to store tuple visibility bits
 separately from heap tuple data itself, so +1 to this idea.
 
 (The rough idea in my head was that you can do an indexscan and look
 up visibility bits without having to pull the whole heap along; and
 visibility updates are also cheaper, whether they come from indexscans
 or heap scans.  Of course, the implicit cost is that a seqscan needs to
 fetch the visibility pages, too; and the locking is more complex.)

I very much like the idea of a generic method for including additional
bulk metadata for a relation (heap or index). That neatly provides a
general infrastructure for lots of clever things such as dead space,
visibility or other properties, while at the same time maintaining
modularity. 

Can we call them maps or metadata maps? forks sounds weird.

We don't need to assume anything about the maps themselves at this
stage, so we might imagine tightly coupled maps that are always updated
as a relation changes, or loosely coupled maps that are lazily updated
by background processes. Autovacuum then becomes the vehicle by which we
execute map maintenance procedures, defined according to which AMs are
installed and what relation options are set. So we have a completely
generalised data/metadata storage infrastructure.

Sensibly arranged this could provide an entry point for powerful new
features within existing and future index AMs. It also sounds like it
might avoid a whole class of bugs and special cases that I regrettably
foresee would be unavoidable in Heikki's proposal.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com 

  PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PATCHES] [HACKERS] CIC and deadlocks

2008-03-17 Thread Tom Lane

Pavan Deolasee [EMAIL PROTECTED] writes:
 [ patch to reduce probability of deadlock of CREATE INDEX CONCURRENTLY
   with other things ]

This patch no longer applies because of the VirtualXid changes.
Looking at it again, I'm fairly dissatisfied with it anyway;
I really don't like moving the GetTransactionSnapshot calls around
like that, because it opens a risk that GetTransactionSnapshot won't
get called at all.

Since the autovacuum case is already dealt with separately, I'm
thinking there is no problem here that we actually need to solve.
C.I.C. can never be guaranteed free of deadlock risk, so I don't
see a lot of value in making it free of deadlock risk against
just CLUSTER and VACUUM FULL.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rewriting Free Space Map

2008-03-17 Thread Gregory Stark

Tom Lane [EMAIL PROTECTED] writes:

 Heikki Linnakangas [EMAIL PROTECTED] writes:
 My original thought was to have a separate RelFileNode for each of the 
 maps. That would require no smgr or xlog changes, and not very many 
 changes in the buffer manager, though I guess you'd more catalog 
 changes. You had doubts about that on the previous thread 
 (http://archives.postgresql.org/pgsql-hackers/2007-11/msg00204.php), but 
 the map forks idea certainly seems much more invasive than that.

 The main problems with that are (a) the need to expose every type of map
 in pg_class and (b) the need to pass all those relfilenode numbers down
 to pretty low levels of the system.  The nice thing about the fork idea
 is that you don't need any added info to uniquely identify what relation
 you're working on.  The fork numbers would be hard-wired into whatever
 code needed to know about particular forks.  (Of course, these same
 advantages apply to using special space in an existing file.  I'm
 just suggesting that we can keep these advantages without buying into
 the restrictions that special space would have.)

One advantage of using separate relfilenodes would be that if we need to
regenerate a map we could do it in a new relfilenode and swap it in like we do
with heap rewrites.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commit fest?

2008-03-17 Thread Gregory Stark


Bruce Momjian [EMAIL PROTECTED] writes:

 Magnus Hagander wrote:
 We are sucking a lot of this data down to the db on 
 search.postgresql.org already. Does it make sense to do it there 
 perhaps? Is there need for anything more than a unique-messageid-hit? If 
 that's all we need, we could easily have an url like 
 http://search.postgresql.org/[EMAIL PROTECTED] redirect 
 to the proper page on archives?

 Agreed, we just need search to index the message-id line and we can link
 to that easily.

I would very much like such a URL as well. At a guess it would require hacking
the tsearch parser we use for the search engine on the web site?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New style of hash join proposal

2008-03-17 Thread Gregory Stark

Gregory Stark [EMAIL PROTECTED] writes:

 It would be ideal if it could scan the invoices using an index, toss them all
 in a hash, then do a bitmap index scan to pull out all the matching detail
 records. If there are multiple batches it can start a whole new index scan for
 the each of the batches.

A more general solution to this would be to find a way to tackle the general
problem of postponing the heap lookups until we really need columns which
aren't present in the index keys.

So something like the aforementioned 

 select * from invoice join invoice_detail on (invoice_id) where 
invoice.quarter='Q4'

could be done doing something like

Heap Scan on invoice_detail
  - Heap Scan on invoice
 - Nested Loop
- Index Scan on invoice_quarter
   Index Cond: (quarter='Q4')
- Index Scan on pk_invoice_detail 
   Index Cond: (invoice_id = $0)

But that would be a much more wide-ranging change. And it would still not be
sequential unless we do extra work to sort the index tuples by tid.

There would be plenty of fiddly bits around which paths it's safe to execute
prior to checking the visibility as well. Obviously the visibility would have
to be checked before things like Unique or Aggregate nodes.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Better error message for select_common_type()

2008-03-17 Thread Tom Lane

Peter Eisentraut [EMAIL PROTECTED] writes:
 ... I'm not sure about the terminology position and instance; they're 
 just two coordinates to get at the problem.

 None of this will help if you have multiple unrelated clauses that 
 invoke select_common_type(), but that might be better handled using the 
 parser location mechanism.

Were there any objections to changing this patch so that it reports
the second expression's parser location, instead of some arbitrary
numbers?  The way I'm envisioning doing it is:

1. Invent an exprLocation() function (comparable to, say, exprType)
that knows how to get the parser location from any subtype of Node that
has one.

2. Make a variant of select_common_type() that takes a list of Exprs
instead of just type OIDs.  It can get the type IDs from these
using exprType(), and it can get their locations using exprLocation()
if needed.

We could almost just replace the current form of select_common_type()
with the version envisioned in #2.  In a quick grep, there is only
one usage of select_common_type() that isn't invoking it on a list
of exprType() results that could be trivially changed over to the
underlying expressions instead --- and that is in
transformSetOperationTree, which may be dealing with inputs that
are previously-resolved output types for child set operations.
I'm not sure about a better way to complain about type mismatches in
nested set operations, anyway.  We could possibly keep track of one of
the sub-expressions that had determined the resolved output type, and
point to that, but it would likely seem a bit arbitrary to the user.
Thoughts anyone?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rewriting Free Space Map

2008-03-17 Thread Andrew Dunstan




Gregory Stark wrote:

Tom Lane [EMAIL PROTECTED] writes:

  

Heikki Linnakangas [EMAIL PROTECTED] writes:

My original thought was to have a separate RelFileNode for each of the 
maps. That would require no smgr or xlog changes, and not very many 
changes in the buffer manager, though I guess you'd more catalog 
changes. You had doubts about that on the previous thread 
(http://archives.postgresql.org/pgsql-hackers/2007-11/msg00204.php), but 
the map forks idea certainly seems much more invasive than that.
  

The main problems with that are (a) the need to expose every type of map
in pg_class and (b) the need to pass all those relfilenode numbers down
to pretty low levels of the system.  The nice thing about the fork idea
is that you don't need any added info to uniquely identify what relation
you're working on.  The fork numbers would be hard-wired into whatever
code needed to know about particular forks.  (Of course, these same
advantages apply to using special space in an existing file.  I'm
just suggesting that we can keep these advantages without buying into
the restrictions that special space would have.)



One advantage of using separate relfilenodes would be that if we need to
regenerate a map we could do it in a new relfilenode and swap it in like we do
with heap rewrites.

  


Why can't you just do that with a different extension and file rename? 
You'd need an exclusive lock while swapping in the new map, but you need 
that anyway, IIRC, and this way you don't even need a catalog change.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] How large file is really large - pathconf results

2008-03-17 Thread Zdenek Kotala

Regarding to discussion about large segment size of table files a test 
pathconf function (see 
http://www.opengroup.org/onlinepubs/009695399/functions/pathconf.html).


You can see output there:

_PC_FILESIZEBITS - 3rd column
_PC_LINK_MAX - 4th column
_PC_NAME_MAX - 5th column   
_PC_PATH_MAX - 6th column


Solaris Nevada  ZFS 64  -1  255 1024
UFS 41  32767   255 1024
FAT 33  1   8   1024
NFS 41  32767   255 1024
Solaris 8   UFS 41  32767   255 1024
NFS 40  32767   255 1024
Centos4(2.6.11) EXT364  32000   255 4096
XFS 64  2147483647  255 4096
Mac OSX leopard HFS+64  32767   255 1024


The result is not really good :(. I tested it also on HP.UX 11.11/11.23, 
Tru64 v4.0 and MacOS tiger (big thanks to Tomas Honzak for machine 
access) and  Tiger and Tru64 does not recognize _PC_FILESIZEBITS 
definition and HP_UX returns errno=EINVAL. I also don't trust Linux 
result on EXT3. It seems that only Solaris and Leopard returns 
relatively correct result (33 bit on FAT FS is probably not correct).


I attached my test program, please let me know your result from your 
favorite OS/FS (binary must be saved on tested FS).


However, I think we cannot use this method to test max file size on FS :(.

Comments, ideas?

Zdenek

PS: Does pg_dump strip a large file or not?
#include unistd.h
#include stdio.h
#include errno.h
#include string.h
#include limits.h

int main(int argc, char** argv)
{
	long ret;
	int err;

	errno = 0;
	ret = pathconf(argv[0],_PC_FILESIZEBITS);
	if ( ret == -1)
		if ( errno == 0)
			printf(_PC_FILESIZEBITS = unlimited\n);
		else
			printf(_PC_FILESIZEBITS = %s\n, strerror(errno));
	else	
		printf(_PC_FILESIZEBITS = %li\n, ret);

	/*  */
	errno = 0;
	ret = pathconf(argv[0],_PC_LINK_MAX);
	if ( ret == -1)
		if ( errno == 0)
			printf(_PC_LINK_MAX = unlimited\n);
		else
			printf(_PC_LINK_MAX = %s\n, strerror(errno));
	else	
		printf(_PC_LINK_MAX = %li\n, ret);

	/*  */
	errno = 0;
	ret = pathconf(argv[0],_PC_NAME_MAX);
	if ( ret == -1)
		if ( errno == 0)
			printf(_PC_NAME_MAX = unlimited\n);
		else
			printf(_PC_NAME_MAX = %s\n, strerror(errno));
	else	
		printf(_PC_NAME_MAX = %li\n, ret);

	/*  */
	errno = 0;
	ret = pathconf(argv[0],_PC_PATH_MAX);
	if ( ret == -1)
		if ( errno == 0)
			printf(_PC_PATH_MAX = unlimited\n);
		else
			printf(_PC_PATH_MAX = %s\n, strerror(errno));
	else	
		printf(_PC_PATH_MAX = %li\n, ret);

	return 0;
}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commit fest?

2008-03-17 Thread Magnus Hagander


Gregory Stark wrote:

Bruce Momjian [EMAIL PROTECTED] writes:


Magnus Hagander wrote:
We are sucking a lot of this data down to the db on 
search.postgresql.org already. Does it make sense to do it there 
perhaps? Is there need for anything more than a unique-messageid-hit? If 
that's all we need, we could easily have an url like 
http://search.postgresql.org/[EMAIL PROTECTED] redirect 
to the proper page on archives?

Agreed, we just need search to index the message-id line and we can link
to that easily.


I would very much like such a URL as well. At a guess it would require hacking
the tsearch parser we use for the search engine on the web site?


No, it requires hacking the indexing script, and we'll store the 
messageid in it's own column in the table.


//Magnus

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PATCHES] [HACKERS] [0/4] Proposal of SE-PostgreSQL patches

2008-03-17 Thread Greg Smith


On Mon, 17 Mar 2008, KaiGai Kohei wrote:


I'll submit the proposal of SE-PostgreSQL patches again, because some of 
previous
messages are filtered due to attachment and I cannot provide whole of patches 
yet.


This is actually what you should have done from the beginning.  And it 
only should have gone to the pgsql-hackers list, which is the only one I'm 
replying to.  Your patches are at this point a proposal, as you say in the 
subject, and those go to the pgsql-hackers list with the minimum of files 
necessary to support them.  pgsql-patches is generally aimed at patches 
that have already been discussed on the hackers list, ones that are 
basically ready to apply to the source code.


The libselinux is linked with SE-PostgreSQL, but it is licensed as 
public domain software by NSA.


As for the licensing issues here, what everyone is looking for is a clear 
statement of the SELinux license from the source of that code.  The 
official NSA statment at http://www.nsa.gov/selinux/info/license.cfm says:


All source code found on this site is released under the same terms and 
conditions as the original sources. For example, the patches to the Linux 
kernel, patches to many existing utilities, and some of the new programs 
available here are released under the terms and conditions of the GNU 
General Public License (GPL). Please refer to the source code for specific 
license information.


GPL is a perfectly good license, but it's far from clear whether code 
derived from it can be incorporated into PostgreSQL even if you wrote all 
of it yourself.  I just checked libselinux, and as you say it includes a 
LICENSE file that states This library (libselinux) is public domain 
software, i.e. not copyrighted..  That's good, but a similar independant 
review will need to happen for every component you interact with here, on 
top of a technical review.  Luckily this is something a lot of people 
would like and that should all get taken care of.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commit fest?

2008-03-17 Thread Josh Berkus

All,

First, the new comment interface doesn't work on Konqueror/Safari, which is 
the problem I think many people are having.  Use firefox instead.

Second, I plan to do a wiki-level summary of pending patches for the 2nd 
commit fest.  For the first one, raw data wasn't available before the 
official start of the Fest, and now things are changing too fast for me to 
keep up.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New style of hash join proposal

2008-03-17 Thread Tom Lane

Gregory Stark [EMAIL PROTECTED] writes:
 It would be ideal if it could scan the invoices using an index, toss them all
 in a hash, then do a bitmap index scan to pull out all the matching detail
 records. If there are multiple batches it can start a whole new index scan for
 the each of the batches.

I don't understand which part of we can do that now isn't clear to you.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [0/4] Proposal of SE-PostgreSQL patches

2008-03-17 Thread Josh Berkus

KaiGai,

 The series of patches are the proposal of Security-Enhanced PostgreSQL
 (SE-PostgreSQL) for the upstreamed PostgreSQL 8.4 development cycle.

Since I'm (Finally!) expecting the TrustedSolaris folks to put some work into 
PostgreSQL as well this year, I'm going to ask them to look over PGACE to see 
if this implementation is (still) generic enough to support TS as well.  If 
it is, then it's probably generic enough to be a general building block.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commit fest?

2008-03-17 Thread Greg Smith


On Mon, 17 Mar 2008, Magnus Hagander wrote:

We are sucking a lot of this data down to the db on search.postgresql.org 
already. Does it make sense to do it there perhaps?


No, using the database for this sort of thing is so old-school at this 
point.  Didn't you hear that good web applications abstract away the 
database so you don't have to worry about what's in there?  The right way 
to handle this is to push the entire archive through the client each time 
so it can make rich decisions about the data instead.  I hear Ruby on 
Rails is a good tool for this.


Is there need for anything more than a unique-messageid-hit? If that's 
all we need, we could easily have an url like 
http://search.postgresql.org/[EMAIL PROTECTED] redirect 
to the proper page on archives?


That would be perfect.  I'd like to be able to replace my saved mailbox 
with a web page containing links to the archives instead, and that would 
be easy to do with that support.  Bruce's requirements have a similar 
mapping job to accomplish, and I could imagine a useful app for that area 
that consumed an mbox file and output a page of wiki markup.  Why, the 
patch queue is practically on a wiki already!


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rewriting Free Space Map

2008-03-17 Thread Tom Lane

Gregory Stark [EMAIL PROTECTED] writes:
 One advantage of using separate relfilenodes would be that if we need to
 regenerate a map we could do it in a new relfilenode and swap it in like we do
 with heap rewrites.

You could probably do that using a temporary fork number, if the
situation ever came up.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rewriting Free Space Map

2008-03-17 Thread Tom Lane

Simon Riggs [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 The idea that's becoming attractive to me while contemplating the
 multiple-maps problem is that we should adopt something similar to
 the old Mac OS idea of multiple forks in a relation.

 Can we call them maps or metadata maps? forks sounds weird.

I'm not wedded to forks, that's just the name that was used in the
only previous example I've seen.  Classic Mac had a resource fork
and a data fork within each file.

Don't think I like maps though, as (a) that prejudges what the
alternate forks might be used for, and (b) the name fails to be
inclusive of the data fork.  Other suggestions anyone?

BTW, thinking about the Mac precedent a little more, I believe
the way they grafted that Classic concept onto BSD was that
applications (which the user thinks of as single objects) are
now directories with multiple files inside them.  Probably it'd be
overkill to think of turning each relation into a subdirectory, but
then again maybe not?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commit fest?

2008-03-17 Thread Bruce Momjian

Magnus Hagander wrote:
 Gregory Stark wrote:
  Bruce Momjian [EMAIL PROTECTED] writes:
  
  Magnus Hagander wrote:
  We are sucking a lot of this data down to the db on 
  search.postgresql.org already. Does it make sense to do it there 
  perhaps? Is there need for anything more than a unique-messageid-hit? If 
  that's all we need, we could easily have an url like 
  http://search.postgresql.org/[EMAIL PROTECTED] redirect 
  to the proper page on archives?
  Agreed, we just need search to index the message-id line and we can link
  to that easily.
  
  I would very much like such a URL as well. At a guess it would require 
  hacking
  the tsearch parser we use for the search engine on the web site?
 
 No, it requires hacking the indexing script, and we'll store the 
 messageid in it's own column in the table.

Yea, it would be nice if we could do it.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Some cleanups of enum-guc code, per comments from Tom.

2008-03-17 Thread Magnus Hagander


Tom Lane wrote:

Alvaro Herrera [EMAIL PROTECTED] writes:

These bits seem to have broken --describe-config:
(Perhaps they were already broken in the previous patch, not sure).


It was already broken :-(.  Not sure how both Magnus and I missed
the switch in help_config.c --- I know I grepped for references to
the PGC_xxx enum constants, and I suppose he did too.


Yeah, that's what I did, and I have no idea how I missed it.

Will fix.

//Magnus

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New style of hash join proposal

2008-03-17 Thread Gregory Stark

Tom Lane [EMAIL PROTECTED] writes:

 Gregory Stark [EMAIL PROTECTED] writes:
 It would be ideal if it could scan the invoices using an index, toss them all
 in a hash, then do a bitmap index scan to pull out all the matching detail
 records. If there are multiple batches it can start a whole new index scan 
 for
 the each of the batches.

 I don't understand which part of we can do that now isn't clear to you.

Uh, except we can't.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rewriting Free Space Map

2008-03-17 Thread Simon Riggs

On Mon, 2008-03-17 at 13:23 -0400, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  Tom Lane wrote:
  The idea that's becoming attractive to me while contemplating the
  multiple-maps problem is that we should adopt something similar to
  the old Mac OS idea of multiple forks in a relation.
 
  Can we call them maps or metadata maps? forks sounds weird.
 
 I'm not wedded to forks, that's just the name that was used in the
 only previous example I've seen.  Classic Mac had a resource fork
 and a data fork within each file.

Layer? Slab? Sheet? Strata/um? Overlay?

Layer makes sense to me because of the way GIS and CAD systems work.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com 

  PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Small bug in new backend build method

2008-03-17 Thread Peter Eisentraut

Am Samstag, 15. März 2008 schrieb Tom Lane:
 I found that when I added a new .o file in backend/access/hash while
 working in an already-built source tree, the objfiles.txt file in that
 directory got updated, but the one in backend/access did not, leading to
 link failure.  Seems there's a missing dependency for the upper-level
 objfiles.txt files.

OK, fixed.  The output size that we were originally trying to shorten is a bit 
longer again now, but it should be bearable.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rewriting Free Space Map

2008-03-17 Thread David Fetter

On Mon, Mar 17, 2008 at 01:23:46PM -0400, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  Tom Lane wrote:
  The idea that's becoming attractive to me while contemplating
  the multiple-maps problem is that we should adopt something
  similar to the old Mac OS idea of multiple forks in a
  relation.
 
  Can we call them maps or metadata maps? forks sounds weird.
 
 I'm not wedded to forks, that's just the name that was used in the
 only previous example I've seen.  Classic Mac had a resource fork
 and a data fork within each file.
 
 Don't think I like maps though, as (a) that prejudges what the
 alternate forks might be used for, and (b) the name fails to be
 inclusive of the data fork.  Other suggestions anyone?

Segment?  Section?  Module?

Cheers,
David.
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: [EMAIL PROTECTED]

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCHES] [0/4] Proposal of SE-PostgreSQL patches

2008-03-17 Thread Gregory Stark

Greg Smith [EMAIL PROTECTED] writes:

 On Mon, 17 Mar 2008, KaiGai Kohei wrote:

 I'll submit the proposal of SE-PostgreSQL patches again, because some of 
 previous
 messages are filtered due to attachment and I cannot provide whole of 
 patches yet.

 This is actually what you should have done from the beginning.  And it only
 should have gone to the pgsql-hackers list, which is the only one I'm replying
 to.  Your patches are at this point a proposal, as you say in the subject, and
 those go to the pgsql-hackers list with the minimum of files necessary to
 support them.  pgsql-patches is generally aimed at patches that have already
 been discussed on the hackers list, ones that are basically ready to apply to
 the source code.

Some people shout any time you send patches to -hackers. For the -patches is
there mainly to catch large attachments regardless of their maturity.

But it's true that it's best to post a plan and have discussion prior to
developing big patches.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rewriting Free Space Map

2008-03-17 Thread Jochem van Dieten

On Mon, Mar 17, 2008 at 6:23 PM, Tom Lane wrote:
 Simon Riggs writes:

 Can we call them maps or metadata maps? forks sounds weird.

  I'm not wedded to forks, that's just the name that was used in the
  only previous example I've seen.  Classic Mac had a resource fork
  and a data fork within each file.

Microsoft / NTFS calls them Data Streams:
http://www.wikistc.org/wiki/Alternate_data_streams

Jochem

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rewriting Free Space Map

2008-03-17 Thread Mark Cave-Ayland


On Mon, 2008-03-17 at 13:23 -0400, Tom Lane wrote:

 I'm not wedded to forks, that's just the name that was used in the
 only previous example I've seen.  Classic Mac had a resource fork
 and a data fork within each file.

 Don't think I like maps though, as (a) that prejudges what the
 alternate forks might be used for, and (b) the name fails to be
 inclusive of the data fork.  Other suggestions anyone?

I believe that in the world of NTFS the concept is called streams:
http://support.microsoft.com/kb/105763.


HTH,

Mark.

-- 
Mark Cave-Ayland
Sirius Corporation - The Open Source Experts
http://www.siriusit.co.uk
T: +44 870 608 0063


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] krb_match_realm patch

2008-03-17 Thread Bruce Momjian


Added to TODO:

o Allow Kerberos to disable stripping of realms so we can
  check the [EMAIL PROTECTED] against multiple realms

  http://archives.postgresql.org/pgsql-hackers/2007-11/msg9.php


---

Magnus Hagander wrote:
 Stephen Frost wrote:
  Greetings,
  
Regarding Magnus' patch for matching against the Kerberos realm- I'd
see it as much more useful as a multi-value configuration option.
Perhaps 'krb_alt_realms' or 'krb_realms'.  This would look like:
  
Match against one, and only one, realm (does not have to be the realm
the server is in, that's dealt with seperately):
krb_realms = 'ABC.COM'
  
Don't worry about the realm ever:
krb_realms = '' # default, to match current krb5
  
Match against multiple realms:
krb_realms = 'ABC.COM, DEF.ABC.COM'
  
Note that using multiple realms implies either no overlap, or that
overlap means the same person.

Additionally, I feel we should have an explicit 'krb_strip_realm'
boolean option to enable this behaviour.  If 'krb_strip_realm' is
'false' then the full [EMAIL PROTECTED] would be used.  This would mean 
  that
more complex cross-realm could also be handled by creating users with
[EMAIL PROTECTED] and then just roles when a given user exists in multiple
realms.

I understand that we're in beta now but both of these are isolated and
rather small changes, I believe.  Also, Magnus has indicated that he'd
be willing to adjust his patch accordingly if this is agreed to
(please correct me if I'm wrong here :).
 
 I've committed the patch as it was without this, because that's still
 better than what we have now.
 
 Just for the record, I've indicated that I'm willing to add the
 multi-realm match part of that, but I'm not sure we want to dig into the
 krb_strip_realm stuff this late in the cycle. At least unless someone
 can confirm that we won't have issues *elswhere* from passing in very
 long usernames in what I believe is not entirely specified formats.
 
 I will try to work on the multi-realm stuff next week, unless someone
 wants to beat me to it...
 
 //Magnus
 
 ---(end of broadcast)---
 TIP 4: Have you searched our list archives?
 
http://archives.postgresql.org

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rewriting Free Space Map

2008-03-17 Thread Gregory Stark


Tom Lane [EMAIL PROTECTED] writes:

 I'm not wedded to forks, that's just the name that was used in the
 only previous example I've seen.  Classic Mac had a resource fork
 and a data fork within each file.

fwiw forks are not unique to MacOS, c.f.:
http://en.wikipedia.org/wiki/Fork_%28filesystem%29

However I'm not sure reusing any of these terms is such a hot idea. All it's
going to do is confuse someone into thinking we're actually talking about HFS
forks or NTFS data streams or whatever. Better to pick a term that isn't
already being used for such things so people don't get misled.

 BTW, thinking about the Mac precedent a little more, I believe
 the way they grafted that Classic concept onto BSD was that
 applications (which the user thinks of as single objects) are
 now directories with multiple files inside them.  Probably it'd be
 overkill to think of turning each relation into a subdirectory, but
 then again maybe not?

Well there are upsides and downsides. Many OSes have difficulties when you
have many files in a single directory. This would tend to reduce that. On the
other hand it would drastically increase the number of directory files the OS
has to keep track of and the total number of inodes being referenced.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rewriting Free Space Map

2008-03-17 Thread Heikki Linnakangas


Tom Lane wrote:

Heikki Linnakangas [EMAIL PROTECTED] writes:

Tom Lane wrote:

You're cavalierly waving away a whole boatload of problems that will
arise as soon as you start trying to make the index AMs play along
with this :-(.  



It doesn't seem very hard.


The problem is that the index AMs are no longer in control of what goes
where within their indexes, which has always been their prerogative to
determine.  The fact that you think you can kluge btree to still work
doesn't mean that it will work for other AMs.


Well, it does work with all the existing AMs AFAICS. I do agree with the 
general point; it'd certainly be cleaner, more modular and more flexible 
if the AMs didn't need to know about the existence of the maps.



The idea that's becoming attractive to me while contemplating the
multiple-maps problem is that we should adopt something similar to
the old Mac OS idea of multiple forks in a relation.


Hmm. You also need to teach at least xlog.c and xlogutils.c about the 
map forks, for full page images and the invalid page tracking.


Well, you'd have to teach them something anyway, for any incarnation
of maps that they might need to update.


Umm, the WAL code doesn't care where the pages it operates on came from. 
Sure, we'll need rmgr-specific code that know what to do with the maps, 
but the full page image code would work without changes with the 
multiple RelFileNode approach.


The essential change with the map fork idea is that a RelFileNode no 
longer uniquely identifies a file on disk (ignoring the segmentation 
which is handled in smgr for now). Anything that operates on 
RelFileNodes, without any higher level information of what it is, needs 
to be modified to use RelFileNode+forkid instead. That includes at least 
the buffer manager, smgr, and the full page image code in xlog.c.


It's probably a pretty mechanical change, even though it affects a lot 
of code. We'd probably want to have a new struct, let's call it 
PhysFileId for now, for RelFileNode+forkid, and basically replace all 
occurrences of RelFileNode with PhysFileId in smgr, bufmgr and xlog code.



I also wonder what the performance impact of extending BufferTag is.


That's a fair objection, and obviously something we'd need to check.
But I don't recall seeing hash_any so high on any profile that I think
it'd be a big problem.


I do remember seeing hash_any in some oprofile runs. But that's fairly 
easy to test: we don't need to actually implement any of the stuff, 
other than add a field to BufferTag, and run pgbench.


My original thought was to have a separate RelFileNode for each of the 
maps. That would require no smgr or xlog changes, and not very many 
changes in the buffer manager, though I guess you'd more catalog 
changes. You had doubts about that on the previous thread 
(http://archives.postgresql.org/pgsql-hackers/2007-11/msg00204.php), but 
the map forks idea certainly seems much more invasive than that.


The main problems with that are (a) the need to expose every type of map
in pg_class and (b) the need to pass all those relfilenode numbers down
to pretty low levels of the system. 


(a) is certainly a valid point. Regarding (b), I don't think the low 
level stuff (I assume you mean smgr, bufmgr, bgwriter, xlog by that) 
would need to be passed any additional relfilenode numbers. Or rather, 
they already work with relfilenodes, and they don't need to know whether 
the relfilenode is for an index, a heap, or an FSM attached to something 
else. The relfilenodes would be in RelationData, and we already have 
that around whenever we do anything that needs to differentiate between 
those.


Another consideration is which approach is easiest to debug. The map 
fork approach seems better on that front, as you can immediately see 
from the PhysFileId if a page is coming from an auxiliary map or the 
main data portion. That might turn out to be handy in the buffer manager 
or bgwriter as well; they don't currently have any knowledge of what a 
page contains.



The nice thing about the fork idea
is that you don't need any added info to uniquely identify what relation
you're working on.  The fork numbers would be hard-wired into whatever
code needed to know about particular forks.  (Of course, these same
advantages apply to using special space in an existing file.  I'm
just suggesting that we can keep these advantages without buying into
the restrictions that special space would have.)


I don't see that advantage. All the higher-level code that care which 
relation you're working on already have Relation around. All the 
lower-level stuff don't care.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rewriting Free Space Map

2008-03-17 Thread Dawid Kuroczko

On Mon, Mar 17, 2008 at 6:23 PM, Tom Lane [EMAIL PROTECTED] wrote:
  I'm not wedded to forks, that's just the name that was used in the
  only previous example I've seen.  Classic Mac had a resource fork
  and a data fork within each file.

  Don't think I like maps though, as (a) that prejudges what the
  alternate forks might be used for, and (b) the name fails to be
  inclusive of the data fork.  Other suggestions anyone?

Shadow?  As each err, fork trails each relfilenode? (Or perhaps shade).

Hints?  As something more generic than map?

   Regards,
 Dawid

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New style of hash join proposal

2008-03-17 Thread Tom Lane

Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 I don't understand which part of we can do that now isn't clear to you.

 Uh, except we can't.

I already demonstrated that we could.  If the problem is that the
planner is cutting over from one plan type to the other at the wrong
rowcount because of bad cost estimates, that's a different issue
altogether.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commit fest?

2008-03-17 Thread Tom Lane

Josh Berkus [EMAIL PROTECTED] writes:
 First, the new comment interface doesn't work on Konqueror/Safari, which is 
 the problem I think many people are having.  Use firefox instead.

Can't say about Konqueror, but the comments work fine in Safari (as
long as you don't turn off Javascript).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commit fest?

2008-03-17 Thread Bruce Momjian

Tom Lane wrote:
 Josh Berkus [EMAIL PROTECTED] writes:
  First, the new comment interface doesn't work on Konqueror/Safari, which is 
  the problem I think many people are having.  Use firefox instead.
 
 Can't say about Konqueror, but the comments work fine in Safari (as
 long as you don't turn off Javascript).

I wonder if part of the problem is that the javascript file is not on
momjian.us but on another server --- I can imagine some browsers
thinking that is a security issue.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCHES] [0/4] Proposal of SE-PostgreSQL patches

2008-03-17 Thread Greg Smith


On Mon, 17 Mar 2008, Gregory Stark wrote:


Some people shout any time you send patches to -hackers.


Right, but if you note the improved version I give the thumbs-up to didn't 
include any patches--just links to where you could get them.  There's 
little reason to include any code as an attachment for a proposal if you 
can direct people to the web for them.  That's why I suggested sending 
the minimum of files necessary, which in this case was zero.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New style of hash join proposal

2008-03-17 Thread Gregory Stark

Tom Lane [EMAIL PROTECTED] writes:

 Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 I don't understand which part of we can do that now isn't clear to you.

 Uh, except we can't.

 I already demonstrated that we could.  If the problem is that the
 planner is cutting over from one plan type to the other at the wrong
 rowcount because of bad cost estimates, that's a different issue
 altogether.

We seem to be talking past each other. The plan you showed is analogous but
using a plain old index scan. That means it has to look up each matching
record in a separate index scan, potentially repeatedly. A bitmap index scan
would be able to look up precisely the elements in the hash in sequential
order and precisely once.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] backup_label and server start

2008-03-17 Thread Bruce Momjian


Added to TODO:

o Fix server restart problem when the server was shutdown during
  a PITR backup

  http://archives.postgresql.org/pgsql-hackers/2007-11/msg00800.php


---

Albe Laurenz wrote:
  If the postmaster is stopped with 'pg_ctl stop' while an
  online backup is in progress, the 'backup_label' file will remain
  in the data directory.
 [...]
  the startup process will fail with a message like this:
 [...]
  PANIC:  could not locate required checkpoint record
  HINT:  If you are not restoring from a backup, try removing the file 
  /POSTGRES/data/PG820/backup_label.
 
  wouldn't it be a good thing
  for the startup process to ignore (and rename) the backup_label
  file if no recovery.conf is present?
 
 Tom Lane replied:
  No, it certainly wouldn't.
 
 Point taken. When backup_label is present and recovery.conf isn't,
 there is the risk that the data directory has been restored from
 an online backup, in which case using the latest available
 checkpoint would be detrimental.
 
  I don't see why we should simplify the bizarre case you're 
  talking about
 
 Well, it's not a bizarre case, it has happened twice here.
 
 If somebody stops the postmaster while an online backup is
 in progress, there is no warning or nothing. Only the server
 will fail to restart.
 
 One of our databases is running in a RedHat cluster, which
 in this case cannot failover to another node.
 And this can also happen during an online backup.
 
 Simon Riggs replied:
  The hint is telling you how to restart the original server, not a crafty
  way of cheating the process to allow you to use it for backup.
 
  What are you trying to do?
 
 You misunderstood me, I'm not trying to cheat anything, nor do
 I want to restore a backup that way.
 
 All I want to do is restart a server after a clean shutdown.
 
 How about my second suggestion:
 
 Remove backup_label when the server shuts down cleanly.
 In that case an online backup in progress will not be useful
 anyway, and there is no need to recover on server restart.
 
 What do you think?
 
 Yours,
 Laurenz Albe
 
 ---(end of broadcast)---
 TIP 6: explain analyze is your friend

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Enable probes to work with Mac OS X Leopard and other OSes that

2008-03-17 Thread Peter Eisentraut

Tom Lane wrote:
 Peter Eisentraut [EMAIL PROTECTED] writes:
  Tom Lane wrote:
  Peter Eisentraut [EMAIL PROTECTED] writes:
  I think if they can run Gen_fmgrtab, they can get this to work.
 
  They can't.
 
  They can't run Gen_fmgrtab or they can't get this to work?

 They can't run Gen_fmgrtab.  The MSVC port has its own reimplementation
 of that script, and I suppose now it's going to need one to substitute
 for this, too.

Well, yes.  I meant to say, a build system that can supply the functionality 
of Gen_fmgrtab can surely implement this new thing.  I see there is Perl 
being used, so it should be simple.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New style of hash join proposal

2008-03-17 Thread Tom Lane

Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 I already demonstrated that we could.

 We seem to be talking past each other. The plan you showed is analogous but
 using a plain old index scan.

That's only because that seemed like the appropriate thing for the given
case's statistics.  [ fiddles with example... ]

regression=# explain select * from tenk1 a where thousand in (select f1 from 
int4_tbl b);
QUERY PLAN  
  
--
 Nested Loop  (cost=5.39..198.81 rows=51 width=244)
   -  HashAggregate  (cost=1.06..1.11 rows=5 width=4)
 -  Seq Scan on int4_tbl b  (cost=0.00..1.05 rows=5 width=4)
   -  Bitmap Heap Scan on tenk1 a  (cost=4.33..39.41 rows=10 width=244)
 Recheck Cond: (a.thousand = b.f1)
 -  Bitmap Index Scan on tenk1_thous_tenthous  (cost=0.00..4.33 
rows=10 width=0)
   Index Cond: (a.thousand = b.f1)
(7 rows)


regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG 7.3 is five years old today

2008-03-17 Thread Bruce Momjian


Added to TODO:

   o Remove pre-7.3 pg_dump code that assumes pg_depend does not exit


---

Tom Lane wrote:
 Gregory Stark [EMAIL PROTECTED] writes:
  Whether there's any need to support the old protocol in the server depends 
  on
  whether there are any clients out there which use it which is harder to
  determine and not affected by whether Postgres 7.3 is still around.
 
 Right.  There's really not much to be gained by dropping it on the
 server side anyway.  libpq might possibly be simplified by a useful
 amount, but on the other hand we probably want to keep its current
 structure for the inevitable v4 protocol.
 
 Another area where we might think about dropping some stuff is pg_dump.
 If we got rid of the requirement to support dumps from pre-7.3 servers
 then it could assume server-side dependencies exist, and lose all the
 code for trying to behave sanely without 'em.
 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 7: You can help support the PostgreSQL project by donating at
 
 http://www.postgresql.org/about/donate

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Better error message for select_common_type()

2008-03-17 Thread Tom Lane

I wrote:
 Were there any objections to changing this patch so that it reports
 the second expression's parser location, instead of some arbitrary
 numbers?  The way I'm envisioning doing it is:

 1. Invent an exprLocation() function (comparable to, say, exprType)
 that knows how to get the parser location from any subtype of Node that
 has one.

 2. Make a variant of select_common_type() that takes a list of Exprs
 instead of just type OIDs.  It can get the type IDs from these
 using exprType(), and it can get their locations using exprLocation()
 if needed.

I started to look at this and immediately found out that the above
blithe sketch has nothing to do with reality.  The problem is that
the current parser location mechanism stores locations only for
nodes that appear in raw grammar trees (gram.y output), *not* in
analyzed expressions (transformExpr output).  This was an intentional
choice based on a couple of factors:

* Once we no longer have the parser input string available, the location
information would be just so much wasted space.

* It would add a weird special case to the equalfuncs.c routines:
should location fields be compared?  (Probably not, but it seems a bit
unprincipled to ignore them.)  And other places might have comparable
uncertainties what to do with 'em.

We'd need to either go back on that decision or pass in location
information separately to select_common_type.  I think I prefer the
latter, but it's messier.

(On the third hand, you could make a case that including location
info in analyzed expressions makes it feasible to point at problems
detected at higher levels of analyze.c than just the first-level
transformExpr() call.  select_common_type's problem could be seen
as just one aspect of what might be a widespread need.)

Another problem is that only a rather small subset of raw-grammar
expression node types actually carry locations at all.  I had always
intended to go back and extend that, but it's not done yet.  One reason
it's not done is that currently a lot of expression node types are used
for both raw-grammar output and analyzed expressions, which brings us
right back up against the issue above.  I'd be inclined to fix that by
extending AExpr even more, and/or inventing an analogous raw-grammar
node type for things that take variable numbers of arguments, but
still it's more work.

So this is all eminently do-able but it seems too much to be tackling
during commit fest.  I'd like to throw this item back on the TODO list.

Or we could apply Peter's patch more or less as-is, but I don't like
that.  I don't think it solves the stated problem: if you know that CASE
branches 3 and 5 don't match, that still doesn't help you in a monster
query with lots of CASEs.  I think we can and must do better.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New style of hash join proposal

2008-03-17 Thread Gregory Stark

Tom Lane [EMAIL PROTECTED] writes:

 Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 I already demonstrated that we could.

 We seem to be talking past each other. The plan you showed is analogous but
 using a plain old index scan.

 That's only because that seemed like the appropriate thing for the given
 case's statistics.  [ fiddles with example... ]

 regression=# explain select * from tenk1 a where thousand in (select f1 from 
 int4_tbl b);
 QUERY PLAN
 
 --
  Nested Loop  (cost=5.39..198.81 rows=51 width=244)
-  HashAggregate  (cost=1.06..1.11 rows=5 width=4)
  -  Seq Scan on int4_tbl b  (cost=0.00..1.05 rows=5 width=4)
-  Bitmap Heap Scan on tenk1 a  (cost=4.33..39.41 rows=10 width=244)
  Recheck Cond: (a.thousand = b.f1)
  -  Bitmap Index Scan on tenk1_thous_tenthous  (cost=0.00..4.33 
 rows=10 width=0)
Index Cond: (a.thousand = b.f1)
 (7 rows)

Sure, but that's still re-executing the bitmap index scan 51 times -- possibly
having to fetch the same records off disk repeatedly. Avoiding that is kind of
the point behind the hash join plan after all.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] 业务合办！

2008-03-17 Thread 张海泉

尊敬的公司领导负责人(经理／财务)您们好！
   我是深圳市益鹏湾实业有限公司的，由于我司进项较多，
实力雄厚！现有多余的进项发票可以对外代开。具体有广东
省统一发票;（有商品销售、服务、广告、建筑安装、运输
电脑票）以及增值税电脑票和海关代征增值税发票都可代开
！一切票可等贵公司验征后付款．我司采取最优惠的点数来
向外代开，可为贵公司节约一部份资金．
 贵公司如有需要欢迎来电咨询。本邮件设置定时发送，
如有打扰敬请原谅！

   
  祝
商祺！
 
  联系人：张 海 泉
  手  机：13824305596
  邮  [EMAIL PROTECTED]

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Better error message for select_common_type()

2008-03-17 Thread Gregory Stark

Tom Lane [EMAIL PROTECTED] writes:

 Or we could apply Peter's patch more or less as-is, but I don't like
 that.  I don't think it solves the stated problem: if you know that CASE
 branches 3 and 5 don't match, that still doesn't help you in a monster
 query with lots of CASEs.  I think we can and must do better.

Do we have something more helpful than branches 3 and 5? Perhaps printing
the actual transformed expressions?


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New style of hash join proposal

2008-03-17 Thread Tom Lane

Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 Nested Loop  (cost=5.39..198.81 rows=51 width=244)
 -  HashAggregate  (cost=1.06..1.11 rows=5 width=4)
 -  Seq Scan on int4_tbl b  (cost=0.00..1.05 rows=5 width=4)
 -  Bitmap Heap Scan on tenk1 a  (cost=4.33..39.41 rows=10 width=244)
 Recheck Cond: (a.thousand = b.f1)
 -  Bitmap Index Scan on tenk1_thous_tenthous  (cost=0.00..4.33 rows=10 
 width=0)
 Index Cond: (a.thousand = b.f1)
 (7 rows)

 Sure, but that's still re-executing the bitmap index scan 51 times -- possibly
 having to fetch the same records off disk repeatedly.

It's not fetching any record repeatedly, because the HashAggregate
step eliminated duplicate keys on the other side.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Better error message for select_common_type()

2008-03-17 Thread Tom Lane

Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 Or we could apply Peter's patch more or less as-is, but I don't like
 that.  I don't think it solves the stated problem: if you know that CASE
 branches 3 and 5 don't match, that still doesn't help you in a monster
 query with lots of CASEs.  I think we can and must do better.

 Do we have something more helpful than branches 3 and 5?

That's exactly the point of discussion.  A parser location is what we
need, the problem is that this patch doesn't provide it.

 Perhaps printing the actual transformed expressions?

Don't think it solves the problem either.  For instance, if there are
a hundred references to variable X in your query, printing X isn't
going to get you far.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: new large object API

2008-03-17 Thread Tatsuo Ishii

I have posted proposed patches to pgsql-patches.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

 I would like to propose new large object client side API for 8.4.
 
 Currently we have:
 
 Oid lo_import(PGconn *conn, const char *filename);
 
 But we do not have an API which imports a large object specifying the
 object id. This is inconvenient and inconsistent since we already have
 lo_create() and lo_open() which allow to specify the large object id.
 
 So I propose to add new API:
 
 int lo_import_with_oid(PGconn *conn, Oid lobjId, const char 
 *filename);
 
 Another idea is changing the signature of lo_import:
 
 Oid lo_import(PGconn *conn, Oid lobjId, const char *filename);
 
 which will be cleaner but break the backward compatibility.
 
 Comments are welcome.
 --
 Tatsuo Ishii
 SRA OSS, Inc. Japan
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [0/4] Proposal of SE-PostgreSQL patches

2008-03-17 Thread KaiGai Kohei

Josh Berkus wrote:
 KaiGai,
 
 The series of patches are the proposal of Security-Enhanced PostgreSQL
 (SE-PostgreSQL) for the upstreamed PostgreSQL 8.4 development cycle.
 
 Since I'm (Finally!) expecting the TrustedSolaris folks to put some work into 
 PostgreSQL as well this year, I'm going to ask them to look over PGACE to see 
 if this implementation is (still) generic enough to support TS as well.  If 
 it is, then it's probably generic enough to be a general building block.

We can extend PGACE framework to mount TrustedSolaris features.
If they need new hooks which is not used in SE-PostgreSQL, it can
remain the default behavior.
The default PGACE behavior gives us no effects in access controls.

A flexible framework is worthwhile for both operating systems.
Please confirm it to the TS folks.

Thanks,
-- 
OSS Platform Development Division, NEC
KaiGai Kohei [EMAIL PROTECTED]

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PATCHES] [HACKERS] [0/4] Proposal of SE-PostgreSQL patches

2008-03-17 Thread KaiGai Kohei


Greg Smith wrote:

On Mon, 17 Mar 2008, KaiGai Kohei wrote:

I'll submit the proposal of SE-PostgreSQL patches again, because some 
of previous
messages are filtered due to attachment and I cannot provide whole of 
patches yet.


This is actually what you should have done from the beginning.  And it 
only should have gone to the pgsql-hackers list, which is the only one 
I'm replying to.  Your patches are at this point a proposal, as you say 
in the subject, and those go to the pgsql-hackers list with the minimum 
of files necessary to support them.  pgsql-patches is generally aimed at 
patches that have already been discussed on the hackers list, ones that 
are basically ready to apply to the source code.


OK, I can understand the purpose of pgsql-hackers and pgsql-patches list.
At first, I'll have a discussion here.

The libselinux is linked with SE-PostgreSQL, but it is licensed as 
public domain software by NSA.


As for the licensing issues here, what everyone is looking for is a 
clear statement of the SELinux license from the source of that code.  
The official NSA statment at http://www.nsa.gov/selinux/info/license.cfm 
says:


All source code found on this site is released under the same terms and 
conditions as the original sources. For example, the patches to the 
Linux kernel, patches to many existing utilities, and some of the new 
programs available here are released under the terms and conditions of 
the GNU General Public License (GPL). Please refer to the source code 
for specific license information.


GPL is a perfectly good license, but it's far from clear whether code 
derived from it can be incorporated into PostgreSQL even if you wrote 
all of it yourself.  I just checked libselinux, and as you say it 
includes a LICENSE file that states This library (libselinux) is public 
domain software, i.e. not copyrighted..  That's good, but a similar 
independant review will need to happen for every component you interact 
with here, on top of a technical review.  Luckily this is something a 
lot of people would like and that should all get taken care of.


SE-PostgreSQL internally uses libselinux, glibc and PostgreSQL internal
APIs like SearchSysCache().
I'm not a lawyer, but I believe they cannot enforce us to apply a specific
lisence. So, I clearly say SE-PostgreSQL feature is licensed with the same
one of PostgreSQL.
No need to say, more conprehensive checks and reviews are welcome.

Thanks,
--
OSS Platform Development Division, NEC
KaiGai Kohei [EMAIL PROTECTED]

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PATCHES] [HACKERS] [0/4] Proposal of SE-PostgreSQL patches

2008-03-17 Thread Joshua D. Drake

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Tue, 18 Mar 2008 10:41:42 +0900
KaiGai Kohei [EMAIL PROTECTED] wrote:

  GPL is a perfectly good license, but it's far from clear whether
  code derived from it can be incorporated into PostgreSQL even if
  you wrote all of it yourself.  I just checked libselinux, and as
  you say it includes a LICENSE file that states This library
  (libselinux) is public domain software, i.e. not copyrighted..
  That's good, but a similar independant review will need to happen
  for every component you interact with here, on top of a technical
  review.  Luckily this is something a lot of people would like and
  that should all get taken care of.
 
 SE-PostgreSQL internally uses libselinux, glibc and PostgreSQL
 internal APIs like SearchSysCache().
 I'm not a lawyer, but I believe they cannot enforce us to apply a
 specific lisence. So, I clearly say SE-PostgreSQL feature is licensed
 with the same one of PostgreSQL.
 No need to say, more conprehensive checks and reviews are welcome.

Hmmm,

Everything that I read says that libselinux is GPL. That could present
a problem for anyone that wants to use the BSD features of
PostgreSQL :).

I can check with SFLC if people are really curious.

Sincerely,

Joshua D. Drake


- -- 
The PostgreSQL Company since 1997: http://www.commandprompt.com/ 
PostgreSQL Community Conference: http://www.postgresqlconference.org/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
  PostgreSQL political pundit | Mocker of Dolphins

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFH3x/wATb/zqfZUUQRAq6WAJ9h2ecrYrsZ5bJUTJGhyS2LZSOqkACfeGoB
EHwcHtq7Ow5k3AlKNPwOVzs=
=yamT
-END PGP SIGNATURE-

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PATCHES] [HACKERS] [0/4] Proposal of SE-PostgreSQL patches

2008-03-17 Thread KaiGai Kohei


Joshua D. Drake wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Tue, 18 Mar 2008 10:41:42 +0900
KaiGai Kohei [EMAIL PROTECTED] wrote:


GPL is a perfectly good license, but it's far from clear whether
code derived from it can be incorporated into PostgreSQL even if
you wrote all of it yourself.  I just checked libselinux, and as
you say it includes a LICENSE file that states This library
(libselinux) is public domain software, i.e. not copyrighted..
That's good, but a similar independant review will need to happen
for every component you interact with here, on top of a technical
review.  Luckily this is something a lot of people would like and
that should all get taken care of.

SE-PostgreSQL internally uses libselinux, glibc and PostgreSQL
internal APIs like SearchSysCache().
I'm not a lawyer, but I believe they cannot enforce us to apply a
specific lisence. So, I clearly say SE-PostgreSQL feature is licensed
with the same one of PostgreSQL.
No need to say, more conprehensive checks and reviews are welcome.


Hmmm,

Everything that I read says that libselinux is GPL. That could present
a problem for anyone that wants to use the BSD features of
PostgreSQL :).


It is incorrect.
SELinux is indeed GPL because it is a part of kernel feature.
But libselinux is a public domain software, as follows:
  https://selinux.svn.sourceforge.net/svnroot/selinux/trunk/libselinux/LICENSE

SE-PostgreSQL is linked with *ONLY* libselinux. It communicate to SELinux
via system call. As you know, GPL does not consider invokation of system calls
as a link. Thus, we can release SE-PostgreSQL as a BSD licensed software.

Thanks,


I can check with SFLC if people are really curious.

Sincerely,

Joshua D. Drake

--
OSS Platform Development Division, NEC
KaiGai Kohei [EMAIL PROTECTED]

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Recreating archive_status

2008-03-17 Thread Bruce Momjian


Added to TODO:

o Recreate pg_xlog/archive_status/ if it doesn't exist after
  restoring from a PITR backup

  http://archives.postgresql.org/pgsql-hackers/2007-12/msg00487.php


---

Simon Riggs wrote:
 During recovery procedures, there is a step that says
 
 If you didn't archive pg_xlog/ at all, then recreate it, and be sure to
 recreate the subdirectory pg_xlog/archive_status/ as well.
 
 If you forget to do this, you may not realise until the recovering
 server comes up and tries writing to the directory. The message that is
 spat out when this happens is 
 
 LOG:  could not create archive status file
 pg_xlog/archive_status/000103CE009E.ready: No such file or
 directory
 
 We could check this just as the server comes up and then re-create it if
 necessary. So we have one less step in the process to remember. Existing
 scripts which perform this automatically will not need changing.
 
 We can keep the message in case something removes the directory later.
 
 Views?
 
 -- 
   Simon Riggs
   2ndQuadrant  http://www.2ndQuadrant.com
 
 
 ---(end of broadcast)---
 TIP 6: explain analyze is your friend

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PATCHES] [HACKERS] [0/4] Proposal of SE-PostgreSQL patches

2008-03-17 Thread Tom Lane

Joshua D. Drake [EMAIL PROTECTED] writes:
 Everything that I read says that libselinux is GPL.

What exactly did you read?  The LICENSE file in the source tarball
says

: This library (libselinux) is public domain software, i.e. not copyrighted.
: 
: Warranty Exclusion
: --
: You agree that this software is a
: non-commercially developed program that may contain bugs (as that
: term is used in the industry) and that it may not function as intended.
: The software is licensed as is. NSA makes no, and hereby expressly
: disclaims all, warranties, express, implied, statutory, or otherwise
: with respect to the software, including noninfringement and the implied
: warranties of merchantability and fitness for a particular purpose.
: 
: Limitation of Liability
: ---
: In no event will NSA be liable for any damages, including loss of data,
: lost profits, cost of cover, or other special, incidental,
: consequential, direct or indirect damages arising from the software or
: the use thereof, however caused and on any theory of liability. This
: limitation will apply even if NSA has been advised of the possibility
: of such damage. You acknowledge that this is a reasonable allocation of
: risk.


I have a feeling that NSA did not bother to run this by any actual
lawyers, because the second and third paragraphs are only meaningful as
part of a license (ie something recipients agree to) and by definition
there is no license on public-domain software.  They've given up
*everything*, including the right to make you agree to any terms of
distribution.

Not that anyone would likely be dumb enough to try to sue the NSA,
but it's still pretty funny.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Request for feature - ECPGget_PGconn

2008-03-17 Thread Dave Cramer


Tom,

I believe you were the one to object to this proposal pre 8.3 ?

To me it seems innocuous. Most high level db abstractions allow access  
to the underlying connection.


Dave
On 17-Mar-08, at 7:26 AM, Mike Aubury wrote:



Request Overview

Add a function to return the current PGConn used within ecpg..


Background
--
For years now within the Aubit4GL project we've been able to access  
the PGConn
record used by ecpg by the highly dubious means of accessing an  
internal

record within ecpg (which has now been removed/hidden).
It would be really useful if we could get at the PGConn connection  
via a

formal API/function call...

This would be useful to others as it would allow libpq calls on the  
currently
open connection to use features for which there is no direct ecpg  
equivilent,
or where the functionality has already been implemented using libpq  
calls.
(The ability to drop to a lower level of abstraction is common in  
most db

orientated languages/language extensions like esql/c.)



Implementation
--

This could be implemented by adding the following code to the existing
ecpglib/connect.c file :

   PGconn* ECPGget_PGconn(const char *connection_name) {
struct connection * con;
con=ecpg_get_connection(connection_name);
if (con==NULL) return NULL;

return con-connection;
   }




TIA


--
Mike Aubury

Aubit Computing Ltd is registered in England and Wales, Number:  
3112827

Registered Address : Clayton House,59 Piccadilly,Manchester,M1 2AQ



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PATCHES] [HACKERS] [0/4] Proposal of SE-PostgreSQL patches

2008-03-17 Thread Joshua D. Drake

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Mon, 17 Mar 2008 22:45:14 -0400
Tom Lane [EMAIL PROTECTED] wrote:

 Joshua D. Drake [EMAIL PROTECTED] writes:
  Everything that I read says that libselinux is GPL.
 
 What exactly did you read?  The LICENSE file in the source tarball
 says

I was reading a directory listing. I found the actual license file and
you are correct.

 
 Not that anyone would likely be dumb enough to try to sue the NSA,
 but it's still pretty funny.
 

One of those few entities on the planet that it just really doesn't
matter how much money you have. You don't sue them. In fact, NSA who?

Joshua D. Drake


- -- 
The PostgreSQL Company since 1997: http://www.commandprompt.com/ 
PostgreSQL Community Conference: http://www.postgresqlconference.org/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
  PostgreSQL political pundit | Mocker of Dolphins

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFH3y31ATb/zqfZUUQRAgs/AKCJORQO+N4xfTeG6MXvhXFMn06DKwCfT33V
j8xTb5clqdPK3zWnA3XYuMQ=
=mVVu
-END PGP SIGNATURE-

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [0/4] Proposal of SE-PostgreSQL patches

2008-03-17 Thread Josh Berkus

KaiGai,

 A flexible framework is worthwhile for both operating systems.
 Please confirm it to the TS folks.

Yep, that's the idea.  Glenn was fine with your stuff last year, I expect 
it'll still be fine.  

Other than SELinux and TrustedSolaris, does anyone know of other role-based or 
multilevel security frameworks we should check against?  I'd like to have any 
security framework we approve be plug-in adaptable to everything out there.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCHES] [0/4] Proposal of SE-PostgreSQL patches

2008-03-17 Thread KaiGai Kohei


Gregory Stark wrote:

Greg Smith [EMAIL PROTECTED] writes:


On Mon, 17 Mar 2008, KaiGai Kohei wrote:


I'll submit the proposal of SE-PostgreSQL patches again, because some of 
previous
messages are filtered due to attachment and I cannot provide whole of patches 
yet.

This is actually what you should have done from the beginning.  And it only
should have gone to the pgsql-hackers list, which is the only one I'm replying
to.  Your patches are at this point a proposal, as you say in the subject, and
those go to the pgsql-hackers list with the minimum of files necessary to
support them.  pgsql-patches is generally aimed at patches that have already
been discussed on the hackers list, ones that are basically ready to apply to
the source code.


Some people shout any time you send patches to -hackers. For the -patches is
there mainly to catch large attachments regardless of their maturity.

But it's true that it's best to post a plan and have discussion prior to
developing big patches.


Yes, it might be a better way to develop this feature on reflection.

I'm sorry that I could not submit a proposal by the feature freeze
date of v8.3 unfortunately, so the development of SE-PostgreSQL
(based on v8.2) is overlapped with development cycle of v8.3.

I don't want to repeat same thing twice, so these series of patches
are submitted fot v8.4 development cycle.
The first two of them ([1/4] and [2/4]) are significant part of
SE-PostgreSQL. We can discuss rest of them later. They contains
utility extension and security policy.

Thanks,
--
OSS Platform Development Division, NEC
KaiGai Kohei [EMAIL PROTECTED]

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] count(*) performance improvement ideas

2008-03-17 Thread Stephen Denne

 

Tom Lane wrote
 Pavan Deolasee [EMAIL PROTECTED] writes:
  I am sure this must have been discussed before.
 
 Indeed.  Apparently you didn't find the threads in which the idea of
 having transactions enter delta entries was discussed?  Solves both
 the locking and the MVCC problems, at the cost that you need to make
 cleanup passes over the counter table fairly often.

I can't find any posts that directly address what I was looking for.

In my situation I have a small number of concurrent transactions with each 
transaction running a large number of single row inserts or deletes.

However I'm not after a fast count(*) from table, but more like a fast
select grouping_id, count(*) from my_table group by grouping_id

I initially thought that what you meant by having transactions enter delta 
entries was that I have a trigger that would create a row each time it was 
called, writing how many records where inserted or deleted. I didn't understand 
how this would be much of an improvement, as each of my rows would contain 
either +1 or -1.

But I just realised you might mean to include a txid row in my table of deltas, 
and in my trigger insert or update that row where txid = txid_current()
(per grouping_id)

Is that what is recommended?

No locking problems as each transaction is only updating its own rows.

Can you clarify the lack of MVCC problems?
Do new versions of rows get created if the original version of the row was 
created in the current transaction?
Does this idea apply with the same efficiency in pre 8.3, non-HOT 
implementations?
Any advice on creating or avoiding indexes on the tables in question?

I can think of two options for a performing the cleanup passes using current 
functionality:
1) Using Listen/Notify
Issue a notify whenever a new row is inserted for a transaction. They get 
delivered post transaction commit don't they? And not at all on rollback? Have 
an application listening for them, performing the aggregation  cleanup work.
2) Use a schedule, based on local knowledge of expected number of transactions 
over time.

So I'd rephrase Pavan's suggestion as a request to have post-transaction commit 
triggers that have access to (at least) the txid of the transaction that was 
committed.

Suggested syntax is to add the option TRANSACTION (or perhaps COMMIT) to 
the CREATE TRIGGER statement:

CREATE TRIGGER name AFTER INSERT OR UPDATE OR DELETE ON table FOR EACH 
TRANSACTION EXECUTE PROCEDURE funcname ( arguments );

Any of the triggering actions on the specified table ensure that the function 
is called once if the transaction commits.
Requires a new TG_LEVEL.
TG_OP could be the first action triggered.

Would implementing this be extremely difficult due to transferring information 
from within the transaction to outside the transaction?
If so, perhaps I'd get the same benefit from having a trigger set up to fire 
pre-commit (or pre-prepare), and be a part of the transaction.
Would the locking difficulties be reduced as the lock would not be required 
till late in the game, and not held for long?

Regards,
Stephen Denne.

Disclaimer:
At the Datamail Group we value team commitment, respect, achievement, customer 
focus, and courage. This email with any attachments is confidential and may be 
subject to legal privilege.  If it is not intended for you please advise by 
reply immediately, destroy it and do not copy, disclose or use it in any way.

__
  This email has been scanned by the DMZGlobal Business Quality 
  Electronic Messaging Suite.
Please see http://www.dmzglobal.com/services/bqem.htm for details.
__



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New style of hash join proposal

2008-03-17 Thread Gregory Stark

Tom Lane [EMAIL PROTECTED] writes:

 Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 Nested Loop  (cost=5.39..198.81 rows=51 width=244)
 -  HashAggregate  (cost=1.06..1.11 rows=5 width=4)
 -  Seq Scan on int4_tbl b  (cost=0.00..1.05 rows=5 width=4)
 -  Bitmap Heap Scan on tenk1 a  (cost=4.33..39.41 rows=10 width=244)
 Recheck Cond: (a.thousand = b.f1)
 -  Bitmap Index Scan on tenk1_thous_tenthous  (cost=0.00..4.33 rows=10 
 width=0)
 Index Cond: (a.thousand = b.f1)
 (7 rows)

 Sure, but that's still re-executing the bitmap index scan 51 times -- 
 possibly
 having to fetch the same records off disk repeatedly.

sorry 5 times

 It's not fetching any record repeatedly, because the HashAggregate
 step eliminated duplicate keys on the other side.

Only because it's an IN query. If it was a normal join which hash joins are
perfectly capable of handling then it would be. As it is it could be fetching
the same page repeatedly but not precisely the same tuples. 

It happens that transforming this query to 

explain analyze select * from tenk1 a where thousand = any (array(select f1 
from int4_tbl b));

makes it run about 40% faster. That could just be that the hash is small
enough that a linear search is more efficient than calling hashint4 though.
(Perhaps we should be protecting against that in dynahash, actually)

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] count(*) performance improvement ideas

2008-03-17 Thread Tom Lane

Stephen Denne [EMAIL PROTECTED] writes:
 I initially thought that what you meant by having transactions enter delta 
 entries was that I have a trigger that would create a row each time it was 
 called, writing how many records where inserted or deleted. I didn't 
 understand how this would be much of an improvement, as each of my rows would 
 contain either +1 or -1.

Well, ideally you'd aggregate all the deltas caused by a particular
transaction into one entry in the counting table.  Whether or not that
happens, though, the point of the concept is that some background task
aggregates all the deltas from long-gone transactions into just one base
row, and then deletes the old delta entries.  To get a valid value of
COUNT(*), what onlookers must do is SUM() the base row and delta records
from all transactions that they can see under MVCC rules.  The amount
of work involved is proportional to the number of recent updates, not
the total size of the underlying table.

 However I'm not after a fast count(*) from table, but more like a fast
   select grouping_id, count(*) from my_table group by grouping_id

You could apply the same technique across each group id, though this
certainly is getting beyond what any built-in feature might offer.

 Can you clarify the lack of MVCC problems?

The point there is that the right answer actually depends on the
observer, since each observer might have a different snapshot and
therefore be able to see a different set of committed rows in the
underlying table.  The multiple-delta table handles this automatically,
because you can see a delta entry if and only if you could see
the underlying-table changes it represents.

 Does this idea apply with the same efficiency in pre 8.3, non-HOT 
 implementations?

I didn't claim it was amazingly efficient in any implementation ;-).
HOT in particular is nearly useless since most rows in the count
table will never be updated, only inserted and eventually deleted.
You might get some mileage on the base row, but that'd be about it.
The count table will need frequent vacuums as well as frequent
aggregation scans.

It should beat scanning a large underlying table, but it's hardly
gonna be free.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

87 matches

Mail list logo