date:20080926

[HACKERS] About the parameter of API: PQprepared

2008-09-26 Thread iihero

In libpq, the definition is like:
PGresult *
PQprepare(PGconn *conn,
const char *stmtName, const char *query,
int nParams, const Oid *paramTypes)

Could we remove the parameter nParams?
e.g. insert into foo(id, name, address) values ($1, $2, $3)
PostgreSQL possibly can parse the prepared sql statement to get the real
paramters count.

Or, is there another alternate way?

Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches

2008-09-26 Thread KaiGai Kohei


Robert Haas wrote:

I think you have to resign yourself to the fact that a user who can
see only a subset of the rows in a table may very well see apparent
foreign-key violations.  But so what?

So you're leaking information about the rows that they're not supposed
to be able to see.  This is not what I would call national-security-grade
information hiding --- leastwise *I* certainly wouldn't store nuclear
weapon design information in such a database.  The people that the NSA
wants to defend against are more than smart enough, and persistent
enough, to extract information through such loopholes.


If your plan is to refuse to implement row-level security until
someone produces a design that can never leak information under any
circumstances regardless of how the user configures it, then I
strongly suspect we'll be waiting forever.  The designs being put
forward here are capable of being configured in an insecure fashion,
but that's true of any other security mechanism we offer, too.


In my understanding, security evaluation criteria does not require
to prevent such kind of information leaks (called as illicit information
flow, or covert channel) in all cases.

The ISO/IEC15408 is a set of specifications for security functionalities
of IT products. We can find its requirement about illicit information flow
as follows:

- FDP_IFF.3  Limited illicit information flows, requires the SFP
 to cover illicit information flows, but not necessarily
 eliminate them.
- FDP_IFF.4  Partial elimination of illicit information flows, requires
 the SFP to cover the elimination of some (but not necessarily
 all) illicit information flows.
- FDP_IFF.5  No illicit information flows, requires SFP to cover the
 elimination of all illicit information flows.

 (*) FDP_IFF = Functionalities of Data Protection
   Information Flaw control Functions
 (*) The larger tail number means more strict requirement.
 (*) Common Criteria part.2 at
 http://www.commoncriteriaportal.org/files/ccfiles/CCPART2V3.1R2.pdf

At least, we can say it is not something like all or nothing.


SE-PostgreSQL needs to be good enough for the NSA, but row-level
security in general does not.


BTW, it seems to me someone misunderstand I works as an agent of NSA.
Is it a humor, itn't it? I've paid my effort to improve the security
of open source software, not for overseas intelligence agency.

Thanks,
--
OSS Platform Development Division, NEC
KaiGai Kohei [EMAIL PROTECTED]

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches

2008-09-26 Thread KaiGai Kohei


Well, the PGACE documentation says:

http://code.google.com/p/sepgsql/wiki/WhatIsPGACE

Datum pgacePreparePlanCheck(Relation rel)


In the latest patch, this hooks is replaced by pgaceBeginPerformCheckFK()
and pgaceEndPerformCheckFK(), but its purpose is unchanged.

Sorry for the confusable legacy description.


I updated it to reflect the latest patches.

Thanks,
--
OSS Platform Development Division, NEC
KaiGai Kohei [EMAIL PROTECTED]

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] parallel pg_restore - WIP patch

2008-09-26 Thread Stefan Kaltenbrunner


Andrew Dunstan wrote:


Attached is my WIP patch for parallel pg_restore. It's still very rough, 
but seems to work.


Anyone who can test this with highend equipment would be helping some.


tried playing with this(on a 22Gb compressed dump using 4 connections) 
but it does not seem to work at all for me:


pg_restore: [custom archiver] out of memory
pg_restore: [custom archiver] could not uncompress data: invalid block type
pg_restore: [custom archiver] out of memory
pg_restore: [custom archiver] could not uncompress data: invalid stored 
block lengths

pg_restore: [custom archiver] out of memory
pg_restore: [custom archiver] out of memory
pg_restore: [custom archiver] could not uncompress data: invalid 
distance too far back
pg_restore: [custom archiver] could not uncompress data: invalid 
distances set
pg_restore: [custom archiver] could not uncompress data: invalid code 
lengths set
pg_restore: [custom archiver] could not uncompress data: incorrect data 
check
pg_restore: [custom archiver] could not uncompress data: invalid code 
lengths set

pg_restore: [custom archiver] out of memory
pg_restore: [custom archiver] out of memory
pg_restore: [custom archiver] out of memory
pg_restore: [custom archiver] out of memory
pg_restore: [custom archiver] could not uncompress data: invalid 
literal/length code
pg_restore: [custom archiver] could not uncompress data: invalid 
literal/length code

pg_restore: [custom archiver] could not uncompress data: invalid block type

each pg_restore process seem to eat a few gigabytes of memory in a few 
seconds.



Stefan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PATCHES] [HACKERS] Infrastructure changes for recovery

2008-09-26 Thread Simon Riggs


On Thu, 2008-09-25 at 18:28 -0400, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  Version 7

 Anyway, that's sufficiently bad that I'm bouncing the patch for
 reconsideration.  

No problem, I understand this needs discussion. 

There's less detail here than first appears. There are some basic points
to consider from which all else follows.

 After reading this for awhile, I realized that there is a rather
 fundamental problem with it: it switches into consistent recovery
 mode as soon as it's read WAL beyond ControlFile-minRecoveryPoint.
 In a crash recovery situation that typically is before the last
 checkpoint (if indeed it's not still zero), and what that means is
 that this patch will activate the bgwriter and start letting in
 backends instantaneously after a crash, long before we can have any
 certainty that the DB state really is consistent.
 
 In a normal crash recovery situation this would be easily fixed by
 simply not letting it go to consistent recovery state at all, but
 what about recovery from a restartpoint?  We don't want a slave that's
 crashed once to never let backends in again.  But I don't see how to
 determine that we're far enough past the restartpoint to be consistent
 again.  In crash recovery we assume (without proof ;-)) that we're
 consistent once we reach the end of valid-looking WAL, but that rule
 doesn't help for a slave that's following a continuing WAL sequence.
 
 Perhaps something could be done based on noting when we have to pull in
 a WAL segment from the recovery_command, but it sounds like a pretty
 fragile assumption.

Seems like we just say we only signal the postmaster if
InArchiveRecovery. Archive recovery from a restartpoint is still archive
recovery, so this shouldn't be a problem in the way you mention. The
presence of recovery.conf overrides all other cases.

 Some other issues I noted before giving up:

All of these issues raised can be addressed, but I think the main
decision we need to make is not so much about running other processes
but about when it can start and when they have to change mode.

When they can start seems solvable, as above.

When/how they must change state from recovery to normal mode seems more
difficult. State change must be atomic across all processes, but also
done at a micro level so that XLogFlush tests for the state change. The
overhead of continually checking seems high, so I am tempted to say lets
just kick 'em all off and then let them back on again. That's easily
accomplished for bgwriter without anybody noticing much. For Hot Standby
that would mean that a failover would kick off all query backends. I can
see why that would be both desirable and undesirable.

Anyway, from here I propose:
* we keep the shutdown checkpoint
* we kick off bgwriter (and any children) then let 'em back on again so
they can initialise in a different mode.

To do that, I just need to dust off a previous version of the patch. So
we can sort this out quickly if we have a clear way to proceed.

--
other comments relate to this current patch, so further discussion of
the points below may not be required, if we agree how to proceed as
above.

 * I'm a bit uncomfortable with the fact that the
 IsRecoveryProcessingMode flag is read and written with no lock.
 There's no atomicity or concurrent-write problem, sure, but on
 a multiprocessor with weak memory ordering guarantees (eg PPC)
 readers could get a fairly stale value of the flag.  The false
 to true transition happens before anyone except the startup process is
 running, so that's no problem; the net effect is then that backends
 might think that recovery mode was still active for awhile after it
 wasn't.  This seems a bit scary, eg in the patch as it stands that'll
 disable XLogFlush calls that should have happened.  You could fix that
 by grabbing/releasing some spinlock (any old one) around the accesses,
 but are any of the call sites performance-critical?  The one in
 XLogInsert looks like it is, if nothing else.

Agreed.

It's not a dynamic state, so I can fix that inside
IsRecoveryProcessingMode() with a local state to make check faster.

bool
IsRecoveryProcessingMode(void)
{
if (!IsRecoveryProcessingMode)
return false;

{
/* use volatile pointer to prevent code rearrangement */
volatile XLogCtlData *xlogctl = XLogCtl;

SpinLockAcquire(xlogctl-mode_lck);
RecoveryProcessingMode = XLogCtl-IsRecoveryProcessingMode;
SpinLockRelease(xlogctl-mode_lck);
   }

   return IsRecoveryProcessingMode;
}

This only applies if we decide not to kick everybody off, change state
and then let them back on again.

 * I kinda think you broke XLogFlush anyway.  It's certainly clear
 that the WARNING case at the bottom is unreachable with the patch,
 and I think that means that you've messed up an important error
 robustness behavior.  Is it still possible to get out of recovery mode
 if

Re: [HACKERS] PostgreSQL future ideas

2008-09-26 Thread Chris Browne

[EMAIL PROTECTED] (Gevik Babakhani) writes:
 Advantage of C++ is that it reduce lot of OO code written in 
 C in PostgreSQL, but it is so big effort to do that without 
 small gain. It will increase number of bugs. Do not forget 
 also that C++ compiler is not so common (so good) on 
 different platforms. If somebody interesting in that yes but 
 like a fork ( PostgreSQL++ :-).

 Reducing OO code that is written in C is one of my major
 interests. After some investigating myself it appears that having
 the codebase fully (rewritten in C++ will have an impact on the
 performance. So I guess such an effort will result the code being
 more C++ish and fully OO, being a mixture in C with some OO taste.

I'm not convinced that it would a good idea at all to make the system
fully OO, nor that C++ would be a meaningful tool to use to that
end.

After all, C++ can certainly be used in decidedly non-OO ways.  For
instance, STL is NOT an OO framework, and the author of STL, obviously
something of a fan of C++, characterizes OO as almost as much of a
hoax as Artificial Intelligence.
http://en.wikipedia.org/wiki/Object-oriented_programming#Criticism

I tend to agree with that characterization.

Further, C++ suffers from the same not OO at its base problem of
Java, which contributes complexity as well as hurting the OO-ness of
it.

 Better idea is to start to use C99 in PostgreSQL ;-).

 I have not investigated this yet. But I am very interested to know
 what the advantages would be to upgrade the code to C99 standards.

It would give us heartburn on any platforms where the preferred
compiler doesn't grok C99, for sure.

As much as I'm ok with using GCC, it would seem unfortunate to force
people into using GCC everywhere, and preclude using other compilers.
(And actually, I'm more ambivalent about GCC than that; I'm not
totally happy with how GCC has gone, but that's another tale for
another day...)
-- 
(format nil [EMAIL PROTECTED] cbbrowne acm.org)
http://linuxdatabases.info/info/linux.html
NT 5.0 is the last nail in the Unix coffin. Interestingly, Unix isn't
in the coffin... It's wondering what the heck is sealing itself into a
wooden box 6 feet underground... -- Jason McMullan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: move column defaults into pg_attribute along with attacl

2008-09-26 Thread Markus Wanner

Alvaro Herrera wrote:
 Tom Lane wrote:
 
 I think adding a subobject column to pg_shdepend is probably the best
 answer --- we only didn't do that to start with because we thought it
 wasn't needed.
 
 Yep.  I did consider adding it, but there was no use for it at the time
 so I just left it out.  It's not like it's very difficult to add it at
 this point.

Sorry, I lost track here and would like to know the recommended course
of action for ACLs. I see these issues:

 * make pg_attribute store VARLENA and/or NOT NULL fields
 * dependency tracking of defaults (and ACLs)
 * where to place ACLs (pg_attribute, merge with pg_attrdef or yet
another pg_attracl table)

So far I understood that merging pg_attrdef into pg_attribute is
unwanted due to complexity at DROP TABLE.

What does the subobject column for pg_shdepend buy us?

Clarification appreciated.

Regards

Markus Wanner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] parallel pg_restore - WIP patch

2008-09-26 Thread Russell Smith

Hi,

As I'm interested in this topic, I thought I'd take a look at the
patch.  I have no capability to test it on high end hardware but did
some basic testing on my workstation and basic review of the patch.

I somehow had the impression that instead of creating a new connection
for each restore item we would create the processes at the start and
then send them the dumpId's they should be restoring.  That would allow
the controller to batch dumpId's together and expect the worker to
process them in a transaction.  But this is probably just an idea I
created in my head.

Do we know why we experience tuple concurrently updated errors if we
spawn thread too fast?

I completed some test restores using the pg_restore from head with the
patch applied.  The dump was a custom dump created with pg 8.2 and
restored to an 8.2 database.  To confirm this would work, I completed a
restore using the standard single threaded mode.   The schema restore
successfully.  The only errors reported involved non-existent roles.

When I attempt to restore using parallel restore I get out of memory
errors reported from _PrintData.   The code returning the error is;

_PrintData(...
while (blkLen != 0)
{
if (blkLen + 1  ctx-inSize)
{
free(ctx-zlibIn);
ctx-zlibIn = NULL;
ctx-zlibIn = (char *) malloc(blkLen + 1);
if (!ctx-zlibIn)
die_horribly(AH, modulename,  out of memory\n);

ctx-inSize = blkLen + 1;
in = ctx-zlibIn;
}


It appears from my debugging and looking at the code that in _PrintData;
lclContext *ctx = (lclContext *) AH-formatData;

the memory context is shared across all threads.  Which means that it's
possible the memory contexts are stomping on each other.  My GDB skills
are now up to being able to reproduce this in a gdb session as there are
forks going on all over the place.  And if you process them in a serial
fashion, there aren't any errors.  I'm not sure of the fix for this. 
But in a parallel environment it doesn't seem possible to store the
memory context in the AH.

I also receive messages saying pg_restore: [custom archiver] could not
read from input file: end of file.  I have not investigated these
further as my current guess is they are linked to the out of memory error.

Given I ran into this error at my first testing attempt  I haven't
evaluated much else at this point in time.  Now all this could be
because I'm using the 8.2 archive, but it works fine in single restore
mode.  The dump file is about 400M compressed and an entire archive
schema was removed from the restore path with a custom restore list.

Command line used;  PGPORT=5432 ./pg_restore -h /var/run/postgresql -m4
--truncate-before-load -v -d tt2 -L tt.list
/home/mr-russ/pg-index-test/timetable.pgdump 2 log.txt

I sent the log and this email originally to the list, but I think the 
attachment was too large, so I've resent without any attachements.  Since my 
initial testing, Stefan has confirmed the problem I am having.

If you have any questions, would like me to run other tests or anything,
feel free to contact me.

Regards

Russell


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PATCHES] [HACKERS] Infrastructure changes for recovery

2008-09-26 Thread Simon Riggs


On Fri, 2008-09-26 at 11:20 +0100, Simon Riggs wrote:

  After reading this for awhile, I realized that there is a rather
  fundamental problem with it: it switches into consistent recovery
  mode as soon as it's read WAL beyond ControlFile-minRecoveryPoint.
  In a crash recovery situation that typically is before the last
  checkpoint (if indeed it's not still zero), and what that means is
  that this patch will activate the bgwriter and start letting in
  backends instantaneously after a crash, long before we can have any
  certainty that the DB state really is consistent.
  
  In a normal crash recovery situation this would be easily fixed by
  simply not letting it go to consistent recovery state at all, but
  what about recovery from a restartpoint?  We don't want a slave that's
  crashed once to never let backends in again.  But I don't see how to
  determine that we're far enough past the restartpoint to be consistent
  again.  In crash recovery we assume (without proof ;-)) that we're
  consistent once we reach the end of valid-looking WAL, but that rule
  doesn't help for a slave that's following a continuing WAL sequence.
  
  Perhaps something could be done based on noting when we have to pull in
  a WAL segment from the recovery_command, but it sounds like a pretty
  fragile assumption.
 
 Seems like we just say we only signal the postmaster if
 InArchiveRecovery. Archive recovery from a restartpoint is still archive
 recovery, so this shouldn't be a problem in the way you mention. The
 presence of recovery.conf overrides all other cases.

Anticipating your possible reponses, I would add this also:

There has long been an annoying hole in the PITR scheme which is the
support of recovery using a crashed database. That is there to support
split mirror snapshots, but it creates a loophole where we don't know
the min recovery location, circumventing the care we (you!) took to put
stop/start backup in place.

I think we need to add a parameter to recovery.conf that people can use
to specify a minRecoveryPoint iff there in no backup label file. They
can work out what this should be by following this procedure, which we
should document:
* split mirror, so you have offline copy of crashed database
* copy database away to backup
* go to running database and run pg_current_xlog_insert_location()
* use the value to specify recovery_min_location

If they don't specify this, then bgwriter will not start and you cannot
run in Hot Standby mode. Their choice, so we need not worry then about
the loophole any more.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] parallel pg_restore - WIP patch

2008-09-26 Thread Andrew Dunstan




Russell Smith wrote:

Hi,

As I'm interested in this topic, I thought I'd take a look at the
patch.  I have no capability to test it on high end hardware but did
some basic testing on my workstation and basic review of the patch.

I somehow had the impression that instead of creating a new connection
for each restore item we would create the processes at the start and
then send them the dumpId's they should be restoring.  That would allow
the controller to batch dumpId's together and expect the worker to
process them in a transaction.  But this is probably just an idea I
created in my head.
  


Yes it is. To do that I would have to invent a protocol for talking to 
the workers, etc, and there is not the slightest chance I would get that 
done by November.
And I don't see the virtue in processing them all in a transaction. I've 
provided a much simpler means of avoiding WAL logging of the COPY.



Do we know why we experience tuple concurrently updated errors if we
spawn thread too fast?
  


No. That's an open item.

I completed some test restores using the pg_restore from head with the
patch applied.  The dump was a custom dump created with pg 8.2 and
restored to an 8.2 database.  To confirm this would work, I completed a
restore using the standard single threaded mode.   The schema restore
successfully.  The only errors reported involved non-existent roles.

When I attempt to restore using parallel restore I get out of memory
errors reported from _PrintData.   The code returning the error is;

_PrintData(...
while (blkLen != 0)
{
if (blkLen + 1  ctx-inSize)
{
free(ctx-zlibIn);
ctx-zlibIn = NULL;
ctx-zlibIn = (char *) malloc(blkLen + 1);
if (!ctx-zlibIn)
die_horribly(AH, modulename,  out of memory\n);

ctx-inSize = blkLen + 1;
in = ctx-zlibIn;
}


It appears from my debugging and looking at the code that in _PrintData;
lclContext *ctx = (lclContext *) AH-formatData;

the memory context is shared across all threads.  Which means that it's
possible the memory contexts are stomping on each other.  My GDB skills
are now up to being able to reproduce this in a gdb session as there are
forks going on all over the place.  And if you process them in a serial
fashion, there aren't any errors.  I'm not sure of the fix for this. 
But in a parallel environment it doesn't seem possible to store the

memory context in the AH.
  



There are no threads, hence nothing is shared. fork() create s new 
process, not a new thread, and all they share are file descriptors.




I also receive messages saying pg_restore: [custom archiver] could not
read from input file: end of file.  I have not investigated these
further as my current guess is they are linked to the out of memory error.

Given I ran into this error at my first testing attempt  I haven't
evaluated much else at this point in time.  Now all this could be
because I'm using the 8.2 archive, but it works fine in single restore
mode.  The dump file is about 400M compressed and an entire archive
schema was removed from the restore path with a custom restore list.

Command line used;  PGPORT=5432 ./pg_restore -h /var/run/postgresql -m4
--truncate-before-load -v -d tt2 -L tt.list
/home/mr-russ/pg-index-test/timetable.pgdump 2 log.txt

I've attached the log.txt file so you can review the errors that I saw. 
I have adjusted the out of memory error to include a number to work

out which one was being triggered.  So you'll see 5 out of memory in
the log file, which corresponds to the code above.
  


However, there does seem to be something odd happening with the 
compression lib, which I will investigate. Thanks for the report.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: new border setting in psql

2008-09-26 Thread D'Arcy J.M. Cain

So what have we decided about this suggestion.  Should I submit the
patch or just forget about it?  So far some people like it and some
people think that it is unneccessary.  No one so far has suggested that
it would harm the system or people's use of it.

-- 
D'Arcy J.M. Cain [EMAIL PROTECTED] |  Democracy is three wolves
http://www.druid.net/darcy/|  and a sheep voting on
+1 416 425 1212 (DoD#0082)(eNTP)   |  what's for dinner.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] parallel pg_restore - WIP patch

2008-09-26 Thread Andrew Dunstan




Stefan Kaltenbrunner wrote:

Andrew Dunstan wrote:


Attached is my WIP patch for parallel pg_restore. It's still very 
rough, but seems to work.


Anyone who can test this with highend equipment would be helping some.


tried playing with this(on a 22Gb compressed dump using 4 connections) 
but it does not seem to work at all for me:


pg_restore: [custom archiver] out of memory
pg_restore: [custom archiver] could not uncompress data: invalid block 
type

pg_restore: [custom archiver] out of memory
pg_restore: [custom archiver] could not uncompress data: invalid 
stored block lengths

pg_restore: [custom archiver] out of memory
pg_restore: [custom archiver] out of memory
pg_restore: [custom archiver] could not uncompress data: invalid 
distance too far back
pg_restore: [custom archiver] could not uncompress data: invalid 
distances set
pg_restore: [custom archiver] could not uncompress data: invalid code 
lengths set
pg_restore: [custom archiver] could not uncompress data: incorrect 
data check
pg_restore: [custom archiver] could not uncompress data: invalid code 
lengths set

pg_restore: [custom archiver] out of memory
pg_restore: [custom archiver] out of memory
pg_restore: [custom archiver] out of memory
pg_restore: [custom archiver] out of memory
pg_restore: [custom archiver] could not uncompress data: invalid 
literal/length code
pg_restore: [custom archiver] could not uncompress data: invalid 
literal/length code
pg_restore: [custom archiver] could not uncompress data: invalid block 
type


each pg_restore process seem to eat a few gigabytes of memory in a few 
seconds.



Ouch. Ok, Thanks for the report. I will investigate.

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches

2008-09-26 Thread Zeugswetter Andreas OSB sIT


  We already have an optional OID system column that can be specified
  during table creation (WITH OIDS).  We could have another optional oid
  column (WITH ROW SECURITY) called security_context which would store the
  oid of the role that can see the row;  if the oid is zero (InvalidOid),

A role alone is not sufficient. It needs to be the proposed mapping to 
pg_security.

  anyone can see it.  SE-PostgreSQL would default to WITH ROW SECURITY and
  use the oid to look up strings in pg_security.

 The above explanation is not correct, as Tom mentioned.
 The security system column is declared as TEXT type, however, every tuple
 has a Oid value to indicate pg_security system catalog. It enables to
 prevent waste of storage. When user tries to read the system column,
 it is translated from Oid to text representation.

Imho the important points Bruce wanted to make are:
1. there is only one extra oid storage column per row (regardless whether it is 
translated to text upon select)
this is already the case in the patch.
2. the column(s) are system columns, so they do not show up in select *

I think having access to the oid directly could be beneficial to performance.
e.g. a smart client could cache pg_security and map the oid's to text locally
instead of transferring the quite verbose text representation for every row.
That may be mute, because showing the security_context definitely sounds more
like an admin kind of functionality.
Traditionally the column would probably be oid and sql would need to cast it for
the text representation (e.g. security_context::regsecurity).

Andreas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] parallel pg_restore - WIP patch

2008-09-26 Thread Russell Smith

Andrew Dunstan wrote:
 Do we know why we experience tuple concurrently updated errors if we
 spawn thread too fast?
   

 No. That's an open item.

Okay, I'll see if I can have a little more of a look into it.  No
promises as the restore the restore isn't playing nicely.


 the memory context is shared across all threads.  Which means that it's
 possible the memory contexts are stomping on each other.  My GDB skills
 are now up to being able to reproduce this in a gdb session as there are
 forks going on all over the place.  And if you process them in a serial
 fashion, there aren't any errors.  I'm not sure of the fix for this.
 But in a parallel environment it doesn't seem possible to store the
 memory context in the AH.
   


 There are no threads, hence nothing is shared. fork() create s new
 process, not a new thread, and all they share are file descriptors.

 However, there does seem to be something odd happening with the
 compression lib, which I will investigate. Thanks for the report.

I'm sorry, I meant processes there.  I'm aware there are no threads. 
But my feeling was that when you forked with open files you got all of
the open file properties, including positions, and as you dupped the
descriptor, you share all that it's pointing to with every other copy of
the descriptor.  My brief research on that shows that in 2005 there was
a kernel mailing list discussion on this issue. 
http://mail.nl.linux.org/kernelnewbies/2005-09/msg00479.html was quite
informative for me.  I again could be wrong but worth a read.  If it is
true, then the file needs to be reopened by each child, it can't use the
duplicated descriptor.  I haven't had a change to implementation test is
as it's late here.  But I'd take a stab that it will solve the
compression library problems.

I hope this helps, not hinders

Russell.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] FSM, now without WAL-logging

2008-09-26 Thread Zdenek Kotala


Heikki Linnakangas napsal(a):

Zdenek Kotala wrote:




Attached is a new version, now with WAL-logging of the FSM truncation. I 
decided to go with the separate WAL record for that, rather than 
piggybacking on the smgrtruncate's WAL record. It seems much better from 
a modularity point of view this way. I've also worked on the comments, 
renamed many of the internal functions, in a more coherent scheme, and I 
also started using the struct FSMAddress you suggested a while ago.


But I don't think I've changed anything that could explain that crash. 
Let me know if it still doesn't work.


This version works on my old repo.

I performed performance test (iGEN) on SUN x4600 with 60 concurrent 
users and see result:


Original:
-
MQThL (Maximum Qualified Throughput LIGHT): 1209.60 tpm
MQThM (Maximum Qualified Throughput MEDIUM): 2576.72 tpm
MQThH (Maximum Qualified Throughput HEAVY): 2191.20 tpm


TRANSACTION MIX

Total number of transactions = 181232
TYPETX. COUNT   MIX
-   ---
Light:  30240   16.69%
Medium: 64418   35.54%
DSS:19865   10.96%
Heavy:  54780   30.23%
Connection: 11929   6.58%


RESPONSE TIMES  AVG.MAX.90TH

Light   0.304   6.405   0.400
Medium  0.317   6.533   0.400
DSS 0.266   6.343   0.020
Heavy   0.361   6.737   3.000
Connections 0.264   5.983   0.400
Number of users = 60
Sum of Avg. RT * TPS for all Tx. Types = 32.770142


FSM with WAL

MQThL (Maximum Qualified Throughput LIGHT): 1199.36 tpm
MQThM (Maximum Qualified Throughput MEDIUM): 2569.12 tpm
MQThH (Maximum Qualified Throughput HEAVY): 2171.64 tpm


TRANSACTION MIX

Total number of transactions = 180625
TYPETX. COUNT   MIX
-   ---
Light:  29984   16.60%
Medium: 64228   35.56%
DSS:20181   11.17%
Heavy:  54291   30.06%
Connection: 11941   6.61%


RESPONSE TIMES  AVG.MAX.90TH

Light   0.309   6.560   0.400
Medium  0.323   6.529   0.400
DSS 0.268   6.327   0.020
Heavy   0.360   6.675   3.000
Connections 0.274   6.359   0.400
Number of users = 60
Sum of Avg. RT * TPS for all Tx. Types = 32.845712


FSM no WAL last version
---
MQThL (Maximum Qualified Throughput LIGHT): 1207.92 tpm
MQThM (Maximum Qualified Throughput MEDIUM): 2611.84 tpm
MQThH (Maximum Qualified Throughput HEAVY): 2177.68 tpm


TRANSACTION MIX

Total number of transactions = 18
TYPETX. COUNT   MIX
-   ---
Light:  30198   16.57%
Medium: 65296   35.83%
DSS:20118   11.04%
Heavy:  54442   29.88%
Connection: 12168   6.68%


RESPONSE TIMES  AVG.MAX.90TH

Light   0.301   6.106   0.400
Medium  0.315   6.130   0.400
DSS 0.261   5.977   0.020
Heavy   0.361   6.220   3.000
Connections 0.260   6.044   0.400
Number of users = 60
Sum of Avg. RT * TPS for all Tx. Types = 32.696832


-

I don't see any big difference. Throughput is similar. Only response 
time seems to be better with your last FSM version.


I personally happy with performance.

Zdenek


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] About the parameter of API: PQprepared

2008-09-26 Thread Tom Lane

iihero [EMAIL PROTECTED] writes:
 In libpq, the definition is like:
 PGresult *
 PQprepare(PGconn *conn,
 const char *stmtName, const char *query,
 int nParams, const Oid *paramTypes)

 Could we remove the parameter nParams?

No.  We are not going to break every existing user of the function.
But you can already set it to zero if you don't feel like prespecifying
the parameter types.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] FSM, now without WAL-logging

2008-09-26 Thread Heikki Linnakangas


Zdenek Kotala wrote:
I performed performance test (iGEN) on SUN x4600 with 60 concurrent 
users and see result:


 ...


I don't see any big difference. Throughput is similar. Only response 
time seems to be better with your last FSM version.


I personally happy with performance.


Thanks! I've been running DBT-2 tests myself, and I'm not seeing any 
difference there either:


testno  TPM NO90timecomment
36  14051.724   6h  fsm-nowal
35  14210.761   6h  CVS HEAD
34  14391.066   2h  fsm-nowal
33  14420.868   2h  CVS HEAD

The NO90 is the 90% percentile of New Order transaction response time. 
there's big variation there in the 6h tests, because of a huge dip in 
performance near the end of the test, both with and without the patch. I 
 don't have an explanation for the dips, but it throws off the response 
times for the whole tests. Given that in the 2h tests, which is exactly 
the same as the 6h test, just shorter, there's no degradation in the 
response times, I'm not worried about that.


I don't have access to the site where I used to publish the test results 
earlier, but let me know if you want to see the full test results, and 
I'll try to zip them up and FTP somewhere (~500 MB uncompressed).


I've also tried various pgbench tests, on a RAM disk and otherwise, as 
well as the table population test I ran earlier, and don't see any 
difference in performance.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] parallel pg_restore - WIP patch

2008-09-26 Thread Andrew Dunstan




Russell Smith wrote:
I'm sorry, I meant processes there.  I'm aware there are no threads. 
But my feeling was that when you forked with open files you got all of

the open file properties, including positions, and as you dupped the
descriptor, you share all that it's pointing to with every other copy of
the descriptor.  My brief research on that shows that in 2005 there was
a kernel mailing list discussion on this issue. 
http://mail.nl.linux.org/kernelnewbies/2005-09/msg00479.html was quite

informative for me.  I again could be wrong but worth a read.  If it is
true, then the file needs to be reopened by each child, it can't use the
duplicated descriptor.  I haven't had a change to implementation test is
as it's late here.  But I'd take a stab that it will solve the
compression library problems.

I hope this helps, not hinders


  


I'm sure that's the problem. Should be fairly easily fixable, I believe.

Thanks for the info.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches

2008-09-26 Thread Andrew Sullivan

On Thu, Sep 25, 2008 at 08:57:46PM -0400, Tom Lane wrote:
 Another point is that the proposed behavior leaks quite a lot of
 information, since it will fail operations on the basis of tuples that
 supposedly aren't visible to the invoking user.  While I admit that it's
 hard to see an alternative if we're to preserve FK integrity, I have to
 worry that this definition isn't going to satisfy the tin-foil-hat
 brigade that are supposed to be the main users of SEPostgres.  If the
 goal is you don't know the row is there, this doesn't seem to meet it.

The above point, and other similar ones in every discussion of the
proposed functionality, makes me think once again either that the
requirements for this feature aren't understood by everyone, or else
that they're not actually explicit enough.  I have a feeling it's the
latter.  Certainly, I've not yet read a complete security analysis of
a data system security plan that outlines why the proposed model is
correct.

What I think is really happening with this development is that the
SE-Linux understanding of security enhancement has been taken as the
correct analysis for how one secures an information system.  That deep
assumption appears to me to be informing much of the development of
SE-PostgreSQL.  In particular, that deep assumption includes an
assumption that consistency of access control trumps all.  The
Postgres developers who are questioning the SE approach are (I think)
coming at this from the point of view of data systems developers,
where consistency of the data set trumps all.

I suspect that the tension between these approaches will not be
reconciled without a fairly complete outline of possible security
models for data systems, their relationship to what the OS security
people have decided is the right thing to do, and the trade-offs
necessary to make different priorities work.  Some of the trade offs
may include things like violate traditional understanding of data set
consistency and possible disclosure of existence of datum.  I think
this will be a lot of work, and I'm not volunteering to do it.  I
nevertheless think that without it, the SE-PostgreSQL features will
continue to be a very awkward fit.

A

-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches

2008-09-26 Thread Andrew Sullivan

On Thu, Sep 25, 2008 at 10:32:24PM -0400, Tom Lane wrote:
 I can't escape the lurking suspicion that some bright folk inside the
 NSA have spent years thinking about this and have come up with some
 reasonably self-consistent definition of row hiding in a SQL database.
 But have they published it where we can find it?

I have a couple contacts in the security world who might be able to
help with references.  I'm asking them now.

A

-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] FSM, now without WAL-logging

2008-09-26 Thread Zdenek Kotala


Heikki Linnakangas napsal(a):

Zdenek Kotala wrote:


snip

I've also tried various pgbench tests, on a RAM disk and otherwise, as 
well as the table population test I ran earlier, and don't see any 
difference in performance.


I think performance is OK

Zdenek



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches

2008-09-26 Thread Tom Lane

Andrew Sullivan [EMAIL PROTECTED] writes:
 The above point, and other similar ones in every discussion of the
 proposed functionality, makes me think once again either that the
 requirements for this feature aren't understood by everyone, or else
 that they're not actually explicit enough.  I have a feeling it's the
 latter.

Yeah, I think that's exactly the problem here: we've got this large
patch and no agreement on just what requirements it's supposed to meet.
Perhaps others see it differently, but I feel like I'm being told that
whatever the patch does is the right thing by definition ... and yet
it doesn't seem to meet what I would think are the likely requirements
of the users who might actually want such features.

Agreeing on the requirements seems like a necessary condition for
arriving at any consensus on a patch.  Where can we get some evidence
that would convince everyone that the requirements for a highly
secure database are X, Y and Z?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches

2008-09-26 Thread KaiGai Kohei


Zeugswetter Andreas OSB sIT wrote:

anyone can see it.  SE-PostgreSQL would default to WITH ROW SECURITY and
use the oid to look up strings in pg_security.

The above explanation is not correct, as Tom mentioned.
The security system column is declared as TEXT type, however, every tuple
has a Oid value to indicate pg_security system catalog. It enables to
prevent waste of storage. When user tries to read the system column,
it is translated from Oid to text representation.


Imho the important points Bruce wanted to make are:
1. there is only one extra oid storage column per row (regardless whether it is 
translated to text upon select)
this is already the case in the patch.
2. the column(s) are system columns, so they do not show up in select *

I think having access to the oid directly could be beneficial to performance.
e.g. a smart client could cache pg_security and map the oid's to text locally
instead of transferring the quite verbose text representation for every row.
That may be mute, because showing the security_context definitely sounds more
like an admin kind of functionality.
Traditionally the column would probably be oid and sql would need to cast it for
the text representation (e.g. security_context::regsecurity).


In most of cases, SE-PostgreSQL does not need to translate the security 
identifier
into text representation, because it caches the result of access checks between
the client and the recently used security_context. SE-PostgreSQL can make its
decision refering the internal hash table with the security Oid.
(See, src/backend/security/sepgsql/avc.c)

When user requires to expose security_context, it is necessary to lookup 
pg_security
to translate the security Oid into text representation, but I guess it is not 
frequently.

Thanks,
--
KaiGai Kohei [EMAIL PROTECTED]

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] planned maintenance downtime - tribble.postgresql.org

2008-09-26 Thread Stefan Kaltenbrunner

The sysadmin team would like to announce a planned maintenance window 
for OS related updates on tribble.postgresql.org starting Sunday Sep 28 
07:00 GMT (espected to last for an hour) affecting the following 
publically visible services:


cvs.postgresql.org
wwwmaster.postgresql.org
www.pgadmin.org
doxygen.postgresql.org
wiki.postgresql.org

I would ask people to hold off on any changes or commits to the affected 
services during that time period until you see an explicit it's done.



regards

Stefan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] parallel pg_restore - WIP patch

2008-09-26 Thread Andrew Dunstan




This version of the patch should fix the shared file descriptor bug 
Russell Smith noticed. It also disables the 1/2 second sleep between 
forks, so the performance on a small db (regression) is vastly improved.


cheers

andrew



Index: pg_backup.h
===
RCS file: /cvsroot/pgsql/src/bin/pg_dump/pg_backup.h,v
retrieving revision 1.47
diff -c -r1.47 pg_backup.h
*** pg_backup.h	13 Apr 2008 03:49:21 -	1.47
--- pg_backup.h	26 Sep 2008 15:15:38 -
***
*** 123,128 
--- 123,130 
  	int			suppressDumpWarnings;	/* Suppress output of WARNING entries
  		 * to stderr */
  	bool		single_txn;
+ int number_of_threads;
+ 	booltruncate_before_load;
  
  	bool	   *idWanted;		/* array showing which dump IDs to emit */
  } RestoreOptions;
***
*** 165,170 
--- 167,173 
  extern void CloseArchive(Archive *AH);
  
  extern void RestoreArchive(Archive *AH, RestoreOptions *ropt);
+ extern void RestoreArchiveParallel(Archive *AH, RestoreOptions *ropt);
  
  /* Open an existing archive */
  extern Archive *OpenArchive(const char *FileSpec, const ArchiveFormat fmt);
Index: pg_backup_archiver.c
===
RCS file: /cvsroot/pgsql/src/bin/pg_dump/pg_backup_archiver.c,v
retrieving revision 1.158
diff -c -r1.158 pg_backup_archiver.c
*** pg_backup_archiver.c	5 Sep 2008 23:53:42 -	1.158
--- pg_backup_archiver.c	26 Sep 2008 15:15:39 -
***
*** 27,38 
--- 27,50 
  
  #include unistd.h
  
+ #include sys/types.h
+ #include sys/wait.h
+ 
+ 
  #ifdef WIN32
  #include io.h
  #endif
  
  #include libpq/libpq-fs.h
  
+ typedef struct _parallel_slot 
+ {
+ 	pid_t   pid;
+ 	TocEntry *te;
+ 	DumpId  dumpId;
+ } ParallelSlot;
+ 
+ #define NO_SLOT (-1)
  
  const char *progname;
  
***
*** 70,76 
--- 82,99 
  static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
  static OutputContext SetOutput(ArchiveHandle *AH, char *filename, int compression);
  static void ResetOutput(ArchiveHandle *AH, OutputContext savedContext);
+ static bool work_is_being_done(ParallelSlot *slot, int n_slots);
+ static int get_next_slot(ParallelSlot *slots, int n_slots);
+ static TocEntry *get_next_work_item(ArchiveHandle *AH);
+ static void prestore(ArchiveHandle *AH, TocEntry *te);
+ static void mark_work_done(ArchiveHandle *AH, pid_t worker, ParallelSlot *slots, int n_slots);
+ static int _restore_one_te(ArchiveHandle *ah, TocEntry *te, RestoreOptions *ropt,bool is_parallel);
+ static void _reduce_dependencies(ArchiveHandle * AH, TocEntry *te);
+ static void _fix_dependency_counts(ArchiveHandle *AH);
+ static void _inhibit_data_for_failed_table(ArchiveHandle *AH, TocEntry * te);
+ 
  
+ static ArchiveHandle *GAH;
  
  /*
   *	Wrapper functions.
***
*** 125,137 
  
  /* Public */
  void
  RestoreArchive(Archive *AHX, RestoreOptions *ropt)
  {
  	ArchiveHandle *AH = (ArchiveHandle *) AHX;
  	TocEntry   *te;
  	teReqs		reqs;
  	OutputContext sav;
- 	bool		defnDumped;
  
  	AH-ropt = ropt;
  	AH-stage = STAGE_INITIALIZING;
--- 148,529 
  
  /* Public */
  void
+ RestoreArchiveParallel(Archive *AHX, RestoreOptions *ropt)
+ {
+ 
+ 	ArchiveHandle *AH = (ArchiveHandle *) AHX;
+ 	ParallelSlot  *slots;
+ int next_slot;
+ TocEntry *next_work_item = NULL;
+ int work_status;
+ 	pid_t ret_child;
+ 	int n_slots = ropt-number_of_threads;
+ 	TocEntry *te;
+ 	teReqsreqs;
+ 	
+ 
+ 	/* 	AH-debugLevel = 99; */
+ 	/* some routines that use ahlog() don't get passed AH */
+ 	GAH = AH;
+ 
+ 	ahlog(AH,1,entering RestoreARchiveParallel\n);
+ 
+ 
+ 	slots = (ParallelSlot *) calloc(sizeof(ParallelSlot),n_slots);
+ 	AH-ropt = ropt;
+ 
+ 	if (ropt-create)
+ 		die_horribly(AH,modulename,
+ 	 parallel restore is incompatible with --create\n);
+ 
+ 	if (ropt-dropSchema)
+ 		die_horribly(AH,modulename,
+ 	 parallel restore is incompatible with --clean\n);
+ 
+ 	if (!ropt-useDB)
+ 		die_horribly(AH,modulename,
+ 	 parallel restore requires direct database connection\n);
+ 
+ 
+ #ifndef HAVE_LIBZ
+ 
+ 	/* make sure we won't need (de)compression we haven't got */
+ 	if (AH-compression != 0  AH-PrintTocDataPtr != NULL)
+ 	{
+ 		for (te = AH-toc-next; te != AH-toc; te = te-next)
+ 		{
+ 			reqs = _tocEntryRequired(te, ropt, false);
+ 			if (te-hadDumper  (reqs  REQ_DATA) != 0)
+ die_horribly(AH, modulename, 
+ 			 cannot restore from compressed archive (compression not supported in this installation)\n);
+ 		}
+ 	}
+ #endif
+ 
+ 	ahlog(AH, 1, connecting to database for restore\n);
+ 	if (AH-version  K_VERS_1_3)
+ 		die_horribly(AH, modulename, 
+ 	 direct database connections are not supported in pre-1.3 archives\n);
+ 
+ 	/* XXX Should get this from the archive */
+ 	AHX-minRemoteVersion = 070100;
+ 	AHX-maxRemoteVersion = 99;
+ 
+ 	/* correct dependency counts in case we're doing a

Re: [HACKERS] PostgreSQL future ideas

2008-09-26 Thread A.M.



On Sep 25, 2008, at 5:50 PM, Chris Browne wrote:


[EMAIL PROTECTED] (Gevik Babakhani) writes:

Advantage of C++ is that it reduce lot of OO code written in
C in PostgreSQL, but it is so big effort to do that without
small gain. It will increase number of bugs. Do not forget
also that C++ compiler is not so common (so good) on
different platforms. If somebody interesting in that yes but
like a fork ( PostgreSQL++ :-).


Reducing OO code that is written in C is one of my major
interests. After some investigating myself it appears that having
the codebase fully (rewritten in C++ will have an impact on the
performance. So I guess such an effort will result the code being
more C++ish and fully OO, being a mixture in C with some OO taste.


I'm not convinced that it would a good idea at all to make the system
fully OO, nor that C++ would be a meaningful tool to use to that
end.

After all, C++ can certainly be used in decidedly non-OO ways.  For
instance, STL is NOT an OO framework, and the author of STL, obviously
something of a fan of C++, characterizes OO as almost as much of a
hoax as Artificial Intelligence.
http://en.wikipedia.org/wiki/Object-oriented_programming#Criticism

I tend to agree with that characterization.

Further, C++ suffers from the same not OO at its base problem of
Java, which contributes complexity as well as hurting the OO-ness of
it.


Better idea is to start to use C99 in PostgreSQL ;-).


I have not investigated this yet. But I am very interested to know
what the advantages would be to upgrade the code to C99 standards.


It would give us heartburn on any platforms where the preferred
compiler doesn't grok C99, for sure.

As much as I'm ok with using GCC, it would seem unfortunate to force
people into using GCC everywhere, and preclude using other compilers.
(And actually, I'm more ambivalent about GCC than that; I'm not
totally happy with how GCC has gone, but that's another tale for
another day...)


Speaking of language choice, no one said that _all_ the source code  
would need to be rewritten. It would be nice, for example, if  
PostgreSQL rewrote the current GUC system with a glue language like  
Lua (which is also very C-like).


Cheers,
M

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL future ideas

2008-09-26 Thread Andrew Dunstan




A.M. wrote:



Speaking of language choice, no one said that _all_ the source code 
would need to be rewritten. It would be nice, for example, if 
PostgreSQL rewrote the current GUC system with a glue language like 
Lua (which is also very C-like).





No it wouldn't. All it would mean is that you'd need developers fluent 
in both languages.


If this were a greenfields project we might well now make choices other 
than those made in the past, but that doesn't mean we should constantly 
revisit those decisions.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] lock contention on parallel COPY ?

2008-09-26 Thread Stefan Kaltenbrunner

I'm currently playing with Andrews parallel restore patch and it seems 
that pg is far from taking advantage of the hardware I have for testing 
(Dual Quad Core Xeon attached to Netapp with 68 spindles). with a 
concurrency of 4 - I see iowait hovering ~1% CPU load at 20% peak and 
around 15 contextswitches/s. The load seems to progress at around 
the same rate as a single backend restore does on the same box. a 
profile during the load (a fair amount of tables sized ~10-30M rows 
each) looks fairly similiar to:



samples  %symbol name
1933314  21.8884  LWLockAcquire
1677808  18.9957  XLogInsert
8482279.6034  LWLockRelease
4141794.6892  DoCopy
3326333.7660  CopyReadLine
2665803.0181  UnpinBuffer
2216932.5099  heap_formtuple
1769392.0033  .plt
1718421.9455  PinBuffer
1604701.8168  GetNewObjectId
1540951.7446  heap_insert
1518131.7188  s_lock
1178491.3343  LockBuffer
1095301.2401  hash_search_with_hash_value
1021691.1567  PageAddItem
91151 1.0320  pg_verify_mbstr_len
82538 0.9345  CopyGetData


using --truncate-before-load seems to help a bit but it still seems to 
only barely utilizing the available resources.



Stefan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] lock contention on parallel COPY ?

2008-09-26 Thread Tom Lane

Stefan Kaltenbrunner [EMAIL PROTECTED] writes:
 samples  %symbol name
 1933314  21.8884  LWLockAcquire
 1677808  18.9957  XLogInsert
 8482279.6034  LWLockRelease
 4141794.6892  DoCopy
 3326333.7660  CopyReadLine
 2665803.0181  UnpinBuffer
 2216932.5099  heap_formtuple

I suppose Andrew didn't yet put in the hack to avoid WAL logging
during the COPY commands.  The LWLockAcquires are presumably
blocking on WALInsertLock, given that XLogInsert is also right
up there ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] lock contention on parallel COPY ?

2008-09-26 Thread Andrew Dunstan




Tom Lane wrote:

Stefan Kaltenbrunner [EMAIL PROTECTED] writes:
  

samples  %symbol name
1933314  21.8884  LWLockAcquire
1677808  18.9957  XLogInsert
8482279.6034  LWLockRelease
4141794.6892  DoCopy
3326333.7660  CopyReadLine
2665803.0181  UnpinBuffer
2216932.5099  heap_formtuple



I suppose Andrew didn't yet put in the hack to avoid WAL logging
during the COPY commands.  The LWLockAcquires are presumably
blocking on WALInsertLock, given that XLogInsert is also right
up there ...


  



Yes I did. That's what the --truncate-before-load switch does (or should 
do).


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] lock contention on parallel COPY ?

2008-09-26 Thread Tom Lane

Andrew Dunstan [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 I suppose Andrew didn't yet put in the hack to avoid WAL logging

 Yes I did. That's what the --truncate-before-load switch does (or should 
 do).

Well, it doesn't seem to be having any effect.  Maybe Stefan is testing
a configuration with xlog archiving enabled?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] lock contention on parallel COPY ?

2008-09-26 Thread Stefan Kaltenbrunner


Tom Lane wrote:

Andrew Dunstan [EMAIL PROTECTED] writes:

Tom Lane wrote:

I suppose Andrew didn't yet put in the hack to avoid WAL logging


Yes I did. That's what the --truncate-before-load switch does (or should 
do).


Well, it doesn't seem to be having any effect.  Maybe Stefan is testing
a configuration with xlog archiving enabled?



heh no log archiving - I actually said that I'm now playing with 
--truncate-before-load which seems to cause a noticeable performance (as 
in IO generated) increase but I still see 13 context switches/s and 
a profile that looks like:



samples  %symbol name
5552616.5614  LWLockAcquire
29721 8.8647  DoCopy
26581 7.9281  CopyReadLine
25105 7.4879  LWLockRelease
15743 4.6956  PinBuffer
14725 4.3919  heap_formtuple
9748  2.9075  GetNewObjectId
8059  2.4037  pg_verify_mbstr_len
6825  2.0356  hash_search_with_hash_value
6386  1.9047  s_lock
5645  1.6837  heap_insert
5631  1.6795  PageAddItem
4723  1.4087  pg_atoi


Stefan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] lock contention on parallel COPY ?

2008-09-26 Thread Tom Lane

Stefan Kaltenbrunner [EMAIL PROTECTED] writes:
 heh no log archiving - I actually said that I'm now playing with 
 --truncate-before-load which seems to cause a noticeable performance (as 
 in IO generated) increase but I still see 13 context switches/s and 
 a profile that looks like:

 samples  %symbol name
 5552616.5614  LWLockAcquire
 29721 8.8647  DoCopy
 26581 7.9281  CopyReadLine
 25105 7.4879  LWLockRelease
 15743 4.6956  PinBuffer
 14725 4.3919  heap_formtuple

Still a lot of contention for something, then.  You might try turning on
LWLOCK_STATS (this only requires recompiling storage/lmgr/lwlock.c) to
get some evidence about what.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] lock contention on parallel COPY ?

2008-09-26 Thread Stefan Kaltenbrunner


Tom Lane wrote:

Stefan Kaltenbrunner [EMAIL PROTECTED] writes:
heh no log archiving - I actually said that I'm now playing with 
--truncate-before-load which seems to cause a noticeable performance (as 
in IO generated) increase but I still see 13 context switches/s and 
a profile that looks like:



samples  %symbol name
5552616.5614  LWLockAcquire
29721 8.8647  DoCopy
26581 7.9281  CopyReadLine
25105 7.4879  LWLockRelease
15743 4.6956  PinBuffer
14725 4.3919  heap_formtuple


Still a lot of contention for something, then.  You might try turning on
LWLOCK_STATS (this only requires recompiling storage/lmgr/lwlock.c) to
get some evidence about what.


that one generates a huge amount of logs - output for ~60s into the load 
is available here:


http://www.kaltenbrunner.cc/files/lwstats.txt (21MB!)


Stefan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Meridiem markers (was: [BUGS] Incorrect invalid AM/PM string error from to_timestamp)

2008-09-26 Thread Brendan Jurd

Hey folks,

Tom pointed out in the parent thread [1] that the error message for
bogus AM/PM markers in to_timestamp is pretty lame:

ERROR:  invalid AM/PM string

I agree, and once I started thinking about this, I came up with other
gripes concerning the treatment of 12-hour time in to_timestamp, the
use of the meridiem markers (AM/PM) in particular.

Currently, Postgres accepts four separate flavours for specifying
meridiem markers, given by uppercase/lowercase and with/without
periods:

 * am/pm
 * AM/PM
 * a.m./p.m.
 * A.M./P.M.

What I find surprising about the implementation is that you must
specify the correct formatting pattern for the particular flavour of
meridiem marker in your input string, or you get the aforementioned
lame error message.  Consider:

postgres=# select to_timestamp('11:47 pm 27 Sep 2008', 'HH:MI PM DD Mon ');
ERROR:  invalid AM/PM string

Here it seems to me that Postgres is being unnecessarily pedantic.
Yes, the case of the marker differs from the case of the formatting
keyword, but is that really grounds for an ERROR?  The user's
intention is perfectly unambiguous, so why not just accept the input?

I would go so far as to say that we should accept any of the 8 valid
meridiem markers, regardless of which flavour is indicated by the
formatting keyword.

Day and month names already work this way.  We don't throw an error if
a user specifies a mixed-case month name like Sep but uses the
uppercase formatting keyword MON.

I suspect that the current behaviour isn't so much a deliberate choice
to be draconian as it is a consequence of the way the code was
developed; as the inverse of to_char().

One way to tidy this up would be to re-implement the meridiem markers
using the seq_search functions, i.e., make it work like the day and
month names.  This would make it easy to accept any flavour of marker,
and the error messages thrown for bogus input would then be the same
as those for bogus day and month names.

Note that all of the above applies equally to the era markers B.C. and A.D.

Comments?

Cheers,
BJ

[1] http://archives.postgresql.org/message-id/[EMAIL PROTECTED]

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Meridiem markers (was: [BUGS] Incorrect invalid AM/PM string error from to_timestamp)

2008-09-26 Thread Tom Lane

Brendan Jurd [EMAIL PROTECTED] writes:
 One way to tidy this up would be to re-implement the meridiem markers
 using the seq_search functions, i.e., make it work like the day and
 month names.  This would make it easy to accept any flavour of marker,
 and the error messages thrown for bogus input would then be the same
 as those for bogus day and month names.

 Note that all of the above applies equally to the era markers B.C. and A.D.

+1 ... making these work like month names makes sense to me.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] lock contention on parallel COPY ?

2008-09-26 Thread Tom Lane

Stefan Kaltenbrunner [EMAIL PROTECTED] writes:
 that one generates a huge amount of logs - output for ~60s into the load 
 is available here:
 http://www.kaltenbrunner.cc/files/lwstats.txt (21MB!)

Huh ... essentially all the contention is for OidGenLock.  I take it
you're loading into a table that has OIDs?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL future ideas

2008-09-26 Thread Jonah H. Harris

On Fri, Sep 26, 2008 at 11:52 AM, Andrew Dunstan [EMAIL PROTECTED] wrote:
 Speaking of language choice, no one said that _all_ the source code would
 need to be rewritten. It would be nice, for example, if PostgreSQL rewrote
 the current GUC system with a glue language like Lua (which is also very
 C-like).

 No it wouldn't. All it would mean is that you'd need developers fluent in
 both languages.

Having done quite a bit of internals work with SAP DB (which is an
amalgamation of C, C++, and Pascal), I completely agree.  The entire
system, if possible, should be in a single language.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] lock contention on parallel COPY ?

2008-09-26 Thread Stefan Kaltenbrunner


Tom Lane wrote:

Stefan Kaltenbrunner [EMAIL PROTECTED] writes:
that one generates a huge amount of logs - output for ~60s into the load 
is available here:

http://www.kaltenbrunner.cc/files/lwstats.txt (21MB!)


Huh ... essentially all the contention is for OidGenLock.  I take it
you're loading into a table that has OIDs?


hmm that particular database has its ancient roots on 7.3 (or 7.4) and 
indeed it seems that some tables in the dump are preceeded by a SET 
default_with_oids = true;


Will go and elimited those and try again.


Stefan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] lock contention on parallel COPY ?

2008-09-26 Thread Simon Riggs


On Fri, 2008-09-26 at 12:38 -0400, Tom Lane wrote:
 Stefan Kaltenbrunner [EMAIL PROTECTED] writes:
  heh no log archiving - I actually said that I'm now playing with 
  --truncate-before-load which seems to cause a noticeable performance (as 
  in IO generated) increase but I still see 13 context switches/s and 
  a profile that looks like:
 
  samples  %symbol name
  5552616.5614  LWLockAcquire
  29721 8.8647  DoCopy
  26581 7.9281  CopyReadLine
  25105 7.4879  LWLockRelease
  15743 4.6956  PinBuffer
  14725 4.3919  heap_formtuple
 
 Still a lot of contention for something, then.  You might try turning on
 LWLOCK_STATS (this only requires recompiling storage/lmgr/lwlock.c) to
 get some evidence about what.

Probably loading a table with a generated PK or loading data in
ascending sequence, so its contending heavily for the rightmost edge of
the index.

We need to load data a block at a time and buffer the inserts into the
index also, so we don't need to lock/unlock per row.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] lock contention on parallel COPY ?

2008-09-26 Thread Tom Lane

Simon Riggs [EMAIL PROTECTED] writes:
 Stefan Kaltenbrunner [EMAIL PROTECTED] writes:
 samples  %symbol name
 5552616.5614  LWLockAcquire
 29721 8.8647  DoCopy
 26581 7.9281  CopyReadLine
 25105 7.4879  LWLockRelease
 15743 4.6956  PinBuffer
 14725 4.3919  heap_formtuple

 Probably loading a table with a generated PK or loading data in
 ascending sequence, so its contending heavily for the rightmost edge of
 the index.

No, given that DoCopy and CopyReadLine are right up there, I think we're
still looking at the COPY phase, not index building.

The profile will probably change completely once index building
starts...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Meridiem markers (was: [BUGS] Incorrect invalid AM/PM string error from to_timestamp)

2008-09-26 Thread Alex Hunsaker

On Fri, Sep 26, 2008 at 11:25 AM, Brendan Jurd [EMAIL PROTECTED] wrote:
 One way to tidy this up would be to re-implement the meridiem markers
 using the seq_search functions, i.e., make it work like the day and
 month names.  This would make it easy to accept any flavour of marker,
 and the error messages thrown for bogus input would then be the same
 as those for bogus day and month names.

Yeah if we seq_search then it should be a pretty easy conversion. so +1

However that still leaves the original complaint around (at least IMHO):

select to_timestamp('AN', 'AM');
ERROR:  invalid AM/PM string

select to_timestamp('11:47 PM 27 Sep a2008', 'HH:MI PM DD Mon ');
ERROR: invalid value for  in source string


Now arguably most to_timestamp calls are going to be short so i can
easily look for the  in my string and see what I did wrong (and
DETAIL: provides Value must be an integer in the second case)... So
really maybe thats good enough I dunno... Personally I find the output
of my dumb patch to be better than both above:

select to_timestamp('AN', 'AM');
ERROR: invalid AM/PM string for 'AN'

And we could improve that by only showing node-key-len chars.  And
make that work for both AM/PM and the others  etc

Then again maybe its not worth it?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Meridiem markers (was: [BUGS] Incorrect invalid AM/PM string error from to_timestamp)

2008-09-26 Thread Tom Lane

Alex Hunsaker [EMAIL PROTECTED] writes:
 However that still leaves the original complaint around (at least IMHO):

 select to_timestamp('AN', 'AM');
 ERROR:  invalid AM/PM string

 select to_timestamp('11:47 PM 27 Sep a2008', 'HH:MI PM DD Mon ');
 ERROR: invalid value for  in source string

Yeah, it would be a lot better if it said

ERROR: invalid value for : a2008

The DETAIL is good too, but it's no substitute for showing the exact
substring that the code is unhappy about.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] lock contention on parallel COPY ?

2008-09-26 Thread Stefan Kaltenbrunner


Tom Lane wrote:

Simon Riggs [EMAIL PROTECTED] writes:

Stefan Kaltenbrunner [EMAIL PROTECTED] writes:

samples  %symbol name
5552616.5614  LWLockAcquire
29721 8.8647  DoCopy
26581 7.9281  CopyReadLine
25105 7.4879  LWLockRelease
15743 4.6956  PinBuffer
14725 4.3919  heap_formtuple



Probably loading a table with a generated PK or loading data in
ascending sequence, so its contending heavily for the rightmost edge of
the index.


No, given that DoCopy and CopyReadLine are right up there, I think we're
still looking at the COPY phase, not index building.

The profile will probably change completely once index building
starts...


yeah this profile is only showing the COPY phase ...

Stefan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] lock contention on parallel COPY ?

2008-09-26 Thread Simon Riggs


On Fri, 2008-09-26 at 14:00 -0400, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  Stefan Kaltenbrunner [EMAIL PROTECTED] writes:
  samples  %symbol name
  5552616.5614  LWLockAcquire
  29721 8.8647  DoCopy
  26581 7.9281  CopyReadLine
  25105 7.4879  LWLockRelease
  15743 4.6956  PinBuffer
  14725 4.3919  heap_formtuple
 
  Probably loading a table with a generated PK or loading data in
  ascending sequence, so its contending heavily for the rightmost edge of
  the index.
 
 No, given that DoCopy and CopyReadLine are right up there, I think we're
 still looking at the COPY phase, not index building.
 
 The profile will probably change completely once index building
 starts...

Sorry, was assuming we were loading with indexes on, which is wrong.

Agree the profile looks odd.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] About the parameter of API: PQprepared

2008-09-26 Thread Merlin Moncure

On Fri, Sep 26, 2008 at 2:45 AM, iihero [EMAIL PROTECTED] wrote:
 In libpq, the definition is like:
 PGresult *
 PQprepare(PGconn *conn,
 const char *stmtName, const char *query,
 int nParams, const Oid *paramTypes)

 Could we remove the parameter nParams?
 e.g. insert into foo(id, name, address) values ($1, $2, $3)
 PostgreSQL possibly can parse the prepared sql statement to get the real
 paramters count.

 Or, is there another alternate way?

if you are looking for easier way to do things like prepared
statements, etc.  over libpq, you may want to look at libpqtypes (it's
a external library that will require a patched libpq for versions
8.4):

http://libpqtypes.esilo.com/

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] lock contention on parallel COPY ?

2008-09-26 Thread Simon Riggs


On Fri, 2008-09-26 at 20:07 +0200, Stefan Kaltenbrunner wrote:
 Tom Lane wrote:
  Simon Riggs [EMAIL PROTECTED] writes:
  Stefan Kaltenbrunner [EMAIL PROTECTED] writes:
  samples  %symbol name
  5552616.5614  LWLockAcquire
  29721 8.8647  DoCopy
  26581 7.9281  CopyReadLine
  25105 7.4879  LWLockRelease
  15743 4.6956  PinBuffer
  14725 4.3919  heap_formtuple
  
  Probably loading a table with a generated PK or loading data in
  ascending sequence, so its contending heavily for the rightmost edge of
  the index.
  
  No, given that DoCopy and CopyReadLine are right up there, I think we're
  still looking at the COPY phase, not index building.
  
  The profile will probably change completely once index building
  starts...
 
 yeah this profile is only showing the COPY phase ...

Try using this Postgres core patch. It's a updated version of my
fast_copy.v4.patch from Patches: Bulk Insert tuning 20 Mar 2008

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support
Index: src/backend/access/heap/heapam.c
===
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/access/heap/heapam.c,v
retrieving revision 1.263
diff -c -r1.263 heapam.c
*** src/backend/access/heap/heapam.c	11 Sep 2008 14:01:09 -	1.263
--- src/backend/access/heap/heapam.c	26 Sep 2008 19:16:27 -
***
*** 1728,1733 
--- 1728,1748 
  	}
  }
  
+ /*
+  * Begin/End Bulk Inserts
+  *
+  */
+ void
+ heap_begin_bulk_insert(void)
+ {
+ 	ReleaseBulkInsertBufferIfAny();
+ }
+ 
+ void
+ heap_end_bulk_insert(void)
+ {	
+ 	ReleaseBulkInsertBufferIfAny();
+ }
  
  /*
   *	heap_insert		- insert tuple into a heap
***
*** 1755,1765 
   */
  Oid
  heap_insert(Relation relation, HeapTuple tup, CommandId cid,
! 			bool use_wal, bool use_fsm)
  {
  	TransactionId xid = GetCurrentTransactionId();
  	HeapTuple	heaptup;
  	Buffer		buffer;
  
  	if (relation-rd_rel-relhasoids)
  	{
--- 1770,1781 
   */
  Oid
  heap_insert(Relation relation, HeapTuple tup, CommandId cid,
! 			bool use_wal, bool use_fsm, bool bulk_insert_request)
  {
  	TransactionId xid = GetCurrentTransactionId();
  	HeapTuple	heaptup;
  	Buffer		buffer;
+ 	bool		bulk_insert = bulk_insert_request  !relation-rd_istemp;
  
  	if (relation-rd_rel-relhasoids)
  	{
***
*** 1812,1820 
  	else
  		heaptup = tup;
  
! 	/* Find buffer to insert this tuple into */
! 	buffer = RelationGetBufferForTuple(relation, heaptup-t_len,
! 	   InvalidBuffer, use_fsm);
  
  	/* NO EREPORT(ERROR) from here till changes are logged */
  	START_CRIT_SECTION();
--- 1828,1845 
  	else
  		heaptup = tup;
  
! 	/* 
! 	 * Find buffer to insert this tuple into 
! 	 */
! 	if (bulk_insert)
! 	{
! 		buffer = RelationGetBufferForTuple(relation, heaptup-t_len,
! 	   GetBulkInsertBuffer(), use_fsm, true);
! 		SetBulkInsertBuffer(buffer);
! 	}
! 	else
! 		buffer = RelationGetBufferForTuple(relation, heaptup-t_len,
! 	   InvalidBuffer, use_fsm, false);
  
  	/* NO EREPORT(ERROR) from here till changes are logged */
  	START_CRIT_SECTION();
***
*** 1893,1899 
  
  	END_CRIT_SECTION();
  
! 	UnlockReleaseBuffer(buffer);
  
  	/*
  	 * If tuple is cachable, mark it for invalidation from the caches in case
--- 1918,1930 
  
  	END_CRIT_SECTION();
  
! 	/*
! 	 * Keep buffer pinned if we are in bulk insert mode
! 	 */
! 	if (bulk_insert)
! 		LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
! 	else
! 		UnlockReleaseBuffer(buffer);
  
  	/*
  	 * If tuple is cachable, mark it for invalidation from the caches in case
***
*** 1930,1936 
  Oid
  simple_heap_insert(Relation relation, HeapTuple tup)
  {
! 	return heap_insert(relation, tup, GetCurrentCommandId(true), true, true);
  }
  
  /*
--- 1961,1967 
  Oid
  simple_heap_insert(Relation relation, HeapTuple tup)
  {
! 	return heap_insert(relation, tup, GetCurrentCommandId(true), true, true, false);
  }
  
  /*
***
*** 2553,2559 
  		{
  			/* Assume there's no chance to put heaptup on same page. */
  			newbuf = RelationGetBufferForTuple(relation, heaptup-t_len,
! 			   buffer, true);
  		}
  		else
  		{
--- 2584,2590 
  		{
  			/* Assume there's no chance to put heaptup on same page. */
  			newbuf = RelationGetBufferForTuple(relation, heaptup-t_len,
! 			   buffer, true, false);
  		}
  		else
  		{
***
*** 2570,2576 
   */
  LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
  newbuf = RelationGetBufferForTuple(relation, heaptup-t_len,
!    buffer, true);
  			}
  			else
  			{
--- 2601,2607 
   */
  LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
  newbuf = RelationGetBufferForTuple(relation, heaptup-t_len,
!    buffer, true, false);
  			}
  			else
  			{
Index: src/backend/access/heap/hio.c
===
RCS

Re: [HACKERS] About the parameter of API: PQprepared

2008-09-26 Thread Andrew Chernow


Merlin Moncure wrote:

On Fri, Sep 26, 2008 at 2:45 AM, iihero [EMAIL PROTECTED] wrote:

In libpq, the definition is like:
PGresult *
PQprepare(PGconn *conn,
const char *stmtName, const char *query,
int nParams, const Oid *paramTypes)

Could we remove the parameter nParams?
e.g. insert into foo(id, name, address) values ($1, $2, $3)
PostgreSQL possibly can parse the prepared sql statement to get the real
paramters count.

Or, is there another alternate way?


if you are looking for easier way to do things like prepared
statements, etc.  over libpq, you may want to look at libpqtypes (it's
a external library that will require a patched libpq for versions
8.4):

http://libpqtypes.esilo.com/

merlin



Sounds like you could make use of libpqtypes, which abstracts you from 
the standard parameterized API.


Latest version of libpqtypes is finally online (v1.2a):

http://pgfoundry.org/frs/?group_id=1000370

NOTE: requires using cvs head because libpqtypes needs the new 
libpq-events feature.


--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] parallel pg_restore - WIP patch

2008-09-26 Thread Stefan Kaltenbrunner


Andrew Dunstan wrote:



This version of the patch should fix the shared file descriptor bug 
Russell Smith noticed. It also disables the 1/2 second sleep between 
forks, so the performance on a small db (regression) is vastly improved.


this works better but there is something fishy still - using the same 
dump file I get a proper restore using pg_restore normally. If I however 
use -m for a parallel one I only get parts (in this case only 243 of the 
709 tables) of the database restored ...



Stefan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] parallel pg_restore - WIP patch

2008-09-26 Thread Andrew Dunstan




Stefan Kaltenbrunner wrote:

Andrew Dunstan wrote:



This version of the patch should fix the shared file descriptor bug 
Russell Smith noticed. It also disables the 1/2 second sleep between 
forks, so the performance on a small db (regression) is vastly improved.


this works better but there is something fishy still - using the same 
dump file I get a proper restore using pg_restore normally. If I 
however use -m for a parallel one I only get parts (in this case only 
243 of the 709 tables) of the database restored ...






Yes, there are several funny things going on, including some stuff with 
dependencies. I'll have a new patch tomorrow with luck. Thanks for testing.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] parallel pg_restore - WIP patch

2008-09-26 Thread Joshua Drake

On Fri, 26 Sep 2008 17:10:44 -0400
Andrew Dunstan [EMAIL PROTECTED] wrote:

 Yes, there are several funny things going on, including some stuff
 with dependencies. I'll have a new patch tomorrow with luck. Thanks
 for testing.

O.k. I took at look at the patch itself and although I don't understand
all of it there were a couple of red flags to me:

+ if (ropt-create)
+   die_horribly(AH,modulename,
+parallel restore is
incompatible with --create\n);
+ 

This seems like an odd limitation. In my mind, the schema would not be
restored in parallel. The schema before data would restore as a single
thread. Even the largest schemas would only take minutes (if that).
Thus something like --create should never be a problem.

I also noticed you check if we have zlib? Is it even possible to use
the c format without it? (that would be new to me).

I noticed this line:


+   while((next_work_item = get_next_work_item(AH)) != NULL)
+   {
+   /* XXX need to improve this test in case there is no
table data */
+   /* need to test for indexes, FKs, PK, Unique, etc */
+   if(strcmp(next_work_item-desc,TABLE DATA) == 0)
+   break;
+   (void) _restore_one_te(AH, next_work_item, ropt,
false);
+ 
+   next_work_item-prestored = true;
+ 
+   _reduce_dependencies(AH,next_work_item);
+   }


Intead of the TABLE DATA compare, perhaps it makes sense to back patch
pg_dump to have a line delimiter in the TOC? That way even if there is
no TABLE DATA there would be a delimiter that says:

--- BEGIN TABLE DATA
--- END TABLE DATA

Thus if nothing is there... nothing is there?

+   /* delay just long enough betweek forks to
give the catalog some
+* breathing space. Without this sleep I got 
+* tuple concurrently updated errors.
+*/
+   pg_usleep(50);
+   continue; /* in case the slots are not yet
full */
+   }

Could that be solved with a lock instead? Once the lock is released

Anyway... just some thoughts. I apologize if I misunderstood the patch.

Sincerely,

Joshua D. Drake



 
 cheers
 
 andrew
 


-- 
The PostgreSQL Company since 1997: http://www.commandprompt.com/ 
PostgreSQL Community Conference: http://www.postgresqlconference.org/
United States PostgreSQL Association: http://www.postgresql.us/



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches

2008-09-26 Thread Andrew Sullivan

Dear colleagues,

I said earlier I'd ask around about some of the literature on security
controls vs. databse accessibility and side channels.  I did, and I
heard back.

One person told me that this conference often has things on this
topic:

http://www.ieee-security.org/TC/SP-Index.html

From my brief glimpse of the TOCs from the proceedings, as well as
some spelunking in the ACM guide, it seems to me that some people have
already worked out what ought to happen in many of these cases, and
all we need to do is write down what we think ought to happen for the
various use cases.  I note in particular that an awful lot of work
seems to be coming out of the health care sector in this area.  That
strikes me as at least as good a guide as national security concerns,
and anything that one might want to do probably ought to be able to
cope with at least those two caricatures of use cases.

I also found a 2007 doctoral thesis by Azhar Rauf, Colorado Technical
University, _A tradeoff analysis between data accessibility and
inference control for row, column, and cell level security in
relational databases_.  The title and abstract make me think it might
be worth looking at.

Hope this is helpful,

A

-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches

2008-09-26 Thread Bruce Momjian

KaiGai Kohei wrote:
  SE-PostgreSQL needs to be good enough for the NSA, but row-level
  security in general does not.
 
 BTW, it seems to me someone misunderstand I works as an agent of NSA.
 Is it a humor, itn't it? I've paid my effort to improve the security
 of open source software, not for overseas intelligence agency.

The comment was not directed at you.  The comment was that SE-Linux was
perhaps designed by the NSA, so an implementation matching SE-Linux will
be good enough for the NSA.  The comment goes on to say that we might
want something better or more consistent than the NSA requirements.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches

2008-09-26 Thread Bruce Momjian

Tom Lane wrote:
 Robert Haas [EMAIL PROTECTED] writes:
  I think you have to resign yourself to the fact that a user who can
  see only a subset of the rows in a table may very well see apparent
  foreign-key violations.  But so what?
 
 So you're leaking information about the rows that they're not supposed
 to be able to see.  This is not what I would call national-security-grade
 information hiding --- leastwise *I* certainly wouldn't store nuclear
 weapon design information in such a database.  The people that the NSA
 wants to defend against are more than smart enough, and persistent
 enough, to extract information through such loopholes.
 
 I can't escape the lurking suspicion that some bright folk inside the
 NSA have spent years thinking about this and have come up with some
 reasonably self-consistent definition of row hiding in a SQL database.
 But have they published it where we can find it?

I am confused how knowing that a sequence number used for a primary key
exists or doesn't exist is leaking _meaningful_ information.  People
might know the sequence number exists, but how is that information
useful.  Now, if natural keys are used, that is a different story.

I am, of course, supportive of digging deeper to find the best possible
behavior.  I am also supportive of making row-level security an
SQL-level feature that can be used beyond SE-Linux, and will allow the
feature to be tested on all platforms.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] parallel pg_restore - WIP patch

2008-09-26 Thread Andrew Dunstan




Joshua Drake wrote:

On Fri, 26 Sep 2008 17:10:44 -0400
Andrew Dunstan [EMAIL PROTECTED] wrote:

  

Yes, there are several funny things going on, including some stuff
with dependencies. I'll have a new patch tomorrow with luck. Thanks
for testing.



O.k. I took at look at the patch itself and although I don't understand
all of it there were a couple of red flags to me:

+ if (ropt-create)
+   die_horribly(AH,modulename,
+parallel restore is
incompatible with --create\n);
+ 
  
This seems like an odd limitation. In my mind, the schema would not be

restored in parallel. The schema before data would restore as a single
thread. Even the largest schemas would only take minutes (if that).
Thus something like --create should never be a problem.
  



Originally I had everything restoring in parallel. Now I am in fact (as 
the patch should have showed you) restoring the first part in a single 
thread like you say. Thus I probably can relax that restriction. I will 
look and see.



I also noticed you check if we have zlib? Is it even possible to use
the c format without it? (that would be new to me).

I noticed this line:


+   while((next_work_item = get_next_work_item(AH)) != NULL)
+   {
+   /* XXX need to improve this test in case there is no
table data */
+   /* need to test for indexes, FKs, PK, Unique, etc */
+   if(strcmp(next_work_item-desc,TABLE DATA) == 0)
+   break;
+   (void) _restore_one_te(AH, next_work_item, ropt,
false);
+ 
+ 		next_work_item-prestored = true;
+ 
+ 		_reduce_dependencies(AH,next_work_item);

+   }


Intead of the TABLE DATA compare, perhaps it makes sense to back patch
pg_dump to have a line delimiter in the TOC? That way even if there is
no TABLE DATA there would be a delimiter that says:

--- BEGIN TABLE DATA
--- END TABLE DATA

Thus if nothing is there... nothing is there?
  


The TOC isn't stored as a text file. So we'll need to look by entry 
tags. It's no big deal - there aren't a huge number.



+   /* delay just long enough betweek forks to
give the catalog some
+ 			 * breathing space. Without this sleep I got 
+ 			 * tuple concurrently updated errors.

+*/
+   pg_usleep(50);
+   continue; /* in case the slots are not yet
full */
+   }

Could that be solved with a lock instead? Once the lock is released
  



That sleep is now gone.



Anyway... just some thoughts. I apologize if I misunderstood the patch.


  



No problem. Thanks for looking.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches

2008-09-26 Thread Tom Lane

Bruce Momjian [EMAIL PROTECTED] writes:
 I am confused how knowing that a sequence number used for a primary key
 exists or doesn't exist is leaking _meaningful_ information.  People
 might know the sequence number exists, but how is that information
 useful.  Now, if natural keys are used, that is a different story.

Right.  It might be that securing a database requires not just some
security mechanisms but also some database design rules (like don't
allow foreign keys except on synthetic IDs).  But it seems to me that
we are just flailing around in the dark because we don't have that
bigger picture of how the features would actually get used.

The literature pointers that Andrew just gave us seem promising to me.
Who's going to go searching for some useful info?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL future ideas

2008-09-26 Thread Chris Browne

[EMAIL PROTECTED] (Andrew Dunstan) writes:
 A.M. wrote:
 Speaking of language choice, no one said that _all_ the source code
 would need to be rewritten. It would be nice, for example, if
 PostgreSQL rewrote the current GUC system with a glue language like
 Lua (which is also very C-like).

 No it wouldn't. All it would mean is that you'd need developers
 fluent in both languages.

I expect it would be both a little better *and* a little worse than
that.

On the better side, I don't expect that, in this instance, there
would be terribly much need for anything but the shallowest
understanding of Lua.  If this were all there was to it, I'd contend
that there's little to object to.

However, there's a pretty considerable worse side, namely that
developers would need to understand the API for interfacing between
the bits of C that are the Lua 'external interface' and how that
gets plumbed into PostgreSQL.  *That's* got very little to do with
language, per se; it has to do with the implementation of the
language.
-- 
let name=cbbrowne and tld=acm.org in String.concat @ [name;tld];;
http://linuxdatabases.info/info/oses.html
Real concurrency---in which one program actually continues to
function while you call up and use another---is more amazing but of
small use to the average person.  How many programs do you have that
take more than a few seconds to perform any task?
-- New York Times, 4/25/89

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug in ILIKE?

2008-09-26 Thread Andrew Dunstan




Tom Lane wrote:

I think it's just a bug in 8.3.


  


Well, here's a patch that I think fixes it. If you're happy I'll apply 
it to HEAD and 8.3.


cheers

andrew
? .deps
Index: like_match.c
===
RCS file: /cvsroot/pgsql/src/backend/utils/adt/like_match.c,v
retrieving revision 1.21
diff -c -r1.21 like_match.c
*** like_match.c	1 Mar 2008 03:26:34 -	1.21
--- like_match.c	26 Sep 2008 22:38:26 -
***
*** 96,105 
  	{
  		if (*p == '\\')
  		{
! 			/* Next byte must match literally, whatever it is */
  			NextByte(p, plen);
! 			if ((plen = 0) || *p != *t)
  return LIKE_FALSE;
  		}
  		else if (*p == '%')
  		{
--- 96,108 
  	{
  		if (*p == '\\')
  		{
! 			/* Next char must match literally, whatever it is */
  			NextByte(p, plen);
! 			if ((plen = 0) || TCHAR(*p) != TCHAR(*t))
  return LIKE_FALSE;
+ 			NextChar(t,tlen);
+ 			NextChar(p,plen);
+ 			continue;
  		}
  		else if (*p == '%')
  		{

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug in ILIKE?

2008-09-26 Thread Tom Lane

Andrew Dunstan [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 I think it's just a bug in 8.3.

 Well, here's a patch that I think fixes it. If you're happy I'll apply 
 it to HEAD and 8.3.

That patch isn't gonna apply to HEAD ;-).  Also the introduction of
NextChar is simply broken, as it will skip additional bytes of a
multibyte char without having compared 'em.  All you need AFAICS is
to put TCHAR()s into the *p != *t comparison, so that it matches the
case for ordinary characters.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] allow has_table_privilege(..., 'usage') on sequences

2008-09-26 Thread Abhijit Menon-Sen

At 2008-09-22 12:54:34 -0500, [EMAIL PROTECTED] wrote:

 can we tell there is consensus in create a new has_sequence_privilege()?
 Abhijit will you make it? if not i can make a try...

Yes, I'll do it.

-- ams

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug in ILIKE?

2008-09-26 Thread Andrew Dunstan




Tom Lane wrote:

Andrew Dunstan [EMAIL PROTECTED] writes:
  

Tom Lane wrote:


I think it's just a bug in 8.3.
  


  
Well, here's a patch that I think fixes it. If you're happy I'll apply 
it to HEAD and 8.3.



That patch isn't gonna apply to HEAD ;-).  Also the introduction of
NextChar is simply broken, as it will skip additional bytes of a
multibyte char without having compared 'em.  All you need AFAICS is
to put TCHAR()s into the *p != *t comparison, so that it matches the
case for ordinary characters.


  


I'll have another look. What happens in that case though if you have 
escape+x where x is a multibyte char?


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug in ILIKE?

2008-09-26 Thread Tom Lane

Andrew Dunstan [EMAIL PROTECTED] writes:
 I'll have another look. What happens in that case though if you have 
 escape+x where x is a multibyte char?

TCHAR()'s just a no-op in multibyte encodings --- we handle ILIKE
using a preliminary downcasing pass in that case.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug in ILIKE?

2008-09-26 Thread Andrew Dunstan




Tom Lane wrote:

Andrew Dunstan [EMAIL PROTECTED] writes:
  
I'll have another look. What happens in that case though if you have 
escape+x where x is a multibyte char?



TCHAR()'s just a no-op in multibyte encodings --- we handle ILIKE
using a preliminary downcasing pass in that case.


  


Ah. Of course.

Will fix.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Meridiem markers (was: [BUGS] Incorrect invalid AM/PM string error from to_timestamp)

2008-09-26 Thread Brendan Jurd

On Sat, Sep 27, 2008 at 4:08 AM, Tom Lane [EMAIL PROTECTED] wrote:
 Alex Hunsaker [EMAIL PROTECTED] writes:
 However that still leaves the original complaint around (at least IMHO):

 select to_timestamp('AN', 'AM');
 ERROR:  invalid AM/PM string

 select to_timestamp('11:47 PM 27 Sep a2008', 'HH:MI PM DD Mon ');
 ERROR: invalid value for  in source string

 Yeah, it would be a lot better if it said

ERROR: invalid value for : a2008

 The DETAIL is good too, but it's no substitute for showing the exact
 substring that the code is unhappy about.


I agree.  And with my proposed changes to the meridiem/era marker
code, it will be much easier to improve the error messages in a
consistent way.

I'll work on a patch for the Nov commitfest.

Cheers,
BJ

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches

2008-09-26 Thread Robert Haas

 The literature pointers that Andrew just gave us seem promising to me.
 Who's going to go searching for some useful info?

I couldn't find much on the pages linked directly from the link Andrew
sent, but a Google search for site:ieee-security.org database turned
up the following:

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.48.2185
A MAC policy framework for multilevel relational databases
by Xiaolei Qian, Teresa F. Lunt

I haven't read through this in detail yet, but appears to be highly
relevant to Tom's concerns about referential integrity.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches

2008-09-26 Thread Andrew Sullivan

On Fri, Sep 26, 2008 at 06:15:46PM -0400, Bruce Momjian wrote:

 I am confused how knowing that a sequence number used for a primary key
 exists or doesn't exist is leaking _meaningful_ information. 

This sort of side-channel intelligence is _exactly_ how certain kinds
of security exploits work: I'm not supposed to know that _x_ exists;
but by knowing key-of-_x_, I learn that _x_ exists.  From existence, I
can infer something, and from that inference I construct an attack
that was supposed to be forestalled by the access controls.

I am by no means a security expert, but I know enough about the area
to know that it is very hard to get right, and that seemingly
insignificant flaws in design turn out to be major vulnerabilities.
To speak about something I do know about, when DNS was designed,
nobody could have imagined that the widespread availability of
recursion would turn out to be a flaw.  Today, it turns out that open
recursion can be used in an attack that magnifies the attacker's
outbound traffic by many orders of magnitude.  This sort of surprise
side effect is why I am so anxious that something advertised as a
security system fit really well with the proposed use cases.

A

-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches

2008-09-26 Thread Bruce Momjian

Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  I am confused how knowing that a sequence number used for a primary key
  exists or doesn't exist is leaking _meaningful_ information.  People
  might know the sequence number exists, but how is that information
  useful.  Now, if natural keys are used, that is a different story.
 
 Right.  It might be that securing a database requires not just some
 security mechanisms but also some database design rules (like don't
 allow foreign keys except on synthetic IDs).  But it seems to me that
 we are just flailing around in the dark because we don't have that
 bigger picture of how the features would actually get used.
 
 The literature pointers that Andrew just gave us seem promising to me.
 Who's going to go searching for some useful info?

I found this paper from 1996:

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.33.5950

Full PDF at link in right column.  The interesting chapters are chapter
3, that talks about ENTITY AND REFERENTIAL INTEGRITY IN MLS DATABASES
and chapter 4, COVERT CHANNELS.  It mentions polyinstantiation:

These security considerations have led to the notion of
polyinstantiation [Denning 87]. Polyinstantiation forces a relation to
contain multiple tuples with the same primary key but distinguishable by
their classification levels or by the non-primary key attributes of the
relation [Lunt 91].

which I think we want to avoid.  It also talks about cases where the
primary and foreign key rows have identical or different security
settings.  It talks about COVERT CHANNELS, which is information
leaking.

And it mentions TCSEC (Trusted Computer System Evaluation Criteria):

http://en.wikipedia.org/wiki/Trusted_Computer_System_Evaluation_Criteria

which I think is the proper term for the security target we are trying
to address, or at least one of the targets.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches

2008-09-26 Thread KaiGai Kohei

Tom Lane wrote:
 Andrew Sullivan [EMAIL PROTECTED] writes:
 The above point, and other similar ones in every discussion of the
 proposed functionality, makes me think once again either that the
 requirements for this feature aren't understood by everyone, or else
 that they're not actually explicit enough.  I have a feeling it's the
 latter.
 
 Yeah, I think that's exactly the problem here: we've got this large
 patch and no agreement on just what requirements it's supposed to meet.
 Perhaps others see it differently, but I feel like I'm being told that
 whatever the patch does is the right thing by definition ... and yet
 it doesn't seem to meet what I would think are the likely requirements
 of the users who might actually want such features.

Indeed, I might not say my motivation and the goal/target of SE-PostgreSQL
in explicit.

As I repeated several times, SE-PostgreSQL applies the seuciry policy
of SELinux to achieve consistency in access controls. This feature
enables to restrict client's privileges on accesses to database objects,
as if it accesses to filesystem objects. Its background is our nightmare
for web application flaws.

The major purpose of this feature is to provide the most important
component to run enterprise class web application with least privilege
set which is consistent at whole of the system.
I know there are various aspect of opinions for SELinux, but it has
succeeded to run server applications with the least privileges and
enabled to prevent attachs from malicious ones.
However, web application had some of characteristics different from
server application.

The LAPP software stack is very popular architecture to provide web
services, and widely accepted. I had a concern from the viewpoint of
security folks. The LAPP stack is constructed from four components,
and each of them provides individual security mechanism.
For example, OS(Linux) provides filesystem permission and SELinux,
RDBMS(PostgreSQL) provides database ACL, and so on.
In addition, most of web application stores its data into databases
not only filesystems. This fact prevented to work web application
with least privilege set, because in-kernel SELinux cannot trap
accesses on database objects. In this situation, we have to set up
database acl, and choose a proper database role.
But there is no explicit relationship between database roles and
the least privilege set (called as security context), so web
application can/must select arbitrary one.

In my vision, Apache assigns its contents handler an individual
security context based on HTTP authentication, source IP address
and so on just before web application invoked.
Because web applications works with individual least privilege set,
its accesses on filesystem are restricted by the security policy.
In a similar way, its accesses on databases are also restricted
via SE-PostgreSQL by same policy, by same privilege set.
(Please note that SE-PostgreSQL obtain the security context of
 peer process using getpeercon() API provided by SELinux.)
This architecture enables to minimize security sensitive code,
like openning database connection.
If we adopt this approach, web application can also select
a database role, but the outline of its privileges are defined
by security policy.

In recent years, web application flaws are nightmare for us.
The recent report said 95% of significant incidents on the first
half of 2008, and 75% of them were SQL injection in Japan.
My ultimate goal is to help the situation with mandatory access
control and least priviled set for whole of LAPP stack.

The fine-grained access controls are provided for practical advantage.
One requirement I got is to apply SE-PostgreSQL as a backend of medical
information system which is shared by several small hospitals.
An electronic chart has common format, so it is simple design to store
it within a single table. And, individual tuples have individual access
right which is only exposed to his doctor and medical staffs.

The actual origin of fine-grained access control is sense of rivalry
towards the prior commercial security aware RDBMS (like Oracle Label
Security, IBM DB2), however, it is neccesary to utilize access controls.
If we don't have fine-grained controls, any web application have to have
privileges to access all charts of patients, and vulnerable for SQL
injection. It is a real practical advantage, I think.

Sorry for long my description.

Thanks,

 Agreeing on the requirements seems like a necessary condition for
 arriving at any consensus on a patch.  Where can we get some evidence
 that would convince everyone that the requirements for a highly
 secure database are X, Y and Z?
 
   regards, tom lane
 


-- 
KaiGai Kohei [EMAIL PROTECTED]

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

68 matches

Mail list logo