Re: [HACKERS] Suggestion: Which Binary?

2006-04-04 Thread Andrew Dunstan



Robert Treat wrote:


The problem is that PostgreSQL is moving out of the realm of hard-core
geeks only and more into the mainstream. I'd bet a number of our users
have very little idea how autoconf and it's progeny work. It's probably
not unlikely that those folks would be able to figure out where their
perl was, but then not know how to tell it to configure.
 


Most such users would use a binary distribution, though - either from
the OS supplier or from our collection of binaries. If people are going
to build postgres themselves from source then I *do* expect them to be
moderately hard-core geeks.

   



ISTM that by any measure of the general population, David Wheeler is a 
hard-core geek. :-)   Actually by most measures of the programming/oss 
community he is a hard core geek.  But he still got tripped up by this.  A 
lot of people never get passed ./configure;make;make install even though they 
do a lot of coding on oss projects. Why turn these people away?
 



Robert,

You missed my point completely. I am not saying we should turn away 
people like David. In fact, it was me who suggested to him that he 
should write to -hackers on this subject. All I was saying was that we 
should not feel a need to tailor the configure script for people who 
aren't hard core geeks, as Jim was suggesting. I agree that we should 
help people like David.


cheers

andrew

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] will it work

2006-04-04 Thread siva kumar
hai, will it be possible to use one server for Process and another server 
for Reports. We are using Postgresql as database and java 



- Sivakumar

_
NRIs Zero balance Account. FREE Money Transfers with FREE DVD 
https://www.online.citibank.co.in/portal/rca_msntagofline.htm



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] semaphore usage port based?

2006-04-04 Thread Kris Kennaway
On Sun, Apr 02, 2006 at 11:26:52PM -0400, Tom Lane wrote:
 Kris Kennaway [EMAIL PROTECTED] writes:
  On Sun, Apr 02, 2006 at 11:17:49PM -0400, Tom Lane wrote:
  I have no objection to doing that, so long as you are actually doing it
  correctly.  This example shows that each jail must have its own SysV
  semaphore key space, else information leaks anyway.
 
  By default SysV shared memory is disallowed in jails.
 
 Hm, the present problem seems to be about semaphores not shared memory

Sorry, I meant IPC.

 ... although I'd not be surprised to find that there's a similar issue
 around shared memory.  Anyway, if FBSD's position is that they are
 uninterested in supporting SysV IPC in connection with jails, then I
 think the Postgres project position has to be that we are uninterested
 in supporting Postgres inside FBSD jails.

No-one is taking a position of being uninterested, so please don't
be hasty to reciprocate.

Kris

pgpFHpzpEVx0Q.pgp
Description: PGP signature


Re: [HACKERS] semaphore usage port based?

2006-04-04 Thread Kris Kennaway
On Sun, Apr 02, 2006 at 11:17:49PM -0400, Tom Lane wrote:
 Kris Kennaway [EMAIL PROTECTED] writes:
  On Sun, Apr 02, 2006 at 11:08:11PM -0400, Tom Lane wrote:
  If this is the story, then FBSD have broken their system and must revert
  their change.  They do not have kernel behavior that totally hides the
  existence of the other process, and therefore having some calls that
  pretend it's not there is simply inconsistent.
 
  I'm guessing it's a deliberate change to prevent the information
  leakage between jails.
 
 I have no objection to doing that, so long as you are actually doing it
 correctly.  This example shows that each jail must have its own SysV
 semaphore key space, else information leaks anyway.

By default SysV shared memory is disallowed in jails.

Kris

pgpyArbJ1akOg.pgp
Description: PGP signature


Re: [HACKERS] semaphore usage port based?

2006-04-04 Thread Kris Kennaway
On Sun, Apr 02, 2006 at 11:08:11PM -0400, Tom Lane wrote:

 I venture that FBSD 6 has decided to return ESRCH (no such process)
 where FBSD 4 returned some other error that acknowledged that the
 process did exist (EPERM would be a reasonable guess).
 
 If this is the story, then FBSD have broken their system and must revert
 their change.  They do not have kernel behavior that totally hides the
 existence of the other process, and therefore having some calls that
 pretend it's not there is simply inconsistent.

I'm guessing it's a deliberate change to prevent the information
leakage between jails.

Kris


pgp0723nvyM1m.pgp
Description: PGP signature


Re: [HACKERS] semaphore usage port based?

2006-04-04 Thread Andrew Thompson
On Sun, Apr 02, 2006 at 11:41:01PM -0400, Kris Kennaway wrote:
 On Mon, Apr 03, 2006 at 12:30:58AM -0300, Marc G. Fournier wrote:
  'k, but how do I fix kill so that it has the proper behaviour if SysV is 
  enabled?
 
 Check the source, perhaps there's already a way.  If not, talk to
 whoever made the change.
 
  Maybe a mount option for procfs that allows for pre-5.x 
  behaviour?
 
 procfs has nothing to do with this though.
 
  I'm not the first one to point out that this is a problem, just 
  the first to follow it through to the cause ;(  And I believe there is 
  more then just PostgreSQL that is affected by shared memory (ie. apache2 
  needs SysV IPC enabled, so anyone doing that in a jail has it enabled 
  also) ...
 
 Also note that SysV IPC is not the problem here, it's the change in
 the behaviour of kill() that is causing postgresql to become confused.
 That's what you should investigate.

The ESRCH error is being returned from prison_check(), that would be a
good starting place.


Andrew

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] semaphore usage port based?

2006-04-04 Thread Kris Kennaway
On Mon, Apr 03, 2006 at 12:30:58AM -0300, Marc G. Fournier wrote:
 On Sun, 2 Apr 2006, Kris Kennaway wrote:
 
 On Sun, Apr 02, 2006 at 11:17:49PM -0400, Tom Lane wrote:
 Kris Kennaway [EMAIL PROTECTED] writes:
 On Sun, Apr 02, 2006 at 11:08:11PM -0400, Tom Lane wrote:
 If this is the story, then FBSD have broken their system and must revert
 their change.  They do not have kernel behavior that totally hides the
 existence of the other process, and therefore having some calls that
 pretend it's not there is simply inconsistent.
 
 I'm guessing it's a deliberate change to prevent the information
 leakage between jails.
 
 I have no objection to doing that, so long as you are actually doing it
 correctly.  This example shows that each jail must have its own SysV
 semaphore key space, else information leaks anyway.
 
 By default SysV shared memory is disallowed in jails.
 
 'k, but how do I fix kill so that it has the proper behaviour if SysV is 
 enabled?

Check the source, perhaps there's already a way.  If not, talk to
whoever made the change.

 Maybe a mount option for procfs that allows for pre-5.x 
 behaviour?

procfs has nothing to do with this though.

 I'm not the first one to point out that this is a problem, just 
 the first to follow it through to the cause ;(  And I believe there is 
 more then just PostgreSQL that is affected by shared memory (ie. apache2 
 needs SysV IPC enabled, so anyone doing that in a jail has it enabled 
 also) ...

Also note that SysV IPC is not the problem here, it's the change in
the behaviour of kill() that is causing postgresql to become confused.
That's what you should investigate.

Kris

pgpepKUMNlFKd.pgp
Description: PGP signature


Re: [HACKERS] semaphore usage port based?

2006-04-04 Thread Robert Watson


On Mon, 3 Apr 2006, Stephen Frost wrote:

So I think the code is pretty bulletproof as long as it's in a system that 
is behaving per SysV spec.  The problem in the current FBSD situation is 
that the jail mechanism is exposing semaphore sets across jails, but not 
exposing the existence of the owning processes.  That behavior is 
inconsistent: if process A can affect the state of a sema set that process 
B can see, it's surely unreasonable to pretend that A doesn't exist.


This is certainly a problem with FBSD jails...  Not only the inconsistancy, 
but what happens if someone manages to get access to the appropriate uid 
under one jail and starts sniffing or messing with the semaphores or shared 
memory segments from other jails?  If that's possible then that's a rather 
glaring security problem...


This is why it's disabled by default, and the jail documentation specifically 
advises of this possibility.  Excerpt below.


Robert N M Watson

 security.jail.sysvipc_allowed
  This MIB entry determines whether or not processes within a jail
  have access to System V IPC primitives.  In the current jail imple-
  mentation, System V primitives share a single namespace across the
  host and jail environments, meaning that processes within a jail
  would be able to communicate with (and potentially interfere with)
  processes outside of the jail, and in other jails.  As such, this
  functionality is disabled by default, but can be enabled by setting
  this MIB entry to 1.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] semaphore usage port based?

2006-04-04 Thread Robert Watson


On Mon, 3 Apr 2006, Stephen Frost wrote:

This is why it's disabled by default, and the jail documentation 
specifically advises of this possibility.  Excerpt below.


Ah, I see, glad to see it's accurately documented.


As it has been for the last five years, I believe since introduction of the 
setting to allow System V IPC to be used with documented limitations.


Given the rather significant use of shared memory by Postgres it seems to me 
that jail'ing it under FBSD is unlikely to get you the kind of isolation 
between instances that you want (the assumption being that you want to avoid 
the possibility of a user under one jail impacting a user in another jail). 
As such, I'd suggest finding something else if you truely need that 
isolation for Postgres or dropping the jails entirely.


Running the Postgres instances under different uids (as you'd probably 
expect to do anyway if not using the jails) is probably the right approach. 
Doing that and using jails would probably work, just don't delude yourself 
into thinking that you're safe from a malicious user in one jail.


Yes, there seems to be an awful lot of noise being made about the fact that 
the system does, in fact, work exactly as documented, and that the 
configuration being complained about is one that is specifically documented as 
being unsupported and undesirable.


As commented elsewhere in this thread, currently, there is no virtualization 
support for System V IPC in the FreeBSD Jail implementation.  That may change 
if/when someone implements it.  Until it's implemented, it isn't going to be 
there, and the system won't behave as though it's there no matter how much 
jumping up and down is done.


Robert N M Watson

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] semaphore usage port based?

2006-04-04 Thread Robert Watson


On Sun, 2 Apr 2006, Tom Lane wrote:

Oops.  Here is the problem: kill() is lying by claiming there is no such 
process as 83699.  It looks to me like there in fact is such a process, but 
it's in a different jail.


I venture that FBSD 6 has decided to return ESRCH (no such process) where 
FBSD 4 returned some other error that acknowledged that the process did 
exist (EPERM would be a reasonable guess).


If this is the story, then FBSD have broken their system and must revert 
their change.  They do not have kernel behavior that totally hides the 
existence of the other process, and therefore having some calls that pretend 
it's not there is simply inconsistent.


FreeBSD's mandatory access control models, such as multi-level security, biba 
integrity, and type enforcement, will generally provide consistent protection 
under the circumstances you describe: specifically, that information flow 
invariants across IPC types, including System V IPC and inter-process 
signalling, will allow flow only in keeping with the policy.


However, I guess I would counter with the following concern: the PID returned 
by semctl() has the following definition:


 GETPID   Return the pid of the last process to perform an operation
  on semaphore number semnum.

However, pid's in general uniquely identify a process only at the time they 
are recorded.  So any pid returned here is necessarily stale -- even if there 
is another process with the pid returned by GETPID, it may actually be a 
different process that has ended up with the same pid.  The longer the gap 
since the last semaphore operation, the more likely (presumably) it is that 
the pid has been recycled.  And on modern systems with thousands of processes 
and high process turn-over (i.e., systems with CGI and other sorts of 
scripting),pid reuse can happen quickly.  Is your use of the pid here 
consistent with fact that pid's are reused quickly after process exit?  Use of 
pid's in UNIX is often unreliable, and must be combined with other 
synchronizing, such as file locking on a pidfile, to ensure that the pid read 
is valid.  Even then, you can't implement atomic check-pid-and-signal using 
current UNIX APIs, which would require a notion of a process handle (or, in 
the parlance of Mach, a task port).


Another thought along these lines -- especially with the proliferation of 
fine-grained access control systems, such as Type Enforcement in SELinux, I 
would be cautious about assuming that two processes being able to manipulate 
the same sempahore implies the ability to exchange signals using the signal 
facility.


Robert N M Watson

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] semaphore usage port based?

2006-04-04 Thread Robert Watson

On Mon, 3 Apr 2006, Tom Lane wrote:


Robert Watson [EMAIL PROTECTED] writes:

Maybe I've misunderstood the problem here -- is the use of the GETPID
operation occuring within a coordinated set of server processes, or does it
also occur between client and server processes?  I think it's quite reasonable
to argue that a coordinated set of server processes should be able to see each
other, especially if they're running as the same user, in the same jail,
started at the same time.


We use the semaphore sets only within postgres server processes; we could 
hardly expect client processes to be able to get at them, since in general 
clients aren't on the same machine.  The issue here, though, is that Marc is 
trying to start multiple postgres servers in different jails, and in that 
context the different postgres servers aren't coordinated in any real 
sense.  We'd prefer that they didn't interact at all, but they are 
interacting because the SysV code isn't restricting IPC to occur only within 
a jail.


BTW, Marc, it occurs to me that a workaround for you would be to create a 
separate userid for postgres to run under in each jail; then the regular 
protection mechanisms would prevent the different postmasters from 
interfering with each others' semaphore sets.  But I think that workaround 
just makes it even clearer that the jail mechanism isn't behaving very 
sanely.


Any multi-instance application that uses unvirtualized System V IPC must know 
how to distinguish between those instances.  This is true of any potential 
communication mechanism used by multi-instance applications -- be it a command 
line argument to specify an alternative configuration file, or a configuration 
file that specifies alternative ports, working directories, mail spool 
directories, etc.  If you install two instances of sendmail, it requires some 
configuration to teach them not to step all over each other, and this is not 
an accident: if they try to use the same mail spools, ports, etc, things will 
go badly.  I can't imagine that PostgreSQL should be any different -- it has 
to be pointed at what resources to use and how to use them -- some of that 
will be a property of how it's written, and some how it's configured. 
Presumably, running multiple instances of PostgreSQL in jails should not be 
all that different from running multiple instances on any UNIX machine: they 
must not overlap where shared resources are concerned.


How is PostgreSQL deciding what semaphores to use?  Can it be instructed to 
use non-colliding ones by specifying an alternative argument to pass to 
ftok(), or ID to use directly?


I would, in general, consider the use of System V IPC across jails (as 
opposed to in a single jail) unsupported, since it's not consistent with 
the security model.


That'd be fine with me --- the problem here is that we've got unwanted 
communication across jails.  If, say, the jail ID were considered part of 
semaphore keys, we'd be in fine shape.


Well, I think it's definitely unwanted communications, but until such time as 
FreeBSD supports virtualizing the System V IPC name spaces, the fact that you 
can communicate between jails when System V IPC support is turned on for the 
jail shouldn't be a surprise, and should in fact be considered a feature. 
However, if applications behave incorrectly when treading over each other 
because either they aren't written to support specifying how not to walk over 
each other, or if they are not configured to use that support, then they're 
not going to behave well :-).


Robert N M Watson

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] semaphore usage port based?

2006-04-04 Thread Robert Watson


On Mon, 3 Apr 2006, Tom Lane wrote:

That's a fair question, but in the context of the code I believe we are 
behaving reasonably.  The reason this code exists is to provide some 
insurance against leaking semaphores when a postmaster process is terminated 
unexpectedly (ye olde often-recommended-against kill -9 postmaster, for 
instance).  If the PID returned by GETPID is nonexistent or belongs to a 
process not owned by the postgres userid then we assume that the semaphore 
set can be recycled.  We could get fooled by PID recycling if the PID 
returned by GETPID belongs to a postgres-owned process that isn't actually 
the original owner, but the penalty is just that we'll fail to recycle 
semaphores that could be released.  Not very harmful, and not very probable 
either, unless you're running postgres under a userid that's used for a lot 
of other stuff too.  There is not much risk of long-term leakage of many 
semaphore sets, even if you've got lots of postmaster crashes going on 
(which I sure hope you don't).  The code is designed to retry the same 
semaphore keys on each cycle of life, so you'd have to get fooled by chance 
coincidence of existing PIDs every time over many cycles to have a severe 
resource-leakage problem.  (BTW, Marc, that's the reason for *not* 
randomizing the key selection as you suggested.)


So I think the code is pretty bulletproof as long as it's in a system that 
is behaving per SysV spec.  The problem in the current FBSD situation is 
that the jail mechanism is exposing semaphore sets across jails, but not 
exposing the existence of the owning processes.  That behavior is 
inconsistent: if process A can affect the state of a sema set that process B 
can see, it's surely unreasonable to pretend that A doesn't exist.


Maybe I've misunderstood the problem here -- is the use of the GETPID 
operation occuring within a coordinated set of server processes, or does it 
also occur between client and server processes?  I think it's quite reasonable 
to argue that a coordinated set of server processes should be able to see each 
other, especially if they're running as the same user, in the same jail, 
started at the same time.  After all, coordinated server applications 
frequently use signals to manage resources and perform asynchronous 
notification (i.e., SIGCHLD, SIGHUP, etc).  If we're talking about clients and 
servers coordinating using the same System V IPC name space, I find myself 
less sympathetic to the idea that otherwise unrelated processes on either side 
of the IPC mechanism should be using out-of-band process operations to test 
for mutual presence.


There has been occasional investigation of virtualizing the System V IPC name 
space, but as you are no doubt aware, the name space doesn't lend itself to 
virtualization, as it fails to be conveniently hierarchical, etc.  This is 
just another of the ways in which System V IPC offers quite useful IPC 
services in less useful ways.  I would, in general, consider the use of System 
V IPC across jails (as opposed to in a single jail) unsupported, since it's 
not consistent with the security model.  However, I have doubts about the 
behavioral dependency we're talking about above.


Robert N M Watson

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] semaphore usage port based?

2006-04-04 Thread Kris Kennaway
On Mon, Apr 03, 2006 at 06:51:45PM -0400, Stephen Frost wrote:
 * Robert Watson ([EMAIL PROTECTED]) wrote:
  On Mon, 3 Apr 2006, Stephen Frost wrote:
  This is certainly a problem with FBSD jails...  Not only the 
  inconsistancy, but what happens if someone manages to get access to the 
  appropriate uid under one jail and starts sniffing or messing with the 
  semaphores or shared memory segments from other jails?  If that's possible 
  then that's a rather glaring security problem...
  
  This is why it's disabled by default, and the jail documentation 
  specifically advises of this possibility.  Excerpt below.
 
 Ah, I see, glad to see it's accurately documented.  Given the rather
 significant use of shared memory by Postgres it seems to me that
 jail'ing it under FBSD is unlikely to get you the kind of isolation
 between instances that you want (the assumption being that you want to
 avoid the possibility of a user under one jail impacting a user in
 another jail).  As such, I'd suggest finding something else if you
 truely need that isolation for Postgres or dropping the jails entirely.
 
 Running the Postgres instances under different uids (as you'd probably
 expect to do anyway if not using the jails) is probably the right
 approach.  Doing that and using jails would probably work, just don't
 delude yourself into thinking that you're safe from a malicious user in
 one jail.

Yes; however jails are still useful for administrative
compartmentalization even when you have to weaken their security
properties, such as here.

Kris


pgp2ZsCjYtna3.pgp
Description: PGP signature


Re: [HACKERS] semaphore usage port based?

2006-04-04 Thread Robert Watson


On Mon, 3 Apr 2006, Tom Lane wrote:


Robert Watson [EMAIL PROTECTED] writes:

Any multi-instance application that uses unvirtualized System V IPC must know
how to distinguish between those instances.


Sure.


How is PostgreSQL deciding what semaphores to use?  Can it be instructed to
use non-colliding ones by specifying an alternative argument to pass to
ftok(), or ID to use directly?


The problem here is not that we don't know how to avoid a collision. The 
problem is stemming from code that we added to prevent semaphore leakage 
during failure recoveries.  The code believes that it is deleting a 
semaphore set left over from a crashed previous instance of the same 
postmaster.


We don't use ftok() to determine the keys, and I'm disinclined to think that 
doing so would improve the situation: you could still have key collisions, 
they'd just be unpredictable and there'd be no convenient mechanism for 
escaping one if you hit it.


I guess what I'm saying is this: by turning on system V IPC in a jail, 
administrators accept that they are using an unsupported configuration, in 
which the security features of jail, which include hiding the process state of 
other jails, are known to conflict with the System V IPC services.  We 
specifically disable System V IPC in jails because it is known to have 
undesirable properties.  When configuring systems in that state, the 
responsibility falls on the administrator to disambiguate the configuration by 
specifying which resources must be used in order to prevent a conflict, 
because software operating in that environment will not be able to do so 
properly.  The goal of the switch to enable System V IPC is to allow IPC to be 
enabled for a single jail at a time, where it can be used to its full 
capabilities, without violating the security model.  If it is turned on for 
more than one jail, then isolation is not provided for System V IPC.


So my recommendation is, if people want to run Postgres in more than one jail 
at a time, they be provided with a configuration option to disambiguate which 
semaphore to use: they must hard-code that it will not use the same sempahore 
already in use by another Postgres instance in another Jail.  This is no 
different than specifying that if there are multiple Apache's running on a 
single system, that they run on different port/IP combinations.  If they 
aren't configured to do so, one of them will encounter an error when running, 
because the resource is already in use, and you may get unpredictable results 
if the two Apaches are started at the same time, restarted, etc, as they will 
race to acquire the resource.


Whether you pull the resource ID out of a hat, use ftok(), or whatever, I 
really mind, and have no strong opinion.  The name space of System V IPC is 
one of the known problems with the IPC model, and sadly, one accepts those 
problems by using those IPC mechanisms.


Robert N M Watson

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] semaphore usage port based?

2006-04-04 Thread Kris Kennaway
On Mon, Apr 03, 2006 at 03:42:51PM -0400, Stephen Frost wrote:
 * Tom Lane ([EMAIL PROTECTED]) wrote:
  That's a fair question, but in the context of the code I believe we are
  behaving reasonably.  The reason this code exists is to provide some
  insurance against leaking semaphores when a postmaster process is
  terminated unexpectedly (ye olde often-recommended-against kill -9
  postmaster, for instance).  If the PID returned by GETPID is
 
 Could this be handled sensibly by using SEM_UNDO?  Just a thought.
 
  So I think the code is pretty bulletproof as long as it's in a system
  that is behaving per SysV spec.  The problem in the current FBSD
  situation is that the jail mechanism is exposing semaphore sets across
  jails, but not exposing the existence of the owning processes.  That
  behavior is inconsistent: if process A can affect the state of a sema
  set that process B can see, it's surely unreasonable to pretend that A
  doesn't exist.
 
 This is certainly a problem with FBSD jails...  Not only the
 inconsistancy, but what happens if someone manages to get access to the
 appropriate uid under one jail and starts sniffing or messing with the
 semaphores or shared memory segments from other jails?  If that's
 possible then that's a rather glaring security problem...

This was stated already upthread, but sysv IPC is disabled by default
in jails for precisely this reason.  So yes, when you turn it on it's
a potential security problem if your jails are supposed to be
compartmentalized.

Kris


pgpP7JxlYtaUQ.pgp
Description: PGP signature


Re: [HACKERS] will it work

2006-04-04 Thread Jonah H. Harris
On 4/4/06, siva kumar [EMAIL PROTECTED] wrote:
 hai, will it be possible to use one server for Process and another server
 for Reports. We are using Postgresql as database and java 

You surely can use two servers, but they cannot share the same data
cluster.  You would have to replicate your process data to your
reporting database via Slony or some other replication system.


--
Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
732.331.1324

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] I have changed employers

2006-04-04 Thread Bruce Momjian
FYI, I have left SRA and am now working for EnterpriseDB:

http://www.enterprisedb.com/news_events/press_releases/04_04_06.do

I will be doing the same community work I did before, so my role in the
project will not change.  (I will remain perpetually backlogged.  :-) )

I will always be grateful for SRA's sponsorship of my work for 4.5
years, and wish them continued success.

I am also looking forward to working with EnterpriseDB on new projects
and opportunities.

-- 
  Bruce Momjian   http://candle.pha.pa.us

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] I have changed employers

2006-04-04 Thread Alvaro Herrera
Bruce Momjian wrote:
 FYI, I have left SRA and am now working for EnterpriseDB:
 
   http://www.enterprisedb.com/news_events/press_releases/04_04_06.do

Congratulations!


 I will be doing the same community work I did before, so my role in the
 project will not change.  (I will remain perpetually backlogged.  :-) )

Bummer, I had expected they would have you working on the 2nd edition of
your book :-)

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] I have changed employers

2006-04-04 Thread Luke Lonergan
Bruce,

On 4/4/06 5:06 PM, Bruce Momjian pgman@candle.pha.pa.us wrote:

 I am also looking forward to working with EnterpriseDB on new projects
 and opportunities.

Congrats!

- Luke




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[HACKERS] Stats collection on Windows

2006-04-04 Thread Peter Brant
Hi all,

I think I've found the cause (or one of the causes) why stats
collection is unreliable on Windows and I'm wondering about the best way
to go about fixing it.

The problem is that process IDs on Windows seem to be assigned without
much rhyme or reason and it seems to happen relatively frequently that a
new process will be assigned the same process ID as a process which
recently died.  If this happens before the backend has been expired out
of pgstat.c's pgStatBeDead hash, the backend will be missed.

I was thinking the postmaster could maintain a backend sequence number
with similar semantics to a UNIXish process ID which could then be used
as the key for pgStatBeDead instead of the actual process ID.  Does that
sound reasonable?

Peter

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Fixing domain input

2006-04-04 Thread Michael Glaesemann


On Apr 4, 2006, at 10:39 , Tom Lane wrote:


So there's no additional risk --- in fact, arguably having such a
function crash during normal input of NULL would be a Good Thing,
because it would make it far more likely that the mistake would get
noticed and fixed before someone used it as a form of DOS attack.


Granted, finding an error earlier than later is a Good Thing, but an  
Even Better Thing would be to prevent crashing the backend, and  
afaics (as far as that is) the change you propose doesn't hurt  
anything. Do you see any way to do that?


I'm glad to see work being done on domains. I'm definitely learning  
from the discussion.


Michael Glaesemann
grzm myrealbox com




---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Fixing domain input

2006-04-04 Thread Christopher Kings-Lynne
I'm glad to see work being done on domains. I'm definitely learning from 
the discussion.


I wonder if we should implement 'GRANT USAGE ON DOMAINS' for spec 
compliance sometime...


Chris


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Fixing domain input

2006-04-04 Thread Tom Lane
Michael Glaesemann [EMAIL PROTECTED] writes:
 Granted, finding an error earlier than later is a Good Thing, but an  
 Even Better Thing would be to prevent crashing the backend, and  
 afaics (as far as that is) the change you propose doesn't hurt  
 anything. Do you see any way to do that?

For starters we'd have to forbid any user-written C functions.
Somehow that doesn't seem like a net win.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Fixing domain input

2006-04-04 Thread Michael Glaesemann


On Apr 5, 2006, at 11:46 , Tom Lane wrote:


For starters we'd have to forbid any user-written C functions.
Somehow that doesn't seem like a net win.


Ouch.

Michael Glaesemann
grzm myrealbox com




---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Stats collection on Windows

2006-04-04 Thread Tom Lane
Peter Brant [EMAIL PROTECTED] writes:
 I think I've found the cause (or one of the causes) why stats
 collection is unreliable on Windows and I'm wondering about the best way
 to go about fixing it.

 The problem is that process IDs on Windows seem to be assigned without
 much rhyme or reason and it seems to happen relatively frequently that a
 new process will be assigned the same process ID as a process which
 recently died.  If this happens before the backend has been expired out
 of pgstat.c's pgStatBeDead hash, the backend will be missed.

That's an interesting theory, but do you have any actual evidence for it?
The evidence I've seen says that our big problem on Windows is the stats
collector process just quitting due to unexplained piperead() failures.

(I mean, I'd love to blame Microsoft for everything, but even the
Redmond crowd should be able to figure out that recycling process IDs
instantly would be a stupid idea...)

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Stats collection on Windows

2006-04-04 Thread Qingqing Zhou

Tom Lane [EMAIL PROTECTED] wrote

 The problem is that process IDs on Windows seem to be assigned without
 much rhyme or reason and it seems to happen relatively frequently that a
 new process will be assigned the same process ID as a process which
 recently died.

 That's an interesting theory, but do you have any actual evidence for it?


I can confirm that on Windows 2000 the process ID is recycled instantly.

Regards,
Qingqing



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Stats collection on Windows

2006-04-04 Thread Qingqing Zhou

Tom Lane [EMAIL PROTECTED] wrote

 Redmond crowd should be able to figure out that recycling process IDs
 instantly would be a stupid idea...)


Can you explain more of this? IMHO, if we rely on feature like this, the 
difference is unstable-every-day vs. unstable-every-year.

Regards,
Qingqing 



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Stats collection on Windows

2006-04-04 Thread Tom Lane
Qingqing Zhou [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] wrote
 Redmond crowd should be able to figure out that recycling process IDs
 instantly would be a stupid idea...)

 Can you explain more of this? IMHO, if we rely on feature like this, the 
 difference is unstable-every-day vs. unstable-every-year.

The mere existence of the kill() primitive should bring to mind reasons
why it's a bad idea.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[HACKERS] Left joins and inheritance (table partitioning)

2006-04-04 Thread Rod Taylor
I've recently been playing with table partitioning limitations. Turning
over a large volume of data in inherited structures in a live
environment, and have run into a couple of snags in the planner.

The first is that LEFT JOIN will always do a sequential scan on all
inherited tables.

The second is that IN (1,4,6) works very differently than IN (SELECT id
FROM tab) when tab contains the values 1, 4, and 6. I'm not surprised
a straight left join failed, but I was surprised that IN failed to use
an index with enable_seqscan = off.


My fallback plan is to simply create a view and replace it to point to
the correct data segment when changes occur.



BEGIN;

CREATE TABLE key (keyword_id serial PRIMARY KEY);
INSERT INTO key DEFAULT VALUES;
INSERT INTO key DEFAULT VALUES;
INSERT INTO key DEFAULT VALUES;
INSERT INTO key DEFAULT VALUES;
INSERT INTO key DEFAULT VALUES;
INSERT INTO key DEFAULT VALUES;
INSERT INTO key DEFAULT VALUES;
INSERT INTO key DEFAULT VALUES;
INSERT INTO key DEFAULT VALUES;

CREATE TABLE key2 (keyword_id integer PRIMARY KEY);
INSERT INTO key2 VALUES (1);
INSERT INTO key2 VALUES (6);
INSERT INTO key2 VALUES (8);
ANALYZE key2;


CREATE TABLE foo 
( keyword_id integer PRIMARY KEY
 REFERENCES key);

CREATE TABLE foo2
( PRIMARY KEY (keyword_id)
, FOREIGN KEY (keyword_id) REFERENCES KEY
) INHERITS (foo);

INSERT INTO foo2 VALUES (1);
INSERT INTO foo2 VALUES (2);
INSERT INTO foo2 VALUES (3);
INSERT INTO foo2 VALUES (4);
INSERT INTO foo2 VALUES (5);
INSERT INTO foo2 VALUES (6);
INSERT INTO foo2 VALUES (7);
INSERT INTO foo2 VALUES (8);
INSERT INTO foo2 VALUES (9);

SET enable_seqscan = off;

EXPLAIN
SELECT * FROM key2 LEFT JOIN foo USING (keyword_id) ;

 QUERY PLAN
-
 Hash Left Join  (cost=20073.50..20191.74 rows=6 width=4)
   Hash Cond: (outer.keyword_id = inner.keyword_id)
   -  Index Scan using key2_pkey on key2  (cost=0.00..3.03 rows=3
width=4)
   -  Hash  (cost=20062.80..20062.80 rows=4280 width=4)
 -  Append  (cost=1.00..20062.80 rows=4280 width=4)
   -  Seq Scan on foo  (cost=1.00..10031.40
rows=2140 width=4)
   -  Seq Scan on foo2 foo
(cost=1.00..10031.40 rows=2140 width=4)
(7 rows)


EXPLAIN
SELECT *
  FROM key2
LEFT JOIN (SELECT keyword_id
 FROM foo
WHERE keyword_id IN (SELECT keyword_id FROM key2)
  ) AS tab USING (keyword_id) ;

QUERY PLAN
---
 Merge Left Join  (cost=20087.38..20090.46 rows=3 width=4)
   Merge Cond: (outer.keyword_id = inner.keyword_id)
   -  Index Scan using key2_pkey on key2  (cost=0.00..3.03 rows=3
width=4)
   -  Sort  (cost=20087.38..20087.39 rows=6 width=4)
 Sort Key: public.foo.keyword_id
 -  Hash IN Join  (cost=10003.04..20087.30 rows=6
width=4)
   Hash Cond: (outer.keyword_id = inner.keyword_id)
   -  Append  (cost=1.00..20062.80 rows=4280
width=4)
 -  Seq Scan on foo
(cost=1.00..10031.40 rows=2140 width=4)
 -  Seq Scan on foo2 foo
(cost=1.00..10031.40 rows=2140 width=4)
   -  Hash  (cost=3.03..3.03 rows=3 width=4)
 -  Index Scan using key2_pkey on key2
(cost=0.00..3.03 rows=3 width=4)
(12 rows)

EXPLAIN
SELECT *
  FROM key2
LEFT JOIN (SELECT keyword_id
 FROM foo
WHERE keyword_id IN (1,6,8)
  ) AS tab USING (keyword_id) ;

   QUERY PLAN

 Merge Left Join  (cost=22.08..25.16 rows=3 width=4)
   Merge Cond: (outer.keyword_id = inner.keyword_id)
   -  Index Scan using key2_pkey on key2  (cost=0.00..3.03 rows=3
width=4)
   -  Sort  (cost=22.08..22.09 rows=6 width=4)
 Sort Key: public.foo.keyword_id
 -  Append  (cost=3.01..22.00 rows=6 width=4)
   -  Bitmap Heap Scan on foo  (cost=3.01..9.50 rows=3
width=4)
 Recheck Cond: ((keyword_id = 1) OR (keyword_id = 6)
OR (keyword_id = 8))
 -  BitmapOr  (cost=3.01..3.01 rows=3 width=0)
   -  Bitmap Index Scan on foo_pkey
(cost=0.00..1.00 rows=1 width=0)
 Index Cond: (keyword_id = 1)
   -  Bitmap Index Scan on foo_pkey
(cost=0.00..1.00 rows=1 width=0)
 Index Cond: (keyword_id = 6)
   -  Bitmap Index Scan on foo_pkey
(cost=0.00..1.00 rows=1 width=0)
 Index Cond: (keyword_id = 8)
   -  Bitmap Heap Scan on foo2 foo  (cost=6.01..12.50
rows=3 width=4)
 

Re: [HACKERS] Left joins and inheritance (table partitioning)

2006-04-04 Thread Tom Lane
Rod Taylor [EMAIL PROTECTED] writes:
 I've recently been playing with table partitioning limitations. Turning
 over a large volume of data in inherited structures in a live
 environment, and have run into a couple of snags in the planner.

 The first is that LEFT JOIN will always do a sequential scan on all
 inherited tables.

Try it in CVS HEAD.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Left joins and inheritance (table partitioning)

2006-04-04 Thread Rod Taylor
On Tue, 2006-04-04 at 23:50 -0400, Tom Lane wrote:
 Rod Taylor [EMAIL PROTECTED] writes:
  I've recently been playing with table partitioning limitations. Turning
  over a large volume of data in inherited structures in a live
  environment, and have run into a couple of snags in the planner.
 
  The first is that LEFT JOIN will always do a sequential scan on all
  inherited tables.
 
 Try it in CVS HEAD.

Indeed, much better.

Sorry about the noise.

-- 


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] Summer of Code Preparation

2006-04-04 Thread Josh Berkus
Folks,

I've been warned that Summer of Code is coming up again soon.  We need to be 
ready with proposals which are officially endorsed by the PostgreSQL project.  
Which means we need:

a) Projects which could be accomplished in a summer, and
b) Students to do them.

We have one or two weeks to get this together.  Your help is greatly 
desired .. and if you're a CS student hacker reading this, drop me a line!

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] PostgreSQL Anniversary Proposals -- Important Update

2006-04-04 Thread Josh Berkus
Tatsuo,

 I'm wondering if this was approved or not...

We haven't approved *anything* yet.  The deadline was just Saturday, and I'm 
still keying stuff into the conference management system.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match