Re: [HACKERS] Suggestion: Which Binary?
Robert Treat wrote: The problem is that PostgreSQL is moving out of the realm of hard-core geeks only and more into the mainstream. I'd bet a number of our users have very little idea how autoconf and it's progeny work. It's probably not unlikely that those folks would be able to figure out where their perl was, but then not know how to tell it to configure. Most such users would use a binary distribution, though - either from the OS supplier or from our collection of binaries. If people are going to build postgres themselves from source then I *do* expect them to be moderately hard-core geeks. ISTM that by any measure of the general population, David Wheeler is a hard-core geek. :-) Actually by most measures of the programming/oss community he is a hard core geek. But he still got tripped up by this. A lot of people never get passed ./configure;make;make install even though they do a lot of coding on oss projects. Why turn these people away? Robert, You missed my point completely. I am not saying we should turn away people like David. In fact, it was me who suggested to him that he should write to -hackers on this subject. All I was saying was that we should not feel a need to tailor the configure script for people who aren't hard core geeks, as Jim was suggesting. I agree that we should help people like David. cheers andrew ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] will it work
hai, will it be possible to use one server for Process and another server for Reports. We are using Postgresql as database and java - Sivakumar _ NRIs Zero balance Account. FREE Money Transfers with FREE DVD https://www.online.citibank.co.in/portal/rca_msntagofline.htm ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] semaphore usage port based?
On Sun, Apr 02, 2006 at 11:26:52PM -0400, Tom Lane wrote: Kris Kennaway [EMAIL PROTECTED] writes: On Sun, Apr 02, 2006 at 11:17:49PM -0400, Tom Lane wrote: I have no objection to doing that, so long as you are actually doing it correctly. This example shows that each jail must have its own SysV semaphore key space, else information leaks anyway. By default SysV shared memory is disallowed in jails. Hm, the present problem seems to be about semaphores not shared memory Sorry, I meant IPC. ... although I'd not be surprised to find that there's a similar issue around shared memory. Anyway, if FBSD's position is that they are uninterested in supporting SysV IPC in connection with jails, then I think the Postgres project position has to be that we are uninterested in supporting Postgres inside FBSD jails. No-one is taking a position of being uninterested, so please don't be hasty to reciprocate. Kris pgpFHpzpEVx0Q.pgp Description: PGP signature
Re: [HACKERS] semaphore usage port based?
On Sun, Apr 02, 2006 at 11:17:49PM -0400, Tom Lane wrote: Kris Kennaway [EMAIL PROTECTED] writes: On Sun, Apr 02, 2006 at 11:08:11PM -0400, Tom Lane wrote: If this is the story, then FBSD have broken their system and must revert their change. They do not have kernel behavior that totally hides the existence of the other process, and therefore having some calls that pretend it's not there is simply inconsistent. I'm guessing it's a deliberate change to prevent the information leakage between jails. I have no objection to doing that, so long as you are actually doing it correctly. This example shows that each jail must have its own SysV semaphore key space, else information leaks anyway. By default SysV shared memory is disallowed in jails. Kris pgpyArbJ1akOg.pgp Description: PGP signature
Re: [HACKERS] semaphore usage port based?
On Sun, Apr 02, 2006 at 11:08:11PM -0400, Tom Lane wrote: I venture that FBSD 6 has decided to return ESRCH (no such process) where FBSD 4 returned some other error that acknowledged that the process did exist (EPERM would be a reasonable guess). If this is the story, then FBSD have broken their system and must revert their change. They do not have kernel behavior that totally hides the existence of the other process, and therefore having some calls that pretend it's not there is simply inconsistent. I'm guessing it's a deliberate change to prevent the information leakage between jails. Kris pgp0723nvyM1m.pgp Description: PGP signature
Re: [HACKERS] semaphore usage port based?
On Sun, Apr 02, 2006 at 11:41:01PM -0400, Kris Kennaway wrote: On Mon, Apr 03, 2006 at 12:30:58AM -0300, Marc G. Fournier wrote: 'k, but how do I fix kill so that it has the proper behaviour if SysV is enabled? Check the source, perhaps there's already a way. If not, talk to whoever made the change. Maybe a mount option for procfs that allows for pre-5.x behaviour? procfs has nothing to do with this though. I'm not the first one to point out that this is a problem, just the first to follow it through to the cause ;( And I believe there is more then just PostgreSQL that is affected by shared memory (ie. apache2 needs SysV IPC enabled, so anyone doing that in a jail has it enabled also) ... Also note that SysV IPC is not the problem here, it's the change in the behaviour of kill() that is causing postgresql to become confused. That's what you should investigate. The ESRCH error is being returned from prison_check(), that would be a good starting place. Andrew ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] semaphore usage port based?
On Mon, Apr 03, 2006 at 12:30:58AM -0300, Marc G. Fournier wrote: On Sun, 2 Apr 2006, Kris Kennaway wrote: On Sun, Apr 02, 2006 at 11:17:49PM -0400, Tom Lane wrote: Kris Kennaway [EMAIL PROTECTED] writes: On Sun, Apr 02, 2006 at 11:08:11PM -0400, Tom Lane wrote: If this is the story, then FBSD have broken their system and must revert their change. They do not have kernel behavior that totally hides the existence of the other process, and therefore having some calls that pretend it's not there is simply inconsistent. I'm guessing it's a deliberate change to prevent the information leakage between jails. I have no objection to doing that, so long as you are actually doing it correctly. This example shows that each jail must have its own SysV semaphore key space, else information leaks anyway. By default SysV shared memory is disallowed in jails. 'k, but how do I fix kill so that it has the proper behaviour if SysV is enabled? Check the source, perhaps there's already a way. If not, talk to whoever made the change. Maybe a mount option for procfs that allows for pre-5.x behaviour? procfs has nothing to do with this though. I'm not the first one to point out that this is a problem, just the first to follow it through to the cause ;( And I believe there is more then just PostgreSQL that is affected by shared memory (ie. apache2 needs SysV IPC enabled, so anyone doing that in a jail has it enabled also) ... Also note that SysV IPC is not the problem here, it's the change in the behaviour of kill() that is causing postgresql to become confused. That's what you should investigate. Kris pgpepKUMNlFKd.pgp Description: PGP signature
Re: [HACKERS] semaphore usage port based?
On Mon, 3 Apr 2006, Stephen Frost wrote: So I think the code is pretty bulletproof as long as it's in a system that is behaving per SysV spec. The problem in the current FBSD situation is that the jail mechanism is exposing semaphore sets across jails, but not exposing the existence of the owning processes. That behavior is inconsistent: if process A can affect the state of a sema set that process B can see, it's surely unreasonable to pretend that A doesn't exist. This is certainly a problem with FBSD jails... Not only the inconsistancy, but what happens if someone manages to get access to the appropriate uid under one jail and starts sniffing or messing with the semaphores or shared memory segments from other jails? If that's possible then that's a rather glaring security problem... This is why it's disabled by default, and the jail documentation specifically advises of this possibility. Excerpt below. Robert N M Watson security.jail.sysvipc_allowed This MIB entry determines whether or not processes within a jail have access to System V IPC primitives. In the current jail imple- mentation, System V primitives share a single namespace across the host and jail environments, meaning that processes within a jail would be able to communicate with (and potentially interfere with) processes outside of the jail, and in other jails. As such, this functionality is disabled by default, but can be enabled by setting this MIB entry to 1. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] semaphore usage port based?
On Mon, 3 Apr 2006, Stephen Frost wrote: This is why it's disabled by default, and the jail documentation specifically advises of this possibility. Excerpt below. Ah, I see, glad to see it's accurately documented. As it has been for the last five years, I believe since introduction of the setting to allow System V IPC to be used with documented limitations. Given the rather significant use of shared memory by Postgres it seems to me that jail'ing it under FBSD is unlikely to get you the kind of isolation between instances that you want (the assumption being that you want to avoid the possibility of a user under one jail impacting a user in another jail). As such, I'd suggest finding something else if you truely need that isolation for Postgres or dropping the jails entirely. Running the Postgres instances under different uids (as you'd probably expect to do anyway if not using the jails) is probably the right approach. Doing that and using jails would probably work, just don't delude yourself into thinking that you're safe from a malicious user in one jail. Yes, there seems to be an awful lot of noise being made about the fact that the system does, in fact, work exactly as documented, and that the configuration being complained about is one that is specifically documented as being unsupported and undesirable. As commented elsewhere in this thread, currently, there is no virtualization support for System V IPC in the FreeBSD Jail implementation. That may change if/when someone implements it. Until it's implemented, it isn't going to be there, and the system won't behave as though it's there no matter how much jumping up and down is done. Robert N M Watson ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] semaphore usage port based?
On Sun, 2 Apr 2006, Tom Lane wrote: Oops. Here is the problem: kill() is lying by claiming there is no such process as 83699. It looks to me like there in fact is such a process, but it's in a different jail. I venture that FBSD 6 has decided to return ESRCH (no such process) where FBSD 4 returned some other error that acknowledged that the process did exist (EPERM would be a reasonable guess). If this is the story, then FBSD have broken their system and must revert their change. They do not have kernel behavior that totally hides the existence of the other process, and therefore having some calls that pretend it's not there is simply inconsistent. FreeBSD's mandatory access control models, such as multi-level security, biba integrity, and type enforcement, will generally provide consistent protection under the circumstances you describe: specifically, that information flow invariants across IPC types, including System V IPC and inter-process signalling, will allow flow only in keeping with the policy. However, I guess I would counter with the following concern: the PID returned by semctl() has the following definition: GETPID Return the pid of the last process to perform an operation on semaphore number semnum. However, pid's in general uniquely identify a process only at the time they are recorded. So any pid returned here is necessarily stale -- even if there is another process with the pid returned by GETPID, it may actually be a different process that has ended up with the same pid. The longer the gap since the last semaphore operation, the more likely (presumably) it is that the pid has been recycled. And on modern systems with thousands of processes and high process turn-over (i.e., systems with CGI and other sorts of scripting),pid reuse can happen quickly. Is your use of the pid here consistent with fact that pid's are reused quickly after process exit? Use of pid's in UNIX is often unreliable, and must be combined with other synchronizing, such as file locking on a pidfile, to ensure that the pid read is valid. Even then, you can't implement atomic check-pid-and-signal using current UNIX APIs, which would require a notion of a process handle (or, in the parlance of Mach, a task port). Another thought along these lines -- especially with the proliferation of fine-grained access control systems, such as Type Enforcement in SELinux, I would be cautious about assuming that two processes being able to manipulate the same sempahore implies the ability to exchange signals using the signal facility. Robert N M Watson ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] semaphore usage port based?
On Mon, 3 Apr 2006, Tom Lane wrote: Robert Watson [EMAIL PROTECTED] writes: Maybe I've misunderstood the problem here -- is the use of the GETPID operation occuring within a coordinated set of server processes, or does it also occur between client and server processes? I think it's quite reasonable to argue that a coordinated set of server processes should be able to see each other, especially if they're running as the same user, in the same jail, started at the same time. We use the semaphore sets only within postgres server processes; we could hardly expect client processes to be able to get at them, since in general clients aren't on the same machine. The issue here, though, is that Marc is trying to start multiple postgres servers in different jails, and in that context the different postgres servers aren't coordinated in any real sense. We'd prefer that they didn't interact at all, but they are interacting because the SysV code isn't restricting IPC to occur only within a jail. BTW, Marc, it occurs to me that a workaround for you would be to create a separate userid for postgres to run under in each jail; then the regular protection mechanisms would prevent the different postmasters from interfering with each others' semaphore sets. But I think that workaround just makes it even clearer that the jail mechanism isn't behaving very sanely. Any multi-instance application that uses unvirtualized System V IPC must know how to distinguish between those instances. This is true of any potential communication mechanism used by multi-instance applications -- be it a command line argument to specify an alternative configuration file, or a configuration file that specifies alternative ports, working directories, mail spool directories, etc. If you install two instances of sendmail, it requires some configuration to teach them not to step all over each other, and this is not an accident: if they try to use the same mail spools, ports, etc, things will go badly. I can't imagine that PostgreSQL should be any different -- it has to be pointed at what resources to use and how to use them -- some of that will be a property of how it's written, and some how it's configured. Presumably, running multiple instances of PostgreSQL in jails should not be all that different from running multiple instances on any UNIX machine: they must not overlap where shared resources are concerned. How is PostgreSQL deciding what semaphores to use? Can it be instructed to use non-colliding ones by specifying an alternative argument to pass to ftok(), or ID to use directly? I would, in general, consider the use of System V IPC across jails (as opposed to in a single jail) unsupported, since it's not consistent with the security model. That'd be fine with me --- the problem here is that we've got unwanted communication across jails. If, say, the jail ID were considered part of semaphore keys, we'd be in fine shape. Well, I think it's definitely unwanted communications, but until such time as FreeBSD supports virtualizing the System V IPC name spaces, the fact that you can communicate between jails when System V IPC support is turned on for the jail shouldn't be a surprise, and should in fact be considered a feature. However, if applications behave incorrectly when treading over each other because either they aren't written to support specifying how not to walk over each other, or if they are not configured to use that support, then they're not going to behave well :-). Robert N M Watson ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] semaphore usage port based?
On Mon, 3 Apr 2006, Tom Lane wrote: That's a fair question, but in the context of the code I believe we are behaving reasonably. The reason this code exists is to provide some insurance against leaking semaphores when a postmaster process is terminated unexpectedly (ye olde often-recommended-against kill -9 postmaster, for instance). If the PID returned by GETPID is nonexistent or belongs to a process not owned by the postgres userid then we assume that the semaphore set can be recycled. We could get fooled by PID recycling if the PID returned by GETPID belongs to a postgres-owned process that isn't actually the original owner, but the penalty is just that we'll fail to recycle semaphores that could be released. Not very harmful, and not very probable either, unless you're running postgres under a userid that's used for a lot of other stuff too. There is not much risk of long-term leakage of many semaphore sets, even if you've got lots of postmaster crashes going on (which I sure hope you don't). The code is designed to retry the same semaphore keys on each cycle of life, so you'd have to get fooled by chance coincidence of existing PIDs every time over many cycles to have a severe resource-leakage problem. (BTW, Marc, that's the reason for *not* randomizing the key selection as you suggested.) So I think the code is pretty bulletproof as long as it's in a system that is behaving per SysV spec. The problem in the current FBSD situation is that the jail mechanism is exposing semaphore sets across jails, but not exposing the existence of the owning processes. That behavior is inconsistent: if process A can affect the state of a sema set that process B can see, it's surely unreasonable to pretend that A doesn't exist. Maybe I've misunderstood the problem here -- is the use of the GETPID operation occuring within a coordinated set of server processes, or does it also occur between client and server processes? I think it's quite reasonable to argue that a coordinated set of server processes should be able to see each other, especially if they're running as the same user, in the same jail, started at the same time. After all, coordinated server applications frequently use signals to manage resources and perform asynchronous notification (i.e., SIGCHLD, SIGHUP, etc). If we're talking about clients and servers coordinating using the same System V IPC name space, I find myself less sympathetic to the idea that otherwise unrelated processes on either side of the IPC mechanism should be using out-of-band process operations to test for mutual presence. There has been occasional investigation of virtualizing the System V IPC name space, but as you are no doubt aware, the name space doesn't lend itself to virtualization, as it fails to be conveniently hierarchical, etc. This is just another of the ways in which System V IPC offers quite useful IPC services in less useful ways. I would, in general, consider the use of System V IPC across jails (as opposed to in a single jail) unsupported, since it's not consistent with the security model. However, I have doubts about the behavioral dependency we're talking about above. Robert N M Watson ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] semaphore usage port based?
On Mon, Apr 03, 2006 at 06:51:45PM -0400, Stephen Frost wrote: * Robert Watson ([EMAIL PROTECTED]) wrote: On Mon, 3 Apr 2006, Stephen Frost wrote: This is certainly a problem with FBSD jails... Not only the inconsistancy, but what happens if someone manages to get access to the appropriate uid under one jail and starts sniffing or messing with the semaphores or shared memory segments from other jails? If that's possible then that's a rather glaring security problem... This is why it's disabled by default, and the jail documentation specifically advises of this possibility. Excerpt below. Ah, I see, glad to see it's accurately documented. Given the rather significant use of shared memory by Postgres it seems to me that jail'ing it under FBSD is unlikely to get you the kind of isolation between instances that you want (the assumption being that you want to avoid the possibility of a user under one jail impacting a user in another jail). As such, I'd suggest finding something else if you truely need that isolation for Postgres or dropping the jails entirely. Running the Postgres instances under different uids (as you'd probably expect to do anyway if not using the jails) is probably the right approach. Doing that and using jails would probably work, just don't delude yourself into thinking that you're safe from a malicious user in one jail. Yes; however jails are still useful for administrative compartmentalization even when you have to weaken their security properties, such as here. Kris pgp2ZsCjYtna3.pgp Description: PGP signature
Re: [HACKERS] semaphore usage port based?
On Mon, 3 Apr 2006, Tom Lane wrote: Robert Watson [EMAIL PROTECTED] writes: Any multi-instance application that uses unvirtualized System V IPC must know how to distinguish between those instances. Sure. How is PostgreSQL deciding what semaphores to use? Can it be instructed to use non-colliding ones by specifying an alternative argument to pass to ftok(), or ID to use directly? The problem here is not that we don't know how to avoid a collision. The problem is stemming from code that we added to prevent semaphore leakage during failure recoveries. The code believes that it is deleting a semaphore set left over from a crashed previous instance of the same postmaster. We don't use ftok() to determine the keys, and I'm disinclined to think that doing so would improve the situation: you could still have key collisions, they'd just be unpredictable and there'd be no convenient mechanism for escaping one if you hit it. I guess what I'm saying is this: by turning on system V IPC in a jail, administrators accept that they are using an unsupported configuration, in which the security features of jail, which include hiding the process state of other jails, are known to conflict with the System V IPC services. We specifically disable System V IPC in jails because it is known to have undesirable properties. When configuring systems in that state, the responsibility falls on the administrator to disambiguate the configuration by specifying which resources must be used in order to prevent a conflict, because software operating in that environment will not be able to do so properly. The goal of the switch to enable System V IPC is to allow IPC to be enabled for a single jail at a time, where it can be used to its full capabilities, without violating the security model. If it is turned on for more than one jail, then isolation is not provided for System V IPC. So my recommendation is, if people want to run Postgres in more than one jail at a time, they be provided with a configuration option to disambiguate which semaphore to use: they must hard-code that it will not use the same sempahore already in use by another Postgres instance in another Jail. This is no different than specifying that if there are multiple Apache's running on a single system, that they run on different port/IP combinations. If they aren't configured to do so, one of them will encounter an error when running, because the resource is already in use, and you may get unpredictable results if the two Apaches are started at the same time, restarted, etc, as they will race to acquire the resource. Whether you pull the resource ID out of a hat, use ftok(), or whatever, I really mind, and have no strong opinion. The name space of System V IPC is one of the known problems with the IPC model, and sadly, one accepts those problems by using those IPC mechanisms. Robert N M Watson ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] semaphore usage port based?
On Mon, Apr 03, 2006 at 03:42:51PM -0400, Stephen Frost wrote: * Tom Lane ([EMAIL PROTECTED]) wrote: That's a fair question, but in the context of the code I believe we are behaving reasonably. The reason this code exists is to provide some insurance against leaking semaphores when a postmaster process is terminated unexpectedly (ye olde often-recommended-against kill -9 postmaster, for instance). If the PID returned by GETPID is Could this be handled sensibly by using SEM_UNDO? Just a thought. So I think the code is pretty bulletproof as long as it's in a system that is behaving per SysV spec. The problem in the current FBSD situation is that the jail mechanism is exposing semaphore sets across jails, but not exposing the existence of the owning processes. That behavior is inconsistent: if process A can affect the state of a sema set that process B can see, it's surely unreasonable to pretend that A doesn't exist. This is certainly a problem with FBSD jails... Not only the inconsistancy, but what happens if someone manages to get access to the appropriate uid under one jail and starts sniffing or messing with the semaphores or shared memory segments from other jails? If that's possible then that's a rather glaring security problem... This was stated already upthread, but sysv IPC is disabled by default in jails for precisely this reason. So yes, when you turn it on it's a potential security problem if your jails are supposed to be compartmentalized. Kris pgpP7JxlYtaUQ.pgp Description: PGP signature
Re: [HACKERS] will it work
On 4/4/06, siva kumar [EMAIL PROTECTED] wrote: hai, will it be possible to use one server for Process and another server for Reports. We are using Postgresql as database and java You surely can use two servers, but they cannot share the same data cluster. You would have to replicate your process data to your reporting database via Slony or some other replication system. -- Jonah H. Harris, Database Internals Architect EnterpriseDB Corporation 732.331.1324 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] I have changed employers
FYI, I have left SRA and am now working for EnterpriseDB: http://www.enterprisedb.com/news_events/press_releases/04_04_06.do I will be doing the same community work I did before, so my role in the project will not change. (I will remain perpetually backlogged. :-) ) I will always be grateful for SRA's sponsorship of my work for 4.5 years, and wish them continued success. I am also looking forward to working with EnterpriseDB on new projects and opportunities. -- Bruce Momjian http://candle.pha.pa.us + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] I have changed employers
Bruce Momjian wrote: FYI, I have left SRA and am now working for EnterpriseDB: http://www.enterprisedb.com/news_events/press_releases/04_04_06.do Congratulations! I will be doing the same community work I did before, so my role in the project will not change. (I will remain perpetually backlogged. :-) ) Bummer, I had expected they would have you working on the 2nd edition of your book :-) -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] I have changed employers
Bruce, On 4/4/06 5:06 PM, Bruce Momjian pgman@candle.pha.pa.us wrote: I am also looking forward to working with EnterpriseDB on new projects and opportunities. Congrats! - Luke ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Stats collection on Windows
Hi all, I think I've found the cause (or one of the causes) why stats collection is unreliable on Windows and I'm wondering about the best way to go about fixing it. The problem is that process IDs on Windows seem to be assigned without much rhyme or reason and it seems to happen relatively frequently that a new process will be assigned the same process ID as a process which recently died. If this happens before the backend has been expired out of pgstat.c's pgStatBeDead hash, the backend will be missed. I was thinking the postmaster could maintain a backend sequence number with similar semantics to a UNIXish process ID which could then be used as the key for pgStatBeDead instead of the actual process ID. Does that sound reasonable? Peter ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Fixing domain input
On Apr 4, 2006, at 10:39 , Tom Lane wrote: So there's no additional risk --- in fact, arguably having such a function crash during normal input of NULL would be a Good Thing, because it would make it far more likely that the mistake would get noticed and fixed before someone used it as a form of DOS attack. Granted, finding an error earlier than later is a Good Thing, but an Even Better Thing would be to prevent crashing the backend, and afaics (as far as that is) the change you propose doesn't hurt anything. Do you see any way to do that? I'm glad to see work being done on domains. I'm definitely learning from the discussion. Michael Glaesemann grzm myrealbox com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Fixing domain input
I'm glad to see work being done on domains. I'm definitely learning from the discussion. I wonder if we should implement 'GRANT USAGE ON DOMAINS' for spec compliance sometime... Chris ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Fixing domain input
Michael Glaesemann [EMAIL PROTECTED] writes: Granted, finding an error earlier than later is a Good Thing, but an Even Better Thing would be to prevent crashing the backend, and afaics (as far as that is) the change you propose doesn't hurt anything. Do you see any way to do that? For starters we'd have to forbid any user-written C functions. Somehow that doesn't seem like a net win. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Fixing domain input
On Apr 5, 2006, at 11:46 , Tom Lane wrote: For starters we'd have to forbid any user-written C functions. Somehow that doesn't seem like a net win. Ouch. Michael Glaesemann grzm myrealbox com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Stats collection on Windows
Peter Brant [EMAIL PROTECTED] writes: I think I've found the cause (or one of the causes) why stats collection is unreliable on Windows and I'm wondering about the best way to go about fixing it. The problem is that process IDs on Windows seem to be assigned without much rhyme or reason and it seems to happen relatively frequently that a new process will be assigned the same process ID as a process which recently died. If this happens before the backend has been expired out of pgstat.c's pgStatBeDead hash, the backend will be missed. That's an interesting theory, but do you have any actual evidence for it? The evidence I've seen says that our big problem on Windows is the stats collector process just quitting due to unexplained piperead() failures. (I mean, I'd love to blame Microsoft for everything, but even the Redmond crowd should be able to figure out that recycling process IDs instantly would be a stupid idea...) regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Stats collection on Windows
Tom Lane [EMAIL PROTECTED] wrote The problem is that process IDs on Windows seem to be assigned without much rhyme or reason and it seems to happen relatively frequently that a new process will be assigned the same process ID as a process which recently died. That's an interesting theory, but do you have any actual evidence for it? I can confirm that on Windows 2000 the process ID is recycled instantly. Regards, Qingqing ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Stats collection on Windows
Tom Lane [EMAIL PROTECTED] wrote Redmond crowd should be able to figure out that recycling process IDs instantly would be a stupid idea...) Can you explain more of this? IMHO, if we rely on feature like this, the difference is unstable-every-day vs. unstable-every-year. Regards, Qingqing ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Stats collection on Windows
Qingqing Zhou [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] wrote Redmond crowd should be able to figure out that recycling process IDs instantly would be a stupid idea...) Can you explain more of this? IMHO, if we rely on feature like this, the difference is unstable-every-day vs. unstable-every-year. The mere existence of the kill() primitive should bring to mind reasons why it's a bad idea. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] Left joins and inheritance (table partitioning)
I've recently been playing with table partitioning limitations. Turning over a large volume of data in inherited structures in a live environment, and have run into a couple of snags in the planner. The first is that LEFT JOIN will always do a sequential scan on all inherited tables. The second is that IN (1,4,6) works very differently than IN (SELECT id FROM tab) when tab contains the values 1, 4, and 6. I'm not surprised a straight left join failed, but I was surprised that IN failed to use an index with enable_seqscan = off. My fallback plan is to simply create a view and replace it to point to the correct data segment when changes occur. BEGIN; CREATE TABLE key (keyword_id serial PRIMARY KEY); INSERT INTO key DEFAULT VALUES; INSERT INTO key DEFAULT VALUES; INSERT INTO key DEFAULT VALUES; INSERT INTO key DEFAULT VALUES; INSERT INTO key DEFAULT VALUES; INSERT INTO key DEFAULT VALUES; INSERT INTO key DEFAULT VALUES; INSERT INTO key DEFAULT VALUES; INSERT INTO key DEFAULT VALUES; CREATE TABLE key2 (keyword_id integer PRIMARY KEY); INSERT INTO key2 VALUES (1); INSERT INTO key2 VALUES (6); INSERT INTO key2 VALUES (8); ANALYZE key2; CREATE TABLE foo ( keyword_id integer PRIMARY KEY REFERENCES key); CREATE TABLE foo2 ( PRIMARY KEY (keyword_id) , FOREIGN KEY (keyword_id) REFERENCES KEY ) INHERITS (foo); INSERT INTO foo2 VALUES (1); INSERT INTO foo2 VALUES (2); INSERT INTO foo2 VALUES (3); INSERT INTO foo2 VALUES (4); INSERT INTO foo2 VALUES (5); INSERT INTO foo2 VALUES (6); INSERT INTO foo2 VALUES (7); INSERT INTO foo2 VALUES (8); INSERT INTO foo2 VALUES (9); SET enable_seqscan = off; EXPLAIN SELECT * FROM key2 LEFT JOIN foo USING (keyword_id) ; QUERY PLAN - Hash Left Join (cost=20073.50..20191.74 rows=6 width=4) Hash Cond: (outer.keyword_id = inner.keyword_id) - Index Scan using key2_pkey on key2 (cost=0.00..3.03 rows=3 width=4) - Hash (cost=20062.80..20062.80 rows=4280 width=4) - Append (cost=1.00..20062.80 rows=4280 width=4) - Seq Scan on foo (cost=1.00..10031.40 rows=2140 width=4) - Seq Scan on foo2 foo (cost=1.00..10031.40 rows=2140 width=4) (7 rows) EXPLAIN SELECT * FROM key2 LEFT JOIN (SELECT keyword_id FROM foo WHERE keyword_id IN (SELECT keyword_id FROM key2) ) AS tab USING (keyword_id) ; QUERY PLAN --- Merge Left Join (cost=20087.38..20090.46 rows=3 width=4) Merge Cond: (outer.keyword_id = inner.keyword_id) - Index Scan using key2_pkey on key2 (cost=0.00..3.03 rows=3 width=4) - Sort (cost=20087.38..20087.39 rows=6 width=4) Sort Key: public.foo.keyword_id - Hash IN Join (cost=10003.04..20087.30 rows=6 width=4) Hash Cond: (outer.keyword_id = inner.keyword_id) - Append (cost=1.00..20062.80 rows=4280 width=4) - Seq Scan on foo (cost=1.00..10031.40 rows=2140 width=4) - Seq Scan on foo2 foo (cost=1.00..10031.40 rows=2140 width=4) - Hash (cost=3.03..3.03 rows=3 width=4) - Index Scan using key2_pkey on key2 (cost=0.00..3.03 rows=3 width=4) (12 rows) EXPLAIN SELECT * FROM key2 LEFT JOIN (SELECT keyword_id FROM foo WHERE keyword_id IN (1,6,8) ) AS tab USING (keyword_id) ; QUERY PLAN Merge Left Join (cost=22.08..25.16 rows=3 width=4) Merge Cond: (outer.keyword_id = inner.keyword_id) - Index Scan using key2_pkey on key2 (cost=0.00..3.03 rows=3 width=4) - Sort (cost=22.08..22.09 rows=6 width=4) Sort Key: public.foo.keyword_id - Append (cost=3.01..22.00 rows=6 width=4) - Bitmap Heap Scan on foo (cost=3.01..9.50 rows=3 width=4) Recheck Cond: ((keyword_id = 1) OR (keyword_id = 6) OR (keyword_id = 8)) - BitmapOr (cost=3.01..3.01 rows=3 width=0) - Bitmap Index Scan on foo_pkey (cost=0.00..1.00 rows=1 width=0) Index Cond: (keyword_id = 1) - Bitmap Index Scan on foo_pkey (cost=0.00..1.00 rows=1 width=0) Index Cond: (keyword_id = 6) - Bitmap Index Scan on foo_pkey (cost=0.00..1.00 rows=1 width=0) Index Cond: (keyword_id = 8) - Bitmap Heap Scan on foo2 foo (cost=6.01..12.50 rows=3 width=4)
Re: [HACKERS] Left joins and inheritance (table partitioning)
Rod Taylor [EMAIL PROTECTED] writes: I've recently been playing with table partitioning limitations. Turning over a large volume of data in inherited structures in a live environment, and have run into a couple of snags in the planner. The first is that LEFT JOIN will always do a sequential scan on all inherited tables. Try it in CVS HEAD. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Left joins and inheritance (table partitioning)
On Tue, 2006-04-04 at 23:50 -0400, Tom Lane wrote: Rod Taylor [EMAIL PROTECTED] writes: I've recently been playing with table partitioning limitations. Turning over a large volume of data in inherited structures in a live environment, and have run into a couple of snags in the planner. The first is that LEFT JOIN will always do a sequential scan on all inherited tables. Try it in CVS HEAD. Indeed, much better. Sorry about the noise. -- ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] Summer of Code Preparation
Folks, I've been warned that Summer of Code is coming up again soon. We need to be ready with proposals which are officially endorsed by the PostgreSQL project. Which means we need: a) Projects which could be accomplished in a summer, and b) Students to do them. We have one or two weeks to get this together. Your help is greatly desired .. and if you're a CS student hacker reading this, drop me a line! -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] PostgreSQL Anniversary Proposals -- Important Update
Tatsuo, I'm wondering if this was approved or not... We haven't approved *anything* yet. The deadline was just Saturday, and I'm still keying stuff into the conference management system. -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match