Re: [HACKERS] Proposal: variant of regclass
Hello, Tom. You wrote: TL> Tatsuo Ishii writes: >> I would like to add a variant of regclass, which is exactly same as >> current regclass except it does not raise an error when the target >> table is not found. Instead it returns InvalidOid (0). TL> I've sometimes thought we should just make all the reg* input converters TL> act that way. Absolutely agree. I cannot see the case whn error is the appropriate solution. Casting nonexistent objects to NULL is the way to go for me. TL> It's not terribly consistent that they'll happily take TL> numeric inputs that don't correspond to any existing OID. And more TL> often than not, I've found the throw-an-error behavior to be annoying TL> not helpful. TL> In any case, -1 for dealing with this only for regclass and not the TL> other ones. TL> regards, tom lane -- With best wishes, Pavel mailto:pa...@gf.microolap.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [BUGS] BUG #7873: pg_restore --clean tries to drop tables that don't exist
On 5 December 2013 01:33, Peter Eisentraut wrote: > Can someone in this thread clarify the commit fest situation? I see two > entries that appear to be the same: > > https://commitfest.postgresql.org/action/patch_view?id=1174 > https://commitfest.postgresql.org/action/patch_view?id=1175 > > I think the first one is a duplicate or obsolete. > #1174 looks to be a separate feature. I don't think it's dependent on #1175 from a code standpoint, but it probably needs it to work properly in all situations. I think #1175 is close to being ready for commit. Pavel, will you produce an updated patch based on our last discussion? I'll set this patch to waiting on author. Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Problem with displaying "wide" tables in psql
And my patch affects the row view only. postgres=# \x 1 postgres=# create table wide_table (value text); postgres=# insert into wide_table values ('afadsafasd fasdf asdfasd fsad fas df sadf sad f sadf sadf sa df sadfsadfasd fsad fsa df sadf asd fa sfd sadfsadf asdf sad f sadf sad fadsf'); postgres=# insert into wide_table values ('afadsafasd fasdf asdfasd'); postgres=# select * from wide_table; -[ RECORD 1 ]--- --- value | afadsafasd fasdf asdfasd fsad fas df sadf sad f sadf sadf sa df sadfsad fasd fsad fsa df sadf asd fa sfd sadfsadf asdf sad f sadf sad fadsf -[ RECORD 2 ]--- --- value | afadsafasd fasdf If we add a new column to this table and put the border on, we can see that all values in the table have the same width. postgres=# alter table wide_table add column id integer; postgres=# \pset border 2 postgres=# select * from wide_table; +-[ RECORD 1 ]-- --+ | value | afadsafasd fasdf asdfasd fsad fas df sadf sad f sadf sadf sa df sadfs adfasd fsad fsa df sadf asd fa sfd sadfsadf asdf sad f sadf sad fadsf | | id| | +-[ RECORD 2 ]-- --+ | value | afadsafasd fasdf asdfasd | | id| | +---+--- --+ My patch tries to solve these problems: -[ RECORD 1 ]--- value | afadsafasd fasdf asdfasd fsad fas df sadf sad f sadf sadf sa df sadfsad fasd fsad fsa df sadf asd fa sfd sadfsadf asdf sad f sadf sad fadsf -[ RECORD 2 ]--- value | afadsafasd fasdf asdfasd and +-[ RECORD 1 ]-+ | value | afadsafasd fasdf asdfasd fsad fas df sadf sad f sadf sadf sa df sadfs adfasd fsad fsa df sadf asd fa sfd sadfsadf asdf sad f sadf sad fadsf | | id| | +-[ RECORD 2 ]-+ | value | afadsafasd fasdf asdfasd | | id| | +---+--+ Regards 2013/12/4 Pavel Stehule > Hello > > postgres=# \pset format wrapped > Output format (format) is wrapped. > postgres=# select 'afadsafasd fasdf asdfasd fsad fas df sadf sad f sadf > sadf sa df sadfsadfasd fsad fsa df sadf asd fa sfd sadfsadf asdf sad f sadf > sad fadsf'; > > ?column? > > - > afadsafasd fasdf asdfasd fsad fas df sadf sad f sadf sadf sa df > sadfsadfasd fsad fsa df sadf asd fa sfd sadfsadf a. > .sdf sad f sadf sad fadsf > (1 row) > > It works as expected > > but it is not supported for row view. So any fix of this mode should be > nice > > Regards > > Pavel > > > 2013/12/4 Sergey Muraviov > >> Thank you for this trick. >> It would be nice if this trick was documented. >> >> However, with the pager I can't see wide value on one screen, select and >> copy it entirely. >> And I have to press many keys to find the necessary part of the value. >> There is no such problems with the patch. >> >> >> 2013/12/3 Pavel Stehule >> >>> Hello >>> >>> do you know a pager less trick >>> >>> http://merlinmoncure.blogspot.cz/2007/10/better-psql-with-less.html >>> >>> Regards >>> >>> Pavel Stehule >>> >>> >>> 2013/12/3 Sergey Muraviov >>> Hi. Psql definitely have a problem with displaying "wide" tables. Even in expanded mode, they look horrible. So I tried to solve this problem. Before the patch: postgres=# \x 1 Expanded display (expanded) is on. postgres=# \pset border 2 Border style (border) is 2. postgres=# select * from pg_stats; +-[ RECORD 1 ]---+-- ---
Re: [HACKERS] Extension Templates S03E11
On Wed, 2013-12-04 at 15:28 -0500, Robert Haas wrote: > My experience with software upgrades is that > cases like this, and even weirder things, happen pretty routinely, so > I think more control is good. There would still be control: just use full SQL scripts appropriately. I'm sure there's still room for surprise as extensions become more complex. But ultimately, those surprises will happen because of broken upgrade/downgrade scripts, and if those are broken, the user is probably in for a surprise in the near future anyway. It's fine with me if we help alleviate these problems by using a proper system to organize these upgrades/downgrades. But everything proposed seems pretty bad from the perspective of an extension author -- extra documentation, extra ceremony, more room for error, and more maintenance every time they release a new version. And once we document it, we have to support those behaviors for a long time, which will almost certainly prevent a better solution later. I think we should just make it simple: * If there is a full SQL script of the given version, we guarantee that we'll execute that one. * Otherwise, we execute the shortest path from a full version to the requested version. * If there's a tie, throw an error. That leaves us with plenty of room to improve the situation later, for instance if we support ordered versions. (I'm not sure if ordered versions was rejected outright, or we just didn't have time to do it properly.) Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
Peter Eisentraut writes: > On Wed, 2013-12-04 at 20:27 -0500, Tom Lane wrote: >> Lazy people? I'm not in a hurry to drop it; it's not costing us much to >> just sit there, other than in this connection which we see how to fix. > Actually, I think it probably costs a fair portion of extension authors > when their initial code crashes because they forgot to declare all their > functions V1. I think it might actually be more of a bother to lazy > people than a benefit. Hm. We have heard one or two complaints like that, but not a huge number. I'm worried about breaking code that's been working since god-knows-when; but I will concede there's little evidence that there's very much of that out there either. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol
On Thu, 2013-12-05 at 09:02 +0530, Amit Kapila wrote: > This is certainly not a stupid idea, rather something on similar lines > has been discussed previously in this thread. > Tom has suggested something similar, but I am not sure if there was a > conclusion on that point. Please see the > relavant discussion at below link: > http://www.postgresql.org/message-id/17384.1346645...@sss.pgh.pa.us Yeah, I think the environment variable idea wasn't actually refuted there. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
On Wed, 2013-12-04 at 20:27 -0500, Tom Lane wrote: > Lazy people? I'm not in a hurry to drop it; it's not costing us much to > just sit there, other than in this connection which we see how to fix. Actually, I think it probably costs a fair portion of extension authors when their initial code crashes because they forgot to declare all their functions V1. I think it might actually be more of a bother to lazy people than a benefit. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [bug fix] pg_ctl fails with config-only directory
On Wed, Dec 4, 2013 at 7:57 PM, MauMau wrote: > Hello, > > I've found a bug and would like to fix it, but I cannot figure out how to do > that well. Could you give me any advice? I encountered this on PG 9.2, but > it will probably exist in later versions. > > [Problem] > On Windows, a user with Administrator privileges can start the database > server. However, when he uses config-only directory, the database server > cannot be started. "pg_ctl start" fails with the following messages: > > Execution of PostgreSQL by a user with administrative permissions is not > permitted. > The server must be started under an unprivileged user ID to prevent > possible system security compromises. See the documentation for > more information on how to properly start the server. > > > [Cause] > pg_ctl runs "postgres -C data_directory" to know the data directory. But > postgres cannot be run by a user with Administrator privileges, and displays > the above messages. > > > [Fix] > It is ideal that users with administrative privileges can start postgres, > with the Administrator privileges removed. > > Currently, initdb and pg_ctl take trouble to invoke postgres in a process > with restricted privileges. I understand this improvement was done in 8.2 > or 8.3 for convenience. The same convenience should be available when > running postgres directly, at least "postgres -C", "postgres > --describe-config", and "postgres --single". > > Then, how can we do this? Which approach should we take? > > * Approach 1 > When postgres starts, it removes Administrator privileges from its own > process. But is this possible at all? Windows security API is complex and > provides many functions. It seems difficult to understand them. I'm afraid > it would take a long time to figure out the solution. Is there any good web > page to look at? > > * Approach 2 > Do not call check_root() on Windows when -C, --describe-config, or --single > is specified when running postgres. This would be easy, and should not be > dangerous in terms of security because attackers cannot get into the server > process via network. Approach-2 has been discussed previously to resolve it and it doesn't seem to be a good way to handle it. Please refer link: http://www.postgresql.org/message-id/1339601668-sup-4...@alvh.no-ip.org You can go through that mail chain and see if there can be a better solution than Approach-2. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] same-address mappings vs. relative pointers
During development of the dynamic shared memory facility, Noah and I spent a lot of time arguing about whether it was practical to ensure that a dynamic shared memory segment got mapped at the same address in every backend that used it. The argument went something like this: Me: We'll never be able to make that work reliably. Noah: But if we can't use pointers in the dynamic shared memory segment, our lives will suck. Me: Well, we probably don't really NEED to use data structures that contain pointers all THAT much. Noah: Are you nuts? Of course we need pointers. They're ubiquitous and essential. Me: Meh. I felt somewhat vindicated when I finished the dynamic shared memory message queuing patch (which I'm still hoping someone will review sometime soon...?) since that constitutes a useful chunk of functionality that doesn't care about pointers at all. And there are surely other examples that fall into the same category; for example, an lwlock doesn't contain any pointers today, so storing one in a dynamic shared memory segment in an address that might vary from one process to another ought to work OK, too. I think there was actually a patch a few years ago that made everything use LWLock * rather than LWLockId, which would allow considerably more flexibility in laying out lwlocks in memory - so you could for example try to put the in the same cache lines as the data they protect, or different cache lines than other hot lwlocks - and would probably also be almost enough to allow placing them in a dynamic shared memory segment, which would be useful. But I'm also learning painfully that this kind of thing only goes so far. For example, I spent some time looking at what it would take to provide a dynamic shared memory equivalent of palloc/pfree, a facility that I feel fairly sure would attract a few fans. Well, for regular palloc, we store a pointer to the memory context before the beginning of the chunk, so that when you call pfree you can find the memory context and stick the chunk on one of its free lists. So there are two pointers there: the pointer to the context is a pointer, of course, but so is the free list. Heh, heh. As I see it, if we want to have facilities like this, we'll have to either (1) make same-address mappings work for as many architectures as possible and don't support these facilities on the remainder or (2) use relative pointers instead of absolute pointers within dynamic shared memory segments, which means a loss of performance, notational clarity, and type-safety. We can also (3) adopt both approaches - some facilities can use relative pointers, which will be portable everywhere but annoying otherwise, and others can work only when same-address mappings are supported. Or we can (4) adopt neither approach, and confine ourselves to data structures that don't use pointers. I still have mixed feelings about the idea of same-address mappings. On the one hand, on 64-bit architectures, there's a huge amount of address space available. Assuming the OS does something even vaguely sensible in terms of laying out the text, heap, stack, and shared library mappings, there ought to be many petabytes of address space that never gets touched, and it's hard to see why we couldn't find some place in there to stick our stuff. But that could require quite a bit of OS-specific knowledge about how memory gets laid out. One idea that I think is originally Noah's, though I may be mutilating it, is to create a very large PROT_NONE mapping in the postmaster and then overwrite that mapping with the mapping for any dynamic shared memory segments we subsequently want to create. In that way, we essentially reserve the address space we want to use before the child is forked and things start to diverge (due to memory allocations, additional shared library loads, etc.). But I bet that on at least some operating systems that will actually allocate memory, or at least count toward the system's notion of overcommit, and that will be a problem. So I don't have any really good idea for how to implement this cleanly. Now, on the other hand, as far as dynamic shared memory allocation and freeing is concerned, there aren't really THAT many places where I need a pointer, so using Size or uint64 or something to store an offset instead is annoying, but I have an idea how to do this that only uses pointers in a couple of places, so I think it can be made to work. I am not sure how much complaint that will provoke, though. And even if I do it, the first poor slob that wants to store a linked list or so in dynamic shared memory is going to be unhappy if they can't get a same-address mapping. Maybe that's OK; using linked lists in shared memory might not be a great idea in the first place. I'm sure there will be more than one person who wants to do it, though. Any thoughts on what the least painful compromise is here? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Com
Re: [HACKERS] Parallel Select query performance and shared buffers
On Wed, Dec 4, 2013 at 11:49 PM, Metin Doslu wrote: > Here are some extra information: > > - When we increased NUM_BUFFER_PARTITIONS to 1024, this problem is > disappeared for 8 core machines and come back with 16 core machines on > Amazon EC2. Would it be related with PostgreSQL locking mechanism? I think here there is a good chance of improvement with the patch suggested by Andres in this thread, but still i think it might not completely resolve the current problem as there will be overhead of associating data with shared buffers. Currently NUM_BUFFER_PARTITIONS is fixed, so may be auto tuning it based on some parameter's can help such situations. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FDW: possible resjunk columns in AddForeignUpdateTargets
2013/12/5 Albe Laurenz : > Ian Lawrence Barwick wrote: >> 2013/11/8 Tom Lane : >>> [ thinks for awhile... ] Hm. In principle you can put any expression >>> you want into the tlist during AddForeignUpdateTargets. However, if it's >>> not a Var then the planner won't understand that it's something that needs >>> to be supplied by the table scan, so things won't work right in any but >>> the most trivial cases (maybe not even then :-(). >>> >>> What I'd try is creating a Var that has the attno of ctid >>> (ie, SelfItemPointerAttributeNumber) but the datatype you want, ie bytea. >>> This won't match what the catalogs say your table's ctid is, but I think >>> that nothing will care much about that. >> >> Apologies for reinvigorating this thread, but I'm running into a similar wall >> myself and would like to clarify if this approach will work at all. >> >> My foreign data source is returning a fixed-length string as a unique row >> identifier; in AddForeignUpdateTargets() I can create a Var like this: >> >> var = makeVar(parsetree->resultRelation, >>SelfItemPointerAttributeNumber, >>BPCHAROID, >>32, >>InvalidOid, >>0); >> >> but is it possible to store something other than a TIDOID here, and if so >> how? > > Subsequent analysis showed that this won't work as you have > no way to populate such a resjunk column. > resjunk columns seem to get filled with the values from the > column of the same name, so currently there is no way to invent > your own column, fill it and pass it on. > > See thread 8b848b463a71b7a905bc5ef18b95528e.squir...@sq.gransy.com > > What I ended up doing is introduce a column option that identifies > a primary key column. I add a resjunk entry for each of those and > use them to identify the correct row during an UPDATE or DELETE. > > That only works for foreign data sources that have a concept of > a primary key, but maybe you can do something similar. Thanks for confirming that, I suspected that might be the case. I'll have to go for Plan B (or C or D). Regards Ian Barwick -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Select query performance and shared buffers
On Wed, Dec 4, 2013 at 10:40 AM, Claudio Freire wrote: > On Wed, Dec 4, 2013 at 12:57 AM, Amit Kapila wrote: >>> As a quick side, we also repeated the same experiment on an EC2 instance >>> with 16 CPU cores, and found that the scale out behavior became worse there. >>> (We also tried increasing the shared_buffers to 30 GB. This change >>> completely solved the scaling out problem on this instance type, but hurt >>> our performance on the hi1.4xlarge instances.) >> >> Instead of 30GB, you can try with lesser value, but it should be close >> to your data size. > > The OS cache should have provided a similar function. The performance cannot be same when those pages are in shared buffers as a. OS can flush those pages b. anyway loading it again in shared buffers will have some overhead. > In fact, larger shared buffers shouldn't have made a difference if the > main I/O pattern are sequential scans, because they use a ring buffer. Yeah, this is right, but then why he is able to see scaling when he increased shared buffer's to larger value. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol
On Thu, Dec 5, 2013 at 7:25 AM, Peter Eisentraut wrote: > On Thu, 2013-11-14 at 12:11 +0530, Amit Kapila wrote: >>If an application wants to allow these connection parameters to be >> used, it would need to do PQenableStartServer() first. If it doesn't, >> those connection parameters will be rejected. > > Stupid idea: Would it work that we require an environment variable to be > set before we allow the standalone_backend connection parameter? That's > easy to do, easy to audit, and doesn't require any extra code in the > individual clients. This is certainly not a stupid idea, rather something on similar lines has been discussed previously in this thread. Tom has suggested something similar, but I am not sure if there was a conclusion on that point. Please see the relavant discussion at below link: http://www.postgresql.org/message-id/17384.1346645...@sss.pgh.pa.us I think the basic question at that time was why should we consider an environment variable more safe. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extension Templates S03E11
* Peter Eisentraut (pete...@gmx.net) wrote: > On 12/3/13, 9:20 AM, Stephen Frost wrote: > > Another option, which I generally like better, is to have a new package > > format for PGXN that contains the results of "make install", > > more-or-less, synonymous to Debian source vs. .deb packages. > > > > Perhaps we could even have psql understand that format and be able to > > install the extension via a backslash command instead of having an > > external tool, but I think an external tool for dependency tracking and > > downloading of necessary dependencies ala Debian would be better than > > teaching psql to do that. > > How would that handle varying file system layouts on the backend? This discussion is all about catalog-only extensions and therefore we don't really care about anything filesystem related... Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Performance optimization of btree binary search
On Wed, Dec 4, 2013 at 5:28 PM, Peter Geoghegan wrote: > I'm also curious about the impact on insertion into primary key > indexes. Presently, we hold an exclusive buffer lock for the duration > of a couple of operations when checkUnique != UNIQUE_CHECK_NO. > _bt_binsrch() is one such operation. The other one there, > _bt_check_unique(), is likely to be a lot cheaper than _bt_binsrch() > on average, I think, so I'm cautiously optimistic that it'll be > noticeable. I better go and check it out. Depending on how well this goes, I might also teach _bt_doinsert() to hint to _bt_binsrch() (or as I'm calling it, _bt_page_search()) that it should look to the end of the page when searching, using a similar mechanism to the mechanism for hinting that the main Datum-compare optimization is applicable (this strategy would be abandoned if it didn't work immediately - as soon as the last item on the page turned out to be greater than or equal to the scankey value). This is something that I think would help with SERIAL columns, where it's possible in principle to pass that kind of insight around -- if you can live with making SERIAL more than mere syntactic sugar. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol
On Thu, 2013-11-14 at 12:11 +0530, Amit Kapila wrote: >If an application wants to allow these connection parameters to be > used, it would need to do PQenableStartServer() first. If it doesn't, > those connection parameters will be rejected. Stupid idea: Would it work that we require an environment variable to be set before we allow the standalone_backend connection parameter? That's easy to do, easy to audit, and doesn't require any extra code in the individual clients. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [BUGS] BUG #7873: pg_restore --clean tries to drop tables that don't exist
Can someone in this thread clarify the commit fest situation? I see two entries that appear to be the same: https://commitfest.postgresql.org/action/patch_view?id=1174 https://commitfest.postgresql.org/action/patch_view?id=1175 I think the first one is a duplicate or obsolete. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: variant of regclass
> Tatsuo Ishii writes: >> I would like to add a variant of regclass, which is exactly same as >> current regclass except it does not raise an error when the target >> table is not found. Instead it returns InvalidOid (0). > > I've sometimes thought we should just make all the reg* input converters > act that way. It's not terribly consistent that they'll happily take > numeric inputs that don't correspond to any existing OID. And more > often than not, I've found the throw-an-error behavior to be annoying > not helpful. > > In any case, -1 for dealing with this only for regclass and not the > other ones. I'm happy with changing reg* at the same time. Will come up with the modified proposal. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
On Wed, Dec 4, 2013 at 12:58 PM, Peter Geoghegan wrote: > I'm kind of > curious as to what this benchmark would like like on a server with > many more cores. I'm also curious about the impact on insertion into primary key indexes. Presently, we hold an exclusive buffer lock for the duration of a couple of operations when checkUnique != UNIQUE_CHECK_NO. _bt_binsrch() is one such operation. The other one there, _bt_check_unique(), is likely to be a lot cheaper than _bt_binsrch() on average, I think, so I'm cautiously optimistic that it'll be noticeable. I better go and check it out. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
Peter Eisentraut writes: > On Wed, 2013-12-04 at 19:45 -0500, Robert Haas wrote: >> On Wed, Dec 4, 2013 at 6:56 PM, Tom Lane wrote: >>> Yeah, that's another thing we could simplify if we fixed this problem >>> at the source. I think these decisions date from a time when we still >>> cared about the speed of fmgr_oldstyle. >> Sure, let's whack that thing with a crowbar. > Or just remove it. Who still needs it? Lazy people? I'm not in a hurry to drop it; it's not costing us much to just sit there, other than in this connection which we see how to fix. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: variant of regclass
Tatsuo Ishii writes: > I would like to add a variant of regclass, which is exactly same as > current regclass except it does not raise an error when the target > table is not found. Instead it returns InvalidOid (0). I've sometimes thought we should just make all the reg* input converters act that way. It's not terribly consistent that they'll happily take numeric inputs that don't correspond to any existing OID. And more often than not, I've found the throw-an-error behavior to be annoying not helpful. In any case, -1 for dealing with this only for regclass and not the other ones. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: programmable file format for postgresql.conf
On 12/4/13, 2:02 PM, Álvaro Hernández Tortosa wrote: > So optional fields are either purely optional (i.e., only for tools > that want to use them; everyone else may ignore, but preserve, them) and > some other are just NULLABLEs, depending on the parameter). But my point stands: If it's optional, you can't rely on it, if it's required, people will object because they don't more junk in their config file. But I think this is solving the wrong problem. The metadata is already available via postgres --describe-config. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
On Wed, 2013-12-04 at 19:45 -0500, Robert Haas wrote: > On Wed, Dec 4, 2013 at 6:56 PM, Tom Lane wrote: > > Yeah, that's another thing we could simplify if we fixed this problem > > at the source. I think these decisions date from a time when we still > > cared about the speed of fmgr_oldstyle. > > Sure, let's whack that thing with a crowbar. Or just remove it. Who still needs it? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changes in Trigger Firing
On Wed, Dec 4, 2013 at 11:16 PM, Alvaro Herrera wrote: > Sameer Kumar wrote: > I think you need better tools to guide you in exploring the source code. > For example, you can use cscope to tell you where is CreateTrigStmt > used, and you would find gram.y; and use it to tell you where > CreateTrigger is used, and you would find utility.c. > > Any half-decent code editor should be able to generate a "database" of > symbols and let you frolic around the various files quickly. Without > that, anyone would be completely lost in developing new features of even > the lowest complexity. Not always, other people find as well git grep quite useful when coding a complex tree. I read somewhere that "the best editor is the one you master" (1) :) 1: http://www.postgresql.org/message-id/m2wrs6giyp@hi-media.com -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Status of FDW pushdowns
David Fetter writes: > On Wed, Dec 04, 2013 at 03:04:31PM -0500, Tom Lane wrote: >> If that's the argument, why not just use dblink or dbilink, and be >> happy? This discussion sounds a whole lot like it's trending to a >> conclusion of wanting one of those in core, which is not where I'd >> like to end up. > Telling people who've already installed and configured an FDW that for > perfectly ordinary expected functionality they'll need to install yet > another piece of software, configure it, keep its configuration in > sync with the FDW configuration, etc., is just a ridiculous. Perfectly ordinary expected functionality according to who? Not the SQL standard, for sure. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why we are going to have to go DirectIO
Hi, On 2013-12-03 10:44:15 -0800, Josh Berkus wrote: > I don't know where we'll get the resources to implement our own storage, > but it's looking like we don't have a choice. As long as our storage layer is a s suboptimal as it is today, I think it's a purely detractory to primarily blame the kernel. We * cannot deal with large shared_buffers, the dirty-buffer scanning is far to expensive. The amount of memory required for locks is pretty big, and every backend carries around a pretty huge private array for the buffer pins. * do not have scalability in pretty damn central datastructures like buffer mapping. * Our background eviction mechanism doesn't do anything in lots of workloads but increase contention on important data structures. * Due to the missing efficient eviction, we synchronously write out data when acquiring a victim buffer most of the time. That's already bad if you have a kernel buffering your writes, but if you don't... * Due to the frequency of buffer pins in almost all workloads, our tracking of the importance of individual buffers is far, far too volatile. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
On Wed, Dec 4, 2013 at 6:56 PM, Tom Lane wrote: > Robert Haas writes: >> Hmm. And yet, there's this: > >> * When a type narrower than Datum is stored in a Datum, we place it in the >> * low-order bits and are careful that the DatumGetXXX macro for it discards >> * the unused high-order bits (as opposed to, say, assuming they are zero). >> * This is needed to support old-style user-defined functions, since >> depending >> * on architecture and compiler, the return value of a function returning >> char >> * or short may contain garbage when called as if it returned Datum. > >> And record_image_eq does a rather elaborate dance around here, calling >> the appropriate GET_x_BYTES macro depending on the type-width. If we >> can really count on the high-order bits to be zero, that's all >> completely unnecessary tomfoolery. > > Yeah, that's another thing we could simplify if we fixed this problem > at the source. I think these decisions date from a time when we still > cared about the speed of fmgr_oldstyle. Sure, let's whack that thing with a crowbar. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Time-Delayed Standbys
On 3 December 2013 18:46, Robert Haas wrote: > On Tue, Dec 3, 2013 at 12:36 PM, Fabrízio de Royes Mello > wrote: >> On Tue, Dec 3, 2013 at 2:33 PM, Christian Kruse >> wrote: >>> >>> Hi Fabrizio, >>> >>> looks good to me. I did some testing on 9.2.4, 9.2.5 and HEAD. It >>> applies and compiles w/o errors or warnings. I set up a master and two >>> hot standbys replicating from the master, one with 5 minutes delay and >>> one without delay. After that I created a new database and generated >>> some test data: >>> >>> CREATE TABLE test (val INTEGER); >>> INSERT INTO test (val) (SELECT * FROM generate_series(0, 100)); >>> >>> The non-delayed standby nearly instantly had the data replicated, the >>> delayed standby was replicated after exactly 5 minutes. I did not >>> notice any problems, errors or warnings. >>> >> >> Thanks for your review Christian... > > So, I proposed this patch previously and I still think it's a good > idea, but it got voted down on the grounds that it didn't deal with > clock drift. I view that as insufficient reason to reject the > feature, but others disagreed. Unless some of those people have > changed their minds, I don't think this patch has much future here. I had that objection and others. Since then many people have requested this feature and have persuaded me that this is worth having and that my objections are minor points. I now agree with the need for the feature, almost as written. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Proposal: variant of regclass
I would like to propose to add a variant of regclass. Background: Pgpool-II (http://www.pgpool.net) needs to get information of tables by querying PostgreSQL's system catalog. For efficiency and correctness of the info (search path consideration), pgpool-II issues such queries piggy packing the user's connection to PostgreSQL and regclass is frequently used in the queries. One problem with stock regclass is, it raises exception when a target tables is not found, which breaks user's transaction currently running on the session. For a workaround, pgpool-II ships non-error-raising version of regclass, called pgpool_regclass. However it's not perfect solution because people (or even distributions/packagers) forget to install it [1]. Another problem is, pgpool_regclass heavily depends on the internals of PostgreSQL, which has been changed version to versions and pgpool developers need to spend some efforts to adopt the changes. Someone suggested before that pgpool_regclass could be implemented as a pl function, but I think it is unacceptable because 1) the function is heavily used and using pl will cause performance problem, 2) it does solve the problem I said in [1]. Proposal: I would like to add a variant of regclass, which is exactly same as current regclass except it does not raise an error when the target table is not found. Instead it returns InvalidOid (0). Pgpool-II is being shipped with various distributions and used by many companies including EnterpriseDB, VMWare, SRA OSS and so on. IMO this small enhancement will benefit many PostgreSQL users by small changes to PostgreSQL. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Status of FDW pushdowns
On Wed, Dec 04, 2013 at 03:04:31PM -0500, Tom Lane wrote: > David Fetter writes: > > The idea here is that such a happy situation will not obtain until > > much later, if ever, and meanwhile, we need a way to get things > > accomplished even if it's inelegant, inefficient, etc. The > > alternative is that those things simply will not get accomplished > > at all. > > If that's the argument, why not just use dblink or dbilink, and be > happy? This discussion sounds a whole lot like it's trending to a > conclusion of wanting one of those in core, which is not where I'd > like to end up. Telling people who've already installed and configured an FDW that for perfectly ordinary expected functionality they'll need to install yet another piece of software, configure it, keep its configuration in sync with the FDW configuration, etc., is just a ridiculous. So yes, we do need this functionality and it does need to be part of our FDW implementation. Just exactly where we draw the line between built-ins and APIs is the conversation I thought we were having. The minimal thing would be providing database handles per SQL/MED and a few tools to manipulate same. Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Changeset Extraction Interfaces
Hi, Short recap: >From the perspective of the user interface the changeset extraction feature consists out of two abstract interfaces that the "user" has to do with: 1) The "slot" or "changestream" management interface which manages individual streams of changes. The user can create and destroy a changestream, and most importantly stream the changes. Simplified, a "logical replication slot" is a position in the WAL and a bunch of state associated with it. As long as a slot exists, the user can ask, for all changes that happened since the last time he asked, to be streamed out. It is abstract, because different usecases require the changes to be streamed out via different methods. The series contains two implementation of that interface: I) One integrated into walsender that allows for efficient streaming, including support for synchronous replication. II) Another that is accessible via SQL functions, very useful for writing pg_regress/isolationtester tests. It is, with a relatively low amount of code, possible to add other such interfaces without touching core code. One example, that has been asked for by a number of people, is consuming the changestream in a background worker without involving SQL or connecting to a walsender. There's basically three major 'verbs' that can be performed on a stream, currently named (walsender names): * INIT_LOGICAL_REPLICATION "name" "output_plugin" * START_LOGICAL_REPLICATION "name" last_received ("option_name" value,...) * FREE_LOGICAL_REPLICATION "name" The SQL variant currrently has: * init_logical_replication(name, plugin) * start_logical_replication(name, stream_upto, options[]) * stop_logical_replication(name) You might have noticed the slight inconsistency... 2) The "output plugin" interface, which transforms a changestream (begin, change, commit) into the desired target format. There are 5 callbacks, 3 of them obligatory: * pg_decode_init(context, is_initial) [optional] * pg_decode_begin(context, txn) * pg_decode_change(context, txn, relation, change) * pg_decode_commit(context, txn) * pg_decode_cleanup(context) [optional] Every output plugin can be used from every slot management interface. The current pain points, that I'd like to discuss, are: a) Better naming for the slot management between walsender, SQL and possible future interfaces. b) Decide which of the SQL functions should be in a contrib module, and which in core. Currently init_logical_replication() and stop_logical_replication() are in core, whereas start_logical_replication() is in the 'test_logical_decoding' extension. The reasoning behind that is that init/stop ones are important to the DBA and the start_logical_replication() SRF isn't all that useful in the real world because our SRFs don't support streaming changes out. c) Which data-types does start_logical_replication() return. Currently it's OUT location text, OUT xid bigint, OUT data text. Making the 'data' column text has some obvious disadvantages though - there's obvious usecases for output plugins that return binary data. But making it bytea sucks, because the output is harder to read by default... d) How does a slot acquire the callbacks of an output plugin. For a), my current feeling is to name them: * LOGICAL_DECODING_SLOT_CREATE/pg_logical_decoding_slot_create() * LOGICAL_DECODING_SLOT_STREAM/pg_logical_decoding_slot_extract() * LOGICAL_DECODING_SLOT_DESTROY/pg_logical_decoding_slot_destroy() with an intentional discrepancy between stream and extract, to make the difference obvious. One day we might have the facility - which would be rather cool - to do the streaming from sql as well. Better ideas? Leave out the "logical"? For b), I am happy with that split, I would just like others to comment. For c), I have better idea than two functions. d) is my main question, and Robert, Peter G. and I previously argued about it a fair bit. I know of the following alternatives: I) The output plugin that's specified in INIT_LOGICAL_REPLICATION is actually a library name, and we simply lookup the fixed symbol names in it. That's what currently implemented. The advantage is that it's pretty easy to implement, works on a HS standby without involving the primary, and doesn't have a problem if the library is used in shared_preload_library. The disadvantages are: All output plugins need to be shared libraries and there can only be one output plugin per shared library (although you could route differently, via options, but ugh). II) Keep the output plugin a library, but only lookup a _PG_init_output_plugin() which registers/returns the callbacks. Pretty much the same tradeoffs as I) III) Keep the output plugin a library, but simply rely on _PG_init() calling a function to register all callbacks. Imo it's worse than I) and II) because it basically prohibits using the library in shared_preload_libraries as well, because then it's _PG_init() doesn't get called when starting to stream, and another libra
Re: [HACKERS] Performance optimization of btree binary search
I wrote: > Yeah, that's another thing we could simplify if we fixed this problem > at the source. I think these decisions date from a time when we still > cared about the speed of fmgr_oldstyle. BTW, the text you're quoting is from 2007, but it's just documenting behavior that's mostly a lot older. It's worth reading commit 23a41573 in toto in this connection. I'm not sure if we'd want to revert that DatumGetBool change or not, if we were to clean up fmgr_oldstyle. We'd be able to do whatever was cheapest, but I'm not sure what that is. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
Robert Haas writes: > Hmm. And yet, there's this: > * When a type narrower than Datum is stored in a Datum, we place it in the > * low-order bits and are careful that the DatumGetXXX macro for it discards > * the unused high-order bits (as opposed to, say, assuming they are zero). > * This is needed to support old-style user-defined functions, since depending > * on architecture and compiler, the return value of a function returning char > * or short may contain garbage when called as if it returned Datum. > And record_image_eq does a rather elaborate dance around here, calling > the appropriate GET_x_BYTES macro depending on the type-width. If we > can really count on the high-order bits to be zero, that's all > completely unnecessary tomfoolery. Yeah, that's another thing we could simplify if we fixed this problem at the source. I think these decisions date from a time when we still cared about the speed of fmgr_oldstyle. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: programmable file format for postgresql.conf
On 04/12/13 20:44, Peter Eisentraut wrote: On 12/4/13, 2:02 PM, Álvaro Hernández Tortosa wrote: So optional fields are either purely optional (i.e., only for tools that want to use them; everyone else may ignore, but preserve, them) and some other are just NULLABLEs, depending on the parameter). But my point stands: If it's optional, you can't rely on it, if it's required, people will object because they don't more junk in their config file. OK, I get what you say. My bad, I called "optional" what it is either "optional" (reserved for extension fields) or NULLABLE (fields that may be absent, meaning that they are NULL). But what matters are the required fields. You say they add "junk" to the config file. I understand what you say, but is it really junk? Is it that bad? In return for this extra information, we: - Provide users with more help (information) to help them configure postgres (which is no easy task, specially for newcomers). - Help and encourage app developers to create both GUI tools for easier postgresql configuration and automatic or semi-automatic configuration tools. - Make it way easier to change postgresql parameters persistently from a SQL connection. The tradeoff seems quite positive to me. I see no strong reasons why not do it... am I missing something? But I think this is solving the wrong problem. The metadata is already available via postgres --describe-config. I think that doesn't solve any of the above benefits we would get from a programmable postgresql format such as the one I have described. Best, aht -- Álvaro Hernández Tortosa --- NOSYS Networked Open SYStems -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
On Wed, Dec 4, 2013 at 6:33 PM, Peter Geoghegan wrote: > On Wed, Dec 4, 2013 at 1:59 PM, Robert Haas wrote: >> Yeah, I think if we can make something like this work, it would be >> neat-o. Getting this working for int4 would be a good win, as Peter >> says, but getting it working for both int4 and int8 with the same code >> would be a significantly better one. > > No arguments here. I think I didn't initially suggest it myself out of > passing concern about the guarantees around how unused Datum bits are > initialized in all relevant contexts, but having looked at it for a > second I see that we are of course disciplined there. Hmm. And yet, there's this: * When a type narrower than Datum is stored in a Datum, we place it in the * low-order bits and are careful that the DatumGetXXX macro for it discards * the unused high-order bits (as opposed to, say, assuming they are zero). * This is needed to support old-style user-defined functions, since depending * on architecture and compiler, the return value of a function returning char * or short may contain garbage when called as if it returned Datum. And record_image_eq does a rather elaborate dance around here, calling the appropriate GET_x_BYTES macro depending on the type-width. If we can really count on the high-order bits to be zero, that's all completely unnecessary tomfoolery. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
Peter Geoghegan writes: > On Wed, Dec 4, 2013 at 1:59 PM, Robert Haas wrote: >> Yeah, I think if we can make something like this work, it would be >> neat-o. Getting this working for int4 would be a good win, as Peter >> says, but getting it working for both int4 and int8 with the same code >> would be a significantly better one. > No arguments here. I think I didn't initially suggest it myself out of > passing concern about the guarantees around how unused Datum bits are > initialized in all relevant contexts, but having looked at it for a > second I see that we are of course disciplined there. Hm ... actually, the comment at lines 335ff of postgres.h points out that a Datum returned from a version 0 user-defined function might contain garbage in the high order bits. We could fix that, probably, with some cleanup code added to fmgr_oldstyle. It'd waste a few cycles ... but if there's anybody out there who still cares about the performance of such functions, it's high time they fixed them to be v1, anyway. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
On Wed, Dec 4, 2013 at 1:59 PM, Robert Haas wrote: > Yeah, I think if we can make something like this work, it would be > neat-o. Getting this working for int4 would be a good win, as Peter > says, but getting it working for both int4 and int8 with the same code > would be a significantly better one. No arguments here. I think I didn't initially suggest it myself out of passing concern about the guarantees around how unused Datum bits are initialized in all relevant contexts, but having looked at it for a second I see that we are of course disciplined there. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposed feature: Selective Foreign Keys
On 5 Dec 2013, at 03:48, Andrew Dunstan wrote: >>> Well I guess we could say something like: >>> >>>FOREIGN KEY (a-col) WHERE (a-condition) REFERENCES b(b-col) WHERE >>>(b-condition) >>> > > OK, those make sense. I wonder whether this should be done via a USING clause > on the constraint that pointed to the partial unique index. Or would that be > too obscure? Well you could put a USING clause on the end and it would read pretty unambiguously. Requiring that the user specify it rather than trying to guess which index to use would also probably be an easier path to getting that feature in, at least for a first cut. I won’t be doing work towards putting a where clause on the referenced side just yet, though. One thing at a time. Cheers Tom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposed feature: Selective Foreign Keys
On 5 Dec 2013, at 06:10, Tom Lane wrote: > Andrew Dunstan writes: Well I guess we could say something like: FOREIGN KEY (a-col) WHERE (a-condition) REFERENCES b(b-col) WHERE (b-condition) > > I like what you have above. Yeah. Given both the apparent ambiguity of the current placement, and the fact that the current placement would be right where you’d put a where clause on the referenced table, that’s the only sane way to do it. And it’s not so bad. Cheers Tom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extension Templates S03E11
On 12/3/13, 9:20 AM, Stephen Frost wrote: > Another option, which I generally like better, is to have a new package > format for PGXN that contains the results of "make install", > more-or-less, synonymous to Debian source vs. .deb packages. > > Perhaps we could even have psql understand that format and be able to > install the extension via a backslash command instead of having an > external tool, but I think an external tool for dependency tracking and > downloading of necessary dependencies ala Debian would be better than > teaching psql to do that. How would that handle varying file system layouts on the backend? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extension Templates S03E11
On 12/2/13, 2:33 PM, Greg Stark wrote: > Just tossing an idea out there. What if you could install an extension > by specifying not a local file name but a URL. Obviously there's a > security issue but for example we could allow only https URLs with > verified domain names that are in a list of approved domain names > specified by a GUC. This is similar to what autopex does (https://github.com/petere/autopex). -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extension Templates S03E11
On 12/2/13, 9:14 AM, Dimitri Fontaine wrote: > What I want to build is an “extension distribution” software that knows > how to prepare anything from PGXN (and other places) so that it's fully > ready for being used in the database. Then the main client would run as > a CREATE EXTENSION "ddl_command_start" Event Trigger and would fetch the > prepared extension for you and make it available, then leaving the main > command operate as intended. > > Which is what I think the pex extension is doing, and that's not > coincidental, but it runs the build step on the PostgreSQL server itself > and needs to have a non-trivial set of file-system privileges to be > doing so, and even needs to get root privileges with sudo for some of > its operations. You're thinking of autopex, and while that works, and can be made to work better with certain small changes, I don't think it can ever be the only solution. Many interesting extensions will have external packages build and run-time dependencies, and you need file-system level access to manage that. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITHIN GROUP patch
Andrew Gierth writes: > "Tom" == Tom Lane writes: > Tom> But anyway, what I'm thinking right now is that these questions > Tom> would all go away if the aggregate transfunction were receiving > Tom> the rows and sticking them into the tuplestore. It could add > Tom> whatever columns it felt like. > True, but this ends up duplicating the sorting functionality of > nodeAgg that we are leveraging off in the first place. I think this > will be somewhat more intrusive and likely slower. Hm, it's just a refactoring of the same code we'd have to have anyway, so I'm not seeing a reason to assume it'd be slower. If anything, this approach would open more opportunities for function-specific optimizations, which in the long run could be faster. (I'm not claiming that any such optimizations would be in the first version.) In hindsight I wonder if it wasn't a mistake to embed ordered-aggregate support in nodeAgg.c the way we did. We could have dumped that responsibility into some sort of wrapper around specific aggregates, with an option for some aggregates to skip the wrapper and handle it themselves. A trivial, and perhaps not very useful, example is that non-order-sensitive aggregates like MIN/MAX/COUNT could have been coded to simply ignore any ordering request. I can't immediately think of any examples that are compelling enough to justify such a refactoring now --- unless it turns out to make WITHIN GROUP easier. Anyway, I'm going to go off and look at the WITHIN GROUP patch with these ideas in mind. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
On Wed, Dec 4, 2013 at 4:28 PM, Tom Lane wrote: > Peter Geoghegan writes: >> I guess I could write a proper patch to have code setting up a scankey >> also set a flag that indicated that it was acceptable to assume that >> the special built-in comparator would do fine. ... >> I'd be happy with a scheme with only one built-in comparator, and >> allowed a few types to be cataloged such that it was indicated that >> just using the "built-in" comparator was acceptable, knowledge that >> could be passed to _bt_compare via the scankey. I'm thinking of just >> int4, and maybe date and a few other such int4 "covariant" types. > > If what you're proposing is that we have a fast path that compares Datums > as Datums, I should think that that would work fine for int2 as well, > *and* for int8 on machines where int8 is pass-by-value. (Does anyone > still care much about PG's performance on 32-bit hardware?) We might > have to fool a bit with the fooGetDatum macros in some cases, eg > I think Int16GetDatum isn't careful about sign extension. Admittedly, > that might introduce an offsetting cost on some hardware, but I think > on most machines sign-extension isn't noticeably more expensive than > zero-extension. Yeah, I think if we can make something like this work, it would be neat-o. Getting this working for int4 would be a good win, as Peter says, but getting it working for both int4 and int8 with the same code would be a significantly better one. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance optimization of btree binary search
Peter Geoghegan writes: > I guess I could write a proper patch to have code setting up a scankey > also set a flag that indicated that it was acceptable to assume that > the special built-in comparator would do fine. ... > I'd be happy with a scheme with only one built-in comparator, and > allowed a few types to be cataloged such that it was indicated that > just using the "built-in" comparator was acceptable, knowledge that > could be passed to _bt_compare via the scankey. I'm thinking of just > int4, and maybe date and a few other such int4 "covariant" types. If what you're proposing is that we have a fast path that compares Datums as Datums, I should think that that would work fine for int2 as well, *and* for int8 on machines where int8 is pass-by-value. (Does anyone still care much about PG's performance on 32-bit hardware?) We might have to fool a bit with the fooGetDatum macros in some cases, eg I think Int16GetDatum isn't careful about sign extension. Admittedly, that might introduce an offsetting cost on some hardware, but I think on most machines sign-extension isn't noticeably more expensive than zero-extension. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why we are going to have to go DirectIO
On Wed, 04 Dec 2013 13:01:37 -0800 Josh Berkus wrote: > > Perhaps even better: the next filesystem, storage, and memory management > > summit is March 24-25. > > Link? I can't find anything Googling by that name. I'm pretty sure we > can get at least one person there. It looks like the page for the 2014 event isn't up yet. It will be attached (as usual) to the LF collaboration summit: http://events.linuxfoundation.org/events/collaboration-summit I'll make a personal note to send something here when the planning process begins and the CFP goes out. Napa Valley...one can do worse...:) jon -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why we are going to have to go DirectIO
On Wed, Dec 4, 2013 at 2:31 PM, Jonathan Corbet wrote: > For those interested in the details... (1) It's not quite 50/50, that's one > bound for how the balance is allowed to go. (2) Anybody trying to add > tunables to the kernel tends to run into resistance. Exposing thousands of > knobs tends to lead to a situation where you *have* to be an expert on all > those knobs to get decent behavior out of your system. So there is a big > emphasis on having the kernel tune itself whenever possible. Here is a > situation where that is not always happening, but a fix (which introduces > no knob) is in the works. I think there are interesting parallels here with the 'query plan hints' debate. In both cases I think the conservative voices are correct: better not to go crazy adding knobs. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why we are going to have to go DirectIO
Jonathan, > For those interested in the details... (1) It's not quite 50/50, that's one > bound for how the balance is allowed to go. (2) Anybody trying to add > tunables to the kernel tends to run into resistance. Exposing thousands of > knobs tends to lead to a situation where you *have* to be an expert on all > those knobs to get decent behavior out of your system. So there is a big > emphasis on having the kernel tune itself whenever possible. Here is a > situation where that is not always happening, but a fix (which introduces > no knob) is in the works. Yeah, we get into this argument all the time. The problem is when you run into situations where there is no optimal (or even acceptable) setting for all, or even most, users. And I'll say in advance that 2Q is one of those situations. > As an example, I've never done much with the PostgreSQL knobs on the LWN > server. I just don't have the time to mess with it, and things Work Well > Enough. Sure, and even when I teach fiddling with the knobs, there's only 12-20 knobs 95% of users need to have any interest in. But we have ~~220 settings for the other 5%, and those users would be screwed without them. > Bugs and regressions happen, and I won't say that we do a good enough job > in that regard. There has been some concern recently that we're accepting > too much marginal stuff. We have problems getting enough people to > adequately review code — I think I've heard of another project or two with > similar issues :). But nobody sees the kernel as experimental or feels > that the introduction of bugs is an acceptable thing. OK. The chain of events over the pdflush bug really felt like what I said earlier, especially since problems *were* reported shortly after kernel release and ignored. > I think you're talking to the wrong people. Quite possibly. > Perhaps even better: the next filesystem, storage, and memory management > summit is March 24-25. Link? I can't find anything Googling by that name. I'm pretty sure we can get at least one person there. > Gee, if only there were a web site where one could read about changes to > the Linux kernel :) Even you don't cover 100% of performance-changing commits. And I'll admit to missing issues of LWN when I'm travelling. > Seriously, though, one of the best things to do would be to make a point of > picking up a kernel around -rc3 (right around now, say, for 3.13) and > running a few benchmarks on it. If you report a performance regression at > that stage, it will get attention. Yeah, back to the "we need resources for good benchmarks" discussion fork ... -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITHIN GROUP patch
> "Tom" == Tom Lane writes: Tom> Well, sure, but I was only suggesting adding it when the Tom> aggregate asks for it, probably via a new flag column in Tom> pg_aggregate. Sure, I was only pointing out the necessity. Tom> The question you're evading is what additional functionality Tom> could be had if the aggregate could demand a different datatype Tom> or constant value for the flag column. I don't really see a question there to answer - I simply chose to provide a general mechanism rather than make assumptions about what future users of the code would desire. I have no specific application in mind that would require some other type. >> Adding it only for hypothetical set functions is making a >> distinction in how functions are executed that I don't think is >> warranted - Tom> That seems like rather a curious argument from someone who's Tom> willing to give up the ability to specify a regular transition Tom> value concurrently with the flag column. In the current patch the idea of also specifying a regular transition value is meaningless since there is no transition function. Tom> But anyway, what I'm thinking right now is that these questions Tom> would all go away if the aggregate transfunction were receiving Tom> the rows and sticking them into the tuplestore. It could add Tom> whatever columns it felt like. True, but this ends up duplicating the sorting functionality of nodeAgg that we are leveraging off in the first place. I think this will be somewhat more intrusive and likely slower. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Performance optimization of btree binary search
Having nothing better to do over the holiday weekend, I decided to pursue a number of ideas for improving performance that I thought about a long time ago. These include: * Pre-fetching list node pointers. This looks to be moderately promising, but I'm certainly not going to be the one to land it, given present obligations. Stephen Frost may wish to pick it up, given his previous interest in the matter. This is slightly controversial, because it uses a GCC intrinsic (__builtin_prefetch), but also because the Linux kernel removed this optimization to their generic list data-structure [1]. However, that list was what we'd call an embedded list, so we should probably shouldn't be totally deterred. The amount of effort that I put into this was, frankly, pretty low. A motivated person, willing to do the appropriate analysis could probably bring it further. For one thing, every single foreach() has a call to this intrinsic, even where the list doesn't store pointers (which is not undefined). At the very least that's going to bloat things up, frequently for no conceivable gain, and yet with the patch applied we're still able to see see quite tangible benefits, even if it isn't exactly a stellar improvement. I have an idea that prefetching the last element at the start of the loop could be much better than what I did, because we know that those lists are mostly pretty small in practice, and that's likely to help pipelining - prefetching too late or even too early makes the optimization useless, because you may still get a cache miss. * Optimizing index scans - I noticed that binary searching accounted for many cache misses during a pgbench select.sql benchmark, instrumented with "perf record -e cache-misses". This warranted further investigation. I won't say anything further about the former optimization, except to note that it's included for comparative purposes in the set of benchmarks I've run (I haven't included a patch). The benchmark results are here: http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/results I took two approaches to the latter. This was the more interesting piece of work. Test sets include: * Master baseline (green) * List optimization (as mentioned above, not really relevant to the main topic of this mail) (red) * "fib btree", earlier patch, please disregard (blue) * "Fixed fib patch", Fibonacci search, no specialization (purple) * The interesting one, Finonacci search + specialization - "fib + no jump" (turquoise, see below for details) Initially, I had a little bit of success with Fibonnacci search [2] in place of binary search, in the hope that it would better take advantage of CPU cache characteristics - Fibonnacci search is said to have advantages where non-uniform memory access is an issue - it minimizes the final interval. I wasn't all that optimistic that it would work that well given the smallish size of BLCKSZ relative to modern CPU L1 cache sizes [3], but it did make an appreciable dent on its own. I suppose old habits die hard, because next I hacked up _bt_compare and had it do an int4btcmp directly, in the event of encountering a scankey that had as its comparator the relevant pg_proc oid. This is very much in the style (and the spirit) of the grotty early draft patches for the inlining-comparators-for-sorting patch. Patch is attached. This is a draft, a POC, posted only to facilitate discussion and to allow my results to be independently duplicated/verified. Note that there is a bug (attributable to the new search code) that causes the regression tests to fail in exactly one place (i.e. one line of difference). I didn't think it was worth deferring discussion to deal with that, though, since I don't think it undermines anything. I'm not sure how valuable the comparator trick is if we stick with binary search - I didn't test that. I'm sure it's a question that must be considered, though. I have a fairly huge amount of data here, having run plenty of benchmarks over several nights. The short version is that the 'select' benchmark has just over 18% greater throughput on this machine at some client counts (in particular, when there are 4 clients - there are 4 cores, but 8 logical cores) with the attached patch. There is a 3.5% regression with one client, which is certainly not accounted for by noise. Note, however, that latency appears consistently better with the patch applied. This is a machine running on dedicated hardware: a 4-core i7-4770. The working set easily fits in its 32GiB of DDR3 RAM at all pgbench scales tested (1, 10 and 100). The kernel used is "3.8.0-31-generic #46~precise1-Ubuntu SMP". Postgres settings are typical for this kind of thing (6GiB shared_buffers), but you can refer to my pgbench-tools results for full details (drill down to an individual pgbench run for that - they're all the same). I'm kind of curious as to what this benchmark would like like on a server with many more cores. I guess I could write a proper patch to have co
Re: [HACKERS] Why we are going to have to go DirectIO
* Magnus Hagander (mag...@hagander.net) wrote: > I think that's an excellent idea. If one of our developers could find the > time to attend that, I think that could be very productive. While I'm not > on the funds team, I'd definitely vote for funding such participation out > of community funds if said developer can't do it on his own. > > But it should definitely be a developer with interest and skills in that > particular area as well of course :) So don't think I'm proposing myself, I > definitely am not :) For my part, I'm definitely interested and those dates currently look like they'd work for me. Not sure if I really meet Magnus' qualifications above, but I'd be happy to try. ;) Stark and I were having a pretty good discussion with Ted Ts'o at pgconf.eu and he certainly seemed interested and willing to at least discuss things with us.. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] pgsql: Fix a couple of bugs in MultiXactId freezing
On Wed, Dec 4, 2013 at 8:43 PM, Tom Lane wrote: > Magnus Hagander writes: > > On Tue, Dec 3, 2013 at 7:20 PM, Tom Lane wrote: > >> I assume what would happen is the slave would PANIC upon seeing a WAL > >> record code it didn't recognize. > > > I wonder if we should for the future have the START_REPLICATION command > (or > > the IDENTIFY_SYSTEM would probably make more sense - or even adding a new > > command like IDENTIFY_CLIENT. The point is, something in the replication > > protocol) have walreceiver include it's version sent to the master. That > > way we could have the walsender identify a walreceiver that's too old and > > disconnect it right away - with a much nicer error message than a PANIC. > > Meh. That only helps for the case of streaming replication, and not for > the thirty-seven other ways that some WAL might arrive at something that > wants to replay it. > > It might be worth doing anyway, but I can't get excited about it for this > scenario. > It does, but I bet it's one of the by far most common cases. I'd say it's that one and restore-from-backup that would cover a huge majority of all cases. If we can cover those, we don't have to be perfect - so unless it turns out to be ridiculously complicated, I think it would be worthwhile having. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] Why we are going to have to go DirectIO
On Wed, Dec 4, 2013 at 9:31 PM, Jonathan Corbet wrote: > > I also wasn't exaggerating the reception I got when I tried to talk > > about IO and PostgreSQL at LinuxCon and other events. The majority of > > Linux hackers I've talked to simply don't want to be bothered with > > PostgreSQL's performance needs, and I've heard similar things from my > > collegues at the MySQL variants. Greg KH was the only real exception. > > > > Heck, I went to a meeting of filesystem geeks at LinuxCon and the main > > feedback I received, from Linux FS developers (Chris and Ted), was > > "PostgreSQL should implement its own storage and use DirectIO, we don't > > know why you're even trying to use the Linux IO stack." > > I think you're talking to the wrong people. Nothing you've described is a > filesystem problem; you're contending with memory management problems. > Chris and Ted weren't helpful because there's actually little they can do > to help you. I would be happy to introduce you to some people who would be > more likely to take your problems to heart. > > Mel Gorman, for example, is working on putting together a set of MM > benchmarks in the hopes of quantifying changes and catching regressions > before new code is merged. He's one of the people who has to deal with > performance regressions when they show up in enterprise kernels, and I get > the sense he'd rather do less of that. > > Perhaps even better: the next filesystem, storage, and memory management > summit is March 24-25. A session on your pain points there would bring in > a substantial portion of the relevant developers at all levels. LSFMM > is arguably the most productive kernel event I see over the course of a > year; it's where I would go first to make progress on this issue. I'm not > an LSFMM organizer, but I would be happy to work to make such a session > happen if somebody from the PostgreSQL community wanted to be there. > I think that's an excellent idea. If one of our developers could find the time to attend that, I think that could be very productive. While I'm not on the funds team, I'd definitely vote for funding such participation out of community funds if said developer can't do it on his own. But it should definitely be a developer with interest and skills in that particular area as well of course :) So don't think I'm proposing myself, I definitely am not :) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] Why we are going to have to go DirectIO
On Wed, 04 Dec 2013 11:07:04 -0800 Josh Berkus wrote: > On 12/04/2013 07:33 AM, Jonathan Corbet wrote: > > Wow, Josh, I'm surprised to hear this from you. > > Well, I figured it was too angry to propose for an LWN article. ;-) So you're going to make us write it for you :) > > The active/inactive list mechanism works great for the vast majority of > > users. The second-use algorithm prevents a lot of pathological behavior, > > like wiping out your entire cache by copying a big file or running a > > backup. We *need* that kind of logic in the kernel. > > There's a large body of research on 2Q algorithms going back to the 80s, > which is what this is. As far as I can tell, the modification was > performed without any reading of this research, since that would have > easily shown that 50/50 was unlikely to be a good division, and that in > fact there is nothing which would work except a tunable setting, because > workloads are different. In general, the movement of useful information between academia and real-world programming seems to be minimal at best. Neither side seems to find much that is useful or interesting in what the other is doing. Unfortunate. For those interested in the details... (1) It's not quite 50/50, that's one bound for how the balance is allowed to go. (2) Anybody trying to add tunables to the kernel tends to run into resistance. Exposing thousands of knobs tends to lead to a situation where you *have* to be an expert on all those knobs to get decent behavior out of your system. So there is a big emphasis on having the kernel tune itself whenever possible. Here is a situation where that is not always happening, but a fix (which introduces no knob) is in the works. As an example, I've never done much with the PostgreSQL knobs on the LWN server. I just don't have the time to mess with it, and things Work Well Enough. > However, this particular issue concerns me less than the general > attitude that it's OK to push in experimental IO changes which can't be > disabled by users into release kernels, as exemplified by several > problematic and inadequately tested IO changes in the 3.X kernels -- > most notably the pdflush bug. It speaks of a policy that the Linux IO > stack is not production software, and it's OK to tinker with it in ways > that break things for many users. Bugs and regressions happen, and I won't say that we do a good enough job in that regard. There has been some concern recently that we're accepting too much marginal stuff. We have problems getting enough people to adequately review code — I think I've heard of another project or two with similar issues :). But nobody sees the kernel as experimental or feels that the introduction of bugs is an acceptable thing. > I also wasn't exaggerating the reception I got when I tried to talk > about IO and PostgreSQL at LinuxCon and other events. The majority of > Linux hackers I've talked to simply don't want to be bothered with > PostgreSQL's performance needs, and I've heard similar things from my > collegues at the MySQL variants. Greg KH was the only real exception. > > Heck, I went to a meeting of filesystem geeks at LinuxCon and the main > feedback I received, from Linux FS developers (Chris and Ted), was > "PostgreSQL should implement its own storage and use DirectIO, we don't > know why you're even trying to use the Linux IO stack." I think you're talking to the wrong people. Nothing you've described is a filesystem problem; you're contending with memory management problems. Chris and Ted weren't helpful because there's actually little they can do to help you. I would be happy to introduce you to some people who would be more likely to take your problems to heart. Mel Gorman, for example, is working on putting together a set of MM benchmarks in the hopes of quantifying changes and catching regressions before new code is merged. He's one of the people who has to deal with performance regressions when they show up in enterprise kernels, and I get the sense he'd rather do less of that. Perhaps even better: the next filesystem, storage, and memory management summit is March 24-25. A session on your pain points there would bring in a substantial portion of the relevant developers at all levels. LSFMM is arguably the most productive kernel event I see over the course of a year; it's where I would go first to make progress on this issue. I'm not an LSFMM organizer, but I would be happy to work to make such a session happen if somebody from the PostgreSQL community wanted to be there. > > This code has been a bit slow getting into the mainline for a few reasons, > > but one of the chief ones is this: nobody is saying from the sidelines > > that they need it! If somebody were saying "Postgres would work a lot > > better with this code in place" and had some numbers to demonstrate that, > > we'd be far more likely to see it get into an upcoming release. > > Well, Citus did that; do you need mor
Re: [HACKERS] Extension Templates S03E11
On Wed, Dec 4, 2013 at 3:39 AM, Jeff Davis wrote: > On Tue, 2013-12-03 at 10:23 -0500, Robert Haas wrote: >> In more normal cases, however, the system can (and probably should) >> figure out what was intended by choosing the *shortest* path to get to >> the intended version. For example, if someone ships 1.0, 1.0--1.1, >> 1.1, and 1.1--1.2, the system should choose to run 1.1 and then >> 1.1--1.2, not 1.0 and then 1.0--1.1 and then 1.1--1.2. But that can >> be automatic: only if there are two paths of equal length (as in the >> example in the previous paragraph) do we need help from the user to >> figure out what to do. > > Why do we need help from the user? Just pick a path. > > For an extension update, I understand why someone wouldn't want to > accidentally downgrade 5 versions (dropping all of their dependent > objects) before updating to the latest. But this doesn't apply to > creation. I suppose. But suppose we have 1.0, 1.1, 1.0--1.2, and 1.1--1.2. Suppose further that 1.1 drops some interfaces present in 1.0, and 1.2 adds new stuff. If the system chooses to run 1.0 and then 1.0--1.2, it'll create all the deprecated interfaces and then drop them again. Now maybe that won't cause any problems, but I bet it will. For example, consider hstore again. If we eventually disallow => as an operator altogether, the 1.0 script won't even run any more. Of course that doesn't matter for core because we've removed it entirely from our repository and don't ship it any more, but an out-of-core extension might well keep around more old scripts than we do, to make it easier to use the same bundle with multiple server versions. Imagine, for example, that 1.0 only works on 9.4 or earlier and 1.2 only works on releases 9.2 or later. The extension author wants to keep the 1.0 script around for the benefit of people who haven't upgraded, so that they can still install the older version that works there; but he also need the 1.1 base version to be preferred to the 1.0 base version, else installation of 1.2 on 10.0+ will fail completely. My experience with software upgrades is that cases like this, and even weirder things, happen pretty routinely, so I think more control is good. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Status of FDW pushdowns
Merlin Moncure writes: > The downside of SQL-MED, particularly the way postgres implemented the > driver API, is that each driver is responsible for for all > optimization efforts and I think this is bad. There was never any intention that that would be the final state of things. All the FDW APIs are quite experimental at this point, and subject to change, and one of the reasons for change is going to be to improve the optimization situation. At the same time, it's hard to say what might constitute optimization for FDWs that aren't backed by a remote SQL database. There are always going to be reasons why an FDW will have to do some of that work for itself. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Status of FDW pushdowns
On Wed, Dec 4, 2013 at 1:39 PM, David Fetter wrote: > On Wed, Dec 04, 2013 at 12:43:44PM -0600, Merlin Moncure wrote: >> On Mon, Dec 2, 2013 at 10:26 PM, David Fetter wrote: >> > On Tue, Dec 03, 2013 at 11:15:36AM +0800, Craig Ringer wrote: >> >> On 11/28/2013 03:24 AM, David Fetter wrote: >> >> > WITH, or SRF, or whatever, the point is that we need to be able to >> >> > specify what we're sending--probably single opaque strings delimited >> >> > just as we do other strings--and what we might get back--errors only, >> >> > rows, [sets of] refcursors are the ones I can think of offhand. >> >> >> >> So, you're thinking of something like: >> >> >> >> WITH FOREIGN somecte AS $$... foreign query ...$$ >> >> SELECT ... >> >> FROM somecte; >> > >> > I was picturing something a little more like an SRF which would take >> > one opaque string, the remote command, some descriptor, perhaps an >> > enum, of what if anything might come back. Long ago, I implemented a >> > similar thing in DBI-Link. It was called >> > >> > remote_exec_dbh(data_source_id integer, query text, returns_rows bool) >> >> Couple thoughts: >> *) Any 'pass through' API should support parameterization (the FDW may >> not support that, but many will and API should allow for it). Lack >> of parameterization is a major downside of dblink. The function could >> be set up to be variadic for the parameters. > > I don't know for sure that that needs to be in version 1 of this. It > definitely shouldn't block implementing the non-parameterized one. I'm not making the case it should be version anything. But, if you went dblink style, you'd want to go variadic. It's not really any extra work and you can always embed the string if the FDW driver doesn't support parameterization. > What the standard has is literally insane. Not sure I agree. The guiding principle of the standard implementation AIUI is that it wants to connectivity management via syntax and keep the DML abstractions clean (minus some un-implementable things like RI triggers). In other words, you write exactly the same queries for native and foreign tables. This makes things much easier for people who just want to write SQL the classical way and not get into funky vendor specific APIs. The downside of SQL-MED, particularly the way postgres implemented the driver API, is that each driver is responsible for for all optimization efforts and I think this is bad. So I'm openly wondering if the FDW API should expose optional query rewriting hooks. The odbc-fdw and jdbc-fdw drivers for example could then benefit from those hooks so that qual pushdown could be implemented with far less code duplication and effort and a *much* broader set of problems could be addressed by FDW. For non- or exotic- SQL implementations those hooks could be implemented locally by the driver or disabled if doesn't make sense to use them. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITHIN GROUP patch
Andrew Gierth writes: > "Tom" == Tom Lane writes: > Tom> Well, okay, but you've not said anything that wouldn't be > Tom> handled just as well by some logic that adds a fixed > Tom> integer-constant-zero flag column to the rows going into the > Tom> tuplesort. > Adding such a column unconditionally even for non-hypothetical > functions would break the optimization for sorting a single column > (which is a big deal, something like 3x speed difference, for by-value > types). Well, sure, but I was only suggesting adding it when the aggregate asks for it, probably via a new flag column in pg_aggregate. The question you're evading is what additional functionality could be had if the aggregate could demand a different datatype or constant value for the flag column. > Adding it only for hypothetical set functions is making a distinction > in how functions are executed that I don't think is warranted - That seems like rather a curious argument from someone who's willing to give up the ability to specify a regular transition value concurrently with the flag column. But anyway, what I'm thinking right now is that these questions would all go away if the aggregate transfunction were receiving the rows and sticking them into the tuplestore. It could add whatever columns it felt like. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extension Templates S03E11
On Tue, Dec 3, 2013 at 11:44 AM, Dimitri Fontaine wrote: >> We should also consider the possibility of a user trying to >> deliberately install and older release. For example, if the user has >> 1.0, 1.0--1.1, 1.1, 1.1--1.2, and 1.2--1.0 (a downgrade script) with >> default_full_version = 1.2, an attempt to install 1.0 should run just >> the 1.0 script, NOT 1.2 and then 1.2--1.0. > > In what I did, if you want version 1.0 and we have a script --1.0.sql > around, then we just use that script, never kicking the path chooser. Oh, right. Duh. Sorry, bad example. I do think we want to avoid using a downgrade script as part of an install though - and to install from the newest possible full version (I kind of like the term "base" version) whenever possible. >> break the tie by choosing which version number appears first in the >> aforementioned list. If that still doesn't break the tie, either >> because none of the starting points are mentioned in that list or >> because there are multiple equal-length paths starting in the same >> place, we give up and emit an error. > > Jeff also did mention about tiebreakers without entering into any level > of details. > > We won't be able to just use default_version as the tiebreaker list > here, because of the following example: > > default_version = 1.2, 1.0 > > create extension foo version '1.1'; > > With such a setup it would prefer 1.2--1.1 to 1.0--1.1, which doesn't > look like what we want. Instead, we want > > default_version = 1.2 > create_from_version_candidates = 1.0 > > create extension foo version '1.1'; > > Then the tie breaker is the 1.0 in "create_from_version_candidates" so > we would run foo--1.0.sql and then foo--1.0--1.1.sql. I guess one way to skin this cat would be to just let the user provide an ordering for the versions i.e. version_ordering = 1.0 1.1 1.2 When the user asks for version X, we reject any paths that pass through a newer version (so that we never downgrade), and start with the path that begins as close to the target version as possible. For scenarios were people might be installing either an older or newer version, that might be easier to understand than a base-version preference list. > Baring objections, I'm going to prepare a new branch to support > developping that behavior against only file based extensions, and submit > a spin-off patch to the current CF entry. Not totally sure we're all on the same page yet, but that's not necessarily meant to dissuade you. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposed feature: Selective Foreign Keys
On 12/04/2013 02:40 PM, Tom Lane wrote: Andrew Dunstan writes: Well I guess we could say something like: FOREIGN KEY (a-col) WHERE (a-condition) REFERENCES b(b-col) WHERE (b-condition) But it's somewhat ugly. OK, those make sense. I wonder whether this should be done via a USING clause on the constraint that pointed to the partial unique index. Or would that be too obscure? I like what you have above. Yeah, it requires the more verbose syntax for declaring a foreign key, but this feature is not going to be so heavily used that anyone will be in danger of worsening their carpal tunnel syndrome. Fair enough. I guess in terms of *this* feature TomD would then need to adjust the location of his WHERE clause so it's before the REFERENCES clause. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Status of FDW pushdowns
David Fetter writes: > The idea here is that such a happy situation will not obtain until > much later, if ever, and meanwhile, we need a way to get things > accomplished even if it's inelegant, inefficient, etc. The > alternative is that those things simply will not get accomplished at > all. If that's the argument, why not just use dblink or dbilink, and be happy? This discussion sounds a whole lot like it's trending to a conclusion of wanting one of those in core, which is not where I'd like to end up. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extension Templates S03E11
Jeff Davis writes: > On Tue, 2013-12-03 at 14:31 -0500, Tom Lane wrote: >> Stephen Frost writes: >>> When it comes to dump/reload, I'd much rather see a mechanism which uses >>> our deep understanding of the extension's objects (as database objects) >>> to implement the dump/reload than a text blob which is carried forward >>> from major version to major version and may even fail to run. >> Note that we're already doing that in the binary_upgrade code path. >> I agree that generalizing that approach sounds like a better idea >> than keeping a text blob around. > The reason for doing it that way in pg_upgrade was to preserve OIDs for > types, etc.: That was *a* reason, but not the only one, I believe. > That doesn't seem to apply to ordinary dump/reload. Do you think it's > good for other reasons, as well? I think Stephen has already argued why it could be a good idea here. But in a nutshell: it seems like there are two use-cases to be supported, one where you want "CREATE EXTENSION hstore" to give you some appropriate version of hstore, and one where you want to restore exactly what you had on the previous installation. It seems to me that "exploding" the extension by dumping, rather than suppressing, its component objects is by far the most reliable way of accomplishing the latter. To point out just one reason why, we've never made any effort to prohibit suitably-privileged users from modifying the objects within an extension. So even if you'd kept around the originally defining text string, it might not represent current reality. And as for relying on some URL or other --- whoever proposed that doesn't live in the same internet I do. URLs aren't immutable, even on days when you can get to them. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgsql: Fix a couple of bugs in MultiXactId freezing
Magnus Hagander writes: > On Tue, Dec 3, 2013 at 7:20 PM, Tom Lane wrote: >> I assume what would happen is the slave would PANIC upon seeing a WAL >> record code it didn't recognize. > I wonder if we should for the future have the START_REPLICATION command (or > the IDENTIFY_SYSTEM would probably make more sense - or even adding a new > command like IDENTIFY_CLIENT. The point is, something in the replication > protocol) have walreceiver include it's version sent to the master. That > way we could have the walsender identify a walreceiver that's too old and > disconnect it right away - with a much nicer error message than a PANIC. Meh. That only helps for the case of streaming replication, and not for the thirty-seven other ways that some WAL might arrive at something that wants to replay it. It might be worth doing anyway, but I can't get excited about it for this scenario. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why we are going to have to go DirectIO
On 12/04/2013 07:33 AM, Jonathan Corbet wrote: Wow, Josh, I'm surprised to hear this from you. The active/inactive list mechanism works great for the vast majority of users. The second-use algorithm prevents a lot of pathological behavior, like wiping out your entire cache by copying a big file or running a backup. We *need* that kind of logic in the kernel. The amount of automated testing, including performance testing, has increased markedly in the last couple of years. I bet that it would not be hard at all to get somebody like Fengguang Wu to add some Postgres-oriented I/O tests to his automatic suite: https://lwn.net/Articles/571991/ Then we would all have a much better idea of how kernel releases are affecting one of our most important applications; developers would pay attention to that information. Or you could go off and do your own thing, but I believe that would leave us all poorer. Thank you for your very well thought out, and knowledgeable response. This is certainly helpful and highlights what a lot of us were already stating. Sincerely, Joshua D. Drake -- Command Prompt, Inc. - http://www.commandprompt.com/ 509-416-6579 PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc For my dreams of your image that blossoms a rose in the deeps of my heart. - W.B. Yeats -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposed feature: Selective Foreign Keys
Andrew Dunstan writes: >>> Well I guess we could say something like: >>> >>> FOREIGN KEY (a-col) WHERE (a-condition) REFERENCES b(b-col) WHERE >>> (b-condition) >>> >>> But it's somewhat ugly. > OK, those make sense. I wonder whether this should be done via a USING > clause on the constraint that pointed to the partial unique index. Or > would that be too obscure? I like what you have above. Yeah, it requires the more verbose syntax for declaring a foreign key, but this feature is not going to be so heavily used that anyone will be in danger of worsening their carpal tunnel syndrome. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Status of FDW pushdowns
On Wed, Dec 04, 2013 at 12:43:44PM -0600, Merlin Moncure wrote: > On Mon, Dec 2, 2013 at 10:26 PM, David Fetter wrote: > > On Tue, Dec 03, 2013 at 11:15:36AM +0800, Craig Ringer wrote: > >> On 11/28/2013 03:24 AM, David Fetter wrote: > >> > WITH, or SRF, or whatever, the point is that we need to be able to > >> > specify what we're sending--probably single opaque strings delimited > >> > just as we do other strings--and what we might get back--errors only, > >> > rows, [sets of] refcursors are the ones I can think of offhand. > >> > >> So, you're thinking of something like: > >> > >> WITH FOREIGN somecte AS $$... foreign query ...$$ > >> SELECT ... > >> FROM somecte; > > > > I was picturing something a little more like an SRF which would take > > one opaque string, the remote command, some descriptor, perhaps an > > enum, of what if anything might come back. Long ago, I implemented a > > similar thing in DBI-Link. It was called > > > > remote_exec_dbh(data_source_id integer, query text, returns_rows bool) > > Couple thoughts: > *) Any 'pass through' API should support parameterization (the FDW may > not support that, but many will and API should allow for it). Lack > of parameterization is a major downside of dblink. The function could > be set up to be variadic for the parameters. I don't know for sure that that needs to be in version 1 of this. It definitely shouldn't block implementing the non-parameterized one. > *) For a connectivity APIs of this style, Dblink-ish mechanic of > separating command execution from data returning commands is likely > the right way to go. Also, probably better to stick with SRF > mechanics if we go the 'function route'. So basically we are making > dblink for FDW, adding parameterization and some concept of utilizing > the foreign server. Yes, modulo the above. > All this is assuming we are adding a special remote execution function > ('fdwlink'). While that would be great, it's a significant deviation > from the standard into postgresql specific SRF syntax. What the standard has is literally insane. > If some of the qual pushdown deparsing functionality could be put > inside the internal FDW API, then you'd get the best of both worlds. If this were flawless on the PostgreSQL side (i.e. our path generator understood everything perfectly including aggregates) and trivial to implement correctly in FDWs, certainly. The idea here is that such a happy situation will not obtain until much later, if ever, and meanwhile, we need a way to get things accomplished even if it's inelegant, inefficient, etc. The alternative is that those things simply will not get accomplished at all. Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITHIN GROUP patch
> "Tom" == Tom Lane writes: Tom> Well, okay, but you've not said anything that wouldn't be Tom> handled just as well by some logic that adds a fixed Tom> integer-constant-zero flag column to the rows going into the Tom> tuplesort. Adding such a column unconditionally even for non-hypothetical functions would break the optimization for sorting a single column (which is a big deal, something like 3x speed difference, for by-value types). Adding it only for hypothetical set functions is making a distinction in how functions are executed that I don't think is warranted - imagine for example a function that calculates some measure over a frequency distribution by adding a known set of boundary values to the sort; this would not be a hypothetical set function in terms of argument processing, but it would still benefit from the extra sort column. I did not want to unnecessarily restrict such possibilities. >> It would still be overloaded in some sense because a non-hypothetical >> ordered set function could still take an arbitrary number of args >> (using variadic "any") - there aren't any provided, but there's no >> good reason to disallow user-defined functions doing that - so you'd >> still need a special value like -1 for aggordnargs to handle that. Tom> Sure. But a -1 to indicate "not applicable" doesn't seem like it's Tom> too much of a stretch. It's the -2 business that's bothering me. Tom> Again, that seems unnecessarily non-orthogonal --- who's to say which Tom> functions would want to constrain the number of direct arguments and Tom> which wouldn't? (I wonder whether having this info in the catalogs Tom> isn't the wrong thing anyhow, as opposed to expecting the functions Tom> themselves to check the argument count at runtime.) Not checking the number of arguments to a function until runtime seems a bit on the perverse side. Having a fixed number of direct args is the "normal" case (as seen from the fact that all the non-hypothetical ordered set functions in the spec and in our patch have fixed argument counts). -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why we are going to have to go DirectIO
On 12/04/2013 07:33 AM, Jonathan Corbet wrote: > Wow, Josh, I'm surprised to hear this from you. Well, I figured it was too angry to propose for an LWN article. ;-) > The active/inactive list mechanism works great for the vast majority of > users. The second-use algorithm prevents a lot of pathological behavior, > like wiping out your entire cache by copying a big file or running a > backup. We *need* that kind of logic in the kernel. There's a large body of research on 2Q algorithms going back to the 80s, which is what this is. As far as I can tell, the modification was performed without any reading of this research, since that would have easily shown that 50/50 was unlikely to be a good division, and that in fact there is nothing which would work except a tunable setting, because workloads are different. Certainly the "what happens if a single file is larger than the entire recency bucket" question is addressed and debated. As an example, PostgreSQL would want to shrink the frequency list to 0%, because we already implement our own frequency list, and we already demonstrated back in version 8.1 that a 3-list system was ineffective. I can save Johannes some time: don't implement ARC. Not only is it under IBM patent, it's not effective in real-world situations. Both Postgres and Apache tried it in the early aughts. However, this particular issue concerns me less than the general attitude that it's OK to push in experimental IO changes which can't be disabled by users into release kernels, as exemplified by several problematic and inadequately tested IO changes in the 3.X kernels -- most notably the pdflush bug. It speaks of a policy that the Linux IO stack is not production software, and it's OK to tinker with it in ways that break things for many users. I also wasn't exaggerating the reception I got when I tried to talk about IO and PostgreSQL at LinuxCon and other events. The majority of Linux hackers I've talked to simply don't want to be bothered with PostgreSQL's performance needs, and I've heard similar things from my collegues at the MySQL variants. Greg KH was the only real exception. Heck, I went to a meeting of filesystem geeks at LinuxCon and the main feedback I received, from Linux FS developers (Chris and Ted), was "PostgreSQL should implement its own storage and use DirectIO, we don't know why you're even trying to use the Linux IO stack." That's why I gave up on working through community channels; I face enough uphill battles in *this* project. > This code has been a bit slow getting into the mainline for a few reasons, > but one of the chief ones is this: nobody is saying from the sidelines > that they need it! If somebody were saying "Postgres would work a lot > better with this code in place" and had some numbers to demonstrate that, > we'd be far more likely to see it get into an upcoming release. Well, Citus did that; do you need more evidence? > In the end, Linux is quite responsive to the people who participate in its > development, even as testers and bug reporters. It responds rather less > well to people who find problems in enterprise kernels years later, > granted. All infrastructure software, including Postgres, has the issue that most enterprise users are using a version which was released years ago. As a result, some performance issues simply aren't going to be found until that version has been out for a couple of years. This leads to a Catch-22: enterprise users are reluctant to upgrade because of potential performance regressions, and as a result the median "enterprise" version gets further and further behind current development, and as a result the performance regressions are never fixed. We encounter this in PostgreSQL (I have customers who are still on 8.4 or 9.1 because of specific regressions), and it's even worse in the Linux world, where RHEL is still on 2.6. We work really hard to avoid performance regressions in Postgres versions, because we know we can't test for them adequately, and often can't fix them in release versions after the fact. But you know what? 2.6, overall, still performs better than any kernel in the 3.X series, at least for Postgres. > The amount of automated testing, including performance testing, has > increased markedly in the last couple of years. I bet that it would not > be hard at all to get somebody like Fengguang Wu to add some > Postgres-oriented I/O tests to his automatic suite: > > https://lwn.net/Articles/571991/ > > Then we would all have a much better idea of how kernel releases are > affecting one of our most important applications; developers would pay > attention to that information. Oh, good! I was working with Greg on having an automated pgBench run, but doing it on Wu's testing platform would be even better. I still need to get some automated stats digestion, since I want to at least make sure that the tests would show the three major issues which we encountered in recent Linux kernels
Re: [HACKERS] RFC: programmable file format for postgresql.conf
On 04/12/13 19:49, Peter Eisentraut wrote: On 12/4/13, 11:22 AM, Álvaro Hernández Tortosa wrote: Would it be well-received a new file format that keeps it simple for both hand editing and generation of the configuration, and at the same time offers the features I have mentioned? I don't see how that would work exactly: You want to add various kinds of complex metadata to the configuration file, but make that metadata optional at the same time. The immediate result will be that almost no one will supply the optional metadata, and no tools will be able to rely on their presence. I wouldn't say the metadata is "complex". Looks quite familiar to that of pg_settings (besides that, it was just a brainstorming, not a formal proposal). The optional fields are basically NULLABLE attributes in pg_settings. That is, they only make sense depending on other values (in this case, the parameter name). All of the attributes that are required for tools to work are marked as non optional. So optional fields are either purely optional (i.e., only for tools that want to use them; everyone else may ignore, but preserve, them) and some other are just NULLABLEs, depending on the parameter). In any case, my idea is just to open up the question and search for the best possible set of data to be represented, and then, the best possible syntax / file format for it. aht -- Álvaro Hernández Tortosa --- NOSYS Networked Open SYStems -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: programmable file format for postgresql.conf
On 12/4/13, 11:22 AM, Álvaro Hernández Tortosa wrote: > Would it be well-received a new file format that keeps it simple for > both hand editing and generation of the configuration, and at the same > time offers the features I have mentioned? I don't see how that would work exactly: You want to add various kinds of complex metadata to the configuration file, but make that metadata optional at the same time. The immediate result will be that almost no one will supply the optional metadata, and no tools will be able to rely on their presence. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Status of FDW pushdowns
On Mon, Dec 2, 2013 at 10:26 PM, David Fetter wrote: > On Tue, Dec 03, 2013 at 11:15:36AM +0800, Craig Ringer wrote: >> On 11/28/2013 03:24 AM, David Fetter wrote: >> > WITH, or SRF, or whatever, the point is that we need to be able to >> > specify what we're sending--probably single opaque strings delimited >> > just as we do other strings--and what we might get back--errors only, >> > rows, [sets of] refcursors are the ones I can think of offhand. >> >> So, you're thinking of something like: >> >> WITH FOREIGN somecte AS $$... foreign query ...$$ >> SELECT ... >> FROM somecte; > > I was picturing something a little more like an SRF which would take > one opaque string, the remote command, some descriptor, perhaps an > enum, of what if anything might come back. Long ago, I implemented a > similar thing in DBI-Link. It was called > > remote_exec_dbh(data_source_id integer, query text, returns_rows bool) Couple thoughts: *) Any 'pass through' API should support parameterization (the FDW may not support that, but many will and API should allow for it). Lack of parameterization is a major downside of dblink. The function could be set up to be variadic for the parameters. *) For a connectivity APIs of this style, Dblink-ish mechanic of separating command execution from data returning commands is likely the right way to go. Also, probably better to stick with SRF mechanics if we go the 'function route'. So basically we are making dblink for FDW, adding parameterization and some concept of utilizing the foreign server. All this is assuming we are adding a special remote execution function ('fdwlink'). While that would be great, it's a significant deviation from the standard into postgresql specific SRF syntax. If some of the qual pushdown deparsing functionality could be put inside the internal FDW API, then you'd get the best of both worlds. Maybe you'd still want a dblink style extension anyways, but it wouldn't be as critical. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why we are going to have to go DirectIO
On 12/04/2013 07:30 PM, Joshua D. Drake wrote: > > On 12/04/2013 07:32 AM, Stefan Kaltenbrunner wrote: >> >> On 12/04/2013 04:30 PM, Peter Eisentraut wrote: >>> On 12/4/13, 2:14 AM, Stefan Kaltenbrunner wrote: running a few kvm instances that get bootstrapped automatically is something that is a solved problem. >>> >>> Is it sound to run performance tests on kvm? >> >> as sounds as on any other platform imho, the performance characteristics >> will differ between bare metal or other virtualisation platforms but the >> future is virtual and that is what a lot of stuff runs on... > > In actuality you need both. We need to know what the kernel is going to > do on bare metal. For example, 3.2 to 3.8 are total crap for random IO > access. We will only catch that properly from bare metal tests or at > least, we will only catch it easily on bare metal tests. > > If we know the standard bare metal tests are working then the next step > up would be to test virtual. > > BTW: Virtualization is only one future and it is still a long way off > from serving the needs that bare metal serves at the same level > (speaking PostgreSQL specifically). we need to get that off the ground - and whatever makes it easier to get off the ground will help. and if we solve the automation for virtualisation, bare metal is just a small step away (or the other way round). Getting comparable performance levels between either different postgresql versions (or patches) or different operating systems with various workloads is probably more valuable now that getting absolute peak performance levels under specific tests long term. Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why we are going to have to go DirectIO
On 12/04/2013 07:32 AM, Stefan Kaltenbrunner wrote: On 12/04/2013 04:30 PM, Peter Eisentraut wrote: On 12/4/13, 2:14 AM, Stefan Kaltenbrunner wrote: running a few kvm instances that get bootstrapped automatically is something that is a solved problem. Is it sound to run performance tests on kvm? as sounds as on any other platform imho, the performance characteristics will differ between bare metal or other virtualisation platforms but the future is virtual and that is what a lot of stuff runs on... In actuality you need both. We need to know what the kernel is going to do on bare metal. For example, 3.2 to 3.8 are total crap for random IO access. We will only catch that properly from bare metal tests or at least, we will only catch it easily on bare metal tests. If we know the standard bare metal tests are working then the next step up would be to test virtual. BTW: Virtualization is only one future and it is still a long way off from serving the needs that bare metal serves at the same level (speaking PostgreSQL specifically). JD Stefan -- Command Prompt, Inc. - http://www.commandprompt.com/ 509-416-6579 PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc For my dreams of your image that blossoms a rose in the deeps of my heart. - W.B. Yeats -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Select query performance and shared buffers
> You could try my lwlock-scalability improvement patches - for some > workloads here, the improvements have been rather noticeable. Which > version are you testing? I'm testing with PostgreSQL 9.3.1.
Re: [HACKERS] Parallel Select query performance and shared buffers
On 2013-12-04 20:19:55 +0200, Metin Doslu wrote: > - When we increased NUM_BUFFER_PARTITIONS to 1024, this problem is > disappeared for 8 core machines and come back with 16 core machines on > Amazon EC2. Would it be related with PostgreSQL locking mechanism? You could try my lwlock-scalability improvement patches - for some workloads here, the improvements have been rather noticeable. Which version are you testing? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Select query performance and shared buffers
Here are some extra information: - When we increased NUM_BUFFER_PARTITIONS to 1024, this problem is disappeared for 8 core machines and come back with 16 core machines on Amazon EC2. Would it be related with PostgreSQL locking mechanism? - I tried this test with 4 core machines including my personel computer and some other instances on Amazon EC2, I didn't see this problem with 4 core machines. I started to see this problem in PostgreSQL when core count is 8 or more. - Here are the results of "vmstat 1" while running 8 parallel select count(*). Normally I would expect zero idle time. procs ---memory-- ---swap-- -io --system-- -cpu- r b swpd free buff cache si sobibo in cs us sy id wa st 0 0 0 29838640 94000 3895474000 0 0 22 21 0 0 100 0 0 7 2 0 29788416 94000 3895474000 0 0 53922 108490 14 24 60 1 1 5 0 0 29747248 94000 3895474000 0 0 68008 164571 22 48 27 2 1 8 0 0 29725796 94000 3895474000 0 0 43587 150574 28 54 16 1 1 0 0 0 29838328 94000 3895474000 0 0 15584 100459 26 55 18 1 0 0 0 0 29838328 94000 3895474000 0 0 42 15 0 0 100 0 0 - When I run 8 parallel wc command or other scripts, they scale out as expected and they utilize all cpu. This leads me to think that problem is related with PostgreSQL instead of OS.
Re: [HACKERS] Parallel Select query performance and shared buffers
> Didn't follow the thread from the start. So, this is EC2? Have you > checked, with a recent enough version of top or whatever, how much time > is reported as "stolen"? Yes, this EC2. "stolen" is randomly reported as 1, mostly as 0.
Re: [HACKERS] Parallel Select query performance and shared buffers
On 2013-12-04 16:00:40 -0200, Claudio Freire wrote: > On Wed, Dec 4, 2013 at 1:54 PM, Andres Freund wrote: > > All that time is spent in your virtualization solution. One thing to try > > is to look on the host system, sometimes profiles there can be more > > meaningful. > > You cannot profile the host on EC2. Didn't follow the thread from the start. So, this is EC2? Have you checked, with a recent enough version of top or whatever, how much time is reported as "stolen"? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Select query performance and shared buffers
> You could try HVM. I've noticed it fare better under heavy CPU load, > and it's not fully-HVM (it still uses paravirtualized network and > I/O). I already tried with HVM (cc2.8xlarge instance on Amazon EC2) and observed same problem.
Re: [HACKERS] Parallel Select query performance and shared buffers
On Wed, Dec 4, 2013 at 1:54 PM, Andres Freund wrote: > On 2013-12-04 18:43:35 +0200, Metin Doslu wrote: >> > I'd strongly suggest doing a "perf record -g -a ; >> > perf report" run to check what's eating up the time. >> >> Here is one example: >> >> + 38.87% swapper [kernel.kallsyms] [k] hypercall_page >> + 9.32% postgres [kernel.kallsyms] [k] hypercall_page >> + 6.80% postgres [kernel.kallsyms] [k] xen_set_pte_at > > All that time is spent in your virtualization solution. One thing to try > is to look on the host system, sometimes profiles there can be more > meaningful. You cannot profile the host on EC2. You could try HVM. I've noticed it fare better under heavy CPU load, and it's not fully-HVM (it still uses paravirtualized network and I/O). -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changes in Trigger Firing
> > > > > > CreateTrigStmt is passed to CreateTrigger function as an arguement. I am > > struggling to understand how the values for various members of trigger > are > > set and where [which file] calls CreateTrigStmt. > > > > > > Can someone provide some help on this? > > I think you need better tools to guide you in exploring the source code. > For example, you can use cscope to tell you where is CreateTrigStmt > used, and you would find gram.y; and use it to tell you where > CreateTrigger is used, and you would find utility.c. > > Thanks for your advice. I was relying on PostgreSQL documentation which was quite helpful so far. Let me try some development tool. > Any half-decent code editor should be able to generate a "database" of > symbols and let you frolic around the various files quickly. Without > that, anyone would be completely lost in developing new features of even > the lowest complexity. > Got the point!
Re: [HACKERS] Proposed feature: Selective Foreign Keys
On Wed, Dec 4, 2013 at 12:18 PM, Andrew Dunstan wrote: >> Interestingly, the variant for which you can't think of a use case is >> the one I've missed most. Typical examples in my experience are >> things like project.project_manager_id references person (id) where >> person.is_project_manager, or alert (device_id) references device (id) >> where not device.deleted. >> > OK, those make sense. I wonder whether this should be done via a USING > clause on the constraint that pointed to the partial unique index. Or would > that be too obscure? I wondered that, too. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extension Templates S03E11
On Tue, 2013-12-03 at 14:31 -0500, Tom Lane wrote: > Stephen Frost writes: > > When it comes to dump/reload, I'd much rather see a mechanism which uses > > our deep understanding of the extension's objects (as database objects) > > to implement the dump/reload than a text blob which is carried forward > > from major version to major version and may even fail to run. > > Note that we're already doing that in the binary_upgrade code path. > I agree that generalizing that approach sounds like a better idea > than keeping a text blob around. The reason for doing it that way in pg_upgrade was to preserve OIDs for types, etc.: http://www.postgresql.org/message-id/20783.1297184...@sss.pgh.pa.us That doesn't seem to apply to ordinary dump/reload. Do you think it's good for other reasons, as well? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why we are going to have to go DirectIO
On 12/04/2013 04:33 PM, Jonathan Corbet wrote: > On Tue, 03 Dec 2013 10:44:15 -0800 > Josh Berkus wrote: > >> It seems clear that Kernel.org, since 2.6, has been in the business of >> pushing major, hackish, changes to the IO stack without testing them or >> even thinking too hard about what the side-effects might be. This is >> perhaps unsurprising given that two of the largest sponsors of the >> Kernel -- who, incidentally, do 100% of the performance testing -- don't >> use the IO stack. >> >> This says to me that Linux will clearly be an undependable platform in >> the future with the potential to destroy PostgreSQL performance without >> warning, leaving us scrambling for workarounds. Too bad the >> alternatives are so unpopular. > > Wow, Josh, I'm surprised to hear this from you. > > The active/inactive list mechanism works great for the vast majority of > users. The second-use algorithm prevents a lot of pathological behavior, > like wiping out your entire cache by copying a big file or running a > backup. We *need* that kind of logic in the kernel. > > Now, back in 2012, Johannes (working for one of those big contributors) > hit upon an issue where second-use falls down. So he set out to fix it: > > https://lwn.net/Articles/495543/ > > This code has been a bit slow getting into the mainline for a few reasons, > but one of the chief ones is this: nobody is saying from the sidelines > that they need it! If somebody were saying "Postgres would work a lot > better with this code in place" and had some numbers to demonstrate that, > we'd be far more likely to see it get into an upcoming release. > > In the end, Linux is quite responsive to the people who participate in its > development, even as testers and bug reporters. It responds rather less > well to people who find problems in enterprise kernels years later, > granted. > > The amount of automated testing, including performance testing, has > increased markedly in the last couple of years. I bet that it would not > be hard at all to get somebody like Fengguang Wu to add some > Postgres-oriented I/O tests to his automatic suite: > > https://lwn.net/Articles/571991/ > > Then we would all have a much better idea of how kernel releases are > affecting one of our most important applications; developers would pay > attention to that information. hmm interesting tool, I can see how that would be very useful "for early warning" style detection on the kernel development side using a small set of postgresql "benchmarks". That would basically help with part of Josh complained that it will take ages for regressions to be detected. >From postgresqls pov we would also need additional long term and more complex testing spanning different postgresql version on various distribution platforms (because that is what people deploy in production, hand built git-fetched kernels are rare) using tests that both might have extended runtimes and/or require external infrastructure > > Or you could go off and do your own thing, but I believe that would leave > us all poorer. fully agreed Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extension Templates S03E11
On Wed, 2013-12-04 at 09:50 -0500, Stephen Frost wrote: > > I still don't see that Extension Templates are all bad: > > * They preserve the fact that two instances of the same extension > > (e.g. in different databases) were created from the same template. > > This is only true if we change the extension templates to be shared > catalogs, which they aren't today.. I agree with you about that -- I don't like per-DB templates. I guess the challenge is that we might want to use namespaces to support user-installable extensions, and namespaces reside within a DB. But I think we can find some other solution there (e.g. user names rather than schemas), and per-DB templates are just not a good solution anyway. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgsql: Fix a couple of bugs in MultiXactId freezing
On Tue, Dec 3, 2013 at 7:20 PM, Tom Lane wrote: > Magnus Hagander writes: > > On Tue, Dec 3, 2013 at 7:11 PM, Tom Lane wrote: > >> Maybe we should just bite the bullet and change the WAL format for > >> heap_freeze (inventing an all-new record type, not repurposing the old > >> one, and allowing WAL replay to continue to accept the old one). The > >> implication for users would be that they'd have to update slave servers > >> before the master when installing the update; which is unpleasant, but > >> better than living with a known data corruption case. > > > Agreed. It may suck, but it sucks less. > > > How badly will it break if they do the upgrade in the wrong order though. > > Will the slaves just stop (I assume this?) or is there a risk of a > > wrong-order upgrade causing extra breakage? > > I assume what would happen is the slave would PANIC upon seeing a WAL > record code it didn't recognize. Installing the updated version should > allow it to resume functioning. Would be good to test this, but if it > doesn't work like that, that'd be another bug to fix IMO. We've always > foreseen the possible need to do something like this, so it ought to > work reasonably cleanly. > > I wonder if we should for the future have the START_REPLICATION command (or the IDENTIFY_SYSTEM would probably make more sense - or even adding a new command like IDENTIFY_CLIENT. The point is, something in the replication protocol) have walreceiver include it's version sent to the master. That way we could have the walsender identify a walreceiver that's too old and disconnect it right away - with a much nicer error message than a PANIC. Right now, walreceiver knows the version of the walsender (through pqserverversion), but AFAICT there is no way for the walsender to know which version of the receiver is connected. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] Proposed feature: Selective Foreign Keys
On 12/04/2013 12:00 PM, Robert Haas wrote: On Wed, Dec 4, 2013 at 11:44 AM, Andrew Dunstan wrote: Oh. I misinterpreted what this feature was about, then. I thought it was about restricting the reference to a subset of the *referenced* table, but it seems to be about restricting the constraint to a subset of the *referencing* table. I guess they're both useful, but the syntax... REFERENCES tab(col) WHERE (stuff) ...sure looks like the WHERE clause is syntactically associated with the table being referenced. What would we do if we eventually wanted to support both variants? Well I guess we could say something like: FOREIGN KEY (a-col) WHERE (a-condition) REFERENCES b(b-col) WHERE (b-condition) But it's somewhat ugly. The case of restricting the allowed referent rows does look slightly like a solution in search of a problem, but maybe that's just because I haven't thought of a use for it yet. Interestingly, the variant for which you can't think of a use case is the one I've missed most. Typical examples in my experience are things like project.project_manager_id references person (id) where person.is_project_manager, or alert (device_id) references device (id) where not device.deleted. OK, those make sense. I wonder whether this should be done via a USING clause on the constraint that pointed to the partial unique index. Or would that be too obscure? cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Time-Delayed Standbys
src/backend/access/transam/xlog.c:5889: trailing whitespace. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposed feature: Selective Foreign Keys
On Wed, Dec 4, 2013 at 11:44 AM, Andrew Dunstan wrote: >> Oh. I misinterpreted what this feature was about, then. I thought it >> was about restricting the reference to a subset of the *referenced* >> table, but it seems to be about restricting the constraint to a subset >> of the *referencing* table. I guess they're both useful, but the >> syntax... >> >> REFERENCES tab(col) WHERE (stuff) >> >> ...sure looks like the WHERE clause is syntactically associated with >> the table being referenced. What would we do if we eventually wanted >> to support both variants? >> > > Well I guess we could say something like: > >FOREIGN KEY (a-col) WHERE (a-condition) REFERENCES b(b-col) WHERE >(b-condition) > > But it's somewhat ugly. > > The case of restricting the allowed referent rows does look slightly like a > solution in search of a problem, but maybe that's just because I haven't > thought of a use for it yet. Interestingly, the variant for which you can't think of a use case is the one I've missed most. Typical examples in my experience are things like project.project_manager_id references person (id) where person.is_project_manager, or alert (device_id) references device (id) where not device.deleted. Different strokes for different folks, I guess. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Select query performance and shared buffers
On 2013-12-04 18:43:35 +0200, Metin Doslu wrote: > > I'd strongly suggest doing a "perf record -g -a ; > > perf report" run to check what's eating up the time. > > Here is one example: > > + 38.87% swapper [kernel.kallsyms] [k] hypercall_page > + 9.32% postgres [kernel.kallsyms] [k] hypercall_page > + 6.80% postgres [kernel.kallsyms] [k] xen_set_pte_at All that time is spent in your virtualization solution. One thing to try is to look on the host system, sometimes profiles there can be more meaningful. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposed feature: Selective Foreign Keys
On 12/04/2013 11:25 AM, Robert Haas wrote: On Tue, Dec 3, 2013 at 5:57 PM, Tom Dunstan wrote: On 4 December 2013 01:24, Robert Haas wrote: Yeah, more or less, but the key is ensuring that it wouldn't let you create the constraint in the first place if the partial index specified *didn't* match the WHERE clause. For example, suppose the partial index says WHERE parent_entity = 'event' but the constraint definition is WHERE parent_event = 'somethingelse'. That ought to fail, just as creating a regular foreign constraint will fail if there's no matching unique index. The where clause only applies to queries against the FK table, and we don’t currently fail if there isn’t a matching index on the fk column when creating a FK (I’ve been bitten by that before). We fail if there isn’t a unique index on the referenced table/column(s), but queries against that table on insert/update not the FK table are unchanged (save that we don’t bother with them at all if the where clause expression fails for the given tuple). Oh. I misinterpreted what this feature was about, then. I thought it was about restricting the reference to a subset of the *referenced* table, but it seems to be about restricting the constraint to a subset of the *referencing* table. I guess they're both useful, but the syntax... REFERENCES tab(col) WHERE (stuff) ...sure looks like the WHERE clause is syntactically associated with the table being referenced. What would we do if we eventually wanted to support both variants? Well I guess we could say something like: FOREIGN KEY (a-col) WHERE (a-condition) REFERENCES b(b-col) WHERE (b-condition) But it's somewhat ugly. The case of restricting the allowed referent rows does look slightly like a solution in search of a problem, but maybe that's just because I haven't thought of a use for it yet. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Select query performance and shared buffers
> I'd strongly suggest doing a "perf record -g -a ; > perf report" run to check what's eating up the time. Here is one example: + 38.87% swapper [kernel.kallsyms] [k] hypercall_page + 9.32% postgres [kernel.kallsyms] [k] hypercall_page + 6.80% postgres [kernel.kallsyms] [k] xen_set_pte_at + 5.83% postgres [kernel.kallsyms] [k] copy_user_generic_string + 2.06% postgres [kernel.kallsyms] [k] file_read_actor + 1.89% postgres postgres[.] heapgettup_pagemode + 1.83% postgres postgres[.] hash_search_with_hash_value + 1.33% postgres [kernel.kallsyms] [k] get_phys_to_machine + 1.25% postgres [kernel.kallsyms] [k] find_get_page + 1.00% postgres postgres[.] heapgetpage + 0.99% postgres [kernel.kallsyms] [k] radix_tree_lookup_element + 0.98% postgres postgres[.] advance_aggregates + 0.96% postgres postgres[.] ExecProject + 0.94% postgres postgres[.] advance_transition_function + 0.88% postgres postgres[.] ExecScan + 0.87% postgres postgres[.] HeapTupleSatisfiesMVCC + 0.86% postgres postgres[.] LWLockAcquire + 0.82% postgres [kernel.kallsyms] [k] put_page + 0.82% postgres postgres[.] MemoryContextReset + 0.80% postgres postgres[.] SeqNext + 0.78% postgres [kernel.kallsyms] [k] pte_mfn_to_pfn + 0.69% postgres postgres[.] ExecClearTuple + 0.57% postgres postgres[.] ExecProcNode + 0.54% postgres postgres[.] heap_getnext + 0.53% postgres postgres[.] LWLockRelease + 0.53% postgres postgres[.] ExecStoreTuple + 0.51% postgres libc-2.12.so[.] __GI___libc_read + 0.42% postgres [kernel.kallsyms] [k] xen_spin_lock + 0.40% postgres postgres[.] ReadBuffer_common + 0.38% postgres [kernel.kallsyms] [k] __do_fault + 0.37% postgres [kernel.kallsyms] [k] shmem_fault + 0.37% postgres [kernel.kallsyms] [k] unmap_single_vma + 0.35% postgres [kernel.kallsyms] [k] __wake_up_bit + 0.33% postgres postgres[.] StrategyGetBuffer + 0.33% postgres [kernel.kallsyms] [k] set_page_dirty + 0.33% postgres [kernel.kallsyms] [k] handle_pte_fault + 0.33% postgres postgres[.] ExecAgg + 0.31% postgres postgres[.] XidInMVCCSnapshot + 0.31% postgres [kernel.kallsyms] [k] __audit_syscall_entry + 0.31% postgres postgres[.] CheckForSerializableConflictOut + 0.29% postgres [kernel.kallsyms] [k] handle_mm_fault + 0.25% postgres [kernel.kallsyms] [k] shmem_getpage_gfp On Wed, Dec 4, 2013 at 6:33 PM, Andres Freund wrote: > On 2013-12-04 14:27:10 -0200, Claudio Freire wrote: > > On Wed, Dec 4, 2013 at 9:19 AM, Metin Doslu wrote: > > > > > > Here are the results of "vmstat 1" while running 8 parallel TPC-H > Simple > > > (#6) queries: Although there is no need for I/O, "wa" fluctuates > between 0 > > > and 1. > > > > > > procs ---memory-- ---swap-- -io --system-- > > > -cpu- > > > r b swpd free buffcache si sobiboin > cs us sy id wa st > > > 0 0 0 30093568 84892 3872389600 0 022 > 14 0 0 100 0 0 > > > 8 1 0 30043056 84892 3872389600 0 0 27080 > 52708 16 14 70 0 0 > > > 8 1 0 30006600 84892 3872389600 0 0 44952 > 118286 43 44 12 1 0 > > > 8 0 0 29986264 84900 3872389600 020 28043 > 95934 49 42 8 1 0 > > > 7 0 0 29991976 84900 3872389600 0 0 8308 > 73641 52 42 6 0 0 > > > 0 0 0 30091828 84900 3872389600 0 0 3996 > 30978 23 24 53 0 0 > > > 0 0 0 30091968 84900 3872389600 0 017 > 23 0 0 100 0 0 > > > > > > Notice the huge %sy > > My bet is on transparent hugepage defragmentation. Alternatively it's > scheduler overhead, due to superflous context switches around the buffer > mapping locks. > > I'd strongly suggest doing a "perf record -g -a ; > perf report" run to check what's eating up the time. > > Greetings, > > Andres Freund > > -- > Andres Freund http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services >
Re: [HACKERS] Parallel Select query performance and shared buffers
>Notice the huge %sy >What kind of VM are you using? HVM or paravirtual? This instance is paravirtual.
Re: [HACKERS] Parallel Select query performance and shared buffers
On 2013-12-04 14:27:10 -0200, Claudio Freire wrote: > On Wed, Dec 4, 2013 at 9:19 AM, Metin Doslu wrote: > > > > Here are the results of "vmstat 1" while running 8 parallel TPC-H Simple > > (#6) queries: Although there is no need for I/O, "wa" fluctuates between 0 > > and 1. > > > > procs ---memory-- ---swap-- -io --system-- > > -cpu- > > r b swpd free buffcache si sobiboin cs us > > sy id wa st > > 0 0 0 30093568 84892 3872389600 0 022 14 0 > > 0 100 0 0 > > 8 1 0 30043056 84892 3872389600 0 0 27080 52708 16 > > 14 70 0 0 > > 8 1 0 30006600 84892 3872389600 0 0 44952 118286 43 > > 44 12 1 0 > > 8 0 0 29986264 84900 3872389600 020 28043 95934 49 > > 42 8 1 0 > > 7 0 0 29991976 84900 3872389600 0 0 8308 73641 52 > > 42 6 0 0 > > 0 0 0 30091828 84900 3872389600 0 0 3996 30978 23 > > 24 53 0 0 > > 0 0 0 30091968 84900 3872389600 0 01723 0 > > 0 100 0 0 > > > Notice the huge %sy My bet is on transparent hugepage defragmentation. Alternatively it's scheduler overhead, due to superflous context switches around the buffer mapping locks. I'd strongly suggest doing a "perf record -g -a ; perf report" run to check what's eating up the time. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Select query performance and shared buffers
On Wed, Dec 4, 2013 at 9:19 AM, Metin Doslu wrote: > > Here are the results of "vmstat 1" while running 8 parallel TPC-H Simple > (#6) queries: Although there is no need for I/O, "wa" fluctuates between 0 > and 1. > > procs ---memory-- ---swap-- -io --system-- > -cpu- > r b swpd free buffcache si sobiboin cs us > sy id wa st > 0 0 0 30093568 84892 3872389600 0 022 14 0 > 0 100 0 0 > 8 1 0 30043056 84892 3872389600 0 0 27080 52708 16 > 14 70 0 0 > 8 1 0 30006600 84892 3872389600 0 0 44952 118286 43 > 44 12 1 0 > 8 0 0 29986264 84900 3872389600 020 28043 95934 49 > 42 8 1 0 > 7 0 0 29991976 84900 3872389600 0 0 8308 73641 52 > 42 6 0 0 > 0 0 0 30091828 84900 3872389600 0 0 3996 30978 23 > 24 53 0 0 > 0 0 0 30091968 84900 3872389600 0 01723 0 > 0 100 0 0 Notice the huge %sy What kind of VM are you using? HVM or paravirtual? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposed feature: Selective Foreign Keys
On Tue, Dec 3, 2013 at 5:57 PM, Tom Dunstan wrote: > On 4 December 2013 01:24, Robert Haas wrote: >> Yeah, more or less, but the key is ensuring that it wouldn't let you >> create the constraint in the first place if the partial index >> specified *didn't* match the WHERE clause. For example, suppose the >> partial index says WHERE parent_entity = 'event' but the constraint >> definition is WHERE parent_event = 'somethingelse'. That ought to >> fail, just as creating a regular foreign constraint will fail if >> there's no matching unique index. > > The where clause only applies to queries against the FK table, and we > don’t currently fail if there isn’t a matching index on the fk column > when creating a FK (I’ve been bitten by that before). > > We fail if there isn’t a unique index on the referenced > table/column(s), but queries against that table on insert/update not > the FK table are unchanged (save that we don’t bother with them at all > if the where clause expression fails for the given tuple). Oh. I misinterpreted what this feature was about, then. I thought it was about restricting the reference to a subset of the *referenced* table, but it seems to be about restricting the constraint to a subset of the *referencing* table. I guess they're both useful, but the syntax... REFERENCES tab(col) WHERE (stuff) ...sure looks like the WHERE clause is syntactically associated with the table being referenced. What would we do if we eventually wanted to support both variants? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: programmable file format for postgresql.conf
On 04/12/13 16:51, Peter Eisentraut wrote: On 12/4/13, 1:42 AM, Álvaro Hernández Tortosa wrote: IMHO, a data structure like the above would be completely self-contained and allow any autoconfiguring tool or GUI tool to be easily created, if the syntax is programmable. It would certainly make the config file more verbose, but at the same time would help a lot of users to configure postgres providing much more information. What you are describing appears to be isomorphic to XML and XML Schema. I don't think XML would be a good idea. Even if it is both programatically and humanly editable (two of the features I was suggesting for it), it is messy and very verbose for this purpose. Note that you are not required to maintain your configuration data in a postgresql.conf-formatted file. You can keep it anywhere you like, GUI around in it, and convert it back to the required format. Most of the I think it is not a very good idea to encourage GUI tools or tools to auto-configure postgres to use a separate configuration file and then convert it to postgresql.conf. That introduces a duplicity with evil problems if either source of data is modified out-of-the-expected-way. That's why I'm suggesting a config file that is, at the same time, usable by both postgres and other external tools. That also enables other features such as editing the config file persistently through a SQL session. metadata is available through postgres --describe-config, which is the result of a previous attempt in this area, which never really went anywhere. It's not like there are a bunch of GUI and autotuning tools that people are dying to use or developers are dying to create, but couldn't because editing configuration files programmatically is hard. It might be a chicken-and-egg problem. Maybe it's hard and futile to write this config tools since postgresql.conf doesn't support the required features. I don't know how to measure the "interest of people" but I have seen many comments on this mailing list about features like this. IMHO it would be a great addition :) Let's also not forget the two main use cases (arguably) of the configuration files: hand editing, and generation by configuration management tools. Anything that makes these two harder is not going to be well-received. 100% agreed :) That's why I suggested that the format of the config file should adhere to the requisites a) to e) mentioned on my original email (http://www.postgresql.org/message-id/529b8d01.6060...@nosys.es). Would it be well-received a new file format that keeps it simple for both hand editing and generation of the configuration, and at the same time offers the features I have mentioned? Thanks for your comments, aht -- Álvaro Hernández Tortosa --- NOSYS Networked Open SYStems -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FDW: possible resjunk columns in AddForeignUpdateTargets
Ian Lawrence Barwick wrote: > 2013/11/8 Tom Lane : >> [ thinks for awhile... ] Hm. In principle you can put any expression >> you want into the tlist during AddForeignUpdateTargets. However, if it's >> not a Var then the planner won't understand that it's something that needs >> to be supplied by the table scan, so things won't work right in any but >> the most trivial cases (maybe not even then :-(). >> >> What I'd try is creating a Var that has the attno of ctid >> (ie, SelfItemPointerAttributeNumber) but the datatype you want, ie bytea. >> This won't match what the catalogs say your table's ctid is, but I think >> that nothing will care much about that. > > Apologies for reinvigorating this thread, but I'm running into a similar wall > myself and would like to clarify if this approach will work at all. > > My foreign data source is returning a fixed-length string as a unique row > identifier; in AddForeignUpdateTargets() I can create a Var like this: > > var = makeVar(parsetree->resultRelation, >SelfItemPointerAttributeNumber, >BPCHAROID, >32, >InvalidOid, >0); > > but is it possible to store something other than a TIDOID here, and if so how? Subsequent analysis showed that this won't work as you have no way to populate such a resjunk column. resjunk columns seem to get filled with the values from the column of the same name, so currently there is no way to invent your own column, fill it and pass it on. See thread 8b848b463a71b7a905bc5ef18b95528e.squir...@sq.gransy.com What I ended up doing is introduce a column option that identifies a primary key column. I add a resjunk entry for each of those and use them to identify the correct row during an UPDATE or DELETE. That only works for foreign data sources that have a concept of a primary key, but maybe you can do something similar. Yours, Laurenz Albe -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Minor patch for the uuid-ossp extension
On 11/23/13, 7:12 AM, Mario Weilguni wrote: > Well, in that case and since this is a rarely used extension (I guess > so), maybe it would be the best to simply rename that extension to > uuidossp (or whatever) and don't make any special treatment for it? Why? This is a solved problem, and renaming the extension would only cause unnecessary work. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: programmable file format for postgresql.conf
On 12/4/13, 1:42 AM, Álvaro Hernández Tortosa wrote: > IMHO, a data structure like the above would be completely > self-contained and allow any autoconfiguring tool or GUI tool to be > easily created, if the syntax is programmable. It would certainly make > the config file more verbose, but at the same time would help a lot of > users to configure postgres providing much more information. What you are describing appears to be isomorphic to XML and XML Schema. Note that you are not required to maintain your configuration data in a postgresql.conf-formatted file. You can keep it anywhere you like, GUI around in it, and convert it back to the required format. Most of the metadata is available through postgres --describe-config, which is the result of a previous attempt in this area, which never really went anywhere. It's not like there are a bunch of GUI and autotuning tools that people are dying to use or developers are dying to create, but couldn't because editing configuration files programmatically is hard. Let's also not forget the two main use cases (arguably) of the configuration files: hand editing, and generation by configuration management tools. Anything that makes these two harder is not going to be well-received. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers