Re: [HACKERS] [GENERAL] VACUUM touching file but not updating relation
On Fri, Nov 18, 2011 at 3:18 PM, Tom Lane t...@sss.pgh.pa.us wrote: So the correct number of WAL records is emitted and I see no bug there. What Thom's complaining about is that the buffer may be marked dirty unnecessarily, ie when there has been no actual data change. Based upon both your feedback, I made a change to stop the block being marked dirty, though Tom now wants that removed. Thom, your earlier analysis showing that the md5 checksum of a relation had changed is not happening because of the section of code you identified. The code sets some data on the page, which would cause the md5 checksum to change. So it cannot be the btree code at _bt_delitems_vacuum() causing this. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: Avoid marking buffer dirty when VACUUM has no work to do.
On Tue, Nov 22, 2011 at 2:32 AM, Tom Lane t...@sss.pgh.pa.us wrote: I wrote: Simon Riggs si...@2ndquadrant.com writes: Avoid marking buffer dirty when VACUUM has no work to do. When wal_level = 'hot_standby' we touched the last page of the relation during a VACUUM, even if nothing else had happened. That would alter the LSN of the last block and set the mtime of the relation file unnecessarily. Noted by Thom Brown. This doesn't look right to me --- you have not accounted for the possibility that btpo_cycleid or BTP_HAS_GARBAGE is changed. Also, I'm confused about the business of not setting the LSN. Thom claimed that he was seeing the page not change at all (or at least md5sum of the file didn't change) despite mtime changing. If we'd been plastering a new LSN on the page each time, then that should certainly not have been possible. So I now think maybe we've mis-analyzed what was happening in his example. I think this requires more careful analysis. Ping? If you don't respond, I'm going to take it on my own authority to revert this patch, because it's definitely broken as-is, and I don't think the consequences of not updating the page LSN have been thought through either. Tom, waiting across a weekend isn't a cause for concern. I made that change for you, so am happy to revoke for you also. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allow substitute allocators for PGresult.
Hello, At Fri, 11 Nov 2011 11:29:30 +0200, Heikki Linnakangas wrote You could use the resource owner mechanism to track them. Register a callback function with RegisterResourceReleaseCallback(). Thank you for letting me know about it. I have dug up a message in pg-hackers refering to the mechanism on discussion about postgresql-fdw. I'll put further thought into dblink-plus taking it into account. By the way, thinking about memory management for the result in libpq is considerable as another issue. At Sat, 12 Nov 2011 12:29:50 -0500, Tom Lane wrote To start with, the patch proposes exposing some global variables that affect the behavior of libpq process-wide. This seems like a pretty bad design, because a single process could contain multiple usages of libpq You're right to say the design is bad. I've designed it to have minimal impact on libpq by limiting usage and imposing any reponsibility on the users, that is the developers of the modules using it. If there are any other applications that want to use their own allocators, there are some points to be considered. I think it is preferable consiering multi-threading to make libpq write PGresult into memory blocks passed from the application like OCI does, instead of letting libpq itself make request for them. This approach hands the responsibility of memory management to the user and gives them the capability to avoid memory exhaustion by their own measures. On the other hand, this way could produce the situation that libpq cannot write all of the data to receive from the server onto handed memory block. So, the API must be able to return the partial data to the caller. More advancing, if libpq could store the result directly into user-allocated memory space using tuplestore-like interface, it is better on performance if the final storage is a tuplestore itself. I will be happy with the part-by-part passing of result. So I will think about this as the next issue. So I'd feel happier about the patch if it came along with a patch to make dblink.c use it to prevent memory leaks. I take it is about my original patch. Mmm, I heard that dblink copies received data in PGResult to tuple store not only because of the memory leak, but less memory usage (after the copy is finished). I think I could show you the patch ignoring the latter, but it might take some time for me to start from understand dblink and tuplestore closely... If I find RegisterResourceReleaseCallback short for our requirement, I will show it. If not, I withdraw this patch for ongoing CF and propose another patch based on the discussion above at another time. Please let me have a little more time. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dblink: enable parameters
On 22.11.2011 06:55, Pavel Stehule wrote: I know so dblink is deprecated interface - but it has necessary functionality still - it support a writable statements. It would be nice to add support for SQL/MED passthrough mode... That would allow you to do any queries/updates to foreign servers. It wouldn't sound very difficult at first glance, though I'm not sure what it would mean to our parser, for example. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dblink: enable parameters
On Tue, Nov 22, 2011 at 20:09, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: It would be nice to add support for SQL/MED passthrough mode... That would allow you to do any queries/updates to foreign servers. It wouldn't sound very difficult at first glance, though I'm not sure what it would mean to our parser, for example. I think passthrough mode is almost impossible or has very limited usage because we cannot pass query texts that our parser doesn't accept; we cannot support foreign-table-specific queries. On 22.11.2011 06:55, Pavel Stehule wrote: I know so dblink is deprecated interface - but it has necessary functionality still - it support a writable statements. So, dblink on FDW connection seems to be a possible solution. We pass a query as a form of a plain text. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] explain analyze query execution time
On 22/11/11 02:58, Kevin Grittner wrote: Rudyar wrote: I try to get the execution time of a query workload. I try using explain analyze but this time is allways higher than the execution time of a query across a client like pgadmin3 what is the reason about that difference? It's the observer effect -- there is a cost to the timing, counting, measuring, and reporting which is done by EXPLAIN ANALYZE, which distorts what is being measured. It's just about impossible to avoid entirely. -Kevin Thanks kevin, what tool you recommend for measure the query real query execution time? -- Rudyar Cortés. Estudiante de Ingeniería Civil Informática Universidad Técnica Federico Santa María. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Review: CHECK FUNCTION statement
I tried to look at the patch, but it does not apply to current master, probably because of bit rot. Can you submit an updated version? The patch contains docs and regression tests and is context diff. I'll mark it waiting for author. Yours, Laurenz Albe -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] explain analyze query execution time
Rudyar wrote: what tool you recommend for measure the query real query execution time? The -hackers list is for discussion to coordinate development of the PostgreSQL database product. For user questions like this, please pick a more appropriate list based on the descriptions here: http://www.postgresql.org/community/lists/ While any further follow-up should be on another list, I'll briefly answer here. EXPLAIN ANALYZE is great for seeing how a query is being executed, but for accurate timing of how long the query runs without generating all that extra information, you can measure it on the client side, or turn on logging of statements running long than some particular time. In psql you can use \timing on, in Java you can run System.currentTimeInMillis() before and after running the query, etc. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] explain analyze query execution time
On 22/11/11 10:26, Kevin Grittner wrote: Rudyar wrote: what tool you recommend for measure the query real query execution time? The -hackers list is for discussion to coordinate development of the PostgreSQL database product. For user questions like this, please pick a more appropriate list based on the descriptions here: http://www.postgresql.org/community/lists/ While any further follow-up should be on another list, I'll briefly answer here. EXPLAIN ANALYZE is great for seeing how a query is being executed, but for accurate timing of how long the query runs without generating all that extra information, you can measure it on the client side, or turn on logging of statements running long than some particular time. In psql you can use \timing on, in Java you can run System.currentTimeInMillis() before and after running the query, etc. -Kevin Thanks Kevin ;) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: CHECK FUNCTION statement
2011/11/22 Albe Laurenz laurenz.a...@wien.gv.at: I tried to look at the patch, but it does not apply to current master, probably because of bit rot. Can you submit an updated version? The patch contains docs and regression tests and is context diff. I'll mark it waiting for author. please wait, I plan to work on this topic at thursday. Regards Pavel Yours, Laurenz Albe -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Singleton range constructors versus functional coercion notation
On Mon, Nov 21, 2011 at 9:55 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Nov 20, 2011, at 10:24 PM, Jeff Davis pg...@j-davis.com wrote: Well, if there were a good shorter notation, then probably so. But it doesn't look like we have a good idea, so I'm fine with dropping it. We should also keep in mind that people who use range types can and likely will define their own convenience functions. If people use singletons, or open ranges, or closed ranges, or one-hour timestamp ranges frequently, they can make their own notational shorthand with a 3-line CREATE FUNCTION statement. We don't need to have it all in core. But if you believe that, what syntax do you think people are likely to try if they want a singleton range constructor? Leaving the user to discover the problem and try to invent a workaround is not better than doing it ourselves ... Wait, do I hear the great Tom Lane arguing for putting more than the minimal amount of stuff in core? :-) I honestly don't know what function names people will pick, and I don't care. Someone might like singleton(x), which would be impractical as a built-in because there could be more than one range type over the same base type, but if the user defines the function they can pick what's convenient for them. If they use singletons exceedingly frequently they might even want something really short, like just(x) or s(x). Or they might say daterange1(x), along the lines you suggested earlier. The point is that by not defining more than necessary in core, we give the user the flexibility to do what they want. In cases where that amounts to handing them a loaded gun with the safety off, we shouldn't do it, but that doesn't seem to be the case here. It doesn't take a lot of programming acumen to write a function that passes two copies of its single argument to a built-in. The only mistake anyone's likely to make is forgetting to declare it non-VOLATILE. But the real point is that I don't think we should assume that singleton ranges are unique in being things for which people will want shortcuts. We talked about having a behavior-changing GUC to control whether the bounds are [) or [] or () or (], but we didn't do it, because behavior-changing GUCs are a bad idea. But I fully expect that people who make heavy use of range types will (and we should encourage them to) define convenience functions with names of their own choosing that pass the bounds that are useful to them. If isn't the very model of a use-case for inline-able SQL functions, I'm not sure what is. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EXPLAIN (plan off, rewrite off) for benchmarking
On Mon, Nov 21, 2011 at 8:54 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: ... Maybe we could find a way to reduce the size of the parse tree (i.e. fewer nodes), or the number of times that it has to be walked/copied. We could eliminate some annoying tree-copy steps if we could institute the policy that parse analysis doesn't scribble on the raw parse tree, rewriter doesn't modify parse analysis output, and planner doesn't modify rewriter output. However, it would be a lot of work, and I'm not entirely sure that we'd end up with a significant speed benefit. In a lot of places, the only way to not scribble on the input is to copy it anyway ... This is probably a stupid question, but why does it matter if parse analysis scribbles on the raw parse tree, or the rewriter on the parse analysis output? I understand that we may sometimes need to replan the output of the rewriter, so we'd better not modify it destructively, but I would have thought that parse and parse analysis would always be done together, in which case it doesn't obviously matter. I'm probably missing something here... Another thing we might want to consider doing is introducing some kind of container data structure other than List. I think that trying to change the semantics of the existing List datatype in any meaningful way is probably doomed, but if we introduce a new abstraction and gradually convert things over, we have a lot more flexibility. What I'm imagining is something that is optimized for holding a small number of pointers (say, 4) that are normally added in order, but which can grow if needed, at some cost in performance. For example: struct Thingy { unsigned short nused; unsigned short nalloc; struct Thingy *next; /* if we run out of elements in this Thingy, we can link to another Thingy with more space */ void *item[FLEXIBLE_ARRAY_MEMBER];/* really nalloc */ }; This would mean fewer palloc() calls and less overall memory usage than a List, and might also improve memory access locality. Right now, walking a three element list could involve pulling in one cache line for the list, three more for the list cells, and then (presumably) three more for the elements themselves. There's not much help for the fact that the elements might end up in different cache lines, but at least if we did something like this you wouldn't need to bounce around to find the list cells. This is particularly obnoxious for things like RangeVars, where we build up the list representation mostly as an intermediate step and then flatten it again. Now, one obvious problem with this line of attack is that each individual patch in this area isn't likely to save anything noticeable on real workloads. But I don't think that should stop us from trying. With every performance patch that goes in, the number of workloads that are dominated by the cost of backend-local memory allocation is growing. The cost is extremely distributed and it's very hard to pin down where it's all coming from, but we've gotta start somewhere. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EXPLAIN (plan off, rewrite off) for benchmarking
Robert Haas robertmh...@gmail.com writes: On Mon, Nov 21, 2011 at 8:54 PM, Tom Lane t...@sss.pgh.pa.us wrote: We could eliminate some annoying tree-copy steps if we could institute the policy that parse analysis doesn't scribble on the raw parse tree, rewriter doesn't modify parse analysis output, and planner doesn't modify rewriter output. This is probably a stupid question, but why does it matter if parse analysis scribbles on the raw parse tree, or the rewriter on the parse analysis output? Because we frequently need to save the original tree for possible re-analysis later. This doesn't matter in the simple-query protocol, but it does matter in any code path that involves plancache. I understand that we may sometimes need to replan the output of the rewriter, so we'd better not modify it destructively, but I would have thought that parse and parse analysis would always be done together, in which case it doesn't obviously matter. No, actually it's the raw grammar output tree that gets saved for re-analysis in case we are told of a DDL change. (I've considered having the code save only the original query string, but then you'd have to repeat the flex/bison work, and those things show up high enough on any profile to make it seem unlikely that this is cheaper than copying the parse tree.) It's possible that we don't need a read-only guarantee for the rewriter versus parse analysis output, but I doubt that helps much. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Singleton range constructors versus functional coercion notation
On Tue, 2011-11-22 at 09:07 -0500, Robert Haas wrote: I honestly don't know what function names people will pick, and I don't care. Someone might like singleton(x), which would be impractical as a built-in because there could be more than one range type over the same base type, but if the user defines the function they can pick what's convenient for them. If they use singletons exceedingly frequently they might even want something really short, like just(x) or s(x). Or they might say daterange1(x), along the lines you suggested earlier. For that matter, they might pick daterange(x), as I picked earlier, and run into the same problems. It's a little strange that we allow people to define functions with one argument and the same name as a type if such functions are confusing. This isn't intended as an argument in either direction, just an observation. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] strange nbtree corruption report
Excerpts from Tom Lane's message of mar nov 22 01:14:33 -0300 2011: Alvaro Herrera alvhe...@alvh.no-ip.org writes: ERROR: left link changed unexpectedly in block 3378 of index index_name CONTEXT: automatic vacuum of table table_name This was reported not once, but several dozens of times, by each new autovacuum worker that tried to vacuum the table. As far as I can see, there is just no way for this to happen ... much less happen repeatedly. It's not hard to believe that that would happen repeatedly given a corrupted set of sibling links, eg deletable page A links left to page B, which links right to C, which links right to A. The question is how the index got into such a state. A dropped update during a page split would explain it (ie, B used to be A's left sibling, then at some point B got split into B and C, but A's left-link never got updated on disk). I wonder how reliable their disk+filesystem is ... Well, there are no other signs of random data corruption, such as toast pointers getting corrupted which is the number one symptom showing up when underlying storage is flaky. However, it may be possible that there was a transient storage problem which only affected this one page; if this persisted in the way you describe, it might well explain these symptoms. Another thing I noticed is that there was corruption in heap pages (not the same server, though; it was a different Londiste slave). This was even more strange; the pages would be completely fine, except the first six words corresponding to the page header; they would be all zeros. When filled with valid-looking data (mostly I copied the bytes from neighbor pages), the rest of the page would decode fine. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Singleton range constructors versus functional coercion notation
Jeff Davis pg...@j-davis.com writes: It's a little strange that we allow people to define functions with one argument and the same name as a type if such functions are confusing. As long as your mental model is that such a function is a cast, everything is fine. The trouble with the range constructors is that they aren't really casts, as shown by the fact that when you read textrange('foo') you expect 'foo' to be text and not textrange. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] testing ProcArrayLock patches
On Tue, Nov 22, 2011 at 4:40 AM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Pavan Deolasee pavan.deola...@gmail.com wrote: It will be a great help if you could spare few minutes to also test the patch to take out the frequently accessed PGPROC members to a different array. We are seeing good improvements on HPUX IA platform and the AMD Opteron and it will be interesting to know what happens on the Intel platform too. For a read only comparison (which was run using the simple protocol), using identical settings to the previous master run, but with the PGPROC split patch: m32 tps = 201738.209348 (including connections establishing) p32 tps = 201620.966988 (including connections establishing) m128 tps = 352159.631878 (including connections establishing) p128 tps = 363998.703900 (including connections establishing) Clearly a win at 128 clients; not at 32. For updates: sm32 tps = 27392.393850 (including connections establishing) sp32 tps = 27995.784333 (including connections establishing) sm128 tps = 22261.902571 (including connections establishing) sp128 tps = 23690.408272 (including connections establishing) pm32 tps = 34983.352396 (including connections establishing) pp32 tps = 36076.373389 (including connections establishing) pm128 tps = 24164.441954 (including connections establishing) pp128 tps = 27070.824588 (including connections establishing) That's a pretty decisive win all around. Thanks for running those tests. The numbers are not that bad, but definitely not as good as we saw on some other platforms. But its possible that they may improve in percentage terms with even more number of clients on this box. And given that we are seeing big gains on other platforms, hopefully it will give us confident to proceed with the patch. Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] testing ProcArrayLock patches
Pavan Deolasee pavan.deola...@gmail.com wrote: The numbers are not that bad, but definitely not as good as we saw on some other platforms. Well, this machine is definitely designed to hold up under high concurrency. As I understand it, each core is the memory manager for two 4GB DIMMs, with two channels to them, each with two buffers. The way the cores are connected, a core never needs to go through more than one other core to get to memory not directly managed, and that uses snoop technology which hands the cached data right over from one core to the other when possible, rather than making the core which now owns the cache line pull it from RAM. It seems the 2.6.32 kernel is able to manage that technology in a reasonable fashion. At first I was surprised to see performance top out on the update tests between 80 and 96 clients. But then, that lands almost exactly where my old reliable ((2 * core count) + effective spindle count) would predict. The SELECT only tests peaked at 64 clients, but those were fully cached, so effective spindle count was zero, again fitting the formula. So these optimizations seem to me to break down the barriers which had previously capped the number of clients which could be handled, letting them peak at their natural levels. But its possible that they may improve in percentage terms with even more number of clients on this box. I think so; I think this box is just so scalable that at 128 clients we were just barely getting past the knee in the performance graphs to where these patches help most. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] vpath builds and verbose error messages
On fre, 2011-11-18 at 09:44 -0500, Tom Lane wrote: It wouldn't be that hard for elog.c to do strrchr(fname, '/') or something like that, but the idea that there are hundreds of full-path strings embedded in the executable is a bit annoying. I guess we could hope that the compiler is bright enough to store it only once per source file, so the actual space requirement may not be all *that* bad. A look a the output of strings shows that the compiler does appear to optimize this. So your suggestion is probably the way to go. One thing that isn't so nice about all this is that it embeds the personal directory structure of the builder of the binary into the shipped product. But gcc's cpp doesn't like redefining __FILE__, so the only way to get around that altogether would be to use something other than __FILE__ and define that for all builds. Might not be worth it. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] WIP: index support for regexp search
Hackers, WIP patch with index support for regexp search for pg_trgm contrib is attached. In spite of techniques which extracts continuous text parts from regexp, this patch presents technique of automatum transformation. That allows more comprehensive trigrams extraction. A little example of possible perfomance benefit. test=# explain analyze select * from words where s ~ 'a[bc]+[de]'; QUERY PLAN --- Seq Scan on words (cost=0.00..1703.11 rows=10 width=9) (actual time=3.092..242.303 rows=662 loops=1) Filter: (s ~ 'a[bc]+[de]'::text) Rows Removed by Filter: 97907 Total runtime: 243.213 ms (4 rows) test=# explain analyze select * from words where s ~ 'a[bc]+[de]'; QUERY PLAN Bitmap Heap Scan on words (cost=260.08..295.83 rows=10 width=9) (actual time=4.166..7.506 rows=662 loops=1) Recheck Cond: (s ~ 'a[bc]+[de]'::text) Rows Removed by Index Recheck: 18 - Bitmap Index Scan on words_trgm_idx (cost=0.00..260.07 rows=10 width=0) (actual time=4.076..4.076 rows=680 loops=1) Index Cond: (s ~ 'a[bc]+[de]'::text) Total runtime: 8.424 ms (6 rows) Current version of patch have some limitations: 1) Algorithm of logical expression extraction on trigrams have high computational complexity. So, it can become really slow on regexp with many branches. Probably, improvements of this algorithm is possible. 2) Surely, no perfomance benefit if no trigrams can be extracted from regexp. It's inevitably. 3) Currently, only GIN index is supported. There are no serious problems, GiST code for it just not written yet. 4) It appear to be some kind of problem to extract multibyte encoded character from pg_wchar. I've posted question about it here: http://archives.postgresql.org/pgsql-hackers/2011-11/msg01222.php While I've hardcoded some dirty solution. So PG_EUC_JP, PG_EUC_CN, PG_EUC_KR, PG_EUC_TW, PG_EUC_JIS_2004 are not supported yet. -- With best regards, Alexander Korotkov. trgm-regexp-0.1.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] vpath builds and verbose error messages
Excerpts from Peter Eisentraut's message of mar nov 22 16:22:15 -0300 2011: One thing that isn't so nice about all this is that it embeds the personal directory structure of the builder of the binary into the shipped product. But gcc's cpp doesn't like redefining __FILE__, so the only way to get around that altogether would be to use something other than __FILE__ and define that for all builds. Might not be worth it. This is similar to what's suggested here: http://stackoverflow.com/questions/1591873/how-do-i-write-a-cpp-dir-macro-similar-to-file -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_upgrade relation OID mismatches
I thought these were fixed a while ago, but I'm still seeing these when upgrading from master to self (using testing script sent in a while ago). This is completely reproducible. What's happening? ... Restoring user relation files /home/peter/devel/postgresql/git/postgresql/contrib/pg_upgra Mismatch of relation OID in database regression: old OID 16701, new OID 16689 Failure, exiting -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Not HOT enough
Some time ago, I changed GetOldestXmin() to ignore procs in other databases resulting in a potentially later xmin. GetSnapshotData() was not touched when that happened, even though the comments say ...This is the same computation done by GetOldestXmin(true, true). The transam/README file says it stronger GetSnapshotData also performs an oldest-xmin calculation (which had better match GetOldestXmin's). Doh. As a result, VACUUM ignores procs in other databases, whereas HOT does not. That means we aren't cleaning up as much as we could do when running multiple databases. If its OK for VACUUM, then it must be OK for HOT cleanup also. Attached patch ignores procs in other databases during GetSnapshotData() when IsMVCCSnapshot(), using similar coding to GetOldestXmin(). Any doubters? I suggest this is backpatched a few releases. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services fix_getsnapshotdata.v1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade relation OID mismatches
Peter Eisentraut wrote: I thought these were fixed a while ago, but I'm still seeing these when upgrading from master to self (using testing script sent in a while ago). This is completely reproducible. What's happening? ... Restoring user relation files /home/peter/devel/postgresql/git/postgresql/contrib/pg_upgra Mismatch of relation OID in database regression: old OID 16701, new OID 16689 Failure, exiting Yes, I certainly thought they were all addressed. What object is 16701 in the old database? Anything unusual about it? This is saying the relation oid was not preserved. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review for Add permission check on SELECT INTO
On Mon, Nov 21, 2011 at 6:38 AM, Albe Laurenz laurenz.a...@wien.gv.at wrote: Kohei KaiGai wrote: The attached patch is a revised version. It fixed up this bug, and revised test cases to ensure permission check error shall be raised due to the new table. Thanks. The second patch seems fine to me, I'll mark it ready for committer. Committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Not HOT enough
On Tue, Nov 22, 2011 at 3:23 PM, Simon Riggs si...@2ndquadrant.com wrote: Some time ago, I changed GetOldestXmin() to ignore procs in other databases resulting in a potentially later xmin. GetSnapshotData() was not touched when that happened, even though the comments say ...This is the same computation done by GetOldestXmin(true, true). The transam/README file says it stronger GetSnapshotData also performs an oldest-xmin calculation (which had better match GetOldestXmin's). Doh. As a result, VACUUM ignores procs in other databases, whereas HOT does not. That means we aren't cleaning up as much as we could do when running multiple databases. If its OK for VACUUM, then it must be OK for HOT cleanup also. Attached patch ignores procs in other databases during GetSnapshotData() when IsMVCCSnapshot(), using similar coding to GetOldestXmin(). Any doubters? I think this is unsafe for shared catalogs. I suggest this is backpatched a few releases. -1. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Optimize postgres protocol for fixed size arrays
Hi, During conversion of the jdbc driver to use binary encoding when receiving array objects from postgres it was noticed that for example for int[] arrays the binary encoding is normally 30% to 200% larger in bytes than the text encoding. This was of concern to some users with slower links to database. Even though the encoded size was larger the binary encoding was still constantly faster (with 50% speed up for float[]). Here is a patch that adds a new flag to the protocol that is set when all elements of the array are of same fixed size. When the bit is set the 4 byte length is only sent once and not for each element. Another restriction is that the flag can only be set when there are no NULLs in the array. The postgres part is my first try at the problem and I really need some feedback around the detection of fixed size elements. I just made a guess and noticed that typlen != 0 seemed to work for the basic types I user for testing. First the postgres patch: diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c index bfb6065..970272f 100644 --- a/src/backend/utils/adt/arrayfuncs.c +++ b/src/backend/utils/adt/arrayfuncs.c @@ -86,7 +86,7 @@ static void ReadArrayBinary(StringInfo buf, int nitems, FmgrInfo *receiveproc, Oid typioparam, int32 typmod, int typlen, bool typbyval, char typalign, Datum *values, bool *nulls, - bool *hasnulls, int32 *nbytes); + bool *hasnulls, int32 *nbytes, bool fixedlen); static void CopyArrayEls(ArrayType *array, Datum *values, bool *nulls, int nitems, int typlen, bool typbyval, char typalign, @@ -1242,7 +1242,7 @@ array_recv(PG_FUNCTION_ARGS) ndim, MAXDIM))); flags = pq_getmsgint(buf, 4); - if (flags != 0 flags != 1) + if ((flags ~3) != 0) ereport(ERROR, (errcode(ERRCODE_INVALID_BINARY_REPRESENTATION), errmsg(invalid array flags))); @@ -1327,7 +1327,7 @@ array_recv(PG_FUNCTION_ARGS) my_extra-proc, typioparam, typmod, typlen, typbyval, typalign, dataPtr, nullsPtr, - hasnulls, nbytes); + hasnulls, nbytes, (flags 2) != 0); if (hasnulls) { dataoffset = ARR_OVERHEAD_WITHNULLS(ndim, nitems); @@ -1390,7 +1390,8 @@ ReadArrayBinary(StringInfo buf, Datum *values, bool *nulls, bool *hasnulls, - int32 *nbytes) + int32 *nbytes, + bool fixedlen) { int i; boolhasnull; @@ -1403,7 +1404,7 @@ ReadArrayBinary(StringInfo buf, charcsave; /* Get and check the item length */ - itemlen = pq_getmsgint(buf, 4); + itemlen = fixedlen ? typlen : pq_getmsgint(buf, 4); if (itemlen -1 || itemlen (buf-len - buf-cursor)) ereport(ERROR, (errcode(ERRCODE_INVALID_BINARY_REPRESENTATION), @@ -1496,6 +1497,7 @@ array_send(PG_FUNCTION_ARGS) int bitmask; int nitems, i; + int flags; int ndim, *dim; StringInfoData buf; @@ -1535,6 +1537,8 @@ array_send(PG_FUNCTION_ARGS) typbyval = my_extra-typbyval; typalign = my_extra-typalign; + flags = ARR_HASNULL(v) ? 1 : (typlen 0 ? 2 : 0); + ndim = ARR_NDIM(v); dim = ARR_DIMS(v); nitems = ArrayGetNItems(ndim, dim); @@ -1543,7 +1547,7 @@ array_send(PG_FUNCTION_ARGS) /* Send the array header information */ pq_sendint(buf, ndim, 4); - pq_sendint(buf, ARR_HASNULL(v) ? 1 : 0, 4); + pq_sendint(buf, flags, 4); pq_sendint(buf, element_type, sizeof(Oid)); for (i = 0; i ndim; i++) { @@ -1571,7 +1575,8 @@ array_send(PG_FUNCTION_ARGS) itemvalue = fetch_att(p, typbyval, typlen); outputbytes = SendFunctionCall(my_extra-proc, itemvalue); - pq_sendint(buf, VARSIZE(outputbytes) - VARHDRSZ, 4); + if ((flags 2) == 0) + pq_sendint(buf, VARSIZE(outputbytes) - VARHDRSZ, 4); pq_sendbytes(buf, VARDATA(outputbytes),
Re: [HACKERS] pg_upgrade relation OID mismatches
On tis, 2011-11-22 at 15:42 -0500, Bruce Momjian wrote: Peter Eisentraut wrote: I thought these were fixed a while ago, but I'm still seeing these when upgrading from master to self (using testing script sent in a while ago). This is completely reproducible. What's happening? ... Restoring user relation files /home/peter/devel/postgresql/git/postgresql/contrib/pg_upgra Mismatch of relation OID in database regression: old OID 16701, new OID 16689 Failure, exiting Yes, I certainly thought they were all addressed. What object is 16701 in the old database? Anything unusual about it? This is saying the relation oid was not preserved. It came in with the range types feature: + psql -d regression -x -c 'select * from pg_class where oid = 16701' -[ RECORD 1 ]--+ relname| test_range_gist relnamespace | 2200 reltype| 16703 reloftype | 0 relowner | 10 relam | 0 relfilenode| 16701 reltablespace | 0 relpages | 33 reltuples | 6200 relallvisible | 33 reltoastrelid | 16704 reltoastidxid | 0 relhasindex| t relisshared| f relpersistence | p relkind| r relnatts | 1 relchecks | 0 relhasoids | f relhaspkey | f relhasrules| f relhastriggers | f relhassubclass | f relfrozenxid | 1627 relacl | reloptions | -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimize postgres protocol for fixed size arrays
On Tue, Nov 22, 2011 at 11:47:22PM +0200, Mikko Tiihonen wrote: Hi, During conversion of the jdbc driver to use binary encoding when receiving array objects from postgres it was noticed that for example for int[] arrays the binary encoding is normally 30% to 200% larger in bytes than the text encoding. This was of concern to some users with slower links to database. Even though the encoded size was larger the binary encoding was still constantly faster (with 50% speed up for float[]). Here is a patch that adds a new flag to the protocol that is set when all elements of the array are of same fixed size. When the bit is set the 4 byte length is only sent once and not for each element. Another restriction is that the flag can only be set when there are no NULLs in the array. Cool. This would be very useful with the DSPAM binary array driver. Although the binary is shorter because the values are 8 byte integers, they would be much shorter without the redundant sizing information. Barring issues: +1 Regards, Ken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FlexLocks
Robert Haas robertmh...@gmail.com wrote: Updated patches attached. I've gotten through several days of performance tests for this pair of related patches, with results posted on a separate thread. I'll link those in to the CF application shortly. To summarize the other (fairly long) thread on benchmarks, it seemed like there might be a very slight slowdown at low concurrency, but it could be the random alignment of code with and without the patch; it was a small enough fraction of a percent to be negligible, in my opinion. At higher concurrency levels the patch showed significant performance improvements. Besides the overall improvement in the median tps numbers of several percent, there was significant mitigation of the performance collapse phenomenon, where some runs were much slower than others. It seems a clear win in terms of performance. I've gotten through code review of the flexlock-v2.patch, and have decided to post on that before I go through the procarraylock-v1.patch code. Not surprisingly, this patch was in good form and applied cleanly. There were doc changes, and I can't see where any changes to the tests are required. I liked the structure, and only found a few nit-picky things to point out: I didn't see why num_held_flexlocks and held_flexlocks had the static keyword removed from their declarations. FlexLockRemember() seems to have a pasto for a comment. Maybe change to something like: Add lock to list of locks held by this backend. In procarraylock.c there is this: /* If there are no lockers, clar the critical PGPROC fields. */ s/clar/clear/ I have to admit I don't have my head around the extraWaits issue, so I can't personally vouch for that code, although I have no reason to doubt it, either. Everything else was something that I at least *think* I understand, and it looked good to me. I'm not changing the status until I get through the other patch file, which should be tomorrow. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimize postgres protocol for fixed size arrays
On Tue, Nov 22, 2011 at 3:47 PM, Mikko Tiihonen mikko.tiiho...@nitorcreations.com wrote: Hi, During conversion of the jdbc driver to use binary encoding when receiving array objects from postgres it was noticed that for example for int[] arrays the binary encoding is normally 30% to 200% larger in bytes than the text encoding. This was of concern to some users with slower links to database. Even though the encoded size was larger the binary encoding was still constantly faster (with 50% speed up for float[]). Here is a patch that adds a new flag to the protocol that is set when all elements of the array are of same fixed size. When the bit is set the 4 byte length is only sent once and not for each element. Another restriction is that the flag can only be set when there are no NULLs in the array. The postgres part is my first try at the problem and I really need some feedback around the detection of fixed size elements. I just made a guess and noticed that typlen != 0 seemed to work for the basic types I user for testing. First the postgres patch: diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c index bfb6065..970272f 100644 --- a/src/backend/utils/adt/arrayfuncs.c +++ b/src/backend/utils/adt/arrayfuncs.c @@ -86,7 +86,7 @@ static void ReadArrayBinary(StringInfo buf, int nitems, FmgrInfo *receiveproc, Oid typioparam, int32 typmod, int typlen, bool typbyval, char typalign, Datum *values, bool *nulls, - bool *hasnulls, int32 *nbytes); + bool *hasnulls, int32 *nbytes, bool fixedlen); static void CopyArrayEls(ArrayType *array, Datum *values, bool *nulls, int nitems, int typlen, bool typbyval, char typalign, @@ -1242,7 +1242,7 @@ array_recv(PG_FUNCTION_ARGS) ndim, MAXDIM))); flags = pq_getmsgint(buf, 4); - if (flags != 0 flags != 1) + if ((flags ~3) != 0) ereport(ERROR, (errcode(ERRCODE_INVALID_BINARY_REPRESENTATION), errmsg(invalid array flags))); @@ -1327,7 +1327,7 @@ array_recv(PG_FUNCTION_ARGS) my_extra-proc, typioparam, typmod, typlen, typbyval, typalign, dataPtr, nullsPtr, - hasnulls, nbytes); + hasnulls, nbytes, (flags 2) != 0); if (hasnulls) { dataoffset = ARR_OVERHEAD_WITHNULLS(ndim, nitems); @@ -1390,7 +1390,8 @@ ReadArrayBinary(StringInfo buf, Datum *values, bool *nulls, bool *hasnulls, - int32 *nbytes) + int32 *nbytes, + bool fixedlen) { int i; bool hasnull; @@ -1403,7 +1404,7 @@ ReadArrayBinary(StringInfo buf, char csave; /* Get and check the item length */ - itemlen = pq_getmsgint(buf, 4); + itemlen = fixedlen ? typlen : pq_getmsgint(buf, 4); if (itemlen -1 || itemlen (buf-len - buf-cursor)) ereport(ERROR, (errcode(ERRCODE_INVALID_BINARY_REPRESENTATION), @@ -1496,6 +1497,7 @@ array_send(PG_FUNCTION_ARGS) int bitmask; int nitems, i; + int flags; int ndim, *dim; StringInfoData buf; @@ -1535,6 +1537,8 @@ array_send(PG_FUNCTION_ARGS) typbyval = my_extra-typbyval; typalign = my_extra-typalign; + flags = ARR_HASNULL(v) ? 1 : (typlen 0 ? 2 : 0); + ndim = ARR_NDIM(v); dim = ARR_DIMS(v); nitems = ArrayGetNItems(ndim, dim); @@ -1543,7 +1547,7 @@ array_send(PG_FUNCTION_ARGS) /* Send the array header information */ pq_sendint(buf, ndim, 4); - pq_sendint(buf, ARR_HASNULL(v) ? 1 : 0, 4); + pq_sendint(buf, flags, 4); pq_sendint(buf, element_type, sizeof(Oid)); for (i = 0; i ndim; i++) { @@ -1571,7 +1575,8 @@ array_send(PG_FUNCTION_ARGS) itemvalue = fetch_att(p, typbyval, typlen); outputbytes = SendFunctionCall(my_extra-proc, itemvalue); - pq_sendint(buf, VARSIZE(outputbytes) - VARHDRSZ, 4); + if ((flags 2) == 0) + pq_sendint(buf, VARSIZE(outputbytes) - VARHDRSZ, 4); pq_sendbytes(buf,
Re: [HACKERS] pg_upgrade relation OID mismatches
Peter Eisentraut wrote: On tis, 2011-11-22 at 15:42 -0500, Bruce Momjian wrote: Peter Eisentraut wrote: I thought these were fixed a while ago, but I'm still seeing these when upgrading from master to self (using testing script sent in a while ago). This is completely reproducible. What's happening? ... Restoring user relation files /home/peter/devel/postgresql/git/postgresql/contrib/pg_upgra Mismatch of relation OID in database regression: old OID 16701, new OID 16689 Failure, exiting Yes, I certainly thought they were all addressed. What object is 16701 in the old database? Anything unusual about it? This is saying the relation oid was not preserved. It came in with the range types feature: + psql -d regression -x -c 'select * from pg_class where oid = 16701' -[ RECORD 1 ]--+ relname| test_range_gist relnamespace | 2200 reltype| 16703 reloftype | 0 relowner | 10 relam | 0 relfilenode| 16701 reltablespace | 0 relpages | 33 reltuples | 6200 relallvisible | 33 reltoastrelid | 16704 reltoastidxid | 0 relhasindex| t relisshared| f relpersistence | p relkind| r relnatts | 1 relchecks | 0 relhasoids | f relhaspkey | f relhasrules| f relhastriggers | f relhassubclass | f relfrozenxid | 1627 relacl | reloptions | OK, that is a heap table. My only guess is that the heap is being created without binary_upgrade_next_heap_pg_class_oid being set. Looking at the code, I can't see how the heap could be created without this happening. Another idea is that pg_dumpall isn't output the proper value, but again, how is this data type different from the others. I will try to research this further. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Permissions checks for range-type support functions
It strikes me that we are rather lacking in $SUBJECT. There are a couple of distinct issues: 1. Since no permissions checks are applied at runtime, a nefarious person could bypass ACL_EXECUTE checks for any function that happens to match the signature for a subtype_diff function. Just create a range type that specifies that function as subtype_diff, and think up an index operation that will call it with the arguments you want. In principle one could bypass permissions checks on canonical functions as well, though in practice I think that's uninteresting because of the restrictions on the function signature. In the analogous situation for base types, we don't worry about this because only superusers can define base types; we just presume (and document) that the superuser is effectively granting public access permissions on all functions referenced in a base type definition. This will clearly not fly for range types. The nearest analogy I can see for non-superusers is use of functions within aggregate definitions. There, we check that the aggregate definer has permission to call the referenced function at aggregate definition time, and then recheck it at the start of any query that uses the aggregate. (So, the aggregate definer can give away permission to call the target function, but he could do that anyway by creating a SECURITY DEFINER wrapper function.) For the range-type support functions, it would be simple enough to check call permission at range-type definition time. But I really don't want to put in a run-time check, because there doesn't seem to be a very practical way to do it less often than every call, and besides that it's not very clear who to blame an index operation on. Is it good enough to test this only at range-type creation time? The implication of course is that you might not be able to revoke execute permission from a bad guy, short of dropping your function. This might be all right given the relatively narrow cross-section of functions that could be called this way. 2. The ANALYZE option is flat out dangerous, because it allows any function with the signature f(internal) returns bool to be called as though it's a typanalyze function. There are a couple of such functions in the catalogs already, and either of them will probably crash the backend if invoked as typanalyze on a range column. Again, this isn't a hazard for the existing use with base types, because only superusers (presumed to be responsible adults) can define base types. But for range types it's an easy DOS attack for less responsible persons. I'm inclined to think that the right solution for this one is to drop the ANALYZE option altogether, and just have DefineRange automatically install a system-wide typanalyze function for ranges. Under what circumstances is range-type-specific knowledge going to be needed for typanalyze, anyway? Especially since the functions that would use the results will be system-wide selectivity functions associated with the ANYRANGE operators. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Not HOT enough
On Tue, Nov 22, 2011 at 9:21 PM, Robert Haas robertmh...@gmail.com wrote: On Tue, Nov 22, 2011 at 3:23 PM, Simon Riggs si...@2ndquadrant.com wrote: Some time ago, I changed GetOldestXmin() to ignore procs in other databases resulting in a potentially later xmin. GetSnapshotData() was not touched when that happened, even though the comments say ...This is the same computation done by GetOldestXmin(true, true). The transam/README file says it stronger GetSnapshotData also performs an oldest-xmin calculation (which had better match GetOldestXmin's). Doh. As a result, VACUUM ignores procs in other databases, whereas HOT does not. That means we aren't cleaning up as much as we could do when running multiple databases. If its OK for VACUUM, then it must be OK for HOT cleanup also. Attached patch ignores procs in other databases during GetSnapshotData() when IsMVCCSnapshot(), using similar coding to GetOldestXmin(). Any doubters? I think this is unsafe for shared catalogs. I think so too. Thats why it uses IsMVCCSnapshot() to confirm when it is safe to do so. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [JDBC] Optimize postgres protocol for fixed size arrays
On 23 November 2011 10:47, Mikko Tiihonen mikko.tiiho...@nitorcreations.com wrote: Here is a patch that adds a new flag to the protocol that is set when all elements of the array are of same fixed size. When the bit is set the 4 byte length is only sent once and not for each element. Another restriction is that the flag can only be set when there are no NULLs in the array. How does a client detect that this feature is supported? At a glance the JDBC patch doesn't use it on the send path, but presumably clients could use this when sending binary-format arrays to the server - but only if the server understood the format. (Ideally a pair of settings would be useful here - one to say the server understands the new format and another the client sets to say please use the new format that defaults to off - then you could avoid confusing old clients, too) Oliver -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [JDBC] Optimize postgres protocol for fixed size arrays
Oliver Jowett oli...@opencloud.com writes: On 23 November 2011 10:47, Mikko Tiihonen mikko.tiiho...@nitorcreations.com wrote: Here is a patch that adds a new flag to the protocol that is set when all elements of the array are of same fixed size. How does a client detect that this feature is supported? The only way that anything like this will go in is as part of a protocol version bump, so discoverability would reduce to knowing which protocol you're using. We should file this away as one of the things we might want to do whenever there's sufficient critical mass for a new wire protocol version. Note that COPY BINARY files would be affected too, and so we'd want to make sure that this sort of change is recognizable from a binary file's header. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] vpath builds and verbose error messages
Peter Eisentraut pete...@gmx.net writes: One thing that isn't so nice about all this is that it embeds the personal directory structure of the builder of the binary into the shipped product. But gcc's cpp doesn't like redefining __FILE__, so the only way to get around that altogether would be to use something other than __FILE__ and define that for all builds. Might not be worth it. Well, if you have a problem with that, don't use a vpath build. I don't think it's our job to work around gcc behaviors that someone else might feel to be security issues --- they should take that up with the gcc maintainers. To me, the only argument for doing anything about this at all is that we'd like Postgres' behavior (in terms of what it prints in error messages) to be consistent across different build scenarios. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Not HOT enough
On Tue, Nov 22, 2011 at 11:40 PM, Simon Riggs si...@2ndquadrant.com wrote: I think this is unsafe for shared catalogs. I think so too. Thats why it uses IsMVCCSnapshot() to confirm when it is safe to do so. Ah, you mean access to shared catalogs using MVCC snapshots. Fixed. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services fix_getsnapshotdata.v2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
On 21 November 2011 22:55, Robert Haas robertmh...@gmail.com wrote: Results on my machine, for what they're worth: I was curious as to how much of an improvement I'd made to sorting in isolation. I've adding simple sort timing instrumentation to the code in a revised patch, attached, for the convenience of reviewers. This patch adds an int8 specialisation, but it also supports fast path sorting of timestamps by using that same int8 specialisation (for HAVE_INT64_TIMESTAMP builds at least - didn't know if it was worth handling the NaN crud for the other case). ISTM that various different types with identical internal representations can share specialisations like this, which I hope will alleviate some earlier concerns about both the limited applicability of this optimisation and the possible proliferation of specialisations. Here's some raw qsort figures in seconds for the query select * from orderlines order by test (where test is a timestamptz column I added and updated with some data, executing repeatedly on the same machine as before): Before: DEBUG: elapsed: 0.031738 DEBUG: elapsed: 0.031595 DEBUG: elapsed: 0.031502 DEBUG: elapsed: 0.031378 DEBUG: elapsed: 0.031359 DEBUG: elapsed: 0.031413 DEBUG: elapsed: 0.031499 DEBUG: elapsed: 0.031394 DEBUG: elapsed: 0.031368 After: DEBUG: elapsed: 0.013341 DEBUG: elapsed: 0.014608 DEBUG: elapsed: 0.013809 DEBUG: elapsed: 0.013244 DEBUG: elapsed: 0.014307 DEBUG: elapsed: 0.013207 DEBUG: elapsed: 0.013227 DEBUG: elapsed: 0.013264 DEBUG: elapsed: 0.013143 DEBUG: elapsed: 0.013455 DEBUG: elapsed: 0.013447 I wonder, is it worth qualifying that the Sort Method was a quicksort (fast path) sort within explain analyze output? This is a rather large improvement, so It might actually be something that people look out for, as it might be tricky to remember the exact circumstances under which the optimisation kicks in by the time we're done here. I haven't had as much time as I'd like to polish this patch, or to get clearer answers. I expect to spend more time on it over the weekend. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services From 687779e799eb0d0064f86192cc74ea473ff9a607 Mon Sep 17 00:00:00 2001 From: peter2ndQuadrant pe...@2ndquadrant.com Date: Sun, 20 Nov 2011 20:36:25 + Subject: [PATCH] Initial commit of optimization Stop directly using oid Added int8 quicksort fast path specialisation, which can also be used in place of F_TIMESTAMP_CMP for HAVE_INT64_TIMESTAMP builds. Rebased, revised patch for -hackers, with timestamp and int8 fast path sorting using the same int8 specialization. Remove unneeded line Rebase for -hackers --- src/backend/utils/sort/tuplesort.c | 105 +++- src/include/utils/template_qsort_arg.h | 217 2 files changed, 316 insertions(+), 6 deletions(-) create mode 100644 src/include/utils/template_qsort_arg.h diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c index 3505236..4962973 100644 --- a/src/backend/utils/sort/tuplesort.c +++ b/src/backend/utils/sort/tuplesort.c @@ -107,11 +107,13 @@ #include miscadmin.h #include pg_trace.h #include utils/datum.h +#include utils/fmgroids.h #include utils/logtape.h #include utils/lsyscache.h #include utils/memutils.h #include utils/pg_rusage.h #include utils/rel.h +#include utils/template_qsort_arg.h #include utils/tuplesort.h @@ -1225,12 +1227,60 @@ puttuple_common(Tuplesortstate *state, SortTuple *tuple) } /* - * All tuples have been provided; finish the sort. + * Manufacture type specific sorting specialisations + * with inline comparators + */ +static inline int +compare_int4(Datum first, Datum second) +{ + int32 first_int = DatumGetInt32(first); + int32 second_int = DatumGetInt32(second); + + if (first_int second_int) + return 1; + else if (first_int == second_int) + return 0; + else + return -1; +} + +static inline int +compare_int8(Datum first, Datum second) +{ + int64 first_int = DatumGetInt64(first); + int64 second_int = DatumGetInt64(second); + + if (first_int second_int) + return 1; + else if (first_int == second_int) + return 0; + else + return -1; +} + +TEMPLATE_QSORT_ARG(int4, compare_int4); +TEMPLATE_QSORT_ARG(int8, compare_int8); + +double timeval_subtract(struct timeval *x, struct timeval *y); +double timeval_subtract(struct timeval *x, struct timeval *y) +{ + struct timeval result; + /* Compute the time remaining to wait. tv_usec is certainly positive. */ + result.tv_sec = x-tv_sec - y-tv_sec; + result.tv_usec = x-tv_usec - y-tv_usec; + + /* return difference in seconds */ + return result.tv_sec + ((double) result.tv_usec / 100); +} + +/* + * All tuples have been provided; perform the sort. */ void tuplesort_performsort(Tuplesortstate *state) { MemoryContext oldcontext = MemoryContextSwitchTo(state-sortcontext); + struct timeval begin, end; #ifdef TRACE_SORT if
Re: [HACKERS] range_adjacent and discrete ranges
Florian Pflug f...@phlo.org writes: More formally, let there be two arbitrary ranges a,b,i_a,i_b c,d,i_c,i_d where the first two parameters are the lower respectively upper bound, and the last two are booleans saying whether the lower respectively upper bound is inclusive (true) or exclusive (false). These ranges are then adjacent exactly if the range b,c,!i_b,!i_c is empty. I tried to implement this, and I think it has a small bug. It works as stated if we have b c. However, if we have b == c, then we want to consider the ranges adjacent if i_b != i_c (ie, only one of them claims the common boundary point). But a singleton range is empty unless it's inclusive on both sides. So we have to have a special case when the bounds are equal. (If b c, then of course we have to consider the two ranges in the opposite order.) Attached is a draft patch for this. It passes regression tests but I've not tried to exercise it with a canonical function that actually does something different. It's going to be a bit slower than Jeff's original, because it does not only range_cmp_bound_values but also a make_range cycle (in most cases). So I guess the question is how much we care about supporting canonical functions with non-default policies. Thoughts? regards, tom lane diff --git a/src/backend/utils/adt/rangetypes.c b/src/backend/utils/adt/rangetypes.c index 3326cb17c895273fd01c4eda5eb0d65a521d0168..890b6850cb6f5611e9afa922b85b0c64125f31bb 100644 *** a/src/backend/utils/adt/rangetypes.c --- b/src/backend/utils/adt/rangetypes.c *** range_adjacent(PG_FUNCTION_ARGS) *** 699,704 --- 699,706 upper2; bool empty1, empty2; + RangeType *r3; + int cmp; /* Different types should be prevented by ANYRANGE matching rules */ if (RangeTypeGetOid(r1) != RangeTypeGetOid(r2)) *** range_adjacent(PG_FUNCTION_ARGS) *** 714,736 PG_RETURN_BOOL(false); /* ! * For two ranges to be adjacent, the lower boundary of one range has to ! * match the upper boundary of the other. However, the inclusivity of ! * those two boundaries must also be different. ! * ! * The semantics for range_cmp_bounds aren't quite what we need here, so ! * we do the comparison more directly. */ ! if (lower1.inclusive != upper2.inclusive) { ! if (range_cmp_bound_values(typcache, lower1, upper2) == 0) ! PG_RETURN_BOOL(true); } ! ! if (upper1.inclusive != lower2.inclusive) { ! if (range_cmp_bound_values(typcache, upper1, lower2) == 0) ! PG_RETURN_BOOL(true); } PG_RETURN_BOOL(false); --- 716,758 PG_RETURN_BOOL(false); /* ! * Given two ranges A..B and C..D, where B C, the ranges are adjacent ! * if and only if the range B..C is empty, where inclusivity of these two ! * bounds is inverted compared to the original bounds. For discrete ! * ranges, we have to rely on the canonicalization function to normalize ! * B..C to empty if it contains no elements of the subtype. If B == C, ! * the ranges are adjacent only if these bounds have different inclusive ! * flags (i.e., exactly one of the ranges includes the common boundary). ! * And if B C then the ranges cannot be adjacent in this order, but we ! * must consider the other order (i.e., check D = A). */ ! cmp = range_cmp_bound_values(typcache, upper1, lower2); ! if (cmp 0) { ! upper1.inclusive = !upper1.inclusive; ! upper1.lower = true; ! lower2.inclusive = !lower2.inclusive; ! lower2.lower = false; ! r3 = make_range(typcache, upper1, lower2, false); ! PG_RETURN_BOOL(RangeIsEmpty(r3)); } ! if (cmp == 0) { ! PG_RETURN_BOOL(upper1.inclusive != lower2.inclusive); ! } ! cmp = range_cmp_bound_values(typcache, upper2, lower1); ! if (cmp 0) ! { ! upper2.inclusive = !upper2.inclusive; ! upper2.lower = true; ! lower1.inclusive = !lower1.inclusive; ! lower1.lower = false; ! r3 = make_range(typcache, upper2, lower1, false); ! PG_RETURN_BOOL(RangeIsEmpty(r3)); ! } ! if (cmp == 0) ! { ! PG_RETURN_BOOL(upper2.inclusive != lower1.inclusive); } PG_RETURN_BOOL(false); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Permissions checks for range-type support functions
I wrote: For the range-type support functions, it would be simple enough to check call permission at range-type definition time. But I really don't want to put in a run-time check, because there doesn't seem to be a very practical way to do it less often than every call, and besides that it's not very clear who to blame an index operation on. Is it good enough to test this only at range-type creation time? The implication of course is that you might not be able to revoke execute permission from a bad guy, short of dropping your function. This might be all right given the relatively narrow cross-section of functions that could be called this way. On further reflection I think we can get away with this, because of the additional check that's already in there that insists the support functions be IMMUTABLE. That means they can't have any side-effects, which makes the potential security consequences of unauthorized calls very minimal. It's conceivable that the return value of a misused subtype_diff function could be interesting (think some sort of decryption function); but the system doesn't offer any way for a user to see that return value. It will only factor into GiST index page split decisions, and in a pretty indirect way at that. So I don't see any interesting security risk for a misused subtype_diff function; and there's not likely to be any meaningful hole for abusing canonical functions either, due to the restrictions on argument/result datatype. I still think we ought to add a creation-time permission check, just for pro-forma purposes. 2. The ANALYZE option is flat out dangerous, But this is still true. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dblink: enable parameters
On 22.11.2011 06:55, Pavel Stehule wrote: I know so dblink is deprecated interface - but it has necessary functionality still - it support a writable statements. So, dblink on FDW connection seems to be a possible solution. We pass a query as a form of a plain text. -- it is interesting idea Pavel Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Writeable FDWs?
Hello 2011/11/21 Josh Berkus j...@agliodbs.com: All, When I present to people about the features of 9.1, one of the most frequent questions is when we will get writeable FDWs for data sources where it is appropriate. Is anyone working on this? just note - it is relative complex feature - and need 2PC transaction coordinator Regards Pavel -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers