Re: [HACKERS] RFC: replace pg_stat_activity.waiting with something more descriptive
On Fri, Jul 10, 2015 at 10:03 PM, Alexander Korotkov < a.korot...@postgrespro.ru> wrote: > > On Fri, Jun 26, 2015 at 6:39 AM, Robert Haas wrote: >> >> On Thu, Jun 25, 2015 at 9:23 AM, Peter Eisentraut wrote: >> > On 6/22/15 1:37 PM, Robert Haas wrote: >> >> Currently, the only time we report a process as waiting is when it is >> >> waiting for a heavyweight lock. I'd like to make that somewhat more >> >> fine-grained, by reporting the type of heavyweight lock it's awaiting >> >> (relation, relation extension, transaction, etc.). Also, I'd like to >> >> report when we're waiting for a lwlock, and report either the specific >> >> fixed lwlock for which we are waiting, or else the type of lock (lock >> >> manager lock, buffer content lock, etc.) for locks of which there is >> >> more than one. I'm less sure about this next part, but I think we >> >> might also want to report ourselves as waiting when we are doing an OS >> >> read or an OS write, because it's pretty common for people to think >> >> that a PostgreSQL bug is to blame when in fact it's the operating >> >> system that isn't servicing our I/O requests very quickly. >> > >> > Could that also cover waiting on network? >> >> Possibly. My approach requires that the number of wait states be kept >> relatively small, ideally fitting in a single byte. And it also >> requires that we insert pgstat_report_waiting() calls around the thing >> that is notionally blocking. So, if there are a small number of >> places in the code where we do network I/O, we could stick those calls >> around those places, and this would work just fine. But if a foreign >> data wrapper, or any other piece of code, does network I/O - or any >> other blocking operation - without calling pgstat_report_waiting(), we >> just won't know about it. > > > Idea of fitting wait information into single byte and avoid both locking and atomic operations is attractive. > But how long we can go with it? > Could DBA make some conclusion by single querying of pg_stat_activity or double querying? > It could be helpful in situations, where the session is stuck on a particular lock or when you see most of the backends are showing the wait on same LWLock. > In order to make a conclusion about system load one have to run daemon or background worker which is continuously sampling current wait events. > Sampling current wait event with high rate also gives some overhead to the system as well as locking or atomic operations. > The idea of sampling sounds good, but I think if it adds performance penality on the system, then we should look into the ways to avoid it in hot-paths. > Checking if backend is stuck isn't easy as well. If you don't expose how long last wait event continues it's hard to distinguish getting stuck on particular lock and high concurrency on that lock type. > > I can propose following: > > 1) Expose more information about current lock to user. For instance, having duration of current wait event, user can determine if backend is getting > stuck on particular event without sampling. > For having duration, I think you need to use gettimeofday or some similar call to calculate the wait time, now it will be okay for the cases where wait time is longer, however it could be problematic for the cases if the waits are very small (which could probably be the case for LWLocks) > 2) Accumulate per backend statistics about each wait event type: number of occurrences and total duration. With this statistics user can identify system bottlenecks again without sampling. > > Number #2 will be provided as a separate patch. > Number #1 require different concurrency model. ldus will extract it from "waits monitoring" patch shortly. > Sure, I think those should be evaluated as separate patches, and I can look into those patches and see if something more can be exposed as part of this patch which we can be reused in those patches. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] strange plan with bitmap heap scan and multiple partial indexes
Hi, On 07/11/2015 11:40 PM, Tom Lane wrote: Tomas Vondra writes: So I think the predicate proofing is a better approach, but of course the planning cost may be an issue. But maybe we can make this cheaper by some clever tricks? For example, given two predicates A and B, it seems that if A => B, then selectivity(A) <= selectivity(B). Could we use this to skip some of the expensive stuff? We should have the selectivities anyway, no? We do. The existing logic in choose_bitmap_and essentially uses the selectivity as a heuristic to indicate which partial indexes might have predicates that imply another index's predicate. The expectation is that the former would have selectivity strictly smaller than the latter, causing the former to be processed first, and then the existing rules about what indexes can be "added onto" a potential AND combination will do the trick. The reason this fails in your example is that if the two indexes have exactly identical selectivities (due to identical reltuples values), there's no certainty what order they get sorted in, and the adding-on rules don't catch the case where the new index would actually imply the old one rather than vice versa. Ah, OK. Thanks for the explanation. Conceivably, we could fix this at relatively small cost in the normal case by considering predicate proof rules in the sort comparator, and only if the estimated selectivities are identical. Yep, that's what basically I had in mind. Sure seems like a kluge though, and I remain unconvinced that it's really a case that arises that much in the real world. The short description of the set of indexes you showed is "redundantly silly". Useful sets of indexes would not likely all have indistinguishably small selectivities. Fair point - the example really is artificial, and was constructed to demonstrate planning costs of the extended predicate-proofing for partial indexes. And while in reality small partial indexes are quite common, they probably use predicates tailored for individual queries (and not the nested predicates as in the example). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] strange plan with bitmap heap scan and multiple partial indexes
Tomas Vondra writes: > So I think the predicate proofing is a better approach, but of course > the planning cost may be an issue. But maybe we can make this cheaper by > some clever tricks? For example, given two predicates A and B, it seems > that if A => B, then selectivity(A) <= selectivity(B). Could we use this > to skip some of the expensive stuff? We should have the selectivities > anyway, no? We do. The existing logic in choose_bitmap_and essentially uses the selectivity as a heuristic to indicate which partial indexes might have predicates that imply another index's predicate. The expectation is that the former would have selectivity strictly smaller than the latter, causing the former to be processed first, and then the existing rules about what indexes can be "added onto" a potential AND combination will do the trick. The reason this fails in your example is that if the two indexes have exactly identical selectivities (due to identical reltuples values), there's no certainty what order they get sorted in, and the adding-on rules don't catch the case where the new index would actually imply the old one rather than vice versa. Conceivably, we could fix this at relatively small cost in the normal case by considering predicate proof rules in the sort comparator, and only if the estimated selectivities are identical. Sure seems like a kluge though, and I remain unconvinced that it's really a case that arises that much in the real world. The short description of the set of indexes you showed is "redundantly silly". Useful sets of indexes would not likely all have indistinguishably small selectivities. Perhaps a less klugy answer is to tweak the adding-on rules some more, but I haven't thought about exactly how. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] more RLS oversights
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 07/10/2015 06:15 PM, Noah Misch wrote: > On Fri, Jul 10, 2015 at 03:08:53PM -0700, Joe Conway wrote: >> On 07/03/2015 10:03 AM, Noah Misch wrote: >>> (1) CreatePolicy() and AlterPolicy() omit to call >>> assign_expr_collations() on the node trees. > >> The attached fixes this issue for me, but I am unsure whether we >> really need/want the regression test. Given the recent push to >> increase test coverage maybe so. > > I wouldn't remove the test from your patch. > Ok -- pushed. - -- Joe Conway -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAEBAgAGBQJVoYj2AAoJEDfy90M199hleBcP/3anI8IIEkFyPiUDX0QvzfEM EOBm+AdolwgAvscU6RaDbrVXJlE32YbSWsXhtXOA5jJvhY80ln3YO+ko1ONWV3dW iYbvSO+zQalHDqmID2bqbnY/k+7GTGWPCTsSLMsmUK7P0QCVP2f02lCukqr4yWpH tIVbOfb0A1+Mrb9dxta43Bj32maBBiEpWIwaebotik6BmfwHNeeaZ082PUJQvaqS wtshrlctAaCsCyjQnNiPvtD9mw0rlSWOhNDc7R8KGflWnwXmBlyu7jD4aFHKcPZO v1ErqG2Z0allm3p6snpFbiunQssVpHgF7V8FcWxIReu73lV25ig3Ix56MoUXHIOq a2Y5iAfRw106V1GA6ARW0kjCaE0DrRcfA6/Um8LeEhw44cvUBZkhXx/ozt0t62pz 6mvhKN4UjmO/XfbA9GEN7b9kDz+LZtMFQ1PqcH7mK3OYKgGfYTAdJOA7qwHuWMBC MGVHP5WEUCJEToTNyzVe0matOH8+IHS4LQ9qAtUFVCmhh27FK0m8kjoZAmT/xDk5 uNcMG9mvBOTZe5EmVdC1gywDOsRntzAgWM1SFBK2v0YEgj3YarKll839Jm+dNHGZ nxjniR/XJkNxISrTN6Qq797nhYsmpqRg+d7ZVk0GxmhfNNp/f2SFRVB9/9ovrk4x //7pllazs48qu6e/3eYK =EEZ7 -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] strange plan with bitmap heap scan and multiple partial indexes
On 07/11/2015 06:32 PM, Tom Lane wrote: ... Presumably, this is happening because the numbers of rows actually satisfying the index predicates are so small that it's a matter of luck whether any of them are included in ANALYZE's sample. Given this bad data for the index sizes, it's not totally surprising that choose_bitmap_and() does something wacko. I'm not sure whether we should try to make it smarter, or write this off as "garbage in, garbage out". I think we should make it smarter, if possible - while this example is somewhat artificial, partial indexes are often used exactly like this, i.e. to index only very small subset of data. A good example may be an index on "active invoices", i.e. invoices that were yet sorted out. There may be a lot of invoices in the table, but only very small fraction of them will be active (and thus in the index). So I don't think is an artificial problem, and we should not write it off as "garbage in". Another idea is to not trust any individual ANALYZE's estimate of the index rowcount so completely. (I'd thought that the moving-average logic would get applied to that, but it doesn't seem to be kicking in for some reason.) We could probably make this smarter if we were willing to apply the predicate-proof machinery in more situations; in this example, once we know that idx001 is applicable, we really should disregard idx002 and idx003 altogether because their predicates are implied by idx001's. I've always been hesitant to do that because the cost of checking seemed likely to greatly outweigh the benefits. But since Tomas is nosing around in this territory already, maybe he'd like to investigate that further. I think there are two possible approaches in general - we may improve the statistics somehow, or we may start doing the predicate proofing. I doubt approaching this at the statistics level alone is sufficient, because even with statistics target 10k (i.e. the most detailed one), the sample is still fixed-size. So there will always exist a combination of a sufficiently large data set and selective partial index, causing trouble with the sampling. Moreover, I can't really think of a way to fix this at the statistics level. Maybe there's a clever trick guarding against this particular issue, but my personal experience is that whenever I used such a smart hack, it eventually caused strange issues elsewhere. So I think the predicate proofing is a better approach, but of course the planning cost may be an issue. But maybe we can make this cheaper by some clever tricks? For example, given two predicates A and B, it seems that if A => B, then selectivity(A) <= selectivity(B). Could we use this to skip some of the expensive stuff? We should have the selectivities anyway, no? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] TABLESAMPLE patch is really in pretty sad shape
The two contrib modules this patch added are nowhere near fit for public consumption. They cannot clean up after themselves when dropped: regression=# create extension tsm_system_rows; CREATE EXTENSION regression=# create table big as select i, random() as x from generate_series(1,100) i; SELECT 100 regression=# create view v1 as select * from big tablesample system_rows(10); CREATE VIEW regression=# drop extension tsm_system_rows; DROP EXTENSION The view is still there, but is badly broken: regression=# select * from v1; ERROR: cache lookup failed for function 46379 Potentially this is a security issue, since a malicious user could probably manage to create a Trojan horse function having the now-vacated OID, whereupon use of the view would invoke that function. Worse still, the broken pg_tablesample_method row is still there: regression=# create extension tsm_system_rows; ERROR: duplicate key value violates unique constraint "pg_tablesample_method_name_index" DETAIL: Key (tsmname)=(system_rows) already exists. And even if we fixed that, these modules will not survive a pg_upgrade cycle, because pg_upgrade has no idea that it would need to create a pg_tablesample_method row (remember that we do *not* replay the extension script during binary upgrade). Raw inserts into system catalogs just aren't a sane thing to do in extensions. Some of the risks here come from what seems like a fundamentally wrong decision to copy all of the info about a tablesample method out of the pg_tablesample_method catalog *at parse time* and store it permanently in the query parse tree. This makes any sort of "alter tablesample method" DDL operation impossible in principle, because any views/rules referencing the method have already copied the data. On top of that, I find the basic implementation design rather dubious, because it supposes that the tablesample filtering step must always come first; the moment you say TABLESAMPLE you can kiss goodbye the idea that the query will use any indexes. For example: d2=# create table big as select i, random() as x from generate_series(1,1000) i; SELECT 1000 d2=# create index on big(i); CREATE INDEX d2=# analyze big; ANALYZE d2=# explain analyze select * from big where i between 100 and 200; QUERY PLAN Index Scan using big_i_idx on big (cost=0.43..10.18 rows=87 width=12) (actual time=0.022..0.088 rows=101 loops=1) Index Cond: ((i >= 100) AND (i <= 200)) Planning time: 0.495 ms Execution time: 0.141 ms (4 rows) d2=# explain analyze select * from big tablesample bernoulli(10) where i between 100 and 200; QUERY PLAN Sample Scan (bernoulli) on big (cost=0.00..54055.13 rows=9 width=12) (actual time=0.028..970.051 rows=13 loops=1) Filter: ((i >= 100) AND (i <= 200)) Rows Removed by Filter: 999066 Planning time: 0.182 ms Execution time: 970.093 ms (5 rows) Now, maybe I don't understand the use-case for this feature, but I should think it's meant for dealing with tables that are so big that you can't afford to scan all the data. So, OK, a samplescan is hopefully cheaper than a pure seqscan, but that doesn't mean that fetching 999079 rows and discarding 999066 of them is a good plan design. There needs to be an operational mode whereby we can use an index and do random sampling of the TIDs the index returns. I do not insist that that has to appear in version one of the feature --- but I am troubled by the fact that, by exposing an oversimplified API for use by external modules, this patch is doubling down on the assumption that no such operational mode will ever need to be implemented. There are a whole lot of lesser sins, such as documentation that was clearly never copy-edited by a native speaker of English, badly designed planner APIs (Paths really ought not have rowcounts different from the underlying RelOptInfo), potential core dumps due to dereferencing values that could be null (the guards for null values are in the wrong places entirely), etc etc. While there's nothing here that couldn't be fixed by nuking the contrib modules and putting a week or two of concentrated work into fixing the core code, I for one certainly don't have time to put that kind of effort into TABLESAMPLE right now. Nor do I really have the interest; I find this feature of pretty dubious value. What are we going to do about this? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pg
Re: [HACKERS] [PATCH] Generalized JSON output functions
2015-07-11 19:57 GMT+02:00 Shulgin, Oleksandr : > On Jul 11, 2015 6:19 PM, "Pavel Stehule" wrote: > > > > > > > > 2015-07-11 18:02 GMT+02:00 Shulgin, Oleksandr < > oleksandr.shul...@zalando.de>: > >> > >> On Fri, Jul 10, 2015 at 5:16 PM, Pavel Stehule > wrote: > > > Well, one could call it premature pessimization due to dynamic call > overhead. > > IMO, the fact that json_out_init_context() sets the value callback to > json_out_value is an implementation detail, the other parts of code should > not rely on. And for the Explain output, there definitely going to be > *some* code between context initialization and output callbacks: these are > done in a number of different functions. > >>> > >>> > >>> Again - it is necessary? Postgres still use modular code, not OOP > code. I can understand the using of this technique, when I need a > possibility to change behave. But these function are used for printing > JSON, not printing any others. > >> > >> > >> No, it's not strictly necessary. > >> > >> For me it's not about procedural- vs. object- style, but rather about > being able to override/extend the behavior consistently. And for that I > would prefer that if I override the value callback in a JSON output > context, that it would be called for every value being printed, not only > for some of them. > > > > > > please, can me show any real use case? JSON is JSON, not art work. > > To quote my first mail: > > The motivation behind this to be able to produce specially-crafted JSON in > a logical replication output plugin, such that numeric (and bigint) values > are quoted. This requirement, in turn, arises from the fact that > JavaScript specification, which is quite natural to expect as a consumer > for this JSON data, allows to silently drop significant digits when > converting from string to number object. > > I believe this is a well-known problem and I'm aware of a number of tricks > that might be used to avoid it, but none of them seems to be optimal from > my standpoint. > > I can also imagine this can be used to convert date/time to string > differently, or adding indentation depending on the depth in object > hierarchy, etc. > There is simple rule - be strict on output and tolerant on input. If I understand to sense of this patch - the target is one same format of JSON documents - so there are no space for any variability. > > Still I don't see any value of this. > > Huh? Why then do you spend time on review? > I am thinking so general json functions has sense, but I partially disagree with your implementation. Regards Pavel
Re: [HACKERS] [PATCH] Generalized JSON output functions
On Jul 11, 2015 6:19 PM, "Pavel Stehule" wrote: > > > > 2015-07-11 18:02 GMT+02:00 Shulgin, Oleksandr < oleksandr.shul...@zalando.de>: >> >> On Fri, Jul 10, 2015 at 5:16 PM, Pavel Stehule wrote: Well, one could call it premature pessimization due to dynamic call overhead. IMO, the fact that json_out_init_context() sets the value callback to json_out_value is an implementation detail, the other parts of code should not rely on. And for the Explain output, there definitely going to be *some* code between context initialization and output callbacks: these are done in a number of different functions. >>> >>> >>> Again - it is necessary? Postgres still use modular code, not OOP code. I can understand the using of this technique, when I need a possibility to change behave. But these function are used for printing JSON, not printing any others. >> >> >> No, it's not strictly necessary. >> >> For me it's not about procedural- vs. object- style, but rather about being able to override/extend the behavior consistently. And for that I would prefer that if I override the value callback in a JSON output context, that it would be called for every value being printed, not only for some of them. > > > please, can me show any real use case? JSON is JSON, not art work. To quote my first mail: The motivation behind this to be able to produce specially-crafted JSON in a logical replication output plugin, such that numeric (and bigint) values are quoted. This requirement, in turn, arises from the fact that JavaScript specification, which is quite natural to expect as a consumer for this JSON data, allows to silently drop significant digits when converting from string to number object. I believe this is a well-known problem and I'm aware of a number of tricks that might be used to avoid it, but none of them seems to be optimal from my standpoint. I can also imagine this can be used to convert date/time to string differently, or adding indentation depending on the depth in object hierarchy, etc. > Still I don't see any value of this. Huh? Why then do you spend time on review?
Re: [HACKERS] strange plan with bitmap heap scan and multiple partial indexes
Andres Freund writes: > On 2015-07-11 14:31:25 +0200, Tomas Vondra wrote: >> While working on the "IOS with partial indexes" patch, I've noticed a bit >> strange plan. It's unrelated to that particular patch (reproducible on >> master), so I'm starting a new thread for it. > It's indeed interesting. Running > ANALYZE t;EXPLAIN SELECT a FROM t WHERE b < 100; > repeatedly switches back and forth between the plans. The issue basically is that ANALYZE is putting quasi-random numbers into the reltuples estimates for the indexes. Things seem to be consistently sane after a VACUUM: regression=# vacuum t; VACUUM regression=# select relname,relpages,reltuples from pg_class where relname in ( 't', 'idx001', 'idx002', 'idx003'); relname | relpages | reltuples -+--+- t |44248 | 9.8e+06 idx001 |2 | 99 idx002 |2 | 199 idx003 |2 | 299 (4 rows) but not so much after ANALYZE: regression=# analyze t; ANALYZE regression=# select relname,relpages,reltuples from pg_class where relname in ( 't', 'idx001', 'idx002', 'idx003'); relname | relpages | reltuples -+--+--- t |44248 | 1e+07 idx001 |2 | 0 idx002 |2 | 0 idx003 |2 | 0 (4 rows) I've also seen states like relname | relpages | reltuples -+--+- t |44248 | 9.8e+06 idx001 |2 | 0 idx002 |2 | 334 idx003 |2 | 334 (4 rows) Presumably, this is happening because the numbers of rows actually satisfying the index predicates are so small that it's a matter of luck whether any of them are included in ANALYZE's sample. Given this bad data for the index sizes, it's not totally surprising that choose_bitmap_and() does something wacko. I'm not sure whether we should try to make it smarter, or write this off as "garbage in, garbage out". Another idea is to not trust any individual ANALYZE's estimate of the index rowcount so completely. (I'd thought that the moving-average logic would get applied to that, but it doesn't seem to be kicking in for some reason.) We could probably make this smarter if we were willing to apply the predicate-proof machinery in more situations; in this example, once we know that idx001 is applicable, we really should disregard idx002 and idx003 altogether because their predicates are implied by idx001's. I've always been hesitant to do that because the cost of checking seemed likely to greatly outweigh the benefits. But since Tomas is nosing around in this territory already, maybe he'd like to investigate that further. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Generalized JSON output functions
2015-07-11 18:02 GMT+02:00 Shulgin, Oleksandr : > On Fri, Jul 10, 2015 at 5:16 PM, Pavel Stehule > wrote: > >> >>> Well, one could call it premature pessimization due to dynamic call >>> overhead. >>> >>> IMO, the fact that json_out_init_context() sets the value callback to >>> json_out_value is an implementation detail, the other parts of code should >>> not rely on. And for the Explain output, there definitely going to be >>> *some* code between context initialization and output callbacks: these are >>> done in a number of different functions. >>> >> >> Again - it is necessary? Postgres still use modular code, not OOP code. I >> can understand the using of this technique, when I need a possibility to >> change behave. But these function are used for printing JSON, not printing >> any others. >> > > No, it's not strictly necessary. > > For me it's not about procedural- vs. object- style, but rather about > being able to override/extend the behavior consistently. And for that I > would prefer that if I override the value callback in a JSON output > context, that it would be called for every value being printed, not only > for some of them. > please, can me show any real use case? JSON is JSON, not art work. Still I don't see any value of this. > > Thank you for pointing out the case of Explain format, I've totally > overlooked it in my first version. Trying to apply the proposed approach > in the explain printing code led me to reorganize things slightly. I've > added explicit functions for printing keys vs. values, thus no need to > expose that key_scalar param anymore. There are now separate before/after > key and before/after value functions as well, but I believe it makes for a > cleaner code. > > The most of the complexity is still in the code that decides whether or > not to put spaces (between the values or for indentation) and newlines at > certain points. Should we decide to unify the style we emit ourselves, > this could be simplified, while still leaving room for great flexibility if > overridden by an extension, for example. > > Have a nice weekend. > you too Regards Pavel > -- > Alex > >
Re: [HACKERS] [PATCH] Generalized JSON output functions
On Fri, Jul 10, 2015 at 5:16 PM, Pavel Stehule wrote: > >> Well, one could call it premature pessimization due to dynamic call >> overhead. >> >> IMO, the fact that json_out_init_context() sets the value callback to >> json_out_value is an implementation detail, the other parts of code should >> not rely on. And for the Explain output, there definitely going to be >> *some* code between context initialization and output callbacks: these are >> done in a number of different functions. >> > > Again - it is necessary? Postgres still use modular code, not OOP code. I > can understand the using of this technique, when I need a possibility to > change behave. But these function are used for printing JSON, not printing > any others. > No, it's not strictly necessary. For me it's not about procedural- vs. object- style, but rather about being able to override/extend the behavior consistently. And for that I would prefer that if I override the value callback in a JSON output context, that it would be called for every value being printed, not only for some of them. Thank you for pointing out the case of Explain format, I've totally overlooked it in my first version. Trying to apply the proposed approach in the explain printing code led me to reorganize things slightly. I've added explicit functions for printing keys vs. values, thus no need to expose that key_scalar param anymore. There are now separate before/after key and before/after value functions as well, but I believe it makes for a cleaner code. The most of the complexity is still in the code that decides whether or not to put spaces (between the values or for indentation) and newlines at certain points. Should we decide to unify the style we emit ourselves, this could be simplified, while still leaving room for great flexibility if overridden by an extension, for example. Have a nice weekend. -- Alex diff --git a/contrib/hstore/hstore_io.c b/contrib/hstore/hstore_io.c new file mode 100644 index 7d89867..1f365f5 *** a/contrib/hstore/hstore_io.c --- b/contrib/hstore/hstore_io.c *** hstore_to_json_loose(PG_FUNCTION_ARGS) *** 1241,1286 int count = HS_COUNT(in); char *base = STRPTR(in); HEntry *entries = ARRPTR(in); ! StringInfoData tmp, ! dst; if (count == 0) PG_RETURN_TEXT_P(cstring_to_text_with_len("{}", 2)); initStringInfo(&tmp); ! initStringInfo(&dst); ! ! appendStringInfoChar(&dst, '{'); for (i = 0; i < count; i++) { resetStringInfo(&tmp); appendBinaryStringInfo(&tmp, HS_KEY(entries, base, i), HS_KEYLEN(entries, i)); ! escape_json(&dst, tmp.data); ! appendStringInfoString(&dst, ": "); if (HS_VALISNULL(entries, i)) ! appendStringInfoString(&dst, "null"); /* guess that values of 't' or 'f' are booleans */ else if (HS_VALLEN(entries, i) == 1 && *(HS_VAL(entries, base, i)) == 't') ! appendStringInfoString(&dst, "true"); else if (HS_VALLEN(entries, i) == 1 && *(HS_VAL(entries, base, i)) == 'f') ! appendStringInfoString(&dst, "false"); else { resetStringInfo(&tmp); appendBinaryStringInfo(&tmp, HS_VAL(entries, base, i), HS_VALLEN(entries, i)); if (IsValidJsonNumber(tmp.data, tmp.len)) ! appendBinaryStringInfo(&dst, tmp.data, tmp.len); else ! escape_json(&dst, tmp.data); } - - if (i + 1 != count) - appendStringInfoString(&dst, ", "); } - appendStringInfoChar(&dst, '}'); ! PG_RETURN_TEXT_P(cstring_to_text(dst.data)); } PG_FUNCTION_INFO_V1(hstore_to_json); --- 1241,1293 int count = HS_COUNT(in); char *base = STRPTR(in); HEntry *entries = ARRPTR(in); ! StringInfoData tmp; ! Datum num; ! JsonOutContext dst; if (count == 0) PG_RETURN_TEXT_P(cstring_to_text_with_len("{}", 2)); initStringInfo(&tmp); ! json_out_init_context(&dst, JSON_OUT_USE_SPACES); ! dst.object_start(&dst); for (i = 0; i < count; i++) { resetStringInfo(&tmp); appendBinaryStringInfo(&tmp, HS_KEY(entries, base, i), HS_KEYLEN(entries, i)); ! json_out_cstring_key(&dst, tmp.data); ! if (HS_VALISNULL(entries, i)) ! dst.value(&dst, (Datum) 0, JSONTYPE_NULL, InvalidOid, InvalidOid); ! /* guess that values of 't' or 'f' are booleans */ else if (HS_VALLEN(entries, i) == 1 && *(HS_VAL(entries, base, i)) == 't') ! dst.value(&dst, BoolGetDatum(true), JSONTYPE_BOOL, ! InvalidOid, InvalidOid); ! else if (HS_VALLEN(entries, i) == 1 && *(HS_VAL(entries, base, i)) == 'f') ! dst.value(&dst, BoolGetDatum(false), JSONTYPE_BOOL, ! InvalidOid, InvalidOid); else { resetStringInfo(&tmp); appendBinaryStringInfo(&tmp, HS_VAL(entries, base, i), HS_VALLEN(entries, i)); + if (IsValidJsonNumber(tmp.data, tmp.len)) ! { ! num = DirectFunctionCall3(numeric_in, CStringGetDatum(tmp.data), 0, -1); ! dst.value(&dst, num, JSONTYPE_NUMERIC, ! NUMERICOID, 1702 /* numeric_out */); ! pfree(DatumGetPointer(num)); ! }
Re: [HACKERS] strange plan with bitmap heap scan and multiple partial indexes
On 2015-07-11 14:31:25 +0200, Tomas Vondra wrote: > While working on the "IOS with partial indexes" patch, I've noticed a bit > strange plan. It's unrelated to that particular patch (reproducible on > master), so I'm starting a new thread for it. > > To reproduce it, all you have to do is this (on a new cluster, all settings > on default): > > CREATE TABLE t AS SELECT i AS a, i AS b > FROM generate_series(1,1000) s(i); > > CREATE INDEX idx001 ON t (a) where b < 100; > CREATE INDEX idx002 ON t (a) where b < 200; > CREATE INDEX idx003 ON t (a) where b < 300; > > ANALYZE t; > > EXPLAIN SELECT a FROM t WHERE b < 100; > QUERY PLAN It's indeed interesting. Running ANALYZE t;EXPLAIN SELECT a FROM t WHERE b < 100; repeatedly switches back and forth between the plans. Note that Bitmap Heap Scan on t (cost=9.01..13.02 rows=1000 width=4) is actually costed significantly cheaper than Index Scan using idx001 on t (cost=0.15..30.32 rows=1000 width=4) which means yet another wierd thing is that it's not consistently choosing that plan. When the index scan plan is chosen you interestingly get the bitmapscan plan, but at a slightly higher cost: postgres[32046][1]=# EXPLAIN SELECT a FROM t WHERE b < 100; ┌┐ │ QUERY PLAN │ ├┤ │ Index Scan using idx001 on t (cost=0.15..30.32 rows=1000 width=4) │ └┘ (1 row) Time: 0.460 ms postgres[32046][1]=# SET enable_indexscan = false; SET Time: 0.119 ms postgres[32046][1]=# EXPLAIN SELECT a FROM t WHERE b < 100; ┌── │ QUERY PLAN │ ├── │ Bitmap Heap Scan on t (cost=27.05..31.06 rows=1000 width=4) │ Recheck Cond: ((b < 300) AND (b < 200)) │ Filter: (b < 100) │ -> BitmapAnd (cost=27.05..27.05 rows=1 width=0) │ -> Bitmap Index Scan on idx003 (cost=0.00..13.15 rows=1000 width=0) │ -> Bitmap Index Scan on idx002 (cost=0.00..13.15 rows=1000 width=0) └── Looking at the stats is interesting: postgres[32046][1]=# ANALYZE t;SELECT relpages, reltuples, relallvisible FROM pg_class WHERE relname IN ('t', 'idx003', 'idx002', 'idx001');EXPLAIN SELECT a FROM t WHERE b < 100; ANALYZE Time: 123.469 ms ┌──┬───┬───┐ │ relpages │ reltuples │ relallvisible │ ├──┼───┼───┤ │44248 │ 1e+07 │ 44248 │ │2 │ 0 │ 0 │ │2 │ 667 │ 0 │ │2 │ 667 │ 0 │ └──┴───┴───┘ (4 rows) Time: 0.405 ms ┌┐ │ QUERY PLAN │ ├┤ │ Index Scan using idx001 on t (cost=0.12..24.63 rows=1000 width=4) │ └┘ (1 row) Time: 0.269 ms postgres[32046][1]=# ANALYZE t;SELECT relpages, reltuples, relallvisible FROM pg_class WHERE relname IN ('t', 'idx003', 'idx002', 'idx001');EXPLAIN SELECT a FROM t WHERE b < 100; ANALYZE Time: 124.430 ms ┌──┬───┬───┐ │ relpages │ reltuples │ relallvisible │ ├──┼───┼───┤ │44248 │ 1e+07 │ 44248 │ │2 │ 0 │ 0 │ │2 │ 0 │ 0 │ │2 │ 0 │ 0 │ └──┴───┴───┘ (4 rows) Time: 0.272 ms ┌──┐ │ QUERY PLAN │ ├──┤ │ Bitmap Heap Scan on t (cost=9.01..13.02 rows=1000 width=4) │ │ Recheck Cond: ((b < 300) AND (b < 200))│ │ Filter: (b < 100) │ │ -> BitmapAnd (cost=9.01..9.01 rows=1 width=0)│ │ -> Bitmap Index Scan on idx003 (cost=0.00..4.13 rows=1000 width=0) │ │ -> Bitmap Index Scan on idx002 (cost=0.00..4.13 rows=1000 width=0) │ └──┘ (6 rows) Time: 0.275 ms postgres[32046][1]=# ANALYZE t;SELECT relpages, reltuples, relallvisible FROM pg_class WHERE relname IN ('t', 'idx003', 'idx002', 'idx001');EXPLAIN SELECT a FROM t WHERE b < 100; ANALYZE Time: 112.592 ms ┌─
[HACKERS] strange plan with bitmap heap scan and multiple partial indexes
Hi, While working on the "IOS with partial indexes" patch, I've noticed a bit strange plan. It's unrelated to that particular patch (reproducible on master), so I'm starting a new thread for it. To reproduce it, all you have to do is this (on a new cluster, all settings on default): CREATE TABLE t AS SELECT i AS a, i AS b FROM generate_series(1,1000) s(i); CREATE INDEX idx001 ON t (a) where b < 100; CREATE INDEX idx002 ON t (a) where b < 200; CREATE INDEX idx003 ON t (a) where b < 300; ANALYZE t; EXPLAIN SELECT a FROM t WHERE b < 100; QUERY PLAN Bitmap Heap Scan on t (cost=9.01..13.02 rows=1000 width=4) Recheck Cond: ((b < 300) AND (b < 200)) Filter: (b < 100) -> BitmapAnd (cost=9.01..9.01 rows=1 width=0) -> Bitmap Index Scan on idx003 (cost=0.00..4.13 rows=1000 width=0) -> Bitmap Index Scan on idx002 (cost=0.00..4.13 rows=1000 width=0) Now, that's strange IMHO. There's a perfectly matching partial index, with exactly the same predicate (b<100), but we instead choose the two other indexes, and combine them using BitmapAnd. That seems a bit strange - choosing one of them over the perfectly matching one would be strange too, but why use two and combine them? Another thing is that this gets fixed by a simple VACUUM on the table. EXPLAIN SELECT a FROM t WHERE b < 100; QUERY PLAN Index Scan using idx001 on t (cost=0.14..29.14 rows=1000 width=4) Any idea what's going on here? FWIW all this was on 51d0fe5d (July 23). -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Removing SSL renegotiation (Was: Should we back-patch SSL renegotiation fixes?)
On 2015-07-11 21:09:05 +0900, Michael Paquier wrote: > Something like the patches attached Thanks for that! > could be considered, one is for master > and REL9_5_STABLE to remove ssl_renegotiation_limit, the second one for > ~REL9_4_STABLE to change the default to 0. > diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml > index c669f75..16c0ce5 100644 > --- a/doc/src/sgml/config.sgml > +++ b/doc/src/sgml/config.sgml > @@ -1040,7 +1040,7 @@ include_dir 'conf.d' > cryptanalysis when large amounts of traffic can be examined, but it > also carries a large performance penalty. The sum of sent and > received > traffic is used to check the limit. If this parameter is set to 0, > -renegotiation is disabled. The default is 512MB. > +renegotiation is disabled. The default is 0. I think we should put in a warning or at least note about the dangers of enabling it (connection breaks, exposure to several open openssl bugs). Thanks, Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Removing SSL renegotiation (Was: Should we back-patch SSL renegotiation fixes?)
On Fri, Jul 10, 2015 at 7:47 PM, Andres Freund wrote: > On 2015-07-01 23:32:23 -0400, Noah Misch wrote: > > We'd need to be triply confident that we know better than the DBA before > > removing flexibility in back branches. > > +1 for just changing the default. > > I think we do. But I also think that I pretty clearly lost this > argument, so let's just change the default. > > Is anybody willing to work on this? > Something like the patches attached could be considered, one is for master and REL9_5_STABLE to remove ssl_renegotiation_limit, the second one for ~REL9_4_STABLE to change the default to 0. Regards, -- Michael 20150710_ssl_renegotiation_remove-94.patch Description: binary/octet-stream 20150710_ssl_renegotiation_remove-master.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: index-only scans with partial indexes
Hi, On 07/10/2015 10:43 PM, Tom Lane wrote: Tomas Vondra writes: currently partial indexes end up not using index only scans in most cases, because check_index_only() is overly conservative, as explained in this comment: ... I've done a bunch of tests, and I do see small (hardly noticeable) increase in planning time with long list of WHERE clauses, because all those need to be checked against the index predicate. Not sure if this is what's meant by 'quite expensive' in the comment. Moreover, this was more than compensated by the IOS benefits (even with everything in RAM). But maybe it's possible to fix that somehow? For example, we're certainly doing those checks elsewhere when deciding which clauses need to be evaluated at run-time, so maybe we could cache that somehow? The key problem here is that you're doing those proofs vastly earlier than before, for indexes that might not get used at all in the final plan. If you do some tests with multiple partial indexes you will probably see a bigger planning-time penalty. Hmmm. Maybe we could get a bit smarter by looking at the attnums of each clause before doing the expensive stuff (which is predicate_implied_by I believe), exploiting a few simple observations: * if the clause is already covered by attrs_used, we don't need to process it at all * if the clause uses attributes not included in the index predicate, we know it can't be implied Of course, those are local optimizations, and can't fix some of the problems (e.g. a lot of partial indexes). Perhaps we should bite the bullet and do it anyway, but I'm pretty suspicious of any claim that the planning cost is minimal. Perhaps - I'm not claiming the planning cost is minimal. It was in the tests I've done, but no doubt it's possible to construct examples where the planning time will get much worse. With 30 partial indexes, I got an increase from 0.01 ms to ~2.5ms on simple queries. But maybe we could get at least some of the benefits by planning the index scans like today, and then do the IOS check later? Of course, this won't help with cases where the index scan is thrown away while the index only scan would win, but it does help with cases where we end up doing index scan anyway? That's essentially what I'm struggling right now - I do have a 3TB data set, the plan looks like this: QUERY PLAN Sort (cost=1003860164.92..1003860164.92 rows=1 width=16) Sort Key: orders.o_orderpriority -> HashAggregate Group Key: orders.o_orderpriority -> Merge Semi Join Merge Cond: -> Index Scan using pk_orders on orders Filter: ((o_orderdate >= '1997-07-01'::date) AND (o_orderdate < '1997-10-01 00:00:00'::timestamp)) -> Index Scan using lineitem_l_orderkey_idx_part1 on lineitem and the visibility checks from Index Scans are killing the I/O. An IOS is likely to perform much better here (but haven't ran the query yet). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers