[HACKERS] problem with msvc linker - cannot build orafce
Hi I am trying to build Orafce and I have problem due access to exported variable session_timezone. The build fails with message: 1> Creating library C:\Users\Pavel\orafce-VERSION_3_1_2\orafce-VERSION_3_1_2\msvc\/bin/x64/9.4/lib/orafce.lib and object C:\Users\Pavel\orafce-VERSION_3_1_2\orafce-VERSION_3_1_2\msvc\/bin/x64/9.4/lib/orafce.exp 1>datefce.obj : error LNK2001: unresolved external symbol session_timezone 1>C:\Users\Pavel\orafce-VERSION_3_1_2\orafce-VERSION_3_1_2\msvc\/bin/x64/9.4/lib//orafce.dll : fatal error LNK1120: 1 unresolved externals I am using msvc 2015 and PostgreSQL 9.4.5 from EnterpriseDB. Any idea what can be broken? Regards Pavel
Re: [HACKERS] Rework the way multixact truncations work
On Sun, Nov 8, 2015 at 11:52 AM, Noah Mischwrote: >> I'm not following along right now - in order to make cleanups the plan is to >> revert a couple commits and then redo them prettyfied? > > Yes, essentially. Given the volume of updates, this seemed neater than > framing those updates as in-tree incremental development. I think that's an odd way of representing this work. I tend to remember roughly when major things were committed even years later. An outright revert should represent a total back out of the original commit IMV. Otherwise, a git blame can be quite misleading. I can imagine questioning my recollection, even when it is accurate, if only because I don't expect this. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] New email address
Yahoo recently changed their DMARC policy, and after some investigation and a support case with Yahoo, it is now clear that their email systems can no longer be used with the postgresql.org lists. I've migrated from kgri...@ymail.com to kgri...@gmail.com. In the process I noticed that some people have been sending mail intended for my attention to the kgri...@mail.com address that I migrated from years ago. Because of MajorDomo elimination of duplicates, replying to a COMMITTERS message and adding my old email address caused it to go to the email equivalent of /dev/null. Apologies for what I missed due to that. I've dropped old addresses from the MajorDomo aliases list, which should help prevent a recurrence. You may want to to purge the old addresses from any address book entries for me. After a little settling time I will set up auto-reply messages to point anyone who sends to the addresses to the current address. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] custom function for converting human readable sizes to bytes
On 24/11/15 06:31, Pavel Stehule wrote: 2015-11-23 18:04 GMT+01:00 Tom Lane>: Jim Nasby writes: > On 11/23/15 3:11 AM, Corey Huinker wrote: >> +1 to both pg_size_bytes() and ::bytesize. Both contribute to making the >> statements more self-documenting. > The function seems like overkill to me if we have the type. Just my > opinion though. I'm thinking the type could just be called 'size' too > (or prettysize?). No reason it has to be tied to bytes (in particular > this would work for bits too). Please, no. That's *way* too generic a name. I do not actually agree with making a type for this anyway. I can tolerate a function, but adding a datatype is overkill; and it will introduce far more definitional issues than it's worth. (eg, which other types should have casts to/from it, and at what level) so pg_size_bytes is good enough for everybody? Regards Pavel regards, tom lane perhaps pg_size_bites for those people who want: KiB, MiB, GiB, TiB, PiB, ,.. ???:-) Cheers, Gavin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Declarative partitioning
Robert Haas wrote: > I support building incrementally, but I don't see why we want to > change the catalog structure and then change it again. That seems > like it makes the project more work, not less. I agree with what you say. I thought you were saying that the implementation had to provide multi-partitioning from the get-go, not just the design. > To me, it seems like there is a pretty obvious approach here: each > table can be either a plain table, or a partition root (which can look > just like an empty inheritance parent). Then multi-level partitioning > falls right out of that design without needing to do anything extra. Sounds reasonable. > I think it is also worth getting the syntax right from the beginning. Yes, that's critical. We could implement the whole thing in gram.y and then have the unsupported cases throw errors; then it's easy to see that there are no grammar conflicts to deal with later. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] problem with msvc linker - cannot build orafce
Pavel Stehulewrites: > I am trying to build Orafce and I have problem due access to exported > variable session_timezone. > Any idea what can be broken? Lack of PGDLLIMPORT on the extern declaration, no doubt. The fact that we've not heard this before implies that either nobody has ever tried to use orafce on Windows, or it only very recently grew a dependency on session_timezone. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] custom function for converting human readable sizes to bytes
On Sun, Nov 22, 2015 at 11:24 PM, Pavel Stehulewrote: > > > 2015-11-22 23:54 GMT+01:00 Corey Huinker : > >> What about pg_size_unpretty()? >>> >> I was going to suggest pg_size_ugly(), but unpretty does emphasize the >> inverse (rather than opposite) nature of the function. >> > > "unpretty", "ugly" aren't good names > > maybe pg_size_bytes or different approach > > we can introduce data type - bytesize - default input/output is human > readable text - and conversion to bigint is implicit > > Some like > > select .. where pg_table_size(oid) > bytesize '3.5GB' > > and instead pg_size_pretty(pg_table_size(oid)) we can write > pg_table_size(oid)::bytesize > > Regards > > Pavel > +1 to both pg_size_bytes() and ::bytesize. Both contribute to making the statements more self-documenting.
Re: [HACKERS] onlyvalue aggregate (was: First Aggregate Funtion?)
On Wed, Oct 28, 2015 at 5:03 PM, Marko Tiikkajawrote: > SELECT a, sum(amount), onlyvalue(rolling_count) > FROM > ( > SELECT a, amount, count(*) OVER (ORDER BY a) AS rolling_count > FROM tbl > ) ss > GROUP BY a; The same thing would happen even in the more common case of having functionally dependent columns if they happen to be buried in a subquery. That might well be convenient if you have some expression you want to use in multiple aggregates such as: SELECT pk, acol, avg(x), min(x), max(x) FROM ( SELECT a,pk, a,acol, b.c+b.d+b.e AS x FROM a JOIN b ON (a.pk = b.fk) ) GROUP BY pk Postgres would happily accept that if you collapsed the subquery and ran the group by directly on the join but the subquery in between is actually enough to hide the functional dependency information so it complains that acol is not functionally dependent on the group by column. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [DESIGN] ParallelAppend
On Fri, Nov 20, 2015 at 7:06 PM, Robert Haaswrote: > > On Fri, Nov 20, 2015 at 12:45 AM, Amit Kapila wrote: > > Okay, but I think that's not what I am talking about. I am talking about > > below code in cost_seqscan: > > > > - if (nworkers > 0) > > > > - run_cost = run_cost / (nworkers + 0.5); > > > > + if (path->parallel_degree > 0) > > > > + run_cost = run_cost / (path->parallel_degree + 0.5); > > > > > > It will consider 50% of master backends effort for scan of each child > > relation, > > does that look correct to you? Wouldn't 50% of master backends effort be > > considered to scan all the child relations? > > In the code you originally wrote, you were adding 1 there rather than > 0.5. That meant you were expecting the leader to do as much work as > each of its workers, which is clearly a bad estimate, because the > leader also has to do the work of gathering tuples from the workers. > 0.5 might not be the right value, but it's surely better than 1. > Without this patch, that 0.5 (or 50% of leaders effort) is considered for Gather node irrespective of the number of workers or other factors, but I think with Patch that is no longer true and that's what I am worrying about. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] onlyvalue aggregate (was: First Aggregate Funtion?)
On 21 November 2015 at 03:54, Marko Tiikkajawrote: > Here's v2 of the patch. How's this look? > Here are some initial review comments: * My first thought on reading this patch is that it is somewhat under-commented. For example, I would expect at least a block comment at the top of the new code added by this patch. Also the fields of the new structure could use some comments -- it might be obvious what datum and isnull are for, but fcinfo is less obvious until you read the code that uses it. Likewise the transfn is quite long, with almost no explanatory comments. * There's a clear bug in the null handling in the second branch of the transfn -- when the new value is null and the previous value wasn't. For example: SELECT single_value(x) FROM (VALUES ('x'), (null)) t(x); server closed the connection unexpectedly * In the finalfn, I think calling AggCheckCallContext() should be the first thing it does. Compare for example array_agg_array_finalfn(). * There's another less obvious bug in the way these functions handle complex types. For example: SELECT single_value(t) FROM (VALUES (1,'One'), (1,'One')) t(x,y); ERROR: cache lookup failed for type 2139062143 You might want to look at how array_agg() handles that. Also the code behind array_position() has some elements in common with this patch. * Consider collation handling, as illustrated by array_position(). So I'm afraid there's still some work to do, but there are plenty of examples in existing code to borrow from. Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Identify user requested queries
On 23 November 2015 at 13:27, Praveen Mwrote: > Hi All, > > When the user attempts to make a connection with the database , the code > will look into various pg_catalog tables internally. However the user also > can query the pg_catalog tables. Is there a way to identify the user > requested (or typed query) vs the system requested (internal) queries? > As far as I know there is no simple and reliable method but I'm no expert. Most system accesses to common catalogs use the syscache, which doesn't go through the SQL parse/bind/execute process. Or they construct simple scans directly, again bypassing the full parser. The system will run internal queries with the SPI though, and that's full-fledged SQL. Triggers, rules, views, etc, use the SPI, as does plpgsql, fulltext search, XML support, and a few other parts of the system. So you cannot assume that anything using SQL is user-originated. Take a look at PostgresMain in src/backend/tcop/postgres.c for the top-level user query entry point. You'll see there that you cannot rely on testing isTopLevel because multiple statements sent as a single query string are treated as if they were a nested transaction block. (see exec_simple_query(), postgres.c around line 962). That'd also cause problems with use of PL/PgSQL. You can't assume that all SPI queries are safe, because the user can run queries via the SPI using plpgsql etc. I don't see any way to do this without introducing the concept of a "system query"... and in PostgreSQL that's not simple, because the system query could cause the invocation of user-defined operators, functions, triggers, etc, that then run user-defined code. You'd have to clear the "system query" flag whenever you entered user-defined code, then restore it on exit. That seems exceedingly hard to get right reliably. Reading between the lines, it sounds like you are looking for a way to limit end-user access to system catalogs as part of a lockdown effort, perhaps related to multi-tenancy. Correct? If so, you may wish to look at the current work on supporting row security on system catalogs, as that is probably closer to what you will need. > Also what procedure or function in the code that indicates the user can > write queries , something like I wanted to know the code where the > connection is created and available for user to use. > Start reading at src/backend/tcop/postgres.c . -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] documentation for wal_retrieve_retry_interval
On 11/19/15 11:26 PM, Michael Paquier wrote: > On Fri, Nov 20, 2015 at 1:33 AM, Peter Eisentrautwrote: >> There is no documentation what use case the new (in 9.5) parameter >> wal_retrieve_retry_interval is for. The commit message >> (5d2b45e3f78a85639f30431181c06d4c3221c5a1) alludes to something, but >> even that is not clear, and obviously in the wrong place. Could we come >> up with something more to put into the documentation? > > Yeah, we should highlight the facts that recovery can be made more > responsive when attempting to detect WAL. In archive recovery, this > can be translated by the fact that new WAL segments can be detected > more quickly and make recovery more responsive. The opposite is > actually what leaded to the patch: requirement was to limit the number > of times archive host was requested with a server that had low > activity, the archive host being on AWS. > > An idea would be something like the patch attached. Thoughts? Sounds good. Thanks! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Declarative partitioning
On Mon, Nov 23, 2015 at 1:44 PM, Alvaro Herrerawrote: > Robert Haas wrote: >> I support building incrementally, but I don't see why we want to >> change the catalog structure and then change it again. That seems >> like it makes the project more work, not less. > > I agree with what you say. I thought you were saying that the > implementation had to provide multi-partitioning from the get-go, not > just the design. Well, I *hope* that's going to fall out naturally. If it doesn't, I can live with that. But I hope it will. >> To me, it seems like there is a pretty obvious approach here: each >> table can be either a plain table, or a partition root (which can look >> just like an empty inheritance parent). Then multi-level partitioning >> falls right out of that design without needing to do anything extra. > > Sounds reasonable. Cool. >> I think it is also worth getting the syntax right from the beginning. > > Yes, that's critical. We could implement the whole thing in gram.y and > then have the unsupported cases throw errors; then it's easy to see that > there are no grammar conflicts to deal with later. That's worth considering, too. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] custom function for converting human readable sizes to bytes
On Mon, Nov 23, 2015 at 1:47 PM, Alvaro Herrerawrote: > Pavel Stehule wrote: > >> so pg_size_bytes is good enough for everybody? > > That seems good enough to me. > > I would have it accept GiB and GB and have both transform to base 2, and > have an optional boolean flag whose non-default value turns the GB > interpretation into base 10, leaving the GiB interpretation unaffected. I think it should be orange. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] parallelism and sorting
Hi, I've been thinking about how parallelism interacts with sorting over the last few days and I wanted to share a few preliminary thoughts. I definitely don't have all the answers worked out here yet, so thoughts are welcome. But here are a few observations: 1. Parallel sort is useful but within parallel queries and for utility commands like CREATE INDEX. Index builds can take a long time, and that time is often dominated by the time needed to sort the data, so being able to do that faster would be good. 2. Within parallel query, there are two reasons to care about data that is in sorted order. First, we might need to deliver the results to the user in a particular order, because they've specified ORDER BY whatever. Second, the optimal join strategy might be a merge join, which requires that both relations be sorted according to the join key.[1] 3. The current Gather node reads tuples from the workers in round-robin fashion, skipping over workers that don't have a tuple ready yet when it needs one. It seems potentially useful to have a Gather Merge node which would assume that the results from each worker are ordered with respect to each other, and do a final merge pass over those. Then we could get the toplevel query ordering we want using a plan like this: Gather Merge -> Sort -> Parallel Seq Scan on foo Filter: something 4. Gather Merge would be an executor node, and thus not available to any code that uses tuplesort.c directly. Also, it seems fairly mediocre for merge joins. The best we could do is something like this:[2] Merge Join -> Gather Merge -> Sort -> Parallel Seq Scan -> Gather Merge -> Sort -> Parallel Seq Scan The problem with this plan is that the join itself is not done in parallel, only the sorting. That's not great, especially if there are more joins that need to be done afterwards, necessarily not in parallel.[2] It's possible that one side of the join could be an Index Scan rather than Gather Merge -> Sort -> Parallel Seq Scan, but that doesn't change the overall picture here much. 5. Really nailing the merge join case seems to require partitioning both relations in a fashion compatible with the join attribute, and then joining the partitions separately. Consider an operator Repartition which reads rows from its child plan and returns those where hash(joincol) % NumberOfWorkers == MyWorkerNumber. The rest are sent to the worker whose worker number is hash(joincol) % NumberOfWorkers and are returned by its copy of the corrresponding Repartition operator. Then we could express a merge join reasonably well as: Gather (Merge) -> Merge Join -> Sort -> Repartition -> Parallel Seq Scan -> Sort -> Repartition -> Parallel Seq Scan The Gather could be a Gather Merge if the merge join ordering matches the final output ordering, or a simple Gather if it doesn't. Additional join steps could be inserted between the Gather (Merge) operator and the merge join. So this is a big improvement over the plan shown under point #4. However, it's probably still not optimal, because we probably want to have substantially more partitions than there are workers. Otherwise, if some workers finish before others, it's hard to spread the load. Getting this right probably requires some sort of cooperation between Gather and Repartition where they agree on a number of partitions and then the workers repeatedly pick a partition, run the plan for that partition, and then loop around to get the next unfinished partition until all are completed. 6. Even without repartitioning, if one side of the join has a usable index, we could instead do this: Gather (Merge) -> Merge Join -> Sort -> Parallel Seq Scan -> Index Scan However, this might not be a good idea: we'll scan the index once per worker. If we had a Parallel Index Scan which worked like a Parallel Seq Scan, in that it returned only a subset of the results in each worker but in the same order that the non-parallel version would have returned them, we could instead do this, which might or might not be better: Gather (Merge) -> Merge Join -> Sort -> Repartition -> Parallel Seq Scan -> Repartition -> Parallel Index Scan Here we scan the index just once (spread across all the workers) but we've got to repartition the rows we read from it, so I'm not sure how that's going to work out. Parallel index scan is of course useful apart from merge joins, because you can do something like this to preserve the ordering it creates: Gather Merge -> Nested Loop -> Parallel Index Scan on a -> Index Scan on b Index Qual: b.x = a.x 7. Another option, instead or in addition to introducing a Repartition operator, is to make the sort itself parallel-aware. Each worker reads rows until it fills work_mem, quicksorts them, and dumps them out as a run. Suppose there are few enough runs that we don't need multiple merge passes, and that we have some way of making every worker
Re: [HACKERS] parallelism and sorting
On Mon, Nov 23, 2015 at 05:01:43PM -0500, Robert Haas wrote: > Hi, > > [snip] > > If we had a Parallel Index Scan which worked like a Parallel Seq > Scan, That sounds like a very handy thing to have. Any idea whether it's possible for 9.6? Is there any of the Parallel Seq Scan code that looks like it could be reused or slightly generalized for the implementation? Cheers, David. -- David Fetterhttp://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Optimizing away second VACUUM heap scan when only BRIN indexes on table
The BRIN README notes: """ Since no heap TIDs are stored in a BRIN index, it's not necessary to scan the index when heap tuples are removed. It might be that some summary values can be tightened if heap tuples have been deleted; but this would represent an optimization opportunity only, not a correctness issue. It's simpler to represent this as the need to re-run summarization on the affected page range rather than "subtracting" values from the existing one. This is not currently implemented. """ While I haven't studied the question in depth, I'm pretty sure that what we do to "VACUUM" BRIN indexes has no dependency on the heap TIDs that are pointed to by the index, because there really aren't any. The README goes on to say: """ Note that if there are no indexes on the table other than the BRIN index, usage of maintenance_work_mem by vacuum can be decreased significantly, because no detailed index scan needs to take place (and thus it's not necessary for vacuum to save TIDs to remove). It's unlikely that BRIN would be the only indexes in a table, though, because primary keys can be btrees only, and so we don't implement this optimization. """ I don't think it's realistic to suppose that BRIN indexes will rarely be the only indexes on a table. I'm not especially concerned about maintenance_work_mem in this scenario, though -- I'm much more concerned about the possibly unnecessary second heap scan during a VACUUM of what is presumably a very large table. A second heap scan occurs (lazy_vacuum_heap() is called), where that would not occur if there were no indexes whatsoever (see commit ed8969b1). I realize that the second scan performed by lazy_vacuum_heap() only visits those pages known to contain dead tuples. However, the experience of seeing problems with the random sampling of ANALYZE makes me think that that might not be very helpful. There is no good reason to think that there won't be a uniform distribution of dead tuples across the heap, and so only visiting pages known to contain dead tuples might be surprisingly little help even when there are relatively few VACUUM-able tuples in the table. Has any thought been given to how we could make VACUUM avoid a second heap scan iff there are only BRIN indexes, without compromising anything else? In other words, by killing heap TIDs *before* any "VACUUM" of BRIN index(es) occurs, avoiding a call to lazy_vacuum_heap(), just as when there are no indexes on the table whatsoever? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] problem with msvc linker - cannot build orafce
On 11/23/15 15:14, Tom Lane wrote: > Pavel Stehulewrites: >> I am trying to build Orafce and I have problem due access to exported >> variable session_timezone. >> Any idea what can be broken? > > Lack of PGDLLIMPORT on the extern declaration, no doubt. > > The fact that we've not heard this before implies that either nobody has > ever tried to use orafce on Windows, or it only very recently grew a > dependency on session_timezone. Or something changed in the build process. I've had Ken Olson report the same symbol inaccessible when he builds PL/Java with MSVC, and he says it has happened since PG 9.4. I read myself to sleep about a week ago catching up on this old thread: http://www.postgresql.org/message-id/24290.1391303...@sss.pgh.pa.us What I (think I) took away from it was: 1. Un-PGDLLIMPORTed references to global *functions* work ok. Maybe they are thunked and a little less efficient, but they work. 2. Un-PGDLLIMPORTed references to global *variables*, not so much. They used to silently link (at least on some buildfarm critters) but hold bogus data (maybe a thunk, taken as data?). 3. The one concrete *action* taken in the course of that thread was to tweak the build process to make sure such cases at least *fail* because that's better than silent bogosity. So it's possible that (3) is what makes both Orafce and PL/Java seem to have started failing "recently" even though the code in question may be years old (and for most of that time, while linking without complaint, may have been bogus without someone noticing). The question that interests me most right now: how, if at all, can the extension author/maintainer work around this issue when it crops up? Obviously, the Right Thing To Do is report it and get the PGDLLIMPORT added here, but still for years to come the extension will have to cope with being built against PG distributions that lack it. (Never mind the reason, when the extension doesn't build, it's the extension that looks bad.) Now, I thought I spotted, somewhere in that long thread, the hint of an idea that the magic works as long as the *extension* has the variable declared PGDLLIMPORT, even if it wasn't declared that way when the PG executable itself was built. Anybody else remember that, or did I imagine it? The snag seemed to be, MSVC won't tolerate two extern declarations, one PGDLLIMPORT and one without, so you can't get away with including the .h file (which might not have the PGDLLIMPORT) and then declaring it yourself. You *might* get away with creating a separate C file (how about chamberofhorrors.c?) that, rather revoltingly, *doesn't* include the proper PostgreSQL .h files, only duplicates the necessary declarations with PGDLLIMPORT added, and exports some getter/setter methods to the rest of the extension code. (I like the idea of one chamberofhorrors better than scattering such rubbish all over the project.) That might work ok for log_min_messages and client_min_messages, which are just ints. Trickier maybe for session_timezone, which is a pg_tz, so you can't really avoid pgtime.h. That leaves even more revolting options, like #define session_timezone decoy_session_timezone #include #undef session_timezone extern PGDLLIMPORT pg_tz session_timezone pg_tz decoy_session_timezone; /* unused */ Has anyone got the stomach to try such a thing and see what happens? I don't have MSVC here. -Chap -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Identify user requested queries
Hi Craig, Thanks for the input. I guess i need to read more code and see if it is achievable. I started looking into the code very recently. Your inputs is very valuable to me. Thanks. Yes I am trying to do something similar to multi-tenancy. I will look at the row level security. Thanks Praveen On Mon, Nov 23, 2015 at 2:16 AM, Craig Ringerwrote: > On 23 November 2015 at 13:27, Praveen M wrote: > >> Hi All, >> >> When the user attempts to make a connection with the database , the code >> will look into various pg_catalog tables internally. However the user also >> can query the pg_catalog tables. Is there a way to identify the user >> requested (or typed query) vs the system requested (internal) queries? >> > > As far as I know there is no simple and reliable method but I'm no > expert. > > Most system accesses to common catalogs use the syscache, which doesn't go > through the SQL parse/bind/execute process. Or they construct simple scans > directly, again bypassing the full parser. The system will run internal > queries with the SPI though, and that's full-fledged SQL. Triggers, rules, > views, etc, use the SPI, as does plpgsql, fulltext search, XML support, and > a few other parts of the system. So you cannot assume that anything using > SQL is user-originated. > > Take a look at PostgresMain in src/backend/tcop/postgres.c for the > top-level user query entry point. You'll see there that you cannot rely on > testing isTopLevel because multiple statements sent as a single query > string are treated as if they were a nested transaction block. > (see exec_simple_query(), postgres.c around line 962). That'd also cause > problems with use of PL/PgSQL. > > You can't assume that all SPI queries are safe, because the user can run > queries via the SPI using plpgsql etc. > > I don't see any way to do this without introducing the concept of a > "system query"... and in PostgreSQL that's not simple, because the system > query could cause the invocation of user-defined operators, functions, > triggers, etc, that then run user-defined code. You'd have to clear the > "system query" flag whenever you entered user-defined code, then restore it > on exit. That seems exceedingly hard to get right reliably. > > Reading between the lines, it sounds like you are looking for a way to > limit end-user access to system catalogs as part of a lockdown effort, > perhaps related to multi-tenancy. Correct? If so, you may wish to look at > the current work on supporting row security on system catalogs, as that is > probably closer to what you will need. > > >> Also what procedure or function in the code that indicates the user can >> write queries , something like I wanted to know the code where the >> connection is created and available for user to use. >> > > Start reading at src/backend/tcop/postgres.c . > > -- > Craig Ringer http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services >
Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.
Thanks for the review. On Tue, Nov 24, 2015 at 6:15 AM, Alvaro Herrerawrote: > I just noticed that RecoveryTest.pm is lacking "use strict; use > warnings;". With those added, there's a number of problems reported: > Most of them are easily fixable by adding the correct "my" lines; but at > least @array and $current_dir require more code to be written. > Oops. > TBH all that business with arrays that are kept in sync looks too > contrived to me. Could we have a Perl object representing each node > instead? > Not really to be honest. > That would require a "PostgresNode" package (or similar). The > RecoveryTest.pm would have a single %nodes hash. Also, you don't need > @active_nodes, just a flag in PostgresNode, and have the stop routine do > nothing if node is not marked active. Also: if you pass the "root node" > when creating a node that will become a standby, you don't need to pass > it when calling, say, enable_streaming; the root node becomes an > instance variable. (Hmm, actually, if we do that, I wonder what if in > the future we want to test node promotion and a standby is repointed to > a new master. Maybe we don't want to have this knowledge in the Perl > code at all.) > I think I'll get the idea. In short all the parametrization will just happen at object level, as well as basic actions on the nodes like start, stop, restart etc. > In get_free_port, isn't it easier to use pg_isready rather than psql? > Will switch. > I've been messing with 003 because I think it's a bit too repetitive. > Will finish it after you post a fixed version of RecoveryTest.pm. > Sure, thanks. I'll rework this patch and will update a new version soon. Thanks again for the review. -- Michael
Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.
On Tue, Nov 24, 2015 at 6:27 AM, Alvaro Herrerawrote: > Michael Paquier wrote: >> On Thu, Nov 19, 2015 at 12:21 AM, Alvaro Herrera >> wrote: > >> > Hi, I just started looking this over a bit. The first thing I noticed >> > is that it adds a dependency on Archive::Tar which isn't already used >> > anywhere else. Did anybody check whether this exists back in 5.8 >> > installations? >> >> Actually I didn't and that's a good point, we have decided to support >> TAP down to 5.8.9. The only reason why I introduced this dependency is >> that there is no easy native way to copy an entire folder in perl, and >> that's for handling base backups. There are things like File::NCopy of >> File::Copy::Recursive however it does not seem like a good idea to >> depend on other modules that IPC::Run. Would it be better to have an >> in-core module dedicated to that similar to SimpleTee.pm? Or are you >> guys fine to accept a dependency with another module? > > It would be a lot better to not have to rely on another module existing > everywhere. I'd rather have another simple module, following > SimpleTee's example. Since this doesn't have to be terribly generic, it > should be reasonably short, I hope. Sure, that would be a simple function that does directory lookup and recursive calls. I'll move ahead with that then and reuse it in the recovery logic. >> > Why is "recovery" added to ALWAYS_SUBDIRS in src/test/Makefile instead >> > of to SUBDIRS? Seems a strange choice. >> >> Because I thought that it should not be part of the main regression >> suite, like ssl/. Feel free to correct me if my feeling is wrong. > > As I understand, the problem with "ssl" is that it messes with > system-wide settings, which is not the case here. I'm inclined to move > it to SUBDIRS. As an example, "modules" is not part of the main > regression suite either. OK, I'll move it back to it then. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Minor comment edits in nodeGather.c
While going through nodeGather.c, I noticed portions of the file header comment that may have been obsoleted by recent revisions of the relevant parellelism code. For example, there is a reference to PartialSeqScan node which did not make it into the tree. Attached fixes it. Also, wondering if the semantics of Gather node is that of Scan or more generic Plan? That is to ask whether the following edit makes sense: * nodeGather.c - * Support routines for scanning a plan via multiple workers. + * Support routines for getting the result from a plan via multiple + * workers. * Thanks, Amit diff --git a/src/backend/executor/nodeGather.c b/src/backend/executor/nodeGather.c index b6e82d1..8c6d3e3 100644 --- a/src/backend/executor/nodeGather.c +++ b/src/backend/executor/nodeGather.c @@ -1,7 +1,8 @@ /*- * * nodeGather.c - * Support routines for scanning a plan via multiple workers. + * Support routines for getting the result from a plan via multiple + * workers. * * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California @@ -11,7 +12,8 @@ * or have not started up yet. It then merges all of the results it produces * and the results from the workers into a single output stream. Therefore, * it will normally be used with a plan where running multiple copies of the - * same plan does not produce duplicate output, such as PartialSeqScan. + * same plan does not produce duplicate output, such as parallel-aware + * SeqScan. * * Alternatively, a Gather node can be configured to use just one worker * and the single-copy flag can be set. In this case, the Gather node will -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New email address
On Nov 24, 2015 01:05, "Michael Paquier"wrote: > > On Tue, Nov 24, 2015 at 3:41 AM, Kevin Grittner wrote: > > Yahoo recently changed their DMARC policy, and after some > > investigation and a support case with Yahoo, it is now clear that > > their email systems can no longer be used with the postgresql.org > > lists. I've migrated from kgri...@ymail.com to kgri...@gmail.com. > > Something to be aware of as well: I noticed that sometimes your emails > coming from @ymail.com were flagged as spam by gmail. People be > careful of that if you use it. That's a direct effect of the dmarc policy change. Yahoo no longer supports their customers using mailing lists. They changed their policies for such emails to hard reject, which makes Gmail (and presumably others) stick them in spam.. It would happen to all the emails except the ones where you are on direct cc. /Magnus
Re: [HACKERS] documentation for wal_retrieve_retry_interval
On Mon, Nov 23, 2015 at 11:25 PM, Peter Eisentraut wrote: > Sounds good. Thanks! Great. Thanks for considering it! -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] parallelism and sorting
On Mon, Nov 23, 2015 at 5:38 PM, David Fetterwrote: > That sounds like a very handy thing to have. Any idea whether it's > possible for 9.6? Is there any of the Parallel Seq Scan code that > looks like it could be reused or slightly generalized for the > implementation? I think it would be a good idea to pattern a hypothetical Parallel Index Scan feature after what we did in commits ee7ca559fcf404f9a3bd99da85c8f4ea9fbc2e92 and f0661c4e8c44c0ec7acd4ea7c82e85b265447398, which are only about 500 lines of code combined, but I don't expect any direct code reuse to be possible. However: 1. Parallel Seq Scan is easier because we have, at present, only one heapam API. Partial Index Scan is likely to be more complicated because we need to deal not only with the indexam API but also with the individual access methods (btree, etc.). 2. In Parallel Seq Scan, the determination of what page to scan next isn't dependent on the contents of any page previously scanned. In Parallel Index Scan, it is. Therefore, the amount of effective parallelism is likely to be less. This doesn't mean that trying to parallelize things here is worthless: one backend can be fetching the next index page while some other backend is processing the tuples from a page previously read. 3. Without Gather Merge, it figures to be mostly useless, because a straight Gather node is order-destroying. I'm not prepared to speculate on whether this will get done for 9.6 at this point. I'll say it would be nice. :-) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New email address
On Tue, Nov 24, 2015 at 3:41 AM, Kevin Grittnerwrote: > Yahoo recently changed their DMARC policy, and after some > investigation and a support case with Yahoo, it is now clear that > their email systems can no longer be used with the postgresql.org > lists. I've migrated from kgri...@ymail.com to kgri...@gmail.com. Something to be aware of as well: I noticed that sometimes your emails coming from @ymail.com were flagged as spam by gmail. People be careful of that if you use it. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Foreign join pushdown vs EvalPlanQual
> On Fri, Nov 20, 2015 at 12:11 AM, Kouhei Kaigaiwrote: > >> On Thu, Nov 19, 2015 at 6:39 AM, Kouhei Kaigai > >> wrote: > >> > So, are you suggesting to make a patch that allows ForeignScan to have > >> > multiple sub-plans right now? Or, one sub-plan? > >> > >> Two: > >> > http://www.postgresql.org/message-id/CA+TgmoYZeje+ot1kX4wdoB7R7DPS0CWXAzfqZ- > >> 14ykfkgkr...@mail.gmail.com > >> > > Hmm. Two is a bit mysterious for me because two sub-plans (likely) > > means this ForeignScan node checks join clauses and reconstruct > > a joined tuple by itself but does not check scan clauses pushed- > > down (it is job of inner/outer scan plan, isn't it?). > > In this case, how do we treat N-way remote join cases (N>2) if we > > assume such a capability in FDW driver? > > > > One subplan means FDW driver run an entire join sub-tree with local > > alternative sub-plan; that is my expectation for the majority case. > > However, I cannot explain two subplans, but not multiple, well. > > What I'm imagining is that we'd add handling that allows the > ForeignScan to have inner and outer children. If the FDW wants to > delegate the EvalPlanQual handling to a local plan, it can use the > outer child for that. Or the inner one, if it likes. The other one > is available for some other purposes which we can't imagine yet. If > this is too weird, we can only add handling for an outer subplan and > forget about having an inner subplan for now. > I'd like to agree the last sentence. Having one sub-plan is better (but the second best from my standpoint) than fixed two subplans, because ... > I just thought to make > it symmetric, since outer and inner subplans are pretty deeply baked > into the structure of the system. > Yep, if we would have a special ForeignJoinPath to handle two foreign- tables join, it will be natural. However, our choice allows N-way join at once if sub-plan is consists of three or more foreign-tables. In this case, ForeignScan (scanrelid==0) can represents a sub-plan that shall be equivalent to a stack of joins; that looks like a ForeignScan has inner, outer and variable number of "middler" input streams. If and when we assume ForeignScan has own join mechanism but processes scan-qualifiers by local sub-plans, fixed-number sub-plans are not sufficient. (Probably, it is minority case although.) I'm inclined to put just one outer path at this moment, because the purpose of the FDW sub-plans is EPQ recheck right now. So, we will be able to enhance the feature when we implement other stuffs - more aggressive join push-down for example. Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Foreign join pushdown vs EvalPlanQual
On 2015/11/24 2:41, Robert Haas wrote: On Fri, Nov 20, 2015 at 12:11 AM, Kouhei Kaigaiwrote: One subplan means FDW driver run an entire join sub-tree with local alternative sub-plan; that is my expectation for the majority case. What I'm imagining is that we'd add handling that allows the ForeignScan to have inner and outer children. If the FDW wants to delegate the EvalPlanQual handling to a local plan, it can use the outer child for that. Or the inner one, if it likes. The other one is available for some other purposes which we can't imagine yet. If this is too weird, we can only add handling for an outer subplan and forget about having an inner subplan for now. I just thought to make it symmetric, since outer and inner subplans are pretty deeply baked into the structure of the system. I'd vote for only allowing an outer subplan. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Foreign join pushdown vs EvalPlanQual
On 2015/11/20 22:45, Kouhei Kaigai wrote: I wrote: * This patch means we can define fdw_recheck_quals even for the case of foreign tables with non-NIL fdw_scan_tlist. However, we discussed in another thread [1] that such foreign tables might break EvalPlanQual tests. Where are we on that issue? In case of later locking, RefetchForeignRow() will set a base tuple that have compatible layout of the base relation, not fdw_scan_tlist, because RefetchForeignRow() does not have information about scan node. IIUC, I think the base tuple would be stored into EPQ state not only in case of late row locking but in case of early row locking. * For the case of foreign joins, I think fdw_recheck_quals can be defined for example, the same way as for the case of foreign tables, ie, quals not in scan.plan.qual, or ones defined as "otherclauses" (rinfo->is_pushed_down=true) pushed down to the remote. But since it's required that the FDW has to add to the fdw_scan_tlist the set of columns needed to check quals in fdw_recheck_quals in preparation for EvalPlanQual tests, it's likely that fdw_scan_tlist will end up being long, leading to an increase in a total data transfer amount from the remote. So, that seems not practical to me. Maybe I'm missing something, but what use cases are you thinking? It is trade-off. What solution do you think we can have? To avoid data transfer used for EPQ recheck only, we can implement FDW driver to issue remote join again on EPQ recheck, however, it is not a wise design, isn't it? If we would be able to have no extra data transfer and no remote join execution during EPQ recheck, it is a perfect. I was thinking that in an approach using a local join execution plan, I would just set fdw_recheck_quals set to NIL and evaluate the otherclauses as part of the local join execution plan, so that fdw_scan_tlist won't end up being longer, as in the patch [1]. (Note that in that patch, remote_exprs==NIL when calling make_foreignscan during postgresGetForeignPlan in case of foreign joins.) Best regards, Etsuro Fujita [1] http://www.postgresql.org/message-id/5624d583.10...@lab.ntt.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] parallelism and sorting
On Mon, Nov 23, 2015 at 2:01 PM, Robert Haaswrote: > I've been thinking about how parallelism interacts with sorting over > the last few days and I wanted to share a few preliminary thoughts. I > definitely don't have all the answers worked out here yet, so thoughts > are welcome. I think it's definitely a good idea to have some high level discussion of these issues now. My responses will in some cases also be high level and aspirational. > 2. Within parallel query, there are two reasons to care about data > that is in sorted order. First, we might need to deliver the results > to the user in a particular order, because they've specified ORDER BY > whatever. Second, the optimal join strategy might be a merge join, > which requires that both relations be sorted according to the join > key.[1] I gather the distinction you're making here is between a Sort node, and a node that happens to use a tuplesort without an explicit Sort node (like a "COUNT(DISTINCT(foo))" *Aggregate* node -- *not* a GroupAggregate node). I am a little concerned cases like this might accidentally not benefit due to not explicitly having a Sort node, as you refer to below. Beyond that, CREATE INDEX and CLUSTER utility cases will also need to be parallelized without all this executor infrastructure. > 3. The current Gather node reads tuples from the workers in > round-robin fashion, skipping over workers that don't have a tuple > ready yet when it needs one. It seems potentially useful to have a > Gather Merge node which would assume that the results from each worker > are ordered with respect to each other, and do a final merge pass over > those. Then we could get the toplevel query ordering we want using a > plan like this: > > Gather Merge > -> Sort > -> Parallel Seq Scan on foo > Filter: something I am of course strongly of the opinion that extending the new, improved, but pending approach to external sorts [1] is the way to go. Using the filesystem as "poor man's shared memory" when you can actually afford real shared memory now seems like much less of a problem than I thought in the past. More on that later. The problem I see here is that having real executor nodes, while preserving various useful properties of an on-the-fly merge implies a degree of cross-node promiscuity that I think won't fly. For one thing, all my tricks with memory pooling during the final on-the-fly merge become iffy, to say the least. For another, the children cannot very well feed SortTuples to the parent using the usual TupleTableSlot mechanism -- we benefit plenty from reuse of SortSupport and so on during the merge. Who would want to reconstruct something SortTuple-like on the other side (within the Gather Merge)? Besides, if one child cannot produce tuples in time, unlike many things there is a legitimate need to hold everything up. I think we should parallelize the Merge in a later release -- possibly much later. It should probably still be presented to users more or less as you outline -- it will just be an implementation detail. IOW, what explain.c calls "special child plans". Actually, the answer here is probably simple -- as you suggest separately, the "Gather Merge" actually does almost the same thing as our existing on-the-fly merge step within tuplesort.c. The difference is only that it gets information about runs to merge from the workers when they finish the sort. There is a little bit of bookkeeping in shared memory, plus we revise the tuplesort.c interface to allow what is essentially my new approach to external sorts to happen in phases managed at a slightly lower level by the tuplesort client. The existing interface is preserved, plus a "build your own sort" interface. Clients continue to pass back and forth a little opaque state for tuplesort's benefit, some of which is stored by a Gather-like node in shared memory, but that's it. We need a new tuplesort_end() variant, to free memory early, without releasing tapes, just for this, and the caller needs to know a bit about runs (or that partitioning is a consequence of the number of workers actually available). > 4. Gather Merge would be an executor node, and thus not available to > any code that uses tuplesort.c directly. Also, it seems fairly > mediocre for merge joins. The best we could do is something like > this:[2] > > Merge Join > -> Gather Merge > -> Sort > -> Parallel Seq Scan > -> Gather Merge > -> Sort > -> Parallel Seq Scan > > The problem with this plan is that the join itself is not done in > parallel, only the sorting. That's not great, especially if there are > more joins that need to be done afterwards, necessarily not in > parallel.[2] It's possible that one side of the join could be an > Index Scan rather than Gather Merge -> Sort -> Parallel Seq Scan, but > that doesn't change the overall picture here much. That's not a huge problem, because at least the Sort is the really expensive part. Have you tried
Re: [HACKERS] New email address
Le 24 nov. 2015 01:05, "Michael Paquier"a écrit : > > On Tue, Nov 24, 2015 at 3:41 AM, Kevin Grittner wrote: > > Yahoo recently changed their DMARC policy, and after some > > investigation and a support case with Yahoo, it is now clear that > > their email systems can no longer be used with the postgresql.org > > lists. I've migrated from kgri...@ymail.com to kgri...@gmail.com. > > Something to be aware of as well: I noticed that sometimes your emails > coming from @ymail.com were flagged as spam by gmail. People be > careful of that if you use it. +1, happened a lot actually.
Re: [HACKERS] parallelism and sorting
On Mon, Nov 23, 2015 at 8:45 PM, Peter Geogheganwrote: >> 2. Within parallel query, there are two reasons to care about data >> that is in sorted order. First, we might need to deliver the results >> to the user in a particular order, because they've specified ORDER BY >> whatever. Second, the optimal join strategy might be a merge join, >> which requires that both relations be sorted according to the join >> key.[1] > > I gather the distinction you're making here is between a Sort node, > and a node that happens to use a tuplesort without an explicit Sort > node (like a "COUNT(DISTINCT(foo))" *Aggregate* node -- *not* a > GroupAggregate node). Yes. Or things that aren't part of the executor at all. > I am a little concerned cases like this might > accidentally not benefit due to not explicitly having a Sort node, as > you refer to below. A valid concern. > Beyond that, CREATE INDEX and CLUSTER utility > cases will also need to be parallelized without all this executor > infrastructure. Or, alternatively, CREATE INDEX and CLUSTER could be refactored to use the executor. This is might sound crazy, but maybe it's not. Perhaps we could have the executor tree output correctly-formed index tuples that get funneled into a new kind of DestReceiver that puts them into the index. I don't know if that's a GOOD idea, but it's an idea. > The problem I see here is that having real executor nodes, while > preserving various useful properties of an on-the-fly merge implies a > degree of cross-node promiscuity that I think won't fly. For one > thing, all my tricks with memory pooling during the final on-the-fly > merge become iffy, to say the least. For another, the children cannot > very well feed SortTuples to the parent using the usual TupleTableSlot > mechanism -- we benefit plenty from reuse of SortSupport and so on > during the merge. Who would want to reconstruct something > SortTuple-like on the other side (within the Gather Merge)? Besides, > if one child cannot produce tuples in time, unlike many things there > is a legitimate need to hold everything up. I think we should > parallelize the Merge in a later release -- possibly much later. The implementation I have in mind for Gather Merge is as follows. Currently, a Gather node has two TupleTableSlots - one for tuples that the leader generates itself by running the plan before the workers get started or when they can't keep up, and a second for tuples read from the workers. What I plan to do is refactor it so that there is one TupleTableSlot per worker. If we're doing a standard Gather, we simply return a tuple from whichever slot we manage to fill first. If we're doing a Gather Merge, we fill every slot, then build a heap of the tuples and return the lowest one. When we need the next tuple, we refill that slot, restore the heap property, lather, rinse, repeat. This is basically the same way MergeAppend works, but instead of reading tuples from multiple subplans, we're reading them from multiple workers. There's really no cross-node promiscuity here - whatever is under the Gather Merge neither knows nor cares what the Gather Merge will do with the tuples, and it does not need to be fed by an explicit sort any more than MergeAppend does. >> 4. Gather Merge would be an executor node, and thus not available to >> any code that uses tuplesort.c directly. Also, it seems fairly >> mediocre for merge joins. The best we could do is something like >> this:[2] >> >> Merge Join >> -> Gather Merge >> -> Sort >> -> Parallel Seq Scan >> -> Gather Merge >> -> Sort >> -> Parallel Seq Scan >> >> The problem with this plan is that the join itself is not done in >> parallel, only the sorting. That's not great, especially if there are >> more joins that need to be done afterwards, necessarily not in >> parallel.[2] It's possible that one side of the join could be an >> Index Scan rather than Gather Merge -> Sort -> Parallel Seq Scan, but >> that doesn't change the overall picture here much. > > That's not a huge problem, because at least the Sort is the really > expensive part. OK, but suppose you need to do a hash or nested loop join to another table after the merge join. With this approach, you cannot parallelize that. > Have you tried contriving a merge join test case with a really cheap > sort or pair of sorts? No. My real-world experience, back before I became a full-time hacker, was that hash joins were often faster than nested loops, and merge joins were dog slow. I dunno if that's representative of other people's experience, or whether subsequent releases have changed the picture. > What I found really interesting during my experiments with the new > approach to sorting (simple hybrid sort-merge strategy) was how > performance was very consistent past a certain work_mem setting (about > 1GB IIRC). Lower settings greater than that fuzzy threshold resulted > in a shorter, maybe even much shorter time spent sorting
Re: [HACKERS] Foreign join pushdown vs EvalPlanQual
On 2015/11/09 9:26, Kouhei Kaigai wrote: The attached patch is an adjusted version of the previous one. There seems to be no changes to make_foreignscan. Is that OK? Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Revisiting pg_stat_statements and IN() (Was: Re: pg_stat_statements fingerprinting logic and ArrayExpr)
On Tue, Dec 10, 2013 at 1:30 AM, Peter Geogheganwrote: > pg_stat_statements' fingerprinting logic considers the following two > statements as distinct: > > select 1 in (1, 2, 3); > select 1 in (1, 2, 3, 4); > > This is because the ArrayExpr jumble case jumbles any ArrayExpr's list > of elements recursively. In this case it's a list of Const nodes, and > the fingerprinting logic jumbles those nodes indifferently. > > Somebody told me that they think that pg_stat_statements should not do > that. This person felt that it would be preferable for such > expressions to be normalized without regard to the number of distinct > Const elements. I suppose that that would work by determing if the > ArrayExpr elements list was a list of Const nodes and only const > nodes. Iff that turned out to be the case, something else would be > jumbled (something other than the list) that would essentially be a > representation of "some list of zero or more (or maybe one or more) > Const nodes with consttype of, in this example, 23". I think that this > would make at least one person happy, because of course the two > statements above would have their costs aggregated within a single > pg_stat_statements entry. Baron Schwartz recently gave a talk at PGConf Silicon Valley about the proprietary query instrumentation tool, VividCortex. The slides are available from: http://info.citusdata.com/rs/235-CNE-301/images/Analyzing_PostgreSQL_Network_Traffic_with_vc-pgsql-sniffer_-_Baron_Schwartz.pdf One specific justification he gave for not using pg_stat_statements was: "Doesn’t merge bind vars in IN()" (See slide #11) His theory is that you should allow a proprietary binary to run with root permissions, a binary that sniffs the wire protocol, mostly because pg_stat_statements has this limitation (all other pg_stat_statements limitations listed are pretty unconvincing, IMV). That doesn't seem like a great plan to me, but I think he has a point about pg_stat_statements. It's about time that we fixed this -- it isn't realistic to imagine that people are going to know to use an array constant like "= any ('{1,2,3}')" -- even a major contributor to Django that I talked to about this issue a couple of years ago didn't know about that. It also isn't portable across database systems. I wonder: * How do other people feel about this? Personally, I've seen enough problems of this kind in the field that "slippery slope" arguments against this don't seem very compelling. * How might altering the jumbling logic to make it recognize a variable number of constants as equivalent work in practice? Perhaps we should do something to flatten the representation based on which powers of two the number of constants is between. There are still some details to work out there, but that's my first idea. That seems like a good compromise between the current behavior, and completely disregarding the number of constants. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] custom function for converting human readable sizes to bytes
Jim Nasbywrites: > On 11/23/15 3:11 AM, Corey Huinker wrote: >> +1 to both pg_size_bytes() and ::bytesize. Both contribute to making the >> statements more self-documenting. > The function seems like overkill to me if we have the type. Just my > opinion though. I'm thinking the type could just be called 'size' too > (or prettysize?). No reason it has to be tied to bytes (in particular > this would work for bits too). Please, no. That's *way* too generic a name. I do not actually agree with making a type for this anyway. I can tolerate a function, but adding a datatype is overkill; and it will introduce far more definitional issues than it's worth. (eg, which other types should have casts to/from it, and at what level) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Refactoring of LWLock tranches
On Fri, Nov 20, 2015 at 6:53 AM, Ildus Kurbangalievwrote: > We keep limited number of LWLocks in base shared memory, why not keep > their thanches in shared memory too? Other tranches can be in local > memory, we just have to save somewhere highest id of these tranches. I just don't see it buying us anything. The tranches are small and contain only a handful of values. The values need to be present in shared memory but the tranches themselves don't. Now, if it's convenient to put them in shared memory and doesn't cause us any other problems, then maybe there's no real downside. But it's not clear to me that there's any upside either. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Declarative partitioning
On Fri, Nov 20, 2015 at 6:44 AM, Alvaro Herrerawrote: > Amit Langote wrote: >> On Fri, Nov 20, 2015 at 7:20 PM, Simon Riggs wrote: >> > Drop it?? I think he means "in this initial patch", right Amit L ? >> >> Yes, there was some notion of multi-level partitioning in the earlier >> patch but I removed it from the version I posted on Oct 30. I do >> intend to re-introduce it though. > > I'm with Simon. In my own experience, it's crazy to try to introduce a > huge feature such as this one in a single enormous commit. The last > patch you posted was 300 kb without any SGML changes. > > The way parallel query is being introduced is a good example to follow > (or logical multi-master replication, for that matter) --- one > infrastructure piece at a time. > > Let's build incrementally. I support building incrementally, but I don't see why we want to change the catalog structure and then change it again. That seems like it makes the project more work, not less. To me, it seems like there is a pretty obvious approach here: each table can be either a plain table, or a partition root (which can look just like an empty inheritance parent). Then multi-level partitioning falls right out of that design without needing to do anything extra. If you want a single level of partitioning, you partition the original table. If you want two levels of partitioning, you partition the partitions. If you want three levels of partitioning, you partition those. It's being made out here that limiting ourselves to a single of partitioning makes things simpler, but it's not very clear to me that this is actually true. I think it is also worth getting the syntax right from the beginning. Even if we decide that patch #1 won't support multi-level partitioning, we should have a plan for the syntax that can be extended to multi-level partitioning. If we decide after releasing partitioning with one syntax that we really wish we'd used some other syntax, that is going to be a really big problem - deprecating the use of => or introducing standard_conforming_strings were projects that took many years to complete. We really only get one shot to get that right. That doesn't mean it's all got to be there in version one, but there had better be a way to extend it to all the things we want to do later or we are going to be deeply, deeply sad. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] custom function for converting human readable sizes to bytes
Pavel Stehule wrote: > so pg_size_bytes is good enough for everybody? That seems good enough to me. I would have it accept GiB and GB and have both transform to base 2, and have an optional boolean flag whose non-default value turns the GB interpretation into base 10, leaving the GiB interpretation unaffected. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] Inputs on forcing VACUUM VERBOSE to write timestamp
On 9/3/15 1:50 PM, Jeff Janes wrote: Unconvinced - sounds like you're just re-inventing log_line_prefix. Many times I've wanted a client_log_line_prefix. If someone wants to invent that, I'd second it. It would be pretty awkward to have to turn that on and off to run a manual vacuum, unless there was a LOGTIME option added to vacuum as well. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Truncating/vacuuming relations on full tablespaces
On 9/4/15 7:04 AM, Thom Brown wrote: But shouldn't we not be creating FSM or VM files when truncating? Maybe, but even then you still need to create a bunch of new files (at least one for the table and one for each index), and AFAIK the first page in each file will be properly initialized, which means each file will be at least BLKSZ. ISTM that the vacuum case is one we'd really want to avoid, though, as it's trickier to work around the problem. What might make sense is a special 'free up space NOW' mode that focuses only on attempting to truncate the relation, because if you can't actually shrink the heap you're not going to make any progress. But since none of this will help at all in the default case where WAL is on the same filesystem as the data, I don't know that it's worth it. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] custom function for converting human readable sizes to bytes
On 11/23/15 3:11 AM, Corey Huinker wrote: +1 to both pg_size_bytes() and ::bytesize. Both contribute to making the statements more self-documenting. The function seems like overkill to me if we have the type. Just my opinion though. I'm thinking the type could just be called 'size' too (or prettysize?). No reason it has to be tied to bytes (in particular this would work for bits too). If we're going to add this, I suppose it should support the 'i prefixes' too (GiB, MiB, etc). -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [DESIGN] ParallelAppend
On Mon, Nov 23, 2015 at 7:45 AM, Amit Kapilawrote: > Without this patch, that 0.5 (or 50% of leaders effort) is considered for > Gather node irrespective of the number of workers or other factors, but > I think with Patch that is no longer true and that's what I am worrying > about. Nope, that patch does not change that at all. We probably should, but this patch does not. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Foreign join pushdown vs EvalPlanQual
On Fri, Nov 20, 2015 at 12:11 AM, Kouhei Kaigaiwrote: >> On Thu, Nov 19, 2015 at 6:39 AM, Kouhei Kaigai wrote: >> > So, are you suggesting to make a patch that allows ForeignScan to have >> > multiple sub-plans right now? Or, one sub-plan? >> >> Two: >> >> http://www.postgresql.org/message-id/CA+TgmoYZeje+ot1kX4wdoB7R7DPS0CWXAzfqZ- >> 14ykfkgkr...@mail.gmail.com >> > Hmm. Two is a bit mysterious for me because two sub-plans (likely) > means this ForeignScan node checks join clauses and reconstruct > a joined tuple by itself but does not check scan clauses pushed- > down (it is job of inner/outer scan plan, isn't it?). > In this case, how do we treat N-way remote join cases (N>2) if we > assume such a capability in FDW driver? > > One subplan means FDW driver run an entire join sub-tree with local > alternative sub-plan; that is my expectation for the majority case. > However, I cannot explain two subplans, but not multiple, well. What I'm imagining is that we'd add handling that allows the ForeignScan to have inner and outer children. If the FDW wants to delegate the EvalPlanQual handling to a local plan, it can use the outer child for that. Or the inner one, if it likes. The other one is available for some other purposes which we can't imagine yet. If this is too weird, we can only add handling for an outer subplan and forget about having an inner subplan for now. I just thought to make it symmetric, since outer and inner subplans are pretty deeply baked into the structure of the system. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] custom function for converting human readable sizes to bytes
2015-11-23 18:04 GMT+01:00 Tom Lane: > Jim Nasby writes: > > On 11/23/15 3:11 AM, Corey Huinker wrote: > >> +1 to both pg_size_bytes() and ::bytesize. Both contribute to making the > >> statements more self-documenting. > > > The function seems like overkill to me if we have the type. Just my > > opinion though. I'm thinking the type could just be called 'size' too > > (or prettysize?). No reason it has to be tied to bytes (in particular > > this would work for bits too). > > Please, no. That's *way* too generic a name. > > I do not actually agree with making a type for this anyway. I can > tolerate a function, but adding a datatype is overkill; and it will > introduce far more definitional issues than it's worth. (eg, which > other types should have casts to/from it, and at what level) > so pg_size_bytes is good enough for everybody? Regards Pavel > > regards, tom lane >
Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.
On Sat, Nov 21, 2015 at 12:38 AM, Jim Nasbywrote: > On 11/19/15 7:29 PM, Amit Langote wrote: >>> >>> Another option is to provide the means for the index scan routines to >>> >report their progress. Maybe every index AM won't use it, but it'd >>> >certainly be a lot better than staring at a long_running boolean. >> >> The boolean would be a workaround for sure. I'm also slightly tempted by >> the idea of instrumenting vacuum scans of individual index AM's bulkdelete >> methods. One precedent is how vacuum_delay_point() are sprinkled around in >> the code. Another problem to solve would be to figure out how to pass >> progress parameters around - via some struct or could they be globals just >> like VacuumCost* variables are... > > It just occurred to me that we could do the instrumentation in > lazy_tid_reaped(). It might seem bad to do in increment for every tuple in > an index, but we're already doing a bsearch over the dead tuple list. > Presumably that's going to be a lot more expensive than an increment > operation. I think the cost of doing an increment there would be negligible. I'm not quite sure whether that's the right place to instrument - though it looks like it might be - but I think the cost of ++something in that function isn't gonna be a problem at all. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] SQL function to report log message
Hi All, On Tue, Nov 17, 2015 at 12:10 PM, Peter Eisentrautwrote: > On 11/17/15 2:16 AM, Jim Nasby wrote: > > On 11/15/15 10:56 PM, dinesh kumar wrote: > >> So, shall we make this pg_report_log TO pg_write_log OR pg_ereport OR > >> from you. > > > > Why not pg_raise to mirror plpgsql? (The function does have the same > > semantics, right? It's not doing something like only sending to the log > > and not the client?) > > I think the "raise" terminology is specific to plpgsql, as it actually > raises an exception in that language. > > Sorry for being too late on this, as I have engaged into some other personal tasks. Could someone let me know, what else I need to do to get this patch completed. Any further suggestions on function name. If all OK with pg_log or someother, I would modify the patch, and will submit new one. Kindly let me know. Thanks in advance. -- Regards, Dinesh manojadinesh.blogspot.com
Re: [HACKERS] more RLS oversights
Noah, * Noah Misch (n...@leadboat.com) wrote: > On Tue, Jul 28, 2015 at 04:04:29PM -0700, Joe Conway wrote: > > Pushed to HEAD and 9.5 > > I reviewed this commit, f781a0f "Create a pg_shdepend entry for each role in > TO clause of policies." Thanks for the review! > This commit rendered the > http://www.postgresql.org/docs/devel/static/role-removal.html procedure[1] > incomplete. Before dropping a role, one must additionally drop each policy > mentioning the role in pg_policy.polroles: > > begin; > create role alice; > create table t (c int); > grant select on table t to alice; > create policy p0 on t to alice using (true); > reassign owned by alice to current_user; > drop owned by alice; > drop role alice; > rollback; > > shdepDropOwned() ignores SHARED_DEPENDENCY_POLICY entries. Should it instead > remove the role from polroles, dropping the policy if that would empty > polroles? (Which should change, the documented role-removal procedure or the > DROP OWNED treatment of policies?) I would expect the DROP OWNED treatment of policies to be similar to the DROP OWNED treatment of GRANTs. I'm certainly of the opinion that this is a bug which should be addressed. As an FYI, Joe's laptop recently got stolen and he's working to get back up to speed as quickly as he can. I've just put his new key into place on gitmaster (along with a few other pginfra-related bits), but there's obviously a lot more for him to be completely up and working again. > Independently, > http://www.postgresql.org/docs/devel/static/sql-drop-owned.html deserves an > update since it discusses every other object type having role dependencies. Agreed. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.
I just noticed that RecoveryTest.pm is lacking "use strict; use warnings;". With those added, there's a number of problems reported: Global symbol "%datadir_nodes" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 66. Global symbol "%backup_nodes" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 67. Global symbol "%archive_nodes" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 68. Global symbol "%connstr_nodes" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 69. Global symbol "%applname_nodes" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 70. Global symbol "%datadir_nodes" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 92. Global symbol "%connstr_nodes" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 93. Global symbol "%applname_nodes" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 93. Global symbol "%archive_nodes" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 104. Global symbol "%datadir_nodes" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 111. Global symbol "%archive_nodes" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 121. Global symbol "%datadir_nodes" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 130. Global symbol "%datadir_nodes" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 185. Global symbol "%datadir_nodes" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 197. Global symbol "@array" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 220. Global symbol "%backup_nodes" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 243. Global symbol "%archive_nodes" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 244. Global symbol "%datadir_nodes" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 246. Global symbol "%datadir_nodes" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 257. Global symbol "%backup_nodes" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 258. Global symbol "%archive_nodes" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 259. Global symbol "%connstr_nodes" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 260. Global symbol "%applname_nodes" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 261. Global symbol "%backup_nodes" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 272. Global symbol "%datadir_nodes" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 287. Global symbol "%backup_nodes" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 288. Global symbol "%archive_nodes" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 289. Global symbol "%datadir_nodes" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 292. Global symbol "$current_dir" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 294. Global symbol "%datadir_nodes" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 302. Global symbol "%datadir_nodes" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 313. Global symbol "%datadir_nodes" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 320. Global symbol "%datadir_nodes" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 367. Global symbol "%backup_nodes" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 377. Global symbol "%datadir_nodes" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 390. Global symbol "%backup_nodes" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 391. Global symbol "%archive_nodes" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 392. Global symbol "%connstr_nodes" requires explicit package name at /pgsql/source/master/src/test/perl/RecoveryTest.pm line 393. Global symbol "%applname_nodes" requires explicit package
Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.
Michael Paquier wrote: > On Thu, Nov 19, 2015 at 12:21 AM, Alvaro Herrera >wrote: > > Hi, I just started looking this over a bit. The first thing I noticed > > is that it adds a dependency on Archive::Tar which isn't already used > > anywhere else. Did anybody check whether this exists back in 5.8 > > installations? > > Actually I didn't and that's a good point, we have decided to support > TAP down to 5.8.9. The only reason why I introduced this dependency is > that there is no easy native way to copy an entire folder in perl, and > that's for handling base backups. There are things like File::NCopy of > File::Copy::Recursive however it does not seem like a good idea to > depend on other modules that IPC::Run. Would it be better to have an > in-core module dedicated to that similar to SimpleTee.pm? Or are you > guys fine to accept a dependency with another module? It would be a lot better to not have to rely on another module existing everywhere. I'd rather have another simple module, following SimpleTee's example. Since this doesn't have to be terribly generic, it should be reasonably short, I hope. > > Why is "recovery" added to ALWAYS_SUBDIRS in src/test/Makefile instead > > of to SUBDIRS? Seems a strange choice. > > Because I thought that it should not be part of the main regression > suite, like ssl/. Feel free to correct me if my feeling is wrong. As I understand, the problem with "ssl" is that it messes with system-wide settings, which is not the case here. I'm inclined to move it to SUBDIRS. As an example, "modules" is not part of the main regression suite either. > > In my days of Perl, it was starting to become frowned upon to call > > subroutines without parenthesizing arguments. Is that no longer the > > case? Because I notice there are many places in this patch and pre- > > existing that call psql with an argument list without parens. And it's > > a bit odd because I couldn't find any other subroutine that we're using > > in that way. > > Hm, yeah. If we decide about a perl coding policy I would be happy to > follow it. Personally I prefer usually using parenthesis however if we > decide to make the calls consistent we had better address that as a > separate patch. Some votes against, some votes for. Ultimately, it seems that this depends on the committer. I don't really care all that much about this TBH. > > In 005_replay_delay there's a 2s delay configured; then we test whether > > something is replayed in 1s. I hate tests that run for a long time, but > > is 2s good enough considering that some of our test animals in buildfarm > > are really slow? > > A call to poll_query_until ensures that we wait for the standby to > replay once the minimum replay threshold is reached. Even with a slow > machine the first query would still see only 10 rows at the first try, > and then wait for the standby to replay before checking if 20 rows are > visible. Or I am not following your point. Ah, I see. Maybe it's fine then, or else I'm not following your point ;-) -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RLS open items are vague and unactionable
Noah, * Noah Misch (n...@leadboat.com) wrote: > On Mon, Sep 28, 2015 at 03:03:51PM -0400, Stephen Frost wrote: > > If SELECT rights are required then apply the SELECT policies, even if > > the actual command is an UPDATE or DELETE. This covers the RETURNING > > case which was discussed previously, so we don't need the explicit check > > for that, and further addresses the concern raised by Zhaomo about > > someone abusing the WHERE clause in an UPDATE or DELETE. > > > > Further, if UPDATE rights are required then apply the UPDATE policies, > > even if the actual command is a SELECT. This addresses the concern that > > a user might be able to lock rows they're not actually allowed to UPDATE > > through the UPDATE policies. > > > > Comments welcome, of course. Barring concerns, I'll get this pushed > > tomorrow. > > The CREATE POLICY reference page continues to describe the behavior this patch > replaced, not today's behavior. Just to be clear, I'm not ignoring this, I've been working to try and rework the RLS documentation to add more information to the main RLS section and to better segregate out the general RLS documentation out from what should really be on the CREATE POLICY page. This update will be incorporated into that and I'll be posting the whole thing to -docs soon for comment. Thanks! Stephen signature.asc Description: Digital signature