Re: [HACKERS] Patch to add regression tests for SCHEMA
Dear Robins, Here is an updated patch that uses different schema / role names for different tests (as per commitfest site feedback). Short review about this version of the patch: This patch work for me. This test is a good thing and allows schema to be thoroughly tested, including corner cases which must fail because of errors or permissions. Two remarks: - test 2 bis: why name 'pg_asdf'? why not 'pg_schema_schsome number' to be homogeneous with other tests? - test 3: why not WHERE schema_name='schema_sch3' instead of two negative comparisons? ISTM that if for some reason in the future a new schema name is added, the test will fail. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add some regression tests for SEQUENCE
Have provided an updated patch as per Fabien's recent response on Commitfest site. Any and all feedback is appreciated. Review: This patch works for me. It adds valuable sequence test cases, especially trying corner cases with expected errors and permission denials. I suggest to accept it. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] about index inheritance
On Tue, May 7, 2013 at 11:55 PM, Robert Haas robertmh...@gmail.com wrote: On Mon, May 6, 2013 at 9:30 AM, Vincenzo Melandri vmelan...@imolinfo.it wrote: Hi guys, My first post here :) I stumbled into the same problem as this guy http://www.postgresql.org/message-id/4be2835a.5020...@cybertec.at , so since I have some spare time recently, I've set-up the development environment for postgresql and I think I may be able to contibute for the feature of index inheritance, that is currently unsopported, but listed in TODOs. I've spent some time reading the docs and I took a look at the code. Is anybody out there working on this already? I don't want to overlap someone else effort, plus I'll gladly take any advice or join the community efforts if any, 'cause this feature seems pretty huge to me at a first glance.. This is a really hard problem. If you pick this as your first project hacking on PostgreSQL, you will almost certainly fail. Thank you very much, i guessed that already -.- Still, I needed that at my office for a long time, struggled with it many times and had to come out with some exotic solutions... Now I have spare time between projects, so I can work on it full-time. At least it's worth a try, isn't it? Anyway, I'm working to better understand the problem, trying to identify at least the main involved points. At the moment I'm figuring out how the inherit mechanism works for relations (in tablecmds.c).. Then I'll figure out about how indexes work.. I guess you discussed this plenty of time already in the past, but I didn't found it in the archive. Any hint for old discussions? I'll try to come out with a list of potential things to do, for you guys to validate and discuss. PS: i wrote last mail from an address with which I had not subscribed to the list, and still the message got through.. Odd.. -- Vincenzo. http://www.linkedin.com/pub/vincenzo-melandri/14/16/730
Re: [HACKERS] GSOC Student Project Idea
On 24.04.2013 14:31, Florian Pflug wrote: On Apr23, 2013, at 23:25 , Alexander Korotkovaekorot...@gmail.com wrote: I've taken a brief look on the paper and implementation. As I can see iDistance implements some global building strategy. I mean, for example, it selects some point, calculates distances from selected point to all points in dataset etc. So, it uses the whole dataset at the same time. However you can try to implement global index building in GiST or SP-GiST. In this case I think you should carefully estimate your capabilities during single GSoC project. You would need to extend GiST or SP-GiST interface and write completely new implementation of tree building algorithm. Question of how to exactly extend GiST or SP-GiST interface for this case could appear to be very hard even theoretically. +1. That seemed to be a major roadblock to me too when I read the paper. You could work around that by making partition identification a separate step. You'd have a function idist_analyze(cfg name, table name, field name) which'd identify suitable partitions for the data distribution in table.field and store them somewhere. Such a set of pre-identified partitions would be akin to a tsearch configuration, i.e. all other parts of the iDistance machinery would use it to map points to index keys and queries to ranges of those keys. You'll want to look at how tsearch handles that, and check if the method can indeed be applied to iDistance. You could perform that step as part of the index build. Before the index build starts to add tuples to the index, it could scan a random sample of the heap and identify the partitions based on that. If you need to store the metadata, like a map of partitions, it becomes difficult to cajole this into a normal GiST or SP-GiST opclass. The API doesn't have any support for storing such metadata. In a first cut, you'd probably only allow inserts into index which don't change the maximum distances from the partition centers that idist_analyze() found. That seems like a pretty serious restriction. I'd try to write it so that you can insert any value, but if the new values are very different from any existing values, it would be OK for the index quality to degrade. For example, you could simply add any out-of-bounds values to a separate branch in the index, which would have no particular structure and would just have to be scanned on every query. You can probably do better than that, but that would be a trivial way to deal with it. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Fix permission tests for views/tables proven empty by constraint
On 2013-05-07 21:45:02 -0400, Tom Lane wrote: Greg Stark st...@mit.edu writes: If we just reverted your fix and didn't fix it in 9.2 that would also fix the crash right? The bug was only that it leaked the fact that the view was provably empty from the definition? Well, it might fail to report a permissions violation when the not-allowed-to-be-accessed relation could be proven to yield no rows. I agree that it's a bit hard to call that a security issue as long as you assume that the attacker has access to the system catalogs; and even if you don't assume that, being able to discern that there's a check constraint on some table doesn't seem like a big leakage. Couldn't it also cause tables not to be locked that ought to be? That seems to be the nastier part to me. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSOC Student Project Idea
On 24.04.2013 22:10, Michael Schuh wrote: Thank you both for the very helpful feedback. Perhaps the scope of this project (application's completeness criteria) is better as a feasibility prototyping of the global/distance-based index strategy with B+-tree and/or GiST extension possibilities. For GSoC, we'd really like to see some code that can be committed as a result. Prototyping is important, but if that's all you do during the summer, the work is likely going to waste if no-one is going to work actively on the prototype afterwards. At this point, I think we need a more concrete plan on how this would be implemented. The idea of using a regular B-tree for this, with some functions to do the partition mapping might work. However, that would be a clunky interface - I don't think that would be accepted into PostgreSQL. So I don't think that makes a good GSoC project. If you think this can be done with the existing GiST or SP-GiST APIs, I'd like to see a more concrete plan on how that would work. What datatype would this be for? How would the partitioning be done? If the APIs need to be extended, what would the extensions look like? The summer is short, so there isn't much time for exploration - we need to have a pretty good idea of what the result will look like, right now. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cube extension improvement, GSoC
On Sat, May 4, 2013 at 11:19 PM, Stas Kelvich stanc...@gmail.com wrote: I think we have at least 3 data types more or less similar to cube. 1) array of ranges 2) range of arrays 3) 2d arrays Semantically cube is most close to array or ranges. However array of ranges have huge storage overhead. Also we can declare cube as domain on 2d arrays and declare operations of that domain. But what we should do when arrays in different records have different numbers of element? We can be faced with absolutely same situation with cube. test=# create table cube_test (v cube); CREATE TABLE test=# insert into cube_test values (cube(array[1,2])), (cube(array[1,2,3])); INSERT 0 2 In order to force all cubes to have same number of dimensions excplicit CHECK on table is required. As I remember cube treats absent dimensions as zeros. -- With best regards, Alexander Korotkov.
Re: [HACKERS] Terminology issue: suffix tree
On 06.05.2013 14:10, Alexander Korotkov wrote: On Sat, May 4, 2013 at 10:27 PM, Alexander Korotkovaekorot...@gmail.comwrote: In suffix tree we insert every suffix of source string into the tree. http://en.wikipedia.org/wiki/Suffix_tree Actually opclass implemented radix tree or patricia tree. http://en.wikipedia.org/wiki/Radix_tree Likely we need a patch to rename it in all the places it mentioned. Patch is attached. Thanks, committed. Apparently, we have same issue in contrib/unaccent. Yeah. The data structure in contrib/unaccent seems to be a plain old trie, rather than a radix trie, though. According to wikipedia at least, the difference is that in a radix tree, the edges are labeled with sequences of elements, rather than single elements. Want to patch that too? - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Terminology issue: suffix tree
On Wed, May 8, 2013 at 3:50 PM, Heikki Linnakangas hlinnakan...@vmware.comwrote: On 06.05.2013 14:10, Alexander Korotkov wrote: On Sat, May 4, 2013 at 10:27 PM, Alexander Korotkovaekorot...@gmail.com **wrote: In suffix tree we insert every suffix of source string into the tree. http://en.wikipedia.org/wiki/**Suffix_treehttp://en.wikipedia.org/wiki/Suffix_tree Actually opclass implemented radix tree or patricia tree. http://en.wikipedia.org/wiki/**Radix_treehttp://en.wikipedia.org/wiki/Radix_tree Likely we need a patch to rename it in all the places it mentioned. Patch is attached. Thanks, committed. Thanks! Apparently, we have same issue in contrib/unaccent. Yeah. The data structure in contrib/unaccent seems to be a plain old trie, rather than a radix trie, though. According to wikipedia at least, the difference is that in a radix tree, the edges are labeled with sequences of elements, rather than single elements. Want to patch that too? Agree, trie is most comforming term here. Patch is attached. -- With best regards, Alexander Korotkov. unaccent-suffix-fix.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Fix permission tests for views/tables proven empty by constraint
Andres Freund and...@2ndquadrant.com writes: On 2013-05-07 21:45:02 -0400, Tom Lane wrote: Well, it might fail to report a permissions violation when the not-allowed-to-be-accessed relation could be proven to yield no rows. Couldn't it also cause tables not to be locked that ought to be? That seems to be the nastier part to me. In ordinary immediate execution the parser or planner would have obtained the relevant table lock. I think what you say is possible if a prepared plan is re-executed, but TBH it doesn't sound like much of an issue to me. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] RETURNING syntax for COPY
Hello, as a continuation of my proposal expanding RETURNING syntax by AFTER/BEFORE, there can be enough time to implement RETURNING for COPY. I'd like to hear your opinion on that. My draft idea is: COPY FROM ... RETURNING table_name.* - returns all values copied to table after all triggers invoke COPY FROM ... RETURNING BEFORE.* - returns all values copied to table before triggers (all raw values from file) COPY TO ... RETURNING table_name.* - returns all values from table COPY TO ... RETURNING AFTER.*/BEFORE.* - the same What do you think about? Regards, Karol Trzcionka -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] improving PL/Python builds on OS X
On Fri, Dec 21, 2012 at 5:45 PM, Peter Eisentraut pete...@gmx.net wrote: The PL/Python build on OS X is currently hardcoded to use the system Python install. If you try to override this when running configure, you get a mysterious mix-and-match build. If you want to build against your own Python build, or MacPorts or Homebrew, you can't. This is straightforward to fix. In configure, besides checking Python include and library paths, we can also check whether it's a framework build and the right parameters for that. The attached patch does that and does the job for me. Please test it. One constraint, which is explained in the comment in src/pl/plpython/Makefile is that in Python 2.5, there is no official way to detect either framework builds or shared libpython builds, so we can't support those versions on OS X, at least without more hardcoding of things. I'd rather phase some of that out, but if someone needs to continue to use Python 2.4 or earlier on OS X, let me know. (Or more proper fix would be to split DLSUFFIX into two variables, but that seems more work than it's worth right now.) This patch seems to have broken the build for our installers for 9.3. Because we need a consistent build of the PL interpretors on all the platforms we support, we use the ActiveState distributions of Perl, Python and TCL (we can't rely on vendor supplied packages, because their versions vary between different Linux distros and different OS X versions). However, ActivePython doesn't include a shared library, which this change seems to require. Can that requirement be reverted? -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] XLogFlush invoked about twice as much after 9.2 group commit enhancement
On Tue, May 7, 2013 at 9:23 PM, Peter Geoghegan p...@heroku.com wrote: On Tue, May 7, 2013 at 12:48 PM, Jeff Janes jeff.ja...@gmail.com wrote: Anyway, I don't see this behavior change when turning on wal_debug and looking in the logfiles for 'xlog flush request' messages. That could have everything to do with the hardware you're using. In general, the higher the cost of an fsync, the more useful it is to amortize that cost among concurrently committing transactions. True, but that is going to happen with a proportional increase in throughput, which he reported not seeing. At least originally. I'm not sure what to think now. Cheers, Jeff
Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix permission tests for views/tables proven empty by constraint
Greg Stark st...@mit.edu writes: That's kind of dismaying. ORMs have a tendency to create queries like this and people may have even written such queries by hand and tested them to determine that postgres was able to exclude the useless relation. To have them install a security update and discover that something they had previously tested no longer worked would be annoying. Turns out to be more to this than I realized before. In an example such as the one I showed select * from ((select f1 as x from t1 offset 0) union all (select f2 as x from t2 offset 0)) ss where false; where an appendrel subselect member can be proven empty on the basis of outer-query clauses alone, *we don't even plan that subquery*. The fix I had in mind for this fails to capture table references from such a subquery. We could extend setrefs.c to dig into unplanned subqueries and grab RTEs out of them, but that would not be a complete fix. In particular, RTEs would not get made for inheritance children of parent tables mentioned in the query, since inheritance expansion is done by the planner. Now, that wouldn't affect permissions checks because no extra permissions checks are done on inheritance children, but it would affect the locking behavior that Andres was worried about. I think the only reliable way to make this optimization fully transparent would be to go ahead and plan every subquery, even when we know we'll discard the planning results immediately. That seems like quite a lot of overkill. I'm not really sure I buy Greg's argument that people might be depending on the performance benefits of not planning such subqueries, but I guess it's not impossible either. My inclination is to go ahead and write the extra code to dig RTEs out of unplanned subqueries, and not worry about failing to lock inheritance children in them. I'm also now pretty firmly in the camp of let's not try at all to fix this in the back branches. Thoughts? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] improving PL/Python builds on OS X
On 5/8/13 11:51 AM, Dave Page wrote: This patch seems to have broken the build for our installers for 9.3. Because we need a consistent build of the PL interpretors on all the platforms we support, we use the ActiveState distributions of Perl, Python and TCL (we can't rely on vendor supplied packages, because their versions vary between different Linux distros and different OS X versions). However, ActivePython doesn't include a shared library, which this change seems to require. Can that requirement be reverted? There was no change in this regard. A shared library was always required on OS X. ActivePython does include a shared library. I just tried it and it builds fine. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RETURNING syntax for COPY
Hello why? What is motivation? What is use case? Regards Pavel 2013/5/8 Karol Trzcionka karl...@gmail.com Hello, as a continuation of my proposal expanding RETURNING syntax by AFTER/BEFORE, there can be enough time to implement RETURNING for COPY. I'd like to hear your opinion on that. My draft idea is: COPY FROM ... RETURNING table_name.* - returns all values copied to table after all triggers invoke COPY FROM ... RETURNING BEFORE.* - returns all values copied to table before triggers (all raw values from file) COPY TO ... RETURNING table_name.* - returns all values from table COPY TO ... RETURNING AFTER.*/BEFORE.* - the same What do you think about? Regards, Karol Trzcionka -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RETURNING syntax for COPY
Karol Trzcionka karl...@gmail.com writes: as a continuation of my proposal expanding RETURNING syntax by AFTER/BEFORE, there can be enough time to implement RETURNING for COPY. No there isn't; what you suggest would require FE/BE protocol extensions, making it several orders of magnitude more work than the other thing. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RETURNING syntax for COPY
On 08.05.2013 19:44, Tom Lane wrote: Karol Trzcionkakarl...@gmail.com writes: as a continuation of my proposal expanding RETURNING syntax by AFTER/BEFORE, there can be enough time to implement RETURNING for COPY. No there isn't; what you suggest would require FE/BE protocol extensions, making it several orders of magnitude more work than the other thing. I'd imagine that the flow would go something like this: BE FE CopyInResponse CopyData CopyData ... CopyDone RowDescription DataRow DataRow CommandComplete A well-written client should be able to handle that. But if one expects the backend to always send a CommandComplete after CopyDone, it will get confused. Implementing that doesn't seem too difficult to me. I agree that this is much more work than the UPDATE RETURNING BEFORE/AFTER, though. Not sure if that's a good or a bad thing. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RETURNING syntax for COPY
Heikki Linnakangas hlinnakan...@vmware.com writes: On 08.05.2013 19:44, Tom Lane wrote: No there isn't; what you suggest would require FE/BE protocol extensions, making it several orders of magnitude more work than the other thing. I'd imagine that the flow would go something like this: BEFE CopyInResponse CopyData CopyData ... CopyDone RowDescription DataRow DataRow CommandComplete That would require the backend to buffer the entire query response, which isn't a great idea. I would expect that such an operation would need to interleave CopyData to the backend with DataRow responses. Such a thing could possibly be built on COPY_BOTH mode, but it would be a lot of work (at both ends) for extremely debatable value. The general idea of COPY is to load data as fast as possible, so weighing it down with processing options seems like a pretty dubious idea even if the implementation were easy. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] improving PL/Python builds on OS X
On Wed, May 8, 2013 at 5:34 PM, Peter Eisentraut pete...@gmx.net wrote: On 5/8/13 11:51 AM, Dave Page wrote: This patch seems to have broken the build for our installers for 9.3. Because we need a consistent build of the PL interpretors on all the platforms we support, we use the ActiveState distributions of Perl, Python and TCL (we can't rely on vendor supplied packages, because their versions vary between different Linux distros and different OS X versions). However, ActivePython doesn't include a shared library, which this change seems to require. Can that requirement be reverted? There was no change in this regard. A shared library was always required on OS X. It's failing on Linux. Even worse, it configures fine and then builds without error. There is a message spewed out by configure, but it doesn't contain the words warning or error. Given that I explicitly said I wanted Python support when I ran configure, it should certainly fail with an error at configure time. We only noticed this was a problem when the QA guys started diving in to more detailed tested, as we don't watch for every message in the 50+ MB of logs our automated build systems generate. ActivePython does include a shared library. I just tried it and it builds fine. -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RETURNING syntax for COPY
Karol Trzcionka karl...@gmail.com writes: as a continuation of my proposal expanding RETURNING syntax by What about implementing support for OLD/NEW in per-statement triggers? I guess you would expose the data via a SRF. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RETURNING syntax for COPY
On 05/08/2013 01:16 PM, Tom Lane wrote: That would require the backend to buffer the entire query response, which isn't a great idea. I would expect that such an operation would need to interleave CopyData to the backend with DataRow responses. Such a thing could possibly be built on COPY_BOTH mode, but it would be a lot of work (at both ends) for extremely debatable value. The general idea of COPY is to load data as fast as possible, so weighing it down with processing options seems like a pretty dubious idea even if the implementation were easy. That was my reaction. I would look it this very skeptically. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RETURNING syntax for COPY
On May 8, 2013, at 1:16 PM, Tom Lane wrote: Heikki Linnakangas hlinnakan...@vmware.com writes: On 08.05.2013 19:44, Tom Lane wrote: No there isn't; what you suggest would require FE/BE protocol extensions, making it several orders of magnitude more work than the other thing. I'd imagine that the flow would go something like this: BE FE CopyInResponse CopyData CopyData ... CopyDone RowDescription DataRow DataRow CommandComplete That would require the backend to buffer the entire query response, which isn't a great idea. I would expect that such an operation would need to interleave CopyData to the backend with DataRow responses. Such a thing could possibly be built on COPY_BOTH mode, but it would be a lot of work (at both ends) for extremely debatable value. The general idea of COPY is to load data as fast as possible, so weighing it down with processing options seems like a pretty dubious idea even if the implementation were easy. There are cases that I indeed want to load data very quickly, but I want to perform an operation on it immediately after, e.g. removing bad data that was immediately added from that copy. For instance, I do have this scenario: WITH new_data AS ( COPY FROM ... RETURNING id, field_to_check ) DELETE FROM table USING new_data WHERE table.id = new_data.id AND new_data.field_to_check ~* 'bad data'; Now I can take care of that all in one step, and I know I'm only removing fields I just added. This comes up when I am importing external files from other sources where I may not necessarily want all of the rows or some of the rows contain bad data. This also presumes that COPY works in a CTE, which I'm not sure it does (and I will do the TIAS test after I hit send on this message). Jonathan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RETURNING syntax for COPY
On Wed, May 08, 2013 at 01:16:14PM -0400, Tom Lane wrote: Heikki Linnakangas hlinnakan...@vmware.com writes: On 08.05.2013 19:44, Tom Lane wrote: No there isn't; what you suggest would require FE/BE protocol extensions, making it several orders of magnitude more work than the other thing. I'd imagine that the flow would go something like this: BE FE CopyInResponse CopyData CopyData ... CopyDone RowDescription DataRow DataRow CommandComplete That would require the backend to buffer the entire query response, which isn't a great idea. I would expect that such an operation would need to interleave CopyData to the backend with DataRow responses. Such a thing could possibly be built on COPY_BOTH mode, but it would be a lot of work (at both ends) for extremely debatable value. The general idea of COPY is to load data as fast as possible, With utmost respect, that is one of several use cases, and any change would need to keep that use case unburdened. A sometimes overlapping set of use cases move data in and out of the database in a simple manner. In some of these, people might wish to trade some performance for the feature. A particular example would be one where there are several tables to be loaded, some with generated columns that the future ones would depend on. Yes, it's possible (kinda) to do this with the FDW machinery, but the burden is much higher as it requires DDL permission in general each time. so weighing it down with processing options seems like a pretty dubious idea even if the implementation were easy. Totally agreed that the fast load/unload code path must not be affected by any such changes. Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix permission tests for views/tables proven empty by constraint
On 2013-05-08 12:30:31 -0400, Tom Lane wrote: Greg Stark st...@mit.edu writes: That's kind of dismaying. ORMs have a tendency to create queries like this and people may have even written such queries by hand and tested them to determine that postgres was able to exclude the useless relation. To have them install a security update and discover that something they had previously tested no longer worked would be annoying. Turns out to be more to this than I realized before. In an example such as the one I showed select * from ((select f1 as x from t1 offset 0) union all (select f2 as x from t2 offset 0)) ss where false; where an appendrel subselect member can be proven empty on the basis of outer-query clauses alone, *we don't even plan that subquery*. The fix I had in mind for this fails to capture table references from such a subquery. We could extend setrefs.c to dig into unplanned subqueries and grab RTEs out of them, but that would not be a complete fix. In particular, RTEs would not get made for inheritance children of parent tables mentioned in the query, since inheritance expansion is done by the planner. Now, that wouldn't affect permissions checks because no extra permissions checks are done on inheritance children, but it would affect the locking behavior that Andres was worried about. I first thought that is fair enough since I thought that in most if not all cases where locking plays a user visible role the parent relation would get locked anyway when a child relations gets locked. Turns out, we do it only the other way round, i.e. lock child relations when we lock a parent relation, even for most ddl in child relations. I am not sure if its really problematic, but it seems to allow scenarios like: S1: BEGIN; S1: SELECT * FROM ((SELECT * FROM parent OFFSET 0) UNION ALL (SELECT * FROM parent OFFSET 0)) f WHERE false; -- parent is locked now, children are not S2: BEGIN; S2: ALTER TABLE child_1 DROP CONSTRAINT foo; S1: SELECT * FROM parent WHERE ... -- blocks, waiting for S1 since child_1 is locked. This seems like somewhat confusing behaviour, although it has gone unnoticed so far, since one normally expect that a previous lock allows you to continue workin with a relation. But I guess this is better fixed by making all DDL on child relations also lock their parent relation? That seems like a good idea anyway. I am not at all convinced that this must be fixed, but also not the other way round. I just wanted to point this out since I am not sure there aren't any more problematic cases. I think the only reliable way to make this optimization fully transparent would be to go ahead and plan every subquery, even when we know we'll discard the planning results immediately. That seems like quite a lot of overkill. I'm not really sure I buy Greg's argument that people might be depending on the performance benefits of not planning such subqueries, but I guess it's not impossible either. I didn't understand Greg's argument as being based on performance but as being worried about the changed locking and such from a functional perspective. Greg? I don't really buy the performance argument either, but I agree that we shouldn't do all this in the back branches as the bug isn't really bad and it has some potential for introducing problems. I'm also now pretty firmly in the camp of let's not try at all to fix this in the back branches. +1 independent of where this goes. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Terminology issue: suffix tree
On 08.05.2013 15:49, Alexander Korotkov wrote: On Wed, May 8, 2013 at 3:50 PM, Heikki Linnakangas hlinnakan...@vmware.comwrote: Yeah. The data structure in contrib/unaccent seems to be a plain old trie, rather than a radix trie, though. According to wikipedia at least, the difference is that in a radix tree, the edges are labeled with sequences of elements, rather than single elements. Want to patch that too? Agree, trie is most comforming term here. Patch is attached. Ok, applied. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RETURNING syntax for COPY
On Wed, May 05/08/13, 2013 at 10:55:40AM -0700, David Fetter wrote: On Wed, May 08, 2013 at 01:16:14PM -0400, Tom Lane wrote: Heikki Linnakangas hlinnakan...@vmware.com writes: On 08.05.2013 19:44, Tom Lane wrote: No there isn't; what you suggest would require FE/BE protocol extensions, making it several orders of magnitude more work than the other thing. I'd imagine that the flow would go something like this: BEFE CopyInResponse CopyData CopyData ... CopyDone RowDescription DataRow DataRow CommandComplete That would require the backend to buffer the entire query response, which isn't a great idea. I would expect that such an operation would need to interleave CopyData to the backend with DataRow responses. Such a thing could possibly be built on COPY_BOTH mode, but it would be a lot of work (at both ends) for extremely debatable value. The general idea of COPY is to load data as fast as possible, With utmost respect, that is one of several use cases, and any change would need to keep that use case unburdened. A sometimes overlapping set of use cases move data in and out of the database in a simple manner. In some of these, people might wish to trade some performance for the feature. 99% of my uses at work for COPY are as a general data import and export facility. I often find myself loading CSV files into our database for analysis and further cleanup, and then use COPY to output queries as CSV files for consumption by other members of the business. The recent work for (PRE|POST)PROCESSOR options to COPY is indicative of the fact that users are not merely using COPY to load data as fast as possible. Other discussions around a COMPRESSED option are more than just a performance enhancement, in my view, as I oftern receive files compressed and decompressing the data is just another step standing in the way of myself importing the data into the database. Additionally, once I have the data imported, I often take many steps to cleanup and format the data, prior to applying actual typing to a table (which invariably fails due to invalid dates, and other nonsense). COPY ... RETURNING would certainly be useful to apply additional transformations to the data before finally sending it to its ultimate destination. A particular example would be one where there are several tables to be loaded, some with generated columns that the future ones would depend on. Yes, it's possible (kinda) to do this with the FDW machinery, but the burden is much higher as it requires DDL permission in general each time. I find using the FDW machinery to perform many queries to be much slower than importing the data once and then running my queries. There is also no ability to use indexes. so weighing it down with processing options seems like a pretty dubious idea even if the implementation were easy. Totally agreed that the fast load/unload code path must not be affected by any such changes. Agreed here as well. -Ryan P. Kelly -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RETURNING syntax for COPY
* Ryan Kelly (rpkell...@gmail.com) wrote: COPY ... RETURNING would certainly be useful to apply additional transformations to the data before finally sending it to its ultimate destination. If we really think that COPY ... RETURNING is only going to be used in a CTE or similar, then we could always only support that and forgo any changes to the FE/BE protocol to support it. Or, at least, take the simplest approach to supporting it which would involve cacheing the data entirely before sending it back to the client (isn't that what we do on a big INSERT ... VALUES ... RETURNING anyway? people can transfer in blocks if they want to with INSERT .. VALUES or COPY .. RETURNING). Thanks, Stephen signature.asc Description: Digital signature
[HACKERS] Taking the varattno in args (where part of a query)
Hi everyone, I'm working on a project and one of my task is trying to take the where part of the queries.I'm programming inside the function exec_simple_query in postgres.c.I searched inside the source code and how I've understood I need to take the varattno in args in qpqual, is it right?If yes, how do I have to save it in order to have it in that function?If no, what is the easiest way to take it? Example: For a query: select firstname from friend where age=33The attributes of the table friends are: firstname, lastname, city, state, ageI need to have the number 5 because age is the fifth attribute of friend. Thanks
Re: [HACKERS] Graph datatype addition
On Thu, May 2, 2013 at 7:58 AM, Atri Sharma atri.j...@gmail.com wrote: Sent from my iPad On 02-May-2013, at 4:33, Misa Simic misa.si...@gmail.com wrote: On Wednesday, May 1, 2013, Atri Sharma wrote: Hi all, Please find a probable prototype for the same: struct GraphNode { Oid NodeOid;// Oid of the row which is the node here. We will store an identifier to it here rather than the complete row(with data) itself. AdjacencyList *list; // Pointer to the node's adjacency list. }; struct AdjacencyList { Oid[] neighbours_list; }; struct AdjacencyList is probably the 'hottest' data structure in our entire implementation. We can think of making a cache of recently accessed struct AdjacencyList instances, or the AdjacencyList(s) of the neighbours of the recently accessed nodes, because, they are most likely to be accessed in near future. Advice here, please? So. struct AdjacencyCache { Oid[] cache_values; }; push and pop functions for AdjacencyCache follow. We need a replacement and invalidation algorithm for the cache. I feel a version of LRU should be good here. I have not given a prototype for operations and algorithm implementations. I feel,as suggested by Peter and Jaime, we can look at pgRouting code for algorithm implementations. Florian's concerns are mitigated here to some extent,IMO. Since the nodes and linkings are loosely coupled, and not represented as a single representation, updating or changing of any part or adding a new edge is no longer an expensive operation, as it only requires a lookup of GraphNode and then its AdjacencyList. If we use the cache as well, it will further reduce the lookup costs. I have not yet thought of the user visible layer as suggested by Jim. Probably. once we are ok with the internal layer, we can move to the user visible layer. Advice/Comments/Feedback please? Honestly - I think I dont understand proposal... Datatypes - are about values - what will be stored in that column in a table Datatype - cant have any clue about rows How I understand what you described - you can achieve the same with pure SQL - struct are equvalent to graph tables... Instead od Oid column will store PKs of nodes table... Yes, I agree.I need to think more. Let me get back with a deeper proposal. Regards, Atri Hi all, In continuation of the above discussion,I have been thinking about the design of the data type. I have been thinking on the lines of making a multi dimensional data structure for the same: Specifically, I have been thinking about making multi lists for representing data. After some research, I think that the following data structure may help: Each node will be represented as: [Down Pointer][Atom][Right Pointer] Suppose, a graph is like(sorry for the bad drawing): B / AD \ / C \ E can be represented as: C's dataE's data D's data ^ ^ ^ A's data [|][1][--[|][1][--[|][1][NULL] ^ ^ [|][1][--[|][0][-[|][1][NULL] ^ B's data Essentially, the Atom flag denotes if the node has any out edges from it. If it has no out edge, ATOM is 0 and Down Pointer points to an auxiliary structure which holds the node's data(hence, the data can be of arbitrary size). If the node has some out degree, then, those nodes are added to a new sublist which starts from the node which spawns those nodes.Node's down pointer points to the head of the new sublist. Essentially, a sublist has all the nodes directly spawning from the head node of the sublist. This approach has multiple advantages in term of memory and efficiency. Also, isolating sub graphs based on some criteria is pretty efficient, which is good for many analytics based operations. Access time is restricted as well. Thoughts/Comments/Feedback please? Regards, Atri -- Regards, Atri l'apprenant -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Graph datatype addition
On 5/8/13 1:40 PM, Atri Sharma wrote: On Thu, May 2, 2013 at 7:58 AM, Atri Sharma atri.j...@gmail.com wrote: Sent from my iPad On 02-May-2013, at 4:33, Misa Simic misa.si...@gmail.com wrote: On Wednesday, May 1, 2013, Atri Sharma wrote: Hi all, Please find a probable prototype for the same: struct GraphNode { Oid NodeOid;// Oid of the row which is the node here. We will store an identifier to it here rather than the complete row(with data) itself. AdjacencyList *list; // Pointer to the node's adjacency list. }; struct AdjacencyList { Oid[] neighbours_list; }; struct AdjacencyList is probably the 'hottest' data structure in our entire implementation. We can think of making a cache of recently accessed struct AdjacencyList instances, or the AdjacencyList(s) of the neighbours of the recently accessed nodes, because, they are most likely to be accessed in near future. Advice here, please? So. struct AdjacencyCache { Oid[] cache_values; }; push and pop functions for AdjacencyCache follow. We need a replacement and invalidation algorithm for the cache. I feel a version of LRU should be good here. I have not given a prototype for operations and algorithm implementations. I feel,as suggested by Peter and Jaime, we can look at pgRouting code for algorithm implementations. Florian's concerns are mitigated here to some extent,IMO. Since the nodes and linkings are loosely coupled, and not represented as a single representation, updating or changing of any part or adding a new edge is no longer an expensive operation, as it only requires a lookup of GraphNode and then its AdjacencyList. If we use the cache as well, it will further reduce the lookup costs. I have not yet thought of the user visible layer as suggested by Jim. Probably. once we are ok with the internal layer, we can move to the user visible layer. Advice/Comments/Feedback please? Honestly - I think I dont understand proposal... Datatypes - are about values - what will be stored in that column in a table Datatype - cant have any clue about rows How I understand what you described - you can achieve the same with pure SQL - struct are equvalent to graph tables... Instead od Oid column will store PKs of nodes table... Yes, I agree.I need to think more. Let me get back with a deeper proposal. Regards, Atri Hi all, In continuation of the above discussion,I have been thinking about the design of the data type. I have been thinking on the lines of making a multi dimensional data structure for the same: Specifically, I have been thinking about making multi lists for representing data. After some research, I think that the following data structure may help: Each node will be represented as: [Down Pointer][Atom][Right Pointer] Suppose, a graph is like(sorry for the bad drawing): B / AD \ / C \ E can be represented as: C's dataE's data D's data ^ ^ ^ A's data [|][1][--[|][1][--[|][1][NULL] ^ ^ [|][1][--[|][0][-[|][1][NULL] ^ B's data Essentially, the Atom flag denotes if the node has any out edges from it. If it has no out edge, ATOM is 0 and Down Pointer points to an auxiliary structure which holds the node's data(hence, the data can be of arbitrary size). If the node has some out degree, then, those nodes are added to a new sublist which starts from the node which spawns those nodes.Node's down pointer points to the head of the new sublist. Essentially, a sublist has all the nodes directly spawning from the head node of the sublist. This approach has multiple advantages in term of memory and efficiency. Also, isolating sub graphs based on some criteria is pretty efficient, which is good for many analytics based operations. Access time is restricted as well. Thoughts/Comments/Feedback please? Your second drawing didn't really make any sense to me. :( I do think it would be most productive to focus on what the API for dealing with graph data would look like before trying to handle the storage aspect. The storage is potentially dirt-simple, as others have shown. The only challenge would be efficiency, but it's impossible to discuss efficiency without some clue of how the data will be accessed. Frankly, for the first round of this I think it would be best if the storage really was just some raw tables. Once something is available people will start figuring out how to use it, and where the API needs to be improved. -- Jim C. Nasby, Data Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list
Re: [HACKERS] Graph datatype addition
Your second drawing didn't really make any sense to me. :( I do think it would be most productive to focus on what the API for dealing with graph data would look like before trying to handle the storage aspect. The storage is potentially dirt-simple, as others have shown. The only challenge would be efficiency, but it's impossible to discuss efficiency without some clue of how the data will be accessed. Frankly, for the first round of this I think it would be best if the storage really was just some raw tables. Once something is available people will start figuring out how to use it, and where the API needs to be improved. -- Thanks for your reply. Yes,my drawing sucks.heh. Ok,I agree. I was pretty perked up about efficiency in storage, hence started designing. Sketching out an API in terms of functionalities will require a different viewpoint. I think make, insert, search, delete functionalities would be straightly exposed to the user.Then, functionalities to isolate sub graphs based on some criterion/criteria and implementations of standard graph algorithms(BFS,DFS,Djikstra's algorithm) can be exposed through single functions. Regards, Atri -- Regards, Atri l'apprenant -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] about index inheritance
On Wed, May 08, 2013 at 10:19:08AM +0200, Vincenzo Melandri wrote: On Tue, May 7, 2013 at 11:55 PM, Robert Haas robertmh...@gmail.com wrote: This is a really hard problem. If you pick this as your first project hacking on PostgreSQL, you will almost certainly fail. Thank you very much, i guessed that already -.- Still, I needed that at my office for a long time, struggled with it many times and had to come out with some exotic solutions... Now I have spare time between projects, so I can work on it full-time. At least it's worth a try, isn't it? Well, you can work on it but I think it will be less programming and more coming up with a feasable solution. Anyway, I'm working to better understand the problem, trying to identify at least the main involved points. At the moment I'm figuring out how the inherit mechanism works for relations (in tablecmds.c).. Then I'll figure out about how indexes work.. While there are probably old threads in the archives, I find the easiest way to look at the problem is in the locking. In particular, I think if you can get unique indexes to work then the rest will follow. Consider the case of an inheritence hierarchy and you want a unique index on a column. Since you want to be able to create and drop children easily, each childs need to have an index just for them. But if you insert a row into one child you need to, somehow, prevent other people also inserting the same value in a different child. Efficiently and deadlock-free. This is hard, though we're up for crazy, out-of-the-box ideas. Note, there is one very special case, namely: - The children are used for partitioning. - The unique index you want is on the partition key. Since each value can only possibly appear in one table your locking problems vanish. The question is: how often does this happen? Hope this helps, -- Martijn van Oosterhout klep...@svana.org http://svana.org/kleptog/ He who writes carelessly confesses thereby at the very outset that he does not attach much importance to his own thoughts. -- Arthur Schopenhauer signature.asc Description: Digital signature
Re: [HACKERS] RETURNING syntax for COPY
On 5/8/13 12:54 PM, Jonathan S. Katz wrote: On May 8, 2013, at 1:16 PM, Tom Lane wrote: Heikki Linnakangas hlinnakan...@vmware.com writes: On 08.05.2013 19:44, Tom Lane wrote: No there isn't; what you suggest would require FE/BE protocol extensions, making it several orders of magnitude more work than the other thing. I'd imagine that the flow would go something like this: BE FE CopyInResponse CopyData CopyData ... CopyDone RowDescription DataRow DataRow CommandComplete That would require the backend to buffer the entire query response, which isn't a great idea. I would expect that such an operation would need to interleave CopyData to the backend with DataRow responses. Such a thing could possibly be built on COPY_BOTH mode, but it would be a lot of work (at both ends) for extremely debatable value. The general idea of COPY is to load data as fast as possible, so weighing it down with processing options seems like a pretty dubious idea even if the implementation were easy. There are cases that I indeed want to load data very quickly, but I want to perform an operation on it immediately after, e.g. removing bad data that was immediately added from that copy. For instance, I do have this scenario: WITH new_data AS ( COPY FROM ... RETURNING id, field_to_check ) DELETE FROM table USING new_data WHERE table.id = new_data.id AND new_data.field_to_check ~* 'bad data'; Now I can take care of that all in one step, and I know I'm only removing fields I just added. This comes up when I am importing external files from other sources where I may not necessarily want all of the rows or some of the rows contain bad data. This also presumes that COPY works in a CTE, which I'm not sure it does (and I will do the TIAS test after I hit send on this message). What you're really asking for here is some kind of stream processing capability. There are spin-offs of Postgres that provide that capability (I know Neil Conway worked on some). Those are geared completely around stream processing, but I think it would be extremely interesting to provide some minimal support for that in community Postgres. Using your use case as an example, something like this would be very interesting: COPY table FROM ... WHERE field_to_check !~* 'bad data' ; In this case we're just applying a simple WHERE clause against each incoming row. Perhaps what I'm suggesting could be implemented with a CTE, but I'm not sure it makes sense to do it the way you propose, at least not initially. A CTE would provide so much flexibility that it'd be difficult for the optimizer to be efficient about it. Something like a WHERE clause directly on COPY would be a lot easier to handle. As someone mentioned, FDW might be another option there. -- Jim C. Nasby, Data Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RETURNING syntax for COPY
On 5/8/13 12:33 PM, Dimitri Fontaine wrote: Karol Trzcionka karl...@gmail.com writes: as a continuation of my proposal expanding RETURNING syntax by What about implementing support for OLD/NEW in per-statement triggers? I guess you would expose the data via a SRF. Per statement NEW/OLD is an interesting case, in that it shares some of the same challenges; namely how to store the NEW and OLD recordsets efficiently. I've wondered if there'd be some way to do that by just storing a list of CTIDs (not sure if that'd work with HOT for OLD though). I still like the idea of being able to exclude certain records during COPY though; not writing a tuple will always be more efficient than creating one and then nuking it after the fact. There's a similar argument to be made about in-line transforms too. -- Jim C. Nasby, Data Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] about index inheritance
On 5/8/13 2:17 PM, Martijn van Oosterhout wrote: On Wed, May 08, 2013 at 10:19:08AM +0200, Vincenzo Melandri wrote: On Tue, May 7, 2013 at 11:55 PM, Robert Haas robertmh...@gmail.com wrote: This is a really hard problem. If you pick this as your first project hacking on PostgreSQL, you will almost certainly fail. Thank you very much, i guessed that already -.- Still, I needed that at my office for a long time, struggled with it many times and had to come out with some exotic solutions... Now I have spare time between projects, so I can work on it full-time. At least it's worth a try, isn't it? Well, you can work on it but I think it will be less programming and more coming up with a feasable solution. Anyway, I'm working to better understand the problem, trying to identify at least the main involved points. At the moment I'm figuring out how the inherit mechanism works for relations (in tablecmds.c).. Then I'll figure out about how indexes work.. While there are probably old threads in the archives, I find the easiest way to look at the problem is in the locking. In particular, I think if you can get unique indexes to work then the rest will follow. Consider the case of an inheritence hierarchy and you want a unique index on a column. Since you want to be able to create and drop children easily, each childs need to have an index just for them. But if you insert a row into one child you need to, somehow, prevent other people also inserting the same value in a different child. Efficiently and deadlock-free. This is hard, though we're up for crazy, out-of-the-box ideas. Note, there is one very special case, namely: - The children are used for partitioning. - The unique index you want is on the partition key. Since each value can only possibly appear in one table your locking problems vanish. The question is: how often does this happen? I would also consider indexes that span multiple tables that are do NOT involve inheritance. That's the most generic case, so if you can make that work everything else should fall into place. The only caveat is that UPDATE and DELETE in an inheritance tree could produce unique challenges since they would start off by reading from more than one table. -- Jim C. Nasby, Data Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RETURNING syntax for COPY
On 05/08/2013 03:23 PM, Jim Nasby wrote: WITH new_data AS ( COPY FROM ... RETURNING id, field_to_check ) Why is this better than this, which you can do today? WITH new_data AS ( INSERT into ... FROM foreign_table_with_file_fdw RETURNING ... ) The whole reason I abandoned trying to do this sort of thing with COPY was that I realized the FDW would provide what I wanted. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RETURNING syntax for COPY
On Wed, May 05/08/13, 2013 at 03:38:10PM -0400, Andrew Dunstan wrote: On 05/08/2013 03:23 PM, Jim Nasby wrote: WITH new_data AS ( COPY FROM ... RETURNING id, field_to_check ) Why is this better than this, which you can do today? WITH new_data AS ( INSERT into ... FROM foreign_table_with_file_fdw RETURNING ... ) The whole reason I abandoned trying to do this sort of thing with COPY was that I realized the FDW would provide what I wanted. You need to first CREATE EXTENSION file_fdw. Then you need to CREATE SERVER. Then CREATE FOREIGN TABLE. Which requires appropriate permission to do those things, and certainly has no hope of working on the client side. -Ryan P. Kelly -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSOC Student Project Idea
On 5/8/13 3:54 AM, Heikki Linnakangas wrote: On 24.04.2013 14:31, Florian Pflug wrote: On Apr23, 2013, at 23:25 , Alexander Korotkovaekorot...@gmail.com wrote: I've taken a brief look on the paper and implementation. As I can see iDistance implements some global building strategy. I mean, for example, it selects some point, calculates distances from selected point to all points in dataset etc. So, it uses the whole dataset at the same time. However you can try to implement global index building in GiST or SP-GiST. In this case I think you should carefully estimate your capabilities during single GSoC project. You would need to extend GiST or SP-GiST interface and write completely new implementation of tree building algorithm. Question of how to exactly extend GiST or SP-GiST interface for this case could appear to be very hard even theoretically. +1. That seemed to be a major roadblock to me too when I read the paper. You could work around that by making partition identification a separate step. You'd have a function idist_analyze(cfg name, table name, field name) which'd identify suitable partitions for the data distribution in table.field and store them somewhere. Such a set of pre-identified partitions would be akin to a tsearch configuration, i.e. all other parts of the iDistance machinery would use it to map points to index keys and queries to ranges of those keys. You'll want to look at how tsearch handles that, and check if the method can indeed be applied to iDistance. You could perform that step as part of the index build. Before the index build starts to add tuples to the index, it could scan a random sample of the heap and identify the partitions based on that. If you need to store the metadata, like a map of partitions, it becomes difficult to cajole this into a normal GiST or SP-GiST opclass. The API doesn't have any support for storing such metadata. In a first cut, you'd probably only allow inserts into index which don't change the maximum distances from the partition centers that idist_analyze() found. That seems like a pretty serious restriction. I'd try to write it so that you can insert any value, but if the new values are very different from any existing values, it would be OK for the index quality to degrade. For example, you could simply add any out-of-bounds values to a separate branch in the index, which would have no particular structure and would just have to be scanned on every query. You can probably do better than that, but that would be a trivial way to deal with it. Or you could use the new insert to start a new partition. Heck, maybe the focus should actually be on partitions and not individual records/points. ISTM the entire challenge here is figuring out a way to maintain a set of partitions that: - Are limited enough in number that you can quickly perform operations/searches across all partitions - Yet small enough that once you've narrowed down a set of partitions you don't have a ton of raw records to still look at Before we had range types I experimented with representing time ranges as rectangles of varying size (ie: for (start, end), create rectangle(point(start,start), point(end,end)). The problem with that is you had to convert timestamp into a float, which was not exact. So when querying you could use a GiST index on all the rectangles to narrow your scope, but you still needed a set of exact clauses (ie: start = now() - '1 year' AND end = now()). Partitions would be similar in that they wouldn't be exact but could greatly narrow the search space (of course we'd want to handle the secondary exact checking internally instead of exposing the user to that). -- Jim C. Nasby, Data Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [GENERAL] pg_upgrade fails, mismatch of relation OID - 9.1.9 to 9.2.4
On Wed, May 8, 2013 at 02:27:18PM -0400, Evan D. Hoffman wrote: If you want to start the old cluster, you will need to remove the .old suffix from /var/lib/pgsql/9.1/data/global/pg_control.old. Because link mode was used, the old cluster cannot be safely started once the new cluster has been started. Linking user relation files /var/lib/pgsql/9.1/data/base/16406/3016054 Mismatch of relation OID in database dbname: old OID 2938685, new OID 299721 Failure, exiting [ Moved to hackers ] OK, that is odd. We preserve old/new OIDs, (not relfilenode, as someone suggested in this thread); FYI: * FYI, while pg_class.oid and pg_class.relfilenode are initially the same * in a cluster, but they can diverge due to CLUSTER, REINDEX, or VACUUM * FULL. The new cluster will have matching pg_class.oid and * pg_class.relfilenode values and be based on the old oid value. This can * cause the old and new pg_class.relfilenode values to differ. In summary, * old and new pg_class.oid and new pg_class.relfilenode will have the * same value, and old pg_class.relfilenode might differ. The problem reported is that pg_dump was not able to preserve the old/new oids between clusters. Can you get the answer for this query on the old cluster: SELECT relname from pg_class where oid = 2938685; and on the new cluster, assuming you used 'copy' mode so you can start the old/new clusters indepdendently: SELECT relname from pg_class where oid = 299721; I think we will find that there is something in pg_dump related to this table that isn't preserving the oids. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] corrupt pages detected by enabling checksums
On 4/5/13 6:39 PM, Jeff Davis wrote: On Fri, 2013-04-05 at 10:34 +0200, Florian Pflug wrote: Maybe we could scan forward to check whether a corrupted WAL record is followed by one or more valid ones with sensible LSNs. If it is, chances are high that we haven't actually hit the end of the WAL. In that case, we could either log a warning, or (better, probably) abort crash recovery. +1. Corruption of fields which we require to scan past the record would cause false negatives, i.e. no trigger an error even though we do abort recovery mid-way through. There's a risk of false positives too, but they require quite specific orderings of writes and thus seem rather unlikely. (AFAICS, the OS would have to write some parts of record N followed by the whole of record N+1 and then crash to cause a false positive). Does the xlp_pageaddr help solve this? Also, we'd need to be a little careful when written-but-not-flushed WAL data makes it to disk, which could cause a false positive and may be a fairly common case. Apologies if this is a stupid question, but is this mostly an issue due to torn pages? IOW, if we had a way to ensure we never see torn pages, would that mean an invalid CRC on a WAL page indicated there really was corruption on that page? Maybe it's worth putting (yet more) thought into the torn page issue... :/ -- Jim C. Nasby, Data Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal to add --single-row to psql
On 5/1/13 7:36 PM, Robert Haas wrote: On Mon, Apr 29, 2013 at 4:33 PM, Jim Nasbyj...@nasby.net wrote: On 4/28/13 7:50 AM, Craig Ringer wrote: I find it frustrating that I've never seen an @paraccel email address here and that few of the other vendors of highly customised Pg offshoots are contributing back. It's almost enough to make me like the GPL. FWIW, I think there's a pretty large barrier to these folks contributing back. Would the community really want to add a bunch of hooks to support something like Redshift? Or Greenplum? Or etc, etc.? Most of these guys have to change significant amounts of PG code, so much so that it's actually hard for them to stay current (which is why most of them just fork). I do think this is a shame, but I'm not sure of any good way to fix it. Yep. There are plenty of things that we do at EDB for good and valid business reasons that I can't imagine the community accepting under any circumstances. For example, Oracle compatibility is not something the community values as highly as EnterpriseDB (and our customers) do. I'm sure that many of those vendors are in similar situations - they write code that only runs on specialized hardware, or (rather commonly, I suspect) they remove parts of the functionality in order to make certain things very fast. Those are not trade-offs that make sense for PostgreSQL, but I find it hard to understand what we'd gain from preventing other people from making them. There are in fact a pretty large number of companies - EnterpriseDB, obviously, but there are many, many others - that are choosing to build businesses around PostgreSQL precisely because it*isn't* GPL. Personally, I think that's a good thing for our community in terms of mindshare even when companies choose not to contribute back - and it's even better when they do. FWIW, one point I was trying to make that was overlooked is that it seems to be exceptionally difficult for companies to fork Postgres and then stay current (AFAIK EnterpriseDB and Mammoth are the only products that have pulled that feat off). I believe that makes it significantly harder for them to actually contribute code back that doesn't give them a business advantage, as well as making it harder to justify hacking on the community codebase because they'll just face a very large hurdle when it comes to pulling that code back into their proprietary product. I don't know of any good way to solve that problem. Maybe it's not worth solving... but I do suspect there's some useful stuff that the community has lost out on because of this. -- Jim C. Nasby, Data Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal to add --single-row to psql
On Wed, May 08, 2013 at 06:08:28PM -0500, Jim Nasby wrote: On 5/1/13 7:36 PM, Robert Haas wrote: On Mon, Apr 29, 2013 at 4:33 PM, Jim Nasbyj...@nasby.net wrote: On 4/28/13 7:50 AM, Craig Ringer wrote: I find it frustrating that I've never seen an @paraccel email address here and that few of the other vendors of highly customised Pg offshoots are contributing back. It's almost enough to make me like the GPL. FWIW, I think there's a pretty large barrier to these folks contributing back. Would the community really want to add a bunch of hooks to support something like Redshift? Or Greenplum? Or etc, etc.? Most of these guys have to change significant amounts of PG code, so much so that it's actually hard for them to stay current (which is why most of them just fork). I do think this is a shame, but I'm not sure of any good way to fix it. Yep. There are plenty of things that we do at EDB for good and valid business reasons that I can't imagine the community accepting under any circumstances. For example, Oracle compatibility is not something the community values as highly as EnterpriseDB (and our customers) do. I'm sure that many of those vendors are in similar situations - they write code that only runs on specialized hardware, or (rather commonly, I suspect) they remove parts of the functionality in order to make certain things very fast. Those are not trade-offs that make sense for PostgreSQL, but I find it hard to understand what we'd gain from preventing other people from making them. There are in fact a pretty large number of companies - EnterpriseDB, obviously, but there are many, many others - that are choosing to build businesses around PostgreSQL precisely because it*isn't* GPL. Personally, I think that's a good thing for our community in terms of mindshare even when companies choose not to contribute back - and it's even better when they do. FWIW, one point I was trying to make that was overlooked is that it seems to be exceptionally difficult for companies to fork Postgres and then stay current (AFAIK EnterpriseDB and Mammoth are the only products that have pulled that feat off). VMware and CitusDB are doing pretty well so far, but it's early days. I believe that makes it significantly harder for them to actually contribute code back that doesn't give them a business advantage, as well as making it harder to justify hacking on the community codebase because they'll just face a very large hurdle when it comes to pulling that code back into their proprietary product. I don't know of any good way to solve that problem. Maybe it's not worth solving... but I do suspect there's some useful stuff that the community has lost out on because of this. Some of this is getting solved by making PostgreSQL more pluggable in ways that isolate the proprietary stuff, i.e. make people not have to touch the PostgreSQL core code much, if at all, in order to provide whatever special features they provide. Hooks and FDWs are two such pluggable components. Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] corrupt pages detected by enabling checksums
On Wed, 2013-05-08 at 17:56 -0500, Jim Nasby wrote: Apologies if this is a stupid question, but is this mostly an issue due to torn pages? IOW, if we had a way to ensure we never see torn pages, would that mean an invalid CRC on a WAL page indicated there really was corruption on that page? Maybe it's worth putting (yet more) thought into the torn page issue... :/ Sort of. For data, a page is the logically-atomic unit that is expected to be intact. For WAL, a record is the logically-atomic unit that is expected to be intact. So it might be better to say that the issue for the WAL is torn records. A record might be larger than a page (it can hold up to three full-page images in one record), but is often much smaller. We use a CRC to validate that the WAL record is fully intact. The concern is that, if it fails the CRC check, we *assume* that it's because it wasn't completely flushed yet (i.e. a torn record). Based on that assumption, neither that record nor any later record contains committed transactions, so we can safely consider that the end of the WAL (as of the crash) and bring the system up. The problem is that the assumption is not always true: a CRC failure could also indicate real corruption of WAL records that have been previously flushed successfully, and may contain committed transactions. That can mean we bring the system up way too early, corrupting the database. Unfortunately, it seems that doing any kind of validation to determine that we have a valid end-of-the-WAL inherently requires some kind of separate durable write somewhere. It would be a tiny amount of data (an LSN and maybe some extra crosscheck information), so I could imagine that would be just fine given the right hardware; but if we just write to disk that would be pretty bad. Ideas welcome. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fast promotion failure
On Tue, May 7, 2013 at 6:57 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: While testing the bug from the Assertion failure at standby promotion, I bumped into a different bug in fast promotion. When the first checkpoint after fast promotion is performed, there is no guarantee that the checkpointer process is running with the correct, new, ThisTimeLineID. In CreateCheckPoint(), we have this: /* * An end-of-recovery checkpoint is created before anyone is allowed to * write WAL. To allow us to write the checkpoint record, temporarily * enable XLogInsertAllowed. (This also ensures ThisTimeLineID is * initialized, which we need here and in AdvanceXLInsertBuffer.) */ if (flags CHECKPOINT_END_OF_RECOVERY) LocalSetXLogInsertAllowed(); That ensures that ThisTimeLineID is updated when performing an end-of-recovery checkpoint, but it doesn't get executed with fast promotion. The consequence is that the checkpoint is created with the old timeline, and subsequent recovery from it will fail. I ran into this with the attached script. It sets up a master (M), a standby (B), and a cascading standby (C). I'm not sure why, but when I tried to simplify the script by removing the cascading standby, it started to work. The bug occurs in standby B, so I'm not sure why the presence of the cascading standby makes any difference. Maybe it just affects the timing. Can this really happen? ISTM that the checkpointer should detect that the recovery mode ends and call RecoveryInProgress()-InitXLOGAccess() before calling CreateCheckPoint(). Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add regression tests for COLLATE
Hi, Please find attached the updated patch. Fabien pointed out that currently does not check for non-trivial locales. I am still on the learning curve about LOCALEs and so, let me know if this is a show-stopper. I guess I could look at it and get back in some time with more tests as Fabien points out. (Apologies for the delay though. An update to the patch was mostly done back in April, but since most of the other Code-Coverage patches (SCHEMA/ROLE/etc.) had no other feedback, I worked on all of them together just this week). -- Robins Tharakan On 12 April 2013 09:28, Michael Paquier michael.paqu...@gmail.com wrote: On Thu, Apr 11, 2013 at 4:14 PM, Robins Tharakan thara...@gmail.comwrote: Hi, Please find attached a patch to take 'make check' code-coverage of COLLATE (/src/backend/commands/collationcmds) from 0% to 96%. Any feedback is more than welcome. Also posting this to Commitfest-next. Just by having a quick look at the patch, using object names of the type cX is too generic even if the tests are done in a private schema. Why not using a name like collate_obj_X or similar? -- Michael regress_collate_v2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch to add regression tests for SCHEMA
Hi, Please find attached an updated patch with the said changes. I'll try to update the other patches (if they pertain to this feedback) and update on their respective threads (as well as on Commitfest). -- Robins Tharakan On 8 May 2013 13:01, Fabien COELHO coe...@cri.ensmp.fr wrote: Dear Robins, Here is an updated patch that uses different schema / role names for different tests (as per commitfest site feedback). Short review about this version of the patch: This patch work for me. This test is a good thing and allows schema to be thoroughly tested, including corner cases which must fail because of errors or permissions. Two remarks: - test 2 bis: why name 'pg_asdf'? why not 'pg_schema_schsome number' to be homogeneous with other tests? - test 3: why not WHERE schema_name='schema_sch3' instead of two negative comparisons? ISTM that if for some reason in the future a new schema name is added, the test will fail. -- Fabien. regress_schema_v4.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] improving PL/Python builds on OS X
On Wed, 2013-05-08 at 18:24 +0100, Dave Page wrote: It's failing on Linux. Even worse, it configures fine and then builds without error. There is a message spewed out by configure, but it doesn't contain the words warning or error. Given that I explicitly said I wanted Python support when I ran configure, it should certainly fail with an error at configure time. We only noticed this was a problem when the QA guys started diving in to more detailed tested, as we don't watch for every message in the 50+ MB of logs our automated build systems generate. It worked before because we used to allow linking shared libraries against static libraries on some platforms. But that was more or less a lie because it doesn't work on 64-bit platforms. ActiveState Python contains a static library with PIC files. There is no obvious way to detect that, which is why we don't support it directly. You can sort it out yourself by building with make shared_libpython=yes In the long term, find a way to detect whether the library is usable. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add regression tests for COLLATE
Robins Tharakan thara...@gmail.com writes: Fabien pointed out that currently does not check for non-trivial locales. I am still on the learning curve about LOCALEs and so, let me know if this is a show-stopper. I guess I could look at it and get back in some time with more tests as Fabien points out. You really can't, because there is no guarantee that any given machine will have anything except C and POSIX. But there's another problem: I believe this test will fail on any machine where the database is created with an encoding different from UTF8, because that encoding is named in some of the error messages in the expected output. This stuff is not easy to test in a portable way. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Logging of PAM Authentication Failure
Hello, When client authentication method is set to pam in pg_hba.conf, connecting using psql results in logging of authentication failure even before a password prompt is provided, nonetheless user is subsequently able to connect by providing a password. Following is what is logged: Password: LOG: pam_authenticate failed: Conversation error FATAL: PAM authentication failed for user amit To see what's going on I debugged psql and found that without a -W option, this is bound to happen, since psql first attempts to connect and without a password (which it doesn't know is required for the first time), it fails and subsequently prompts for password. Correct password then leads to successful connection. I tried to observe the behavior with md5 method (without -W) and observed that no authentication failure is logged, since server probably behaves differently in response to the psql's first connection request in that case. But, pam method leads to it being logged. Is this a problem? -- Amit Langote -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add regression tests for ROLE (USER)
Hi, Please find an updated patch as per comments on Commitfest (comments replicated below for ease of understanding). Feedback 1: fc: role_ro2/3 used twice? rt: Corrected in this update. Feedback 2: fc: I do not understand why asdf conveys anything about an expected failure. Association of Scientists, Developers and Faculties? :-) rt: ASDF is a pattern that I learnt in one of the tests (SEQUENCE?) that pre-existed when I started working. Its a slang for arbit text that I just reused thinking that it is normal practice here. Anyway, have corrected that in this update. Feedback 3: fc: 2030/1/1 - 2030-01-01? maybe use a larger date? rt: 2030/1/1 date is not a failure point of the test. It needs to be a valid date (but sufficiently distant that so that tests don't fail). I tried setting this to 2200/1/1 and I get the same error message. Let me know if this still needs to be a large date. fb: VALID UNTIL '-12-31' works for me... rt: I thought 20 years is a date sufficiently far ahead to ensure that this test doesn't fail. Sure, have updated the test to use /1/1. Also, have added more tests at the end to ensure date-checks are also being validated in ALTER ROLE VALID UNTIL. Let me know if you need anything else changed in this. -- Robins Tharakan On 20 March 2013 03:41, Robins Tharakan thara...@gmail.com wrote: Hi, Please find attached a patch to take 'make check' code-coverage of ROLE (USER) from 59% to 91%. Any feedback is more than welcome. -- Robins Tharakan regress_user_v2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fast promotion failure
On Thursday, May 09, 2013 6:29 AM Fujii Masao wrote: On Tue, May 7, 2013 at 6:57 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: While testing the bug from the Assertion failure at standby promotion, I bumped into a different bug in fast promotion. When the first checkpoint after fast promotion is performed, there is no guarantee that the checkpointer process is running with the correct, new, ThisTimeLineID. In CreateCheckPoint(), we have this: /* * An end-of-recovery checkpoint is created before anyone is allowed to * write WAL. To allow us to write the checkpoint record, temporarily * enable XLogInsertAllowed. (This also ensures ThisTimeLineID is * initialized, which we need here and in AdvanceXLInsertBuffer.) */ if (flags CHECKPOINT_END_OF_RECOVERY) LocalSetXLogInsertAllowed(); That ensures that ThisTimeLineID is updated when performing an end-of-recovery checkpoint, but it doesn't get executed with fast promotion. The consequence is that the checkpoint is created with the old timeline, and subsequent recovery from it will fail. I ran into this with the attached script. It sets up a master (M), a standby (B), and a cascading standby (C). I'm not sure why, but when I tried to simplify the script by removing the cascading standby, it started to work. The bug occurs in standby B, so I'm not sure why the presence of the cascading standby makes any difference. Maybe it just affects the timing. Can this really happen? ISTM that the checkpointer should detect that the recovery mode ends and call RecoveryInProgress()-InitXLOGAccess() before calling CreateCheckPoint(). Without fast-promotion, it will request/perform End of Recovery checkpoint while still in recovery (before setting xlogctl-SharedRecoveryInProgress), So I think before any new operation can start, it can make sure that Checkpoint with new timeline is performed. However with fast promotion, the request for checkpoint is done after recovery; so some operations can happen before checkpoint with new timeline. I think it can so happen that last checkpoint is with old timeline and there are operations with new timeline which might have caused the problem Heikki has seen. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers