Re: [HACKERS] Postgresql Materialized views
On Sun, 2008-01-13 at 17:44 -0800, Sean Utt wrote: It is not my contention that the core developers need to be different in any way. It is also not my contention that the users need to be different in any way. First, this is an open forum, so thank you for expressing your views openly in the manner you've felt them. General comment: I think one perspective I have on the above statement is the feeling that there is a distinction between two groups of people and that one group is put here to serve the other group better. Many new users of Postgres are so used to the closed source situation of Developers being the only people who can see the code that they often perpetuate the concept of tiering or groups, when it doesn't exist. Almost all of the people on the list are users of Postgres. There's just a complete range of people from new users to experienced hackers. Postgres is well documented, well commented and completely open source, so there is no barrier to anyone who wishes to change, and if you choose to define that change positively, improve. So I support Mark Mielke's views on writing code. Anybody who wants to code, can. There's probably a project of a size and complexity that's right for your first project. Apparently the guy that invented the new scheduling algorithms for Linux wasn't even a coder, but he sat down and worked it out. This is Hackers: Write some code today, everybody. You *can*. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Transaction Snapshot Cloning
On Sat, 2008-01-12 at 16:22 -0500, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: ... So if we did this via an installable module approach then we would be able to use it much sooner for upgrading to 8.3, rather than waiting for 8.4 I think it borders on ludicrous to imagine making this work with just an installable module and no core-code changes. So no, I refuse to spend any time helping to design an implementation for 8.3. Refuse is a strong word, so apologies if I've offended. I already know how to write it, the main question was how dangerous is it and I think we answered that. Misuse seems to be the issue you seem to be worried about. In the wrong hands it could prove to be a Foot Gun with a larger than normal kill zone. Dangerous enough that you actively want me to not write it? Or just saying you're not sure it can be done? It'll be a hard enough problem to make this work for 8.4. Agreed. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Declarative partitioning grammar
Hi, Jeff Cohen wrote: We did look at allowing general functions for partitioning and this was one concern. The other is that we want to enforce that a row only gets inserted into a single partition, so we wanted a declarative syntax where it was relatively easy to check that range and list specifications don't overlap. Why do you need to define a split point so ambiguously at all? Why not just give the DBA exactly *one* place to define the split point? I don't think the separation into list, hash and range partitioning is adequate. What is the system supposed to do, if you try to insert a row which doesn't fit any of the values in your list or doesn't fit any of the ranges you defined? I prefer a partitioning grammar which doesn't interfere with constraints. We all know how to define constraints. Please don't introduce a new, ambiguous way. A partitioning definition should be able to tell the target partition for *every* row which satisfies the constraints (the real ones, not ambiguous ones). IMO, a single DDL command should only touch a single split point, i.e. split a table into two partitions, move the split point or remove the split point (joining the partitions again). Those are the only basic commands you need to be able to handle partitioning. Sorry, but for my taste, the proposed grammar is too long per command, not flexible enough and instead ambiguous for split points as well as for constraints. To me it looks like repeating the mistakes of others. Regards Markus ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Postgresql Materialized views
On Mon, 2008-01-14 at 09:22 +, Simon Riggs wrote: So I support Mark Mielke's views on writing code. Anybody who wants to code, can. There's probably a project of a size and complexity that's right for your first project. The main problem is that usually that initial thing is not what you desperately need today... so the motivation will be pretty low unless you just have loads of time to start off playing with the code. Apparently the guy that invented the new scheduling algorithms for Linux wasn't even a coder, but he sat down and worked it out. This is Hackers: Write some code today, everybody. You *can*. Certainly everybody can write code, but the barrier to accept it is pretty high in the postgres community. So you better be a damn good coder if you expect your code to be accepted... and even then with considerable fight for justifying the use case for your feature ;-) This is all good for a stable product, but it really makes the barrier between simple users and hackers pretty high. Cheers, Csaba. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Postgresql Materialized views
Dear Friends, I hope that this flame war can stop, as it is useless. The logic of free software is that developers pick-up issues, based on their skills and interest. The power of the cummunity is to gather very talented developers from all over the planet. Freedom is the logic and there is no need to drive the community. In the end, PostgreSQL relies on the knownledge of talented developers. When posting this thread, I hope that a talented developer would some day pick-up the materialized view issue and work on it, during a process of discussion. In a few days, I will post some precise statistics on how much MATERIALIZED VIEW could boost a 400.000 posts PhpBB 3.0 forum. I will write the required PL trigger, test them and install them on my server. Then each individual is able to decide whether materiazed views are important or not. This is my definition of freedom. Freedom of choice. I don't ask for more. Now, I hope that the list can return to a more peaceful state. Kind regards, Jean-Michel Pouré ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] Using tapes on tuplesort.c
Hi to all. It seems that the current PostgreSQL implementation of the Replacement Selection (RS) algorithm [Knuth] changes a logical tape for each run built. I'm trying to implement that refinement to RS using 2 heaps instead of just one (2Way RS). Recall each heap is aimed at building its corresponding physical run, both heap cooperate building its own physical run associated to the same logical run). 2Way RS stops building the current logical run just after stop building both physical runs associated to the current logical run. My question: should I use/change tape for each physical run or for each logical run? I know you'll be probably busy with issues on the new PostgreSQL release, so I'll thank you twice for your reply. Regards, Manolo.
Re: [HACKERS] Index trouble with 8.3b4
Tom Lane wrote: I've committed a patch to do that. Please test CVS HEAD and see if you still see problems. I'm happy to hear you found something and I will try CVS HEAD in a minute. In the meantime let me report that the cluster issue happens with GIST as well. I have load 5 million rows in that table and did: test=# CREATE INDEX CONCURRENTLY ts_test_tsv_gist ON public.test USING gist (tsv); CREATE INDEX test=# CLUSTER test USING ts_test_tsv_gist; ERROR: could not create unique index test_pkey DETAIL: Table contains duplicated values. test=# But as far as I understood this is already covered by your thesis. -- Best regards, Hannes Dorbath ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Index trouble with 8.3b4
Tom Lane [EMAIL PROTECTED] writes: Oooh ... I can't be sure that this is what's biting you, but I definitely see a bug that seems to match the symptoms. As the comments in index.c point out, CREATE INDEX CONCURRENTLY works like this: * validate_index() works by first gathering all the TIDs currently in the * index, using a bulkdelete callback that just stores the TIDs and doesn't * ever say delete it. (This should be faster than a plain indexscan; * also, not all index AMs support full-index indexscan.) Then we sort the * TIDs, and finally scan the table doing a merge join against the TID list * to see which tuples are missing from the index. The scan is done using the regular heapscan code, which in 8.3 has been modified to enable synchronized scanning, which means it might start from the middle of the table and wrap around. Wow, I'm glad we caught this in beta. Thanks a lot to Hannes Dorbath for testing and reporting it. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services! ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Postgresql Materialized views
Jean-Michel Pouré wrote: When posting this thread, I hope that a talented developer would some day pick-up the materialized view issue and work on it, during a process of discussion. In case it was lost in the noise - there are several capable people that have been personally on this issue for years. You are not the first to suggest it, and your opinion that the feature is valuable is shared by a larger group. There is still a difference between talking about it and doing it. A few of the doers told me off privately stating that it is not that difficult. My suspicion is that it *is* difficult and they are not doing themselves credit, or their solution is incomplete, but whatever - the result is the same. When one or more of these people are ready, you will likely see it released. It may even be complete before 2008 is complete. In a few days, I will post some precise statistics on how much MATERIALIZED VIEW could boost a 400.000 posts PhpBB 3.0 forum. I will write the required PL trigger, test them and install them on my server. The two factors of note here: 1) If you can write a PL trigger for it, the value of the feature is limited. In my own case, I found it fairly easy to write triggers, or update the summary table from the application (poor man's view). 2) In my experience, a custom PL trigger can make assumptions about the application that allow greater optimization that a general 'syntactical sugar' solution could. In my own case, performance of queries leapt from 1500 ms to 1 ms. Even if materialized views were implemented to a level that most people would consider full, I do not expect to see the same speed improvement, because a generalized implementation would not be able to make the assumptions that I can. FYI, my triggers are perhaps 10 lines each, and I believe I have three triggers in the 1500 ms - 1 ms example. I have a view and a summary table. I update the summary table from the view. In my opinion, this solution is very manageable given the 1500:1 performance improvement it grants me. Then each individual is able to decide whether materiazed views are important or not. This is my definition of freedom. Freedom of choice. I don't ask for more. Now, I hope that the list can return to a more peaceful state You are doing fine. I am sorry for assuming you intended more and giving you a cold-ish shoulder. Cheers, mark -- Mark Mielke [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Postgresql Materialized views
Mark Mielke wrote: FYI, my triggers are perhaps 10 lines each, and I believe I have three triggers in the 1500 ms - 1 ms example. I have a view and a summary table. I update the summary table from the view. In my opinion, this solution is very manageable given the 1500:1 performance improvement it grants me. But you had to modify your queries. I would think that a materialized views implementation worth its salt would put the view to work on the original, unmodified queries. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Postgresql Materialized views
-Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: Sunday, January 13, 2008 8:18 PM To: Sean Utt Cc: Andrew Dunstan; Joshua D. Drake; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Postgresql Materialized views Sean Utt [EMAIL PROTECTED] writes: My point is simply this: The lack of a clear formal process for feature requests leads to this degradation in the conversation. Two comments: 1) The existing informal process has served us very well for more than ten years now. I'm disinclined to consider replacing it, because that would risk altering the community's dynamics for the worse. 2) In the end, this is an open source *community*; no amount of formal feature requesting will have any material impact on what actually gets implemented, because there isn't any central control. Wow. Being new to Open Source, this amazes me. What gets implemented is whatever individual contributors choose to work on, either because they find it interesting or (in some cases) because someone pays them to do something specific. Certainly, some contributors pay attention to what's being requested, but I see no reason to think that increasing the level of formality will help them. What happens when a person adds a feature or changes the architecture of the database that is perceived by some as incorrect or going in the wrong direction? Jon ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Postgresql Materialized views
On Mon, Jan 14, 2008 at 08:30:53AM -0600, Roberts, Jon wrote: My point is that you should be able to query _table and the system should automatically use the view, without you saying so (except by initially creating them). I agree! From a BI perspective, a materialized view is worthless if you have to re-write your query. There isn't a tool on the market that is smart enough to rewrite a query to a view because all other databases handle the rewriting internally. I don't know about worthless, given that people are doing materialised views on postgres already. It is however a completely orthoginal problem. Someone needs to write the code to maintain such a view before you can even think about working on the planner. However, step 1 would be to get them onto the TODO list. Have a nice day, -- Martijn van Oosterhout [EMAIL PROTECTED] http://svana.org/kleptog/ Those who make peaceful revolution impossible will make violent revolution inevitable. -- John F Kennedy signature.asc Description: Digital signature
Re: [HACKERS] Postgresql Materialized views
But you had to modify your queries. I would think that a materialized views implementation worth its salt would put the view to work on the original, unmodified queries. I might be slow today (everyday? :-) ) - but what do you mean by this? The only difference between *_table and *_view is that *_table is the summary table and *_view is the view. The triggers on the tables the view is derived from select from *_view and update *_table. The queries remain unchanged except for deciding whether to use *_table or *_view. Traditionally materialized views exist, so that you do not need to code differently. Your queries still run on the detail table, but are silently answered by a suitable MV. The MV might have count + other aggregated columns grouped by some columns, and thus be able e.g. shortcircuit a select count(*) from atab. The MV should be MVCC aware (have different values for different snapshots) and not substantially reduce possible concurrency of updates to the base table. For some further background - the base tables are a mirror of accpac tables (augh!) from mssql. The view and summary table gathers information from 5 or so of these tables including aggregates, conditionals, sub-selects (different queries to the same base tables) and deep joins. Perhaps my imagination is too limited - but I don't see how it would be easy to make syntactical sugar for this and still maintain the performance I describe above. For about 30 lines of pl/pgsql and some application-side updates (again from the view to the summary table) in the synchronization script it seems acceptable. As long as you can formulate a normal view on the above statement, you should be able to tell the db to materialize that. A good MV feature would be able to use that MV regardless of whether you select from the view, or use a statement that the view is a generalization of. I think MV's where originally invented to boost benchmark results and thus had to operate on given sql to base tables. Andreas ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Postgresql Materialized views
Roberts, Jon wrote: What gets implemented is whatever individual contributors choose to work on, either because they find it interesting or (in some cases) because someone pays them to do something specific. Certainly, some contributors pay attention to what's being requested, but I see no reason to think that increasing the level of formality will help them. What happens when a person adds a feature or changes the architecture of the database that is perceived by some as incorrect or going in the wrong direction? If that's the general perception it doesn't get added to our source tree. It's very rare that it gets to anything like as formal as a vote. To avoid the possibility of people spending lots of time doing work which is ultimately not adopted, we strongly discourage ivory tower development. For major features especially, developers are encouraged to discuss early and often. But that's a different issue from which items people work on, which is very much a matter of individual choice, or at least something the community has little control over. cheers andrew ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] Pl/Java broken since Postgresql 8.3-rc1
Hi there, as you now is plJava broken with the actual security releases. There is a pljava.dll at http://www.ejurka.com/pgsql/pljava/83rc1/ to fix it for version 8.3RC1. Is a pljava.dll for version 8.2.6 out? It's very important for me, need it for my office. Juergen ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Index trouble with 8.3b4
Hannes Dorbath [EMAIL PROTECTED] writes: In the meantime let me report that the cluster issue happens with GIST as well. ... But as far as I understood this is already covered by your thesis. Right, the bug is independent of which index AM you use (though the symptoms may vary). regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] could not open relation: Invalid argument
Roberts, Jon wrote: Version: PostgreSQL 8.2.5 on i686-pc-mingw32 I recently started getting this error message randomly, could not open relation 42904/42906/42985: Invalid argument. I also got it for a couple of other files. All three files are related to tables that have just a single row each. This has been reported before, and we've patched the system so that the error reported is better. In HEAD we also patched it so that the system would automatically retry a number of times if the problem is ERROR_SHARING_VIOLATION (or something like that). That patch was not backported to 8.2 due to lack of testing -- the original reported did not come back to try the patched version. If you want to try, the patch is here: https://projects.commandprompt.com/public/pgsql/changeset/29853 Let us know how it goes. If your problem is easily reproducible and the patch makes it go away, we would consider back-patching the fix. Thanks. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Postgresql Materialized views
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 My point is simply this: The lack of a clear formal process for feature requests leads to this degradation in the conversation. Without a formalized structure, the conversation devolves rapidly into an argument over semantics and word choice. ... There needs to be a way to evaluate the demand for a specific feature as well as the benefits and the effort it will require. You could always start a page on the developer's wiki: http://developer.postgresql.org/ That would seem to be a good place to at least describe the problem in detail, show how you would like a feature to behave, and have people add the pros and cons of certain approaches. Certainly would be better to have a page to point to rather than trying to trawl through mailing archives (heck, the page could even mostly be a collection of such links). - -- Greg Sabino Mullane [EMAIL PROTECTED] PGP Key: 0x14964AC8 200801141104 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iD8DBQFHi4gvvJuQZxSWSsgRA66dAKCGCPBPDfTFDoizE0WDwXBzDK/W3ACg8dwZ 99OvuSU9PPmG6XDPPK2iQzA= =Xseg -END PGP SIGNATURE- ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Postgresql Materialized views
Zeugswetter Andreas ADI SD [EMAIL PROTECTED] writes: Traditionally materialized views exist, so that you do not need to code differently. Your queries still run on the detail table, but are silently answered by a suitable MV. The MV might have count + other aggregated columns grouped by some columns, and thus be able e.g. shortcircuit a select count(*) from atab. The MV should be MVCC aware (have different values for different snapshots) and not substantially reduce possible concurrency of updates to the base table. Note that you just raised the minimum bar for implementation of the feature by a couple orders of magnitude. We cannot automatically substitute an MV into queries unless this is guaranteed not to change the results. No lazy updates, MVCC transparency required, etc. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Postgresql Materialized views
Roberts, Jon wrote: 2) In the end, this is an open source *community*; no amount of formal feature requesting will have any material impact on what actually gets implemented, because there isn't any central control. Wow. Being new to Open Source, this amazes me. Well it depends on what you mean by formal feature requesting. I will agree with Tom that we are a bunch of cats, you can not herd us. However feature requests do get done and do have material impact. However it usually takes more work than it should to actually get the feature accepted and or committed. someone pays them to do something specific. Certainly, some contributors pay attention to what's being requested, but I see no reason to think that increasing the level of formality will help them. What happens when a person adds a feature or changes the architecture of the database that is perceived by some as incorrect or going in the wrong direction? They can't add a feature. They can submit a feature for inclusion but if we don't like it we don't take it. It is the hope that people who are trying to submit follow this: http://www.postgresql.org/docs/faqs.FAQ_DEV.html Sincerely, Joshua D. Drake Jon ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Pl/Java broken since Postgresql 8.3-rc1
Juergen, Is a pljava.dll for version 8.2.6 out? It's very important for me, need it for my office. Try e-mailing pgsql-jdbc mailing list and asking there. --Josh ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Postgresql Materialized views
Traditionally materialized views exist, so that you do not need to code differently. Your queries still run on the detail table, but are silently answered by a suitable MV. The MV might have count + other aggregated columns grouped by some columns, and thus be able e.g. shortcircuit a select count(*) from atab. The MV should be MVCC aware (have different values for different snapshots) and not substantially reduce possible concurrency of updates to the base table. Note that you just raised the minimum bar for implementation of the feature by a couple orders of magnitude. We cannot automatically substitute an MV into queries unless this is guaranteed not to change the results. No lazy updates, MVCC transparency required, etc. Yes, unfortunately. But don't you also think that this is what makes it a worthwhile feature ? I mean, we do have the doityourself triggered summary table approach, which is not overly difficult to set up. It needs some thought and possibly design by the user to solve the most obvious concurrency issues, but it is doable. Imho MV could be separated in 2 parts: 1: materialized and MVCC aware views (only used explicitly) 2: add the smarts to rewrite sql Part 1 is already useful by itself since it provides a generic and easy solution to concurrency for the user. (probably nice and mindboggling, how to best implement that, though :-) The lazy update and non MVCC approach imho sounds too much like your you can make it arbitrarily fast if it does not need to be correct :-) Andreas ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Postgresql Materialized views
Mark Mielke wrote: Mark Mielke wrote: Counts, because as we all know, PostgreSQL count(*) is slow, and in any case, my count(*) is not on the whole table, but on a subset. Doing this in a general way seems complex to me as it would need to be able to evaluate whether a given INSERT or UPDATE or one of the dependent tables would impact the WHERE clause for the materialized view, and it still wouldn't know which rows to add/update/remove without detailed analysis, so it would either be throwing out the entire materialized view and recreating it on INSERT or UPDATE (or deferring until the next query?) in which case it may be very slow, or it may be very complex. Bah. I forgot to add: The feature I've been wondering about (and not necessarily looking for somebody else to do, although I don't think I know the code well enough to do it at this point): Web applications often make the same queries over and over. While memcache can be used to cache results, the memcache interface is different from the web application interfere requiring complex code, and as well, one loses the transaction guarantees as the memcache results are not guaranteed to be up-to-date with the database. Regarding up-to-dateness note that there is a pgfoundry project that helps there. http://pgfoundry.org/projects/pgmemcache/ The other advantages of doing the caching outside the database is that (a) the memory for the cached results don't have to sit in the database machine, and (b) you can cache post-processed (rendered into HTML or gifs) fragments rather than raw data. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Index trouble with 8.3b4
Tom Lane wrote: I wrote: I think it's okay for CREATE INDEX CONCURRENTLY to use bulk-read access strategy (that is, seqscan using a limited number of buffers), but it has to be able to force the scan to start at page zero. I've committed a patch to do that. Please test CVS HEAD and see if you still see problems. With some limited testing it seems both cases are indeed fixed. I was unable to reproduce either with current CVS HEAD. Though I'll run some further tests tomorrow to back that up. Thanks for your time and prompt responses. -- Best regards, Hannes Dorbath ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] scan.l: check_escape_warning()
Michael Meskes [EMAIL PROTECTED] writes: On Fri, Jan 11, 2008 at 10:41:17AM -0500, Tom Lane wrote: Perhaps there's some discrepancy between the ecpg and backend lexers as to where these are called? You're right. There is no way to (un)select standard conforming strings which makes up for the difference. If that's how it is going to be, would you remove or #ifdef NOT_USED the escape_string_warning function? The 'function defined but not used' warning that it's producing now is kind of annoying ... regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] timestamp refactor effort
On Jan 13, 2008 9:21 AM, Tom Lane [EMAIL PROTECTED] wrote: Warren Turkal [EMAIL PROTECTED] writes: I have a question. Are the low level representations of Timestamp and TimestampTZ the same? They're the same but the interpretations are different, which is why I think it's useful to have two typedefs as a way of documenting what any given value is intended to be. The argument for having a third typedef would be exactly the same: to help document what a value is intended to be. Makes sense. wt ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Re: [COMMITTERS] pgsql: Most recent Postgres version is 8.2.6, per report from Robert
Alvaro Herrera wrote: Bruce Momjian wrote: Log Message: --- Most recent Postgres version is 8.2.6, per report from Robert Treat. Can't we make this automatically somehow? Easiest way would be to have it refer to www.postgresql.org, where the frontpage already states this... //Magnus ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Re: [COMMITTERS] pgsql: Most recent Postgres version is 8.2.6, per report from Robert
Bruce Momjian wrote: Log Message: --- Most recent Postgres version is 8.2.6, per report from Robert Treat. Can't we make this automatically somehow? Are we going to move to having the FAQ in XML Docbook FAQ format? That would rock and it would be easier to keep translations. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Index trouble with 8.3b4
On Sun, 2008-01-13 at 18:52 -0500, Tom Lane wrote: The scan is done using the regular heapscan code, which in 8.3 has been modified to enable synchronized scanning, which means it might start from the middle of the table and wrap around. If that happens, the merge join will get totally confused because it is expecting the Thank you and Hannes Dorbath for tracking this down. I wonder whether there are any other places that are silently assuming that heapscans start from page zero ... I considered that question when implementing sync scans, but I could not think of any specific areas of the code that would likely be affected. Regards, Jeff Davis ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] to_char incompatibility
You'll have to explain to Oracle and their customers that Oracle's security model is not a great idea then. I'd love to, and in fact *do* whenever I'm given the chance. In fact, Oracle's security model is pretty bad; the reason why Oracle advertises Unbreakable so hard is that they have a terrible record of security exploits, making them nearly as bad as MySQL. Heck, these days you're better off using MSSQL than Oracle to protect your data. LOL! I'm not going to trade jabs with you on which product has more exploits because that is just stupid. I'm stating that the *model* for Oracle security is very similar to the non-default behavior of PostgreSQL of using security definer. I prefer this model. I think it is a great idea and I mention Oracle because it is highly reputable database company that uses this model. For instance, if I want to allow a user to insert data, I most likely want them to ONLY do it through my method. That means creating a function with security definer set and granting the user execute on the function. I don't want the user to select my sequence or inserting data directly to the table. Also, there is no need to argue this because we can have it both ways. Security definer is an option and I recommend to always use it over the default. If you don't want to use it, don't. Jon ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Index trouble with 8.3b4
Jeff Davis [EMAIL PROTECTED] writes: On Sun, 2008-01-13 at 18:52 -0500, Tom Lane wrote: I wonder whether there are any other places that are silently assuming that heapscans start from page zero ... I considered that question when implementing sync scans, but I could not think of any specific areas of the code that would likely be affected. I went through all of the heap_beginscan calls in the code last night. pgstattuple was broken but AFAICS none of the other callers care about the visitation order. I wonder though about third-party add-ons :-( regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] to_char incompatibility
Jon, You'll have to explain to Oracle and their customers that Oracle's security model is not a great idea then. I'd love to, and in fact *do* whenever I'm given the chance. In fact, Oracle's security model is pretty bad; the reason why Oracle advertises Unbreakable so hard is that they have a terrible record of security exploits, making them nearly as bad as MySQL. Heck, these days you're better off using MSSQL than Oracle to protect your data. -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] SSL over Unix-domain sockets
Peter Eisentraut [EMAIL PROTECTED] writes: It has been reported that the data transmission overhead is much less than the connection establishing overhead, which is measured here. But this is certainly not an encouraging measurement, if we want to put this close to the default path of use. I did some more experiments to confirm Peter's results. My test case for measuring connection overhead is pgbench -c 1 -t 1000 -S -n -C bench (ie, single client, SELECT-only transaction, connecting again for each transaction). This is marginally more realistic than Peter's test since the client executes a SQL command per connection. I get $ PGSSLMODE=prefer time pgbench -c 1 -t 1000 -S -n -C bench transaction type: SELECT only scaling factor: 10 number of clients: 1 number of transactions per client: 1000 number of transactions actually processed: 1000/1000 tps = 33.078772 (including connections establishing) tps = 33.078772 (excluding connections establishing) 10.45user 0.68system 0:30.26elapsed 36%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+48465minor)pagefaults 0swaps $ PGSSLMODE=disable time pgbench -c 1 -t 1000 -S -n -C bench transaction type: SELECT only scaling factor: 10 number of clients: 1 number of transactions per client: 1000 number of transactions actually processed: 1000/1000 tps = 156.237184 (including connections establishing) tps = 156.237208 (excluding connections establishing) 0.20user 0.18system 0:06.41elapsed 6%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+2500minor)pagefaults 0swaps $ PGSSLMODE=prefer time pgbench -c 1 -t 1000 -S -n -C -h localhost bench transaction type: SELECT only scaling factor: 10 number of clients: 1 number of transactions per client: 1000 number of transactions actually processed: 1000/1000 tps = 32.320773 (including connections establishing) tps = 32.320774 (excluding connections establishing) 10.54user 1.01system 0:30.97elapsed 37%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+49807minor)pagefaults 0swaps $ PGSSLMODE=disable time pgbench -c 1 -t 1000 -S -n -C -h localhost bench transaction type: SELECT only scaling factor: 10 number of clients: 1 number of transactions per client: 1000 number of transactions actually processed: 1000/1000 tps = 144.859620 (including connections establishing) tps = 144.859641 (excluding connections establishing) 0.32user 0.62system 0:06.91elapsed 13%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+4512minor)pagefaults 0swaps I also did some tests to measure the encryption overhead for bulk data, in the form of pg_dumping a medium-size table (which is in fact just the data from the regression test's tenk1 table, repeated 128 times): [EMAIL PROTECTED] ~]$ PGSSLMODE=prefer time pg_dump -t foo regression | wc 2.71user 0.36system 0:25.09elapsed 12%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+1093minor)pagefaults 0swaps 1280054 20480136 85863449 [EMAIL PROTECTED] ~]$ PGSSLMODE=disable time pg_dump -t foo regression | wc 0.64user 0.30system 0:09.63elapsed 9%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+793minor)pagefaults 0swaps 1280054 20480136 85863449 [EMAIL PROTECTED] ~]$ PGSSLMODE=prefer time pg_dump -t foo -h localhost regression | wc 3.06user 0.45system 0:25.82elapsed 13%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+1105minor)pagefaults 0swaps 1280054 20480136 85863449 [EMAIL PROTECTED] ~]$ PGSSLMODE=disable time pg_dump -t foo -h localhost regression | wc 0.66user 0.42system 0:09.91elapsed 10%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+806minor)pagefaults 0swaps 1280054 20480136 85863449 Note that these times are for SSL enabled, but without any root.crt files, so no actual authentication is happening --- I'm not sure how much more connection-time overhead that would incur. Presumably the bulk transfer rate wouldn't change though. All these numbers are stable to within a percent or three over repeated trials. Conclusions: * SSL, even without real authentication, is *way* too expensive to enable by default. * The extra cost of going across a local TCP connection is measurable, but it's insignificant compared to the cost of turning on SSL. (This is on a Fedora 8 kernel BTW ... that result might vary on other platforms.) So you could make a pretty good case that the answer for DBAs who want to prevent spoofing is to disable socket connections in pg_hba.conf and force even local connections to come through hostssl connections. If we do want to apply Peter's patch, I think it needs to be extended so that the default behavior on sockets is the same as before, ie, no SSL. This could be done by giving libpq an additional connection parameter, say socketsslmode, having the same alternatives as sslmode but defaulting to allow instead of prefer. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill
Re: [HACKERS] Postgresql Materialized views
Added to TODO: * Add the ability to automatically create materialized views Right now materialized views require the user to create triggers on the main table to keep the summary table current. SQL syntax should be able to manager the triggers and summary table automatically. A more sophisticated implementation would automatically retrieve from the summary table when the main table is referenced, if possible. I also thought this was on the TODO list. --- Zeugswetter Andreas ADI SD wrote: Traditionally materialized views exist, so that you do not need to code differently. Your queries still run on the detail table, but are silently answered by a suitable MV. The MV might have count + other aggregated columns grouped by some columns, and thus be able e.g. shortcircuit a select count(*) from atab. The MV should be MVCC aware (have different values for different snapshots) and not substantially reduce possible concurrency of updates to the base table. Note that you just raised the minimum bar for implementation of the feature by a couple orders of magnitude. We cannot automatically substitute an MV into queries unless this is guaranteed not to change the results. No lazy updates, MVCC transparency required, etc. Yes, unfortunately. But don't you also think that this is what makes it a worthwhile feature ? I mean, we do have the doityourself triggered summary table approach, which is not overly difficult to set up. It needs some thought and possibly design by the user to solve the most obvious concurrency issues, but it is doable. Imho MV could be separated in 2 parts: 1: materialized and MVCC aware views (only used explicitly) 2: add the smarts to rewrite sql Part 1 is already useful by itself since it provides a generic and easy solution to concurrency for the user. (probably nice and mindboggling, how to best implement that, though :-) The lazy update and non MVCC approach imho sounds too much like your you can make it arbitrarily fast if it does not need to be correct :-) Andreas ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] to_char incompatibility
Jon, Also, there is no need to argue this because we can have it both ways. Security definer is an option and I recommend to always use it over the default. If you don't want to use it, don't. Security Definer has ramifications in PostgreSQL which I don't think it does in Oracle. Particularly, see: http://www.postgresql.org/docs/techdocs.77 -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] to_char incompatibility
Josh Berkus [EMAIL PROTECTED] writes: Security Definer has ramifications in PostgreSQL which I don't think it does in Oracle. Particularly, see: http://www.postgresql.org/docs/techdocs.77 BTW, that article needs to be updated to show the (much easier) way to do it as of 8.3. I concur that make all your functions security definer by default is unlikely to make a system more secure overall --- it'll just move the problems around. Especially if it's applied blindly by someone who stopped reading at that point. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Bug: Unreferenced temp tables disables vacuum to update xid
Alvaro Herrera [EMAIL PROTECTED] writes: Hmm ... that would be strange. Off-the-cuff idea: we introduced code to advance relfrozenxid in CLUSTER, TRUNCATE and table-rewriting forms of ALTER TABLE. Perhaps the problem is that we're neglecting to update it for the toast table there. AFAIR I analyzed the cases and they were all handled, but perhaps I forgot something. I found a smoking gun ... regression=# create table foo (f1 serial primary key, f2 text); NOTICE: CREATE TABLE will create implicit sequence foo_f1_seq for serial column foo.f1 NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index foo_pkey for table foo CREATE TABLE regression=# insert into foo values(1,repeat('xyzzy',10)); INSERT 0 1 regression=# insert into foo values(2,repeat('xqzzy',10)); INSERT 0 1 regression=# select relname, relkind, relfrozenxid from pg_class order by oid desc limit 6; relname| relkind | relfrozenxid ---+-+-- foo_pkey | i |0 pg_toast_707220_index | i |0 pg_toast_707220 | t | 119421 foo | r | 119421 foo_f1_seq| S |0 xmlview5 | v |0 (6 rows) regression=# cluster foo_pkey on foo; CLUSTER regression=# select relname, relkind, relfrozenxid from pg_class order by oid desc limit 6; relname| relkind | relfrozenxid ---+-+-- pg_toast_707231_index | i |0 pg_toast_707231 | t | 119424 foo_pkey | i |0 foo | r | 4195086720 foo_f1_seq| S |0 xmlview5 | v |0 (6 rows) So something is out of whack in CLUSTER. However it only seems to be broken in HEAD, so I'm not sure this helps to explain the original report. (Speculation: this is related to the rewrite to make CLUSTER MVCC-safe?) regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Declarative partitioning grammar
On Sat, Jan 12, 2008 at 04:01:19PM +0530, NikhilS wrote: Hi, We did look at allowing general functions for partitioning and this was one concern. The other is that we want to enforce that a row only gets inserted into a single partition, so we wanted a declarative syntax where it was relatively easy to check that range and list specifications don't overlap. Detection of mutually exclusive ranges might not turn out to be so easy afterall. I think there is some code in the constraint_exclusion area which might help out in this. In some prototyping code it didn't seem too difficult but if we've made a mistake we might have to look at the CE code. Thanks, Gavin ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Bug: Unreferenced temp tables disables vacuum to update xid
Tom Lane wrote: regression=# cluster foo_pkey on foo; CLUSTER regression=# select relname, relkind, relfrozenxid from pg_class order by oid desc limit 6; relname| relkind | relfrozenxid ---+-+-- pg_toast_707231_index | i |0 pg_toast_707231 | t | 119424 foo_pkey | i |0 foo | r | 4195086720 foo_f1_seq| S |0 xmlview5 | v |0 (6 rows) So something is out of whack in CLUSTER. However it only seems to be broken in HEAD, so I'm not sure this helps to explain the original report. (Speculation: this is related to the rewrite to make CLUSTER MVCC-safe?) Right ... see copy_heap_data --- it sets FreezeXid as relfrozenxid. If we were to scan each tuple as it is inserted, we could store a higher relfrozenxid, but I doubt we want to do that. Perhaps what we could do is take the relfrozenxid from the old relation and copy it over, if it's later than FreezeXid? -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Bug: Unreferenced temp tables disables vacuum to update xid
Alvaro Herrera [EMAIL PROTECTED] writes: Perhaps what we could do is take the relfrozenxid from the old relation and copy it over, if it's later than FreezeXid? It certainly doesn't seem to make any sense to allow the rel's relfrozenxid to go backwards. Indeed this coding lets it end up less than the DB's datfrozenxid, which is certainly inappropriate. What might be the best idea is to advance FreezeXid to the old relfrozenxid between the vacuum_set_xid_limits and begin_heap_rewrite calls. Then we'd be quite certain we are not lying: anything older than that did indeed get frozen. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Postgresql Materialized views
Zeugswetter Andreas ADI SD [EMAIL PROTECTED] writes: Note that you just raised the minimum bar for implementation of the feature by a couple orders of magnitude. Yes, unfortunately. But don't you also think that this is what makes it a worthwhile feature ? Well, my point is that taking automatic rewriting as a required feature has at least two negative impacts: * it rules out any form of lazy update, even though for many applications an out-of-date summary view would be acceptable for some purposes; * requiring MVCC consistency will probably hugely reduce the variety of views that we can figure out how to materialize, and cost performance even for the ones we can do at all. It's not zero-cost, even if you consider implementation effort and complexity as free (which I don't). regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Index trouble with 8.3b4
Tom Lane [EMAIL PROTECTED] writes: Jeff Davis [EMAIL PROTECTED] writes: On Sun, 2008-01-13 at 18:52 -0500, Tom Lane wrote: I wonder whether there are any other places that are silently assuming that heapscans start from page zero ... I considered that question when implementing sync scans, but I could not think of any specific areas of the code that would likely be affected. I went through all of the heap_beginscan calls in the code last night. pgstattuple was broken but AFAICS none of the other callers care about the visitation order. I wonder though about third-party add-ons :-( Perhaps we ought to have made heap_beginscan guarantee an ordered scan and made synch scans be explicitly requested. That would have touched a lot of lines but been more conservative. I'm not sure it's worth going back on it now though. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication support! ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] 8.3RC1 on windows missing descriptive Event handle names
On Windows XP, using Process Explorer with the lower pane showing Handles, not all postgres.exe processes are including an Event type with a description of what the process is doing. At the moment, I have ten postgres processes shown as being at the bottom of the tree, three of which do not have such descriptions. The processes that do show an Event type handle show these names: \BaseNamedObjects\pgident: postgres: wal writer process \BaseNamedObjects\pgident: postgres: autovacuum launcher process plus one per connection like: \BaseNamedObjects\pgident: postgres: postgres mydatabase 127.0.0.1(1954) idle Comparing the list of processes to those of 8.2, I suspect that the processes missing this detail are the log writer (one of the processes does have a File Handle on the log file), and the stats collector. I have autovacuum enabled for 8.3rc1, and disabled for 8.2. PostgreSQL is started as a service. Stephen Denne Disclaimer: At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any attachments is confidential and may be subject to legal privilege. If it is not intended for you please advise by reply immediately, destroy it and do not copy, disclose or use it in any way. __ This email has been scanned by the DMZGlobal Business Quality Electronic Messaging Suite. Please see http://www.dmzglobal.com/services/bqem.htm for details. __ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Declarative partitioning grammar
On Jan 14, 2008, at 1:49 AM, Markus Schiltknecht wrote: I don't think the separation into list, hash and range partitioning is adequate. What is the system supposed to do, if you try to insert a row which doesn't fit any of the values in your list or doesn't fit any of the ranges you defined? Hi Markus, If you don't define a default partition to handle outliers, the insert should fail with an error. I prefer a partitioning grammar which doesn't interfere with constraints. We all know how to define constraints. Please don't introduce a new, ambiguous way. A partitioning definition should be able to tell the target partition for *every* row which satisfies the constraints (the real ones, not ambiguous ones). IMO, a single DDL command should only touch a single split point, i.e. split a table into two partitions, move the split point or remove the split point (joining the partitions again). Those are the only basic commands you need to be able to handle partitioning. I can certainly appreciate the simplicity of this approach. It lets us use a generic check constraint to perform partitioning, so it is more general than partitioning using hash, list, and range. However, it achieves this generality at the expense of usability for typical customer cases. For example, let's look at the case of a table of 1 year of sales data, where we want to create 12 partitions -- one for each month. With the generic approach, you start with a single table, and start by splitting it into two six-month partitions: ALTER TABLE sales SPLIT where sales_date date '2007-06-01' INTO ( PARTITION first_half PARTITION second_half ); We could implement this approach using check constraints and table inheritance: the partition second_half is a child table where sales_date date '2007-06-01', and the partition first_half has the complementary constraint NOT(sales_date date '2007-06-01'). Next, you split each partition: ALTER TABLE sales SPLIT PARTITION first_half where sales_date date '2007-03-01' INTO ( PARTITION first_quarter PARTITION second_quarter ); So now the child table for first_half itself has two children. As you continue this process you construct a binary tree of table inheritance using 12 ALTER statements. In the long grammar you can create and partition the table in one statement: CREATE TABLE sales ... PARTITION BY sales_date ( start (date '2007-01-01') end (date '2008-01-01') every (interval '1 month') ); Sorry, but for my taste, the proposed grammar is too long per command, not flexible enough and instead ambiguous for split points as well as for constraints. To me it looks like repeating the mistakes of others. Thanks for your feedback. Partitioning the table using series of splits is a clever solution for situations where the partitioning operation cannot be described using simple equality (like list,hash) or ordered comparison (range). But for many common business cases, the long grammar is easier to specify. kind regards, Jeff ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] to_char incompatibility
Tom Lane [EMAIL PROTECTED] writes: Josh Berkus [EMAIL PROTECTED] writes: Security Definer has ramifications in PostgreSQL which I don't think it does in Oracle. Particularly, see: http://www.postgresql.org/docs/techdocs.77 BTW, that article needs to be updated to show the (much easier) way to do it as of 8.3. I concur that make all your functions security definer by default is unlikely to make a system more secure overall --- it'll just move the problems around. Especially if it's applied blindly by someone who stopped reading at that point. I think the reason Oracle DBAs are accustomed to using security definer for everything is that it has some further effects aside from selecting the privileges to use. Remember that in Oracle the current role also controls what we call the search_path. So selecting security definer is effectively selecting lexical scoping over dynamic scoping. It nails down all the references in the package or function at compile time. That does have more robust security implications. It's also supposed to perform better. And experience shows lexical scoping makes it easier to build large complex systems without getting bogged down in lots of action-at-a-distance. In Postgres the performance consequence is reversed. We have a performance *hit* for security definer. And the pl interpreters don't behave any differently as far as when they do their lookups. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support! ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Index trouble with 8.3b4
Gregory Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: I went through all of the heap_beginscan calls in the code last night. pgstattuple was broken but AFAICS none of the other callers care about the visitation order. I wonder though about third-party add-ons :-( Perhaps we ought to have made heap_beginscan guarantee an ordered scan and made synch scans be explicitly requested. That would have touched a lot of lines but been more conservative. I'm not sure it's worth going back on it now though. Hmm. I'm too lazy to go back and look right now, but IIRC most of the hardwired heapscans are on system catalogs that are unlikely to be large enough to trigger a syncscan anyway. If we were to flip the semantics, and then change only the callers that clearly need to enable syncscans, it would not be all that large a patch I think. On the other hand it's far from clear that there's really a problem. The model for doing a block-at-a-time scan is VACUUM, and that doesn't use the heapscan infrastructure but just fetches blocks by number. It would only be people who'd copied pgstattuple's methodology that would be likely to be at risk. I'm not sure we should protect those hypothetical people at the cost of not doing syncscans for other (also hypothetical) third-party add-ons that do heapscans on large tables and wouldn't have a problem with wraparound. It's a tossup from here. Anybody have a strong opinion one way or the other? regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] could not open relation: Invalid argument
On Jan 14, 2008 11:03 AM, Alvaro Herrera [EMAIL PROTECTED] wrote: Roberts, Jon wrote: Version: PostgreSQL 8.2.5 on i686-pc-mingw32 I recently started getting this error message randomly, could not open relation 42904/42906/42985: Invalid argument. I also got it for a couple of other files. All three files are related to tables that have just a single row each. This has been reported before, and we've patched the system so that the error reported is better. In HEAD we also patched it so that the system would automatically retry a number of times if the problem is ERROR_SHARING_VIOLATION (or something like that). That patch was not backported to 8.2 due to lack of testing -- the original reported did not come back to try the patched version. If you want to try, the patch is here: https://projects.commandprompt.com/public/pgsql/changeset/29853 Let us know how it goes. If your problem is easily reproducible and the patch makes it go away, we would consider back-patching the fix. I have builded a patched version of 8.2, if you want i can send it to you... if your mail server doesn't accept large files (the zip file is 5.6 mb) then maybe someone can share some space for temporarily store it... -- regards, Jaime Casanova Programming today is a race between software engineers striving to build bigger and better idiot-proof programs and the universe trying to produce bigger and better idiots. So far, the universe is winning. Richard Cook ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] SSL over Unix-domain sockets
Tom Lane wrote: Conclusions: * SSL, even without real authentication, is *way* too expensive to enable by default. * The extra cost of going across a local TCP connection is measurable, but it's insignificant compared to the cost of turning on SSL. (This is on a Fedora 8 kernel BTW ... that result might vary on other platforms.) So you could make a pretty good case that the answer for DBAs who want to prevent spoofing is to disable socket connections in pg_hba.conf and force even local connections to come through hostssl connections. Yea, I figured using protected directories for the socket was the zero-cost solution, and if you have to do SSL, might as well just use TCP too. (If you moved the socket file to a protected directory I think you could use external_pid_file='/tmp/.s.PGSQL.5432' to prevent a spoof socket file in /tmp. Should we document that idea?) If we do want to apply Peter's patch, I think it needs to be extended so that the default behavior on sockets is the same as before, ie, no SSL. This could be done by giving libpq an additional connection parameter, say socketsslmode, having the same alternatives as sslmode but defaulting to allow instead of prefer. That seems like it is going to be added confusion; just using the protected socket diretory or TCP SSL seems less error-prone. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Declarative partitioning grammar
Jeff Cohen [EMAIL PROTECTED] writes: In the proposed solution, hash and list partitions work for all types that support an equality operator, and range partitions work for all types that support fully-ordered comparison. Surely a hashing method would require a *hashable* equality operator, ie a hash opclass; likewise range partitions would demand a matching btree opclass. You could do list partitions with an equality operator of either kind. Essentially all of the system's current knowledge about the properties of specific operators is encoded as operator classes for one of these two built-in index types. If you want to make assumptions about the behavior of an operator, it really needs to be founded on these types of opclasses --- or else you're buying into inventing a comparable amount of infrastructure for some other organizational concept. I think Peter's point was that you might want to think about generalizing your concepts so that other kinds of operator classes could someday serve as the foundations for other kinds of partitioning rules. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Declarative partitioning grammar
On Jan 12, 2008, at 9:34 AM, Peter Eisentraut wrote: Well, with an extensible system such as PostgreSQL you will need to have a partitioning scheme that can deal with extensions. Perhaps people want to partition by XML, GIS, text-search data, or whatever someone might come up with in the future. Hi Peter, In the proposed solution, hash and list partitions work for all types that support an equality operator, and range partitions work for all types that support fully-ordered comparison. kind regards, Jeff ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] SSL over Unix-domain sockets
Bruce Momjian [EMAIL PROTECTED] writes: Yea, I figured using protected directories for the socket was the zero-cost solution, and if you have to do SSL, might as well just use TCP too. (If you moved the socket file to a protected directory I think you could use external_pid_file='/tmp/.s.PGSQL.5432' to prevent a spoof socket file in /tmp. Should we document that idea?) Umm ... two questions about that: * will the postmaster fail if there's a socket where it tries to write the external_pid_file? (If it does fail, does that really fix anything? The spoofer already owns the socket.) * if there's a plain file where a client expects to find the socket, what happens? (Probably nothing very good, since the first thing the client will do is write on it.) If we do want to apply Peter's patch, I think it needs to be extended so that the default behavior on sockets is the same as before, ie, no SSL. That seems like it is going to be added confusion; just using the protected socket diretory or TCP SSL seems less error-prone. Yeah, all of this is about confusion and error-proneness. I still think that the real problem is that we don't have full control over client-side code, and therefore can't just write off the problem of a client deciding to connect to /tmp/.s.PGSQL.5432 even if the local DBA thinks the socket would be safer elsewhere. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] Re: [COMMITTERS] pgsql: Most recent Postgres version is 8.2.6, per report from Robert
Alvaro Herrera wrote: Bruce Momjian wrote: Log Message: --- Most recent Postgres version is 8.2.6, per report from Robert Treat. Can't we make this automatically somehow? Hmmm, well, that file is pulled to our web site on every change so I am not sure how we could do this automatically; updating this file is in the release check list but somehow it got skipped. Are we going to move to having the FAQ in XML Docbook FAQ format? That would rock and it would be easier to keep translations. That would make sense. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Pl/Java broken since Postgresql 8.3-rc1
On Mon, 14 Jan 2008, Josh Berkus wrote: Juergen, Is a pljava.dll for version 8.2.6 out? It's very important for me, need it for my office. Try e-mailing pgsql-jdbc mailing list and asking there. The correct list is actually [EMAIL PROTECTED] The JDBC driver and the server side language have little in common. I've put up a new pljava build against 8.2.6 for windows here: http://www.ejurka.com/pgsql/pljava/826/ In this case you need the new pljava.jar as well as the .dll because it's had some updates since the previous release as well. Kris Jurka ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] SSL over Unix-domain sockets
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Yea, I figured using protected directories for the socket was the zero-cost solution, and if you have to do SSL, might as well just use TCP too. (If you moved the socket file to a protected directory I think you could use external_pid_file='/tmp/.s.PGSQL.5432' to prevent a spoof socket file in /tmp. Should we document that idea?) Umm ... two questions about that: * will the postmaster fail if there's a socket where it tries to write the external_pid_file? (If it does fail, does that really fix anything? The spoofer already owns the socket.) I figured it would prevent someone from spoofing while the server was up, which is a _new_ problem when moving the socket. :-( My feeling on the moving of sockets risk is that you are probably going to have all your clients using the new socket directory before anyone tries to put something in /tmp, especially if you have the lock file in /tmp as outlined above. To spoof in such a situation you would need to do the attack while the server is down _and_ against a client that doesn't know the right socket location. * if there's a plain file where a client expects to find the socket, what happens? (Probably nothing very good, since the first thing the client will do is write on it.) We would have to test that. If we do want to apply Peter's patch, I think it needs to be extended so that the default behavior on sockets is the same as before, ie, no SSL. That seems like it is going to be added confusion; just using the protected socket diretory or TCP SSL seems less error-prone. Yeah, all of this is about confusion and error-proneness. I still think that the real problem is that we don't have full control over client-side code, and therefore can't just write off the problem of a client deciding to connect to /tmp/.s.PGSQL.5432 even if the local DBA thinks the socket would be safer elsewhere. Right. I think the lock file in /tmp does help somewhat. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Array behavior oddities
Bruce Momjian [EMAIL PROTECTED] writes: If any dimension is written as a slice, i.e. contains a colon, then all dimensions are treated as slices. Is the the behavior of assuming an entry with no colon is a slice what we want, or are we just stuck with it? Why do you find that surprising? It's either a slice or it isn't, there's no halfway point. Are you proposing to throw an error if only some of the subscripts have colons? What would be the point? Is there a reason out-of-bounds array accesses behave differently for slices and non-slices? History (although sloppy original implementation would do too). I'm not sure if we should try to clean it up --- there've not been that many complaints, but I'm sure we'd get complaints from people whose code stopped working, if we change it. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Array behavior oddities
In reading our array documentation I came across two unusual behaviors. The issue relates to slices: We can also access arbitrary rectangular slices of an array, or subarrays. An array slice is denoted by writing literalreplaceablelower-bound/replaceable:replaceableupper-bound/replaceable/literal for one or more array dimensions. For example, this query retrieves the first item on Bill's schedule for the first two days of the week: SELECT schedule[1:2][1:1] FROM sal_emp WHERE name = 'Bill'; First issue: If any dimension is written as a slice, i.e. contains a colon, then all dimensions are treated as slices. Any dimension that has only a single number (no colon) is treated as being from literal1/ to the number specified. For example, literal[2]/ is treated as literal[1:2]/, as in this example: Is the the behavior of assuming an entry with no colon is a slice what we want, or are we just stuck with it? Also: An array subscript expression will return null if either the array itself or any of the subscript expressions are null. Also, null is returned if a subscript is outside the array bounds (this case does not raise an error). For example, if literalschedule/ currently has the dimensions literal[1:3][1:2]/ then referencing literalschedule[3][3]/ yields NULL. Similarly, an array reference with the wrong number of subscripts yields a null rather than an error. An array slice expression likewise yields null if the array itself or any of the subscript expressions are null. However, in other corner cases such as selecting an array slice that is completely outside the current array bounds, a slice expression yields an empty (zero-dimensional) array instead of null. If the requested slice partially overlaps the array bounds, then it is silently reduced to just the overlapping region. Is there a reason out-of-bounds array accesses behave differently for slices and non-slices? Having slices and non-slices behave differently is very confusing to me. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] SSL over Unix-domain sockets
Bruce Momjian [EMAIL PROTECTED] writes: Tom Lane wrote: Yeah, all of this is about confusion and error-proneness. I still think that the real problem is that we don't have full control over client-side code, and therefore can't just write off the problem of a client deciding to connect to /tmp/.s.PGSQL.5432 even if the local DBA thinks the socket would be safer elsewhere. Right. I think the lock file in /tmp does help somewhat. Even if it happens to work (on some platforms) it seems like a kluge. It strikes me that given the postmaster's infrastructure for listening on multiple sockets, it would be a pretty small matter of programming to teach it to listen on socket files in multiple directories not only one. If we had that, the postmaster could listen in both /tmp and your-more-secure-directory-of-choice. Surely an actual socket file would be a more useful blocker in /tmp than a dead-weight PID file. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly