Re: [HACKERS] IN vs EXISTS equivalence
On Thu, Aug 14, 2008 at 06:50:09PM +0100, Simon Riggs wrote: On Fri, 2008-08-08 at 16:23 -0400, Tom Lane wrote: NOT IN is a lot trickier, condition: you must also assume that the comparison operator involved never yields NULL for non-null inputs. That might be okay for btree comparison functions but it's not a very comfy assumption in general; we certainly haven't got any explicit knowledge that any functions are guaranteed to act that way. So this case might be worth doing later ... Just found this comment, after reading what you said on other thread about NOT IN. NOT IN is a serious performance issue for most people. We simply can't say to people you were told not to. If we can fix it easily for the majority of cases, we should. We can't let the it won't work in certain cases reason prevent various A suggestion: what about adding an attribute to functions to declare that they never return null? declare foo(int, int) returns int immutable not null as ... -dg -- David Gould [EMAIL PROTECTED] 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Window functions patch v04 for the September commit fest
Simon Riggs wrote: On Tue, 2008-09-02 at 15:51 +0300, Heikki Linnakangas wrote: The needs of access to the rows are so different that it seems best to me to delegate the buffering to the window function. That seems sensible in some ways, not others. In the API I proposed later in that mail, the buffering is actually done by the executor node, not by the window function. Instead, the window function can request abitrary rows of the frame from the executor, and can signal that some rows are no longer required, allowing them to be discarded. Some of the window functions, like lead and lag merely specify window size and shape for other functions to act upon. I don't understand that. LEAD/LAG return a value. They don't affect the size or shape of the window in any way. It doesn't affect other functions. For those types of request I don't see any need for custom functions, whereas for the comparison/calculation functions there might be a need. We don't need to implement all the things the SQL Standard calls window functions with a 1:1 mapping to Postgres functions. Sure, we have special hacks for things like MIN/MAX already. But using PostgreSQL functions does seem like the simplest solution to me, as the backend code can get quite complex if we have to add special handling for different window functions. LEAD/LAG fall quite nicely into the framework I proposed, but if something comes along that doesn't, then we'll have to extend the framework or add a special case. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Window functions patch v04 for the September commit fest
Hitoshi Harada wrote: 2008/9/2 Heikki Linnakangas [EMAIL PROTECTED]: Hitoshi Harada wrote: 2008/9/2 Heikki Linnakangas [EMAIL PROTECTED]: In my understanding, the Window Frame is defined by clauses such like ROWS BETWEEN ... , RANGE BETWEEN ... or so, contrast to Window Partition defined by PARTITION BY clause. A frame slides within a partition or there's only one frame if those clauses are not specified. The current patch has not implemented this yet. I'll update the docs. Yes, that's how I read it as well. Another way to think of it is that there's always a window frame, but if no ROWS BETWEEN or similar clause is given, the window frame spans the whole partition (or from the 1st row of the partition, up to the current row, if ORDER BY is given). I don't like to call the second type ranking aggregates because it may refer to only ranking functions though there are more types of function like ntile(), lead() and lag(). But window functions doesn't seem appropriate either since it's ambiguous with the general name of window expressions. Yep, confusing :-(. The SQL standard says that a window function is one of: a rank function, a distribution function, a row number function, a window aggregate function, the ntile function, the lead function, the lag function, the first-value function, the last-value function, the nth-value function. So, window aggregate functions are a special class of window functions, and there's no term to refer to all the rest of the window functions excluding window aggregate functions. Your docSQL spec Window expression Window function Window function Any window function other than a window aggregate function Window aggregateWindow aggregate function I tried to coin the term ranking aggregate for the SQL2008 term Any window function other than a window aggregate function, but you're right that that's still confusing, because the SQL2008 term rank function includes only RANK() and DENSE_RANK(). The spec calls them group aggregate functions, when they're used with GROUP BY, rather than as a window function. I think we could use that term. Agree. So from now on, we use window functions for all kinds of functions including window aggregates. Window expression is discarded. Window functions also means the mechanism to support these functions to process and this project. Your proposal is smarter than the current implementation. But it doesn't seem complete in some way. From logical point of view, the window functions should be allowed to access whole of rows in the frame the current row belongs to (c.f. inverse distribution functions). By the whole of rows, do you mean a) the chosen value or expression of all rows, or b) all columns of the input rows? a). I mean all input rows in a window frame. But later I found inverse distribution function is not one of window functions. That is actually one of aggregate functions. Forget about it. Different window functions have different needs. RANK() for example does need to see all columns, to compare them, but it only needs access to the previous and current row. CUME_DIST on the other hand needs access to all columns of all rows, and LAG needs access to a specific column of a fixed number of rows. And row_number needs nothing. The needs of access to the rows are so different that it seems best to me to delegate the buffering to the window function. Delegating optimization to them depending on functions' needs is a good idea. So executor can concentrate on the window function process flow. Let's unify it in executor and let trivial optimizations get into individual functions. Actually, looking closer to the ranking functions, they don't really need access to all columns. They just need to be able to compare them, according to the window ordering, so we should probably only provide access to the arguments of the aggregate, evaluated for any row in the frame, and a comparator function, that can determine if any given rows in the frame are , = or . That is kind of problem. If your task is only to define that window node executor simply stores window frame rows and pass them to window functions as they need, the rank functions' needs don't come. As you point out, rank functions need ordering key columns and its comparators. So you have to imagine what comes next? What will be wanted other than ordering key columns, if we think about universe window functions much more than SQL spec says? It might be a good idea to google around what window functions other DBMSs support, and see if this scheme could support all of them. I couldn't find any that it couldn't, but I didn't look very hard. Let's look at the trivial, generic, and slow implementation first, and then figure out what tricks we can do to make it faster. I gather that that generic algorithm, following how the SQL spec defines the window frame, looks like this: So as to satisfy all of window functions' needs, Window object
Re: [HACKERS] Window functions patch v04 for the September commit fest
On Wed, 2008-09-03 at 09:51 +0300, Heikki Linnakangas wrote: Simon Riggs wrote: On Tue, 2008-09-02 at 15:51 +0300, Heikki Linnakangas wrote: The needs of access to the rows are so different that it seems best to me to delegate the buffering to the window function. That seems sensible in some ways, not others. In the API I proposed later in that mail, the buffering is actually done by the executor node, not by the window function. Instead, the window function can request abitrary rows of the frame from the executor, and can signal that some rows are no longer required, allowing them to be discarded. I'm happy with that. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Cleanup of GUC units code
Marko Kreen wrote: On 9/2/08, Peter Eisentraut [EMAIL PROTECTED] wrote: Marko Kreen wrote: In the meantime, here is simple patch for case-insensivity. You might be able to talk me into accepting various unambiguous, common alternative spellings of various units. But for instance allowing MB and Mb to mean the same thing is insane. How would the docs for that look like? And anyway, what is wrong with Mb for megabytes? I doesn't seem completely unreasonable to me that we'd want to express something in megabits/second in the future. For example, instead of vacuum_cost_delay, it would be cool to specify a bandwidth allowance. Megabits/second is a completely reasonable unit for that. Or a limit on network bandwidth. FWIW, I don't feel very strongly either way. I'm more than happy with the status quo. The hint in the error message very clearly spells out what the valid values are, so it's immediately clear what you need to fix if you get that wrong. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] IN vs EXISTS equivalence
On Wed, Sep 3, 2008 at 9:17 AM, daveg [EMAIL PROTECTED] wrote: On Thu, Aug 14, 2008 at 06:50:09PM +0100, Simon Riggs wrote: On Fri, 2008-08-08 at 16:23 -0400, Tom Lane wrote: NOT IN is a lot trickier, condition: you must also assume that the comparison operator involved never yields NULL for non-null inputs. That might be okay for btree comparison functions but it's not a very comfy assumption in general; we certainly haven't got any explicit knowledge that any functions are guaranteed to act that way. So this case might be worth doing later ... Just found this comment, after reading what you said on other thread about NOT IN. NOT IN is a serious performance issue for most people. We simply can't say to people you were told not to. If we can fix it easily for the majority of cases, we should. We can't let the it won't work in certain cases reason prevent various A suggestion: what about adding an attribute to functions to declare that they never return null? And if function still returns null then error will be raised? Then you will end up adding NOT NULL also to IN and OUT parameters. IIRC it was possible in Oracle to declare local variables NOT NULL. declare foo(int, int) returns int immutable not null as ... -dg -- David Gould [EMAIL PROTECTED] 510 536 1443510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Cleanup of GUC units code
On Wed, Sep 3, 2008 at 11:20 AM, Heikki Linnakangas [EMAIL PROTECTED] wrote: Marko Kreen wrote: On 9/2/08, Peter Eisentraut [EMAIL PROTECTED] wrote: Marko Kreen wrote: In the meantime, here is simple patch for case-insensivity. You might be able to talk me into accepting various unambiguous, common alternative spellings of various units. But for instance allowing MB and Mb to mean the same thing is insane. How would the docs for that look like? And anyway, what is wrong with Mb for megabytes? From infamous wikipedia: A *megabit* is a unit of informationhttp://en.wikipedia.org/wiki/Informationor computer storage, abbreviated *Mbit* (or *Mb*). To me playing with case of acronyms and even depending on it seems more insane. It would make much more sense to have case insensitive set of acronyms and (thanks Tom for pointing out) some sanity checks when configuration is loaded to notify user when wrong ones are used for some context. I doesn't seem completely unreasonable to me that we'd want to express something in megabits/second in the future. For example, instead of vacuum_cost_delay, it would be cool to specify a bandwidth allowance. Megabits/second is a completely reasonable unit for that. Or a limit on network bandwidth. There are less confusing (better) acronyms kbit/s and mbit/s available for that. FWIW, I don't feel very strongly either way. I'm more than happy with the status quo. The hint in the error message very clearly spells out what the valid values are, so it's immediately clear what you need to fix if you get that wrong. Is the database down during that time? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Cleanup of GUC units code
On 9/3/08, Heikki Linnakangas [EMAIL PROTECTED] wrote: Marko Kreen wrote: On 9/2/08, Peter Eisentraut [EMAIL PROTECTED] wrote: Marko Kreen wrote: In the meantime, here is simple patch for case-insensivity. You might be able to talk me into accepting various unambiguous, common alternative spellings of various units. But for instance allowing MB and Mb to mean the same thing is insane. How would the docs for that look like? And anyway, what is wrong with Mb for megabytes? I doesn't seem completely unreasonable to me that we'd want to express something in megabits/second in the future. For example, instead of vacuum_cost_delay, it would be cool to specify a bandwidth allowance. Megabits/second is a completely reasonable unit for that. Or a limit on network bandwidth. While it sounds theoretically useful, as a UI it's combines worst from both worlds. You now can confuse user with both case-sensitivity and unit-mixup. Even if we keep the current case-sensitivity, we should set policy that units that differ only with case will never be accepted. And the best way to set the policy in stone would be to make units case-insensitive. I like Asko's proposal of 'kbit/s' and 'mbit/s' for those - clear and no chance of confusion. FWIW, I don't feel very strongly either way. I'm more than happy with the status quo. The hint in the error message very clearly spells out what the valid values are, so it's immediately clear what you need to fix if you get that wrong. Well, the problem is that while database may come up with wrong values, it may be unusable for actual loads, until admin browses the logs, re-edits the config file and finally restarts in again. -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Cleanup of GUC units code
On Tue, 2008-09-02 at 16:50 +0300, Peter Eisentraut wrote: Gregory Stark wrote: Peter Eisentraut [EMAIL PROTECTED] writes: Marko Kreen wrote: In the meantime, here is simple patch for case-insensivity. You might be able to talk me into accepting various unambiguous, common alternative spellings of various units. But for instance allowing MB and Mb to mean the same thing is insane. Because you think some user will be trying to specify their shared_buffers in bits? If somebody wants then somebody can still claim that the units are wrong, as SI specifies 1k = 1000 and 1M=100, etc The binary should use IEC multipliers for kibi-, mebi-, tebi-, gibi-, etc. http://searchstorage.techtarget.com/sDefinition/0,,sid5_gci825099,00.html So we could beat correctness into DBA-s by forcing them to write KiB MiB and so on for things that they have always written kb and mb (unless they are communication engineers, in which case they may be used to writing these as kB and MB) My concern is that this information does not stay in the configuration files. It will invariably leak out into whitepapers, presentations, product documentation, and before long there will be confusion about why you can't stuff N Mb over an N Mb connection. You can't even put N Mbits of data in one sec over N Mbits/sec connection as there is some overhead from all protocol levels. Clueless people will also be confused about why they can't fit 1Mbyte of CVS data into a 1Mbyte database file or why you can't store 1024 MB of data on a 1GB disk. I am not making this up. Is there anything in conf files that can be reasonably specified in bits ? Mb does not add any typing ease (as KB might) or readability (as sec might), and there is no respectable source that will claim it is an acceptable alias for MB. Are you really afraid that someone would want to use mb to mean millibits ? As SQL is generally case insensitive, it is quite surprising to most people that GUC units are not. - Hannu -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Question regarding the database page layout.
Tom Lane [EMAIL PROTECTED] writes: Gregory Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: BTW, there are at least two copies of that code to be changed. I'd suggest grepping for assignments to t_hoff to be sure there aren't more. I did send in a patch a while ago to get rid of the old HeapFormTuple() and friends. I remember discussing that idea, but I don't recall seeing an actual patch? It would have to be quite large because of the number of places using the old way. I'd also be a bit worried about breaking add-on modules to little purpose ... Huh, apparently I did it but didn't actually send in the patch: http://archives.postgresql.org/pgsql-hackers/2007-10/msg00851.php I looked around and I don't seem to have it lying around any more. (Kind of mystifying since I have tons of old source trees and patches, just not that one.) I could do the janitorial work again if we're interested. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Window functions patch v04 for the September commit fest
2008/9/3 Heikki Linnakangas [EMAIL PROTECTED]: Hitoshi Harada wrote: I'd suggest: 1. Implement Window node, with the capability to invoke an aggregate function, using the above API. Implement required parser/planner changes. Implement a few simple ranking aggregates using the API. 2. Implement glue to call normal aggregates through the new API 3. Implement the optimization to drop rows that are no longer needed (signal_cutoff can be a no-op until this phase) 4. Implement window framing (the frame can always be all rows in the partition, or all rows until the current row, until this phase) 5. Expose the new API to user-defined aggregates. It can be an internal API only used by built-in functions until this phase I believe you already have phase 1 in your patch, except for the API changes. I am willing to challenge to implement the API above, after maintain the current patch adding docs and tests. Since the API includes changes much more like Aggregate syntax than current patch, I'm not sure if I can finish it by next commit fest, which is said to be feature freeze. For safety, remain the current patch to review excluding API and executor then if I fail to finish use it for next release. Git helps it by cutting a branch, does it? How do you think? We do allow changes to the user manual after the feature freeze, so I'd suggest concentrating on the code and tests first. Code comments and internal docs are important, though, for easy review. I'm sure we won't get all the way to phase 5 for 8.4, but if we can even get 1-3, plus some of the most important window functions, this this will be a great release! OK, so first tests and internal docs/comments, then comes trying to catch API , finally docs. BTW, I think it is better to put together the discussion points we have done as general roadmap to complete window functions. It is not about the features for the next release but is the complete tasks. Where to go? Wiki, or my design docs? Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Function call order dependency
Is there a knowable order in which functions are called within a query in PostgreSQL? For example I'll use the Oracle contains function, though this is not exactly what I'm doing, it just illustrates the issue clearly. select *, score(1) from mytable where contains(mytable.title, 'Winding Road', 1) order by score(1); The contains function does a match against mytable.title for the term 'Winding Road' and both returns and saves an integer score which may be retrieved later using the score(...) function. The integer used as a parameter in score(...) and contains(...) is an index to reference which score you need as more than one contains(...) call may be used in single query. This sets up an interesting issue, how can one ensure that contains() is called prior to any score() function on each row? Is this possible? Is there a specific order on which you can count? Would it be something like: where clause first, left to right, followed by select terms, left to right, and lastly the order by clause? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq object hooks (libpq events)
Alvaro Herrera wrote: There's one thing that seems a bit baroque, which is the PG_COPYRES_USE_ATTRS stuff in PQcopyResult. I think that flag introduces different enough behavior that it should be a routine of its own, say PQcopyResultAttrs. That way you would leave out the two extra params in PQcopyResult. Oh -- one last thing. I am not really sure about the flags to PQcopyResult. Should there really be flags to _remove_ behavior, instead of flags that add? i.e. instead of having 0 copy everything, and have to pass flags for things not to copy, wouldn't it be cleaner to have 0 copy only base stuff, and require flags to copy extra things? a name is attached to every event proc, so that it can be reported in error messages Can someone confirm that an event 'name' should be re-introduced, as suggested by Alvaro? Can I get a happy or sad face in regards to below? New options which add instead of remove. #define PG_COPYRES_ATTRS 0x01 #define PG_COPYRES_TUPLES 0x02 /* Implies PG_COPYRES_ATTRS */ #define PG_COPYRES_EVENTS 0x04 #define PG_COPYRES_NOTICEHOOKS0x08 // tuples implies attrs, you need the attrs to copy the tuples. if(options PG_COPYRES_TUPLES) options |= PG_COPYRES_ATTRS; // auto set option In regards to copying the attrs, the PQcopyResult still needs the ability to copy the source result's attrs. Although, it doesn't need the ability to provide custom attrs (which I removed). New prototype for copyresult: PGresult * PQcopyResult(const PGresult *src, int options); I then added a PQsetResultAttrs. copyresultattrs didn't seem like the correct name because you are no longer copying attrs from a source result. You are providing the attrs to 'set'. int PQsetResultAttrs(PGresult *res, int numAttributes, PGresAttDesc *attDescs); If the result provided to setattrs already contains attributes, I have the function failing (can't overwrite existing attrs). I think this is good behavior When PQcopyResult needs to copy the source result's attrs, it calls PQsetResultAttrs. /* Wants attrs */ if((options PG_COPYRES_ATTRS) !PQsetResultAttrs(dest, src-numAttributes, src-attDescs)) { PQclear(dest); return NULL; } So, there is some nice code reuse which indicates to me the code is segmented well (copyres setattrs). -- Andrew Chernow eSilo, LLC every bit counts http://www.esilo.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Cleanup of GUC units code
Hannu Krosing wrote: On Tue, 2008-09-02 at 16:50 +0300, Peter Eisentraut wrote: Gregory Stark wrote: Peter Eisentraut [EMAIL PROTECTED] writes: Are you really afraid that someone would want to use mb to mean millibits ? As SQL is generally case insensitive, it is quite surprising to most people that GUC units are not. We have had this discussion before, I even submitted a patch to make them case insensitive. In retrospect I was wrong to submit that patch. SQL may be case insensitive but units are not. MB != Mb != mb , I don't think we should encourage in any way for users to do the wrong thing. Sincerely, Joshua D. Drake -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Function call order dependency
[EMAIL PROTECTED] writes: For example I'll use the Oracle contains function, though this is not exactly what I'm doing, it just illustrates the issue clearly. select *, score(1) from mytable where contains(mytable.title, 'Winding Road', 1) order by score(1); The contains function does a match against mytable.title for the term 'Winding Road' and both returns and saves an integer score which may be retrieved later using the score(...) function. This is just a bad, bad idea. Side-effects in a WHERE-clause function are guaranteed to cause headaches. When (not if) it breaks, you get to keep both pieces. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Function call order dependency
[EMAIL PROTECTED] wrote: Would it be something like: where clause first, left to right, followed by select terms, left to right, and lastly the order by clause? I don't know what ANSI says, nor do I know what PostgreSQL exactly does at the moment, but, the only thing you can reasonably count on is that the WHERE clause is evaluated before the SELECT-result-rows and the ORDER BY clause (in any SQL database). You cannot depend on any left to right order, and you cannot depend on ORDER BY being evaluated after the SELECT-result-rows. -- Sincerely, Stephen R. van den Berg. Clarions sounding *No one* expects the Spanish inquisition! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Cleanup of GUC units code
Greg Stark wrote: I don't think worrying about the message we send to users is reasonable. We can take responsibilty for the messages we output but punishing our users to teach them a lesson is being actively user-hostile There is no arguing that MB != Mb; nor is there anything user-hostile behind the idea of doing it the right way. Sincerely, Joshua D. Drake -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Question regarding the database page layout.
Gregory Stark [EMAIL PROTECTED] writes: I could do the janitorial work again if we're interested. I think it'd make more sense to do it incrementally rather than in one big-bang patch ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Window functions doc patch
Dear Hitoshi, I noticed the folowing typo in the doc sgml: 'rownumber()', instead of 'row_number()' ( 2x ) hth Erik Rijkers *** doc/src/sgml/func.sgml.orig 2008-09-03 17:20:28.130229027 +0200 --- doc/src/sgml/func.sgml 2008-09-03 17:21:01.331907454 +0200 *** *** 10092,10100 row entry indexterm ! primaryrownumber()/primary /indexterm !functionrownumber() OVER (ORDER BY replaceable class=parameterexpression/replaceable)/function /entry entry typebigint/type --- 10092,10100 row entry indexterm ! primaryrow_number()/primary /indexterm !functionrow_number() OVER (ORDER BY replaceable class=parameterexpression/replaceable)/function /entry entry typebigint/type -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Cleanup of GUC units code
On Wed, 2008-09-03 at 07:52 -0700, Joshua D. Drake wrote: Hannu Krosing wrote: On Tue, 2008-09-02 at 16:50 +0300, Peter Eisentraut wrote: Gregory Stark wrote: Peter Eisentraut [EMAIL PROTECTED] writes: Are you really afraid that someone would want to use mb to mean millibits ? As SQL is generally case insensitive, it is quite surprising to most people that GUC units are not. We have had this discussion before, I even submitted a patch to make them case insensitive. In retrospect I was wrong to submit that patch. SQL may be case insensitive but units are not. MB != Mb != mb , For most people they are equal, and all mean MEGABYTE(S) though http://en.wikipedia.org/wiki/MB has lots of other possible meanings for each. I don't think we should encourage in any way for users to do the wrong thing. Can you see any scenario where accepting case insensitive units does more damage than just ignoring the conf line with incorrect casing ? Or do you mean we should discourage people from editing postgresql.conf manually and have them use some tool which prevents them entering kb ? Hannu -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Cleanup of GUC units code
On Wed, 2008-09-03 at 08:20 -0700, Joshua D. Drake wrote: Greg Stark wrote: I don't think worrying about the message we send to users is reasonable. We can take responsibilty for the messages we output but punishing our users to teach them a lesson is being actively user-hostile There is no arguing that MB != Mb; The whole point of this discussion is, that mostly people expect MB == Mb = mb == mB, especially if they see weird constructs like kB used (k for Kilo, or actually Kibi). nor is there anything user-hostile behind the idea of doing it the right way. I was not trying to expose some sinister plan, just pointing out that users seldom expect that kind of surprise. -- Hannu -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Window functions doc patch
2008/9/4 Erikj [EMAIL PROTECTED]: Dear Hitoshi, I noticed the folowing typo in the doc sgml: 'rownumber()', instead of 'row_number()' ( 2x ) hth Erik Rijkers *** doc/src/sgml/func.sgml.orig 2008-09-03 17:20:28.130229027 +0200 --- doc/src/sgml/func.sgml 2008-09-03 17:21:01.331907454 +0200 *** *** 10092,10100 row entry indexterm ! primaryrownumber()/primary /indexterm !functionrownumber() OVER (ORDER BY replaceable class=parameterexpression/replaceable)/function /entry entry typebigint/type --- 10092,10100 row entry indexterm ! primaryrow_number()/primary /indexterm !functionrow_number() OVER (ORDER BY replaceable class=parameterexpression/replaceable)/function /entry entry typebigint/type Ah, thanks. It's my mistake. Both of SQL spec and my implementation in pg_proc say it is row_number, not rownumber. Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Cleanup of GUC units code
On Wed, 03 Sep 2008 18:32:16 +0300 Hannu Krosing [EMAIL PROTECTED] wrote: We have had this discussion before, I even submitted a patch to make them case insensitive. In retrospect I was wrong to submit that patch. SQL may be case insensitive but units are not. MB != Mb != mb , For most people they are equal, and all mean MEGABYTE(S) though http://en.wikipedia.org/wiki/MB has lots of other possible meanings for each. O.k. there is an interesting point being made here, For most people. Which people exactly? Are we talking about the general populous? If so I would argue that most people don't have a clue what MB, Mb, or mb is except to say I think that means some kind of speed or capacity. The above is not our target. If our definition of most people is, those who are reasonably technically adept and will be deploying PostgreSQL in production on some level. If someone doesn't know the difference between Mb and MB on a production system, I would not want them anywhere near any instance of a production system. If we are going to make sweeping statements (anyone on this thread) about user-hostile and most people, then we better define what those mean. This whole argument about making something easier (and incorrect) for someone who doesn't exist and has not been defined. I would be hung on this list if I made a similar argument about any other feature. I don't think we should encourage in any way for users to do the wrong thing. Can you see any scenario where accepting case insensitive units does more damage than just ignoring the conf line with incorrect casing ? Generally speaking, no I can't think of any damage that could be done from mixed casing. Especially since we would only accept certain possibilities, e.g; Mb would equal MB. It just strikes me as really bad that a project that prides itself on doing it right is willing to make this type of sacrifice. This isn't about usability. This is about doing it wrong and actively encouraging our users that wrong is ok. It could also misinform the user about what the meaning of the value means. Or do you mean we should discourage people from editing postgresql.conf manually and have them use some tool which prevents them entering kb ? Well that is a whole other argument :P. I would be happy to have that one on another thread. Sincerely, Joshua D. Drake -- The PostgreSQL Company since 1997: http://www.commandprompt.com/ PostgreSQL Community Conference: http://www.postgresqlconference.org/ United States PostgreSQL Association: http://www.postgresql.us/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Function call order dependency
[EMAIL PROTECTED] writes: For example I'll use the Oracle contains function, though this is not exactly what I'm doing, it just illustrates the issue clearly. select *, score(1) from mytable where contains(mytable.title, 'Winding Road', 1) order by score(1); The contains function does a match against mytable.title for the term 'Winding Road' and both returns and saves an integer score which may be retrieved later using the score(...) function. This is just a bad, bad idea. Side-effects in a WHERE-clause function are guaranteed to cause headaches. When (not if) it breaks, you get to keep both pieces. I was kind of afraid of that. So, how could one implement such a function set? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Function call order dependency
I was kind of afraid of that. So, how could one implement such a function set? Write a function (say, score_contains) that returns NULL whenever contains would return false, and the score otherwise. SELECT * FROM ( SELECT *, score_contains(mytable.title, 'Winding Road', 1) AS score FROM mytable ) x WHERE x.score IS NOT NULL ORDER BY x.score ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq object hooks (libpq events)
Andrew Chernow escribió: Alvaro Herrera wrote: (I also removed PQresultAlloc.) Nooo ... removing PQresultAlloc breaks libpqtypes! It also removes some of the use cases provided by PQsetvalue, which allows one to add to a result (in our case from scratch). I don't really see the point -- it's the same as pqResultAlloc, except that you have to pass an extra argument. There's no actual functionality loss. However, the only change of any significance that I introduced was that a name is attached to every event proc, so that it can be reported in error messages, as reporting only %p seems very useless. (I also removed PQresultAlloc.) I don't mind re-introducing the name, but Tom seemed very against this due to conflicts. If 2 different libraries register the same name, debugging would be painful. Hmm, is that really a good argument? I don't see how providing a pointer is a better answer than a name -- it's far less user friendly. Let's start assuming that no duplicate names would be used, and if there are any conflicts in the real world, we can just ask the user to fire up GDB to figure out where each name is pointing to. My guess is that conflicting names will be a rarity, if we ever get to see them. (Would two different libraries call themselves pqtypes, for example?) -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Function call order dependency
[EMAIL PROTECTED] writes: For example I'll use the Oracle contains function, though this is not exactly what I'm doing, it just illustrates the issue clearly. select *, score(1) from mytable where contains(mytable.title, 'Winding Road', 1) order by score(1); The contains function does a match against mytable.title for the term 'Winding Road' and both returns and saves an integer score which may be retrieved later using the score(...) function. This is just a bad, bad idea. Side-effects in a WHERE-clause function are guaranteed to cause headaches. When (not if) it breaks, you get to keep both pieces. Well, I guess I need to alter the question a bit. I need to perform an operation during query time and there are multiple results based on the outcome. For instance: (Lets try this) select myrank(t1.column1, t2.column2, 1) as rank, myscore(t1.column1,t2.column2, 1) as score from t1, t2 where myrank(t1.column1,t2.column2) 10 order by myscore(t1.column1, t2.column2, 1) desc; This is a bit messier, and I wanted to resist this approach as it is ugly. The underlying code will check the values of the first and second parameters and only perform the operation if a previous call did not already act on the current parameters. Now, can I assume that in the above select statement, that each permutation of t1.column1 and t2.column2 will only be evaluated once and that myscore(...) and myrank(...) will all be called before the next permutation is evaluated? So, basically, I don't want to recalculate the values for each and every function call as that would make the system VERY slow. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Cleanup of GUC units code
On Wed, Sep 03, 2008 at 06:37:29PM +0300, Hannu Krosing wrote: On Wed, 2008-09-03 at 08:20 -0700, Joshua D. Drake wrote: There is no arguing that MB != Mb; The whole point of this discussion is, that mostly people expect MB == Mb = mb == mB, especially if they see weird constructs like kB used (k for Kilo, or actually Kibi). Note that in the networked computer world, MB and Mb are importantly different. The latter is relevant for the speed of your network interface, for instance. People often get this wrong when speaking carelessly, but a mistake of this sort is a serious one, given the orders of magnitude difference. A -- Andrew Sullivan [EMAIL PROTECTED] +1 503 667 4564 x104 http://www.commandprompt.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq object hooks (libpq events)
Alvaro Herrera wrote: Andrew Chernow escribió: Alvaro Herrera wrote: (I also removed PQresultAlloc.) Nooo ... removing PQresultAlloc breaks libpqtypes! It also removes some of the use cases provided by PQsetvalue, which allows one to add to a result (in our case from scratch). I don't really see the point -- it's the same as pqResultAlloc, except that you have to pass an extra argument. There's no actual functionality loss. libpqtypes uses it. libpqtypes doesn't have access to any internals of libpq, including pqResultAlloc. So, I made a public wrapper to the internal version. The point is to provide public access to the result allocator. -- Andrew Chernow eSilo, LLC every bit counts http://www.esilo.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Function call order dependency
[EMAIL PROTECTED] writes: I need to perform an operation during query time and there are multiple results based on the outcome. For instance: (Lets try this) select myrank(t1.column1, t2.column2, 1) as rank, myscore(t1.column1,t2.column2, 1) as score from t1, t2 where myrank(t1.column1,t2.column2) 10 order by myscore(t1.column1, t2.column2, 1) desc; Why not have one function that produces multiple output columns? Now, can I assume that in the above select statement, that each permutation of t1.column1 and t2.column2 will only be evaluated once and that myscore(...) and myrank(...) will all be called before the next permutation is evaluated? You can assume that functions in the SELECT target list are evaluated exactly once per output row (at least as long as no SRFs-in-the-targetlist are involved). I don't think it'd be wise to assume anything about order of evaluation, though it's probably true that it's left-to-right at the moment. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Function call order dependency
I was kind of afraid of that. So, how could one implement such a function set? Write a function (say, score_contains) that returns NULL whenever contains would return false, and the score otherwise. SELECT * FROM ( SELECT *, score_contains(mytable.title, 'Winding Road', 1) AS score FROM mytable ) x WHERE x.score IS NOT NULL ORDER BY x.score That could work, and while it fits my example, my actual need is a bit more complex. For instance, say I have two variables (I actually have a few that I need) select myvar1(1), myvar2(1), myvar3(1) from mytable where myfunction(mytable.column, 'some text to search for', 1) 2; How could I ensure that (1) myfunction is called prior to myvar1(), myvar2(), and myvar3()? I think the answer is that I can't. So, the obvious solution is to pass all the variables to all the functions and have it first come first served. The next issue is something like this: select *, myvar1(t1.col1,t2.col2,1), myvar2(t1.col1.t2.col2,1) from t1,t2 where myfunction(t1.col1,t2.col2,1) 10 order by myvar3(t1.col1,t2.col2,1) desc; Using a first come first served strategy, is there any discontinuity between the function calls for t1.col1 and t2.col2. Will they all be called for a particular combination of t1.col1 and t2.col2, in some unpredictable order before the next row(s) combination is evaluated or will I have to execute the underlying algorithm for each and every call? ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Cleanup of GUC units code
On Wed, 2008-09-03 at 09:10 -0700, Joshua Drake wrote: If someone doesn't know the difference between Mb and MB on a production system, I would not want them anywhere near any instance of a production system. I for one can make the difference, once I can zen that we are in a domain, where Mbit makes sense. For me it does not make any sense to define shared_buffer in Mbit's. If we are going to make sweeping statements (anyone on this thread) about user-hostile and most people, then we better define what those mean. Not user-hostile but rather hostile to an overworked DBA, who tries to change some .conf param from 1MB to 512KB at 3AM to save an overloaded server and then suddenly the server won't start anymore. This whole argument about making something easier (and incorrect) for someone who doesn't exist and has not been defined. I still don't get in what non-academic way it would be incorrect to spell kilobytes as kb or KB or megabytes as mb ? Do you have any standard source where it specifies one use and forbids the other ? We are not correct according to SI (where k = 1000) nor IEC which specifies Ki for 1024. I would be hung on this list if I made a similar argument about any other feature. Its all about making it easier, not incorrect. It may be perhaps incorrect in some strictly academic sense, maybe some article submission guidelines or such. I don't think we should encourage in any way for users to do the wrong thing. Can you see any scenario where accepting case insensitive units does more damage than just ignoring the conf line with incorrect casing ? Generally speaking, no I can't think of any damage that could be done from mixed casing. Especially since we would only accept certain possibilities, e.g; Mb would equal MB. It just strikes me as really bad that a project that prides itself on doing it right is willing to make this type of sacrifice. This isn't about usability. It is all about usability. Not accepting popular spellings of common units seems to me as unnecessary nit-picking , similar to if we required _exactly_ one space on either side on = in SET statements. And we are already incorrect by allowing both '' and around the value, the latter would usually imply an identifier :P hannu=# set effective_cache_size = '132MB'; SET hannu=# set effective_cache_size = 132MB; SET Maybe we should change this too if we want to champion the right way? This is about doing it wrong and actively encouraging our users that wrong is ok. It could also misinform the user about what the meaning of the value means. You could issue a warning, something like Warning: KB is not a spelling endorsed by PostgeSQL Global Development Group, in future please use kB, a spelling that many of us feel to be the only valid one and others don't think important enough to argue with them. --- Hannu -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Commitfest has started, y'all
Folks, If you hadn't noticed, the CommitFest started this week. I'm currently assigning patches to reviewers, but if there's a patch you especially want to tackle, please put your name down right away. If you can't actually review the patch *this week*, please don't put your name down. Thanks! --Josh -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Function call order dependency
[EMAIL PROTECTED] writes: I need to perform an operation during query time and there are multiple results based on the outcome. For instance: (Lets try this) select myrank(t1.column1, t2.column2, 1) as rank, myscore(t1.column1,t2.column2, 1) as score from t1, t2 where myrank(t1.column1,t2.column2) 10 order by myscore(t1.column1, t2.column2, 1) desc; Why not have one function that produces multiple output columns? I was sort of trying to make this a fairly generic SQL extension who's methodology could be moved to other databases if needed. I guess multiple columns could work. I've got some code in another extension that does that. Now, can I assume that in the above select statement, that each permutation of t1.column1 and t2.column2 will only be evaluated once and that myscore(...) and myrank(...) will all be called before the next permutation is evaluated? You can assume that functions in the SELECT target list are evaluated exactly once per output row (at least as long as no SRFs-in-the-targetlist are involved). I don't think it'd be wise to assume anything about order of evaluation, though it's probably true that it's left-to-right at the moment. But are all the items targeted in close proximity to each other BEFORE moving on to the next row? What about the where clause? would that be called out of order of the select target list? I'm doing a fairly large amount of processing and doing it once is important. / -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Cleanup of GUC units code
Hannu Krosing escribió: On Wed, 2008-09-03 at 09:10 -0700, Joshua Drake wrote: If we are going to make sweeping statements (anyone on this thread) about user-hostile and most people, then we better define what those mean. Not user-hostile but rather hostile to an overworked DBA, who tries to change some .conf param from 1MB to 512KB at 3AM to save an overloaded server and then suddenly the server won't start anymore. I think the energy wasted in this discussion would be better spent in working a the check-the-config-file feature. That would equally solve this problem, as well as many others. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Cleanup of GUC units code
On Wed, Sep 03, 2008 at 01:48:18PM -0400, Alvaro Herrera wrote: I think the energy wasted in this discussion would be better spent in working a the check-the-config-file feature. That would equally solve this problem, as well as many others. This seems like a good idea to me. A -- Andrew Sullivan [EMAIL PROTECTED] +1 503 667 4564 x104 http://www.commandprompt.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [EMAIL PROTECTED]: Re: [HACKERS] [patch] GUC source file and line number]
On Tue, 2 Sep 2008, Tom Lane wrote: How about having two new columns reset value and boot value? Like it better than default value ... It's being a bit pedantic at the expense of the user, but I don't really care that much here. I exposed the boot_val and described it in the documentation as: Default value if the parameter is not explicitly set That's the value that people care about--if they comment out a setting altogether and restart the server, what will it go back to. New admins and people playing with the config files in a tuning content aren't often using sighup in my experience, they just restart the server after changes. I'm not aware of any specific use case for exposing the reset value other than for completeness sake. Having both exposed with names that don't mean anything to new admins is making the user experience more difficult than it needs to be. That was why I just picked the more important one and named it default; that makes the case for the average user so easy they don't even need to look at the documentation. I note the ongoing GUC units debate as a reminder that a technically correct UI is usually preferred in this project to an easier to use but slightly ambiguous one, and I'm not going to argue for default further if everyone else is happy with a cryptic naming instead. The important thing is that the boot_val gets exposed somehow so tool writers can trivially present it as an option. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Testers: 8.4 snapshot RPMs are available
Hi, If you are a Fedora-9 or RHEL/CentOS 5 user and want to test new features of PostgreSQL 8.4 and help development team, you may use the packages that I have just released, based on today's CVS snapshot. I am planning to push new packages each weekend during commitfest. Please note that these packages are not built using the official tarball. I built it with the help of some PostgreSQL hackers. These are not stable packages. Use them at your own risk. They are *not* production ready. As usual, the RPMs are available at http://yum.pgsqlrpms.org . Please install repository RPMs from here first: http://yum.pgsqlrpms.org/reporpms/repoview/letter_p.group.html Remove existing RPMs after taking your backups, and then install 8.4 packages. Current set is labeled as 8.4devel_03092008-1PGDG . You may want to install -debuginfo RPM, which may help PostgreSQL developers to get some data in case you find a bug. Source RPMs are also available: http://yum.pgsqlrpms.org/srpms/8.4 (and they are installable using yum). Please report any packaging related errors to me. If you find any PostgreSQL 8.4 bugs, please post them to [EMAIL PROTECTED] or fill this form: http://www.postgresql.org/support/submitbug Regards, -- Devrim GÜNDÜZ, RHCE devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr http://www.gunduz.org signature.asc Description: This is a digitally signed message part
Re: [HACKERS] [PATCH] Cleanup of GUC units code
I don't think worrying about the message we send to users is reasonable. We can take responsibilty for the messages we output but punishing our users to teach them a lesson is being actively user- hostile greg On 3 Sep 2008, at 15:52, Joshua D. Drake [EMAIL PROTECTED] wrote: Hannu Krosing wrote: On Tue, 2008-09-02 at 16:50 +0300, Peter Eisentraut wrote: Gregory Stark wrote: Peter Eisentraut [EMAIL PROTECTED] writes: Are you really afraid that someone would want to use mb to mean millibits ? As SQL is generally case insensitive, it is quite surprising to most people that GUC units are not. We have had this discussion before, I even submitted a patch to make them case insensitive. In retrospect I was wrong to submit that patch. SQL may be case insensitive but units are not. MB != Mb != mb , I don't think we should encourage in any way for users to do the wrong thing. Sincerely, Joshua D. Drake -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Cleanup of GUC units code
Sure if people want to do it the right way more power to them. What you're talking about is punishing people when they don't live up to your standards. greg On 3 Sep 2008, at 16:20, Joshua D. Drake [EMAIL PROTECTED] wrote: Greg Stark wrote: I don't think worrying about the message we send to users is reasonable. We can take responsibilty for the messages we output but punishing our users to teach them a lesson is being actively user-hostile There is no arguing that MB != Mb; nor is there anything user- hostile behind the idea of doing it the right way. Sincerely, Joshua D. Drake -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Cleanup of GUC units code
On Wed, 3 Sep 2008 19:36:19 +0100 Greg Stark [EMAIL PROTECTED] wrote: Sure if people want to do it the right way more power to them. What you're talking about is punishing people when they don't live up to your standards. I think I will defer to Andrew and Alvaro's opinion on the matter. Sincerely, Joshua D. Drake -- The PostgreSQL Company since 1997: http://www.commandprompt.com/ PostgreSQL Community Conference: http://www.postgresqlconference.org/ United States PostgreSQL Association: http://www.postgresql.us/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [EMAIL PROTECTED]: Re: [HACKERS] [patch] GUC source file and line number]
Greg Smith [EMAIL PROTECTED] writes: On Tue, 2 Sep 2008, Tom Lane wrote: How about having two new columns reset value and boot value? Like it better than default value ... It's being a bit pedantic at the expense of the user, but I don't really care that much here. I exposed the boot_val and described it in the documentation as: Default value if the parameter is not explicitly set If that statement were the truth, the whole truth, and nothing but the truth, and if it didn't ignore the point about explicitly set WHERE?, I'd be fine with it. That was why I just picked the more important one and named it default; More important to whom? You are adopting a very narrow mindset, which seems to be that only DBAs look at this view. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Cleanup of GUC units code
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 You do realize that misspelling unit name can cause downtime of several minutes instead of couple seconds? We can easily do restart in couple of seconds but the restart, look logs, launch editor, find value, change, save, restart cycle will take quite a lot more. Why should we increase the chance that any config edit causes problems? Minutes? It's like any other changes made to postgresql.conf: make the change, ctrl-z out of your editor, reload Postgres. If there's an error, foreground to the editor, revert the change, and reload Postgres. Not ideal (the config file checker is what we really want), but certainly quick. Secondly, humans don't have byte-exact memory, instead they generalize and deduce (eg. from nearby parameters). Thus remembering KB, MB, GB or kb, mb, gb is easier than remembering kB, MB, GB. They are all listed at the top of the file, in case there is any confusion. No need to worry about remembering things. - -- Greg Sabino Mullane [EMAIL PROTECTED] PGP Key: 0x14964AC8 200809031501 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iEYEAREDAAYFAki+39gACgkQvJuQZxSWSsiTfACgiNPP77YKGKgyBm36ckkKZhGM b9sAn2JmrpbMkJ8dm0Wbz3TYdLo83h/S =PQvy -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump roles support
=?ISO-8859-1?Q?Benedek_L=E1szl=F3?= [EMAIL PROTECTED] writes: pg_dumpall now just passes the --role option to pg_dump. What do you think, is it enough or it should issue the SET ROLE TO ... command in its own session too? I think it would have to, in the general case. Consider the possibility that someone has restricted access to the system catalogs, for instance. You have missed an important component of Stephen's original proposal, which was the point that something similar is needed on the restore side. This is a little bit tricky since the context at restore time is not necessarily the same as the context at dump time. When using an archive file it's not a problem: the behavior can be driven off a --role switch to pg_restore, and this is independent of what pg_dump did. In a dump to plain text, though, I'm not sure what to do. The simplest design would have pg_dump's --role switch control both what it does in its own connection to the source database, and what it puts into the output script. I'm not sure that's adequate though. Is it worth having two different switches for the two cases? If we think it's a corner case to need different role IDs, we could just leave it like that and tell anyone who needs different behaviors that they have to go through an archive file and pg_restore. Stephen, you were the one who wanted this in the first place, what's your use-cases look like? Some other review nitpicking: The documentation part of the patch is well short of acceptable IMHO, since it gives no hint of what this switch might be good for, and indeed encourages the user to confuse it with the -U switch by injecting a mention of it into the middle of a discussion about -U. It is not normally considered appropriate for individual patches to edit the release notes; and it's DEFINITELY not appropriate to put a mention of a feature addition into the wrong section of the release notes. + {role, required_argument, NULL, 'r' + 0x80}, This is not a good choice of option code IMHO ... what if the value is stored in a signed char on some machines? If you can't find a free letter you like, use a small integer code, as you can find being done elsewhere. BTW, the patch fails to compile on a strict ANSI C compiler, because you are using a C++-ism of declaring a variable mid-block. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] GUC source file and line number]
On Wed, 3 Sep 2008, Tom Lane wrote: Default value if the parameter is not explicitly set If that statement were the truth, the whole truth, and nothing but the truth, and if it didn't ignore the point about explicitly set WHERE?, I'd be fine with it. First question--how about if I changed that description to read: Default value used at server startup if the parameter is not explicitly set? I could then expose reset-val, named like that and with a description that explained the context it applies in. And then we've give people a way to experiment and understand the FAQ of why didn't the value go back to the default when I commented it out of the postgresql.conf and HUP'd the server?. Section question: with those changes, would it then be reasonable to you to keep that column named default instead of giving it a less common name? You are adopting a very narrow mindset, which seems to be that only DBAs look at this view. DBAs are the only group I am always getting questions in this area from. Everybody else seemed happy with the status quo, where the value wasn't exposed at all and you just looked in guc.c to see what it was. About once a month, somebody asks me how can I tell what the default is for *X*? I want to be able to answer this question with look in pg_settings, which is easy enough to remember, and not have to say anything else. That's the source of my mindset here, and I'm sure I'm not alone in fielding that so often. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] GUC source file and line number]
On Wed, 3 Sep 2008 16:04:12 -0400 (EDT) Greg Smith [EMAIL PROTECTED] wrote: Section question: with those changes, would it then be reasonable to you to keep that column named default instead of giving it a less common name? You are adopting a very narrow mindset, which seems to be that only DBAs look at this view. DBAs are the only group I am always getting questions in this area from. Everybody else seemed happy with the status quo, where the value wasn't exposed at all and you just looked in guc.c to see what it was. I guess I would ask, Who else would we be targeting this for?. DBAs seem to be the only logical choice. Joshua D. Drake -- The PostgreSQL Company since 1997: http://www.commandprompt.com/ PostgreSQL Community Conference: http://www.postgresqlconference.org/ United States PostgreSQL Association: http://www.postgresql.us/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Cleanup of GUC units code
On Wed, 2008-09-03 at 13:48 -0400, Alvaro Herrera wrote: Hannu Krosing escribió: On Wed, 2008-09-03 at 09:10 -0700, Joshua Drake wrote: If we are going to make sweeping statements (anyone on this thread) about user-hostile and most people, then we better define what those mean. Not user-hostile but rather hostile to an overworked DBA, who tries to change some .conf param from 1MB to 512KB at 3AM to save an overloaded server and then suddenly the server won't start anymore. I think the energy wasted in this discussion would be better spent in working a the check-the-config-file feature. What kind of checks do you have in mind. Would this be something that works at restart, does the check and continues with old settings if new file would not load ? That would equally solve this problem, as well as many others. AFAIK the config file is checked now, and if the check fails, the database won't start. - Hannu -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Cleanup of GUC units code
On Wed, 03 Sep 2008 23:10:24 +0300 Hannu Krosing [EMAIL PROTECTED] wrote: That would equally solve this problem, as well as many others. AFAIK the config file is checked now, and if the check fails, the database won't start. Like apachectl configcheck ... E.g; we have the ability to check if the config file is valid before we restart and have an extended outage. Joshua D. Drake - Hannu -- The PostgreSQL Company since 1997: http://www.commandprompt.com/ PostgreSQL Community Conference: http://www.postgresqlconference.org/ United States PostgreSQL Association: http://www.postgresql.us/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Cleanup of GUC units code
Hannu Krosing escribió: On Wed, 2008-09-03 at 13:48 -0400, Alvaro Herrera wrote: I think the energy wasted in this discussion would be better spent in working a the check-the-config-file feature. What kind of checks do you have in mind. Would this be something that works at restart, does the check and continues with old settings if new file would not load ? This was discussed in some other thread. No, the idea is that before you do a reload, you run some function or invoke postmaster with certain arguments, and it checks the config file and it says this is OK or there are errors here and here. The check can be run even if the server is already running. That way you can just run it just before a reload or restart. That would equally solve this problem, as well as many others. AFAIK the config file is checked now, and if the check fails, the database won't start. ... which is not ideal. Obviously it doesn't make much sense to run the check if the server is already down, because you'll immediately know that it won't come up just by trying to start it up. (However, maybe it would be better for the init script to run the check anyway, and report the error to stderr where the user running the script can read it directly instead of having to go check the postmaster log which could be sitting somewhere else.) -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] IN vs EXISTS equivalence
Tom Lane [EMAIL PROTECTED] wrote: If you're still interested in testing CVS HEAD's handling of EXISTS, I've about finished what I wanted to do with it. It's been hectic here, but I've managed to let some stuff run in the background using an old test case from here: http://archives.postgresql.org/pgsql-hackers/2007-03/msg01408.php explain analyze SELECT A.adjustmentNo, A.tranNo, A.countyNo, H.date, H.userId, H.time FROM Adjustment A JOIN TranHeader H ON (H.tranId = A.adjustmentNo AND H.countyNo = A.countyNo AND H.tranNo = A.tranNo) WHERE H.tranType = 'A' AND A.date DATE '2006-01-01' AND H.countyNo = 66 AND A.countyNo = 66 AND EXISTS ( SELECT 1 FROM TranDetail D WHERE D.tranNo = H.tranNo AND D.countyNo = H.countyNo AND D.caseNo LIKE '2006TR%' ) ; On development machine using 8.3.3: Nested Loop (cost=0.00..399190.49 rows=1 width=37) (actual time=7184.068..3249391.592 rows=12372 loops=1) Join Filter: ((A.adjustmentNo)::text = (H.tranId)::text) - Seq Scan on Adjustment A (cost=0.00..5218.87 rows=247869 width=17) (actual time=9.804..1695.691 rows=248674 loops=1) Filter: (((date)::date '2006-01-01'::date) AND ((countyNo)::smallint = 66)) - Index Scan using TranHeader_pkey on TranHeader H (cost=0.00..1.57 rows=1 width=37) (actual time=13.056..13.056 rows=0 loops=248674) Index Cond: (((H.tranNo)::integer = (A.tranNo)::integer) AND ((H.countyNo)::smallint = 66)) Filter: (((H.tranType)::text = 'A'::text) AND (subplan)) SubPlan - Index Scan using TranDetail_TranDetCaseNo on TranDetail D (cost=0.00..1.29 rows=1 width=0) (actual time=13.017..13.017 rows=0 loops=248674) Index Cond: (((caseNo)::text = '2006TR'::text) AND ((caseNo)::text '2006TS'::text) AND ((tranNo)::integer = ($0)::integer) AND ((countyNo)::smallint = ($1)::smallint)) Filter: ((caseNo)::text ~~ '2006TR%'::text) Total runtime: 3249404.662 ms On the same machine, using the snapshot from this morning: Nested Loop (cost=1963.24..38483.54 rows=1 width=37) (actual time=372.964..986.994 rows=12372 loops=1) Join Filter: ((H.tranNo)::integer = (A.tranNo)::integer) - Merge Semi Join (cost=1963.24..31012.28 rows=21317 width=37) (actual time=372.926..839.298 rows=12372 loops=1) Merge Cond: ((H.tranNo)::integer = (D.tranNo)::integer) Join Filter: ((D.countyNo)::smallint = (H.countyNo)::smallint) - Index Scan using TranHeader_pkey on TranHeader H (cost=0.00..27848.57 rows=322517 width=37) (actual time=3.722..526.124 rows=311963 loops=1 ) Index Cond: ((countyNo)::smallint = 66) Filter: ((tranType)::text = 'A'::text) - Sort (cost=1963.17..2027.08 rows=25565 width=6) (actual time=171.512..191.688 rows=76597 loops=1) Sort Key: D.tranNo Sort Method: quicksort Memory: 6663kB - Index Scan using TranDetail_TranDetCaseNo on TranDetail D (cost=0.00..91.57 rows=25565 width=6) (actual time=0.031..100.688 rows=7659 7 loops=1) Index Cond: (((caseNo)::text = '2006TR'::text) AND ((caseNo)::text '2006TS'::text) AND ((countyNo)::smallint = 66)) Filter: ((caseNo)::text ~~ '2006TR%'::text) - Index Scan using Adjustment_pkey on Adjustment A (cost=0.00..0.34 rows=1 width=17) (actual time=0.009..0.010 rows=1 loops=12372) Index Cond: (((A.adjustmentNo)::text = (H.tranId)::text) AND ((A.countyNo)::smallint = 66)) Filter: ((A.date)::date '2006-01-01'::date) Total runtime: 991.097 ms The chosen plan looks very reasonable, and performs very well. Nice! After converting the database I originally forgot to run VACUUM ANALYZE. Even planning blind and doing hint-bit rewrites it picked a plan which ran in under 10 seconds. I'll be running other tests as I get the chance. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Cleanup of GUC units code
On Wed, 2008-09-03 at 11:45 -0700, Joshua Drake wrote: On Wed, 3 Sep 2008 19:36:19 +0100 Greg Stark [EMAIL PROTECTED] wrote: Sure if people want to do it the right way more power to them. What you're talking about is punishing people when they don't live up to your standards. I think I will defer to Andrew and Alvaro's opinion on the matter. So Andrews opinion was that Mb (meaning Mbit) is different from MB (for megabyte) and that if someone thinks that we define shared buffers in megabits can get confused and order wrong kind of network card ? I can understand Alvaros stance more readily - if we have irrational constraints on what can go into conf file, and people wont listen to reason, then build better tools for helping people to compli to these irrational demands. It has the added benefit of helping to catch reall conf file errors. I did not realize earlier that KB vs kb vs kB vs Kb is a religious issue . I mean, there is no known written standard, which says that Mb is megabit, not megabyte or that you can (or can't) write kilo as K, but some people just believe that kB is the Way and allowing people to write kilobytes as KB or kb is evil and should be punished. To me this sounds stupid, but I understand that this is a thing that can't be argued logically and I have better things to do than changing peoples irrational beliefs. Sorry. --- Hannu -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] In case module has wrong magic, report exact problem
Marko Kreen [EMAIL PROTECTED] writes: In case module and server magic blocks do not match report exact parameters that differ. Applied with revisions --- your patch produced a message that wasn't localizable and didn't follow the style guidelines. Also it assumed that all the fields would be plain integer values, which isn't true even today. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Cleanup of GUC units code
On Thu, 04 Sep 2008 01:26:44 +0300 Hannu Krosing [EMAIL PROTECTED] wrote: So Andrews opinion was that Mb (meaning Mbit) is different from MB (for megabyte) and that if someone thinks that we define shared buffers in megabits can get confused and order wrong kind of network card ? I was actually referring to: http://archives.postgresql.org/pgsql-hackers/2008-09/msg00206.php Sincerely, Joshua D. Drake -- The PostgreSQL Company since 1997: http://www.commandprompt.com/ PostgreSQL Community Conference: http://www.postgresqlconference.org/ United States PostgreSQL Association: http://www.postgresql.us/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] GUC source file and line number]
Joshua Drake [EMAIL PROTECTED] writes: I guess I would ask, Who else would we be targeting this for?. DBAs seem to be the only logical choice. Regular users look at pg_settings too, you know. Maybe *you* only get questions from DBAs... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] GUC source file and line number]
Greg Smith [EMAIL PROTECTED] writes: First question--how about if I changed that description to read: Default value used at server startup if the parameter is not explicitly set? ... not otherwise set would probably be an accurate phrasing. (I'm thinking of corner cases like stuff absorbed from environment variables, which aren't really explicitly set by any normal usage of that term.) I could then expose reset-val, named like that and with a description that explained the context it applies in. And then we've give people a way to experiment and understand the FAQ of why didn't the value go back to the default when I commented it out of the postgresql.conf and HUP'd the server?. You do know that's an ex-FAQ as of 8.3? If we're designing this feature to respond to that, we are wasting a lot of effort. About once a month, somebody asks me how can I tell what the default is for *X*? I wonder how certain you can be of which meaning of default they have in mind. I don't think it means the same thing to everybody that it means to you. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Our CLUSTER implementation is pessimal
Gregory Stark [EMAIL PROTECTED] writes: There are a couple problems with this: a) We need some way to decide *when* to do a sort and when to do an index scan. The planner has all this machinery but we don't really have all the pieces handy to use it in a utility statement. Why not? You don't even need any quals when trying to cost a full-index scan. b) tuplesort no longer has the pieces needed to sort whole tuples including visibility info. And actually even the old pieces that were removed had not quite the right interface and behaviour. We need to preserve t_self for the heap rewrite tools and we need to be able to use _bt_mkscankey_nodata() to generate the scan keys that match the index. So you just broke it irredeemably for non-btree indexes, no? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Conflict resolution in Multimaster replication(Postgres-R)
Hello, My basic question is: in multimaster replication, if each site goes ahead and does the modifications issued by the transaction and then sends the writeset to others in the group, how the ACID properties be maintained? Details: Suppose there are two sites in the group, lets say, A and B and are managing a database D. Two transactions TA and TB started in sites A and B respectively, at nearly same time, wanted to update same row of a table in the database. As, no locking structures and other concurrency handling structures are replicated each will go ahead and do the modifications in their corresponding databases and sends the writeset. Since, both writesets contain update to the same row, will the two transactions be rolled back or anything other than this happens? A more general question is: for Transactional isolation level 4(serializable level), the information such as locking of rows be transmitted across sites? If not, what is the mechanism to address concurrency with serializibility. Thanks, Srinivas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Conflict resolution in Multimaster replication(Postgres-R)
Hi Srinivas, Multi-master replication in Postgres-R is handled using a process called certification that ensures there are no serializability violations. Look at the paper by Kemme and Alonzo entitled Don't be Lazy, Be Consistent... (http://www.cs.mcgill.ca/~kemme/papers/vldb00.html). In the first case you describe one transaction must abort if applying them would break serializability. In the second case you describe, you must transmit read sets as well as write sets. The same sort of algorithm is applied as for writes. Please email me directly if you want more information. Thanks, Robert On 9/3/08 4:02 PM, M2Y [EMAIL PROTECTED] wrote: Hello, My basic question is: in multimaster replication, if each site goes ahead and does the modifications issued by the transaction and then sends the writeset to others in the group, how the ACID properties be maintained? Details: Suppose there are two sites in the group, lets say, A and B and are managing a database D. Two transactions TA and TB started in sites A and B respectively, at nearly same time, wanted to update same row of a table in the database. As, no locking structures and other concurrency handling structures are replicated each will go ahead and do the modifications in their corresponding databases and sends the writeset. Since, both writesets contain update to the same row, will the two transactions be rolled back or anything other than this happens? A more general question is: for Transactional isolation level 4(serializable level), the information such as locking of rows be transmitted across sites? If not, what is the mechanism to address concurrency with serializibility. Thanks, Srinivas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Robert Hodges, CTO, Continuent, Inc. Email: [EMAIL PROTECTED] Mobile: +1-510-501-3728 Skype: hodgesrm
Re: [HACKERS] [PATCH] Cleanup of GUC units code
Hannu Krosing escribió: I mean, there is no known written standard, which says that Mb is megabit, not megabyte or that you can (or can't) write kilo as K, but some people just believe that kB is the Way and allowing people to write kilobytes as KB or kb is evil and should be punished. Yes there is --- it's the SI. http://en.wikipedia.org/wiki/SI#SI_writing_style I don't know about it being evil and punishment, but it's wrong. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extending grant insert on tables to sequences
Stephen Frost [EMAIL PROTECTED] writes: * Jaime Casanova ([EMAIL PROTECTED]) wrote: updating the patch with one that only extends inserts. though, i haven't look at the col level privs patch yet. At least initially I wasn't planning to support column-level privileges for sequences, so I don't think it will affect you much. Do people think it makes sense to try and support that? USAGE certainly wouldn't be column-level in any case --- it'd be a privilege on the sequence as such. That end of it isn't the problem; the problem is that column-level privileges on the table make it hard to decide when to grant rights on the sequence, as I pointed out last time round: http://archives.postgresql.org/pgsql-hackers/2008-07/msg00624.php As your patch appears more ready-for-commit than the column-level privileges patch, I wouldn't worry about what code might have to move around, that'll be for me to deal with in a re-sync with HEAD once your patch is committed. I think that's backwards. The above message raises serious concerns about whether the USAGE-granting patch can be implemented at all in the presence of column-level privileges. I think the right thing is to get column privileges in and then see if it's possible to implement USAGE-granting compatibly. I don't want to commit a patch that is clearly going to be broken when (not if) column privileges arrive. I note also that no response was given to my worries about pg_dump behavior. In short, this patch isn't much more ready to commit than it was in the last fest. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump roles support
* Tom Lane ([EMAIL PROTECTED]) wrote: =?ISO-8859-1?Q?Benedek_L=E1szl=F3?= [EMAIL PROTECTED] writes: pg_dumpall now just passes the --role option to pg_dump. What do you think, is it enough or it should issue the SET ROLE TO ... command in its own session too? I think it would have to, in the general case. Consider the possibility that someone has restricted access to the system catalogs, for instance. I would agree with this. pg_dumpall should do the 'set role' in its session as well. You have missed an important component of Stephen's original proposal, which was the point that something similar is needed on the restore side. This is a little bit tricky since the context at restore time is not necessarily the same as the context at dump time. When using an archive file it's not a problem: the behavior can be driven off a --role switch to pg_restore, and this is independent of what pg_dump did. In a dump to plain text, though, I'm not sure what to do. The simplest design would have pg_dump's --role switch control both what it does in its own connection to the source database, and what it puts into the output script. I'm not sure that's adequate though. This makes sense to me and I feel it's adequate. If necessary, people can post-process their .sql files using sed or something similar. That's made reasonably easy by having a 'set role' in the .sql file. I actively dislike the idea that pg_restore would modify the input stream from a text file, even if it was passed a --role switch. Is it worth having two different switches for the two cases? If we think it's a corner case to need different role IDs, we could just leave it like that and tell anyone who needs different behaviors that they have to go through an archive file and pg_restore. Stephen, you were the one who wanted this in the first place, what's your use-cases look like? My primary use cases are performing a pg_dump when logging in as one user but needing the permissions of another role, followed by loading the data into another system when logging in as one user and needing to set role first to another. In at least 90% of those cases, that role is postgres, and in the other 10% most, if not all, are the same role on both sides. There are a few cases where we might change the restore-as role away from the dumped-as role, but we're happy to use pg_restore to handle that, or take care of changing the role in the .sql file (which is what we tend to use, honestly) using sed or similar. Alot of this is driven from the fact that we don't allow admins to remotely connect directly as postgres (akin to disabling remote root logins in sshd_config via PermitRootLogin, and for the same reasons). They must authenticate and connect as their own user first and then use 'set role postgres;' to gain superuser rights. Not being able to have pg_dump do that set role has been quite frustrating as we use it extensively for transferring data between systems. Some other review nitpicking: I agree with the other comments. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Extending grant insert on tables to sequences
* Tom Lane ([EMAIL PROTECTED]) wrote: Stephen Frost [EMAIL PROTECTED] writes: * Jaime Casanova ([EMAIL PROTECTED]) wrote: updating the patch with one that only extends inserts. though, i haven't look at the col level privs patch yet. At least initially I wasn't planning to support column-level privileges for sequences, so I don't think it will affect you much. Do people think it makes sense to try and support that? USAGE certainly wouldn't be column-level in any case --- it'd be a privilege on the sequence as such. That end of it isn't the problem; the problem is that column-level privileges on the table make it hard to decide when to grant rights on the sequence, as I pointed out last time round: http://archives.postgresql.org/pgsql-hackers/2008-07/msg00624.php Ah, obviously I hadn't read far enough back about this patch. I agree that sequence USAGE should be granted when insert is granted on any column. One suggestion is that as the SQL spec indicates that a table-level revoke implies a revoke on all columns, we could have the revokation of the sequence permissisons done only on table-level revokation of insert and not on any individual column-level insert, even if that was the last column which insert rights were granted on. I have to admit that I'm not a big fan of that though because a given state on the table wouldn't imply a particular state for the sequence- it would depend on how you got there. The way the code is currently laid out for the column-level privileges, it wouldn't be that difficult to go through all of the other columns and check if this was the last insert being revoked, but I don't particularly like that either, and it strikes me as 99% of the time being wasted effort. I guess if we could check for and only go through that effort when there is a sequence in place with implicit grants it might not be too bad. As your patch appears more ready-for-commit than the column-level privileges patch, I wouldn't worry about what code might have to move around, that'll be for me to deal with in a re-sync with HEAD once your patch is committed. I think that's backwards. The above message raises serious concerns about whether the USAGE-granting patch can be implemented at all in the presence of column-level privileges. I think the right thing is to get column privileges in and then see if it's possible to implement USAGE-granting compatibly. I don't want to commit a patch that is clearly going to be broken when (not if) column privileges arrive. Now that I understand the situation better, I agree with you on this. I hadn't realized this patch was about implicit grants on sequnces. Sorry for the noise. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] [patch] GUC source file and line number]
Before I respond to Tom's comments, let me step back a second and add the intro the deadline didn't leave me time for. There are two specific things the bit I added to this GUC patch is aimed at: 1) Somebody has a postgresql.conf from a random source (saw it on the Internet and pasted dubious stuff in/previous person working on the server/etc.) and wants to know the default value they'd get if they just commented a specific line or lines out. 2) A GUC tuning tool author wants to provide a UI for modifying a GUC parameter that shows the default as input to the person deciding what to set a parameter to. The interface I've always wanted to make available would be...wait a minute, I can provide a working example now. Picture this: name | Recommended | Current | Min | Default | Max -+-+-+---+-+- wal_buffers | 1024kB | 64kB| 32 kB | 64 kB | 2048 MB With your cursor lighting up either the Recommended or Current field, depending on whether you're a default approve or deny kind of tool designer. Pretty simple interface to decide what to do, right? I find that much less useful without the default value being included, but right now someone who is writing a tuning tool has to maintain their own database with that information if they want to do that. I will actually do that for earlier versions the minute I know what the 8.4 solution that makes the problem go away looks like. The above is the output from: select name, '1024kB' as Recommended, current_setting(name) as Current, case when unit='8kB' then pg_size_pretty(min_val::int8*8192) else min_val end as Min, case when unit='8kB' then pg_size_pretty(default_val::int8*8192) else default_val end as Default, case when unit='8kB' then pg_size_pretty(max_val::int8) else max_val end as Max from pg_settings where name='wal_buffers'; on my system with the patch installed. That's what I wanted but was unable to get until now. Combine that with being able to figure out what source file and line the setting was actually taken from, and the top 3 obstacles to writing a simple and easy to use read/modify/write tuning tool are all cleared. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] GUC source file and line number]
On Wed, 3 Sep 2008, Tom Lane wrote: Greg Smith [EMAIL PROTECTED] writes: First question--how about if I changed that description to read: Default value used at server startup if the parameter is not explicitly set? ... not otherwise set would probably be an accurate phrasing. (I'm thinking of corner cases like stuff absorbed from environment variables, which aren't really explicitly set by any normal usage of that term.) My opinion is that setting something in an environment variable certainly is explicitly setting it, but it doesn't matter; if the parameter is not otherwise set works just as well as far as I'm concerned. I could then expose reset-val, named like that and with a description that explained the context it applies in. And then we've give people a way to experiment and understand the FAQ of why didn't the value go back to the default... You do know that's an ex-FAQ as of 8.3? If we're designing this feature to respond to that, we are wasting a lot of effort. Sure, but there are a lot of pre-8.3 installs out there. I don't really care about the reset-val at all, so I'm not going to justify whether or not it should be included. I wonder how certain you can be of which meaning of default they have in mind. I don't think it means the same thing to everybody that it means to you. When most people say the default talking about a value in a configuration file, they mean the value the software will assume if that setting isn't there at all. In the postgresql.conf context, that means what they'll get if they start the server with that line missing or commented out (and no environment variables, etc.) which is why I mapped that to the boot_val. While I'm aware there are other uses of default that apply in this context, I think they are extremely rare compared to the common usage. The subtle distictions that require both a boot_val and a reset_val internally are only important to people who are also capable of understanding that default is a mass-consumption oriented label that's a touch fuzzy IMHO. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New FSM patch
On Fri, 2008-08-29 at 10:47 +0300, Heikki Linnakangas wrote: Here's an updated FSM patch. Can I check some aspects of this related to Hot Standby? Some of them sound obvious, but worth double checking. * There will be no need to read FSM by any normal operation of a read-only transaction, so locking correctness considerations can possibly be ignored during recovery. pg_freespacemap exists though: would we need to prevent that from executing during recovery, or will the FSM be fully readable? i.e. does redo take appropriate locks already (I don't see any Cleanup locks being required). * FSM will be continuously maintained during recovery, so FSM will now be correct and immediately available when recovery completes? * There are no cases where a screwed-up FSM will crash either recovery (FATAL+) or halt normal operation (PANIC)? * incomplete action cleanup is fairly cheap and doesn't rely on the FSM being searchable to correct the error? This last is a hard one... Do we have the concept of a invalid/corrupt FSM? What happens if the logic goes wrong and we have a corrupt page? Will that mean we can't complete actions against the heap? Are there really any changes to these files? src/include/storage/bufmgr.h src/include/postmaster/bgwriter.h -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Our CLUSTER implementation is pessimal
On Mon, 2008-09-01 at 00:25 +0100, Gregory Stark wrote: One thing that's been annoying me for a while is that our CLUSTER implementation is really very slow. When I say very slow I mean it's really very very very slow. Does this implementation work towards being able to do CREATE INDEX ... CLUSTER TABLE So that we can do both actions with just one sort of the data? I think there needs to be an option to force this to do either sorts or indexscans. On a large table you may not have the space to perform a full table sort, plus on a multi-column index we may not accurately predict the cost of an indexscan. (What is the change to elog.c about?) -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Cleanup of GUC units code
On Wednesday 03 September 2008 16:12:29 Joshua Drake wrote: On Wed, 03 Sep 2008 23:10:24 +0300 Hannu Krosing [EMAIL PROTECTED] wrote: That would equally solve this problem, as well as many others. AFAIK the config file is checked now, and if the check fails, the database won't start. Like apachectl configcheck ... E.g; we have the ability to check if the config file is valid before we restart and have an extended outage. To paraphrase, if you can't write a config file correctly before restarting, I do not want you anywhere near any instance of a production system -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Cleanup of GUC units code
On Wed, 2008-09-03 at 20:01 -0400, Alvaro Herrera wrote: Hannu Krosing escribió: I mean, there is no known written standard, which says that Mb is megabit, not megabyte or that you can (or can't) write kilo as K, but some people just believe that kB is the Way and allowing people to write kilobytes as KB or kb is evil and should be punished. Yes there is --- it's the SI. http://en.wikipedia.org/wiki/SI#SI_writing_style I don't know about it being evil and punishment, but it's wrong. SI defines decimal-based prefixes, where k = kilo = 1000, so our current conf use is also wrong. http://en.wikipedia.org/wiki/Kilobyte ... 1024 bytes (2^10): This unit is used when expressing quantities which are based on powers of two, such as memory chip capacities. Most software also expresses storage capacity in units of 1024 bytes. Although the use of kilobyte for this unit is common, this usage has been expressly forbidden by the SI standard and other standards organisations. To indicate a quantity of 1024 bytes, the term kibibyte (KiB) has been recommended instead. And no, I am not proposing switching conf units to KiB, nor am I proposing to use MB to mean 1 000 000. I'm quite happy with us being wrong in 1000 v. 1024 area. --- Hannu -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers