Re: [HACKERS] Integer datetimes
Bruce Momjian wrote: Neil Conway wrote: On Sat, 2007-05-05 at 20:52 -0400, Bruce Momjian wrote: What? We don't pass float as a binary to clients. Sure we do, if the client is sending or receiving data in binary format. But in those cases, we assume the client and server have the same configuration, right? It is correct assumption, but I did not find it in documentation and if you look on floating data type description there is mention about non IEEE 754 platform, but nothing about this assumption. I think IEEE 754 compliance must be required on all platforms. Zdenek ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] Stats not updated after rollback -- autovacuum confused.
Hello, I have a system where there are mostly COPYs, which insert data into a table. Ocasionally a COPY will fail (and thus, dead rows appear), but as far as I can tell ROLLBACK is not reflected anywhere in the pg_stats_user_tables. And since there are no rows n_tup_upd or n_tup_del, therefore autovacuum will not fire for that table. I see two possible solutions: 1) let rollback increment both n_tup_ins and n_tup_del (or maybe n_tup_upd, at least)? This would be a good safeguard, I guess. 2) ANALYZE is able to see wether table is accumulating dead rows. It might be a good idea to make ANALYZE able hint autovacuum that some tables need VACUUM (that they exceed limits set for autovacuum). The 2nd point could be a TODO item, perhaps? Something like: When ANALYZE runs, make it note removable dead rows and non-removable dead rows. If removable dead rows exceed some threshold, hint autovacuum at that table. Regards, Dawid ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] plperl vs. bytea
What we've basically got here is a complaint that the default textual-representation-based method for transmitting PL function parameters and results is awkward and inefficient for bytea. So the first question is whether this is really localized to only bytea, and if not which other types have got similar issues. (Even if you make the case that no other scalar types need help, what of bytea[] and composite types containing bytea or bytea[]?) It can be solution for known isues. Current textual representation is more ugly hack than everythink else. Regards Pavel Stehule ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] autovacuum does not start in HEAD
Alvaro Herrera <[EMAIL PROTECTED]> wrote: > ITAGAKI Takahiro wrote: > > > I found that autovacuum launcher does not launch any workers in HEAD. > > > > The attached autovacuum-fix.patch could fix the problem. I changed > > to use 'greater or equal' instead of 'greater' at the decision of > > next autovacuum target. > > I have committed a patch which might fix this issue in autovacuum.c rev 1.44. > Please retest. HEAD (r1.45) is still broken. We skip entries using the test adl_next_worker - autovacuum_naptime < current_time <= adl_next_worker, but the second inequation should be adl_next_worker - autovacuum_naptime < current_time < adl_next_worker, because adl_next_worker can equal current_time. @@ -1036,8 +1036,8 @@ * Skip this database if its next_worker value falls between * the current time and the current time plus naptime. */ - if (TimestampDifferenceExceeds(current_time, - dbp->adl_next_worker, 0) && + if (!TimestampDifferenceExceeds(dbp->adl_next_worker, + current_time, 0) && !TimestampDifferenceExceeds(current_time, dbp->adl_next_worker, autovacuum_naptime * 1000)) By the way, why do we need the upper bounds to decide a next target? Can we use simplify it to "current_time < adl_next_worker"? @@ -1033,16 +1033,11 @@ if (dbp->adl_datid == tmp->adw_datid) { /* -* Skip this database if its next_worker value falls between -* the current time and the current time plus naptime. +* Skip this database if its next_worker value is later than +* the current time. */ - if (TimestampDifferenceExceeds(current_time, - dbp->adl_next_worker, 0) && - !TimestampDifferenceExceeds(current_time, - dbp->adl_next_worker, - autovacuum_naptime * 1000)) - skipit = true; - + skipit = !TimestampDifferenceExceeds(dbp->adl_next_worker, + current_time, 0); break; } elem = DLGetPred(elem); Regards, --- ITAGAKI Takahiro NTT Open Source Software Center ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [COMMITTERS] pgsql: Teach tuplesort.c about "top N" sorting, in which only the first
Jim Nasby wrote: > If we really want to make the logfile the approved method for > monitoring performance, then why do we have the stats infrastructure > at all? It could all be replaced with logging output and pgfouine. First we'd have to fix the usability problem of our redirect_stderr stuff for pgfouine (multiline messages from different backends are mixed, according to Guillaume). -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [DOCS] row-level stats and last analyze time
Neil Conway <[EMAIL PROTECTED]> writes: > On Thu, 2007-26-04 at 18:07 -0400, Neil Conway wrote: >> (1) I believe the reasoning for Tom's earlier change was not to reduce >> the I/O between the backend and the pgstat process [...] > Tom, any comments on this? Your change introduced an undocumented > regression into 8.2. I think you're on the hook for a documentation > update at the very least, if not a revert. The documentation update seems the most prudent thing to me. The problem with the prior behavior is that it guarantees that every table in the database will eventually have a pg_stat entry, even if stats_row_level and stats_block_level are both off. In a DB with lots of tables that creates a significant overhead *for a feature the DBA probably thinks is turned off*. This is not how it worked before 8.2, and so 8.2.0's behavior is arguably a performance regression compared to 8.1 and before. Now this patch went in before we realized that 8.2.x had a bug in computing the stats-file-update delay, and it could be that after fixing that the problem is not so pressing. But I don't particularly care for new features that impose a performance penalty on those who aren't using them, and that's exactly what last_vacuum/last_analyze tracking does if we allow it to bloat the stats file in the default configuration. The long-term answer of course is to devise a more efficient stats reporting scheme, but I'm not sure offhand what that would look like. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] temporal variants of generate_series()
Jim Nasby <[EMAIL PROTECTED]> writes: > Also, what would be the appropriate way to put this into initdb? You seem to have missed a step here, which is to convince people that these belong in core at all. So far I've not even seen an argument that would justify putting them in contrib. If they *were* of sufficiently wide use to justify putting them into core, a more efficient implementation would probably be expected. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] plperl vs. bytea
Andrew Dunstan <[EMAIL PROTECTED]> writes: > Peter Eisentraut wrote: >> This ought to be a property of data type plus language, not a property >> of a function. > Why should it? > And how would you do it in such a way that it didn't break legacy code? > My GUC proposal would have made it language+type specific, but Tom > didn't like that approach. It may indeed need to be language+type specific; what I was objecting to was the proposal of an ad-hoc plperl-specific solution without any consideration for other languages (or other data types for that matter). I think that's working at the wrong level of detail, at least for initial design. What we've basically got here is a complaint that the default textual-representation-based method for transmitting PL function parameters and results is awkward and inefficient for bytea. So the first question is whether this is really localized to only bytea, and if not which other types have got similar issues. (Even if you make the case that no other scalar types need help, what of bytea[] and composite types containing bytea or bytea[]?) After that we have to look at which PLs have the issue. I think this is largely driven by what the PL's internal type system is like, in particular does it have a datatype that is a natural conversion target for bytea, or other types with the same issue? (Tcl for instance once did not have 8-bit-clean strings, though I think it does today.) After we've got a handle on the scope of the problem we can start to think about solutions. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] Re: [COMMITTERS] psqlodbc - psqlodbc: Put Autotools-generated files into subdirectory
Peter Eisentraut wrote: > Hiroshi Inoue wrote: >> Hiroshi Inoue wrote: >>> User Petere wrote: Log Message: --- Put Autotools-generated files into subdirectory config/; add macro files used from PostgreSQL there so you don't need a PostgreSQL source tree to bootstrap the code. >>> >>> Added Files: --- psqlodbc/config: general.m4 (r1.1) >>> Under what license is this file distributed ? >> Could you please reply to my question ? > > It's from the PostgreSQL source tree, so whatever license that has. So what's it ? Could you please take account of developers in the psqlodbc project a little more ? ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Integer datetimes
On Sun, 2007-06-05 at 13:09 -0400, Bruce Momjian wrote: > Also, are we sure we can load a dump that used the float format? What > happens for a date out of int8 range? AFAIK we should always be able to reload timestamp values that are in the legal range for an int8-based timestamp. For values outside that range, the restore will fail, just as it would if you tried to move an application from PG 8.2 with float timestamps to PG 8.2 with integer timestamps. The user can always reconfigure with --disable-integer-datetimes. -Neil ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] [HACKERS] Arrays of Complex Types
On Sun, May 06, 2007 at 01:33:47PM -0400, Andrew Dunstan wrote: > However, there are still some oddities. For example, a change to or > removal of the base type affects the array type, but the array type > can be directly operated on (e.g. alter type _aa set schema foo ). > I'm inclined to say we should prevent direct operations on array > types, and they should live or die by their parent types. > > Thoughts? +1 on binding the array types tightly to the parent types. Cheers, D -- David Fetter <[EMAIL PROTECTED]> http://fetter.org/ phone: +1 415 235 3778AIM: dfetter666 Skype: davidfetter Remember to vote! Consider donating to PostgreSQL: http://www.postgresql.org/about/donate ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] [HACKERS] Arrays of Complex Types
I wrote: OK, summarising what looks to me like a consensus position, ISTM the plan could be: . fix makeArrayTypeName() or similar to make it try harder to generate a unique non-clashing name . remove the existing "62 instead of 63" name length restrictions . autogenerate array types for all explicitly or implicitly created composite types other than for system catalog objects. . defer for the present any consideration of a "CREATE TYPE foo AS ARRAY ..." command. Regarding catalog objects, we might have to try a little harder than just not generating in bootstrap mode - IIRC we generate system views (including pg_stats) in non-bootstrap mode. Maybe we just need to exempt anything in the pg_catalog namespace. What would happen if a user created a view over pg_statistic? Should the test be to avoid arrays for things that depend on the catalogs? Or maybe we should go to the heart of the problem and simply check for pseudo-types directly. I've been working on David's patch and done the following: . inhibit creation of array types for composites during initdb . some bug fixes . have CheckAttributeType recurse into composite types, so you can no longer create a table/type with a composite field which contains a pseudo-type column (like pg_statistic) However, there are still some oddities. For example, a change to or removal of the base type affects the array type, but the array type can be directly operated on (e.g. alter type _aa set schema foo ). I'm inclined to say we should prevent direct operations on array types, and they should live or die by their parent types. Thoughts? cheers andrew ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Integer datetimes
Jim Nasby wrote: > On May 5, 2007, at 10:38 AM, Neil Conway wrote: > > On Sat, 2007-05-05 at 11:03 -0400, Tom Lane wrote: > >> I'm not necessarily opposed to changing the default configure > >> selection, > >> but I am opposed to removing the FP code entirely. > > > > I would be satisfied with changing the default to integer and > > deprecating the FP code (but keeping it around as a configure option). > > Are there any objections to doing this for 8.3? > > One question... I've always assumed that FP date times suffers from > the inexact math issues that floats do; is that true? > > The only use I can think of for huge date values would be astronomy. > I know they deal with huge numbers, so maybe huge times as well. If > there is that kind of demand perhaps we'd want to continue supporting > FP dates... maybe via contrib, or as a different base data type. Also, are we sure we can load a dump that used the float format? What happens for a date out of int8 range? -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [COMMITTERS] psqlodbc - psqlodbc: Put Autotools-generated files into subdirectory
Hiroshi Inoue wrote: > Hiroshi Inoue wrote: > > User Petere wrote: > >> Log Message: > >> --- > >> Put Autotools-generated files into subdirectory config/; add macro > >> files used from PostgreSQL there so you don't need a PostgreSQL > >> source tree to bootstrap the code. > > > > > > > >> Added Files: > >> --- > >> psqlodbc/config: > >> general.m4 (r1.1) > > > > Under what license is this file distributed ? > > Could you please reply to my question ? It's from the PostgreSQL source tree, so whatever license that has. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] plperl vs. bytea
Andrew Dunstan wrote: > It's not. If we really want to tackle this root and branch without > upsetting legacy code, I think we'd need to have a way of marking > data items as binary in the grammar, e.g. > > create function myfunc(myarg binary bytea) returns binary bytea > language plperl as $$ ...$$; This ought to be a property of data type plus language, not a property of a function. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] plperl vs. bytea
Peter Eisentraut wrote: Andrew Dunstan wrote: It's not. If we really want to tackle this root and branch without upsetting legacy code, I think we'd need to have a way of marking data items as binary in the grammar, e.g. create function myfunc(myarg binary bytea) returns binary bytea language plperl as $$ ...$$; This ought to be a property of data type plus language, not a property of a function. Why should it? And how would you do it in such a way that it didn't break legacy code? My GUC proposal would have made it language+type specific, but Tom didn't like that approach. cheers andrew ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [COMMITTERS] pgsql: Teach tuplesort.c about "top N" sorting, in which only the first
Jim Nasby <[EMAIL PROTECTED]> writes: > There's several problems with that. First, trace_sort isn't > documented (or at least it's not in postgresql.conf), so most folks > don't know it exists. Second, in order to see it's output you have to > drop log_min_messages to debug. That results in a huge log volume, > especially on a production system. Nonsense ... trace_sort output is at LOG level. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Integer datetimes
On May 5, 2007, at 10:38 AM, Neil Conway wrote: On Sat, 2007-05-05 at 11:03 -0400, Tom Lane wrote: I'm not necessarily opposed to changing the default configure selection, but I am opposed to removing the FP code entirely. I would be satisfied with changing the default to integer and deprecating the FP code (but keeping it around as a configure option). Are there any objections to doing this for 8.3? One question... I've always assumed that FP date times suffers from the inexact math issues that floats do; is that true? The only use I can think of for huge date values would be astronomy. I know they deal with huge numbers, so maybe huge times as well. If there is that kind of demand perhaps we'd want to continue supporting FP dates... maybe via contrib, or as a different base data type. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Managing the community information stream
Dave Page wrote: Bruce Momjian wrote: The idea of the patch number in the subject line works with that streaming model because it merely marks streams so they can be grouped. The defining event that marks the stream is a post to the patches list. We already number posts to the bugs list, so in a way we could improve tracking there and somehow link it to TODO items and patch submissions, but because many TODO items are not the result of bug reports but come out of general discussions, I am not sure tracking would work as well there. And what about features? Do you start assigning numbers there, and what is your trigger event? In my opinion, as you start trying to place more structure on the stream, the stream itself starts to degrade in its dynamism and ease of use. To me, that is the fundamental issue, and risk. Bruce, I cannot really add to that except to say that you neatly summarized what I've completely failed to in my last few emails to Andrew. I agree completely. Frankly, this strikes me as painting lipstick on a pig. Try searching the mailing list archives to find information. It's hard. It sucks badly. So often you have to post a query on a mailing list, which you have to join unless you want your query to sit in limbo for days. If you think this is treating users nicely then you have a different idea from me of what that means. Yes, what I'm proposing means work, and no it can't be fully automated. That doesn't mean it's not worth doing. Case 1 (bug): Recently I had a problem with Gaim/Pidgin on my fc6 boxes. I went to the bug site, clicked a few buttons and found that our own Devrim Gunduz had reported the problem. Later, when I found out some more information, I went back and added it to the bug. When the RedHat/Fedora guys get around to fixing it they will know what the problem is and what the solution is. They will have all the info gathered in one spot. Case 2 (feature): Several years ago I wanted to find out what had happened about BZ support for Postgres. It was in their roadmap doc, so I went and looked at the tracking item. Nothing seemed to be happening, so I asked. Then I reviewed the patches (plural, note - another reason why tracking patches rather than action items is not necessarily good) that related to the item. I didn't like the direction they were going so I did some work and proposed an alternative. That got picked up by Ed Sobol and Max Kanat-Alexander (iirc) and the result is that today there is full support for Postgres in BZ mainline. If someone wants to review the history it is all there, with patches and comments all gathered neatly. Oh, the answer to Bruce's question about when to create a feature item? You could well do it at the time when today you create a TODO item. However, we might even do better. For example, we might well add feature requests that are denied. That would help people to see if something has been proposed before. I could go on but I'm actually trying to get some code written today :-) cheers andrew ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [COMMITTERS] pgsql: Teach tuplesort.c about "top N" sorting, in which only the first
On May 6, 2007, at 9:32 AM, Tom Lane wrote: Jim Nasby <[EMAIL PROTECTED]> writes: There's several problems with that. First, trace_sort isn't documented (or at least it's not in postgresql.conf), so most folks don't know it exists. Second, in order to see it's output you have to drop log_min_messages to debug. That results in a huge log volume, especially on a production system. Nonsense ... trace_sort output is at LOG level. I stand corrected. But my point still remains. It wouldn't be unusual for a website database to be running several sorts a second; that means 4 lines per sort, which is still a large amount of data. If we really want to make the logfile the approved method for monitoring performance, then why do we have the stats infrastructure at all? It could all be replaced with logging output and pgfouine. Why are we maintaining two separate sets of code for monitoring database performance? -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] New idea for patch tracking
On May 5, 2007, at 11:40 AM, Dave Page wrote: Barring a few trivial details, that sounds almost identical to what I proposed. Well, Andrew says everyone he talks to doesn't want it. They want a more comprehensive solution that goes from bug to patch. I don't recall him saying that, though I do know that's /his/ opinion. It's certainly *not* the opinion of most of the people I've spoken with. I don't disagree with the idea in principle though, but I don't believe it will work for us because it's so fundamentally different from the way we currently work and still wouldn't solve the problem of capturing all the relevant discussion regarding a given patch (or bug) without a reasonable amount of manual work, or grafting a large part of what I'm proposing on the side. IIRC, every recent debate about going to a bug/issue (and now patch) tracker has ultimately boiled down to the impact it would have on our current work processes: it has to work 100% painlessly off of the mailing lists. It's got to do more than just pipe changes to the mailing list; it's got to be able to be driven by the list as well. That's the real challenging part. People have suggested different trackers that have varying amounts of email capability, but I don't think any of them have had the full capability that we'd need. At best they might accept comments on a bug/issue via email, but to work for the community they'd need to go beyond that. You'd have to be able to resolve via email (preferably tied to -commiters). You'd need to be able to make a bug as invalid. You'd need to be able to open a new issue via email. And change status. And assign it to someone. And it would have to actually thread discussion to be useful. Probably some other things as well. Since a system like that doesn't exist I think it's going to be up to us to create one. When it comes to the full set of features you'd expect out of an issue tracker, it would probably make sense to start with an existing project and try and add this functionality. But right now I don't think such an effort would work well, because we don't know well enough how all these new features should work. But writing a patch tracker would be simpler than a full issue tracker. It's also something we could more easily do in a piece-meal fashion, since the only users will be developers. Building such a tool would provide a wealth of experience that could then be applied to tackling a full-blown issue tracking system. The system Bruce and Dave have outlined shouldn't be terribly hard to implement. Let's start with that and see what we learn (as I've already told Dave, this is something I'll help with). Otherwise we'll once again have spent another chunk of community effort on a tracker discussion that results in nothing being done (I guess it has been 6 months since it was last brought up, so we were due again anyway...) -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Managing the community information stream
Bruce Momjian wrote: The idea of the patch number in the subject line works with that streaming model because it merely marks streams so they can be grouped. The defining event that marks the stream is a post to the patches list. We already number posts to the bugs list, so in a way we could improve tracking there and somehow link it to TODO items and patch submissions, but because many TODO items are not the result of bug reports but come out of general discussions, I am not sure tracking would work as well there. And what about features? Do you start assigning numbers there, and what is your trigger event? In my opinion, as you start trying to place more structure on the stream, the stream itself starts to degrade in its dynamism and ease of use. To me, that is the fundamental issue, and risk. Bruce, I cannot really add to that except to say that you neatly summarized what I've completely failed to in my last few emails to Andrew. I agree completely. Regards, Dave. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq