Re: [HACKERS] sp-gist porting to postgreSQL
Oleg, Thanks for your prompt reply. Actually, I am able to create a new access method for testing and add an operator class for the type "integer" using the new access method. Then created a table with two integer fields, one indexed using the new access method and the other using a btree index, and everything is ok so far. Even using EXPLAIN statement for queries show that the indexes are used correctly as they should. I am using postgresql version 8.0.0beta3 from CVS. Thanks Ramy -Original Message- From: Oleg Bartunov [mailto:[EMAIL PROTECTED] Sent: Wednesday, November 10, 2004 12:35 AM To: Ramy M. Hassan; Pgsql Hackers Cc: Teodor Sigaev; Walid G. Aref Subject: Re: sp-gist porting to postgreSQL Ramy, glad to hear from you ! AFAIK, posgresql doesnt' supports several indices for the same type. I think this is a problem of optimizer. Probably other hackers know better. I forward your message to -hackers mailing list which is a relevant place for GiST discussion. regards, Oleg On Tue, 9 Nov 2004, Ramy M. Hassan wrote: > Dear Oleg and Teodor, > Thanks for offering help. > I have a design question for now. > Currently in the postgresql GiST implementation, I noticed that the way to > have a GiST based index is to define an operator class for a certain type > using GiST index. There is no new index type defined from the point of view > of postgresql ( nothing is added to pg_am ). This means that for a certain > type there could only be one GiST based index. I mean that there is no way in > the same server to use gist to implement an xtree index and a ytree for the > same type even if they index different fields in different relations. is > that correct ? > What about doing it the other way ( I am talking about SP-GiST now ) , by > providing the extension writer with an API to use it to instantiate a > standalone SP-GiST based index ( for example trie index ) that has a record > in the pg_am relation. In my point of view this would give more flexibility, > and also would not require the extension writer to learn the postgresql API ( > maybe oneday SP-GiST will be ported to another database engine ) he will > just need to learn the SP-GiST API which will propably be less amount of > study (and this is what GiST and SP-GiST is all about if I correctly > understand ). > Please let me know your opinions regarding to this. > > Thanks > > Ramy > Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] sp-gist porting to postgreSQL
Ramy, glad to hear from you ! AFAIK, posgresql doesnt' supports several indices for the same type. I think this is a problem of optimizer. Probably other hackers know better. I forward your message to -hackers mailing list which is a relevant place for GiST discussion. regards, Oleg On Tue, 9 Nov 2004, Ramy M. Hassan wrote: Dear Oleg and Teodor, Thanks for offering help. I have a design question for now. Currently in the postgresql GiST implementation, I noticed that the way to have a GiST based index is to define an operator class for a certain type using GiST index. There is no new index type defined from the point of view of postgresql ( nothing is added to pg_am ). This means that for a certain type there could only be one GiST based index. I mean that there is no way in the same server to use gist to implement an xtree index and a ytree for the same type even if they index different fields in different relations. is that correct ? What about doing it the other way ( I am talking about SP-GiST now ) , by providing the extension writer with an API to use it to instantiate a standalone SP-GiST based index ( for example trie index ) that has a record in the pg_am relation. In my point of view this would give more flexibility, and also would not require the extension writer to learn the postgresql API ( maybe oneday SP-GiST will be ported to another database engine ) he will just need to learn the SP-GiST API which will propably be less amount of study (and this is what GiST and SP-GiST is all about if I correctly understand ). Please let me know your opinions regarding to this. Thanks Ramy Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] A modest proposal: get rid of GUC's USERLIMIT variable
Greg Stark wrote: > > Tom Lane <[EMAIL PROTECTED]> writes: > > > I'd like to propose that we get rid of GUC's USERLIMIT category and > > convert all the variables in it to plain SUSET. In my mind, USERLIMIT > > is a failed experiment: it's way too complicated, and it still doesn't > > do quite what it was intended to do, because there are times when it > > can't check whether you're a superuser. > > > > The only variables that are in the category are log-verbosity-related: > > Would that mean I wouldn't be able to change the logging level on the fly at > all? > > That would disappoint at least one user, myself. I've found the best debugging > compromise is to leave log_statement off in general but have a magic parameter > I can pass to the application that will set log_statement = true for a single > transaction. > > That way I can look at what queries transpired in my session without having to > dig through hundreds of other queries from other sessions. And have the > complete logs for the session I'm debugging without the performance impact in > the normal case. Yes, this would not be possible for non-super users with the new proposal. You could set the setting for non-super users per-user but not per-session. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] A modest proposal: get rid of GUC's USERLIMIT variable category
Tom Lane <[EMAIL PROTECTED]> writes: > I'd like to propose that we get rid of GUC's USERLIMIT category and > convert all the variables in it to plain SUSET. In my mind, USERLIMIT > is a failed experiment: it's way too complicated, and it still doesn't > do quite what it was intended to do, because there are times when it > can't check whether you're a superuser. > > The only variables that are in the category are log-verbosity-related: Would that mean I wouldn't be able to change the logging level on the fly at all? That would disappoint at least one user, myself. I've found the best debugging compromise is to leave log_statement off in general but have a magic parameter I can pass to the application that will set log_statement = true for a single transaction. That way I can look at what queries transpired in my session without having to dig through hundreds of other queries from other sessions. And have the complete logs for the session I'm debugging without the performance impact in the normal case. -- greg ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [Pgsphere-dev] GIST index concurrency concern
Daniel, concurrency is a big issue of current implementation of GiST. But it should don't bite you for READ ops ! -hackers mailing list is a very relevant mailing list for GiST discussions. It's pity we several times claimed to work on GiST concurrency and recovery, but never got a chance :) I see Neil become interested in GiST concurrency, though. Oleg On Tue, 9 Nov 2004, Daniel Ceregatti wrote: Hi, It's recently come to my attention that GIST indices suffer from concurrency issues. I have already developed a dating sites using GIST for use with attributes using the intarray contrib, and for Earth distance/radius calculations using pg_sphere. I'm wondering if I haven't shot myself in the foot here. So far, I understand that a GIST index will be locked by a backend for any DML. Basically I'm concerned that my database will not scale in the manner that I was hoping, because the sites that access the database are to be used by many multiple concurrent users, doing some DML. I expect my site to sustain something around 1000-3000 new user acquisitions per day, all of which will account for an insert into 3 GIST indices. Additionally there will be people that will be updating their attributes and locations as well, but this will probably only account for a small fraction of the DML. We don't allow people to delete stuff. My concern now is this concurrency issue. My question is: Is there anyone out there using a GIST index on a database where there's a lot of DML? Should I be concerned with this issue at all? If so, what can be done to minimize the impact of heavy DML on a GIST index? I've pondered rolling all DML into queues via triggers and then de-queuing them in one transaction every so often, like 15 minutes, via cron. Any other suggestions? I'm posting to this list because I understand that both Oleg and Teodor read it, and I found no other relevant list. If I've misposted, please accept my apology and please direct me to the appropriate list. Thanks, Daniel Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] GiST: range of penalty method?
Does anyone know what the expected range of the "penalty" GiST method is? (i.e. Is the legal range documented anywhere? Failing that, what does existing GiST-based code expect?) While rewriting gistchoose() in gist.c to be less obfuscated, it occurred to me that (a) I don't think the existing code will work as intended if a negative penalty is returned (b) it would be good to define a "minimum legal penalty". Once the minimum penalty has been returned for a particular location, GiST can know that all other locations where it might insert the node will have a penalty greater than or equal to that value, so gistchoose() can bail-out earlier. Therefore, I'd like to make "0.0" is the minimum legal penalty, and require all GiST "penalty" methods to return values >= 0. I think requiring the penalty != NaN would also be a good idea. Comments? Note that making this change to CVS tip results in a regression failure in contrib/btree_gist. The regression.diffs are attached. The regression tests for rtree_gist, tsearch, tsearch2 and pg_trgm succeed after making the modification. -Neil *** ./expected/bit.out 2004-11-09 11:20:51 +11:00 --- ./results/bit.out 2004-11-10 15:44:52 +11:00 *** *** 33,38 --- 33,39 (1 row) CREATE INDEX bitidx ON bittmp USING GIST ( a ); + ERROR: illegal penalty from GiST penalty method: -116307439489975337053247905800904507392.00 SET enable_seqscan=off; SELECT count(*) FROM bittmp WHERE a < '01101100010001011101100011100'; count == *** ./expected/varbit.out 2004-11-09 11:20:51 +11:00 --- ./results/varbit.out 2004-11-10 15:44:52 +11:00 *** *** 33,38 --- 33,39 (1 row) CREATE INDEX varbitidx ON varbittmp USING GIST ( a ); + ERROR: illegal penalty from GiST penalty method: -92381335565846851340979378715858305024.00 SET enable_seqscan=off; SELECT count(*) FROM varbittmp WHERE a < '1110100111010'::varbit; count == ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[HACKERS] CREATE or REPLACE function pg_catalog.*
Hi, When doing CREATE or REPLACE FUNCTION of a builtin function, it seems to have no effect if its in the 'C" language. SQL functions seem to work, but as neilc pointed out, it may be due to the SQL function being inlined. The builtin function is still called, not the userdefined function for 'C' language functions. ... John ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] CVS should die (was: Possible make_oidjoins_check ...)
> This doesn't really answer the question of what tool Postgres might > change to, but it seems that Subversion is a good tool one should > consider. And by golly, CVS is bad. Just consider the cons – having > to forbid renames in all but the most necessary cases – it just > invites cruft into any project. Interesting reading: http://better-scm.berlios.de/comparison/comparison.html http://zooko.com/revision_control_quick_ref.html Cheers, Steve ---(end of broadcast)--- TIP 8: explain analyze is your friend
[HACKERS] A modest proposal: get rid of GUC's USERLIMIT variable category
I'd like to propose that we get rid of GUC's USERLIMIT category and convert all the variables in it to plain SUSET. In my mind, USERLIMIT is a failed experiment: it's way too complicated, and it still doesn't do quite what it was intended to do, because there are times when it can't check whether you're a superuser. The only variables that are in the category are log-verbosity-related: regression=# select name from pg_settings where context = 'userlimit'; name log_duration log_executor_stats log_min_duration_statement log_min_error_statement log_min_messages log_parser_stats log_planner_stats log_statement log_statement_stats (9 rows) What the USERLIMIT code tries to do is allow non-superusers to "increase" but not "decrease" the logging verbosity for their sessions only. (For instance, a non-superuser could turn log_duration on, but can't turn it off if the DBA has turned it on.) However, the usefulness of this capability is really pretty debatable. A non-superuser presumably doesn't have access to the postmaster log file anyhow, so why does he need to be able to turn up the logging? You could even argue that being able to flood the logs with stuff the DBA doesn't want is a mild form of DOS attack. If we do get rid of USERLIMIT, another benefit accrues: we can assume that ALTER USER and ALTER DATABASE settings were fully checked when they were installed, and thereby accept them at session start without any extra permissions check. This would mean that, for example, a superuser could use ALTER USER to set these variables on a per-user basis for non-superusers, and it would actually work. Right now the value is rechecked as if the setting were being issued by the non-superuser, and so it may fail. For more discussion see this thread in pgsql-bugs: http://archives.postgresql.org/pgsql-bugs/2004-11/msg00101.php regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Call for objections: simplify stable functions during estimation
Robert Treat <[EMAIL PROTECTED]> writes: > On Tuesday 09 November 2004 11:28, Tom Lane wrote: >> (One of the potential objections went away when >> we started enforcing that stable functions don't have side-effects.) > Since we know people will be calling volatile functions inside stable > functions (see thread from last week if you need a refresher as to why) is > there any serious negative side-effect in those cases? If you are making an end-run around the rule, IMHO it's up to you to make sure that the behavior of the function is sane. In practice, the planner is only going to be estimating values for functions that appear in WHERE/GROUP BY/HAVING clauses, and anyone who puts a function with real side-effects in such places is in deep trouble anyway. The real bottom line here is that it's better to take the current value of the function as the planner estimate than to fall back to completely default selectivity estimates. You can doubtless invent scenarios where this is wrong, but they are far outweighed by cases where it is right. regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Call for objections: simplify stable functions during estimation
On Tuesday 09 November 2004 11:28, Tom Lane wrote: > Awhile back, there was some discussion about pre-folding now() and > related functions when the planner is trying to estimate selectivities. > This would allow reasonable plans to be made for cases like > WHERE moddate >= current_date - 10; > without having to indulge in any crude hacks with mislabeled wrapper > functions, such as you can find all too often in the archives :-( > > I was a bit hesitant about it at the time because I wasn't sure of all > the implications; but I've looked the idea over again, and as far as I > can see it's reasonable to pre-fold *all* stable functions when deriving > statistical estimates. (One of the potential objections went away when > we started enforcing that stable functions don't have side-effects.) > Since we know people will be calling volatile functions inside stable functions (see thread from last week if you need a refresher as to why) is there any serious negative side-effect in those cases? -- Robert Treat Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] [GENERAL] server auto-restarts and ipcs
"Ed L." <[EMAIL PROTECTED]> writes: > A power failure led to failed postmaster restart using 7.4.6 (see output > below). The short-term fix is usually to delete the pid file and restart. > I often wonder why ipcs never seems to show the shared memory > block in question? > 2004-11-08 17:17:22.398 [18038] FATAL: pre-existing shared memory block (key > 9746001, ID 658210829) is still in use I did a bit of experimentation and found that the Linux kernel does seem to reproducibly assign similar shmem IDs from one boot cycle to the next. Here's a smoking-gun case: $ sudo ipcs -m -- Shared Memory Segments keyshmid owner perms bytes nattch status 0x0052e2c1 65536 postgres 60010436608 1 0x 131073 gdm 600393216 2 dest 0x00530201 163842 tgl 60010395648 2 [ reboot ] $ sudo ipcs -m -- Shared Memory Segments keyshmid owner perms bytes nattch status 0x0052e2c1 65536 postgres 60010436608 1 0x00530201 98305 tgl 60010395648 2 0x 163842 gdm 600393216 2 dest The "tgl" entry is a manually-started postmaster, which in the second boot cycle I was able to start before gdm came up. Notice that gdm has been handed out a shmid that belonged to a different userID in the previous boot cycle. What this says is that given a little bit of variability in the boot cycle, it is fairly likely for the postmaster.pid file to contain a shared memory ID that has already been assigned to another daemon in the current boot cycle. The way that PGSharedMemoryIsInUse() is coded, this will result in a failure as exhibited by Ed, because shmctl() will return EACCES and we interpret that as a conflicting shmem segment. (The reason this is considered dangerous is it suggests that there might be backends still alive from a crashed previous postmaster; we dare not start new backends that are not in sync with the old ones.) After thinking about this awhile, I believe that it is safe to consider EACCES as a don't-care situation. EACCES could only happen if the shmem ID belongs to a different userid, which implies that it is not a postgres shared memory segment. Even if you are running postmasters under multiple userids, this can be ignored, because all that we care about is whether the shared memory segment could indicate the presence of backends running in the current $PGDATA directory. With the file permissions that we use, it is not possible for a shared memory segment to belong to a userid different from the one that owns the data directory, and so any postmaster having a different userid must be managing a different data directory. So we could reduce our exposure to failure-to-start conditions by allowing the EACCES case in PGSharedMemoryIsInUse. Does anyone see a flaw in this reasoning? This isn't a complete solution, because if you are running multiple postmasters under the *same* userid, they could still get confused. We could probably fix that by marking each shmem seg to indicate which data directory it goes with (eg, store the directory's inode number in the seg header). If we see an apparently live shmem segment of our own userid, we could attach to it and check the header to determine whether it's really a conflict or not. There might be some portability issues here though; didn't we find out that Windows doesn't really have inode numbers? regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Call for objections: simplify stable functions during
On Tue, 9 Nov 2004, Tom Lane wrote: Awhile back, there was some discussion about pre-folding now() and related functions when the planner is trying to estimate selectivities. This would allow reasonable plans to be made for cases like WHERE moddate >= current_date - 10; without having to indulge in any crude hacks with mislabeled wrapper functions, such as you can find all too often in the archives :-( I was a bit hesitant about it at the time because I wasn't sure of all the implications; but I've looked the idea over again, and as far as I can see it's reasonable to pre-fold *all* stable functions when deriving statistical estimates. (One of the potential objections went away when we started enforcing that stable functions don't have side-effects.) The infrastructure for this is already there, because of Oliver Jowett's previous work to teach eval_const_expressions() whether it's folding the expression "for real" or just for estimation; it's basically a one line change to treat stable functions differently in the two cases. I know it's a bit late in the cycle, but I'd like to go ahead and make this change for 8.0. Objections? From a performance tuning standpoing, I can't argue against it ... go for it ... Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: [EMAIL PROTECTED] Yahoo!: yscrappy ICQ: 7615664 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Call for objections: simplify stable functions
On Tue, 2004-11-09 at 16:28, Tom Lane wrote: > Awhile back, there was some discussion about pre-folding now() and > related functions when the planner is trying to estimate selectivities. > This would allow reasonable plans to be made for cases like > WHERE moddate >= current_date - 10; > without having to indulge in any crude hacks with mislabeled wrapper > functions, such as you can find all too often in the archives :-( > > I was a bit hesitant about it at the time because I wasn't sure of all > the implications; but I've looked the idea over again, and as far as I > can see it's reasonable to pre-fold *all* stable functions when deriving > statistical estimates. (One of the potential objections went away when > we started enforcing that stable functions don't have side-effects.) > > The infrastructure for this is already there, because of Oliver Jowett's > previous work to teach eval_const_expressions() whether it's folding > the expression "for real" or just for estimation; it's basically a one > line change to treat stable functions differently in the two cases. > > I know it's a bit late in the cycle, but I'd like to go ahead and make > this change for 8.0. Objections? None. IMHO Poor performance is a valid bug that must be addressed in the beta cycle. We've chased out most of the functional deficiencies, now its time to concentrate on the performance ones. -- Best Regards, Simon Riggs ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] debugging PostgreSQL
On Tue, Nov 09, 2004 at 09:16:34AM +0100, [EMAIL PROTECTED] wrote: > Could someone be so kind to give me some pointers about how > to debug pg. I would like to know which debugger you use under windows and > linux GDB. The mechanism is simple: start a connection, and in a terminal window get the backend's PID with ps(1), then call gdb -p . I assume this would also work with other debuggers or GDB frontends, but I haven't tried. -- Alvaro Herrera () "El número de instalaciones de UNIX se ha elevado a 10, y se espera que este número aumente" (UPM, 1972) ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Reorganization of the translation files
Peter Eisentraut <[EMAIL PROTECTED]> writes: > I was thinking about organizing the translation files in a more efficient > manner. (not for 8.0, obviously) > [snip] > And it would easily solve issues like translatable strings > appearing in the pgport library, which has no makefile structure to support > translations (which would be a waste for like 5 strings), and no run-time > support either. Sounds like a win to me on that grounds alone; but probably the translators need to have the biggest say here, since they'll be affected the most. One question is whether the gettext code is markedly less efficient when accessing a large .mo file than a small one; if so, the consolidation would probably hurt client performance. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
[HACKERS] Reorganization of the translation files
I was thinking about organizing the translation files in a more efficient manner. (not for 8.0, obviously) Right now we have one PO file for each combination of program (or library) and language. The idea was that one only has to install the translation files for the PostgreSQL pieces that one actually uses on a particular system, and this is basically how it has worked out. Nevertheless, I think it may make sense to provide only a single PO file for each language, covering the entire PostgreSQL source tree. Because if you think about it, the above space-saving scheme actually wastes space. If you install a PostgreSQL server package today, you install 14 translations, 93% of which you don't need. On the other hand, if we only provided one PO file per language and encouraged packagers to create a separate package for each language (say, postgresql-i18n-de), then a client-only installation wastes more like 60%, and a server installation (which usually includes the clients) wastes nothing. Moreover, English speakers have the option to install no translations at all. Now, the above can be accomplished by realigning the packaging without changing the PO files, but if we were to do that, then there is little reason to keep separate files. Moreover, having one big file would have several other advantages: It would save space because many strings are duplicated in several PO files. It would make life easier on those installing the translations. I suspect it would also make life easier for translators. And it would be easier to release translation updates or new translations between code releases. And it would easily solve issues like translatable strings appearing in the pgport library, which has no makefile structure to support translations (which would be a waste for like 5 strings), and no run-time support either. Comments? -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[HACKERS] Call for objections: simplify stable functions during estimation
Awhile back, there was some discussion about pre-folding now() and related functions when the planner is trying to estimate selectivities. This would allow reasonable plans to be made for cases like WHERE moddate >= current_date - 10; without having to indulge in any crude hacks with mislabeled wrapper functions, such as you can find all too often in the archives :-( I was a bit hesitant about it at the time because I wasn't sure of all the implications; but I've looked the idea over again, and as far as I can see it's reasonable to pre-fold *all* stable functions when deriving statistical estimates. (One of the potential objections went away when we started enforcing that stable functions don't have side-effects.) The infrastructure for this is already there, because of Oliver Jowett's previous work to teach eval_const_expressions() whether it's folding the expression "for real" or just for estimation; it's basically a one line change to treat stable functions differently in the two cases. I know it's a bit late in the cycle, but I'd like to go ahead and make this change for 8.0. Objections? regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] Increasing the length of
On Mon, 2004-11-08 at 22:32, Tom Lane wrote: > Another relevant question is why you are expecting to get this > information through pgstats and not by looking in the postmaster log. This is only available if you log all queries, which isn't normally done while you are in production. When you hit a long running query, you do wish you had that enabled, and if it was you could look there. It would be best to leave the postmaster logging turned off, then allow dynamic inspection of the query iff you have a rogue query. This is an important admin consideration for data warehousing. > I don't know about you, but I don't have any tools that are designed to > cope nicely with looking at tables that have columns that might be many > K wide. Looking in the log seems a much nicer way of examining the full > text of extremely long queries. So I think it's actually a good thing > that pgstats truncates the queries at some reasonable width. You're right...if the query was in the log, thats where I'd look. In general, I'm not bothered whether I can see the whole query or not. But it would be good to have a mode of operation that allows the whole query to be seen via pg_stat_activity, when required. Could this allow some dynamic behaviour? i.e. set it low, as Tom suggests for most of the time, then set it higher, as Greg suggests, upon demand, for a short period only? [Not sure, but I think it may only be sent once at start of query, then never againso may be a hole in this thinking] -- Best Regards, Simon Riggs ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] Increasing the length of pg_stat_activity.current_query...
Josh Berkus wrote: Tom, Another relevant question is why you are expecting to get this information through pgstats and not by looking in the postmaster log. I don't know about you, but I don't have any tools that are designed to cope nicely with looking at tables that have columns that might be many K wide. Looking in the log seems a much nicer way of examining the full text of extremely long queries. So I think it's actually a good thing that pgstats truncates the queries at some reasonable width. Because pg_stat_activity can be queried dynamically, and the log can't. I've been planning to post a lengthy mail after 8.0 release, but it seems a good idea to do it now. When comparing pgsql to MSSQL in practice, I encounter a similar problem as Josh. I got a server hammered by countless queries, some of them not too well constructed and thus soaking CPU from all users. On MSSQL, I'd be using the Profiler, which lets me tap one or more connections, and log whatever I think is important to trace down the problem. This lets me filter out those uninteresting 99.9 % of queries which would make my log unreadable. Additionally, some performance measures are recorded for each query, enabling me to spot the bad guys, analyze and improve them. On pgsql, all logging goes unstructured into one file, I even can't start and stop a new log on demand on my observation period (somebody refused to implement a manual log rotation function, "nobody needs that"...) On a server addressed by 100 users, with several dozens of queries fired every second, it's hard work to locate the offending query. It appears to me that simple increasing the max query length won't do the deal (and 16k would not be enough). What I'd like to see is the possibility to tap one or more backends (this is superuser only, of course), and put them in a logging mode, which will record the complete query including performance counters to some process in a lossless way. When I say tapping I mean that the backend configuration switch is *not* set by the very same backend, but from a different superuser backend. Regards, Andreas ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] ExclusiveLock
On Mon, 2004-11-08 at 21:37, Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > Recent runs of DBT-2 show very occasional ExclusiveLock (s) being held > > by transactions, sometimes waiting to be granted. > > I think you are right that these reflect heap or btree-index extension > operations. Those do not actually take locks on the *table* however, > but locks on a single page within it (which are completely orthogonal to > table locks and don't conflict). The pg_locks output leaves something > to be desired, because you can't tell the difference between table and > page locks. Good. Thought it was worth discussion... > It's odd that your example does not appear to show someone else holding > a conflicting lock. There isI didn't copy the whole lock table output...here it is... relname| pid | mode | granted ---+---+--+- new_order | 21735 | AccessShareLock | t new_order | 21735 | RowExclusiveLock | t orders| 21715 | AccessShareLock | t orders| 21715 | RowExclusiveLock | t pg_class | 23254 | AccessShareLock | t order_line| 21715 | AccessShareLock | t order_line| 21715 | RowExclusiveLock | t order_line| 21735 | ExclusiveLock| f new_order | 21715 | AccessShareLock | t new_order | 21715 | RowExclusiveLock | t customer | 21715 | AccessShareLock | t pk_order_line | 21735 | AccessShareLock | t pk_order_line | 21735 | RowExclusiveLock | t item | 21715 | AccessShareLock | t orders| 21735 | AccessShareLock | t orders| 21735 | RowExclusiveLock | t order_line| 21735 | AccessShareLock | t order_line| 21735 | RowExclusiveLock | t stock | 21715 | AccessShareLock | t stock | 21715 | RowExclusiveLock | t order_line| 21715 | ExclusiveLock| t pk_order_line | 21715 | RowExclusiveLock | t pg_locks | 23254 | AccessShareLock | t district | 21715 | AccessShareLock | t district | 21715 | RowShareLock | t district | 21715 | RowExclusiveLock | t warehouse | 21715 | AccessShareLock | t customer | 21735 | AccessShareLock | t customer | 21735 | RowExclusiveLock | t (29 rows) Pids 21715 and 21735 are conflicting. There's also another example where the lock table output is > 1400 rows, with two lock requests pending. The oprofile for this run looks like this: (but is not of course a snapshot at a point in time, like the lock list) CPU: CPU with timer interrupt, speed 0 MHz (estimated) Profiling through timer interrupt samples %app name symbol name 170746 42.7220 vmlinux-2.6.8.1-osdl2ia64_pal_call_static 18934 4.7374 libc-2.3.2.so(no symbols) 10691 2.6750 postgres FunctionCall2 9814 2.4555 postgres hash_seq_search 8654 2.1653 postgres SearchCatCache 7389 1.8488 postgres AllocSetAlloc 6122 1.5318 postgres hash_search 5707 1.4279 postgres OpernameGetCandidates 4901 1.2263 postgres StrategyDirtyBufferList 4627 1.1577 postgres XLogInsert 4424 1.1069 postgres pglz_decompress 4371 1.0937 vmlinux-2.6.8.1-osdl2__copy_user 3796 0.9498 vmlinux-2.6.8.1-osdl2finish_task_switch 3483 0.8715 postgres LWLockAcquire 3458 0.8652 postgres eqjoinsel 3001 0.7509 vmlinux-2.6.8.1-osdl2get_exec_dcookie 2824 0.7066 postgres AtEOXact_CatCache 2745 0.6868 postgres _bt_compare 2730 0.6831 postgres nocachegetattr 2715 0.6793 postgres SearchCatCacheList 2659 0.6653 postgres MemoryContextAllocZeroAligned 2604 0.6515 postgres yyparse 2553 0.6388 postgres eqsel 2127 0.5322 postgres deconstruct_array 1921 0.4806 postgres hash_any 1919 0.4801 postgres int4eq 1855 0.4641 postgres LWLockRelease 1839 0.4601 postgres StrategyBufferLookup 1777 0.4446 postgres GetSnapshotData 1729 0.4326 postgres heap_getsysattr 1595 0.3991 postgres DLMoveToFront 1586 0.3968 postgres MemoryContextAlloc 1485 0.3716 vmlinux-2.6.8.1-osdl2try_atomic_semop 1455 0.3641 postgres anonymous symbol from section .plt 1409 0.3525 postgres lappend 1352 0.3383 postgres heap_release_fetch 1270 0.3178 postgres PinBuffer 1141 0.2855 postgres DirectFunctionCall1 1132 0.2832 postgres base_yylex 982 0.2457 postgres
[HACKERS] debugging PostgreSQL
Dear Folks, Could someone be so kind to give me some pointers about how to debug pg. I would like to know which debugger you use under windows and linux Regards, Gevik ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings