Re: [HACKERS] Reducing stats collection overhead
On 31-7-2007 5:07 Alvaro Herrera wrote: Arjen van der Meijden wrote: Afaik Tom hadn't finished his patch when I was testing things, so I don't know. But we're in the process of benchmarking a new system (dual quad-core Xeon) and we'll have a look at how it performs in the postgres 8.2dev we used before, the stable 8.2.4 and a fresh HEAD-checkout (which we'll call 8.3dev). I'll let you guys (or at least Tom) know how they compare in our benchmark. So, ahem, did it work? :-) The machine turned out to have a faulty mainboard, so we had to concentrate on first figuring out why it was unstable and then whether the replacement mainboard did make it stable in a long durability test Of course that behaviour only appeared with mysql and not with postgresql, so we had to run our mysql-version of the benchmark a few hundred times, rather than testing various versions, untill the machine had to go in production. So we haven't tested postgresql 8.3dev on that machine, sorry. Best regards, Arjen On 18-5-2007 15:12 Alvaro Herrera wrote: Tom Lane wrote: Arjen van der Meijden told me that according to the tweakers.net benchmark, HEAD is noticeably slower than 8.2.4, and I soon confirmed here that for small SELECT queries issued as separate transactions, there's a significant difference. I think much of the difference stems from the fact that we now have stats_row_level ON by default, and so every transaction sends a stats message that wasn't there by default in 8.2. When you're doing a few thousand transactions per second (not hard for small read-only queries) that adds up. So, did this patch make the performance problem go away? ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Machine available for community use
Devrim =?ISO-8859-1?Q?G=DCND=DCZ?= <[EMAIL PROTECTED]> writes: > On Mon, 2007-07-30 at 23:36 -0400, Tom Lane wrote: >> There isn't any "secret sauce" in the RHEL build process > Really? Are the compiler options, etc, public? Certainly. If you doubt it, try comparing pg_config output for the RHEL and CentOS packages. (And if the CFLAGS entries are different, you should be mentioning it to the CentOS package maintainer, not me.) > Actually I did not compare -- But both of them were 4.3 (RHEL 4.3 and > CentOS 4.3). I'm assuming that they have the same package versions, > right? > BTW, they were stock 4.3 -- no updates, etc. RHEL 4.3 was obsoleted more than a year ago, so I'd like to think that nobody finds "no update" comparisons to be very relevant today ... regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Machine available for community use
Hi, On Mon, 2007-07-30 at 23:36 -0400, Tom Lane wrote: > > Each test took 1-2 days -- I will insist that CentOS performs poorer > > than RHEL. > > I'm finding that hard to believe too. I have felt the same, that's why I repeated the test twice. > There isn't any "secret sauce" in the RHEL build process Really? Are the compiler options, etc, public? > --- the CentOS guys should have been able to duplicate the RHEL RPMs > exactly. Now it's possible that CentOS had lagged in updating some > performance-relevant package; did you compare package versions across > both OSes? Actually I did not compare -- But both of them were 4.3 (RHEL 4.3 and CentOS 4.3). I'm assuming that they have the same package versions, right? BTW, they were stock 4.3 -- no updates, etc. I hope I will be able to publish only the graphs, so that community will take a look what is going on. Regards, -- Devrim GÜNDÜZ PostgreSQL Replication, Consulting, Custom Development, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, ODBCng - http://www.commandprompt.com/ signature.asc Description: This is a digitally signed message part
Re: [HACKERS] [GENERAL] ascii() for utf8
Moving to -hackers. On Jul 27, 2007, at 1:22 PM, Stuart wrote: Does Postgresql have a function like ascii() that will return the unicode codepoint value for a utf8 character? (And symmetrically same for question chr() of course). I didn't find anything in the docs so I think the answer is no which leads me to ask... Why not? (Hard to believe lack of need without concluding that either ascii() is not needed, of utf8 text is little used.) Are there technical problems in implementing such a function? Has anyone else already done this (ie, is there somewhere I could get it from?) Is there some other non-obvious way to get the cp value for the utf8 character? I think I could use plperl or plpython for this but this seems like an awful lot of overhead for such a basic task. I suspect that this is just a matter of no one scratching the itch. I suspect a patch would be accepted, or you could possibly put something on pgFoundry. I'd set it up so that ascii() and chr() act according to the appropriate locale setting (I'm not sure which one would be appropriate). -- Decibel!, aka Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Machine available for community use
Devrim =?ISO-8859-1?Q?G=DCND=DCZ?= <[EMAIL PROTECTED]> writes: > On Mon, 2007-07-30 at 19:14 -0700, Joshua D. Drake wrote: >>> and RHEL performed much better than CentOS. >> Not to be unkind, but I doubt that on an identical configuration. > Each test took 1-2 days -- I will insist that CentOS performs poorer > than RHEL. I'm finding that hard to believe too. There isn't any "secret sauce" in the RHEL build process --- the CentOS guys should have been able to duplicate the RHEL RPMs exactly. Now it's possible that CentOS had lagged in updating some performance-relevant package; did you compare package versions across both OSes? regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Reducing stats collection overhead
Arjen van der Meijden wrote: > Afaik Tom hadn't finished his patch when I was testing things, so I don't > know. But we're in the process of benchmarking a new system (dual quad-core > Xeon) and we'll have a look at how it performs in the postgres 8.2dev we > used before, the stable 8.2.4 and a fresh HEAD-checkout (which we'll call > 8.3dev). I'll let you guys (or at least Tom) know how they compare in our > benchmark. So, ahem, did it work? :-) > On 18-5-2007 15:12 Alvaro Herrera wrote: >> Tom Lane wrote: >>> Arjen van der Meijden told me that according to the tweakers.net >>> benchmark, HEAD is noticeably slower than 8.2.4, and I soon confirmed >>> here that for small SELECT queries issued as separate transactions, >>> there's a significant difference. I think much of the difference stems >>> from the fact that we now have stats_row_level ON by default, and so >>> every transaction sends a stats message that wasn't there by default >>> in 8.2. When you're doing a few thousand transactions per second >>> (not hard for small read-only queries) that adds up. >> So, did this patch make the performance problem go away? -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Quick idea for reducing VACUUM contention
ITAGAKI Takahiro wrote: > > Alvaro Herrera <[EMAIL PROTECTED]> wrote: > > > > I think we might need additional "freezing-xmax" operations to avoid > > > XID-wraparound in the first path of vacuum, though it hardly occurs. > > > > I'm not sure I follow. Can you elaborate? Do you mean storing a > > separate relfrozenxmax for each table or something like that? > > We need to work around wraparound of xmax in dead tuples. If we miss to > vacuum them and XID is wrapped, we cannot remove them until the next > XID-wraparound, because we treat them to be deleted in the *future*. Oh, but this should not be a problem, because a tuple is either frozen or removed completely -- xmax cannot precede xmin. > > > We just add XID of the vacuum to dead tuples we see in the > > > first path. When backends find a dead tuple and see the transaction > > > identified by XID in it has commited, they can freely reuse the area of > > > the dead tuple because we can assume index entries pointing the tuple > > > have been removed by the vacuum. > > > > I would be worried about leftover index entries being later used by new > > tuples in the heap. Then when you visit the index, find that entry, go > > to the heap and find the new tuple and return it, which could be bogus. > > Avoiding dangling index entries, I'm thinking about reusing dead tuples > only if we see the VACUUM transaction have committed successfully. > That means the VACUUM transaction removed all index entries corresponding > those dead tuples; They are now Heap-Only-Tuples, so that we can recycle > them in the same manner as HOT updated tuples. Hmm. OK, I admit I have no idea how HOT works. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] default_text_search_config and expression indexes
Alvaro Herrera wrote: > Bruce Momjian wrote: > > Bruce Momjian wrote: > > > > 3) Remove default_text_search_config and require the > > > configuration to be specified in each function call. > > > > > > If we remove default_text_search_config, it would also make ::tsvector > > > casting useless as well. > > > > OK, I just found a case that I think is going to make #3 a requirement > > (remove default_text_search_config). > > > > How is a CREATE INDEX ... to_tsvector(col) going to restore from a > > pg_dump? I see no way of guaranteeing that the > > default_text_search_config is correct on the restore, and in fact I > > don't think we have any way of knowing the default_text_search_config > > used for the index. > > Make pg_dump emit only CREATE INDEX sentences with two-param format. In > fact I think it would make sense to convert internally the one-param > format to two-param, before hitting the catalogs. > > This would also solve your problem about usability of WHERE clauses, if > you rewrite the one-param calls to two-params before the optimizer kicks > in. Yes, that could be done but as far as I know we weren't planning to have those areas of our backend be aware of configuration parameters to text search functions, and I doubt we want to do that for 8.3, if ever. There are many tsearch functions and doing this would spill tsearch function checks all over the backend, reducing our modularity. The default capability just isn't worth it, and in addition is error-prone. -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Quick idea for reducing VACUUM contention
Alvaro Herrera <[EMAIL PROTECTED]> wrote: > > I think we might need additional "freezing-xmax" operations to avoid > > XID-wraparound in the first path of vacuum, though it hardly occurs. > > I'm not sure I follow. Can you elaborate? Do you mean storing a > separate relfrozenxmax for each table or something like that? We need to work around wraparound of xmax in dead tuples. If we miss to vacuum them and XID is wrapped, we cannot remove them until the next XID-wraparound, because we treat them to be deleted in the *future*. > > We just add XID of the vacuum to dead tuples we see in the > > first path. When backends find a dead tuple and see the transaction > > identified by XID in it has commited, they can freely reuse the area of > > the dead tuple because we can assume index entries pointing the tuple > > have been removed by the vacuum. > > I would be worried about leftover index entries being later used by new > tuples in the heap. Then when you visit the index, find that entry, go > to the heap and find the new tuple and return it, which could be bogus. Avoiding dangling index entries, I'm thinking about reusing dead tuples only if we see the VACUUM transaction have committed successfully. That means the VACUUM transaction removed all index entries corresponding those dead tuples; They are now Heap-Only-Tuples, so that we can recycle them in the same manner as HOT updated tuples. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] default_text_search_config and expression indexes
Bruce Momjian wrote: > Bruce Momjian wrote: > > 3) Remove default_text_search_config and require the > >configuration to be specified in each function call. > > > > If we remove default_text_search_config, it would also make ::tsvector > > casting useless as well. > > OK, I just found a case that I think is going to make #3 a requirement > (remove default_text_search_config). > > How is a CREATE INDEX ... to_tsvector(col) going to restore from a > pg_dump? I see no way of guaranteeing that the > default_text_search_config is correct on the restore, and in fact I > don't think we have any way of knowing the default_text_search_config > used for the index. Make pg_dump emit only CREATE INDEX sentences with two-param format. In fact I think it would make sense to convert internally the one-param format to two-param, before hitting the catalogs. This would also solve your problem about usability of WHERE clauses, if you rewrite the one-param calls to two-params before the optimizer kicks in. -- Alvaro Herrera http://www.amazon.com/gp/registry/DXLWNGRJD34J "Nadie esta tan esclavizado como el que se cree libre no siendolo" (Goethe) ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Machine available for community use
Hi, On Mon, 2007-07-30 at 19:14 -0700, Joshua D. Drake wrote: > > and RHEL performed much better than CentOS. > > Not to be unkind, but I doubt that on an identical configuration. Since I don't have the permission to distribute the benchmark results, I will be happy to spend time for re-running these tests if someone provides me an identical machine. Each test took 1-2 days -- I will insist that CentOS performs poorer than RHEL. BTW, I will ask for permission to distribute the graphs that I produced using gnuplot -- Maybe those graphs will give us some light. Regards, -- Devrim GÜNDÜZ PostgreSQL Replication, Consulting, Custom Development, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, ODBCng - http://www.commandprompt.com/ signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Machine available for community use
Devrim GÜNDÜZ wrote: Hi, RHEL has better performance than CentOS -- I guess it is the compiler options that Red Hat is using while compiling their RPMs. I have performed a test using OSDL test suite a few months ago on a system that has: * 8 x86_64 CPUs @ 3200.263 * 16 Gigabytes of RAM * PostgreSQL 8.1.5 (PGDG packages) and RHEL performed much better than CentOS. Not to be unkind, but I doubt that on an identical configuration. Joshua D. Drake Regards, -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] default_text_search_config and expression indexes
Bruce Momjian wrote: > We have to decide if we want a GUC default_text_search_config, and if so > when can it be changed. > > Right now there are three ways to create a tsvector (or tsquery) > > ::tsvector > to_tsvector(value) > to_tsvector(config, value) > > (ignoring plainto_tsvector) > > Only the last one specifies the configuration. The others use the > configuration specified by default_text_search_config. (We had an > previous discussion on what the default value of > default_text_search_config should be, and it was decided it should be > set via initdb based on a flag or the locale.) > > Now, because most people use a single configuration, they can just set > default_text_search_config and there is no need to specify the > configuration name. > > However, expression indexes cause a problem here: > > > http://momjian.us/expire/fulltext/HTML/textsearch-tables.html#TEXTSEARCH-TABLES-INDEX > > We recommend that users create an expression index on the column they > want to do a full text search on, e.g. > > CREATE INDEX pgweb_idx ON pgweb USING gin(to_tsvector(body)); > > However, the big problem is that the expressions used in expression > indexes should not change their output based on the value of a GUC > variable (because it would corrupt the index), but in the case above, > default_text_search_config controls what configuration is used, and > hence the output of to_tsvector is changed if default_text_search_config > changes. > > We have a few possible options: > > 1) Document the problem and do nothing else. > 2) Make default_text_search_config a postgresql.conf-only > setting, thereby making it impossible to change by non-super > users, or make it a super-user-only setting. > 3) Remove default_text_search_config and require the > configuration to be specified in each function call. > > If we remove default_text_search_config, it would also make ::tsvector > casting useless as well. OK, I just found a case that I think is going to make #3 a requirement (remove default_text_search_config). How is a CREATE INDEX ... to_tsvector(col) going to restore from a pg_dump? I see no way of guaranteeing that the default_text_search_config is correct on the restore, and in fact I don't think we have any way of knowing the default_text_search_config used for the index. And if we have to require the configuration name in CREATE INDEX, it has to be used in WHERE, so we might as well just remove the default capability and always require the configuration name. -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] Building Hash Index by Presorting Tuples
Hi, We are trying to sort the index tuples before inserting them into hash buckets, to improve build speed. Here is our plan: 1. Build a spool that contains all the index tuples to be inserted into the buckets. - this is done. 2. sort the index tuples in the spool according to the bucket number to which they should belong. This results in accessing a bucket once and only once. 3. For (2) to work, we need an estimate of the number of buckets. This is done. 4. After sorting the index tuples, insert them into hash in bucket order. Our challenge: we need to determine the final bucket number for the itup (index tuple). 1. to do the above, we need to apply a mask to the hash value of the index tuple. first, we calculate the hash value of the index tuple. then, we calculate the mask using: (1 << (ceiling(log 2 (Estimate of buckets needed-1 So, if we need 6 buckets, the mask would be 7 or binary 111. If we needed 100 buckets, the mask would be 127 or binary 111. If we AND this mask to the hash of the key, we only recognize the least sig. bits needed to do the compare. A 32 bit hash value may look like: 1011010100101010110101010101 Let's say we just need 6 buckets, apply the mask 111 and we get: 1011010100101010110101010101 (the hash value of the key) 0111 (the mask &) 0101 (the resulting bucket number = 5) If we needed 100 buckets, the calculation would look like: 1011010100101010110101010101 (the hash value of the key) 0111 (the mask &) 01010101 (the resulting bucket number = 85) 2. however, in practice when we apply a mask of value say, (binary) our resulting bucket number is not evenly distrubuted. 3. do we look for a better hash function? or can we modify the existing hash? Comments are welcome. -Tom ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[HACKERS] Reducing Transaction Start/End Contention
Jignesh Shah's scalability testing on Solaris has revealed further tuning opportunities surrounding the start and end of a transaction. Tuning that should be especially important since async commit is likely to allow much higher transaction rates than were previously possible. There is strong contention on the ProcArrayLock in Exclusive mode, with the top path being CommitTransaction(). This becomes clear as the number of connections increases, but it seems likely that the contention can be caused in a range of other circumstances. My thoughts on the causes of this contention are that the following 3 tasks contend with each other in the following way: CommitTransaction(): takes ProcArrayLock Exclusive but only needs access to one ProcArray element waits for GetSnapshotData():ProcArrayLock Shared ReadNewTransactionId():XidGenLock Shared which waits for GetNextTransactionId() takes XidGenLock Exclusive ExtendCLOG(): takes ClogControlLock Exclusive, WALInsertLock Exclusive two possible place where I/O is required ExtendSubtrans(): takes SubtransControlLock() one possible place where I/O is required Avoids lock on ProcArrayLock: atomically updates one ProcArray element or more simply: CommitTransaction() -- i.e. once per transaction waits for GetSnapshotData() -- i.e. once per SQL statement which waits for GetNextTransactionId() -- i.e. once per transaction This gives some goals for scalability improvements and some proposals. (1) and (2) are proposals for 8.3 tuning, the others are directions for further research. Goal: Reduce total time that GetSnapshotData() waits for GetNextTransactionId() 1. Increase size of Clog-specific BLCKSZ Clog currently uses BLCKSZ to define the size of clog buffers. This can be changed to use CLOG_BLCKSZ, which would then be set to 32768. This will naturally increase the amount of memory allocated to the clog, so we need not alter CLOG_BUFFERS above 8 if we do this (as previously suggested, with successful results). This will also reduce the number of ExtendClog() calls, which will probably reduce the overall contention also. 2. Perform ExtendClog() as a background activity Background process can look at the next transactionid once each cycle without holding any lock. If the xid is almost at the point where a new clog page would be allocated, then it will allocate one prior to the new page being absolutely required. Doing this as a background task would mean that we do not need to hold the XidGenLock in exclusive mode while we do this, which means that GetSnapshotData() and CommitTransaction() would also be less likely to block. Also, if any clog writes need to be performed when the page is moved forwards this would also be performed in the background. 3. Consider whether ProcArrayLock should use a new queued-shared lock mode that puts a maximum wait time on ExclusiveLock requests. It would be fairly hard to implement this well as a timer, but it might be possible to place a limit on queue length. i.e. allow Share locks to be granted immediately if a Shared holder already exists, but only if there is a queue of no more than N exclusive mode requests queued. This might prevent the worst cases of exclusive lock starvation. 4. Since shared locks are currently queued behind exclusive requests when they cannot be immediately satisfied, it might be worth reconsidering the way LWLockRelease works also. When we wake up the queue we only wake the Shared requests that are adjacent to the head of the queue. Instead we could wake *all* waiting Shared requestors. e.g. with a lock queue like this: (HEAD) S<-S<-X<-S<-X<-S<-X<-S Currently we would wake the 1st and 2nd waiters only. If we were to wake the 3rd, 5th and 7th waiters also, then the queue would reduce in length very quickly, if we assume generally uniform service times. (If the head of the queue is X, then we wake only that one process and I'm not proposing we change that). That would mean queue jumping right? Well thats what already happens in other circumstances, so there cannot be anything intrinsically wrong with allowing it, the only question is: would it help? We need not wake the whole queue, there may be some generally more beneficial heuristic. The reason for considering this is not to speed up Shared requests but to reduce the queue length and thus the waiting time for the Xclusive requestors. Each time a Shared request is dequeued, we effectively re-enable queue jumping, so a Shared request arriving during that point will actually jump ahead of Shared requests that were unlucky enough to arrive while an Exclusive lock was held. Worse than that, the new incoming Shared requests exacerbate the starvation, so the more non-adjacent groups of Shared lock requests there are in the queue, the worse the starvation of the exclusive requestors becomes. We are effectively randomly starving some shared locks as well as exclusive locks in the current scheme, based upon the state of the lock when they make their req
Re: [HACKERS] Quick idea for reducing VACUUM contention
Jim Nasby wrote: > On Jul 27, 2007, at 1:49 AM, Alvaro Herrera wrote: >> ITAGAKI Takahiro wrote: >>> "Simon Riggs" <[EMAIL PROTECTED]> wrote: >>> Read the heap blocks in sequence, but make a conditional lock for cleanup on each block. If we don't get it, sleep, then try again when we wake up. If we fail the second time, just skip the block completely. >> >> It would be cool if we could do something like sweep a range of pages, >> initiate IO for those that are not in shared buffers, and while that is >> running, lock and clean up the ones that are in shared buffers, skipping >> those that are not lockable right away; when that's done, go back to >> those buffers that were gotten from I/O and clean those up. And retry >> the locking for those that couldn't be locked the first time around, >> also conditionally. And when that's all done, a third pass could get >> those blocks that weren't cleaned up in none of the previous passes (and >> this time the lock would not be conditional). > > Would that be substantially easier than just creating a bgreader? I'm not sure about easier, but I'm not sure that the bgreader can do the same job. ISTM that the bgreader would be mostly in charge of reading in advance of backends, whereas what I'm proposing is mostly about finding the best spot for locking. It might turn out to be more trouble than it's worth though, for sure. And in any case I'm not in a hurry to implement it. In any case I'm not so sure about skipping vacuuming a block if it's not lockable. -- Alvaro Herrera http://www.amazon.com/gp/registry/CTMLCN8V17R4 "Crear es tan difícil como ser libre" (Elsa Triolet) ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Machine available for community use
Hi, On Wed, 2007-07-25 at 20:22 -0700, Joshua D. Drake wrote: > > I'm checking into this, but it may take a few days to get an answer > > (particularly since I'm planning to take Friday through Monday off). > > Well if we go RHEL why not CentOS5 and just call it good? ...because RHEL and CentOS are not really that identical. They are just binary-compilant. RHEL has better performance than CentOS -- I guess it is the compiler options that Red Hat is using while compiling their RPMs. I have performed a test using OSDL test suite a few months ago on a system that has: * 8 x86_64 CPUs @ 3200.263 * 16 Gigabytes of RAM * PostgreSQL 8.1.5 (PGDG packages) and RHEL performed much better than CentOS. Regards, -- Devrim GÜNDÜZ PostgreSQL Replication, Consulting, Custom Development, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, ODBCng - http://www.commandprompt.com/ signature.asc Description: This is a digitally signed message part
Re: [HACKERS] default_text_search_config and expression indexes
Oleg Bartunov wrote: > > OK, here is what I am thinking. If we make default_text_search_config > > super-user-only, then the user can't do SET (using "zero_damaged_pages" > > as a superuser-only example): > > > > test=> set zero_damaged_pages = on; > > ERROR: permission denied to set parameter "zero_damaged_pages" > > > > test=> alter user guest set zero_damaged_pages = on; > > ERROR: permission denied to set parameter "zero_damaged_pages" > > > > but the super-user can set it in postgresql.conf, or: > > > > test=# alter user guest set zero_damaged_pages = on; > > ALTER ROLE > > > > or > > > > test=# alter database vendor3 set zero_damaged_pages = on; > > ALTER ROLE > > > > meaning while it will be super-user-only, the administrator can set the > > default for specific databases and users. Is that the best approach? > > > > A user can still over-ride the default by specifying the configuration > > in the function call. > > This is ok, but it will not work in hosting environment and still > doesn't prevent errors. Agreed. super-user-only now seems strange to me because it isn't a security issue, but rather an attempt to avoid people causing errors. The fundamental issue is that if you do a query using tsvector and tsquery everything will work find because default_text_search_config will be the same for both queries. The problem is if do an expression index lookup that doesn't specify the configuration name and your default_text_search_config doesn't match the index, or you INSERT or UPDATE into an expression index with a mismatched default_text_search_config. If we do make default_text_search_config super-user-only it prevents a database owner from doing ALTER DATABASE db1 SET default_text_search_config = 'english', which seems like a pretty big limitation because I think per-database default_text_search_config makes the most sense. And, again, if you specify the configuration in the expression index you have to specify it in the WHERE clause and then default_text_search_config is pretty useless. If we required the configuration to always be specified, you could still store multiple configurations in the same column by having a secondary column hold the configuration name: CREATE INDEX i on x USING gist (to_tsvector(config_col, body)); -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Export convert_to_scalar
On Mon, Jul 30, 2007 at 01:04:35PM +0100, Gregory Stark wrote: > > One of the ideas for pgadmin for the future will be a statistics table > visualizer which amongst other things could chart the histograms so people can > see if they represent the distribution of data they expect. > > To do this pgadmin would need access to convert_to_scalar so it understands > where to plot values from the histogram (and because where to place them is > the only way to tell how tall to make the bars). > > As the comments indicate convert_to_scalar is a bit of a hack and we may want > to reimplement it some day but there's no particular need to do so for this > task. > > There are two options here, we can either provide a user-accessible catalog > function to access convert_to_scalar in pg_proc, or we can just export > convert_to_scalar and allow pgadmin's adminpack to define the corresponding > function. > > I've actually already done both so I could play with it, I can post whichever > people prefer or both to patches. It would be good to get one or the other > done in 8.3 so that if I or anyone else implements the charting with for > pgadmin they can support 8.3 instead of having to wait until 8.4. Unless there are very good arguments for not doing it, I'd very much like to see it in core rather than adminpack. Most people don't install adminpack, really. And it doesn't play too well in a hosted scenario. So I'd rather see more functionality moved from adminpack into core than the other way around.. //Magnus ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] Export convert_to_scalar
One of the ideas for pgadmin for the future will be a statistics table visualizer which amongst other things could chart the histograms so people can see if they represent the distribution of data they expect. To do this pgadmin would need access to convert_to_scalar so it understands where to plot values from the histogram (and because where to place them is the only way to tell how tall to make the bars). As the comments indicate convert_to_scalar is a bit of a hack and we may want to reimplement it some day but there's no particular need to do so for this task. There are two options here, we can either provide a user-accessible catalog function to access convert_to_scalar in pg_proc, or we can just export convert_to_scalar and allow pgadmin's adminpack to define the corresponding function. I've actually already done both so I could play with it, I can post whichever people prefer or both to patches. It would be good to get one or the other done in 8.3 so that if I or anyone else implements the charting with for pgadmin they can support 8.3 instead of having to wait until 8.4. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] pipe chunking vs Windows
Magnus Hagander wrote: However, the problem is that in backporting it we'd make a slight behaviour change - the log file just gets LF instead of CRLF line endings. I'm inclined to say that's a better result than living with the bug, though. Can't we add back the CRLF combo when writing the file out from the syslogger process? Yes, you're right. I can do this by setting text mode on the syslogFile. I'll have a patch for review shortly. Note: this is behaviour we definitely won't want for CSVlogs, because it will translate embedded LF as well as LF at the end of a log line. This whole little mess chewed up all the time I wanted to spend yesterday making progress on CSVlogs :-( cheers andrew ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] pipe chunking vs Windows
On Sun, Jul 29, 2007 at 07:43:34PM -0400, Andrew Dunstan wrote: > > > Andreas Pflug wrote: > >Andrew Dunstan wrote: > > > >>I have no idea why that's done - it goes back to the origins of the > >>syslogger - probably because someone mistakenly thinks all WIndows > >>text files have to have CRLF line endings. > >> > >>I tried changing that to _O_BINARY, and calling _setmode on both the > >>pipe before it's duped into stderr and stderr after the dup and both. > >>Nothing seemed to work. > >> > >AFAIR the flag has to be set again in each child process. > > > > > > OK, this was the bit of info I was missing. I put a call in > SubPostmasterMain() like this: > >_setmode(fileno(stderr), _O_BINARY); > > and one in the syslogger code the same, just after the pipe is dup'ed > into stderr. > > After that I saw no more corruption issues. Seems right - for anything EXEC_BACKEND, if you change a state from the default it won't be inherited, so you nede to explicitly add it in the child. > However, the problem is that in backporting it we'd make a slight > behaviour change - the log file just gets LF instead of CRLF line > endings. I'm inclined to say that's a better result than living with the > bug, though. Can't we add back the CRLF combo when writing the file out from the syslogger process? //Magnus ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] pipe chunking vs Windows
On Sun, Jul 29, 2007 at 06:31:04PM -0400, Andrew Dunstan wrote: > > > Andreas Pflug wrote: > >Andrew Dunstan wrote: > > > >>> > >>> > I have no idea why that's done - it goes back to the origins of the > syslogger - probably because someone mistakenly thinks all WIndows > text files have to have CRLF line endings. > > >Yes this was intentional, notepad still doesn't like LF line endings. > >Not my preferred text viewer, but the only one that's always available. > > > > > > > > Not for Wordpad though, and it's pretty universal too. And Notepad won't > load a file of any great size anyway. Furthermore, we just can't have Just for the record, that hasn't been true for a long time. Notepad opens large files just fine (well, fine is a matter of definition, since it needs to load the whole file in RAM which is interesting when you open gigabyte size files. But IIRC, so does wordpad). And notepad doesn't carry the risk of people saving the file in word format (yes, this happens all too frequently from what I've seen) //Magnus ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster