Re: [HACKERS] Machine available for community use
Tom Lane wrote: FWIW, it's looking like Red Hat will donate a RHEL/RHN subscription if we want one, though I don't have final approval quite yet. One possible point favoring the use of Centos over RHEL - its a little easier for community members to reproduce or test any findings... i.e. you don't have to get a RHEL sub! Cheers Mark ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] HOT patch - version 11
Hi All, Please see the version 11 of HOT patch posted on -patches. The concept of marking the pruned tuples with LP_DELETE and reusing such tuples for subsequent UPDATEs has been removed and replaced with a simpler mechanism of repairing the page fragmentation as and when possible. The logic of chain pruning has been simplified a lot. In addition, there are fewer new WAL log records. We rather reuse the existing WAL record types to log the operations. Few 4 hour DBT2 runs have confirmed that this simplification hasn't taken away any performance gains, rather we are seeing better performance with the changes. The gain can be attributed to the fact that now more HOT updates are possible even if the tuple length changes between updates (since we do the complete page defragmentation) The changes are mostly isolated in the above area apart from some stray bug fixes. Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com
[HACKERS] Lock and Waiters
I have question on Locks and waiting. In the readme pgsql/src/backend/storage/lmgr/README Each waiter is awoken if (a) its requestdoes not conflict with already-granted locks, and (b) its request doesnot conflict with the requests of prior un-wakable waiters. Let us imagine if there is Process P which is holding a lock and there are individual waiters p1 p2 p3 p4 p5 p6 requiring the same lock. Now since they are in conflict it is sure that there will be wait queue that will get generated as in p1 p2 p3 p4 p5 p6. Imagine if Process P releases it lock. As per explaination given in (a) it is sure that p1 will wake up. What is the status of p2. It was in conflict with process P and hence should we term it that it will not wake up. Same is the case with p2 ... p6. Under what circumstance will p2 be also woken up taking into consideration that the lock held by process P is released. Secondly if p2 is not woken up and if p3's lock doesn't conflict with ( P and p2 ) then by rule(b) will p3 move ahead of p2 Thanks,KennethTried the new MSN Messenger? Its cool! Download now.
[HACKERS] How do I connect postgres table structures and view structures to an existing svn repository?
Hi, How do I connect postgres table structures and view structures to an existing svn repository? Thanks, -- John J. Mitchell
Re: [HACKERS] How do I connect postgres table structures and view structures to an existing svn repository?
Am Mittwoch, 1. August 2007 14:52 schrieb John Mitchell: How do I connect postgres table structures and view structures to an existing svn repository? This question does not relate to the development of PostgreSQL. Please ask on a different mailing list. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Machine available for community use
Let us know when/if and we'll pay command prompt to install the base OS on the system. All that we're waiting on at this point is the final on the OS. Gavin On 7/31/07, Tom Lane [EMAIL PROTECTED] wrote: Josh Berkus [EMAIL PROTECTED] writes: Hey, this is looking like a serious case of Bike Shedding. That is, a dozen people are arguing about what color to paint the bike shed instead of getting it built.[1] FWIW, it's looking like Red Hat will donate a RHEL/RHN subscription if we want one, though I don't have final approval quite yet. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] GIT patch
Alvaro Herrera wrote: I've started reading the GIT patch to see if I can help with the review. Thanks. First thing I notice is that there are several things that seems left over; for example the comments in pg_proc for the new functions are incomplete. ... I'm also finding a certain lack of code commentary that makes the reviewing a bit harder. Sorry about that. As the patch stands, I tried to keep it as non-invasive as possible, with minimum changes to existing APIs. That's because in the winter we were discussing changes to the indexam API to support the bitmap index am, and also GIT. I wanted to just have a patch to do performance testing with, without getting into the API changes. I've been reluctant to spend time to clean up the code and comments, knowing that that's going to change a lot, depending on what kind of an API we settle on and what capabilities we're going to have in the executor. And also because there was no acceptance of even the general design, so it might just be rejected. Please read the discussions on the thread bitmapscan changes: http://archives.postgresql.org/pgsql-patches/2007-03/msg00163.php There's basically three slightly alternative designs: 1. A grouped index tuple contains a bitmap of offsetnumbers, representing a bunch of heap tuples stored on the same heap page, that all have a key between the key stored on the index tuple and the next index tuple. We don't keep track of the ordering of the heap tuples represented by one group index tuple. When doing a normal, non-bitmap, index scan, they need to be sorted. This is what the patch currently implements. 2. Same as 1, but instead of storing the offsetnumbers in a bitmap, they're sorted in a list (variable-sized array, really), which keeps the ordering between the tuples intact. No sorting needed on index scans, and we can do binary searches using the list. But takes more space than a bitmap. 3. Same as 1, but mandate that all the heap tuples that are represented by the same grouped index tuple must be in index order on the heap page. If an out-of-order tuple is inserted, we need to split the grouped index tuple into two groups, to uphold that invariant. No sorting needed on index scans and we can do binary searches. But takes more space when the heap is not perfectly in order and makes the index to degrade into a normal b-tree more quickly when the table is updated. I'm leaning towards 2 or 3 myself at the moment, to keep it simple. In any case. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] stats_block_level
\On Jul 29, 2007, at 9:37 AM, Tom Lane wrote: Erik Jones [EMAIL PROTECTED] writes: improvement that went into that release. I could test turning it back on this week if you like -- I certainly would like to have my blks_read/cach_hits stats back. Toggling stats_block_level will respond to a reload, yes? Yes, as long as you had stats_start_collector on at startup. regards, tom lane Ok, we finally had a day with both our sysadmin and me in the office, flipped stats_block_level back on and noticed no noticable change in our iostats. We're going to leave it on and if anything starts to go crazy in the next couple of days I'll be sure to let you know. Erik Jones Software Developer | Emma® [EMAIL PROTECTED] 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate market in style. Visit us online at http://www.myemma.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] default_text_search_config and expression indexes
Bruce Momjian wrote: Oleg Bartunov wrote: What is a basis of your assumption ? In my opinion, it's very limited use of text search, because it doesn't supports ranking. For 4-5 years of tsearch2 usage I never used it and I never seem in mailing lists. This is very user-oriented feature and we could probably ask -general people for their opinion. I think I asked about this kind of usage a couple years back; and Oleg pointed out other reasons why it wasn't as good an idea too. http://archives.postgresql.org/pgsql-general/2005-10/msg00475.php http://archives.postgresql.org/pgsql-general/2005-10/msg00477.php The particular question I had asked why the functional index was slower than maintaining the extra column; with the explanation that the lossy index having to call the function (including parsing, dictionary lookup, etc) for re-checking the data made it inadvisable to avoid the extra column anyway. I doubt 'general' is going to understand the details of merging this into the backend. I assume we have enough people on hackers to decide this. Are you saying the majority of users have a separate column with a trigger? I think so. At least when I was using it in 2005 the second column with the trigger was faster than using a functional index. We need more feedback from users. Well, I am waiting for other hackers to get involved, but if they don't, I have to evaluate it myself on the email lists. Personally, I think documentation changes would be an OK way to to handle it. Something that makes it extremely clear to the user the advantages of having the extra column and the risks of avoiding them. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] default_text_search_config and expression indexes
Oleg Bartunov wrote: On Tue, 31 Jul 2007, Bruce Momjian wrote: Oleg Bartunov wrote: On Tue, 31 Jul 2007, Bruce Momjian wrote: And if we have to require the configuration name in CREATE INDEX, it has to be used in WHERE, so we might as well just remove the default capability and always require the configuration name. this is very rare use case for text searching 1. expression index without configuration name 2. default_text_search_config can be changed by somebody If you are going to be using the configuration name with the create expression index, you have to use it in the WHERE clause (or the index doesn't work), and I assume that is 90% of the text search uses. I don't see it as rare at all. What is a basis of your assumption ? In my opinion, it's very limited use of text search, because it doesn't supports ranking. For 4-5 years of tsearch2 usage I never used it and I never seem in mailing lists. This is very user-oriented feature and we could probably ask -general people for their opinion. I doubt 'general' is going to understand the details of merging this into the backend. I assume we have enough people on hackers to decide this. I mean not technical details, but use case. Does they need expressional index without ranking but sacrifice ability to use default configuration in other cases too ? My prediction is that people doesn't ever thought about this possibility until we said them about. In a choice between expression indexes and default_text_search_config, there is no question in my mind that expression indexes are more useful. Lack of default_text_search_config only means you have to specify the configuration name every time, and can't do casting to a text search data type. Are you saying the majority of users have a separate column with a trigger? Does the trigger specify the configuation? I don't see that as a parameter argument to tsvector_update_trigger(). If you reload a pg_dump, what does it use for the configuration? yes, separate column with custom trigger works fine. It's up to you how to keep your data actual and it's up to you how to write trigger. Our tsvector_update_trigger() is a tsvector_update_trigger_example() ! Well, that is the major problem --- that this is very error-prone, especially considering that the tsvector_update_trigger() doesn't get it right either. Why is a separate column better than the index? Just ranking? ranking + composite documents. I already mentioned, that this could be rather expensive. Also, having separate column allow people various ways to say what is a document and even change it. OK, I am confused why an expression index can't use those features if a separate column can. I realize the index can't store that information, but why can the code pick it out of a heap column but not run the function on the heap row to get that information. I assume it is something that is just hard to implement. The reason the expression index is nice is this feature has to be easy to use for people who are new to full text and even PostgreSQL. Right now /contrib is fine for experts to use, but we want a larger user base for this feature. I agree here. This was one of the main reason of our work for 8.3. Probably, we shold think in another direction - not to curtail tsearch2 and confuse rather big existing users, but to add an ability to save somehow configuration used for creating of *document* either implicitly (in expression index, or just gin(text_column)), or explicitly (separate column). There is no problem with index itself ! Agreed. We need to find a way to save the configuration when the output of a text search function is stored, either in an expression index or via a trigger into a separate column, but only if we allow the default configuration to be changed by non-super-users. Should we hold the patch for 8.4? If we're not agree to say in docs, that implicit usage of text search configuration in CREATE INDEX command doesn't supported. Could we leave default_text_search_config for super-users, at least ? Anyway, let's wait what other people say. The big problem is that not many people have taken the time to fully understand how full text search works. I hoped that putting the updated documentation online would help: http://momjian.us/expire/fulltext/HTML/textsearch.html but it seems it hasn't. What we could do it if we make default_text_search_config super-user-only and tell users at the start that if default_text_search_config doesn't match the language they want to use, then they have to read a documentation section that explains the problem of configuration mismatches. The problem with that is that we should be setting default_text_search_config in the pg_dump output, like we do for client_encoding, but because it is a super-user-only, it will fail for non-super-user restores. So, I am back to thinking
Re: [HACKERS] default_text_search_config and expression indexes
Ron Mayer wrote: Bruce Momjian wrote: Oleg Bartunov wrote: What is a basis of your assumption ? In my opinion, it's very limited use of text search, because it doesn't supports ranking. For 4-5 years of tsearch2 usage I never used it and I never seem in mailing lists. This is very user-oriented feature and we could probably ask -general people for their opinion. I think I asked about this kind of usage a couple years back; and Oleg pointed out other reasons why it wasn't as good an idea too. http://archives.postgresql.org/pgsql-general/2005-10/msg00475.php http://archives.postgresql.org/pgsql-general/2005-10/msg00477.php The particular question I had asked why the functional index was slower than maintaining the extra column; with the explanation that the lossy index having to call the function (including parsing, dictionary lookup, etc) for re-checking the data made it inadvisable to avoid the extra column anyway. I doubt 'general' is going to understand the details of merging this into the backend. I assume we have enough people on hackers to decide this. Are you saying the majority of users have a separate column with a trigger? I think so. At least when I was using it in 2005 the second column with the trigger was faster than using a functional index. OK, it is good you measured it. I wonder how GIN would behave because it is not lossy. We need more feedback from users. Well, I am waiting for other hackers to get involved, but if they don't, I have to evaluate it myself on the email lists. Personally, I think documentation changes would be an OK way to to handle it. Something that makes it extremely clear to the user the advantages of having the extra column and the risks of avoiding them. Sure, but you have make sure you use the right configuration in the trigger, no? Does the tsquery have to use the same configuration? -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] default_text_search_config and expression indexes
Bruce Momjian wrote: Ron Mayer wrote: Bruce Momjian wrote: Oleg Bartunov wrote: What is a basis of your assumption ? I think I asked about this kind of usage a couple years back;... http://archives.postgresql.org/pgsql-general/2005-10/msg00475.php http://archives.postgresql.org/pgsql-general/2005-10/msg00477.php ...why the functional index was slower than maintaining the extra column; with the explanation that the lossy index having to call the function (including parsing, dictionary lookup, etc) for re-checking the data ... ... Are you saying the majority of users have a separate column with a trigger? I think so. At least when I was using it in 2005 the second column with the trigger was faster than using a functional index. OK, it is good you measured it. I wonder how GIN would behave because it is not lossy. Too bad I don't have the same database around anymore. It seems the re-parsing for re-checking for the lossy index was very expensive, tho. In the end, I suspect it depends greatly on what fraction of rows match. We need more feedback from users. Well, I am waiting for other hackers to get involved, but if they don't, I have to evaluate it myself on the email lists. Personally, I think documentation changes would be an OK way to to handle it. Something that makes it extremely clear to the user the advantages of having the extra column and the risks of avoiding them. Sure, but you have make sure you use the right configuration in the trigger, no? Does the tsquery have to use the same configuration? I wish I knew this myself. :-) Whatever I had done happened to work but that was largely through people on IRC walking me through it. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] default_text_search_config and expression indexes
Ron Mayer wrote: We need more feedback from users. Well, I am waiting for other hackers to get involved, but if they don't, I have to evaluate it myself on the email lists. Personally, I think documentation changes would be an OK way to to handle it. Something that makes it extremely clear to the user the advantages of having the extra column and the risks of avoiding them. Sure, but you have make sure you use the right configuration in the trigger, no? Does the tsquery have to use the same configuration? I wish I knew this myself. :-) Whatever I had done happened to work but that was largely through people on IRC walking me through it. This illustrates the major issue --- that this has to be simple for people to get started, while keeping the capabilities for experienced users. I am now thinking that making users always specify the configuration name and not allowing :: casting is going to be the best approach. We can always add more in 8.4 after it is in wide use. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] GIT patch
Heikki Linnakangas wrote: Alvaro Herrera wrote: I've started reading the GIT patch to see if I can help with the review. As the patch stands, I tried to keep it as non-invasive as possible, with minimum changes to existing APIs. That's because in the winter we were discussing changes to the indexam API to support the bitmap index am, and also GIT. I wanted to just have a patch to do performance testing with, without getting into the API changes. Hmm, do say, doesn't it seem like the lack of feedback and the failed bitmap patch played against final development of this patch? At this point I feel like the patch still needs some work and reshuffling before it is in an acceptable state. The fact that there are some API changes for which the patch needs to be adjusted makes me feel like we should put this patch on hold for 8.4. So we would first get the API changes discussed and done and then adapt this patch to them. Of the three proposals you suggest, I think the first one 1. A grouped index tuple contains a bitmap of offsetnumbers, representing a bunch of heap tuples stored on the same heap page, that all have a key between the key stored on the index tuple and the next index tuple. We don't keep track of the ordering of the heap tuples represented by one group index tuple. When doing a normal, non-bitmap, index scan, they need to be sorted. This is what the patch currently implements. makes the most sense -- the index is keep simple and fast, and doing the sorting during an indexscan seems a perfectly acceptable compromise, knowing that the amount of tuples possible returned for sort is limited by the heap blocksize. -- Alvaro Herrera http://www.amazon.com/gp/registry/5ZYLFMCVHXC Everything that I think about is more fascinating than the crap in your head. (Dogbert's interpretation of blogger philosophy) ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] default_text_search_config and expression indexes
Bruce Momjian wrote: Ron Mayer wrote: I wish I knew this myself. :-) Whatever I had done happened to work but that was largely through people on IRC walking me through it. This illustrates the major issue --- that this has to be simple for people to get started, while keeping the capabilities for experienced users. I am now thinking that making users always specify the configuration name and not allowing :: casting is going to be the best approach. We can always add more in 8.4 after it is in wide use. That's fair. Either the docs need to make it totally obvious or the software should force people to do something safe. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match