Re: [HACKERS] Machine available for community use

2007-08-01 Thread Mark Kirkwood

Tom Lane wrote:


FWIW, it's looking like Red Hat will donate a RHEL/RHN subscription if
we want one, though I don't have final approval quite yet.


One possible point favoring the use of Centos over RHEL - its a little 
easier for community members to reproduce or test any findings... i.e. 
you don't have to get a RHEL sub!


Cheers

Mark

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] HOT patch - version 11

2007-08-01 Thread Pavan Deolasee
Hi All,

Please see the version 11 of HOT patch posted on -patches.

The concept of marking the pruned tuples with LP_DELETE and
reusing such tuples for subsequent UPDATEs has been removed
and replaced with a simpler mechanism of repairing the page
fragmentation as and when possible.

The logic of chain pruning has been simplified a lot. In addition, there
are fewer new WAL log records. We rather reuse the existing WAL
record types to log the operations.

Few 4 hour DBT2 runs have confirmed that this simplification hasn't
taken away any performance gains, rather we are seeing better performance
with the changes. The gain can be attributed to the fact that now more
HOT updates are possible even if the tuple length changes between
updates (since we do the complete page defragmentation)

The changes are mostly isolated in the above area apart from some
stray bug fixes.

Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com


[HACKERS] Lock and Waiters

2007-08-01 Thread kenneth d'souza



I have question on Locks and waiting. In the readme pgsql/src/backend/storage/lmgr/README
Each waiter is awoken if (a) its requestdoes not conflict with already-granted locks, and (b) its request doesnot conflict with the requests of prior un-wakable waiters.

Let us imagine if there is Process P which is holding a lock and there are individual waiters p1 p2 p3 p4 p5 p6 requiring the same lock. Now since they are in conflict it is sure that there will be wait queue that will get generated as in p1 p2 p3 p4 p5 p6. Imagine if Process P releases it lock. As per explaination given in (a) it is sure that p1 will wake up. What is the status of p2. It was in conflict with process P and hence should we term it that it will not wake up. Same is the case with p2 ... p6. 

Under what circumstance will p2 be also woken up taking into consideration that the lock held by process P is released.
Secondly if p2 is not woken up and if p3's lock doesn't conflict with ( P and p2 ) then by rule(b) will p3 move ahead of p2

Thanks,KennethTried the new MSN Messenger? It’s cool! Download now. 



[HACKERS] How do I connect postgres table structures and view structures to an existing svn repository?

2007-08-01 Thread John Mitchell
Hi,

How do I connect postgres table structures and view structures to an
existing svn repository?

Thanks,

-- 
John J. Mitchell


Re: [HACKERS] How do I connect postgres table structures and view structures to an existing svn repository?

2007-08-01 Thread Peter Eisentraut
Am Mittwoch, 1. August 2007 14:52 schrieb John Mitchell:
 How do I connect postgres table structures and view structures to an
 existing svn repository?

This question does not relate to the development of PostgreSQL.  Please ask on 
a different mailing list.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Machine available for community use

2007-08-01 Thread Gavin M. Roy
Let us know when/if and we'll pay command prompt to install the base OS on
the system.  All that we're waiting on at this point is the final on the OS.

Gavin

On 7/31/07, Tom Lane [EMAIL PROTECTED] wrote:

 Josh Berkus [EMAIL PROTECTED] writes:
  Hey, this is looking like a serious case of Bike Shedding.  That is, a
 dozen
  people are arguing about what color to paint the bike shed instead of
 getting
  it built.[1]

 FWIW, it's looking like Red Hat will donate a RHEL/RHN subscription if
 we want one, though I don't have final approval quite yet.

 regards, tom lane

 ---(end of broadcast)---
 TIP 7: You can help support the PostgreSQL project by donating at

 http://www.postgresql.org/about/donate



Re: [HACKERS] GIT patch

2007-08-01 Thread Heikki Linnakangas
Alvaro Herrera wrote:
 I've started reading the GIT patch to see if I can help with the review.

Thanks.

 First thing I notice is that there are several things that seems left
 over; for example the comments in pg_proc for the new functions are
 incomplete.
 
 ...
 I'm also finding a certain lack of code commentary that makes the
 reviewing a bit harder.

Sorry about that.

As the patch stands, I tried to keep it as non-invasive as possible,
with minimum changes to existing APIs. That's because in the winter we
were discussing changes to the indexam API to support the bitmap index
am, and also GIT. I wanted to just have a patch to do performance
testing with, without getting into the API changes.

I've been reluctant to spend time to clean up the code and comments,
knowing that that's going to change a lot, depending on what kind of an
API we settle on and what capabilities we're going to have in the
executor. And also because there was no acceptance of even the general
design, so it might just be rejected. Please read the discussions on the
thread bitmapscan changes:

http://archives.postgresql.org/pgsql-patches/2007-03/msg00163.php

There's basically three slightly alternative designs:

1. A grouped index tuple contains a bitmap of offsetnumbers,
representing a bunch of heap tuples stored on the same heap page, that
all have a key between the key stored on the index tuple and the next
index tuple. We don't keep track of the ordering of the heap tuples
represented by one group index tuple. When doing a normal, non-bitmap,
index scan, they need to be sorted. This is what the patch currently
implements.

2. Same as 1, but instead of storing the offsetnumbers in a bitmap,
they're sorted in a list (variable-sized array, really), which keeps the
ordering between the tuples intact. No sorting needed on index scans,
and we can do binary searches using the list. But takes more space than
a bitmap.

3. Same as 1, but mandate that all the heap tuples that are represented
by the same grouped index tuple must be in index order on the heap page.
If an out-of-order tuple is inserted, we need to split the grouped index
tuple into two groups, to uphold that invariant. No sorting needed on
index scans and we can do binary searches. But takes more space when the
heap is not perfectly in order and makes the index to degrade into a
normal b-tree more quickly when the table is updated.

I'm leaning towards 2 or 3 myself at the moment, to keep it simple. In
any case.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] stats_block_level

2007-08-01 Thread Erik Jones

\On Jul 29, 2007, at 9:37 AM, Tom Lane wrote:


Erik Jones [EMAIL PROTECTED] writes:

improvement that went into that release.  I could test turning it
back on this week if you like -- I certainly would like to have my
blks_read/cach_hits stats back.  Toggling stats_block_level will
respond to a reload, yes?


Yes, as long as you had stats_start_collector on at startup.

regards, tom lane


Ok, we finally had a day with both our sysadmin and me in the office,  
flipped stats_block_level back on and noticed no noticable change in  
our iostats.  We're going to leave it on and if anything starts to go  
crazy in the next couple of days I'll be sure to let you know.


Erik Jones

Software Developer | Emma®
[EMAIL PROTECTED]
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate  market in style.
Visit us online at http://www.myemma.com



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] default_text_search_config and expression indexes

2007-08-01 Thread Ron Mayer
Bruce Momjian wrote:
 Oleg Bartunov wrote:
 What is a basis of your assumption ? In my opinion, it's very limited
 use of text search, because it doesn't supports ranking. For 4-5 years
 of tsearch2 usage I never used it and I never seem in mailing lists.
 This is very user-oriented feature and we could probably ask 
 -general people for their opinion.

I think I asked about this kind of usage a couple years back;
and Oleg pointed out other reasons why it wasn't as good an
idea too.

http://archives.postgresql.org/pgsql-general/2005-10/msg00475.php
http://archives.postgresql.org/pgsql-general/2005-10/msg00477.php

The particular question I had asked why the functional index was
slower than maintaining the extra column; with the explanation
that the lossy index having to call the function (including
parsing, dictionary lookup, etc) for re-checking the data made
it inadvisable to avoid the extra column anyway.

 I doubt 'general' is going to understand the details of merging this
 into the backend.  I assume we have enough people on hackers to decide
 this.
 
 Are you saying the majority of users have a separate column with a
 trigger?

I think so.   At least when I was using it in 2005 the second
column with the trigger was faster than using a functional index.

 We need more feedback from users.
 
 Well, I am waiting for other hackers to get involved, but if they don't,
 I have to evaluate it myself on the email lists.

Personally, I think documentation changes would be an OK way to
to handle it.   Something that makes it extremely clear to the
user the advantages of having the extra column and the risks
of avoiding them.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] default_text_search_config and expression indexes

2007-08-01 Thread Bruce Momjian
Oleg Bartunov wrote:
 On Tue, 31 Jul 2007, Bruce Momjian wrote:
 
  Oleg Bartunov wrote:
  On Tue, 31 Jul 2007, Bruce Momjian wrote:
 
  And if we have to require the configuration name in CREATE INDEX, it has
  to be used in WHERE, so we might as well just remove the default
  capability and always require the configuration name.
 
  this is very rare use case for text searching
  1. expression index without configuration name
  2. default_text_search_config can be changed by somebody
 
  If you are going to be using the configuration name with the create
  expression index, you have to use it in the WHERE clause (or the index
  doesn't work), and I assume that is 90% of the text search uses.  I
  don't see it as rare at all.
 
  What is a basis of your assumption ? In my opinion, it's very limited
  use of text search, because it doesn't supports ranking. For 4-5 years
  of tsearch2 usage I never used it and I never seem in mailing lists.
  This is very user-oriented feature and we could probably ask
  -general people for their opinion.
 
  I doubt 'general' is going to understand the details of merging this
  into the backend.  I assume we have enough people on hackers to decide
  this.
 
 I mean not technical details, but use case. Does they need expressional
 index without ranking but sacrifice ability to use default configuration
 in other cases too ? My prediction is that  people doesn't ever thought about 
 this possibility until we said them about.

In a choice between expression indexes and default_text_search_config,
there is no question in my mind that expression indexes are more useful.
Lack of default_text_search_config only means you have to specify the
configuration name every time, and can't do casting to a text search
data type.

  Are you saying the majority of users have a separate column with a
  trigger?  Does the trigger specify the configuation?  I don't see that
  as a parameter argument to tsvector_update_trigger().  If you reload a
  pg_dump, what does it use for the configuration?
 
 
 yes, separate column with custom trigger works fine. It's up to you how
 to keep your data actual and it's up to you how to write trigger. 
 Our tsvector_update_trigger() is a tsvector_update_trigger_example() !

Well, that is the major problem --- that this is very error-prone,
especially considering that the tsvector_update_trigger() doesn't get it
right either.

  Why is a separate column better than the index?  Just ranking?
 
 ranking + composite documents. I already mentioned, that this could be
 rather expensive. Also, having separate column allow people various
 ways to say what is a document and even change it.

OK, I am confused why an expression index can't use those features if a
separate column can.  I realize the index can't store that information,
but why can the code pick it out of a heap column but not run the
function on the heap row to get that information.  I assume it is
something that is just hard to implement.

  The reason the expression index is nice is this feature has to be easy
  to use for people who are new to full text and even PostgreSQL.  Right
  now /contrib is fine for experts to use, but we want a larger user base
  for this feature.
 
 I agree here. This was one of the main reason of our work for 8.3.
 Probably, we shold think in another direction - not to curtail tsearch2
 and confuse rather big existing users, but to add an ability to save somehow
 configuration used for creating of *document*
 either implicitly (in expression index, or just gin(text_column)), or
 explicitly (separate column). There is no problem with index itself !

Agreed.  We need to find a way to save the configuration when the output
of a text search function is stored, either in an expression index or
via a trigger into a separate column, but only if we allow the default
configuration to be changed by non-super-users.

 
  Should we hold the patch for 8.4?
 
 If we're not agree to say in docs, that implicit usage of text search 
 configuration in CREATE INDEX command doesn't supported. Could we leave
 default_text_search_config for super-users, at least ?
 
 Anyway, let's wait what other people say.

The big problem is that not many people have taken the time to fully
understand how full text search works. I hoped that putting the updated
documentation online would help:

http://momjian.us/expire/fulltext/HTML/textsearch.html

but it seems it hasn't.

What we could do it if we make default_text_search_config
super-user-only and tell users at the start that if
default_text_search_config doesn't match the language they want to use,
then they have to read a documentation section that explains the problem
of configuration mismatches.

The problem with that is that we should be setting
default_text_search_config in the pg_dump output, like we do for
client_encoding, but because it is a super-user-only, it will fail for
non-super-user restores.

So, I am back to thinking 

Re: [HACKERS] default_text_search_config and expression indexes

2007-08-01 Thread Bruce Momjian
Ron Mayer wrote:
 Bruce Momjian wrote:
  Oleg Bartunov wrote:
  What is a basis of your assumption ? In my opinion, it's very limited
  use of text search, because it doesn't supports ranking. For 4-5 years
  of tsearch2 usage I never used it and I never seem in mailing lists.
  This is very user-oriented feature and we could probably ask 
  -general people for their opinion.
 
 I think I asked about this kind of usage a couple years back;
 and Oleg pointed out other reasons why it wasn't as good an
 idea too.
 
 http://archives.postgresql.org/pgsql-general/2005-10/msg00475.php
 http://archives.postgresql.org/pgsql-general/2005-10/msg00477.php
 
 The particular question I had asked why the functional index was
 slower than maintaining the extra column; with the explanation
 that the lossy index having to call the function (including
 parsing, dictionary lookup, etc) for re-checking the data made
 it inadvisable to avoid the extra column anyway.
 
  I doubt 'general' is going to understand the details of merging this
  into the backend.  I assume we have enough people on hackers to decide
  this.
  
  Are you saying the majority of users have a separate column with a
  trigger?
 
 I think so.   At least when I was using it in 2005 the second
 column with the trigger was faster than using a functional index.

OK, it is good you measured it.  I wonder how GIN would behave because
it is not lossy.

  We need more feedback from users.
  
  Well, I am waiting for other hackers to get involved, but if they don't,
  I have to evaluate it myself on the email lists.
 
 Personally, I think documentation changes would be an OK way to
 to handle it.   Something that makes it extremely clear to the
 user the advantages of having the extra column and the risks
 of avoiding them.

Sure, but you have make sure you use the right configuration in the
trigger, no?  Does the tsquery have to use the same configuration?

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] default_text_search_config and expression indexes

2007-08-01 Thread Ron Mayer
Bruce Momjian wrote:
 Ron Mayer wrote:
 Bruce Momjian wrote:
 Oleg Bartunov wrote:
 What is a basis of your assumption ? 
 I think I asked about this kind of usage a couple years back;...

 http://archives.postgresql.org/pgsql-general/2005-10/msg00475.php
 http://archives.postgresql.org/pgsql-general/2005-10/msg00477.php

 ...why the functional index was
 slower than maintaining the extra column; with the explanation
 that the lossy index having to call the function (including
 parsing, dictionary lookup, etc) for re-checking the data ...
 ...

 Are you saying the majority of users have a separate column with a
 trigger?
 I think so.   At least when I was using it in 2005 the second
 column with the trigger was faster than using a functional index.
 
 OK, it is good you measured it.  I wonder how GIN would behave because
 it is not lossy.

Too bad I don't have the same database around anymore.
It seems the re-parsing for re-checking for the lossy index was very
expensive, tho.
In the end, I suspect it depends greatly on what fraction of rows match.

 We need more feedback from users.
 Well, I am waiting for other hackers to get involved, but if they don't,
 I have to evaluate it myself on the email lists.
 Personally, I think documentation changes would be an OK way to
 to handle it.   Something that makes it extremely clear to the
 user the advantages of having the extra column and the risks
 of avoiding them.
 
 Sure, but you have make sure you use the right configuration in the
 trigger, no?  Does the tsquery have to use the same configuration?

I wish I knew this myself. :-)   Whatever I had done happened to work
but that was largely through people on IRC walking me through it.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] default_text_search_config and expression indexes

2007-08-01 Thread Bruce Momjian
Ron Mayer wrote:
  We need more feedback from users.
  Well, I am waiting for other hackers to get involved, but if they don't,
  I have to evaluate it myself on the email lists.
  Personally, I think documentation changes would be an OK way to
  to handle it.   Something that makes it extremely clear to the
  user the advantages of having the extra column and the risks
  of avoiding them.
  
  Sure, but you have make sure you use the right configuration in the
  trigger, no?  Does the tsquery have to use the same configuration?
 
 I wish I knew this myself. :-)   Whatever I had done happened to work
 but that was largely through people on IRC walking me through it.

This illustrates the major issue --- that this has to be simple for
people to get started, while keeping the capabilities for experienced
users.

I am now thinking that making users always specify the configuration
name and not allowing :: casting is going to be the best approach.  We
can always add more in 8.4 after it is in wide use.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] GIT patch

2007-08-01 Thread Alvaro Herrera
Heikki Linnakangas wrote:
 Alvaro Herrera wrote:
  I've started reading the GIT patch to see if I can help with the review.

 As the patch stands, I tried to keep it as non-invasive as possible,
 with minimum changes to existing APIs. That's because in the winter we
 were discussing changes to the indexam API to support the bitmap index
 am, and also GIT. I wanted to just have a patch to do performance
 testing with, without getting into the API changes.

Hmm, do say, doesn't it seem like the lack of feedback and the failed
bitmap patch played against final development of this patch?  At this
point I feel like the patch still needs some work and reshuffling before
it is in an acceptable state.  The fact that there are some API changes
for which the patch needs to be adjusted makes me feel like we should
put this patch on hold for 8.4.  So we would first get the API changes
discussed and done and then adapt this patch to them.

Of the three proposals you suggest, I think the first one

 1. A grouped index tuple contains a bitmap of offsetnumbers,
 representing a bunch of heap tuples stored on the same heap page, that
 all have a key between the key stored on the index tuple and the next
 index tuple. We don't keep track of the ordering of the heap tuples
 represented by one group index tuple. When doing a normal, non-bitmap,
 index scan, they need to be sorted. This is what the patch currently
 implements.

makes the most sense -- the index is keep simple and fast, and doing the
sorting during an indexscan seems a perfectly acceptable compromise,
knowing that the amount of tuples possible returned for sort is limited
by the heap blocksize.

-- 
Alvaro Herrera  http://www.amazon.com/gp/registry/5ZYLFMCVHXC
Everything that I think about is more fascinating than the crap in your head.
   (Dogbert's interpretation of blogger philosophy)

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] default_text_search_config and expression indexes

2007-08-01 Thread Ron Mayer
Bruce Momjian wrote:
 Ron Mayer wrote:
 I wish I knew this myself. :-)   Whatever I had done happened to work
 but that was largely through people on IRC walking me through it.
 
 This illustrates the major issue --- that this has to be simple for
 people to get started, while keeping the capabilities for experienced
 users.
 
 I am now thinking that making users always specify the configuration
 name and not allowing :: casting is going to be the best approach.  We
 can always add more in 8.4 after it is in wide use.

That's fair.   Either the docs need to make it totally obvious or
the software should force people to do something safe.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match