Re: [HACKERS] Index trouble with 8.3b4

2008-01-15 Thread Kenneth Marshall
On Mon, Jan 14, 2008 at 10:10:54PM -0500, Tom Lane wrote:
 Gregory Stark [EMAIL PROTECTED] writes:
  Tom Lane [EMAIL PROTECTED] writes:
  I went through all of the heap_beginscan calls in the code last night.
  pgstattuple was broken but AFAICS none of the other callers care about
  the visitation order.  I wonder though about third-party add-ons :-(
 
  Perhaps we ought to have made heap_beginscan guarantee an ordered scan and
  made synch scans be explicitly requested. That would have touched a lot of
  lines but been more conservative. I'm not sure it's worth going back on it 
  now
  though.
 
 Hmm.  I'm too lazy to go back and look right now, but IIRC most of the
 hardwired heapscans are on system catalogs that are unlikely to be large
 enough to trigger a syncscan anyway.  If we were to flip the semantics,
 and then change only the callers that clearly need to enable syncscans,
 it would not be all that large a patch I think.
 
 On the other hand it's far from clear that there's really a problem.
 The model for doing a block-at-a-time scan is VACUUM, and that doesn't
 use the heapscan infrastructure but just fetches blocks by number.
 It would only be people who'd copied pgstattuple's methodology that
 would be likely to be at risk.  I'm not sure we should protect those
 hypothetical people at the cost of not doing syncscans for other
 (also hypothetical) third-party add-ons that do heapscans on large
 tables and wouldn't have a problem with wraparound.
 
 It's a tossup from here.  Anybody have a strong opinion one way or the
 other?
 
   regards, tom lane
 
The principle of least surprise would have us default to syncscans and
assume that the 3rd-party add-ons can or will handle the wraparound.
This choice at least helps to bound the resource usage by many simultaneous
scans, speaking as someone who has brought a server to its knees using
multiple full-table scans.

Cheers,
Ken Marshall

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Index trouble with 8.3b4

2008-01-15 Thread Jeff Davis
On Mon, 2008-01-14 at 22:10 -0500, Tom Lane wrote:
 It's a tossup from here.  Anybody have a strong opinion one way or the
 other?
 

To me a heapscan means read all the tuples (without implying order)
and an ordered heap scan is a special case that should be made explicit.
But this is not a strong opinion.

Regards,
Jeff Davis


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Index trouble with 8.3b4

2008-01-15 Thread Gregory Stark
Jeff Davis [EMAIL PROTECTED] writes:

 On Mon, 2008-01-14 at 22:10 -0500, Tom Lane wrote:
 It's a tossup from here.  Anybody have a strong opinion one way or the
 other?

 To me a heapscan means read all the tuples (without implying order)
 and an ordered heap scan is a special case that should be made explicit.
 But this is not a strong opinion.

I had another thought. Perhaps in use_assert_checking mode we should have it
start from a random position every time. Or perhaps just a random position
from amongst the first n pages.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Index trouble with 8.3b4

2008-01-15 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 I had another thought. Perhaps in use_assert_checking mode we should have it
 start from a random position every time. Or perhaps just a random position
 from amongst the first n pages.

Interesting idea, but I fear it'd make a lot of the regression tests
fail.  Might be worth trying to see how bad things are there, though.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Index trouble with 8.3b4

2008-01-14 Thread Hannes Dorbath

Tom Lane wrote:

I've committed a patch to do that.  Please test CVS HEAD and see if you
still see problems.


I'm happy to hear you found something and I will try CVS HEAD in a minute.

In the meantime let me report that the cluster issue happens with GIST 
as well. I have load 5 million rows in that table and did:


test=# CREATE INDEX CONCURRENTLY ts_test_tsv_gist ON public.test 
USING gist (tsv);

CREATE INDEX
test=# CLUSTER test USING ts_test_tsv_gist;
ERROR:  could not create unique index test_pkey
DETAIL:  Table contains duplicated values.
test=#

But as far as I understood this is already covered by your thesis.


--
Best regards,
Hannes Dorbath

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Index trouble with 8.3b4

2008-01-14 Thread Gregory Stark

Tom Lane [EMAIL PROTECTED] writes:

 Oooh ... I can't be sure that this is what's biting you, but I
 definitely see a bug that seems to match the symptoms.  As the comments
 in index.c point out, CREATE INDEX CONCURRENTLY works like this:

  * validate_index() works by first gathering all the TIDs currently in the
  * index, using a bulkdelete callback that just stores the TIDs and doesn't
  * ever say delete it.  (This should be faster than a plain indexscan;
  * also, not all index AMs support full-index indexscan.)  Then we sort the
  * TIDs, and finally scan the table doing a merge join against the TID list
  * to see which tuples are missing from the index.

 The scan is done using the regular heapscan code, which in 8.3 has been
 modified to enable synchronized scanning, which means it might start
 from the middle of the table and wrap around.  

Wow, I'm glad we caught this in beta. Thanks a lot to Hannes Dorbath for
testing and reporting it.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Index trouble with 8.3b4

2008-01-14 Thread Tom Lane
Hannes Dorbath [EMAIL PROTECTED] writes:
 In the meantime let me report that the cluster issue happens with GIST 
 as well. ...
 But as far as I understood this is already covered by your thesis.

Right, the bug is independent of which index AM you use (though the
symptoms may vary).

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Index trouble with 8.3b4

2008-01-14 Thread Hannes Dorbath

Tom Lane wrote:

I wrote:

I think it's okay for CREATE INDEX CONCURRENTLY to use bulk-read access
strategy (that is, seqscan using a limited number of buffers), but it
has to be able to force the scan to start at page zero.


I've committed a patch to do that.  Please test CVS HEAD and see if you
still see problems.


With some limited testing it seems both cases are indeed fixed. I was 
unable to reproduce either with current CVS HEAD. Though I'll run some 
further tests tomorrow to back that up.



Thanks for your time and prompt responses.


--
Best regards,
Hannes Dorbath

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Index trouble with 8.3b4

2008-01-14 Thread Jeff Davis
On Sun, 2008-01-13 at 18:52 -0500, Tom Lane wrote:
 The scan is done using the regular heapscan code, which in 8.3 has been
 modified to enable synchronized scanning, which means it might start
 from the middle of the table and wrap around.  If that happens, the
 merge join will get totally confused because it is expecting the

Thank you and Hannes Dorbath for tracking this down.

 I wonder whether there are any other places that are silently assuming
 that heapscans start from page zero ...
 

I considered that question when implementing sync scans, but I could not
think of any specific areas of the code that would likely be affected.

Regards,
Jeff Davis


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Index trouble with 8.3b4

2008-01-14 Thread Tom Lane
Jeff Davis [EMAIL PROTECTED] writes:
 On Sun, 2008-01-13 at 18:52 -0500, Tom Lane wrote:
 I wonder whether there are any other places that are silently assuming
 that heapscans start from page zero ...

 I considered that question when implementing sync scans, but I could not
 think of any specific areas of the code that would likely be affected.

I went through all of the heap_beginscan calls in the code last night.
pgstattuple was broken but AFAICS none of the other callers care about
the visitation order.  I wonder though about third-party add-ons :-(

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Index trouble with 8.3b4

2008-01-14 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

 Jeff Davis [EMAIL PROTECTED] writes:
 On Sun, 2008-01-13 at 18:52 -0500, Tom Lane wrote:
 I wonder whether there are any other places that are silently assuming
 that heapscans start from page zero ...

 I considered that question when implementing sync scans, but I could not
 think of any specific areas of the code that would likely be affected.

 I went through all of the heap_beginscan calls in the code last night.
 pgstattuple was broken but AFAICS none of the other callers care about
 the visitation order.  I wonder though about third-party add-ons :-(

Perhaps we ought to have made heap_beginscan guarantee an ordered scan and
made synch scans be explicitly requested. That would have touched a lot of
lines but been more conservative. I'm not sure it's worth going back on it now
though.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Index trouble with 8.3b4

2008-01-14 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 I went through all of the heap_beginscan calls in the code last night.
 pgstattuple was broken but AFAICS none of the other callers care about
 the visitation order.  I wonder though about third-party add-ons :-(

 Perhaps we ought to have made heap_beginscan guarantee an ordered scan and
 made synch scans be explicitly requested. That would have touched a lot of
 lines but been more conservative. I'm not sure it's worth going back on it now
 though.

Hmm.  I'm too lazy to go back and look right now, but IIRC most of the
hardwired heapscans are on system catalogs that are unlikely to be large
enough to trigger a syncscan anyway.  If we were to flip the semantics,
and then change only the callers that clearly need to enable syncscans,
it would not be all that large a patch I think.

On the other hand it's far from clear that there's really a problem.
The model for doing a block-at-a-time scan is VACUUM, and that doesn't
use the heapscan infrastructure but just fetches blocks by number.
It would only be people who'd copied pgstattuple's methodology that
would be likely to be at risk.  I'm not sure we should protect those
hypothetical people at the cost of not doing syncscans for other
(also hypothetical) third-party add-ons that do heapscans on large
tables and wouldn't have a problem with wraparound.

It's a tossup from here.  Anybody have a strong opinion one way or the
other?

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Index trouble with 8.3b4

2008-01-13 Thread Hannes Dorbath

Gregory Stark wrote:

Gregory Stark [EMAIL PROTECTED] writes:


On the other hand we can't just ignore all vacuums because someone could issue
a manual vacuum inside a transaction (I think?). 


Doh, ignore this. sigh.



I started from scratch to put up a test case. I cannot trigger ERROR: 
item pointer (0,1) already exists again as the deadlock issue reported 
by Gregory Stark hit me every time. Is this fixed in RC1? Can I retry 
with that?


Only good news is that I think I found the CLUSTER issue:

It was no GIST index I created, I accidentally created a BTREE index:
http://theendofthetunnel.de/cluster.txt

If it does help anything, the data and setup:

8.3-b4 build from source

./configure --prefix=/usr/local --enable-thread-safety --with-perl 
--with-openssl --with-libxml --with-libxslt


initdb line:
initdb -D /data/pgsql --locale='de_DE.utf8' --lc-collate='C'

Only listen_address and pg_hba.conf was touched.

Please get the -Fc dump (37MB) from:

http://theendofthetunnel.de/dump.bin
http://theendofthetunnel.de/glob.sql


--
Best regards,
Hannes Dorbath

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Index trouble with 8.3b4

2008-01-13 Thread Guillaume Smet
On Jan 13, 2008 7:50 PM, Hannes Dorbath [EMAIL PROTECTED] wrote:
 I started from scratch to put up a test case. I cannot trigger ERROR:
 item pointer (0,1) already exists again as the deadlock issue reported
 by Gregory Stark hit me every time. Is this fixed in RC1? Can I retry
 with that?

No, it's not fixed in RC1. You have to compile CVS HEAD to have it fixed.

--
Guillaume

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Index trouble with 8.3b4

2008-01-13 Thread Hannes Dorbath

Hannes Dorbath wrote:

Guillaume Smet wrote:
On Jan 13, 2008 7:50 PM, Hannes Dorbath [EMAIL PROTECTED] 
wrote:

I started from scratch to put up a test case. I cannot trigger ERROR:
item pointer (0,1) already exists again as the deadlock issue reported
by Gregory Stark hit me every time. Is this fixed in RC1? Can I retry
with that?


No, it's not fixed in RC1. You have to compile CVS HEAD to have it fixed.


OK, the deadlock is gone. I can only provoke it when issuing the create 
index statement from 2 terminals at the same time. But I think this is 
intended. I keep trying to catch the gin error though.


Well, or maybe not really intended that way. Both terminals error out with:

ERROR:  relation ts_test_tsv already exists

But the index was created.

ERROR:  relation ts_test_tsv already exists
test=# drop INDEX ts_test_tsv ;
DROP INDEX

--
Best regards,
Hannes Dorbath

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Index trouble with 8.3b4

2008-01-13 Thread Hannes Dorbath

Guillaume Smet wrote:

On Jan 13, 2008 7:50 PM, Hannes Dorbath [EMAIL PROTECTED] wrote:

I started from scratch to put up a test case. I cannot trigger ERROR:
item pointer (0,1) already exists again as the deadlock issue reported
by Gregory Stark hit me every time. Is this fixed in RC1? Can I retry
with that?


No, it's not fixed in RC1. You have to compile CVS HEAD to have it fixed.


OK, the deadlock is gone. I can only provoke it when issuing the create 
index statement from 2 terminals at the same time. But I think this is 
intended. I keep trying to catch the gin error though.



--
Best regards,
Hannes Dorbath

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Index trouble with 8.3b4

2008-01-13 Thread Hannes Dorbath

Hannes Dorbath wrote:

ERROR:  relation ts_test_tsv already exists
test=# drop INDEX ts_test_tsv ;
DROP INDEX


This is a general thing I'd like to ask. If the creation of an index 
fails, why is it nevertheless there? No matter if deadlock or my GIN 
error, why isn't the whole operation rolled back? And what state is it 
it leaves me on? Do I end up with a corrupt index on my table?


--
Best regards,
Hannes Dorbath

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Index trouble with 8.3b4

2008-01-13 Thread Tom Lane
Hannes Dorbath [EMAIL PROTECTED] writes:
 This is a general thing I'd like to ask. If the creation of an index 
 fails, why is it nevertheless there?

It's a rather ugly consequence of the fact that CREATE INDEX
CONCURRENTLY requires more than one transaction.  If the later ones
fail, the invalid index is still there.

It'd be nice to clean that up sometime, but don't hold your breath.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Index trouble with 8.3b4

2008-01-13 Thread Hannes Dorbath

Tom Lane wrote:

Hannes Dorbath [EMAIL PROTECTED] writes:
This is a general thing I'd like to ask. If the creation of an index 
fails, why is it nevertheless there?


It's a rather ugly consequence of the fact that CREATE INDEX
CONCURRENTLY requires more than one transaction.  If the later ones
fail, the invalid index is still there.

It'd be nice to clean that up sometime, but don't hold your breath.


OK, I have my GIN failure back with CSV-HEAD:

test=# UPDATE test SET tsv = to_tsvector(text);
UPDATE 753100
test=# CREATE INDEX CONCURRENTLY ts_test_tsv ON public.test USING 
gin (tsv);

ERROR:  item pointer (8,23) already exists
test=# drop INDEX ts_test_tsv ;
DROP INDEX
test=# CREATE INDEX CONCURRENTLY ts_test_tsv ON public.test USING 
gin (tsv);

CREATE INDEX
test=#

I have a hard time to pin it down. Currently all I can say is: It 
happens the first time after I bulk load data into that table.


I cannot catch it with pg_dump -- after a restore it works. I can 
reproduce it here more or less reliable. Maybe I should just bzip 
$PGDATA and send it.


--
Best regards,
Hannes Dorbath

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Index trouble with 8.3b4

2008-01-13 Thread Hannes Dorbath

Hannes Dorbath wrote:

Currently all I can say is: It happens the first time after I bulk load data 
into that table.


I have the bad feeling that I need to correct this into It happens when 
autovacuum is active on the table.


Is it by any chance possible that CREATE INDEX CONCURRENTLY might read 
dirt while autovacuum is busy with the table?



--
Best regards,
Hannes Dorbath

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Index trouble with 8.3b4

2008-01-13 Thread Tom Lane
Oooh ... I can't be sure that this is what's biting you, but I
definitely see a bug that seems to match the symptoms.  As the comments
in index.c point out, CREATE INDEX CONCURRENTLY works like this:

 * validate_index() works by first gathering all the TIDs currently in the
 * index, using a bulkdelete callback that just stores the TIDs and doesn't
 * ever say delete it.  (This should be faster than a plain indexscan;
 * also, not all index AMs support full-index indexscan.)  Then we sort the
 * TIDs, and finally scan the table doing a merge join against the TID list
 * to see which tuples are missing from the index.

The scan is done using the regular heapscan code, which in 8.3 has been
modified to enable synchronized scanning, which means it might start
from the middle of the table and wrap around.  If that happens, the
merge join will get totally confused because it is expecting the
tuples to be returned in increasing ctid order.  This will result in
misidentifying a bunch of TIDs as not being in the table, allowing
duplicate entries to be made in the index.  And both of the misbehaviors
you originally showed can be explained by duplicate index entries
(actual or attempted).

Furthermore, the first duplicate TIDs to be entered will tend to be
low-numbered TIDs, which explains why you were consistently getting
GIN complaints about low-numbered TIDs, which I was having a hard time
thinking of a mechanism for otherwise.

I can now reproduce the failure: the trick is to get the syncscan start
pointer to not be on page zero.  For example,

-- load up table
begin;
declare c cursor for select id from test;
fetch 1 from c;
commit;
CREATE INDEX CONCURRENTLY cluster_test ON public.test USING gin (tsv);
ERROR:  item pointer (0,1) already exists

I think it's okay for CREATE INDEX CONCURRENTLY to use bulk-read access
strategy (that is, seqscan using a limited number of buffers), but it
has to be able to force the scan to start at page zero.  Right now,
heapam.c doesn't offer any API to control this, but we can certainly
add one.

I wonder whether there are any other places that are silently assuming
that heapscans start from page zero ...

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Index trouble with 8.3b4

2008-01-13 Thread Tom Lane
Hannes Dorbath [EMAIL PROTECTED] writes:
 I have the bad feeling that I need to correct this into It happens when 
 autovacuum is active on the table.

Ah-hah, I realize how to explain that too, now.  If you start the CREATE
INDEX CONCURRENTLY while an autovacuum is in progress on that table, the
autovacuum gets booted off so that C.I.C. can get the lock.  And, most
likely, it leaves the syncscan pointer positioned someplace past block
zero.  So the part that is a bit hard to reproduce is to launch the
C.I.C. while an autovacuum is in progress.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Index trouble with 8.3b4

2008-01-13 Thread Tom Lane
I wrote;
 Hannes Dorbath [EMAIL PROTECTED] writes:
 I have the bad feeling that I need to correct this into It happens when 
 autovacuum is active on the table.

 Ah-hah, I realize how to explain that too, now.

Hmm, no, scratch that: neither VACUUM nor ANALYZE use a standard
heapscan, so they won't move the syncscan pointer.  Were you performing
some other query that did a partial seqscan of the table?

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Index trouble with 8.3b4

2008-01-13 Thread Tom Lane
I wrote:
 I think it's okay for CREATE INDEX CONCURRENTLY to use bulk-read access
 strategy (that is, seqscan using a limited number of buffers), but it
 has to be able to force the scan to start at page zero.

I've committed a patch to do that.  Please test CVS HEAD and see if you
still see problems.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Index trouble with 8.3b4

2008-01-08 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Further poking around shows that the unrecognized locktag is because
 lmgr.c:DescribeLockTag was never taught about virtual xids.

That's fixed, thanks for the patch.

 The pid it's waiting on is long since gone but looks like it was probably an
 autovacuum process. I have a vague recollection that you had rigged CREATE
 INDEX CONCURRENTLY to ignore vacuum processes when checking for conflicting
 processes.

I'm still not too clear on the underlying bug though.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Index trouble with 8.3b4

2008-01-08 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 The pid it's waiting on is long since gone but looks like it was probably an
 autovacuum process. I have a vague recollection that you had rigged CREATE
 INDEX CONCURRENTLY to ignore vacuum processes when checking for conflicting
 processes. Since any such process will be blocked on our session-level
 ShareUpdateExclusiveLock it will always cause a deadlock and we would rather
 it just hang out and wait until our index build is finished.

OK, after reading the code some more I think I've got the point.  The
scenario is that autovacuum is waiting to get ShareUpdateExclusiveLock
(it can't already have it, because the CREATE INDEX CONCURRENTLY does)
and then one of C.I.C's three wait steps decides it has to wait for the
autovacuum.  It cannot be one of the first two, because those only block
for xacts that *already have* a conflicting lock.  The problem must be
at the third wait step, which waits out all xacts that might conceivably
be interested in recently-dead tuples that are not in the index.

Now an unindexed dead tuple is not a problem from vacuum's point of
view, nor does ANALYZE care, so AFAICS there is no need for this step
to wait for autovacuum processes --- nor indeed for manual vacuums.
So we can avoid the deadlock if we just exclude those processes from
the list of ones to wait for.

I suggest we extend GetCurrentVirtualXIDs() with an additional
parameter includeVacuums, and have it skip vacuum procs if that's
set.  (Hmm, maybe a more flexible approach is to make the parameter
a bitmask, and ignore any procs for which param  vacuumFlags is
not zero.)

Comments?

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Index trouble with 8.3b4

2008-01-08 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

 It cannot be one of the first two, because those only block
 for xacts that *already have* a conflicting lock.  The problem must be
 at the third wait step, which waits out all xacts that might conceivably
 be interested in recently-dead tuples that are not in the index.

Ah, I had missed that point.

 Now an unindexed dead tuple is not a problem from vacuum's point of
 view, nor does ANALYZE care, so AFAICS there is no need for this step
 to wait for autovacuum processes --- nor indeed for manual vacuums.
 So we can avoid the deadlock if we just exclude those processes from
 the list of ones to wait for.

That's what I had in mind.

 I suggest we extend GetCurrentVirtualXIDs() with an additional
 parameter includeVacuums, and have it skip vacuum procs if that's
 set.  (Hmm, maybe a more flexible approach is to make the parameter
 a bitmask, and ignore any procs for which param  vacuumFlags is
 not zero.)

 Comments?

Only that the restrictions on what VACUUM is allowed to do seem the piling up.
We may have to write up a separate document explaining what specialized set of
rules VACUUM operates under.

Also, ANALYZE was included in the latest security changes. Is there some way
that ANALYZE could trigger some user-defined function being invoked which
could in turn run some SQL using this index? I suppose a very strange
expression index where the expression involved a recursive SQL query back to
the same table (presumably being careful to avoid an infinite loop) could be
possible.

I am hoping our other things which ignore VACUUM such as the globalxmin
calculation are careful not to ignore VACUUM ANALYZE processes?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Index trouble with 8.3b4

2008-01-08 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 Now an unindexed dead tuple is not a problem from vacuum's point of
 view, nor does ANALYZE care, so AFAICS there is no need for this step
 to wait for autovacuum processes --- nor indeed for manual vacuums.

 Also, ANALYZE was included in the latest security changes. Is there some way
 that ANALYZE could trigger some user-defined function being invoked which
 could in turn run some SQL using this index?

Hmm.  ANALYZE itself doesn't look into the indexes, but it does invoke
user-defined functions that could nominally run queries.  However, a
function in an index that runs a query that examines its own table seems
implausible, and very unlikely to work right anyway.  You could hardly
expect such a function to be really immutable -- consider for example
that it would be unlikely to deliver the same results during CREATE
INDEX on an already-filled table that it would if the rows were being
inserted with the index already existing.  So I'm not really worried
about that scenario.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Index trouble with 8.3b4

2008-01-08 Thread Alvaro Herrera
Gregory Stark wrote:

 I am hoping our other things which ignore VACUUM such as the globalxmin
 calculation are careful not to ignore VACUUM ANALYZE processes?

It doesn't matter -- the ANALYZE is done in a separate transaction (so
the VACUUM part is ignored, the ANALYZE part is not).

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Index trouble with 8.3b4

2008-01-07 Thread Gregory Stark
Gregory Stark [EMAIL PROTECTED] writes:

 Tom Lane [EMAIL PROTECTED] writes:

 I didn't have any luck reproducing either of these behaviors --- maybe
 it's data-dependent.  Can you extract a test case?

 I haven't been able to reproduce this either but I produced an entirely
 different problem:

 postgres=# create index concurrently dg5 on doc using gin 
 (to_tsvector('english',d));
 ERROR:  deadlock detected
 DETAIL:  Process 7076 waits for ShareLock on unrecognized locktag type 5; 
 blocked by process 10497.
 Process 10497 waits for ShareUpdateExclusiveLock on relation 24656 of 
 database 11511; blocked by process 7076.

Further poking around shows that the unrecognized locktag is because
lmgr.c:DescribeLockTag was never taught about virtual xids. Ie something like
this (untested):

--- lmgr.c  04 Jan 2008 15:12:37 +  1.95
+++ lmgr.c  07 Jan 2008 15:54:49 +  
@@ -739,6 +739,12 @@
 tag-locktag_field2,
 tag-locktag_field1);
break;
+   case LOCKTAG_VIRTUALTRANSACTION:
+   appendStringInfo(buf,
+_(virtual transaction 
%d/%u),
+tag-locktag_field1,
+tag-locktag_field2);
+   break;
case LOCKTAG_TRANSACTION:
appendStringInfo(buf,
 _(transaction %u),


The pid it's waiting on is long since gone but looks like it was probably an
autovacuum process. I have a vague recollection that you had rigged CREATE
INDEX CONCURRENTLY to ignore vacuum processes when checking for conflicting
processes. Since any such process will be blocked on our session-level
ShareUpdateExclusiveLock it will always cause a deadlock and we would rather
it just hang out and wait until our index build is finished.

On the other hand we can't just ignore all vacuums because someone could issue
a manual vacuum inside a transaction (I think?). But this is a general problem
with all the places where we check if another transaction is just running
vacuum, such as checking for globalxmin. We should only be ignoring
transactions which were started just to execute a vacuum.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Index trouble with 8.3b4

2008-01-07 Thread Gregory Stark
Gregory Stark [EMAIL PROTECTED] writes:

 On the other hand we can't just ignore all vacuums because someone could issue
 a manual vacuum inside a transaction (I think?). 

Doh, ignore this. sigh.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly