date:20070316

Re: [HACKERS] tsearch_core for inclusion

2007-03-16 Thread Oleg Bartunov


On Fri, 16 Mar 2007, Magnus Hagander wrote:


Most people whom I talk to about tsearch who want the syntax changed to make
it easier want something akin to just CREATE INDEX fti1 on t1(c1) USING
FULLTEXT and then be done with it.  This patch isn't going to give people
that.


Since we use standard postgresql-ish CREATE INDEX command, I assume
people want to skip creation of tsvector column ?


That would be great.


How they could manage
complex document indexing, when document is a combination (with different 
weights)
of many text attributes from several tables, for example ?


Just to give you some more work, could I have both, please.

Seriously, if the current powerful functionality could be combined with a dead 
simple solution for new users and those who don't need it, that would be very
good.

This could be an auto generated hidden column or something, as long as the user 
doesn't need to see or care about it in the simple case.


hmm, then we should think about very generic fts configuration behind such
simple solution. We need to know what requirements and features should
be supported. for example, CREATE INDEX fti1 on t1(c1) USING FULLTEXT
syntax assume default configuration only.

Joshua, Tome proposed something like
CREATE INDEX fti1 on t1(c1) USING FULLTEXT [WITH] GIST | GIN,
so is't worth to extend it to specify fts configuration like
CREATE INDEX fti1 on t1(c1) USING FULLTEXT [public.pg]  [WITH] GIST | GIN 
Normally, fts configuration is used when creating tsvector, so 
CREATE INDEX doesn't need to know it. Hmm, looks rather messy.




Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] UPDATE using sub selects

2007-03-16 Thread NikhilS


Hi,



 What's the expected result if the tuple from subselect is more than 1?

Error, per SQL99 section 7.14:

 1) If the cardinality of a row subquery is greater than 1
(one),
then an exception condition is raised: cardinality violation.



I expect no update at all in case of void result set, is this the case ?

No, you get nulls; it's a subquery not a join.  Per SQL99 7.1:

c) If the row value constructor is a row subquery, then:

  i) Let R be the result of the row subquery and let D be
the
 degree of R.

 ii) If the cardinality of R is 0 (zero), then the result of
the
 row value constructor is D null values.

iii) If the cardinality of R is 1 (one), then the result of
the
 row value constructor is R.

regards, tom lane



To allow both of the above to hold, I think the subselect will have to be
treated like a EXPR_SUBLINK subquery. I was wondering if we have a similar
mechanism for plain selects/subselects to check and restrict their output to
a single row.

Regards,
Nikhils
--
EnterpriseDB   http://www.enterprisedb.com

Re: [HACKERS] tsearch_core for inclusion

2007-03-16 Thread Teodor Sigaev


Yeah, that one.  It might be more consistent to spell it as fulltext_ops
but I wouldn't insist on it.


Hmm, you are prompting an idea to me how to simplify usage of full text index in 
 simple cases.


CREATE INDEX idxname ON tblname USING gin (textcolumn fulltext_ops);

Fulltext_ops opclass parses the document similarly to_tsvector nad stores 
lexemes in gin index. It's a full equalent of

CREATE INDEX ... ( to_tsvector( textcolumn ) )

And, let we define operation text @ text, which is equivalent of text @@ 
plainto_tsquery(text), so, queries will look like

SELECT * FROM tblname WHERE textcolumn @ textquery;

Fulltext_ops can speedup both operation, text @@ tsquery and text @ text. 
Because gin API has extractQuery method which calls once per index scan and it 
can parse query to lexemes.


Some disadvantage: with that way it isn't possible make fast ranking - there is 
no stored parsed text. And, fulltext_ops may be done for GiST index too, but 
fulltext opclass will be lossy which means slow search due to reparse texts for 
each index match.


BTW, simple syntax sugar for CREATE INDEX (fulltext_ops) may be done:
CREATE INDEX idxname ON tblname USING FULLTEXT (textcolumn)

--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] tsearch_core for inclusion

2007-03-16 Thread Teodor Sigaev

I'm also concerned about the stability of the tsearch api in general wrt 
including it in core.  Currently the recommended upgrade practice is to 
dump/reload without tsearch, installing the new servers version of tsearch 
That is because pg_ts* tables changes, function names and internal API. Putting 
tsearch in core discards a lot of such problem. For example, who notices changes 
in pg_am table from release to release? Really it was a developers/hackers, not 
a usual users


--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] [RFC] CLUSTER VERBOSE

2007-03-16 Thread Heikki Linnakangas


Grzegorz Jaskiewicz wrote:
Because CLUSTER is divided into two major operations, (data reordering, 
index rebuild) - I see it this way:


CLUSTER on I: index name T: table name, data reordering
CLUSTER on I: index name T: table name, index rebuild


Something like that would be nice to see how long each step takes, like 
vacuum verbose.



and than:
CLUSTER 10%
CLUSTER 12% , etc


We don't have progress indicators for any other commands, and I don't 
see why we should add one for cluster in particular. Sure, progress 
indicators are nice, but we should rather try to add some kind of a 
general progress indicator support that would support SELECTs for 
example. I know it's much harder, but also much more useful.



I am looking for opinions, on what information should be presented.


What would be useful is some kind of a metric of how (de)clustered the 
table was before CLUSTER, and the same # of dead vs. live row counts 
that vacuum verbose prints.


We don't really have a good metric for clusteredness, as have been 
discussed before, so if you can come up with a good one that would be 
useful in the planner as well, that would be great.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [HACKERS] pltcl vs. multilib machines

2007-03-16 Thread Joe Conway


Tom Lane wrote:

It turns out that this is because the link command for pltcl includes
-L/usr/lib, so that gets searched before /usr/lib64.  And the reason the
command includes that is that that's what it says in TCL_LIB_SPEC in
/usr/lib/tclConfig.sh.  There is also a /usr/lib64/tclConfig.sh which
says the right things, but we aren't finding that because this is how we
determine where to look for tclConfig.sh:

$ echo 'puts $auto_path' | tclsh
/usr/share/tcl8.4 /usr/share /usr/lib /usr/lib64
$

Perhaps I should lobby the Red Hat guys to change the order of that
result, but really this is more our problem than theirs: whichever
way tclsh reports it, it will be wrong for trying to build Postgres
with the other word width on a multilib machine.


Not the ideal answer, but I've had this in my config script for quite a 
while now:


--with-tclconfig=/usr/lib64

Joe

---(end of broadcast)---
TIP 6: explain analyze is your friend

[HACKERS] Lock table in non-volatile functions

2007-03-16 Thread Gaetano Mendola

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi all,
I'm observing that is not allowed to LOCK a table in a
STABLE/IMMUTABLE function but at same time is allowed
a SELECT FOR UPDATE.

Is that normal?




-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFF+lsm7UpzwH2SGd4RAjloAJ4j/AOdJhGMRnvM/TKVpKHPwesAOACeO4mT
OQhSwR1of3xS7HSSvtjGiQc=
=nFFM
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] [RFC] CLUSTER VERBOSE

2007-03-16 Thread Grzegorz Jaskiewicz



On Mar 16, 2007, at 9:53 AM, Heikki Linnakangas wrote:


Grzegorz Jaskiewicz wrote:
Because CLUSTER is divided into two major operations, (data  
reordering, index rebuild) - I see it this way:

CLUSTER on I: index name T: table name, data reordering
CLUSTER on I: index name T: table name, index rebuild


Something like that would be nice to see how long each step takes,  
like vacuum verbose.

yup.



I am looking for opinions, on what information should be presented.


What would be useful is some kind of a metric of how (de)clustered  
the table was before CLUSTER, and the same # of dead vs. live row  
counts that vacuum verbose prints.
Is that information available in cluster.c atm ? I am looking for  
some hints here. One of the reasons I decided to go with this patch,  
is to learn something  - and cluster seems to be touching very 'bone'  
of postgres,
tuples system (just like vacuum), and indices. I would appreciate any  
hints.


We don't really have a good metric for clusteredness, as have been  
discussed before, so if you can come up with a good one that would  
be useful in the planner as well, that would be great.



I really don't know where and how should I calculate such param. Any  
hints ?


thanks.

--
Grzegorz Jaskiewicz

C/C++ freelance for hire






---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Bug in UTF8-Validation Code?

2007-03-16 Thread Mario Weilguni

Am Mittwoch, 14. März 2007 08:01 schrieb Michael Paesold:
 Andrew Dunstan wrote:
 
  This strikes me as essential. If the db has a certain encoding ISTM we
  are promising that all the text data is valid for that encoding.
 
  The question in my mind is how we help people to recover from the fact
  that we haven't done that.

 I would also say that it's a bug that escape sequences can get characters
 into the database that are not valid in the specified encoding. If you
 compare the encoding to table constraints, there is no way to simply
 escape a constraint check.

 This seems to violate the principle of consistency in ACID. Additionally,
 if you include pg_dump into ACID, it also violates durability, since it
 cannot restore what it wrote itself.
 Is there anything in the SQL spec that asks for such a behaviour? I guess
 not.

 A DBA will usually not even learn about this issue until they are presented
 with a failing restore.

Is there anything I can do to help with this problem? Maybe implementing a new 
GUC variable that turns off accepting wrong encoded sequences (so DBAs still 
can turn it on if they really depend on it)?

For me, 

Best regards,
Mario Weilguni

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[HACKERS] Question: pg_class attributes and race conditions ?

2007-03-16 Thread Pavan Deolasee




What is the safest way to access/modify the pg_class attribute
and still avoid any race conditions with the other backends ?

A specific example is: To solve the CREATE INDEX problem with
HOT, I am thinking of adding (along with other things) a pg_class
boolean attribute, say hot_update_enable. All backends are
supposed to check this attribute before they perform an UPDATE.
The attribute would usually be available in relation-rd_rel

My understanding is that the backend which sets this attribute
must first acquire a lock on the heap relation of sufficient
strength so as to ensure that there are no concurrent UPDATErs,
update the pg_class row and then release the lock on the relation.
This would ensure that no backend has a stale Relation
pointer with stale value of hot_update_enable.

Also, should I use heap_inplace_update() rather than
simple_heap_update() because I want other backends to see the
change immediately irrespective of their snapshot ?

Is this a fair analysis ? Are there any rules I must follow
to avoid any deadlock and race conditions. I know we should
not be requesting a higher grade lock while holding a
lower grade lock, but are there any other restrictions/best
practices ?

Thanks,
Pavan


--


EnterpriseDBhttp://www.enterprisedb.com


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [HACKERS] Bug in UTF8-Validation Code?

2007-03-16 Thread Albe Laurenz

Mario Weilguni wrote:
 Is there anything I can do to help with this problem? Maybe
implementing a new 
 GUC variable that turns off accepting wrong encoded sequences (so DBAs
still 
 can turn it on if they really depend on it)?

I think that this should be done away with unconditionally.
Or does anybody have a good point for allowing corrupt data
in text columns?
Maybe it is the way it is now because nobody could be bothered
to add the appropriate checks...

Yours,
Laurenz Albe

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] Bug in UTF8-Validation Code?

2007-03-16 Thread Andrew Dunstan


Albe Laurenz wrote:

Mario Weilguni wrote:
  

Is there anything I can do to help with this problem? Maybe

implementing a new 
  

GUC variable that turns off accepting wrong encoded sequences (so DBAs

still 
  

can turn it on if they really depend on it)?



I think that this should be done away with unconditionally.
Or does anybody have a good point for allowing corrupt data
in text columns?
Maybe it is the way it is now because nobody could be bothered
to add the appropriate checks...

  


I agree. It's more or less an integrity violation, IMNSHO.

cheers

andrew

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] tsearch_core for inclusion

2007-03-16 Thread Robert Treat

On Friday 16 March 2007 04:44, Teodor Sigaev wrote:
  I'm also concerned about the stability of the tsearch api in general wrt
  including it in core.  Currently the recommended upgrade practice is to
  dump/reload without tsearch, installing the new servers version of
  tsearch

 That is because pg_ts* tables changes, function names and internal API.
 Putting tsearch in core discards a lot of such problem. For example, who
 notices changes in pg_am table from release to release? Really it was a
 developers/hackers, not a usual users

I've ran into problems on very vanilla setups that I am sure other users are 
going to run across... see the following for details: 
 
http://people.planetpostgresql.org/xzilla/index.php?/archives/291-The-pain-that-is-tsearch2-8.1-8.2-upgrading.html

I don't see how the proposal is going to solve that type of problem, but maybe 
I am overlooking something?

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] My honours project - databases using dynamically attached entity-properties

2007-03-16 Thread Andrew Dunstan


[EMAIL PROTECTED] wrote:


Does hstore nest? My impression is that it doesn't. Which might well not 
matter, of course.



If what you mean is to have mappings of mappings then no.

Hstore implements a data type for a (finite) mapping (a set of key - value
pairs, think hash for perl folks), with operations like H1 contains
H2 (in the sense that all key-value pairs in H2 are also in H1)
supported by an index. Keys and values are strings.

  


As a perl folk I think of hashes as nestable :-). Unlike hstore, the 
keys are strings but the values can be anything, including a hashref or 
arrayref.


Anyway, this means that you can't use hstore to cover the same field as 
YAML or JSON. That doesn't mean it's not useful - far from it.


cheers

andrew

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] UPDATE using sub selects

2007-03-16 Thread Tom Lane

NikhilS [EMAIL PROTECTED] writes:
 To allow both of the above to hold, I think the subselect will have to be
 treated like a EXPR_SUBLINK subquery. I was wondering if we have a similar
 mechanism for plain selects/subselects to check and restrict their output to
 a single row.

No.  Offhand I think you'd either need to relax EXPR_SUBLINK to allow
multiple output columns, or invent a ROW_SUBLINK SubLinkType that is
just like EXPR_SUBLINK except for allowing multiple output columns.
The latter would probably be less likely to break other things...

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] tsearch_core for inclusion

2007-03-16 Thread Tom Lane

Teodor Sigaev [EMAIL PROTECTED] writes:
 Hmm, you are prompting an idea to me how to simplify usage of full text index 
 in 
   simple cases.

 CREATE INDEX idxname ON tblname USING gin (textcolumn fulltext_ops);

+1 ... makes the easy cases easy, doesn't make the hard cases any
harder.

 BTW, simple syntax sugar for CREATE INDEX (fulltext_ops) may be done:
 CREATE INDEX idxname ON tblname USING FULLTEXT (textcolumn)

Not in favor of this, because FULLTEXT isn't an AM ... how would you
know whether to use GIST or GIN?  Actually, if you wanted to simplify
life a bit, you could mark fulltext_ops as being the default opclass
for text (and varchar I guess) under GIST and GIN.  Then it reduces
to just

CREATE INDEX idxname ON tblname USING gin (textcolumn);

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] tsearch_core for inclusion

2007-03-16 Thread Teodor Sigaev

I don't see how the proposal is going to solve that type of problem, but maybe 
I am overlooking something?


The same way as other system tables objects, they don't dump, they don't 
restore. In 8.3, seems, API to index AM will be changed - will anybody except 
pghackers see that? New opclass layout, new opfamily table - users don't that 
changes at all.



--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Question: pg_class attributes and race conditions ?

2007-03-16 Thread Alvaro Herrera

Pavan Deolasee wrote:
 
 
 What is the safest way to access/modify the pg_class attribute
 and still avoid any race conditions with the other backends ?
 
 A specific example is: To solve the CREATE INDEX problem with
 HOT, I am thinking of adding (along with other things) a pg_class
 boolean attribute, say hot_update_enable. All backends are
 supposed to check this attribute before they perform an UPDATE.
 The attribute would usually be available in relation-rd_rel
 
 My understanding is that the backend which sets this attribute
 must first acquire a lock on the heap relation of sufficient
 strength so as to ensure that there are no concurrent UPDATErs,
 update the pg_class row and then release the lock on the relation.
 This would ensure that no backend has a stale Relation
 pointer with stale value of hot_update_enable.

FWIW this is pretty much the same I wanted to do with setting
relfrozenxid to FrozenTransactionId.  To this end I wrote a patch to add
a catalog pg_ntclass (later renamed to pg_class_nt), which was
ultimately rejected for reasons I don't remember at the time.  Maybe it
would be illuminating to investigate that -- please see the archives.

(I still think it would be good to have a pg_class_nt catalog, so it's
not a dead idea).

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] UPDATE using sub selects

2007-03-16 Thread NikhilS


Hi,

On 3/16/07, Tom Lane [EMAIL PROTECTED] wrote:


NikhilS [EMAIL PROTECTED] writes:
 To allow both of the above to hold, I think the subselect will have to
be
 treated like a EXPR_SUBLINK subquery. I was wondering if we have a
similar
 mechanism for plain selects/subselects to check and restrict their
output to
 a single row.

No.  Offhand I think you'd either need to relax EXPR_SUBLINK to allow
multiple output columns, or invent a ROW_SUBLINK SubLinkType that is
just like EXPR_SUBLINK except for allowing multiple output columns.
The latter would probably be less likely to break other things...



Yeah, was looking at EXPR_SUBLINK and its single column use case and drove
to the same conclusion that inventing a new sublink type would be better
too. It is indeed becoming a not so simple and narrow fix as you had
mentioned earlier in your first response :)

Regards,
Nikhils
--
EnterpriseDB   http://www.enterprisedb.com

Re: [HACKERS] Lock table in non-volatile functions

2007-03-16 Thread Tom Lane

Gaetano Mendola [EMAIL PROTECTED] writes:
 I'm observing that is not allowed to LOCK a table in a
 STABLE/IMMUTABLE function but at same time is allowed
 a SELECT FOR UPDATE.

Really?  AFAICS, CommandIsReadOnly() will reject SELECT FOR UPDATE too.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] tsearch_core for inclusion

2007-03-16 Thread Florian G. Pflug


Teodor Sigaev wrote:

CREATE INDEX idxname ON tblname USING gin (textcolumn fulltext_ops);

Fulltext_ops opclass parses the document similarly to_tsvector nad 
stores lexemes in gin index. It's a full equalent of

CREATE INDEX ... ( to_tsvector( textcolumn ) )

And, let we define operation text @ text, which is equivalent of text @@ 
plainto_tsquery(text), so, queries will look like

SELECT * FROM tblname WHERE textcolumn @ textquery;

Fulltext_ops can speedup both operation, text @@ tsquery and text @ 
text. Because gin API has extractQuery method which calls once per index 
scan and it can parse query to lexemes.


Some disadvantage: with that way it isn't possible make fast ranking - 
there is no stored parsed text. And, fulltext_ops may be done for GiST 
index too, but fulltext opclass will be lossy which means slow search 
due to reparse texts for each index match.

Just a thought:

If the patch that implements the GENERATED ALWAYS syntax is accepted,
than creating a seperate field that hold the parsed text and an index
on that column becomes as easy as:
alter table t1 add column text_parsed generated always as to_tsvector(text);
create index idx on t1 using gin (text_parsed fulltext_ops);

I know that there is a trigger function in tsearch that support something
similar, but I really like the simplicity of the statements above.

One a related note - will to_tsvector and to_tsquery be renamed to
something like ft_parse_text() and ft_parse_query() if tsearch2 goes
into core? It seems like the ts part of those names would be the only
referenced left to the name tsearch if they are not, which could be
somewhat confusing for users.

greetings, Florian Pflug

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] Question: pg_class attributes and race conditions ?

2007-03-16 Thread Tom Lane

Pavan Deolasee [EMAIL PROTECTED] writes:
 My understanding is that the backend which sets this attribute
 must first acquire a lock on the heap relation of sufficient
 strength so as to ensure that there are no concurrent UPDATErs,
 update the pg_class row and then release the lock on the relation.

In what context are you proposing to do that, and won't this
high-strength lock in itself lead to deadlocks?

The whole thing sounds exceedingly ugly anyway --- for example
what happens if the backend doing the CREATE INDEX fails and
is therefore unable to clear the flag again?

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] tsearch_core for inclusion

2007-03-16 Thread Joshua D. Drake


 One a related note - will to_tsvector and to_tsquery be renamed to
 something like ft_parse_text() and ft_parse_query() if tsearch2 goes

Furthering this... perhaps even:

ft_search()
ft_query()

Joshua D. Drake



-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] tsearch_core for inclusion

2007-03-16 Thread Andrew Dunstan


Tom Lane wrote:

Actually, if you wanted to simplify
life a bit, you could mark fulltext_ops as being the default opclass
for text (and varchar I guess) under GIST and GIN.  Then it reduces
to just

CREATE INDEX idxname ON tblname USING gin (textcolumn);

  


Nice. This gets my vote.

cheers

andrew

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [HACKERS] tsearch_core for inclusion

2007-03-16 Thread Oleg Bartunov


On Fri, 16 Mar 2007, Joshua D. Drake wrote:




One a related note - will to_tsvector and to_tsquery be renamed to
something like ft_parse_text() and ft_parse_query() if tsearch2 goes


Furthering this... perhaps even:

ft_search()
ft_query()


ts_ means Text Search, I don't think ft_ (Full Text) is better.
Going further it should be fts_ (Full Text Search), but we have 
many concerns about compatibility and stability of api, so I'd prefer

to stay with ts_.

The more important is what syntax we should accept for the final patch ?
Original proposed or modified version ?
On
http://mira.sai.msu.su/~megera/pgsql/ftsdoc/fts-syntax-compare.html
they are D.1.2 and D.1.3

As I understood Teodor's proposal about CREATE INDEX command is ok for all.



Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] tsearch_core for inclusion

2007-03-16 Thread Stefan Kaltenbrunner

Florian G. Pflug wrote:
 Teodor Sigaev wrote:
 CREATE INDEX idxname ON tblname USING gin (textcolumn fulltext_ops);

 Fulltext_ops opclass parses the document similarly to_tsvector nad
 stores lexemes in gin index. It's a full equalent of
 CREATE INDEX ... ( to_tsvector( textcolumn ) )

 And, let we define operation text @ text, which is equivalent of text
 @@ plainto_tsquery(text), so, queries will look like
 SELECT * FROM tblname WHERE textcolumn @ textquery;

 Fulltext_ops can speedup both operation, text @@ tsquery and text @
 text. Because gin API has extractQuery method which calls once per
 index scan and it can parse query to lexemes.

 Some disadvantage: with that way it isn't possible make fast ranking -
 there is no stored parsed text. And, fulltext_ops may be done for GiST
 index too, but fulltext opclass will be lossy which means slow search
 due to reparse texts for each index match.
 Just a thought:
 
 If the patch that implements the GENERATED ALWAYS syntax is accepted,
 than creating a seperate field that hold the parsed text and an index
 on that column becomes as easy as:
 alter table t1 add column text_parsed generated always as
 to_tsvector(text);
 create index idx on t1 using gin (text_parsed fulltext_ops);

or to take tom's idea into consideration:

ALTER TABLE t1 ADD COLUMN text_parsed GENERATED ALWAYS AS to_tsvector(text);
CREATE INDEX idxname ON t1 USING gin (text_parsed);

which looks pretty nice and simple to me


 
 I know that there is a trigger function in tsearch that support something
 similar, but I really like the simplicity of the statements above.
 
 One a related note - will to_tsvector and to_tsquery be renamed to
 something like ft_parse_text() and ft_parse_query() if tsearch2 goes
 into core? It seems like the ts part of those names would be the only
 referenced left to the name tsearch if they are not, which could be
 somewhat confusing for users.

well either renaming those functions (and completely destroy the upgrade
path for any current users) or  just refer to it as text search in the
docs (so that the prefix makes sense).


Stefan

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] tsearch_core for inclusion

2007-03-16 Thread Florian G. Pflug


Oleg Bartunov wrote:

On Fri, 16 Mar 2007, Joshua D. Drake wrote:




One a related note - will to_tsvector and to_tsquery be renamed to
something like ft_parse_text() and ft_parse_query() if tsearch2 goes


Furthering this... perhaps even:

ft_search()
ft_query()


ts_ means Text Search, I don't think ft_ (Full Text) is better.
Going further it should be fts_ (Full Text Search), but we have many 
concerns about compatibility and stability of api, so I'd prefer

to stay with ts_.


Hm, so it could be fts_parse_query() and fts_parse_text()
You could alias it to to_tsvector() and to_tsquery() to
archive api compatibility.

I agree that the names of these functions are really a minor
issue, and api compatibility is more important. But confusing
names can be the source of a lot of errors for new users, so
there *is* a point is naming things consistenly. And if the
cost is basically an entry in pg_proc, why not do it?

greetings, Florian Pflug



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [HACKERS] Question: pg_class attributes and race conditions ?

2007-03-16 Thread Pavan Deolasee


Tom Lane wrote:

 In what context are you proposing to do that, and won't this
 high-strength lock in itself lead to deadlocks?

 The whole thing sounds exceedingly ugly anyway --- for example
 what happens if the backend doing the CREATE INDEX fails and
 is therefore unable to clear the flag again?


Let me state the problem and a vague solution I am thinking of.
I would appreciate comments and suggestions.

The major known issue left with HOT is support for
CREATE INDEX and CREATE INDEX CONCURRENTLY. The
problem is with HEAP_ONLY tuples in the heap which do not have index
entries in the existing indexes. When we build a new index, some or all
of the HEAP_ONLY tuples may need index entries in the new index.
It would be very ugly if we try to keep the existing indexes
without index entries for those tuples. A clean solution
would be to add index entries for the HEAP_ONLY tuples in
the existing indexes and break all the HOT-chains.

I would leave the details, but rather explain what I have in
mind at high level. Any help to fill in the details or any
suggestions to do things differently would immensely help.

This is what I have in mind:

In the context of CREATE INDEX [CONCURRENTLY],

We first disable HOT-updates on the table. This would ensure
that no new HOT tuples are added while we CHILL the heap.
(How do we do this ?)

We then start scanning the heap and start building the new
index. If a HEAP_ONLY tuple is found which needs to be
indexed, we mark the tuple with a CHILL_IN_PROGRESS flag
and insert index entries into all the existing indexes.
(The buffer is exclusively locked and the operation is WAL
logged).

We do this until entire heap is scanned. At this point, we
would have inserted missing index entries for the HEAP_ONLY
tuples. Till this point, we don't use the direct index
entries to fetch the HEAP_ONLY tuples to avoid duplicate
fetches of the same tuple.

We now wait for all the concurrent index scans to end and
then disable HOT-chain following logic to fetch tuples.
(How do we do this ?)

At this point, all index scans would ONLY use the direct
path from the index to fetch tuples. The HOT-chains are
not followed to avoid duplicate fetches of the same tuple.

A second pass over the heap is now required to clear the
CHILL_IN_PROGRESS, HEAP_ONLY and HEAP_HOT_UPDATED flags.

At the end of this step, all the indexes and the table are
in sync. Once again we need to ensure that there are no
concurrent index scans in progress and then enable HOT-fetch.
Also, HOT-updates can be turned on.

If CREATE INDEX crashes, VACUUM is required to clear the
CHILL_IN_PROGRESS flags and the corresponding index entries
are removed. Since VACUUM runs mutually exclusive to CREATE
INDEX, we don't need any special mechanism to handle race
conditions between them.

There are some other details like running multiple CREATE
INDEX in parallel and still be able to CHILL the table
safely. May be one of them needs to act as the chiller
and others wait for it finish successfully.

Any thoughts on the overall approach ? Any suggestions to
simplify things or any alternate designs ? Can something
as simple as CHILLing the table holding VACUUM FULL
strength lock be acceptable ?


Thanks,
Pavan

--


EnterpriseDBhttp://www.enterprisedb.com


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Question: pg_class attributes and race conditions ?

2007-03-16 Thread Tom Lane

Pavan Deolasee [EMAIL PROTECTED] writes:
 Any thoughts on the overall approach ?

Fragile and full of race conditions :-(.  I thought from the beginning
that CREATE INDEX might be a showstopper for the whole HOT concept,
and it's starting to look like that's the case.

I think what we need to get away from is the assumption that HOT-ness
for one index is the same as HOT-ness for all.  What if we only applied
HOT to primary-key indexes, so that there was certainly not more than
one index per table that the property applies to?

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] New Project: PostGIS application to Missing People

2007-03-16 Thread Josh Berkus

Luis,

 This is a proposal for design a new concept for integrated PostGIS
 application and how to implement features to improve tracking information
 about missing people. This application will be useful in disaster
 scenarios, looking for missing kids, rescue kidnapped people, human right
 watch, etc . This task surely is not an easy one but I think it would be a
 great service to the global community and I would really enjoy mentoring
 this project. That is my motivation.

Per my private e-mail, I think you misunderstood.  I was suggesting that you 
find a *student* to propose the project.  Given that we, as a project, are 
responsible for our mentors and the outcome of SoC projects, we really can't 
accept mentors who are not familiar to us.

For that matter, mentors don't get to choose the projects.  The students do.  
We just choose what we want to accept.

If the student proposing the project speaks Spanish, that should not be a 
problem.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Question: pg_class attributes and race conditions ?

2007-03-16 Thread Joshua D. Drake

Tom Lane wrote:
 Pavan Deolasee [EMAIL PROTECTED] writes:
 Any thoughts on the overall approach ?
 
 Fragile and full of race conditions :-(.  I thought from the beginning
 that CREATE INDEX might be a showstopper for the whole HOT concept,
 and it's starting to look like that's the case.
 
 I think what we need to get away from is the assumption that HOT-ness
 for one index is the same as HOT-ness for all.  What if we only applied
 HOT to primary-key indexes, so that there was certainly not more than
 one index per table that the property applies to?

Just to throw my two bits in here :). If we do that, how does that
effect the idea that most people in the web world use (id serial primary
key), even though that is not what they are searching on?

More specifically, does HOT help conditions where a composite comes into
play (session_id,last_active) ... which would be a more heavily updated
index than just the primary key.

Sincerely,

Joshua D. Drake


 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 2: Don't 'kill -9' the postmaster
 


-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Question: pg_class attributes and race conditions ?

2007-03-16 Thread Heikki Linnakangas


Tom Lane wrote:

What if we only applied
HOT to primary-key indexes, so that there was certainly not more than
one index per table that the property applies to?


The main objective of HOT is to enable retail vacuum of HOT-updated 
tuples. Doing the above would make it useless for that purpose, at least 
when there's more than one index on the table. Granted, there's a lot of 
tables with just one index out there, but it's a big limitation 
nevertheless.


An extension of that idea, though is to store a flag per index in the 
HOT-updated tuple. We would then need a mapping between bits in the 
tuple header to indexes, for example as a new column in pg_index.


Let's assume that we'd use one byte in the tuple header to store 
HOT-flags. That's enough to store the HOT-status for 8 indexes. A set 
bit means that the index corresponding that bit doesn't have an index 
pointer in it.


When you CREATE INDEX, assign a bit for the new index that's not 
currently in use. When you scan the table to build the index, clear that 
bit for every tuple if set and insert index entry as usual.


DROP INDEX wouldn't need to scan the heap to clear the flags, because we 
clear them on CREATE INDEX when necessary.


If you run out of bits in the header, IOW have more than 8 indexes on a 
table, indexes unlucky enough to not have a bit assigned to them 
wouldn't be HOT-updateable.


This would also enable us to skip index inserts for those indexes whose 
key columns are not updated, and do the index inserts as usual for the 
rest. The limitation that you can only retail vacuum HOT-updated tuples 
when none of the indexed keys were changed remains, but we've accepted 
that already.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Question: pg_class attributes and race conditions ?

2007-03-16 Thread Pavan Deolasee


Tom Lane wrote:
 Pavan Deolasee [EMAIL PROTECTED] writes:
 Any thoughts on the overall approach ?

 Fragile and full of race conditions :-(.


Yes, it looks a bit complex. But IMHO we can get around that.
Do you have any ideas in mind about doing that ?

 I thought from the beginning
 that CREATE INDEX might be a showstopper for the whole HOT concept,
 and it's starting to look like that's the case.

I remember you raised this concern very early, but I am hopeful
that we would be able to solve this. Would it be acceptable
to have a simple (though not the best) solution for this release
and then improve later on ? As I mentioned earlier, one option
is to CHILL the table, if required, holding AccessExclusive lock,
just like VACUUM FULL. I am assuming here that CREATE INDEX is
not such a common activity, isn't that true ?

 I think what we need to get away from is the assumption that HOT-ness
 for one index is the same as HOT-ness for all.  What if we only applied
 HOT to primary-key indexes, so that there was certainly not more than
 one index per table that the property applies to?


I think that will take away the ability to reuse HEAP_ONLY tuples
without vacuuming the heap and index.

Thanks,
Pavan


--


EnterpriseDBhttp://www.enterprisedb.com


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] Question: pg_class attributes and race conditions ?

2007-03-16 Thread Simon Riggs

On Fri, 2007-03-16 at 21:56 +0530, Pavan Deolasee wrote:

 Any thoughts on the overall approach ? Any suggestions to
 simplify things or any alternate designs ? 

Well your design is very different from what we discussed, so I think I
should post my proposed design alongside this, for further discussion.

- - -

Proposed solutions for CREATE INDEX and CREATE INDEX CONCURRENTLY.

CREATE INDEX CONCURRENTLY has no feature changes to work with HOT.

CREATE INDEX works normally, except when HOT tuples are found, in which
case special processing occurs requiring additional locking. The index
build occurs in a single scan, as now.

CREATE INDEX CONCURRENTLY
-
We perform no chilling during the initial scan. We index the tuple
identified by SnapshotNow, but we take the root tuple's htid, not the
htid of the tuple version being indexed. We assume that the tuple
version indexed will be the root of the HOT chain by the time the index
is complete.

Currently, we wait until all pre-existing transactions exit before we
allow this to COMMIT. With HOT, we simply move the wait so it occurs
*before* the second scan, then we can prune the HOT chains as we pass
through the heap on the second scan. There will be no pre-existing HOT
tuples and so no chilling is required.

CREATE INDEX


We add a field, xchill, to pg_class that stores TransactionIds. This is
set to InvalidTransactionId if no index has been built yet. The concept
of main indexer is introduced, so we allow multiple concurrent index
builds, but only one of these can chill tuples at a time.

1. In IndexBuildHeapScan, as we pass through the table:

a) if we find any any HOT rows, we check xchill and do one of steps
(i-iii). Until this point, it hasn't been important whether we are the
main or a secondary indexer.

i) if xchill is InvalidTransactionId or is committed then we attempt to
become main indexer immediately, following these steps before we
continue building the index (1b)

-- If the table is temp, or if we created the table then we immediately
become the main indexer, so return immediately. If the table being
indexed is already visible to everybody, then:
-- Update pg_class entry for the table, setting the xchill field for the
table to the builder's Xid. (Use heap_inplace_update for this, which is
OK, whether we commit or abort).
-- acquire AccessExclusiveLock on all existing indexes (not the heap)

ii) If xchill is in-progress we wait for that transaction to complete,
then do either step i) or iii). We cannot continue building our index
until the other transaction commits because we cannot yet see the other
index, yet we have to insert into it in order to correctly chill a tuple
to allow *our* index to be built.

iii) if xchill is aborted we abort also, saying that a VACUUM is needed.

b) If we get here then we are the main indexer and can chill tuples. As
we move through the scan we chill all HOT tuples, mark them
HEAP_CHILL_IN_PROGRESS, write WAL for this and insert index entries for
them in all existing indexes, as well as this one. Then remove
CHILL_IN_PROGRESS flags, without writing WAL.

c) release locks on indexes, before end of transaction

2. If we crash or a transaction abort occurs:
- we cannot prune a HEAP_ONLY_TUPLE that points to a tuple with
HEAP_CHILL_IN_PROGRESS. 

- VACUUM must be used to clean up after an aborted index build and needs
some additional code to allow this to occur.

3. Concurrent index builds are allowed. If we are not the main indexer,
then we can attempt to build an index, but any scan that sees a HOT
tuple will block and wait for the main index builder to complete before
it proceeds.

4. When an indexscan reads the table, if it finds a
HEAP_CHILL_IN_PROGRESS tuple it may or may not be valid. Concurrent
index scans and tuple chilling can mean that an index scan find the same
tuple twice, by different routes, if a CREATE INDEX crashed. To avoid
this an IndexScan will only find a tuple visible if it came across a
HEAP_CHILL_IN_PROGRESS tuple using an indirect route, i.e. it followed
the path from root-HOT tuple.

In this design, CREATE INDEX does have a deadlock risk when it is used
within a transaction *and* the index is being built on a publicly
visible table (i.e. not just-built and not temp). IMHO that risk is
acceptable, since if users are worried about concurrent access to a
table during CREATE INDEX they can use CREATE INDEX CONCURRENTLY.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Question: pg_class attributes and race conditions ?

2007-03-16 Thread Gregory Stark


Joshua D. Drake [EMAIL PROTECTED] writes:

 Just to throw my two bits in here :). If we do that, how does that
 effect the idea that most people in the web world use (id serial primary
 key), even though that is not what they are searching on?

affect. But I think you're right that generally you'll have two indexes.

 More specifically, does HOT help conditions where a composite comes into
 play (session_id,last_active) ... which would be a more heavily updated
 index than just the primary key.

Well if you're indexing a column that you're updating then you've already
failed your saving throw.

The case we're trying to deal with is when you're updating columns that
*aren't* indexed and therefore really don't need redundant index pointers for
each tuple version with identical to the old versions. Especially since those
index pointers are what's preventing us from vacuuming the old tuple versions.

If you are updating an index key then there's no question you're going to need
vacuum to clean out your index.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] Bug in UTF8-Validation Code?

2007-03-16 Thread Jeff Davis

On Tue, 2007-03-13 at 12:00 +0100, Mario Weilguni wrote:
 Hi,
 
 I've a problem with a database, I can dump the database to a file, but 
 restoration fails, happens with 8.1.4.

I reported the same problem a while back:

http://archives.postgresql.org/pgsql-bugs/2006-10/msg00246.php

Some people think it's a bug, some people don't. It is technically
documented behavior, but I don't think the documentation is clear
enough. I think it is a bug that should be fixed, and here's another
message in the thread that expresses my opinion:

http://archives.postgresql.org/pgsql-bugs/2006-11/msg00033.php

If you look at that email, it includes some examples of surprising
behaviors caused by that bug, particularly with bytea.

In some applications (for which it's impractical to change the source
code), I actually use a CHECK constraint (which raises an exception on
invalid utf8 data) on every text column so that some binary data doesn't
break my slony replication. 

I'd like to see this fixed.

Regards,
Jeff Davis


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Question: pg_class attributes and race conditions ?

2007-03-16 Thread Pavan Deolasee


Heikki Linnakangas wrote:
 Tom Lane wrote:
 What if we only applied
 HOT to primary-key indexes, so that there was certainly not more than
 one index per table that the property applies to?

 The main objective of HOT is to enable retail vacuum of HOT-updated
 tuples. Doing the above would make it useless for that purpose,
 at least when there's more than one index on the table. Granted,
 there's a lot of tables with just one index out there, but it's a
 big limitation nevertheless.


Agree.

 An extension of that idea, though is to store a flag per index in
 the HOT-updated tuple. We would then need a mapping between bits in
 the tuple header to indexes, for example as a new column in pg_index.


I like the idea. The major objection would be that it adds a byte
to the tuple header which when considered along with the null
bitmap, may actually make the header 8 bytes larger in the
worst case.

Also, I am also worried about the additional complexity introduced
with this. We can and should work on this idea, I am wondering
whether it would be too much to do before the feature freeze.

I am personally inclined towards doing something simpler to
tackle the CREATE INDEX issue at the moment. But if that is not
acceptable and/or you or anyone else is willing help me on this,
we can work on a better solution.


Thanks,
Pavan

--


EnterpriseDBhttp://www.enterprisedb.com


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [HACKERS] Bug in UTF8-Validation Code?

2007-03-16 Thread Jeff Davis

On Wed, 2007-03-14 at 01:29 -0600, Michael Fuhr wrote:
 On Tue, Mar 13, 2007 at 04:42:35PM +0100, Mario Weilguni wrote:
  Am Dienstag, 13. März 2007 16:38 schrieb Joshua D. Drake:
   Is this any different than the issues of moving 8.0.x to 8.1 UTF8? Where
   we had to use iconv?
  
  What issues? I've upgraded several 8.0 database to 8.1. without having to 
  use 
  iconv. Did I miss something?
 
 http://www.postgresql.org/docs/8.1/interactive/release-8-1.html
 
 Some users are having problems loading UTF-8 data into 8.1.X.  This
 is because previous versions allowed invalid UTF-8 byte sequences
 to be entered into the database, and this release properly accepts
 only valid UTF-8 sequences. One way to correct a dumpfile is to run
 the command iconv -c -f UTF-8 -t UTF-8 -o cleanfile.sql dumpfile.sql.
 

If the above quote were actually true, then Mario wouldn't be having a
problem. Instead, it's half-true: Invalid byte sequences are rejected in
some situations and accepted in others. If postgresql consistently
rejected or consistently accepted invalid byte sequences, that would not
cause problems with COPY (meaning problems with pg_dump, slony, etc.).

Regards,
Jeff Davis


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Question: pg_class attributes and race conditions ?

2007-03-16 Thread Simon Riggs

On Fri, 2007-03-16 at 12:40 -0400, Tom Lane wrote:
 Pavan Deolasee [EMAIL PROTECTED] writes:
  Any thoughts on the overall approach ?
 
 Fragile and full of race conditions :-(.  I thought from the beginning
 that CREATE INDEX might be a showstopper for the whole HOT concept,
 and it's starting to look like that's the case.

Seems like we can fix all but some strange CREATE INDEX use cases. Since
we have CREATE INDEX CONCURRENTLY, seems like HOT is a showstopper for
the whole CREATE INDEX concept.

 I think what we need to get away from is the assumption that HOT-ness
 for one index is the same as HOT-ness for all. 

Sounds interesting. I'd not considered that before.

  What if we only applied
 HOT to primary-key indexes, so that there was certainly not more than
 one index per table that the property applies to?

On its own, I don't think this is a sufficiently wide use-case.

Perhaps we should do this PLUS make HOT-semantics optional for each
additional index. i.e. HOT is always enforced on primary indexes and
optionally on other indexes (but not by default).

If you accept the HOT option on an index, you then accept the additional
issues surrounding chilling tuples. Bear in mind that there aren't any
at all if you use CREATE INDEX CONCURRENTLY and many other cases.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Question: pg_class attributes and race conditions ?

2007-03-16 Thread Joshua D. Drake

Gregory Stark wrote:
 Joshua D. Drake [EMAIL PROTECTED] writes:
 
 Just to throw my two bits in here :). If we do that, how does that
 effect the idea that most people in the web world use (id serial primary
 key), even though that is not what they are searching on?
 
 affect. But I think you're right that generally you'll have two indexes.
 
 More specifically, does HOT help conditions where a composite comes into
 play (session_id,last_active) ... which would be a more heavily updated
 index than just the primary key.
 
 Well if you're indexing a column that you're updating then you've already
 failed your saving throw.

Just for everyone who missed this. Greg Stark obviously spends his time
(or at some time) playing|ed DD. I have an Epic level Sorcerer, how
about you Greg? ;)

Sincerely,

Joshua D. Drake


-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

[HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-16 Thread Tom Lane

The current buildfarm webpages make it easy to see when a branch tip
is seriously broken, but it's not very easy to investigate transient
failures, such as a regression test race condition that only
materializes once in awhile.  I would like to have a way of seeing
just the failed build attempts across all machines running a given
branch.  Ideally it would be possible to tag failures as to the cause
(if known) and/or symptom pattern, and then be able to examine just
the ones without known cause or having similar symptoms.

I'm not sure how much of this is reasonable to try to do with webpages
similar to what we've got.  But the data is all in a database AIUI,
so another possibility is to do this work via SQL.  That'd require
having the ability to pull the information from the buildfarm database
so someone else could manipulate it.

So I guess the first question is can you make the build data available,
and the second is whether you're interested in building more flexible
views or just want to let someone else do that.  Also, if anyone does
make an effort to tag failures, it'd be good to somehow push that data
back into the master database, so that we don't end up duplicating such
work.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] [PATCHES] Bitmapscan changes

2007-03-16 Thread Joshua D. Drake

Heikki Linnakangas wrote:
 Joshua D. Drake wrote:
 This is what I suggest.

 Provide a tarball of -head with the patch applied.
 
 Here you are:
 
 http://community.enterprisedb.com/git/pgsql-git-20070315.tar.gz
 
 Provide a couple of use cases that can be run with explanation of how to
 verify the use cases.
 
 There's a number of simple test cases on the web page that I've used
 (perfunittests). I can try to simplify them and add explanations.

O.k. maybe I am the only one, but I actually dug the archives for what
website you were talking about and then said, Aha!, he means:
http://community.enterprisedb.com/git/;.

So I will accept my own paperbag, and hopefully save some from the same
fate by posted the above link.

Joshua D. Drake




-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] [PATCHES] Bitmapscan changes

2007-03-16 Thread Joshua D. Drake

Heikki Linnakangas wrote:
 Joshua D. Drake wrote:
 This is what I suggest.

 Provide a tarball of -head with the patch applied.
 
 Here you are:
 
 http://community.enterprisedb.com/git/pgsql-git-20070315.tar.gz
 
 Provide a couple of use cases that can be run with explanation of how to
 verify the use cases.
 
 There's a number of simple test cases on the web page that I've used
 (perfunittests). I can try to simplify them and add explanations.
 
This URL is not working:


http://community.enterprisedb.com/git/git-perfunittests-20070222.tar.gz

File not found.

Sincerely,

Joshua D. Drake


-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-16 Thread Joshua D. Drake

Tom Lane wrote:
 The current buildfarm webpages make it easy to see when a branch tip
 is seriously broken, but it's not very easy to investigate transient
 failures, such as a regression test race condition that only
 materializes once in awhile.  I would like to have a way of seeing
 just the failed build attempts across all machines running a given
 branch.  Ideally it would be possible to tag failures as to the cause
 (if known) and/or symptom pattern, and then be able to examine just
 the ones without known cause or having similar symptoms.
 
 I'm not sure how much of this is reasonable to try to do with webpages
 similar to what we've got.  But the data is all in a database AIUI,
 so another possibility is to do this work via SQL.  That'd require
 having the ability to pull the information from the buildfarm database
 so someone else could manipulate it.
 
 So I guess the first question is can you make the build data available,
 and the second is whether you're interested in building more flexible
 views or just want to let someone else do that.  Also, if anyone does
 make an effort to tag failures, it'd be good to somehow push that data
 back into the master database, so that we don't end up duplicating such
 work.

If the data is already there and just not represented, just let me know
exactly what you want and I will implement pages for that data happily.

Joshua D. Drake


 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 3: Have you checked our extensive FAQ?
 
http://www.postgresql.org/docs/faq
 


-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] [PATCHES] Bitmapscan changes

2007-03-16 Thread Heikki Linnakangas


Joshua D. Drake wrote:

This URL is not working:


http://community.enterprisedb.com/git/git-perfunittests-20070222.tar.gz


Sorry about that, typo in the filename. Fixed.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-16 Thread Andrew Dunstan


Tom Lane wrote:

The current buildfarm webpages make it easy to see when a branch tip
is seriously broken, but it's not very easy to investigate transient
failures, such as a regression test race condition that only
materializes once in awhile.  I would like to have a way of seeing
just the failed build attempts across all machines running a given
branch.  Ideally it would be possible to tag failures as to the cause
(if known) and/or symptom pattern, and then be able to examine just
the ones without known cause or having similar symptoms.

I'm not sure how much of this is reasonable to try to do with webpages
similar to what we've got.  But the data is all in a database AIUI,
so another possibility is to do this work via SQL.  That'd require
having the ability to pull the information from the buildfarm database
so someone else could manipulate it.

So I guess the first question is can you make the build data available,
and the second is whether you're interested in building more flexible
views or just want to let someone else do that.  Also, if anyone does
make an effort to tag failures, it'd be good to somehow push that data
back into the master database, so that we don't end up duplicating such
work.


  


Well, the db is currently running around 13Gb, so that's not something 
to be exported lightly ;-)


If we upgraded from Postgres 8.0.x to 8.2.x we could make use of some 
features, like dynamic partitioning and copy from queries, that might 
make life easier (CP people: that's a hint :-) )


I don't want to fragment effort, but I also know CP don't want open 
access, for obvious reasons.


We can also look at a safe API that we could make available freely. I've 
already done this over SOAP (see example client at 
http://people.planetpostgresql.org/andrew/index.php?/archives/14-SOAP-server-for-Buildfarm-dashboard.html 
). Doing updates is a whole other matter, of course.


Lastly, note that some buildfarm enhancements are on the SOC project 
list. I have no idea if anyone will express any interest in that, of 
course. It's not very glamorous work.


cheers

andrew


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] tsearch_core for inclusion

2007-03-16 Thread Robert Treat

On Friday 16 March 2007 10:45, Teodor Sigaev wrote:
  I don't see how the proposal is going to solve that type of problem, but
  maybe I am overlooking something?

 The same way as other system tables objects, they don't dump, they don't
 restore. In 8.3, seems, API to index AM will be changed - will anybody
 except pghackers see that? New opclass layout, new opfamily table - users
 don't that changes at all.

If I have configured my tsearch install for spanish (spanish 
dictionary/stemmers/synonyms/etc...) aren't I going to lose all that on the 
next database upgrade? 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-16 Thread Joshua D. Drake


 Well, the db is currently running around 13Gb, so that's not something
 to be exported lightly ;-)
 
 If we upgraded from Postgres 8.0.x to 8.2.x we could make use of some
 features, like dynamic partitioning and copy from queries, that might
 make life easier (CP people: that's a hint :-) )

Yeah, Yeah... I need to get you off that machine as a whole :) Which is
on the list but I am waiting for 8.3 *badda bing*.

Sincerely,

Joshua D. Drake



-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-16 Thread Tom Lane

Andrew Dunstan [EMAIL PROTECTED] writes:
 Well, the db is currently running around 13Gb, so that's not something 
 to be exported lightly ;-)

Yeah.  I would assume though that the vast bulk of that is captured log
files.  For the purposes I'm imagining, it'd be sufficient to export
only the rest of the database --- or ideally, records including all the
other fields and a URL for each log file.  For the small number of log
files you actually need to examine, you'd chase the URL.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] [PATCHES] Bitmapscan changes

2007-03-16 Thread Joshua D. Drake

Heikki Linnakangas wrote:
 Joshua D. Drake wrote:
 This URL is not working:


 http://community.enterprisedb.com/git/git-perfunittests-20070222.tar.gz
 
 Sorry about that, typo in the filename. Fixed.
 
 
Here are my results on a modest 3800X2 2 Gig of ram, RAID 1 dual SATA


http://pgsql.privatepaste.com/170yD8c0gr

Sincerely,

Joshua D. Drake
-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Question: pg_class attributes and race conditions ?

2007-03-16 Thread Alvaro Herrera

Simon Riggs wrote:

   What if we only applied
  HOT to primary-key indexes, so that there was certainly not more than
  one index per table that the property applies to?
 
 On its own, I don't think this is a sufficiently wide use-case.

 Perhaps we should do this PLUS make HOT-semantics optional for each
 additional index. i.e. HOT is always enforced on primary indexes and
 optionally on other indexes (but not by default).

Here's is a very simple, low-tech idea.  How about checking whether the
new index requires chilling tuples; if it does, then elog(ERROR) until
all the indexes have been manually chilled, which would be done with an
ALTER INDEX ... CHILL command or something like that.  Only when all
indexes are known chilled, you can create another one, and then the user
can hotify indexes as appropriate.

(Disclaimer: I haven't followed the HOT design closely to know if this
makes enough sense)

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] [PATCHES] Bitmapscan changes

2007-03-16 Thread Heikki Linnakangas


Joshua D. Drake wrote:

Heikki Linnakangas wrote:

Joshua D. Drake wrote:

This URL is not working:


http://community.enterprisedb.com/git/git-perfunittests-20070222.tar.gz

Sorry about that, typo in the filename. Fixed.



Here are my results on a modest 3800X2 2 Gig of ram, RAID 1 dual SATA


Thanks for looking into this, though that test alone doesn't really tell 
us anything. You'd have to run the same tests with and without clustered 
indexes enabled, and compare. With the default settings the test data 
fits in memory anyway, so you're not seeing the I/O benefit but only the 
CPU overhead.


Attached is a larger test case with a data set of  2 GB. Run the 
git_demo_init.sql first to create tables and indexes, and 
git_demo_run.sql to perform selects on them. The test runs for quite a 
long time, depending on your hardware, and print the time spent on the 
selects, with and without clustered index.


You'll obviously need to run it with the patch applied. I'd suggest to 
enable stats_block_level to see the effect on buffer cache hit/miss ratio.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com


git_demo.tar.gz
Description: GNU Zip compressed data

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-16 Thread Andrew Dunstan


Tom Lane wrote:

Andrew Dunstan [EMAIL PROTECTED] writes:
  
Well, the db is currently running around 13Gb, so that's not something 
to be exported lightly ;-)



Yeah.  I would assume though that the vast bulk of that is captured log
files.  For the purposes I'm imagining, it'd be sufficient to export
only the rest of the database --- or ideally, records including all the
other fields and a URL for each log file.  For the small number of log
files you actually need to examine, you'd chase the URL.

  


OK, for anyone that wants to play, I have created an extract that 
contains a summary of every non-CVS-related failure we've had. It's a 
single table looking like this:


CREATE TABLE mfailures (
   sysname text,
   snapshot timestamp without time zone,
   stage text,
   conf_sum text,
   branch text,
   changed_this_run text,
   changed_since_success text,
   log_archive_filenames text[],
   build_flags text[]
);


The dump is just under 1Mb and can be downloaded from 
http://www.pgbuildfarm.org/mfailures.dump


If this is useful we can create it or something like it on a regular 
basis (say nightly).


The summary log for a given build can be got from: 
http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=sysnamedt=snapshot


To look at the log for a given run stage select 
http://www.pgbuildfarm.org/cgi-bin/show_stage_log.pl?nm=sysnamedt=snapshotstg=stagename 
- the stage names available (if any) are the entries in 
log_archive_filenames, stripped of the .log suffix.


We can make these available over an API that isn't plain http is people 
want. Or we can provide a version of the buildlog that is tripped of the 
html.


cheers

andrew





---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-16 Thread Jeremy Drake

On Fri, 16 Mar 2007, Andrew Dunstan wrote:

 OK, for anyone that wants to play, I have created an extract that contains a
 summary of every non-CVS-related failure we've had. It's a single table
 looking like this:

 CREATE TABLE mfailures (
sysname text,
snapshot timestamp without time zone,
stage text,
conf_sum text,
branch text,
changed_this_run text,
changed_since_success text,
log_archive_filenames text[],
build_flags text[]
 );

Sweet.  Should be interesting to look at.



 The dump is just under 1Mb and can be downloaded from
 http://www.pgbuildfarm.org/mfailures.dump

Sure about that?

--14:45:45--  http://www.pgbuildfarm.org/mfailures.dump
   = `mfailures.dump'
Resolving www.pgbuildfarm.org... 207.173.203.146
Connecting to www.pgbuildfarm.org|207.173.203.146|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 9,184,142 (8.8M) [text/plain]



-- 
BOO!  We changed Coke again!  BLEAH!  BLEAH!

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Question: pg_class attributes and race conditions ?

2007-03-16 Thread Simon Riggs

On Fri, 2007-03-16 at 16:59 -0400, Alvaro Herrera wrote:

 Here's is a very simple, low-tech idea.  How about checking whether the
 new index requires chilling tuples; if it does, then elog(ERROR) until
 all the indexes have been manually chilled, which would be done with an
 ALTER INDEX ... CHILL command or something like that.  Only when all
 indexes are known chilled, you can create another one, and then the user
 can hotify indexes as appropriate.

Well, I've spent two weeks searching for a design that does CREATE INDEX
without changing existing functionality. What's been proposed is very
close, but not exact.

CREATE INDEX CONCURRENTLY can work, so we're just discussing the other
increasingly edgy cases.

I agree some kind of compromise on CREATE INDEX seems to be required if
we want HOT without some drastic reductions in function. I'm happy to go
for low tech approaches, or anything really. Simple is good, so we can
hit feature freeze.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] tsearch_core for inclusion

2007-03-16 Thread Florian G. Pflug


Robert Treat wrote:

On Friday 16 March 2007 10:45, Teodor Sigaev wrote:

I don't see how the proposal is going to solve that type of problem, but
maybe I am overlooking something?

The same way as other system tables objects, they don't dump, they don't
restore. In 8.3, seems, API to index AM will be changed - will anybody
except pghackers see that? New opclass layout, new opfamily table - users
don't that changes at all.


If I have configured my tsearch install for spanish (spanish 
dictionary/stemmers/synonyms/etc...) aren't I going to lose all that on the 
next database upgrade? 


I believe what teodor meant is that the *tables* won't be dumped as such
by pg_dump (the same as all other tables in pg_catalog). But pg_dump would
of course need to dump the information stored in the tables - it would
put some CREATE FULLTEXT ...  statements into your dump. Its really
the same as for any other database object like a function, a type, ...

greetings, Florian Pflug



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

[HACKERS] Adding a typmod field to Const et al

2007-03-16 Thread Tom Lane

A month or so back I wrote:
 BTW, I think a good case could be made that the core of the problem
 is exactly that struct Const doesn't carry typmod, and thus that we
 lose information about constructs like 'foo'::char(7).  We should fix
 that, and also anywhere else in the expression tree structure where
 we are discarding knowledge about the typmod of a result.  This has
 got some urgency because of Teodor's recent work on allowing user
 defined types to have typmods --- we can expect massive growth in the
 number of scenarios where it matters.

I looked into this and determined that the interesting cases seem to be

Const:  needs a struct field added

ArrayRef:   ditto; but we could drop refrestype which is
redundant

SubLink:EXPR and ARRAY cases should recurse to
subplan target item, as exprType() does

ArrayExpr:  should adopt the same behavior as Coalesce and
similar nodes, ie, if all the elements show the
same type/typmod then return that typmod
instead of -1

With these changes, exprTypmod covers all the same cases as exprType,
except for cases that demonstrably don't have a typmod, such as the
result of a non-length-coercion function, or nodes that have a hardwired
result type such as BOOL that doesn't take a typmod.

Comments, objections?

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-16 Thread Andrew Dunstan

Jeremy Drake wrote:


 The dump is just under 1Mb and can be downloaded from
 http://www.pgbuildfarm.org/mfailures.dump

 Sure about that?

 HTTP request sent, awaiting response... 200 OK
 Length: 9,184,142 (8.8M) [text/plain]



Damn these new specs. They made me skip a digit.

cheers

andrew



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Adding a typmod field to Const et al

2007-03-16 Thread Gregory Stark


Tom Lane [EMAIL PROTECTED] writes:

   ArrayExpr:  should adopt the same behavior as Coalesce and
   similar nodes, ie, if all the elements show the
   same type/typmod then return that typmod
   instead of -1
...
 Comments, objections?

I'm not entirely convinced by this one. Does that mean expressions like this
would throw an error if col1 was declared as a numeric(1)?

 ARRAY[col1] || 10



-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] Adding a typmod field to Const et al

2007-03-16 Thread Tom Lane

Gregory Stark [EMAIL PROTECTED] writes:
 I'm not entirely convinced by this one. Does that mean expressions like this
 would throw an error if col1 was declared as a numeric(1)?
  ARRAY[col1] || 10

No, because the result of the || operator won't have a specific typmod.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Lock table in non-volatile functions

2007-03-16 Thread Gaetano Mendola


Tom Lane wrote:

Gaetano Mendola [EMAIL PROTECTED] writes:

I'm observing that is not allowed to LOCK a table in a
STABLE/IMMUTABLE function but at same time is allowed
a SELECT FOR UPDATE.


Really?  AFAICS, CommandIsReadOnly() will reject SELECT FOR UPDATE too.


kalman=# select version();
version

 PostgreSQL 8.2.3 on i686-redhat-linux-gnu, compiled by GCC gcc (GCC) 
4.1.1 20070105 (Red Hat 4.1.1-51)

(1 row)

kalman=# CREATE TABLE test( a integer );
CREATE TABLE
kalman=# INSERT INTO test VALUES ( 1 );
INSERT 0 1
kalman=# CREATE OR REPLACE FUNCTION sp_test()
kalman-# RETURNS INTEGER AS $body$
kalman$# DECLARE
kalman$#  my_integer integer;
kalman$#  my_port_set RECORD;
kalman$# BEGIN
kalman$#  FOR my_port_set IN
kalman$#  SELECT a
kalman$#  FROM test
kalman$#  FOR UPDATE
kalman$#  LOOP
kalman$#  my_integer = 0;
kalman$#  END LOOP;
kalman$# RETURN 0;
kalman$# END;
kalman$# $body$ language 'plpgsql'
kalman-# STABLE;
CREATE FUNCTION
kalman=# select sp_test();
 sp_test
-
   0
(1 row)

BTW why forbid the lock in a non volatile function or (if you fix this) 
the SELECT FOR UPDATE ?


Regards
Gaetano Mendola

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Lock table in non-volatile functions

2007-03-16 Thread Tom Lane

Gaetano Mendola [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 Really?  AFAICS, CommandIsReadOnly() will reject SELECT FOR UPDATE too.

 kalman$#  FOR my_port_set IN
 kalman$#  SELECT a
 kalman$#  FROM test
 kalman$#  FOR UPDATE
 kalman$#  LOOP

Hm, that's a bug --- SPI_cursor_open is failing to check for a read-only
query.

 BTW why forbid the lock in a non volatile function or (if you fix this) 
 the SELECT FOR UPDATE ?

Well, as for the lock, a non-volatile function isn't supposed to have
any side-effects, and taking a lock is certainly a side-effect no?
Now I suppose it'll be taking AccessShareLock anyway if it reads any
tables, so maybe we could negotiate about what sort of locks could be
allowed; but I'd certainly argue that allowing it to take any kind of
exclusive lock would be a Bad Idea.

As for SELECT FOR UPDATE, there's a very good reason for disallowing
that even without considering what locks it takes.  In a READ COMMITTED
transaction, SELECT FOR UPDATE can return row states that aren't visible
according to the nominal transaction snapshot, and so it violates the
promise of stable results.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] [PATCHES] Bitmapscan changes

2007-03-16 Thread Joshua D. Drake

Heikki Linnakangas wrote:
 Joshua D. Drake wrote:
 Heikki Linnakangas wrote:
 Joshua D. Drake wrote:
 This URL is not working:


 http://community.enterprisedb.com/git/git-perfunittests-20070222.tar.gz
 Sorry about that, typo in the filename. Fixed.


 Here are my results on a modest 3800X2 2 Gig of ram, RAID 1 dual SATA
 

 heap_pages | normal_index_pages | clustered_index_pages
++---
 216217 | 109679 |  1316


 select_with_normal_index
--
   10
(1 row)

Time: 1356524.743 ms
 select_with_normal_index
--
   10
(1 row)

Time: 1144832.597 ms
 select_with_normal_index
--
   10
(1 row)

Time: 445.236 ms


And now run the same tests with clustered index
Timing is on.
 select_with_clustered_index
-
  10
(1 row)

Time: 815622.768 ms
 select_with_clustered_index
-
  10
(1 row)

Time: 535749.457 ms
 select_with_clustered_index
-
  10
(1 row)


select relname,indexrelname,idx_blks_read,idx_blks_hit from
pg_statio_all_indexes where schemaname = 'public';
   relname| indexrelname | idx_blks_read | idx_blks_hit
--+--+---+--
 narrowtable  | narrowtable_index|296973 |   904654
 narrowtable2 | narrowtable2_clustered_index | 44556 |   857269
(2 rows)

 select relname,heap_blks_read,heap_blks_hit,idx_blks_read,idx_blks_hit
from pg_statio_user_tables  ;
   relname| heap_blks_read | heap_blks_hit | idx_blks_read |
idx_blks_hit
--++---+---+--
 narrowtable2 | 734312 |  40304136 | 44556 |
857269
 narrowtable  | 952044 |  40002609 |296973 |
904654


Seems like a clear win to me. Anyone else want to try?

Sincerely,

Joshua D. Drake




-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

63 matches

Mail list logo