Re: [HACKERS] TABLE-function patch vs plpgsql

2008-07-18 Thread Pavel Stehule
2008/7/18 Tom Lane [EMAIL PROTECTED]:
 Pavel Stehule [EMAIL PROTECTED] writes:
 Maybe we can use some well defined implicit record, maybe NEW (or
 RESULT, ROW_RESULT, ROW, TABLE_ROW, ...) like trigger - some like

 That sounds like exactly the sort of kluge-solution that I didn't
 want to get involved with ...

 Anyway, the core feature is in, and we still have several months
 before 8.4 feature freeze to debate how plpgsql ought to interact
 with it.


I agree.
Regards

p.s.

other solution - using referenced types

declare foorec fcename%ROWTYPE -- allows only current fce name
fooscalar fcename.col%TYPE

regards
Pavel Stehule

and many thanks for commit this patch


regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Load spikes on 8.1.11

2008-07-18 Thread Andrew Sullivan
On Fri, Jul 18, 2008 at 10:41:36AM +0530, Gurjeet Singh wrote:
 
 Just started INIT cluster Slonik command and that spiked too.. for more than
 10 minutes now!!

Are you attempting to do Slony changes (such as install Slony) on an
active database?  I strongly encourage you to read the Slony manual.
Slony, frankly, sucks for this use case.  The manual says as much,
although in more orotund phrases than that.

A

-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-18 Thread Simon Riggs

On Thu, 2008-07-17 at 16:37 -0700, Dann Corbit wrote:

 Large table unique index equality search should be very fast with hashed
 index (and the only place where any advantage will be seen).  Hashed
 indexes are useless for any search besides equality and gain more and
 more when the levels of the b-tree index increase.

I think a comparison with a btree using a functional index should be
shown.

 The only way to get better performance from hash based indexes is to
 read fewer index pages than if a tree-based index were used.  So I think
 that the scheme used to create the index pages is the focus to make them
 worthwhile.

Agreed. Some math on that, plus a clear focus on making this faster than
a btree is critical to this project.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TABLE-function patch vs plpgsql

2008-07-18 Thread Marko Kreen
On 7/18/08, Tom Lane [EMAIL PROTECTED] wrote:
 I've been working on the TABLE-function patch, and I am coming to the
  conclusion that it's really a bad idea for plpgsql to not associate
  variables with output columns --- that is, I think we should make
  RETURNS TABLE columns semantically just the same as OUT parameters.
  Here are some reasons:

  1. It's ludicrous to argue that standards compliance requires the
  behavior-as-submitted.  plpgsql is not specified by the SQL standard.

Yes, but it would be a good feature addition to plpgsql.
Currently there is no way to suppress the local variable
creation.  The proposed behaviour would give that possibility.

  2. Not having the parameter names available means that you don't have
  access to their types either, which is a big problem for polymorphic
  functions.  Read the last couple paragraphs of section 38.3.1:
  
 http://developer.postgresql.org/pgdocs/postgres/plpgsql-declarations.html#PLPGSQL-DECLARATION-ALIASES
  as well as the following 38.3.2.  How would you do those things with
  a polymorphic TABLE column?

This does not make sense as Postgres does not support
polymorphic table columns...

For polymorphic function arguments user should use OUT parameters.

I think thats the point - it should not be just syntactic sugar for
OUT parameters, let it be different.

  3. Not treating the parameters as assignable variables makes RETURN NEXT
  nearly worthless in a TABLE function.  Since they're not assignable,
  you can't use the parameterless form of RETURN NEXT (which'd return
  the current values of the variables).  The only alternative available
  is to return a record or row variable; but there's no convenient way
  to declare such a variable, since after all the whole point here is
  that the function's output rowtype is anonymous.

  4. It's a whole lot easier to explain things if we can just say that
  OUT parameters and TABLE parameters work alike.  This is especially
  true when they actually *are* alike for all the other available PLs.

What other PLs do create local variables for OUT parameters?

  If we insist on the current definition then we are eventually going to
  need to kluge up some solutions to #2 and #3, which seems like make-work
  to me when we already have smooth solutions to these problems for
  OUT parameters.

  Comments?

I would prefer - no local vars, no polymorphism and funcname%rowtype.

Some functions are better written with OUT parameters but some
with record variable for return rows.  The new behaviour would
allow picking best coding style for a function.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: CITEXT 2.0 v4

2008-07-18 Thread Michael Paesold

David E. Wheeler writes:


On Jul 17, 2008, at 03:45, Michael Paesold wrote:

Wouldn't it be possible to create a variant of regexp_replace, i.e.  
regexp_replace(citext,citext,text), which would again lower-case  
the first two arguments before passing the input to  
regexp_replace(text,text,text)?


Sure, but then you end up with this:

template1=# select regexp_replace( 'Fxx'::citext, 'X'::citext, 'o');
regexp_replace

foo
(1 row)


Yeah, you are right, I see. :-)


Which is just wrong. I'm going to look at the regex C functions  
today and see if there's an easy way to just always pass them the  
'i' flag, which would do the trick. That still won't help replace(),  
split_part(), or translate(), however.


Calling regex functions with the case-insensitivity option would be  
great. It should also be possible to rewrite replace() into  
regexp_replace() by first escaping the regex meta characters.


Actually re-implementing those functions in a case insensitive way  
would still be an option, but of course some amount of work. The  
question is, how much use case there is.


Best Regards
Michael Paesold

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-18 Thread Gregory Stark
Simon Riggs [EMAIL PROTECTED] writes:

 On Thu, 2008-07-17 at 16:37 -0700, Dann Corbit wrote:

 Large table unique index equality search should be very fast with hashed
 index (and the only place where any advantage will be seen).  Hashed
 indexes are useless for any search besides equality and gain more and
 more when the levels of the b-tree index increase.

 I think a comparison with a btree using a functional index should be
 shown.

To do that you'll have to think about the use cases you think hash should win
on. 

For cpu-bound databases with small indexes there might be a win if you can
avoid the binary search of all the elements on a page. (Have we modified btree
to do that or does it still scan sequentially on the leaf pages?)

For i/o-bound databases with very large indexes there should be an opportunity
where btree lookups are O(logn) and hash lookups can in theory be O(1).

However to get O(1) hash lookups need to do extra work at insert time. If they
don't expand the hash as necessary then they end up just being a linear
speedup to whatever lookup algorithm is used to scan the buckets. That isn't
going to win over btree which is already doing a binary search.

The extra work on insert time is O(nlogn) amortized, but I'm not sure
good amortized performance is good enough for Postgres. Users are unhappy when
they're average performance is good but 1/1000 inserts thrashes their i/o
rewriting the whole index... 

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-18 Thread Simon Riggs

On Fri, 2008-07-18 at 11:07 +0100, Gregory Stark wrote:
 Simon Riggs [EMAIL PROTECTED] writes:

 hash lookups can in theory be O(1).

I'm not sure whether that applies here? I'm interested in how *this*
patch will work, not in more generic algorithm theory.

To patch authors: Can we please see a table showing expected number of
logical I/Os (i,e, block accesses) for btrees and hash indexes

e.g. for 100-byte rows...

rowsbtree   hash
-   
10^2
10^3
10^4
10^5
10^6
10^7
10^8

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: primary key patches

2008-07-18 Thread chris
[EMAIL PROTECTED] (Markus Wanner) writes:
 as you might know, Postgres-R relies on primary keys to address tuples
 of a table. It cannot replicate tables without a primary key.

Slony-I does the same, with the variation that it permits the option
of using a candidate primary key, namely an index that is unique+NOT
NULL.

If it is possible to support that broader notion, that might make
addition of these sorts of logic more widely useful.

 Primary keys currently aren't really used within the executor, so I
 had to extended and modify Postgres here and there, to get the
 required information. To ease reviewing I have split out these
 modifications and present them here as two separate little patches.

I know Jan Wieck has in mind the idea of adding an interface to enable
doing highly efficient IUD (Insert/Update/Delete) via generating a way
to do direct heap updates, which would be *enormously* more efficient
than the present need (in Slony-I, for instance) to parse, plan and
execute thousands of IUD statements.  For UPDATE/DELETE to work
requires utilizing (candidate) primary keys, so there is some
seemingly relevant similarity there.
-- 
select 'cbbrowne' || '@' || 'linuxfinances.info';
http://cbbrowne.com/info/lsf.html
Rules  of the  Evil Overlord  #145. My  dungeon cell  decor  will not
feature exposed pipes.  While they add to the  gloomy atmosphere, they
are good  conductors of vibrations and  a lot of  prisoners know Morse
code. http://www.eviloverlord.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: primary key patches

2008-07-18 Thread Markus Wanner

Hello Chris,

chris wrote:

Slony-I does the same, with the variation that it permits the option
of using a candidate primary key, namely an index that is unique+NOT
NULL.

If it is possible to support that broader notion, that might make
addition of these sorts of logic more widely useful.


Well, yeah, that's technically not much different, so it would probably 
be very easy to extend Postgres-R to work on any arbitrary Index.


But what do we have primary keys for, in the first place? Isn't it 
exactly the *primay* key into the table, which you want to use for 
replication? Or do we need an additional per-table configuration option 
for that? A REPLICATION KEY besides the PRIMARY KEY?



I know Jan Wieck has in mind the idea of adding an interface to enable
doing highly efficient IUD (Insert/Update/Delete) via generating a way
to do direct heap updates, which would be *enormously* more efficient
than the present need (in Slony-I, for instance) to parse, plan and
execute thousands of IUD statements.  For UPDATE/DELETE to work
requires utilizing (candidate) primary keys, so there is some
seemingly relevant similarity there.


Definitely. The remote backend does exactly that for Postgres-R: it 
takes a change set, which consists of one or more tuple collections, and 
then applies these collections. See ExecProcessCollection() in execMain.c.


(Although, I'm still less than thrilled about the internal storage 
format of these tuple collections. That can certainly be improved and 
simplified.)


Regards

Markus

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-18 Thread Alvaro Herrera
Gregory Stark escribió:

 For cpu-bound databases with small indexes there might be a win if you can
 avoid the binary search of all the elements on a page. (Have we modified btree
 to do that or does it still scan sequentially on the leaf pages?)

Hmm?  It has used binary search since as long as I can remember ... see
_bt_first and _bt_binsrch.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Load spikes on 8.1.11

2008-07-18 Thread David Fetter
On Fri, Jul 18, 2008 at 10:15:42AM +0530, Gurjeet Singh wrote:
 On Fri, Jul 18, 2008 at 10:05 AM, Gurjeet Singh [EMAIL PROTECTED]
 wrote:
 
  On Fri, Jul 18, 2008 at 9:58 AM, Tom Lane [EMAIL PROTECTED] wrote:
 
  Gurjeet Singh [EMAIL PROTECTED] writes:
   During these spikes, in the 'top' sessions we see the 'idle' PG
   processes consuming between 2 and 5 % CPU, and since the box has 8 CPUS
  (2
   sockets and each CPU is a quad core Intel Xeon processors) and somewhere
   around 200 Postgres processes, the load spikes to above 200; and it does
   this very sharply.
 
  This looks like heavy contention for a spinlock.  You need to get a
  higher-level analysis of what's happening before anyone can say much
  more than that.
 
  Note that 8.1 is pretty much ancient history as far as scalability to
  8-core hardware goes.  You should probably consider updating to 8.3
  before investing too much time in tracking down what's happening.
  If you can still show the problem on 8.3 then there would be some
  interest in fixing it ...
 
 
  Upgrading is on the cards, but not as high priority as I would like it to
  be! This is a production box, and we desperatly need some respite from these
  spikes.
 
  Can you please elaborate on what high level diagnosis would you need?
 
  I just ran DROP SCHEMA _slony schema CASCADE; and it spiked again, on a
  very low loaded box!!
 
  Thanks for all you help.
 
 
 Would reducing the number of connections on the DB help in reducing the
 spike?

Just generally, reducing the number of connections to the DB will help
in reducing resource consumption.

When you first get a chance, use or set up a test environment where
you can test the upgrade to 8.3.latest.

Cheers,
David.
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: [EMAIL PROTECTED]

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: primary key patches

2008-07-18 Thread David Fetter
On Fri, Jul 18, 2008 at 03:04:08PM +0200, Markus Schiltknecht wrote:
 Hello Chris,

 chris wrote:
 Slony-I does the same, with the variation that it permits the
 option of using a candidate primary key, namely an index that is
 unique+NOT NULL.

 If it is possible to support that broader notion, that might make
 addition of these sorts of logic more widely useful.

 Well, yeah, that's technically not much different, so it would
 probably  be very easy to extend Postgres-R to work on any arbitrary
 Index.

 But what do we have primary keys for, in the first place? Isn't it
 exactly the *primay* key into the table, which you want to use for
 replication? Or do we need an additional per-table configuration
 option  for that? A REPLICATION KEY besides the PRIMARY KEY?

We have them because people are used to thinking in terms of a
PRIMARY KEY, not because that concept is actually distinguishable
from a non-partial UNIQUE NOT NULL constraint.

While I'm a chicken rather than a pig on this project
http://en.wikipedia.org/wiki/The_Chicken_and_the_Pig, I believe that
covering the more general case right from the start would be a much
better plan.

Cheers,
David.
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: [EMAIL PROTECTED]

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-18 Thread Heikki Linnakangas

Gregory Stark wrote:

For i/o-bound databases with very large indexes there should be an opportunity
where btree lookups are O(logn) and hash lookups can in theory be O(1).


Ignoring the big-O complexity, if a hash index only stores a 32-bit hash 
code instead of the whole key, it could be a big win in storage size, 
and therefore in cache-efficiency and performance, when the keys are 
very long.


Granted, it's not very common to use a 1K text field as a key column...

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-18 Thread Gregory Stark
Heikki Linnakangas [EMAIL PROTECTED] writes:

 Gregory Stark wrote:
 For i/o-bound databases with very large indexes there should be an 
 opportunity
 where btree lookups are O(logn) and hash lookups can in theory be O(1).

 Ignoring the big-O complexity, if a hash index only stores a 32-bit hash code
 instead of the whole key, it could be a big win in storage size, and therefore
 in cache-efficiency and performance, when the keys are very long.

I think it has to show an improvement over an expression index over
(hashany(col)) and not just an improvement over an index over col due to col
being large.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-18 Thread Jonah H. Harris
On Fri, Jul 18, 2008 at 10:44 AM, Heikki Linnakangas
[EMAIL PROTECTED] wrote:
 Ignoring the big-O complexity, if a hash index only stores a 32-bit hash
 code instead of the whole key, it could be a big win in storage size, and
 therefore in cache-efficiency and performance, when the keys are very long.

Agreed.  My thinking is that there's either something inherently wrong
with the implementation, or we're performing so many disk I/Os that
it's nearly equivalent to b-tree.  Tom has a couple suggestions which
Xiao and I will explore.

 Granted, it's not very common to use a 1K text field as a key column...

Especially for direct equality comparison :)

-- 
Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
EnterpriseDB Corporation | fax: 732.331.1301
499 Thornall Street, 2nd Floor | [EMAIL PROTECTED]
Edison, NJ 08837 | http://www.enterprisedb.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TABLE-function patch vs plpgsql

2008-07-18 Thread Tom Lane
Marko Kreen [EMAIL PROTECTED] writes:
 On 7/18/08, Tom Lane [EMAIL PROTECTED] wrote:
 1. It's ludicrous to argue that standards compliance requires the
 behavior-as-submitted.  plpgsql is not specified by the SQL standard.

 Yes, but it would be a good feature addition to plpgsql.
 Currently there is no way to suppress the local variable
 creation.  The proposed behaviour would give that possibility.

Why would anyone consider that a feature?

 2. Not having the parameter names available means that you don't have
 access to their types either, which is a big problem for polymorphic
 functions.

 This does not make sense as Postgres does not support
 polymorphic table columns...

No, but it certainly supports polymorphic function output parameters,
and that's what these really are.

 I think thats the point - it should not be just syntactic sugar for
 OUT parameters, let it be different.

Why?  All you're doing is proposing that we deliberately cripple
the semantics.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] WITH RECUSIVE patches 0717

2008-07-18 Thread David Fetter
On Fri, Jul 18, 2008 at 10:41:20AM +0900, Tatsuo Ishii wrote:
   Here is the lastest WITH RECURSIVE patches against CVS HEAD created by
   Yoshiyuki Asaba and minor corrections by Tatsuo Ishii.
  
  I tried this patch vs. CVS HEAD used my usual configure option with
  only --with-prefix set, then tried to make, and got:
  
  make[3]: *** No rule to make target `parse_cte.o', needed by 
  `objfiles.txt'.  Stop.
  make[3]: Leaving directory `/home/shackle/pgsql/src/backend/parser'
  make[2]: *** [parser-recursive] Error 2
  make[2]: Leaving directory `/home/shackle/pgsql/src/backend'
  make[1]: *** [all] Error 2
  make[1]: Leaving directory `/home/shackle/pgsql/src'
  make: *** [all] Error 2
  
  Is there something missing?
 
 Oops. I forgot to include patches against newly added files. Please
 try included patches.

This now compiles.

I have a test case that hangs and smashes.

WITH t(i) AS (
SELECT * FROM generate_series(1,5)
)
SELECT
t1.i,
2*t2.i
FROM
t AS t1
JOIN
t AS t2 USING(i);

server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

An equivalent query without RECURSIVE breaks in a different, in some
sense even more severe, way, as in it just hands out a wrong result
set:

WITH RECURSIVE t(i) AS (
VALUES(1::int)
UNION ALL
SELECT i+1 FROM t WHERE i  5
)
SELECT
t1.i,
2*t2.i
FROM
t AS t1
JOIN
t AS t2 USING(i);

 i | ?column? 
---+--
 1 |2
 2 |4
 3 |6
 4 |8
 5 |   10
(5 rows)

While this case is trivial, others are not.  For example, if someone
wishes to do a k-deep summary on a parts explosion n levels deep, nk,
one way to do this would be to JOIN the k-deep part of the path
enumeration to the parts greater than k deep.

What would need to be fixed in order to make the above things work?

Cheers,
David.
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: [EMAIL PROTECTED]

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: primary key patches

2008-07-18 Thread Tom Lane
David Fetter [EMAIL PROTECTED] writes:
 On Fri, Jul 18, 2008 at 03:04:08PM +0200, Markus Schiltknecht wrote:
 But what do we have primary keys for, in the first place?

 We have them because people are used to thinking in terms of a
 PRIMARY KEY, not because that concept is actually distinguishable
 from a non-partial UNIQUE NOT NULL constraint.

No, we have them because the SQL standard actually assigns a distinct
meaning to a primary key.  (It's the default foreign key reference
column(s) for the table --- and in that context it's clear that
There Can Be Only One.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] WITH RECUSIVE patches 0717

2008-07-18 Thread David Fetter
On Fri, Jul 18, 2008 at 07:56:09AM -0700, David Fetter wrote:
 On Fri, Jul 18, 2008 at 10:41:20AM +0900, Tatsuo Ishii wrote:
Here is the lastest WITH RECURSIVE patches against CVS HEAD created by
Yoshiyuki Asaba and minor corrections by Tatsuo Ishii.
   
   I tried this patch vs. CVS HEAD used my usual configure option with
   only --with-prefix set, then tried to make, and got:
   
   make[3]: *** No rule to make target `parse_cte.o', needed by 
   `objfiles.txt'.  Stop.
   make[3]: Leaving directory `/home/shackle/pgsql/src/backend/parser'
   make[2]: *** [parser-recursive] Error 2
   make[2]: Leaving directory `/home/shackle/pgsql/src/backend'
   make[1]: *** [all] Error 2
   make[1]: Leaving directory `/home/shackle/pgsql/src'
   make: *** [all] Error 2
   
   Is there something missing?
  
  Oops. I forgot to include patches against newly added files. Please
  try included patches.
 
 This now compiles.
 
 I have a test case that hangs and smashes.
 
 WITH t(i) AS (
 SELECT * FROM generate_series(1,5)
 )
 SELECT
 t1.i,
 2*t2.i
 FROM
 t AS t1
 JOIN
 t AS t2 USING(i);
 
 server closed the connection unexpectedly
 This probably means the server terminated abnormally
 before or while processing the request.
 The connection to the server was lost. Attempting reset: Failed.
 
 An equivalent query without RECURSIVE breaks in a different, in some
 sense even more severe, way, as in it just hands out a wrong result
 set:

D'oh!  That's what I get for sending this before waking up.  It works
just fine.

 While this case is trivial, others are not.  For example, if someone
 wishes to do a k-deep summary on a parts explosion n levels deep,
 nk, one way to do this would be to JOIN the k-deep part of the path
 enumeration to the parts greater than k deep.
 
 What would need to be fixed in order to make the above things work?

Cheers,
David.
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: [EMAIL PROTECTED]

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: primary key patches

2008-07-18 Thread Gregory Stark
David Fetter [EMAIL PROTECTED] writes:

 On Fri, Jul 18, 2008 at 03:04:08PM +0200, Markus Schiltknecht wrote:
 Hello Chris,

 chris wrote:
 Slony-I does the same, with the variation that it permits the
 option of using a candidate primary key, namely an index that is
 unique+NOT NULL.

 If it is possible to support that broader notion, that might make
 addition of these sorts of logic more widely useful.

 Well, yeah, that's technically not much different, so it would
 probably  be very easy to extend Postgres-R to work on any arbitrary
 Index.

 But what do we have primary keys for, in the first place? Isn't it
 exactly the *primay* key into the table, which you want to use for
 replication? Or do we need an additional per-table configuration
 option  for that? A REPLICATION KEY besides the PRIMARY KEY?

Hm, it occurs to me that really Slony should be saying 
 WHERE (col1,col2,...) = ('x','y','z',...)

and letting the server figure out what access method is best for finding the
candidate record. That could mean using the primary key index, or it could
mean using some other index (perhaps a partial index for example).

It would be nice if there was a way for Slony to express to the server that
really, it only needs any UNIQUE NOT NULL combination of columns to match.
Once the server has any such combination which matches it can skip checking
the rest. I can't think of any way to write such a query in SQL.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: primary key patches

2008-07-18 Thread Markus Wanner

Hi,

David Fetter wrote:

While I'm a chicken rather than a pig on this project
http://en.wikipedia.org/wiki/The_Chicken_and_the_Pig, I believe that
covering the more general case right from the start would be a much
better plan.


I was trying to say that Postgres-R internally relies only on a unique 
index with not null constraint. It doesn't care if you name it PRIMARY 
KEY or REPLICATION KEY or whatever.


So, it's just a question of the syntax. We already have PRIMARY KEYs, 
and those are pretty much what I think is needed in 99% of all cases as 
the pointer to the replication


While I'm normally an absolute fan of generality,

I think you didn't quite get the point.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: primary key patches

2008-07-18 Thread Markus Wanner

Hi,

sorry, some strange key-combination made my mail client send too early...

I myself wrote:
I was trying to say that Postgres-R internally relies only on a unique 
index with not null constraint. It doesn't care if you name it PRIMARY 
KEY or REPLICATION KEY or whatever.


So, it's just a question of the syntax. We already have PRIMARY KEYs, 
and those are pretty much what I think is needed in 99% of all cases as 
the pointer to the replication


.. as the pointer to the index to use for replication.

Offering the user a possibility to choose another (index + not null) 
would require something like ALTER TABLE ... ADD REPLICATION KEY ... or 
some such. Mostly syntactic sugar, which can be added as soon as we 
really need it.



While I'm normally an absolute fan of generality,

I think you didn't quite get the point.


Iiik.. that's what I didn't want to send and wanted to delete before 
sending... :-)  Sorry.


Regards

Markus



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: primary key patches

2008-07-18 Thread Markus Wanner

Hi,

Tom Lane wrote:

It's the default foreign key reference
column(s) for the table


That's why I think it makes for a pretty good replication key as well.

Regards

Markus

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TABLE-function patch vs plpgsql

2008-07-18 Thread Marko Kreen
On 7/18/08, Tom Lane [EMAIL PROTECTED] wrote:
 Marko Kreen [EMAIL PROTECTED] writes:
   On 7/18/08, Tom Lane [EMAIL PROTECTED] wrote:

  1. It's ludicrous to argue that standards compliance requires the
   behavior-as-submitted.  plpgsql is not specified by the SQL standard.

   Yes, but it would be a good feature addition to plpgsql.
   Currently there is no way to suppress the local variable
   creation.  The proposed behaviour would give that possibility.

 Why would anyone consider that a feature?

Well, it's issue of big global namespace vs. several local namespaces

If you have function that has lot of OUT parameters and also
local variables it gets confusing fast.  It would be good to avoid
OUT parameters polluting local variable space.

   2. Not having the parameter names available means that you don't have
   access to their types either, which is a big problem for polymorphic
   functions.

  This does not make sense as Postgres does not support
   polymorphic table columns...

 No, but it certainly supports polymorphic function output parameters,
  and that's what these really are.

I was referring to the syntax of the feature: RETURNS TABLE

   I think thats the point - it should not be just syntactic sugar for
   OUT parameters, let it be different.

 Why?  All you're doing is proposing that we deliberately cripple
  the semantics.

plpgsql already is rather crippled, we could use that feature
to give additional flexibility to it.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: primary key patches

2008-07-18 Thread Markus Wanner

Hi,

I realize that you are talk about Slony, let me answer for the 
Postgres-R case, anyway.


Gregory Stark wrote:
Hm, it occurs to me that really Slony should be saying 
 WHERE (col1,col2,...) = ('x','y','z',...)


Hm.. that would mean increasing the amount of work for the remote 
backend, which applies remote transaction. For scalability reasons, I'm 
trying to keep that minimal.



and letting the server figure out what access method is best for finding the
candidate record. That could mean using the primary key index, or it could
mean using some other index (perhaps a partial index for example).


For Postgres-R, I think that would only be a gain in those cases, where 
all tuples of a collection (or even the entire change set) only affect 
tuples from a partial index. That doesn't look like it's worth the 
trouble, IMO. Or do you think that's a frequent case?


Thinking about it, I'd even say that requiring only one index frequently 
is favorable because of caching effects. Dunno.



It would be nice if there was a way for Slony to express to the server that
really, it only needs any UNIQUE NOT NULL combination of columns to match.
Once the server has any such combination which matches it can skip checking
the rest. I can't think of any way to write such a query in SQL.


I don't quite get your point here. For UPDATEs which change the PRIMARY 
KEY, the sender currently sends the *old* values plus the changes. In 
that case, you certainly don't want to send the entire olde tuple, but 
only the fields for *one* KEY. That's what I'm calling the replication 
key. (And currently equals the PRIMARY KEY).


Maybe I'm thinking too much in terms of Postgres-R, instead of Slony, 
what you are talking about.


Regards

Markus

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-18 Thread Tom Lane
Jonah H. Harris [EMAIL PROTECTED] writes:
 Agreed.  My thinking is that there's either something inherently wrong
 with the implementation, or we're performing so many disk I/Os that
 it's nearly equivalent to b-tree.  Tom has a couple suggestions which
 Xiao and I will explore.

I finally got a chance to look through the patch in some detail.
If I haven't missed something, there are just two improvement ideas
embodied in it:

1. Store just the hash value, and not the index key value, in hash
index entries.  (This means that all hash indexscans become lossy
and have to be rechecked at the heap.)

2. Keep the contents of each index page ordered by hash value, and use
binary instead of linear search to find the matching item(s) during an
indexscan.  (This depends on #1 because recomputing the hash values
during the binary search would be too expensive --- although you could
also make it work by storing *both* the hash value and the original
key.)

I suppose that the main point of #1 is to reduce index size by
allowing more tuples to fit in each bucket.  However, the patch
neglects to adjust the target-fillfactor calculation in _hash_metapinit,
which means that the code won't actually put any more tuples per bucket
(on average) than it did before.  So the index isn't actually any
smaller and you get no I/O savings --- you just have more unused
space on a typical page.  Fixing that might help.

FWIW, I had always assumed that part of the solution to hash's woes
would involve decoupling the bucket size from the page size, so
that you could have multiple buckets per page.  But maybe the
binary-search idea makes that unnecessary.  I'm not sure.  A whole
lot depends on how evenly the buckets get filled.  You should probably
investigate how many tuples actually end up in each bucket with and
without the patch.

In the realm of micro-optimizations that might be significant, I think
you really need to get rid of all those _create_hash_desc calls,
particularly the one in _hash_checkqual which is part of the inner loop
of an indexscan.  Not only are they slow but they probably represent a
serious memory leak in a scan that returns many tuples.  For reading the
hash value out of an existing index tuple, I don't think you should be
bothering with a tupdesc at all --- don't use index_getattr, just map a
C struct onto the known layout of a indextuple with a single never-null
int field.  This would particularly make for a noticeable improvement in
the speed of _hash_binsearch.  The tupdescs used in storing an index
entry are probably needed, but you could just use a single statically
declared tupdesc (look at the ones in relcache.c for examples of
building one as a constant).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: primary key patches

2008-07-18 Thread Alvaro Herrera
Markus Wanner wrote:

 Gregory Stark wrote:

 It would be nice if there was a way for Slony to express to the server that
 really, it only needs any UNIQUE NOT NULL combination of columns to match.
 Once the server has any such combination which matches it can skip checking
 the rest. I can't think of any way to write such a query in SQL.

 I don't quite get your point here. For UPDATEs which change the PRIMARY  
 KEY, the sender currently sends the *old* values plus the changes. In  
 that case, you certainly don't want to send the entire olde tuple, but  
 only the fields for *one* KEY. That's what I'm calling the replication  
 key. (And currently equals the PRIMARY KEY).

I think the point here is that you need to distinguish which tuple you
need to update.  For this, our Replicator uses the primary key only;
there's no way to use another candidate key (unique not null).  It would
certainly be possible to use a different candidate key, but as far as I
know no customer has ever requested this.

(FWIW we don't send the old values -- only the original PK columns, the
values of columns that changed, and the update mask in terms of
heap_modify_tuple.)

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: CITEXT 2.0 v4

2008-07-18 Thread David E. Wheeler

On Jul 18, 2008, at 01:39, Michael Paesold wrote:

Calling regex functions with the case-insensitivity option would be  
great. It should also be possible to rewrite replace() into  
regexp_replace() by first escaping the regex meta characters.


Actually re-implementing those functions in a case insensitive way  
would still be an option, but of course some amount of work. The  
question is, how much use case there is.


Not much for me. I might use the regex functions, but would be happy  
to manually pass the i flag.


However, if someone with a lot more C and Pg core knowledge wanted to  
sit down with me for a couple hours next week and help me bang out  
these functions, that would be great. I'd love to have the  
implementation be that much more complete.


I do believe that, as it stands now in the v4 patch, citext is pretty  
close to ready, and certainly commit-able.


Thanks,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] typedefs for indent

2008-07-18 Thread Bruce Momjian
Andrew Dunstan wrote:
 
 
 Alvaro Herrera wrote:
  Andrew Dunstan wrote:

  OK, I have spent some time generating and filtering typdefs via objdump  
  on various platforms. I filtered them and Bruce's list to eliminate  
  items not actually found in the sources thus:
  
 
  Did this go anywhere?

 
 
 I'm still  trying to get a working objdump for OSX. Automating this is 
 difficult because we need to make sure we get all (or pretty close to 
 all) the typedefs we can get on each platform for various build 
 configurations.

At this point I would like to get a typedef list into CVS, even if it is
not perfect --- it is better than what we have now.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: primary key patches

2008-07-18 Thread Markus Wanner

Hi,

Alvaro Herrera wrote:

I think the point here is that you need to distinguish which tuple you
need to update.  For this, our Replicator uses the primary key only;
there's no way to use another candidate key (unique not null).  It would
certainly be possible to use a different candidate key,


Yeah, and for this to work, the *sender* needs to decide on a key to use.


but as far as I
know no customer has ever requested this.


I can't see the use case for a separate REPLICATION KEY, different from 
the PRIMARY KEY, either..



(FWIW we don't send the old values -- only the original PK columns, the
values of columns that changed, and the update mask in terms of
heap_modify_tuple.)


Yup, that's pretty much the same what I'm doing for Postgres-R.

Regards

Markus

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] .psqlrc output for \pset commands

2008-07-18 Thread Bruce Momjian
Bruce Momjian wrote:
 $ psql test
   --   Output format is wrapped.
 psql (8.4devel)
 Type help for help.
  
   Is this desirable?  \set QUIET at the top of .psqlrc fixes it, but I am
   wondering if we should be automatically doing quiet while .psqlrc is
   processed.
  
  I was wondering about this myself, but I'm still not used to the new banner.
  It seems kind of... curt. Perhaps it should just be a single line instead of
  two lines both around 20 characters...
  
  Anyways the thing that struck me as odd was the messages appearing *before*
  the header. It seems to me the header should print followed by .psqlrc 
  output
  followed by normal output.
 
 Do you like this better?
 
   $ psql test
   psql (8.4devel)
   Type help for help.
   Output format is wrapped.
   
   test=
 
 The attached patch accomplishes this.

With no feedback, applied.  ;-)

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-18 Thread Kenneth Marshall
I just ran my original 16M word test case against the patched
version, and like Tom noted below, the tuples per bucket
calculation is wrong which results in identical index sizes
for both the original version and the hash-value-only version.

 I suppose that the main point of #1 is to reduce index size by
On Fri, Jul 18, 2008 at 12:23:25PM -0400, Tom Lane wrote:
 Jonah H. Harris [EMAIL PROTECTED] writes:
  Agreed.  My thinking is that there's either something inherently wrong
  with the implementation, or we're performing so many disk I/Os that
  it's nearly equivalent to b-tree.  Tom has a couple suggestions which
  Xiao and I will explore.
 
 I finally got a chance to look through the patch in some detail.
 If I haven't missed something, there are just two improvement ideas
 embodied in it:
 
 1. Store just the hash value, and not the index key value, in hash
 index entries.  (This means that all hash indexscans become lossy
 and have to be rechecked at the heap.)
 
 2. Keep the contents of each index page ordered by hash value, and use
 binary instead of linear search to find the matching item(s) during an
 indexscan.  (This depends on #1 because recomputing the hash values
 during the binary search would be too expensive --- although you could
 also make it work by storing *both* the hash value and the original
 key.)
 
 allowing more tuples to fit in each bucket.  However, the patch
 neglects to adjust the target-fillfactor calculation in _hash_metapinit,
 which means that the code won't actually put any more tuples per bucket
 (on average) than it did before.  So the index isn't actually any
 smaller and you get no I/O savings --- you just have more unused
 space on a typical page.  Fixing that might help.
 
 FWIW, I had always assumed that part of the solution to hash's woes
 would involve decoupling the bucket size from the page size, so
 that you could have multiple buckets per page.  But maybe the
 binary-search idea makes that unnecessary.  I'm not sure.  A whole
 lot depends on how evenly the buckets get filled.  You should probably
 investigate how many tuples actually end up in each bucket with and
 without the patch.
 
I think that while the binary-search idea will improve the lookup
over the original sequential scan of the bucket, it makes updates
much more expensive particularly with buckets approaching 100%
full. The idea that I have been mulling over tries to improve access
times by breaking a bucket in mini-virtual buckets within a page. We
restrict the size of the mini-bucket to be pagesize/(1/2^n). The
sweet spot should be around n=6 or 7 which for an 8k pagesize yields
a mini-bucket size of 32 or 64 bytes. Then the search for the value
in a page is to read the virtual bucket corresponding to the n bits
of the hash value.

The second piece is to take advantage of the fact that the size of
the mini-bucket is not an even multiple of the size of a hash index
tuple and aggregate all the lost space for use as the first overflow
page for all of a pages mini-buckets. This avoids the I/O needed to
read a full overflow page from disk and accomodates the imperfections
in the hash function distribution. The overflow pages, both the virtual
first and subsequent real pages would benefit from the binary lookups.
It may also be worth storing the high and low hash values specially to
avoid the search in a page if its value would not be on the page.

 In the realm of micro-optimizations that might be significant, I think
 you really need to get rid of all those _create_hash_desc calls,
 particularly the one in _hash_checkqual which is part of the inner loop
 of an indexscan.  Not only are they slow but they probably represent a
 serious memory leak in a scan that returns many tuples.  For reading the
 hash value out of an existing index tuple, I don't think you should be
 bothering with a tupdesc at all --- don't use index_getattr, just map a
 C struct onto the known layout of a indextuple with a single never-null
 int field.  This would particularly make for a noticeable improvement in
 the speed of _hash_binsearch.  The tupdescs used in storing an index
 entry are probably needed, but you could just use a single statically
 declared tupdesc (look at the ones in relcache.c for examples of
 building one as a constant).
 
+1

   regards, tom lane
 

I think that this sort of virtual bucket would allow us to take
better advantage of the O(1) behavior. What do you all think?

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] WITH RECUSIVE patches 0717

2008-07-18 Thread Erik
On Fri, July 18, 2008 03:41, Tatsuo Ishii wrote:
  Here is the lastest WITH RECURSIVE patches against CVS HEAD created by
  Yoshiyuki Asaba and minor corrections by Tatsuo Ishii.

 I tried this patch vs. CVS HEAD used my usual configure option with
 only --with-prefix set, then tried to make, and got:

 make[3]: *** No rule to make target `parse_cte.o', needed by `objfiles.txt'.
  Stop.
 make[3]: Leaving directory `/home/shackle/pgsql/src/backend/parser'
 make[2]: *** [parser-recursive] Error 2
 make[2]: Leaving directory `/home/shackle/pgsql/src/backend'
 make[1]: *** [all] Error 2
 make[1]: Leaving directory `/home/shackle/pgsql/src'
 make: *** [all] Error 2

 Is there something missing?

 Oops. I forgot to include patches against newly added files. Please
 try included patches.



This crashes the backend:

WITH RECURSIVE t(n) AS (
VALUES (1)
  UNION ALL
SELECT n+1 FROM t WHERE n  5 ORDER BY 1
)
SELECT n FROM t;

apparently because of the  ORDER BY 1

( ORDER BY t.n  will just error out )


Compiled with:

./configure \
 --prefix=${install_dir} \
 --with-pgport=${pgport} \
 --quiet  \
 --enable-depend  \
 --enable-cassert \
 --enable-debug   \
 --with-openssl


hth

Erik Rijkers






-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] WIP: executor_hook for pg_stat_statements

2008-07-18 Thread Tom Lane
ITAGAKI Takahiro [EMAIL PROTECTED] writes:
 The attached patch is the proposal. It adds two global symbols:
   * ExecutorRun_hook - replacing behavior of ExecutorRun()
   * standard_ExecutorRun() - default behavior of ExecutorRun()

Applied.

 And also modifies one funtion:
   * ExecuteQuery() - It passes prepared query's text to portal so that
 the prepared query's text is available at the executor level.
 This change is almost free because it copys only string pointer,
 not the string buffer.

This patch is unsafe because the portal could outlive the cached plan
source (consider the case that a called function does a DEALLOCATE).
However, I don't see any compelling argument against doing a pstrdup
here.  I did that and also went around and made assumptions uniform
about always having a source string for a cached plan or Portal.
So ActivePortal-sourceText should be a safe thing to consult to
see the source text of the most closely nested query being executed.
(Inside a plpgsql function, for instance, this would be the current
SQL statement of the function.)

 The attached archive pg_stat_statements.tar.gz is a demonstration of
 ExecutorRun_hook. It collect per-statement statistics of number of planned
 and executed, plan cost, execution time, and buffer gets/reads/writes.

I don't think this works yet --- you are still using debug_query_string,
and you're assuming it will be consistent with ActivePortal-sourceText,
which it won't be in function calls and other situations.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-18 Thread Kenneth Marshall
FYI,

I just patched the fill-factor calculation and re-ran my test.
The index size dropped from 513M to 43M which is the same disk
footprint as the corresponding btree index.

Have a nice weekend.
Ken

On Fri, Jul 18, 2008 at 12:23:14PM -0500, Kenneth Marshall wrote:
 I just ran my original 16M word test case against the patched
 version, and like Tom noted below, the tuples per bucket
 calculation is wrong which results in identical index sizes
 for both the original version and the hash-value-only version.
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] temp table problem

2008-07-18 Thread Tom Lane
Abbas [EMAIL PROTECTED] writes:
 I have come across a problem. When you try to access a temp table
 created via SPI_EXEC, you get a table not found error.

   SPI_EXEC(CREATE TEMP TABLE my_temp_table(first_name text, last_name
 text), UTILITY);
   SPI_EXEC(REVOKE ALL ON TABLE my_temp_table FROM PUBLIC, UTILITY);

 The second statement generates a table not found error, although the
 first statement was successful.

Works for me ...

ret = SPI_execute(CREATE TEMP TABLE my_temp_table(first_name text, 
last_name text), false, 0);

if (ret != SPI_OK_UTILITY)
elog(ERROR, SPI_execute(CREATE) returned %d, ret);

ret = SPI_execute(REVOKE ALL ON TABLE my_temp_table FROM PUBLIC, false, 
0);

if (ret != SPI_OK_UTILITY)
elog(ERROR, SPI_execute(REVOKE) returned %d, ret);

What PG version are you testing?  Maybe you need to show a complete
test case, instead of leaving us to guess at details?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Load spikes on 8.1.11

2008-07-18 Thread Gurjeet Singh
On Fri, Jul 18, 2008 at 7:15 PM, David Fetter [EMAIL PROTECTED] wrote:

 On Fri, Jul 18, 2008 at 10:15:42AM +0530, Gurjeet Singh wrote:
  On Fri, Jul 18, 2008 at 10:05 AM, Gurjeet Singh [EMAIL PROTECTED]
 
  wrote:
 
   On Fri, Jul 18, 2008 at 9:58 AM, Tom Lane [EMAIL PROTECTED] wrote:
  
   Gurjeet Singh [EMAIL PROTECTED] writes:
During these spikes, in the 'top' sessions we see the 'idle' PG
processes consuming between 2 and 5 % CPU, and since the box has 8
 CPUS
   (2
sockets and each CPU is a quad core Intel Xeon processors) and
 somewhere
around 200 Postgres processes, the load spikes to above 200; and it
 does
this very sharply.
  
   This looks like heavy contention for a spinlock.  You need to get a
   higher-level analysis of what's happening before anyone can say much
   more than that.
  
   Note that 8.1 is pretty much ancient history as far as scalability to
   8-core hardware goes.  You should probably consider updating to 8.3
   before investing too much time in tracking down what's happening.
   If you can still show the problem on 8.3 then there would be some
   interest in fixing it ...
  
  
   Upgrading is on the cards, but not as high priority as I would like it
 to
   be! This is a production box, and we desperatly need some respite from
 these
   spikes.
  
   Can you please elaborate on what high level diagnosis would you need?
  
   I just ran DROP SCHEMA _slony schema CASCADE; and it spiked again, on
 a
   very low loaded box!!
  
   Thanks for all you help.
  
 
  Would reducing the number of connections on the DB help in reducing the
  spike?

 Just generally, reducing the number of connections to the DB will help
 in reducing resource consumption.


Will try this option, at least in the next schema upgrade or when setting up
Slony.



 When you first get a chance, use or set up a test environment where
 you can test the upgrade to 8.3.latest.


Based on the thread above, we seem to be moving towards greater consensus on
upgrade. One of the major hurdles in our environment's upgrade is the loss
of implicit casts in 8.3.

Following is the environment we have:

select version();
  version
---
 PostgreSQL 8.1.11 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC)
3.4.6 20060404 (Red Hat 3.4.6-9)
(1 row)

I cannot see oprofile installed on this box, so will try to get that
installed and get you guys some more details when this happens next.

Thanks,
-- 
[EMAIL PROTECTED]
[EMAIL PROTECTED] gmail | hotmail | indiatimes | yahoo }.com

EnterpriseDB http://www.enterprisedb.com

Mail sent from my BlackLaptop device


Re: [HACKERS] temp table problem

2008-07-18 Thread Robert Haas
I can't help suspecting that the two statements in question were run
in different sessions (or at least different transactions?).

...Robert

On Fri, Jul 18, 2008 at 9:11 PM, Tom Lane [EMAIL PROTECTED] wrote:
 Abbas [EMAIL PROTECTED] writes:
 I have come across a problem. When you try to access a temp table
 created via SPI_EXEC, you get a table not found error.

   SPI_EXEC(CREATE TEMP TABLE my_temp_table(first_name text, last_name
 text), UTILITY);
   SPI_EXEC(REVOKE ALL ON TABLE my_temp_table FROM PUBLIC, UTILITY);

 The second statement generates a table not found error, although the
 first statement was successful.

 Works for me ...

ret = SPI_execute(CREATE TEMP TABLE my_temp_table(first_name text, 
 last_name text), false, 0);

if (ret != SPI_OK_UTILITY)
elog(ERROR, SPI_execute(CREATE) returned %d, ret);

ret = SPI_execute(REVOKE ALL ON TABLE my_temp_table FROM PUBLIC, false, 
 0);

if (ret != SPI_OK_UTILITY)
elog(ERROR, SPI_execute(REVOKE) returned %d, ret);

 What PG version are you testing?  Maybe you need to show a complete
 test case, instead of leaving us to guess at details?

regards, tom lane

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Getting to universal binaries for Darwin

2008-07-18 Thread Tom Lane
Awhile back we determined that the big stumbling block for building
Postgres universal binaries for OS X was that we were using ld
to produce intermediate SUBSYS.o files, and it didn't want to preserve
multi-architecture components of input .o files.  Peter got rid of that
hack recently, so I thought I'd see what it takes to build a universal
binary with CVS HEAD.

The good news is that the tools problem seems to be solved.  If you
add something like -arch i386 -arch ppc to CFLAGS and build normally,
you get real working multiarch binaries and libraries.  (At least it
works for me on OS X 10.5.4 --- no promises about older toolchains.)

The bad news is that if you only do that, only the arch that you
actually build on will work.  We have configure set up to insert
various hardware-dependent definitions into pg_config.h and
ecpg_config.h, and if you don't have the right values visible for
each compilation, the resulting executables will fail.

You can get around that by hacking up the generated config files
with #ifdef __i386__ and so on to expose the correct values of
the hardware-dependent symbols to each build.  Of course you have
to know what the correct values are --- if you don't have a sample
of each architecture handy to run configure against, it'd be easy
to miss some things.  And even then it's pretty tedious.  I am
not sure if it is possible or worth the trouble to try to automate
this part better.

The other big problem is that genbki.sh wants to propagate some of those
hardware-dependent symbols into postgres.bki, in particular
FLOAT4PASSBYVAL and FLOAT8PASSBYVAL.  This is a complete nonstarter,
because we put postgres.bki under share/ which means it is supposed to
be architecture independent.  (I'm really glad I realized this before we
released 8.4, because it'd violate Red Hat's multilib file rules...)
I think we can pretty easily fix that by moving the responsibility
for substituting these values into initdb, though.

Modulo the above problems, I was able to build i386+ppc binaries that
do in fact work on both architectures.  I haven't got any 64-bit Apple
machines to play with, so there might be 64-bit issues I missed.
Still, this is a huge step forward compared to what was discussed here:
http://archives.postgresql.org/pgsql-general/2008-02/msg00200.php

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] phrase search

2008-07-18 Thread Sushant Sinha
I looked at query operators for tsquery and here are some of the new
query operators for position based queries. I am just proposing some
changes and the questions I have.

1. What is the meaning of such a query operator?

foo #5 bar - true if the document has word foo followed by bar at
5th position.
   
foo #5 bar - true if document has word foo followed by bar with in
5 positions

foo #5 bar - true if document has word foo followed by bar after 5
positions

then some other ways it can be used are
!(foo #5 bar) - true if document never has any foo  followed by bar
with in 5 positions.

etc .

2. How to implement such query operators?

Should we modify QueryItem to include additional distance information or
is there any other way to accomplish it?

Is the following list sufficient to accomplish this?
a. Modify to_tsquery
b. Modify TS_execute in tsvector_op.c to check new operator

Is there anything needed in rewrite subsystem?

3. Are these valid uses of the operators and if yes what would they
mean?

foo #5 (bar  cup)

If no then should the operator be applied to only two QI_VAL's?

4. If the operator only applies to two query items can we create an
index such that (foo, bar)- documents[min distance, max distance] 
How difficult it is to implement an index like this?


Thanks,
-Sushant.

On Thu, 2008-06-05 at 19:37 +0400, Teodor Sigaev wrote:
  I can add index support and support for arbitrary distance between
  lexeme. 
  It appears to me that supporting arbitrary boolean expression will be
  complicated. Can we pull out something from TSQuery?
 
 I don't very like an idea to have separated interface for phrase search. Your 
 patch may be a module and used by people who really wants to have a phrase 
 search.
 
 Introducing new operator in tsquery allows to use already existing 
 infrastructure of tsquery such as concatenations (, ||, !!), rewrite 
 subsystem 
 etc.  But new operation/types specially designed for phrase search makes 
 needing 
 to make that work again.
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers