[HACKERS] SoC ECPG Enhancements

2007-03-15 Thread Peter Eisentraut
The Google Summer of Code web page lists as possible project

ECPG Enhancments: Enable ECPG to generate calls directly for libpq 
rather than calls to its own libraries.

What would be the point of that?

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [PATCHES] Bitmapscan changes

2007-03-15 Thread Heikki Linnakangas

Hannu Krosing wrote:

Ühel kenal päeval, K, 2007-03-14 kell 10:22, kirjutas Heikki
Linnakangas:
Clustered indexes have roughly the same performance effect and use cases 
as clustered indexes on MS SQL Server, and Index-Organized-Tables on 
Oracle, but the way I've implemented them is significantly different. On 
other DBMSs, the index and heap are combined to a single b-tree 
structure. The way I've implemented them is less invasive, there's no 
changes to the heap for example, and it doesn't require moving live tuples.


Do you keep visibility info in the index ?


No.


If there is no visibility data in index, then I can't see, how it gets
the same performance effect as Index-Organized-Tables, as lot of random
heap access is still needed.


Let me illustrate the effect in the best case, with a table that 
consists of just the key:


Normal b-tree:

Root - leaf - heap

aaa - aaa - aaa
   bbb - bbb
   ccc - ccc
ddd - ddd - ddd
   eee - eee
   fff - fff
ggg - ggg - ggg
   hhh - hhh
   iii - iii

Clustered b-tree:

Root - heap

aaa - aaa
   bbb
   ccc
ddd - ddd
   eee
   fff
ggg - ggg
   hhh
   iii

The index is much smaller, one level shallower in the best case. A 
smaller index means that more of it fits in cache. If you're doing 
random access through the index, that means that you need to do less I/O 
because you don't need to fetch so many index pages. You need to access 
the heap anyway for the visibility information, as you pointed out, but 
the savings are coming from having to do less index I/O.


How close to the best case do you get in practice? It depends on your 
schema, narrow tables or tables with wide keys gain the most, and on the 
clusteredness of the table.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [PATCHES] Bitmapscan changes

2007-03-15 Thread Heikki Linnakangas

Alvaro Herrera wrote:

Which is why we don't do things that way.  The code must fit within the
general architecture before application -- particularly if it's an
internal API change.  That's what the review process is for.


Yes, of course. As I've said, I have the time to work on this, but I 
need get the review process *started*. Otherwise I'll just tweak and 
polish the patch for weeks, and end up with something that gets rejected 
in the end anyway.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


[HACKERS] where to add/change commands

2007-03-15 Thread Grzegorz Jaskiewicz

Hi folks

I am trying to make CLUSTER command just a bit verbose,as an option  
here. Added bits to gram.y that should support CLUSTER [VERBOSE] ...  
but psql doesn't seem to pick it up.
Where else do I need to update it, besides gram.y, cluster.c and of  
course adding new field to ClusterStmt.
I tried seaching the list for such hints, but didn't found anything  
usefull.


thanks.

--
Grzegorz Jaskiewicz
[EMAIL PROTECTED]

C/C++ Freelance to hire.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] CLUSTER and MVCC

2007-03-15 Thread Heikki Linnakangas

Heikki Linnakangas wrote:
One potential issue I'm seeing is that if we rely on the unbroken chain 
starting from  OldestXmin, and that tuple isn't there because of a bug, 
for example, the later version of the tuple is skipped and the row is lost.


After further thought, I don't feel comfortable with the idea because of 
 the above loss of robustness.


I'm thinking of keeping an in-memory mapping of old and new tids of 
updated tuples while clustering, instead. That means that cluster 
requires a little bit of memory for each RECENTLY_DEAD updated tuple. In 
the worst case that means that you run out of memory if there's too many 
of those in the table, but I doubt that's going to be a problem in practice.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] where to add/change commands

2007-03-15 Thread Heikki Linnakangas

Grzegorz Jaskiewicz wrote:
I am trying to make CLUSTER command just a bit verbose,as an option 
here. Added bits to gram.y that should support CLUSTER [VERBOSE] ... but 
psql doesn't seem to pick it up.


psql? There's some code to detect commands that can't be run in a 
transaction block in src/bin/psql/common.c, maybe that's what you're 
looking for.


Or did you mean something else? How doesn't it pick it up?

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] [PATCHES] Bitmapscan changes

2007-03-15 Thread Heikki Linnakangas

Joshua D. Drake wrote:

This is what I suggest.

Provide a tarball of -head with the patch applied.


Here you are:

http://community.enterprisedb.com/git/pgsql-git-20070315.tar.gz


Provide a couple of use cases that can be run with explanation of how to
verify the use cases.


There's a number of simple test cases on the web page that I've used 
(perfunittests). I can try to simplify them and add explanations.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] where to add/change commands

2007-03-15 Thread Grzegorz Jaskiewicz


On Mar 15, 2007, at 11:36 AM, Heikki Linnakangas wrote:


Grzegorz Jaskiewicz wrote:
I am trying to make CLUSTER command just a bit verbose,as an  
option here. Added bits to gram.y that should support CLUSTER  
[VERBOSE] ... but psql doesn't seem to pick it up.


psql? There's some code to detect commands that can't be run in a  
transaction block in src/bin/psql/common.c, maybe that's what  
you're looking for.



meaning that if I type in CLUSTER VERBOSE in psql, I get syntax error.


Or did you mean something else? How doesn't it pick it up?



well, probably patch's worth 1000 words. Hope that attaching such  
small file isn't a crime in the neck'o'woods.


clusterverbose.patch
Description: Binary data


--
Grzegorz Jaskiewicz
[EMAIL PROTECTED]




---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] where to add/change commands

2007-03-15 Thread NikhilS

Hi,



psql? There's some code to detect commands that can't be run in a
transaction block in src/bin/psql/common.c, maybe that's what you're
looking for.

Or did you mean something else? How doesn't it pick it up?




I think he probably meant that he was getting a syntax error, even after
making all the changes.

Grzegorz, I would have suggested to make an entry for VERBOSE in
parser/keywords.c, but it already seems to contain an entry for VERBOSE. I
hope you are using the opt_verbose rule in your gram.y in the CLUSTER
[VERBOSE] case.

Regards,
Nikhils

--
EnterpriseDB   http://www.enterprisedb.com


Re: [HACKERS] where to add/change commands

2007-03-15 Thread Grzegorz Jaskiewicz


On Mar 15, 2007, at 11:46 AM, NikhilS wrote:


Grzegorz, I would have suggested to make an entry for VERBOSE in  
parser/keywords.c, but it already seems to contain an entry for  
VERBOSE. I hope you are using the opt_verbose rule in your gram.y  
in the CLUSTER [VERBOSE] case.




sure I am. My 'bison' skills are not very high, but I was trying to  
mimic 'VACUUM' syntax there. see the patch.



--
Grzegorz Jaskiewicz
[EMAIL PROTECTED]


C/C++ freelance to hire

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] where to add/change commands

2007-03-15 Thread Peter Eisentraut
Grzegorz Jaskiewicz wrote:
 meaning that if I type in CLUSTER VERBOSE in psql, I get syntax
 error.

Your patch works perfectly fine for me.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] where to add/change commands

2007-03-15 Thread Grzegorz Jaskiewicz


On Mar 15, 2007, at 12:26 PM, Peter Eisentraut wrote:



Your patch works perfectly fine for me.


Ok, I'll try make distclean/make -j2/sudo make install, reboot and  
will see.
I was kinda beliving that remaking whole thing from scratch over and  
over again wouldn't be a necessity here.

thanks.

--
Grzegorz Jaskiewicz

C/C++ freelance for hire






---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] where to add/change commands

2007-03-15 Thread Gregory Stark
Grzegorz Jaskiewicz [EMAIL PROTECTED] writes:

 On Mar 15, 2007, at 12:26 PM, Peter Eisentraut wrote:


 Your patch works perfectly fine for me.

 Ok, I'll try make distclean/make -j2/sudo make install, reboot and will see.
 I was kinda beliving that remaking whole thing from scratch over and over 
 again
 wouldn't be a necessity here.
 thanks.

You may want to configure with --enable-depend. It shouldn't be necessary to
get make to notice changes to gram.y but perhaps your problem lies elsewhere.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] where to add/change commands

2007-03-15 Thread Grzegorz Jaskiewicz

ok, it works okay.
Thanks.

I am really serious about adding this patch. I would like it to not  
only show which tables/indices are being clustered, but also some  
sort of progress information (print procentage of job being done, etc).


Any hints, as on what might be useful for others. Perhaps there's a  
slim chance I could offer it to mainline?


--
Grzegorz Jaskiewicz

C/C++ freelance for hire






---(end of broadcast)---
TIP 6: explain analyze is your friend


[HACKERS] [RFC] CLUSTER VERBOSE

2007-03-15 Thread Grzegorz Jaskiewicz

Hi folks,

I figure - I should start brand new thread for this one - so here you  
go.



I am in a need for verbose CLUSTER. Ie. one that would give me  
feedback and progress.
Because CLUSTER is divided into two major operations, (data  
reordering, index rebuild) - I see it this way:


CLUSTER on I: index name T: table name, data reordering
CLUSTER on I: index name T: table name, index rebuild

and than:
CLUSTER 10%
CLUSTER 12% , etc

(yeah, I know how hard it is to write good progress ..).

I don't have even slight doubt that it can be useful, just like  
VACUUM VERBOSE is. So no question about it.

I am seeking for comments. Ideas.
The patch would not be very intrusive, atm no one is using VERBOSE  
for CLUSTER, because it is not there. And nothing would change in  
this area.

I am looking for opinions, on what information should be presented.
Perhaps there's also use for some information it might gather  
elsewhere (stats, etc) - but that's not really my point atm.


Thanks for all comments.
btw, I would really appreciate not CCing me on this, I am subscribed  
here for yeaaars now (8.0 times).


ta.


--
Grzegorz Jaskiewicz

C/C++ freelance for hire






---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] how to add seconds to a TimestampTz

2007-03-15 Thread Alvaro Herrera
Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
  Is there a better way than going to time_t and back?
 
 Isn't the standard SQL-level locution
   timestamptz + numeric_value * '1 second'::interval
 ?  I'm not sure what would be the most convenient realization
 of this at the C level, but please stay away from time_t ...

Yes, it is, but we would have to be calling interval input,
interval-times-integer, then interval-plus-timestamp, each time we
scheduled a worker (to calculate the time of next worker start), which
sounds like too much work ... an idea would be to store the result of
interval input and multiplication, and recalculate each time we got
SIGHUP.

But then, maybe this is premature optimization on my part.  I'll write
it in the natural form for now, and then we'll see if we should rewrite
it; and then, maybe it's easier to write something equivalent to
TimestampDifference ...

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] how to add seconds to a TimestampTz

2007-03-15 Thread Grzegorz Jaskiewicz


On Mar 15, 2007, at 5:58 AM, Tom Lane wrote:



?  I'm not sure what would be the most convenient realization
of this at the C level, but please stay away from time_t ...


what's wrong with time_t ? Does postgres has some sort of time API,  
that can be used instead?


--
Grzegorz Jaskiewicz

C/C++ freelance for hire






---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] Additional global stats for pg_database?

2007-03-15 Thread Magnus Hagander
Hi!

I'd like to add the ability to show global counters of a couple of
things currently available in pg_stat_all_tables (for example number of
rows added etc) in a global counter for the entire cluster, for
monitoring of an entire cluster without having to connect to each
individual database. 

Should I try adding this to the exsting pg_stat_database view, or should
I look at creating a separate view or two?

//Magnus

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] how to add seconds to a TimestampTz

2007-03-15 Thread Sailesh Krishnamurthy
If you read the autovacuum_naptime into an Interval object once, why can't 
you just use timestamptz_pl_interval ? You won't be using the interval 
input/output repeatedly surely.

Regards
Sailesh

--
Sailesh Krishnamurthy
Amalgamated Insight
[W] (650) 242-3503
[C] (650) 804-6585

-Original Message-
From: [EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED] On Behalf Of Alvaro Herrera
Sent: Wednesday, March 14, 2007 3:46 PM
To: Hackers
Subject: [HACKERS] how to add seconds to a TimestampTz

Is there a better way than going to time_t and back?  I am currently
using this:

db-next_worker =
time_t_to_timestamptz(timestamptz_to_time_t(current_time) +
  autovacuum_naptime);

(db-next_worker is a TimestampTz, as is current_time.
autovacuum_naptime is integer for a number of seconds)

but it doesn't seem clean, and the comments in the functions more or
less say that their use is discouraged.

I saw about doing it via the interval input/output but that's an awful
lot of work ...

Is this the first time this is going to be done in the backend?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 6: explain analyze is your friend


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] My honours project - databases using dynamically attached entity-properties

2007-03-15 Thread Edward Stanley
On Wed, 14 Mar 2007, David Fetter wrote:
 On Tue, Mar 13, 2007 at 05:54:34PM +, Richard Huxton wrote:
  David Fetter wrote:
  On Tue, Mar 13, 2007 at 02:21:37PM +, Richard Huxton wrote:
  David Fetter wrote:
  On Tue, Mar 13, 2007 at 09:31:45AM +, Richard Huxton wrote:
  * Another good example is the questionnaire.
  
  With all due respect, this is a solved problem *without EAV or
  run-time DDL*.  The URL below has one excellent approach to this.
  
  http://www.varlena.com/GeneralBits/110.php
  
  Which broadly speaking was the solution I used for my
  questionnaire, except I had a restricted set of types so basically
  just coerced them to text and side-stepped the inheritance issue.
  To the extent that it's dynamic, it's still just EAV though.
  
  That's precisely the difference between the above solution and
  yours, and it's the difference between a good design and one that
  will come up and bit you on the as^Hnkle.
 
  It's still basically EAV (either approach).  The key fault with EAV
  is that the tables have no semantic meaning - answer_int contains
  number of oranges, days since birth and the price of a tube ticket
  in pennies.

 Stuffing all of those into an answer_int is *precisely* what the end
 user must not do.  That's pilot error.

  Now, with a questionnaire that might not matter because everything
  is an answer and you're not necessarily going to do much more than
  count/aggregate it.

 See above.

  It doesn't remove the need for run-time DDL if you allow users to
  add their own questions.
  
  Sure it does.   When a user, who should be talking with you, wants
  to ask a new kind of question, that's the start of a discussion
  about what new kind(s) of questions would be generally applicable
  in the questionnaire schema.  Then, when you come to an agreement,
  you roll it into the new schema, and the whole system gets an
  improvement.
 
  Fine, but if you're not letting the user extend the system, then
  it's not really addressing Edward's original posting, is it?

 It's my contention that Edward's original idea is ill-posed.  SQL is
 just fine for doing this kind of thing, and it's *not that hard*.

  If the user's talking to me, I might as well just write the DDL
  myself - it's the talk that'll take the time, not writing a dozen
  lines of SQL.

 It's the talk that's the important part.  Machines are really bad at
 seeing the broader picture.  In the attempt to save a few minutes'
 discussion, he's trying to borrow that time from a system asked to do
 things that computers are inherently bad at doing, and every end user
 will pay that time back at a very high rate of interest.  This is
 precisely the kind of false economy that so plagues software
 development and maintenance these days.

  The interesting part of the problem (from a Comp-Sci point of view)
  is precisely in automating part of that discussion.  It's providing
  an abstraction so that you don't end up with a mass of attributes
  while still providing freedom to the user.

 This freedom and efficiency you're talking about is better supplied,
 IMHO, by putting a standard DDL for questionnaires up on a pgfoundry
 or an SF.net.  That way, improvements to the DDL get spread all over
 the world, and a very large amount of wheel reinvention gets avoided.
 Reusable components are a big chunk of both freedom and efficiency. :)

 Cheers,
 D

Maybe I should rethink the problem a bit - from the very brief initial 
research I've done, it seems EAV schemas have two common uses: 

1) When new attributes have to be created on-the-fly 
2) When the number of possible properties for an entity greatly (orders of 
magnitude) exceeds the number of properties any one entity is likely to have. 

I'm not sure about solving the first problem - there seems to be a lot of 
debate around this. I can see reasons for and against allowing this. However 
I think the second is a very real problem. One such example is a patient 
record system.

For each patient we have a table of common data (dob, sex, height, weight etc) 
but as well as this a patient can present with many symptoms. This might be a 
table of 40,000 possible symptoms. 

Lets say we want to run a query on these symptoms (using a boolean expression) 
to return the patient records which match the query string on the symptoms.

(This turns out to be a very similar problem to the 'tags' example I first 
presented) - assume a similar schema. With more than a couple of symptoms and 
a complex tree, the resulting SQL can span pages. 

When I first started thinking about this project I believed the two problems 
essentially to be the same class of problem, but this may not be the case.

What do people think?

 - Also, thanks everyone for your input thus far. It has been very valuable.


Eddie Stanley

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] My honours project - databases using dynamically attached entity-properties

2007-03-15 Thread Eddie Stanley

David Fetter wrote:

On Wed, Mar 14, 2007 at 12:07:45PM +1300, Edward Stanley wrote:
  

On Wed, 14 Mar 2007, David Fetter wrote:


On Tue, Mar 13, 2007 at 05:54:34PM +, Richard Huxton wrote:
  

David Fetter wrote:


On Tue, Mar 13, 2007 at 02:21:37PM +, Richard Huxton wrote:
  

David Fetter wrote:


On Tue, Mar 13, 2007 at 09:31:45AM +, Richard Huxton wrote:
  

* Another good example is the questionnaire.


With all due respect, this is a solved problem *without EAV or
run-time DDL*.  The URL below has one excellent approach to this.

http://www.varlena.com/GeneralBits/110.php
  

Which broadly speaking was the solution I used for my
questionnaire, except I had a restricted set of types so basically
just coerced them to text and side-stepped the inheritance issue.
To the extent that it's dynamic, it's still just EAV though.


That's precisely the difference between the above solution and
yours, and it's the difference between a good design and one that
will come up and bit you on the as^Hnkle.
  

It's still basically EAV (either approach).  The key fault with EAV
is that the tables have no semantic meaning - answer_int contains
number of oranges, days since birth and the price of a tube ticket
in pennies.


Stuffing all of those into an answer_int is *precisely* what the end
user must not do.  That's pilot error.

  

Now, with a questionnaire that might not matter because everything
is an answer and you're not necessarily going to do much more than
count/aggregate it.


See above.

  

It doesn't remove the need for run-time DDL if you allow users to
add their own questions.


Sure it does.   When a user, who should be talking with you, wants
to ask a new kind of question, that's the start of a discussion
about what new kind(s) of questions would be generally applicable
in the questionnaire schema.  Then, when you come to an agreement,
you roll it into the new schema, and the whole system gets an
improvement.
  

Fine, but if you're not letting the user extend the system, then
it's not really addressing Edward's original posting, is it?


It's my contention that Edward's original idea is ill-posed.  SQL is
just fine for doing this kind of thing, and it's *not that hard*.

  

If the user's talking to me, I might as well just write the DDL
myself - it's the talk that'll take the time, not writing a dozen
lines of SQL.


It's the talk that's the important part.  Machines are really bad at
seeing the broader picture.  In the attempt to save a few minutes'
discussion, he's trying to borrow that time from a system asked to do
things that computers are inherently bad at doing, and every end user
will pay that time back at a very high rate of interest.  This is
precisely the kind of false economy that so plagues software
development and maintenance these days.

  

The interesting part of the problem (from a Comp-Sci point of view)
is precisely in automating part of that discussion.  It's providing
an abstraction so that you don't end up with a mass of attributes
while still providing freedom to the user.


This freedom and efficiency you're talking about is better supplied,
IMHO, by putting a standard DDL for questionnaires up on a pgfoundry
or an SF.net.  That way, improvements to the DDL get spread all over
the world, and a very large amount of wheel reinvention gets avoided.
Reusable components are a big chunk of both freedom and efficiency. :)

Cheers,
D
  

Maybe I should rethink the problem a bit - from the very brief
initial research I've done, it seems EAV schemas have two common
uses: 



  
1) When new attributes have to be created on-the-fly 
2) When the number of possible properties for an entity greatly (orders of 
magnitude) exceeds the number of properties any one entity is likely to have. 



Um, no.  The first use case is bad coding practice, and the second is
a classic case for a join table, which is the standard way to handle
M:N relationships.

  
I'm not sure about solving the first problem - there seems to be a lot of 
debate around this. I can see reasons for and against allowing this. However 
I think the second is a very real problem. One such example is a patient 
record system.


For each patient we have a table of common data (dob, sex, height, weight etc) 
but as well as this a patient can present with many symptoms. This might be a 
table of 40,000 possible symptoms. 



Here's how I'd do that:

CREATE TABLE patient (
patient_id SERIAL PRIMARY KEY, /* for simplicity.  Some
  combination of columns in the
  table would also have a UNIQUE
  NOT NULL constraint on it.
*/
...
);

CREATE TABLE symptom (
symptom_id SERIAL PRIMARY 

Re: [HACKERS] Calculated view fields (8.1 != 8.2)

2007-03-15 Thread Gaetano Mendola
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Tom Lane wrote:
 Gaetano Mendola [EMAIL PROTECTED] writes:
 I can immagine a case when a lower module exports a view to upper layer 
 stating
 the interface as list of fields:
 
 first_name, last_name, 
 
 with an *hidden* field that is a function call that updates the statistics on
 how many time a given record was selected, then this technique can not be 
 used
 anymore starting with 8.2.x.
 
 You're living in a dream world if you think that works reliably in *any*
 version of Postgres.  But for starters, what is your definition of
 selected --- pulled from the physical table?  Accumulated into an
 aggregate?  Delivered as a recognizable row to the client?  Delivered N
 times to the client due to joining N times to some other table?

Well that was a not good example, I don't have any problem in mark from now
on all my function as stable/immutable (the one I use on views) but still
I believe is source of bad performance evaluate a function on rows discarded and
at same time this break the principle of least surprise.

Regards
Gaetano Mendola

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFF9U8R7UpzwH2SGd4RAhoGAKDSpUSQ3lGEdIdFWLwQjxoZXUAS1ACdGtht
TZg9BKScbzGO0MzpHy0Gr80=
=auwk
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] My honours project - databases using dynamically attached entity-properties

2007-03-15 Thread Trent Shipley
On Wednesday 2007-03-14 08:26, Csaba Nagy wrote:
 On Wed, 2007-03-14 at 16:08, [EMAIL PROTECTED] wrote:
  On Wed, Mar 14, 2007 at 02:28:03PM +, Gregory Stark wrote:
   David Fetter [EMAIL PROTECTED] writes:
CREATE TABLE symptom (
symptom_id SERIAL PRIMARY KEY, /* See above. */
...
);
   
CREATE TABLE patient_presents_with (
patient_id INTEGER NOT NULL REFERENCES patient(patient_id),
symptom_id INTEGER NOT NULL REFERENCES symptom(symptom_id),
UNIQUE(patient_id, symptom_id)
);
  
   I'm just glad I don't have your doctor. I hope mine doesn't think
   symptoms are all boolean values.
 
  Where is the boolean above? It is M:N, with each having whatever data
  is required.

 The boolean is assumed in the symptoms table. In any case, even if it's
 not a boolean value, even if maybe the symptoms table is a complex one
 on it's own, it still is one single type for all symptoms of all
 patients. The real problem is that in some real world applications you
 have a mix of wildly varying types of attributes a user might want to
 use, and you can't know what those will be beforehand... the symptoms
 thing is simple to solve in the way David did it, but there really are
 other situations which a simple m:n can't easily cover. How would you
 handle a data base of user settings for 10K different applications and
 100M different users where each application must be able to store it's
 own (type safe !!) settings in the same structure, and applications come
 and go with their own settings ? Come up with a good solution to this
 combined with queries like give me all the users who have this set of
 settings set to these values running fast, and then you're talking.

 Cheers,
 Csaba.

At that point traditional referential database design starts to break down.  
If you need THAT MUCH flexibility it's probably time to look at network 
databases, knowledge bases, extensible knowledge bases, and ad hoc knowledge 
bases (OWL, RDF, etc).  Flexibility, friendliness to marketeers or 
accountants, extesiblity, none are really the strong points of relational 
database.  Databases as they exist today do best with finite domains that can 
be formally organized.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Additional global stats for pg_database?

2007-03-15 Thread Alvaro Herrera
Magnus Hagander wrote:
 Hi!
 
 I'd like to add the ability to show global counters of a couple of
 things currently available in pg_stat_all_tables (for example number of
 rows added etc) in a global counter for the entire cluster, for
 monitoring of an entire cluster without having to connect to each
 individual database. 
 
 Should I try adding this to the exsting pg_stat_database view, or should
 I look at creating a separate view or two?

I'd say add that to each database entry.  To get the sum of all
databases you'd need some sort of refactoring, because currently when a
backend reads the stats file it only gets its own stats.  One idea is to
create an additional counter in the stats file, stored separately from
database entries.

If you add a tuples updated/deleted/inserted to databases, autovacuum
can make use of that too (it's even mentioned on an XXX comment in the
code somewhere).

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] CLUSTER and MVCC

2007-03-15 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes:
 I'm thinking of keeping an in-memory mapping of old and new tids of 
 updated tuples while clustering, instead. That means that cluster 
 requires a little bit of memory for each RECENTLY_DEAD updated tuple. In 
 the worst case that means that you run out of memory if there's too many 
 of those in the table, but I doubt that's going to be a problem in practice.

That is more or less isomorphic to what VACUUM FULL does.  While people
have complained about VACUUM FULL's memory usage on occasion, just at
the moment I feel that the main problem with it is complexity.  If we
still haven't gotten all the bugs out of VACUUM FULL after more than
eight years of work on it, what are the odds that we can make CLUSTER
do it right the first time?

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Backend crash in 8.2.3 with plpgsql function

2007-03-15 Thread Gaetano Mendola
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Tom Lane wrote:
 SM [EMAIL PROTECTED] writes:
 I got a backend crash in Postgresql 8.2.3 with the plpgsql function.
 The following statement in psql causes a signal 11:
 psql=# create function testpl() returns void as 'begin return; 
 end;'language 'plpgsql';
 
 Worksforme ...

For me too.

$ psql
Welcome to psql 8.2.3, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms
   \h for help with SQL commands
   \? for help with psql commands
   \g or terminate with semicolon to execute query
   \q to quit

kalman=# create function testpl() returns void as 'begin return; end;'language 
'plpgsql';
CREATE FUNCTION
kalman=# select version();
version
- 

 PostgreSQL 8.2.3 on i686-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.1 
20070105 (Red Hat 4.1.1-51)
(1 row)




Regards
Gaetano Mendola
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFF+V2/7UpzwH2SGd4RAk29AJ44FZFMnsFHJV+uOcQZpuD0cGN/YACgjxjY
4lVP/g+/PLs2+RfOFtpBJtE=
=/Vae
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] UPDATE using sub selects

2007-03-15 Thread Gaetano Mendola
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

NikhilS wrote:

 I have tried some update-subselect variations and they seem to work. For
 example the case in the src/test/regress/sql/update.sql, which used to
 fail till now, seems to work:
 
 UPDATE update_test SET (a,b) = (select a,b FROM update_test where c =
 'foo')
   WHERE a = 10;

What's the expected result if the tuple from subselect is more than 1?
I expect no update at all in case of void result set, is this the case ?



Regards
Gaetano Mendola
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFF+Vx47UpzwH2SGd4RAvyVAKCGK5pC9B/lmrNjAFPGXhm5ialwSwCglM2n
DxrxWyvJASX5WSF9B8cAMas=
=AoVF
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Additional global stats for pg_database?

2007-03-15 Thread Magnus Hagander
On Thu, Mar 15, 2007 at 10:25:57AM -0400, Alvaro Herrera wrote:
 Magnus Hagander wrote:
  Hi!
  
  I'd like to add the ability to show global counters of a couple of
  things currently available in pg_stat_all_tables (for example number of
  rows added etc) in a global counter for the entire cluster, for
  monitoring of an entire cluster without having to connect to each
  individual database. 
  
  Should I try adding this to the exsting pg_stat_database view, or should
  I look at creating a separate view or two?
 
 I'd say add that to each database entry.  To get the sum of all
 databases you'd need some sort of refactoring, because currently when a
 backend reads the stats file it only gets its own stats.  One idea is to
 create an additional counter in the stats file, stored separately from
 database entries.
 
 If you add a tuples updated/deleted/inserted to databases, autovacuum
 can make use of that too (it's even mentioned on an XXX comment in the
 code somewhere).

As you can surely tell, I didn't actually check what code would be
needed before I asked ;-) I'll see what/when I can do.

//Magnus

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] how to add seconds to a TimestampTz

2007-03-15 Thread Alvaro Herrera
Grzegorz Jaskiewicz wrote:
 
 On Mar 15, 2007, at 5:58 AM, Tom Lane wrote:
 
 
 ?  I'm not sure what would be the most convenient realization
 of this at the C level, but please stay away from time_t ...
 
 what's wrong with time_t ? Does postgres has some sort of time API,  
 that can be used instead?

Sure.  See TimestampTz (as mentioned in the subject).

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] CLUSTER and MVCC

2007-03-15 Thread Heikki Linnakangas

Tom Lane wrote:

Heikki Linnakangas [EMAIL PROTECTED] writes:
I'm thinking of keeping an in-memory mapping of old and new tids of 
updated tuples while clustering, instead. That means that cluster 
requires a little bit of memory for each RECENTLY_DEAD updated tuple. In 
the worst case that means that you run out of memory if there's too many 
of those in the table, but I doubt that's going to be a problem in practice.


That is more or less isomorphic to what VACUUM FULL does.  While people
have complained about VACUUM FULL's memory usage on occasion, just at
the moment I feel that the main problem with it is complexity.  If we
still haven't gotten all the bugs out of VACUUM FULL after more than
eight years of work on it, what are the odds that we can make CLUSTER
do it right the first time?


Well, I can't guarantee that there's no bugs.

To copy a chain correctly, we need to correctly detect tuples that have 
a t_ctid pointing to a non-dead tuple (non-dead meaning 
HeapTupleSatisfiesVacuum(tuple) != DEAD), and tuples that are being 
pointed to by a non-dead tuple. If we incorrectly detect that a tuple 
belongs to either of those categories, when in fact it doesn't, we don't 
corrupt anything, but we waste a little bit of memory memorizing the 
tuple unnecessarily.


To detect tuples in the first category, we need to check that xmax of 
the tuple isn't invalid, and t_ctid doesn't point to itself.


To detect tuples in the second category, we need to check that xmin 
isn't invalid, and is greater than OldestXmin.


With both categories correctly identified, it's just a matter of mapping 
old ctids to corresponding tids in the new heap.


Unlike in my first proposal, if something nevertheless goes wrong in 
detecting the chains, we only lose the chaining between the tuples, but 
we don't otherwise lose any data. The latest version of each row is fine 
anyway. I think this approach is pretty robust, and it fails in a good way.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] [PATCHES] Bitmapscan changes

2007-03-15 Thread Joshua D. Drake
Heikki Linnakangas wrote:
 Joshua D. Drake wrote:
 This is what I suggest.

 Provide a tarball of -head with the patch applied.
 
 Here you are:
 
 http://community.enterprisedb.com/git/pgsql-git-20070315.tar.gz
 
 Provide a couple of use cases that can be run with explanation of how to
 verify the use cases.
 
 There's a number of simple test cases on the web page that I've used
 (perfunittests). I can try to simplify them and add explanations.

I am downloading now.

Joshua D. Drake




-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] My honours project - databases using dynamically attached entity-properties

2007-03-15 Thread Ron Mayer
Josh Berkus wrote:
 And then what? dynamically construct all your SQL queries?
 Sure, sounds like a simple solution to me...
 
 Not to mention DB security issues.  How do you secure your database when 
 your web client has DDL access?
 
 So, Edward, the really *interesting* idea would be to come up with a 
 secure, normalized way to do UDFs *without* EAV tables.  People would be 
 very impressed.
 

I have a system with many essentially user-defined fields, and was
thinking of creating something similar to an Array type and writing
some GIST indexes for it.

My current workaround is to store them as a YAML document and use
tsearch to index it (with application logic to further refine the
results) - but a EAV datatype that could be put in tables and
effectively indexed would be of quite a bit of interest here.
And yes, a better say to do UDFs would be even cooler.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] My honours project - databases using dynamically attached entity-properties

2007-03-15 Thread tomas
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Thu, Mar 15, 2007 at 08:31:24AM -0700, Ron Mayer wrote:
 Josh Berkus wrote:
  And then what? dynamically construct all your SQL queries?
  Sure, sounds like a simple solution to me...
  
  Not to mention DB security issues.  How do you secure your database when 
  your web client has DDL access?
  
  So, Edward, the really *interesting* idea would be to come up with a 
  secure, normalized way to do UDFs *without* EAV tables.  People would be 
  very impressed.
  
 
 I have a system with many essentially user-defined fields, and was
 thinking of creating something similar to an Array type and writing
 some GIST indexes for it.

Do give hstore a try. It indexes over name/value mappings (the same
authors have intarray, indexing over arrays). If it doesn't solve your
problem it'll be at least a very good starting point for what you
mention above (basically it's a lossy GiST index based on a hash bitmap.
I was at least quite impressed).

Regards
- -- tomás
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFF+WumBcgs9XrR2kYRAukhAJ9mPk89FfLX4E3dIBGlhfVTwqup0QCdE8Tx
nh05Y1WpyX36uKI+9qspO60=
=C/5K
-END PGP SIGNATURE-


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] My honours project - databases using dynamically attached entity-properties

2007-03-15 Thread A.M.


On Mar 15, 2007, at 11:31 , Ron Mayer wrote:


Josh Berkus wrote:

And then what? dynamically construct all your SQL queries?
Sure, sounds like a simple solution to me...


Not to mention DB security issues.  How do you secure your  
database when

your web client has DDL access?

So, Edward, the really *interesting* idea would be to come up with a
secure, normalized way to do UDFs *without* EAV tables.  People  
would be

very impressed.



I have a system with many essentially user-defined fields, and was
thinking of creating something similar to an Array type and writing
some GIST indexes for it.

My current workaround is to store them as a YAML document and use
tsearch to index it (with application logic to further refine the
results) - but a EAV datatype that could be put in tables and
effectively indexed would be of quite a bit of interest here.
And yes, a better say to do UDFs would be even cooler.


Out of all the databases that I have used, postgresql offers the most  
flexible DDL- mostly for one reason: they can operate within  
transactions.


To handle arbitrary strings as column identifiers, the column names  
could actually be stripped down to lower-case letters and the real  
title could be stored in a separate table or as column comments.


Mr. Berkus' concern regarding the security implications is already  
handled by privilege separation or security-definer functions.


The OP's concern about the difficulty about querying a schema  
structure is alleviated via any number of APIs in Perl, JDBC, etc.


It seems to me that postgresql is especially well-suited to run DDL  
at runtime, so what's the issue?



-M

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


[HACKERS] tsearch_core for inclusion

2007-03-15 Thread Teodor Sigaev
Last try there was a fight about syntax of introduced commands. And we (Oleg and 
me) developed variant of patch with another syntax. We will not change docs 
until agreement will be reached, current version 
http://mira.sai.msu.su/~megera/pgsql/ftsdoc/


Following demonstrates subset of FTS syntax using example from 
http://mira.sai.msu.su/~megera/pgsql/ftsdoc/fts-complete-tut.html.


Syntax which was proposed:
begin;
CREATE FULLTEXT CONFIGURATION public.pg ( LOCALE = 'ru_RU.UTF-8' ) LIKE english 
WITH MAP;

CREATE FULLTEXT DICTIONARY pg_dict ( OPT = 'pg_dict.txt' ) LIKE synonym;
CREATE FULLTEXT DICTIONARY en_ispell
( OPT = 'DictFile=/usr/local/share/dicts/ispell/english-utf8.dict,
 AffFile=/usr/local/share/dicts/ispell/english-utf8.aff,
 StopFile=/usr/local/share/dicts/ispell/english-utf8.stop'
) LIKE ispell_template;
ALTER FULLTEXT DICTIONARY en_stem SET 
OPT='/usr/local/share/dicts/ispell/english-utf8.stop';

ALTER FULLTEXT MAPPING ON pg FOR lword,lhword,lpart_hword
 WITH pg_dict,en_ispell,en_stem;
DROP FULLTEXT MAPPING ON pg FOR email, url, sfloat, uri, float;
end;
Patch is http://www.sigaev.ru/misc/tsearch_core-0.38.gz

Making above FTS syntax as Peter suggested, we removed unnecessary parenthesis, 
equal sign, OPT becomes OPTION.

begin;
CREATE FULLTEXT CONFIGURATION public.pg  LOCALE  'ru_RU.UTF-8' LIKE english WITH
 MAP;
CREATE FULLTEXT DICTIONARY pg_dict OPTION 'pg_dict.txt'  LIKE synonym;
CREATE FULLTEXT DICTIONARY en_ispell
OPTION 'DictFile=/usr/local/share/dicts/ispell/english-utf8.dict,
 AffFile=/usr/local/share/dicts/ispell/english-utf8.aff,
 StopFile=/usr/local/share/dicts/ispell/english-utf8.stop'
LIKE ispell_template;
ALTER FULLTEXT DICTIONARY en_stem SET OPTION '/usr/local/share/dicts/ispell/engl
ish-utf8.stop';
ALTER FULLTEXT MAPPING ON pg FOR lword,lhword,lpart_hword
 WITH pg_dict,en_ispell,en_stem;
DROP FULLTEXT MAPPING ON pg FOR email, url, sfloat, uri, float;
end;
Patch is http://www.sigaev.ru/misc/tsearch_core-0.38.1.gz

Comparing that syntaxes with current tsearch2 is placed at 
http://mira.sai.msu.su/~megera/pgsql/ftsdoc/fts-syntax-compare.html


So, which is syntax more attractive? And is there some another objections?


--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] My honours project - databases using dynamically attached entity-properties

2007-03-15 Thread Gregory Stark
Ron Mayer [EMAIL PROTECTED] writes:

 I have a system with many essentially user-defined fields, and was
 thinking of creating something similar to an Array type and writing
 some GIST indexes for it.

 My current workaround is to store them as a YAML document and use
 tsearch to index it (with application logic to further refine the
 results) - but a EAV datatype that could be put in tables and
 effectively indexed would be of quite a bit of interest here.
 And yes, a better say to do UDFs would be even cooler.

I think that's what hstore does. Honestly I'm not entirely certain though.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] My honours project - databases using dynamically attached entity-properties

2007-03-15 Thread Csaba Nagy
On Thu, 2007-03-15 at 17:01, A.M. wrote:
 It seems to me that postgresql is especially well-suited to run DDL  
 at runtime, so what's the issue?

The issue is that some applications are not well suited to run DDL at
runtime :-)

As I already mentioned in another post in this thread, our application
also has a requirement of user defined fields in one table. Problem is,
that table is so accessed in peak hours, that it is simply impossible to
take an exclusive lock on it without causing an extended perceived
downtime of the application. And guess what, users will always want to
add new fields in peak hours...

We did solve this in our case with some application logic, but a generic
solution would be nice ;-)

Cheers,
Csaba.



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] My honours project - databases using dynamically attached entity-properties

2007-03-15 Thread Andrew Dunstan

Gregory Stark wrote:

Ron Mayer [EMAIL PROTECTED] writes:

  

I have a system with many essentially user-defined fields, and was
thinking of creating something similar to an Array type and writing
some GIST indexes for it.

My current workaround is to store them as a YAML document and use
tsearch to index it (with application logic to further refine the
results) - but a EAV datatype that could be put in tables and
effectively indexed would be of quite a bit of interest here.
And yes, a better say to do UDFs would be even cooler.



I think that's what hstore does. Honestly I'm not entirely certain though.

  


Does hstore nest? My impression is that it doesn't. Which might well not 
matter, of course.


cheers

andrew

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


[HACKERS] Proposal: minor SPI interface change

2007-03-15 Thread Tom Lane
While fooling with getting SPI to use the plancache, I got annoyed again
about how SPI_prepare() and related functions refer to SPI plan pointers
as void *.  I'm all for having the structure be opaque to callers, but
to me void * means any old pointer, which this surely is not.  I'd
like to change things so that spi.h does

typedef struct _SPI_plan *SPIPlanPtr;

and then SPI_prepare is declared to return SPIPlanPtr not void *, and
likewise for the other SPI functions dealing with plans.

AFAICS this does not break code that refers to plan pointers as void *
because C compilers will allow implicit casts between void * and
SPIPlanPtr.  However, for code that we feel like updating, the result
is more readable and less error-prone.

There are people out there who want their code to compile against
multiple PG versions.  To use the improved notation and still compile
against pre-8.3 headers, they could do

#if CATALOG_VERSION_NO  whatever
typedef void *SPIPlanPtr;
#endif

Comments, objections?

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [RFC] CLUSTER VERBOSE

2007-03-15 Thread Dawid Kuroczko

On 3/15/07, Grzegorz Jaskiewicz [EMAIL PROTECTED] wrote:

I figure - I should start brand new thread for this one - so here you
go.


I am in a need for verbose CLUSTER. Ie. one that would give me
feedback and progress.
Because CLUSTER is divided into two major operations, (data
reordering, index rebuild) - I see it this way:

CLUSTER on I: index name T: table name, data reordering
CLUSTER on I: index name T: table name, index rebuild

and than:
CLUSTER 10%
CLUSTER 12% , etc


Well, I'm afraid that would be inconsistent with other VERBOSE
commands (VACUUM VERBOSE), which don't give a progress
indication other than that of specific stage being finished.

I think if you want to add VERBOSE to cluster, it should behave
exactly like all other 'VERBOSE' commands.

And as for progress indication, there has been proposals for more
or less similar feature, like:
http://archives.postgresql.org/pgsql-hackers/2006-07/msg00719.php

As I recall the ideas which caught most traction were
indicating current progress via shared memory (pg_stat_activity)
and a GUC variable which instructs the server to send notices
indicating the progress status. The latter is harder.

I'm afraid creating such a feature 'just for CLUSTER' is not the greatest
idea -- there a lots of other places where having a progress bar would
be a great benefit.  REINDEX, most ALTER TABLEs, CREATE INDEX, even
long running SELECTs, UPDATEs and DELETEs not to mention VACUUM
would equally benefit from it.  I think you will be having hard time trying
to push CLUSTER-specific extension when there is a need for more
generic one.

  Regards,
 Dawid

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] pltcl vs. multilib machines

2007-03-15 Thread Tom Lane
So I just updated my x86_64 machine to Fedora Core 6, and I find that
rebuilding our CVS HEAD gives a batch of noise that was not there
before:

/usr/bin/ld: skipping incompatible /usr/lib/libtcl8.4.so when searching for 
-ltcl8.4
/usr/bin/ld: skipping incompatible /usr/lib/libdl.so when searching for -ldl
/usr/bin/ld: skipping incompatible /usr/lib/libdl.a when searching for -ldl
/usr/bin/ld: skipping incompatible /usr/lib/libpthread.so when searching for 
-lpthread
/usr/bin/ld: skipping incompatible /usr/lib/libpthread.a when searching for 
-lpthread
/usr/bin/ld: skipping incompatible /usr/lib/libieee.a when searching for -lieee
/usr/bin/ld: skipping incompatible /usr/lib/libm.so when searching for -lm
/usr/bin/ld: skipping incompatible /usr/lib/libm.a when searching for -lm
/usr/bin/ld: skipping incompatible /usr/lib/libc.so when searching for -lc
/usr/bin/ld: skipping incompatible /usr/lib/libc.a when searching for -lc
/usr/bin/ld: skipping incompatible /usr/lib/libc.so when searching for -lc
/usr/bin/ld: skipping incompatible /usr/lib/libc.a when searching for -lc

It turns out that this is because the link command for pltcl includes
-L/usr/lib, so that gets searched before /usr/lib64.  And the reason the
command includes that is that that's what it says in TCL_LIB_SPEC in
/usr/lib/tclConfig.sh.  There is also a /usr/lib64/tclConfig.sh which
says the right things, but we aren't finding that because this is how we
determine where to look for tclConfig.sh:

$ echo 'puts $auto_path' | tclsh
/usr/share/tcl8.4 /usr/share /usr/lib /usr/lib64
$

Perhaps I should lobby the Red Hat guys to change the order of that
result, but really this is more our problem than theirs: whichever
way tclsh reports it, it will be wrong for trying to build Postgres
with the other word width on a multilib machine.

The ideal thing would be to try to verify that the found tclConfig.sh
is compatible with the compiler switches we want to use, but I can't
think of any reasonably robust way to do that (ie, something that's
likely to work with non-gcc compilers...).  Any ideas?

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] UPDATE using sub selects

2007-03-15 Thread Tom Lane
Gaetano Mendola [EMAIL PROTECTED] writes:
 NikhilS wrote:
 I have tried some update-subselect variations and they seem to work. For
 example the case in the src/test/regress/sql/update.sql, which used to
 fail till now, seems to work:
 
 UPDATE update_test SET (a,b) = (select a,b FROM update_test where c =
 'foo')
 WHERE a = 10;

 What's the expected result if the tuple from subselect is more than 1?

Error, per SQL99 section 7.14:

 1) If the cardinality of a row subquery is greater than 1 (one),
then an exception condition is raised: cardinality violation.

 I expect no update at all in case of void result set, is this the case ?

No, you get nulls; it's a subquery not a join.  Per SQL99 7.1:

c) If the row value constructor is a row subquery, then:

  i) Let R be the result of the row subquery and let D be the
 degree of R.

 ii) If the cardinality of R is 0 (zero), then the result of the
 row value constructor is D null values.

iii) If the cardinality of R is 1 (one), then the result of the
 row value constructor is R.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] tsearch_core for inclusion

2007-03-15 Thread Robert Treat
On Thursday 15 March 2007 12:17, Teodor Sigaev wrote:
 Last try there was a fight about syntax of introduced commands. And we
 (Oleg and me) developed variant of patch with another syntax. We will not
 change docs until agreement will be reached, current version
 http://mira.sai.msu.su/~megera/pgsql/ftsdoc/

 Following demonstrates subset of FTS syntax using example from
 http://mira.sai.msu.su/~megera/pgsql/ftsdoc/fts-complete-tut.html.


This is nice. 

snip
 Comparing that syntaxes with current tsearch2 is placed at
 http://mira.sai.msu.su/~megera/pgsql/ftsdoc/fts-syntax-compare.html

 So, which is syntax more attractive? 

Honestly I don't find any of these syntax's to be head and shoulders above the 
others, but would probably lean toward the original syntax, since it has some 
level of familiarity among the existing user base. 

 And is there some another objections?

Most people whom I talk to about tsearch who want the syntax changed to make 
it easier want something akin to just CREATE INDEX fti1 on t1(c1) USING 
FULLTEXT and then be done with it.  This patch isn't going to give people 
that. 

I'm also concerned about the stability of the tsearch api in general wrt 
including it in core.  Currently the recommended upgrade practice is to 
dump/reload without tsearch, installing the new servers version of tsearch 
instead.  IMHO this is not an acceptable solution for core-included features. 
So is this actually going to be improved in a core tsearch?  system tables 
are not dumped by default, so that seems easier, until you consider that your 
custom tsearch install will then be lost on upgrade... oops! 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] tsearch_core for inclusion

2007-03-15 Thread Joshua D. Drake

 And is there some another objections?
 
 Most people whom I talk to about tsearch who want the syntax changed to make 
 it easier want something akin to just CREATE INDEX fti1 on t1(c1) USING 
 FULLTEXT and then be done with it.  This patch isn't going to give people 
 that. 

+1 (again) Although I would see something like this:

CREATE INDEX fti1 on t1(c1) USING FULLTEXT [WITH] GIST | GIN

Joshua D. Drake


-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] tsearch_core for inclusion

2007-03-15 Thread Tom Lane
Joshua D. Drake [EMAIL PROTECTED] writes:
 +1 (again) Although I would see something like this:
 CREATE INDEX fti1 on t1(c1) USING FULLTEXT [WITH] GIST | GIN

Surely the CREATE INDEX syntax has got enough warts on it already.
Can't we express this as a particular operator class, or something
that doesn't add any new syntax?

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] tsearch_core for inclusion

2007-03-15 Thread Joshua D. Drake
Tom Lane wrote:
 Joshua D. Drake [EMAIL PROTECTED] writes:
 +1 (again) Although I would see something like this:
 CREATE INDEX fti1 on t1(c1) USING FULLTEXT [WITH] GIST | GIN
 
 Surely the CREATE INDEX syntax has got enough warts on it already.
 Can't we express this as a particular operator class, or something
 that doesn't add any new syntax?

Do you mean something like:

CREATE INDEX fti1 ON t1 USING GIST|GIN(c1) WITH (FULLTEXT)

The above I got from the idea of FILLFACTOR so you could have something
like WITH (FILLFACTOR = 70, FULLTEXT) (not even sure if that is relevant)

OR

CREATE INDEX ftil ON t1 USING GIST|GIN(C1 FULLTEXT);

Where FULLTEXT is like VARCHAR OPS?

I could live with that.

Sincerely,

Joshua D. Drake




 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 6: explain analyze is your friend
 


-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] tsearch_core for inclusion

2007-03-15 Thread Tom Lane
Joshua D. Drake [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 Surely the CREATE INDEX syntax has got enough warts on it already.

 Do you mean something like:
 CREATE INDEX ftil ON t1 USING GIST|GIN(C1 FULLTEXT);
 Where FULLTEXT is like VARCHAR OPS?

Yeah, that one.  It might be more consistent to spell it as fulltext_ops
but I wouldn't insist on it.

Of course the issue not addressed here is where you specify all the
secondary configuration data (the stuff currently handled by config
tables in the contrib implementation).  Perhaps the WITH clause would
work for that, though in the current code WITH is targeted at the index
AM not individual opclasses.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] tsearch_core for inclusion

2007-03-15 Thread Joshua D. Drake
Tom Lane wrote:
 Joshua D. Drake [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 Surely the CREATE INDEX syntax has got enough warts on it already.
 
 Do you mean something like:
 CREATE INDEX ftil ON t1 USING GIST|GIN(C1 FULLTEXT);
 Where FULLTEXT is like VARCHAR OPS?
 
 Yeah, that one.  It might be more consistent to spell it as fulltext_ops
 but I wouldn't insist on it.

*shrug* fulltext_ops is probably more accurate but FULLTEXT is more
friendly :). I find you normally can't have both, my vote would probably
be consistency.

 
 Of course the issue not addressed here is where you specify all the
 secondary configuration data (the stuff currently handled by config
 tables in the contrib implementation).  Perhaps the WITH clause would
 work for that, though in the current code WITH is targeted at the index
 AM not individual opclasses.

Not sure what to say here. WITH seems logical and I don't think we want
to add yet another keyword but I certainly see your point.

Sincerely,

Joshua D. Drake



 
   regards, tom lane
 


-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] tsearch_core for inclusion

2007-03-15 Thread Oleg Bartunov

On Thu, 15 Mar 2007, Robert Treat wrote:


On Thursday 15 March 2007 12:17, Teodor Sigaev wrote:

Last try there was a fight about syntax of introduced commands. And we
(Oleg and me) developed variant of patch with another syntax. We will not
change docs until agreement will be reached, current version
http://mira.sai.msu.su/~megera/pgsql/ftsdoc/

Following demonstrates subset of FTS syntax using example from
http://mira.sai.msu.su/~megera/pgsql/ftsdoc/fts-complete-tut.html.



This is nice.

snip

Comparing that syntaxes with current tsearch2 is placed at
http://mira.sai.msu.su/~megera/pgsql/ftsdoc/fts-syntax-compare.html

So, which is syntax more attractive?


Honestly I don't find any of these syntax's to be head and shoulders above the
others, but would probably lean toward the original syntax, since it has some
level of familiarity among the existing user base.


And is there some another objections?


Most people whom I talk to about tsearch who want the syntax changed to make
it easier want something akin to just CREATE INDEX fti1 on t1(c1) USING
FULLTEXT and then be done with it.  This patch isn't going to give people
that.


Since we use standard postgresql-ish CREATE INDEX command, I assume 
people want to skip creation of tsvector column ? How they could manage

complex document indexing, when document is a combination (with different 
weights)
of many text attributes from several tables, for example ? There are several
other issues with that approach, for example, we need to store positional
information somewhere for ranking information. It's awkward to parse document
every time to get this information. 


I'm also concerned about the stability of the tsearch api in general wrt
including it in core.  Currently the recommended upgrade practice is to
dump/reload without tsearch, installing the new servers version of tsearch
instead.  IMHO this is not an acceptable solution for core-included features.
So is this actually going to be improved in a core tsearch?  system tables
are not dumped by default, so that seems easier, until you consider that your
custom tsearch install will then be lost on upgrade... oops!


This is exact reason why we want to include tsearch into core, it was discussed
several times.

Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] tsearch_core for inclusion

2007-03-15 Thread Oleg Bartunov

On Thu, 15 Mar 2007, Joshua D. Drake wrote:




And is there some another objections?


Most people whom I talk to about tsearch who want the syntax changed to make
it easier want something akin to just CREATE INDEX fti1 on t1(c1) USING
FULLTEXT and then be done with it.  This patch isn't going to give people
that.


+1 (again) Although I would see something like this:

CREATE INDEX fti1 on t1(c1) USING FULLTEXT [WITH] GIST | GIN


Yet another syntax addition :) 
What's wrong with the standard CREATE INDEX ?


Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] SoC ECPG Enhancements

2007-03-15 Thread Josh Berkus
Peter,

 The Google Summer of Code web page lists as possible project

 ECPG Enhancments: Enable ECPG to generate calls directly for libpq
 rather than calls to its own libraries.

 What would be the point of that?

More importantly, Michael hasn't volunteered to mentor this year, so I think 
we should probably take the idea down.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] My honours project - databases using dynamically attached entity-properties

2007-03-15 Thread tomas
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Thu, Mar 15, 2007 at 12:49:10PM -0400, Andrew Dunstan wrote:
 Gregory Stark wrote:
 Ron Mayer [EMAIL PROTECTED] writes:
 
   
 I have a system with many essentially user-defined fields, and was
 thinking of creating something similar to an Array type and writing
 some GIST indexes for it.

[...]

 I think that's what hstore does. Honestly I'm not entirely certain though.

 Does hstore nest? My impression is that it doesn't. Which might well not 
 matter, of course.

If what you mean is to have mappings of mappings then no.

Hstore implements a data type for a (finite) mapping (a set of key - value
pairs, think hash for perl folks), with operations like H1 contains
H2 (in the sense that all key-value pairs in H2 are also in H1)
supported by an index. Keys and values are strings.

But I don't see why it couldn't be extended to more structured data
types (one of the good things of hstore at the moment is its surprising
simplicity, because it provides a good example of what GiST is good for,
therefore I'd understand if the authors don't wanted to add that much
hair without need).

In my view it's one of the cutest applications of GiST.

Regards
- -- tomás
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFF+i5OBcgs9XrR2kYRAprYAJ46ecAnSNWnh+rizGdhWLutU7BphACdHweH
k3rdsxmUzuxthwldXvJ3UKw=
=+sT5
-END PGP SIGNATURE-


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] tsearch_core for inclusion

2007-03-15 Thread Magnus Hagander
  Most people whom I talk to about tsearch who want the syntax changed to make
  it easier want something akin to just CREATE INDEX fti1 on t1(c1) USING
  FULLTEXT and then be done with it.  This patch isn't going to give people
  that.
 
 Since we use standard postgresql-ish CREATE INDEX command, I assume 
 people want to skip creation of tsvector column ?

That would be great.

 How they could manage
 complex document indexing, when document is a combination (with different 
 weights)
 of many text attributes from several tables, for example ?

Just to give you some more work, could I have both, please.

Seriously, if the current powerful functionality could be combined with a dead 
simple solution for new users and those who don't need it, that would be very 
good.

This could be an auto generated hidden column or something, as long as the user 
doesn't need to see or care about it in the simple case. 

/Magnus


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly