Re: [HACKERS] getting oid of function

2005-02-14 Thread Michael Fuhr
On Mon, Feb 14, 2005 at 12:47:44PM +0500, Sibtay Abbas wrote:
 
 Is it possible to get the oid of a function on the basis of its name?.

One way is to cast the function name to regproc (or, with arguments,
to regprocedure) and then to oid:

SELECT 'atan'::regproc::oid;
SELECT 'length(text)'::regprocedure::oid;

See Object Identifier Types in the documentation for more info:

http://www.postgresql.org/docs/8.0/static/datatype-oid.html

 The scenario which i am currently facing is that i have the function name, now
 i want search the pg_proc system catalog on the basis of the function
 name and retrieve its Oid.

SELECT oid FROM pg_proc WHERE proname = 'funcname';

A function can have multiple records in pg_proc if it can take
different types and/or numbers of arguments, so you might have to
allow for that.

 Another confusion which i am facing is that, I am not sure whether Oid
 of a function is entered in pg_proc system catalog or not. Because i
 am not able to identify any relevant field.

oid is a system column; tools that describe tables usually don't
show system columns.  You can query pg_attribute to see all of a
table's columns.

http://www.postgresql.org/docs/8.0/static/ddl-system-columns.html
http://www.postgresql.org/docs/8.0/static/catalog-pg-attribute.html

-- 
Michael Fuhr
http://www.fuhr.org/~mfuhr/

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] Schema name of function

2005-02-14 Thread John Hansen
Just got reminded...

Is there a way for a C function to determine the name of the schema in which is 
was created?

... John



---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Design notes for BufMgrLock rewrite

2005-02-14 Thread Sailesh Krishnamurthy
 Tom == Tom Lane [EMAIL PROTECTED] writes:

Tom and changing the buf_table hash table.  The only common
Tom operation that needs exclusive lock is reading in a page that
Tom was not in shared buffers already, which will require at
Tom least a kernel call and usually a wait for I/O, so it will be
Tom slow anyway.

Why not a separate lock per bucket chain in the hash table in addition
to the system-wide LWLock ? It's not so much that such an operation will be
slow anyway but that such a slow operation will unnecessarily block
other operations.

-- 
Pip-pip
Sailesh
http://www.cs.berkeley.edu/~sailesh



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Dealing with network-dead clients

2005-02-14 Thread Richard Huxton
Oliver Jowett wrote:
I'm currently trying to find a clean way to deal with network-dead 
clients that are in a transaction and holding locks etc.

The normal client closes socket case works fine. The scenario I'm 
worried about is when the client machine falls off the network entirely 
for some reason (ethernet problem, kernel panic, machine catches 
fire..). From what I can see, if the connection is idle at that point, 
the server won't notice this until TCP-level SO_KEEPALIVE kicks in, 
which by default takes over 2 hours on an idle connection. I'm looking 
for something more like a 30-60 second turnaround if the client is 
holding locks.

3) implement an idle timeout on the server so that open transactions 
that are idle for longer than some period are automatically aborted.

(3) seems like a proper solution. I've searched the archives a bit and 
transaction timeouts have been suggested before, but there seems to be 
some resistance to them.
Have you come across the pgpool connection-pooling project? 
http://pgpool.projects.postgresql.org/

Might be easier to put a timeout+disconnect in there.
--
  Richard Huxton
  Archonet Ltd
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Schema name of function

2005-02-14 Thread Michael Fuhr
On Mon, Feb 14, 2005 at 07:32:15PM +1100, John Hansen wrote:
 
 Is there a way for a C function to determine the name of the
 schema in which is was created?

Dunno if there's anything as simple as whats_my_schema(), but
fcinfo-flinfo-fn_oid should contain the function's oid.  If nobody
mentions an easier way, you could use SPI to query pg_proc and
pg_namespace.

-- 
Michael Fuhr
http://www.fuhr.org/~mfuhr/

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Dealing with network-dead clients

2005-02-14 Thread Oliver Jowett
Richard Huxton wrote:
Oliver Jowett wrote:
I'm currently trying to find a clean way to deal with network-dead 
clients that are in a transaction and holding locks etc.

Have you come across the pgpool connection-pooling project? 
http://pgpool.projects.postgresql.org/
I've looked at it, haven't used it.
Might be easier to put a timeout+disconnect in there.
It seems like I have the same design issues even if the code lives in 
pgpool. Also, I'm reluctant to introduce another bit of software into 
the system just for the sake of timeouts; we have no other need for 
pgpool functionality.

-O
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[HACKERS] Help me recovering data

2005-02-14 Thread Kouber Saparev
Hi folks,

I ran into big trouble - it seems that my DB is lost.

select * from pg_database gives me 0 rows, but I still can connect to
databases with \c and even select from tables there, although they're also
not visible with \dt.

After asking the guys in the [EMAIL PROTECTED] channel they told
me that the reason is the Transaction ID wraparound, because I have never
ran VACUUM on the whole database.

So they proposed to ask here for help. I have stopped the server, but what
could I do in order to save the data if it's possible at all?

You could also take a look at my pg_controldata output:
http://rafb.net/paste/results/ghcIb880.html

Regards,
Kouber Saparev


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] getting oid of function

2005-02-14 Thread Sibtay Abbas
thank you for the detailed reply
But what i wanted to know is that how can we actually get a function's
oid from its
name from within postgresql code itself

Actually i ve introduced a new statement in plpgsql, that supports calling
a plpgsql function from within a plpgsql function for example

CALL function_name( params);

Here i only have the function name, i need to fetch the Oid of the
function so that
i can construct FmgrInfo and FunctionCallInfoData structures, so that
i can compile (if required) and execute that function.

There is a function like SearchSysCache(PROCOID,.
that would retrieve the pg_proc Tuple on the basis of function Oid
passed to it, so can use this function to return a pg_proc Tuple on
the basis of a function name?
Or is there any alternate function avaible that can achieve this

Thank you

regards
Sibtay

In case i am taking a wrong approach I would be thankful if anyone can
correct me.

On Mon, 14 Feb 2005 01:14:07 -0700, Michael Fuhr [EMAIL PROTECTED] wrote:
 On Mon, Feb 14, 2005 at 12:47:44PM +0500, Sibtay Abbas wrote:
 
  Is it possible to get the oid of a function on the basis of its name?.
 
 One way is to cast the function name to regproc (or, with arguments,
 to regprocedure) and then to oid:
 
 SELECT 'atan'::regproc::oid;
 SELECT 'length(text)'::regprocedure::oid;
 
 See Object Identifier Types in the documentation for more info:
 
 http://www.postgresql.org/docs/8.0/static/datatype-oid.html
 
  The scenario which i am currently facing is that i have the function name, 
  now
  i want search the pg_proc system catalog on the basis of the function
  name and retrieve its Oid.
 
 SELECT oid FROM pg_proc WHERE proname = 'funcname';
 
 A function can have multiple records in pg_proc if it can take
 different types and/or numbers of arguments, so you might have to
 allow for that.
 
  Another confusion which i am facing is that, I am not sure whether Oid
  of a function is entered in pg_proc system catalog or not. Because i
  am not able to identify any relevant field.
 
 oid is a system column; tools that describe tables usually don't
 show system columns.  You can query pg_attribute to see all of a
 table's columns.
 
 http://www.postgresql.org/docs/8.0/static/ddl-system-columns.html
 http://www.postgresql.org/docs/8.0/static/catalog-pg-attribute.html
 
 --
 Michael Fuhr
 http://www.fuhr.org/~mfuhr/


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Query optimizer 8.0.1 (and 8.0)

2005-02-14 Thread pgsql
 Probably off-topic, but I think it's worth to see what astronomers are
 doing with their very big spatial databases. For example, we are working
 with more than 500,000,000 rows catalog and we use some special
 transformation
 of coordinates to integer numbers with preserving objects closeness.
 I hope we could show postgresql is good enough to be used in astronomy
 for very big catalogs. Currently, MS SQL is in use.
 See http://www.sdss.jhu.edu/htm/ for details. We use another technique.

You know, I don't think a lot of people get the issues I was describing,
or maybe they don't believe it, I don't know, but, I think that it would
be a useful contrib project to create an 'analyze_special('table',
'column', 'method')' function that does a better job at calculating the
stats for table that contain multiple trend waveforms. A separate function
will probably work well as the trends within the data probably only apply
to specific rows.

It's interesting, because I don't think it needs to calculate a perfect
representation of the data so much as better clue to its nature for the
optimizer.

When I get the time (or can get someone to pay me to do it) I'm going to
try it.



   Oleg
 On Wed, 9 Feb 2005 [EMAIL PROTECTED] wrote:

 I wrote a message caled One Big trend vs multiple smaller trends in
 table
 statistics that, I think, explains what we've been seeing.


 [EMAIL PROTECTED] wrote:

 In this case, the behavior observed could be changed by altering the
 sample size for a table. I submit that an arbitrary fixed sample size
 is
 not a good base for the analyzer, but that the sample size should be
 based
 on the size of the table or some calculation of its deviation.


Mark,

 Do you have any evidence that the Sample Size had anything to do
 with the performance problem you're seeing?

 Sample size is only a bandaid for the issue, however, more samples
 always
 provide more information.



 I also do a lot with the complete Census/TIGER database.

 Every problem I have with the optimizer comes down to the
 fact that the data is loaded (and ordered on disk) by
 State/County FIPS codes, and then queried by zip-code
 or by city name.  Like this:

  Alabama36101 [hundreds of pages with zip's in 36***]
  Alaska 99686 [hundreds of pages with zip's in 9]
  Arizona85701 [hundreds of pages with zip's in 855**]

 Note that the zip codes are *NOT* sequential.

 Again, read One Big Trend... and let me know what you think. I think
 it
 describes exactly the problem that we see.

 For now, the solution that works for me is to seriously up the value of
 targrows in analyze.c. It makes it take longer, and while the stats
 are
 not correct because they are not designed to detect these sorts of
 patterns, a larger sample allows them to be less wrong enough to give
 a
 better hint to the planner.



 ---(end of broadcast)---
 TIP 7: don't forget to increase your free space map settings


   Regards,
   Oleg
 _
 Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
 Sternberg Astronomical Institute, Moscow University (Russia)
 Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
 phone: +007(095)939-16-83, +007(095)939-23-83

 ---(end of broadcast)---
 TIP 3: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Schema name of function

2005-02-14 Thread Tom Lane
Michael Fuhr [EMAIL PROTECTED] writes:
 On Mon, Feb 14, 2005 at 07:32:15PM +1100, John Hansen wrote:
 Is there a way for a C function to determine the name of the
 schema in which is was created?

 Dunno if there's anything as simple as whats_my_schema(), but
 fcinfo-flinfo-fn_oid should contain the function's oid.  If nobody
 mentions an easier way, you could use SPI to query pg_proc and
 pg_namespace.

In C, it'd be a lot easier (and faster) to do a couple of SearchSysCache
calls than to use SPI to get those rows.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] [GENERAL] WARNING: could not remove database directory

2005-02-14 Thread Bruce Momjian

8.0.X will have proper filename reporting for rmtree() failures.

---

Tom Lane wrote:
 Bruce Momjian pgman@candle.pha.pa.us writes:
  Tom Lane wrote:
  [ looks at code... ]  dbcommands.c is expecting that rmtree() will have
  printed out a more-detailed message about the problem, but someone has
  carefully removed every trace of error reporting from rmtree().
 
  I think the issue is that we didn't want different object files for
  client and server output message and and returning error codes and
  having every calling location print strings was unmaintainable.
 
 But we already bit that bullet.  Look at the other routines in dirmod.c:
 
 #ifndef FRONTEND
 ereport(ERROR,
 (errcode_for_file_access(),
  errmsg(Error setting junction for %s: %s,
 nativeTarget, msg)));
 #else
 fprintf(stderr, Error setting junction for %s: %s\n,
 nativeTarget, msg);
 #endif
 
 It's certainly not realistic to pass back enough information from
 rmtree() to let the caller print a useful error message, so I think
 we have to add reporting code along this line to rmtree().
 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 4: Don't 'kill -9' the postmaster
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Help me recovering data

2005-02-14 Thread Tom Lane
Kouber Saparev [EMAIL PROTECTED] writes:
 After asking the guys in the [EMAIL PROTECTED] channel they told
 me that the reason is the Transaction ID wraparound, because I have never
 ran VACUUM on the whole database.

 So they proposed to ask here for help. I have stopped the server, but what
 could I do in order to save the data if it's possible at all?

I think you're pretty well screwed as far as getting it *all* back goes,
but you could use pg_resetxlog to back up the NextXID counter enough to
make your tables and databases reappear (and thereby lose the effects of
however many recent transactions you back up over).

Once you've found a NextXID setting you like, I'd suggest an immediate
pg_dumpall/initdb/reload to make sure you have a consistent set of data.
Don't VACUUM, or indeed modify the DB at all, until you have gotten a
satisfactory dump.

Then put in a cron job to do periodic vacuuming ;-)

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Help me recovering data

2005-02-14 Thread Kouber Saparev
 Once you've found a NextXID setting you like, I'd suggest an immediate
 pg_dumpall/initdb/reload to make sure you have a consistent set of data.
 Don't VACUUM, or indeed modify the DB at all, until you have gotten a
 satisfactory dump.

 Then put in a cron job to do periodic vacuuming ;-)

Thank you, I just discovered in the mailing lists that I'm not the first nor
the last guy that have forgotten to VACUUM the database. ;-)

Regards,
Kouber Saparev


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Help me recovering data

2005-02-14 Thread Christopher Kings-Lynne
I think you're pretty well screwed as far as getting it *all* back goes,
but you could use pg_resetxlog to back up the NextXID counter enough to
make your tables and databases reappear (and thereby lose the effects of
however many recent transactions you back up over).
Once you've found a NextXID setting you like, I'd suggest an immediate
pg_dumpall/initdb/reload to make sure you have a consistent set of data.
Don't VACUUM, or indeed modify the DB at all, until you have gotten a
satisfactory dump.
Then put in a cron job to do periodic vacuuming ;-)
This might seem like a stupid question, but since this is a massive data 
loss potential in PostgreSQL, what's so hard about having the 
checkpointer or something check the transaction counter when it runs and 
 either issue a db-wide vacuum if it's about to wrap, or simply 
disallow any new transactions?

I think people'd rather their db just stopped accepting new transactions 
rather than just losing data...

Chris
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [HACKERS] Query optimizer 8.0.1 (and 8.0)

2005-02-14 Thread Ron Mayer
[EMAIL PROTECTED] wrote:
You know, I don't think a lot of people get the issues I was describing,
or maybe they don't believe it, I don't know, but, I think that it would
be a useful contrib project to create an 'analyze_special('table',
'column', 'method')' function that does a better job at calculating the
stats for table that contain multiple trend waveforms. A separate function
will probably work well as the trends within the data probably only apply
to specific rows.
I've done something similar, but simpler for the Census/TIGER data.
If you loaded each TIGER file sequentially, like I did, the data
was all grouped by county when it was loaded - so basically all
the geographical columns (zip, county, state, census-tract) are
actually grouped tightly on disk -- though ANALYZE can't see this
because they're not strictly ascending or descending.
Since I merely observed the geospatial columns were all
clustered pretty well, I merely set the correlation
value to the same pretty large value for all the
geometric rows with a bunch of statements like this:
  update pg_statistic
 set stanumbers3[1] = 0.8
   where starelid = 31412043
 and staattnum=3;
Instead of a complicated analyze function, how about just
letting the user tell the optimizer that a column is
clustered well with a function like:
 force_correlation_stat('schema', 'table', 'column', 'value')
would actually work well for your data.  Since you
know your distinct values lay on a relatively small
number of pages if you merely did:
 force_correlation('tiger','rt1','zipl',0.8);
 force_correlation('tiger','rt1','statel',0.8);
 force_correlation('tiger','rt1','countyl',0.8);
the optimizer would then see that not many disk
pages would need to be hit for a single zip code.

It's interesting, because I don't think it needs to calculate a perfect
representation of the data so much as better clue to its nature for the
optimizer.
Indeed.  Using the very arbitrary number 0.8 for the
correlation, for all the geographic-related columns in the
tiger data, the optimizer guessed a good plan almost every
time on my company's 200GB geographical database.
When I get the time (or can get someone to pay me to do it) I'm going to
try it.
I still suspect that the correct way to do it would not be
to use the single correlation, but 2 stats - one for estimating
how sequential/random accesses would be; and one for estimating
the number of pages that would be hit.  I think the existing
correlation does well for the first estimate; but for many data
sets, poorly for the second type.
If you want to start a contrib project that looks into additional
stats that may help, I might be interested.
  Ron
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [HACKERS] Help me recovering data

2005-02-14 Thread pgsql
It must be possible to create a tool based on the PostgreSQL sources that
can read all the tuples in a database and dump them to a file stream. All
the data remains in the file until overwritten with data after a vacuum.
It *should* be doable.

If there data in the table is worth anything, then it would be worth
extracting.

It would, of course, be a tool of last resort.



 Kouber Saparev [EMAIL PROTECTED] writes:
 After asking the guys in the [EMAIL PROTECTED] channel they
 told
 me that the reason is the Transaction ID wraparound, because I have
 never
 ran VACUUM on the whole database.

 So they proposed to ask here for help. I have stopped the server, but
 what
 could I do in order to save the data if it's possible at all?

 I think you're pretty well screwed as far as getting it *all* back goes,
 but you could use pg_resetxlog to back up the NextXID counter enough to
 make your tables and databases reappear (and thereby lose the effects of
 however many recent transactions you back up over).

 Once you've found a NextXID setting you like, I'd suggest an immediate
 pg_dumpall/initdb/reload to make sure you have a consistent set of data.
 Don't VACUUM, or indeed modify the DB at all, until you have gotten a
 satisfactory dump.

 Then put in a cron job to do periodic vacuuming ;-)

   regards, tom lane

 ---(end of broadcast)---
 TIP 6: Have you searched our list archives?

http://archives.postgresql.org



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Help me recovering data

2005-02-14 Thread pgsql
 I think you're pretty well screwed as far as getting it *all* back goes,
 but you could use pg_resetxlog to back up the NextXID counter enough to
 make your tables and databases reappear (and thereby lose the effects of
 however many recent transactions you back up over).

 Once you've found a NextXID setting you like, I'd suggest an immediate
 pg_dumpall/initdb/reload to make sure you have a consistent set of data.
 Don't VACUUM, or indeed modify the DB at all, until you have gotten a
 satisfactory dump.

 Then put in a cron job to do periodic vacuuming ;-)

 This might seem like a stupid question, but since this is a massive data
 loss potential in PostgreSQL, what's so hard about having the
 checkpointer or something check the transaction counter when it runs and
   either issue a db-wide vacuum if it's about to wrap, or simply
 disallow any new transactions?

 I think people'd rather their db just stopped accepting new transactions
 rather than just losing data...


I would certainly prefer the system to issue an error and stop working
than complete data loss.


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Help me recovering data

2005-02-14 Thread Doug McNaught
Christopher Kings-Lynne [EMAIL PROTECTED] writes:

 This might seem like a stupid question, but since this is a massive
 data loss potential in PostgreSQL, what's so hard about having the
 checkpointer or something check the transaction counter when it runs
 and either issue a db-wide vacuum if it's about to wrap, or simply
 disallow any new transactions?

I think autovac-in-backend is the preferred solution to this, and it's
definitely on the TODO list...

-Doug

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Schema name of function

2005-02-14 Thread Michael Fuhr
On Mon, Feb 14, 2005 at 11:11:53AM -0500, Tom Lane wrote:
 
 In C, it'd be a lot easier (and faster) to do a couple of SearchSysCache
 calls than to use SPI to get those rows.

The following appears to work -- does it look right, aside from the
missing error checking?

  tuple = SearchSysCache(PROCOID,
 ObjectIdGetDatum(fcinfo-flinfo-fn_oid),
 0, 0, 0);
  nsoid = SysCacheGetAttr(PROCOID, tuple,
  Anum_pg_proc_pronamespace, isnull);
  schemaname = get_namespace_name(nsoid);
  ReleaseSysCache(tuple);

-- 
Michael Fuhr
http://www.fuhr.org/~mfuhr/

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] getting oid of function

2005-02-14 Thread Michael Fuhr
On Mon, Feb 14, 2005 at 05:02:19PM +0500, Sibtay Abbas wrote:
 
 Actually i ve introduced a new statement in plpgsql, that supports calling
 a plpgsql function from within a plpgsql function for example
 
 CALL function_name( params);

How would this differ from PERFORM?

-- 
Michael Fuhr
http://www.fuhr.org/~mfuhr/

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Schema name of function

2005-02-14 Thread Tom Lane
Michael Fuhr [EMAIL PROTECTED] writes:
 On Mon, Feb 14, 2005 at 11:11:53AM -0500, Tom Lane wrote:
 In C, it'd be a lot easier (and faster) to do a couple of SearchSysCache
 calls than to use SPI to get those rows.

 The following appears to work -- does it look right, aside from the
 missing error checking?

   tuple = SearchSysCache(PROCOID,
  ObjectIdGetDatum(fcinfo-flinfo-fn_oid),
  0, 0, 0);
   nsoid = SysCacheGetAttr(PROCOID, tuple,
   Anum_pg_proc_pronamespace, isnull);
   schemaname = get_namespace_name(nsoid);
   ReleaseSysCache(tuple);

That would work.  Since pronamespace is one of the fixed non-nullable
columns of pg_proc, you don't really need to use SysCacheGetAttr: you
can just map the C struct onto the tuple and grab the field directly.

nsoid = ((Form_pg_proc) GETSTRUCT(tuple))-pronamespace;

utils/cache/lsyscache.c contains lots of examples of this sort of thing.
(get_namespace_name is one, in fact.)

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Help me recovering data

2005-02-14 Thread Tom Lane
Christopher Kings-Lynne [EMAIL PROTECTED] writes:
 This might seem like a stupid question, but since this is a massive data 
 loss potential in PostgreSQL, what's so hard about having the 
 checkpointer or something check the transaction counter when it runs and 
   either issue a db-wide vacuum if it's about to wrap, or simply 
 disallow any new transactions?

The checkpointer is entirely incapable of either detecting the problem
(it doesn't have enough infrastructure to examine pg_database in a
reasonable way) or preventing backends from doing anything if it did
know there was a problem.

 I think people'd rather their db just stopped accepting new transactions 
 rather than just losing data...

Not being able to issue new transactions *is* data loss --- how are you
going to get the system out of that state?

autovacuum is the correct long-term solution to this, not some kind of
automatic hara-kiri.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] enforcing a plan (in brief)

2005-02-14 Thread pgsql
 On Thu, 2005-02-10 at 14:37 -0500, Bruce Momjian wrote:
 No, we feel that is of limited value.  If the optimizer isn't doing
 things properly, we will fix it.

 I agree that improving the optimizer is the right answer for normal
 usage, so I can't get excited about query-level plan hints, but I can
 see the capability to instruct the planner being useful in an academic
 context.


I think that is sort of arrogant. Look at Oracle, you can give the planner
hints in the form of comments.

The idea that constructing a planner that will always do the best job is
like creating a program that can predict the weather. There are too many
subtle variations in datasets that are impossible to really evalute. I
posted a message last week called One Big trend vs multiple smaller
trends. and you'll see what I mean.

Yea, on a simple data organization, you could make a great planner, but
someone who has studied the nature of their data can almost always toss
their hands up in frustration because the planner isn't working right.

I have had multiple issues with the inability to guide the planner on its
decisions. I'll give a couple examples:

A music database where the artist name is Various Artists,  given any
normal database of music recordings, Various Artists will be *THE* most
popular artist, usually close to almost half the data. Most of the time
I've had to turn off sequential scans for these queries. (I filter out
various artists) Being able to say:

select * from cdtitles where artist = 'foo' /* index scan
ctitles_artist_ndx  */ ;

Would be helpful as disabling sequential scan isn't always the right think
either.

The whole Query optimizer 8.0.1 (and 8.0) series of posts show a
different problem.

It all comes down to that the planner *can not* be perfect, and thus will
always be lacking in some respect. This is because you can not anticipate
every physical data storage pattern, therefore, the analyzer will not
correctly characterize them, and the planner will not create an optimal
plan.

Allowing the user to suggest alternate query strategies is a good idea.





---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Schema name of function

2005-02-14 Thread John Hansen
Beautiful, just what I was looking for.

Thnx,

John

 -Original Message-
 From: Tom Lane [mailto:[EMAIL PROTECTED] 
 Sent: Tuesday, February 15, 2005 6:31 AM
 To: Michael Fuhr
 Cc: John Hansen; pgsql-hackers@postgresql.org
 Subject: Re: [HACKERS] Schema name of function 
 
 Michael Fuhr [EMAIL PROTECTED] writes:
  On Mon, Feb 14, 2005 at 11:11:53AM -0500, Tom Lane wrote:
  In C, it'd be a lot easier (and faster) to do a couple of 
  SearchSysCache calls than to use SPI to get those rows.
 
  The following appears to work -- does it look right, aside from the 
  missing error checking?
 
tuple = SearchSysCache(PROCOID,
   ObjectIdGetDatum(fcinfo-flinfo-fn_oid),
   0, 0, 0);
nsoid = SysCacheGetAttr(PROCOID, tuple,
Anum_pg_proc_pronamespace, isnull);
schemaname = get_namespace_name(nsoid);
ReleaseSysCache(tuple);
 
 That would work.  Since pronamespace is one of the fixed 
 non-nullable columns of pg_proc, you don't really need to use 
 SysCacheGetAttr: you can just map the C struct onto the tuple 
 and grab the field directly.
 
   nsoid = ((Form_pg_proc) GETSTRUCT(tuple))-pronamespace;
 
 utils/cache/lsyscache.c contains lots of examples of this 
 sort of thing.
 (get_namespace_name is one, in fact.)
 
   regards, tom lane
 
 

---(end of broadcast)---
TIP 8: explain analyze is your friend


[HACKERS] 8.0.X and the ARC patent

2005-02-14 Thread Bruce Momjian
FYI, core has discussed the pending IBM ARC patent and the usage of
those ideas in 8.0.

Tom has found a 2Q cache algorithm that predates the ARC patent and is
very similar to ARC.  The major difference is that it doesn't auto-size
the ARC sub-buffers.  

Core believes it is best to backpatch this 2Q algorithm into 8.0.X to
avoid any possible patent problems if the patent is granted and
enforced.

We are testing the use of the 2Q code to see if it has any performance
impact.  The 8.0.X release that uses 2Q will have more extensive testing
than a normal minor release.  8.1 will have a new cache algorithm that
hopefully will remove the buffer contention problems experienced by SMP
machines.

For development, this means we will _not_ have a shortened, non-initdb
8.1 release but a regular release cycle with the typical big batch of
features.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] WAL: O_DIRECT and multipage-writer

2005-02-14 Thread Bruce Momjian

This thread has been saved for the 8.1 release:

http://momjian.postgresql.org/cgi-bin/pgpatches2

---

ITAGAKI Takahiro wrote:
 Hello, all.
 
 I think that there is room for improvement in WAL. 
 Here is a patch for it.
   - Multiple pages are written in one write() if it is contiguous.
   - Add 'open_direct' to wal_sync_method.
 
 WAL writer writes one page in one write(). This is not efficient
 when wal_sync_method is 'open_sync', because the writer waits for
 IO completions at each write(). Multipage-writer can reduce syscalls
 and improve IO throughput. 
 
 'open_direct' uses O_DIRECT instead of O_SYNC. O_DIRECT implies synchronous
 writing, so it may show the tendency like open_sync. But maybe it can reduce
 memcpy() and save OS's disk cache memory.
 
 I benchmarked this patch with pgbench. It works well and 
 improved 50% of tps on my machine. WAL seems to be bottle-neck
 on machines with poor disks.
 
 This patch has not yet tested enough. I would like it to be examined much
 and taken into PostgreSQL.
 
 There are still many TODOs:
   * Is this logic really correct?
   - O_DIRECT_BUFFER_ALIGN should be adjusted to runtime, not compile time.
   - Consider to use writev() instead of write().
 Buffers are noncontiguous when WAL ring buffer rotates.
   - If wan_sync_method is not open_direct, XLOG_EXTRA_BUFFERS can be 0.
 
 
 Sincerely,
 ITAGAKI Takahiro
 
 
 
 -- pgbench result --
 
 $ ./pgbench -s 100 -c 50 -t 400
 
 - 8.0.0 default + fsync:
 tps = 20.630632 (including connections establishing)
 tps = 20.636768 (excluding connections establishing)
 - multipage-writer + open_direct:
 tps = 33.761917 (including connections establishing)
 tps = 33.778320 (excluding connections establishing)
 
 Environment:
   OS : Linux kernel 2.6.9
   CPU: Pentium 4 3GHz
   disk   : ATA 5400rpm (Data and WAL are placed on same partition.)
   memory : 1GB
   config : shared_buffers=1, wal_buffers=256,
XLOG_SEG_SIZE=256MB, checkpoint_segment=4
 
 ---
 ITAGAKI Takahiro [EMAIL PROTECTED]
 NTT Cyber Space Laboratories
 Nippon Telegraph and Telephone Corporation.

[ Attachment, skipping... ]

 
 ---(end of broadcast)---
 TIP 5: Have you checked our extensive FAQ?
 
http://www.postgresql.org/docs/faq

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] 8.0.X and the ARC patent

2005-02-14 Thread pgsql
Might it be possible to contact IBM directly and ask if they will allow
usage of the patent for PostgreSQL. They've let 500 patents for open
source, maybe they'll give a write off for this as well.

There is an advantage beyond just not having to re-write the code, but it
would also be sort of an IBM blessing, great PR.

I will be at Linux World and see if there is an IBM booth, maybe I can
get  some contact info.


 FYI, core has discussed the pending IBM ARC patent and the usage of
 those ideas in 8.0.

 Tom has found a 2Q cache algorithm that predates the ARC patent and is
 very similar to ARC.  The major difference is that it doesn't auto-size
 the ARC sub-buffers.

 Core believes it is best to backpatch this 2Q algorithm into 8.0.X to
 avoid any possible patent problems if the patent is granted and
 enforced.

 We are testing the use of the 2Q code to see if it has any performance
 impact.  The 8.0.X release that uses 2Q will have more extensive testing
 than a normal minor release.  8.1 will have a new cache algorithm that
 hopefully will remove the buffer contention problems experienced by SMP
 machines.

 For development, this means we will _not_ have a shortened, non-initdb
 8.1 release but a regular release cycle with the typical big batch of
 features.

 --
   Bruce Momjian|  http://candle.pha.pa.us
   pgman@candle.pha.pa.us   |  (610) 359-1001
   +  If your life is a hard drive, |  13 Roberts Road
   +  Christ can be your backup.|  Newtown Square, Pennsylvania
 19073

 ---(end of broadcast)---
 TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] 8.0.X and the ARC patent

2005-02-14 Thread Bruce Momjian
[EMAIL PROTECTED] wrote:
 Might it be possible to contact IBM directly and ask if they will allow
 usage of the patent for PostgreSQL. They've let 500 patents for open
 source, maybe they'll give a write off for this as well.
 
 There is an advantage beyond just not having to re-write the code, but it
 would also be sort of an IBM blessing, great PR.
 
 I will be at Linux World and see if there is an IBM booth, maybe I can
 get  some contact info.

I doubt they will give us something that extends to companies that sell
PostgreSQL so I don't see the point.

---


 
 
  FYI, core has discussed the pending IBM ARC patent and the usage of
  those ideas in 8.0.
 
  Tom has found a 2Q cache algorithm that predates the ARC patent and is
  very similar to ARC.  The major difference is that it doesn't auto-size
  the ARC sub-buffers.
 
  Core believes it is best to backpatch this 2Q algorithm into 8.0.X to
  avoid any possible patent problems if the patent is granted and
  enforced.
 
  We are testing the use of the 2Q code to see if it has any performance
  impact.  The 8.0.X release that uses 2Q will have more extensive testing
  than a normal minor release.  8.1 will have a new cache algorithm that
  hopefully will remove the buffer contention problems experienced by SMP
  machines.
 
  For development, this means we will _not_ have a shortened, non-initdb
  8.1 release but a regular release cycle with the typical big batch of
  features.
 
  --
Bruce Momjian|  http://candle.pha.pa.us
pgman@candle.pha.pa.us   |  (610) 359-1001
+  If your life is a hard drive, |  13 Roberts Road
+  Christ can be your backup.|  Newtown Square, Pennsylvania
  19073
 
  ---(end of broadcast)---
  TIP 5: Have you checked our extensive FAQ?
 
 http://www.postgresql.org/docs/faq
 
 
 
 ---(end of broadcast)---
 TIP 7: don't forget to increase your free space map settings
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] UTF8 or Unicode

2005-02-14 Thread Bruce Momjian
Should our multi-byte encoding be referred to as UTF8 or Unicode?
I know UTF8 is a type of unicode but do we need to rename anything from
Unicode to UTF8?

Someone asked me via private email.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] 8.0.X and the ARC patent

2005-02-14 Thread pgsql
 [EMAIL PROTECTED] wrote:
 Might it be possible to contact IBM directly and ask if they will allow
 usage of the patent for PostgreSQL. They've let 500 patents for open
 source, maybe they'll give a write off for this as well.

 There is an advantage beyond just not having to re-write the code, but
 it
 would also be sort of an IBM blessing, great PR.

 I will be at Linux World and see if there is an IBM booth, maybe I can
 get  some contact info.

 I doubt they will give us something that extends to companies that sell
 PostgreSQL so I don't see the point.

Actually, I think that's wrong. IBM has been really gung-ho about Linux.
Of course this is an obvious movement against Microsoft domination of the
server market, but they have made some very strong open source statements
and have release about 500 patents to open source projects.

The current open source patents extend to companies that sell other
products, why not PostgreSQL as well?

There is a *LOT* of crap going on with patents, there are so many issues
and motives that is hard to keep track of why who is doing what. My bet is
that it is 50/50. It all depends if IBM wants to hurt Oracle more than it
wants to defned DB2.




---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] UTF8 or Unicode

2005-02-14 Thread Abhijit Menon-Sen
At 2005-02-14 21:14:54 -0500, pgman@candle.pha.pa.us wrote:

 Should our multi-byte encoding be referred to as UTF8 or Unicode?

The *encoding* should certainly be referred to as UTF-8. Unicode is a
character set, not an encoding; Unicode characters may be encoded with
UTF-8, among other things.

(One might think of a charset as being a set of integers representing
characters, and an encoding as specifying how those integers may be
converted to bytes.)

 I know UTF8 is a type of unicode but do we need to rename anything
 from Unicode to UTF8?

I don't know. I'll go through the documentation to see if I can find
anything that needs changing.

-- ams

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] 8.0.X and the ARC patent

2005-02-14 Thread Joshua D. Drake
Bruce Momjian wrote:
[EMAIL PROTECTED] wrote:
 

Might it be possible to contact IBM directly and ask if they will allow
usage of the patent for PostgreSQL. They've let 500 patents for open
source, maybe they'll give a write off for this as well.
There is an advantage beyond just not having to re-write the code, but it
would also be sort of an IBM blessing, great PR.
I will be at Linux World and see if there is an IBM booth, maybe I can
get  some contact info.
   

I doubt they will give us something that extends to companies that sell
PostgreSQL so I don't see the point.
 

Also if I recall didn't Tom already have a patch ready to be
tested for the q2 stuff?
Sincerely,
Joshua D. Drake
---
 

   

FYI, core has discussed the pending IBM ARC patent and the usage of
those ideas in 8.0.
Tom has found a 2Q cache algorithm that predates the ARC patent and is
very similar to ARC.  The major difference is that it doesn't auto-size
the ARC sub-buffers.
Core believes it is best to backpatch this 2Q algorithm into 8.0.X to
avoid any possible patent problems if the patent is granted and
enforced.
We are testing the use of the 2Q code to see if it has any performance
impact.  The 8.0.X release that uses 2Q will have more extensive testing
than a normal minor release.  8.1 will have a new cache algorithm that
hopefully will remove the buffer contention problems experienced by SMP
machines.
For development, this means we will _not_ have a shortened, non-initdb
8.1 release but a regular release cycle with the typical big batch of
features.
--
 Bruce Momjian|  http://candle.pha.pa.us
 pgman@candle.pha.pa.us   |  (610) 359-1001
 +  If your life is a hard drive, |  13 Roberts Road
 +  Christ can be your backup.|  Newtown Square, Pennsylvania
19073
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faq
 

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings
   

 


--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - [EMAIL PROTECTED] - http://www.commandprompt.com
PostgreSQL Replicator -- production quality replication for PostgreSQL
begin:vcard
fn:Joshua Drake
n:Drake;Joshua
org:Command Prompt, Inc.
adr:;;PO Box 215 ;Cascade Locks;OR;97014;US
email;internet:[EMAIL PROTECTED]
title:Consultant
tel;work:503-667-4564
tel;fax:503-210-0334
x-mozilla-html:FALSE
url:http://www.commandprompt.com
version:2.1
end:vcard


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] 8.0.X and the ARC patent

2005-02-14 Thread Bruce Momjian
Joshua D. Drake wrote:
 Bruce Momjian wrote:
 
 [EMAIL PROTECTED] wrote:
   
 
 Might it be possible to contact IBM directly and ask if they will allow
 usage of the patent for PostgreSQL. They've let 500 patents for open
 source, maybe they'll give a write off for this as well.
 
 There is an advantage beyond just not having to re-write the code, but it
 would also be sort of an IBM blessing, great PR.
 
 I will be at Linux World and see if there is an IBM booth, maybe I can
 get  some contact info.
 
 
 
 I doubt they will give us something that extends to companies that sell
 PostgreSQL so I don't see the point.
   
 
 
 Also if I recall didn't Tom already have a patch ready to be
 tested for the q2 stuff?

Yes, he does.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] UTF8 or Unicode

2005-02-14 Thread Bruce Momjian
Abhijit Menon-Sen wrote:
 At 2005-02-14 21:14:54 -0500, pgman@candle.pha.pa.us wrote:
 
  Should our multi-byte encoding be referred to as UTF8 or Unicode?
 
 The *encoding* should certainly be referred to as UTF-8. Unicode is a
 character set, not an encoding; Unicode characters may be encoded with
 UTF-8, among other things.
 
 (One might think of a charset as being a set of integers representing
 characters, and an encoding as specifying how those integers may be
 converted to bytes.)
 
  I know UTF8 is a type of unicode but do we need to rename anything
  from Unicode to UTF8?
 
 I don't know. I'll go through the documentation to see if I can find
 anything that needs changing.

I looked at encoding.sgml and that mentions Unicode, and then UTF8 as an
acronym. I am wondering if we need to make UTF8 first and Unicode
second.  Does initdb accept UTF8 as an encoding?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [pgsql-hackers] Allow GRANT/REVOKE permissions to be applied

2005-02-14 Thread Bruce Momjian

I have added this syntax to the TODO list:

* Allow GRANT/REVOKE permissions to be applied to all schema objects with one
  command

  The proposed syntax is:
GRANT SELECT ON ALL TABLES IN public TO phpuser;
GRANT SELECT ON NEW TABLES IN public TO phpuser;


---

Matthias Schmidt wrote:
 Hi *,
 
 I will start implementing this stuff based on this syntax:
 
 GRANT SELECT ON ALL TABLES IN public TO phpuser;
 GRANT SELECT ON NEW TABLES IN public TO phpuser;
 
 so there are two seperate commands to use.
 
 is everybody fine with this aproach?
 
 cheers,
 
 Matthias
 
 PS.: Tom, shouldn't we mention the fact, that one can grant on views 
 and sequences as well in the documentation?
 
 
 
 Am 29.01.2005 um 01:43 schrieb Tom Lane:
 
  Josh Berkus josh@agliodbs.com writes:
  GRANT SELECT ON ALL, NEW TABLES IN public TO phpuser;
  ... does both.
 
  Ah, I overlooked that part of your message.  I think the above probably
  doesn't work in bison, but if not we could spell it like
 
GRANT SELECT ON ALL AND NEW TABLES IN public TO phpuser;
 
  Or just make the user enter two commands for this case.  Aside from
  syntactic simplicity, that might be a good idea anyway.  The NEW TABLES
  case is *fundamentally* different from every other form of GRANT, in
  that it causes future actions.  So it might be a wise idea from the
  standpoint of understandability to keep it as a separate command from
  the immediate-acting ALL TABLES.
 
  regards, tom lane
 
  ---(end of 
  broadcast)---
  TIP 8: explain analyze is your friend
 
 
 --
 Matthias Schmidt
 Viehtriftstr. 49
 
 67346 Speyer
 GERMANY
 
 Tel.: +49 6232 4867
 Fax.: +49 6232 640089
 
 
 ---(end of broadcast)---
 TIP 2: you can get off all lists at once with the unregister command
 (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Refactoring

2005-02-14 Thread Bruce Momjian
Manfred Koizar wrote:
 On Wed, 19 Jan 2005 18:57:48 +0100, I wrote:
   My first vacuum.c
 refactoring patch, rev 1.281 2004-06-08, added these comments in
 repair_frag():
 
 /*
  * VACUUM FULL has an exclusive lock on the relation.  So
  * normally no other transaction can have pending INSERTs or
  * DELETEs in this relation.  A tuple is either
  *   (a) a tuple in a system catalog, inserted or deleted by
  *   a not yet committed transaction or
  *   (b) dead (XMIN_INVALID or XMAX_COMMITTED) or
  *   (c) inserted by a committed xact (XMIN_COMMITTED) or
  *   (d) moved by the currently running VACUUM.
  * In case (a) we wouldn't be in repair_frag() at all.
  * In case (b) we cannot be here, because scan_heap() has
  * already marked the item as unused, see continue above.
  * Case (c) is what normally is to be expected.
  * Case (d) is only possible, if a whole tuple chain has been
  * moved while processing this or a higher numbered block.
  */
 
 It turns out that this comment is not quite correct.  It is incomplete.
 Case (b) should be: known dead (XMIN_INVALID, or XMAX_COMMITTED and xmax
 is visible to all active transactions).
 
 And there is a fifth possibility: (e) deleted (XMAX_COMMITTED) but at
 least one active transaction does not see the deleting transaction.
 
 The patch seems to imply that case (e) is a subcase of (b), but
 effectively tuples in this state are treated more like (c).

OK, comment updated to:

/* ---
 * VACUUM FULL has an exclusive lock on the relation.  So
 * normally no other transaction can have pending INSERTs or
 * DELETEs in this relation.  A tuple is either:
 *  (a) a tuple in a system catalog, inserted or deleted
 *  by a not yet committed transaction
 *  (b) known dead (XMIN_INVALID, or XMAX_COMMITTED and xmax
 *  is visible to all active transactions)
 *  (c) inserted by a committed xact (XMIN_COMMITTED)
 *  (d) moved by the currently running VACUUM.
 *  (e) deleted (XMAX_COMMITTED) but at least one active
 *  transaction does not see the deleting transaction
 * In case (a) we wouldn't be in repair_frag() at all.
 * In case (b) we cannot be here, because scan_heap() has
 * already marked the item as unused, see continue above. Case
 * (c) is what normally is to be expected. Case (d) is only
 * possible, if a whole tuple chain has been moved while
 * processing this or a higher numbered block.
 * ---
 */

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] getting oid of function

2005-02-14 Thread Neil Conway
On Mon, 2005-02-14 at 17:02 +0500, Sibtay Abbas wrote:
 thank you for the detailed reply
 But what i wanted to know is that how can we actually get a function's
 oid from its
 name from within postgresql code itself

You'll want to query the syscache. Note that due to function
overloading, there may be multiple functions with the same name, so
you'll need to figure out which one ought to be invoked by using the
number and types of the parameters. See FuncnameGetCandidates() in
namespace.c for an example.

-Neil



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] enforcing a plan (in brief)

2005-02-14 Thread Tom Lane
[EMAIL PROTECTED] writes:
 I think that is sort of arrogant. Look at Oracle, you can give the planner
 hints in the form of comments.

Arrogant or not, that's the general view of the people who work on the
planner.

The real issue is not so much whether the planner will always get things
right --- it won't, and no one imagines that it will ever be perfect.
The real issue is that we have limited manpower, and designing and
implementing a useful hint facility is a nontrivial project.  (Not to
mention that maintaining such a thing in the face of frequent,
fundamental changes to the underlying planner and executor capabilities
would be an outright nightmare.)

The people who are actually doing the work think their time is more
usefully spent on improving the planner's intelligence than on devising
ways to override it.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] enforcing a plan (in brief)

2005-02-14 Thread Bruce Momjian
Tom Lane wrote:
 [EMAIL PROTECTED] writes:
  I think that is sort of arrogant. Look at Oracle, you can give the planner
  hints in the form of comments.
 
 Arrogant or not, that's the general view of the people who work on the
 planner.
 
 The real issue is not so much whether the planner will always get things
 right --- it won't, and no one imagines that it will ever be perfect.
 The real issue is that we have limited manpower, and designing and
 implementing a useful hint facility is a nontrivial project.  (Not to
 mention that maintaining such a thing in the face of frequent,
 fundamental changes to the underlying planner and executor capabilities
 would be an outright nightmare.)

And the user maintenance of updating those hints for every release of
PostgreSQL as we improve the database engine.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] 8.0.X and the ARC patent

2005-02-14 Thread Tom Lane
[EMAIL PROTECTED] writes:
 Might it be possible to contact IBM directly and ask if they will allow
 usage of the patent for PostgreSQL. They've let 500 patents for open
 source, maybe they'll give a write off for this as well.

If there were hard evidence that the ARC algorithm was far better than
the alternatives, it might be worth going in that direction.  But there
is no such evidence.  Jan has retracted his original opinion that the
ARC code is a big improvement over what we had before, and I haven't
seen anyone else putting up benchmark numbers that say we need to keep
ARC.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] unicode upper/lower functions

2005-02-14 Thread Bruce Momjian

This has been saved for the 8.1 release:

http://momjian.postgresql.org/cgi-bin/pgpatches2

---

John Hansen wrote:
 Hi list,
 
 
 Attached for your perusal, unicode versions of upper/lower, that work
 independent of locale except for the following languages:
 
 Turkish, Azeri, and Lithuanian.
 There are 15 locale specific cases in total not covered.
 
 
 -- 
 John Hansen [EMAIL PROTECTED]
 GeekNET

[ Attachment, skipping... ]

 
 ---(end of broadcast)---
 TIP 7: don't forget to increase your free space map settings

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] 8.0.X and the ARC patent

2005-02-14 Thread Bruce Momjian
Tom Lane wrote:
 [EMAIL PROTECTED] writes:
  Might it be possible to contact IBM directly and ask if they will allow
  usage of the patent for PostgreSQL. They've let 500 patents for open
  source, maybe they'll give a write off for this as well.
 
 If there were hard evidence that the ARC algorithm was far better than
 the alternatives, it might be worth going in that direction.  But there
 is no such evidence.  Jan has retracted his original opinion that the
 ARC code is a big improvement over what we had before, and I haven't
 seen anyone else putting up benchmark numbers that say we need to keep
 ARC.

And ARC has locking requirements that will make it very hard to fix our
SMP buffer management problems in 8.1.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] [HACKERS] WAL: O_DIRECT and multipage-writer (+ memory

2005-02-14 Thread Bruce Momjian

This has been saved for the 8.1 release:

http://momjian.postgresql.org/cgi-bin/pgpatches2

---

ITAGAKI Takahiro wrote:
 Hello everyone.
 
 I fixed two bugs in the patch that I sent before.
 Check and test new one, please.
 
 1. Fix update timing of Write-curridx. (pointed by Tom)
  Change to update it soon after write().
 
 2. Fix buffer alignment routine on 64bit cpu. (pointed by Mark)
  I checked it on Xeon EM64T and it worked properly, but I don't have 
 IA64...
 
 
 BTW, I found memory leak in BootStrapXLOG(). The buffer allocated by malloc()
 is not free()ed. ISSUE_BOOTSTRAP_MEMORYLEAK in this patch points out it.
 (But this leak is not serious, because this function is called only once.)
 
 
 ITAGAKI Takahiro

[ Attachment, skipping... ]

 
 ---(end of broadcast)---
 TIP 7: don't forget to increase your free space map settings

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] enforcing a plan (in brief)

2005-02-14 Thread Neil Conway
On Mon, 2005-02-14 at 22:56 -0500, Bruce Momjian wrote:
 And the user maintenance of updating those hints for every release of
 PostgreSQL as we improve the database engine.

... and maintaining those hints as the data changes over time.

But I think this thread has been hijacked toward a subject that has been
beaten to death in the past, and away from something that I think might
be worth exploring. IMHO, people deploying PostgreSQL for production use
are just one of the groups of users of this project. Another group are
those people using PostgreSQL in an academic environment. I think it
would be really cool to make it absolutely simple to use PostgreSQL as a
starting point for DBMS research. That would mean things like:

- good, thorough documentation of the internals (naturally this would
help attract OSS developers as well)

- APIs that allow people to drive the planner and executor
programmatically (as in the original question)

- plugin APIs that make it relatively easy to replace the implementation
of a subsystem whole-sale (if there's a cost to these APIs in terms of
complexity or performance, it is perhaps not worth doing)

(Of course, I'm partially guessing here -- but if those people who
actually _are_ using PostgreSQL in an academic context have some
additional ideas for how we can make your lives easier, I'd be curious
to hear them.)

-Neil



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] enforcing a plan (in brief)

2005-02-14 Thread pgsql
 [EMAIL PROTECTED] writes:
 I think that is sort of arrogant. Look at Oracle, you can give the
 planner
 hints in the form of comments.

 Arrogant or not, that's the general view of the people who work on the
 planner.

 The real issue is not so much whether the planner will always get things
 right --- it won't, and no one imagines that it will ever be perfect.
 The real issue is that we have limited manpower, and designing and
 implementing a useful hint facility is a nontrivial project.  (Not to
 mention that maintaining such a thing in the face of frequent,
 fundamental changes to the underlying planner and executor capabilities
 would be an outright nightmare.)

 The people who are actually doing the work think their time is more
 usefully spent on improving the planner's intelligence than on devising
 ways to override it.


I know I come on strong, and I know I'm probably irritating in many ways,
however, I have been a PostgreSQL user since just after it was named
postgreSQL from Postgres95. I've seen a lot of changes, and almost all of
them have been quite good.

I have over 10 years of using it on various projects. While I have not
been lucky enough to get a gig in which I could contribute more, I do try
to contribute and sometimes it is quite difficult.

The one thing that I think you guys miss is actually using PostgreSQL in
some projects where the company and the deliverables don't give a rat's
ass about whether you use PostgreSQL or Oracle or something else. Over the
years I have beat my head against the walls suggesting features, most of
which eventually have come to PostgreSQL, but every one was a fight.

I have some well founded opinions about PostgreSQL hard earned from real
world situations. Agree or not, I have experience with this database and I
have hit many of its short comings.

One consistent problem is the planner not being able to handle this or
that scenario. At this stage, the *best* way to improve the planner is to
add the ability to place hints in the plan. It *is* good enough for 90% of
the types of queries you would ever want to do. I am dubious that you can
get it demonstrably better in the last 10% or so without making it worse.

Simple hints would go a HUGE way to improving the last 10%. Many of the
Why doesn't PostgreSQL use my index questions would go away. Most of the
time Tom spends looking at people's pg_stats info would drop. It would
actually save time.

As a PostgreSQL user, I can tell you with 100% confidence, if I had this
tool, I could do my job easier. I can also tell you that while I have
genuine appreciation for the current quality of the planner, I still would
like to be able to tailor queries specifically to test various approaches
for performance reasons.

---(end of broadcast)---
TIP 8: explain analyze is your friend


[HACKERS] I will be on Boston

2005-02-14 Thread Bruce Momjian
I will be on Boston for Linuxworld from Tuesday through Thursday.  I
will read email only occasionally.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] 8.0.X and the ARC patent

2005-02-14 Thread Tom Lane
Bruce Momjian pgman@candle.pha.pa.us writes:
 And ARC has locking requirements that will make it very hard to fix our
 SMP buffer management problems in 8.1.

I am working on a buffer manager rewrite using the BufMgrLock breakup
and clock sweep management algorithm we've been discussing.  At the
moment it's passing the regression tests but I'm sure there's some bugs
left :-(.  I just now tried it on the infamous context-swap-storm test
case using a 4-way machine at Red Hat.  PG 8.0 shows 20K or more CS/sec
and under 30% CPU usage in this situation.  The patch shows 99% CPU
utilization and about 200 CS/sec (which is about nil, because the
machine shows ~100 CS/sec with nothing running except vmstat).

Still to be determined: what we lose in extra I/O from the presumably
less efficient cache management; also what sort of slowdown occurs on
a single-CPU machine that isn't going to get any benefit from the
increased amount of lock management.  But it looks promising.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] 8.0.X and the ARC patent

2005-02-14 Thread Thomas Hallgren
[EMAIL PROTECTED] wrote:
[EMAIL PROTECTED] wrote:
Might it be possible to contact IBM directly and ask if they will allow
usage of the patent for PostgreSQL. They've let 500 patents for open
source, maybe they'll give a write off for this as well.
There is an advantage beyond just not having to re-write the code, but
it
would also be sort of an IBM blessing, great PR.
I will be at Linux World and see if there is an IBM booth, maybe I can
get  some contact info.
I doubt they will give us something that extends to companies that sell
PostgreSQL so I don't see the point.

Actually, I think that's wrong. IBM has been really gung-ho about Linux.
Of course this is an obvious movement against Microsoft domination of the
server market, but they have made some very strong open source statements
and have release about 500 patents to open source projects.
The current open source patents extend to companies that sell other
products, why not PostgreSQL as well?
There is a *LOT* of crap going on with patents, there are so many issues
and motives that is hard to keep track of why who is doing what. My bet is
that it is 50/50. It all depends if IBM wants to hurt Oracle more than it
wants to defned DB2.
Seeing up close what IBM is doing with Eclipse, I must concur with this. 
I would be surprised if they said no. IBM knows that making revenue from 
software licenses is becoming increasingly harder. My impression is that 
they want to become less dependent on that and instead become more 
services oriented. This patent was filed before they started to change 
direction big time. Personally, I don't think they would care about it 
at all now.

So IMHO, ask IBM. It can't hurt.
Regards,
Thomas Hallgren
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] enforcing a plan (in brief)

2005-02-14 Thread Greg Stark

Neil Conway [EMAIL PROTECTED] writes:

 - good, thorough documentation of the internals (naturally this would
 help attract OSS developers as well)

I don't know what software you work with but the Postgres source is far and
away the best documented source I've had the pleasure to read. I think it's
challenging to jump into because it's a legitimately complex piece of
software, not because of any deficiency in the documentation.

 - plugin APIs that make it relatively easy to replace the implementation
 of a subsystem whole-sale (if there's a cost to these APIs in terms of
 complexity or performance, it is perhaps not worth doing)

And Postgres's extensibility features like plugin languages and indexing
methods are one of its strengths.

 - APIs that allow people to drive the planner and executor
 programmatically (as in the original question)

Actually, I think that would be a neat experiment. I've often wondered about
an environment where SQL is the source language and it's compiled statically
into data structures representing plans.

But you have to be careful, it would be easy to come up with nonsensical
plans, or even plans that would be infinite loops or cause crashes.

-- 
greg


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] enforcing a plan (in brief)

2005-02-14 Thread Mark Kirkwood
Although this is all true, consider that adding hints will mean that the
Pg developers *never* get bug reports to drive the optimizer improvement
process. This will have the effect of stagnating its development. I
think this would be a bad thing :-)
As an aside note that DB2 UDB does not let you hint its optimizer
either...I have heard it argued (by a IBM acquaintance of mine) that
their optimizer is better than that other database's whose name begins
with O, precisely because of this (He is biased of coarse, but it is an
interesting point).
regards
Mark

[EMAIL PROTECTED] wrote:
One consistent problem is the planner not being able to handle this or
that scenario. At this stage, the *best* way to improve the planner is to
add the ability to place hints in the plan. It *is* good enough for 90% of
the types of queries you would ever want to do. I am dubious that you can
get it demonstrably better in the last 10% or so without making it worse.
Simple hints would go a HUGE way to improving the last 10%. Many of the
Why doesn't PostgreSQL use my index questions would go away. Most of the
time Tom spends looking at people's pg_stats info would drop. It would
actually save time.
As a PostgreSQL user, I can tell you with 100% confidence, if I had this
tool, I could do my job easier. I can also tell you that while I have
genuine appreciation for the current quality of the planner, I still would
like to be able to tailor queries specifically to test various approaches
for performance reasons.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly