Re: [HACKERS] Additional stats for Relations

2006-10-14 Thread NikhilS
Hi Jim,
On 10/13/06, Jim C. Nasby [EMAIL PROTECTED] wrote:
On Fri, Oct 13, 2006 at 06:17:47PM +0530, NikhilS wrote: Currently a select * from pg_statio_user_tables; displays only
 heap_blks_read, heap_blks_hit stats amongst others for the main relation. It would be good to have the following stats collected too. I think these stats can be used to better statistically analyze/understand the block I/O
 activity on the relation: heap_blks_reused: The number of buffers returned by the FSM for use to store a new tuple inThe description on this is misleading... FSM doesn't return buffers, it
returns pages that have free space on them.

Nikhils
FSM returns the block number from which we fetch the buffer. This is similar to the way we track buffer_read stats in ReadBuffer.
Nikhils

I'm also not sure if this metric is what you actually want, since asingle page can be returned many times from the FSM even between
vacuums. Tracking how many pages for a relation have been put into theFSM might be more useful...

Nikhils
Pages might be put into the FSM, but by this metric don't we get the actual usage of the pages from the FSM? Agreed a single page can be returned multiple times, but since it serves a new tuple, shouldn't we track it?

Nikhils

 heap_blks_extend: The number of times file extend was invoked on the relation heap_blks_truncate: The total number of blocks that have been truncated due
 to vacuum activity e.g. As an addendum to the truncate stats above, we can also have the additional following stats: heap_blks_maxtruncate: The max block of buffers truncated in one go
 heap_blks_ntruncate: The number of times truncate was called on this relation I can come up with a patch (already have one) for the above. Any thought/comments?Do you have a use-case for this info? I can see where it might be neat
to know, but I'm not sure how you'd actually use it in the real world.

Nikhils
The use-case according to me is that these stats help prove the effectiveness of autovacuum/vacuum operations. By varying some autovac guc variables, and doing subsequent (pgbench e.g.) runs, one can find out the optimum values for these variables using these stats. 

Nikhils

Regards, 
Nikhils 
EnterpriseDB http://www.enterprisedb.com-- All the world's a stage, and most of us are desperately unrehearsed.


Re: [HACKERS] Interface of the R-tree in order to work with postgresql

2006-10-14 Thread Markus Schaber
Hi, Jorge,

jorge alberto wrote:

 I'm Jorge from Peru South America, and this is my first post
 
 I want to know how can I add a new spatial access method into the
 postgresql (I'm doing research on spatial access methods( reading a lot
 of papers and programming a lot too ) but also I want to know how can I
 add my new data structure( if someday i get it, of course =) ) in the
 postgresql, I mean where can i find the .h that describes the interface
 that a spatial access method, like the R-tree, must have in order to
 work with postgresql.
 
 I think that to understand how postgresql works with spatial access
 methods, like the R-tree, I need to find a .h or .c to begin
 reading the code, so can you tell me where to start?

Besides the hints others posted here, you might want to look at
contrib/tsearch2 and PostGIS for examples of how to use GIST.

HTH,
Markus

-- 
Markus Schaber | Logical TrackingTracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in Europe! www.ffii.org
www.nosoftwarepatents.org



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] query optimization with UDFs

2006-10-14 Thread Markus Schaber
Hi, Tom,

Tom Lane wrote:
 Neil Conway [EMAIL PROTECTED] writes:
 BTW, I think it would make sense to implement a limited subset of the
 xfunc ideas: add options to CREATE FUNCTION to allow cost information to
 be specified, and then take advantage of this information instead of
 using the existing constant kludges. This would be a tangible
 improvement, and would have minimal impact on the planner.
 
 The trick is to figure out what a useful parameterized cost model would
 look like.  IIRC, the main reason the xfunc code rotted on the vine was
 that its cost parameters didn't seem to be either easy to select or
 powerful in predicting actual cost.  We'd have to do better this time.

I don't know what the xfunc people did, but at least for some varlen
data types (Arrays, PostGIS, text), some function costs (concatenation,
GeomUnion etc.) can be estimated via the average field size of the tables.

Has that idea been considered?

HTH,
Markus

-- 
Markus Schaber | Logical TrackingTracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in Europe! www.ffii.org
www.nosoftwarepatents.org



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] query optimization with UDFs

2006-10-14 Thread Tom Lane
Markus Schaber [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 The trick is to figure out what a useful parameterized cost model would
 look like.  IIRC, the main reason the xfunc code rotted on the vine was
 that its cost parameters didn't seem to be either easy to select or
 powerful in predicting actual cost.  We'd have to do better this time.

 I don't know what the xfunc people did, but at least for some varlen
 data types (Arrays, PostGIS, text), some function costs (concatenation,
 GeomUnion etc.) can be estimated via the average field size of the tables

 Has that idea been considered?

[ shrug... ]  Concatenation is definitely not one of the functions we
need to worry about.  In fact, I'd say that only functions containing
database accesses are really worth worrying about --- and the hard
part of modeling them is the possible variation in the number of rows
touched depending on their parameters.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [PATCHES] New shared memory hooks proposal (was Re: pre_load_libraries)

2006-10-14 Thread Tom Lane
Marc Munro [EMAIL PROTECTED] writes:
 The attached patch provides add-ins with the means to register for
 shared memory and LWLocks.

I finally got around to reviewing this patch, and realized that it's got
a pretty fundamental design flaw: it isn't useful under Windows (or any
other EXEC_BACKEND platform), because there isn't any provision for
backends to locate structs allocated by other backends by means of
searching in shared memory.  AFAICS the code can only do something
useful in a platform where allocations made in the postmaster process
can be found by backends via fork inheritance of pointers.

The right way to handle shmem allocations is to use ShmemInitStruct
to either allocate a shared struct for the first time or attach to a
previously made instance of the struct.  (This struct could be a
memory allocation arena itself, but that's not the core code's problem.)
Now we could extend the patch so that each addin has its own
ShmemIndex within its private workspace, but I think that's probably
overkill.  My inclination is to rip out ShmemAllocFromContext and expect
addins to use ShmemInitStruct the same as everyone else.  The hook
callable at shared_preload_libraries time should just serve to add
the necessary amount to the computed size of the shared memory segment.

RegisterAddinLWLock is broken in the same way: it could only be used
safely if the registered lock ID were remembered in shared memory,
but since shared memory doesn't exist at the time it's supposed to be
called, there's no way to do that.  Again, it'd seem to work as long as
the lock ID value were inherited via fork, but that's gonna fail on
EXEC_BACKEND builds.  I think we should probably take this out in favor
of something that just increments a counter that replaces
NUM_USER_DEFINED_LWLOCKS, and expect people to use LWLockAssign() at an
appropriate time while initializing their shared memory areas.

It strikes me that there's a race condition here, which we've not seen
in previous use because backends expect all standard shared memory
structs to have already been initialized by the postmaster.  An add-on
will instead have to operate under the regime of first backend wanting
to use the struct must init it.  Although ShmemInitStruct returns a
found bool telling you you've got to init it, there's no interlock
ensuring that you can do so before someone else comes along and tries to
use the struct (which he'll assume is done because he sees found = true).
And, per above discussion, an add-on can't solve this for itself using
an add-on LWLock, because it really has to acquire its add-on locks
while initializing that same shmem struct, which is where it's going to
keep the locks' identity :-(

So I think we need to provide a standard LWLock reserved for the purpose
of synchronizing first-time use of a shmem struct.  The coding rules for
an add-on would then look like:

* in the shared_preload_libraries hook:

RequestAddinShmemSpace(size);
RequestAddinLWLocks(n);

* in a backend, to access a shared memory struct:

static mystruct *ptr = NULL;

if (!ptr)
{
boolfound;

LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
ptr = ShmemInitStruct(my struct name, size, found);
if (!ptr)
elog(ERROR, out of shared memory);
if (!found)
{
initialize contents of shmem area;
acquire any requested LWLocks using:
ptr-mylockid = LWLockAssign();
}
LWLockRelease(AddinShmemInitLock);
}


Thoughts?

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] Problems building 8.2beta1 on macos G5 xserve

2006-10-14 Thread Sean Davis
Trying to build 8.2beta1 on MacOS G5 Xserver, OS version 10.4.7.  I got 
this:




gcc -no-cpp-precomp -O2 -Wall -Wmissing-prototypes -Wpointer-arith 
-Winline -Wendif-labels -fno-strict-aliasing   -dynamiclib -install_name 
/usr/local/pgsql/lib/libpq.5.dylib -compatibility_version 5 
-current_version 5.0 -exported_symbols_list exports.list 
-multiply_defined suppress  fe-auth.o fe-connect.o fe-exec.o fe-misc.o 
fe-print.o fe-lobj.o fe-protocol2.o fe-protocol3.o pqexpbuffer.o 
pqsignal.o fe-secure.o md5.o ip.o wchar.o encnames.o noblock.o 
pgstrcasecmp.o thread.o   -L../../../src/port -L/sw/lib   -o libpq.5.0.dylib
/usr/bin/libtool: for architecture: cputype (16777234) cpusubtype (0) 
file: -lSystem is not an object file (not allowed in a library)

make[3]: *** [libpq.5.0.dylib] Error 1
make[2]: *** [all] Error 2
make[1]: *** [all] Error 2
make: *** [all] Error 2

% uname -a
Darwin Sherlock 8.7.0 Darwin Kernel Version 8.7.0: Fri May 26 15:20:53 
PDT 2006; root:xnu-792.6.76.obj~1/RELEASE_PPC Power Macintosh powerpc


Any suggestions?

Thanks,
Sean

---(end of broadcast)---
TIP 6: explain analyze is your friend


[HACKERS] Not quite there on timezone names in timestamp input

2006-10-14 Thread Tom Lane
This doesn't work:

regression=# select '2006-07-13 09:20:00 EST5EDT'::timestamptz;
ERROR:  invalid input syntax for type timestamp with time zone: 2006-07-13 
09:20:00 EST5EDT

It never has worked in the past, of course, but I expected it to work in
HEAD seeing that the zic database includes EST5EDT as one of the defined
zone names (and pg_timezone_names shows it as available).

On investigation, the problem seems to lie with ParseDateTime(), which
is being quite finicky about what it will believe is a timezone name.
In particular it won't accept digits as part of the name.  That can
probably be fixed but I'm wondering about the other more or less special
cases in the timezone name list, such as 

America/Port-au-Prince
GB-Eire
GMT-0
Etc/GMT+8
W-SU

In particular it seems tough to accept GMT-0 as a timezone name without
breaking recognition of month-name dates such as 'Nov-09-2006'.  Any
thoughts about appropriate heuristics?  Should we just accept a few
cases where we don't recognize zic timezone names?  (If so, we'd better
hack pg_timezone_names to not show those cases...)

Another problem here is case folding.  The existing code smashes the
string to initcap format up to the '/' if any, but doesn't fold the part
after the '/', leaving us with a weird combination of case sensitivity
and insensitivity, not to mention another reason why EST5EDT won't work.
I'd like it to be entirely case insensitive, but am not sure how to make
the file lookup work that way.  Thoughts?

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Problems building 8.2beta1 on macos G5 xserve

2006-10-14 Thread Tom Lane
Sean Davis [EMAIL PROTECTED] writes:
 Trying to build 8.2beta1 on MacOS G5 Xserver, OS version 10.4.7.  I got 
 this:
 /usr/bin/libtool: for architecture: cputype (16777234) cpusubtype (0) 
 file: -lSystem is not an object file (not allowed in a library)

What Xcode version have you got?  My recollection is that this is the
symptom of trying to build with pre-10.4 Xcode tools on 10.4.

(Yeah, time to download that 800MB file again :-( ... but at least it's
free...)

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Problems building 8.2beta1 on macos G5 xserve

2006-10-14 Thread Sean Davis

Tom Lane wrote:

Sean Davis [EMAIL PROTECTED] writes:
  
Trying to build 8.2beta1 on MacOS G5 Xserver, OS version 10.4.7.  I got 
this:
/usr/bin/libtool: for architecture: cputype (16777234) cpusubtype (0) 
file: -lSystem is not an object file (not allowed in a library)



What Xcode version have you got?  My recollection is that this is the
symptom of trying to build with pre-10.4 Xcode tools on 10.4.

(Yeah, time to download that 800MB file again :-( ... but at least it's
free...)
  
Looks like that might be the case (my gcc is from 2003!).  I'll try 
again after updating and let you know for sure.


Thanks, Tom.

Sean

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] [BUGS] BUG #2683: spi_exec_query in plperl returns column names which are not marked as UTF8

2006-10-14 Thread Tom Lane
Vitali Stupin [EMAIL PROTECTED] writes:
 If database uses UTF8 encoding, then spi_exec_query in plperl should return
 query results in UTF8 encoding. But unfortunately only data is marked as
 UTF8, while column names are not.

It looks to me like basically everywhere in plperl.c that does newSVpv()
should follow it with

#if PERL_BCDVERSION = 0x5006000L
if (GetDatabaseEncoding() == PG_UTF8)
SvUTF8_on(sv);
#endif

whereas currently there are only a couple of places that do that.

I'm tempted to consolidate this into a function on the order of
newSVstring(const char *) or some such.  Comments?

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] [HACKERS] large object regression tests

2006-10-14 Thread Bruce Momjian

Patch applied.  Thanks.

---


Jeremy Drake wrote:
 On Sun, 24 Sep 2006, Jeremy Drake wrote:
 
  On Thu, 21 Sep 2006, Tom Lane wrote:
 
   I think we could do without the Moby Dick extract too ...
 
  I am open to suggestions.  I saw one suggestion that I use an image of an
  elephant, but I suspect that was tongue-in-cheek.  I am not very fond of
  the idea of generating repetitious data, as I think it would be more
  difficult to determine whether or not the loseek/tell functions put me in
  the right place in the middle of the file.
 
 I just had the idea that I could use one of the existing data files which
 are used for testing COPY instead of the Moby Dick extract.  They are
 already there, a few of them are pretty good sized, they have data in the
 file which is not just simple repetition so it would be pretty obvious if
 the seek function broke, and they are very unlikely to change.  I am
 considering changing the test I put together to use tenk.data as the input
 file tomorrow and send in what I have again, since I also am doing a test
 of \lo_import (which also requires a patch to psql I sent in earlier to
 fix the output of the \lo_* commands to respect the output settings).
 
 -- 
 When does summertime come to Minnesota, you ask?
 Well, last year, I think it was a Tuesday.
 
 ---(end of broadcast)---
 TIP 3: Have you checked our extensive FAQ?
 
http://www.postgresql.org/docs/faq

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] [HACKERS] large object regression tests

2006-10-14 Thread Bruce Momjian

Sorry, please disregard.  Was means for a Japanese FAQ update.

---

Bruce Momjian wrote:
 
 Patch applied.  Thanks.
 
 ---
 
 
 Jeremy Drake wrote:
  On Sun, 24 Sep 2006, Jeremy Drake wrote:
  
   On Thu, 21 Sep 2006, Tom Lane wrote:
  
I think we could do without the Moby Dick extract too ...
  
   I am open to suggestions.  I saw one suggestion that I use an image of an
   elephant, but I suspect that was tongue-in-cheek.  I am not very fond of
   the idea of generating repetitious data, as I think it would be more
   difficult to determine whether or not the loseek/tell functions put me in
   the right place in the middle of the file.
  
  I just had the idea that I could use one of the existing data files which
  are used for testing COPY instead of the Moby Dick extract.  They are
  already there, a few of them are pretty good sized, they have data in the
  file which is not just simple repetition so it would be pretty obvious if
  the seek function broke, and they are very unlikely to change.  I am
  considering changing the test I put together to use tenk.data as the input
  file tomorrow and send in what I have again, since I also am doing a test
  of \lo_import (which also requires a patch to psql I sent in earlier to
  fix the output of the \lo_* commands to respect the output settings).
  
  -- 
  When does summertime come to Minnesota, you ask?
  Well, last year, I think it was a Tuesday.
  
  ---(end of broadcast)---
  TIP 3: Have you checked our extensive FAQ?
  
 http://www.postgresql.org/docs/faq
 
 -- 
   Bruce Momjian   [EMAIL PROTECTED]
   EnterpriseDBhttp://www.enterprisedb.com
 
   + If your life is a hard drive, Christ can be your backup. +
 
 ---(end of broadcast)---
 TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings