Re: [HACKERS] [PATCHES] Explain XML patch v2

2008-07-04 Thread Tom Raney

Quoting daveg <[EMAIL PROTECTED]>:


On Wed, Jul 02, 2008 at 09:01:18AM -0700, David Fetter wrote:

On Wed, Jul 02, 2008 at 05:57:29PM +0200, Peter Eisentraut wrote:
> It would also be interesting if EXPLAIN could optionally be a
> function that returns a datum of type XML, to allow further
> processing.

It would be better to have a function which allows people to plug in
their own serialization.  A JSON or YAML one, for example, would be
much lighter weight on both ends.


+1 for either of these.

-dg



So, this leads me to the idea of assembling the EXPLAIN data  
internally in an output-neutral data structure.  At the very end of  
processing, one decision statement would decide which output plugin to  
use for output.  Sprinkling XML print statements throughout the code  
(as currently done in the patch) while functional, is not ideal.  And,  
the escaping of XML content should ideally be done in the serializer  
anyway.


Of course, this realization didn't occur to me until *after* I had  
spent a bit of time coding up the patch in its current form.  Oh well.


Thoughts?

Regarding the XML datum, in order to support that, will all users need  
to compile with libxml?  Are there any lighter weight solutions to  
serialize other than libxml?


-Tom Raney




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] gsoc, text search selectivity and dllist enhancments

2008-07-04 Thread Heikki Linnakangas

Tom Lane wrote:

The data structure I'd suggest is a simple array of pointers
to the underlying hash table entries.  Since you have a predetermined
maximum number of lexemes to track, you can just palloc the array once
--- you don't need the expansibility properties of a list. 


The number of lexemes isn't predetermined. It's 2 * (longest tsvector 
seen so far), and we don't know beforehand how long the longest tsvector is.


repalloc()ing shouldn't be a problem, though.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] the un-vacuumable table

2008-07-04 Thread Gregory Stark
"Tom Lane" <[EMAIL PROTECTED]> writes:

> Have you looked into the machine's kernel log to see if there is any
> evidence of low-level distress (hardware or filesystem level)?  I'm
> wondering if ENOSPC is being reported because it is the closest
> available errno code, but the real problem is something different than
> the error message text suggests.  Other than the errno the symptoms
> all look quite a bit like a bad-sector problem ...

Uhm, just for the record FileWrite returns error messages which get printed
this way for two reasons other than write(2) returning ENOSPC:

1) if FileAccess has to reopen the file then open(2) could return an error. I
don't see how open returns ENOSPC without O_CREAT (and that's cleared for
reopening)

2) If write(2) returns < 0 but doesn't set errno. That also seems like a
strange case that shouldn't happen, but perhaps there's some reason it can.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Multi-column GIN

2008-07-04 Thread Heikki Linnakangas

Dumb question:

What's the benefit of a multi-column GIN index over multiple 
single-column GIN indexes?


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] Multi-column GIN

2008-07-04 Thread Teodor Sigaev

> What's the benefit of a multi-column GIN index over multiple
> single-column GIN indexes?

Page 12 from presentation on PgCon 
(http://www.sigaev.ru/gin/fastinsert_and_multicolumn_GIN.pdf):


Multicolumn index vs.  2 single column indexes

Size:539 Mb538 Mb
Speed:   *1.885* ms4.994 ms
Index:   ~340 s~200 s
Insert:  72 s/166 s/1



--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A Windows x64 port of PostgreSQL

2008-07-04 Thread Martijn van Oosterhout
On Thu, Jul 03, 2008 at 01:45:06AM -0400, Ken Camann wrote:
> When 32-bit arrived (much too late, at Microsoft) most x86 compilers
> that had formerly used the segmented memory model made int 4 bytes
> like people felt "it was supposed to be" but left long at 4 the way it
> was so as not to bloat all the variables to double words on such a
> register-poor architecture as x86. 

The usual way to talk about these things is that unix systems went for
LP64 and Microsoft apparently went for LLP64. This link

http://www.unix.org/version2/whatsnew/lp64_wp.html

talks about it. I don't think anyone is going to change their position
on this now.

Have a nice day,
-- 
Martijn van Oosterhout   <[EMAIL PROTECTED]>   http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while 
> boarding. Thank you for flying nlogn airlines.


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCHES] Explain XML patch v2

2008-07-04 Thread Peter Eisentraut
Am Freitag, 4. Juli 2008 schrieb Tom Raney:
> Regarding the XML datum, in order to support that, will all users need  
> to compile with libxml?  Are there any lighter weight solutions to  
> serialize other than libxml?

You can create XML without libxml.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] Multi-column GIN

2008-07-04 Thread Oleg Bartunov

On Fri, 4 Jul 2008, Teodor Sigaev wrote:


What's the benefit of a multi-column GIN index over multiple
single-column GIN indexes?


Page 12 from presentation on PgCon 
(http://www.sigaev.ru/gin/fastinsert_and_multicolumn_GIN.pdf):


Multicolumn index vs.  2 single column indexes

Size:539 Mb538 Mb
Speed:   *1.885* ms4.994 ms
Index:   ~340 s~200 s
Insert:  72 s/166 s/1


Well, another reason is a index feature-completeness

Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] log_rotation_age integer overflow display quirk

2008-07-04 Thread Stefan Kaltenbrunner
I just noticed that setting log_rotation_age to a value larger than 24 
days results in rather weird output (I have not actually tested yet if 
that affects the functionality too or just the output):



test=# show log_rotation_age;
 log_rotation_age
--
 -2134967296ms
(1 row)

this is a 64bit build of 8.3.3 on Debian/Linux.


Stefan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding variables for segment_size, wal_segment_size and block sizes

2008-07-04 Thread Simon Riggs

On Thu, 2008-07-03 at 16:36 +0200, Bernd Helmle wrote:
> --On Montag, Juni 30, 2008 18:47:33 -0400 Bruce Momjian <[EMAIL PROTECTED]> 
> wrote:
> 
> >>
> >> I'd like to implement them if we agree on them
> >
> > Bernd, have you made any progress on this?
> 
> Here's a patch for this. I'll add it to the commit fest wiki page if it's 
> okay for you.

It's small and uncontentious, please add it to the wiki.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] pg crashing

2008-07-04 Thread Magnus Hagander
Tom Lane wrote:
> Magnus Hagander <[EMAIL PROTECTED]> writes:
>>> below this is going to convert \ to /, wouldn't it be clearer to
>>> describe the path prefix as Global/PostgreSQL: in the first place?
> 
>> Eh, that shows another bug I think. It should *not* convert the \ in
>> "Global\", because that one is is interpreted by the Win32 API call!
> 
> I was wondering about that.  What are the implications of that?

First, that the name isn't nicely readable when browsing with Process
Explorer. Second, that they will all go in the local namespace, which
means you can in theory start two postmasters in the same directory from
different terminal server sessions (this was the way it was on 8.2
already, it's a new feature for 8.3 that simply didn't work)



>> I think it should be per this patch. Seems right?
> 
> Pls fix the comment on the malloc, too.

Right, will do and commit.

//Magnus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A Windows x64 port of PostgreSQL

2008-07-04 Thread chris
[EMAIL PROTECTED] ("Ken Camann") writes:
> On Thu, Jul 3, 2008 at 12:39 AM, Tom Lane <[EMAIL PROTECTED]> wrote:
>> "Ken Camann" <[EMAIL PROTECTED]> writes:
>>> EMT64/AMD64 is new compared to the older architectures, I
>>> would guess the older ones predate the time when it became a somewhat
>>> de facto standard to leave "long int" at 4 bytes, and make "long long"
>>> the new 64-bit type.
>>
>> Apparently your definition of "de facto standard" is "any idiotic
>> decision Microsoft cares to make".  AFAIK there is *no* system other
>> than WIN64 where long is narrower than size_t; and I rather doubt that
>> there ever will be any.  "long long" is generally understood to mean
>> a type that the hardware supports, but not very efficiently (ie, it's
>> double the native word width) --- and one would hope that pointers
>> are not in that category.  On a 64-bit machine LL really ought to
>> denote 128-bit arithmetic ... I wonder what curious syntax Microsoft
>> will invent when they realize their compilers ought to support that?
>
> Actually, it isn't my definition.  I haven't really worked with enough
> compilers to make a claim like that, I got that impression (and the
> phrase "de facto") from the proceedings of the C++0x standards
> committee where they finalized long long as being required to be 8
> bytes.  I think it ultimately does come from Microsoft/Intel, because
> they did follow the old width convention in the 16 bit days, that is
> sizeof(int) was 2 (word width) and sizeof(long) was 4.

AFAIK, we oughtn't care what C++ standards say, because PostgreSQL is
implemented in C, and therefore needs to follow what the *C* standards
say.

> When 32-bit arrived (much too late, at Microsoft) most x86 compilers
> that had formerly used the segmented memory model made int 4 bytes
> like people felt "it was supposed to be" but left long at 4 the way it
> was so as not to bloat all the variables to double words on such a
> register-poor architecture as x86.  I actually think of Borland Turbo
> C++ and Intel more than Microsoft when I think of this decision.  For
> that reason, I would have thought you would see the same thing on all
> x86 systems...but now I realize that's stupid.  Once you leave Windows
> its a gcc world, so it would be the way it always should have been, on
> every POSIX system.  Even then though, if I were to use Linux on x64,
> wouldn't sizeof(int) be 4 and not 8?

#include 

int main (int argc, char **argv) {
   int i;
   long j;
   long long k;
   printf ("size of int: %d\n", sizeof(i));
   printf ("size of long: %d\n", sizeof(j));
   printf ("size of long long: %d\n", sizeof(k));
}

Output on 32 bit Linux:
size of int: 4
size of long: 4
size of long long: 8

Output on 64 bit Linux:
size of int: 4
size of long: 8
size of long long: 8
-- 
output = ("cbbrowne" "@" "linuxfinances.info")
http://linuxfinances.info/info/
"The real  romance is   out   ahead and   yet to come.The computer
revolution hasn't started yet. Don't be misled by the enormous flow of
money into bad defacto standards for unsophisticated buyers using poor
adaptations of incomplete ideas." -- Alan Kay

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Concurrent Restores

2008-07-04 Thread Zdenek Kotala

Volkan YAZICI napsal(a):

Hi,







pg_dump is capable of dumping objects with respect to their dependency
relations. It'd be really awesome if pg_dump can also handle
parallelizing primary key, foreign key and index creation queries into
separate files. Would such a think be possible? Comments?


You can find a discussion about pg_dump improvement on this page
http://wiki.postgresql.org/wiki/PgCon_2008_Developer_Meeting


Zdenek

--
Zdenek Kotala  Sun Microsystems
Prague, Czech Republic http://sun.com/postgresql


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] log_rotation_age integer overflow display quirk

2008-07-04 Thread Bernd Helmle
--On Freitag, Juli 04, 2008 11:31:07 +0200 Stefan Kaltenbrunner 
<[EMAIL PROTECTED]> wrote:



I just noticed that setting log_rotation_age to a value larger than 24
days results in rather weird output (I have not actually tested yet if
that affects the functionality too or just the output):


test=# show log_rotation_age;
  log_rotation_age
--
  -2134967296ms
(1 row)


This seems to be a bug in _ShowOption(), where the corresponding value is 
converted into milliseconds to get the biggest possible time unit to 
display. This overflows the result variable (which is declared as int), 
causing this strange output.


--
 Thanks

   Bernd

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Review: DTrace probes (merged version)

2008-07-04 Thread Zdenek Kotala
I performed review of merged patch from Robert Treat. At first point the patch 
does not work (SunOS 5.11 snv_86 sun4u sparc SUNW,Sun-Fire-V240)


 truncate ... ok
 alter_table  ... FAILED (test process exited with exit code 2)
 sequence ... ok
 polymorphism ... ok
 rowtypes ... ok
 returning... ok
 largeobject  ... FAILED (test process exited with exit code 2)
 xml  ... ok
test stats... FAILED (test process exited with exit code 2)
test tablespace   ... FAILED (test process exited with exit code 2)


However I went through a code and I have following comments:

1) Naming convention:

 - Some probes uses "*end", some "*done". It would be good to select one name.
 - I prefer to use clog instead of slru in probes name. clog is widely known.
 - It seems to me better to have checkpoint-clog..., checkpoint-subtrans 
instead of clog-checkpoint.
 - buffer-flush was originally dirty-buffer-write-start. I prefer Robert Lor's 
naming.


2) storage read write probes

smgr-read*, smgr-writes probes are in md.c. I personally think it make more 
sense to put them into smgr.c. Only advantage to have it in md.c is that actual 
amount of bytes is possible to monitor.


3) query rewrite probe

There are several probes for query measurement but query rewrite phase is 
missing. See analyze_and_rewrite or pg_parse_and_rewrite


4) autovacuum_start

Autovacuum_start probe is alone. I propose following probes for completeness:

proc-autovacuum-start
proc-autovacuum-stop
proc-bgwriter-start
proc-bgwriter-stop
proc-backend-start
proc-backend-stop
proc-master-start
proc-master-stop

5) Need explain of usage:

I have some doubts about following probes. Could you please explain usage of 
them? example dtrace script is welcome


 - all exec-* probes
 - mark-dirty, local-mark-dirty


6) several comments about placement:

I published patch on http://reviewdemo.postgresql.org/r/25/. I added several 
comments there.


7) SLRU/CLOG

SLRU probes could be return more info. For example if page was in buffer or if 
physical write is not necessary and so on.



That's all for this moment

Zdenek  
--
Zdenek Kotala  Sun Microsystems
Prague, Czech Republic http://sun.com/postgresql


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review: DTrace probes (merged version)

2008-07-04 Thread Alvaro Herrera
Zdenek Kotala wrote:

> 1) Naming convention:
>
>  - Some probes uses "*end", some "*done". It would be good to select one name.
>  - I prefer to use clog instead of slru in probes name. clog is widely known.

But slru is also used in pg_subtrans and pg_multixact.  Which maybe
says that we oughta have separate probes for these rather than a single
one in slru.  Otherwise it's going to be difficult telling one from the
other, yes?


> Autovacuum_start probe is alone. I propose following probes for completeness:
>
> proc-autovacuum-start
> proc-autovacuum-stop
> proc-bgwriter-start
> proc-bgwriter-stop

Separate proc-autovacuum-worker-start and proc-autovacuum-launcher-start,
perhaps.  Not that I see any usefulness in tracking autovacuum launcher
start and stop, but then if we're tracking bgwriter start and stop then
it makes the same sense.

> proc-master-start
> proc-master-stop

What's "master" here?


-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] gsoc, text search selectivity and dllist enhancments

2008-07-04 Thread Tom Lane
"Heikki Linnakangas" <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> The data structure I'd suggest is a simple array of pointers
>> to the underlying hash table entries.  Since you have a predetermined
>> maximum number of lexemes to track, you can just palloc the array once
>> --- you don't need the expansibility properties of a list. 

> The number of lexemes isn't predetermined. It's 2 * (longest tsvector 
> seen so far), and we don't know beforehand how long the longest tsvector is.

Hmm, I had just assumed without looking too closely that it was stats
target times a fudge factor.  What is the rationale for doing it as
above?  I don't think I like the idea of the limit varying over the
course of the scan --- that means that lexemes in different places
in the input will have significantly different probabilities of
surviving to the final result.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] Explain XML patch v2

2008-07-04 Thread Tom Lane
Peter Eisentraut <[EMAIL PROTECTED]> writes:
> Am Freitag, 4. Juli 2008 schrieb Tom Raney:
>> Regarding the XML datum, in order to support that, will all users need  
>> to compile with libxml?  Are there any lighter weight solutions to  
>> serialize other than libxml?

> You can create XML without libxml.

Seems to me that anyone who wants this feature will probably also want
the existing libxml-based features, so they'll be building with libxml
anyway.  So I'd not be in favor of expending any extra code on a
roll-your-own solution.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Fix a couple of bugs in win32 shmem name generation: * Don't cut

2008-07-04 Thread Tom Lane
Magnus Hagander <[EMAIL PROTECTED]> writes:
> Heikki Linnakangas wrote:
>> What happens if someone launches version 8.3.3 postgres.exe and version
>> 8.3.4 postgres.exe at the same time, on the same data directory? Will
>> the interlock that prevents two postmasters from starting at the same
>> time work?

> Hmm. Didn't think of that :(

> Yeah, it seems that that part of it would fail. In a lot of cases you'd
> still get kicked off by socket conflicts and such, but the shared memory
> part would not notice someone is already there, no.

> Not sure if it's reason enough to revert - since it fixes other cases. I
> guess in theory we could check both the old and the new name, but that's
> going to be a considerably more complex patch.

According to what you just told me, the original coding is storing the
name in a "local namespace", which presumably means it won't conflict
anyway.  Ergo, the existing coding is simply broken and there's nothing
we can do about it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review: DTrace probes (merged version)

2008-07-04 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
>> Autovacuum_start probe is alone. I propose following probes for completeness:
>> 
>> proc-autovacuum-start
>> proc-autovacuum-stop
>> proc-bgwriter-start
>> proc-bgwriter-stop

> Separate proc-autovacuum-worker-start and proc-autovacuum-launcher-start,
> perhaps.  Not that I see any usefulness in tracking autovacuum launcher
> start and stop, but then if we're tracking bgwriter start and stop then
> it makes the same sense.

I see no value in cluttering the system with useless probes.  The worker
start/stop are the only ones here with any conceivable application IMHO.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review: DTrace probes (merged version)

2008-07-04 Thread Zdenek Kotala

Alvaro Herrera napsal(a):

Zdenek Kotala wrote:


1) Naming convention:

 - Some probes uses "*end", some "*done". It would be good to select one name.
 - I prefer to use clog instead of slru in probes name. clog is widely known.


But slru is also used in pg_subtrans and pg_multixact.  Which maybe
says that we oughta have separate probes for these rather than a single
one in slru.  Otherwise it's going to be difficult telling one from the
other, yes?


Yeah, you are right, I missed that it is used in other part too. slru is OK




Autovacuum_start probe is alone. I propose following probes for completeness:

proc-autovacuum-start
proc-autovacuum-stop
proc-bgwriter-start
proc-bgwriter-stop


Separate proc-autovacuum-worker-start and proc-autovacuum-launcher-start,
perhaps.  Not that I see any usefulness in tracking autovacuum launcher
start and stop, but then if we're tracking bgwriter start and stop then
it makes the same sense.


The advantage to track start and stop of procese is that you can stop the 
process in dtrace script at the beginning and for example attach debugger or for 
example start counting number of writes per process and so on.



proc-master-start
proc-master-stop


What's "master" here?


Main process - postmaster.


Zdenek

--
Zdenek Kotala  Sun Microsystems
Prague, Czech Republic http://sun.com/postgresql


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review: DTrace probes (merged version)

2008-07-04 Thread Zdenek Kotala

Tom Lane napsal(a):

Alvaro Herrera <[EMAIL PROTECTED]> writes:

Autovacuum_start probe is alone. I propose following probes for completeness:

proc-autovacuum-start
proc-autovacuum-stop
proc-bgwriter-start
proc-bgwriter-stop



Separate proc-autovacuum-worker-start and proc-autovacuum-launcher-start,
perhaps.  Not that I see any usefulness in tracking autovacuum launcher
start and stop, but then if we're tracking bgwriter start and stop then
it makes the same sense.


I see no value in cluttering the system with useless probes.  The worker
start/stop are the only ones here with any conceivable application IMHO.


As I answered to Alvaro. I needed to catch start of backend several times to 
track call flow or attach debugger. It is possible to use some other dtrace 
magic for that, but it is not easy and there is not way how to determine what 
kind of process it is.  For example how to measure how many writes performs 
bgwriter?


Zdenek


--
Zdenek Kotala  Sun Microsystems
Prague, Czech Republic http://sun.com/postgresql


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review: DTrace probes (merged version)

2008-07-04 Thread Alvaro Herrera
Zdenek Kotala wrote:
> Alvaro Herrera napsal(a):

>>> proc-master-start
>>> proc-master-stop
>>
>> What's "master" here?
>
> Main process - postmaster.

Huh, surely we're not interested in tracking postmaster start?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review: DTrace probes (merged version)

2008-07-04 Thread Alvaro Herrera
Zdenek Kotala wrote:
> Tom Lane napsal(a):

>> I see no value in cluttering the system with useless probes.  The worker
>> start/stop are the only ones here with any conceivable application IMHO.
>
> As I answered to Alvaro. I needed to catch start of backend several times 
> to track call flow or attach debugger. It is possible to use some other 
> dtrace magic for that, but it is not easy and there is not way how to 
> determine what kind of process it is.  For example how to measure how 
> many writes performs bgwriter?

If you need to attach a debugger to a backend, you can use the -W switch
(even on PGOPTIONS if you need it for a particular backend, AFAIR).  If
you want to "truss" it I guess you can use -W too.

Does it have any usefulness beyond that?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review: DTrace probes (merged version)

2008-07-04 Thread Zdenek Kotala

Alvaro Herrera napsal(a):

Zdenek Kotala wrote:

Alvaro Herrera napsal(a):



proc-master-start
proc-master-stop

What's "master" here?

Main process - postmaster.


Huh, surely we're not interested in tracking postmaster start?



It depends. See following schema example


postgres::proc-master-start
{
   tracking = 1
}

syscall:::open:entry
/ tracking == 1/
{
   ... print file name ...
}

postgres::proc-master-stop
{
  tracking = 0
}

It is very useful because it say when you can start and stop monitoring for 
example. But I'm not DTrace expert maybe there is different way how to do it.


Zdenek


--
Zdenek Kotala  Sun Microsystems
Prague, Czech Republic http://sun.com/postgresql


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review: DTrace probes (merged version)

2008-07-04 Thread Zdenek Kotala

Alvaro Herrera napsal(a):

Zdenek Kotala wrote:

Tom Lane napsal(a):



I see no value in cluttering the system with useless probes.  The worker
start/stop are the only ones here with any conceivable application IMHO.
As I answered to Alvaro. I needed to catch start of backend several times 
to track call flow or attach debugger. It is possible to use some other 
dtrace magic for that, but it is not easy and there is not way how to 
determine what kind of process it is.  For example how to measure how 
many writes performs bgwriter?


If you need to attach a debugger to a backend, you can use the -W switch
(even on PGOPTIONS if you need it for a particular backend, AFAIR).  If
you want to "truss" it I guess you can use -W too.

Does it have any usefulness beyond that?



Why use million of tools when you can use one? And truss monitors only syscalls 
but with dtrace you are able to use/trace over 8 probes in the kernel, libc 
and so on. I agree that for debugger you can use -W option but in situation when 
you are not able to use this switch (e.g on customer production machine) dtrace 
is only possible solution. That is why I think that this probes are useful.



Zdenek

--
Zdenek Kotala  Sun Microsystems
Prague, Czech Republic http://sun.com/postgresql


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Fix a couple of bugs in win32 shmem name generation: * Don't cut

2008-07-04 Thread Magnus Hagander
Tom Lane wrote:
> Magnus Hagander <[EMAIL PROTECTED]> writes:
>> Heikki Linnakangas wrote:
>>> What happens if someone launches version 8.3.3 postgres.exe and version
>>> 8.3.4 postgres.exe at the same time, on the same data directory? Will
>>> the interlock that prevents two postmasters from starting at the same
>>> time work?
> 
>> Hmm. Didn't think of that :(
> 
>> Yeah, it seems that that part of it would fail. In a lot of cases you'd
>> still get kicked off by socket conflicts and such, but the shared memory
>> part would not notice someone is already there, no.
> 
>> Not sure if it's reason enough to revert - since it fixes other cases. I
>> guess in theory we could check both the old and the new name, but that's
>> going to be a considerably more complex patch.
> 
> According to what you just told me, the original coding is storing the
> name in a "local namespace", which presumably means it won't conflict
> anyway.  Ergo, the existing coding is simply broken and there's nothing
> we can do about it.

Local namespace = Session local, not process local. So it would properly
protect against two processes started in the same session. One session
is, for example, an interactive login. But not if they were started by
different users, since they'd be in different sessions.

//Magnus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Fix a couple of bugs in win32 shmem name generation: * Don't cut

2008-07-04 Thread Alvaro Herrera
Magnus Hagander wrote:
> Tom Lane wrote:

> > According to what you just told me, the original coding is storing the
> > name in a "local namespace", which presumably means it won't conflict
> > anyway.  Ergo, the existing coding is simply broken and there's nothing
> > we can do about it.
> 
> Local namespace = Session local, not process local. So it would properly
> protect against two processes started in the same session. One session
> is, for example, an interactive login. But not if they were started by
> different users, since they'd be in different sessions.

But those different users would not have access to the same set of
files, so it wouldn't work anyway, right?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] gsoc, text search selectivity and dllist enhancments

2008-07-04 Thread Heikki Linnakangas

Tom Lane wrote:

"Heikki Linnakangas" <[EMAIL PROTECTED]> writes:

Tom Lane wrote:

The data structure I'd suggest is a simple array of pointers
to the underlying hash table entries.  Since you have a predetermined
maximum number of lexemes to track, you can just palloc the array once
--- you don't need the expansibility properties of a list. 


The number of lexemes isn't predetermined. It's 2 * (longest tsvector 
seen so far), and we don't know beforehand how long the longest tsvector is.


Hmm, I had just assumed without looking too closely that it was stats
target times a fudge factor.  What is the rationale for doing it as
above?  I don't think I like the idea of the limit varying over the
course of the scan --- that means that lexemes in different places
in the input will have significantly different probabilities of
surviving to the final result.


Well, clearly if the list is smaller than the longest tsvector, 
inserting all elements of that long tsvector will flush out all other 
entries. Or if we throw away the newly inserted entries, some elements 
will never have a chance to climb up the list. I'm not sure where the 
"times two" figure comes from, maybe it's just a fudge factor, but the 
bottom line is that the minimum size needed depends on the size of the 
longest tsvector.


(Jan is offline until Saturday...)

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: CITEXT 2.0

2008-07-04 Thread David E. Wheeler

On Jul 3, 2008, at 09:53, Alvaro Herrera wrote:

Thanks. What would citext_hash() look like? I don't see a  
text_hash() to

borrow from anywhere in src/.


See hash_any().  I assume the difficulty is making sure that
hash("FOO") = hash("foo") ...


Great, big help, thank you. So does this look sensible?

Datum
citext_hash(PG_FUNCTION_ARGS)
{
char   *txt;
char   *str;
Datum   result;

txt = cilower( PG_GETARG_TEXT_PP(0) );
str = VARDATA_ANY(txt);

result = hash_any((unsigned char *) str, VARSIZE_ANY_EXHDR(txt));

/* Avoid leaking memory for toasted inputs */
PG_FREE_IF_COPY(txt, 0);
pfree( str );

return result;
}

And how might I be able to test that it actually works?

Best,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: CITEXT 2.0

2008-07-04 Thread David E. Wheeler

On Jul 2, 2008, at 22:14, Tom Lane wrote:

The "leak" is irrelevant for larger/smaller.  The only place where  
it's

actually useful to do PG_FREE_IF_COPY is in a btree or hash index
support function.  In other cases you can assume that you're being
called in a memory context that's too short-lived for it to matter.


Stupid question: for the btree index support function, is that *only*  
the function referenced in the OPERATOR CLASS, or does it also apply  
to functions that implement the operators in that class? IOW, do I  
need to worry about memory leaks in citext_eq, citext_ne, citext_gt,  
etc., or only in citext_cmp()?


Thanks,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: CITEXT 2.0

2008-07-04 Thread David E. Wheeler
Replying to myself, but I've made some local changes (see other  
messages) and just wanted to follow up on some of my own comments.


On Jul 2, 2008, at 21:38, David E. Wheeler wrote:


4) Operator =  citext_eq is not correct. See comment 
http://doxygen.postgresql.org/varlena_8c.html#8621d064d14f259c594e4df3c1a64cac


So should citextcmp() call strncmp() instead of varst_cmp()? The  
latter is what I saw in varlena.c.


I'm guessing that the answer is "no," since varstr_cmp() uses  
strncmp() internally, as appropriate to the locale. Correct?


There must be difference between equality and collation for example  
in Czech language 'láska' and 'laská' are different word it means  
that 'láska' != 'laská'. But there is no difference in collation  
order. See Unicode Universal Collation Algorithm for detail.


I'll leave the collation stuff to the functions I call (*far* from  
my specialty), but I'll add a test for this and make sure it works  
as expected. Um, although, with what collation should it be tested?  
The tests I wrote assume en_US.UTF-8.


I added this test and is passes:

SELECT isnt( 'láska'::citext, 'laská'::citext, 'Diffrent accented  
characters should not be equivalent' );


5) There are several commented out lines in CREATE OPERATOR  
statement mostly related to NEGATOR. Is there some reason for that?


I copied it from the original citext.sql. Not sure what effect it has.


I restored these (and one of them was wrong anyway).


Also OPERATOR || has probably wrong negator.


Right, good catch.


Stupid question: What would the negation of || actually be? There  
isn't one is, there?


Thanks!

David
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers