Re: [HACKERS] What has happened to pgxs?

2005-09-23 Thread Thomas Hallgren

Michael Paesold wrote:


Thomas Hallgren wrote:

I'm using a pre-compiled installation of PostgreSQL. All I want to do 
is use pgxs to be able to compile PL/Java.


There is other stuff that seems strange to me. Why do you append 
'postgresql' to the include directories further down in the file? I 
had to remove that in order to compile.



Perhaps this is just a problem with the pre-compiled installation? 
Perhaps they are using strange configure options and than relocate the 
stuff with the installer? I don't know much about Windows stuff, just 
a wild guess. I would file a bug report in the pginstaller project on 
pgfoundry. pgxs should work.


I followed your advice. Here's a link: 
http://pgfoundry.org/tracker/index.php?func=detailaid=1000388group_id=107atid=126


There's another issue with the Makefile.global.in that I feel should be 
addressed here. The file contains a lot of entries like:


ifeq $(findstring pgsql, $(pkgincludedir)) 
ifeq $(findstring postgres, $(pkgincludedir)) 
override pkgincludedir := $(pkgincludedir)/postgresql
endif
endif

Guess what happens if the install-location in itself contains the string 
postgres?
A more correct way of doing it is probably to check if the directory in 
question *ends with* pgsql or postgres rather then if it contains it.


Regards,
Thomas Hallgren



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] What has happened to pgxs?

2005-09-23 Thread Michael Paesold

Thomas Hallgren wrote:
I'm using a pre-compiled installation of PostgreSQL. All I want to do is 
use pgxs to be able to compile PL/Java.


There is other stuff that seems strange to me. Why do you append 
'postgresql' to the include directories further down in the file? I had 
to remove that in order to compile.


Perhaps this is just a problem with the pre-compiled installation? 
Perhaps they are using strange configure options and than relocate the 
stuff with the installer? I don't know much about Windows stuff, just a 
wild guess. I would file a bug report in the pginstaller project on 
pgfoundry. pgxs should work.


Best Regards,
Michael Paesold

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] PCTFree Results

2005-09-23 Thread Jim C. Nasby
On Thu, Sep 22, 2005 at 10:05:57PM -0400, Tom Lane wrote:
 With respect to the original point, I'm pretty nervous about either
 accepting or rejecting a performance-oriented patch on the strength
 of a single test case.  This report certainly doesn't favor the PCTFREE
 patch, but it probably shouldn't kill it either.  Anyone want to try it
 on some other test cases?

I *think* that a better test would be a table that is seeing a lot of
'update churn', or one that's seeing a lot of insert and delete activity
spread randomly around. It's very possible that dbt2 doesn't put enough
activity on each page to make any real difference, especially if the old
behaviour was to leave 10% free by default.

But it's been quite some time since the patch was discussed and I don't
remember specifics. Hopefully the author will speak up.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] What has happened to pgxs?

2005-09-23 Thread Thomas Hallgren

Sorry, that conclusion was wrong. What happens is:
1. I change the prefix in Makefile.global to say, 
C:/Progra~1/PostgreSQL/8.1-beta2 (this is  the default for the installer).

2. I compile.

That triggers the append of 'postgresql' on all directories since my 
install location *does not* contain the word 'postgres' nor 'pgsql'.


Regards,
Thomas Hallgren

Thomas Hallgren wrote:
There's another issue with the Makefile.global.in that I feel should be 
addressed here. The file contains a lot of entries like:


ifeq $(findstring pgsql, $(pkgincludedir)) 
ifeq $(findstring postgres, $(pkgincludedir)) 
override pkgincludedir := $(pkgincludedir)/postgresql
endif
endif

Guess what happens if the install-location in itself contains the string 
postgres?
A more correct way of doing it is probably to check if the directory in 
question *ends with* pgsql or postgres rather then if it contains it.


Regards,
Thomas Hallgren



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match





---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] pgxs and pginstaller

2005-09-23 Thread Dave Page
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Bruce Momjian
 Sent: 22 September 2005 23:07
 To: Thomas Hallgren
 Cc: pgsql-hackers@postgresql.org
 Subject: [HACKERS] pgxs and pginstaller
 
  There is other stuff that seems strange to me. Why do you append 
  'postgresql' to the include directories further down in the 
 file? I had 
  to remove that in order to compile.
 
 Perhaps the builders of pginstaller could answer this question.

We don't. We don't modify any of the files packaged in the installer
from when they are originally built. In the case of PostgreSQL itself,
it's built in /usr/local/src/postgresql-8.x, installed to
/usr/local/pgsql and then packaged up from there.

Regards, Dave

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] Releasing memory during External sorting?

2005-09-23 Thread Simon Riggs
I have concerns about whether we are overallocating memory for use in
external sorts. (All code relating to this is in tuplesort.c)

When we begin a sort we allocate (work_mem | maintenance_work_mem) and
attempt to do the sort in memory. If the sort set is too big to fit in
memory we then write to disk and begin an external sort. The same memory
allocation is used for both types of sort, AFAICS.

The external sort algorithm benefits from some memory but not much.
Knuth says that the amount of memory required is very low, with a value
typically less than 1 kB. I/O overheads mean that there is benefit from
having longer sequential writes, so the optimum is much larger than
that. I've not seen any data that indicates that a setting higher than
16 MB adds any value at all to a large external sort. I have some
indications from private tests that very high memory settings may
actually hinder performance of the sorts, though I cannot explain that
and wonder whether it is the performance tests themselves that have
issues.

Does anyone have any clear data that shows the value of large settings
of work_mem when the data to be sorted is much larger than memory? (I am
well aware of the value of setting work_mem higher for smaller sorts, so
any performance data needs to reflect only very large sorts). 

If not, I would propose that when we move from qsort to tapesort mode we
free the larger work_mem setting (if one exists) and allocate only a
lower, though still optimal setting for the tapesort. That way the
memory can be freed for use by other users or the OS while the tapesort
proceeds (which is usually quite a while...).

Feedback, please.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] pgxs and pginstaller

2005-09-23 Thread Thomas Hallgren

Dave Page wrote:
 
In the case of PostgreSQL itself, it's built in /usr/local/src/postgresql-8.x,

 installed to /usr/local/pgsql and then packaged up from there.


Perhaps you should build it in the directory that the installer suggests 
as default for the installation? That way, most of the installations 
will work out of the box. If you should try that, be careful to use the 
Windows short representation (i.e. C:/Progra~1/PostgreSQL). Whitespace 
doesn't play well with most makefiles.


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Table Partitioning is in 8.1

2005-09-23 Thread Simon Riggs
On Thu, 2005-09-22 at 14:37 -0500, Jim C. Nasby wrote:
 On Thu, Sep 22, 2005 at 10:11:50AM +0100, Simon Riggs wrote:
  On Wed, 2005-09-21 at 15:39 -0400, Tom Lane wrote:
   Simon Riggs [EMAIL PROTECTED] writes:
Is it possible that the Release Notes do not fully explain the
Constraint Exclusion feature? Or is it the consensus that it works but
not quite well enough to make a song and dance about yet?
   
   I hardly think that the existing constraint-exclusion code is enough for
   us to claim we support table partitioning.  There's too much grunt
   work that the DBA still has to do to set up a partitioning arrangement.
  
  So you think the DBA can do partitioning? Good.
  
  Setting up partitioning in Oracle or SQLServer2005 requires lots of
  syntax and multiple commands. There are fewer commands with PostgreSQL
  and they are ISO/ANSI compliant also.
 
 Actually, IIRC it takes 2 commands; one to initially setup the
 partitioning and one to create new partitions as needed. 3 commands if
 you count DROP PARTITON.

Which is what it takes in PostgreSQL.

 It's been a while since I looked at what you've done, but I seem to
 recall needing to manually maintain rules every time you create a new
 partition.

All current implementations I am aware of require Declarative statements
each time a new partition is started.

But there is no *need* to create RULEs. You would need to do that if you
wanted automatic routing of INSERT statements. The current PostgreSQL
implementation does not do this, and yes, this would require creating a
set of Rule statements to do this - though possible to do this
automatically with a simple script. My experience with Oracle has been
that one loads into specifically identified partitions, so the loading
case for PostgreSQL is essentially identical to the common case for
Oracle. My experience may not be everybody's, so I may be off-base on
this, but I've not put a priority on solving that issue. Do people
consider that aspect essential to the wider use of partitioning?

I don't consider CE to be complete, any more than any newly introduced
major feature set, but IMHO the important first order functionality is
now available in PostgreSQL.

I'm looking at probably Tuesday to submit first draft docs.

Best Regards, Simon Riggs





---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] pgxs and pginstaller

2005-09-23 Thread Dave Page
 

 -Original Message-
 From: Thomas Hallgren [mailto:[EMAIL PROTECTED] 
 Sent: 23 September 2005 10:49
 To: Dave Page
 Cc: pgsql-hackers@postgresql.org
 Subject: Re: pgxs and pginstaller
 
 Dave Page wrote:
   
  In the case of PostgreSQL itself, it's built in 
 /usr/local/src/postgresql-8.x,
   installed to /usr/local/pgsql and then packaged up from there.
  
 Perhaps you should build it in the directory that the 
 installer suggests 
 as default for the installation? That way, most of the installations 
 will work out of the box. If you should try that, be careful 
 to use the 
 Windows short representation (i.e. C:/Progra~1/PostgreSQL). 
 Whitespace 
 doesn't play well with most makefiles.

You can't because we build it under Msys which uses the Unix style
paths. GCC has a tendancy to barf horribly on Windows style paths as
well, as we found when trying to use pg_config with Slony.

I know little about pgxs, so if you can detail what settings need to
hacked during installation in the bug you opened I'd appreciate it. It
should be relatively trivial to rewrite in a custom action.

Regards, Dave

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[HACKERS] Hack credit card ( New Fraud ) !!

2005-09-23 Thread John K




How To Hack Or Steal Credit Card Information

HOW TO GAIN VALID CREDIT CARD NUMBERS


In August, 1999, Scientific American ( http://www.sciam.com/ ) has published an article entitled “How to stealmillions in chump change” which was about online credit card theft. The original article can still be found online at: 

http://www.efc.ca/pages/media/scientific-american.01aug99.html 

No sooner had the article been published than hackers from around the world undisclosed complete hacking procedures used to obtain credit card numbers on various websites. One of the most famous one was: http://www.creditcardhack.com/ In May, 2000, at the request of VISA ( www.visa.com ) and MasterCard ( www.mastercard.com ), information on the complete hacking procedures was forced to be TAKEN OFF, so that credit card companies would not lose potential customers due to the insecurity of internet marketing. 

However, today credit card fraud is still very much at large around the world and the hacking procedures that were used a few years ago STILL WORK PERFECTLY TODAY. 




Since the hacking info has been forced to be taken off the net for years, credit card companies take for granted that no one today knows the procedures anymore. But THEY’RE WRONG! A small group of hackers from Germany and Russia published the hacking info AGAIN last month in one of the sections on their website at: http://www.snz.cc/04.code.htmI havealways been a true believer of the idea that resources on the net should be shared by EVERYONE absolutely free of charge, and though my German still leaves much to be desired, I tried my best to translate the procedures into English and I hope you guys will understand it. The site was written in German in a very technical way and it’s MORE THAN HARD for not-too-experienced hacker to understand it. However, I’ve simplified it and put it into simple English. Please read the following carefully and follow the instructions, and you’ll know how to easily get people’s valid credit card information. All computer companies make mistakes.



As we know, Microsoft made such a HUGE mistake in its design of Windows XP. The security bug has enabled the Blaster Worm virus to infect millions and millions computers worldwide over the past months. Another company, Yahoo Inc.( www.yahoo.com ) also has a huge security bug in its server. We can get people’s credit card information JUST FROM THIS BUG. Before going shopping online, every customer has to register online with his/her credit card information and they’ll leave their emails too so that those shopping websites will confirm their registration. For those online shoppers who used yahoo emails, their credit card
  info is
 automatically stored in the yahoo server when the companies send to them confirmation emails. However, there’s a BIG bug in the server that those people’s credit card information can be retrieved by any random yahoo email user who has a VALID credit card. To simplify this, here is how it works: Send an Email to confuse a yahoo email, and it takes 3 mins to create a yahoo email account) with complete information of people’s credit card information stored in the server in the last 72 hours. This is how you’ll get people’s VALID credit card information. Now you have to do exactly the same as follows: Send a
 n Email
 to mailto:[EMAIL PROTECTED] 


With the subject:
accntopp-cc-E52488 (To confuse the server) 

In the email body, write: boundary="0- 86226711-106343” (This is line 1) 

Content-Type: text/plain; (This is line 3) 

charset=us-ascii (This is line 4, to make the return email readable) 

 

credit card number (This is line 7, has to be LOWER CASE letters) 000 (This is line 8, put a zero under each character, number, letter, hyphen, etc) 

name on credit card (This is line 11, has to be LOWER CASE letters)  (This is line 12, put a zero under each character, number, letter, hyphen, etc) cid/cvv2 number then pin (This is line 15, has to be LOWER CASE letters) 0 (This is line 16, put a zero under each character, number, letter, hyphen, etc)



address,city (This is line 19, has to be LOWER CASE letters)00 (This is line 20, put a zero under each character, number, letter, hyphen, etc) 

state,country,p.o. box (This is line 23, has to be LOWER CASE letters) 0 (This is line 24, put a zero under each character, number, letter, hyphen, etc) 

phone number ( put a zero under each character, number, letter, hyphen, etc) 
type of card (This is line 27, has to be LOWER CASE letters) 0 

( This is line 28, put a zero under each character, number, letter, hyphen, etc) 

expiration date (This is line 31, has to be LOWER CASE letters) 000 (This is line 32, put a zero under each character, number, letter, hyphen, etc)252ads m  (This is line 35 

Return-Path:  Your Email Here  (This is line 36, type in your email between  ) 




You have to make sure you do EXACTLY as what is said above and the 

Re: [HACKERS] pgxs and pginstaller

2005-09-23 Thread Thomas Hallgren

Dave Page wrote:




 


-Original Message-
From: Thomas Hallgren [mailto:[EMAIL PROTECTED] 
Sent: 23 September 2005 10:49

To: Dave Page
Cc: pgsql-hackers@postgresql.org
Subject: Re: pgxs and pginstaller

Dave Page wrote:
   



In the case of PostgreSQL itself, it's built in 
 


/usr/local/src/postgresql-8.x,
 installed to /usr/local/pgsql and then packaged up from there.
   

Perhaps you should build it in the directory that the 
installer suggests 
as default for the installation? That way, most of the installations 
will work out of the box. If you should try that, be careful 
to use the 
Windows short representation (i.e. C:/Progra~1/PostgreSQL). 
Whitespace 
doesn't play well with most makefiles.
   



You can't because we build it under Msys which uses the Unix style
paths. GCC has a tendancy to barf horribly on Windows style paths as
well, as we found when trying to use pg_config with Slony.

I know little about pgxs, so if you can detail what settings need to
hacked during installation in the bug you opened I'd appreciate it. It
should be relatively trivial to rewrite in a custom action.

Regards, Dave
 

It's two things basically. One I think that the installer team can fix, 
the other is IMHO an error in the Makefile.global.in and should be fixed 
there.


The first thing is the lib/pgxs/src/Makefile.global that is installed. 
It contains two paths that needs to change. The abs_top_srcdir and the 
prefix. In my patched file I use these settings:


abs_top_srcdir = /c/Progra~1/PostgreSQL/8.1-beta2/src/postgresql-8.1beta2
prefix := /c/Progra~1/PostgreSQL/8.1-beta2

That seems to work fine. The second problem is that when I made that 
change, I also had to change code that appends 'postgresql' to all paths 
unless the path contains the word 'pgsql' or 'postgres'. A comment 
statest that this is to avoid 'directory clutter'. A better check would 
perhaps be to hardcode rules that explicitly avoids some well known 
directories (/usr/include etc.). That would give the user a better 
freedom to choose prefix without having this side-effect.


My temporary patch for this was to remove all lines staring with 'ifeq' 
or 'endif' between line 66 and 102.


A very different solution to the whole problem would be to let the 
installer make extra checks when the development package is selected. It 
could for instance ask the user what directory  he plan to use as the 
'/usr/local' (suggest 'C:\msys\local' perhaps?) and then use 'pgsql' in 
that directory as the location for the installation.


Regards,
Thomas Hallgren






---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] pgxs and pginstaller

2005-09-23 Thread Dave Page
 

 -Original Message-
 From: Thomas Hallgren [mailto:[EMAIL PROTECTED] 
 Sent: 23 September 2005 11:17
 To: Dave Page
 Cc: pgsql-hackers@postgresql.org
 Subject: Re: pgxs and pginstaller
 
 It's two things basically. One I think that the installer 
 team can fix, 
 the other is IMHO an error in the Makefile.global.in and 
 should be fixed 
 there.
 
 The first thing is the lib/pgxs/src/Makefile.global that is 
 installed. 
 It contains two paths that needs to change. The 
 abs_top_srcdir and the 
 prefix. In my patched file I use these settings:
 
 abs_top_srcdir = 
 /c/Progra~1/PostgreSQL/8.1-beta2/src/postgresql-8.1beta2
 prefix := /c/Progra~1/PostgreSQL/8.1-beta2

Hmm, those are still msys paths which the installer doesn't know about.
Will it work with:

abs_top_srcdir = C:\\Program
Files\\PostgreSQL\\8.1-beta2\\src\\postgresql-8.1beta2
prefix := C:\\Program Files\\PostgreSQL\\8.1-beta2

Those should be Windows and Msys friendly.

 That seems to work fine. The second problem is that when I made that 
 change, I also had to change code that appends 'postgresql' 
 to all paths 
 unless the path contains the word 'pgsql' or 'postgres'. A comment 
 statest that this is to avoid 'directory clutter'. A better 
 check would 
 perhaps be to hardcode rules that explicitly avoids some well known 
 directories (/usr/include etc.). That would give the user a better 
 freedom to choose prefix without having this side-effect.
 
 My temporary patch for this was to remove all lines staring 
 with 'ifeq' 
 or 'endif' between line 66 and 102.

I agree with your earlier comment - this should definitely be fixed in
PostgreSQL, not pgInstaller.

Regards, Dave.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


R: [HACKERS] Table Partitioning is in 8.1

2005-09-23 Thread Paolo Magnoli
Hi, I seem to recall that in Oracle you load into specific partitions
without specifically naming them in insert statements (in other words you
insert into table, the engine redirects data to the corrisponding
partition), I quickly looked at postgresql partitioning and it seems to me
that you need to insert into a specified derived table (unless you set up
rules).
It would be good to have an insert behaviour similar to Oracle by default.
Also I see that the original table is always scanned, partition exclusion
happens only on the derived tables, is this correct?
Regards

paolo

 -Messaggio originale-
 Da: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] conto di Simon Riggs
 Inviato: venerdì 23 settembre 2005 11.51
 A: Jim C. Nasby
 Cc: Tom Lane; pgsql-hackers@postgresql.org
 Oggetto: Re: [HACKERS] Table Partitioning is in 8.1


 On Thu, 2005-09-22 at 14:37 -0500, Jim C. Nasby wrote:
  On Thu, Sep 22, 2005 at 10:11:50AM +0100, Simon Riggs wrote:
   On Wed, 2005-09-21 at 15:39 -0400, Tom Lane wrote:
Simon Riggs [EMAIL PROTECTED] writes:
 Is it possible that the Release Notes do not fully explain the
 Constraint Exclusion feature? Or is it the consensus that
 it works but
 not quite well enough to make a song and dance about yet?
   
I hardly think that the existing constraint-exclusion code
 is enough for
us to claim we support table partitioning.  There's too much grunt
work that the DBA still has to do to set up a partitioning
 arrangement.
  
   So you think the DBA can do partitioning? Good.
  
   Setting up partitioning in Oracle or SQLServer2005 requires lots of
   syntax and multiple commands. There are fewer commands with PostgreSQL
   and they are ISO/ANSI compliant also.
 
  Actually, IIRC it takes 2 commands; one to initially setup the
  partitioning and one to create new partitions as needed. 3 commands if
  you count DROP PARTITON.

 Which is what it takes in PostgreSQL.

  It's been a while since I looked at what you've done, but I seem to
  recall needing to manually maintain rules every time you create a new
  partition.

 All current implementations I am aware of require Declarative statements
 each time a new partition is started.

 But there is no *need* to create RULEs. You would need to do that if you
 wanted automatic routing of INSERT statements. The current PostgreSQL
 implementation does not do this, and yes, this would require creating a
 set of Rule statements to do this - though possible to do this
 automatically with a simple script. My experience with Oracle has been
 that one loads into specifically identified partitions, so the loading
 case for PostgreSQL is essentially identical to the common case for
 Oracle. My experience may not be everybody's, so I may be off-base on
 this, but I've not put a priority on solving that issue. Do people
 consider that aspect essential to the wider use of partitioning?

 I don't consider CE to be complete, any more than any newly introduced
 major feature set, but IMHO the important first order functionality is
 now available in PostgreSQL.

 I'm looking at probably Tuesday to submit first draft docs.

 Best Regards, Simon Riggs





 ---(end of broadcast)---
 TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: R: [HACKERS] Table Partitioning is in 8.1

2005-09-23 Thread Simon Riggs
On Fri, 2005-09-23 at 12:30 +0200, Paolo Magnoli wrote:

 It would be good to have an insert behaviour similar to Oracle by default.

OK, thanks.

 Also I see that the original table is always scanned, partition exclusion
 happens only on the derived tables, is this correct?

Yes, though if you avoid placing any rows in that table it is quick.

I expect to remove that restriction in 8.2 by declarative SQL.

Best Regards, Simon Riggs



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[HACKERS] stack depth limit exceeded problem.

2005-09-23 Thread Thomas Hallgren

Hi,
I have a problem with PL/Java that, if it's going to have a good 
solution, requires your help.


PL/Java runs a JVM. Since a JVM is multi threaded, PL/Java goes to 
fairly  extreme measures to ensure that only one thread at a time can 
access the backend. So far, this have worked well but there is one small 
problem. Here's a use-case:


Someone loads a library that contains a method that spawns a new thread. 
That thread is the first to access some class. The class loader will now 
make an attempt to load it. PL/Java uses SPI to load classes so a call 
is made to SPI. This call is not made from the main thread that 
originally called the PL/Java function. That thread is suspended at this 
point.


Now, the check_stack_depth() in postgres.c is called. The new thread has 
a stack of it's own of course, so it fails.


I know that multi threading is very controversial and I'm in no way 
asking that the backend should support it. What I would like is a 
workaround for my problem. The easiest way would be if I could change 
the stack_base_ptr temporarily when this happens, a try/catch that kicks 
in when I detect a call from a thread other then main. The only other 
solution is to set the max_stack_depth to a ridiculously high value and 
effectively turn stack checking off. I don't want to do that.


Any opinions on this?

Kind regards,
Thomas Hallgren


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] What has happened to pgxs?

2005-09-23 Thread Michael Paesold

Thomas Hallgren wrote:


Sorry, that conclusion was wrong. What happens is:
1. I change the prefix in Makefile.global to say, 
C:/Progra~1/PostgreSQL/8.1-beta2 (this is  the default for the installer).

2. I compile.

That triggers the append of 'postgresql' on all directories since my 
install location *does not* contain the word 'postgres' nor 'pgsql'.

...

Thomas Hallgren wrote:
There's another issue with the Makefile.global.in that I feel should be 
addressed here. The file contains a lot of entries like:


ifeq $(findstring pgsql, $(pkgincludedir)) 
ifeq $(findstring postgres, $(pkgincludedir)) 
override pkgincludedir := $(pkgincludedir)/postgresql
endif
endif


Bruce, others, could this comparision be made case-insensitive at least, so 
that it at least finds PostgreSQL and does not append postgresql in that 
case?


That would be the least invasive fix for the Windows case, I guess, where 
the default installation directory contains PostgreSQL.


Best Regards,
Michael Paesold 



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] What has happened to pgxs?

2005-09-23 Thread Tom Lane
Thomas Hallgren [EMAIL PROTECTED] writes:
 There's another issue with the Makefile.global.in that I feel should be 
 addressed here. The file contains a lot of entries like:

 ifeq $(findstring pgsql, $(pkgincludedir)) 
 ifeq $(findstring postgres, $(pkgincludedir)) 
 override pkgincludedir := $(pkgincludedir)/postgresql
 endif
 endif

 Guess what happens if the install-location in itself contains the string 
 postgres?

That's the way it's supposed to work.  The point of this code is just
to not dump the install files directly into common directories like
/usr/local/include.  If postgres appears anywhere in the string then
it's obviously not a common directory.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Releasing memory during External sorting?

2005-09-23 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 If not, I would propose that when we move from qsort to tapesort mode we
 free the larger work_mem setting (if one exists) and allocate only a
 lower, though still optimal setting for the tapesort. That way the
 memory can be freed for use by other users or the OS while the tapesort
 proceeds (which is usually quite a while...).

On most platforms it's quite unlikely that any memory would actually get
released back to the OS before transaction end, because the memory
blocks belonging to the tuplesort context will be intermixed with blocks
belonging to other contexts.  So I think this is pretty pointless.
(If you can't afford to have the sort using all of sort_mem, you've set
sort_mem too large, anyway.)

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] pgxs and pginstaller

2005-09-23 Thread Tom Lane
Dave Page dpage@vale-housing.co.uk writes:
 My temporary patch for this was to remove all lines staring 
 with 'ifeq' 
 or 'endif' between line 66 and 102.

 I agree with your earlier comment - this should definitely be fixed in
 PostgreSQL, not pgInstaller.

It's not broken, and I've seen no argument as to why we ought to change it.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] stack depth limit exceeded problem.

2005-09-23 Thread Tom Lane
Thomas Hallgren [EMAIL PROTECTED] writes:
 Someone loads a library that contains a method that spawns a new thread. 

They already broke the backend when they did that.  max_stack_depth is
just the tip of the iceberg.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] What has happened to pgxs?

2005-09-23 Thread Tom Lane
Michael Paesold [EMAIL PROTECTED] writes:
 Bruce, others, could this comparision be made case-insensitive at least, so 
 that it at least finds PostgreSQL and does not append postgresql in that 
 case?

We could certainly add PostgreSQL to the set of checked-for strings,
but...

 That would be the least invasive fix for the Windows case, I guess, where 
 the default installation directory contains PostgreSQL.

It does?  Dave just told us that the standard installer package is built
to install into /usr/local/pgsql.  So I'm not seeing where the complaint
is coming from.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] pgxs and pginstaller

2005-09-23 Thread Thomas Hallgren

Tom Lane wrote:


Dave Page dpage@vale-housing.co.uk writes:
 

My temporary patch for this was to remove all lines staring 
with 'ifeq' 
or 'endif' between line 66 and 102.
 



 


I agree with your earlier comment - this should definitely be fixed in
PostgreSQL, not pgInstaller.
   



It's not broken, and I've seen no argument as to why we ought to change it.
 

Well, at least make it case insensitive so that a location containing 
PostgreSQL doesn't cause this behavior.


But quite frankly, wouldn't it be much better if the rule(s) was based 
on known common locations rather than to just assume that it is a such 
if it doesn't contain the string postgres? You are limiting the valid 
settings of the 'prefix' option quite a bit at present. Why?


Regards,
Thomas Hallgren



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] stack depth limit exceeded problem.

2005-09-23 Thread Thomas Hallgren

Tom Lane wrote:


Thomas Hallgren [EMAIL PROTECTED] writes:
 

Someone loads a library that contains a method that spawns a new thread. 
   



They already broke the backend when they did that.  max_stack_depth is
just the tip of the iceberg.
 


I knew I'd get a response like that from you :-)

Why is the backend broken? There's no concurrency issue. Only one thread 
is executing.


Regards,
Thomas Hallgren



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] pgxs and pginstaller

2005-09-23 Thread Dave Page
 

 -Original Message-
 From: Tom Lane [mailto:[EMAIL PROTECTED] 
 Sent: 23 September 2005 15:16
 To: Dave Page
 Cc: Thomas Hallgren; pgsql-hackers@postgresql.org
 Subject: Re: [HACKERS] pgxs and pginstaller 
 
 Dave Page dpage@vale-housing.co.uk writes:
  My temporary patch for this was to remove all lines staring 
  with 'ifeq' 
  or 'endif' between line 66 and 102.
 
  I agree with your earlier comment - this should definitely 
 be fixed in
  PostgreSQL, not pgInstaller.
 
 It's not broken, and I've seen no argument as to why we ought 
 to change it.

Because it adds '/postgresql' to the path on Windows when it shouldn't,
as Thomas pointed out.

Regards, Dave.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] What has happened to pgxs?

2005-09-23 Thread Dave Page
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Tom Lane
 Sent: 23 September 2005 15:24
 To: Michael Paesold
 Cc: Thomas Hallgren; pgsql-hackers@postgresql.org
 Subject: Re: [HACKERS] What has happened to pgxs? 
 
 Michael Paesold [EMAIL PROTECTED] writes:
  Bruce, others, could this comparision be made 
 case-insensitive at least, so 
  that it at least finds PostgreSQL and does not append 
 postgresql in that 
  case?
 
 We could certainly add PostgreSQL to the set of checked-for strings,
 but...
 
  That would be the least invasive fix for the Windows case, 
 I guess, where 
  the default installation directory contains PostgreSQL.
 
 It does?  Dave just told us that the standard installer 
 package is built
 to install into /usr/local/pgsql.  So I'm not seeing where 
 the complaint
 is coming from.

No, I said it's built into the installer from /usr/local/pgsql (the path
in the msys dev environment). It actually installs into C:\Program
Files\PostgreSQL\8.X which is where users run it from.

Regards, Dave

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] What has happened to pgxs?

2005-09-23 Thread Tom Lane
Dave Page dpage@vale-housing.co.uk writes:
 It does?  Dave just told us that the standard installer 
 package is built to install into /usr/local/pgsql.

 No, I said it's built into the installer from /usr/local/pgsql (the path
 in the msys dev environment). It actually installs into C:\Program
 Files\PostgreSQL\8.X which is where users run it from.

Hmm ... so the real issue is that pgxs sees the installation directory
as named differently from what it was named during backend build.  OK,
that makes this behavior a problem, considering that we nominally
support being able to relocate installations.  Not sure what to do
about it though.  Perhaps pgxs should be interrogating pg_config for
the various path names instead of assuming it can recompute them?

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] What has happened to pgxs?

2005-09-23 Thread Dave Page
 

 -Original Message-
 From: Tom Lane [mailto:[EMAIL PROTECTED] 
 Sent: 23 September 2005 15:48
 To: Dave Page
 Cc: Michael Paesold; Thomas Hallgren; pgsql-hackers@postgresql.org
 Subject: Re: [HACKERS] What has happened to pgxs? 
 
 Dave Page dpage@vale-housing.co.uk writes:
  It does?  Dave just told us that the standard installer 
  package is built to install into /usr/local/pgsql.
 
  No, I said it's built into the installer from 
 /usr/local/pgsql (the path
  in the msys dev environment). It actually installs into C:\Program
  Files\PostgreSQL\8.X which is where users run it from.
 
 Hmm ... so the real issue is that pgxs sees the installation directory
 as named differently from what it was named during backend build.  OK,
 that makes this behavior a problem, considering that we nominally
 support being able to relocate installations.  Not sure what to do
 about it though.  Perhaps pgxs should be interrogating pg_config for
 the various path names instead of assuming it can recompute them?

That would be a definite improvement, however it may well run into the
whitespace issues that Thomas mentioned - apparently the makefiles need
short Windows filenames if there are any spaces in them - eg, instead
of:

C:/Program Files/PostgreSQL

We need

C:/Progra~1/PostgreSQL

The GetShortPathName() API should do this, though some reversing of the
/'s might be required first (and if so, they'll need to be flipped back
again afterwards). I'm tied up with other stuff right now though so I
can't really look atm.

Regards, Dave.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Releasing memory during External sorting?

2005-09-23 Thread Simon Riggs
On Fri, 2005-09-23 at 10:09 -0400, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  If not, I would propose that when we move from qsort to tapesort mode we
  free the larger work_mem setting (if one exists) and allocate only a
  lower, though still optimal setting for the tapesort. That way the
  memory can be freed for use by other users or the OS while the tapesort
  proceeds (which is usually quite a while...).
 
 On most platforms it's quite unlikely that any memory would actually get
 released back to the OS before transaction end, because the memory
 blocks belonging to the tuplesort context will be intermixed with blocks
 belonging to other contexts.  So I think this is pretty pointless.

I take it you mean pointless because of the way the memory allocation
works, rather than because giving memory back isn't worthwhile ?

Surely the sort memory would be allocated in contiguous chunks? In some
cases we might be talking about more than a GB of memory, so it'd be
good to get that back ASAP. I'm speculating

 (If you can't afford to have the sort using all of sort_mem, you've set
 sort_mem too large, anyway.)

Sort takes care to allocate only what it needs as starts up. All I'm
suggesting is to take the same care when the sort mode changes. If the
above argument held water then we would just allocate all the memory in
one lump at startup, because we can afford to, so I don't buy that. 

Since we know the predicted size of the sort set prior to starting the
sort node, could we not use that information to allocate memory
appropriately? i.e. if sort size is predicted to be more than twice the
size of work_mem, then just move straight to the external sort algorithm
and set the work_mem down at the lower limit?

That is, unless somebody has evidence that having a very large memory
has any performance benefit for external sorting?

Best Regards, Simon Riggs





---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Releasing memory during External sorting?

2005-09-23 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 Since we know the predicted size of the sort set prior to starting the
 sort node, could we not use that information to allocate memory
 appropriately? i.e. if sort size is predicted to be more than twice the
 size of work_mem, then just move straight to the external sort algorithm
 and set the work_mem down at the lower limit?

Have you actually read the sort code?

During the run-forming phase it's definitely useful to eat all the
memory you can: that translates directly to longer initial runs and
hence fewer merge passes.  During the run-merging phase it's possible
that using less memory would not hurt performance any, but as already
stated, I don't think it will actually end up cutting the backend's
memory footprint --- the sbrk point will be established during the run
forming phase and it's unlikely to move back much until transaction end.

Also, if I recall the development of that code correctly, the reason for
using more than minimum memory during the merge phase is that writing or
reading lots of tuples at once improves sequentiality of access to the
temp files.  So I'm not sure that cutting down the memory wouldn't hurt
performance.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: R: [HACKERS] Table Partitioning is in 8.1

2005-09-23 Thread Bricklen Anderson
Paolo Magnoli wrote:
 Hi, I seem to recall that in Oracle you load into specific partitions
 without specifically naming them in insert statements (in other words you
 insert into table, the engine redirects data to the corrisponding
 partition), 

This is correct
-- 
___

This e-mail may be privileged and/or confidential, and the sender does
not waive any related rights and obligations. Any distribution, use or
copying of this e-mail or the information it contains by other than an
intended recipient is unauthorized. If you received this e-mail in
error, please advise me (by return e-mail or otherwise) immediately.
___

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Releasing memory during External sorting?

2005-09-23 Thread Pailloncy Jean-Gerard
On most platforms it's quite unlikely that any memory would  
actually get

released back to the OS before transaction end, because the memory
blocks belonging to the tuplesort context will be intermixed with  
blocks

belonging to other contexts.  So I think this is pretty pointless.
(If you can't afford to have the sort using all of sort_mem, you've  
set

sort_mem too large, anyway.)

On OpenBSD 3.8 malloc use mmap(2) and no more sbrk.
So, as soon as the bloc is free, it returns to the OS.
Access to the freed pointer crashs immediatly.

Cordialement,
Jean-Gérard Pailloncy


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] PgAdmin3 1.2.2 Install on Win32 Disrupts Pg Service

2005-09-23 Thread Lee, Patricia S.
System: Windows XP Pro SP2 
Running Postgres 8.0.3 for several months with the embedded PgAdmin3
1.2.1 on the same machine with no problems. Postgres is configured to
start automatically as a service. I installed the PgAdmin3 1.2.2 which
requested a Windows reboot. I rebooted without stopping Postgres and
then couldn't restart the Postgres service. Manual restarts with pg_ctl
and postmaster commands complained that the user must not have admin
privilege. So, I changed the user's group to non-privileged status and
uninstalled PgAdmin3 1.2.2. Now, the Postgres service starts and runs,
PgAdmin3 1.2.1 works (and PgAdmin3 1.2.2 works remotely) but pg_ctl
status claims there's no postmaster nor postgres running. What might
have the installation of PgAdmin3 1.2.2 done to upset this installation?
What shall I try next to restore normal pg_ctl activity on the host?

Thanks for your consideration. 

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] 2 forks for md5?

2005-09-23 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian pgman@candle.pha.pa.us writes:
  Yea, we could do that, but does it make sense to downgrade the
  connection message, especially since the connection authorized message
  doesn't contain the hostname.  We would have to add the host name to the
  connection authorized message and at that point there is little need
  for the connection received message.
 
 The connection-authorized message could be made to carry all the info
 for the normal successful-connection case, but for connection failures
 (not only bad password, but any other startup failure) it isn't going
 to help.  So on reflection I think we'd better keep the
 connection-received message --- else we'd have to add the equivalent
 info to all the failure-case messages.
 
 I'm coming to agree with Andrew that a documentation patch might be the
 best answer.  But where to put it ... under the description of the
 log_connections GUC var?

I am thinking we should wait for someone else to notice the double log
entries before mentioning it in the docs.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [GENERAL] 8.1 observation

2005-09-23 Thread Bruce Momjian
Tom Lane wrote:
 Tony Caduto [EMAIL PROTECTED] writes:
  I just noticed that pronargs in pg_proc does not show the full arg 
  count, seems only to show count of IN args.
  shouldn't this show the full arg count including in/out/inout?
 
 There was some discussion of that just a day or so ago; so far no one's
 come up with a reasonable suggestion for what the output should look like.

Is this a TODO?  You don't really pass the OUT parameters as parameters
to the function, so the current behavior seems fine to me.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [GENERAL] 8.1 observation

2005-09-23 Thread Tom Lane
Bruce Momjian pgman@candle.pha.pa.us writes:
 Tom Lane wrote:
 Tony Caduto [EMAIL PROTECTED] writes:
 I just noticed that pronargs in pg_proc does not show the full arg 
 count, seems only to show count of IN args.
 shouldn't this show the full arg count including in/out/inout?
 
 There was some discussion of that just a day or so ago; so far no one's
 come up with a reasonable suggestion for what the output should look like.

 Is this a TODO?  You don't really pass the OUT parameters as parameters
 to the function, so the current behavior seems fine to me.

It's not really fine, because the only info you see about the result
type is record, which is less detail than you should get (or be able
to get, anyway --- perhaps only \df+ need show the OUT parameters).

A related gripe is that \df doesn't show parameter names, which is a
pretty big loss from a documentation standpoint.

The hard part is fitting all that info into a tabular display.
\df output already tends to exceed the width of a terminal window ...

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] 64-bit API for large objects

2005-09-23 Thread Jeremy Drake
This patch implements the ability for large objects to be larger than 2GB.
I believe the limit to now be about 2TB, based on the fact that the large
object page size is 2048 bytes, and the page number is still 32 bits.

There are a few things about this patch which probably require tweaking or
at least a closer inspection from the list.

1) The lo_*64 functions are added to the catalog/pg_proc.h (spacing exact
location atm) with OID set to 0, all other entries in this file have OIDs
explicitly defined.

2) The lo_*64, in order to be convenient from the client end, have
functions added to libpq as the existing lo_* functions.  The client side
of libpq did not previously know anything about int64 or how to
send/receive them.  I added an include of postgres-fe.h (which according
to the comment in that file looks like it should go there) so int64 would
be defined, also implemented functions (code mostly stolen from the server
libpq format functions for same) to convert them to/from network byte
order.  I did this in a somewhat inconsistent way between the get and put,
as I did not want to change the existing api at all, and existing code as
little as possible.

3) The 32 bit box I tested this on was a PII 300MHz laptop.  Not exactly
the fastest.  The test consisted entirely of making sure it compiled.
Perhaps someone with a fast IA32 box and spare cycles can test it?  Also,
so far the only platforms I have tried to compile this on have been:

* Linux 2.6 (gentoo), AMD64, gcc-3.4.4
* Solaris 8, SPARCv9, gcc-3.4.2
* Linux 2.6 (debian unstable), i686, gcc-3.4.x (laptop, don't remember
exact version).

Would probably be a good idea to verify this on other platforms as well,
or at least other compilers.

Hopefully I did not break anything too badly with this.  All of the
regression tests still pass after the patch, and I made a version of the
tests/examples/testlo which uses 64bit (in the patch) which works also.  I
grepped in the regression tests, and I could not find any usage of large
objects in them, which I found to be rather odd, which is why I used
testlo and my new testlo64 to test them instead.

On Tue, 20 Sep 2005, Jonah H. Harris wrote:

 Cool. We look forward to it.

 On 9/19/05, Mark Dilger [EMAIL PROTECTED] wrote:
 
  Jonah H. Harris wrote:
   Mark,
  
   If you don't mind contributing the changes, we'd be glad to take a look
   at them. Thanks.
  
   -Jonah
  
 
  Ok, we will post it back soon. We have tested it on two different 64-bit
  architectures (Sparc and AMD) and are now testing on pentium before
  posting up
  to the list.
 
  mark
 



 --
 Respectfully,

 Jonah H. Harris, Database Internals Architect
 EnterpriseDB Corporation
 http://www.enterprisedb.com/


-- 
Mere nonexistence is a feeble excuse for declaring a thing unseeable. You
*can* see dragons.  You just have to look in the right direction.
-- John Haslerdiff -Nur postgresql-8.0.3-orig/src/backend/libpq/be-fsstubs.c 
postgresql-8.0.3/src/backend/libpq/be-fsstubs.c
--- postgresql-8.0.3-orig/src/backend/libpq/be-fsstubs.c2004-12-31 
13:59:50.0 -0800
+++ postgresql-8.0.3/src/backend/libpq/be-fsstubs.c 2005-09-18 
17:22:17.0 -0700
@@ -233,6 +233,34 @@
PG_RETURN_INT32(status);
 }
 
+
+Datum
+lo_lseek64(PG_FUNCTION_ARGS)
+{
+   int32   fd = PG_GETARG_INT32(0);
+   int64   offset = PG_GETARG_INT64(1);
+   int32   whence = PG_GETARG_INT32(2);
+   MemoryContext currentContext;
+   int64   status;
+
+   if (fd  0 || fd = cookies_size || cookies[fd] == NULL)
+   {
+   ereport(ERROR,
+   (errcode(ERRCODE_UNDEFINED_OBJECT),
+errmsg(invalid large-object descriptor: %d, 
fd)));
+   PG_RETURN_INT64(-1);
+   }
+
+   Assert(fscxt != NULL);
+   currentContext = MemoryContextSwitchTo(fscxt);
+
+   status = inv_seek64(cookies[fd], offset, whence);
+
+   MemoryContextSwitchTo(currentContext);
+
+   PG_RETURN_INT64(status);
+}
+
 Datum
 lo_creat(PG_FUNCTION_ARGS)
 {
@@ -283,6 +311,28 @@
PG_RETURN_INT32(inv_tell(cookies[fd]));
 }
 
+
+Datum
+lo_tell64(PG_FUNCTION_ARGS)
+{
+   int32   fd = PG_GETARG_INT32(0);
+
+   if (fd  0 || fd = cookies_size || cookies[fd] == NULL)
+   {
+   ereport(ERROR,
+   (errcode(ERRCODE_UNDEFINED_OBJECT),
+errmsg(invalid large-object descriptor: %d, 
fd)));
+   PG_RETURN_INT64(-1);
+   }
+
+   /*
+* We assume we do not need to switch contexts for inv_tell. That is
+* true for now, but is probably more than this module ought to
+* assume...
+*/
+   PG_RETURN_INT64(inv_tell(cookies[fd]));
+}
+
 Datum
 lo_unlink(PG_FUNCTION_ARGS)
 {
diff -Nur postgresql-8.0.3-orig/src/backend/storage/large_object/inv_api.c 

Re: [HACKERS] Releasing memory during External sorting?

2005-09-23 Thread Martijn van Oosterhout
On Fri, Sep 23, 2005 at 06:39:35PM +0200, Pailloncy Jean-Gerard wrote:
 On most platforms it's quite unlikely that any memory would actually
 get released back to the OS before transaction end, because the
 memory blocks belonging to the tuplesort context will be intermixed
 with blocks belonging to other contexts.  So I think this is pretty
 pointless. (If you can't afford to have the sort using all of
 sort_mem, you've set sort_mem too large, anyway.)

 On OpenBSD 3.8 malloc use mmap(2) and no more sbrk.
 So, as soon as the bloc is free, it returns to the OS.
 Access to the freed pointer crashs immediatly.

Interesting point. Glibc also uses mmap() but only for allocations
greater than a few K, otherwise it's a waste of space.

I guess you would have to look into the postgresql allocator to see if
it doesn't divide the mmap()ed space up between multiple contexts.
Large allocations certainly appear to be passed off to malloc() but I
don't think execSort allocates all it's space in one go, it just counts
the space allocated by palloc().

So, unless someone goes and adds changes the tuplesort code to allocate
big blocks and use them only for tuples, I think you're going to run
into issues with data interleaved, meaning not much to give back to the
OS...
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpZK896EQRZq.pgp
Description: PGP signature


Re: [HACKERS] Improved \df(+) in psql + backward-compatibility

2005-09-23 Thread Bruce Momjian

Seems this item will have to remain for 8.2.  I have added this to TODO:

o Display IN, INOUT, and OUT parameters in \df+

  It probably requires psql to output newlines in the proper
  column, which is already on the TODO list.


---

Tom Lane wrote:
 David Fetter [EMAIL PROTECTED] writes:
  I've noticed that \df doesn't do quite what it might when a function
  is created with named input parameters.  Please find enclosed a patch
  against CVS TIP that does this better.
 
 Meanwhile, getting back to the actual merits of the patch ... this is
 not right yet, because it will do the wrong thing when there are OUT
 parameters.  (The proargnames array includes both IN and OUT params,
 and you can't assume that proargnames and proargtypes have corresponding
 subscripts.)  It would probably be a good idea to discuss what display
 we want for a function with OUT parameters, anyway.  The strict columnar
 representation that \df currently uses doesn't scale very well :-(
 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 4: Have you searched our list archives?
 
http://archives.postgresql.org
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] pg_dump fails to set index ownership

2005-09-23 Thread Bruce Momjian
Tom Lane wrote:
 Michael Fuhr [EMAIL PROTECTED] writes:
  Regarding the removal of ALTER INDEX OWNER commands from pg_dump,
  indexes are now restored with the wrong ownership if the user doing
  the restore is different than the user who owned the original index
 
 pg_dump is not the source of the problem.  We should instead arrange
 that an index's relowner value is copied directly from the parent table
 during CREATE INDEX.  This is probably more important now with roles,
 since GetUserId() might not have a lot to do with the table's owner ID.

My testing indicated this is fixed.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] pg_dump fails to set index ownership

2005-09-23 Thread Bruce Momjian

Yep, testing confirms this is fixed.  Thanks for the report.

---

Michael Fuhr wrote:
 Regarding the removal of ALTER INDEX OWNER commands from pg_dump,
 indexes are now restored with the wrong ownership if the user doing
 the restore is different than the user who owned the original index
 (if this sounds familiar, I reported the same problem for 8.0.0rc4
 in January).  ALTER INDEX OWNER no longer works, and ALTER TABLE
 OWNER won't change the index ownership if the table ownership doesn't
 actually change (i.e., nothing happens if the new owner and the old
 owner are the same).  Should CREATE INDEX automatically set index
 ownership to be the same as the table ownership?  Or did I miss
 past discussion about that?
 
 Seems like this ought to be fixed before beta1 is announced so it
 doesn't bite people who are trying 8.1 for the first time.
 
 postgres=# CREATE ROLE test LOGIN PASSWORD 'test';
 CREATE ROLE
 postgres=# CREATE DATABASE test1;
 CREATE DATABASE
 postgres=# CREATE DATABASE test2;
 CREATE DATABASE
 postgres=# \c test1 test
 Password for user test: 
 You are now connected to database test1 as user test.
 test1= CREATE TABLE foo (id serial PRIMARY KEY, val text);   
 NOTICE:  CREATE TABLE will create implicit sequence foo_id_seq for serial 
 column foo.id
 NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index foo_pkey for 
 table foo
 CREATE TABLE
 test1= CREATE INDEX foo_val_idx ON foo (val);
 CREATE INDEX
 test1= \q
 
 % pg_dump -U postgres test1 | psql -U postgres test2
 SET
 SET
 SET
 COMMENT
 SET
 SET
 SET
 CREATE TABLE
 ALTER TABLE
  setval 
 
   1
 (1 row)
 
 ALTER TABLE
 CREATE INDEX
 REVOKE
 REVOKE
 GRANT
 GRANT
 
 % psql -q -U test test2
 Password for user test: 
 test2= \d
List of relations
  Schema |Name|   Type   | Owner 
 ++--+---
  public | foo| table| test
  public | foo_id_seq | sequence | test
 (2 rows)
 
 test2= \di
 List of relations
  Schema |Name | Type  |  Owner   | Table 
 +-+---+--+---
  public | foo_pkey| index | postgres | foo
  public | foo_val_idx | index | postgres | foo
 (2 rows)
 
 test2= DROP INDEX foo_val_idx;
 ERROR:  must be owner of relation foo_val_idx
 test2= \c test2 postgres
 Password for user postgres: 
 You are now connected to database test2 as user postgres.
 test2=# ALTER INDEX foo_val_idx OWNER TO test;
 WARNING:  cannot change owner of index foo_val_idx
 HINT:  Change the ownership of the index's table, instead.
 ALTER INDEX
 test2=# ALTER TABLE foo OWNER TO test;
 ALTER TABLE
 test2=# \di
 List of relations
  Schema |Name | Type  |  Owner   | Table 
 +-+---+--+---
  public | foo_pkey| index | postgres | foo
  public | foo_val_idx | index | postgres | foo
 (2 rows)
 
 test2=# ALTER TABLE foo OWNER TO postgres;
 ALTER TABLE
 test2=# ALTER TABLE foo OWNER TO test;
 ALTER TABLE
 test2=# \di
   List of relations
  Schema |Name | Type  | Owner | Table 
 +-+---+---+---
  public | foo_pkey| index | test  | foo
  public | foo_val_idx | index | test  | foo
 (2 rows)
 
 -- 
 Michael Fuhr
 
 ---(end of broadcast)---
 TIP 2: Don't 'kill -9' the postmaster
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] Patching dblink.c to avoid warning about open transaction

2005-09-23 Thread Jonathan Beit-Aharon






Hi,
I'm not a member of this list (yet), so please
CC me on responses and
discussion.

The patch below seems to be completion of work already
started, because the boolean remoteTrFlag was already defined, and all
I had to add was its setting and two references.

I hope someone will find it useful,
Jonathan


--- dblink.c Sat Jan 1 00:43:05 2005
+++ /home/jbeitaharon/dev/third/postgreSQL/contrib/dblink/dblink.c
Thu Sep 22 16:10:20 2005
@@ -329,12 +329,16 @@

 if (!conn)
 DBLINK_CONN_NOT_AVAIL;
+
+ if (rcon)
+ rcon-remoteTrFlag = (PQtransactionStatus(conn) !=
PQTRANS_IDLE);

- res = PQexec(conn, "BEGIN");
- if (PQresultStatus(res) != PGRES_COMMAND_OK)
- DBLINK_RES_INTERNALERROR("begin error");
-
- PQclear(res);
+ if ((!rcon) || (!(rcon-remoteTrFlag))) {
+ res = PQexec(conn, "BEGIN");
+ if (PQresultStatus(res) != PGRES_COMMAND_OK)
+ DBLINK_RES_INTERNALERROR("begin error");
+ PQclear(res);
+ }

 appendStringInfo(str, "DECLARE %s CURSOR FOR %s", curname, sql);
 res = PQexec(conn, str-data);
@@ -424,12 +428,13 @@

 PQclear(res);

- /* commit the transaction */
- res = PQexec(conn, "COMMIT");
- if (PQresultStatus(res) != PGRES_COMMAND_OK)
- DBLINK_RES_INTERNALERROR("commit error");
-
- PQclear(res);
+ if ((!rcon) || (!(rcon-remoteTrFlag))) {
+ /* commit the transaction */
+ res = PQexec(conn, "COMMIT");
+ if (PQresultStatus(res) != PGRES_COMMAND_OK)
+ DBLINK_RES_INTERNALERROR("commit error");
+ PQclear(res);
+ }

 PG_RETURN_TEXT_P(GET_TEXT("OK"));
}




Re: [HACKERS] [PERFORM] Releasing memory during External sorting?

2005-09-23 Thread Ron Peacetree
From: Tom Lane [EMAIL PROTECTED]
Sent: Sep 23, 2005 2:15 PM
Subject: Re: [PERFORM] Releasing memory during External sorting? 

Mark Lewis [EMAIL PROTECTED] writes:
 operations != passes.  If you were clever, you could probably write a
 modified bubble-sort algorithm that only made 2 passes.  A pass is a
 disk scan, operations are then performed (hopefully in memory) on what
 you read from the disk.  So there's no theoretical log N lower-bound on
 the number of disk passes.

Given infinite memory that might be true, but I don't think I believe it
for limited memory.  If you have room for K tuples in memory then it's
impossible to perform more than K*N useful comparisons per pass (ie, as
each tuple comes off the disk you can compare it to all the ones
currently in memory; anything more is certainly redundant work).  So if
K  logN it's clearly not gonna work.

Actually, it's far better than that.  I recall a paper I saw in one of the
algorithms journals 15+ years ago that proved that if you knew the range
of the data, regardless of what that range was, and had n^2 space, you
could sort n items in O(n) time.

Turns out that with very modest constraints on the range of the data and
substantially less extra space (about the same as you'd need for
Replacement Selection + External Merge Sort), you can _still_ sort in
O(n) time.


It's possible that you could design an algorithm that works in a fixed
number of passes if you are allowed to assume you can hold O(log N)
tuples in memory --- and in practice that would probably work fine,
if the constant factor implied by the O() isn't too big.  But it's not
really solving the general external-sort problem.

If you know nothing about the data to be sorted and must guard against
the worst possible edge cases, AKA the classic definition of the general
external sorting problem,  then one can't do better than some variant
of Replacement Selection + Unbalanced Multiway Merge.

OTOH, ITRW things are _not_ like that.  We know the range of the data
in our DB fields or we can safely assume it to be relatively constrained.
This allows us access to much better external sorting algorithms.

For example Postman Sort (the 2005 winner of the PennySort benchmark)
is basically an IO optimized version of an external Radix Sort.


Ron

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [PERFORM] Releasing memory during External sorting?

2005-09-23 Thread Ron Peacetree
From: Simon Riggs [EMAIL PROTECTED]
Sent: Sep 23, 2005 5:37 AM
Subject: [PERFORM] Releasing memory during External sorting?

I have concerns about whether we are overallocating memory for use in
external sorts. (All code relating to this is in tuplesort.c)

A decent external sorting algorithm, say a Merge Sort + Radix (or
Distribution Counting) hybrid with appropriate optimizations for small sub-
files, should become more effective / efficient the more RAM you give it. 


The external sort algorithm benefits from some memory but not much.

That's probably an artifact of the psql external sorting code and _not_
due to some fundamental external sorting issue.


Knuth says that the amount of memory required is very low, with a value
typically less than 1 kB.

Required means the external sort can operate on that little memory.  How
Much memory is required for optimal performance is another matter.


I/O overheads mean that there is benefit from having longer sequential
writes, so the optimum is much larger than that. I've not seen any data
that indicates that a setting higher than 16 MB adds any value at all to a 
large external sort.

It should.  A first pass upper bound would be the amount of RAM needed for
Replacement Selection to create a run (ie sort) of the whole file.  That should
be ~ the amount of RAM to hold 1/2 the file in a Replacement Selection pass.

At the simplest, for any file over 32MB the optimum should be more than 
16MB.


 I have some indications from private tests that very high memory settings
may actually hinder performance of the sorts, though I cannot explain that
and wonder whether it is the performance tests themselves that have issues.

Hmmm.  Are you talking about amounts so high that you are throwing the OS
into paging and swapping thrash behavior?  If not, then the above is weird.


Does anyone have any clear data that shows the value of large settings
of work_mem when the data to be sorted is much larger than memory? (I am
well aware of the value of setting work_mem higher for smaller sorts, so
any performance data needs to reflect only very large sorts). 

This is not PostgreSQL specific, but it does prove the point that the 
performance
of external sorts benefits greatly from large amounts of RAM being available:

http://research.microsoft.com/barc/SortBenchmark/

Looking at the particulars of the algorithms listed there should shed a lot of 
light
on what a good external sorting algorithm looks like:
1= HD IO matters the most.
 1a= Seeking behavior is the largest factor in poor performance.
2= No optimal external sorting algorithm should use more than 2 passes.
3= Optimal external sorting algorithms should use 1 pass if at all possible.
4= Use as much RAM as possible, and use it as efficiently as possible.
5= The amount of RAM needed to hide the latency of a HD subsytem goes up as
the _square_ of the difference between the bandwidth of the HD subsystem and
memory.
6= Be cache friendly.
7= For large numbers of records whose sorting key is substantially smaller than
the record itself, use a pointer + compressed key representation and write the 
data
to HD in sorted order (Replace HD seeks with RAM seeks.  Minimize RAM seeks).
8= Since your performance will be constrained by HD IO first and RAM IO second,
up to a point it is worth it to spend more CPU cycles to save on IO.

Given the large and growing gap between CPU IO, RAM IO, and HD IO, these issues
are becoming more important for _internal_ sorts as well.  


Feedback, please.

Best Regards, Simon Riggs

Hope this is useful,
Ron

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [PERFORM] Releasing memory during External sorting?

2005-09-23 Thread Ron Peacetree
Yep.  Also, bear in mind that the lg(n!)= ~ nlgn - n lower bound on
the number of comparisions:
a= says nothing about the amount of data movement used.
b= only holds for generic comparison based sorting algorithms.

As Knuth says (vol 3, p180), Distribution Counting sorts without
ever comparing elements to each other at all, and so does Radix
Sort.  Similar comments can be found in many algorithms texts.

Any time we know that the range of the data to be sorted is substantially
restricted compared to the number of items to be sorted, we can sort in
less than O(lg(n!)) time.  DB fields tend to take on few values and are
therefore substantially restricted.

Given the proper resources and algorithms, O(n) sorts are very plausible
when sorting DB records.

All of the fastest external sorts of the last decade or so take advantage of
this.  Check out that URL I posted.

Ron


-Original Message-
From: Mark Lewis [EMAIL PROTECTED]
Sent: Sep 23, 2005 1:43 PM
To: Tom Lane [EMAIL PROTECTED]
Subject: Re: [PERFORM] Releasing memory during External sorting?

operations != passes.  If you were clever, you could probably write a
modified bubble-sort algorithm that only made 2 passes.  A pass is a
disk scan, operations are then performed (hopefully in memory) on what
you read from the disk.  So there's no theoretical log N lower-bound on
the number of disk passes.

Not that I have anything else useful to add to this discussion, just a
tidbit I remembered from my CS classes back in college :)

-- Mark

On Fri, 2005-09-23 at 13:17 -0400, Tom Lane wrote:
 Ron Peacetree [EMAIL PROTECTED] writes:
  2= No optimal external sorting algorithm should use more than 2 passes.
  3= Optimal external sorting algorithms should use 1 pass if at all possible.
 
 A comparison-based sort must use at least N log N operations, so it
 would appear to me that if you haven't got approximately log N passes
 then your algorithm doesn't work.
 
   regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [PERFORM] Releasing memory during External sorting?

2005-09-23 Thread Mark Lewis
operations != passes.  If you were clever, you could probably write a
modified bubble-sort algorithm that only made 2 passes.  A pass is a
disk scan, operations are then performed (hopefully in memory) on what
you read from the disk.  So there's no theoretical log N lower-bound on
the number of disk passes.

Not that I have anything else useful to add to this discussion, just a
tidbit I remembered from my CS classes back in college :)

-- Mark

On Fri, 2005-09-23 at 13:17 -0400, Tom Lane wrote:
 Ron Peacetree [EMAIL PROTECTED] writes:
  2= No optimal external sorting algorithm should use more than 2 passes.
  3= Optimal external sorting algorithms should use 1 pass if at all possible.
 
 A comparison-based sort must use at least N log N operations, so it
 would appear to me that if you haven't got approximately log N passes
 then your algorithm doesn't work.
 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [PERFORM] Releasing memory during External sorting?

2005-09-23 Thread Dann Corbit
For the subfiles, load the top element of each subfile into a priority
queue.  Extract the min element and write it to disk.  If the next value
is the same, then the queue does not need to be adjusted.  If the next
value in the subfile changes, then adjust it.

Then, when the lowest element in the priority queue changes, adjust the
queue.

Keep doing that until the queue is empty.

You can create all the subfiles in one pass over the data.

You can read all the subfiles, merge them, and write them out in a
second pass (no matter how many of them there are).

Replacement selection is not a good idea any more, since obvious better
ideas should take over.  Longer runs are of no value if you do not have
to do multiple merge passes.

I have explained this general technique in the book C Unleashed,
chapter 13.

Sample code is available on the book's home page.

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:pgsql-hackers-
 [EMAIL PROTECTED] On Behalf Of Ron Peacetree
 Sent: Friday, September 23, 2005 11:41 AM
 To: Mark Lewis; Tom Lane; pgsql-hackers@postgresql.org; pgsql-
 [EMAIL PROTECTED]
 Subject: Re: [HACKERS] [PERFORM] Releasing memory during External
sorting?
 
 Yep.  Also, bear in mind that the lg(n!)= ~ nlgn - n lower bound on
 the number of comparisions:
 a= says nothing about the amount of data movement used.
 b= only holds for generic comparison based sorting algorithms.
 
 As Knuth says (vol 3, p180), Distribution Counting sorts without
 ever comparing elements to each other at all, and so does Radix
 Sort.  Similar comments can be found in many algorithms texts.
 
 Any time we know that the range of the data to be sorted is
substantially
 restricted compared to the number of items to be sorted, we can sort
in
 less than O(lg(n!)) time.  DB fields tend to take on few values and
are
 therefore substantially restricted.
 
 Given the proper resources and algorithms, O(n) sorts are very
plausible
 when sorting DB records.
 
 All of the fastest external sorts of the last decade or so take
advantage
 of
 this.  Check out that URL I posted.
 
 Ron
 
 
 -Original Message-
 From: Mark Lewis [EMAIL PROTECTED]
 Sent: Sep 23, 2005 1:43 PM
 To: Tom Lane [EMAIL PROTECTED]
 Subject: Re: [PERFORM] Releasing memory during External sorting?
 
 operations != passes.  If you were clever, you could probably write a
 modified bubble-sort algorithm that only made 2 passes.  A pass is a
 disk scan, operations are then performed (hopefully in memory) on what
 you read from the disk.  So there's no theoretical log N lower-bound
on
 the number of disk passes.
 
 Not that I have anything else useful to add to this discussion, just a
 tidbit I remembered from my CS classes back in college :)
 
 -- Mark
 
 On Fri, 2005-09-23 at 13:17 -0400, Tom Lane wrote:
  Ron Peacetree [EMAIL PROTECTED] writes:
   2= No optimal external sorting algorithm should use more than 2
 passes.
   3= Optimal external sorting algorithms should use 1 pass if at all
 possible.
 
  A comparison-based sort must use at least N log N operations, so it
  would appear to me that if you haven't got approximately log N
passes
  then your algorithm doesn't work.
 
  regards, tom lane
 
 ---(end of
broadcast)---
 TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that
your
message can get through to the mailing list cleanly

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Patching dblink.c to avoid warning about open transaction

2005-09-23 Thread Tom Lane
Jonathan Beit-Aharon [EMAIL PROTECTED] writes:
 nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp; if (!conn)br
 nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;
  DBLINK_CONN_NOT_AVAIL;br
 +br
 +nbsp;nbsp;nbsp;nbsp;nbsp;nbsp; if (rcon)br
 +nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;
  rcon-gt;remoteTrFlag = (PQtransactionStatus(conn) !=
 PQTRANS_IDLE);br
 br
 -nbsp;nbsp;nbsp;nbsp;nbsp;nbsp; res = PQexec(conn, BEGIN);br
 -nbsp;nbsp;nbsp;nbsp;nbsp;nbsp; if (PQresultStatus(res) != 
 PGRES_COMMAND_OK)br

 [etc]

Could we see this in a less broken format?

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] ALTER ROLES - questions

2005-09-23 Thread Bruce Momjian

I see a lot of ALTER ROLE items not documented:

SYSID
IN ROLE
ROLE/USER
ADMIN

Is anyone working on documenting these?  I see these flags as used by
and documented in CREATE ROLE too.  Should they be disabled for ALTER
ROLE?

---

Stefan Kaltenbrunner wrote:
 Hi!
 
 I played around with roles a bit today and noticed some minor things:
 
 ALTER ROLE seems to support ALTER ROLE name ROLE name - but that
 form is not mentioned in the docs:
 
 playground=# CREATE ROLE myrole;
 CREATE ROLE
 playground=# CREATE ROLE myrole2;
 CREATE ROLE
 playground=# ALTER ROLE myrole ROLE myrole2;
 ALTER ROLE
 
 ALTER ROLE name IN ROLE name (undocumented but seems logical to try
 because CREATE ROLE supports that) seems to result in the following a
 bit cryptic error message:
 
 playground=# CREATE ROLE myrole;
 CREATE ROLE
 playground=# CREATE ROLE myrole2;
 CREATE ROLE
 playground=# ALTER ROLE myrole IN ROLE myrole2;
 ERROR:  option addroleto not recognized
 
 
 I understand that adding/removing role membership can be done by the
 means of GRANT/REVOKE but at least improving the error message(or
 stopping the parser from accepting that syntax) a bit would be nice :-)
 
 Stefan
 
 ---(end of broadcast)---
 TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Patching dblink.c to avoid warning about open transaction

2005-09-23 Thread Bruce Momjian
Tom Lane wrote:
 Jonathan Beit-Aharon [EMAIL PROTECTED] writes:
  nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp; if (!conn)br
  nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;
   DBLINK_CONN_NOT_AVAIL;br
 
 Could we see this in a less broken format?

Here is the patch in text format.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073
--- dblink.cSat Jan  1 00:43:05 2005
+++
/home/jbeitaharon/dev/third/postgreSQL/contrib/dblink/dblink.c
Thu Sep 22 16:10:20 2005
@@ -329,12 +329,16 @@
if (!conn)
DBLINK_CONN_NOT_AVAIL;
+
+   if (rcon)
+   rcon-remoteTrFlag = (PQtransactionStatus(conn) !=
PQTRANS_IDLE);
-   res = PQexec(conn, BEGIN);
-   if (PQresultStatus(res) != PGRES_COMMAND_OK)
-   DBLINK_RES_INTERNALERROR(begin error);
-
-   PQclear(res);
+   if ((!rcon) || (!(rcon-remoteTrFlag))) {
+   res = PQexec(conn, BEGIN);
+   if (PQresultStatus(res) != PGRES_COMMAND_OK)
+   DBLINK_RES_INTERNALERROR(begin error);
+   PQclear(res);
+   }
appendStringInfo(str, DECLARE %s CURSOR FOR %s, curname,
sql);
res = PQexec(conn, str-data);
@@ -424,12 +428,13 @@
PQclear(res);
-   /* commit the transaction */
-   res = PQexec(conn, COMMIT);
-   if (PQresultStatus(res) != PGRES_COMMAND_OK)
-   DBLINK_RES_INTERNALERROR(commit error);
-
-   PQclear(res);
+   if ((!rcon) || (!(rcon-remoteTrFlag))) {
+   /* commit the transaction */
+   res = PQexec(conn, COMMIT);
+   if (PQresultStatus(res) != PGRES_COMMAND_OK)
+   DBLINK_RES_INTERNALERROR(commit error);
+   PQclear(res);
+   }
PG_RETURN_TEXT_P(GET_TEXT(OK));
 }

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] 64-bit API for large objects

2005-09-23 Thread Tom Lane
Jeremy Drake [EMAIL PROTECTED] writes:
 2) The lo_*64, in order to be convenient from the client end, have
 functions added to libpq as the existing lo_* functions.  The client side
 of libpq did not previously know anything about int64 or how to
 send/receive them.  I added an include of postgres-fe.h (which according
 to the comment in that file looks like it should go there) so int64 would
 be defined,

Unfortunately that's completely unacceptable from a namespace-pollution
point of view.

The real problem here is that int64 isn't a well-defined portable
datatype, and so it's going to be very hard to export these functions in
a way that won't break on different platforms, applications compiled
with a different compiler than libpq was, etc.

For that matter, we can't even guarantee that they work at all: not all
platforms even *have* int64 types.  We have so far avoided putting any
fundamental dependencies on int64 arithmetic into the system, and I'm a
bit worried that this patch will break LO support entirely on platforms
that don't have working int64 arithmetic.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [COMMITTERS] pgsql: Basic documentation for ROLEs.

2005-09-23 Thread Bruce Momjian

Added to TODO:

* Simplify dropping roles that have objects in several databases


---


Alvaro Herrera wrote:
 On Fri, Aug 12, 2005 at 09:53:59PM -0400, Bruce Momjian wrote:
  
  Can I have a TODO item for this?
 
 Something like ease dropping roles which have dependencies spilt over
 several databases ... ?
 
  Alvaro Herrera wrote:
   On Sat, Jul 30, 2005 at 12:19:41AM -0400, Bruce Momjian wrote:
   
I have just loaded the patches list with all outstanding patches that
need consideration, and updated the open items list:

http://momjian.postgresql.org/cgi-bin/pgpatches
http://momjian.postgresql.org/cgi-bin/pgopenitems
   
   The main shared dependency patch is applied.  I still owe a patch to
   implement DROP OWNED and REASSIGN OWNED, to drop or give away
   objects owned by a list of roles.
 
 -- 
 Alvaro Herrera (alvherre[a]alvh.no-ip.org)
 Major Fambrough: You wish to see the frontier?
 John Dunbar: Yes sir, before it's gone.
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] pg_dump fails to set index ownership

2005-09-23 Thread Michael Fuhr
On Fri, Sep 23, 2005 at 04:45:02PM -0400, Bruce Momjian wrote:
 Tom Lane wrote:
  Michael Fuhr [EMAIL PROTECTED] writes:
   Regarding the removal of ALTER INDEX OWNER commands from pg_dump,
   indexes are now restored with the wrong ownership if the user doing
   the restore is different than the user who owned the original index
  
  pg_dump is not the source of the problem.  We should instead arrange
  that an index's relowner value is copied directly from the parent table
  during CREATE INDEX.  This is probably more important now with roles,
  since GetUserId() might not have a lot to do with the table's owner ID.
 
 My testing indicated this is fixed.

Tom fixed it shortly after making that post:

http://archives.postgresql.org/pgsql-committers/2005-08/msg00347.php

-- 
Michael Fuhr

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] ALTER ROLES - questions

2005-09-23 Thread Tom Lane
Bruce Momjian pgman@candle.pha.pa.us writes:
 I see a lot of ALTER ROLE items not documented:

   SYSID
   IN ROLE
   ROLE/USER
   ADMIN

 Is anyone working on documenting these?

No, because they're not actually supported.  The grammar happens to
accept them because we use the same productions for CREATE ROLE and
ALTER ROLE, but they are not intended for use.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] ALTER ROLES - questions

2005-09-23 Thread Bruce Momjian

OK, I have added comments to gram.y to document what is happening.

ALTER ROLE ... ROLE is supported for ALTER GROUP.  The others like IN
ROL are supported by CREATE, but when used by ALTER throw an error
printing their internal names, so it is kind of cryptic. Not sure what
we should do to improve this.  Let's see if others complain.

---

Stefan Kaltenbrunner wrote:
 Hi!
 
 I played around with roles a bit today and noticed some minor things:
 
 ALTER ROLE seems to support ALTER ROLE name ROLE name - but that
 form is not mentioned in the docs:
 
 playground=# CREATE ROLE myrole;
 CREATE ROLE
 playground=# CREATE ROLE myrole2;
 CREATE ROLE
 playground=# ALTER ROLE myrole ROLE myrole2;
 ALTER ROLE
 
 ALTER ROLE name IN ROLE name (undocumented but seems logical to try
 because CREATE ROLE supports that) seems to result in the following a
 bit cryptic error message:
 
 playground=# CREATE ROLE myrole;
 CREATE ROLE
 playground=# CREATE ROLE myrole2;
 CREATE ROLE
 playground=# ALTER ROLE myrole IN ROLE myrole2;
 ERROR:  option addroleto not recognized
 
 
 I understand that adding/removing role membership can be done by the
 means of GRANT/REVOKE but at least improving the error message(or
 stopping the parser from accepting that syntax) a bit would be nice :-)
 
 Stefan
 
 ---(end of broadcast)---
 TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] 64-bit API for large objects

2005-09-23 Thread David Fetter
On Fri, Sep 23, 2005 at 05:40:09PM -0400, Tom Lane wrote:
 Jeremy Drake [EMAIL PROTECTED] writes:
 
 The real problem here is that int64 isn't a well-defined portable
 datatype, and so it's going to be very hard to export these
 functions in a way that won't break on different platforms,
 applications compiled with a different compiler than libpq was, etc.
 
 For that matter, we can't even guarantee that they work at all: not
 all platforms even *have* int64 types.  We have so far avoided
 putting any fundamental dependencies on int64 arithmetic into the
 system, and I'm a bit worried that this patch will break LO support
 entirely on platforms that don't have working int64 arithmetic.

What platforms that PG supports don't have int64 arithmetic?

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 510 893 6100   mobile: +1 415 235 3778

Remember to vote!

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] stack depth limit exceeded problem.

2005-09-23 Thread Oliver Jowett
Thomas Hallgren wrote:

 PL/Java runs a JVM. Since a JVM is multi threaded, PL/Java goes to
 fairly  extreme measures to ensure that only one thread at a time can
 access the backend. So far, this have worked well but there is one small
 problem. [...]

I assume this means you have a single lock serializing requests to the
backend?

If you can't solve the depth checking problem (Tom doesn't seem to like
the idea of multiple threads calling into the backend..), what about
turning the original thread (i.e. the main backend thread) into a
backend interface thread that does nothing but feed callbacks into the
backend on request? Then run all the user code in a separate thread that
passes backend requests to the interface thread rather than directly
executing them. If it starts extra threads which makes DB requests, the
mechanism stays the same..

-O

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] PgAdmin3 1.2.2 Install on Win32 Disrupts Pg Service

2005-09-23 Thread Dave Page
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Lee, 
 Patricia S.
 Sent: 23 September 2005 17:46
 To: pgsql-hackers@postgresql.org; [EMAIL PROTECTED]
 Subject: [HACKERS] PgAdmin3 1.2.2 Install on Win32 Disrupts Pg Service
 
 System: Windows XP Pro SP2 
 Running Postgres 8.0.3 for several months with the embedded PgAdmin3
 1.2.1 on the same machine with no problems. Postgres is configured to
 start automatically as a service. I installed the PgAdmin3 1.2.2 which
 requested a Windows reboot. I rebooted without stopping Postgres and
 then couldn't restart the Postgres service. Manual restarts 
 with pg_ctl
 and postmaster commands complained that the user must not have admin
 privilege. So, I changed the user's group to non-privileged status and
 uninstalled PgAdmin3 1.2.2. Now, the Postgres service starts and runs,
 PgAdmin3 1.2.1 works (and PgAdmin3 1.2.2 works remotely) but pg_ctl
 status claims there's no postmaster nor postgres running. What might
 have the installation of PgAdmin3 1.2.2 done to upset this 
 installation?

Nothing that I can imagine. It shops with the same versions of the
shared DLLs as come with 8.0.3 anyway, and everything else installs in
it's own directory and really doesn't care what else is installed. Even
if pgAdmin 1.2.2 installed a corrupted dll for some odd reason,
uninstalling it again wouldn't have caused the old file to be restored,
so things would still be broken.

I suspect the pgAdmin installation is a red herring to a whole different
problem. You could try running the PostgreSQL 8.0.3 installer to repair
the installation, then upgrade the pgAdmin.exe file within the embedded
installation with the one from the 1.2.2 pgAdmin installer. 

Regards, Dave.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] 64-bit API for large objects

2005-09-23 Thread Tom Lane
Jeremy Drake [EMAIL PROTECTED] writes:
 On Fri, 23 Sep 2005, Tom Lane wrote:
 Unfortunately that's completely unacceptable from a namespace-pollution
 point of view.

 I don't quite understand.

postgresql-fe.h defines a ton of stuff that has no business being
visible to libpq's client applications.  It's designed to be used by
our *own* client-side code (psql and the like), but we have not made
any attempt to keep it from defining stuff that would likely break
other peoples' code.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] 64-bit API for large objects

2005-09-23 Thread Jeremy Drake
I sent this from the wrong address last time so it did not go to the list,
I apologize to anyone who may be getting it again...

On Fri, 23 Sep 2005, Tom Lane wrote:

 Jeremy Drake [EMAIL PROTECTED] writes:
  2) The lo_*64, in order to be convenient from the client end, have
  functions added to libpq as the existing lo_* functions.  The client side
  of libpq did not previously know anything about int64 or how to
  send/receive them.  I added an include of postgres-fe.h (which according
  to the comment in that file looks like it should go there) so int64 would
  be defined,

 Unfortunately that's completely unacceptable from a namespace-pollution
 point of view.

I don't quite understand.  Allow me to cite the source, so we both are
referring to the same thing here...

[EMAIL PROTECTED] postgresql-8.0.3 $ head -n17 src/include/postgres_fe.h
/*-
 *
 * postgres_fe.h
 *Primary include file for PostgreSQL client-side .c files
 *
 * This should be the first file included by PostgreSQL client libraries and
 * application programs --- but not by backend modules, which should include
 * postgres.h.
 *
 *
 * Portions Copyright (c) 1996-2005, PostgreSQL Global Development Group
 * Portions Copyright (c) 1995, Regents of the University of California
 *
 * $PostgreSQL: pgsql/src/include/postgres_fe.h,v 1.10 2004/12/31 22:03:19 
pgsql Exp $
 *
 *-
 */

Now I may not completely understand the term client, but I think libpq
is a client library and anything which may use it would be an
application program.  So it seems it was an oversight on the part of
libpq to not include it.  Does the term client not mean what I thought
it did (anything which connects to a postgresql server)?


 The real problem here is that int64 isn't a well-defined portable
 datatype, and so it's going to be very hard to export these functions in
 a way that won't break on different platforms, applications compiled
 with a different compiler than libpq was, etc.

Umm, what wouldn't break if you switched compilers in a way that redefined
sizeof(things)?  I happen to know, even using the same compiler but just
changing a compile flag (-m64) which changes sizes of integral types
(sizeof(long) from 32 to 64 bits) will make such actions stop working on
one of my tested platform.  It sucks, I happen to not be fond of this
because I tend not to have every library which is on my box built for
both, but it is the way life is.  I do not know of a platform where the
size of an integral type can change and still be able to link against
libraries and things.  And if the size of some type is not changing, then
things should already be correctly set for the platform.  But I admit I
have not met every platform in existance.  Do you happen to be able to
cite a platform where this is the case?


 For that matter, we can't even guarantee that they work at all: not all
 platforms even *have* int64 types.  We have so far avoided putting any
 fundamental dependencies on int64 arithmetic into the system, and I'm a
 bit worried that this patch will break LO support entirely on platforms
 that don't have working int64 arithmetic.

They should in fact break gracefully on such platforms, or at least as
gracefully as any other int64-using code might.  I did check a couple
places for #ifdef INT64_BROKEN (or whatever it was called) to make sure
that on those platforms something at least somewhat sane would happen.
(they use 32 bits instead).  Also, on those platforms, you could always
use the non-64 versions if you were concerned about that.  The patches
would allow seeking past the old limit using the 32 function in stages
(seek 2G, seek 2G, seek 2G would put you at 6G) if you do not mind wierd
return values and tell not working.  And if you use a platform which
does not support 64bit integral types, then you cannot reasonably expect
those functions to work correctly anyway.  But they should compile at
least.


   regards, tom lane

 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match


-- 
I don't wanna argue, and I don't wanna fight,
But there will definitely be a party tonight...

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Releasing memory during External sorting?

2005-09-23 Thread Meir Maor
Calculating Optimal memory for disk based sort is based only on minimizing IO.
A previous post stated we can merge as many subfiles as we want in a single pass,
this is not accurate, as we want to eliminate disk seeks also in the merge phase,
also the merging should be done by reading blocks of data from each subfile,
if we have data of size N and M memory, then we will have K=N/M subfiles to merge
after sorting each. 
in the merge operation if we want to merge all blocks in one pass we will read 
M/K data from each subfile into memory and begin merging, we will read another M/K block
when the buffer from a subfile is empty, 
we would like disk seek time to be irrelavant when comparing to sequential IO time.
We notice that we are performing IO in blocks of N/K^2 which is M/(N/M)^2 
let us assume that sequeential IO is done at 100MB/s and that
a random seek requires ~15ms. and we want seek time to be irrelavnt in one order of
magnitute we get, that in the time of one random seek we can read 1.5MB of data
and would get optimal performance if we perform IO in blocks of 15MB.
and since in the merge algorithm showed above we perform IO in blocks of M/K
we would like MK*15MB which results in a very large memory requirement.
M^2N*15MB
Msqrt(N*15MB)
for example for sorting 10GB of data, we would like M380MB
for optimal performance.

alternativly if we can choose a diffrent algorithm in which we merge only a constant
number of sunfiles to gether at a time but then we will require multiple passes to merge
the entire file. we will require log(K) passes over the entire data and this approach obviously
improves with increase of memory.

The first aproach requires 2 passes of the entire data and K^2+K random seeks,
the second aproach(when merging l blocks at a time) requires: log(l,K) passes over the data
and K*l+K random seeks.

On 9/23/05, Simon Riggs [EMAIL PROTECTED] wrote:
I have concerns about whether we are overallocating memory for use inexternal sorts. (All code relating to this is in tuplesort.c)When we begin a sort we allocate (work_mem | maintenance_work_mem) andattempt to do the sort in memory. If the sort set is too big to fit in
memory we then write to disk and begin an external sort. The same memoryallocation is used for both types of sort, AFAICS.The external sort algorithm benefits from some memory but not much.Knuth says that the amount of memory required is very low, with a value
typically less than 1 kB. I/O overheads mean that there is benefit fromhaving longer sequential writes, so the optimum is much larger thanthat. I've not seen any data that indicates that a setting higher than
16 MB adds any value at all to a large external sort. I have someindications from private tests that very high memory settings mayactually hinder performance of the sorts, though I cannot explain thatand wonder whether it is the performance tests themselves that have
issues.Does anyone have any clear data that shows the value of large settingsof work_mem when the data to be sorted is much larger than memory? (I amwell aware of the value of setting work_mem higher for smaller sorts, so
any performance data needs to reflect only very large sorts).If not, I would propose that when we move from qsort to tapesort mode wefree the larger work_mem setting (if one exists) and allocate only a
lower, though still optimal setting for the tapesort. That way thememory can be freed for use by other users or the OS while the tapesortproceeds (which is usually quite a while...).Feedback, please.
Best Regards, Simon Riggs---(end of broadcast)---TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] 64-bit API for large objects

2005-09-23 Thread Jeremy Drake
On Fri, 23 Sep 2005, Tom Lane wrote:

 postgresql-fe.h defines a ton of stuff that has no business being
 visible to libpq's client applications.  It's designed to be used by
 our *own* client-side code (psql and the like), but we have not made
 any attempt to keep it from defining stuff that would likely break
 other peoples' code.

So does this mean that there is a different, more advanced and more likely
to break random other code, client library where this call would fit
better?  If so, I would be happy to change the patch to put it there.  I
did not see it, but I did not look very hard.

If not, what is a client side programmer to do if they want to pass int64s
around?  Every client app has to basically write their own htonll (or
whatever you want to call it) and perform their own detection of what type
is a 64bit int, and cache the oids for the fastcall interface themselves?
There seems to be a lot of overhead which libpq saves you from.  Or the
client program could perform the detection of the type, and also detect a
function which would reasonably serve as an atoll on the platform, and
snprintf(buf, 1024, SELECT lo_seek64(%d, %lld, %d), fh, offset,
SEEK_SET); exec the buf, check to see if any tuples came back, if so (get
the first column of the first tuple, call atoll on that) else handle
error, and in either case free the result?



In any case, are there any comments on the changes below libpq (the
functions visible to queries on down)?  I don't want to get hung up in the
client issues just to find out later that the server stuff was completely
insane anyway...  The client library seems to me to be less important
anyway.  If the server can support it, the client can always manage to do
it some how, and then once the client lib can support it, it should be
fairly transparent to swap that out later, so that code that worked around
could be updated without immediately breaking all other code working
around.

So that means that if I get good feedback on the server side code, I could
start having people code to it using one of the above workaround methods
listed, and then if we manage to come up with some way which would be more
correct (if that is the right word) than the libpq hack I did then they
could gradually switch over to that (or use sed -i).

-- 
All that glitters has a high refractive index.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [PERFORM] Releasing memory during External sorting?

2005-09-23 Thread Tom Lane
Ron Peacetree [EMAIL PROTECTED] writes:
 2= No optimal external sorting algorithm should use more than 2 passes.
 3= Optimal external sorting algorithms should use 1 pass if at all possible.

A comparison-based sort must use at least N log N operations, so it
would appear to me that if you haven't got approximately log N passes
then your algorithm doesn't work.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [PERFORM] Releasing memory during External sorting?

2005-09-23 Thread Tom Lane
Mark Lewis [EMAIL PROTECTED] writes:
 operations != passes.  If you were clever, you could probably write a
 modified bubble-sort algorithm that only made 2 passes.  A pass is a
 disk scan, operations are then performed (hopefully in memory) on what
 you read from the disk.  So there's no theoretical log N lower-bound on
 the number of disk passes.

Given infinite memory that might be true, but I don't think I believe it
for limited memory.  If you have room for K tuples in memory then it's
impossible to perform more than K*N useful comparisons per pass (ie, as
each tuple comes off the disk you can compare it to all the ones
currently in memory; anything more is certainly redundant work).  So if
K  logN it's clearly not gonna work.

It's possible that you could design an algorithm that works in a fixed
number of passes if you are allowed to assume you can hold O(log N)
tuples in memory --- and in practice that would probably work fine,
if the constant factor implied by the O() isn't too big.  But it's not
really solving the general external-sort problem.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [PERFORM] Releasing memory during External sorting?

2005-09-23 Thread Dann Corbit
The cited book also explains how to use a callback function to perform
arbitrary radix sorts (you simply need a method that returns the
[bucketsize] most significant bits for a given data type, for the length
of the key).

So you can sort fairly arbitrary data in linear time (of course if the
key is long then O(n*log(n)) will be better anyway.)

But in any case, if we are talking about external sorting, then disk
time will be so totally dominant that the choice of algorithm is
practically irrelevant.

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:pgsql-hackers-
 [EMAIL PROTECTED] On Behalf Of Dann Corbit
 Sent: Friday, September 23, 2005 2:21 PM
 To: Ron Peacetree; Mark Lewis; Tom Lane; pgsql-hackers@postgresql.org;
 pgsql-performance@postgresql.org
 Subject: Re: [HACKERS] [PERFORM] Releasing memory during External
sorting?
 
 For the subfiles, load the top element of each subfile into a priority
 queue.  Extract the min element and write it to disk.  If the next
value
 is the same, then the queue does not need to be adjusted.  If the next
 value in the subfile changes, then adjust it.
 
 Then, when the lowest element in the priority queue changes, adjust
the
 queue.
 
 Keep doing that until the queue is empty.
 
 You can create all the subfiles in one pass over the data.
 
 You can read all the subfiles, merge them, and write them out in a
 second pass (no matter how many of them there are).
 
 Replacement selection is not a good idea any more, since obvious
better
 ideas should take over.  Longer runs are of no value if you do not
have
 to do multiple merge passes.
 
 I have explained this general technique in the book C Unleashed,
 chapter 13.
 
 Sample code is available on the book's home page.
 
  -Original Message-
  From: [EMAIL PROTECTED] [mailto:pgsql-hackers-
  [EMAIL PROTECTED] On Behalf Of Ron Peacetree
  Sent: Friday, September 23, 2005 11:41 AM
  To: Mark Lewis; Tom Lane; pgsql-hackers@postgresql.org; pgsql-
  [EMAIL PROTECTED]
  Subject: Re: [HACKERS] [PERFORM] Releasing memory during External
 sorting?
 
  Yep.  Also, bear in mind that the lg(n!)= ~ nlgn - n lower bound on
  the number of comparisions:
  a= says nothing about the amount of data movement used.
  b= only holds for generic comparison based sorting algorithms.
 
  As Knuth says (vol 3, p180), Distribution Counting sorts without
  ever comparing elements to each other at all, and so does Radix
  Sort.  Similar comments can be found in many algorithms texts.
 
  Any time we know that the range of the data to be sorted is
 substantially
  restricted compared to the number of items to be sorted, we can sort
 in
  less than O(lg(n!)) time.  DB fields tend to take on few values and
 are
  therefore substantially restricted.
 
  Given the proper resources and algorithms, O(n) sorts are very
 plausible
  when sorting DB records.
 
  All of the fastest external sorts of the last decade or so take
 advantage
  of
  this.  Check out that URL I posted.
 
  Ron
 
 
  -Original Message-
  From: Mark Lewis [EMAIL PROTECTED]
  Sent: Sep 23, 2005 1:43 PM
  To: Tom Lane [EMAIL PROTECTED]
  Subject: Re: [PERFORM] Releasing memory during External sorting?
 
  operations != passes.  If you were clever, you could probably write
a
  modified bubble-sort algorithm that only made 2 passes.  A pass is a
  disk scan, operations are then performed (hopefully in memory) on
what
  you read from the disk.  So there's no theoretical log N lower-bound
 on
  the number of disk passes.
 
  Not that I have anything else useful to add to this discussion, just
a
  tidbit I remembered from my CS classes back in college :)
 
  -- Mark
 
  On Fri, 2005-09-23 at 13:17 -0400, Tom Lane wrote:
   Ron Peacetree [EMAIL PROTECTED] writes:
2= No optimal external sorting algorithm should use more than 2
  passes.
3= Optimal external sorting algorithms should use 1 pass if at
all
  possible.
  
   A comparison-based sort must use at least N log N operations, so
it
   would appear to me that if you haven't got approximately log N
 passes
   then your algorithm doesn't work.
  
 regards, tom lane
 
  ---(end of
 broadcast)---
  TIP 1: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that
 your
 message can get through to the mailing list cleanly
 
 ---(end of
broadcast)---
 TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that
your
message can get through to the mailing list cleanly

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings