[HACKERS] idea for concurrent seqscans

2005-02-25 Thread Jeff Davis
I had an idea that might improve parallel seqscans on the same relation.

If you have lots of concurrent seqscans going on a large relation, the
cache hit ratio is very low. But, if the seqscans are concurrent on the
same relation, there may be something to gain by starting a seqscan near
the page being accessed by an already-in-progress seqscan, and wrapping
back around to that start location. That would make some use of the
shared buffers, which would otherwise just be cache pollution.

I made a proof-of-concept implementation, which is entirely in heapam.c,
except for one addition to the HeapScanDesc struct in relscan.h. It is
not at all up to production quality; there are things I know that need
to be addressed. Basically, I just modified heapam.c to be able to start
at any page in the relation. Then, every time it reads a new page, I
have it mark the relation's oid and the page number in a shared mem
segment. Everytime a new scan is started, it reads the shared mem
segment, and if the relation's oid matches, it starts the scan at the
page number it found in the shared memory. Otherwise, it starts the scan
at 0.

There are a couple obvious issues, one is that my whole implementation
doesn't account for reverse scans at all (since initscan doesn't know
what direction the scan will move in), but that shouldn't be a major
problem since at worst it will be the current behavior (aside: can
someone tell me how to force reverse scans so I can test that better?).
Another is that there's a race condition with the shared mem, and that's
out of pure laziness on my part.

This method is really only effective at all if there is a significant
amount of disk i/o. If it's pulling the data from O/S buffers the
various scans will diverge too much and not be using eachother's shared
buffers.

I tested with shared_buffers=500 and all stats on. I used 60 threads
performing 30 seqscans each in my script ssf.rb (I refer to my
modification as sequential scan follower or ssf). 

Here are some results with my modifications:
$ time ./ssf.rb # my script

real4m22.476s
user0m0.389s
sys 0m0.186s

test=# select relpages from pg_class where relname='test_ssf';
 relpages
--
 1667
(1 row)

test=# select count(*) from test_ssf;
 count

 20
(1 row)

test=# select pg_stat_get_blocks_hit(17232) as hit,
pg_stat_get_blocks_fetched(17232) as total;
  hit   |  total
+-
 971503 | 3353963
(1 row)

Or, approx. 29% cache hit.

Here are the results without my modifications:

test=# select relpages from pg_class where relname='test_ssf';
 relpages
--
 1667
(1 row)

test=# select count(*) from test_ssf;
 count

 20
(1 row)

test=# select pg_stat_get_blocks_hit(17231) as hit,
pg_stat_get_blocks_fetched(17231) as total;
  hit   |  total
+-
 19 | 3353963
(1 row)

Or, approx. 6% cache hit. Note: the oid is different, because I have two
seperately initdb'd data directories, one for the modified version, one
for the unmodified 8.0.0.

This is the first time I've really modified the PG source code to do
anything that looked promising, so this is more of a question than
anything else. Is it promising? Is this a potentially good approach? I'm
happy to post more test data and more documentation, and I'd also be
happy to bring the code to production quality. However, before I spend
too much more time on that, I'd like to get a general response from a
3rd party to let me know if I'm off base.

Regards,
Jeff Davis

--- postgresql-8.0.0/src/backend/access/heap/heapam.c	2004-12-31 13:59:16.0 -0800
+++ postgresql-8.0.0-ssf/src/backend/access/heap/heapam.c	2005-02-24 20:37:24.596626668 -0800
@@ -65,6 +65,50 @@
  * 
  */
 
+
+#include sys/shm.h
+#include sys/types.h
+#include sys/ipc.h
+/*
+ Retrieves a location of an in-progress seqscan on relid
+ and returns the page location
+*/
+static BlockNumber get_page_loc(Oid relid) {
+  int shmid;
+  char *shm;
+  Oid shmrelid;
+  BlockNumber loc;
+  shmid = shmget(0x11aa55cc,(sizeof(Oid)+sizeof(BlockNumber)),
+		 0666|IPC_CREAT);
+  shm = shmat(shmid,NULL,0);
+  shmrelid = *((Oid*)shm);
+  loc = *((BlockNumber*)(shm+sizeof(Oid)));
+  shmdt(shm);
+  /*elog(NOTICE,Getting shm: %u %u,shmrelid,loc);*/
+  if (shmrelid == relid)
+return loc;
+  else
+return 0;
+}
+
+/*
+ Places (relid,loc) in a shared mem structure
+*/
+static int report_page_loc(Oid relid, BlockNumber loc) 
+{
+  int shmid;
+  char *shm;
+  shmid = shmget(0x11aa55cc,(sizeof(Oid)+sizeof(BlockNumber)),
+		 0666|IPC_CREAT);
+  shm = shmat(shmid,NULL,0);
+
+  /*elog(NOTICE,Setting shm: %u %u,relid,loc);*/
+  *((Oid*)shm) = relid;
+  *((BlockNumber*)(shm+sizeof(Oid))) = loc;
+  shmdt(shm);
+  return 0;
+}
+
 /* 
  *		initscan - scan code common to heap_beginscan and heap_rescan
  * 
@@ -85,6 +129,18 @@
 	scan-rs_ctup.t_data = NULL;
 	scan-rs_cbuf = 

Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32

2005-02-25 Thread John Hansen
 John Hansen wrote:
  currently, upper/lower does not work with 2+ byte unicode 
 characters, 
  on any OS under the C locale.
 
 Sure it does.  It's just that the defined behavior of the C 
 locale is often useless in practice.

select upper('æøå');
ERROR:  invalid multibyte character for locale
HINT:  The server's LC_CTYPE locale is probably incompatible with the database 
encoding.

Consequently it seems that is does not work.

... John

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [pgsql-hackers-win32] [HACKERS] win32 performance - fsync question

2005-02-25 Thread Zeugswetter Andreas DAZ SD

 Are you verifying that all the data that was committed was actually stored? 
 Or
 just verifying that the database works properly after rebooting?
 
 I verified the data.

Does pg startup increase the xid by some amount (say 1000 xids) after crash ?
Else I think you would also need to rollback a range of xids after
the crash, to see if you don't loose data by reusing and rolling back xids.

The risk is datapages reaching the disk before WAL, because the disk rearranges.
I think you would not notice such corruption (with pg_dump) unless you do the
range rollback.

Andreas

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] UTF8 or Unicode

2005-02-25 Thread Karel Zak
On Thu, 2005-02-24 at 23:51 -0500, Bruce Momjian wrote:
 Tatsuo Ishii wrote:
  I do not object the changing UNICODE-UTF-8, but all these discussions
  sound a little bit funny to me.
  
  If you want to blame UNICODE, you should blame LATIN1 etc. as
  well. LATIN1(ISO-8859-1) is actually a character set name, not an
  encoding name. ISO-8859-1 can be encoded in 8-bit single byte
  stream. But it can be encoded in 7-bit too. So when we refer to
  LATIN1(ISO-8859-1), it's not clear if it's encoded in 7/8-bit.
 
 Wow, Tatsuo has a point here.  Looking at encnames.c, I see:
 
 UNICODE, PG_UTF8
 
 but also:
 
 WIN, PG_WIN1251
 LATIN1, PG_LATIN1

 so I see what he is saying.  We are not consistent in favoring the
 official names vs. the common names.

Yes. I said already. For example WIN is extremely bad alias. It all is
heritage from old versions.

 I will work on a patch that people can review and test.

Thanks.

Karel

-- 
Karel Zak [EMAIL PROTECTED]


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] UTF8 or Unicode

2005-02-25 Thread Peter Eisentraut
Am Freitag, 25. Februar 2005 05:51 schrieb Bruce Momjian:
 so I see what he is saying.  We are not consistent in favoring the
 official names vs. the common names.

 I will work on a patch that people can review and test.

I think this is what we should do:

UNICODE = UTF8
ALT = WIN866
WIN = WIN1251
TCVN = WIN1258

That should clear it up.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[HACKERS] Development Plans

2005-02-25 Thread Simon Riggs
I'm giving a talk next week on PostgreSQL 8, so I would like some input
from the community on a few issues, so that my answers are as close to
majority opinion as possible.

One of the most frequent set of questions I get asked is around the
development vision and release strategy of PostgreSQL.

- When is the next release due?
- What will be in release 8.1?
- What are you working towards? Performance? Stability? X?

These are all good questions, you'll note. They mostly indicate that the
person asking the question has already got the more basic messages, and
are preparing themselves to fully accept PostgreSQL as the way forward 
*for them*.

I think I've come to understand the answers to many of these questions,
but these answers are not written down. When I do answer them, I try to
make it clear that I present a personal opinion only - but that always
gets strange looks. People really do not understand why there is no
official answer, and take that as a black mark. 
Other projects such as Ubuntu, Fedora and OpenOffice have much of this
type of information easily available - certainly commercial software
vendors spend a good deal of time on providing this information.
Could we find a way of expressing the project philosophy in writing, so
I can convey that message out to the world, exactly as intended, without
any Riggs filtering?

My own understandings would be...

- When is the next release due?
I'm happy with the Zen approach of there is no answer, the code comes
when it is time and HACKERS list IS the process. 
Many people take the lack of a planned release date as clear indication
that there is no strictly controlled release process, however-much I
state that there really is one. In the absence of release dates, could
we write down some indication of what the release process is, so
everybody understands there is one.
My understanding is:
- new release forked in code repository
- feature freeze, beta phase starts
- string freeze, to allow translations
- release candidate process
- release
Right now, I have zero idea which quarter, let alone which month feature
freeze for 8.1 is in. I think it will be in 2005, but I'm not sure.
[That makes it fairly difficult to get sponsorship for release of new
features, since I cannot guarantee which year they'll be in.]

- what will be in release 8.1?
The TODO list contains a partial mechanism for recording what is being
worked upon by various people. Could that process by beefed up somewhat,
so there is a clear list of Features in Next Release, as part of the
TODO list on the main web site? A caveat could easily warn that this is
a provisional list only and offers no guarantees of inclusion.
I'm happy to make certain commitments to particular features already on
the TODO list. I'm sure others are too.

- What are you working towards? Performance? Stability? X?
When I explain that the pace of development has increased, people
immediately ask which direction things are going in.
In the run-up to r8.0, PITR was listed as an URGENT feature. This gave
some indication of the direction that Core wished the code base to
travel in and gave many a strong indication that they understood the
momentum and trajectory of development.
That section has been removed now.
This is a more difficult one to answer, especially since it really
covers what the major sponsors want.

I'm fairly clear about my own directions: Enterprise features to enhance
robustness, performance and scalability, plus Data Warehousing features
- nearly all of which come from clients or interested parties.
Does anybody else wish to share theirs?

Just so it is clear, there is not a single criticism on this page from
me; I am merely passing on a pattern that I think I see emerging from a
variety of conversations with clients, course attendees and
press/exhibition contacts. 

Many thanks,

Best Regards, Simon Riggs



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] 8.0.X and the ARC patent

2005-02-25 Thread Andrew Dunstan

Bruce Momjian wrote:
Simon Riggs wrote:
 

On Mon, 2005-02-14 at 18:17 -0500, Bruce Momjian wrote:
   

For development, this means we will _not_ have a shortened, non-initdb
8.1 release but a regular release cycle with the typical big batch of
features.
 

Might we set a rough date for Beta freeze for 8.1 then?
September 30th 2005 ?
I see only benefit from publishing a not-before date now. It's up to
Core if it slips, but it'll really help with gaining funding if people
can accurately determine whether or not features can be added for
inclusion in the next release. There are lots of potential donors
waiting, so lets give them some certainty about which release their
payback will occur in
   

Yea, probably September, but you can't dump a huge feature on us in
August either without having talked about it first, so knowing the date
might not be that helpful.
 


Am I misunderstanding here? September 30th is 15 months after the 
feature freeze for 8.0. On that basis we might reasonably expect 8.1 to 
appear around April 2006. Is that what's intended?

cheers
andrew
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faq


Re: [HACKERS] [pgsql-hackers-win32] Repleacement for src/port/snprintf.c

2005-02-25 Thread Nicolai Tufar
On Thu, 24 Feb 2005 22:18:11 -0500, Tom Lane [EMAIL PROTECTED] wrote:
 Didn't we do that already?

No :(   I promised to do it a couple of weeks ago but could not get to do it.
Now with Magnus's help I finaly did it. The last patch should be fine.

 regards, tom lane
Nicolai

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] 8.0.X and the ARC patent

2005-02-25 Thread Simon Riggs
On Fri, 2005-02-25 at 02:47 -0400, Marc G. Fournier wrote:
 On Fri, 25 Feb 2005, Bruce Momjian wrote:
 
  Simon Riggs wrote:
  On Mon, 2005-02-14 at 18:17 -0500, Bruce Momjian wrote:
  For development, this means we will _not_ have a shortened, non-initdb
  8.1 release but a regular release cycle with the typical big batch of
  features.
 
  Might we set a rough date for Beta freeze for 8.1 then?
 
  September 30th 2005 ?
 
  I see only benefit from publishing a not-before date now. It's up to
  Core if it slips, but it'll really help with gaining funding if people
  can accurately determine whether or not features can be added for
  inclusion in the next release. There are lots of potential donors
  waiting, so lets give them some certainty about which release their
  payback will occur in
 
  Yea, probably September, but you can't dump a huge feature on us in
  August either without having talked about it first, so knowing the date
  might not be that helpful.
 
 I thought we were looking at a 12-18 month cycle for 8.1?  Which would put 
 beta around January '06, no?

I'm happy with Core taking this decision, I'd just like to know +/- 1
month when that date is, i.e. which quarter of which year.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] 8.0.X and the ARC patent

2005-02-25 Thread Simon Riggs
On Fri, 2005-02-25 at 00:18 -0500, Bruce Momjian wrote:
 Simon Riggs wrote:
  On Mon, 2005-02-14 at 18:17 -0500, Bruce Momjian wrote:
   For development, this means we will _not_ have a shortened, non-initdb
   8.1 release but a regular release cycle with the typical big batch of
   features.
  
  Might we set a rough date for Beta freeze for 8.1 then?
  
  September 30th 2005 ?
  
  I see only benefit from publishing a not-before date now. It's up to
  Core if it slips, but it'll really help with gaining funding if people
  can accurately determine whether or not features can be added for
  inclusion in the next release. There are lots of potential donors
  waiting, so lets give them some certainty about which release their
  payback will occur in
 
 Yea, probably September, but you can't dump a huge feature on us in
 August either without having talked about it first, so knowing the date
 might not be that helpful.
 

[Sorry, just brought this up as part of another more general thread.]

That's fine. Just have two (rough dates):

- Major Feature Design Freeze - June 2005 (or 2Q2005)
Major features may require significant discussion, prototyping and
performance testing before agreement to include. You are advised that
major features presented for initial code review may not be accepted
after this date.

- Overall Feature Freeze - Sept 2005 (or 3Q2005)
All code implementing new features, large or small, should be complete
by this date. Features may be rejected if supporting tests and
documentation are not provided.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Where are we on stored procedures?

2005-02-25 Thread Merlin Moncure
 1. Executing outside the database engine, and therefore being able to
 start/commit transactions.  (This is *fundamentally* different from
our
 current concept of functions, and I think that any implementation that
 tries to gloss over the difference will be doomed to failure.)

Back in the early days of nested transactions, you could begin/commit
while within a transaction...they could be pushed and popped off of a
stack.

Supposing you could do that once again, would there be any reason why a
SP should be extra-transactional?

Merlin

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Interesting NetBSD annual report

2005-02-25 Thread D'Arcy J.M. Cain
On Thu, 24 Feb 2005 23:00:48 -0500 (EST)
Bruce Momjian pgman@candle.pha.pa.us wrote:
 I read the following report from the NetBSD group:
 
   http://kerneltrap.org/node/4680
 
 It has some interesting points.  First, they analyze how the fit with
 other open source database offerings.  Their position is somewhat
 similar to ours.
 
 Their development style is also similar to ours.

Not too surprising since both projects came out of Berkeley.

 They complain about release delays and the challenge of organizing
 volunteers, just like us.

Is there any volunteer organization that's any different?  :-)

 Second, they have a much more rigid hierarchy structure.  There are
 good and bad things about that and you can see it in the report.

It could be that the people really do make a difference but, as a NetBSD
developer, I never really feel that the hierarchy gets in the way of
doing things.  I think that in any organization the people get stuff
done regardless of what structure is in place.  It depends more on the
people than the structure.

-- 
D'Arcy J.M. Cain darcy@druid.net |  Democracy is three wolves
http://www.druid.net/darcy/|  and a sheep voting on
+1 416 425 1212 (DoD#0082)(eNTP)   |  what's for dinner.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] idea for concurrent seqscans

2005-02-25 Thread Simon Riggs
On Fri, 2005-02-25 at 00:34 -0800, Jeff Davis wrote:
 I had an idea that might improve parallel seqscans on the same relation.
 
 If you have lots of concurrent seqscans going on a large relation, the
 cache hit ratio is very low. But, if the seqscans are concurrent on the
 same relation, there may be something to gain by starting a seqscan near
 the page being accessed by an already-in-progress seqscan, and wrapping
 back around to that start location. That would make some use of the
 shared buffers, which would otherwise just be cache pollution.

This is cool and was on my list of would-like-to-implement features. 

It's usually known as Synchronised Scanning. AFAIK it is free of any
patent restriction: it has already been implemented by both Teradata and
RedBrick.

 This is the first time I've really modified the PG source code to do
 anything that looked promising, so this is more of a question than
 anything else. Is it promising? Is this a potentially good approach? I'm
 happy to post more test data and more documentation, and I'd also be
 happy to bring the code to production quality. 

I'll be happy to help you do this, at least for design and code review.

I'll come back later with more detailed comments on your thoughts so
far.

 However, before I spend
 too much more time on that, I'd like to get a general response from a
 3rd party to let me know if I'm off base.

Third party?

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] [JDBC] Where are we on stored procedures?

2005-02-25 Thread Francisco Figueiredo Jr.
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Tom Lane wrote:
| Francisco Figueiredo Jr. [EMAIL PROTECTED] writes:
|
|Could I add another item?
|
|
|Could we have the row count of statements executed inside a
|procedure/function returned to client?
|
|
| IMHO that request is completely bogus; if the procedure wants to tell
| the client that, it's the procedure's responsibility to return the
| number as a result.  Doing what you ask (a) would arguably be a security
| violation, and (b) the info would be impossible for the client to
| interpret in any but the most trivial cases anyway.  Consider for
| instance a procedure that contains different queries in the THEN and
| the ELSE arms of an IF, or contains loops executed a variable number
| of times.  Without access to the results of the control flow tests, the
| client could not know which number means what.
Hi Tom Lane.
Thanks for feedback.
In fact, I think people keep requesting me support on Npgsql for that
because MS Sql server supports it and they are porting their code to use
Postgresql and facing that difficult.
Indeed, for (a) we could use an approach similar to Ms sql server. We
could have a way of the procedure say if it wanted to send the rows
affected information or not.
And about (b) I think that it is only on trivial cases that people
really want to use this feature :)
But thinking on other side I think this in only a differ in backend's
idiom. I already answered your solution of get diagnostics and I will
provide a litle FAQ about it so people can see how it is done on postgresql.
Regards,
Francisco Figueiredo Jr.
- --
Regards,
Francisco Figueiredo Jr.
Membro Fundador do Projeto MonoBrasil - MonoBrasil Project Founder Member
http://monobrasil.softwarelivre.org

- -
Science without religion is lame;
religion without science is blind.
~  ~ Albert Einstein
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iQEVAwUBQh854P7iFmsNzeXfAQJ7UQf+Jqd4G7EE7YFyDLRBsuoGLZTYl7N/ZY0M
yeIgRbTNw3HPeHb+OiAI+MMXCWxu0YU4OgLOvqkls9NKHkeuMjdSwmsCfwPL0qfc
sZT6UqQJyap6skH1VJ7jQEhfgjesfpqYzUoyeJUJDiRyFsEbM4sbdnDdpMawf/jq
jgzx50lpgQtFOblsaMlfklLiUZa47FeMC8M64XcmzE+IIdfJcx0kWYQ+LKtoimNs
9SH7Z1v3lnNKvucLSw5eMT/y2Ldu8Qgr4/eK7JroKcI4JbHOlNtirrovAoinBjwB
Otcwtd2tRqffFCrF6MqsSzuXn+S2C4Wj9/0qWZ1CIzevigb+QfXOIw==
=h32q
-END PGP SIGNATURE-
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] [pgsql-advocacy] Development Plans

2005-02-25 Thread Marc G. Fournier
On Fri, 25 Feb 2005, Simon Riggs wrote:
- When is the next release due?
Based on past discussions on a 12 to 18 month dev cycle for 8.1, and based 
on past track record, I'd say closer to the 18 month, so figure on June 
'06, with freeze in January of '06 (12 month dev, 6 month beta) ... 
subject to change, but I wouldn't expect a freeze until Jan '06, beta 
period might be shorter though ...


Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Some download statistics

2005-02-25 Thread Jim C. Nasby
On Thu, Feb 24, 2005 at 10:27:03PM +0100, Ernst Herzberg wrote:
 On Thursday 24 February 2005 21:43, Marc G. Fournier wrote:
 [...]
  
 pub/source/v.8.0.1/
 postgresql-8.0.1.tar.bz2
 postgresql-8.0.1.tar.bz2.md5
 postgresql-8.0.1.tar.gz
 postgresql-8.0.1.tar.gz.md5
 split-tarballs/
 postgresql-base-8.0.1.tar.bz2
 postgresql-base-8.0.1.tar.bz2.md5
 etc...
 
  That sounds like a reasonable compromise ... I could do that right away,
  if nobody is in disagreement ... ?
 
 Hm, there is a distribution that uses the split tarballs: gentoo.
 
 snipped from /usr/portage/dev-db/postgresql/postgresql-8.0.1-r1.ebuild :
 
 SRC_URI=mirror://postgresql/source/v${PV}/${PN}-base-${MY_PV}.tar.bz2
 mirror://postgresql/source/v${PV}/${PN}-opt-${MY_PV}.tar.bz2
 doc? ( mirror://postgresql/source/v${PV}/${PN}-docs-${MY_PV}.tar.bz2 )
 
 Better is to drop a 'bug' to bugs.gentoo.org so that the ebuild will be 
 fixed;-)

How exactly is that a bug? Why should someone spend time downloading the
docs if they don't want to install them? And gentoo isn't the only OS
that uses them.

Unless generating the splits requires a non-trivial amount of work, I
see no reason not to have them.
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: Where do you want to go today?
Linux: Where do you want to go tomorrow?
FreeBSD: Are you guys coming, or what?

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Where are we on stored procedures?

2005-02-25 Thread Tom Lane
Neil Conway [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 Essentially I'm thinking about the JDBC solution, but automated a bit
 better.

 So would your proposal invent a new stored procedure construct, or 
 just add some sugar to the existing function stuff? i.e. will you be 
 able to issue a CREATE FUNCTION that specifies OUT parameters?

I certainly intend to be able to say OUT in CREATE FUNCTION.  I'm not
sure what you consider to be a new construct.  I'm not thinking of
making a new system catalog, for instance, just new column(s) in
pg_proc.

 This doesn't address the question of SETOF results, of course.  I'm
 leaning towards returning those as cursors.

 This is part of the reason I liked the approach of introduced SQL-level 
 variables. Besides being a feature that has some use in itself, it could 
 be extended reasonably cleanly to allow (effectively) SETOF variables 
 and rowtype variables.

AFAICS that would force every SETOF result to be materialized, which
would be a bit of an efficiency hit.  The main reason I don't like it,
though, is that variables in a language that hasn't got control
structures are fundamentally wrong.  (And no, I'm not interested in
adding IF and LOOP later ;-)) They could only be useful as an
intermediate step in pushing data from the server to the client, so
why invent all the extra concept and mechanism instead of just pushing
the data immediately?

Basically what I am thinking is that we have all the infrastructure
today to solve the OUT-parameter problem, it's just not wrapped up in
an easy-to-use package.

 But if you can pass over what you have, I'd like to see about
 pressing forward.

 Sure, I've attached a very WIP patch with the utility command 
 definitions; unfortunately I don't think it will be of much use, as much 
 of it is CREATE PROCEDURE-related boilerplate.

Thanks.

 Gavin will update the 
 matching-arguments-by-name code to HEAD at some point in the future; I 
 believe that works fine for functions (since we just error out in case 
 of ambiguity), so we can include it in 8.1 independently on any other 
 work on SPs.

Agreed, that should be mostly orthogonal.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Where are we on stored procedures?

2005-02-25 Thread Tom Lane
Merlin Moncure [EMAIL PROTECTED] writes:
 1. Executing outside the database engine, and therefore being able to
 start/commit transactions.  (This is *fundamentally* different from our
 current concept of functions, and I think that any implementation that
 tries to gloss over the difference will be doomed to failure.)

 Back in the early days of nested transactions, you could begin/commit
 while within a transaction...they could be pushed and popped off of a
 stack.

Those weren't actual transaction starts or commits though; they were
(in our current terminology) savepoints.  People want *real* commits
in stored procedures, not fakes.  For instance consider an SP that
wants to issue a series of VACUUM commands --- if there's a hidden
transaction block around the SP, this must fail.  Locking considerations
may force use of real commits and real rollbacks, too.  You can't
release a lock by committing a subtransaction.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32

2005-02-25 Thread Tom Lane
John Hansen [EMAIL PROTECTED] writes:
 Sure it does.  It's just that the defined behavior of the C 
 locale is often useless in practice.

 select upper('æøå');
 ERROR:  invalid multibyte character for locale
 HINT:  The server's LC_CTYPE locale is probably incompatible with the 
 database encoding.

 Consequently it seems that is does not work.

It fails on my machine should not be read as it doesn't work for anyone.
It all depends on how your local mbstowcs() works.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] int64/double for time/timestamp

2005-02-25 Thread Teodor Sigaev
Urgh.  This is clearly a bug.  All the code in utils/adt seems to be
correctly set up to treat TimeADT as an integral value, but then the two
macros quoted are converting the value to float8 and back again ... so
what's actually on disk is the float8 equivalent of what the int64 value
is supposed to be :-(.  As long as the macros are used *consistently* to
fetch and store time datums, no one would notice --- you could only see
a difference if the int64 values got large enough to not be represented
completely accurately as floats, which I believe is impossible for type
time.
So the fact that you're seeing a bug in btree_gist suggests that
someplace you're cheating and bypassing the FooGetDatum/DatumGetFoo
macros.
We'll obviously want to fix this going forward for efficiency reasons,
but it's an initdb-forcer because it'll change the on-disk
representation of time columns.  So we can't change it in 8.0 or before.
So, will we do it? I can do, but I don't know: Is there a place which contains 
storage version (except file PG_VERSION)?

--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [HACKERS] int64/double for time/timestamp

2005-02-25 Thread Tom Lane
Teodor Sigaev [EMAIL PROTECTED] writes:
 We'll obviously want to fix this going forward for efficiency reasons,
 but it's an initdb-forcer because it'll change the on-disk
 representation of time columns.  So we can't change it in 8.0 or before.

 So, will we do it? I can do, but I don't know: Is there a place which
 contains storage version (except file PG_VERSION)?

catversion.h would need to be advanced for such a change.  We haven't
got anything finer-grained than that.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] UTF8 or Unicode

2005-02-25 Thread Bruce Momjian
Peter Eisentraut wrote:
 Am Freitag, 25. Februar 2005 05:51 schrieb Bruce Momjian:
  so I see what he is saying.  We are not consistent in favoring the
  official names vs. the common names.
 
  I will work on a patch that people can review and test.
 
 I think this is what we should do:
 
 UNICODE = UTF8
 ALT = WIN866
 WIN = WIN1251
 TCVN = WIN1258

OK, but what about latin1?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] UTF8 or Unicode

2005-02-25 Thread Tom Lane
Bruce Momjian pgman@candle.pha.pa.us writes:
 Peter Eisentraut wrote:
 I think this is what we should do:
 
 UNICODE = UTF8
 ALT = WIN866
 WIN = WIN1251
 TCVN = WIN1258

 OK, but what about latin1?

I think LATIN1 is fine as-is.  It's a reasonably popular name for the
character set, and despite Tatsuo's complaint, it's not going to confuse
anyone in practice --- the 7-bit version of that standard has no traction.
The reason UNICODE is a bad name for UTF8 is exactly that there are
multiple physical encodings of Unicode that are in common use.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] [pgsql-advocacy] Development Plans

2005-02-25 Thread Bruce Momjian
Marc G. Fournier wrote:
 On Fri, 25 Feb 2005, Simon Riggs wrote:
 
  - When is the next release due?
 
 Based on past discussions on a 12 to 18 month dev cycle for 8.1, and based 
 on past track record, I'd say closer to the 18 month, so figure on June 
 '06, with freeze in January of '06 (12 month dev, 6 month beta) ... 
 subject to change, but I wouldn't expect a freeze until Jan '06, beta 
 period might be shorter though ...

Agreed.  That is good target, but might change, as Marc said.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] 8.0.X and the ARC patent

2005-02-25 Thread Bruce Momjian
Marc G. Fournier wrote:
 On Fri, 25 Feb 2005, Bruce Momjian wrote:
 
  Simon Riggs wrote:
  On Mon, 2005-02-14 at 18:17 -0500, Bruce Momjian wrote:
  For development, this means we will _not_ have a shortened, non-initdb
  8.1 release but a regular release cycle with the typical big batch of
  features.
 
  Might we set a rough date for Beta freeze for 8.1 then?
 
  September 30th 2005 ?
 
  I see only benefit from publishing a not-before date now. It's up to
  Core if it slips, but it'll really help with gaining funding if people
  can accurately determine whether or not features can be added for
  inclusion in the next release. There are lots of potential donors
  waiting, so lets give them some certainty about which release their
  payback will occur in
 
  Yea, probably September, but you can't dump a huge feature on us in
  August either without having talked about it first, so knowing the date
  might not be that helpful.
 
 I thought we were looking at a 12-18 month cycle for 8.1?  Which would put 
 beta around January '06, no?

Oh, yea. I didn't realize people agreed with that, but I certainly
agree.

So that would put us at January.  Simon, I think the major issue is how
much does it help us to fix a date now vs. what is the benefit to
allowing us to state it later.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Interesting NetBSD annual report

2005-02-25 Thread Jim C. Nasby
On Fri, Feb 25, 2005 at 08:42:51AM -0500, D'Arcy J.M. Cain wrote:
  They complain about release delays and the challenge of organizing
  volunteers, just like us.
 
 Is there any volunteer organization that's any different?  :-)

FreeBSD doesn't seem to suffer much from either issue. Then again, they
also have something like 2000 committers, probably about 100 of which
are very active.
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: Where do you want to go today?
Linux: Where do you want to go tomorrow?
FreeBSD: Are you guys coming, or what?

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] 8.0.X and the ARC patent

2005-02-25 Thread Tom Lane
Marc G. Fournier [EMAIL PROTECTED] writes:
 Yea, probably September, but you can't dump a huge feature on us in
 August either without having talked about it first, so knowing the date
 might not be that helpful.

 I thought we were looking at a 12-18 month cycle for 8.1?  Which would put 
 beta around January '06, no?

Although we've dropped the idea of letting the ARC problem drive a very
short 8.1 cycle, I would still like to see us shoot for a relatively
short 8.1 cycle --- less than a year for sure.  The main reason is that
I think we'll be flushing out performance and feature issues in the
Windows port that we cannot reasonably back-patch into 8.0.*.  PITR also.
In general it seems to me that 8.1 will need to have a consolidation and
fill-in-the-blanks flavor after what we did for 8.0, and that will be
helped by a shorter devel cycle.

As a proposal: feature freeze maybe early summer (June or July), beta
maybe Aug/Sep, final as always when it's ready (maybe Oct/Nov with
a good tailwind).

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Development Plans

2005-02-25 Thread Bruce Momjian
Simon Riggs wrote:
 I'm giving a talk next week on PostgreSQL 8, so I would like some input
 from the community on a few issues, so that my answers are as close to
 majority opinion as possible.
 
 One of the most frequent set of questions I get asked is around the
 development vision and release strategy of PostgreSQL.
 
 - When is the next release due?
 - What will be in release 8.1?
 - What are you working towards? Performance? Stability? X?
 
 These are all good questions, you'll note. They mostly indicate that the
 person asking the question has already got the more basic messages, and
 are preparing themselves to fully accept PostgreSQL as the way forward 
 *for them*.
 
 I think I've come to understand the answers to many of these questions,
 but these answers are not written down. When I do answer them, I try to
 make it clear that I present a personal opinion only - but that always
 gets strange looks. People really do not understand why there is no
 official answer, and take that as a black mark. 
 Other projects such as Ubuntu, Fedora and OpenOffice have much of this
 type of information easily available - certainly commercial software
 vendors spend a good deal of time on providing this information.
 Could we find a way of expressing the project philosophy in writing, so
 I can convey that message out to the world, exactly as intended, without
 any Riggs filtering?

I have no idea how to predict what will be in 8.1.  I couldn't predict
what would be in 8.0 until just before feature freeze, so the idea that
we would have any clue about 8.1 is unrealistic.

How do other open source projects predict these things? The most visible
project I know that did that was Mozilla, and it was very unpredictive,
and they had a higher percentage of paid folks than we do.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] [pgsql-advocacy] Development Plans

2005-02-25 Thread Robert Treat
On Fri, 2005-02-25 at 05:34, Simon Riggs wrote:
 I'm giving a talk next week on PostgreSQL 8, so I would like some input
 from the community on a few issues, so that my answers are as close to
 majority opinion as possible.
 
 One of the most frequent set of questions I get asked is around the
 development vision and release strategy of PostgreSQL.
 
 - When is the next release due?
 - What will be in release 8.1?
 - What are you working towards? Performance? Stability? X?
 
 These are all good questions, you'll note. They mostly indicate that the
 person asking the question has already got the more basic messages, and
 are preparing themselves to fully accept PostgreSQL as the way forward 
 *for them*.
 
 I think I've come to understand the answers to many of these questions,
 but these answers are not written down. When I do answer them, I try to
 make it clear that I present a personal opinion only - but that always
 gets strange looks. People really do not understand why there is no
 official answer, and take that as a black mark. 
 Other projects such as Ubuntu, Fedora and OpenOffice have much of this
 type of information easily available - certainly commercial software
 vendors spend a good deal of time on providing this information.
 Could we find a way of expressing the project philosophy in writing, so
 I can convey that message out to the world, exactly as intended, without
 any Riggs filtering?
 
 My own understandings would be...
 
 - When is the next release due?
 I'm happy with the Zen approach of there is no answer, the code comes
 when it is time and HACKERS list IS the process. 
 Many people take the lack of a planned release date as clear indication
 that there is no strictly controlled release process, however-much I
 state that there really is one. In the absence of release dates, could
 we write down some indication of what the release process is, so
 everybody understands there is one.
 My understanding is:
 - new release forked in code repository
 - feature freeze, beta phase starts
 - string freeze, to allow translations
 - release candidate process
 - release
 Right now, I have zero idea which quarter, let alone which month feature
 freeze for 8.1 is in. I think it will be in 2005, but I'm not sure.
 [That makes it fairly difficult to get sponsorship for release of new
 features, since I cannot guarantee which year they'll be in.]
 

As Marc stated, core has decided on a 12 month development cycle, so
that might look like January of next year.  If I were speaking to
someone looking to sponsor development, I would tell them they would
need to being work in Q4 of 2005 at the very latest to have a chance of
something going in. As far as when they can expect to see it in a
released branch, Q2 of 2006 would probably be a reasonable estimate.
(I'm hopeful that beta wont take as long this time around since win32
should be more stable and the buildfarm should also be of some help, but
I can't imagine we'd get through it in less than 3 months) 

--- update ---

Was just about to hit send on this email when I saw Tom's post come up
on -hackers talking about a 6 month cycle... so maybe all of the above
is off... 


 - what will be in release 8.1?
 The TODO list contains a partial mechanism for recording what is being
 worked upon by various people. Could that process by beefed up somewhat,
 so there is a clear list of Features in Next Release, as part of the
 TODO list on the main web site? A caveat could easily warn that this is
 a provisional list only and offers no guarantees of inclusion.
 I'm happy to make certain commitments to particular features already on
 the TODO list. I'm sure others are too.
 

Actually the TODO list is really only a good indicator of things already
in the code... to get an idea of items being discussed you really just
have to hang out on the lists to see what people are working on.  A list
of some of what are probably the big things being worked on for 8.1
include:

* Improved Resource Managment (this encapsulates the buffer manager
changes and arc replacement and similar work)

* Integrated pg_autovacuum

* Stored procedures (with transaction control and multiple result sets
among other features, not to be confused with function support)

* SQL compatible recursive WITH statements

* 2 Phase Commit

As always none of that stuff is guaranteed but those items have someone 
who has stated they want to make it happen for 8.1 (and by make it
happen I mean they plan to code it)  There are a couple of other big
things floating out there too (bitmapped indexes are a good example),
but I think those are the most concrete. There are also some things that
are cool but probably not big things (OS/2 support for instance)

 - What are you working towards? Performance? Stability? X?
 When I explain that the pace of development has increased, people
 immediately ask which direction things are going in.
 In the run-up to r8.0, PITR was listed as an URGENT feature. This gave
 

Re: [HACKERS] [pgsql-advocacy] Development Plans

2005-02-25 Thread Christopher Kings-Lynne
I would agree that since we have a number of commercial developers who
do have intentions of working on specific items in 8.1, it would be nice
to list those items somewhere (the urgent section of the TODO seems
fine), but it is up to those developers to speak up.
I'd like to bundle pg_dump and pg_dumpall into a single binary, allowing 
pg_dumpall in custom (binary) format.  Dunno if I'll get the time though :)

I'd also like to backport 8.0's pg_dump to 7.4.x since I think it has 
'bug fixes'...

Chris
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] 8.0.X and the ARC patent

2005-02-25 Thread Matthew T. O'Connor
Tom Lane wrote:
Although we've dropped the idea of letting the ARC problem drive a very
short 8.1 cycle, I would still like to see us shoot for a relatively
short 8.1 cycle --- less than a year for sure.  The main reason is that
I think we'll be flushing out performance and feature issues in the
Windows port that we cannot reasonably back-patch into 8.0.*.  PITR also.
In general it seems to me that 8.1 will need to have a consolidation and
fill-in-the-blanks flavor after what we did for 8.0, and that will be
helped by a shorter devel cycle.
As a proposal: feature freeze maybe early summer (June or July), beta
maybe Aug/Sep, final as always when it's ready (maybe Oct/Nov with
a good tailwind).
That sounds good.  I would think that lots of users probably won't use 
the Windows port in production until 8.1 (performance reasons, paranoia 
etc...)  I would hate to make put such a long delay in their adoption 
plans. 

One thing to consider while discussing the length of the cycle is what 
features are people planning on putting in?  The 8.0 cycle had to be 
long due to the many huge improvements.  I'm not aware of any 8.1 plans 
that are that ambitious, so why plan a long cycle when there are no 
features requiring it?  Am I missing something?

Matthew
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [HACKERS] Development Plans

2005-02-25 Thread Matthew T. O'Connor
Bruce Momjian wrote:
I have no idea how to predict what will be in 8.1.  I couldn't predict
what would be in 8.0 until just before feature freeze, so the idea that
we would have any clue about 8.1 is unrealistic.
 

I agree there is no way to accurately predict such things, however there 
was a long list of large features that people were trying to get in for 
8.0.  So while you can't be sure about what will make it for 8.1, I'm 
not even aware of any large projects that anyone is working on.  BTW 
when I say large project I'm talking about things like the Windows port, 
PITR, nested Xacts etc   Are there any big projects are people 
working on to get into 8.1?

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [HACKERS] 8.0.X and the ARC patent

2005-02-25 Thread Tom Lane
Matthew T. O'Connor matthew@zeut.net writes:
 One thing to consider while discussing the length of the cycle is what 
 features are people planning on putting in?  The 8.0 cycle had to be 
 long due to the many huge improvements.  I'm not aware of any 8.1 plans 
 that are that ambitious, so why plan a long cycle when there are no 
 features requiring it?  Am I missing something?

The subtext here is that people are trying to decide what they intend to
shoot for in this cycle, and so they are asking Core what the schedule
target is.  You really misunderstand the dynamics.  8.0 didn't start out
to be what it ended up being; there was no master plan, and never has
been.  The most organization we've ever had is for Core to set a feature
freeze target date well in advance.

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] 8.0.X and the ARC patent

2005-02-25 Thread Bruce Momjian
Tom Lane wrote:
 Marc G. Fournier [EMAIL PROTECTED] writes:
  Yea, probably September, but you can't dump a huge feature on us in
  August either without having talked about it first, so knowing the date
  might not be that helpful.
 
  I thought we were looking at a 12-18 month cycle for 8.1?  Which would put 
  beta around January '06, no?
 
 Although we've dropped the idea of letting the ARC problem drive a very
 short 8.1 cycle, I would still like to see us shoot for a relatively
 short 8.1 cycle --- less than a year for sure.  The main reason is that
 I think we'll be flushing out performance and feature issues in the
 Windows port that we cannot reasonably back-patch into 8.0.*.  PITR also.
 In general it seems to me that 8.1 will need to have a consolidation and
 fill-in-the-blanks flavor after what we did for 8.0, and that will be
 helped by a shorter devel cycle.
 
 As a proposal: feature freeze maybe early summer (June or July), beta
 maybe Aug/Sep, final as always when it's ready (maybe Oct/Nov with
 a good tailwind).

That is fine with me too.  Let's see how mush 8.0 fixing we need in 8.1
and that will help determine the cutoff date, as will completed features
that we want to get into a public release.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Development Plans

2005-02-25 Thread Jim C. Nasby
On Fri, Feb 25, 2005 at 10:49:59AM -0500, Bruce Momjian wrote:
 I have no idea how to predict what will be in 8.1.  I couldn't predict
 what would be in 8.0 until just before feature freeze, so the idea that
 we would have any clue about 8.1 is unrealistic.
 
 How do other open source projects predict these things? The most visible
 project I know that did that was Mozilla, and it was very unpredictive,
 and they had a higher percentage of paid folks than we do.

Not to sound like a broken FreeBSD drum, but they manage to do it, and
afaik a pretty good job of it.
http://www.freebsd.org/releases/5.4R/todo.html is an example.
http://www.freebsd.org/releases/5.4R/schedule.html is also interesting.

I suspect a big part of why/how they can do this is they have a much
larger developer pool than PostgreSQL; I believe there's over 2000
people with commit access, and there are probably 100-200 people who are
actively developing code for FBSD. But it's been some time since I
followed the details of FBSD development, so I could be way off on these
WAGs.
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: Where do you want to go today?
Linux: Where do you want to go tomorrow?
FreeBSD: Are you guys coming, or what?

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [pgsql-advocacy] Development Plans

2005-02-25 Thread Peter Eisentraut
Am Freitag, 25. Februar 2005 15:42 schrieb Marc G. Fournier:
 Based on past discussions on a 12 to 18 month dev cycle for 8.1, and based
 on past track record, I'd say closer to the 18 month, so figure on June
 '06, with freeze in January of '06 (12 month dev, 6 month beta) ...
 subject to change, but I wouldn't expect a freeze until Jan '06, beta
 period might be shorter though ...

I certainly hope that this is wrong.  I think the past cycles that saw major 
releases every 11 or 12 months were OK, albeit too long for my taste, but 18 
months is way too long both from the users' and the developers' point of 
view.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] UTF8 or Unicode

2005-02-25 Thread Peter Eisentraut
Am Freitag, 25. Februar 2005 16:26 schrieb Bruce Momjian:
 OK, but what about latin1?

The following character set names are specified in the SQL standard and 
therefore somewhat non-negotiable:

SQL_CHARACTER
GRAPHIC_IRV
LATIN1
ISO8BIT
UTF16
UTF8
UCS2
SQL_TEXT
SQL_IDENTIFIER

So we have to use LATIN1, even though it creates an inconsistency.  We 
discussed this a while ago during the last great renaming, I think.

Btw., I think ISO8BIT is the correct name for what we call SQL_ASCII, but I 
haven't analyzed that in detail, yet.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Development Plans

2005-02-25 Thread Tom Lane
Matthew T. O'Connor matthew@zeut.net writes:
 Are there any big projects are people 
 working on to get into 8.1?

I'm privately hoping to get bitmap index operations into 8.1 (that is,
build a bitmap of tuple locations from an index, possibly AND or OR the
results of multiple indexes, and finally visit the heap in CTID order).
This is not as big as say the Windows port, but it's easily a solid
month or two of effort.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [pgsql-advocacy] [HACKERS] Development Plans

2005-02-25 Thread Peter Eisentraut
Am Freitag, 25. Februar 2005 16:49 schrieb Bruce Momjian:
 I have no idea how to predict what will be in 8.1.  I couldn't predict
 what would be in 8.0 until just before feature freeze, so the idea that
 we would have any clue about 8.1 is unrealistic.

A good guess for the features contained in PostgreSQL version X are those that 
were first mentioned as targeted for PostgreSQL version X-2.  Seriously.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Development Plans

2005-02-25 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 I'm happy with the Zen approach of there is no answer, the code comes
 when it is time and HACKERS list IS the process. 
 Many people take the lack of a planned release date as clear indication
 that there is no strictly controlled release process, however-much I
 state that there really is one. In the absence of release dates, could
 we write down some indication of what the release process is, so
 everybody understands there is one.

What we do not have is a marketing-driven development process ;-).
We let the decisions be dictated by the maturity of the code, not by
an arbitrary predetermined release date.  We do try to set a feature
freeze date in advance, but this has always been a pretty sloppy
cutoff --- the reason we set it in advance is just so that individual
developers can make their own plans about how much they think they can
get done in time for the next release.  (Awhile back we didn't even do
that, but we found that we were forever slipping a release because first
one feature then another was almost ready.  We have learned that we
must be willing to say too late for this release.)  Once we are past
feature freeze, the beta schedule and release schedule are determined
entirely by Core's judgment about the state of the code.

I think most of the developers consider our process a strength, not
a weakness.  It's certainly conditioned by the realities of an open-
source project in which most of the workers are part-time volunteers,
but it works well for us.

 Right now, I have zero idea which quarter, let alone which month feature
 freeze for 8.1 is in. I think it will be in 2005, but I'm not sure.

I'm sorry about the flux around the 8.1 schedule.  It's been caused by
what is hopefully a one-time problem, namely uncertainty about how
we ought to deal with the ARC patent problem.  Normally we would have
set a fairly definite feature freeze date before now.

 - what will be in release 8.1?

Whatever people get done.  In a project where the work is done by
volunteers, it's just about pointless to imagine that we can predict it.
I can say what I'm hoping to work on personally, but how much of it will
get done I don't really know.  Multiply that across a few dozen people
and what you have is pretty squishy.

I wouldn't mind seeing people be a little more vocal on the hackers list
about what they plan to be doing, just so that there's not duplication
of effort.  But trying to assemble that into some sort of published
Master Plan sounds to me like just a recipe for making ourselves look
foolish when the Plan ends up having little relationship to reality.

 - What are you working towards? Performance? Stability? X?

Yes.

Bruce used to try to describe our releases as being oriented towards some
particular goal, but I always thought that these were after-the-fact
descriptions that had nothing to do with the real development process.
The truth is that individual developers work on what they feel like,
or find interesting, or in some cases get paid to do.  But just as
there's not a master plan, there's not really some guiding vision that
in a particular release we are going to focus on X or Y or Z.

Note: the above is just my two cents and shouldn't be read as speaking
for Core.  I think if you asked the other members of core you'd get
roughly similar answers, but each with his own spin.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [pgsql-advocacy] [HACKERS] Development Plans

2005-02-25 Thread Christopher Kings-Lynne
I wouldn't mind seeing people be a little more vocal on the hackers list
about what they plan to be doing, just so that there's not duplication
of effort.  But trying to assemble that into some sort of published
Master Plan sounds to me like just a recipe for making ourselves look
foolish when the Plan ends up having little relationship to reality.
The things I would _like_ to do:
* Combine pg_dump and pg_dumpall into a single binary with common 
options, allowing a full cluster custom format dump.

* Make psql backwards compatible like pg_dump is
* Work with a few of the IRC guys on a pgsql version of SQL*Loader. 
This would be a rad high speed data loader, with overflow files of rows 
that couldn't be loaded, etc.

Chris
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] [pgsql-advocacy] Development Plans

2005-02-25 Thread Marc G. Fournier
On Fri, 25 Feb 2005, Peter Eisentraut wrote:
Am Freitag, 25. Februar 2005 15:42 schrieb Marc G. Fournier:
Based on past discussions on a 12 to 18 month dev cycle for 8.1, and based
on past track record, I'd say closer to the 18 month, so figure on June
'06, with freeze in January of '06 (12 month dev, 6 month beta) ...
subject to change, but I wouldn't expect a freeze until Jan '06, beta
period might be shorter though ...
I certainly hope that this is wrong.  I think the past cycles that saw major
releases every 11 or 12 months were OK, albeit too long for my taste, but 18
months is way too long both from the users' and the developers' point of
view.
Hey, don't shoot the messenger ... I'm just going by what ppl have been 
throwing around ... personally, I'd like to a see around an 8-12 month 
cycle being the max ... Freeze in Sept, Rel January ... but, of course, 
when Sept rolls aroound, there will be someone petitioning for an 
extension to get this 'one last feature in', and someone else will rally 
up for the cause, and ... 'k, I've gotten a bit jaded over the years? :)

But, 8-12 month seems to be a reasonable amount of time to aim for ... ?

Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Some download statistics

2005-02-25 Thread Marc G. Fournier
On Fri, 25 Feb 2005, Jim C. Nasby wrote:
Unless generating the splits requires a non-trivial amount of work, I
see no reason not to have them.
its all automated ... still wish I could figure out a way of splitting off 
*just* libppq, since that would drop downloads of the full package 
significantly ... in my case, I need postgresql server on maybe one 
machine, but need just libpq on every other one that I run ..

I've gotta sit down and see if I can  figure out how to do thatt .. would 
really be interesting to see the stats if  we had a seperate 'libpq' 
download :)


Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [HACKERS] Interesting NetBSD annual report

2005-02-25 Thread Marc G. Fournier
On Fri, 25 Feb 2005, Jim C. Nasby wrote:
On Fri, Feb 25, 2005 at 08:42:51AM -0500, D'Arcy J.M. Cain wrote:
They complain about release delays and the challenge of organizing
volunteers, just like us.
Is there any volunteer organization that's any different?  :-)
FreeBSD doesn't seem to suffer much from either issue. Then again, they
also have something like 2000 committers, probably about 100 of which
are very active.
FreeBSD suffers from *alot* of other things because of havin gtoo many 
committers :(  As hard-core FreeBSD as I've been over the years, things 
haven't been the same since Jordan left ... to the point that I'm already 
planning on moving over to DragonFlyBSD (fork project of FreeBSD ... 
same base code as what we're currently using [ie. FreeBSD 4.x] but 
with *alot* of fixes for some pretty major problems that still exist 
in the current version of FreeBSD) over the next few months ... they've 
already re-written the VFS system, which we tax heavily, but haven't 
finished the rewrite of the unionfs code (which is how we tax VFS) which 
is in the works ...

Love the OS ... not so haappy with the direction its taken :(

Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [HACKERS] [pgsql-advocacy] Development Plans

2005-02-25 Thread Josh Berkus
Simon,

Welcome back!   Ready to get to work?   I need to talk to you about some 
stuff 

 - When is the next release due?

Each of our previous 4 releases has taken between 11 and 14 months.   So, 
early 2006 would not be unlikely; however, we have set no dates yet.

 - What will be in release 8.1?

Can't answer that.Let me give you the text I've used with reporters:

Currently there are developers working on bitmap indexes, database roles, 
two-phase commit, faster GiST and R-tree indexes, integrated autovacuum, SQL 
standard compliant procedures, and further improvements in memory usage.  
However, it is still quite early in the development cycle, and it is very 
likely some of these features won't be ready, or won't be good enough, for 
8.1, and is equally likely that we will receive and accept four or five other 
features that I don't know about yet.

 - What are you working towards? Performance? Stability? X?

X, definitely X.  

We're working toward PostgreSQL being indisputably the very best SQL RDBMS in 
the world.

 I think I've come to understand the answers to many of these questions,
 but these answers are not written down. When I do answer them, I try to
 make it clear that I present a personal opinion only - but that always
 gets strange looks. People really do not understand why there is no
 official answer, and take that as a black mark.

Well, they're used to dealing with private companies and company-sponsored 
projects.   These things have marketing-driven agendas.   We are a 
non-commercial, all-volunteer OSS project.You will need to educate people 
on this.

 Other projects such as Ubuntu, Fedora and OpenOffice have much of this
 type of information easily available

OpenOffice.org and Fedora are both single-company-sponsored projects, with 
marketing-driven goals.  I don't know about Ubuntu.

 - certainly commercial software 
 vendors spend a good deal of time on providing this information.

Yep.  And commercial vendors ship releases whether or not that release is 
stable or actually contains the features advertised.

 Could we find a way of expressing the project philosophy in writing, so
 I can convey that message out to the world, exactly as intended, without
 any Riggs filtering?

That's not a small order, if we want to do it right.Why don't you prepare 
a Faq-ish page that covers these issues based on the responses you've 
received on this thread?  I can add it to the Press FAQ.

 Right now, I have zero idea which quarter, let alone which month feature
 freeze for 8.1 is in. I think it will be in 2005, but I'm not sure.
 [That makes it fairly difficult to get sponsorship for release of new
 features, since I cannot guarantee which year they'll be in.]

Think early 2006.   Tentatively.  After all, we haven't discussed feature 
freeze date yet.

 The TODO list contains a partial mechanism for recording what is being
 worked upon by various people. Could that process by beefed up somewhat,
 so there is a clear list of Features in Next Release, as part of the
 TODO list on the main web site? A caveat could easily warn that this is
 a provisional list only and offers no guarantees of inclusion.
 I'm happy to make certain commitments to particular features already on
 the TODO list. I'm sure others are too.

I've been in favor of converting the TODO list into a pgFoundry-based Task 
List, so that TODOs can be claimed by people, for some time.

 I'm fairly clear about my own directions: Enterprise features to enhance
 robustness, performance and scalability, plus Data Warehousing features
 - nearly all of which come from clients or interested parties.
 Does anybody else wish to share theirs?

Well, I'll be working on Data Warehousing too.   

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] idea for concurrent seqscans

2005-02-25 Thread Jeff Davis
On Fri, 2005-02-25 at 13:38 +, Simon Riggs wrote:
 On Fri, 2005-02-25 at 00:34 -0800, Jeff Davis wrote:
  I had an idea that might improve parallel seqscans on the same relation.
  
  If you have lots of concurrent seqscans going on a large relation, the
  cache hit ratio is very low. But, if the seqscans are concurrent on the
  same relation, there may be something to gain by starting a seqscan near
  the page being accessed by an already-in-progress seqscan, and wrapping
  back around to that start location. That would make some use of the
  shared buffers, which would otherwise just be cache pollution.
 
 This is cool and was on my list of would-like-to-implement features. 
 
 It's usually known as Synchronised Scanning. AFAIK it is free of any
 patent restriction: it has already been implemented by both Teradata and
 RedBrick.
 
  This is the first time I've really modified the PG source code to do
  anything that looked promising, so this is more of a question than
  anything else. Is it promising? Is this a potentially good approach? I'm
  happy to post more test data and more documentation, and I'd also be
  happy to bring the code to production quality. 
 
 I'll be happy to help you do this, at least for design and code review.
 
 I'll come back later with more detailed comments on your thoughts so
 far.
 

Good to hear. I'll clean up the code and document some more tests. Three
questions come to mind right now:
(1) Do we care about reverse scans being done with synchronized
scanning? If so, is there a good way to know in advance whether it is
going to be a forward or reverse scan (i.e. before heap_getnext())?
(2) Where is the appropriate place to put the page location of an
in-progress scan? Are there other pieces of shared memory that aren't
disk buffers that I should be making use of?


  However, before I spend
  too much more time on that, I'd like to get a general response from a
  3rd party to let me know if I'm off base.
 
 Third party?
 

A 2nd party? Anyone else? That was a typo :)

Regards,
Jeff Davis


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] idea for concurrent seqscans

2005-02-25 Thread Tom Lane
Jeff Davis [EMAIL PROTECTED] writes:
 (1) Do we care about reverse scans being done with synchronized
 scanning? If so, is there a good way to know in advance whether it is
 going to be a forward or reverse scan (i.e. before heap_getnext())?

There are no reverse heapscans --- the only case where you'll see
direction = backwards is while backing up a cursor with FETCH BACKWARD.
I don't think you need to optimize that case.

What I'm more concerned about is your use of shared memory.  I didn't
have time to look at the patch, but how are you determining an upper
bound on the amount of memory you need?  What sort of locking and
contention issues will there be?

Another point is that this will render the results from heapscans
unstable, since different executions of the same query might start
at different points.  This would for example probably break many
of the regression tests.  We can deal with that if we have to, but
it raises the bar of how much benefit I'd want to see from the patch.

One detail that might or might not be significant: different scans are
very likely to have slightly different ideas about where the end of the
table is, since they determine this with an lseek(SEEK_END) at the
instant they start the scan.  I don't think this invalidates your idea
but you need to watch out for corner-case bugs in the coding.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] [pgsql-hackers] Daily digest v1.4988 (21 messages)

2005-02-25 Thread Josh Berkus
Tom,

 As a proposal: feature freeze maybe early summer (June or July), beta
 maybe Aug/Sep, final as always when it's ready (maybe Oct/Nov with
 a good tailwind).

I thought we were trying to get away from a midsummer feature freeze, due to 
the general lack of personnel in that season?   I can tell you that, while 
I'm probably the least critical person for a feature freeze, I will be 
unavailable for anything development-related from July 10 to August 6th.   
And at least a dozen PG people will be presenting at OSCON, which means that 
their attention will be divided in the last week of July.  And there's a 
bunch of European conventions in June, for that matter.

So I'd advocate either freezing in May, or in September.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Development schedule

2005-02-25 Thread Josh Berkus
Tom,

 Well, if you take the summer-vacation argument seriously, then nothing
 will get done between May and September anyway, so we may as well freeze
 in May ;-)

 I'd be happy with saying June 1.

Hey, you and Bruce are the ones who'll get stuck with all the code checking if 
nobody else is available, like last year.  So it's your call.

Better, you should maybe check with the committers when people will be 
available.

Also, what do you think of Simon's plan for a 2-stage feature freeze?   Maybe 
not so far apart ... maybe a month apart?

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Development schedule

2005-02-25 Thread Tom Lane
Josh Berkus josh@agliodbs.com writes:
 I thought we were trying to get away from a midsummer feature freeze, due to 
 the general lack of personnel in that season?
 ...
 So I'd advocate either freezing in May, or in September.

Well, if you take the summer-vacation argument seriously, then nothing
will get done between May and September anyway, so we may as well freeze
in May ;-)

I'd be happy with saying June 1.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] [pgsql-hackers] Daily digest v1.4988 (21 messages)

2005-02-25 Thread Peter Eisentraut
Josh Berkus wrote:
 I thought we were trying to get away from a midsummer feature freeze,
 due to the general lack of personnel in that season?

Better to do feature freeze with no one around than development or 
release preparations with no one around, no?

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Development schedule

2005-02-25 Thread Marc G. Fournier
On Fri, 25 Feb 2005, Josh Berkus wrote:
Also, what do you think of Simon's plan for a 2-stage feature freeze? 
Maybe not so far apart ... maybe a month apart?
I missed that ... could you re-summarize?

Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Development schedule

2005-02-25 Thread Marc G. Fournier
On Fri, 25 Feb 2005, Tom Lane wrote:
Josh Berkus josh@agliodbs.com writes:
I thought we were trying to get away from a midsummer feature freeze, due to
the general lack of personnel in that season?
...
So I'd advocate either freezing in May, or in September.
Well, if you take the summer-vacation argument seriously, then nothing
will get done between May and September anyway, so we may as well freeze
in May ;-)
I'd be happy with saying June 1.
I concur with Josh on this ... that kinda wastes the 'two months of 
summer' when ppl are really sporatically around, so no really testing will 
get done ... I'd rather see a Sept 1st feature freeze, once most ppl are 
back from holidays and are a bit more steady ... it means those working on 
the big features have a few extra months to hammer out the kinks, and 
those  testing are a bit more 'consistent/focused' then they are when they 
are planning, or on, holidays ;)


Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32

2005-02-25 Thread John Hansen
  select upper('æøå');
  ERROR:  invalid multibyte character for locale
  HINT:  The server's LC_CTYPE locale is probably 
 incompatible with the database encoding.
 
  Consequently it seems that is does not work.
 
 It fails on my machine should not be read as it doesn't 
 work for anyone.
 It all depends on how your local mbstowcs() works.

Ok,... Do you have an example of a system on which it works?


... John

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Development schedule

2005-02-25 Thread Tom Lane
Josh Berkus josh@agliodbs.com writes:
 Also, what do you think of Simon's plan for a 2-stage feature freeze?   Maybe
 not so far apart ... maybe a month apart?

I didn't feel a need for it.  It's true that the closer we get to
feature freeze, the smaller the patch you should expect to drop on us
sight unseen.  Simon's proposal implies that this is a binary condition,
but it's really more of a continuous process.  Another point is that
we've never wanted to discourage people from going full tilt right up
to feature freeze; if we say you must have something credible X months
before freeze, that diminishes the value of free time that people might
have after that point.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Development schedule

2005-02-25 Thread Josh Berkus
Marc,

 I missed that ... could you re-summarize?

Sure, Simon proposed that we have a feature freeze for major features (like 
bitmapped indexes and 2PC) before the feature freeze for minor 
features (like new system views).   The reason being that the major features 
need a lot more code checking, and may affect the implementation of the minor 
features.

I'd suggest a month interval.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] [pgsql-hackers] Daily digest v1.4988 (21 messages)

2005-02-25 Thread Marc G. Fournier
On Fri, 25 Feb 2005, Peter Eisentraut wrote:
Josh Berkus wrote:
I thought we were trying to get away from a midsummer feature freeze,
due to the general lack of personnel in that season?
Better to do feature freeze with no one around than development or
release preparations with no one around, no?
I'd say the other way around ... at least when 'noone is around', one 
person that is can still work on the feature they are working on ... 
feature freeze when nobody around means the code stagnants since nobody is 
around to test/give feedback ...


Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] idea for concurrent seqscans

2005-02-25 Thread Jeff Davis
On Fri, 2005-02-25 at 12:54 -0500, Tom Lane wrote:
 Jeff Davis [EMAIL PROTECTED] writes:
  (1) Do we care about reverse scans being done with synchronized
  scanning? If so, is there a good way to know in advance whether it is
  going to be a forward or reverse scan (i.e. before heap_getnext())?
 
 There are no reverse heapscans --- the only case where you'll see
 direction = backwards is while backing up a cursor with FETCH BACKWARD.
 I don't think you need to optimize that case.
 

Ok, I was wondering about that.

 What I'm more concerned about is your use of shared memory.  I didn't
 have time to look at the patch, but how are you determining an upper
 bound on the amount of memory you need?  What sort of locking and
 contention issues will there be?

Right now a scanning backend puts the page it's scanning into shared
memory when it gets a new page (so it's not every tuple). I haven't
determined whether this will be a major point of locking contention.
However, one possible implementation seems to solve both problems at
once:
Let's say we just had a static hash table of size
100*sizeof(oid)*sizeof(blocknumber) (to hold the relation's oid and the
page number it's currently scanning). The relid would predetermine the
placement in the table. If there's a collision, overwrite. I don't think
much is lost in that case, unless, for example, two tables in an
important join have oids that hash to the same value. In that case the
effectiveness of synchronized scanning will be lost, but not worse than
the current behavior.
Let's say we didn't use any locks at all. Are there any real dangers
there? If there's a race, and one backend gets some garbage data, it can
just say this is out of bounds, start the scan at 0. Since it's a
static hash table, we don't have to worry about following a bad pointer,
etc. If that looks like it will be a problem, I can test with locking
also to see what kind of contention there is.

The current patch I sent was very much a proof of concept, but all it
did was have a shared mem segment of size 8 bytes (only holds info for
one relid at a time). That would probably be somewhat effective in many
cases, but of course we want it larger than that (800? 8KB?).

In short, I tried to overcome these problems with simplicity. Where
simplicity doesn't work I default to starting the scan at 0. Hopefully
those non-simple cases (like hash collisions and shared memory races)
are rare enough that we don't lose all that we gain.

 Another point is that this will render the results from heapscans
 unstable, since different executions of the same query might start
 at different points.  This would for example probably break many
 of the regression tests.  We can deal with that if we have to, but
 it raises the bar of how much benefit I'd want to see from the patch.
 

I didn't consider that. Is there a reason the regression tests assume
the results will be returned in a certain order (or a consistent order)?

 One detail that might or might not be significant: different scans are
 very likely to have slightly different ideas about where the end of the
 table is, since they determine this with an lseek(SEEK_END) at the
 instant they start the scan.  I don't think this invalidates your idea
 but you need to watch out for corner-case bugs in the coding.
 

I only see that as an issue in initscan(), where it sets the start page.
A simple bounds check would cure that, no? If it was out of bounds, set
the start page to zero, and we didn't lose much. I need a bounds check
there anyway, since the data we get from shared memory needs to be
validated. That bounds check would be comparing against the current
backend's scan-rs_nblocks, which should be the correct number for that
backend.

Regards,
Jeff Davis



---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32

2005-02-25 Thread Tom Lane
John Hansen [EMAIL PROTECTED] writes:
 It fails on my machine should not be read as it doesn't 
 work for anyone.
 It all depends on how your local mbstowcs() works.

 Ok,... Do you have an example of a system on which it works?

On HPUX 10.20, mbstowcs seems to treat all byte values as single-byte
characters in C locale, so my sample-of-one says that it works
everywhere ;-).

Nonetheless, it's clear that in C locale mbstowcs cannot be buying us
anything compared to using the old ctype.h macros, so I'm fine with
adding a check on the locale as per previous discussion.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [JDBC] [HACKERS] Where are we on stored procedures?

2005-02-25 Thread Tom Lane
Markus Schaber [EMAIL PROTECTED] writes:
 Tom Lane schrieb:
 given the improved support in 8.0 for anonymous record types, we could
 in theory have the backend invent a record type on-the-fly to match
 whatever list of OUT parameters a particular function has.)

 It would not be necessarily on the fly, at least in the first step we
 possibly get away with declaraing the returned tuples at creation time
 and implicitly creating those tuple types. The declaration could be like
 returns (touchedrows int, somethingelse datetime), setof (article int,
 description text) for a function/method that has two resultsets, one of
 those with always one row.

The advantage of not explicitly creating the rowtypes is that we don't
need to worry about choosing nonconflicting names for them.  So I think
I'd go down the anonymous-rowtype path even in the first cut.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] idea for concurrent seqscans

2005-02-25 Thread Tom Lane
Jeff Davis [EMAIL PROTECTED] writes:
 I didn't consider that. Is there a reason the regression tests assume
 the results will be returned in a certain order (or a consistent order)?

We use diff as the checking tool.

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] UTF8 or Unicode

2005-02-25 Thread Bruce Momjian
Peter Eisentraut wrote:
 Am Freitag, 25. Februar 2005 16:26 schrieb Bruce Momjian:
  OK, but what about latin1?
 
 The following character set names are specified in the SQL standard and 
 therefore somewhat non-negotiable:
 
 SQL_CHARACTER
 GRAPHIC_IRV
 LATIN1
 ISO8BIT
 UTF16
 UTF8
 UCS2
 SQL_TEXT
 SQL_IDENTIFIER
 
 So we have to use LATIN1, even though it creates an inconsistency.  We 
 discussed this a while ago during the last great renaming, I think.
 

Oh, UTF8 and not UTF-8?  I thought UTF-8 was the standard name, but if
ANSI uses UTF8 we will have to use that.

 Btw., I think ISO8BIT is the correct name for what we call SQL_ASCII, but I 
 haven't analyzed that in detail, yet.

OK, please let us know.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32

2005-02-25 Thread John Hansen
 On HPUX 10.20, mbstowcs seems to treat all byte values as 
 single-byte characters in C locale, so my sample-of-one says 
 that it works everywhere ;-).

Right, so for the sample SQL I sent earlier, the result would be the same as 
the input?
That's hardly a working upper/lower

If a character doesn't have case then fine, but one that does, should at least 
produce a warning if it cannot be converted.


... John

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32

2005-02-25 Thread Tom Lane
John Hansen [EMAIL PROTECTED] writes:
 Right, so for the sample SQL I sent earlier, the result would be the same as 
 the input?
 That's hardly a working upper/lower

[ shrug... ]  It works per the locale definition, which is that only
7-bit-ASCII a-z/A-Z get converted.

The bottom line here is that we rely on the locale setting for this
behavior, and that's not likely to change real soon.  If you dislike
the locale definition then you should be using a different locale.
In particular I think the issue here is really that your platform's
definition of C locale says that bytes above x7F are illegal
characters.  My platform's doesn't.  The thing to be changing is the
locale definition.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


[HACKERS] CAN-2005-0247

2005-02-25 Thread yo mero
Hello  I saw this :

http://www.cve.mitre.org/cgi-bin/cvename.cgi?name=2005-0247

is  8.0.1  Safe ?
is there a patch ?

thank you 

Leonel

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Development schedule

2005-02-25 Thread Tom Lane
Marc G. Fournier [EMAIL PROTECTED] writes:
 Josh Berkus josh@agliodbs.com writes:
 I thought we were trying to get away from a midsummer feature freeze, due to
 the general lack of personnel in that season?

 I concur with Josh on this ... that kinda wastes the 'two months of 
 summer' when ppl are really sporatically around, so no really testing will 
 get done ... I'd rather see a Sept 1st feature freeze, once most ppl are 
 back from holidays and are a bit more steady ... it means those working on 
 the big features have a few extra months to hammer out the kinks, and 
 those  testing are a bit more 'consistent/focused' then they are when they 
 are planning, or on, holidays ;)

The thing is, if we target feature freeze for September then I think
there is 0 chance of the 8.1 cycle being less than a year -- even with
a fairly short feature freeze and beta cycle you're getting into
December unless there are no slips at all.  And we tried and failed to
release in December this last time; it's got the same
people-aren't-paying-attention problem as the summer.

If this were an ordinary devel cycle then I'd be fine with it running a
year, but I think we really do need to plan for a shorter than normal
cycle so we can clean up 8.0 kinks in a reasonably timely fashion.

Also, I'm unconvinced that we can't do post-feature-freeze cleanup
during the summer.  If we have say a beta2 by the time September
comes, then people returning from vacation will have something to
beat on, and I think it will go well.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] CAN-2005-0247

2005-02-25 Thread Tom Lane
yo mero [EMAIL PROTECTED] writes:
 Hello  I saw this :
 http://www.cve.mitre.org/cgi-bin/cvename.cgi?name=2005-0247
 is  8.0.1  Safe ?

No --- at least not the source-code releases.  Red Hat's 8.0.1 RPMs
include the patch, and I think the latest community RPMs do too.

 is there a patch ?

http://developer.postgresql.org/cvsweb.cgi/pgsql/src/pl/plpgsql/src/gram.y.diff?r1=texttr1=1.64.4.1r2=texttr2=1.64.4.3

The patches for other branches besides 8.0 are available from CVS
as well.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Interesting NetBSD annual report

2005-02-25 Thread Jim C. Nasby
On Fri, Feb 25, 2005 at 01:27:09PM -0400, Marc G. Fournier wrote:
 On Fri, 25 Feb 2005, Jim C. Nasby wrote:
 
 On Fri, Feb 25, 2005 at 08:42:51AM -0500, D'Arcy J.M. Cain wrote:
 They complain about release delays and the challenge of organizing
 volunteers, just like us.
 
 Is there any volunteer organization that's any different?  :-)
 
 FreeBSD doesn't seem to suffer much from either issue. Then again, they
 also have something like 2000 committers, probably about 100 of which
 are very active.
 
 FreeBSD suffers from *alot* of other things because of havin gtoo many 
 committers :(  As hard-core FreeBSD as I've been over the years, things 
 haven't been the same since Jordan left ... to the point that I'm already 
 planning on moving over to DragonFlyBSD (fork project of FreeBSD ... 
 same base code as what we're currently using [ie. FreeBSD 4.x] but 
 with *alot* of fixes for some pretty major problems that still exist 
 in the current version of FreeBSD) over the next few months ... they've 
 already re-written the VFS system, which we tax heavily, but haven't 
 finished the rewrite of the unionfs code (which is how we tax VFS) which 
 is in the works ...
 
 Love the OS ... not so haappy with the direction its taken :(

Hrm, guess it's been too long since I've been subscribed to -current. I
know they're still working bugs out of 5.x, but I attributed that to the
major changes made. One of the big reasons I've been such a fan of FBSD
is the technical direction was set by a group of very competent people,
instead of just one person.
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: Where do you want to go today?
Linux: Where do you want to go tomorrow?
FreeBSD: Are you guys coming, or what?

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] CAN-2005-0247

2005-02-25 Thread yo mero
On Fri, 25 Feb 2005 14:40:25 -0500, Tom Lane [EMAIL PROTECTED] wrote:
 yo mero [EMAIL PROTECTED] writes:
  Hello  I saw this :
  http://www.cve.mitre.org/cgi-bin/cvename.cgi?name=2005-0247
  is  8.0.1  Safe ?
 
 No --- at least not the source-code releases.  Red Hat's 8.0.1 RPMs
 include the patch, and I think the latest community RPMs do too.
 
  is there a patch ?
 
 http://developer.postgresql.org/cvsweb.cgi/pgsql/src/pl/plpgsql/src/gram.y.diff?r1=texttr1=1.64.4.1r2=texttr2=1.64.4.3
 
 The patches for other branches besides 8.0 are available from CVS
 as well.
 
 regards, tom lane
 


patched  , compiled,   postgresql restarted 

thank you 


leonel

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Interesting NetBSD annual report

2005-02-25 Thread Marc G. Fournier
On Fri, 25 Feb 2005, Jim C. Nasby wrote:
Hrm, guess it's been too long since I've been subscribed to -current. I
know they're still working bugs out of 5.x, but I attributed that to the
major changes made. One of the big reasons I've been such a fan of FBSD
is the technical direction was set by a group of very competent people,
instead of just one person.
The problem right now is that there are too many 'very competent people' 
...


Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faq


Re: [HACKERS] Development Plans

2005-02-25 Thread Bruce Momjian
Jim C. Nasby wrote:
 On Fri, Feb 25, 2005 at 10:49:59AM -0500, Bruce Momjian wrote:
  I have no idea how to predict what will be in 8.1.  I couldn't predict
  what would be in 8.0 until just before feature freeze, so the idea that
  we would have any clue about 8.1 is unrealistic.
  
  How do other open source projects predict these things? The most visible
  project I know that did that was Mozilla, and it was very unpredictive,
  and they had a higher percentage of paid folks than we do.
 
 Not to sound like a broken FreeBSD drum, but they manage to do it, and
 afaik a pretty good job of it.
 http://www.freebsd.org/releases/5.4R/todo.html is an example.
 http://www.freebsd.org/releases/5.4R/schedule.html is also interesting.
 
 I suspect a big part of why/how they can do this is they have a much
 larger developer pool than PostgreSQL; I believe there's over 2000
 people with commit access, and there are probably 100-200 people who are
 actively developing code for FBSD. But it's been some time since I
 followed the details of FBSD development, so I could be way off on these
 WAGs.

Uh, we could do it too if we didn't require each release to be as stable
as the previous one.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [pgsql-advocacy] [HACKERS] Development Plans

2005-02-25 Thread Bruce Momjian
Tom Lane wrote:
 Bruce used to try to describe our releases as being oriented towards some
 particular goal, but I always thought that these were after-the-fact
 descriptions that had nothing to do with the real development process.
 The truth is that individual developers work on what they feel like,
 or find interesting, or in some cases get paid to do.  But just as
 there's not a master plan, there's not really some guiding vision that
 in a particular release we are going to focus on X or Y or Z.

Right, my analysis was always after-the-fact in an attempt to put an
understandable story around the release features, nothing more.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Development schedule

2005-02-25 Thread Bruce Momjian
Tom Lane wrote:
 Marc G. Fournier [EMAIL PROTECTED] writes:
  Josh Berkus josh@agliodbs.com writes:
  I thought we were trying to get away from a midsummer feature freeze, due 
  to
  the general lack of personnel in that season?
 
  I concur with Josh on this ... that kinda wastes the 'two months of 
  summer' when ppl are really sporatically around, so no really testing will 
  get done ... I'd rather see a Sept 1st feature freeze, once most ppl are 
  back from holidays and are a bit more steady ... it means those working on 
  the big features have a few extra months to hammer out the kinks, and 
  those  testing are a bit more 'consistent/focused' then they are when they 
  are planning, or on, holidays ;)
 
 The thing is, if we target feature freeze for September then I think
 there is 0 chance of the 8.1 cycle being less than a year -- even with
 a fairly short feature freeze and beta cycle you're getting into
 December unless there are no slips at all.  And we tried and failed to
 release in December this last time; it's got the same
 people-aren't-paying-attention problem as the summer.
 
 If this were an ordinary devel cycle then I'd be fine with it running a
 year, but I think we really do need to plan for a shorter than normal
 cycle so we can clean up 8.0 kinks in a reasonably timely fashion.

Let's see how much 8.0 cleanup we need.  At this point I haven't seen
any major things needing cleanup.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Development schedule

2005-02-25 Thread Tom Lane
Bruce Momjian pgman@candle.pha.pa.us writes:
 Tom Lane wrote:
 If this were an ordinary devel cycle then I'd be fine with it running a
 year, but I think we really do need to plan for a shorter than normal
 cycle so we can clean up 8.0 kinks in a reasonably timely fashion.

 Let's see how much 8.0 cleanup we need.  At this point I haven't seen
 any major things needing cleanup.

However, people are asking us for a schedule target now; wait and see
isn't the answer they need.  My feeling is that we should bet on there
being some issues, rather than bet on there not being any.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Development schedule

2005-02-25 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian pgman@candle.pha.pa.us writes:
  Tom Lane wrote:
  If this were an ordinary devel cycle then I'd be fine with it running a
  year, but I think we really do need to plan for a shorter than normal
  cycle so we can clean up 8.0 kinks in a reasonably timely fashion.
 
  Let's see how much 8.0 cleanup we need.  At this point I haven't seen
  any major things needing cleanup.
 
 However, people are asking us for a schedule target now; wait and see
 isn't the answer they need.  My feeling is that we should bet on there
 being some issues, rather than bet on there not being any.

Uh, they want to know now?  My feeling is that if there were major
issues, we would have heard about them already --- it has been over a
month since 8.0.  It has been a long time since we have had to push out
a major to fix a hard problem in the previous major.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[HACKERS] Modifying COPY TO

2005-02-25 Thread Dave Held
Title: Modifying COPY TO






I am interested in hacking COPY TO such that one can specify that

rows are copied in a certain index order. I got as far as 

src/backend/commands/copy.c:CopyTo(), and it looks like I would need

to modify the call to heap_beginscan() so that it uses a key. However,

I couldn't figure out how to provide one, or if I'm even looking at the

right area. Ideally, this behavior would be specified with a flag,

perhaps: WITH INDEX index_name or WITH PRIMARY KEY

or something similar.


The motivation for this change is as follows. I have a fairly large

database (10 million+ records) that mirrors the data in a proprietary

system. The only access to that data is through exported flat files.

Currently, those flat files are copied directly into a staging area in the 

db via a COPY FROM, the actual tables are truncated, and the

staging data is inserted into the live tables. Since the data is read-only,

it doesn't matter that it is recreated every day. However, as you

can imagine, the import process takes quite a while (several hours).

Also, rebuilding the db from scratch every day loses any statistical

information gathered from the execution of queries during the day.


A possibility that I would like to pursue is to keep the staging data

from the previous day, do a COPY TO, import the new data into

another staging table with a COPY FROM, then export the fresh

data with another COPY TO. Then, I can write a fast C/C++

program to do a line-by-line comparison of each record, isolating

the ones that have changed from the previous day. I can then

emit those records in a change file that should be relatively small

and easy to update. Of course, this scheme can only work if

COPY TO emits the records in a reliable order.


Any assistance on this project would be greatly appreciated. The

best I can see, I'm stuck on line 1053 from copy.c:


 scandesc = heap_beginscan(rel, mySnapshot, 0, NULL);


I suspect that I want it to look like this:


 scandesc = heap_beginscan(rel, mySnapshot, 1, key);


where 'key' is an appropriately constructed ScanKey. It looks

like I want to call ScanKeyEntryInitialize(), but I'm not sure what

parameters I need to pass to it to get an index or the primary

key. I mostly need help building the ScanKey object. I think I 

can figure out how to hack the custom option, etc. I should 

mention that I am using the 7.4.7 codebase on Linux 2.4.


__

David B. Held

Software Engineer/Array Services Group

200 14th Ave. East, Sartell, MN 56377

320.534.3637 320.253.7800 800.752.8129






Re: [HACKERS] Modifying COPY TO

2005-02-25 Thread Tom Lane
Dave Held [EMAIL PROTECTED] writes:
 I am interested in hacking COPY TO such that one can specify that
 rows are copied in a certain index order.  I got as far as=20
 src/backend/commands/copy.c:CopyTo(), and it looks like I would need
 to modify the call to heap_beginscan() so that it uses a key.

Actually you'd need to change it to be an index_beginscan call.

Offhand I don't think you need a scan key; the point of a scan key is to
filter the returned rows, and it doesn't sound like that's what you
want.  So it should work to do

scandesc = index_beginscan(rel, idx, mySnapshot, 0, NULL);

(plus appropriate changes to the usage of the scandesc)
which leaves only the problem of opening the proper index.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Modifying COPY TO

2005-02-25 Thread Stephen Frost
* Dave Held ([EMAIL PROTECTED]) wrote:
 A possibility that I would like to pursue is to keep the staging data
 from the previous day, do a COPY TO, import the new data into
 another staging table with a COPY FROM, then export the fresh
 data with another COPY TO.  Then, I can write a fast C/C++
 program to do a line-by-line comparison of each record, isolating
 the ones that have changed from the previous day.  I can then
 emit those records in a change file that should be relatively small
 and easy to update.  Of course, this scheme can only work if
 COPY TO emits the records in a reliable order.

sort -u old new  changes

?

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] idea for concurrent seqscans

2005-02-25 Thread Jim C. Nasby
On Fri, Feb 25, 2005 at 01:30:57PM -0500, Tom Lane wrote:
 Jeff Davis [EMAIL PROTECTED] writes:
  I didn't consider that. Is there a reason the regression tests assume
  the results will be returned in a certain order (or a consistent order)?
 
 We use diff as the checking tool.

Doesn't the SQL spec specifically state that the only time you'll get
results in a deterministic order is if you use ORDER BY? Assuming
otherwise seems a bad idea (though at least in the case of testing it
makes the test more strenuous rather than less...)
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: Where do you want to go today?
Linux: Where do you want to go tomorrow?
FreeBSD: Are you guys coming, or what?

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] idea for concurrent seqscans

2005-02-25 Thread Jeff Davis
On Fri, 2005-02-25 at 13:30 -0500, Tom Lane wrote:
 Jeff Davis [EMAIL PROTECTED] writes:
  I didn't consider that. Is there a reason the regression tests assume
  the results will be returned in a certain order (or a consistent order)?
 
 We use diff as the checking tool.
 

Well, that does make testing more difficult, or it at least requires
extra work to make the regression tests understand the results better.

I'll sumbmit a better patch, and then if everyone decides it's worth the
hassle with the regression tests, we can use it in 8.1. Some more
testing is required to see if the results are really as good as we hope.

Regards,
Jeff Davis


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Modifying COPY TO

2005-02-25 Thread Jim C. Nasby
Instead of just being able to specify an index to use, how hard would it
be to allow COPY TO to work from a generic query instead of just a
table? That way you wouldn't need to do a diff, you could just export
only new records. ISTM there's far more use for copying from a query
result that just copying by index order.
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: Where do you want to go today?
Linux: Where do you want to go tomorrow?
FreeBSD: Are you guys coming, or what?

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] idea for concurrent seqscans

2005-02-25 Thread Jeff Davis
On Fri, 2005-02-25 at 18:03 -0600, Jim C. Nasby wrote:
 On Fri, Feb 25, 2005 at 01:30:57PM -0500, Tom Lane wrote:
  Jeff Davis [EMAIL PROTECTED] writes:
   I didn't consider that. Is there a reason the regression tests assume
   the results will be returned in a certain order (or a consistent order)?
  
  We use diff as the checking tool.
 
 Doesn't the SQL spec specifically state that the only time you'll get
 results in a deterministic order is if you use ORDER BY? Assuming
 otherwise seems a bad idea (though at least in the case of testing it
 makes the test more strenuous rather than less...)

True, that was my reasoning when I proposed synchronized scanning.

Keep in mind that this is a criticism of only the regression tests, not
the RDBMS itself.

I don't know much about the regression tests, so maybe it's impractical
to not assume consistent order. I'm sure the developers will vote one
way or the other. I hate to throw away a potential performance boost,
but I also hate to burden the developers with rewriting a lot of
regression tests when their time could be better spent elsewhere.

Regards,
Jeff Davis







---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] idea for concurrent seqscans

2005-02-25 Thread Bruce Momjian

Sorry, please disregard my ramblings.  I thought it was a different
question.

---

pgman wrote:
 Jim C. Nasby wrote:
  On Fri, Feb 25, 2005 at 01:30:57PM -0500, Tom Lane wrote:
   Jeff Davis [EMAIL PROTECTED] writes:
I didn't consider that. Is there a reason the regression tests assume
the results will be returned in a certain order (or a consistent order)?
   
   We use diff as the checking tool.
  
  Doesn't the SQL spec specifically state that the only time you'll get
  results in a deterministic order is if you use ORDER BY? Assuming
  otherwise seems a bad idea (though at least in the case of testing it
  makes the test more strenuous rather than less...)
 
 The only trick I can think of is to use SELECT ... INTO TEMPORARY tab
 ... oRDER BY and then use COPY to dump the table.  It will then dump in
 the order of the ORDER BY.
 
 -- 
   Bruce Momjian|  http://candle.pha.pa.us
   pgman@candle.pha.pa.us   |  (610) 359-1001
   +  If your life is a hard drive, |  13 Roberts Road
   +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Modifying COPY TO

2005-02-25 Thread Bruce Momjian
Jim C. Nasby wrote:
 Instead of just being able to specify an index to use, how hard would it
 be to allow COPY TO to work from a generic query instead of just a
 table? That way you wouldn't need to do a diff, you could just export
 only new records. ISTM there's far more use for copying from a query
 result that just copying by index order.

The reason we don't support queries with COPY is because COPY reads from
the raw heap rather than going through the executor.  We could
special-case it so queries go through the executor but would take a
little work.

The only trick I can think of for now is to use SELECT ... INTO
TEMPORARY tab ... oRDER BY and then use COPY to dump the table.  It will
then dump in the order of the ORDER BY.


-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] idea for concurrent seqscans

2005-02-25 Thread Bruce Momjian
Jim C. Nasby wrote:
 On Fri, Feb 25, 2005 at 01:30:57PM -0500, Tom Lane wrote:
  Jeff Davis [EMAIL PROTECTED] writes:
   I didn't consider that. Is there a reason the regression tests assume
   the results will be returned in a certain order (or a consistent order)?
  
  We use diff as the checking tool.
 
 Doesn't the SQL spec specifically state that the only time you'll get
 results in a deterministic order is if you use ORDER BY? Assuming
 otherwise seems a bad idea (though at least in the case of testing it
 makes the test more strenuous rather than less...)

The only trick I can think of is to use SELECT ... INTO TEMPORARY tab
... oRDER BY and then use COPY to dump the table.  It will then dump in
the order of the ORDER BY.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] idea for concurrent seqscans

2005-02-25 Thread Jim C. Nasby
On Fri, Feb 25, 2005 at 04:30:17PM -0800, Jeff Davis wrote:
 On Fri, 2005-02-25 at 18:03 -0600, Jim C. Nasby wrote:
  On Fri, Feb 25, 2005 at 01:30:57PM -0500, Tom Lane wrote:
   Jeff Davis [EMAIL PROTECTED] writes:
I didn't consider that. Is there a reason the regression tests assume
the results will be returned in a certain order (or a consistent order)?
   
   We use diff as the checking tool.
  
  Doesn't the SQL spec specifically state that the only time you'll get
  results in a deterministic order is if you use ORDER BY? Assuming
  otherwise seems a bad idea (though at least in the case of testing it
  makes the test more strenuous rather than less...)
 
 True, that was my reasoning when I proposed synchronized scanning.
 
 Keep in mind that this is a criticism of only the regression tests, not
 the RDBMS itself.
 
 I don't know much about the regression tests, so maybe it's impractical
 to not assume consistent order. I'm sure the developers will vote one
 way or the other. I hate to throw away a potential performance boost,
 but I also hate to burden the developers with rewriting a lot of
 regression tests when their time could be better spent elsewhere.

Certainly, but I suspect it's just a matter of adding ORDER BY to
everything, which just about anyone (even myself!) should be able to do.
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: Where do you want to go today?
Linux: Where do you want to go tomorrow?
FreeBSD: Are you guys coming, or what?

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Modifying COPY TO

2005-02-25 Thread Andrew - Supernews
On 2005-02-25, Dave Held [EMAIL PROTECTED] wrote:
 A possibility that I would like to pursue is to keep the staging data
 from the previous day, do a COPY TO, import the new data into
 another staging table with a COPY FROM, then export the fresh
 data with another COPY TO.  Then, I can write a fast C/C++
 program to do a line-by-line comparison of each record, isolating
 the ones that have changed from the previous day.  I can then
 emit those records in a change file that should be relatively small
 and easy to update.

I have an application that does something like this, but rather than use an
external program, I do the comparison in the database itself:

  - import data from external system into a temporary table
  - compare the temporary table against the live data (a full outer join
is a convenient way of doing this - I create an index on the temp table
first)
  - perform insert/update/delete for each record that was added, changed
or removed

In my case the compare/update is in a pl/pgsql function. My data is only
2-3 million rows, a bit smaller than yours, but I have to update hourly,
not daily, and spend no more than 5-10 minutes on each update (currently
I can do it in 5: 2 to load the data, 3 to do the compare/update).

-- 
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Development Plans

2005-02-25 Thread Jeff Hoffmann
On Feb 25, 2005, at 10:36 AM, Tom Lane wrote:
Matthew T. O'Connor matthew@zeut.net writes:
Are there any big projects are people
working on to get into 8.1?
I'm privately hoping to get bitmap index operations into 8.1 (that is,
build a bitmap of tuple locations from an index, possibly AND or OR the
results of multiple indexes, and finally visit the heap in CTID order).
This is not as big as say the Windows port, but it's easily a solid
month or two of effort.
I'd be very excited to see this implemented so I'm happy someone so 
high-profile seems to be interested in it getting done.  Were you 
planning on doing it yourself or are there people you know who are 
actively working on it?  When you say hoping is that probably, but 
no guarantees or is it more it'd be nice if someone took the lead on 
this?  I'm not offering, I don't think I'd be capable of doing it, but 
I'd certainly be interested in following the development and helping 
with testing.

--
Jeff Hoffmann
[EMAIL PROTECTED]
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [HACKERS] idea for concurrent seqscans

2005-02-25 Thread Tom Lane
Jim C. Nasby [EMAIL PROTECTED] writes:
 but I also hate to burden the developers with rewriting a lot of
 regression tests when their time could be better spent elsewhere.

 Certainly, but I suspect it's just a matter of adding ORDER BY to
 everything, which just about anyone (even myself!) should be able to do.

Performance is not the issue; test coverage, however, is an issue.
See the comment at the end of
http://developer.postgresql.org/docs/postgres/regress-evaluation.html#AEN22383

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Modifying COPY TO

2005-02-25 Thread Tom Lane
Jim C. Nasby [EMAIL PROTECTED] writes:
 ... ISTM there's far more use for copying from a query
 result that just copying by index order.

Yeah.  The other point is that it's entirely likely that an external
sort will be faster than using an indexscan to produce the sorted order.
If you instead create a command like 
COPY FROM (SELECT whatever FROM foo ORDER BY something)
then you give the optimizer a chance at deciding what to do ... not that
I promise it will get it right, but a fixed choice is certain to be
wrong.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] idea for concurrent seqscans

2005-02-25 Thread Jim C. Nasby
On Fri, Feb 25, 2005 at 11:51:40PM -0500, Tom Lane wrote:
 Jim C. Nasby [EMAIL PROTECTED] writes:
  but I also hate to burden the developers with rewriting a lot of
  regression tests when their time could be better spent elsewhere.
 
  Certainly, but I suspect it's just a matter of adding ORDER BY to
  everything, which just about anyone (even myself!) should be able to do.
 
 Performance is not the issue; test coverage, however, is an issue.
 See the comment at the end of
 http://developer.postgresql.org/docs/postgres/regress-evaluation.html#AEN22383

Assuming you're talkning about You might wonder why we don't order all
the regression test queries explicitly to get rid of this issue once and
for all. The reason is that that would make the regression tests less
useful, not more, since they'd tend to exercise query plan types that
produce ordered results to the exclusion of those that don't., good
point. I can think of 2 ways around this:

1) Select into a temptable, then select out of it with an order by

2) Run the output through sort before doing the diff

Is there any reason one of these wouldn't work?
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: Where do you want to go today?
Linux: Where do you want to go tomorrow?
FreeBSD: Are you guys coming, or what?

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly