Re: [HACKERS] PostGIS Integration

2004-02-04 Thread Christopher Kings-Lynne
I can't see any way to handle parameterized types without extending the
grammar individually for each one --- otherwise it's too hard to tell
them apart from function calls.  That makes it a bit hard to do 'em
as plug-ins :-(.  The grammar hacks are certainly ugly though, and if
someone could think of a way, I'm all ears...
Disallow it in table definitions, but allow it in domain definitions...

Chris

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [HACKERS] PostGIS Integration

2004-02-04 Thread Tom Lane
Christopher Kings-Lynne [EMAIL PROTECTED] writes:
 I can't see any way to handle parameterized types without extending the
 grammar individually for each one --- otherwise it's too hard to tell
 them apart from function calls.

 Disallow it in table definitions, but allow it in domain definitions...

Those two cases are not hard, because in those scenarios the parser
knows it is expecting a type specification.  The real problem is this
syntax for typed literals:
typename 'string'
which occurs in ordinary expressions.  So when you see name( you
aren't real sure if you're seeing the start of a function call or the
start of a typed-literal construct.  And it's very hard to postpone that
decision until you see what comes after the right paren.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] PostGIS Integration

2004-02-04 Thread Tom Lane
Christopher Kings-Lynne [EMAIL PROTECTED] writes:
 Those two cases are not hard, because in those scenarios the parser
 knows it is expecting a type specification.  The real problem is this
 syntax for typed literals:
 typename 'string'

 Just disallow that particular case for custom types :P

Well, maybe we could --- comments?  Tom Lockhart went to some lengths to
support that, but now that he's gafiated we could perhaps rethink it.
AFAICS the SQL spec only requires this syntax for certain built-in types.
Tom wanted to generalize that to all datatypes that Postgres supports,
and that seems like a reasonable goal ... but if it prevents getting to
other reasonable goals then we ought to think twice.

 Will this work:  'string'::typename

Yes, since the :: cues the parser to expect a typename next.

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Recursive queries?

2004-02-04 Thread Hans-Jürgen Schönig
Christopher Kings-Lynne wrote:
There is a website somewhere where a guy posts his patch he is 
maintaining that does it.  I'll try to find it...


Found it.  Check it out:

http://gppl.terminal.ru/index.eng.html

Patch is current for 7.4, Oracle syntax.

Chris


I had a look at the patch.
It is still in development but it seems to work nicely - at least I have 
been able to get the same results with Oracle.

I will try it with a lot of data this afternoon so that we can compare 
Oracle vs. Pg performance. I expect horrible results ;).

Does this patch have a serious chance to make it into Pg some day?
I think Oracle's syntax is not perfect but is easy to handle and many 
people are used to it. In people's mind recursive queries = CONNECT BY 
and many people (like me) miss it sadly.

If this patch has a serious chance I'd like to do some investigation and 
some real-world data testing.

	Regards,

		Hans

--
Cybertec Geschwinde u Schoenig
Schoengrabern 134, A-2020 Hollabrunn, Austria
Tel: +43/2952/30706 or +43/664/233 90 75
www.cybertec.at, www.postgresql.at, kernel.cybertec.at
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [HACKERS] PITR Dead horse?

2004-02-04 Thread Nicolai Tufar
I would like to join this effort too.
I was afraid that people at RedHat are already
halfway though and were to release their work 
shortly. But it does not seem to be the case.

Regards,
Nicolai

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:pgsql-hackers-
 [EMAIL PROTECTED] On Behalf Of Koichi Suzuki
 Sent: Wednesday, February 04, 2004 11:25 AM
 To: Tatsuo Ishii
 Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
 Subject: Re: [HACKERS] PITR Dead horse?
 
 Hi, This is Suzuki from NTT DATA Intellilink.
 
 I told Bruce Momjan that I and my colleagues are interested in
 implementing PITR in BOF in NY LW2004.  NTT's laboratory is very
 interested in this issue and I'm planning to work with them.  I hope
we
 could cooperate.
 
 Tatsuo Ishii wrote:
 
 Has this been beaten to death now? Just curious if PITR was in Dev
tree
 yet. Been out of the loop. TIA.
 
 
  I and my co workers are very interested in implementing PITR. We
will
  tackle this for 7.5 if no one objects.
  --
  Tatsuo Ishii
 
  ---(end of
broadcast)---
  TIP 2: you can get off all lists at once with the unregister command
  (send unregister YourEmailAddressHere to
[EMAIL PROTECTED])
 
 
 
 
 ---(end of
broadcast)---
 TIP 1: subscribe and unsubscribe commands go to
[EMAIL PROTECTED]


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Recursive queries?

2004-02-04 Thread Andrew Rawnsley
I haven't had any problems with it so far, although I haven't really 
stressed it yet.  I was going to make this very plea...

I agree that the syntax can probably be improved, but its familiar to 
those of us unfortunate enough to have used (or still have to use)
Oracle. I imagine that bringing it more in line with any standard would 
be what people would prefer.

On Feb 4, 2004, at 5:28 AM, Hans-Jürgen Schönig wrote:

Christopher Kings-Lynne wrote:
There is a website somewhere where a guy posts his patch he is 
maintaining that does it.  I'll try to find it...
Found it.  Check it out:
http://gppl.terminal.ru/index.eng.html
Patch is current for 7.4, Oracle syntax.
Chris


I had a look at the patch.
It is still in development but it seems to work nicely - at least I 
have been able to get the same results with Oracle.

I will try it with a lot of data this afternoon so that we can compare 
Oracle vs. Pg performance. I expect horrible results ;).

Does this patch have a serious chance to make it into Pg some day?
I think Oracle's syntax is not perfect but is easy to handle and many 
people are used to it. In people's mind recursive queries = CONNECT BY 
and many people (like me) miss it sadly.

If this patch has a serious chance I'd like to do some investigation 
and some real-world data testing.

	Regards,

		Hans

--
Cybertec Geschwinde u Schoenig
Schoengrabern 134, A-2020 Hollabrunn, Austria
Tel: +43/2952/30706 or +43/664/233 90 75
www.cybertec.at, www.postgresql.at, kernel.cybertec.at
---(end of 
broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly



Andrew Rawnsley
President
The Ravensfield Digital Resource Group, Ltd.
(740) 587-0114
www.ravensfield.com
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[HACKERS] array surprising behavior

2004-02-04 Thread Alvaro Herrera
Hi,

I think this is most surprising behavior -- shouldn't the UPDATE raise
an error?

alvherre=# create table foo (a int[]);
CREATE TABLE
alvherre=# insert into foo values (null);
INSERT 33649 1
alvherre=# update foo set a[3] = '42';
UPDATE 1
alvherre=# select a, a is null from foo;
 a | ?column?
---+--
   | t
(1 fila)
alvherre=# select version();
  version
---
 PostgreSQL 7.5devel on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3.1 (Mandrake 
Linux 9.2 3.3.1-2mdk)
(1 fila)


(This is CVS tip as of a couple weeks ago)

-- 
Alvaro Herrera (alvherre[a]dcc.uchile.cl)
El destino baraja y nosotros jugamos (A. Schopenhauer)

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Recursive queries?

2004-02-04 Thread Christopher Browne
Clinging to sanity, [EMAIL PROTECTED] (Andrew Rawnsley) mumbled into her beard:
 I haven't had any problems with it so far, although I haven't really
 stressed it yet.  I was going to make this very plea...

 I agree that the syntax can probably be improved, but its familiar to
 those of us unfortunate enough to have used (or still have to use)
 Oracle. I imagine that bringing it more in line with any standard
 would be what people would prefer.

The SQL:1999 form is instead of the form

 with recquery (a,b,c,d) as
 (select a1,b1,c1,d1 from some table where d1  21)
   select * from recquery;

Notice that I have indented this in the same way a Lisp programmer
would indent a LET form...

 (let
((a value-for-a)
 (b value-for-b)
 (c compute-c)
 (d 42)) ;;; The ultimate answer...
   (compute-something-with-values a b c d))

In ML, there is an analagous let/in construct:

#let a = 1 and
 b = 2 and
 c = 3
  in
 a + b * c;;
- : int = 7

That example is oversimplified, a bit, as it does not do anything
recursive.  In order to express a recursive relationship, the query
likely needs to have a UNION ALL, and look more like the following:

 with recquery (a,b,c,d) as
   (select a,b,c,d from base_table root   -- Root level entries
  where c  200
union all 
select child.a,child.b,child.c,child.d 
  from recquery parent, base_table child  -- Self-reference here
  where parent.a = child.b  -- The link between nodes...
and c  200)
  select a,b,c,d from recquery;

The fact that the form of this resembles that of the Lisp/ML let
forms means that WITH can be useful in structuring queries as well.
For instance, supposing you're computing a value that gets used
several times, putting it into a WITH clause might allow evading the
need to compute it more than once.

with notrec (radius, pi, month) as
   (select radius, 3.1412, date_trunc('month', txn_date) from pie_table)
 select month, sum(pi * radius * radius as area), count(*)
   from not_rec
   where month between '2003-01-01' and '2004-01-01'
   group by month;

has some 'elegance' by virtue of only using date_trunc once over

  select date_trunc('month', txn_date), sum(3.1412 * radius*radius) as
area, count(*) from pie_table
  where
date_trunc('month', txn_date) between '2003-01-01' and '2004-01-01'
  group by month;

Admittedly, date_trunc() may not be an ideal example, as the date
constraint would work as well with an untruncated date the point is
that in the no-WITH approach, there is an extra use of date_trunc().
But the recomputation that takes place when a functional value is used
both in the result clause and in the WHERE clause is something that
WITH can eliminate.
-- 
aa454,@,freenet.carleton.ca
http://www.ntlug.org/~cbbrowne/emacs.html
Lisp Users:
Due to the holiday next Monday, there will be no garbage collection.

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] PITR Dead horse?

2004-02-04 Thread Marc G. Fournier
On Wed, 4 Feb 2004, Tatsuo Ishii wrote:

  I and some other developers are also interested in.
  Do you think we can work together?

 Sure. Why not. I think it would be practical to decide who is the
 leader of this project, though.

Is this something large enough, like the win32 stuff, that having a side
list for discussions is worth setting up?


Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Recursive queries?

2004-02-04 Thread Robert Treat
On Wed, 2004-02-04 at 05:28, Hans-Jürgen Schönig wrote:
 Christopher Kings-Lynne wrote:
  There is a website somewhere where a guy posts his patch he is 
  maintaining that does it.  I'll try to find it...
  
  
  Found it.  Check it out:
  
  http://gppl.terminal.ru/index.eng.html
  
  Patch is current for 7.4, Oracle syntax.
  
  Chris
 
 
 I had a look at the patch.
 It is still in development but it seems to work nicely - at least I have 
 been able to get the same results with Oracle.
 
 I will try it with a lot of data this afternoon so that we can compare 
 Oracle vs. Pg performance. I expect horrible results ;).
 
 Does this patch have a serious chance to make it into Pg some day?
 I think Oracle's syntax is not perfect but is easy to handle and many 
 people are used to it. In people's mind recursive queries = CONNECT BY 
 and many people (like me) miss it sadly.
 
 If this patch has a serious chance I'd like to do some investigation and 
 some real-world data testing.
 

Seems it has no chance of getting in as it is GPL'd code... so step one
would be convincing him to relicense it. 

As a side note, I thought Joe Conway also had an implementation of
this...

Robert Treat 
-- 
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Recursive queries?

2004-02-04 Thread Tom Lane
=?ISO-8859-1?Q?Hans-J=FCrgen_Sch=F6nig?= [EMAIL PROTECTED] writes:
 Does this patch have a serious chance to make it into Pg some day?
 I think Oracle's syntax is not perfect but is easy to handle and many 
 people are used to it. In people's mind recursive queries = CONNECT BY 
 and many people (like me) miss it sadly.

I would prefer to see us supporting the SQL-standard syntax (WITH etc),
as it is (1) standard and (2) more flexible than CONNECT BY.  The Red
Hat work mentioned earlier in the thread was aimed at supporting the
standard syntax.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] PITR Dead horse?

2004-02-04 Thread Tom Lane
Marc G. Fournier [EMAIL PROTECTED] writes:
 Is this something large enough, like the win32 stuff, that having a side
 list for discussions is worth setting up?

In terms of the amount of code to be written, I expect it's larger than
the win32 porting effort.  And it should be mostly pretty separate from
hacking the core backend, since most of what remains to do is writing
external management utilities (I think).

I've been dissatisfied with having the separate pgsql-hackers-win32
list; I feel it just fragments the discussion, and people tend to end up
crossposting to -hackers anyway.  But a separate list for PITR work
might be a good idea despite that experience, since it seems like it'd
be a more separable project.

Any other opinions out there?

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Recursive queries?

2004-02-04 Thread Hannu Krosing
Robert Treat kirjutas K, 04.02.2004 kell 16:55:

 Seems it has no chance of getting in as it is GPL'd code... so step one
 would be convincing him to relicense it. 
 
 As a side note, I thought Joe Conway also had an implementation of
 this...

IIRC Joe Conway had the simple join-by-parent-id variant done as
set-returning function.

---
Hannu


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Recursive queries?

2004-02-04 Thread Hannu Krosing
Tom Lane kirjutas K, 04.02.2004 kell 06:04:
 Christopher Kings-Lynne [EMAIL PROTECTED] writes:
  Wasn't there some guy at RedHat doing it?
 
 Andrew Overholt did some work on SQL99 recursive queries, but went back
 to university without having gotten to the point where it actually
 worked.  One of the many things on my to-do list is to pick up and
 finish Andrew's work on this.  If someone has time to work on it,
 let me know and I'll try to get what he had over to you.

I attach my early attempts at doing the same.

I also sent this to Andrew while he was working on it, and he claimed
that his version was similar. I think he sent me a slightly more
advanced verion of this, but I'm currently unable to lovcate it.

This has mainly the syntax part (without recursion control IIRC) and
some initial documentation (in python's StructuredText and html formats)

If I find Andrews variant I'll try to post it too.

-
Hannu



pg-with-recursive.tar.gz
Description: application/compressed-tar

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Recursive queries?

2004-02-04 Thread Hannu Krosing
Christopher Browne kirjutas K, 04.02.2004 kell 15:10:

 The fact that the form of this resembles that of the Lisp/ML let
 forms means that WITH can be useful in structuring queries as well.
 For instance, supposing you're computing a value that gets used
 several times, putting it into a WITH clause might allow evading the
 need to compute it more than once.

The main difference between giving the subquery in WITH and in FROM, is
that the subqueries given in FROM claues don't see each other, while the
ones given in WITH see the ones preceeding them and the ones in WITH
RECURSIVE see all queries in the WITH RECURSIVE clause.

--
Hannu


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] PostGIS Integration

2004-02-04 Thread strk
tgl wrote:
 Christopher Kings-Lynne [EMAIL PROTECTED] writes:
  Those two cases are not hard, because in those scenarios the parser
  knows it is expecting a type specification.  The real problem is this
  syntax for typed literals:
  typename 'string'
 
  Just disallow that particular case for custom types :P
 
 Well, maybe we could --- comments?  Tom Lockhart went to some lengths to
 support that, but now that he's gafiated we could perhaps rethink it.
 AFAICS the SQL spec only requires this syntax for certain built-in types.
 Tom wanted to generalize that to all datatypes that Postgres supports,
 and that seems like a reasonable goal ... but if it prevents getting to
 other reasonable goals then we ought to think twice.

If it's not for SQL conformance
I don't think we really need to generalize that.
As far as there are other means to gain the same result...

'string'::type(parameter) can be the general postgres version.
while varchar(2) 'string' can be the standard SQL version (not general).

--strk;

 
  Will this work:  'string'::typename
 
 Yes, since the :: cues the parser to expect a typename next.
 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 8: explain analyze is your friend

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Recursive queries?

2004-02-04 Thread Hans-Jürgen Schönig
Tom Lane wrote:
=?ISO-8859-1?Q?Hans-J=FCrgen_Sch=F6nig?= [EMAIL PROTECTED] writes:

Does this patch have a serious chance to make it into Pg some day?
I think Oracle's syntax is not perfect but is easy to handle and many 
people are used to it. In people's mind recursive queries = CONNECT BY 
and many people (like me) miss it sadly.


I would prefer to see us supporting the SQL-standard syntax (WITH etc),
as it is (1) standard and (2) more flexible than CONNECT BY.  The Red
Hat work mentioned earlier in the thread was aimed at supporting the
standard syntax.
			regards, tom lane


I have already expected an answer like that.
In my very personal opinion (don't cut my head off) I'd vote for both 
syntaxes.
The reasons for that are fairly easy to explain:

- I have to agree with Tom (1, 2).

- CONNECT BY makes sense because it is easier to build applications 
supporting Oracle and PostgreSQL. In case of more complex applications 
(CONNECT BY is definitely more than pure storage of simple data) 
Oracle-Pg compliance is really important (I have seen that a dozen times).

From a marketing point of view both versions make sense - Oracle-Pg 
migration is an increasing market share.
From a technical point of view I completely agree with Tom (I have 
learned in the past that Tom us usually right).

	Regards,

		Hans

--
Cybertec Geschwinde u Schoenig
Schoengrabern 134, A-2020 Hollabrunn, Austria
Tel: +43/2952/30706 or +43/664/233 90 75
www.cybertec.at, www.postgresql.at, kernel.cybertec.at
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Recursive queries?

2004-02-04 Thread Tom Lane
=?ISO-8859-1?Q?Hans-J=FCrgen_Sch=F6nig?= [EMAIL PROTECTED] writes:
 In my very personal opinion (don't cut my head off) I'd vote for both 
 syntaxes.

I'm not opposed to that, although it would be a good idea to check that
Oracle doesn't have some patent covering their syntax.

However, if we go for that then what we probably want to do is implement
the SQL-spec syntax and then add something to translate the Oracle
syntax into a SQL parsetree.  We shouldn't need two implementations
in the guts of the system, and I'd expect that translating in the other
direction (SQL WITH to an Oracle internal implementation) wouldn't work,
because WITH does more.

I dunno whether the patch mentioned earlier in this thread could serve
as a starting point for that or not.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-02-04 Thread Bruce Momjian
 I am concerned that the bgwriter will not be able to keep up with the
 I/O generated by even a single backend restoring a database, let alone a
 busy system.  To me, the write() performed by the bgwriter, because it
 is I/O, will typically be the bottleneck on any system that is I/O bound
 (especially as the kernel buffers fill) and will not be able to keep up
 with active backends now freed from writes.
 
 The idea to fallback when the bgwriter can not keep up is to have the
 backends sync the data, which seems like it would just slow down an
 I/O-bound system further.
 
 I talked to Magnus about this, and we considered various ideas, but
 could not come up with a clean way of having the backends communicate to
 the bgwriter about their own non-sync writes.  We had the ideas of using
 shared memory or a socket, but these seemed like choke-points.
 
 Here is my new idea.  (I will keep throwing out ideas until I hit on a
 good one.)  The bgwriter it going to have to check before every write to
 determine if the file is already recorded as needing fsync during
 checkpoint.  My idea is to have that checking happen during the bgwriter
 buffer scan, rather than at write time.  if we add a shared memory
 boolean for each buffer, backends needing to write buffers can writer
 buffers already recorded as safe to write by the bgwriter scanner.  I
 don't think the bgwriter is going to be able to keep up with I/O bound
 backends, but I do think it can scan and set those booleans fast enough
 for the backends to then perform the writes.  (We might need a separate
 bgwriter thread to do this or a separate process.)
 
 As I remember, our new queue system has a list of buffers that are most
 likely to be replaced, so the bgwriter can scan those first and make
 sure they have their booleans set.
 
 There is an issue that these booleans are set without locking, so there
 might need to be a double-check of them by backends, first before the
 write, then after just before they replace the buffer.  The bgwriter
 would clear the bits before the checkpoint starts.

Now that no one is ill from my fsync buffer boolean idea, let me give
some implementation details.  :-)

First, we need to add a bit to each shared buffer descriptor (sbufdesc)
that indicates whether the background writer (bwwriter) has recorded the
file associated with the buffer as needing fsync. This bit will be set
only by the background writer, usually during its normal buffer scan
looking for buffers to write. The background writer doesn't write all
dirty buffers on each buffer pass, but it could record the buffers that
need fsync on each pass, allowing backends to write those buffers if
buffer space becomes limited. (Not sure but perhaps the buffer bit set
could be done with only a shared lock on the buffer because no one else
sets the bit.)

(One idea would be to move the fsync bit into its own byte in shared
memory so it is more centralized and no locking is required to set the
bit. Also, should we have one byte per shared buffer to indicate dirty
buffers so the bwwriter can fine them more efficiently?)

The bit can be cleared if either the background writer writes the page,
or a backend writes the page.

Right now, the checkpoint process writes out all dirty buffers. We might
need to change this so the background writer does this because only it
can record files needing fsync.  During checkpoint, the background
writer should write out all buffers. It will not be recording any new
fsync bits during this scan because it is writing every dirty buffer.
(If it did do this, it could mark an fsync bit that was written only
during or after the fsync it performs later.)

Once it is done, it should move the hash of files needing fsync to a
backup pointer and create a new empty list and do a scan so backends can
do writes. A subprocess should do fsync of all files, either using
fork() and having the child read the saved pointer hash, or for
EXEC_BACKEND, write a temp file that the child can read.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] array surprising behavior

2004-02-04 Thread Joe Conway
Alvaro Herrera wrote:
I think this is most surprising behavior -- shouldn't the UPDATE raise
an error?
Surprising, but not new (probably has been there back to the Berkley 
code), and has come up before on one of the lists (I think it might even 
have been pgsql-bugs).

regression=# select version();
version

 PostgreSQL 7.3.5 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.2.2 
20030222 (Red Hat Linux 3.2.2-5)
(1 row)

regression=# create table foo (a int[]);
CREATE TABLE
regression=# insert into foo values (null);
INSERT 1104092 1
regression=# update foo set a[3] = '42';
UPDATE 1
regression=# select a, a is null from foo;
 a | ?column?
---+--
   | t
(1 row)
I'm still hoping to scrounge up the time to continue working on arrays 
for 7.5, including figuring out how to deal with this.

Joe



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] implemented missing bitSetBit() and bitGetBit()

2004-02-04 Thread Neil Conway
David Helgason [EMAIL PROTECTED] writes:
 I needed these, so I went and implemented them myself.

I didn't see any followup to this: do we want to include this in the
main tree, contrib/, or not at all?

-Neil (who has no opinion on the matter, but just wants to make sure
this doesn't fall through the cracks)


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] implemented missing bitSetBit() and bitGetBit()

2004-02-04 Thread Dann Corbit
Bit sets are remarkably useful functionality.   I suggest putting it
into the core product.

 -Original Message-
 From: Neil Conway [mailto:[EMAIL PROTECTED] 
 Sent: Wednesday, February 04, 2004 11:22 AM
 To: David Helgason
 Cc: [EMAIL PROTECTED]
 Subject: Re: [HACKERS] implemented missing bitSetBit() and bitGetBit()
 
 
 David Helgason [EMAIL PROTECTED] writes:
  I needed these, so I went and implemented them myself.
 
 I didn't see any followup to this: do we want to include this 
 in the main tree, contrib/, or not at all?
 
 -Neil (who has no opinion on the matter, but just wants to 
 make sure this doesn't fall through the cracks)
 
 
 ---(end of 
 broadcast)---
 TIP 5: Have you checked our extensive FAQ?
 
   http://www.postgresql.org/docs/faqs/FAQ.html

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] Question on database backup

2004-02-04 Thread Michael Brusser
We have customers who prefer to use their backup facilities
instead of what we provide in the app (we use pg_dump)
I hear speed is at least one consideration.

The questions I need to answer are these:

1) Is this absolutely safe to do file copy (cpio, or 
smth. else, whatever the robust backup app. would use)
on the Postgres db, when it's completely shut down.

2) Same question, but the database is up and running in
read-only mode. We're making sure that no updates are 
taking place.

If it matters - this is on Solaris, HP, Linux.
We're using v.7.3.2 running on UDS and v.7.3.4 on TCP/IP
We provide no explicit settings for wal, fsync and the like.
And (yes, I know) they often install it on NFS.

Thank you much.
Mike.







---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] implemented missing bitSetBit() and bitGetBit()

2004-02-04 Thread Peter Eisentraut
Neil Conway wrote:
 David Helgason [EMAIL PROTECTED] writes:
  I needed these, so I went and implemented them myself.

 I didn't see any followup to this: do we want to include this in the
 main tree, contrib/, or not at all?

getbit sounds a lot like what substring() does.  So perhaps setbit could 
actually be handled by replace()?  That would be a more general 
solution (since it would handle more than one bit at a time).


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] implemented missing bitSetBit() and bitGetBit()

2004-02-04 Thread David Helgason
On 4. feb 2004, at 20:51, Peter Eisentraut wrote:
Neil Conway wrote:
David Helgason [EMAIL PROTECTED] writes:
I needed these, so I went and implemented them myself.
I didn't see any followup to this: do we want to include this in the
main tree, contrib/, or not at all?
getbit sounds a lot like what substring() does.  So perhaps setbit 
could
actually be handled by replace()?  That would be a more general
solution (since it would handle more than one bit at a time).
I sort of agree, but it's currently documented like I implemented it 
(afaics), so it's a simple thing to include.

I feel a bit bad for not having done a full patch with test-cases and 
.bki modifications etc., but it seemed a pretty daunting task (for my 
schedule at least).

Hope someone can use it though.

David Helgason,
Over the Edge Entertainments
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Question on database backup

2004-02-04 Thread scott.marlowe
On Wed, 4 Feb 2004, Michael Brusser wrote:

 We have customers who prefer to use their backup facilities
 instead of what we provide in the app (we use pg_dump)
 I hear speed is at least one consideration.
 
 The questions I need to answer are these:
 
 1) Is this absolutely safe to do file copy (cpio, or 
 smth. else, whatever the robust backup app. would use)
 on the Postgres db, when it's completely shut down.

Yes, it is.

 2) Same question, but the database is up and running in
 read-only mode. We're making sure that no updates are 
 taking place.

Most likely, it is.  No guarantees if the database is up and running, even 
if you're certain there are no updates happening.

Also, you can use a snapshotting file system to make a backup image and 
then back up the image, while the database is up and being accessed, both 
by readers and writers, assuming you get REAL snapshots.

 If it matters - this is on Solaris, HP, Linux.
 We're using v.7.3.2 running on UDS and v.7.3.4 on TCP/IP
 We provide no explicit settings for wal, fsync and the like.
 And (yes, I know) they often install it on NFS.

OK, here's a couple more issues to chew on as well.  Suppose you have a 
backup, and the database server was compiled with UNKNOWN switches.  The 
machine burns to the ground.  now you get to try and figure out how to 
compile the database on the new server so it can read the old dataset.  
This may or may not be a complete friggin' nightmare for you.  Dumps can 
move between versions / hardware configs / differently compiled versions 
of postgresql with some amount of reliability.  binary copies, may or may 
not move so easily.

Scenario II, the HP burns to the ground, and your boss just buys a big old 
intel box.  how do you get your data up and running with a binary backup?  
you don't.

Scenario III.  Subtle corruption gets into your dataset due to a bad block 
or what not.  No one notices for a while.  Suddenly, someone notices.  
With only file system backups, with no error messages in them, how do you 
determine when the corruption occurred and get the uncorrupt data out 
leaving the corrupted behind?

Plain and simple.  Postgresql is designed to be backed up by pg_dump.  
Using anything else isn't supported so to speak, and may cause you 
untold grief in the future.  That said, sometimes file backups are the 
perfect solution, just go into with your eyes open to the possible 
problems, and I'd make a pg_dump every so often just in case.


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-02-04 Thread Kevin Brown
Tom Lane wrote:
 Kevin Brown [EMAIL PROTECTED] writes:
  Instead, have each backend maintain its own separate list in shared
  memory.  The only readers of a given list would be the backend it belongs
  to and the bgwriter, and the only time bgwriter attempts to read the
  list is at checkpoint time.
 
  The sum total size of all the lists shouldn't be that much larger than
  it would be if you maintained it as a global list.
 
 I fear that is just wishful thinking.  Consider the system catalogs as a
 counterexample of files that are likely to be touched/modified by many
 different backends.

Oh, I'm not arguing that there won't be a set of files touched by a lot
of backends, just that the number of such files is likely to be relatively
small -- a few tens of files, perhaps.  But that admittedly can add up
fast.  But see below.


 The bigger problem though with this is that it makes the problem of
 list overflow much worse.  The hard part about shared memory management
 is not so much that the available space is small, as that the available
 space is fixed --- we can't easily change it after postmaster start.
 The more finely you slice your workspace, the more likely it becomes
 that one particular part will run out of space.  So the inefficient case
 where a backend isn't able to insert something into the appropriate list
 will become considerably more of a factor.

Well, running out of space in the list isn't that much of a problem.  If
the backends run out of list space (and the max size of the list could
be a configurable thing, either as a percentage of shared memory or as
an absolute size), then all that happens is that the background writer
might end up fsync()ing some files that have already been fsync()ed.
But that's not that big of a deal -- the fact they've already been
fsync()ed means that there shouldn't be any data in the kernel buffers
left to write to disk, so subsequent fsync()s should return quickly.
How quickly depends on the individual kernel's implementation of the
dirty buffer list as it relates to file descriptors.

Perhaps a better way to do it would be to store the list of all the
relfilenodes of everything in pg_class, with a flag for each indicating
whether or not an fsync() of the file needs to take place.  When anything
writes to a file without O_SYNC or a trailing fsync(), it sets the flag
for the relfilenode of what it's writing.  Then at checkpoint time, the
bgwriter can scan the list and fsync() everything that has been flagged.

The relfilenode list should be relatively small in size: at most 16
bytes per item (and that on a 64-bit machine).  A database that has 4096
file objects would have a 64K list at most.  Not bad.

Because each database backend can only see the class objects associated
with the database it's connected to or the global objects (if there's a
way to see all objects I'd like to know about it, but pg_class only
shows objects in the current database or objects which are visible to
all databases), the relfilenode list might have to be broken up into one
list per database, with perhaps a separate list for global objects.

The interesting question in that situation is how to handle object
creation and removal, which should be a relatively rare occurrance
(fortunately), so it supposedly doesn't have to be all that efficient.


-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Beta freeze? (was Re: [HACKERS] array surprising behavior)

2004-02-04 Thread Alvaro Herrera
On Wed, Feb 04, 2004 at 11:06:29AM -0800, Joe Conway wrote:
 Alvaro Herrera wrote:
 I think this is most surprising behavior -- shouldn't the UPDATE raise
 an error?
 
 Surprising, but not new (probably has been there back to the Berkley 
 code), and has come up before on one of the lists (I think it might even 
 have been pgsql-bugs).

Too bad :-(

 I'm still hoping to scrounge up the time to continue working on arrays 
 for 7.5, including figuring out how to deal with this.

I have the same hope, though I know for sure that I won't have any time
to work on anything until March, and then I will probably devote most of
my time to Pg.  When is beta freeze supposed to happen?

-- 
Alvaro Herrera (alvherre[a]dcc.uchile.cl)
Siempre hay que alimentar a los dioses, aunque la tierra esté seca (Orual)

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-02-04 Thread Tom Lane
Kevin Brown [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 The more finely you slice your workspace, the more likely it becomes
 that one particular part will run out of space.  So the inefficient case
 where a backend isn't able to insert something into the appropriate list
 will become considerably more of a factor.

 Well, running out of space in the list isn't that much of a problem.  If
 the backends run out of list space (and the max size of the list could
 be a configurable thing, either as a percentage of shared memory or as
 an absolute size), then all that happens is that the background writer
 might end up fsync()ing some files that have already been fsync()ed.
 But that's not that big of a deal -- the fact they've already been
 fsync()ed means that there shouldn't be any data in the kernel buffers
 left to write to disk, so subsequent fsync()s should return quickly.

Yes, it's a big deal.  You're arguing as though the bgwriter is the
thing that needs to be fast, when actually what we care about is the
backends being fast.  If the bgwriter isn't doing the vast bulk of the
writing (and especially the fsync waits) then we are wasting our time
having one at all.  So we need a scheme that makes it as unlikely as
possible that backends will have to do their own fsyncs.  Small
per-backend fsync lists aren't the way to do that.

 Perhaps a better way to do it would be to store the list of all the
 relfilenodes of everything in pg_class, with a flag for each indicating
 whether or not an fsync() of the file needs to take place.

You're forgetting that we have a fixed-size workspace to do this in ...
and no way to know at postmaster start how many relations there are in
any of our databases, let alone predict how many there might be later on.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] PITR Dead horse?

2004-02-04 Thread Simon Riggs
Tom Lane wrote
 Marc G. Fournier [EMAIL PROTECTED] writes:
  Is this something large enough, like the win32 stuff, that having a
side
  list for discussions is worth setting up?
 
 In terms of the amount of code to be written, I expect it's larger
than
 the win32 porting effort.  And it should be mostly pretty separate
from
 hacking the core backend, since most of what remains to do is writing
 external management utilities (I think).

Yes it is! I'd like to start the discussion about PITR and try to go
through some functional requirements and how those might be implemented.
The Win32 port has a self-evident set of functional requirements; I'm
not sure that the PITR stuff is as clear - so I couldn't pass any
judgement at all (even if I did know the code well enough) on how big a
coding task that is, but I can see that the analysis and discussion is
large indeed.

 I've been dissatisfied with having the separate pgsql-hackers-win32
 list; I feel it just fragments the discussion, and people tend to end
up
 crossposting to -hackers anyway.  But a separate list for PITR work
 might be a good idea despite that experience, since it seems like it'd
 be a more separable project.

I'd vote for a new list dedicated to discussing Robustness issues,
such as PITR and the fsync/sync issues. IMHO, PostgreSQL has the
Functionality and Performance, it just needs some rock-solid analysis of
where-things-can-go-wrong with it, so that the business data centre
people will be able to use it with absolute confidence...even if the
answer is we've got every base covered. For me, the issues about
robustness are as much to do with risk reduction and confidence building
as they are about specific features in that area. [Wow, I expect some
flames on those comments!]

The list probably would remain clearly differentiated, in the same way
[Performance] covers lots of areas not discussed in [Hackers].

Not hung up on the name either, just something that indicates
breadth-of-scope, e.g. Availability or Data Protection or Resilience
etc..; maybe the Advocates would like to name it? It might even be a
press-release: PostgreSQL community focuses new efforts towards
Robustness features for its next major release.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] PITR Dead horse?

2004-02-04 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 I'd vote for a new list dedicated to discussing Robustness issues,
 such as PITR and the fsync/sync issues. IMHO, PostgreSQL has the
 Functionality and Performance, it just needs some rock-solid analysis of
 where-things-can-go-wrong with it, so that the business data centre
 people will be able to use it with absolute confidence...even if the
 answer is we've got every base covered. For me, the issues about
 robustness are as much to do with risk reduction and confidence building
 as they are about specific features in that area. [Wow, I expect some
 flames on those comments!]

You're right.  Exactly where do you expect to find the expertise and
interest to do such an analysis?  On pghackers, that's where.  There's
no reason to invent a new mailing list for what should be a continuing
topic in pghackers.  And to the extent that you were to move such a
discussion somewhere else, you'd just risk losing the attention of the
pair of eyeballs that might notice a hole in your analysis.

 Not hung up on the name either, just something that indicates
 breadth-of-scope, e.g. Availability or Data Protection or Resilience
 etc..; maybe the Advocates would like to name it? It might even be a
 press-release: PostgreSQL community focuses new efforts towards
 Robustness features for its next major release.

I think such a press release would be counterproductive, as it would
immediately make people question whether we have reliability problems.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: Beta freeze? (was Re: [HACKERS] array surprising behavior)

2004-02-04 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 When is beta freeze supposed to happen?

AFAIK, no date has been set at all.  I doubt we'll even think about it
until we see how the Windows port effort goes.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] PITR Dead horse?

2004-02-04 Thread Nicolai Tufar
Totally agree. Robustness and rock-solidness are the only
things missing for PostgreSQL to become the killer of certain
commercial enterprise databases out there. And the only thing 
that is missing in this respect is PITR. PITR should be there
INGRES had it in '84 and some people as why PostgreSQL does 
not have it.

I am well versed in the internals of PITR features of a certain
leading enterprise-class database out there. And would like to
contribute (write code) to this effort  as much as I can.

Best regards,
Nicolai Tufar



 -Original Message-
 From: [EMAIL PROTECTED] [mailto:pgsql-hackers-
 [EMAIL PROTECTED] On Behalf Of Simon Riggs
 Sent: Thursday, February 05, 2004 1:33 AM
 To: 'Tom Lane'; 'Marc G. Fournier'
 Cc: 'Tatsuo Ishii'; [EMAIL PROTECTED]; [EMAIL PROTECTED]; pgsql-
 [EMAIL PROTECTED]
 Subject: Re: [HACKERS] PITR Dead horse?
 
 Tom Lane wrote
  Marc G. Fournier [EMAIL PROTECTED] writes:
   Is this something large enough, like the win32 stuff, that having
a
 side
   list for discussions is worth setting up?
 
  In terms of the amount of code to be written, I expect it's larger
 than
  the win32 porting effort.  And it should be mostly pretty separate
 from
  hacking the core backend, since most of what remains to do is
writing
  external management utilities (I think).
 
 Yes it is! I'd like to start the discussion about PITR and try to go
 through some functional requirements and how those might be
implemented.
 The Win32 port has a self-evident set of functional requirements; I'm
 not sure that the PITR stuff is as clear - so I couldn't pass any
 judgement at all (even if I did know the code well enough) on how big
a
 coding task that is, but I can see that the analysis and discussion is
 large indeed.
 
  I've been dissatisfied with having the separate pgsql-hackers-win32
  list; I feel it just fragments the discussion, and people tend to
end
 up
  crossposting to -hackers anyway.  But a separate list for PITR work
  might be a good idea despite that experience, since it seems like
it'd
  be a more separable project.
 
 I'd vote for a new list dedicated to discussing Robustness issues,
 such as PITR and the fsync/sync issues. IMHO, PostgreSQL has the
 Functionality and Performance, it just needs some rock-solid analysis
of
 where-things-can-go-wrong with it, so that the business data centre
 people will be able to use it with absolute confidence...even if the
 answer is we've got every base covered. For me, the issues about
 robustness are as much to do with risk reduction and confidence
building
 as they are about specific features in that area. [Wow, I expect some
 flames on those comments!]
 
 The list probably would remain clearly differentiated, in the same way
 [Performance] covers lots of areas not discussed in [Hackers].
 
 Not hung up on the name either, just something that indicates
 breadth-of-scope, e.g. Availability or Data Protection or Resilience
 etc..; maybe the Advocates would like to name it? It might even be a
 press-release: PostgreSQL community focuses new efforts towards
 Robustness features for its next major release.
 
 Best Regards, Simon Riggs
 
 
 ---(end of
broadcast)---
 TIP 9: the planner will ignore your desire to choose an index scan if
your
   joining column's datatypes do not match


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] COPY from question

2004-02-04 Thread Slavisa Garic
Hi Kevin,

On Tue, 3 Feb 2004, Kevin Brown wrote:

 Slavisa Garic wrote:
  Using pg module in python I am trying to run the COPY command to populate
  the large table. I am using this to replace the INSERT which takes about
  few hours to add 7 entries where copy takes minute and a half. 
 
 That difference in speed seems quite large.  Too large.  Are you batching
 your INSERTs into transactions (you should be in order to get good
 performance)?  Do you have a ton of indexes on the table?  Does it have
 triggers on it or some other thing (if so then COPY may well wind up doing
 the wrong thing since the triggers won't fire for the rows it inserts)?
 
 I don't know what kind of schema you're using, but it takes perhaps a
 couple of hours to insert 2.5 million rows on my system.  But the rows
 in my schema may be much smaller than yours.

You are right about the indexes. There is quite a few of them (5-6 without
looking at the schema). The problem is that I do need those indexes as I
have a lot of SELECTs on that table and inserts are only happening once.

You are also right about the rows (i think) as I have about 15-20 columns.
This could be split into few other table and it used to be but I have
merged them because of the requirement for the faster SELECTs. With the
current schema there most of my modules that access the database are not
required to do expensive JOINs as they used to. Because faster SELECTs are
more important to me then faster INSERTs I had to do this. THis wasn't a
problem for me until I have started creating experiments which had more
than 20 thousand jobs which translates to 20 thousand rows in this big
table.

I do batch INSERTs into one big transaction (1000 rows at a time). While i
did get some improvement compared to the single transaction per insert it
was still not fast enough (well not for me :) ). Could you please
elaborate on the triggers? I have no idea what kind of triggers there are
in PGSQL or relational databases.

With regards to my problem, I did solve it by piping the data into the
COPY stdin. Now I have about 75000 rows inserted in 40 seconds which is
extremely good for me.

Thank you for your help,
Regards,
Slavisa

 
 -- 
 Kevin Brown [EMAIL PROTECTED]
 
 ---(end of broadcast)---
 TIP 4: Don't 'kill -9' the postmaster
 


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] PITR Dead horse?

2004-02-04 Thread Josh Berkus
Simon,

 I'd vote for a new list dedicated to discussing Robustness issues,
 such as PITR and the fsync/sync issues.
 snip
 The list probably would remain clearly differentiated, in the same way
 [Performance] covers lots of areas not discussed in [Hackers].

Actually, Simon, you're welcome to bring this discussion over to PERFORMANCE.  
We discuss scalability and HA on Performance frequently, and I don't feel 
that the discussion you refer to would be out of place.

But Tom is right that you need the feedback of a lot of the people on Hackers 
once you start discussing a code solution, so there's not much point in 
starting a new mailing list that all the same people need to subscribe to.  
Certainly Jan had enough trouble getting meaningful feedback on the sync 
issue here; on his own list he'd still be talking to himself.

As far as promoting an image of reliability, that belongs on Advocacy.   The 
image and the reality don't sync much; we're already about 500% more reliable 
than MS SQL Server but ask any ten CIOs what they think?   That's just 
marketing.

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster