Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-25 Thread Cesar Suga

Hi,

I also wrote Bruce about that.

It happens that, if you 'freely advertise' commercial solutions (rather 
than they doing so by other vehicles) you will always happen to be an 
'updater' to the docs if they change their product lines, if they change 
their business model, if and if.


If you cite a commercial solution, as a fair game you should cite *all* 
of them. If one enterprise has the right to be listed in the 
documentation, all of them might, as you will never be favouring one of 
them.


That's the main motivation to write this. Moreover, if there are also 
commercial solutions for high-end installs and they are cited as 
providers to those solutions, it (to a point) disencourages those of 
gathering themselves and writing open source extensions to PostgreSQL.


As Bruce stated, then should the documentation contemplate 
EnterpriseDB's Oracle functions? Should PostgreSQL also come with it? 
Wouldn't it be painful to make, say, another description for an 
alternate product other than EnterpriseDB if it arises?


If people (who read the documentation) professionally work with 
PostgreSQL, they may already have been briefed by those commercial 
offerings in some way.


I think only the source and its tightly coupled (read: can compile along 
with, free as PostgreSQL) components should be packaged into the tarball.


However, I find Bruce's unofficial wiki idea a good one for comparisons.

Regards,
Cesar

Steve Atkins wrote:


On Oct 24, 2006, at 9:20 PM, Bruce Momjian wrote:


Steve Atkins wrote:

If we are to add them, I need to hear that from people who haven't
worked in PostgreSQL commerical replication companies.


I'm not coming to PostgreSQL for open source solutions. I'm coming
to PostgreSQL for _good_ solutions.

I want to see what solutions might be available for a problem I have.
I certainly want to know whether they're freely available, commercial
or some flavour of open source, but I'd like to know about all of them.

A big part of the value of Postgresql is the applications and 
extensions

that support it. Hiding the existence of some subset of those just
because of the way they're licensed is both underselling postgresql
and doing something of a disservice to the user of the document.


OK, does that mean we mention EnterpriseDB in the section about Oracle
functions?  Why not mention MS SQL if they have a better solution?  I
just don't see where that line can clearly be drawn on what to include.
Do we mention Netiza, which is loosely based on PostgreSQL?   It just
seems very arbitrary to include commercial software.  If someone wants
to put in on a wiki, I think that would be fine because that doesn't
seems as official.


Good question. The line needs to be drawn somewhere. It's basically
your judgement, tempered by other peoples feedback, though. If it
were me, I'd ask myself Would I mention this product if it were open
source? Would mentioning it help people using the document?.

Cheers,
  Steve


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org




---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [SPAM?] Re: Asynchronous I/O Support

2006-10-25 Thread NikhilS
Hi, 

While we are at async i/o. I think direct i/o and concurrent i/o also deserve a look at. The archives suggest that Bruce had some misgivings about dio because of no kernel caching, but almost all databases seem to (carefully) use dio (Solaris, Linux, ?) and cio (AIX) extensively nowadays.

Since these can be turned on a per file basis, perf testing them out should be simpler too. 

Regards,
Nikhils
On 10/25/06, Martijn van Oosterhout kleptog@svana.org wrote:
On Tue, Oct 24, 2006 at 12:53:23PM -0700, Ron Mayer wrote: Anyway, for those who want to see what they do in Linux,
 http://www.gelato.unsw.edu.au/lxr/source/mm/fadvise.c Pretty scary that Bruce said it could make older linuxes dump core - there isn't a lot of code there.
The bug was probably in the glibc interface to the kernel. Google foundthis:http://sourceware.org/ml/libc-hacker/2004-03/msg0.html
i.e. posix_fadvise appears to have been broken on all 64-bitarchitechtures prior to March 2004 due to a silly linking error.And then things like this:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=313219Which suggest that prior to glibc 2.3.5, posix_fadvise crashed on 2.4kernels. That's a fairly recent version, so the bug would still befairly widespead.
Have a nice day,--Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ From each according to his ability. To each according to his ability to litigate.
-BEGIN PGP SIGNATURE-Version: GnuPG v1.4.1 (GNU/Linux)iD8DBQFFPnYrIB7bNG8LQkwRAuAqAJ4uqx8y9LxUa9RcEDm7CPwZ2lkS2wCfYxjB2KzJ7iDYU21lumcZT6cHeLI==MzUY-END PGP SIGNATURE-
-- All the world's a stage, and most of us are desperately unrehearsed.


Re: [HACKERS] Replication documentation addition

2006-10-25 Thread Hannu Krosing
Ühel kenal päeval, T, 2006-10-24 kell 22:57, kirjutas Bruce Momjian:
 I don't think the PostgreSQL documentation should be mentioning
 commercial solutions.

IMNSHO, having commercial solutions based on postgresql which extend
postgres in directions not (yet?) done by core postgres is nothing to be
ashamed of.

And we should at least mention the OSS version of Bizgres as a place
where quite a lot of initial development is done on performance
improvements considered too risky for mainline postgresql.

And if you need a more technical reason, you can use free libpq and psql
to connect to even Bizgres MPP ;)


 ---
 
 Luke Lonergan wrote:
  Bruce, 
  
   -Original Message-
   From: [EMAIL PROTECTED] 
   [mailto:[EMAIL PROTECTED] On Behalf Of Bruce Momjian
   Sent: Tuesday, October 24, 2006 5:16 PM
   To: Hannu Krosing
   Cc: PostgreSQL-documentation; PostgreSQL-development
   Subject: Re: [HACKERS] Replication documentation addition
   
   
   OK, I have updated the URL.  Please let me know how you like it.
  
  There's a typo on line 8, first paragraph:
  
  perhaps with only one server allowing write rwork together at the same
  time.
  
  Also, consider this wording of the last description:
  
  Single-Query Clustering...
  
  Replaced by:
  
  Shared Nothing Clustering
  ---
  
  This allows multiple servers with separate disks to work together on a
  each query.
  In shared nothing clusters, the work of answering each query is
  distributed among
  the servers to increase the performance through parallelism.  These
  systems will
  typically feature high availability by using other forms of replication
  internally.
  
  While there are no open source options for this type of clustering,
  there are several
  commercial products available that implement this approach, making
  PostgreSQL achieve
  very high performance for multi-Terabyte business intelligence
  databases.
  
  - Luke
 
-- 

Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-25 Thread Markus Schiltknecht

Hi,

Bruce Momjian wrote:
I have updated the text.  Please let me know what else I should change. 
I am unsure if I should be mentioning commercial PostgreSQL products in

our documentation.


I support your POV and vote for not including any pointers to commercial 
extensions in the official documentation. If at all, they should go to 
'external-projects.sgml', where PostGIS, PgAdmin and other projects are 
mentioned.


I can't really get excited about the exclusion of the term 
'replication', because it's what most people are looking for. It's a 
well known term. Sorry if it sounded that way, but I've not meant to 
avoid that term.


The newly created terms 'Query Broadcast Load Balancing' or even worse 
'Multi-Master Load Balancing' are more confusing than helpful, because 
these terms do not exist. (See the googlefight in [1])


Can we name the chapter Fail-over, Load-Balancing and Replication 
Options? That would fit everything and contain the necessary buzz words.


Also, I'm still missing Multi- vs Single-Master, which are also commonly 
used terms.


IMHO, it does not make sense to speak of a synchronous replication for a 
'Shared Disk Fail Over'. It's not replication, because there's no replica.


The Data Partitioning paragraph should probably mention it's close 
relation with data partitioning across table spaces (and make the 
differences clear).


What you call 'Query Broadcast Load Balancing' is also a multi-master 
replication, thus naming only the later 'Multi-Master Load Balancing' 
misleading.


I'd propose to add a subsection 'Synchronous, Multi-Master Replication' 
and explain the different possibilities on how to do that:


* Query-Based
* with 2PC
* Distributed SHMEM
* (perhaps mention the optimized Postgres-R algorithm ;-)

What you called 'Single-Query Clustering' is probably better known as 
'Parallel Query Execution'. It can be combined with all types of 
replication (every combination of async / sync and Single- / 
Multi-Master). It's maybe load balancing, but it depends on some form of 
replication to distribute the data first.


I liked Chris Browns documentation in [2] which was clearer regarding 
replication (which can be used to do fail-over, load-balancing, 
data-partitioning or parallel query execution). I'd like to keep all 
those things a little more separate to get them clear.


Regards

Markus

[1]: Googlefight: Multi-Master Load Balancing vs Multi-Master 
Replication: http://tinyurl.com/y3k76r


[2]: Chris Browns proposal for a replication documentation:
http://archives.postgresql.org/pgsql-patches/2006-08/msg00026.php

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-25 Thread Magnus Hagander
 I don't think the PostgreSQL documentation should be 
 mentioning commercial solutions.

I think maybe the PostgreSQL documentation should be careful about
trying to list a complete list of commercial *or* free solutions.
Instead linking to something on the main website or on techdocs that can
more easily be updated.

//Magnus

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-25 Thread Shane Ambler

Bruce Momjian wrote:


OK, does that mean we mention EnterpriseDB in the section about Oracle
functions?  Why not mention MS SQL if they have a better solution?  I
just don't see where that line can clearly be drawn on what to include.
Do we mention Netiza, which is loosely based on PostgreSQL?   It just
seems very arbitrary to include commercial software.  If someone wants
to put in on a wiki, I think that would be fine because that doesn't
seems as official.


I agree that the commercial offerings shouldn't be named directly in the 
docs, but it should be mentioned that some commercial options are 
available and a starting point to find more information.


If potential new users look through the docs and it says no options 
available for what they want or consider they will need in the future 
then they go elsewhere, if they know that some options are available 
then they will look further if they want that feature.


something like
There are currently no open source solutions available for this option 
but there are some commercial offerings. More details of some available 
solutions can be found at postgresql.org/support/




--

Shane Ambler
[EMAIL PROTECTED]

Get Sheeky @ http://Sheeky.Biz

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [JDBC] server process (PID 1188) exited with exit code

2006-10-25 Thread JEAN-PIERRE PELLETIER

Here is what I get with loglevel=2

08:47:18.718 (1) PostgreSQL 8.2devel JDBC3 with SSL (build 503)
08:47:18.718 (1) Trying to establish a protocol version 3 connection to 
localhost:5432
08:47:18.859 (1)  FE= StartupPacket(user=postgres, database=main, 
client_encoding=UNICODE, DateStyle=ISO)

08:47:19.218 (1)  =BE AuthenticationOk
08:47:19.234 (1)  =BE ParameterStatus(client_encoding = UNICODE)
08:47:19.234 (1)  =BE ParameterStatus(DateStyle = ISO, MDY)
08:47:19.234 (1)  =BE ParameterStatus(integer_datetimes = off)
08:47:19.234 (1)  =BE ParameterStatus(is_superuser = on)
08:47:19.234 (1)  =BE ParameterStatus(server_encoding = LATIN1)
08:47:19.234 (1)  =BE ParameterStatus(server_version = 8.2beta1)
08:47:19.234 (1)  =BE ParameterStatus(session_authorization = postgres)
08:47:19.234 (1)  =BE ParameterStatus(standard_conforming_strings = off)
08:47:19.234 (1)  =BE ParameterStatus(TimeZone = US/Eastern)
08:47:19.234 (1)  =BE BackendKeyData(pid=3248,ckey=166035706)
08:47:19.234 (1)  =BE ReadyForQuery(I)
08:47:19.234 (1) compatible = 8.2
08:47:19.234 (1) loglevel = 2
08:47:19.234 (1) prepare threshold = 5
getConnection returning 
driver[className=org.postgresql.Driver,[EMAIL PROTECTED]
08:47:19.296 (1) simple execute, 
[EMAIL PROTECTED], 
maxRows=0, fetchSize=0, flags=17
08:47:19.296 (1)  FE= Parse(stmt=null,query=select $1 from (select * from 
pg_database) t,oids={23})

08:47:19.296 (1)  FE= Bind(stmt=null,portal=null,$1=1)
08:47:19.296 (1)  FE= Describe(portal=null)
08:47:19.296 (1)  FE= Execute(portal=null,limit=0)
08:47:19.296 (1)  FE= Sync
08:47:19.718 (1)  FE= Terminate
08:47:19.718 (1) Discarding IOException on close:
java.net.SocketException: Connection reset by peer: socket write error
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(Unknown Source)
at java.net.SocketOutputStream.write(Unknown Source)
at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
at java.io.BufferedOutputStream.flush(Unknown Source)
at org.postgresql.core.PGStream.flush(PGStream.java:532)
	at 
org.postgresql.core.v3.ProtocolConnectionImpl.close(ProtocolConnectionImpl.java:131)
	at 
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:215)
	at 
org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:452)
	at 
org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:354)
	at 
org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:258)

at org.apache.jsp.Test_jsp._jspService(Test_jsp.java:104)
at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
	at 
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:332)

at 
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:314)
at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:264)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
	at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:252)
	at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173)
	at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
	at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:178)
	at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:126)
	at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105)
	at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:107)
	at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:148)
	at 
org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:831)
	at 
org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:639)

at 
org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1196)
at java.lang.Thread.run(Unknown Source)
org.postgresql.util.PSQLException: An I/O error occured while sending to the 
backend.
	at 
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:216)
	at 
org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:452)
	at 
org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:354)
	at 
org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:258)

at org.apache.jsp.Test_jsp._jspService(Test_jsp.java:104)
at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
	at 
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:332)

at 
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:314)

Re: [HACKERS] Incorrect behavior with CE and ORDER BY

2006-10-25 Thread Matteo Beccati

Tom Lane ha scritto:

Alvaro Herrera [EMAIL PROTECTED] writes:

Limit (50)
  Sort (key: pse_lastlogin)
Result
   Append
  Limit (50)
 SeqScan tbl_profile_search
  Limit (50)
 Indexscan tbl_profile_search_interest_1
  Limit (50)
 IndexScan on the index mentioned above


is wrong because there's no guarantee that the first 50 elements of a
seqscan will be anything special.  You could imagine dealing with that
by sorting the seqscan results and limiting to 50, or by not
sorting/limiting that data at all but letting the upper sort see all the
seqscan entries.  Offhand I think either of those could win depending on
how many elements the seqscan will yield.  Also, it might be interesting
to consider inventing a merge plan node type that takes N
already-sorted inputs and produces a sorted output stream.  Then we'd
need to trade off this approach versus doing the top-level sort, which
could cope with some of its inputs not being pre-sorted.

This seems to have some aspects in common with the recent discussion
about how to optimize min/max aggregates across an appendrel set.


The plan proposed by Alvaro reminds me of:

http://archives.postgresql.org/pgsql-performance/2005-09/msg00047.php

My proposal was in fact (Alvaro's plan + first Tom's suggested change):

Limit (50)
  Sort (key: pse_lastlogin)
Result
   Append
  Limit (50)
 Sort (key: pse_lastlogin)
SeqScan tbl_profile_search
  Limit (50)
 Indexscan tbl_profile_search_interest_1
  Limit (50)
 IndexScan on the index mentioned above

The plan was generated rewriting the query to use explicit subselect and 
forcing the planner to order by and limit for each subquery.


I've tried a few times to write a patch to handle it, but I wasn't able 
to do it because of my lack of internals knowledge and spare time.



Best regards
--
Matteo Beccati
http://phpadsnew.com
http://phppgads.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-25 Thread Jim C. Nasby
On Wed, Oct 25, 2006 at 08:22:25PM +0930, Shane Ambler wrote:
 Bruce Momjian wrote:
 
 OK, does that mean we mention EnterpriseDB in the section about Oracle
 functions?  Why not mention MS SQL if they have a better solution?  I
 just don't see where that line can clearly be drawn on what to include.
 Do we mention Netiza, which is loosely based on PostgreSQL?   It just
 seems very arbitrary to include commercial software.  If someone wants
 to put in on a wiki, I think that would be fine because that doesn't
 seems as official.
 
 I agree that the commercial offerings shouldn't be named directly in the 
 docs, but it should be mentioned that some commercial options are 
 available and a starting point to find more information.
 
 If potential new users look through the docs and it says no options 
 available for what they want or consider they will need in the future 
 then they go elsewhere, if they know that some options are available 
 then they will look further if they want that feature.
 
 something like
 There are currently no open source solutions available for this option 
 but there are some commercial offerings. More details of some available 
 solutions can be found at postgresql.org/support/

I think this is probably the best compromise. Keep in mind that many
people who are looking at us will also be looking at MySQL, which is
itself a commercial offering. It's good to let folks know that with
PostgreSQL, they have more control over how much money they spend for
commercial add-ons and support.
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-25 Thread Jim C. Nasby
On Wed, Oct 25, 2006 at 11:38:11AM +0200, Markus Schiltknecht wrote:
 I can't really get excited about the exclusion of the term 
 'replication', because it's what most people are looking for. It's a 
 well known term. Sorry if it sounded that way, but I've not meant to 
 avoid that term.
snip 
 IMHO, it does not make sense to speak of a synchronous replication for a 
 'Shared Disk Fail Over'. It's not replication, because there's no replica.

Those to statements are at odds with each other, at least based on
everyone I've ever talked to in a commercial setting. People will use
terms like 'replication', 'HA' or 'clustering' fairly interchangably.
Usually what these folks want is some kind of high-availability
solution. A few are more concerned with scalability. Sometimes it's a
combination of both. That's why I think it's good for the chapter to
deal with both aspects of this.
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-25 Thread Joshua D. Drake
 
 I am not inclined to add commercial offerings.  If people wanted
 commercial database offerings, they can get them from companies that
 advertize.  People are coming to PostgreSQL for open source solutions,
 and I think mentioning commercial ones doesn't make sense.
 
 If we are to add them, I need to hear that from people who haven't
 worked in PostgreSQL commerical replication companies.
 

You did, Josh Berkus. Secondly, as many people have stated in the past
not one replication suits everyone's needs and as PostgreSQL has many
replication solutions, it only makes sense to list the more prominent
ones, commercial or not.

Joshua D. Drake


-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-25 Thread Joshua D. Drake

 A big part of the value of Postgresql is the applications and extensions
 that support it. Hiding the existence of some subset of those just
 because of the way they're licensed is both underselling postgresql
 and doing something of a disservice to the user of the document.
 
 OK, does that mean we mention EnterpriseDB in the section about Oracle
 functions?

Way to compare apples to houses their Bruce. We are talking about
*PostgreSQL* replication solutions. Not *Oracle* compatibility
functions, However, *if* we had an Oracle compatibility section, I would
say, Yes it does make sense to list EnterpriseDB as a Proprietary
Commercial solution to migrating from Oracle.

  Why not mention MS SQL if they have a better solution?

Because we aren't talking about MS SQL, we are talking about PostgreSQL.

  I
 just don't see where that line can clearly be drawn on what to include.
 Do we mention Netiza, which is loosely based on PostgreSQL?   It just
 seems very arbitrary to include commercial software.

It is no more arbitrary than including *any* information on PostgreSQL
replication solutions, because PostgreSQL doesn't have any.

PostgreSQL doesn't do replication, except for PITR (and that is pushing
it as a replication solution).

Now.. there are *projects* that enable PostgreSQL to do replication.
Some of them are Open Source, some of them are commercial products.

Sincerely,

Joshua D. Drake


-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [JDBC] server process (PID 1188) exited with exit code

2006-10-25 Thread Tom Lane
JEAN-PIERRE PELLETIER [EMAIL PROTECTED] writes:
 08:47:19.296 (1)  FE= Parse(stmt=null,query=select $1 from (select * from 
 pg_database) t,oids={23})

Actually, now that I look closely, this command is almost certainly
triggering this beta1 bug:
http://archives.postgresql.org/pgsql-committers/2006-10/msg00107.php

Please try beta2 and see if it isn't fixed.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-25 Thread Markus Schaber
Hi, Cesar,

Cesar Suga wrote:
 If people (who read the documentation) professionally work with
 PostgreSQL, they may already have been briefed by those commercial
 offerings in some way.
 
 I think only the source and its tightly coupled (read: can compile along
 with, free as PostgreSQL) components should be packaged into the tarball.
 
 However, I find Bruce's unofficial wiki idea a good one for comparisons.

My suggestion is that the docs should mention only the pure existence of
important third-party packages and projects in those places where it
talks about the deficits that are supposedly fixed by those.

E. G. There are some third-party packages and projects that aim to
provide multi-master replication, you can search for more information at
http://[unofficial wiki page url] or your favourite search engine.

This way, the docs stay neutral, but point the user to possible
solutions of his problem.

HTH,
Markus
-- 
Markus Schaber | Logical TrackingTracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in Europe! www.ffii.org
www.nosoftwarepatents.org



signature.asc
Description: OpenPGP digital signature


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-25 Thread Joshua D. Drake
Cesar Suga wrote:
 Hi,
 
 I also wrote Bruce about that.
 
 It happens that, if you 'freely advertise' commercial solutions (rather
 than they doing so by other vehicles) you will always happen to be an
 'updater' to the docs if they change their product lines, if they change
 their business model, if and if.

That is no different than the open source offerings. We have had several
open source offerings that have died over the years. Replicator, for
example has always been Replicator and has been around longer than any
of the current replication solutions.

 
 If you cite a commercial solution, as a fair game you should cite *all*
 of them.

No. That doesn't make any sense either. I assume we aren't going to list
all PostgreSQL OSS replication solutions (there are at least a dozen or
more).

You list the ones that are stable in their existence (commercial or not).

 If one enterprise has the right to be listed in the
 documentation, all of them might, as you will never be favouring one of
 them.

You are looking at this the wrong way. This isn't about *any*
enterprise. It is about a PostgreSQL Solution. There happens to be two
or three known working open source solutions, and two or three known
working commercial solutions.

 
 That's the main motivation to write this. Moreover, if there are also
 commercial solutions for high-end installs and they are cited as
 providers to those solutions, it (to a point) disencourages those of
 gathering themselves and writing open source extensions to PostgreSQL.


No it doesn't. Because there is always the, It want's to be free! crowd.


 If people (who read the documentation) professionally work with
 PostgreSQL, they may already have been briefed by those commercial
 offerings in some way.

Maybe, maybe not.

Sincerely,

Joshua D. Drake


-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-25 Thread Bruce Momjian

I would think that companies that sell closed-source solutions for
PostgreSQL would be modest enough not to push their own agenda for the
documentation.  I think they should just sit back and hope others
suggest it.

[ Josh Berkus recently left Green Plum for Sun. ]

---

Joshua D. Drake wrote:
 
  A big part of the value of Postgresql is the applications and extensions
  that support it. Hiding the existence of some subset of those just
  because of the way they're licensed is both underselling postgresql
  and doing something of a disservice to the user of the document.
  
  OK, does that mean we mention EnterpriseDB in the section about Oracle
  functions?
 
 Way to compare apples to houses their Bruce. We are talking about
 *PostgreSQL* replication solutions. Not *Oracle* compatibility
 functions, However, *if* we had an Oracle compatibility section, I would
 say, Yes it does make sense to list EnterpriseDB as a Proprietary
 Commercial solution to migrating from Oracle.
 
   Why not mention MS SQL if they have a better solution?
 
 Because we aren't talking about MS SQL, we are talking about PostgreSQL.
 
   I
  just don't see where that line can clearly be drawn on what to include.
  Do we mention Netiza, which is loosely based on PostgreSQL?   It just
  seems very arbitrary to include commercial software.
 
 It is no more arbitrary than including *any* information on PostgreSQL
 replication solutions, because PostgreSQL doesn't have any.
 
 PostgreSQL doesn't do replication, except for PITR (and that is pushing
 it as a replication solution).
 
 Now.. there are *projects* that enable PostgreSQL to do replication.
 Some of them are Open Source, some of them are commercial products.
 
 Sincerely,
 
 Joshua D. Drake
 
 
 -- 
 
   === The PostgreSQL Company: Command Prompt, Inc. ===
 Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
 Providing the most comprehensive  PostgreSQL solutions since 1997
  http://www.commandprompt.com/
 
 Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
 
 
 ---(end of broadcast)---
 TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Replication documentation addition

2006-10-25 Thread Bruce Momjian

I have added this text:

Commercial Solutions


Because PostgreSQL is open source and easily extended, a number of
companies have taken PostgreSQL and created commercial closed-source
solutions with unique failover, replication, and load balancing
capabilities.


---

Hannu Krosing wrote:
 ?hel kenal p?eval, T, 2006-10-24 kell 22:57, kirjutas Bruce Momjian:
  I don't think the PostgreSQL documentation should be mentioning
  commercial solutions.
 
 IMNSHO, having commercial solutions based on postgresql which extend
 postgres in directions not (yet?) done by core postgres is nothing to be
 ashamed of.
 
 And we should at least mention the OSS version of Bizgres as a place
 where quite a lot of initial development is done on performance
 improvements considered too risky for mainline postgresql.
 
 And if you need a more technical reason, you can use free libpq and psql
 to connect to even Bizgres MPP ;)
 
 
  ---
  
  Luke Lonergan wrote:
   Bruce, 
   
-Original Message-
From: [EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED] On Behalf Of Bruce Momjian
Sent: Tuesday, October 24, 2006 5:16 PM
To: Hannu Krosing
Cc: PostgreSQL-documentation; PostgreSQL-development
Subject: Re: [HACKERS] Replication documentation addition


OK, I have updated the URL.  Please let me know how you like it.
   
   There's a typo on line 8, first paragraph:
   
   perhaps with only one server allowing write rwork together at the same
   time.
   
   Also, consider this wording of the last description:
   
   Single-Query Clustering...
   
   Replaced by:
   
   Shared Nothing Clustering
   ---
   
   This allows multiple servers with separate disks to work together on a
   each query.
   In shared nothing clusters, the work of answering each query is
   distributed among
   the servers to increase the performance through parallelism.  These
   systems will
   typically feature high availability by using other forms of replication
   internally.
   
   While there are no open source options for this type of clustering,
   there are several
   commercial products available that implement this approach, making
   PostgreSQL achieve
   very high performance for multi-Terabyte business intelligence
   databases.
   
   - Luke
  
 -- 
 
 Hannu Krosing
 Database Architect
 Skype Technologies O?
 Akadeemia tee 21 F, Tallinn, 12618, Estonia
 
 Skype me:  callto:hkrosing
 Get Skype for free:  http://www.skype.com

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-25 Thread Joshua D. Drake
Bruce Momjian wrote:
 I would think that companies that sell closed-source solutions for
 PostgreSQL would be modest enough not to push their own agenda for the
 documentation.  I think they should just sit back and hope others
 suggest it.
 
 [ Josh Berkus recently left Green Plum for Sun. ]

Bruce, you are making an idiot of yourself. With this statement you have
implied that Josh Berkus, are core member somehow has his own agenda
that is not in the interests of the PostgreSQL community.

Further that, you are suggesting that I as a member of Command Prompt
has an agenda that is not in the interests of the PostgreSQL community.

It was rude, uncalled for, inaccurate, and frankly disgusting.


Joshua D. Drake


-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-25 Thread Magnus Hagander
  I also wrote Bruce about that.
  
  It happens that, if you 'freely advertise' commercial solutions 
  (rather than they doing so by other vehicles) you will 
 always happen 
  to be an 'updater' to the docs if they change their product 
 lines, if 
  they change their business model, if and if.
 
 That is no different than the open source offerings. We have 
 had several open source offerings that have died over the 
 years. Replicator, for example has always been Replicator and 
 has been around longer than any of the current replication solutions.

I think this is a good reason not to list *any* of the products by name
in the documentation, but instead refer to a page on say techdocs that
can be more easily updated. And that can contain both free and non-free
projects, under clear headlines showing the difference.

The documentation is about PostgreSQL, not about third-party products,
be they free or commercial. Our *website*, however, should give guidance
on which specific products we (as a community) know are stable and
usable along with PostgreSQL (as we do today under downloads, but could
very well do based on specific uses like replication as well)

//Magnus

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-25 Thread Joshua D. Drake

 they change their business model, if and if.
 That is no different than the open source offerings. We have 
 had several open source offerings that have died over the 
 years. Replicator, for example has always been Replicator and 
 has been around longer than any of the current replication solutions.
 
 I think this is a good reason not to list *any* of the products by name
 in the documentation, but instead refer to a page on say techdocs that
 can be more easily updated. And that can contain both free and non-free
 projects, under clear headlines showing the difference.
 
 The documentation is about PostgreSQL, not about third-party products,
 be they free or commercial. Our *website*, however, should give guidance
 on which specific products we (as a community) know are stable and
 usable along with PostgreSQL (as we do today under downloads, but could
 very well do based on specific uses like replication as well)
 

I can agree with this :)

Sincerely,

Joshua D. Drake



-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] materialised view

2006-10-25 Thread Jim C. Nasby
On Wed, Oct 25, 2006 at 09:24:16AM +0530, rajesh boppana wrote:
 i want to implement materialized views in postgresql . to do as i
 want to modify the code in backend but i don't know what r the files i have
 to modify. so please help me by mentioning about the backend code.

If you're going to do this, I'd recommend looking at a modular approach.
Many of the things you need to do materialized views are also needed by
replication, such as the ability to log changes to tables and ship those
somewhere.
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-25 Thread Tom Lane
Magnus Hagander [EMAIL PROTECTED] writes:
 I think this is a good reason not to list *any* of the products by name
 in the documentation, but instead refer to a page on say techdocs that
 can be more easily updated.

I agree with that.  If we have statements about other projects in our
docs, we will have a problem with not being able to update those
statements in a timely fashion when the other projects change.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[HACKERS] Bug in 8.2B1 plpgsql ...

2006-10-25 Thread James Robinson
Seems that plpgsql in 8.2B1 thinks that selects of the form '   
and foo not in (select  ... )' should be function calls, not  
subselects. These worked fine in 8.1.


Here's a smallish script which reproduces the problem on 8.2RC1 / OSX:

If you comment out the 'and NEW.id not in (select t1_id from  
skip_t1_ids) ' clause in the trigger, then the script completes.



social=# \i 8.2.bug.sql
BEGIN
psql:8.2.bug.sql:15: NOTICE:  CREATE TABLE / PRIMARY KEY will create  
implicit index t1_pkey for table t1

CREATE TABLE
CREATE TABLE
CREATE SEQUENCE
CREATE TABLE
CREATE FUNCTION
CREATE TRIGGER
INSERT 0 1
INSERT 0 1
INSERT 0 1
INSERT 0 1
INSERT 0 1
psql:8.2.bug.sql:52: ERROR:  cache lookup failed for function 0
CONTEXT:  SQL statement SELECT   $1$2  and  $3  not in (select  
t1_id from skip_t1_ids)

PL/pgSQL function track_t1_changes line 2 at if
ROLLBACK



8.2.bug.sql
Description: Binary data





James Robinson
Socialserve.com


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[HACKERS] Nasty btree deletion bug

2006-10-25 Thread Tom Lane
I've been analyzing Ed L's recent report of index corruption:
http://archives.postgresql.org/pgsql-general/2006-10/msg01183.php
(thanks to Ed for letting me have the actual index to study).
I think I've figured out what happened to it.  nbtree/README says

: The notion of a half-dead page means that the key space relationship between
: the half-dead page's level and its parent's level may be a little out of
: whack: key space that appears to belong to the half-dead page's parent on the
: parent level may really belong to its right sibling.  We can tolerate this,
: however, because insertions and deletions on upper tree levels are always
: done by reference to child page numbers, not keys.  The only cost is that
: searches may sometimes descend to the half-dead page and then have to move
: right, rather than going directly to the sibling page.

but unfortunately this analysis is too simplistic.  In the situation
where a half-dead page is its parent's rightmost child (which is the
only case where we expect half-deadness to persist long), that page's
former key space now belongs to its right sibling, which means that
some keys that are less than the parent's high key now belong to the
keyspace of pages below the parent's right sibling.  This is OK as far
as search behavior goes --- but suppose that we get a continuing stream
of insertions of new keys in that key range.  This will result in page
splits that cause keys in that key range to bubble up into the upper
levels.  If that keeps happening long enough, eventually we will split
the parent's right sibling at a key value less than the parent's high
key, and then we will insert that key into the grandparent level just
to the right of the parent's right sibling.  Now we have an index
that's actually corrupt, because we have out-of-order index keys in
the grandparent level, which can cause searches for keys in their range
to fail (a search may descend too far to the right to find the entries
it should have found).

Since only internal pages can be half-dead, this failure requires at
least a three-level index, and it requires enough deletions within a
small range for for a level-1 page to become empty (hence half dead)
followed by a large number of insertions in that same range.  Ed's
index was probably more prone to this than average because he was
indexing very wide values (~500 byte text), leading to low btree fanout
and a relatively narrow value range for a level-1 page.  Still I'm a
bit surprised that we've not figured it out before, because the bug is
presumably present all the way back to 7.4 when the btree deletion code
was added.

I haven't thought of a suitable fix yet --- clearly we're going to have
to change the concept of half-deadness to some extent.  But I have to
leave for a dentist appointment, so I figured I'd post what I know.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [JDBC] server process (PID 1188) exited with exit code

2006-10-25 Thread JEAN-PIERRE PELLETIER

Yes, the problem is gone in 8.2 beta2.

Thanks all for an outstanding product and support,

Jean-Pierre Pelletier


From: Tom Lane [EMAIL PROTECTED]
To: JEAN-PIERRE PELLETIER [EMAIL PROTECTED]
CC: pgsql-hackers@postgresql.org, [EMAIL PROTECTED], [EMAIL PROTECTED]
Subject: Re: [HACKERS] [JDBC] server process (PID 1188) exited with exit 
code Date: Wed, 25 Oct 2006 10:15:03 -0400


JEAN-PIERRE PELLETIER [EMAIL PROTECTED] writes:
 08:47:19.296 (1)  FE= Parse(stmt=null,query=select $1 from (select * 
from

 pg_database) t,oids={23})

Actually, now that I look closely, this command is almost certainly
triggering this beta1 bug:
http://archives.postgresql.org/pgsql-committers/2006-10/msg00107.php

Please try beta2 and see if it isn't fixed.

regards, tom lane




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-25 Thread Joshua D. Drake
Tom Lane wrote:
 Magnus Hagander [EMAIL PROTECTED] writes:
 I think this is a good reason not to list *any* of the products by name
 in the documentation, but instead refer to a page on say techdocs that
 can be more easily updated.
 
 I agree with that.  If we have statements about other projects in our
 docs, we will have a problem with not being able to update those
 statements in a timely fashion when the other projects change.

This being said, I would say that the replication documentation needs to
be on Techdocs or some place similar and that we should have a link in
the PostgreSQL docs that points to the techdocs article and possibly:
http://www.postgresql.org/download/ .

Sincerely,

Joshua D. Drake



 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
 


-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-25 Thread Bruce Momjian
Markus Schiltknecht wrote:
 Hi,
 
 Bruce Momjian wrote:
  I have updated the text.  Please let me know what else I should change. 
  I am unsure if I should be mentioning commercial PostgreSQL products in
  our documentation.
 
 I support your POV and vote for not including any pointers to commercial 
 extensions in the official documentation. If at all, they should go to 
 'external-projects.sgml', where PostGIS, PgAdmin and other projects are 
 mentioned.
 
 I can't really get excited about the exclusion of the term 
 'replication', because it's what most people are looking for. It's a 
 well known term. Sorry if it sounded that way, but I've not meant to 
 avoid that term.

OK, I have re-added the term replication as appropriate.

 The newly created terms 'Query Broadcast Load Balancing' or even worse 
 'Multi-Master Load Balancing' are more confusing than helpful, because 
 these terms do not exist. (See the googlefight in [1])

OK, renamed.

 Can we name the chapter Fail-over, Load-Balancing and Replication 
 Options? That would fit everything and contain the necessary buzz words.

Yes. Done, cluster added too.

 Also, I'm still missing Multi- vs Single-Master, which are also commonly 
 used terms.

Yea, not sure how to get those in because it somewhat confuses the
purpose of the solution.

 IMHO, it does not make sense to speak of a synchronous replication for a 
 'Shared Disk Fail Over'. It's not replication, because there's no replica.

Agreed.  Modified.

 The Data Partitioning paragraph should probably mention it's close 
 relation with data partitioning across table spaces (and make the 
 differences clear).

Uh, so you I/O load with table spaces.  Uh, that seems too far a reach
to mention here.

 What you call 'Query Broadcast Load Balancing' is also a multi-master 
 replication, thus naming only the later 'Multi-Master Load Balancing' 
 misleading.

Renamed.

 I'd propose to add a subsection 'Synchronous, Multi-Master Replication' 
 and explain the different possibilities on how to do that:
 
 * Query-Based
 * with 2PC
 * Distributed SHMEM
 * (perhaps mention the optimized Postgres-R algorithm ;-)
 
 What you called 'Single-Query Clustering' is probably better known as 
 'Parallel Query Execution'. It can be combined with all types of 
 replication (every combination of async / sync and Single- / 
 Multi-Master). It's maybe load balancing, but it depends on some form of 
 replication to distribute the data first.

Good term.  Added.

 I liked Chris Browns documentation in [2] which was clearer regarding 
 replication (which can be used to do fail-over, load-balancing, 
 data-partitioning or parallel query execution). I'd like to keep all 
 those things a little more separate to get them clear.

Please let me know how you like the new version at the ftp URL.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-25 Thread Bruce Momjian
Jim C. Nasby wrote:
 On Wed, Oct 25, 2006 at 11:38:11AM +0200, Markus Schiltknecht wrote:
  I can't really get excited about the exclusion of the term 
  'replication', because it's what most people are looking for. It's a 
  well known term. Sorry if it sounded that way, but I've not meant to 
  avoid that term.
 snip 
  IMHO, it does not make sense to speak of a synchronous replication for a 
  'Shared Disk Fail Over'. It's not replication, because there's no replica.
 
 Those to statements are at odds with each other, at least based on
 everyone I've ever talked to in a commercial setting. People will use
 terms like 'replication', 'HA' or 'clustering' fairly interchangably.
 Usually what these folks want is some kind of high-availability
 solution. A few are more concerned with scalability. Sometimes it's a
 combination of both. That's why I think it's good for the chapter to
 deal with both aspects of this.

OK, I did break it out somewhat for clarity.  Let me know how it looks
now.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-25 Thread Markus Schiltknecht

Hi,

Jim C. Nasby wrote:

Those to statements are at odds with each other, at least based on
everyone I've ever talked to in a commercial setting. People will use
terms like 'replication', 'HA' or 'clustering' fairly interchangably.
Usually what these folks want is some kind of high-availability
solution. A few are more concerned with scalability. Sometimes it's a
combination of both. That's why I think it's good for the chapter to
deal with both aspects of this.


Yabut... at least the PostgreSQL manual should uses the terms correctly.

And while I do perfectly agree that it's a fail-over solution and it 
should be mentioned in that section, I'm arguing that it's not replication.


Regards

Markus

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-25 Thread Bruce Momjian
Tom Lane wrote:
 Magnus Hagander [EMAIL PROTECTED] writes:
  I think this is a good reason not to list *any* of the products by name
  in the documentation, but instead refer to a page on say techdocs that
  can be more easily updated.
 
 I agree with that.  If we have statements about other projects in our
 docs, we will have a problem with not being able to update those
 statements in a timely fashion when the other projects change.

I mention only Slony and pgpool as examples of replication types.  They
seem to have risen to high enough visiblity to do that. I have not
mentioned any other solutions.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-25 Thread Joshua D. Drake
Bruce Momjian wrote:
 Tom Lane wrote:
 Magnus Hagander [EMAIL PROTECTED] writes:
 I think this is a good reason not to list *any* of the products by name
 in the documentation, but instead refer to a page on say techdocs that
 can be more easily updated.
 I agree with that.  If we have statements about other projects in our
 docs, we will have a problem with not being able to update those
 statements in a timely fashion when the other projects change.
 
 I mention only Slony and pgpool as examples of replication types.  They
 seem to have risen to high enough visiblity to do that. I have not
 mentioned any other solutions.

What about Slony-II or pgpool2? Which are fundamentally different from
their v1 counterparts (o.k. slony-ii isn't out yet but still).

I +1 that we move to have all of the replication documentation pushed to
techdocs or other facility and just have a link from the docs.

Joshua D. Drake


-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-25 Thread Bruce Momjian
Joshua D. Drake wrote:
 Bruce Momjian wrote:
  Tom Lane wrote:
  Magnus Hagander [EMAIL PROTECTED] writes:
  I think this is a good reason not to list *any* of the products by name
  in the documentation, but instead refer to a page on say techdocs that
  can be more easily updated.
  I agree with that.  If we have statements about other projects in our
  docs, we will have a problem with not being able to update those
  statements in a timely fashion when the other projects change.
  
  I mention only Slony and pgpool as examples of replication types.  They
  seem to have risen to high enough visiblity to do that. I have not
  mentioned any other solutions.
 
 What about Slony-II or pgpool2? Which are fundamentally different from
 their v1 counterparts (o.k. slony-ii isn't out yet but still).
 
 I +1 that we move to have all of the replication documentation pushed to
 techdocs or other facility and just have a link from the docs.

What I did was to mention Slony and pgpool as examples, so people
realize there are many other soluions.  It would be good to have a
companion web site that could list them all, both open source and
commercial.  That is going to take a lot more work, but I think would
have great value, especially since our documentation will clearly
outline the terms.  What you don't want to do is to throw up a list and
have people try to figure out what solutions they cover.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Replication documentation addition

2006-10-25 Thread Richard Troy

Hi Hannu, everyone,

I apologize for not having read the document in question - will do
shortly. My comments are brought about by the dialogue I read on list this
morning...

  Here is a new replication documentation section I want to add for 8.2:
 
  ftp://momjian.us/pub/postgresql/mypatches/replication


  Data Partitioning
  -
 
  Data partitioning splits the database into data sets.  To achieve
  replication, each data set can only be modified by one server.  For
  example, data can be partitioned by offices, e.g. London and Paris.
  While London and Paris servers have all data records, only London can
  modify London records, and Paris can only modify Paris records.  Such
  partitioning is usually accomplished in application code, though rules
  and triggers can help enforce partitioning and keep the read-only data
  sets current.  Slony can also be used in such a setup.  While Slony
  replicates only entire tables, London and Paris can be placed in
  separate tables, and inheritance can be used to access from both tables
  using a single table name.

 Maybe another use of partitioning should also be mentioned. That is ,
 when partitioning is used to overcome limitations of single servers
 (especially IO and memory, but also CPU), and only a subset of data is
 stored and processed on each server.

  I think the official term for this kind of replication is
  Shared-Nothing Clustering.

Data partitioning has two fundamental flavors, horizontal and
vertical, quite a handful of implementations, and even more motivations
behind why one uses either strategy and whatever implementation. The same
is true for clustering - a few fundamental strategies, with a larger
number of implementations and yet more motivations. Replication,
meanwhile, is yet another beast altogether, sharing the same fundamentals
of multiple flavors, implementations and motivations. … I strongly urge
keeping any documentation on these (and related) topics strictly distinct
and separate.

In my view, one should define the terms first, separately, distinctly, and
as succinctly as possible, and, following this, a dialogue on how these
may be combined can be entertained. The definitions of each should be both
complete and academic in flavor and may include implementation and
motivational  information, but never muddy the water by mixing with
other concepts - not yet, not until after all the fundamentals have been
introduced.

I don't know much about what PostgreSql has been doing in these areas of
late - nothing, I gather from someone's post this morning - but I'll try
to help out as I can with a paragraph or two - whatever you want,
whatever's welcome - as I was there when Randy Eash created the first
commercial RDBMS replicator - for Ingres - and since I created the first
commercial RDBMS front-end failover technology, also for Ingres, so I have
a pretty good handle on all the issues.

Also, I liked what Markus Schiltknecht wrote, but will have to read the
original before I can comment on his specific points.

 I am not inclined to add commercial offerings.  If people wanted
 commercial database offerings, they can get them from companies that
 advertize.  People are coming to PostgreSQL for open source solutions,
 and I think mentioning commercial ones doesn't make sense.

 If we are to add them, I need to hear that from people who haven't
 worked in PostgreSQL commerical replication companies.

 I'm not coming to PostgreSQL for open source solutions. I'm coming
 to PostgreSQL for _good_ solutions.

 I want to see what solutions might be available for a problem I have.
 I certainly want to know whether they're freely available, commercial
 or some flavour of open source, but I'd like to know about all of them.

 A big part of the value of Postgresql is the applications and extensions
 that support it. Hiding the existence of some subset of those just
 because of the way they're licensed is both underselling postgresql
 and doing something of a disservice to the user of the document.

 If potential new users look through the docs and it says no options
 available for what they want or consider they will need in the future
 then they go elsewhere, if they know that some options are available
 then they will look further if they want that feature.


I agree that people look through the materials on the web site,
documentation especially, and make choices based upon what they see. Many
of us don't have time to spend a day searching the web for things we don't
even know exist. By including more information, more users will be
attracted to PostgreSql, whether it be in the documentation or web site. I
have been SURE that certain things must exist in the PG world, but haven't
known about them with certainty due to time constraints, but would gladly
point our customers at Postgres solutions if only I knew about them. Count
this paragraph as praise for doing _something_more_ to help get more
information to (prospective) users.

Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-25 Thread David Fetter
On Wed, Oct 25, 2006 at 11:38:11AM +0200, Markus Schiltknecht wrote:

 Can we name the chapter Fail-over, Load-Balancing and Replication 
 Options? That would fit everything and contain the necessary buzz words.
...

 IMHO, it does not make sense to speak of a synchronous replication for a 
 'Shared Disk Fail Over'. It's not replication, because there's no replica.

As you point out, there is no replica of the data, but there is some
protection against machine failure, which puts it firmly in the
Fail-over part above.

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-25 Thread Bruce Momjian
David Fetter wrote:
 On Wed, Oct 25, 2006 at 11:38:11AM +0200, Markus Schiltknecht wrote:
 
  Can we name the chapter Fail-over, Load-Balancing and Replication 
  Options? That would fit everything and contain the necessary buzz words.
 ...
 
  IMHO, it does not make sense to speak of a synchronous replication for a 
  'Shared Disk Fail Over'. It's not replication, because there's no replica.
 
 As you point out, there is no replica of the data, but there is some
 protection against machine failure, which puts it firmly in the
 Fail-over part above.

Right, but his point was not to call it synchronous.  I have fixed that
in the current version.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-25 Thread Casey Duncan
Totally agree. The docs will tend to outlive whatever projects or  
websites they mention. Best to not bake that into stone.


-Casey

On Oct 25, 2006, at 3:36 AM, Magnus Hagander wrote:


I don't think the PostgreSQL documentation should be
mentioning commercial solutions.


I think maybe the PostgreSQL documentation should be careful about
trying to list a complete list of commercial *or* free solutions.
Instead linking to something on the main website or on techdocs  
that can

more easily be updated.

//Magnus

---(end of  
broadcast)---

TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [JDBC] server process (PID 1188) exited with exit code -1073741819, 8.2 beta1

2006-10-25 Thread Thomas H.

This is a server bug, I will post to hackers for you,


Please provide a complete test case.  I tried to reproduce the failure
in libpq, with


tom, i've just noticed this is the exact same error message  errorcode as i 
get when updating a table that contains a tsearch2 vector column


2006-10-25 20:08:42 [3420] LOG:  0: server process (PID 2332) exited 
with exit code -1073741819

2006-10-25 20:08:42 [3420] LOCATION:  LogChildExit, postmaster.c:2385

(test case provided, see Re: [BUGS] 8.2beta1 (w32): server process crash 
(tsvector)  from oct. 17th)


maybe dave is also using tsearch2 in the involved tables, or is this a 
general error code?


regards,
thomas 




---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Replication documentation addition

2006-10-25 Thread Richard Troy

 Here is a new replication documentation section I want to add for 8.2:

 ftp://momjian.us/pub/postgresql/mypatches/replication


...Read the document, as promissed...

First paragraph, (fail over) is inconsistent with title, failover, as
are other spots throughout the document. The whole document should be
consistent and I vote for failover and not fail over.

Fourth paragraph, This sync problem is the fundamental difficulty for
servers working together; Sync problem hasn't been defined. Actually,
you're talking about the consistent attribute of the acid properties of
all competent databases: Atomic, Consistency, Isolation, and Durability.
At least define the term you are using - probably most easily done in the
preceeding paragraph.

The fifth paragraph needs a lot more help, I think. Howabout this
alternative:

So called two phaised commit was developed as a strategy in which two or
more databases are updated simultaneously and none of the data is
committed until all are committed. This guarantees consistency between the
databases with all propagation delay being absorbed by the writer at write
time. There are times when this propagation delay is large, so sometimes
alternatives are worked out which we'll call here asynchronous updates,
however, in these cases, there is always a window of time in which some
transaction can be lost should a failure occurr. For this reason,
asynchronous updates are only used when the possibility of such losses is
acceptible.

Paragraphs six through to shared disk failover seem very awkward to me.
I don't like them at all.

Shared disk failover has nothing to do with the sync problem as it's
not a multiple-database solution. It's an uptime, 24 X 7 X 365 issue.
Further, it also has nothing to do with disk arrays, though it is often
used with RAID to help avoid disk based corruption problems.

The point about Warm Standby needs to include a warning about WAL that it
MUST be sensitive to the semantics of the database design or else it's
fatally flawed. I'm talking about referential integrety. That is to say,
it's inappropriate to capture updates on a table by table basis, as some
such systems do, (I have no idea what's done by anyone in the PG world on
this right now) because an update to one table (esp. inserts) very often
go hand in glove with updates in other tables and to get one without the
other can corrupt a database.

The description of Continuously running replication server should
include the critical caveat - repeated if you think it's already said
elsewhere - that it is ONLY suitable for applications in which a loss of
(missing) update data doesn't matter. For example, an airline reservation
system would be an inappropriate application for such a solution because
what seats are available cannot be guaranteed to be correct.

Regarding data partitioning, I strongly disagree with the opening sentence
in that it doesn't split a database into sets, it splits tables into sets.
Data partitioning is often done within a single database on a single
server and therefore, as a concept, has nothing whatsoever to do with
different servers. Similarly, the second paragraph of this section is
problematic. Please define your term first, then talk about some
implementations - this is muddying the water. Further, there are both
vertical and horizontal partitioning - you mention neither - and each has
its own distinct uses. If partitioning is mentioned, it should be more
complete.

Next, Query Broadcast Load Balancing... also needs a lot of work. First,
it's foremost in my memory that sending read queries everywhere and
returning the first result set back is a key way to improve application
performance at the cost of additional load on other systems - I guess
that's not at all what the document is after here, but it's a worthy part
of a dialogue on broadcasting queries. In other words, this has more parts
to it than just what the document now entertains. Secondly, the document
doesn't address _at_all_ whether this is a two-phaise-commit environment
or not. If not, how are updates managed? If each server operates
independently and one of them fails, what do you do then? How do you know
_any_ server got an insert/update? ...  Each server _can't_ operate
independently unless the application does its own insert/update commits to
every one of them - and that can't be fast, nor does it load balance,
though it may contribute to superior uptime performance by the
application.

Next up; I'm not aware of any current products or projects that provide
parallel query execution, though Informix might - I can ask a colleague or
two. Either way, it's probably best to simply define the term (perhaps in
a little more detail), and not mention solutions - they change with time
anyway.

While I've never used Oracle's clustering tools, I've read up on them and
have customers who use them, and I think this description of Oracle
clustering is a mis-read on what the Oracle system actually does. A check

Re: [HACKERS] Replication documentation addition

2006-10-25 Thread Bruce Momjian
Alexey Klyukin wrote:
 Hi,
 
 A typo:
 (a write to any server has to be _propogated_)
 s/propogated/propagated

Thanks, fixed.

---


 
 Bruce Momjian wrote:
  Here is a new replication documentation section I want to add for 8.2:
 
  ftp://momjian.us/pub/postgresql/mypatches/replication
 
  Comments welcomed.
 

 -- 
 Regards,
 
 Alexey Klyukinalexk(at)vollmond.org.ua
 Simferopol, Crimea, Ukraine.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Replication documentation addition

2006-10-25 Thread Josh Berkus
Bruce,

 ftp://momjian.us/pub/postgresql/mypatches/replication

I'm still not seeing anything in this patch that tells users where they can 
get replication solutions for PostgreSQL, either OSS or commercial.

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Replication documentation addition

2006-10-25 Thread Bruce Momjian
Richard Troy wrote:
 
  Here is a new replication documentation section I want to add for 8.2:
 
  ftp://momjian.us/pub/postgresql/mypatches/replication
 
 
 ...Read the document, as promissed...
 
 First paragraph, (fail over) is inconsistent with title, failover, as
 are other spots throughout the document. The whole document should be
 consistent and I vote for failover and not fail over.

OK.  Fixed to failover

 Fourth paragraph, This sync problem is the fundamental difficulty for
 servers working together; Sync problem hasn't been defined. Actually,
 you're talking about the consistent attribute of the acid properties of
 all competent databases: Atomic, Consistency, Isolation, and Durability.
 At least define the term you are using - probably most easily done in the
 preceeding paragraph.

OK, sync problem term removed, and spelled out fully.

 The fifth paragraph needs a lot more help, I think. Howabout this
 alternative:
 
 So called two phaised commit was developed as a strategy in which two or
 more databases are updated simultaneously and none of the data is
 committed until all are committed. This guarantees consistency between the
 databases with all propagation delay being absorbed by the writer at write
 time. There are times when this propagation delay is large, so sometimes
 alternatives are worked out which we'll call here asynchronous updates,
 however, in these cases, there is always a window of time in which some
 transaction can be lost should a failure occurr. For this reason,
 asynchronous updates are only used when the possibility of such losses is
 acceptible.

I have modified the paragraph to use some of your terms.

 Paragraphs six through to shared disk failover seem very awkward to me.
 I don't like them at all.
 
 Shared disk failover has nothing to do with the sync problem as it's
 not a multiple-database solution. It's an uptime, 24 X 7 X 365 issue.
 Further, it also has nothing to do with disk arrays, though it is often
 used with RAID to help avoid disk based corruption problems.

Yes, please see updated version.  I removed the sync problem term from
there.

 The point about Warm Standby needs to include a warning about WAL that it
 MUST be sensitive to the semantics of the database design or else it's
 fatally flawed. I'm talking about referential integrety. That is to say,
 it's inappropriate to capture updates on a table by table basis, as some
 such systems do, (I have no idea what's done by anyone in the PG world on
 this right now) because an update to one table (esp. inserts) very often
 go hand in glove with updates in other tables and to get one without the
 other can corrupt a database.

We don't have that problem.  We recover only full transactions.

 The description of Continuously running replication server should
 include the critical caveat - repeated if you think it's already said
 elsewhere - that it is ONLY suitable for applications in which a loss of
 (missing) update data doesn't matter. For example, an airline reservation
 system would be an inappropriate application for such a solution because
 what seats are available cannot be guaranteed to be correct.

I have added note about data loss for the Slony item.

 Regarding data partitioning, I strongly disagree with the opening sentence
 in that it doesn't split a database into sets, it splits tables into sets.

OK, changed.

 Data partitioning is often done within a single database on a single
 server and therefore, as a concept, has nothing whatsoever to do with
 different servers. Similarly, the second paragraph of this section is

Uh, why would someone split things up like that on a single server?

 problematic. Please define your term first, then talk about some
 implementations - this is muddying the water. Further, there are both
 vertical and horizontal partitioning - you mention neither - and each has
 its own distinct uses. If partitioning is mentioned, it should be more
 complete.

Uh, what exactly needs to be defined.

 Next, Query Broadcast Load Balancing... also needs a lot of work. First,
 it's foremost in my memory that sending read queries everywhere and
 returning the first result set back is a key way to improve application
 performance at the cost of additional load on other systems - I guess
 that's not at all what the document is after here, but it's a worthy part
 of a dialogue on broadcasting queries. In other words, this has more parts
 to it than just what the document now entertains. Secondly, the document

Uh, do we want to go into that here?  I guess I could.

 doesn't address _at_all_ whether this is a two-phaise-commit environment
 or not. If not, how are updates managed? If each server operates
 independently and one of them fails, what do you do then? How do you know
 _any_ server got an insert/update? ...  Each server _can't_ operate
 independently unless the application does its own insert/update commits to
 every one of them - and that can't be fast, nor does it 

Re: [HACKERS] Replication documentation addition

2006-10-25 Thread Bruce Momjian
Josh Berkus wrote:
 Bruce,
 
ftp://momjian.us/pub/postgresql/mypatches/replication
 
 I'm still not seeing anything in this patch that tells users where they can 
 get replication solutions for PostgreSQL, either OSS or commercial.

It isn't designed for that.  It is designed for people to understand
what they want, and then they can look around for solutions.  I think
most agree we don't want a list of solutions in the documentation,
though I have a few as examples.  Also, some of the solutions don't
require software, but just configuration or special hardware.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Bug in 8.2B1 plpgsql ...

2006-10-25 Thread Tom Lane
James Robinson [EMAIL PROTECTED] writes:
 Seems that plpgsql in 8.2B1 thinks that selects of the form '   
 and foo not in (select  ... )' should be function calls, not  
 subselects. These worked fine in 8.1.

Fixed, thanks.  It's not actually plpgsql's fault ...

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


[HACKERS] Traveling for two weeks

2006-10-25 Thread Bruce Momjian
FYI, I am leaving Friday for a two-week trip for EnterpriseDB.  I am
going to Tokyo, Islamabad (Pakistan), and Pune (India).  I return on
Friday, November 10.  I will have Internet connectivity, but of course I
will not be online as frequently as usual.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-25 Thread Dawid Kuroczko

On 10/25/06, Bruce Momjian [EMAIL PROTECTED] wrote:

Joshua D. Drake wrote:
 Bruce Momjian wrote:
  Tom Lane wrote:
  Magnus Hagander [EMAIL PROTECTED] writes:
  I think this is a good reason not to list *any* of the products by name
  in the documentation, but instead refer to a page on say techdocs that
  can be more easily updated.
  I agree with that.  If we have statements about other projects in our
  docs, we will have a problem with not being able to update those
  statements in a timely fashion when the other projects change.
 
  I mention only Slony and pgpool as examples of replication types.  They
  seem to have risen to high enough visiblity to do that. I have not
  mentioned any other solutions.

 What about Slony-II or pgpool2? Which are fundamentally different from
 their v1 counterparts (o.k. slony-ii isn't out yet but still).

 I +1 that we move to have all of the replication documentation pushed to
 techdocs or other facility and just have a link from the docs.

What I did was to mention Slony and pgpool as examples, so people
realize there are many other soluions.  It would be good to have a
companion web site that could list them all, both open source and
commercial.  That is going to take a lot more work, but I think would
have great value, especially since our documentation will clearly
outline the terms.  What you don't want to do is to throw up a list and
have people try to figure out what solutions they cover.


I'm in quite an unique situation right now, working with a few DBAs
who have deep knowledge but no PostgreSQL background, so I have
a good view how PostgreSQL is perceived by people with fair knowledge
of other databases.

What I have noticed is a deep respect for community.  If they ask about
replication solution, and I tell about Slony, they ask if Slony is provided
with the postgresql-contrib. Well... no, and it won't be.  Then they look
back, think a while and say somethig on the lines of: well, $SOME_OTHER
_DATABASE was using external replication solutions so it is all right.

But then, before I talked with them, they did some quick research on
PostgreSQL and their perception was that there's no replication / replication
is shady in PostgreSQL.  It would be quite convenient to tell them:
No replication? Did you actually read the manual? here goes URL
Well, pointing them to slony page is a solution but of a lesser caliber
(how should they know about Slony anyway? They are newbies).
Pointing them at The Documentation is a Good Argument (and it may
cause them to look for some other information, like SQL syntax or
PostgreSQL-specific catalog views there, which is Good).

Enough background.

Bruce, I've read Your documentation and I was left a bit with a feeling
that it's a bit too generic.  It's almost as if it could be about just about
any major database, not PostgreSQL specific.  I feel that, when I'm
reading PostgreSQL docs I would like to know how to set up multi-master
replication with PostgreSQL not an explanation what a multi-master
replication is. It's not about the actual documentation content, but rather
on accents distribution.  Now it is something like: These are the types
of replication solutions possible, some of them can be done with PostgreSQL,
I think it should be rather: With PostgreSQL and some third-party tools you
can achieve such and such replication solutions, oh and by the way, research
is done on such and such replication method, but it's not a production quality
yet.

And I try to think as my DBA-mates would do if they read the documentation,
I'm not sure they would end up enlighted after reading the docs -- thay would
probably say: hey, I knew that, it's well structured there, but I
still don't know
what should I use, or maybe where can I read something about this slony
thing anyway?.

It may be my closed thinking schema though.  What I feel is that such
outsider, after reading these docs should end with Aha! I should be using
Slony for my purposes.  Or pgpool, if it's what she needs.  I believe Tom's
remark that it does NOT belong in the PostgreSQL documentation is quite
right (though I wish there IS some reference to external replication packages,
mainly because over and over again I need to prove PostgreSQL CAN be
replicated, and it's not uncommon).  However I'm still unconvinced about
TechDocs -- TechDocs are good but still they are a bit scattered and
unorganised.  I am a PostgreSQL enthusiast, but it took me a while to
learn about them, and for newbies not biased towards PostgreSQL it may
take even more time.  If it is linked from within the documentation, random
DBAs might read it, and I wish they do.

Right now I am more and more biased towards an additional documentation
book for PostgreSQL, something like DBA guide or handbook.  In format
similar to the PostgreSQL documentation, but inside oriented around
configuring other tools around and together with PostgreSQL.  I shall send
here some drafts withing 10-days time to seed a 

Re: [HACKERS] Replication documentation addition

2006-10-25 Thread Josh Berkus
Bruce,

 It isn't designed for that.  It is designed for people to understand
 what they want, and then they can look around for solutions.  I think
 most agree we don't want a list of solutions in the documentation,
 though I have a few as examples.  

Do they?   I've seen no discussion of the matter.  I think we should have 
them.

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Replication documentation addition

2006-10-25 Thread Bruce Momjian
Josh Berkus wrote:
 Bruce,
 
  It isn't designed for that.  It is designed for people to understand
  what they want, and then they can look around for solutions.  I think
  most agree we don't want a list of solutions in the documentation,
  though I have a few as examples.  
 
 Do they?   I've seen no discussion of the matter.  I think we should have 
 them.

Most people didn't want a list because there is no way to keep it
current in the docs, and a secondary web site was suggested for the
list.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-25 Thread Bruce Momjian
Dawid Kuroczko wrote:
 Bruce, I've read Your documentation and I was left a bit with a feeling
 that it's a bit too generic.  It's almost as if it could be about just about
 any major database, not PostgreSQL specific.  I feel that, when I'm
 reading PostgreSQL docs I would like to know how to set up multi-master
 replication with PostgreSQL not an explanation what a multi-master
 replication is. It's not about the actual documentation content, but rather
 on accents distribution.  Now it is something like: These are the types
 of replication solutions possible, some of them can be done with PostgreSQL,
 I think it should be rather: With PostgreSQL and some third-party tools you
 can achieve such and such replication solutions, oh and by the way, research
 is done on such and such replication method, but it's not a production quality
 yet.
 
 And I try to think as my DBA-mates would do if they read the documentation,
 I'm not sure they would end up enlighted after reading the docs -- thay would
 probably say: hey, I knew that, it's well structured there, but I
 still don't know
 what should I use, or maybe where can I read something about this slony
 thing anyway?.

Well, the idea is to have a web site that lists all the solutions that
can be updated regularly, perhaps using the categories from the
documentation.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Replication documentation addition

2006-10-25 Thread Josh Berkus
Bruce,

 Most people didn't want a list because there is no way to keep it
 current in the docs, and a secondary web site was suggested for the
 list.

So, like www.postgresql.org/docs/techdocs/replication?   That would work.

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Replication documentation addition

2006-10-25 Thread Bruce Momjian
Josh Berkus wrote:
 Bruce,
 
  Most people didn't want a list because there is no way to keep it
  current in the docs, and a secondary web site was suggested for the
  list.
 
 So, like www.postgresql.org/docs/techdocs/replication?   That would work.

Yes.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-25 Thread Jim C. Nasby
On Wed, Oct 25, 2006 at 04:42:17PM -0400, Bruce Momjian wrote:
 Dawid Kuroczko wrote:
  Bruce, I've read Your documentation and I was left a bit with a feeling
  that it's a bit too generic.  It's almost as if it could be about just about
  any major database, not PostgreSQL specific.  I feel that, when I'm
  reading PostgreSQL docs I would like to know how to set up multi-master
  replication with PostgreSQL not an explanation what a multi-master
  replication is. It's not about the actual documentation content, but rather
  on accents distribution.  Now it is something like: These are the types
  of replication solutions possible, some of them can be done with 
  PostgreSQL,
  I think it should be rather: With PostgreSQL and some third-party tools you
  can achieve such and such replication solutions, oh and by the way, research
  is done on such and such replication method, but it's not a production 
  quality
  yet.
  
  And I try to think as my DBA-mates would do if they read the documentation,
  I'm not sure they would end up enlighted after reading the docs -- thay 
  would
  probably say: hey, I knew that, it's well structured there, but I
  still don't know
  what should I use, or maybe where can I read something about this slony
  thing anyway?.
 
 Well, the idea is to have a web site that lists all the solutions that
 can be updated regularly, perhaps using the categories from the
 documentation.

And the docs should point to that page, prominently (presumably that
will happen after the page actually exists).

Something else worth doing though is to have a paragraph explaining why
there's no built-in replication. I don't have time to write something
right now, but I can do it later tonight if no one beats me to it.
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Nasty btree deletion bug

2006-10-25 Thread Tom Lane
I wrote:
 I've been analyzing Ed L's recent report of index corruption:
 http://archives.postgresql.org/pgsql-general/2006-10/msg01183.php

After some thought, it still seems to me to be impractical to delete the
rightmost child of a btree page while other children remain.  Doing this
would require either moving the child's keyspace assignment left to its
left sibling, or moving that keyspace right to the first child of the
parent's right sibling, and either of these choices means having to
adjust page high keys in a way that's not certain to work.  (For
instance, moving the keyspace left means assigning the victim page's
high key to its left sibling, and there might not be room in the left
sibling page if the desired key is longer than what's there.  In the
other case the same problem of having to replace a key with a
potentially longer one exists, but it manifests at the grandparent
level.)

So I think the rule needs to be don't delete the rightmost child unless
it's the only child, in which case you can delete the parent too --- but
the same restriction must be observed at the next level up.

I believe we can handle this by doing a precheck before starting a
delete of a level-zero page: scan up the stack to check that the
condition is satisfied for each level that needs to be deleted.
The tricky part is that we don't want to exclusive-lock all those
pages at once (we could do it without deadlock risk, but the concurrency
hit seems bad).  I think though that we can assume that if the condition
is checked it will remain true until we finish the delete:

1. A page that isn't rightmost child can't become so while we're not
looking, because that would require a delete to occur, and only VACUUM
does index page deletes, and we already disallow two concurrent VACUUMs
on the same index.

2. A page that is an only child could acquire a sibling only if it's
split, but that would imply an insert (in fact multiple inserts) into
the original level-zero page.  We'll recheck emptiness of the level-zero
page after we acquire write lock on it to begin the actual deletion,
at which point it's still safe to abandon the delete attempt.

The recent patch to allow ordinary non-vacuum processes to delete index
entries retail makes #2 a little trickier than meets the eye.  One could
imagine a scenario where between the times VACUUM leaves the level-zero
page and reacquires lock on it, enough entries were inserted to split
the page and then they all got deleted again by that patch.  However
that patch in its present form cannot leave a page in a completely
empty state, because it's only invoked as part of an insertion attempt.
(If it did manage to delete all the existing entries, then the same
process would insert a new entry onto the same page before unlocking
it.)  So I think it's OK, but we'll need to be wary about any proposals
to remove index entries in other contexts.

The concept of a half-dead page would remain, but it'd be a transient
state that would normally only persist for a moment between atomic
page-delete actions.  If we crash between two such actions, the
half-dead page would remain present, but would be found and cleaned up
by the next VACUUM.  In the meantime it wouldn't cause any problem
because the keyspace it gives up will belong to a sibling of the same
parent at whatever level the delete is ultimately supposed to stop at,
and so inserts and even splits in that keyspace won't create an
inconsistency.  Alternatively, we could have WAL crash recovery complete
the multilevel deletion using the same sort of remember-pending-actions
logic we use now to handle splits.

Comments?  Have I missed anything?

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-25 Thread Bruce Momjian
Jim C. Nasby wrote:
 Something else worth doing though is to have a paragraph explaining why
 there's no built-in replication. I don't have time to write something
 right now, but I can do it later tonight if no one beats me to it.

I thought that was implied in the early paragraph about why there are
many solutions.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-25 Thread Cesar Suga

Joshua D. Drake wrote:

Cesar Suga wrote:
  

Hi,

I also wrote Bruce about that.

It happens that, if you 'freely advertise' commercial solutions (rather
than they doing so by other vehicles) you will always happen to be an
'updater' to the docs if they change their product lines, if they change
their business model, if and if.



That is no different than the open source offerings. We have had several
open source offerings that have died over the years. Replicator, for
example has always been Replicator and has been around longer than any
of the current replication solutions.
  

The documentation comes with the open source tarball.

I would welcome if the docs point to an unofficial wiki (maintained 
externally from authoritative PostgreSQL developers) or a website 
listing them and giving a brief of each solution.


postgresql.org already does this for events (commercial training!) and 
news. Point to postgresql.org/download/commercial as there *already* are 
brief descriptions, pricing and website links.

If you cite a commercial solution, as a fair game you should cite *all*
of them.



No. That doesn't make any sense either. I assume we aren't going to list
all PostgreSQL OSS replication solutions (there are at least a dozen or
more).

You list the ones that are stable in their existence (commercial or not).
  
And how would you determine it? Years of existance? Contribution to 
PostgreSQL's source code? It is not easy and wouldn't be fair. There are 
ones that certainly will be listed, and other doubtful ones (which would 
perhaps complain, that's why I said 'all' - if they are not stable, 
either they stay out of the market or fix their problems).

If one enterprise has the right to be listed in the
documentation, all of them might, as you will never be favouring one of
them.



You are looking at this the wrong way. This isn't about *any*
enterprise. It is about a PostgreSQL Solution. There happens to be two
or three known working open source solutions, and two or three known
working commercial solutions.
  

(see first three paragraphs)

That's the main motivation to write this. Moreover, if there are also
commercial solutions for high-end installs and they are cited as
providers to those solutions, it (to a point) disencourages those of
gathering themselves and writing open source extensions to PostgreSQL.



No it doesn't. Because there is always the, It want's to be free! crowd.
  
Yes, I agree there are. But also development in *that* cutting-edge is 
scarce. It feels that something had filled the gap if you list some 
commercial solution, mainly people in the trenches (DBAs). They would, 
obviously, firstly seek the commercial solutions as they are interested. 
So they click 'commercial products' in the main website.

If people (who read the documentation) professionally work with
PostgreSQL, they may already have been briefed by those commercial
offerings in some way.



Maybe, maybe not.

Sincerely,

Joshua D. Drake
  
And I agree with your point, still. However, that would open a precedent 
for people to have to maintain lists of stable software in every 
documentation area.


Regards,
Cesar


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


[HACKERS] pg_get_domaindef()

2006-10-25 Thread FAST PostgreSQL
Hi All,

I am now trying to implement pg_get_domaindef() function which is in the TODO 
list and ran into a minor issue.

When the following command is given

CREATE DOMAIN testdomain AS text CONSTRAINT testconstraint NOT NULL;

I couldn't find the CONSTRAINT name ('testconstraint' in this case) being 
stored in the system catalog. Any idea where I can find it?

Or is it acceptable, for the above statement, for pg_get_domaindef() to 
return 

CREATE DOMAIN testdomain AS text CONSTRAINT NOT NULL;

Rgds,
Arul Shaji
This is an email from Fujitsu Australia Software Technology Pty Ltd, ABN 27 003 
693 481. It is confidential to the ordinary user of the email address to which 
it was addressed and may contain copyright and/or legally privileged 
information. No one else may read, print, store, copy or forward all or any of 
it or its attachments. If you receive this email in error, please return to 
sender. Thank you.

If you do not wish to receive commercial email messages from Fujitsu Australia 
Software Technology Pty Ltd, please email [EMAIL PROTECTED]


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings