Re: [HACKERS] Auto Partitioning

2007-04-06 Thread Richard Troy

> David Fetter wrote:
> > On Fri, Apr 06, 2007 at 09:22:55AM -0700, Joshua D. Drake wrote:
> >
> >>> The people that use it are the people stuck by dogmatic rules about
> >>> "every table must have a primary key" or "every logical constraint
> >>> must be protected by a database constraint". Ie, database shops run
> >>> by the CYA principle.
> >> Or ones that actually believe that every table where possible should
> >> have a primary key.
> >>
> >> There are very, very few instances in good design where a table does
> >> not have a primary key.
> >>
> >> It has nothing to do with CYA.
> >
> > That depends on what you mean by CYA.  If you mean, "taking a
> > precaution just so you can show it's not your fault when the mature
> > hits the fan," I agree.  If you mean, "taking a precaution that will
> > actually prevent a problem from occurring in the first place," it
> > definitely does.
>
> Heh, fair enough. When I think of CYA, I think of the former.
>
> Joshua D. Drake

...I was thinking the point was more on "primary key" as in syntax, as
opposed to a table that has a/an attribute(s) that is acknowledged by DML
coders as the appropriate way to use the stored data. That is, I may very
well _not_ want the overhead of an index of any kind, forced uniqueness,
etc, but might also well think of a given attribute as the primary key.
Use of constraints in lieu of "primary key" come to mind...

'Course, maybe I missed the point! -smile-

'Nother thought: CYA _can_ have odeous performance costs if
over-implemented. It's a matter of using actual use-cases - or observed
behavior - to taylor the CYA solution to fit the need without undue
overhead.

Rgds,
Richard


-- 
Richard Troy, Chief Scientist
Science Tools Corporation
510-924-1363 or 202-747-1263
[EMAIL PROTECTED], http://ScienceTools.com/


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Proposal: Commit timestamp

2007-02-09 Thread Richard Troy

On Fri, 9 Feb 2007, Jan Wieck wrote:
>
> No matter how many different models you have in parallel, one single
> transaction will be either a master, a slave or an isolated local thing.
> The proposed changes allow to tell the session which of these three
> roles it is playing and the triggers and rules can be configured to fire
> during master/local role, slave role, always or never. That
> functionality will work for master-slave as well as multi-master.
>
> Although my current plan isn't creating such a blended system, the
> proposed trigger and rule changes are designed to support exactly that
> in a 100% backward compatible way.
>
> Jan

Fantastic! ...At some point you'll be thinking of the management end -
turning it on or off, etc. That might be where these other points come
more into play.

Richard

-- 
Richard Troy, Chief Scientist
Science Tools Corporation
510-924-1363 or 202-747-1263
[EMAIL PROTECTED], http://ScienceTools.com/


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Proposal: Commit timestamp

2007-02-09 Thread Richard Troy

On Fri, 9 Feb 2007, Andrew Dunstan wrote:
> Richard Troy wrote:
> > In more specific terms, and I'm just brainstorming in public here, perhaps
> > we can use the power of Schemas within a database to manage such
> > divisions; commands which pertain to replication can/would include a
> > schema specifier and elements within the schema can be replicated one way
> > or another, at the whim of the DBA / Architect. For backwards
> > compatability, if a schema isn't specified, it indicates that command
> > pertains to the entire database.
>
> I understand that you're just thinking aloud, but overloading namespaces
> in this way strikes me as awful. Applications and extensions, which are
> the things that have need of namespaces, should not have to care about
> replication. If we have to design them for replication we'll be on a
> fast track to nowhere IMNSHO.

Well, Andrew, replication _is_ an application. Or, you could think of
replication as an extension to an application. I was under the impression
that_users_ decide to put tables in schema spaces based upon _user_ need,
and that Postgres developer's use of them for other purposes was
incroaching on user choices, not the other way around. Either way,
claiming "need"  like this strikes me as stuck-in-a-rut or dogmatic
thinking. Besides, don't we have schema nesting to help resolve any such
"care?" And, what do you mean by "design them for replication?"

While I'm in no way stuck on blending replication strategies via schemas,
it does strike me as an appropriate concept and I'd preferr to have it
evaluated based on technical merrit - possibly citing workarounds or
solutions to technical issues, which is what I gather has been the
tradition of this group: Use case first, technical merrit second... Other
alternatives, ISTM, will have virtually the same look/feel as a schema
from an external perspective, and the more I think of it the more I think
using schemas is a sound, clean approach. That it offends someones sense
of asthetics STM a poor rationale for not choosing it. Another question
might be: What's lacking in the implementation of schemas that makes this
a poor choice, and what could be done about it without much effort?

Regards,
Richard

-- 
Richard Troy, Chief Scientist
Science Tools Corporation
510-924-1363 or 202-747-1263
[EMAIL PROTECTED], http://ScienceTools.com/


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Proposal: Commit timestamp

2007-02-09 Thread Richard Troy

On Fri, 9 Feb 2007, Jan Wieck wrote:
> > [ I wrote ]
> > It'd be great if Jan considers the blending of replication;
>
> Please elaborate. I would really like to get all you can contribute.

Thanks Jan,

prefaced that I really haven't read everything you've written on this (or
what other people are doing, either), and that I've got a terrible flu
right now (fever, etc), I'll give it a go - hopefully it's actually
helpful. To wit:

In general terms, "blending of replication [techniques]" means to me that
one can have a single database instance serve as a master and as a slave
(to use only one set of terminology), and as a multi-master, too, all
simultaneously, letting the DBA / Architect choose which portions serve
which roles (purposes). All replication features would respect the
boundaries of such choices automatically, as it's all blended.

In more specific terms, and I'm just brainstorming in public here, perhaps
we can use the power of Schemas within a database to manage such
divisions; commands which pertain to replication can/would include a
schema specifier and elements within the schema can be replicated one way
or another, at the whim of the DBA / Architect. For backwards
compatability, if a schema isn't specified, it indicates that command
pertains to the entire database.

At the very least, a schema division strategy for replication leaverages
an existing DB-component binding/dividing mechanism that most everyone is
familliar with. While there are/may be database-wide, nay, installation-
wide constructs as in your Commit Timestamp proposal, I don't see that
there's any conflict - at least, from what I understand of existing
systems and proposals to date.

HTH,
Richard

-- 
Richard Troy, Chief Scientist
Science Tools Corporation
510-924-1363 or 202-747-1263
[EMAIL PROTECTED], http://ScienceTools.com/


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Proposal: Commit timestamp

2007-02-08 Thread Richard Troy

On Thu, 8 Feb 2007, Joshua D. Drake wrote:
>
> Well how deep are we talking here? My understanding of what Jan wants to
> do is simple.
>
> Be able to declare which triggers are fired depending on the state of
> the cluster.
>
> In Jan's terms, the Origin or Subscriber. In Replicator terms the Master
> or Slave.
>
> This is useful because I may have a trigger on the Master and the same
> trigger on the Slave. You do not want the trigger to fire on the Slave
> because we are doing data replication. In short, the we replicate the
> result, not the action.
>
> However, you may want triggers that are on the Slave to fire separately.
> A reporting server that generates materialized views is a good example.
> Don't tie up the Master with what a Slave can do.
>

It'd be great if Jan considers the blending of replication; any given DB
instance shouldn't be only a master/originator or only a slave/subscriber.
A solution that lets you blend replication strategies in a single db is,
from my point of view, very important.

> > I have no clue what got you into what you are doing here.

Jan, some sleep now and then might be helpful to your public disposition.
-smile-

peace,
Richard

-- 
Richard Troy, Chief Scientist
Science Tools Corporation
510-924-1363 or 202-747-1263
[EMAIL PROTECTED], http://ScienceTools.com/


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Proposal: Commit timestamp

2007-02-07 Thread Richard Troy

> Jan Wieck wrote:
> > Are we still discussing if the Postgres backend may provide support for
> > a commit timestamp, that follows the rules for Lamport timestamps in a
> > multi-node cluster?

...I thought you said in this thread that you haven't and weren't going to
work on any kind of logical proof of it's correctness, saw no value in
prototyping your way to a clear (convincing) argument, and were
withdrawing the proposal due to all the issues others raised which were,
in light of this, unanswerable beyond conjecture. I thought that the
thread was continuing because other people saw value in the kernel of the
idea, would support if if it could be shown to be correct/useful, were
disappointed you'd leave it at that and wanted to continue to see if
something positive might come of the dialogue. So, the thread weaved
around a bit. I think that if you want to nail this down, people here are
willing to be convinced, but that hasn't happened yet.

On Wed, 7 Feb 2007, Markus Schiltknecht wrote:
> I'm only trying to get a discussion going, because a) I'm interested in
> how you plan to solve these problems and b) in the past, most people
> were complaining that all the different replication efforts didn't try
> to work together. I'm slowly trying to open up and discuss what I'm
> doing with Postgres-R on the lists.
>
> Just yesterday at the SFPUG meeting, I've experienced how confusing it
> is for the users to have such a broad variety of (existing and upcoming)
> replication solutions. And I'm all for working together and probably
> even for merging different replication solutions.

In support of that idea, I offer this; When Randy Eash wrote the world's
first replication system for Ingres circa 1990, his work included ideas
and features that are right now in the Postgres world fragmented among
several existing replication / replication-related products, along with
some things that are only now in discussion in this group. As discussed at
the SFPUG meeting last night, real-world use cases are seldom if ever
completely satisfied with a one-size-fits-all replication strategy. For
example, a manufacturing company might want all factories to be capable of
being autonomous but both report activities and take direction from
corporate headquarters. To do this without having multiple databases at
each site, a single database instance would likely be both a master and
slave, but for differing aspects of the businesses needs. Business
decisions would resolve the conflicts, say, the manufacturing node always
wins when it comes to data that pertains to their work, rather than
something like a time-stamp, last timestamp/serialized update wins.

Like Markus, I would like to see the various replication efforts merged as
best they can be because even if the majority of users don't use a little
bit of everything, surely the more interesting cases would like to and the
entire community is better served if the various "solutions" are in
harmony.

Richard


-- 
Richard Troy, Chief Scientist
Science Tools Corporation
510-924-1363 or 202-747-1263
[EMAIL PROTECTED], http://ScienceTools.com/


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] The may/can/might business

2007-02-01 Thread Richard Troy

On Thu, 1 Feb 2007, Bruce Momjian wrote:
> From: Bruce Momjian <[EMAIL PROTECTED]>
> Tom Lane wrote:
> > 3606c3606
> > <errmsg("aggregate function calls cannot be nested")));
> > ---
> > >errmsg("aggregate function calls may not be nested")));
> >
> > I don't think that this is an improvement, or even correct English.
> >
> > You have changed a message that states that an action is logically
> > impossible into one that implies we are arbitrarily refusing to let
> > the user do something that *could* be done, if only we'd let him.
> >
> > There is relevant material in the message style guidelines, section
> > 45.3.8: it says that "cannot open file "%s" ... indicates that the
> > functionality of opening the named file does not exist at all in the
> > program, or that it's conceptually impossible."
>
> Uh, I think you might be reading the diff backwards.  The current CVS
> wording is "cannot".

No, Bruce, he got it exactly right: "cannot" indicates, as Tom put it,
"logical impossibility," whereas "may not" suggests that something could
happen but it's being prevented. His parsing of the english was spot-on.

RT


-- 
Richard Troy, Chief Scientist
Science Tools Corporation
510-924-1363 or 202-747-1263
[EMAIL PROTECTED], http://ScienceTools.com/


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Updateable cursors

2007-01-29 Thread Richard Troy

On Wed, 24 Jan 2007, John Bartlett wrote:
[regarding optional DBA/SysAdmin logging of Updateable Cursors]
>
> I can see where you are coming from but I am not sure if a new log entry
> would be such a good idea. The result of creating such a low level log could
> be to increase the amount of logging by a rather large amount.
>

Given that logging can be controlled via the contents of postgresql.conf,
this sounds like an answer from someone who's never had to support a
production environment; Putting a check for log_min_error_statement being
set to, say, info, hardly seems like a big burden to me. A casual study of
the controls in postgresql.conf reveals we already have many controlls to
get things logged when we want/need them - all of which were deemed
appropriate previously. So ISTM that if the DBA/SysAdmin thinks they need
the information, who are you to tell them, in effect, "No, I don't want
you to have to spend any of your machine's performace giving you the
information you need?"

Help your user by giving them information when they want it. ... Do you
argue that this is useless information?

Richard

-- 
Richard Troy, Chief Scientist
Science Tools Corporation
510-924-1363 or 202-747-1263
[EMAIL PROTECTED], http://ScienceTools.com/


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] DROP FUNCTION failure: cache lookup failed for relation

2007-01-28 Thread Richard Troy


> It seems a general solution would involve having dependency.c take
> exclusive locks on all types of objects (not only tables) as it scans
> them and decides they need to be deleted later.  And when adding a
> pg_depend entry, we'd need to take a shared lock and then recheck to
> make sure the object still exists.  This would be localized in
> dependency.c, but it still seems like quite a lot of mechanism and
> cycles added to every DDL operation.  And I'm not at all sure that
> we'd not be opening ourselves up to deadlock problems.
>
> I'm a bit tempted to fix only the table case and leave the handling of
> non-table objects as is.  Comments?
>
>   regards, tom lane

The taking of DDL locks is very unlikely to create a performance problem
for anyone as DML statements typically far outnumber DDL statements.
Further, in my experience, DDL statements are very carefully thought
through and are usually either completely automated by well crafted
programs or are performed by one person at a time - the DBA. I therefore
conclude that any deadlock risk is triflingly small and would be a
self-inflicted circumstance.

Richard

-- 
Richard Troy, Chief Scientist
Science Tools Corporation
510-924-1363 or 202-747-1263
[EMAIL PROTECTED], http://ScienceTools.com/


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Proposal: Commit timestamp

2007-01-25 Thread Richard Troy

On Thu, 25 Jan 2007, Jan Wieck wrote:
>
> For a future multimaster replication system, I will need a couple of
> features in the PostgreSQL server itself. I will submit separate
> proposals per feature so that discussions can be kept focused on one
> feature per thread.

Hmm... "will need" ... Have you prototyped this system yet? ISTM you can
prototype your proposal using "external" components so you can work out
the kinks first.

Richard


-- 
Richard Troy, Chief Scientist
Science Tools Corporation
510-924-1363 or 202-747-1263
[EMAIL PROTECTED], http://ScienceTools.com/


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Proposal: allow installation of any contrib module

2007-01-25 Thread Richard Troy

FWIW:

> * Better packaging support, eg make it easier to add/remove an extension
> module and control how pg_dump deals with it.  We talked about that
> awhile back but nobody did anything with the ideas.

+1

> * Better documentation for the contrib modules; some of them are
> reasonably well doc'd now, but many are not, and in almost all cases
> it's only plain text not SGML.

+1

> * Better advertising, for instance make the contrib documentation
> available on the website (which probably requires SGML conversion
> to happen first...)

+1


RT

-- 
Richard Troy, Chief Scientist
Science Tools Corporation
510-924-1363 or 202-747-1263
[EMAIL PROTECTED], http://ScienceTools.com/


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Updateable cursors

2007-01-23 Thread Richard Troy

On Wed, 24 Jan 2007, FAST PostgreSQL wrote:
>
> We are trying to develop the updateable cursors functionality into
> Postgresql. I have given below details of the design and also issues we are
> facing.  Looking forward to the advice on how to proceed with these issues.
>
> Rgds,
> Arul Shaji
>

Hi Arul,

...I can see people are picking apart the implementation details so you're
getting good feedback on your ambitious proposal. Looks like you've put a
lot of thought/work into it.

I've never been a fan of cursors because they encourage bad behavior;
"Think time" in a transaction sometimes becomes "lunch time" for users and
in any event long lock duration is something to be avoided for the sake of
concurrency and sometimes performance (vacuum, etc). My philosophy is "get
in and get out quick."

Ten years ago May, our first customer insisted we implement what has
become our primary API library in Java and somewhat later I was shocked to
learn that for whatever reason Java ResultSets are supposed to be
implemented as _updateable_cursors._ This created serious security issues
for handing off results to other programs through the library - ones that
don't even have the ability to connect to the target database. Confirmed
in the behavior of Informix, we went through some hoops to remove the need
to pass ResultSets around. (If I had only known Postgres didn't implement
the RS as an updateable cursor, I'd have pushed for our primary platform
to be Postgres!)

What impresses me is that Postgres has survived so well without updateable
cursors. To my mind it illustrates that they aren't widely used. I'm
wondering what troubles lurk ahead once they're available. As a
DBA/SysAdmin, I'd be quite happy that there existed some kind of log
element that indicated updateable cursors were in use that I could search
for easily whenever trying to diagnose some performance or deadlocking
problem, etc, say log fiile entries that indicated the opening and later
closing of such a cursor with an id of some kind that allowed matching up
open/close pairs. I also think that that the documentation should be
updated to not only indicate usage of this new feature, but provide
cautionary warnings about the potential locking issues and, for the
authors of libraries, Java in particular, the possible security issues.

Regards,
Richard

-- 
Richard Troy, Chief Scientist
Science Tools Corporation
510-924-1363 or 202-747-1263
[EMAIL PROTECTED], http://ScienceTools.com/


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Function execution costs 'n all that

2007-01-15 Thread Richard Troy

On Mon, 15 Jan 2007, Neil Conway wrote:
> On Mon, 2007-01-15 at 10:51 -0800, Richard Troy wrote:
> > I therefore propose that the engine evaluate -
> > benchmark, if you will - all functions as they are ingested, or
> > vacuum-like at some later date (when valid data for testing may exist),
> > and assign a cost relative to what it already knows - the built-ins, for
> > example.
>
> That seems pretty unworkable. It is unsafe, for one: evaluating a
> function may have side effects (inside or outside the database), so the
> DBMS cannot just invoke user-defined functions at whim. Also, the
> relationship between a function's arguments and its performance will
> often be highly complex -- it would be very difficult, not too mention
> computationally infeasible, to reconstruct that relationship
> automatically, especially without any real knowledge about the
> function's behavior.
>
> -Neil

Hi Neil,

Tom had already proposed:
>
> I'm envisioning that the CREATE FUNCTION syntax would add optional
> clauses
>
>COST function-name-or-numeric-constant
>ROWS function-name-or-numeric-constant
>
> that would be used to fill these columns.

I was considering these ideas in the mix; let the user provide either a
numeric or a function, the distinction here being that instead of running
that function at planning-time, it could be run "off-line", so to speak.

Richard

-- 
Richard Troy, Chief Scientist
Science Tools Corporation
510-924-1363 or 202-747-1263
[EMAIL PROTECTED], http://ScienceTools.com/


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Function execution costs 'n all that

2007-01-15 Thread Richard Troy
On Mon, 15 Jan 2007, Tom Lane wrote:

> So I've been working on the scheme I suggested a few days ago of
> representing "equivalence classes" of variables explicitly, and avoiding
> the current ad-hocery of generating and then removing redundant clauses
> in favor of generating only the ones we want in the first place.  Any
> clause that looks like an equijoin gets sent to the EquivalenceClass
> machinery by distribute_qual_to_rels, and not put into the
> restrictlist/joinlist data structure at all.  Then we make passes over
> the EquivalenceClass lists at appropriate times to generate the clauses
> we want.  This is turning over well enough now to pass the regression
> tests,

That was quick...

> In short, this approach results in a whole lot less stability in the
> order in which WHERE clauses are evaluated.  That might be a killer
> objection to the whole thing, but on the other hand we've never made
> any strong promises about WHERE evaluation order.

Showing my ignorance here, but I've never been a fan of "syntax based
optimization," though it is better than no optimization. If people are
counting on order for optimization, then, hmmm... If you can provide a way
to at least _try_ to do better, then don't worry about it. It will improve
with time.

> Instead, I'm thinking it might be time to re-introduce some notion of
> function execution cost into the system, and make use of that info to
> sort WHERE clauses into a reasonable execution order.

Ingres did/does it that way, IIRC. It's a solid strategy.

>  This example
> would be fixed with even a very stupid rule-of-thumb about SQL functions
> being more expensive than C functions, but if we're going to go to the
> trouble it seems like it'd be a good idea to provide a way to label
> user-defined functions with execution costs.
>
> Would a simple constant value be workable, or do we need some more
> complex model (and if so what)?

Ingres would, if I'm not mistaken, gain through historical use through
histograms. Short of that, you've got classes of functions, agregations,
for example, and there's sure to be missing information to make a great
decision at planning time. However, I take it that the cost here is
primarily CPU and not I/O. I therefore propose that the engine evaluate -
benchmark, if you will - all functions as they are ingested, or
vacuum-like at some later date (when valid data for testing may exist),
and assign a cost relative to what it already knows - the built-ins, for
example. Doing so could allow this strategy to be functional in short
order and be improved with time so all the work doesn't have to be
implemented on day 1. And, DBA/sys-admin tweaking can always be done by
updating the catalogues.

HTH,
Richard

-- 
Richard Troy, Chief Scientist
Science Tools Corporation
510-924-1363 or 202-747-1263
[EMAIL PROTECTED], http://ScienceTools.com/


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [GENERAL] Checkpoint request failed on version 8.2.1.

2007-01-11 Thread Richard Troy

On Thu, 11 Jan 2007, Tom Lane wrote:

...snip...
>
> (You know, of course, that my opinion is that no sane person would run a
> production database on Windows in the first place.  So the data-loss
> risk to me seems less of a problem than the unexpected-failures problem.
> It's not like there aren't a ton of other data-loss scenarios in that OS
> that we can't do anything about...)
>
>   regards, tom lane
>

PLEASE OH PLEASE document every f-ing one of them! (And I don't mean
document Windows issues as comments in the source code. Best would be in
the official documentation/on a web page.) On occasion, I could *really*
use such a list! (If such already exists, please point me at it!)

Thing is, Tom, not everybody has the same level of information you have on
the subject...

Regards,
Richard

-- 
Richard Troy, Chief Scientist
Science Tools Corporation
510-924-1363 or 202-747-1263
[EMAIL PROTECTED], http://ScienceTools.com/


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] ideas for auto-processing patches

2007-01-10 Thread Richard Troy

On Wed, 10 Jan 2007, Jim C. Nasby wrote:
>
> On Thu, Jan 11, 2007 at 08:04:41AM +0900, Michael Glaesemann wrote:
> > >Wouldn't there be some value to knowing whether the patch failed
> > >due to
> > >bitrot vs it just didn't work on some platforms out of the gate?
> >
> > I'm having a hard time figuring out what that value would be. How
> > would that knowledge affect what's needed to fix the patch?
>
> I was thinking that knowing it did work at one time would be useful, but
> maybe that's not the case...
>

"Has it ever worked" is the singularly most fundamental technical support
question; yes, it has value.

One question here - rhetorical, perhaps - is; What changed and when? Often
when things changed can help get you to what changed. (This is what logs
are for, and not just automated computer logs, but system management
things like, "I upgraded GCC today.") And that can help you focus in on
what to do to fix the problem. (such as looking to the GCC release notes)

A non-rhetorical question is; Shouldn't the build process mechanism/system
know when _any_ aspect of a build has failed (including patches)? I'd
think so, especially in a build-farm scenario.

...Just my two cents - and worth every penny! -smile-

Richard

-- 
Richard Troy, Chief Scientist
Science Tools Corporation
510-924-1363 or 202-747-1263
[EMAIL PROTECTED], http://ScienceTools.com/


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] "recovering prepared transaction" after serverrestart

2006-11-03 Thread Richard Troy

On Fri, 3 Nov 2006, Tom Lane wrote:
>
> > Is there a way to see prepared transactions where the original session
> > that prepared then has died? Perhaps the message at startup should be
> > "you have at least one prepared transaction that needs resolution".
>
> I am completely baffled by this focus on database startup time.  That's
> not where the problem is.
>
>   regards, tom lane
>

I'm not alluding to anyone in particular, just responding to the focus on
startup time; When I joined Ingres as a Consultant (back when that was a
revered job), we saw this a lot, too, bubbling through the ranks from
technical support. Engineering was having a cow over it. We Consultants
were expected to backline such problems and be the interface between
engineering and the rest of the world. What we found was that in what we'd
call the ligitimate cases, the cause for concern over startup time had to
do with bugs that forced, one way or another, a server restart.

Illigitimate cases - the VAST majority - were the result of, well, let's
call them less-than-successful DBAs, thrashing their installations with
their management breathing down their necks, often with flailing arms and
fire coming out of their mouths saying things like, "I bet my business on
this!"... The usual causes there were inappropriate configurations, and a
critical cause of _that_ was an instalation toolset that didn't help
people size/position things properly. Often a sales guy or trainee would
configure a test system and then the customer would put that into
production without ever reexamining the settings.

I realized there was an opportunity here; I put together a training
program and we sold it as a service along with installation to new
customers to help them get off on the right foot. Once we did that, new
customers were essentially put on notice that they could either pay us to
help set them up, or they could do it, but that continuing along with what
the salesman or junior techie had done wasn't sufficient for a production
environment that you could bet your business on. ...The complaint and
concern about startup time dropped out of sight nearly immediately...

Opportunity here, for PostgreSql: A Technical Document of some kind
entitled something like: "How to move your testing environment into
production."

No, unfortunately, I can't volunteer to be the point person on this one.
And to the underlying question: is this the case with PostgreSql? I can't
say...

Regards,
Richard


-- 
Richard Troy, Chief Scientist
Science Tools Corporation
510-924-1363 or 202-747-1263
[EMAIL PROTECTED], http://ScienceTools.com/


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Design Considerations for New Authentication Methods

2006-11-02 Thread Richard Troy


> Username/password is not acceptable in a number of situations.  This is
> not intended to replace them.  This would be in *addition* to supporting
> the current auth methods.  I don't understand at all how you feel it'd
> be nice to have yet shouldn't be done.
>
>Thanks,
>
>Stephen

...I thought you said this _needs_ to be done - by using words like
"unacceptible" and "required" - and I disagree. There's a difference
between what needs to be done and what is desired to be done. Further, I
never said "shouldn't."

Richard

-- 
Richard Troy, Chief Scientist
Science Tools Corporation
510-924-1363 or 202-747-1263
[EMAIL PROTECTED], http://ScienceTools.com/


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Design Considerations for New Authentication Methods

2006-11-02 Thread Richard Troy
On Thu, 2 Nov 2006, Magnus Hagander wrote:
> >
> > I expect we'll need a mapping of some sort, or perhaps a
> > sasl_regexp or similar to what is done in OpenLDAP.  I don't
> > recall PG supporting using the DN from a client cert in an
> > SSL connection as a PG username but perhaps I missed it somewhere...
>
> You can't today.
> If we want to add username mapping in SASL or whatever, it might be a
> good idea to look at generalizing the authuser-to-dbuser mapping stuff
> (like we have for identmap now) into something that can be used for all
> external auth methods. Instead of inventing one for every method.
>
> //Magnus


Well, there's simply no need. While I can agree that more could be done,
I'm not convinced there's a need because what we have now works fine. Let
me support my view by stating first that I perceive that combining the
conception of encrypting a communications channel with user authentication
to be a very poor choice. I gather from the paragraph above that this is a
forgone conclusion. Appologies if I'm mistaken.

Just so my point - that another strategy is not needed - is understood,
let's agree that SSL is just preventing sniffers from capturing whatever
else goes on in "our conversation."  Great. What's inside that
communication? Why, there's a perfectly workable username/password
authentication that happens! Sure, someone could steal that data somehow
and break in, but that requires one of the two systems to be breached, and
that's a security problem that's out of scope for Postgres.

Would signed certificates be preferred? Well, sure, they're nice. I don't
object, and in fact welcome some improvements here. For example, I'd love
the choice of taking an individual user's certificate and authenticating
completely based upon that. However, while this _seems_ to simplify
things, it really just trades off with the added cost of managing those
certs - username/password is slam-dunk simple and has the advantage that
users can share one authentication.

Unless I've really overlooked something basic, there's nothing lacking in
the existing scheme...

Richard

-- 
Richard Troy, Chief Scientist
Science Tools Corporation
510-924-1363 or 202-747-1263
[EMAIL PROTECTED], http://ScienceTools.com/


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] bug in on_error_rollback !?

2006-10-28 Thread Richard Troy

Gurjeet,

I see that the question of case sensitivity in psql is still being
discussed. "I don't have a dog in that fight," but thought I might make a
suggestion. To wit:

I propose you adopt the standard that I personally have adopted eons ago -
literally perhaps 20 years ago - and has by now saved me many days of
time, I'm sure; ALWAYS presume case sensitivity and code _exactly_ that
way every time. (And develop you're own capitalization standard, too, so
you'll always know which way it goes.) You'll never be disappointed that
way and you won't create hidden bugs. If you want to keep arguing that
Postgres should change to meet your expectations, fine, and if it changes,
great for you, but you'll just have the same problem someday with some
other package - better you change your habits instead!

Richard

On Sat, 28 Oct 2006, Gurjeet Singh wrote:

> Date: Sat, 28 Oct 2006 20:01:00 +0530
> From: Gurjeet Singh <[EMAIL PROTECTED]>
> To: Peter Eisentraut <[EMAIL PROTECTED]>
> Cc: pgsql-hackers@postgresql.org, Andrew Dunstan <[EMAIL PROTECTED]>,
>  Bernd Helmle <[EMAIL PROTECTED]>
> Subject: Re: [HACKERS] bug in on_error_rollback !?
>
> On 10/27/06, Peter Eisentraut <[EMAIL PROTECTED]> wrote:
> >
> > In psql, the psql
> > parts follow the syntax rules of psql, the SQL parts follow the syntax
> > rules of SQL.  The syntax rules of psql in turn are inspired by Unix
> > shells, sort of because psql is used that way.  (Surely one wouldn't
> > want the argument to \i be case-insensitive?)
>
>
> A very good reasoning... I completely agree...
>
> But you'd also agree that since the psql variables can (and most often they
> are) used in SQL satements, we should consider making atleast \set case
> insensitive!
>
> postgres=# \set x 1
> postgres=# select :x;
>  ?column?
> --
> 1
> (1 row)
>
> postgres=# select :X;
> ERROR:  syntax error at or near ":"
> LINE 1: select :X;
>^
> postgres=#
>
> 
> what harm allowing "\set on_error_rollback" would be: it certainly
> won't break any existing scripts.
> ...
> I wrote this feature (but someone else
> chose the name!) and I still occasionally write it lowercase and wonder
> why it isn't working. :)
> 
>
> I agree, we can't make every '\' command case-insensitive, but a few,
> where it makes absolute sense, should be subject to reconsideration. We have
> the choice of making it more user-friendly, and less confusing.
>
>
>
>

-- 
Richard Troy, Chief Scientist
Science Tools Corporation
510-924-1363 or 202-747-1263
[EMAIL PROTECTED], http://ScienceTools.com/


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Replication documentation addition

2006-10-27 Thread Richard Troy

On Wed, 25 Oct 2006, Bruce Momjian wrote:

   ...snip...
>
> > Data partitioning is often done within a single database on a single
> > server and therefore, as a concept, has nothing whatsoever to do with
> > different servers. Similarly, the second paragraph of this section is
>
> Uh, why would someone split things up like that on a single server?
>
> > problematic. Please define your term first, then talk about some
> > implementations - this is muddying the water. Further, there are both
> > vertical and horizontal partitioning - you mention neither - and each has
> > its own distinct uses. If partitioning is mentioned, it should be more
> > complete.
>
> Uh, what exactly needs to be defined.

OK, "Data partitioning"; data partitioning begins in the RDB world with
the very notion of tables, and we partition our data during schema
development with the goal of "normalizing" the design - "thrid normal
form" being the one most Professors talk about as a target. "Data
partitioning", then, is the intentional denormalization of the design to
accomplish some goal(s) - not all of which are listed in this document's
title. In this context, data partitioning takes two forms based upon which
axis of a two-dimensional table is to be divided, with the vertical
partition dividing attributes (as in a master/detail relationship with
one-to-one mapping), and the horizontal partition dividing based on one or
more attributes domain, or value (as in your example of London records
being kept in a database in London, while Paris records are kept in
Paris).

The point I was making was that that section of the document was in err
because it presumed there was only one form of data partitioning and that
it was horizontal. (The document is now missing, so I can't look at the
current content - it was here:
ftp://momjian.us/pub/postgresql/mypatches/replication.)

In answer to your query about why someone would use such partitioning, the
nearly universal answer is performance, and the distant second answer is
security. In one example that comes immediately to mind, there is a table
which is a central core of an application, and, as such, there's a lot to
say about the items in this table. The table's size is in the tens to
hundreds of millions of rows, and needs to be joined with something else
in a huge fraction of queries.  For performance reasons, the tables size
was therefore kept as tiny as possible and detail table(s) is(are) used
for the remaining attributes that logically belong in the table - it's a
vertical partition. It's an exceptionally common technique - so common, it
probably didn't occur to you that you were even talking about it when you
spoke of "data partitioning."

> > Next, Query Broadcast Load Balancing... also needs a lot of work. First,
> > it's foremost in my memory that sending read queries everywhere and
> > returning the first result set back is a key way to improve application
> > performance at the cost of additional load on other systems - I guess
> > that's not at all what the document is after here, but it's a worthy part
> > of a dialogue on broadcasting queries. In other words, this has more parts
> > to it than just what the document now entertains. Secondly, the document
>
> Uh, do we want to go into that here?  I guess I could.
>
> > doesn't address _at_all_ whether this is a two-phaise-commit environment
> > or not. If not, how are updates managed? If each server operates
> > independently and one of them fails, what do you do then? How do you know
> > _any_ server got an insert/update? ...  Each server _can't_ operate
> > independently unless the application does its own insert/update commits to
> > every one of them - and that can't be fast, nor does it load balance,
> > though it may contribute to superior uptime performance by the
> > application.
>
> I think having the application middle layer do the commits is how it
> works now.  Can someone explain how pgpool works, or should we mention
> how two-phase commit has to be done here?  pgpool2 has additional
> features.

Well, you hadn't mentioned two phaise commit at all and it surely belong
somewhere in this document - it's a core PG feature and enables a lot of
alternative solutions which the document discusses.

What it needs to say but doesn't (didn't?) is that the load from read
queries can be distributed for load balancing purposes but that there's no
benefit possible for writes, and that replication overhead costs could
possibly overwhelm the benefits in high-update scenarios. The point that
each server operates independently is only true if you ignore the the
necessary replication - which, to my mind, links the sys

Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-26 Thread Richard Troy

> The documentation comes with the open source tarball.

Yuck.

>
> I would welcome if the docs point to an unofficial wiki (maintained
> externally from authoritative PostgreSQL developers) or a website
> listing them and giving a brief of each solution.
>
> postgresql.org already does this for events (commercial training!) and
> news. Point to postgresql.org/download/commercial as there *already* are
> brief descriptions, pricing and website links.

I wouldn't have looked in "download" for such a thing. Nor would I expect
everyone with a Postgres related solution to want to post it on
PosgreSql.org for download.

However I agree that a simple web page listing such things is needed. It's
easy to manage - way easier to manage than the development of a competent
relational database engine! It's just a bunch of text, after all, and
errors aren't that critical and will tend to self-correct through user
attention.

> >
> > You list the ones that are stable in their existence (commercial or not).
> >
> And how would you determine it? Years of existance? Contribution to
> PostgreSQL's source code? It is not easy and wouldn't be fair. There are
> ones that certainly will be listed, and other doubtful ones (which would
> perhaps complain, that's why I said 'all' - if they are not stable,
> either they stay out of the market or fix their problems).

You have to just trust people. If it's clear that "this isn't
PostgreSql.org", stuff can be unstable, etc - it isn't the group's
problem.

> > No it doesn't. Because there is always the, "It want's to be free!" crowd.
> >
> Yes, I agree there are. But also development in *that* cutting-edge is
> scarce. It feels that something had filled the gap if you list some
> commercial solution, mainly people in the trenches (DBAs). They would,
> obviously, firstly seek the commercial solutions as they are interested.
> So they click 'commercial products' in the main website.

Not necessarily. Most times, I'll seek the better solution, which may or
may not be commercial. Sometimes I'll avoid a commercial version because I
don't like the company!

... But getting genuine donations of time - without direct $$
self-interest attached, is a whole nother kettle o fish.  For example,
there are a lot of students out there that are excellent and would love to
have a mechanism to gain something for their resumes before entering the
business world. ...There might be some residual interest at UCB, for
example. Attracting this kind of support is a completely different
dialogue, but on _this_ topic, surely seeking the "it wants to be free!"
crowd can't (or shouldn't, in my view) be used as an excuse for not
publishing pointers to commercial soltions that involve PostgreSql. Do it
already!

> >> If people (who read the documentation) professionally work with
> >> PostgreSQL, they may already have been briefed by those commercial
> >> offerings in some way.
> >>
> >
> > Maybe, maybe not.

The "may" is a wiggler; sounds like an excuse with a back door. The real
answer is "probably not!" I'm in that world. I haven't been briefed. Ever.

> And I agree with your point, still. However, that would open a precedent
> for people to have to maintain lists of stable software in every
> documentation area.

All that's needed is ONE list, with clear disclaimer. It'll be all text
and links, and maybe the odd small .gif logo, if permitted, so it won't be
a huge thing. Come on now, are there thousands of such products? Tens
sounds more plausible.

Regards,
Richard

-- 
Richard Troy, Chief Scientist
Science Tools Corporation
510-924-1363 or 202-747-1263
[EMAIL PROTECTED], http://ScienceTools.com/


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Replication documentation addition

2006-10-26 Thread Richard Troy

On Wed, 25 Oct 2006, Josh Berkus wrote:
>
> Bruce,
>
> > It isn't designed for that.  It is designed for people to understand
> > what they want, and then they can look around for solutions.  I think
> > most agree we don't want a list of solutions in the documentation,
> > though I have a few as examples.
>
> Do they?   I've seen no discussion of the matter.  I think we should have
> them.
>
>

I completely agree; If you want to attract competent people from the
business world, one thing you have to do is respect their time by helping
them find information, especially about things they don't know exist. All
that's needed are pointers, but the pointers need to be to solid
documents/resources, not just the top of a heap - if you'll forgive the
pun.

Richard



-- 
Richard Troy, Chief Scientist
Science Tools Corporation
510-924-1363 or 202-747-1263
[EMAIL PROTECTED], http://ScienceTools.com/


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Replication documentation addition

2006-10-25 Thread Richard Troy
efine the term (perhaps in
a little more detail), and not mention solutions - they change with time
anyway.

While I've never used Oracle's clustering tools, I've read up on them and
have customers who use them, and I think this description of Oracle
clustering is a mis-read on what the Oracle system actually does. A check
with a true Oracle clustering expert is in order here.

Hope this helps. If asked, I'm willing to (re)write some of the bits
discussed above.

Regards,
Richard

-- 
Richard Troy, Chief Scientist
Science Tools Corporation
510-924-1363 or 202-747-1263
[EMAIL PROTECTED], http://ScienceTools.com/


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Replication documentation addition

2006-10-25 Thread Richard Troy
st. By including more information, more users will be
attracted to PostgreSql, whether it be in the documentation or web site. I
have been SURE that certain things must exist in the PG world, but haven't
known about them with certainty due to time constraints, but would gladly
point our customers at Postgres solutions if only I knew about them. Count
this paragraph as praise for doing _something_more_ to help get more
information to (prospective) users.

Consider someone like me; my company supports five RDBMSes, one of them
being Postgres. We are probably not unique in that we've written an SQL
dialect translator so we could write our own code in one code line to run
anywhere, against any RDBMS (it can learn new dialects) - or perhaps
others keep multiple code lines containing varriant dialects. Either way,
we "don't care" whether our customer has Oracle, or PostgreSql, so long as
they buy our stuff. But when our customers - or prospects - come to us
with a given scenario, the more we know about Postgres - and its community
- the more likely we can steer them to a PG solution, which we would
prefer anyway, for lots of reasons, historical, personal, and technical -
not to mention cost. The trouble is, Oracle, for example, has already told
them (sold them?) on whatever, and we need a rebuttal ready at hand or
they'll go with Oracle. We just don't have the time to fight that battle,
nor do we wish to risk the sale when we can work with Oracle just fine.

In sum, I agree with Tom Lane and the others who chimed in with "keep the
docs clean, use the web site for mentioning other projects/products." And
again I applaud this new effort.

Regards,
Richard

-- 
Richard Troy, Chief Scientist
Science Tools Corporation
510-924-1363 or 202-747-1263
[EMAIL PROTECTED], http://ScienceTools.com/


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org