AW: AW: [HACKERS] Re: [SQL] behavior of ' = NULL' vs. MySQL vs. Stand ards

2001-06-18 Thread Zeugswetter Andreas SB


> -- If I have interpreted SQL92 correctly UNKNOWN IS TRUE should return
> FALSE, and UNKNOWN IS NOT TRUE is equivalent to NOT (UNKNOWN IS TRUE) ==>
> TRUE. Is this correct?

No, I do not think it is valid to say "should return true|false"
I think they should return UNKNOWN. Only when it comes to evaluating the
"... WHERE UNKNOWN;" can you translate it to "... WHERE FALSE;", or in the 
output function.

My interpretation would be:
UNKNOWN IS TRUE --> FALSE
UNKNOWN IS NOT TRUE --> FALSE
NOT (UNKNOWN IS TRUE)   --> FALSE

Andreas

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



[HACKERS] Re: temp-table-related failure in regression tests

2001-06-18 Thread Tom Lane

I wrote:
> Is anyone else seeing this with current CVS, or is it my own breakage?

Ah, the problem is RelationGetRelationName didn't know about the
new temprel naming convention.

I quick-hacked rel.h to fix this, but we need a better solution.
I don't much like having rel.h include temprel.h --- seems like the
include should go the other way.  Should is_temp_relname get moved
to rel.h?

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



[HACKERS] temp-table-related failure in regression tests

2001-06-18 Thread Tom Lane

Is anyone else seeing this with current CVS, or is it my own breakage?

*** ./expected/alter_table.out  Wed May 30 12:38:38 2001
--- ./results/alter_table.out   Tue Jun 19 00:45:22 2001
***
*** 340,347 
  ALTER TABLE FKTABLE ADD FOREIGN KEY(ftest1) references pktable(ptest1);
  NOTICE:  ALTER TABLE ... ADD CONSTRAINT will create implicit trigger(s) for FOREIGN 
KEY check(s)
  DROP TABLE pktable;
! NOTICE:  DROP TABLE implicitly drops referential integrity trigger from table 
"fktable"
! NOTICE:  DROP TABLE implicitly drops referential integrity trigger from table 
"fktable"
  DROP TABLE fktable;
  CREATE TEMP TABLE PKTABLE (ptest1 int, ptest2 text,
 PRIMARY KEY(ptest1, ptest2));
--- 340,347 
  ALTER TABLE FKTABLE ADD FOREIGN KEY(ftest1) references pktable(ptest1);
  NOTICE:  ALTER TABLE ... ADD CONSTRAINT will create implicit trigger(s) for FOREIGN 
KEY check(s)
  DROP TABLE pktable;
! NOTICE:  DROP TABLE implicitly drops referential integrity trigger from table 
"pg_temp_15818_3"
! NOTICE:  DROP TABLE implicitly drops referential integrity trigger from table 
"pg_temp_15818_3"
  DROP TABLE fktable;
  CREATE TEMP TABLE PKTABLE (ptest1 int, ptest2 text,
 PRIMARY KEY(ptest1, ptest2));

==


regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] timestamp with/without time zone

2001-06-18 Thread Tom Lane

Bruce Momjian <[EMAIL PROTECTED]> writes:
> I am confused what you are suggesting here.

*** src/backend/utils/adt/format_type.c.origWed May 23 18:10:19 2001
--- src/backend/utils/adt/format_type.c Mon Jun 18 21:41:53 2001
***
*** 178,184 
break;
  
case TIMESTAMPOID:
!   buf = pstrdup("timestamp with time zone");
break;
  
case VARBITOID:
--- 178,184 
break;
  
case TIMESTAMPOID:
!   buf = pstrdup("timestamp");
break;
  
case VARBITOID:


Clear enough?

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] timestamp with/without time zone

2001-06-18 Thread Tom Lane

Bruce Momjian <[EMAIL PROTECTED]> writes:
> Very few people know the standards stuff so it seems we should just call
> it timestamp and do the best we can.  Basically by mentioning "with
> timezone" we are making the standards people happy but confusing our
> users.

I don't believe we're making any standards-lovers happy either, because
the datatype in question *is* *not* SQL9x's TIMESTAMP WITH TIME ZONE.
Given that no one actually wants to change its behavior to conform to
either of the standard's datatypes, ISTM that calling it something
different from either of those two is the appropriate path.

At some point (if someone is foolish enough to want to implement the
spec's semantics) we might have three distinct datatypes called
timestamp, timestamp with time zone, and timestamp without time zone,
with the first of these (the existing type) being the recommended
choice.  What we have at the moment is that lacking implementations
for the last two, we map them into the first one.  That doesn't seem
unreasonable to me.  But to have a clean upgrade path from one to three
types, we need to be sure we call the existing type what it is, and not
mislabel it as one of the spec-compliant types.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] Re: Various silliness in heap_getnext and related routines

2001-06-18 Thread Ross J. Reedstrom

On Mon, Jun 11, 2001 at 12:21:56PM -0400, Robert E. Bruccoleri wrote:
> Dear Tom,
> > 
> > 
> > [EMAIL PROTECTED] (Robert E. Bruccoleri) writes:
> > > BTW, given the high level of support that you provide to the PostgreSQL
> > > community, it's very accurate to state that support for PostgreSQL
> > > is far superior to that of Oracle, especially for SGI systems.
> > 
> > It's all about having the source code available, I think.  After all,
> > it was you who identified the location of the problem...
> 
> Yes, but it's not just having the source code. Although I could see
> a potential problem, it was your knowledge of the source code to recommend
> a patch that worked immediately, and your willingness to help
> that together makes PostgreSQL support so good. The same knowledge and
> helpful attitude applies to all the PostgreSQL developers.

Gee, guys, I'm tearing up here. ;-)

sniff - 

Ross


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



[HACKERS] Re: [SQL] LEFT JOIN ...

2001-06-18 Thread The Hermit Hacker


Perfect, thank you ... i knew I was overlooking something obvious ... the
query just flies now ...

On Mon, 18 Jun 2001, Tom Lane wrote:

> The Hermit Hacker <[EMAIL PROTECTED]> writes:
> >> Try adding ... AND n.nid = 15748 ...  to the WHERE.
>
> > n.nid is the note id ... nl.id is the contact id ...
>
> Ooops, I misread "n.nid = nl.nid" as "n.nid = nl.id".  Sorry for the
> bogus advice.
>
> Try rephrasing as
>
> FROM (note_links nl JOIN notes n ON (n.nid = nl.nid))
>  LEFT JOIN calendar c ON (n.nid = c.nid)
> WHERE ...
>
> The way you were writing it forced the LEFT JOIN to be done first,
> whereas what you want is for the note_links-to-notes join to be done
> first.  See
> http://www.ca.postgresql.org/users-lounge/docs/7.1/postgres/explicit-joins.html
>
>   regards, tom lane
>

Marc G. Fournier   ICQ#7615664   IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: [EMAIL PROTECTED]   secondary: scrappy@{freebsd|postgresql}.org


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



Re: AW: [HACKERS] Re: [SQL] behavior of ' = NULL' vs. MySQL vs. Stand ards

2001-06-18 Thread Tom Lane

"Joe Conway" <[EMAIL PROTECTED]> writes:
> I also noticed that in PostgreSQL I can do the following (both before and
> after this patch):
> select f2 is null from foo;
> whereas in both Oracle and MSSQL it causes a syntax error. Any thoughts on
> this?

I dug into this further and discovered that indeed it is not SQL92
... but it is SQL99.  Amazingly enough, SQL92 doesn't allow boolean
expressions as a possible type of general expression:

  ::=

  | 
  | 
  | 

It only allows them as s, which is to say WHERE,
HAVING, CASE WHEN, CHECK, and one or two other places.

But SQL99 gets it right:

  ::=

  | 
  | 
  | 
  | 
  | 
  | 
  | 
  | 

Looks like we're ahead of the curve here...

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



[HACKERS] Re: [SQL] LEFT JOIN ...

2001-06-18 Thread Tom Lane

The Hermit Hacker <[EMAIL PROTECTED]> writes:
>> Try adding ... AND n.nid = 15748 ...  to the WHERE.

> n.nid is the note id ... nl.id is the contact id ...

Ooops, I misread "n.nid = nl.nid" as "n.nid = nl.id".  Sorry for the
bogus advice.

Try rephrasing as

FROM (note_links nl JOIN notes n ON (n.nid = nl.nid))
 LEFT JOIN calendar c ON (n.nid = c.nid)
WHERE ...

The way you were writing it forced the LEFT JOIN to be done first,
whereas what you want is for the note_links-to-notes join to be done
first.  See
http://www.ca.postgresql.org/users-lounge/docs/7.1/postgres/explicit-joins.html

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] POSTMASTER

2001-06-18 Thread Reinoud van Leeuwen

On 18 Jun 2001 17:00:41 -, you wrote:

>
>Hello All.
>
>How can i limit how much of cpu the postmaster can use?

Maybe your host OS can limit the resource usage of the userid that
postmaster runs under?
-- 
__
"Nothing is as subjective as reality"
Reinoud van Leeuwen   [EMAIL PROTECTED]
http://www.xs4all.nl/~reinoud
__

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



Re: [HACKERS] corrupted document in 7.1.2

2001-06-18 Thread Tom Lane

Sergio Bruder <[EMAIL PROTECTED]> writes:
> [bruder@mapi2 postgresql-7.1.2]$ file 
>src/bin/pgaccess/doc/html/tutorial/copyright.html
> src/bin/pgaccess/doc/html/tutorial/copyright.html: GNU tar archive
> [bruder@mapi2 postgresql-7.1.2]$ tar -tvf 
>src/bin/pgaccess/doc/html/tutorial/copyright.html 
> -rw-r--r-- root/root  1195 1999-07-13 02:19:00 index.html
> -rw-r--r-- root/root  2354 1999-07-13 02:14:11 intro.html
> -rw-r--r-- sarah/sarah4442 1999-07-13 02:18:45 irix.html
> -rw-r--r-- root/root  3442 1999-07-13 01:54:13 problems.html
> -rw-r--r-- root/root  4087 1999-07-13 02:07:22 start.html

Oh-ho, so that's why it's full of nulls and so on.  Looks like we should
extract the component files in the master sources...

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



[HACKERS] Re: [SQL] LEFT JOIN ...

2001-06-18 Thread The Hermit Hacker

On Mon, 18 Jun 2001, Tom Lane wrote:

> The Hermit Hacker <[EMAIL PROTECTED]> writes:
> >   FROM note_links nl, notes n LEFT JOIN calendar c ON (n.nid = c.nid)
> >   WHERE (n.type = 'A' OR n.type = 'N' OR n.type = 'H' OR n.type = 'C')
> > AND (nl.id = 15748 AND contact_lvl = 'company')
> > AND n.nid = nl.nid
> >   ORDER BY start DESC;
>
> > Is there some way to write the above so that it evaluates:
> > first, so that it only has to do the LEFT JOIN on the *one* n.nid that is
> > returned, instead of the 86736 that are in the table?
>
> Try adding ... AND n.nid = 15748 ...  to the WHERE.  It's not very
> bright about making that sort of transitive-equality deduction for
> itself...

n.nid is the note id ... nl.id is the contact id ...

I'm trying to pull out all notes for the company with an id of 15748:

sepick=# select * from note_links where id = 15748;
  nid  |  id   | contact_lvl | owner
---+---+-+---
 84691 | 15748 | company | f
(1 row)



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] POSTMASTER

2001-06-18 Thread Ryan Mahoney

Write very optimized statements and run them infrequently  ;)

I don't really think it's possible.  You need to understand how your 
application will be used, what the resource costs are, and plan 
accordingly, (load balance, etc.)

-r

At 05:00 PM 6/18/01 +, gabriel wrote:


>Hello All.
>
>How can i limit how much of cpu the postmaster can use?
>
>thanks
>Gabriel...
>
>---(end of broadcast)---
>TIP 6: Have you searched our list archives?
>
>http://www.postgresql.org/search.mpl
>
>
>
>---
>Incoming mail is certified Virus Free.
>Checked by AVG anti-virus system (http://www.grisoft.com).
>Version: 6.0.251 / Virus Database: 124 - Release Date: 4/26/01



---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.251 / Virus Database: 124 - Release Date: 4/26/01



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: AW: [HACKERS] Re: [SQL] behavior of ' = NULL' vs. MySQL vs. Stand ards

2001-06-18 Thread Tom Lane

"Joe Conway" <[EMAIL PROTECTED]> writes:
> Attached is a patch for a new NullTest node type for review and comment.

I assume you are just looking for review at this point; I would not
recommend applying to CVS until the BooleanTest part is done too.
(Since parsetree changes affect stored rules, the change really should
include a catversion.h increment, and thus it's best to bunch this sort
of change together to avoid forcing extra initdbs on other hackers.)
I'll look through the code later, but...

> Based on this, should support for the converting "a = null" to "a is null"
> be dropped?

My opinion on that is already on record ;-)

> I also noticed that in PostgreSQL I can do the following (both before and
> after this patch):
> select f2 is null from foo;
> whereas in both Oracle and MSSQL it causes a syntax error. Any thoughts on
> this?

Curious; I'd have said that that is clearly within the spec.  Can anyone
check it on some other engines?

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



[HACKERS] Re: [SQL] LEFT JOIN ...

2001-06-18 Thread Stephan Szabo


I think that using INNER JOIN between nl and n (on n.nid=nl.nid) or
joining those tables in a subquery might work.

On Mon, 18 Jun 2001, The Hermit Hacker wrote:

> Is there some way to write the above so that it evaluates:
> 
>  WHERE (n.type = 'A' OR n.type = 'N' OR n.type = 'H' OR n.type = 'C')
>AND (nl.id = 15748 AND contact_lvl = 'company')
>AND n.nid = nl.nid
> 
> first, so that it only has to do the LEFT JOIN on the *one* n.nid that is
> returned, instead of the 86736 that are in the table?


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



[HACKERS] Re: [SQL] LEFT JOIN ...

2001-06-18 Thread Tom Lane

The Hermit Hacker <[EMAIL PROTECTED]> writes:
>   FROM note_links nl, notes n LEFT JOIN calendar c ON (n.nid = c.nid)
>   WHERE (n.type = 'A' OR n.type = 'N' OR n.type = 'H' OR n.type = 'C')
> AND (nl.id = 15748 AND contact_lvl = 'company')
> AND n.nid = nl.nid
>   ORDER BY start DESC;

> Is there some way to write the above so that it evaluates:
> first, so that it only has to do the LEFT JOIN on the *one* n.nid that is
> returned, instead of the 86736 that are in the table?

Try adding ... AND n.nid = 15748 ...  to the WHERE.  It's not very
bright about making that sort of transitive-equality deduction for
itself...

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



[HACKERS] LEFT JOIN ...

2001-06-18 Thread The Hermit Hacker


Morning ...

I'm trying to wrack my brain over something here, and no matter
how I try and look at it, I'm drawing a blank ...

I have two tables that are dependent on each other:

notes (86736 tuples) and note_links (173473 tuples)

The relationship is that one note can have several 'ppl' link'd to
it ...

I have a third table: calendar (11014 tuples) ... those calendar
entries link to a note.

So you have something like:

personA ---
personB --|--> note_links --> notes --[maybe]--> calendar entry
personC ---

now, the query I'm workign with is:

SELECT n.note, n.nid, n.type, c.act_type, c.status, nl.contact_lvl,
CASE WHEN c.act_start IS NULL
  THEN date_part('epoch', n.added)
  ELSE date_part('epoch', c.act_start)
END AS start
   FROM note_links nl, notes n LEFT JOIN calendar c ON (n.nid 
= c.nid)
  WHERE (n.type = 'A' OR n.type = 'N' OR n.type = 'H' OR 
n.type = 'C')
AND (nl.id = 15748 AND contact_lvl = 'company')
AND n.nid = nl.nid
   ORDER BY start DESC;

Which explains out as:

NOTICE:  QUERY PLAN:

Sort  (cost=7446.32..7446.32 rows=1 width=88)
  ->  Nested Loop  (cost=306.52..7446.31 rows=1 width=88)
->  Index Scan using note_links_id on note_links nl  (cost=0.00..3.49 rows=1 
width=16)
->  Materialize  (cost=6692.63..6692.63 rows=60015 width=72)
  ->  Hash Join  (cost=306.52..6692.63 rows=60015 width=72)
->  Seq Scan on notes n  (cost=0.00..2903.98 rows=60015 width=36)
->  Hash  (cost=206.22..206.22 rows=10122 width=36)
  ->  Seq Scan on calendar c  (cost=0.00..206.22 rows=10122 
width=36)

EXPLAIN

and takes forever to run ...

Now, if I eliminate the LEFT JOIN part of the above, *one* tuple is
returned ... so even with the LEFT JOIN, only *one* tuple is going to be
returned ...

Is there some way to write the above so that it evaluates:

 WHERE (n.type = 'A' OR n.type = 'N' OR n.type = 'H' OR n.type = 'C')
   AND (nl.id = 15748 AND contact_lvl = 'company')
   AND n.nid = nl.nid

first, so that it only has to do the LEFT JOIN on the *one* n.nid that is
returned, instead of the 86736 that are in the table?

Marc G. Fournier   ICQ#7615664   IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: [EMAIL PROTECTED]   secondary: scrappy@{freebsd|postgresql}.org



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] Doc translation

2001-06-18 Thread Tom Lane

Peter Eisentraut <[EMAIL PROTECTED]> writes:
> About "I don't want to download all this stuff I can't read":  We already
> have chunked distribution tarballs.  It would be possible to "chunk out"
> the files pertaining to a particular language

That works for picking up tarballs, but (AFAIK) not for people who
update from the CVS server.  However, seeing that doc/src/sgml is
presently only about a tenth of the total CVS tree bulk, it'll probably
be a long while before the docs form an overwhelming load on CVS users.

Your other arguments seem good to me, so I agree with keeping the
translated documents in the main tree, at least for now.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Doc translation

2001-06-18 Thread The Hermit Hacker

On Mon, 18 Jun 2001, Peter Eisentraut wrote:

> Tatsuo Ishii writes:
>
> > Hi, some PostgreSQL users in Japan have been translating 7.1 docs into
> > Japanese. I hope the work would finish within 1-2 months. My question
> > is how the translated docs could be merged into the doc source tree
> > once it is done. Maybe doc/ja/src/sgml?
>
> A while ago I sent a proposal to -docs about how to handle this (basically
> doc/src/sgml-).  No one protested so I was going to implement it; in
> fact, I already have in some private branch I have lying around here.  It
> even includes some nice side-effects, such as fallback to English for
> incomplete translations (so you can look at the result while translation
> is still going on) and the integration of the translated SQL reference
> pages with the internationalized psql that is currently taking shape.
> (Someone else is working on a French translation and has been very anxious
> for this to happen, too.)
>
> I would not be in favour of a separate CVS module, for several reasons:
> First, it will marginalize the efforts.  I bet there are a sufficient
> number of people how would be willing to track documentation upgrades and
> keep their language up-to-date.  Second, the build machinery would get
> gratuitously complicated and spread around (makefiles, stylesheets,
> graphics, URL strings, etc.).  Third, the (undoubtedly real) problem of
> keeping these translations up to date would not be helped by this at all.
> The maintainers of these translations will simply have to be honest to not
> label their documentation set as corresponding to version X.Y when the
> content is still based on the original documentation for X.(Y-2).
>
> About "I don't want to download all this stuff I can't read":  We already
> have chunked distribution tarballs.  It would be possible to "chunk out"
> the files pertaining to a particular language (which would include the
> message catalogs as well).
>
> While other open source projects sometimes keep their entire documentation
> in a separate cvs module, they generally keep all languages together for
> the reasons given above.

I definitely have no problems with this ... one comment about Tom's "how
to keep releases together" point though ... that is what Tags/Branches are
for ... as long as we tag all modules, things shouldn't "fall out of sync"
...

But, if you already have a clean method of doign this, please, by all
means, continue ...


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: AW: AW: [HACKERS] Call for alpha testing: planner statistics revi sion s

2001-06-18 Thread Tom Lane

Zeugswetter Andreas SB  <[EMAIL PROTECTED]> writes:
> I mean, that it is probably not useful to maintain distribution statistics 
> for a table that is that small at all (e.g. <= 3000 rows and less than
> 512 k size). 

Actually, stats are quite interesting for smaller tables too.  Maybe not
so much for the table itself (ie, deciding between seq and index scan is
not so critical), but to estimate sizes of joins against other tables.

>> Not if it takes hours to get the stats.  I'm more interested in keeping
>> ANALYZE cheap and encouraging DBAs to run it frequently, so that the
>> stats stay up-to-date.  It doesn't matter how perfect the stats were
>> when they were made, if the table has changed since then.

> That is true, but this is certainly a tradeoff situation. For a huge table
> that is quite static you would certainly want most accurate statistics even
> if it takes hours to compute once a month.

Sure.  My thought is that one would do this by increasing the SET
STATISTICS targets for such tables, thus yielding more detailed stats
that take longer to compute.  What we need now is experimentation to
find out how well this works in practice.  It might well be that more
knobs will turn out to be useful, but let's not add complexity until
we've proven it to be necessary ...

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



AW: AW: [HACKERS] Call for alpha testing: planner statistics revision s

2001-06-18 Thread Zeugswetter Andreas SB


> > Because we do not want the dba to decide which statistics are optimal,
> > there should probably be an analyze helper application that is invoked
> > with "vacuum analyze database optimal" or some such, that also decides 
> > whether a table was sufficiently altered to justify new stats gathering
> > or vacuum.
> 
> And on what are you going to base "sufficiently altered"?

Probably current table size vs table size in statistics and maybe timestamp
when statistics were last updated. Good would also be the active row count, but 
we don't have cheap access to the current value.

The point is, that if the combined effort of all "hackers" (with the help of 
some large scale users) cannot come to a more or less generally adequate answer, 
the field dba most certainly won't eighter.

Andreas

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



AW: AW: AW: [HACKERS] Call for alpha testing: planner statistics revi sion s

2001-06-18 Thread Zeugswetter Andreas SB


> > The point is, that if the combined effort of all "hackers" (with the
> > help of some large scale users) cannot come to a more or less
> > generally adequate answer, the field dba most certainly won't eighter.
> 
> True, but I regard your "if" as unproven.  The reason for this call for
> alpha testing is to find out whether we have a good enough solution or
> not.  I feel no compulsion to assume that it's not good enough on the
> basis of no evidence.

Yes, sure, sorry. I certainly don't mean to be offensive. I am just 
very interested in this area, and the reasoning behind your decisions.
Time to start reading all your code comments, and doing test cases :-)

Andreas

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



AW: AW: [HACKERS] Call for alpha testing: planner statistics revision s

2001-06-18 Thread Zeugswetter Andreas SB


> > 3. if at all, an automatic analyze should do the samples on small tables,
> > and accurate stats on large tables
> 
> Other way 'round, surely?  It already does that: if your table has fewer
> rows than the sampling target, they all get used.

I mean, that it is probably not useful to maintain distribution statistics 
for a table that is that small at all (e.g. <= 3000 rows and less than 512 k size). 
So let me reword: do the samples for medium sized tables.

> > When on the other hand the optimizer does a "mistake" on a huge table
> > the difference is easily a matter of hours, thus you want accurate stats.
> 
> Not if it takes hours to get the stats.  I'm more interested in keeping
> ANALYZE cheap and encouraging DBAs to run it frequently, so that the
> stats stay up-to-date.  It doesn't matter how perfect the stats were
> when they were made, if the table has changed since then.

That is true, but this is certainly a tradeoff situation. For a huge table
that is quite static you would certainly want most accurate statistics even
if it takes hours to compute once a month.

My comments are based on praxis and not theory :-) Of course current 
state of the art optimizer implementations might lag well behind state of
the art theory from ACM SIGMOD :-)

Andreas

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: AW: AW: [HACKERS] Call for alpha testing: planner statistics revi sion s

2001-06-18 Thread Tom Lane

Zeugswetter Andreas SB  <[EMAIL PROTECTED]> writes:
>> And on what are you going to base "sufficiently altered"?

> Probably current table size vs table size in statistics and maybe
> timestamp when statistics were last updated. Good would also be the
> active row count, but we don't have cheap access to the current value.

Once we get done with online VACUUM and internal free space re-use
(which is next on my to-do list), growth of the physical file will be
a poor guide to number of updated tuples, too.  So the above proposal
reduces to "time since last update", for which we do not need any
backend support: people already run VACUUM ANALYZE from cron tasks.

> The point is, that if the combined effort of all "hackers" (with the
> help of some large scale users) cannot come to a more or less
> generally adequate answer, the field dba most certainly won't eighter.

True, but I regard your "if" as unproven.  The reason for this call for
alpha testing is to find out whether we have a good enough solution or
not.  I feel no compulsion to assume that it's not good enough on the
basis of no evidence.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] initdb from current cvs failed

2001-06-18 Thread Tom Lane

Oleg Bartunov <[EMAIL PROTECTED]> writes:
> I tried to setup postgresql from current cvs and got
> initdb failure:

You may need to do a full recompile; I've been altering some in-memory
data structures recently.  If you don't enable dependency tracking,
you definitely need "make clean" and rebuild after every cvs update.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



Re: [HACKERS] Doc translation

2001-06-18 Thread Tom Lane

The Hermit Hacker <[EMAIL PROTECTED]> writes:
> we could move the english docs out of pgsql itself and into this module
> too, as:
> pgsql-docs/en

Hmm, I'm not sure that that's a good idea; seems it would lose the
coupling between versions of the source and versions of the
documentation.

I quite agree that we should have an official distribution of
non-English documentation if possible.  I'm just wondering how best to
keep track of which set of docs goes with which Postgres release.
Since the English docs are (we hope) kept up to date with the sources,
it seems best to keep those as part of the master CVS tree.

We could imagine keeping non-English docs in the same tree, but that
would require lots of attention to branch management --- for example,
we'd have to be careful to commit these Japanese translations of 7.1
docs into the REL7_1_STABLE branch.  OTOH maybe that's just as true
if there's a separate CVS tree for docs; you'd still want to deal with
a new version per source release.  So maybe a single tree is the right
answer after all.

Anyone have experience with managing this sort of situation under CVS?
Is separate tree or combined tree better?

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: AW: [HACKERS] Call for alpha testing: planner statistics revision s

2001-06-18 Thread Tom Lane

Zeugswetter Andreas SB  <[EMAIL PROTECTED]> writes:
> Imho that is not optimal :-) ** ducks head, to evade flying hammer **
> 1. the random sample approach should be explicitly requested with some 
> syntax extension

I don't think so ... with the current implementation you *must* do
approximate ANALYZE for large tables, or face memory overflow.
We can debate where the threshold should be, but you can't get around
the fact that approximation is essential with large tables.

> 2. the sample size should also be tuneable with some analyze syntax 
> extension (the dba chooses the tradeoff between accuracy and runtime)

The sample size is already driven by the largest SET STATISTICS value
for any of the columns of the table being analyzed.  I'm not sure if we
need a user-tweakable multiplier or not.  The current multiplier is 300
(ie, 3000 sample rows with the default SET STATISTICS target of 10).
This is not a random choice; there is some theory behind it:

 * The following choice of minrows is based on the paper
 * "Random sampling for histogram construction: how much is enough?"
 * by Surajit Chaudhuri, Rajeev Motwani and Vivek Narasayya, in
 * Proceedings of ACM SIGMOD International Conference on Management
 * of Data, 1998, Pages 436-447.  Their Corollary 1 to Theorem 5
 * says that for table size n, histogram size k, maximum relative
 * error in bin size f, and error probability gamma, the minimum
 * random sample size is
 *r = 4 * k * ln(2*n/gamma) / f^2
 * Taking f = 0.5, gamma = 0.01, n = 1 million rows, we obtain
 *r = 305.82 * k
 * Note that because of the log function, the dependence on n is
 * quite weak; even at n = 1 billion, a 300*k sample gives <= 0.59
 * bin size error with probability 0.99.  So there's no real need to
 * scale for n, which is a good thing because we don't necessarily
 * know it at this point.

> 3. if at all, an automatic analyze should do the samples on small tables,
> and accurate stats on large tables

Other way 'round, surely?  It already does that: if your table has fewer
rows than the sampling target, they all get used.

> When on the other hand the optimizer does a "mistake" on a huge table
> the difference is easily a matter of hours, thus you want accurate stats.

Not if it takes hours to get the stats.  I'm more interested in keeping
ANALYZE cheap and encouraging DBAs to run it frequently, so that the
stats stay up-to-date.  It doesn't matter how perfect the stats were
when they were made, if the table has changed since then.

> Because we do not want the dba to decide which statistics are optimal,
> there should probably be an analyze helper application that is invoked
> with "vacuum analyze database optimal" or some such, that also decides 
> whether a table was sufficiently altered to justify new stats gathering
> or vacuum.

And on what are you going to base "sufficiently altered"?

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



Re: [HACKERS] Doc translation

2001-06-18 Thread The Hermit Hacker


Tatsuo ... setting up a seperate CVS module for this does sound like a
great idea ... you already have access to the CVS repository, right?  Can
you send me a tar file containing what you have so far, and I'll get it
into CVS and then you'll be able to update that at will?

If we set it up as:

pgsql-docs/ja

we could move the english docs out of pgsql itself and into this module
too, as:

pgsql-docs/en

and any other language too ...

On Mon, 18 Jun 2001, Tatsuo Ishii wrote:

> > Tatsuo Ishii <[EMAIL PROTECTED]> writes:
> > > Hi, some PostgreSQL users in Japan have been translating 7.1 docs into
> > > Japanese. I hope the work would finish within 1-2 months. My question
> > > is how the translated docs could be merged into the doc source tree
> > > once it is done. Maybe doc/ja/src/sgml?
> >
> > Hmm, *should* they be merged into the source tree, or distributed as
> > a separate tarball?  I'm concerned that they'd always be out of sync
> > with the English docs :-(
>
> Right. However, it would be greater for Japanese users to have the
> Japanese docs in the *official* distribution of PostgreSQL, than
> getting them from other places. What about setting up a new CVS module
> for the Japanese docs, isolated from the source and English doc
> module(pgsql)?
> --
> Tatsuo Ishii
>
> ---(end of broadcast)---
> TIP 6: Have you searched our list archives?
>
> http://www.postgresql.org/search.mpl
>

Marc G. Fournier   ICQ#7615664   IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: [EMAIL PROTECTED]   secondary: scrappy@{freebsd|postgresql}.org


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



AW: [HACKERS] timestamp with/without time zone

2001-06-18 Thread Zeugswetter Andreas SB


> Let's switch 'timestamp with time zone' back to 'timestamp'. This just
> makes no sense.

Imho it only makes no sense, since the impl does not conform to standard :-(
The "with time zone" requests, that the client timezone be stored in the row.
The "timestamp" wants no timezone arithmetic/input or output at all.  

Andreas

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



AW: [HACKERS] Call for alpha testing: planner statistics revisions

2001-06-18 Thread Zeugswetter Andreas SB

First of all thanks for the great effort, it will surely be appreciated :-)

> * On large tables, ANALYZE uses a random sample of rows rather than
> examining every row, so that it should take a reasonably short time
> even on very large tables.  Possible downside: inaccurate stats.
> We need to find out if the sample size is large enough.

Imho that is not optimal :-) ** ducks head, to evade flying hammer **
1. the random sample approach should be explicitly requested with some 
syntax extension
2. the sample size should also be tuneable with some analyze syntax 
extension (the dba chooses the tradeoff between accuracy and runtime)
3. if at all, an automatic analyze should do the samples on small tables,
and accurate stats on large tables

The reasoning behind this is, that when the optimizer does a "mistake"
on small tables the runtime penalty is small, and probably even beats
the cost of accurate statistics lookup. (3 page table --> no stats 
except table size needed)

When on the other hand the optimizer does a "mistake" on a huge table
the difference is easily a matter of hours, thus you want accurate stats.

Because we do not want the dba to decide which statistics are optimal,
there should probably be an analyze helper application that is invoked
with "vacuum analyze database optimal" or some such, that also decides 
whether a table was sufficiently altered to justify new stats gathering
or vacuum. The decision, what to do may also be based on a runtime limit, 
that the dba specifies ("do the most important stats/vacuums you can do 
within ~3 hours"). 

These points are also based on experience with huge SAP/R3 installations
and the way statistics are gathered there.

Andreas

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



AW: [HACKERS] RE: Row Versioning, for jdbc updateable result sets

2001-06-18 Thread Zeugswetter Andreas SB


> "Dave Cramer" <[EMAIL PROTECTED]> writes:
> > Can the version # of
> > the row be made available to the client?
> 
> There is no "version # of the row" in postgres, unless you 
> set up such a
> thing for yourself (which could certainly be done, using triggers).

And in addition there is no row version in SQL in general.
So I have the question whether it is actually intended to solve
updateable result sets with proprietary row versions, or whether
someone implemented it that way to optimize concurrent access for 
another db system, that blocks readers ?

Andreas 

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster