AW: AW: [HACKERS] Re: [SQL] behavior of ' = NULL' vs. MySQL vs. Stand ards
> -- If I have interpreted SQL92 correctly UNKNOWN IS TRUE should return > FALSE, and UNKNOWN IS NOT TRUE is equivalent to NOT (UNKNOWN IS TRUE) ==> > TRUE. Is this correct? No, I do not think it is valid to say "should return true|false" I think they should return UNKNOWN. Only when it comes to evaluating the "... WHERE UNKNOWN;" can you translate it to "... WHERE FALSE;", or in the output function. My interpretation would be: UNKNOWN IS TRUE --> FALSE UNKNOWN IS NOT TRUE --> FALSE NOT (UNKNOWN IS TRUE) --> FALSE Andreas ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[HACKERS] Re: temp-table-related failure in regression tests
I wrote: > Is anyone else seeing this with current CVS, or is it my own breakage? Ah, the problem is RelationGetRelationName didn't know about the new temprel naming convention. I quick-hacked rel.h to fix this, but we need a better solution. I don't much like having rel.h include temprel.h --- seems like the include should go the other way. Should is_temp_relname get moved to rel.h? regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[HACKERS] temp-table-related failure in regression tests
Is anyone else seeing this with current CVS, or is it my own breakage? *** ./expected/alter_table.out Wed May 30 12:38:38 2001 --- ./results/alter_table.out Tue Jun 19 00:45:22 2001 *** *** 340,347 ALTER TABLE FKTABLE ADD FOREIGN KEY(ftest1) references pktable(ptest1); NOTICE: ALTER TABLE ... ADD CONSTRAINT will create implicit trigger(s) for FOREIGN KEY check(s) DROP TABLE pktable; ! NOTICE: DROP TABLE implicitly drops referential integrity trigger from table "fktable" ! NOTICE: DROP TABLE implicitly drops referential integrity trigger from table "fktable" DROP TABLE fktable; CREATE TEMP TABLE PKTABLE (ptest1 int, ptest2 text, PRIMARY KEY(ptest1, ptest2)); --- 340,347 ALTER TABLE FKTABLE ADD FOREIGN KEY(ftest1) references pktable(ptest1); NOTICE: ALTER TABLE ... ADD CONSTRAINT will create implicit trigger(s) for FOREIGN KEY check(s) DROP TABLE pktable; ! NOTICE: DROP TABLE implicitly drops referential integrity trigger from table "pg_temp_15818_3" ! NOTICE: DROP TABLE implicitly drops referential integrity trigger from table "pg_temp_15818_3" DROP TABLE fktable; CREATE TEMP TABLE PKTABLE (ptest1 int, ptest2 text, PRIMARY KEY(ptest1, ptest2)); == regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] timestamp with/without time zone
Bruce Momjian <[EMAIL PROTECTED]> writes: > I am confused what you are suggesting here. *** src/backend/utils/adt/format_type.c.origWed May 23 18:10:19 2001 --- src/backend/utils/adt/format_type.c Mon Jun 18 21:41:53 2001 *** *** 178,184 break; case TIMESTAMPOID: ! buf = pstrdup("timestamp with time zone"); break; case VARBITOID: --- 178,184 break; case TIMESTAMPOID: ! buf = pstrdup("timestamp"); break; case VARBITOID: Clear enough? regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] timestamp with/without time zone
Bruce Momjian <[EMAIL PROTECTED]> writes: > Very few people know the standards stuff so it seems we should just call > it timestamp and do the best we can. Basically by mentioning "with > timezone" we are making the standards people happy but confusing our > users. I don't believe we're making any standards-lovers happy either, because the datatype in question *is* *not* SQL9x's TIMESTAMP WITH TIME ZONE. Given that no one actually wants to change its behavior to conform to either of the standard's datatypes, ISTM that calling it something different from either of those two is the appropriate path. At some point (if someone is foolish enough to want to implement the spec's semantics) we might have three distinct datatypes called timestamp, timestamp with time zone, and timestamp without time zone, with the first of these (the existing type) being the recommended choice. What we have at the moment is that lacking implementations for the last two, we map them into the first one. That doesn't seem unreasonable to me. But to have a clean upgrade path from one to three types, we need to be sure we call the existing type what it is, and not mislabel it as one of the spec-compliant types. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] Re: Various silliness in heap_getnext and related routines
On Mon, Jun 11, 2001 at 12:21:56PM -0400, Robert E. Bruccoleri wrote: > Dear Tom, > > > > > > [EMAIL PROTECTED] (Robert E. Bruccoleri) writes: > > > BTW, given the high level of support that you provide to the PostgreSQL > > > community, it's very accurate to state that support for PostgreSQL > > > is far superior to that of Oracle, especially for SGI systems. > > > > It's all about having the source code available, I think. After all, > > it was you who identified the location of the problem... > > Yes, but it's not just having the source code. Although I could see > a potential problem, it was your knowledge of the source code to recommend > a patch that worked immediately, and your willingness to help > that together makes PostgreSQL support so good. The same knowledge and > helpful attitude applies to all the PostgreSQL developers. Gee, guys, I'm tearing up here. ;-) sniff - Ross ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[HACKERS] Re: [SQL] LEFT JOIN ...
Perfect, thank you ... i knew I was overlooking something obvious ... the query just flies now ... On Mon, 18 Jun 2001, Tom Lane wrote: > The Hermit Hacker <[EMAIL PROTECTED]> writes: > >> Try adding ... AND n.nid = 15748 ... to the WHERE. > > > n.nid is the note id ... nl.id is the contact id ... > > Ooops, I misread "n.nid = nl.nid" as "n.nid = nl.id". Sorry for the > bogus advice. > > Try rephrasing as > > FROM (note_links nl JOIN notes n ON (n.nid = nl.nid)) > LEFT JOIN calendar c ON (n.nid = c.nid) > WHERE ... > > The way you were writing it forced the LEFT JOIN to be done first, > whereas what you want is for the note_links-to-notes join to be done > first. See > http://www.ca.postgresql.org/users-lounge/docs/7.1/postgres/explicit-joins.html > > regards, tom lane > Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: [EMAIL PROTECTED] secondary: scrappy@{freebsd|postgresql}.org ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: AW: [HACKERS] Re: [SQL] behavior of ' = NULL' vs. MySQL vs. Stand ards
"Joe Conway" <[EMAIL PROTECTED]> writes: > I also noticed that in PostgreSQL I can do the following (both before and > after this patch): > select f2 is null from foo; > whereas in both Oracle and MSSQL it causes a syntax error. Any thoughts on > this? I dug into this further and discovered that indeed it is not SQL92 ... but it is SQL99. Amazingly enough, SQL92 doesn't allow boolean expressions as a possible type of general expression: ::= | | | It only allows them as s, which is to say WHERE, HAVING, CASE WHEN, CHECK, and one or two other places. But SQL99 gets it right: ::= | | | | | | | | Looks like we're ahead of the curve here... regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Re: [SQL] LEFT JOIN ...
The Hermit Hacker <[EMAIL PROTECTED]> writes: >> Try adding ... AND n.nid = 15748 ... to the WHERE. > n.nid is the note id ... nl.id is the contact id ... Ooops, I misread "n.nid = nl.nid" as "n.nid = nl.id". Sorry for the bogus advice. Try rephrasing as FROM (note_links nl JOIN notes n ON (n.nid = nl.nid)) LEFT JOIN calendar c ON (n.nid = c.nid) WHERE ... The way you were writing it forced the LEFT JOIN to be done first, whereas what you want is for the note_links-to-notes join to be done first. See http://www.ca.postgresql.org/users-lounge/docs/7.1/postgres/explicit-joins.html regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] POSTMASTER
On 18 Jun 2001 17:00:41 -, you wrote: > >Hello All. > >How can i limit how much of cpu the postmaster can use? Maybe your host OS can limit the resource usage of the userid that postmaster runs under? -- __ "Nothing is as subjective as reality" Reinoud van Leeuwen [EMAIL PROTECTED] http://www.xs4all.nl/~reinoud __ ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [HACKERS] corrupted document in 7.1.2
Sergio Bruder <[EMAIL PROTECTED]> writes: > [bruder@mapi2 postgresql-7.1.2]$ file >src/bin/pgaccess/doc/html/tutorial/copyright.html > src/bin/pgaccess/doc/html/tutorial/copyright.html: GNU tar archive > [bruder@mapi2 postgresql-7.1.2]$ tar -tvf >src/bin/pgaccess/doc/html/tutorial/copyright.html > -rw-r--r-- root/root 1195 1999-07-13 02:19:00 index.html > -rw-r--r-- root/root 2354 1999-07-13 02:14:11 intro.html > -rw-r--r-- sarah/sarah4442 1999-07-13 02:18:45 irix.html > -rw-r--r-- root/root 3442 1999-07-13 01:54:13 problems.html > -rw-r--r-- root/root 4087 1999-07-13 02:07:22 start.html Oh-ho, so that's why it's full of nulls and so on. Looks like we should extract the component files in the master sources... regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Re: [SQL] LEFT JOIN ...
On Mon, 18 Jun 2001, Tom Lane wrote: > The Hermit Hacker <[EMAIL PROTECTED]> writes: > > FROM note_links nl, notes n LEFT JOIN calendar c ON (n.nid = c.nid) > > WHERE (n.type = 'A' OR n.type = 'N' OR n.type = 'H' OR n.type = 'C') > > AND (nl.id = 15748 AND contact_lvl = 'company') > > AND n.nid = nl.nid > > ORDER BY start DESC; > > > Is there some way to write the above so that it evaluates: > > first, so that it only has to do the LEFT JOIN on the *one* n.nid that is > > returned, instead of the 86736 that are in the table? > > Try adding ... AND n.nid = 15748 ... to the WHERE. It's not very > bright about making that sort of transitive-equality deduction for > itself... n.nid is the note id ... nl.id is the contact id ... I'm trying to pull out all notes for the company with an id of 15748: sepick=# select * from note_links where id = 15748; nid | id | contact_lvl | owner ---+---+-+--- 84691 | 15748 | company | f (1 row) ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] POSTMASTER
Write very optimized statements and run them infrequently ;) I don't really think it's possible. You need to understand how your application will be used, what the resource costs are, and plan accordingly, (load balance, etc.) -r At 05:00 PM 6/18/01 +, gabriel wrote: >Hello All. > >How can i limit how much of cpu the postmaster can use? > >thanks >Gabriel... > >---(end of broadcast)--- >TIP 6: Have you searched our list archives? > >http://www.postgresql.org/search.mpl > > > >--- >Incoming mail is certified Virus Free. >Checked by AVG anti-virus system (http://www.grisoft.com). >Version: 6.0.251 / Virus Database: 124 - Release Date: 4/26/01 --- Outgoing mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.251 / Virus Database: 124 - Release Date: 4/26/01 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: AW: [HACKERS] Re: [SQL] behavior of ' = NULL' vs. MySQL vs. Stand ards
"Joe Conway" <[EMAIL PROTECTED]> writes: > Attached is a patch for a new NullTest node type for review and comment. I assume you are just looking for review at this point; I would not recommend applying to CVS until the BooleanTest part is done too. (Since parsetree changes affect stored rules, the change really should include a catversion.h increment, and thus it's best to bunch this sort of change together to avoid forcing extra initdbs on other hackers.) I'll look through the code later, but... > Based on this, should support for the converting "a = null" to "a is null" > be dropped? My opinion on that is already on record ;-) > I also noticed that in PostgreSQL I can do the following (both before and > after this patch): > select f2 is null from foo; > whereas in both Oracle and MSSQL it causes a syntax error. Any thoughts on > this? Curious; I'd have said that that is clearly within the spec. Can anyone check it on some other engines? regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
[HACKERS] Re: [SQL] LEFT JOIN ...
I think that using INNER JOIN between nl and n (on n.nid=nl.nid) or joining those tables in a subquery might work. On Mon, 18 Jun 2001, The Hermit Hacker wrote: > Is there some way to write the above so that it evaluates: > > WHERE (n.type = 'A' OR n.type = 'N' OR n.type = 'H' OR n.type = 'C') >AND (nl.id = 15748 AND contact_lvl = 'company') >AND n.nid = nl.nid > > first, so that it only has to do the LEFT JOIN on the *one* n.nid that is > returned, instead of the 86736 that are in the table? ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
[HACKERS] Re: [SQL] LEFT JOIN ...
The Hermit Hacker <[EMAIL PROTECTED]> writes: > FROM note_links nl, notes n LEFT JOIN calendar c ON (n.nid = c.nid) > WHERE (n.type = 'A' OR n.type = 'N' OR n.type = 'H' OR n.type = 'C') > AND (nl.id = 15748 AND contact_lvl = 'company') > AND n.nid = nl.nid > ORDER BY start DESC; > Is there some way to write the above so that it evaluates: > first, so that it only has to do the LEFT JOIN on the *one* n.nid that is > returned, instead of the 86736 that are in the table? Try adding ... AND n.nid = 15748 ... to the WHERE. It's not very bright about making that sort of transitive-equality deduction for itself... regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[HACKERS] LEFT JOIN ...
Morning ... I'm trying to wrack my brain over something here, and no matter how I try and look at it, I'm drawing a blank ... I have two tables that are dependent on each other: notes (86736 tuples) and note_links (173473 tuples) The relationship is that one note can have several 'ppl' link'd to it ... I have a third table: calendar (11014 tuples) ... those calendar entries link to a note. So you have something like: personA --- personB --|--> note_links --> notes --[maybe]--> calendar entry personC --- now, the query I'm workign with is: SELECT n.note, n.nid, n.type, c.act_type, c.status, nl.contact_lvl, CASE WHEN c.act_start IS NULL THEN date_part('epoch', n.added) ELSE date_part('epoch', c.act_start) END AS start FROM note_links nl, notes n LEFT JOIN calendar c ON (n.nid = c.nid) WHERE (n.type = 'A' OR n.type = 'N' OR n.type = 'H' OR n.type = 'C') AND (nl.id = 15748 AND contact_lvl = 'company') AND n.nid = nl.nid ORDER BY start DESC; Which explains out as: NOTICE: QUERY PLAN: Sort (cost=7446.32..7446.32 rows=1 width=88) -> Nested Loop (cost=306.52..7446.31 rows=1 width=88) -> Index Scan using note_links_id on note_links nl (cost=0.00..3.49 rows=1 width=16) -> Materialize (cost=6692.63..6692.63 rows=60015 width=72) -> Hash Join (cost=306.52..6692.63 rows=60015 width=72) -> Seq Scan on notes n (cost=0.00..2903.98 rows=60015 width=36) -> Hash (cost=206.22..206.22 rows=10122 width=36) -> Seq Scan on calendar c (cost=0.00..206.22 rows=10122 width=36) EXPLAIN and takes forever to run ... Now, if I eliminate the LEFT JOIN part of the above, *one* tuple is returned ... so even with the LEFT JOIN, only *one* tuple is going to be returned ... Is there some way to write the above so that it evaluates: WHERE (n.type = 'A' OR n.type = 'N' OR n.type = 'H' OR n.type = 'C') AND (nl.id = 15748 AND contact_lvl = 'company') AND n.nid = nl.nid first, so that it only has to do the LEFT JOIN on the *one* n.nid that is returned, instead of the 86736 that are in the table? Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: [EMAIL PROTECTED] secondary: scrappy@{freebsd|postgresql}.org ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] Doc translation
Peter Eisentraut <[EMAIL PROTECTED]> writes: > About "I don't want to download all this stuff I can't read": We already > have chunked distribution tarballs. It would be possible to "chunk out" > the files pertaining to a particular language That works for picking up tarballs, but (AFAIK) not for people who update from the CVS server. However, seeing that doc/src/sgml is presently only about a tenth of the total CVS tree bulk, it'll probably be a long while before the docs form an overwhelming load on CVS users. Your other arguments seem good to me, so I agree with keeping the translated documents in the main tree, at least for now. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Doc translation
On Mon, 18 Jun 2001, Peter Eisentraut wrote: > Tatsuo Ishii writes: > > > Hi, some PostgreSQL users in Japan have been translating 7.1 docs into > > Japanese. I hope the work would finish within 1-2 months. My question > > is how the translated docs could be merged into the doc source tree > > once it is done. Maybe doc/ja/src/sgml? > > A while ago I sent a proposal to -docs about how to handle this (basically > doc/src/sgml-). No one protested so I was going to implement it; in > fact, I already have in some private branch I have lying around here. It > even includes some nice side-effects, such as fallback to English for > incomplete translations (so you can look at the result while translation > is still going on) and the integration of the translated SQL reference > pages with the internationalized psql that is currently taking shape. > (Someone else is working on a French translation and has been very anxious > for this to happen, too.) > > I would not be in favour of a separate CVS module, for several reasons: > First, it will marginalize the efforts. I bet there are a sufficient > number of people how would be willing to track documentation upgrades and > keep their language up-to-date. Second, the build machinery would get > gratuitously complicated and spread around (makefiles, stylesheets, > graphics, URL strings, etc.). Third, the (undoubtedly real) problem of > keeping these translations up to date would not be helped by this at all. > The maintainers of these translations will simply have to be honest to not > label their documentation set as corresponding to version X.Y when the > content is still based on the original documentation for X.(Y-2). > > About "I don't want to download all this stuff I can't read": We already > have chunked distribution tarballs. It would be possible to "chunk out" > the files pertaining to a particular language (which would include the > message catalogs as well). > > While other open source projects sometimes keep their entire documentation > in a separate cvs module, they generally keep all languages together for > the reasons given above. I definitely have no problems with this ... one comment about Tom's "how to keep releases together" point though ... that is what Tags/Branches are for ... as long as we tag all modules, things shouldn't "fall out of sync" ... But, if you already have a clean method of doign this, please, by all means, continue ... ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: AW: AW: [HACKERS] Call for alpha testing: planner statistics revi sion s
Zeugswetter Andreas SB <[EMAIL PROTECTED]> writes: > I mean, that it is probably not useful to maintain distribution statistics > for a table that is that small at all (e.g. <= 3000 rows and less than > 512 k size). Actually, stats are quite interesting for smaller tables too. Maybe not so much for the table itself (ie, deciding between seq and index scan is not so critical), but to estimate sizes of joins against other tables. >> Not if it takes hours to get the stats. I'm more interested in keeping >> ANALYZE cheap and encouraging DBAs to run it frequently, so that the >> stats stay up-to-date. It doesn't matter how perfect the stats were >> when they were made, if the table has changed since then. > That is true, but this is certainly a tradeoff situation. For a huge table > that is quite static you would certainly want most accurate statistics even > if it takes hours to compute once a month. Sure. My thought is that one would do this by increasing the SET STATISTICS targets for such tables, thus yielding more detailed stats that take longer to compute. What we need now is experimentation to find out how well this works in practice. It might well be that more knobs will turn out to be useful, but let's not add complexity until we've proven it to be necessary ... regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
AW: AW: [HACKERS] Call for alpha testing: planner statistics revision s
> > Because we do not want the dba to decide which statistics are optimal, > > there should probably be an analyze helper application that is invoked > > with "vacuum analyze database optimal" or some such, that also decides > > whether a table was sufficiently altered to justify new stats gathering > > or vacuum. > > And on what are you going to base "sufficiently altered"? Probably current table size vs table size in statistics and maybe timestamp when statistics were last updated. Good would also be the active row count, but we don't have cheap access to the current value. The point is, that if the combined effort of all "hackers" (with the help of some large scale users) cannot come to a more or less generally adequate answer, the field dba most certainly won't eighter. Andreas ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
AW: AW: AW: [HACKERS] Call for alpha testing: planner statistics revi sion s
> > The point is, that if the combined effort of all "hackers" (with the > > help of some large scale users) cannot come to a more or less > > generally adequate answer, the field dba most certainly won't eighter. > > True, but I regard your "if" as unproven. The reason for this call for > alpha testing is to find out whether we have a good enough solution or > not. I feel no compulsion to assume that it's not good enough on the > basis of no evidence. Yes, sure, sorry. I certainly don't mean to be offensive. I am just very interested in this area, and the reasoning behind your decisions. Time to start reading all your code comments, and doing test cases :-) Andreas ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
AW: AW: [HACKERS] Call for alpha testing: planner statistics revision s
> > 3. if at all, an automatic analyze should do the samples on small tables, > > and accurate stats on large tables > > Other way 'round, surely? It already does that: if your table has fewer > rows than the sampling target, they all get used. I mean, that it is probably not useful to maintain distribution statistics for a table that is that small at all (e.g. <= 3000 rows and less than 512 k size). So let me reword: do the samples for medium sized tables. > > When on the other hand the optimizer does a "mistake" on a huge table > > the difference is easily a matter of hours, thus you want accurate stats. > > Not if it takes hours to get the stats. I'm more interested in keeping > ANALYZE cheap and encouraging DBAs to run it frequently, so that the > stats stay up-to-date. It doesn't matter how perfect the stats were > when they were made, if the table has changed since then. That is true, but this is certainly a tradeoff situation. For a huge table that is quite static you would certainly want most accurate statistics even if it takes hours to compute once a month. My comments are based on praxis and not theory :-) Of course current state of the art optimizer implementations might lag well behind state of the art theory from ACM SIGMOD :-) Andreas ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: AW: AW: [HACKERS] Call for alpha testing: planner statistics revi sion s
Zeugswetter Andreas SB <[EMAIL PROTECTED]> writes: >> And on what are you going to base "sufficiently altered"? > Probably current table size vs table size in statistics and maybe > timestamp when statistics were last updated. Good would also be the > active row count, but we don't have cheap access to the current value. Once we get done with online VACUUM and internal free space re-use (which is next on my to-do list), growth of the physical file will be a poor guide to number of updated tuples, too. So the above proposal reduces to "time since last update", for which we do not need any backend support: people already run VACUUM ANALYZE from cron tasks. > The point is, that if the combined effort of all "hackers" (with the > help of some large scale users) cannot come to a more or less > generally adequate answer, the field dba most certainly won't eighter. True, but I regard your "if" as unproven. The reason for this call for alpha testing is to find out whether we have a good enough solution or not. I feel no compulsion to assume that it's not good enough on the basis of no evidence. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] initdb from current cvs failed
Oleg Bartunov <[EMAIL PROTECTED]> writes: > I tried to setup postgresql from current cvs and got > initdb failure: You may need to do a full recompile; I've been altering some in-memory data structures recently. If you don't enable dependency tracking, you definitely need "make clean" and rebuild after every cvs update. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [HACKERS] Doc translation
The Hermit Hacker <[EMAIL PROTECTED]> writes: > we could move the english docs out of pgsql itself and into this module > too, as: > pgsql-docs/en Hmm, I'm not sure that that's a good idea; seems it would lose the coupling between versions of the source and versions of the documentation. I quite agree that we should have an official distribution of non-English documentation if possible. I'm just wondering how best to keep track of which set of docs goes with which Postgres release. Since the English docs are (we hope) kept up to date with the sources, it seems best to keep those as part of the master CVS tree. We could imagine keeping non-English docs in the same tree, but that would require lots of attention to branch management --- for example, we'd have to be careful to commit these Japanese translations of 7.1 docs into the REL7_1_STABLE branch. OTOH maybe that's just as true if there's a separate CVS tree for docs; you'd still want to deal with a new version per source release. So maybe a single tree is the right answer after all. Anyone have experience with managing this sort of situation under CVS? Is separate tree or combined tree better? regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: AW: [HACKERS] Call for alpha testing: planner statistics revision s
Zeugswetter Andreas SB <[EMAIL PROTECTED]> writes: > Imho that is not optimal :-) ** ducks head, to evade flying hammer ** > 1. the random sample approach should be explicitly requested with some > syntax extension I don't think so ... with the current implementation you *must* do approximate ANALYZE for large tables, or face memory overflow. We can debate where the threshold should be, but you can't get around the fact that approximation is essential with large tables. > 2. the sample size should also be tuneable with some analyze syntax > extension (the dba chooses the tradeoff between accuracy and runtime) The sample size is already driven by the largest SET STATISTICS value for any of the columns of the table being analyzed. I'm not sure if we need a user-tweakable multiplier or not. The current multiplier is 300 (ie, 3000 sample rows with the default SET STATISTICS target of 10). This is not a random choice; there is some theory behind it: * The following choice of minrows is based on the paper * "Random sampling for histogram construction: how much is enough?" * by Surajit Chaudhuri, Rajeev Motwani and Vivek Narasayya, in * Proceedings of ACM SIGMOD International Conference on Management * of Data, 1998, Pages 436-447. Their Corollary 1 to Theorem 5 * says that for table size n, histogram size k, maximum relative * error in bin size f, and error probability gamma, the minimum * random sample size is *r = 4 * k * ln(2*n/gamma) / f^2 * Taking f = 0.5, gamma = 0.01, n = 1 million rows, we obtain *r = 305.82 * k * Note that because of the log function, the dependence on n is * quite weak; even at n = 1 billion, a 300*k sample gives <= 0.59 * bin size error with probability 0.99. So there's no real need to * scale for n, which is a good thing because we don't necessarily * know it at this point. > 3. if at all, an automatic analyze should do the samples on small tables, > and accurate stats on large tables Other way 'round, surely? It already does that: if your table has fewer rows than the sampling target, they all get used. > When on the other hand the optimizer does a "mistake" on a huge table > the difference is easily a matter of hours, thus you want accurate stats. Not if it takes hours to get the stats. I'm more interested in keeping ANALYZE cheap and encouraging DBAs to run it frequently, so that the stats stay up-to-date. It doesn't matter how perfect the stats were when they were made, if the table has changed since then. > Because we do not want the dba to decide which statistics are optimal, > there should probably be an analyze helper application that is invoked > with "vacuum analyze database optimal" or some such, that also decides > whether a table was sufficiently altered to justify new stats gathering > or vacuum. And on what are you going to base "sufficiently altered"? regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [HACKERS] Doc translation
Tatsuo ... setting up a seperate CVS module for this does sound like a great idea ... you already have access to the CVS repository, right? Can you send me a tar file containing what you have so far, and I'll get it into CVS and then you'll be able to update that at will? If we set it up as: pgsql-docs/ja we could move the english docs out of pgsql itself and into this module too, as: pgsql-docs/en and any other language too ... On Mon, 18 Jun 2001, Tatsuo Ishii wrote: > > Tatsuo Ishii <[EMAIL PROTECTED]> writes: > > > Hi, some PostgreSQL users in Japan have been translating 7.1 docs into > > > Japanese. I hope the work would finish within 1-2 months. My question > > > is how the translated docs could be merged into the doc source tree > > > once it is done. Maybe doc/ja/src/sgml? > > > > Hmm, *should* they be merged into the source tree, or distributed as > > a separate tarball? I'm concerned that they'd always be out of sync > > with the English docs :-( > > Right. However, it would be greater for Japanese users to have the > Japanese docs in the *official* distribution of PostgreSQL, than > getting them from other places. What about setting up a new CVS module > for the Japanese docs, isolated from the source and English doc > module(pgsql)? > -- > Tatsuo Ishii > > ---(end of broadcast)--- > TIP 6: Have you searched our list archives? > > http://www.postgresql.org/search.mpl > Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: [EMAIL PROTECTED] secondary: scrappy@{freebsd|postgresql}.org ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
AW: [HACKERS] timestamp with/without time zone
> Let's switch 'timestamp with time zone' back to 'timestamp'. This just > makes no sense. Imho it only makes no sense, since the impl does not conform to standard :-( The "with time zone" requests, that the client timezone be stored in the row. The "timestamp" wants no timezone arithmetic/input or output at all. Andreas ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
AW: [HACKERS] Call for alpha testing: planner statistics revisions
First of all thanks for the great effort, it will surely be appreciated :-) > * On large tables, ANALYZE uses a random sample of rows rather than > examining every row, so that it should take a reasonably short time > even on very large tables. Possible downside: inaccurate stats. > We need to find out if the sample size is large enough. Imho that is not optimal :-) ** ducks head, to evade flying hammer ** 1. the random sample approach should be explicitly requested with some syntax extension 2. the sample size should also be tuneable with some analyze syntax extension (the dba chooses the tradeoff between accuracy and runtime) 3. if at all, an automatic analyze should do the samples on small tables, and accurate stats on large tables The reasoning behind this is, that when the optimizer does a "mistake" on small tables the runtime penalty is small, and probably even beats the cost of accurate statistics lookup. (3 page table --> no stats except table size needed) When on the other hand the optimizer does a "mistake" on a huge table the difference is easily a matter of hours, thus you want accurate stats. Because we do not want the dba to decide which statistics are optimal, there should probably be an analyze helper application that is invoked with "vacuum analyze database optimal" or some such, that also decides whether a table was sufficiently altered to justify new stats gathering or vacuum. The decision, what to do may also be based on a runtime limit, that the dba specifies ("do the most important stats/vacuums you can do within ~3 hours"). These points are also based on experience with huge SAP/R3 installations and the way statistics are gathered there. Andreas ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
AW: [HACKERS] RE: Row Versioning, for jdbc updateable result sets
> "Dave Cramer" <[EMAIL PROTECTED]> writes: > > Can the version # of > > the row be made available to the client? > > There is no "version # of the row" in postgres, unless you > set up such a > thing for yourself (which could certainly be done, using triggers). And in addition there is no row version in SQL in general. So I have the question whether it is actually intended to solve updateable result sets with proprietary row versions, or whether someone implemented it that way to optimize concurrent access for another db system, that blocks readers ? Andreas ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster