Re: [HACKERS] FUNC_MAX_ARGS benchmarks
Bruce Momjian wrote: > Tom Lane wrote: > >>Bruce Momjian <[EMAIL PROTECTED]> writes: >> >>>OK, time to get moving folks. Looks like the increase in the function >>>args to 32 and the NAMEDATALEN to 128 has been sufficiently tested. >> >>I'm convinced by Joe's numbers that FUNC_MAX_ARGS = 32 shouldn't hurt >>too much. But have we done equivalent checks on NAMEDATALEN? In >>particular, do we know what it does to the size of template1? > > > No, I thought we saw the number, was 30%? No, we did a test for 64. > Can someone get us that number for 128? > These are all with FUNC_MAX_ARGS = 16. #define NAMEDATALEN 32 du -h --max-depth=1 /opt/data/pgsql/data/base/ 2.7M/opt/data/pgsql/data/base/1 2.7M/opt/data/pgsql/data/base/16862 2.7M/opt/data/pgsql/data/base/16863 2.7M/opt/data/pgsql/data/base/16864 3.2M/opt/data/pgsql/data/base/16865 2.7M/opt/data/pgsql/data/base/16866 2.7M/opt/data/pgsql/data/base/17117 19M /opt/data/pgsql/data/base #define NAMEDATALEN 64 du -h --max-depth=1 /opt/data/pgsql/data/base/ 3.0M/opt/data/pgsql/data/base/1 3.0M/opt/data/pgsql/data/base/16863 3.0M/opt/data/pgsql/data/base/16864 3.0M/opt/data/pgsql/data/base/16865 3.5M/opt/data/pgsql/data/base/16866 3.0M/opt/data/pgsql/data/base/16867 19M /opt/data/pgsql/data/base #define NAMEDATALEN 128 du -h --max-depth=1 /opt/data/pgsql/data/base/ 3.8M/opt/data/pgsql/data/base/1 3.8M/opt/data/pgsql/data/base/16863 3.8M/opt/data/pgsql/data/base/16864 3.8M/opt/data/pgsql/data/base/16865 4.4M/opt/data/pgsql/data/base/16866 3.8M/opt/data/pgsql/data/base/16867 23M /opt/data/pgsql/data/base Joe ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Set 'o patches
Thomas Lockhart <[EMAIL PROTECTED]> writes: > I've committed changes to do the following: > o Fix buffer overrun possibilities in date/time handling > o Handle fixed-length char and bit literals > o Implement IS OF type predicate > o Define macros to manipulate date/time typmod values > o Map hex string literals to bit string type (may change later) > o Map CREATE TABLE/OF to inheritance. May change later > o Implement WAL log file location support using "-X" and PGXLOG Would it be out of line to question the fact that none of these commit messages showed any documentation updates? regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] Set 'o patches
I've committed changes to do the following: o Fix buffer overrun possibilities in date/time handling o Handle fixed-length char and bit literals o Implement IS OF type predicate o Define macros to manipulate date/time typmod values o Map hex string literals to bit string type (may change later) o Map CREATE TABLE/OF to inheritance. May change later o Implement WAL log file location support using "-X" and PGXLOG - Thomas ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] cvs changes and broken links
On Sat, 3 Aug 2002, Joe Conway wrote: > I couldn't keep up with the list traffic this week, but I thought I saw > enough to convince me that after it was all said and done, I would still > be able to do `cvs co pgsql`. I'm finding today that after using cvsup > to sync up, I can no longer checkout pgsql, but pgsql-server instead. Is > this intended, or are there more changes left to be made? Okay, I've tested both using the regular CVS and the anoncvs servers, and both co pgsql just fine ... can you email me (off list) the errors you are seeing, as well as your CVSROOT? ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] FUNC_MAX_ARGS benchmarks
Bruce Momjian wrote: > Tom Lane wrote: >>I'm convinced by Joe's numbers that FUNC_MAX_ARGS = 32 shouldn't hurt >>too much. But have we done equivalent checks on NAMEDATALEN? In >>particular, do we know what it does to the size of template1? > No, I thought we saw the number, was 30%? No, we did a test for 64. > Can someone get us that number for 128? > I'll do 32, 64, and 128 and report back on template1 size. Joe ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Table inheritance versus views
On 3 Aug 2002, Hannu Krosing wrote: > hannu=# create table animal (name text, legcount int); > CREATE > hannu=# insert into animal values('pig',4); > INSERT 34183 1 > hannu=# select * from animal; > name | legcount > --+-- > pig |4 > (1 row) > > hannu=# create table bird (wingcount int) inherits (animal); > CREATE > hannu=# insert into bird values('hen',2,2); > INSERT 34189 1 > hannu=# select * from animal; > name | legcount > --+-- > pig |4 > hen |2 > (2 rows) You can do this just as well with views. In posgres, it's harder only because you're forced to create rules for updating views. But it's possible to have the system automatically do the right thing. cjs -- Curt Sampson <[EMAIL PROTECTED]> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're all light. --XTC ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Why is MySQL more chosen over PostgreSQL
On 29 Jul 2002, Stephen Deasey wrote: > Table inheritance offers data model extensibility. New (derived) tables > can be added to the system, and will work with existing code that > opperates on the base tables, without having to hack up all the code. And why does this not work with the standard relational mechanism? (Where a "derived table" would be just another table with a foreign key pointing back to the base table.) cjs -- Curt Sampson <[EMAIL PROTECTED]> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're all light. --XTC ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] fate of CLUSTER command ?
"Christopher Kings-Lynne" <[EMAIL PROTECTED]> writes: >> Also, is there any value to contrib/fulltextindex now that we have >> contrib/tsearch? > I haven't looked at tsearch yet, but I expect it's way better than > fulltextindex. However there's more than a few of us using fulltextindex, > so I think it will need to stay for some while. Right, at least a couple releases. > I'm working on a new version of it for 7.3. What have you got in mind? regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] fate of CLUSTER command ?
> Yes, I have always liked CLUSTER with full text searches because you are > usually hitting multiple rows with a single equaltiy restriction, and > CLUSTER puts all the hits on the same page. > > If you look in contrib/fulltextindex, you will see mention of CLUSTER in > the README. It may make sense to add that to your documentation. > > Also, is there any value to contrib/fulltextindex now that we have > contrib/tsearch? I haven't looked at tsearch yet, but I expect it's way better than fulltextindex. However there's more than a few of us using fulltextindex, so I think it will need to stay for some while. I'm working on a new version of it for 7.3. I can put pointers in the README about checking out tsearch... Chris ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Please, apply ltree patch
Patch applied. Thanks. --- Oleg Bartunov wrote: > Bruce, > > please find attached patch to current CVS ( contrib/ltree ) > > Changes: > > July 31, 2002 >Now works on 64-bit platforms. >Added function lca - lowest common ancestor >Version for 7.2 is distributed as separate package - >http://www.sai.msu.su/~megera/postgres/gist/ltree/ltree-7.2.tar.gz > > > Regards, > Oleg > _ > Oleg Bartunov, sci.researcher, hostmaster of AstroNet, > Sternberg Astronomical Institute, Moscow University (Russia) > Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ > phone: +007(095)939-16-83, +007(095)939-23-83 Content-Description: [ Attachment, skipping... ] -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] [PATCHES] START TRANSACTION
Alvaro Herrera <[EMAIL PROTECTED]> writes: > That makes me wonder: should I produce some regression tests for > CLUSTER? It'd be a good thing. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Patch for "Bug of PL/pgSQL parser"
Patch rejected. Tom Lane pointed out some mistakes in this patch, and the patch does not show any corrections. --- eutm wrote: >Some weeks ago i wrote about one problem(called as > "Bug of PL/pgSQL parser"): > > "eutm" <[EMAIL PROTECTED]> writes: > > Dear Sirs!:)I encounted one small problem,working with > > PostgreSQL 7.3devel.It can look a > > bit strange,but i have to use whitespaces in names of > databases,tables,fields > > and so on(like "roomno jk").It's possible to create them all and work > with them > > (INSERT,DELETE,UPDATE),but PL/pgSQL parser(compiler ?) can't execute such > > statements... > > Today i send a simple patch(my first:)). > Regards,Eugene. > > > > > [ Attachment, skipping... ] > > ---(end of broadcast)--- > TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Why is MySQL more chosen over PostgreSQL?
On 3 Aug 2002, Hannu Krosing wrote: > On Sat, 2002-08-03 at 16:32, Curt Sampson wrote: > > On 2 Aug 2002, Hannu Krosing wrote: > > > > Perhaps this is the problem. I disagree that it's a "higher" level. > > I don't mean "morally higher" ;) > Just more concise and easier to grasp, same as VIEW vs. TABLE + ON xxx > DO INSTEAD rules. That's because we don't do a good job of implementing updatable views. Views ought to be as fully updatable as possible given the definition, without having to define rules for doing this. Simple views such as CREATE TABLE tab1 ( id int, foo text ) CREATE TABLE tab2 ( id int, bar text ) CREATE VIEW something AS SELECT tab1.id, tab1.foo, tab2.bar FROM tab1, tab2 WHERE tab1.id = tab2.id ought to be completely updatable without any special rules. For further info see the detailed discussion of this in Date's database textbook. > That is the same way that C is "higher" than ASM and ASM is higher than > writing code directly using hex editor. No, this is the same way that Smalltalk is "higher" than Lisp. (I.e., it isn't.) cjs -- Curt Sampson <[EMAIL PROTECTED]> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're all light. --XTC ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PATCHES] START TRANSACTION
Peter Eisentraut dijo: > Neil Conway writes: > > > The attached patch implements START TRANSACTION, per SQL99. The > > functionality of the command is basically identical to that of > > BEGIN; it just accepts a few extra options (only one of which > > PostgreSQL currently implements), and is standards-compliant. > > The patch includes a simple regression test and documentation. > > Very nice patch, but I don't think we need the regression test. It's a > bit too simple. That makes me wonder: should I produce some regression tests for CLUSTER? -- Alvaro Herrera () "Investigación es lo que hago cuando no sé lo que estoy haciendo" (Wernher von Braun) ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] getpid() function
Bruce Momjian <[EMAIL PROTECTED]> writes: > Sorry, I mean pg_backend_pid. Okay, I was unsure if that was a typo or not. > I could expose backend_id but it may > confuse people so pid is probably better. If you had the id, you could > use pg_stat_get_backend_pid() to get the pid. Yeah, I thought of suggesting pg_backend_id() to return MyBackendId and then pg_stat_get_backend_pid() to get the PID, but was stopped by the thought that this breaks down if the stats collector isn't running. While I'm not convinced that there's any need for backend PID that's not connected to looking at stats-collector results, it's probably foolish to set up a mechanism that doesn't work outside that context. Let's go with pg_backend_pid(). regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] CLUSTER and indisclustered
Bruce Momjian <[EMAIL PROTECTED]> writes: > Tom, should we be updating that flag after we CLUSTER instead of > requiring an ANALYZE after the CLUSTER? Could do that I suppose, but I'm not super-excited about it. ANALYZE is quite cheap these days (especially in comparison to CLUSTER ;-)). I'd settle for a note in the CLUSTER docs that recommends a subsequent ANALYZE --- this seems no different from recommending ANALYZE after bulk data load or other major update of a table. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] getpid() function
Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > >> Let's take it out and wait to see if anyone really still wants it. > > > Just when I am ready to throw it away, I come up with a use for the > > function: > > > test=> select * from pg_stat_activity where procpid != backend_pid(); > > > This shows all activity _except_ my session, which pgmonitor or others > > may want to use, and I can think of no other way to do it. > > Hm. Actually this seems like an argument for exposing MyBackendId, since > what pg_stat_activity really depends on is BackendId. But as that view > is presently defined, you'd not be able to write > WHERE backendid = my_backend_id() > because the view doesn't expose backendid. Yes. > > Comments? Maybe this is why it should be called pg_backend_id and put > > in the stat section. > > *Please* don't call it pg_backend_id --- that invites confusion with > BackendId which is a different thing. > > I'd suggest pg_backend_pid. Sorry, I mean pg_backend_pid. I could expose backend_id but it may confuse people so pid is probably better. If you had the id, you could use pg_stat_get_backend_pid() to get the pid. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] CLUSTER and indisclustered
Gavin Sherry <[EMAIL PROTECTED]> writes: > On Sat, 3 Aug 2002, Tom Lane wrote: >> AFAICT you're assuming that the table is *exactly* ordered by the >> clustered attribute. While this is true at the instant CLUSTER >> completes, the exact ordering will be destroyed by the first insert or >> update :-(. I can't see much value in creating a whole new scan type > Sorry, I meant to say that heap_insert() etc would need to set > indisclustered to false. <> You could do that, but only if you are prepared to invent a mechanism that will instantly invalidate any existing query plans that assume the clustered ordering is good. Up to now we've only allowed the planner to make decisions that impact performace, not correctness of the result. I'm uncomfortable with the idea that a "clusterscan" plan could silently return wrong answers after someone else updates the table and doesn't tell us they did. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] getpid() function
Bruce Momjian <[EMAIL PROTECTED]> writes: >> Let's take it out and wait to see if anyone really still wants it. > Just when I am ready to throw it away, I come up with a use for the > function: > test=> select * from pg_stat_activity where procpid != backend_pid(); > This shows all activity _except_ my session, which pgmonitor or others > may want to use, and I can think of no other way to do it. Hm. Actually this seems like an argument for exposing MyBackendId, since what pg_stat_activity really depends on is BackendId. But as that view is presently defined, you'd not be able to write WHERE backendid = my_backend_id() because the view doesn't expose backendid. > Comments? Maybe this is why it should be called pg_backend_id and put > in the stat section. *Please* don't call it pg_backend_id --- that invites confusion with BackendId which is a different thing. I'd suggest pg_backend_pid. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] CLUSTER and indisclustered
Gavin Sherry wrote: > > Gavin, is that a big win compared to just using the index and looping > > through the entries, knowing that the index matches are on the same > > page, and the heap matches are on the same page. > > Bruce, > > It would cut out the index over head. Besides at (1) (above) we would have > determined that an index scan was too expensive and we would be using a > SeqScan instead. This would just be faster, since a) we would locate the > tuples more intelligently b) we wouldn't need to scan the whole heap once > we'd found all tuples matching the scan key. Yes, but in a clustered table, an index scan is _never_ (?) more expensive than a sequential scan, at least if the optimizer is working correctly. Index scans are slower only because they assume random heap access, but with a clustered table, there is no random heap access. The index takes to right to the spot to start. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] CLUSTER and indisclustered
Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > Tom, should we be updating that flag after we CLUSTER instead of > > requiring an ANALYZE after the CLUSTER? > > Could do that I suppose, but I'm not super-excited about it. ANALYZE is > quite cheap these days (especially in comparison to CLUSTER ;-)). I'd > settle for a note in the CLUSTER docs that recommends a subsequent > ANALYZE --- this seems no different from recommending ANALYZE after bulk > data load or other major update of a table. OK. I am sure it is not obvious to people to ANALYZE because the data in their table hasn't changed, just the ordering. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] CLUSTER and indisclustered
On Sat, 3 Aug 2002, Bruce Momjian wrote: > Gavin Sherry wrote: > > Hi all, > > > > It occured to me on the plane home that now that CLUSTER is fixed we may > > be able to put pg_index.indisclustered to use. If CLUSTER was to set > > indisclustered to true when it clusters a heap according to the given > > index, we could speed up sequantial scans. There are two possible ways. > > > > 1) Planner determines that a seqscan is appropriate *and* the retrieval is > > qualified by the key(s) of one of the relation's indexes > > 2) Planner determines that the relation is clustered on disk according to > > the index over the key(s) used to qualify the retrieval > > 3) Planner sets an appropriate nodeTag for the retrieval (SeqScanCluster?) > > 4) ExecProcNode() calls some new scan routine, ExecSeqScanCluster() ? > > 5) ExecSeqScanCluster() calls ExecScan() with a new ExecScanAccessMtd (ie, > > different from SeqNext) called SeqClusterNext > > 6) SeqClusterNext() has all the heapgettup() logic with two > > exceptions: a) we find the first tuple more intelligently (instead of > > scanning from the first page) b) if we have found tuple(s) matching the > > ScanKey when we encounter an non-matching tuple (via > > HeapTupleSatisfies() ?) we return a NULL'ed out tuple, terminating the > > scan > > Gavin, is that a big win compared to just using the index and looping > through the entries, knowing that the index matches are on the same > page, and the heap matches are on the same page. Bruce, It would cut out the index over head. Besides at (1) (above) we would have determined that an index scan was too expensive and we would be using a SeqScan instead. This would just be faster, since a) we would locate the tuples more intelligently b) we wouldn't need to scan the whole heap once we'd found all tuples matching the scan key. Gavin ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] getpid() function
Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > As I remember, most cases where people have recently been asking for > > backend pid were related to temp tables because they were named by pid. > > Ah, good point. > > > I don't think they are anymore. (?) > > Check. > > > We can do two things. We can either rename it to pg_backend_pid and > > move it to the statistics section in the docs, where the backend pids of > > all active backends are available, or remove my code additions and see > > if anyone asks for it in 7.3. > > Let's take it out and wait to see if anyone really still wants it. Just when I am ready to throw it away, I come up with a use for the function: test=> select * from pg_stat_activity where procpid != backend_pid(); This shows all activity _except_ my session, which pgmonitor or others may want to use, and I can think of no other way to do it. Comments? Maybe this is why it should be called pg_backend_id and put in the stat section. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] CLUSTER and indisclustered
Gavin Sherry wrote: > Hi all, > > It occured to me on the plane home that now that CLUSTER is fixed we may > be able to put pg_index.indisclustered to use. If CLUSTER was to set > indisclustered to true when it clusters a heap according to the given > index, we could speed up sequantial scans. There are two possible ways. > > 1) Planner determines that a seqscan is appropriate *and* the retrieval is > qualified by the key(s) of one of the relation's indexes > 2) Planner determines that the relation is clustered on disk according to > the index over the key(s) used to qualify the retrieval > 3) Planner sets an appropriate nodeTag for the retrieval (SeqScanCluster?) > 4) ExecProcNode() calls some new scan routine, ExecSeqScanCluster() ? > 5) ExecSeqScanCluster() calls ExecScan() with a new ExecScanAccessMtd (ie, > different from SeqNext) called SeqClusterNext > 6) SeqClusterNext() has all the heapgettup() logic with two > exceptions: a) we find the first tuple more intelligently (instead of > scanning from the first page) b) if we have found tuple(s) matching the > ScanKey when we encounter an non-matching tuple (via > HeapTupleSatisfies() ?) we return a NULL'ed out tuple, terminating the > scan Gavin, is that a big win compared to just using the index and looping through the entries, knowing that the index matches are on the same page, and the heap matches are on the same page. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] CLUSTER and indisclustered
Tom Lane wrote: > Gavin Sherry <[EMAIL PROTECTED]> writes: > > It occured to me on the plane home that now that CLUSTER is fixed we may > > be able to put pg_index.indisclustered to use. If CLUSTER was to set > > indisclustered to true when it clusters a heap according to the given > > index, we could speed up sequantial scans. > > AFAICT you're assuming that the table is *exactly* ordered by the > clustered attribute. While this is true at the instant CLUSTER > completes, the exact ordering will be destroyed by the first insert or > update :-(. I can't see much value in creating a whole new scan type > that's only usable on a perfectly-clustered table. > > The existing approach to making the planner smart about clustered tables > is to compute a physical-vs-logical-order-correlation statistic and use > that to adjust the estimated cost of indexscans. I believe this is a > more robust approach than considering a table to be "clustered" or "not > clustered", since it can deal with the gradual degradation of clustered > order over time. However, I will not make any great claims for the > specific equations currently used for this purpose --- they're surely in > need of improvement. Feel free to take a look and see if you have any > ideas. The collection of the statistic is in commands/analyze.c and the > use of it is in optimizer/path/costsize.c. Tom, should we be updating that flag after we CLUSTER instead of requiring an ANALYZE after the CLUSTER? -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] CLUSTER and indisclustered
On Sat, 3 Aug 2002, Tom Lane wrote: > Gavin Sherry <[EMAIL PROTECTED]> writes: > > It occured to me on the plane home that now that CLUSTER is fixed we may > > be able to put pg_index.indisclustered to use. If CLUSTER was to set > > indisclustered to true when it clusters a heap according to the given > > index, we could speed up sequantial scans. > > AFAICT you're assuming that the table is *exactly* ordered by the > clustered attribute. While this is true at the instant CLUSTER > completes, the exact ordering will be destroyed by the first insert or > update :-(. I can't see much value in creating a whole new scan type Sorry, I meant to say that heap_insert() etc would need to set indisclustered to false. I do see some worth in this however. Naturally, in a situation where a database is being modified very often this is of little value. However, for applications focussed on analysing large amounts of static data this could increase performance significantly. Once I get some time I will attempt to explore this further in `diff -c` format :-). Gavin ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] FUNC_MAX_ARGS benchmarks
Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > OK, time to get moving folks. Looks like the increase in the function > > args to 32 and the NAMEDATALEN to 128 has been sufficiently tested. > > I'm convinced by Joe's numbers that FUNC_MAX_ARGS = 32 shouldn't hurt > too much. But have we done equivalent checks on NAMEDATALEN? In > particular, do we know what it does to the size of template1? No, I thought we saw the number, was 30%? No, we did a test for 64. Can someone get us that number for 128? -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] PITR, checkpoint, and local relations
Sounds like a win all around; make PITR easier and temp tables faster. --- Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > There is debate on whether the local buffers are even valuable > > considering the headache they cause in other parts of the system. > > More specifically, the issue is that when (if) you commit, the contents > of the new table now have to be pushed out to shared storage. This is > moderately annoying in itself (among other things, it implies fsync'ing > those tables before commit). But the real reason it comes up now is > that the proposed PITR scheme can't cope gracefully with tables that > are suddenly there but weren't participating in checkpoints before. > > It looks to me like we should stop using local buffers for ordinary > tables that happen to be in their first transaction of existence. > But, per Vadim's suggestion, we shouldn't abandon the local buffer > manager altogether. What we could and should use it for is TEMP tables, > which have no need to be checkpointed or WAL-logged or fsync'd or > accessible to other backends *ever*. Also, a temp table can leave > blocks in local buffers across transactions, which makes local buffers > considerably more useful than they are now. > > If temp tables didn't use the shared bufmgr nor did updates to them get > WAL-logged, they'd be noticeably more efficient than plain tables, which > IMHO would be a Good Thing. Such tables would be essentially invisible > to WAL and PITR (at least their contents would be --- I assume we'd > still log file creation and deletion). But I can't see anything wrong > with that. > > In short, the proposal runs something like this: > > * Regular tables that happen to be in their first transaction of > existence are not treated differently from any other regular table so > far as buffer management or WAL or PITR go. (rd_myxactonly either goes > away or is used for much less than it is now.) > > * TEMP tables use the local buffer manager for their entire existence. > (This probably means adding an "rd_istemp" flag to relcache entries, but > I can't see anything wrong with that.) > > * Local bufmgr semantics are twiddled to reflect this reality --- in > particular, data in local buffers can be held across transactions, there > is no end-of-transaction write (much less fsync). A TEMP table that > isn't too large might never touch disk at all. > > * Data operations in TEMP tables do not get WAL-logged, nor do we > WAL-log page images of local-buffer pages. > > > These changes seem very attractive to me even without regard for making > the world safer for PITR. I'm willing to volunteer to make them happen, > if there are no objections. > > regards, tom lane > -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] CLUSTER and indisclustered
Gavin Sherry <[EMAIL PROTECTED]> writes: > It occured to me on the plane home that now that CLUSTER is fixed we may > be able to put pg_index.indisclustered to use. If CLUSTER was to set > indisclustered to true when it clusters a heap according to the given > index, we could speed up sequantial scans. AFAICT you're assuming that the table is *exactly* ordered by the clustered attribute. While this is true at the instant CLUSTER completes, the exact ordering will be destroyed by the first insert or update :-(. I can't see much value in creating a whole new scan type that's only usable on a perfectly-clustered table. The existing approach to making the planner smart about clustered tables is to compute a physical-vs-logical-order-correlation statistic and use that to adjust the estimated cost of indexscans. I believe this is a more robust approach than considering a table to be "clustered" or "not clustered", since it can deal with the gradual degradation of clustered order over time. However, I will not make any great claims for the specific equations currently used for this purpose --- they're surely in need of improvement. Feel free to take a look and see if you have any ideas. The collection of the statistic is in commands/analyze.c and the use of it is in optimizer/path/costsize.c. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] getpid() function
Bruce Momjian <[EMAIL PROTECTED]> writes: > As I remember, most cases where people have recently been asking for > backend pid were related to temp tables because they were named by pid. Ah, good point. > I don't think they are anymore. (?) Check. > We can do two things. We can either rename it to pg_backend_pid and > move it to the statistics section in the docs, where the backend pids of > all active backends are available, or remove my code additions and see > if anyone asks for it in 7.3. Let's take it out and wait to see if anyone really still wants it. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] FUNC_MAX_ARGS benchmarks
Bruce Momjian <[EMAIL PROTECTED]> writes: > OK, time to get moving folks. Looks like the increase in the function > args to 32 and the NAMEDATALEN to 128 has been sufficiently tested. I'm convinced by Joe's numbers that FUNC_MAX_ARGS = 32 shouldn't hurt too much. But have we done equivalent checks on NAMEDATALEN? In particular, do we know what it does to the size of template1? regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] PITR, checkpoint, and local relations
Bruce Momjian <[EMAIL PROTECTED]> writes: > There is debate on whether the local buffers are even valuable > considering the headache they cause in other parts of the system. More specifically, the issue is that when (if) you commit, the contents of the new table now have to be pushed out to shared storage. This is moderately annoying in itself (among other things, it implies fsync'ing those tables before commit). But the real reason it comes up now is that the proposed PITR scheme can't cope gracefully with tables that are suddenly there but weren't participating in checkpoints before. It looks to me like we should stop using local buffers for ordinary tables that happen to be in their first transaction of existence. But, per Vadim's suggestion, we shouldn't abandon the local buffer manager altogether. What we could and should use it for is TEMP tables, which have no need to be checkpointed or WAL-logged or fsync'd or accessible to other backends *ever*. Also, a temp table can leave blocks in local buffers across transactions, which makes local buffers considerably more useful than they are now. If temp tables didn't use the shared bufmgr nor did updates to them get WAL-logged, they'd be noticeably more efficient than plain tables, which IMHO would be a Good Thing. Such tables would be essentially invisible to WAL and PITR (at least their contents would be --- I assume we'd still log file creation and deletion). But I can't see anything wrong with that. In short, the proposal runs something like this: * Regular tables that happen to be in their first transaction of existence are not treated differently from any other regular table so far as buffer management or WAL or PITR go. (rd_myxactonly either goes away or is used for much less than it is now.) * TEMP tables use the local buffer manager for their entire existence. (This probably means adding an "rd_istemp" flag to relcache entries, but I can't see anything wrong with that.) * Local bufmgr semantics are twiddled to reflect this reality --- in particular, data in local buffers can be held across transactions, there is no end-of-transaction write (much less fsync). A TEMP table that isn't too large might never touch disk at all. * Data operations in TEMP tables do not get WAL-logged, nor do we WAL-log page images of local-buffer pages. These changes seem very attractive to me even without regard for making the world safer for PITR. I'm willing to volunteer to make them happen, if there are no objections. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[HACKERS] CLUSTER and indisclustered
Hi all, It occured to me on the plane home that now that CLUSTER is fixed we may be able to put pg_index.indisclustered to use. If CLUSTER was to set indisclustered to true when it clusters a heap according to the given index, we could speed up sequantial scans. There are two possible ways. 1) Planner determines that a seqscan is appropriate *and* the retrieval is qualified by the key(s) of one of the relation's indexes 2) Planner determines that the relation is clustered on disk according to the index over the key(s) used to qualify the retrieval 3) Planner sets an appropriate nodeTag for the retrieval (SeqScanCluster?) 4) ExecProcNode() calls some new scan routine, ExecSeqScanCluster() ? 5) ExecSeqScanCluster() calls ExecScan() with a new ExecScanAccessMtd (ie, different from SeqNext) called SeqClusterNext 6) SeqClusterNext() has all the heapgettup() logic with two exceptions: a) we find the first tuple more intelligently (instead of scanning from the first page) b) if we have found tuple(s) matching the ScanKey when we encounter an non-matching tuple (via HeapTupleSatisfies() ?) we return a NULL'ed out tuple, terminating the scan Any reason this isn't possible? Any reason it couldn't dramatically speed up the performance of the type of query i've mentioned? Gavin ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] FUNC_MAX_ARGS benchmarks
Hannu Krosing <[EMAIL PROTECTED]> writes: >> Lack of btree index support for _oid would be the first hurdle. > Is that index really needed, or is it there just to enforce uniqueness ? Both. > Also, (imho ;) btree index support should be done for all array types > which have comparison ops for elements at once (with semantics similar > to string) not one by one for individual types. Fine, send a patch ;-) >> Even if we wanted to do that work, there'd be some serious breakage >> of client queries because of the historical differences in output format >> and subscripting. (oidvector indexes from 0, _oid from 1. Which is >> pretty bogus, but if the regression tests are anything to judge by there >> are probably a lot of queries out there that know this.) > I would guess that oidvector is sufficiently obscure type and that > nobody actually uses oidvector for user tables. No, you miss my point: client queries that do subscripting on proargtypes will break. Since the regression tests find this a useful thing to do, I suspect there are clients out there that do too. > But going to _oid will free us from arbitrary limits on argument count. I didn't say it wouldn't be a good idea in the long run. I'm saying I don't think it's happening for 7.3, given that Aug 31 is not that far away anymore and that a lot of cleanup work remains undone on other already-committed features. FUNC_MAX_ARGS=32 could happen for 7.3, though. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] getpid() function
As I remember, most cases where people have recently been asking for backend pid were related to temp tables because they were named by pid. I don't think they are anymore. (?) We can do two things. We can either rename it to pg_backend_pid and move it to the statistics section in the docs, where the backend pids of all active backends are available, or remove my code additions and see if anyone asks for it in 7.3. --- Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > > Perhaps a more relevant question is why are we cluttering the namespace > > > with any such function at all? What's the use case for it? > > > It was requested because it is exposed in libpq and people need it to > > generate unique names and stuff like that from within psql and > > functions. Seems like a valid use for the pid. > > The sole reason libpq exposes it is so that you can tell a self-notify > from an incoming notify. (ie, given you are LISTENing on a condition > that both you and other clients send NOTIFYs for, is this particular > message one that you sent yourself, or not? Compare the originator PID > in the NOTIFY message to your backend_pid to find out.) I put that > feature in back around 6.4, because it allowed some important > optimizations in an app I had that used LISTEN/NOTIFY a lot. > > Since NOTIFY messages aren't even visible at the SQL level, the above is > not a reason for making PIDs visible at the SQL level. > > I'm really dubious about using backend PID for the sort of purpose you > suggest. Unique names would be *much* more safely handled with, say, > a sequence generator. If you are not using libpq or another client > library that can give you a backend-PID API call, then very likely you > don't have a lot of control over the backend connection either, and > shouldn't assume that backend PID is going to be stable for you. > (Think about pooled connections in a webserver, etc.) > > Finally, the most legitimate uses of PID (like discovering a backend PID > to send SIGINT to, when some client query is running wild) are not > supported at all by a function that can only return your own backend's > PID, because that's seldom the PID you need to know. The > pg_stat_activity view handles this much better. > > So I'm still unconvinced that we need or want this ... > > regards, tom lane > > ---(end of broadcast)--- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED]) > -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] FUNC_MAX_ARGS benchmarks
OK, time to get moving folks. Looks like the increase in the function args to 32 and the NAMEDATALEN to 128 has been sufficiently tested. Tom has some ideas on removing some memset() calls for function args to speed things up, but we don't have to wait for that go get going. The end of August is nearing. Is there any reason to delay the change further? --- Joe Conway wrote: > Tom Lane wrote: > > Did you happen to make any notes about the disk space occupied by the > > database? One thing I was worried about was the bloat that'd occur > > in pg_proc, pg_index, and pg_proc_proname_args_nsp_index. Aside from > > costing disk space, this would indirectly slow things down due to more > > I/O to read these tables --- an effect that probably your test couldn't > > measure, since it wasn't touching very many entries in any of those > > tables. > > #define INDEX_MAX_KEYS16 > #define FUNC_MAX_ARGSINDEX_MAX_KEYS > du -h --max-depth=1 /opt/data/pgsql/data/base/ > 2.7M/opt/data/pgsql/data/base/1 > 2.7M/opt/data/pgsql/data/base/16862 > 2.7M/opt/data/pgsql/data/base/16863 > 2.7M/opt/data/pgsql/data/base/16864 > 3.2M/opt/data/pgsql/data/base/16865 > 2.7M/opt/data/pgsql/data/base/16866 > 17M /opt/data/pgsql/data/base > > #define INDEX_MAX_KEYS32 > #define FUNC_MAX_ARGSINDEX_MAX_KEYS > du -h --max-depth=1 /opt/data/pgsql/data/base/ > 3.1M/opt/data/pgsql/data/base/1 > 3.1M/opt/data/pgsql/data/base/16862 > 3.1M/opt/data/pgsql/data/base/16863 > 3.1M/opt/data/pgsql/data/base/16864 > 3.6M/opt/data/pgsql/data/base/16865 > 3.1M/opt/data/pgsql/data/base/16866 > 19M /opt/data/pgsql/data/base > > #define INDEX_MAX_KEYS64 > #define FUNC_MAX_ARGSINDEX_MAX_KEYS > du -h --max-depth=1 /opt/data/pgsql/data/base/ > 3.9M/opt/data/pgsql/data/base/1 > 3.9M/opt/data/pgsql/data/base/16862 > 3.9M/opt/data/pgsql/data/base/16863 > 3.9M/opt/data/pgsql/data/base/16864 > 4.4M/opt/data/pgsql/data/base/16865 > 3.9M/opt/data/pgsql/data/base/16866 > 24M /opt/data/pgsql/data/base > > #define INDEX_MAX_KEYS128 > #define FUNC_MAX_ARGSINDEX_MAX_KEYS > du -h --max-depth=1 /opt/data/pgsql/data/base/ > 5.7M/opt/data/pgsql/data/base/1 > 5.7M/opt/data/pgsql/data/base/16862 > 5.7M/opt/data/pgsql/data/base/16863 > 5.7M/opt/data/pgsql/data/base/16864 > 6.3M/opt/data/pgsql/data/base/16865 > 5.7M/opt/data/pgsql/data/base/16866 > 35M /opt/data/pgsql/data/base > > > Joe > > > ---(end of broadcast)--- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED]) > -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] fate of CLUSTER command ?
Also, let me add that CLUSTER in 7.3 will be fully functional because we will no longer be changing the oid of the table during cluster. This will allow people to use CLUSTER more frequently/safely. --- Bruce Momjian wrote: > Oleg Bartunov wrote: > > I just tried CLUSTER command at fts.postgresql.org to cluster > > fts index and got very visual performance win. Unfortunately > > I had to restore permissions and recreate other indices by hand. > > So, I'm interested what's a future of CLUSTER command ? > > Yes, I have always liked CLUSTER with full text searches because you are > usually hitting multiple rows with a single equaltiy restriction, and > CLUSTER puts all the hits on the same page. > > If you look in contrib/fulltextindex, you will see mention of CLUSTER in > the README. It may make sense to add that to your documentation. > > Also, is there any value to contrib/fulltextindex now that we have > contrib/tsearch? > > -- > Bruce Momjian| http://candle.pha.pa.us > [EMAIL PROTECTED] | (610) 853-3000 > + If your life is a hard drive, | 830 Blythe Avenue > + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 > > ---(end of broadcast)--- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED]) > -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] fate of CLUSTER command ?
Oleg Bartunov wrote: > I just tried CLUSTER command at fts.postgresql.org to cluster > fts index and got very visual performance win. Unfortunately > I had to restore permissions and recreate other indices by hand. > So, I'm interested what's a future of CLUSTER command ? Yes, I have always liked CLUSTER with full text searches because you are usually hitting multiple rows with a single equaltiy restriction, and CLUSTER puts all the hits on the same page. If you look in contrib/fulltextindex, you will see mention of CLUSTER in the README. It may make sense to add that to your documentation. Also, is there any value to contrib/fulltextindex now that we have contrib/tsearch? -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] FUNC_MAX_ARGS benchmarks
Hannu Krosing wrote: > On Sat, 2002-08-03 at 23:20, Tom Lane wrote: > > Hannu Krosing <[EMAIL PROTECTED]> writes: > > > How hard would it be to change pg_proc.proargtypes from oidvector to _oid > > > > Lack of btree index support for _oid would be the first hurdle. > > Is that index really needed, or is it there just to enforce uniqueness ? Needed to look up functions based on their args. The big issue of using arrays is that we don't have cache capability for variable length fields. Until we get that, we are stuck with NAMEDATALEN taking the full length, and oidvector taking the full length. And if we went with variable length, there may be a performance penalty. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] PITR, checkpoint, and local relations
Christopher Kings-Lynne wrote: > > The main area where it seems to get heavy use is during index builds, > > and for 'CREATE TABLE AS SELECT...'. > > > > So I will remove the local buffer manager as part of the PITR patch, > > unless there is further objection. > > Would someone mind filling me in as to what the local bugger manager is and > how it is different (and not useful) compared to the shared buffer manager? Sure. I think I can handle that. When you create a table in a transaction, there isn't any committed state to the table yet, so any table modifications are kept in a local buffer, which is local memory to the backend(?). No one needs to see it because it isn't visible to anyone yet. Same for indexes. Anyway, the WAL activity doesn't handle local buffers the same as shared buffers because there is no crisis if the system crashes. There is debate on whether the local buffers are even valuable considering the headache they cause in other parts of the system. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] FUNC_MAX_ARGS benchmarks
On Sat, 2002-08-03 at 23:20, Tom Lane wrote: > Hannu Krosing <[EMAIL PROTECTED]> writes: > > How hard would it be to change pg_proc.proargtypes from oidvector to _oid > > Lack of btree index support for _oid would be the first hurdle. Is that index really needed, or is it there just to enforce uniqueness ? Would the lookup not be in some internal cache most of the time ? Also, (imho ;) btree index support should be done for all array types which have comparison ops for elements at once (with semantics similar to string) not one by one for individual types. It should be in some ways quite similar to multi-key indexes, so perhaps some code could be borrowed from there. Otoh, It should be a SMOP to write support for b-tree indexes just for _oid :-p , most likely one could re-use code from oidvector ;) > Even if we wanted to do that work, there'd be some serious breakage > of client queries because of the historical differences in output format > and subscripting. (oidvector indexes from 0, _oid from 1. Which is > pretty bogus, but if the regression tests are anything to judge by there > are probably a lot of queries out there that know this.) I would guess that oidvector is sufficiently obscure type and that nobody actually uses oidvector for user tables. It is also only used in two tables and one index in system tables: hannu=# select relname,relkind from pg_class where oid in ( hannu-# select attrelid from pg_attribute where atttypid=30); relname | relkind -+- pg_index| r pg_proc_proname_narg_type_index | i pg_proc | r (3 rows) > > This could also get the requested 2% speedup, > > I'm not convinced that _oid would be faster. Neither am I, but it _may_ be that having generally shorter oid arrays wins us enough ;) > All in all, it doesn't seem worth the trouble compared to just kicking > FUNC_MAX_ARGS up a notch. At least not right now. I think we've > created quite enough system-catalog changes for one release cycle ;-) But going to _oid will free us from arbitrary limits on argument count. Or at least from small arbitrary limits, as there will probably still be the at-least-three-btree-keys-must-fit-in-page limit (makes > 2600 args/function) and maybe some other internal limits as well. -- Hannu ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] FUNC_MAX_ARGS benchmarks
Tom Lane wrote: > Did you happen to make any notes about the disk space occupied by the > database? One thing I was worried about was the bloat that'd occur > in pg_proc, pg_index, and pg_proc_proname_args_nsp_index. Aside from > costing disk space, this would indirectly slow things down due to more > I/O to read these tables --- an effect that probably your test couldn't > measure, since it wasn't touching very many entries in any of those > tables. #define INDEX_MAX_KEYS16 #define FUNC_MAX_ARGSINDEX_MAX_KEYS du -h --max-depth=1 /opt/data/pgsql/data/base/ 2.7M/opt/data/pgsql/data/base/1 2.7M/opt/data/pgsql/data/base/16862 2.7M/opt/data/pgsql/data/base/16863 2.7M/opt/data/pgsql/data/base/16864 3.2M/opt/data/pgsql/data/base/16865 2.7M/opt/data/pgsql/data/base/16866 17M /opt/data/pgsql/data/base #define INDEX_MAX_KEYS32 #define FUNC_MAX_ARGSINDEX_MAX_KEYS du -h --max-depth=1 /opt/data/pgsql/data/base/ 3.1M/opt/data/pgsql/data/base/1 3.1M/opt/data/pgsql/data/base/16862 3.1M/opt/data/pgsql/data/base/16863 3.1M/opt/data/pgsql/data/base/16864 3.6M/opt/data/pgsql/data/base/16865 3.1M/opt/data/pgsql/data/base/16866 19M /opt/data/pgsql/data/base #define INDEX_MAX_KEYS64 #define FUNC_MAX_ARGSINDEX_MAX_KEYS du -h --max-depth=1 /opt/data/pgsql/data/base/ 3.9M/opt/data/pgsql/data/base/1 3.9M/opt/data/pgsql/data/base/16862 3.9M/opt/data/pgsql/data/base/16863 3.9M/opt/data/pgsql/data/base/16864 4.4M/opt/data/pgsql/data/base/16865 3.9M/opt/data/pgsql/data/base/16866 24M /opt/data/pgsql/data/base #define INDEX_MAX_KEYS128 #define FUNC_MAX_ARGSINDEX_MAX_KEYS du -h --max-depth=1 /opt/data/pgsql/data/base/ 5.7M/opt/data/pgsql/data/base/1 5.7M/opt/data/pgsql/data/base/16862 5.7M/opt/data/pgsql/data/base/16863 5.7M/opt/data/pgsql/data/base/16864 6.3M/opt/data/pgsql/data/base/16865 5.7M/opt/data/pgsql/data/base/16866 35M /opt/data/pgsql/data/base Joe ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] fate of CLUSTER command ?
Oleg Bartunov dijo: > On Sat, 3 Aug 2002, Alvaro Herrera wrote: > > > Oleg Bartunov dijo: > > > > > I just tried CLUSTER command at fts.postgresql.org to cluster > > > fts index and got very visual performance win. Unfortunately > > > I had to restore permissions and recreate other indices by hand. > > > So, I'm interested what's a future of CLUSTER command ? > > > > I'm working on CLUSTER. I have a problem with dependency tracking right > > now that I need to get fixed before the patch gets accepted, but that > > shouldn't take long (hopefully). > > God news. Will it go to 7.3 ? In fact, I have just corrected the error and am submitting the patch for revision and possible inclusion. Please test it and check if it does what you need. Let me know if it doesn't, because it should. -- Alvaro Herrera () "Find a bug in a program, and fix it, and the program will work today. Show the program how to find and fix a bug, and the program will work forever" (Oliver Silfridge) ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] fate of CLUSTER command ?
On Sat, 3 Aug 2002, Alvaro Herrera wrote: > Oleg Bartunov dijo: > > > I just tried CLUSTER command at fts.postgresql.org to cluster > > fts index and got very visual performance win. Unfortunately > > I had to restore permissions and recreate other indices by hand. > > So, I'm interested what's a future of CLUSTER command ? > > I'm working on CLUSTER. I have a problem with dependency tracking right > now that I need to get fixed before the patch gets accepted, but that > shouldn't take long (hopefully). > > The patch supposedly fixes all the concerns about CLUSTER (permissions, > other indexes, inheritance). > God news. Will it go to 7.3 ? > Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[HACKERS] cvs changes and broken links
I couldn't keep up with the list traffic this week, but I thought I saw enough to convince me that after it was all said and done, I would still be able to do `cvs co pgsql`. I'm finding today that after using cvsup to sync up, I can no longer checkout pgsql, but pgsql-server instead. Is this intended, or are there more changes left to be made? Also, as side note, the link for cvsup is broken: http://developer.postgresql.org/TODO/docs/cvsup.html and CVS tree Oragnization: http://developer.postgresql.org/TODO/docs/cvs-tree.html Thanks, Joe ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] fate of CLUSTER command ?
Oleg Bartunov dijo: > I just tried CLUSTER command at fts.postgresql.org to cluster > fts index and got very visual performance win. Unfortunately > I had to restore permissions and recreate other indices by hand. > So, I'm interested what's a future of CLUSTER command ? I'm working on CLUSTER. I have a problem with dependency tracking right now that I need to get fixed before the patch gets accepted, but that shouldn't take long (hopefully). The patch supposedly fixes all the concerns about CLUSTER (permissions, other indexes, inheritance). -- Alvaro Herrera () Officer Krupke, what are we to do? Gee, officer Krupke, Krup you! (West Side Story, "Gee, Officer Krupke") ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] FUNC_MAX_ARGS benchmarks
Hannu Krosing <[EMAIL PROTECTED]> writes: > How hard would it be to change pg_proc.proargtypes from oidvector to _oid Lack of btree index support for _oid would be the first hurdle. Even if we wanted to do that work, there'd be some serious breakage of client queries because of the historical differences in output format and subscripting. (oidvector indexes from 0, _oid from 1. Which is pretty bogus, but if the regression tests are anything to judge by there are probably a lot of queries out there that know this.) > This could also get the requested 2% speedup, I'm not convinced that _oid would be faster. All in all, it doesn't seem worth the trouble compared to just kicking FUNC_MAX_ARGS up a notch. At least not right now. I think we've created quite enough system-catalog changes for one release cycle ;-) regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] FUNC_MAX_ARGS benchmarks
On Sat, 2002-08-03 at 18:41, Tom Lane wrote: > Joe Conway <[EMAIL PROTECTED]> writes: > > I ran a crude test as follows (using a PHP script on the same machine. > > Nothing else going on at the same time): > > > do 100 times > >select 2+2+2+2+2+2+ ... iterated 9901 times > > > The results were as follows: > > INDEX_MAX_KEYS1632 64 128 > > -+---+--+ > > Time in seconds 4849 51 55 > > Okay, that seems like a good basic test. > > Did you happen to make any notes about the disk space occupied by the > database? One thing I was worried about was the bloat that'd occur > in pg_proc, pg_index, and pg_proc_proname_args_nsp_index. Aside from > costing disk space, this would indirectly slow things down due to more > I/O to read these tables --- an effect that probably your test couldn't > measure, since it wasn't touching very many entries in any of those > tables. How hard would it be to change pg_proc.proargtypes from oidvector to _oid and hope that toasting will take care of the rest ? This could also get the requested 2% speedup, not to mention the potential for up to 64K arguments ;) --- Hannu ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] FUNC_MAX_ARGS benchmarks
Tom Lane wrote: > Did you happen to make any notes about the disk space occupied by the > database? One thing I was worried about was the bloat that'd occur > in pg_proc, pg_index, and pg_proc_proname_args_nsp_index. Aside from > costing disk space, this would indirectly slow things down due to more > I/O to read these tables --- an effect that probably your test couldn't > measure, since it wasn't touching very many entries in any of those > tables. No, but it's easy enough to repeat. I'll do that today sometime. Joe ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Table inheritance versus views
> On Fri, 2002-08-02 at 22:39, [EMAIL PROTECTED] wrote: > >>On 29 Jul 2002 18:27:40 MDT, the world broke into rejoicing as >>Stephen Deasey <[EMAIL PROTECTED]> said: >> >>>Curt Sampson wrote: >>> I'm still waiting to find out just what advantage table inheritance offers. I've asked a couple of times here, and nobody has even started to come up with anything. >>> >>>Table inheritance offers data model extensibility. New (derived) tables >>>can be added to the system, and will work with existing code that >>>operates on the base tables, without having to hack up all the code. >> >>But it kind of begs the question of why you're creating the new table in >>the first place. >> >>The new table certainly _won't_ work with existing code, at least from >>the perspective that the existing code doesn't _refer_ to that table. Since OpenACS has been brought up in this thread, I thought I'd join the list for a day or two and offer my perspective as the project manager. 1. Yes, we use views in our quasi-object oriented data model. They're automatically generated when content types are built by the content repository, for instance. 2. Yes, you can model anything you can model with PG's OO extensions using views. If you haven't implemented some way to generate the view automatically then a bit more work is required compared to using PG's OO extensions. 3. The view approach requires joins on all the subtype tables. If I declare type 'foo' then the view that returns all of foo's columns joins on all the subtype tables, while in the PG OO case all of foo's columns are stored in foo meaning I can get them all back with a simple query on the table. The PG OO approach can be considerably more efficient than the view approach, and this is important to some folks, no matter how many appeals to authority are made to various bibles on relational theory written by Date and Darwen. 4. The killer that makes the current implementation unusable for us is the fact that there's no form of indexing that spans all the tables inherited from a base type. This means there's no cheap enforcement of uniqueness constraints across a set of object types, among other things. Being able to inherit indexes and constraints would greatly increase the utility of PG's OO extensions. 5. If PG's OO extensions included inheritance of indexes and constraints, there's no doubt we'd use them in the OpenACS project, because when researching PG we compared datamodels written in this style vs. modelling the object relationships manually with automatically generated views. We found the datamodel written using PG's OO extensions not only potentially more efficient, but more readable as well. As far as whether or not there's a significant maintenance cost associated with keeping the existing OO stuff in PG, Tom Lane's voice is authorative while, when it comes to PG internals, Curt Sampson doesn't know squat. -- Don Baccus Portland, OR http://donb.photo.net, http://birdnotes.net, http://openacs.org ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[HACKERS] fate of CLUSTER command ?
I just tried CLUSTER command at fts.postgresql.org to cluster fts index and got very visual performance win. Unfortunately I had to restore permissions and recreate other indices by hand. So, I'm interested what's a future of CLUSTER command ? Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] FUNC_MAX_ARGS benchmarks
Joe Conway <[EMAIL PROTECTED]> writes: > I ran a crude test as follows (using a PHP script on the same machine. > Nothing else going on at the same time): > do 100 times >select 2+2+2+2+2+2+ ... iterated 9901 times > The results were as follows: > INDEX_MAX_KEYS1632 64 128 > -+---+--+ > Time in seconds 4849 51 55 Okay, that seems like a good basic test. Did you happen to make any notes about the disk space occupied by the database? One thing I was worried about was the bloat that'd occur in pg_proc, pg_index, and pg_proc_proname_args_nsp_index. Aside from costing disk space, this would indirectly slow things down due to more I/O to read these tables --- an effect that probably your test couldn't measure, since it wasn't touching very many entries in any of those tables. Looks like we could go for 32 without much problem, though. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] getpid() function
Hannu Krosing <[EMAIL PROTECTED]> writes: > You claimed that NOTIFY uses some _other_ backend id (i.e. not process > id). I did? Must have been momentary brain fade on my part. It's always been process ID. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Why is MySQL more chosen over PostgreSQL?
On Sat, 2002-08-03 at 16:32, Curt Sampson wrote: > On 2 Aug 2002, Hannu Krosing wrote: > > > On Fri, 2002-08-02 at 12:15, Curt Sampson wrote: > > > On 2 Aug 2002, Hannu Krosing wrote: > > > > > > > Could you brief me why do they discourage a syntactical frontent to a > > > > feature that is trivially implemented ? > > > > > > What's the point of adding it? It's just one more thing to learn. > > > > You don't have to learn it if you don't want to. But once you do, you > > have a higher level way of expressing a whole class of models. > > Perhaps this is the problem. I disagree that it's a "higher" level. I don't mean "morally higher" ;) Just more concise and easier to grasp, same as VIEW vs. TABLE + ON xxx DO INSTEAD rules. With INSTEAD rules you can do more than a VIEW does, but when all you want is a VIEW, then it is easier to define a VIEW, thus VIEW is a higher level construct than TABLE + ON xxx DO INSTEAD That is the same way that C is "higher" than ASM and ASM is higher than writing code directly using hex editor. -- Hannu ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Table inheritance versus views
On Fri, 2002-08-02 at 22:39, [EMAIL PROTECTED] wrote: > On 29 Jul 2002 18:27:40 MDT, the world broke into rejoicing as > Stephen Deasey <[EMAIL PROTECTED]> said: > > Curt Sampson wrote: > >> I'm still waiting to find out just what advantage table inheritance > >> offers. I've asked a couple of times here, and nobody has even > >> started to come up with anything. > > > Table inheritance offers data model extensibility. New (derived) tables > > can be added to the system, and will work with existing code that > > operates on the base tables, without having to hack up all the code. > > But it kind of begs the question of why you're creating the new table in > the first place. > > The new table certainly _won't_ work with existing code, at least from > the perspective that the existing code doesn't _refer_ to that table. The beuty of OO is that it does not need to : hannu=# create table animal (name text, legcount int); CREATE hannu=# insert into animal values('pig',4); INSERT 34183 1 hannu=# select * from animal; name | legcount --+-- pig |4 (1 row) hannu=# create table bird (wingcount int) inherits (animal); CREATE hannu=# insert into bird values('hen',2,2); INSERT 34189 1 hannu=# select * from animal; name | legcount --+-- pig |4 hen |2 (2 rows) -- Hannu ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] getpid() function
On Sat, 2002-08-03 at 01:25, Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > > Perhaps a more relevant question is why are we cluttering the namespace > > > with any such function at all? What's the use case for it? > > > It was requested because it is exposed in libpq and people need it to > > generate unique names and stuff like that from within psql and > > functions. Seems like a valid use for the pid. > > The sole reason libpq exposes it is so that you can tell a self-notify > from an incoming notify. (ie, given you are LISTENing on a condition > that both you and other clients send NOTIFYs for, is this particular > message one that you sent yourself, or not? Compare the originator PID > in the NOTIFY message to your backend_pid to find out.) I put that > feature in back around 6.4, because it allowed some important > optimizations in an app I had that used LISTEN/NOTIFY a lot. > > Since NOTIFY messages aren't even visible at the SQL level, the above is > not a reason for making PIDs visible at the SQL level. When I last time showed how backend_pid function can be trivially defined as hannu=# create function getpid() returns int hannu-# as '/lib/libc.so.6','getpid' language 'C'; CREATE hannu=# select getpid(); getpid 2832 (1 row) You claimed that NOTIFY uses some _other_ backend id (i.e. not process id). But when I now tested it it seems that this is not the case, notify does use the actual process id. hannu=# listen a; LISTEN hannu=# notify a; NOTIFY Asynchronous NOTIFY 'a' from backend with pid 2832 received. > > So I'm still unconvinced that we need or want this ... > And you can do it trivially as long as we support old-style C functions anyway. Hannu ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] WAL file location
On Fri, 2 Aug 2002, Thomas Lockhart wrote: > [Symlinks] don't scale, Given that we have only one directory for the log file, this would not appear to be a problem. > they are not portable, That's certainly a problem if we intend to run on systems without them. > and it is difficult for > applications (like the Postgres backend) to know that they are dealing > with a simlink or a real file. Er...that's the whole point of symlinks. Not that I really care either way about the whole issue, so long as we do *something*. cjs -- Curt Sampson <[EMAIL PROTECTED]> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're all light. --XTC ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Why is MySQL more chosen over PostgreSQL?
On 2 Aug 2002, Hannu Krosing wrote: > On Fri, 2002-08-02 at 12:15, Curt Sampson wrote: > > On 2 Aug 2002, Hannu Krosing wrote: > > > > > Could you brief me why do they discourage a syntactical frontent to a > > > feature that is trivially implemented ? > > > > What's the point of adding it? It's just one more thing to learn. > > You don't have to learn it if you don't want to. But once you do, you > have a higher level way of expressing a whole class of models. Perhaps this is the problem. I disagree that it's a "higher" level. cjs -- Curt Sampson <[EMAIL PROTECTED]> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're all light. --XTC ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] PITR, checkpoint, and local relations
> The main area where it seems to get heavy use is during index builds, > and for 'CREATE TABLE AS SELECT...'. > > So I will remove the local buffer manager as part of the PITR patch, > unless there is further objection. Would someone mind filling me in as to what the local bugger manager is and how it is different (and not useful) compared to the shared buffer manager? Chris ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] []performance issues
> So I am still interested in PostgreSQL's ability to deal with > multimillon records tables. Postgres has no problem with multimillion row tables - many people on this list run them - just don't do sequential scans on them if you can't afford the time it takes. Chris ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster