Re: [HACKERS] Socket problem using beta2 on Windows-XP
Martijn van Oosterhout wrote: On Thu, Sep 29, 2005 at 08:50:30AM +0200, Thomas Hallgren wrote: Hi, I've installed PostgreSQL 8.1-beta2 as a service on my Windows-XP box. It runs fine but I get repeated messages like this in the log: 2005-09-29 00:41:09 FATAL: could not duplicate socket 1880 for use in backend: error code 10038 That's from postmaster.c:write_inheritable_socket(). Error 10038 is WSAENOTSOCK. Very odd, time to get out the debugger? Get a backtrace at least. I finally managed to debug the postmaster and I'm now pretty sure the message is not from the postmaster itself. I put a breakpoint where the message is printed (postmaster.c:3762) and in errstart() where elevel >= ERROR (elog.c:152) but I never get there although the message is printed. I know that my debugger works because if I put a break on elog.c:194 it stops for other messages. Regards, Thomas Hallgren ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] effective SELECT from child tables
On Sat, Oct 01, 2005 at 04:35:49PM +0200, Martijn van Oosterhout wrote: > On Sat, Oct 01, 2005 at 10:05:22AM -0400, [EMAIL PROTECTED] wrote: > > It has the 'side or additional benefit' being requested here. The ability > > to filter the child table by some attribute. For example, if the child > > tables are used for partitioning, and the attribute were to keep a date > > range, the field restriction optimization could be used to automatically > > determine the set of tables to use for the date range specified. With > > such a change, it would even work automatically if the date ranges > > overlapped for some reason. Selecting a table name by date is hacky. This > > sort of solution would be a general solution to the problem. > This is what "Constraint Exclusion" does. It uses CHECK constraints on > a table to filter out tables that obviously don't apply to a query. Good point. I'll have to invent another use case. :-) > It's just the the specific case of "tableoid = XXX" is not supported > right now. Yes. This is what I was looking for Simon to admit. :-) Cheers, mark -- [EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] __ . . _ ._ . . .__. . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/|_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Socket problem using beta2 on Windows-XP
I added some traces to the code. I know that the following happens when I start a postmaster. StartupDatabase will call internal_fork_exec, it calls write_inheritable_socket 4 times and succeeds. During the first iteration of ServerLoop: StartBackgroundWriter will call internal_fork_exec and succeed. pgstat_forkexec will call internal_fork_exec and succeed. In the second iteration of ServerLoop, pgstat_forkexec will again call will call internal_fork_exec. This time it fails. According to the log it fails on line: write_inheritable_socket(¶m->pgStatSock, pgStatSock, childPid); i.e. on it's second call to write_inheriable_socket. The failure is in a postgres.exe process, not postmaster.exe (and that's why I can't debug propery on Windoz). Hope this helps. Regards, Thomas Hallgren Magnus Hagander wrote: If it's two zombies per minute, then I bet it's the stat collector and stat bufferer. They are restarted by the postmaster if not found to be running. That would make some sense, because the stat processes need to set up new sockets (for the pipe between them). The autovacuum theory didn't hold any water in my eyes because autovacuum doesn't create any new sockets. However, why two zombies? That would mean that the grandchild process started, which should mean that the pipe was already created ... Does Windows have any equivalent of strace whereby we could watch what's happening during stats process launch? First of all, I won't be able to dig into this any more until next week - sorry about that. But others are always free to :-) There is no strace equivalent builtin, but you can get an addon from http://www.bindview.com/Services/RAZOR/Utilities/Windows/strace_readme.c fm. Don't put it on a production box permanently, though, it tends to cause BSODs in some cases. //Magnus ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] [HACKERS] Query in SQL statement
> -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] Behalf Of Jim C. Nasby > Sent: Friday, September 30, 2005 4:49 PM > Subject: Re: [PERFORM] [HACKERS] Query in SQL statement > I suggest ditching the CamelCase and going with underline_seperators. > I'd also not use the bareword id, instead using bad_user_id. And I'd > name the table bad_user. But that's just me. :) I converted a db from MS SQL, where tables and fields were CamelCase, and just lowercased the ddl to create the tables. So table and fields names were all created in lowercase, but I didn't have to change any of the application code: the SELECT statements worked fine with mixed case. -- sample DDL CREATE TABLE testtable ( fieldone int4 ) insert into TestTable (fieldone) values (11); -- These statements will both work: -- lowercase SELECT fieldone FROM testtable; -- CamelCase SELECT FieldOne FROM TestTable; -Roger > Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [PERFORM] A Better External Sort?
On Sat, Oct 01, 2005 at 11:26:07PM -0400, Tom Lane wrote: > Martijn van Oosterhout writes: > > Anyway, to bring some real info I just profiled PostgreSQL 8.1beta > > doing an index create on a 2960296 row table (3 columns, table size > > 317MB). > > 3 columns in the index you mean? What were the column datatypes? > Any null values? Nope, three columns in the table, one column in the index, no nulls. The indexed column was integer. I did it once with around 6500 values repeated over and over, lots of duplicate kays. And once on a serial column but it made no descernable difference either way. Although the comparison function was called less (only 76 million times), presumably because it was mostly sorted already. > > The number 1 bottleneck with 41% of user time is comparetup_index. > > ... > > The thing is, I can't see anything in comparetup_index() that could > > take much time. > > The index_getattr and heap_getattr macros can be annoyingly expensive. And yet they are optimised for the common case. nocache_index_getattr was only called 7 times, which is about what you expect. I'm getting annotated output now, to determine which line takes the time... Actually, my previous profile overstated stuff a bit. Profiling turned off optimisation so I put it back and you get better results but the order doesn't change much. By line results are below. The top two are the index_getattr calls in comparetup_index. Third and fourth are the HEAPCOMPARES in tuplesort_heap_siftup. Then comes the inlineApplySortFunction call (which isn't being inlined, despite suggesting it should be, -Winline warns about this). Looks to me that there are no real gains to be made in this function. What is needed is an algorithmic change to call this function less often... Have a nice weekend, % cumulative self self total time seconds secondscalls ms/call ms/call name 9.40 22.5622.56 comparetup_index (tuplesort.c:2042 @ 8251060) 5.07 34.7312.17 comparetup_index (tuplesort.c:2043 @ 82510c0) 4.73 46.0911.36 tuplesort_heap_siftup (tuplesort.c:1648 @ 825074d) 3.48 54.45 8.36 tuplesort_heap_siftup (tuplesort.c:1661 @ 82507a9) 2.80 61.18 6.73 comparetup_index (tuplesort.c:2102 @ 8251201) 2.68 67.62 6.44 comparetup_index (tuplesort.c:2048 @ 8251120) 2.16 72.82 5.20 tuplesort_heap_siftup (tuplesort.c:1652 @ 825076d) 1.88 77.34 4.52 76025782 0.00 0.00 comparetup_index (tuplesort.c:2016 @ 8251010) 1.82 81.70 4.36 76025782 0.00 0.00 inlineApplySortFunction (tuplesort.c:1833 @ 8251800) 1.73 85.85 4.15 readtup_heap (tuplesort.c:2000 @ 8250fd8) 1.67 89.86 4.01 AllocSetAlloc (aset.c:568 @ 824bec0) 1.61 93.72 3.86 comparetup_index (tuplesort.c:2025 @ 825102f) 1.47 97.25 3.53 76025785 0.00 0.00 btint4cmp (nbtcompare.c:74 @ 80924a0) 1.11 99.92 2.67 readtup_datum (tuplesort.c:2224 @ 82517c4) 1.10102.55 2.64 comparetup_index (tuplesort.c:2103 @ 82511e7) % cumulative self self total time seconds secondscalls s/call s/call name 28.34 68.0168.01 76025782 0.00 0.00 comparetup_index 13.56100.5432.53 7148934 0.00 0.00 tuplesort_heap_siftup 8.66121.3320.79 76025782 0.00 0.00 inlineApplySortFunction 4.43131.9610.63 13084567 0.00 0.00 AllocSetAlloc 3.73140.90 8.94 76025785 0.00 0.00 btint4cmp 2.15146.07 5.17 6095625 0.00 0.00 LWLockAcquire 2.02150.92 4.85 2960305 0.00 0.00 heapgettup 1.98155.66 4.74 7148934 0.00 0.00 tuplesort_heap_insert 1.78159.94 4.28 2960312 0.00 0.00 slot_deform_tuple 1.73164.09 4.15 readtup_heap 1.67168.09 4.00 6095642 0.00 0.00 LWLockRelease 1.53171.76 3.68 2960308 0.00 0.00 index_form_tuple 1.44175.21 3.45 13083442 0.00 0.00 AllocSetFree 1.28178.28 3.07 8377285 0.00 0.00 LogicalTapeWrite 1.25181.29 3.01 8377285 0.00 0.00 LogicalTapeRead 1.11183.96 2.67 readtup_datum 1.06186.51 2.551 2.55 123.54 IndexBuildHeapScan -- Martijn van Oosterhout http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so
Re: [HACKERS] effective SELECT from child tables
On P, 2005-10-02 at 01:24 -0400, Greg Stark wrote: > Hannu Krosing <[EMAIL PROTECTED]> writes: > > > I also often wish that this would be possible when someone adds a column > > with a default value to a multi-million row table on a 24/7 production > > system and insists on filling all existing columns with the default. > > > > A rule "ON SELECT FROM table_x WHERE col_x IS NULL return > > col_x=default_for_col_x" would solve that nicely. > > Of course that only works if the reason they want to set fill the rows with > the default value isn't precisely because NULL is a perfectly reasonable thing > for the column to have (but not what they want for the existing rows). Sure. What would be needed for adding new colums with default filling would be some end-of-tuple marker or stored column count or tuple version nr, and then a rule (or just default behaviour) of showing default value for *missing* columns (colno > nr of stored columns). -- Hannu Krosing <[EMAIL PROTECTED]> ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] effective SELECT from child tables
On Sun, Oct 02, 2005 at 03:57:37PM +0300, Hannu Krosing wrote: > On P, 2005-10-02 at 01:24 -0400, Greg Stark wrote: > > Of course that only works if the reason they want to set fill the rows with > > the default value isn't precisely because NULL is a perfectly reasonable > > thing > > for the column to have (but not what they want for the existing rows). > > Sure. > > What would be needed for adding new colums with default filling would be > some end-of-tuple marker or stored column count or tuple version nr, and > then a rule (or just default behaviour) of showing default value for > *missing* columns (colno > nr of stored columns). Actually, PostgreSQL does know the number of columns in a tuple. It would be possible get change heap_getattr to return the default value. However, from a semantic point of view, it would be a bit strange. If you added a column, updated some rows then set a default, that default might end up applying to every row, except the ones you already modified. With careful coding you may be able to get around this. However, a good argument can be made that setting the DEFAULT for a column shouldn't change data anywhere. What about if I want to change the default for new values but not for old ones. That wouldn't work if the database starts adding values randomly, depending on when they are read... Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them. pgpxc3tWm4D4W.pgp Description: PGP signature
Re: [HACKERS] 8.1beta2 pg_dumpall inconsistencies
"Joshua D. Drake" <[EMAIL PROTECTED]> writes: > [EMAIL PROTECTED] pgsqldev]$ bin/pg_dumpall -U jd -D -o -p 5500 > ... > pg_dump: INSERT (-d, -D) and OID (-o) options cannot be used together > pg_dump: (The INSERT command cannot set OIDs.) > pg_dumpall: pg_dump failed on database "postgres", exiting > Why am I getting a partial dump when the options are not compatible? Because -D and -o are pg_dump options, which pg_dumpall just passes down without inspection. I'm not sure that it makes sense to try to replicate pg_dump's validity checking logic in pg_dumpall; I fear we'd be more likely to cause problems by letting them get out of sync than to solve problems by complaining a little sooner. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Socket problem using beta2 on Windows-XP
On Sun, Oct 02, 2005 at 12:20:05PM +0200, Thomas Hallgren wrote: > I added some traces to the code. I know that the following happens when > I start a postmaster. > In the second iteration of ServerLoop, pgstat_forkexec will again call > will call internal_fork_exec. This time it fails. > According to the log it fails on line: > >write_inheritable_socket(¶m->pgStatSock, pgStatSock, childPid); Well, pgStatSock is the only SOCK_DGRAM socket, all the others are SOCK_STREAM, maybe that's the difference? It's also connected to itself, although for DGRAM sockets that's not that special. The documentation isn't totally clear about this. Yet the error thrown should terminate the process, yet it obviously isn't. Very odd. Any Windows programmers with ideas? -- Martijn van Oosterhout http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them. pgpKbZv1lHAcM.pgp Description: PGP signature
Re: [HACKERS] effective SELECT from child tables
Martijn van Oosterhout writes: > However, from a semantic point of view, it would be a bit strange. If > you added a column, updated some rows then set a default, that default > might end up applying to every row, except the ones you already > modified. With careful coding you may be able to get around this. I suppose the update would have to check for these new columns and fill in the default value when first updated. But what do you do if the DEFAULT expression isn't immutable. Volatile DEFAULT expressions are extremely common in databases... namely sequences. It's too bad though. An efficient way to add a new column with a default would clear up one of the missing feature of Postgres. Virtually every time I add a new column I have to do this ALTER TABLE, ALTER COLUMN SET DEFAULT, UPDATE TABLE, VACUUM FULL, ALTER COLUMN SET NOT NULL dance. It would be real nice if you could add a new NOT NULL column with a default with a single command and just move on. -- greg ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] External Sort timing debug statements
On Sun, 2005-10-02 at 19:43 +0100, Simon Riggs wrote: > The following patch implements a fairly light set of timing statements > aimed at understanding external sort performance. There is no attempt to > alter the algorithms. Minor update of patch, use this version please. > Best Regards, Simon Riggs Index: src/backend/utils/misc/guc.c === RCS file: /projects/cvsroot/pgsql/src/backend/utils/misc/guc.c,v retrieving revision 1.289 diff -c -c -r1.289 guc.c *** src/backend/utils/misc/guc.c 24 Sep 2005 22:54:39 - 1.289 --- src/backend/utils/misc/guc.c 2 Oct 2005 19:58:02 - *** *** 92,97 --- 92,98 extern int CommitSiblings; extern char *default_tablespace; extern bool fullPageWrites; + extern bool debug_sort; static const char *assign_log_destination(const char *value, bool doit, GucSource source); *** *** 764,770 false, NULL, NULL }, #endif ! { {"log_hostname", PGC_SIGHUP, LOGGING_WHAT, gettext_noop("Logs the host name in the connection logs."), --- 765,779 false, NULL, NULL }, #endif ! { ! {"debug_sort", PGC_USERSET, DEVELOPER_OPTIONS, ! gettext_noop("no description available"), ! NULL, ! GUC_NOT_IN_SAMPLE ! }, ! &debug_sort, ! false, NULL, NULL ! }, { {"log_hostname", PGC_SIGHUP, LOGGING_WHAT, gettext_noop("Logs the host name in the connection logs."), Index: src/backend/utils/sort/tuplesort.c === RCS file: /projects/cvsroot/pgsql/src/backend/utils/sort/tuplesort.c,v retrieving revision 1.50 diff -c -c -r1.50 tuplesort.c *** src/backend/utils/sort/tuplesort.c 23 Sep 2005 15:36:57 - 1.50 --- src/backend/utils/sort/tuplesort.c 2 Oct 2005 19:58:02 - *** *** 85,90 --- 85,92 #include "postgres.h" + #include + #include "access/heapam.h" #include "access/nbtree.h" #include "catalog/pg_amop.h" *** *** 394,399 --- 396,404 */ static Tuplesortstate *qsort_tuplesortstate; + /* debug variables */ + bool debug_sort = false; + struct timeval start_t, current_t; /* * tuplesort_begin_xxx *** *** 422,427 --- 427,439 state = (Tuplesortstate *) palloc0(sizeof(Tuplesortstate)); + if (debug_sort) + { + ereport(NOTICE, + (errmsg("tuplesort begin work_mem= %u", workMem))); + gettimeofday(&start_t, NULL); + } + state->status = TSS_INITIAL; state->randomAccess = randomAccess; state->availMem = workMem * 1024L; *** *** 454,459 --- 466,475 Tuplesortstate *state = tuplesort_begin_common(workMem, randomAccess); int i; + if (debug_sort) + ereport(NOTICE, + (errmsg("+0 secs heap sort nkeys= %u", nkeys))); + AssertArg(nkeys > 0); state->comparetup = comparetup_heap; *** *** 499,504 --- 515,524 { Tuplesortstate *state = tuplesort_begin_common(workMem, randomAccess); + if (debug_sort) + ereport(NOTICE, + (errmsg("+0 secs index sort"))); + state->comparetup = comparetup_index; state->copytup = copytup_index; state->writetup = writetup_index; *** *** 666,671 --- 686,698 /* * Nope; time to switch to tape-based operation. */ + if (debug_sort) + { + gettimeofday(¤t_t, NULL); + ereport(NOTICE, + (errmsg("+%ld secs switching to external sort", + (long) (current_t.tv_sec - start_t.tv_sec; + } inittapes(state); /* *** *** 1042,1047 --- 1069,1082 int j; int a; + if (debug_sort) + { + gettimeofday(¤t_t, NULL); + ereport(NOTICE, + (errmsg("+%ld secs starting build of next run", + (long) (current_t.tv_sec - start_t.tv_sec; + } + /* Step D3: advance j (destTape) */ if (state->tp_dummy[state->destTape] < state->tp_dummy[state->destTape + 1]) { *** *** 1082,1087 --- 1117,1131 Assert(state->status == TSS_BUILDRUNS); Assert(state->memtupcount == 0); + if (debug_sort) + { + gettimeofday(¤t_t, NULL); + ereport(NOTICE, + (errmsg("+%ld secs run building complete nruns= %u", + (long) (current_t.tv_sec - start_t.tv_sec), + state->currentRun))); + } + /* * If we produced only one initial run (quite likely if the total data * volume is between 1X and 2X workMem), we can just use that tape as *** *** 1093,1100 --- 1137,1158 /* must freeze and rewind the finished output tape */ LogicalTapeFreeze(state->tapeset, state->result_tape); state->status = TSS_SORTEDONTAPE; + if (debug_sort) + { + gettimeofday(¤t_t, NULL); + ereport(NOTICE, + (errmsg("+%ld secs no final merge required"
Re: [HACKERS] [PERFORM] A Better External Sort?
Ok, I tried two optimisations: 1. By creating a special version of comparetup_index for single key integer indexes. Create an index_get_attr with byval and len args. By using fetch_att and specifying the values at compile time, gcc optimises the whole call to about 12 instructions of assembly rather than the usual mess. 2. By specifying: -Winline -finline-limit-1500 (only on tuplesort.c). This causes inlineApplySortFunction() to be inlined, like the code obviously expects it to be. default build (baseline)235 seconds -finline only 217 seconds (7% better) comparetup_index_fastbyval4 only221 seconds (6% better) comparetup_index_fastbyval4 and -finline203 seconds (13.5% better) This is indexing the integer sequence column on a 2.7 million row table. The times are as given by gprof and so exclude system call time. Basically, I recommend adding "-Winline -finline-limit-1500" to the default build while we discuss other options. comparetup_index_fastbyval4 patch attached per example. -- Martijn van Oosterhout http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them. --- pgsql-clean/src/include/access/itup.h 2005-10-02 21:30:20.327464320 +0200 +++ pgsql-sort/src/include/access/itup.h2005-10-02 16:04:00.0 +0200 @@ -126,6 +126,34 @@ ) \ ) +#define index_get_attr(tup, attnum, tupleDesc, attbyval, attlen, isnull) \ +( \ + AssertMacro(PointerIsValid(isnull) && (attnum) > 0), \ + *(isnull) = false, \ + !IndexTupleHasNulls(tup) ? \ + ( \ + (tupleDesc)->attrs[(attnum)-1]->attcacheoff >= 0 ? \ + ( \ + fetch_att((char *) (tup) + IndexInfoFindDataOffset((tup)->t_info) \ + + (tupleDesc)->attrs[(attnum)-1]->attcacheoff, attbyval, attlen) \ + ) \ + : \ + nocache_index_getattr((tup), (attnum), (tupleDesc), (isnull)) \ + ) \ + : \ + ( \ + (att_isnull((attnum)-1, (char *)(tup) + sizeof(IndexTupleData))) ? \ + ( \ + *(isnull) = true, \ + (Datum)NULL \ + ) \ + : \ + ( \ + nocache_index_getattr((tup), (attnum), (tupleDesc), (isnull)) \ + ) \ + ) \ +) + /* routines in indextuple.c */ extern IndexTuple index_form_tuple(TupleDesc tupleDescriptor, --- pgsql-clean/src/backend/utils/sort/tuplesort.c 2005-09-24 23:23:39.0 +0200 +++ pgsql-sort/src/backend/utils/sort/tuplesort.c 2005-10-02 21:29:39.349086302 +0200 @@ -375,6 +375,8 @@ unsigned int len); static int comparetup_index(Tuplesortstate *state, const void *a, const void *b); +static int comparetup_index_fastbyval4(Tuplesortstate *state, +const void *a, const void *b); static void *copytup_index(Tuplesortstate *state, void *tup); static void writetup_index(Tuplesortstate *state, int tapenum, void *tup); static void *readtup_index(Tuplesortstate *state, int tapenum, @@ -498,8 +500,12 @@ int workMem, bool randomAccess) { Tuplesortstate *state = tuplesort_begin_common(workMem, randomAccess); + TupleDesc tupDes = RelationGetDescr(indexRel); - state->comparetup = comparetup_index; + if( tupDes->natts == 1 && tupDes->attrs[0]->attbyval == 1 && tupDes->attrs[0]->attlen == 4 ) + state->comparetup = comparetup_index_fastbyval4; + else + state->comparetup = comparetup_index; state->copytup = copytup_index; state->writetup = writetup_index; state->readtup = readtup_index; @@ -2102,6 +2108,92 @@ return 0; } +static int +comparetup_index_fastbyval4(Tuplesortstate *state, const void *a, const void *b) +{ + /* +* This is almost the same as _bt_tuplecompare(), but we need to keep +* track of whether any null fields are present. Also see the special +* treatment for equal keys at the end. +*/ + IndexTuple tuple1 = (IndexTuple) a; + IndexTuple tuple2 = (IndexTuple) b; + Relationrel = state->indexRel; + ScanKey scankey = state->indexScanKey; + TupleDesc tupDes; + boolequal_hasnull = false; + + tupDes = RelationGetDescr(rel); + + ScanKey entry = &scankey[0]; + Datum datum1, + datum2; + boolisnull1, + isnull2; + int32 compare; + + datum1 = index_get_attr(tuple1, 1, tupDes, 1, 4,
[HACKERS] External Sort timing debug statements
The following patch implements a fairly light set of timing statements aimed at understanding external sort performance. There is no attempt to alter the algorithms. Each major point in the algorithms is marked as shown in this example: postgres=# set debug_sort=true; SET postgres=# explain analyze select * from test2 order by col1,col2; NOTICE: tuplesort begin work_mem= 1024 NOTICE: +0 secs heap sort nkeys= 2 NOTICE: +0 secs switching to external sort NOTICE: +1129 secs starting build of next run NOTICE: +2229 secs run building complete nruns= 2 NOTICE: +2229 secs merging runs with 6 tapes ... NOTICE: +3036 secs starting final merge I'll add other information, as requested. The "6 tapes" is currently hardcoded, though is included in expectation of implementing variable numbers of tapes. I'm not sure if I got the header file correct for full portability of gettimeofday(). Please correct me, if this is the case. Please post sort performance data back via this post. Best Regards, Simon Riggs Index: src/backend/utils/misc/guc.c === RCS file: /projects/cvsroot/pgsql/src/backend/utils/misc/guc.c,v retrieving revision 1.289 diff -c -c -r1.289 guc.c *** src/backend/utils/misc/guc.c 24 Sep 2005 22:54:39 - 1.289 --- src/backend/utils/misc/guc.c 2 Oct 2005 18:00:45 - *** *** 92,97 --- 92,98 extern int CommitSiblings; extern char *default_tablespace; extern bool fullPageWrites; + extern bool debug_sort; static const char *assign_log_destination(const char *value, bool doit, GucSource source); *** *** 764,770 false, NULL, NULL }, #endif ! { {"log_hostname", PGC_SIGHUP, LOGGING_WHAT, gettext_noop("Logs the host name in the connection logs."), --- 765,779 false, NULL, NULL }, #endif ! { ! {"debug_sort", PGC_USERSET, DEVELOPER_OPTIONS, ! gettext_noop("no description available"), ! NULL, ! GUC_NOT_IN_SAMPLE ! }, ! &debug_sort, ! false, NULL, NULL ! }, { {"log_hostname", PGC_SIGHUP, LOGGING_WHAT, gettext_noop("Logs the host name in the connection logs."), Index: src/backend/utils/sort/tuplesort.c === RCS file: /projects/cvsroot/pgsql/src/backend/utils/sort/tuplesort.c,v retrieving revision 1.50 diff -c -c -r1.50 tuplesort.c *** src/backend/utils/sort/tuplesort.c 23 Sep 2005 15:36:57 - 1.50 --- src/backend/utils/sort/tuplesort.c 2 Oct 2005 18:00:46 - *** *** 394,399 --- 394,402 */ static Tuplesortstate *qsort_tuplesortstate; + /* debug variables */ + bool debug_sort = false; + struct timeval start_t, current_t; /* * tuplesort_begin_xxx *** *** 422,427 --- 425,437 state = (Tuplesortstate *) palloc0(sizeof(Tuplesortstate)); + if (debug_sort) + { + ereport(NOTICE, + (errmsg("tuplesort begin work_mem= %u", workMem))); + gettimeofday(&start_t, NULL); + } + state->status = TSS_INITIAL; state->randomAccess = randomAccess; state->availMem = workMem * 1024L; *** *** 454,459 --- 464,473 Tuplesortstate *state = tuplesort_begin_common(workMem, randomAccess); int i; + if (debug_sort) + ereport(NOTICE, + (errmsg("+0 secs heap sort nkeys= %u", nkeys))); + AssertArg(nkeys > 0); state->comparetup = comparetup_heap; *** *** 499,504 --- 513,522 { Tuplesortstate *state = tuplesort_begin_common(workMem, randomAccess); + if (debug_sort) + ereport(NOTICE, + (errmsg("+0 secs index sort"))); + state->comparetup = comparetup_index; state->copytup = copytup_index; state->writetup = writetup_index; *** *** 666,671 --- 684,697 /* * Nope; time to switch to tape-based operation. */ + if (debug_sort) + { + gettimeofday(¤t_t, NULL); + ereport(NOTICE, + (errmsg("+%ld secs switching to external sort", + (long) (current_t.tv_sec - start_t.tv_sec), + state->currentRun))); + } inittapes(state); /* *** *** 1042,1047 --- 1068,1082 int j; int a; + if (debug_sort) + { + gettimeofday(¤t_t, NULL); + ereport(NOTICE, + (errmsg("+%ld secs starting build of next run", + (long) (current_t.tv_sec - start_t.tv_sec), + state->currentRun))); + } + /* Step D3: advance j (destTape) */ if (state->tp_dummy[state->destTape] < state->tp_dummy[state->destTape + 1]) { *** *** 1082,1087 --- 1117,1131 Assert(state->status == TSS_BUILDRUNS); Assert(state->memtupcount == 0); + if (debug_sort) + { + gettimeofday(¤t_t, NULL); + ereport(NOTICE, + (errmsg("+%ld secs run building complete n
Re: [HACKERS] effective SELECT from child tables
If one defines a restriction such that 'COLUMN = VALUE' for a specific table, in a theoretical sort of model that completely ignores implementation difficulty, or changes to the restriction, I think it would be safe to not store COLUMN in the tuple. If the tuple is stored, then COLUMN = VALUE, so when fetched, the value will be VALUE. Back to the real world, this would be difficult to implement without treating the column special from the point of table creation, and preventing the restriction from being altered without re-building the table... :-) Cheers, mark -- [EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] __ . . _ ._ . . .__. . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/|_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] effective SELECT from child tables
On Sun, Oct 02, 2005 at 11:51:27PM +0300, Hannu Krosing wrote: > Right. Actually the "default" value returned for missing columns should > be different from the default for new values and should be settable only > once, when adding the column, else issues would become really really > weird. Right, the only way I could really imagine it working is have a flag, attautodefault which if set would return the default instead of NULL. It would only ever be set if the ADD COLUMN SET DEFAULT happened together. But does this mean you have one magic default fixed for the life of the column and the actual default which can be changed anytime? Seems messy, though possible... Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them. pgpf0P5jSQf7Z.pgp Description: PGP signature
Re: [HACKERS] effective SELECT from child tables
On P, 2005-10-02 at 15:30 +0200, Martijn van Oosterhout wrote: > On Sun, Oct 02, 2005 at 03:57:37PM +0300, Hannu Krosing wrote: > > On P, 2005-10-02 at 01:24 -0400, Greg Stark wrote: > > > Of course that only works if the reason they want to set fill the rows > > > with > > > the default value isn't precisely because NULL is a perfectly reasonable > > > thing > > > for the column to have (but not what they want for the existing rows). > > > > Sure. > > > > What would be needed for adding new colums with default filling would be > > some end-of-tuple marker or stored column count or tuple version nr, and > > then a rule (or just default behaviour) of showing default value for > > *missing* columns (colno > nr of stored columns). > > Actually, PostgreSQL does know the number of columns in a tuple. It > would be possible get change heap_getattr to return the default value. Good. (I have not looked at tuple layout for a few years :) > However, from a semantic point of view, it would be a bit strange. If > you added a column, updated some rows then set a default, that default > might end up applying to every row, except the ones you already > modified. With careful coding you may be able to get around this. If I do the add column and set default in the same transaction, this should not be a problem. > However, a good argument can be made that setting the DEFAULT for a > column shouldn't change data anywhere. What about if I want to change > the default for new values but not for old ones. That wouldn't work if > the database starts adding values randomly, depending on when they are > read... Right. Actually the "default" value returned for missing columns should be different from the default for new values and should be settable only once, when adding the column, else issues would become really really weird. -- Hannu Krosing <[EMAIL PROTECTED]> ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] Inherited indexes.
Hi. To allow indexes to be inherited so unique, foreign keys and such works properly with inheritance has been on the todo for quite some time. I thought that most probably it is a very non trivial thing, perhaps completely rethinking how indexes are done. Or perhaps it is not a feature that is requested allot and therefor no one ever got around to it. I am optimistic so I hoped for the second alternative and begun browsing the sources to see what could be done. Well, from what I have been able to figure out it is not trivial, at least not to me. To be honest I have not completely figured out how the existing indexes works and fit into the constraints. I am not quite sure what I am asking for, quite allot I guess. Is there someone already working on it? If so or if someone is considering perhaps I should start with one of the tasks clearly marked as easy, as an novice to postgresql hacking might be of better use then. Or maybe it is quite easy with the right directions, as in; non complex but takes time. So some one who knows what needs to be done, but do not have the time themselves could give an outline? regards -- //Fredrik Olsson Treyst AB +46-19-362182 [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] pgAdmin guru hints
In-time for PostgreSQL 8.1 gold, The pgAdmin Developer Team will release pgAdmin III V1.4, which has a new Guru hint feature that tries to give users helpful hints on typical beginner's pitfalls. The current set of hints is at http://svn.pgadmin.org/cgi-bin/viewcvs.cgi/trunk/pgadmin3/docs/en_US/hints/ You're invited to make your comments on the hints: are these correct, is there something missing or misleading? Regards, Andreas ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] pg_dump versioning
Watching the discussion about how to handle nextval() and keep it dump compatible makes me wonder: would it be useful to encode database or dump version info into dumps? ISTM it would make it much more feasible to handle changes to how things work automatically. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Inherited indexes.
Fredrik Olsson <[EMAIL PROTECTED]> writes: > To allow indexes to be inherited so unique, foreign keys and such works > properly with inheritance has been on the todo for quite some time. I > thought that most probably it is a very non trivial thing, perhaps > completely rethinking how indexes are done. Yup, you're right. There are a couple of major problems, to my mind: 1. A cross-table index would need to store a table OID as well as the existing block/offset information in order to tell you what an entry is pointing at. An extra 4 bytes per index entry (8 bytes if MAXALIGN is 8) is a lot of overhead, so you'd not want to pay that all the time. Which means two index tuple header formats to support, which looks painful. How can that be handled cleanly and efficiently? 2. Nobody has any idea how to handle the locking requirements. For the most part, we assume that a lock on a table protects its associated indexes too. What happens when an index is shared by multiple tables? Are there deadlock problems? A particularly nasty example is that in a unique index, inserting into one table may require visiting other tables (that you've not even got lock on) to see if potentially conflicting rows are still live. > Or perhaps it is not a feature that is requested allot and therefor no > one ever got around to it. No, it's been requested plenty, but it looks hard. See the pghackers archives for previous discussions. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] Last call for back-branch fixes
Core's current plan is to wrap update releases for 8.0, 7.4, 7.3 branches tomorrow (Monday evening North-American-east-coast time). If you're aware of any critical open problems in those branches, speak now ... regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] Updated documentation for new sequence binding
I have marged Tom's description of the new sequence binding with text I was working on. I modified it to follow the existing "we used to do X, now we do Y" pattern in the surrounding entries: http://candle.pha.pa.us/tmp/pgsql/release.html#RELEASE-8-1 I also added an SQL query that will show and update a pre-8.1 loaded schema to the new oid-based sequence storage use. -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] effective SELECT from child tables
Martijn van Oosterhout writes: > On Sun, Oct 02, 2005 at 11:51:27PM +0300, Hannu Krosing wrote: > > Right. Actually the "default" value returned for missing columns should > > be different from the default for new values and should be settable only > > once, when adding the column, else issues would become really really > > weird. > > Right, the only way I could really imagine it working is have a flag, > attautodefault which if set would return the default instead of NULL. No, I think redefining NULL is a non-starter. This whole thing only has legs if Postgres can distinguish between a column that has never been set and a column that has NULL. Actually the only case where I don't see bizarre semantic consequences is the case of a newly created column. It would be nice to be able to do: ALTER TABLE ADD foo integer DEFAULT 1 And there's no question of what what the semantics of this are. No question of columns magically acquiring a value where they were NULL before. The *main* reason I would be happy about this is that it would let me add NOT NULL columns efficiently. I would expect NOT NULL to be allowed whenever a default is provided since there's obviously no way it can cause a problem. (I don't follow the "NULL is evil all the time" philosophy but I do try to set columns NOT NULL wherever I can. It always annoys me that when adding a new column I have to choose between a massive disk intensive batch job or compromising the design.) On the other hand if you do ALTER TABLE ADD foo integer and then later do ALTER TABLE ALTER foo SET DEFAULT 1 then there is a window where all those foos are NULL and then they magically become 1? That doesn't seem tenable. -- greg ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Updated documentation for new sequence binding
pgman wrote: > I have marged Tom's description of the new sequence binding with text I > was working on. I modified it to follow the existing "we used to do X, > now we do Y" pattern in the surrounding entries: > > http://candle.pha.pa.us/tmp/pgsql/release.html#RELEASE-8-1 Sorry, this is a better URL: http://candle.pha.pa.us/main/writings/pgsql/sgml/release.html#RELEASE-8-1 -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] effective SELECT from child tables
Greg Stark <[EMAIL PROTECTED]> writes: > It would be nice to be able to do: > ALTER TABLE ADD foo integer DEFAULT 1 > And there's no question of what what the semantics of this are. Sure, but you can only optimize this if the default expression is immutable... > On the other hand if you do > ALTER TABLE ADD foo integer > and then later do > ALTER TABLE ALTER foo SET DEFAULT 1 > then there is a window where all those foos are NULL and then they magically > become 1? That doesn't seem tenable. It'd also be contrary to the SQL spec, AFAICS. Here's another interesting case to think about: ALTER TABLE ADD foo integer DEFAULT 1 ... ALTER TABLE ALTER foo SET DEFAULT 2 You'll have to pay the table-traversal cost on one step or the other. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] pg_dump versioning
Jim C. Nasby wrote: > Watching the discussion about how to handle nextval() and keep it dump > compatible makes me wonder: would it be useful to encode database or > dump version info into dumps? ISTM it would make it much more feasible > to handle changes to how things work automatically. Yes, we have talked about that, and it would work for pg_dump-generated dumps. What it might not work for cleanly are dumps moved a different way, or application changes, so while it might help in some cases, being able to get things working without the version info is desirable. If we ever get to a case where we _need_ to use it, it would be good to have, just in case. -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] pg_dump versioning
Bruce Momjian writes: > Jim C. Nasby wrote: >> Watching the discussion about how to handle nextval() and keep it dump >> compatible makes me wonder: would it be useful to encode database or >> dump version info into dumps? > If we ever get to a case where we _need_ to use it, it would be good to > have, just in case. The trouble is that it won't help you until years after you invest the work --- invariably, the version info you wish you had is for distinguishing between different *old* releases. I'm not real excited about it myself. My own design experience says that version numbers aren't that hot as a way of determining "does this data have the X nature?". If you can't test for the problem directly, you haven't thought hard enough. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] pgAdmin guru hints
On Mon, Oct 03, 2005 at 02:16:37AM +0200, Andreas Pflug wrote: > In-time for PostgreSQL 8.1 gold, The pgAdmin Developer Team will > release pgAdmin III V1.4, which has a new Guru hint feature that > tries to give users helpful hints on typical beginner's pitfalls. > > The current set of hints is at > http://svn.pgadmin.org/cgi-bin/viewcvs.cgi/trunk/pgadmin3/docs/en_US/hints/ > > You're invited to make your comments on the hints: are these > correct, is there something missing or misleading? Dunno if this is handled yet, but in the previous version, changes disallowed by a RULE showed up as though they had happened. Is this fixed in 1.4-to-be? Cheers, D -- David Fetter [EMAIL PROTECTED] http://fetter.org/ phone: +1 510 893 6100 mobile: +1 415 235 3778 Remember to vote! ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] effective SELECT from child tables
Tom Lane <[EMAIL PROTECTED]> writes: > Here's another interesting case to think about: > > ALTER TABLE ADD foo integer DEFAULT 1 > ... > ALTER TABLE ALTER foo SET DEFAULT 2 > > You'll have to pay the table-traversal cost on one step or the other. A good point. I wonder if this could be solved by recording the xid of the ALTER transaction along with the default value. I imagine that might get complicated quickly though. -- greg ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Updated documentation for new sequence binding
Bruce Momjian wrote: pgman wrote: I have marged Tom's description of the new sequence binding with text I was working on. I modified it to follow the existing "we used to do X, now we do Y" pattern in the surrounding entries: http://candle.pha.pa.us/tmp/pgsql/release.html#RELEASE-8-1 Sorry, this is a better URL: http://candle.pha.pa.us/main/writings/pgsql/sgml/release.html#RELEASE-8-1 Btw. I think the header "Add proper sequence function dependencies for DEFAULT" is in the wrong font, i.e. it's all monospace. Best Regards Michael Paesold ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org