Re: [HACKERS] recent --with-libxml support
On Fri, 22 Dec 2006, Jeremy Drake wrote: > On Sat, 23 Dec 2006, Tom Lane wrote: > > > Peter Eisentraut <[EMAIL PROTECTED]> writes: > > > Jeremy Drake wrote: > > >> #0 0xb7c4dc85 in memcpy () from /lib/tls/libc.so.6 > > >> #1 0x08190f59 in appendBinaryStringInfo (str=0xbfd87f90, > > >> data=0x841ffc0 "qux", datalen=138543040) at stringinfo.c:192 > > >> #2 0x0828377f in map_sql_identifier_to_xml_name (ident=0x841ffc0 > > >> "qux", fully_escaped=0 '\0') at xml.c:933 > > > > > Obviously the datalen has gone off the map. > > > > I wouldn't put 100% faith in that display, unless Jeremy built with -O0. > > I built this one with gcc 3.4.5 using --enable-debug --enable-cassert > configure options. I will try with -O0 and see what I get... I just tried the same thing, but passing CFLAGS="-g -O0" to configure and the xml test passed. Maybe a '\0' termination issue? I also recompiled everything with the defaults again (-O2) and the xml test crashed in the same place. So it is an issue of -O0 works vs -O2 does not. Hate those... -- When I get real bored, I like to drive downtown and get a great parking spot, then sit in my car and count how many people ask me if I'm leaving. -- Steven Wright ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] recent --with-libxml support
On Sat, 23 Dec 2006, Tom Lane wrote: > Peter Eisentraut <[EMAIL PROTECTED]> writes: > > Jeremy Drake wrote: > >> #0 0xb7c4dc85 in memcpy () from /lib/tls/libc.so.6 > >> #1 0x08190f59 in appendBinaryStringInfo (str=0xbfd87f90, > >> data=0x841ffc0 "qux", datalen=138543040) at stringinfo.c:192 > >> #2 0x0828377f in map_sql_identifier_to_xml_name (ident=0x841ffc0 > >> "qux", fully_escaped=0 '\0') at xml.c:933 > > > Obviously the datalen has gone off the map. > > I wouldn't put 100% faith in that display, unless Jeremy built with -O0. I built this one with gcc 3.4.5 using --enable-debug --enable-cassert configure options. I will try with -O0 and see what I get... -- NAPOLEON: What shall we do with this soldier, Guiseppe? Everything he says is wrong. GUISEPPE: Make him a general, Excellency, and then everything he says will be right. -- G. B. Shaw, "The Man of Destiny" ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] recent --with-libxml support
Peter Eisentraut <[EMAIL PROTECTED]> writes: > Jeremy Drake wrote: >> #0 0xb7c4dc85 in memcpy () from /lib/tls/libc.so.6 >> #1 0x08190f59 in appendBinaryStringInfo (str=0xbfd87f90, >> data=0x841ffc0 "qux", datalen=138543040) at stringinfo.c:192 >> #2 0x0828377f in map_sql_identifier_to_xml_name (ident=0x841ffc0 >> "qux", fully_escaped=0 '\0') at xml.c:933 > Obviously the datalen has gone off the map. I wouldn't put 100% faith in that display, unless Jeremy built with -O0. If it is accurate then the question is how could mblen fail so badly? regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] recent --with-libxml support
Jeremy Drake wrote: > #0 0xb7c4dc85 in memcpy () from /lib/tls/libc.so.6 > #1 0x08190f59 in appendBinaryStringInfo (str=0xbfd87f90, > data=0x841ffc0 "qux", datalen=138543040) at stringinfo.c:192 > #2 0x0828377f in map_sql_identifier_to_xml_name (ident=0x841ffc0 > "qux", fully_escaped=0 '\0') at xml.c:933 Obviously the datalen has gone off the map. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] recent --with-libxml support
On Fri, 22 Dec 2006, Tom Lane wrote: > Jeremy Drake <[EMAIL PROTECTED]> writes: > >> Can you provide a stack trace for that crash? > > > #0 0xb7c4dc85 in memcpy () from /lib/tls/libc.so.6 > > #1 0x08190f59 in appendBinaryStringInfo (str=0xbfd87f90, > > data=0x841ffc0 "qux", datalen=138543040) at stringinfo.c:192 > > #2 0x0828377f in map_sql_identifier_to_xml_name (ident=0x841ffc0 "qux", > > fully_escaped=0 '\0') at xml.c:933 > > Hmm ... it seems to work for me here, using Fedora 5's libxml. > > Are you by any chance running this with a non-C locale? The trace > suggests an encoding-mismatch sort of issue... Nope. I saw another buildfarm member that looks like it croaked in the same place: http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=sponge&dt=2006-12-22%2022:30:02 So I guess it is not just me... -- If you think education is expensive, try ignorance. -- Derek Bok, president of Harvard ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Interface for pg_autovacuum
On Thursday 21 December 2006 10:57, Dave Page wrote: > Simon Riggs wrote: > > On Wed, 2006-12-20 at 09:47 -0500, Jim Nasby wrote: > >> On the other hand, this would be the only part of the system where > >> the official interface/API is a system catalog table. Do we really > >> want to expose the internal representation of something as our API? > >> That doesn't seem wise to me... > > > > Define and agree the API (the hard bit) and I'll code it (the easy bit). > > > > We may as well have something on the table, even if it changes later. > > > > Dave: How does PgAdmin handle setting table-specific autovacuum > > parameters? (Does it?) > > Yes, it adds/removes/edits rows in pg_autovacuum as required. > We do this in phppgadmin too, although I also added a screen that show alist of entries with schema and table names (rather than vacrelid) since otherwise it is too much pita to keep things straight. My intent is also to add controls at the table level (where we'll know the vacrelid anyway) though it will probably be put off until there is more demand for it. -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] recent --with-libxml support
Jeremy Drake <[EMAIL PROTECTED]> writes: >> Can you provide a stack trace for that crash? > #0 0xb7c4dc85 in memcpy () from /lib/tls/libc.so.6 > #1 0x08190f59 in appendBinaryStringInfo (str=0xbfd87f90, > data=0x841ffc0 "qux", datalen=138543040) at stringinfo.c:192 > #2 0x0828377f in map_sql_identifier_to_xml_name (ident=0x841ffc0 "qux", > fully_escaped=0 '\0') at xml.c:933 Hmm ... it seems to work for me here, using Fedora 5's libxml. Are you by any chance running this with a non-C locale? The trace suggests an encoding-mismatch sort of issue... regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Strange pgsql crash on MacOSX
Shane Ambler <[EMAIL PROTECTED]> writes: > postgres=# \q > psql(24931) malloc: *** error for object 0x180a800: incorrect checksum > for freed object - object was probably modified after being freed, break > at szone_error to debug > psql(24931) malloc: *** set a breakpoint in szone_error to debug > Segmentation fault I think we've seen something like this before in connection with readline/libedit follies. Does the crash go away if you invoke psql with "-n" option? If so, exactly which version of readline or libedit are you using? FWIW, I do not see this on a fully up-to-date 10.4.8 G4 laptop. I see $ ls -l /usr/lib/libedit* -rwxr-xr-x 1 root wheel 112404 Sep 29 20:59 /usr/lib/libedit.2.dylib lrwxr-xr-x 1 root wheel 15 Apr 26 2006 /usr/lib/libedit.dylib -> libedit.2.dylib $ so it seems that Apple did update libedit not too long ago ... regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Operator class group proposal
Gregory Stark <[EMAIL PROTECTED]> writes: > So the only reason we needed the cross-data-type operators was to get better > estimates? I thought without them you couldn't get an index-based plan at all. Oh, hm, there is that --- you won't get a nestloop with inner indexscan unless the join expression uses the unmodified inner variable (unless you do something weird like provide an index on the casted value...) However, we need to be pretty wary about widening the families unless we're sure that the semantics are right. In particular, I think that numeric-vs-float crosstype operators would violate the transitive law: you could have values for which A=B and B=C but A!=C. This is because we smash numerics to float for comparison, and so there are distinct numeric values that can compare equal to the same float. bigint against float same problem. It'd be OK to integrate integers and numeric into one class, but how much real value is there in that? regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] Strange pgsql crash on MacOSX
I have a dual G4 1.25Ghz with 2GB RAM running Mac OSX 10.4.8 and PostgreSQL 8.2.0 This only happened to me today and with everything I have tried it always happens now - had been running fine before. The only thing I can think of that has changed in the last few days is I have installed the last 2 security updates from Apple and the X11 update (X11 1.1.3) that Apple released a while ago - http://www.apple.com/support/downloads/securityupdate2006008ppc.html http://www.apple.com/support/downloads/securityupdate20060071048clientppc.html the first one I can't see having anything to do with postgres as it is I believe only updating Java. The other one updates a few different areas and may be the culprit. I can't think of anything else I have changed just recently - certainly not in the last couple of days. To test and try and track down the cause I have restarted my machine then started by unzipping the 8.2.0 released source and done the following steps (this example is with clean data files and everything default - the startup script has been there a while and using pg_ctl instead makes no difference) make check passes all test - ./configure --prefix=/usr/local/pgsql make check sudo make install cd /usr/local/pgsql sudo mkdir data sudo chown pgsql:pgsql data sudo chmod 700 data sudo -u pgsql /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data sudo /Library/StartupItems/PostgreSQL/PostgreSQL start Then I get the following - [devbox:~] shane% psql Welcome to psql 8.2.0, the PostgreSQL interactive terminal. Type: \copyright for distribution terms \h for help with SQL commands \? for help with psql commands \g or terminate with semicolon to execute query \q to quit postgres=# \q psql(24931) malloc: *** error for object 0x180a800: incorrect checksum for freed object - object was probably modified after being freed, break at szone_error to debug psql(24931) malloc: *** set a breakpoint in szone_error to debug Segmentation fault [devbox:~] shane% The serverlog gives me - [devbox:local/pgsql/data] root# cat serverlog LOG: database system was shut down at 2006-12-23 12:27:44 CST LOG: checkpoint record is at 0/42BEB8 LOG: redo record is at 0/42BEB8; undo record is at 0/0; shutdown TRUE LOG: next transaction ID: 0/593; next OID: 10820 LOG: next MultiXactId: 1; next MultiXactOffset: 0 LOG: database system is ready Apple's crashreporter gives me - Date/Time: 2006-12-23 12:28:21.499 +1030 OS Version: 10.4.8 (Build 8L127) Report Version: 4 Command: psql Path:/usr/local/pgsql/bin/psql Parent: tcsh [294] Version: ??? (???) PID:24931 Thread: 0 Exception: EXC_BAD_ACCESS (0x0001) Codes: KERN_INVALID_ADDRESS (0x0001) at 0x3430616b Thread 0 Crashed: 0 libSystem.B.dylib 0x90006cd8 szone_free + 3148 1 libSystem.B.dylib 0x900152d0 fclose + 176 2 libedit.2.dylib 0x96b5c334 history_end + 1632 3 libedit.2.dylib 0x96b5c7bc history + 468 4 libedit.2.dylib 0x96b5ec58 write_history + 84 5 psql0x8350 saveHistory + 208 6 psql0x8428 finishInput + 120 7 libSystem.B.dylib 0x90014578 __cxa_finalize + 260 8 libSystem.B.dylib 0x9001 exit + 36 9 psql0x1d00 _start + 764 10 psql0x1a00 start + 48 Thread 0 crashed with PPC Thread State 64: srr0: 0x90006cd8 srr1: 0xd030 vrsave: 0x cr: 0x42002444 xer: 0x2001 lr: 0x90006ca4 ctr: 0x900143a0 r0: 0x90006ca4 r1: 0xb610 r2: 0x42002442 r3: 0x000d r4: 0x r5: 0x000d r6: 0x80808080 r7: 0x0003 r8: 0x39333100 r9: 0xb545 r10: 0x r11: 0x42002442 r12: 0x900143a0 r13: 0x r14: 0x r15: 0x r16: 0x r17: 0x0052 r18: 0x0400 r19: 0x0054 r20: 0x02a4 r21: 0x0180a800 r22: 0xa0001fac r23: 0x02a8 r24: 0x0002 r25: 0x0002 r26: 0x0001 r27: 0x34306167 r28: 0x0180 r29: 0x0180a400 r30: 0x2e616767 r31: 0x900060a0 Binary Images Description: 0x1000 -0x36fff psql/usr/local/pgsql/bin/psql 0x3f000 -0x54fff libpq.5.dylib /usr/local/pgsql/lib/libpq.5.dylib 0x8fe0 - 0x8fe51fff dyld 45.3 /usr/lib/dyld 0x9000 - 0x901bcfff libSystem.B.dylib /usr/lib/libSystem.B.dylib 0x90214000 - 0x90219fff libmathCommon.A.dylib /usr/lib/system/libmathCommon.A.dylib 0x9110f000 - 0x9111dfff libz.1.dylib/usr/lib/libz.1.dylib 0x969c3000 - 0x969f1fff libncurses.5.4.dylib/usr/lib/libncurses.5.4.dylib 0x96b4d000 - 0x96b63fff libedit.2.dylib /usr/lib/libedit.2.dylib Model: PowerMac3,6, BootROM 4.4.8f2, 2 proces
Re: [HACKERS] Companies Contributing to Open Source
Kevin Grittner wrote: > >>> On Tue, Dec 19, 2006 at 6:13 PM, in message > <[EMAIL PROTECTED]>, Bruce Momjian > <[EMAIL PROTECTED]> wrote: > > if the company dies, the community keeps going (as it did after > Great > > Bridge, without a hickup), but if the community dies, the company > dies > > too. > > This statement seems to ignore organizations for which PostgreSQL is an > implementation detail in their current environment. While we appreciate > PostgreSQL and are likely to try to make an occasional contribution, > where it seems to be mutually beneficial, the Wisconsin State Courts > would survive the collapse of the PostgreSQL community. Yes, the statement relates mostly to companies that sell/support/enhance open source software, rather than users who are using the software in their businesses. And that text isn't in the article, it was just in an email to make a distinction. I think I have improved the slant of the article. Let me know if it needs further improvement. Thanks. --- > > While I can only guess at the reasons you may have put the slant you > did on the document, I think it really should reflect the patient > assistance the community provides to those who read the developers FAQ > and make a good faith effort to comply with what is outlined there. The > cooperative, professional, and helpful demeanor of the members of this > community is something which should balanced against the community's > need to act as a gatekeeper on submissions. > > I have recent experience as a first time employee contributor. When we > hit a bump in our initial use of PostgreSQL because of the non-standard > character string literals, you were gracious in accepting our quick > patch as being possibly of some value in the implementation of the > related TODO item. You were then helpful in our effort to do a proper > implementation of the TODO item which fixes it. I see that the patch I > submitted was improved by someone before it made the release, which is > great. > > This illustrates how the process can work. I informed management of > the problem, and presented the options -- we could do our own little > hack that we then had to maintain and apply as the versions moved along, > or we could try to do fix which the community would accept and have that > feature "just work" for us for all subsequent releases. The latter was > a little more time up front, but resulted in a better quality product > for us, and less work in the long term. It was also presumably of some > benefit to the community, which has indirect benefit to our > organization. Nobody here wants to switch database products again soon, > so if we can solve our problem in a way that helps the product gain > momentum, all the better. > > I ran a consulting business for decades, and I know that there is a > great variation in the attitudes among managers. Many are quite > reasonable. I'm reminded of a meeting early in my career with a > businessman who owned and operated half a dozen successful businesses in > a variety of areas. He proposed a deal that I was on the verge of > accepting, albeit somewhat reluctantly. He stopped me and told me that > he hoped to continue to do business with me, so any deal we made had to > benefit and work for both of us or it was no good at all; if I was > uncomfortable with something in the proposal, we should talk it out. > That's the core of what we're trying to say in this document, isn't it? > The rest is an executive overview of the developer FAQ? I can't help > feeling that even with the revisions so far it could have a more > positive "spin". > > -Kevin > -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Companies Contributing to Open Source
OK, based on this feedback and others, I have made a new version of the article: http://momjian.us/main/writings/pgsql/company_contributions/ There are no new concepts, just a more balance article with some of the awkward wording improved. I also added a link to the article from the developer's FAQ. --- Joshua D. Drake wrote: > Hello, > > O.k. below are some comments. Your article although well written has a > distinct, from the community perspective ;) and I think there are some > points from the business side that are missed. > > --- > Employees working in open source communities have two bosses -- the > companies that employ them, and the open source community, which must > review their proposals and patches. Ideally, both groups would want the > same thing, but often companies have different priorities in terms of > deadlines, time investment, and implementation details. And, > unfortunately for companies, open source communities rarely adjust their > requirements to match company needs. They would often rather "do > without" than tie their needs to those of a company. > --- > > Employees don't have two bosses at least not in the presentation above. > In the community the employee may choose to do it the communities way or > not. That choice is much more defined within a Boss purview. > > A companies priorities have a priority that is very powerful that the > community does not and I believe should be reflected in a document such > as this. To actually feed the employee. There is a tendency for the > community to forget that every minute spent on community work is a > direct neglect to the immediate (note that I say immediate) bottom line. > That means that priorities must be balanced so that profit can be made, > employees can get bonuses and god forbid a steady paycheck. > > --- > This makes the employee's job difficult. When working with the > community, it can be difficult to meet company demands. If the company > doesn't understand how the community works, the employee can be seen as > defiant, when in fact the employee has no choice but to work in the > community process and within the community timetable. > > By serving two masters, employees often exhibit various behaviors that > make their community involvement ineffective. Below I outline the steps > involved in open source development, and highlight the differences > experienced by employees involved in such activities. > --- > > The first paragraph seems to need some qualification. An employee is > hired to work at the best interests of the company, not the community. > Those two things may overlap, but that is subject to the companies > declaration. If the employee is not doing the task as delegated that is > defiant. > > I am suspecting that your clarification would be something to the effect > of: > > When a company sets forth to donate resources to the community, it can > make an employee's job difficult. It is important for the company to > understand exactly what it is giving and the process that gift entails. > > Or something like that. > > I take subject to the term serving two masters, I am certainly not the > master of my team but that may just be me. > > --- > Employees usually circulate their proposal inside their companies first > before sharing it with the community. Unfortunately, many employees > never take the additional step of sharing the proposal with the > community. This means the employee is not benefitting from community > oversight and suggestions, often leading to a major rewrite when a patch > is submitted to the community. > --- > > I think the above is not quite accurate. I see few proposals actually > come across to the community either and those that do seem to get bogged > down instead of progress being made. > > The most successful topics I have seen are those that usually have some > footing behind them *before* they bring it to the community. > > --- > For employees, patch review often happens in the company first. Only > when the company is satisfied is the patch submitted to the community. > This is often done because of the perception that poor patches reflect > badly on the company. The problem with this patch pre-screening is that > it prevents parallel review, where the company and community are both > reviewing the patch. Parallel review speeds completion and avoids > unnecessary refactoring. > --- > > It does effect the perception of the company. Maybe not to the community > but as someone who reads comments on the patches that comes through... I > do not look forward to the day when I have a customer that says, didn't > you submit that patch that was torn apart by... > > --- > As you can see, community involvement has unique challenges for company > employees. There are often many mismatches between company needs and > community needs, and the company must decide if it is worth honoring the > co
Re: [HACKERS] Load distributed checkpoint
Gregory Stark wrote: > > "Bruce Momjian" <[EMAIL PROTECTED]> writes: > > > I have a new idea. Rather than increasing write activity as we approach > > checkpoint, I think there is an easier solution. I am very familiar > > with the BSD kernel, and it seems they have a similar issue in trying to > > smooth writes: > > Just to give a bit of context for this. The traditional mechanism for syncing > buffers to disk on BSD which this daemon was a replacement for was to simply > call "sync" every 30s. Compared to that this daemon certainly smooths the I/O > out over the 30s window... > > Linux has a more complex solution to this (of course) which has undergone a > few generations over time. Older kernels had a user space daemon called > bdflush which called an undocumented syscall every 5s. More recent ones have a > kernel thread called pdflush. I think both have various mostly undocumented > tuning knobs but neither makes any sort of guarantee about the amount of time > a dirty buffer might live before being synced. > > Your thinking is correct but that's already the whole point of bgwriter isn't > it? To get the buffers out to the kernel early in the checkpoint interval so > that come checkpoint time they're hopefully already flushed to disk. As long > as your checkpoint interval is well over 30s only the last 30s (or so, it's a > bit fuzzier on Linux) should still be at risk of being pending. > > I think the main problem with an additional pause in the hopes of getting more > buffers synced is that during the 30s pause on a busy system there would be a > continual stream of new dirty buffers being created as bgwriter works and > other backends need to reuse pages. So when the fsync is eventually called > there will still be a large amount of i/o to do. Fundamentally the problem is > that fsync is too blunt an instrument. We only need to fsync the buffers we > care about, not the entire file. Well, one idea would be for the bgwriter not to do many write()'s between the massive checkpoint write()'s and the fsync()'s. That would cut down on the extra I/O that fsync() would have to do. The problem I see with making the bgwriter do more writes between checkpoints is that overhead of those scans, and the overhead of doing write's that will later be dirtied before the checkpoint. With the delay between stages idea, we don't need to guess how agressive the bgwriter needs to be --- we can just do the writes, and wait for a while. On an idle system, would someone dirty a large file, and watch the disk I/O to see how long it takes for the I/O to complete to disk? In what we have now, we are either having the bgwriter do too much I/O between checkpoints, or guaranteeing an I/O storm during a checkpoint by doing lots of write()'s and then calling fsync() right away. I don't see how we are ever going to get that properly tuned. Would someone code up a patch and test it? -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Companies Contributing to Open Source
"Simon Riggs" <[EMAIL PROTECTED]> writes: > In a humble, non-confrontational tone: Why/How does a patch imply a fait > accompli, or show any disrespect? Well depending on the circumstances it could show the poster isn't interested in the judgement of the existing code authors. It can be hard to tell someone that their last 6 months of work was all in a direction that other developers would rather Postgres not head. However I think people are over-generalising if they think this is always true. Patches are often submitted by people who invite comment and are open to new ideas and reworking their approach. Whether the submission is as a fait accompli or as the beginning of a dialogue (imho a more productive dialogue than the usual hand-waving on -hackers) is determined more by the attitude of the presenter and willingness to take criticisms and make changes than it is by the mere fact that they've written code without prior approval. The flip side of all of this is that "the community" doesn't always engage when people do ask for feedback. I asked for comments on how best to proceed getting info down to the Sort node from a higher Limit node to implement the limit-sort optimization and didn't get any guidance. As a result I'm kind of stuck. I can proceed without feedback but I fear I would be, in fact, presenting the result as a fait accompli which would end up getting rejected if others were less comfortable with breaking the planner and executor abstractions (or if I choose not to do so and they decide the necessary abstractions are needless complexity). -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Load distributed checkpoint
On Fri, 22 Dec 2006, Simon Riggs wrote: I have also seen cases where the WAL drive, even when separated, appears to spike upwards during a checkpoint. My best current theory, so far untested, is that the WAL and data drives are using the same CFQ scheduler and that the scheduler actively slows down WAL requests when it need not. Mounting the drives as separate block drives with separate schedulers, CFQ for data and Deadline for WAL should help. The situation I've been seeing is that the database needs a new block to complete a query and issues a read request to get it, but that read is behind the big checkpoint fsync. Client sits there for quite some time waiting for the fsync to finish before it gets the data it needs, and now your trivial select took seconds to complete. It's fairly easy to replicate this problem using pgbench on Linux--I've seen a query sit there for 15 seconds when going out of my way to aggrevate the behavior. One of Takayuki's posts here mentioned a worst-case delay of 13 seconds, that's the problem rearing its ugly head. You may be right that what you're seeing would be solved with a more complicated tuning on a per-device basis (which, by the way, isn't available unless you're running a more recent Linux kernel than most many distributions have available). You can tune the schedulers all day and not make a lick of difference to what I've been running into; I know, I tried. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Operator class group proposal
Tom Lane <[EMAIL PROTECTED]> writes: > No, what you'll get is something like > > int4var::float8 float8eq float8var > > which is perfectly mergejoinable ... however, it's not clear that the > planner will make very good estimates about the value of the cast > expression. I'm not sure if it's worth introducing a pile more > crosstype operators to change that situation --- improving > the selectivity functions to handle casts better might be a wiser > approach. So the only reason we needed the cross-data-type operators was to get better estimates? I thought without them you couldn't get an index-based plan at all. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Load distributed checkpoint
"Bruce Momjian" <[EMAIL PROTECTED]> writes: > I have a new idea. Rather than increasing write activity as we approach > checkpoint, I think there is an easier solution. I am very familiar > with the BSD kernel, and it seems they have a similar issue in trying to > smooth writes: Just to give a bit of context for this. The traditional mechanism for syncing buffers to disk on BSD which this daemon was a replacement for was to simply call "sync" every 30s. Compared to that this daemon certainly smooths the I/O out over the 30s window... Linux has a more complex solution to this (of course) which has undergone a few generations over time. Older kernels had a user space daemon called bdflush which called an undocumented syscall every 5s. More recent ones have a kernel thread called pdflush. I think both have various mostly undocumented tuning knobs but neither makes any sort of guarantee about the amount of time a dirty buffer might live before being synced. Your thinking is correct but that's already the whole point of bgwriter isn't it? To get the buffers out to the kernel early in the checkpoint interval so that come checkpoint time they're hopefully already flushed to disk. As long as your checkpoint interval is well over 30s only the last 30s (or so, it's a bit fuzzier on Linux) should still be at risk of being pending. I think the main problem with an additional pause in the hopes of getting more buffers synced is that during the 30s pause on a busy system there would be a continual stream of new dirty buffers being created as bgwriter works and other backends need to reuse pages. So when the fsync is eventually called there will still be a large amount of i/o to do. Fundamentally the problem is that fsync is too blunt an instrument. We only need to fsync the buffers we care about, not the entire file. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] recent --with-libxml support
On Fri, 22 Dec 2006, Tom Lane wrote: > Jeremy Drake <[EMAIL PROTECTED]> writes: > > As seen, I needed to add an include dir for configure to pass. However, > > make check fails now with the backend crashing. This can be seen in the > > buildfarm results for mongoose. > > Can you provide a stack trace for that crash? #0 0xb7c4dc85 in memcpy () from /lib/tls/libc.so.6 #1 0x08190f59 in appendBinaryStringInfo (str=0xbfd87f90, data=0x841ffc0 "qux", datalen=138543040) at stringinfo.c:192 #2 0x0828377f in map_sql_identifier_to_xml_name (ident=0x841ffc0 "qux", fully_escaped=0 '\0') at xml.c:933 #3 0x0811ce83 in transformXmlExpr (pstate=0x84202b8, x=0x8420034) at parse_expr.c:1426 #4 0x0811ac91 in transformExpr (pstate=0x84202b8, expr=0x8420034) at parse_expr.c:238 #5 0x0811ceb4 in transformXmlExpr (pstate=0x84202b8, x=0x8420174) at parse_expr.c:1456 #6 0x0811ac91 in transformExpr (pstate=0x84202b8, expr=0x8420174) at parse_expr.c:238 #7 0x081288a4 in transformTargetEntry (pstate=0x84202b8, node=0x8420174, expr=0x0, colname=0x0, resjunk=0 '\0') at parse_target.c:74 #8 0x0812890e in transformTargetList (pstate=0x84202b8, targetlist=0x1) at parse_target.c:146 #9 0x080ffcef in transformStmt (pstate=0x84202b8, parseTree=0x84201fc, extras_before=0xbfd882c4, extras_after=0xbfd882c8) at analyze.c:2102 #10 0x08101421 in do_parse_analyze (parseTree=0x841ffc0, pstate=0x84202b8) at analyze.c:251 #11 0x0810227a in parse_analyze (parseTree=0x84201fc, sourceText=0x841ffc0 "qux", paramTypes=0x841ffc0, numParams=138543040) at analyze.c:173 #12 0x0820b66e in pg_analyze_and_rewrite (parsetree=0x84201fc, query_string=0x841fb74 "SELECT xmlconcat(xmlcomment('hello'),\n", ' ' , "xmlelement(NAME qux, 'foo'),\n", ' ' , "xmlcomment('world'));", paramTypes=0x0, numParams=0) at postgres.c:567 #13 0x0820b91e in exec_simple_query ( query_string=0x841fb74 "SELECT xmlconcat(xmlcomment('hello'),\n", ' ' , "xmlelement(NAME qux, 'foo'),\n", ' ' , "xmlcomment('world'));") at postgres.c:875 #14 0x0820d72b in PostgresMain (argc=4, argv=0x83c5c2c, username=0x83c5bfc "jeremyd") at postgres.c:3418 #15 0x081dfbd7 in ServerLoop () at postmaster.c:2924 #16 0x081e132c in PostmasterMain (argc=3, argv=0x83c4550) at postmaster.c:958 #17 0x081991e0 in main (argc=3, argv=0x83c4550) at main.c:188 -- In Tennessee, it is illegal to shoot any game other than whales from a moving automobile. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Load distributed checkpoint
On 12/22/06, Takayuki Tsunakawa <[EMAIL PROTECTED]> wrote: From: Inaam Rana > Which IO Shceduler (elevator) you are using? Elevator? Sorry, I'm not familiar with the kernel implementation, so I don't what it is. My Linux distribution is Red Hat Enterprise Linux 4.0for AMD64/EM64T, and the kernel is 2.6.9-42.ELsmp. I probably havn't changed any kernel settings, except for IPC settings to run PostgreSQL. There are four IO schedulers in Linux. Anticipatory, CFQ (default), deadline, and noop. For typical OLTP type loads generally deadline is recommended. If you are constrained on CPU and you have a good controller then its better to use noop. Deadline attempts to merge requests by maintaining two red black trees in sector sort order and it also ensures that a request is serviced in given time by using FIFO. I don't expect it to do the magic but was wondering that it may dilute the issue of fsync() elbowing out WAL writes. You can look into /sys/block//queue/scheduler to see which scheduler you are using. regards, inaam -- Inaam Rana EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] Load distributed checkpoint
I have a new idea. Rather than increasing write activity as we approach checkpoint, I think there is an easier solution. I am very familiar with the BSD kernel, and it seems they have a similar issue in trying to smooth writes: http://www.brno.cas.cz/cgi-bin/bsdi-man?proto=1.1&query=update&msection=4&apropos=0 UPDATE(4) BSD Programmer's Manual UPDATE(4) NAME update - trickle sync filesystem caches to disk DESCRIPTION At system boot time, the kernel starts filesys_syncer, process 3. This process helps protect the integrity of disk volumes by ensuring that volatile cached filesystem data are written to disk within the vfs.generic.syncdelay interval which defaults to thirty seconds (see sysctl(8)). When a vnode is first written it is placed vfs.generic.syncdelay seconds down on the trickle sync queue. If it still exists and has dirty data when it reaches the top of the queue, filesys_syncer writes it to disk. This approach evens out the load on the underlying I/O system and avoids writing short-lived files. The pa- pers on trickle-sync tend to favor aging based on buffers rather than files. However, BSD/OS synchronizes on file age rather than buffer age because the data structures are much smaller as there are typically far fewer files than buffers. Although this can make the I/O bursty when a big file is written to disk, it is still much better than the wholesale writes that were being done by the historic update process which wrote all dirty data buffers every 30 seconds. It also adapts much better to the soft update code which wants to control aging to improve performance (inodes age in one third of vfs.generic.syncdelay seconds, directories in one half of vfs.generic.syncdelay seconds). This ordering ensures that most dependencies are gone (e.g., inodes are written when directory en- tries want to go to disk) reducing the amount of work that the soft up- date code needs to do. I assume other kernels have similar I/O smoothing, so that data sent to the kernel via write() gets to disk within 30 seconds. I assume write() is not our checkpoint performance problem, but the transfer to disk via fsync(). Perhaps a simple solution is to do the write()'s of all dirty buffers as we do now at checkpoint time, but delay 30 seconds and then do fsync() on all the files. The goal here is that during the 30-second delay, the kernel will be forcing data to the disk, so the fsync() we eventually do will only be for the write() of buffers during the 30-second delay, and because we wrote all dirty buffers 30 seconds ago, there shouldn't be too many of them. I think the basic difference between this and the proposed patch is that we do not put delays in the buffer write() or fsync() phases --- we just put a delay _between_ the phases, and wait for the kernel to smooth it out for us. The kernel certainly knows more about what needs to get to disk, so it seems logical to let it do the I/O smoothing. --- Bruce Momjian wrote: > > I have thought a while about this and I have some ideas. > > Ideally, we would be able to trickle the sync of individuals blocks > during the checkpoint, but we can't because we rely on the kernel to > sync all dirty blocks that haven't made it to disk using fsync(). We > could trickle the fsync() calls, but that just extends the amount of > data we are writing that has been dirtied post-checkpoint. In an ideal > world, we would be able to fsync() only part of a file at a time, and > only those blocks that were dirtied pre-checkpoint, but I don't see that > happening anytime soon (and one reason why many commercial databases > bypass the kernel cache). > > So, in the real world, one conclusion seems to be that our existing > method of tuning the background writer just isn't good enough for the > average user: > > #bgwriter_delay = 200ms # 10-1ms between rounds > #bgwriter_lru_percent = 1.0 # 0-100% of LRU buffers > scanned/round > #bgwriter_lru_maxpages = 5 # 0-1000 buffers max > written/round > #bgwriter_all_percent = 0.333 # 0-100% of all buffers > scanned/round > #bgwriter_all_maxpages = 5 # 0-1000 buffers max > written/round > > These settings control what the bgwriter does, but they do not clearly > relate to the checkpoint timing, which is the purpose of the bgwriter, > and they don't change during the checkpoint interval, which is also less > than ideal. If set t
Re: [HACKERS] recent --with-libxml support
Jeremy Drake <[EMAIL PROTECTED]> writes: > As seen, I needed to add an include dir for configure to pass. However, > make check fails now with the backend crashing. This can be seen in the > buildfarm results for mongoose. Can you provide a stack trace for that crash? regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Operator class group proposal
Gregory Stark <[EMAIL PROTECTED]> writes: > I thought that would just be formalizing what we currently have. But I just > discovered to my surprise tat it's not. I don't see any cross-data-type > operators between any of the integer types and numeric, or between any of the > floating point types and numeric, or between any of the integers and the > floating point types. Correct. > So does that mean we currently have three separate arithmetic "operator class > groups" such as they currently exist and you can't currently do merge joins > between some combinations of these arithmetic types? No, what you'll get is something like int4var::float8 float8eq float8var which is perfectly mergejoinable ... however, it's not clear that the planner will make very good estimates about the value of the cast expression. I'm not sure if it's worth introducing a pile more crosstype operators to change that situation --- improving the selectivity functions to handle casts better might be a wiser approach. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Companies Contributing to Open Source
Guido Barosio wrote: > > "Companies often bring fresh prespective, ideas, and testing > infrastucture to a project." > > > "prespective" || "perspective" ? Thanks, fixed. --- > > g.- > > > On 12/21/06, Kevin Grittner <[EMAIL PROTECTED]> wrote: > > >>> On Tue, Dec 19, 2006 at 6:13 PM, in message > > <[EMAIL PROTECTED]>, Bruce Momjian > > <[EMAIL PROTECTED]> wrote: > > > if the company dies, the community keeps going (as it did after > > Great > > > Bridge, without a hickup), but if the community dies, the company > > dies > > > too. > > > > This statement seems to ignore organizations for which PostgreSQL is an > > implementation detail in their current environment. While we appreciate > > PostgreSQL and are likely to try to make an occasional contribution, > > where it seems to be mutually beneficial, the Wisconsin State Courts > > would survive the collapse of the PostgreSQL community. > > > > While I can only guess at the reasons you may have put the slant you > > did on the document, I think it really should reflect the patient > > assistance the community provides to those who read the developers FAQ > > and make a good faith effort to comply with what is outlined there. The > > cooperative, professional, and helpful demeanor of the members of this > > community is something which should balanced against the community's > > need to act as a gatekeeper on submissions. > > > > I have recent experience as a first time employee contributor. When we > > hit a bump in our initial use of PostgreSQL because of the non-standard > > character string literals, you were gracious in accepting our quick > > patch as being possibly of some value in the implementation of the > > related TODO item. You were then helpful in our effort to do a proper > > implementation of the TODO item which fixes it. I see that the patch I > > submitted was improved by someone before it made the release, which is > > great. > > > > This illustrates how the process can work. I informed management of > > the problem, and presented the options -- we could do our own little > > hack that we then had to maintain and apply as the versions moved along, > > or we could try to do fix which the community would accept and have that > > feature "just work" for us for all subsequent releases. The latter was > > a little more time up front, but resulted in a better quality product > > for us, and less work in the long term. It was also presumably of some > > benefit to the community, which has indirect benefit to our > > organization. Nobody here wants to switch database products again soon, > > so if we can solve our problem in a way that helps the product gain > > momentum, all the better. > > > > I ran a consulting business for decades, and I know that there is a > > great variation in the attitudes among managers. Many are quite > > reasonable. I'm reminded of a meeting early in my career with a > > businessman who owned and operated half a dozen successful businesses in > > a variety of areas. He proposed a deal that I was on the verge of > > accepting, albeit somewhat reluctantly. He stopped me and told me that > > he hoped to continue to do business with me, so any deal we made had to > > benefit and work for both of us or it was no good at all; if I was > > uncomfortable with something in the proposal, we should talk it out. > > That's the core of what we're trying to say in this document, isn't it? > > The rest is an executive overview of the developer FAQ? I can't help > > feeling that even with the revisions so far it could have a more > > positive "spin". > > > > -Kevin > > > > > > > > ---(end of broadcast)--- > > TIP 5: don't forget to increase your free space map settings > > > > > -- > Guido Barosio > --- > http://www.globant.com > [EMAIL PROTECTED] > > ---(end of broadcast)--- > TIP 5: don't forget to increase your free space map settings -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] recent --with-libxml support
I adjusted my buildfarm config (mongoose) to attempt to build HEAD --with-libxml. I added the following to build-farm.conf: if ($branch eq 'HEAD' || $branch ge 'REL8_3') { push(@{$conf{config_opts}}, "--with-includes=/usr/include/et:/usr/include/libxml2"); push(@{$conf{config_opts}}, "--with-libxml"); } As seen, I needed to add an include dir for configure to pass. However, make check fails now with the backend crashing. This can be seen in the buildfarm results for mongoose. According to gentoo portage, I have libxml2 version 2.6.26 installed on my system. I am not clear if I should have pointed it at libxml version 1 or 2, but configure seemed to be happy with libxml2. If it needs version 1, perhaps configure should do something to keep it from using version 2. Here is the diff for the xml regression test: *** ./expected/xml.out Thu Dec 21 16:47:22 2006 --- ./results/xml.out Thu Dec 21 16:59:32 2006 *** *** 58,68 SELECT xmlelement(name element, xmlattributes (1 as one, 'deuce' as two), 'content'); !xmlelement ! ! content ! (1 row) ! SELECT xmlelement(name element, xmlattributes ('unnamed and wrong')); ERROR: unnamed attribute value must be a column reference --- 58,64 SELECT xmlelement(name element, xmlattributes (1 as one, 'deuce' as two), 'content'); ! ERROR: cache lookup failed for type 0 SELECT xmlelement(name element, xmlattributes ('unnamed and wrong')); ERROR: unnamed attribute value must be a column reference *** *** 73,145 (1 row) SELECT xmlelement(name employee, xmlforest(name, age, salary as pay)) FROM emp; ! xmlelement ! -- ! sharon251000 ! sam302000 ! bill201000 ! jeff23600 ! cim30400 ! linda19100 ! (6 rows) ! ! SELECT xmlelement(name wrong, 37); ! ERROR: argument of XMLELEMENT must be type xml, not type integer ! SELECT xmlpi(name foo); ! xmlpi ! - ! ! (1 row) ! ! SELECT xmlpi(name xmlstuff); ! ERROR: invalid XML processing instruction ! DETAIL: XML processing instruction target name cannot start with "xml". ! SELECT xmlpi(name foo, 'bar'); ! xmlpi ! - ! ! (1 row) ! ! SELECT xmlpi(name foo, 'in?>valid'); ! ERROR: invalid XML processing instruction ! DETAIL: XML processing instruction cannot contain "?>". ! SELECT xmlroot ( ! xmlelement ( ! name gazonk, ! xmlattributes ( ! 'val' AS name, ! 1 + 1 AS num ! ), ! xmlelement ( ! NAME qux, ! 'foo' ! ) ! ), ! version '1.0', ! standalone yes ! ); ! xmlroot ! -- ! foo ! (1 row) ! ! SELECT xmlserialize(content data as character varying) FROM xmltest; ! data ! ! one ! two ! (2 rows) ! ! -- Check mapping SQL identifier to XML name ! SELECT xmlpi(name ":::_xml_abc135.%-&_"); ! xmlpi ! - ! ! (1 row) ! ! SELECT xmlpi(name "123"); ! xmlpi ! --- ! ! (1 row) ! --- 69,75 (1 row) SELECT xmlelement(name employee, xmlforest(name, age, salary as pay)) FROM emp; ! server closed the connection unexpectedly ! This probably means the server terminated abnormally ! before or while processing the request. ! connection to server was lost -- The very powerful and the very stupid have one thing in common. Instead of altering their views to fit the facts, they alter the facts to fit their views ... which can be very uncomfortable if you happen to be one of the facts that needs altering. -- Doctor Who, "Face of Evil" ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Operator class group proposal
"Tom Lane" <[EMAIL PROTECTED]> writes: > A class group is associated with a specific index AM and can contain only > opclasses for that AM. We might for instance invent "numeric" and > "numeric_reverse" groups for btree, to contain the default opclasses and > reverse-sort opclasses for the standard arithmetic types. I thought that would just be formalizing what we currently have. But I just discovered to my surprise tat it's not. I don't see any cross-data-type operators between any of the integer types and numeric, or between any of the floating point types and numeric, or between any of the integers and the floating point types. So does that mean we currently have three separate arithmetic "operator class groups" such as they currently exist and you can't currently do merge joins between some combinations of these arithmetic types? What puzzles me is that we used to have problems with bigint columns where people just did "WHERE bigint_col = 1". But my testing shows similar constructs between integer and numeric or other types with no cross-data-type comparator don't lead to similar problems. The system happily introduces casts now and uses the btree operator. So I must have missed another change that was also relevant to this in addition to the cross datatype operators. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] configure problem --with-libxml
Martijn van Oosterhout writes: > On Fri, Dec 22, 2006 at 03:03:49PM +0100, Peter Eisentraut wrote: >> The reason why I did not do this was that this could resolve >> to -I/usr/include or -I/usr/local/include, but adding such a standard >> path explicitly is wrong on some systems. > But if people on such a system want to use libxml2, and they install it > in /usr/include then they're screwed anyway. There's no way to tell the > compiler to use only some files in a directory. That's not the point, the point is that an *explicit* -I can be wrong (because it can change the search order of the default directories). Perhaps it'd be worth trying to add xml2-config's output only if the first probe for the headers fails? regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Load distributed checkpoint
On Thu, 2006-12-21 at 18:46 +0900, ITAGAKI Takahiro wrote: > "Takayuki Tsunakawa" <[EMAIL PROTECTED]> wrote: > > > > If you use Linux, it has very unpleased behavior in fsync(); It locks all > > > metadata of the file being fsync-ed. We have to wait for the completion of > > > fsync when we do read(), write(), and even lseek(). > > > > Oh, really, what an evil fsync is! Yes, I sometimes saw a backend > > waiting for lseek() to complete when it committed. But why does the > > backend which is syncing WAL/pg_control have to wait for syncing the > > data file? They are, not to mention, different files, and WAL and > > data files are stored on separate disks. > > Backends call lseek() in planning, so they have to wait fsync() to > the table that they will access. Even if all of data in the file is in > the cache, lseek() conflict with fsync(). You can see a lot of backends > are waiting in planning phase in checkpoints, not executing phase. It isn't clear to me why you are doing planning during a test at all. If you are doing replanning during test execution then the real performance problem will be the planning, not the fact that the fsync stops planning from happening. Prepared queries are only replanned manually, so the chances of replanning during a checkpoint are fairly low. So although it sounds worrying, I'm not sure that we'll want to alter the use of lseek during planning - though there may be other arguments also. I have also seen cases where the WAL drive, even when separated, appears to spike upwards during a checkpoint. My best current theory, so far untested, is that the WAL and data drives are using the same CFQ scheduler and that the scheduler actively slows down WAL requests when it need not. Mounting the drives as separate block drives with separate schedulers, CFQ for data and Deadline for WAL should help. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Companies Contributing to Open Source
On Wed, 2006-12-20 at 00:29 -0800, David Fetter wrote: > On Tue, Dec 19, 2006 at 05:40:12PM -0500, Bruce Momjian wrote: > > Lukas Kahwe Smith wrote: > > > I think another point you need to bring out more clearily is that > > > the community is also often "miffed" if they feel they have been > > > left out of the design and testing phases. This is sometimes just > > > a reflex that is not always based on technical reasoning. Its just > > > that as you correctly point out are worried of being "high-jacked" > > > by companies. > > > > I hate to mention an emotional community reaction in this document. > > You don't have to name it that if you don't want to, although respect > (or at least a good simulation of it) is crucial when dealing with any > person or group. I'm very interested in this, because it does seem to me that there is an emotional reaction to many things. > Handing the community a /fait accompli/ is a great > way to convey disrespect, no matter how well-meaning the process > originally was. In a humble, non-confrontational tone: Why/How does a patch imply a fait accompli, or show any disrespect? My own reaction to Teodor's recent submission, or Kai-Uwe Sattler's recent contributions has been: great news, patches need some work, but thanks. Please explain on, or off, list to help me understand. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] configure problem --with-libxml
On Fri, Dec 22, 2006 at 03:03:49PM +0100, Peter Eisentraut wrote: > Nikolay Samokhvalov wrote: > > another way is: > > export CPPFLAGS=$(xml2-config --cflags); ./configure --with-libxml > > > > I think that such thing can be used in configure script itself, > > overwise a lot of people will try, fail and do not use SQL/XML at > > all. > > The reason why I did not do this was that this could resolve > to -I/usr/include or -I/usr/local/include, but adding such a standard > path explicitly is wrong on some systems. But if people on such a system want to use libxml2, and they install it in /usr/include then they're screwed anyway. There's no way to tell the compiler to use only some files in a directory. Put another way, if adding the include path for libxml2 breaks their build environment, they can't use libxml2. Have configure play dumb isn't helping anyone. It won't work on any more or less systems. Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to > litigate. signature.asc Description: Digital signature
Re: [HACKERS] xmlagg is not supported?
Pavel Stehule wrote: > why xmlagg is missing in SQL/XML support? Because the version contained in the patch did not work properly. It should be added back, of course. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] xmlagg is not supported?
Nikolay Samokhvalov wrote: > Another thing that was removed is XMLCOMMENT.. XMLCOMMENT works. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] configure problem --with-libxml
Nikolay Samokhvalov wrote: > another way is: > export CPPFLAGS=$(xml2-config --cflags); ./configure --with-libxml > > I think that such thing can be used in configure script itself, > overwise a lot of people will try, fail and do not use SQL/XML at > all. The reason why I did not do this was that this could resolve to -I/usr/include or -I/usr/local/include, but adding such a standard path explicitly is wrong on some systems. Clearly, we need to improve this, but I don't know how yet. Ideas welcome. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Load distributed checkpoint
> > If you use linux, try the following settings: > > 1. Decrease /proc/sys/vm/dirty_ratio and dirty_background_ratio. You will need to pair this with bgwriter_* settings, else too few pages are written to the os inbetween checkpoints. > > 2. Increase wal_buffers to redule WAL flushing. You will want the possibility of single group writes to be able to reach 256kb. The default is thus not enough when you have enough RAM. You also want enough, so that new txns don't need to wait for an empty buffer (that is only freed by a write). > > 3. Set wal_sync_method to open_sync; O_SYNC is faster then fsync(). O_SYNC's only advantage over fdatasync is that it saves a system call, since it still passes through OS cache, but the disadvantage is that it does not let the OS group writes. Thus it is more susceptible to too few WAL_buffers. What you want is O_DIRECT + enough wal_buffers to allow 256k writes. > > 4. Separate data and WAL files into different partitions or disks. While this is generally suggested, I somehow doubt the validity when you only have few disk spindles. If e.g. you only have 2-3 (mirrored) disks I wouldn't do it (at least on the average 70/30 read write systems). Andreas ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Load distributed checkpoint
> (3) is very strange. Your machine seems to be too restricted > by WAL so that other factors cannot be measured properly. Right... It takes as long as 15 seconds to fsync 1GB file. It's strange. This is a borrowed PC server, so the disk may be RAID 5? However, the WAL disk and DB disks show the same throughput. I'll investigate. I may have to find another machine. - Pentium4 3.6GHz with HT / 3GB RAM / Windows XP :-) Oh, Windows. Maybe the fsync() problem Itagaki-san pointed out does not exist. BTW, your env is showing attractive result, isn't it? - Original Message - From: "ITAGAKI Takahiro" <[EMAIL PROTECTED]> To: "Takayuki Tsunakawa" <[EMAIL PROTECTED]> Cc: Sent: Friday, December 22, 2006 6:09 PM Subject: Re: [HACKERS] Load distributed checkpoint "Takayuki Tsunakawa" <[EMAIL PROTECTED]> wrote: > (1) Default case(this is show again for comparison and reminder) > 235 80 226 77 240 > (2) Default + WAL 1MB case > 302 328 82 330 85 > (3) Default + wal_sync_method=open_sync case > 162 67 176 67 164 > (4) (2)+(3) case > 322 350 85 321 84 > (5) (4) + /proc/sys/vm/dirty* tuning > 308 349 84 349 84 (3) is very strange. Your machine seems to be too restricted by WAL so that other factors cannot be measured properly. I'll send results on my machine. - Pentium4 3.6GHz with HT / 3GB RAM / Windows XP :-) - shared_buffers=1GB - wal_sync_method = open_datasync - wal_buffers = 1MB - checkpoint_segments = 16 - checkpoint_timeout = 5min I repeated "pgbench -c16 -t500 -s50" and picked up results around checkpoints. [HEAD] ... 560.8 373.5 <- checkpoint is here 570.8 ... [with patch] ... 562.0 528.4 <- checkpoint (fsync) is here 547.0 ... Regards, --- ITAGAKI Takahiro NTT Open Source Software Center ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] New version of money type
On Thu, 21 Dec 2006 10:47:52 -0500 Tom Lane <[EMAIL PROTECTED]> wrote: > One bug I see in it is that you'd better make the alignment 'd' if the > type is to be int8. Also I much dislike these changes: > > - int32 i = PG_GETARG_INT32(1); > + int64 i = PG_GETARG_INT32(1); As I have made the few corrections that you pointed out, should I go ahead and commit so that it can be tested in a wider group? Also, there are further ideas out there to improve the type further that would be easier to handle with this out of the way. -- D'Arcy J.M. Cain | Democracy is three wolves http://www.druid.net/darcy/| and a sheep voting on +1 416 425 1212 (DoD#0082)(eNTP) | what's for dinner. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] xmlagg is not supported?
Another thing that was removed is XMLCOMMENT.. On 12/22/06, Nikolay Samokhvalov <[EMAIL PROTECTED]> wrote: Hmm... In my patch (http://chernowiki.ru/index.php?node=98) I didn't remove this, moreover I've fixed a couple of issues... Looks like it was removed by Peter (both patches he mailed lack it). Actually, without this function a set is SQL/XML publishing functions becomes rather poor. Peter? On 12/22/06, Pavel Stehule <[EMAIL PROTECTED]> wrote: > Hello, > > why xmlagg is missing in SQL/XML support? > > Regards > Pavel Stehule > > _ > Citite se osamele? Poznejte nekoho vyjmecneho diky Match.com. > http://www.msn.cz/ > > > ---(end of broadcast)--- > TIP 7: You can help support the PostgreSQL project by donating at > > http://www.postgresql.org/about/donate > -- Best regards, Nikolay -- Best regards, Nikolay ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] xmlagg is not supported?
Hmm... In my patch (http://chernowiki.ru/index.php?node=98) I didn't remove this, moreover I've fixed a couple of issues... Looks like it was removed by Peter (both patches he mailed lack it). Actually, without this function a set is SQL/XML publishing functions becomes rather poor. Peter? On 12/22/06, Pavel Stehule <[EMAIL PROTECTED]> wrote: Hello, why xmlagg is missing in SQL/XML support? Regards Pavel Stehule _ Citite se osamele? Poznejte nekoho vyjmecneho diky Match.com. http://www.msn.cz/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate -- Best regards, Nikolay ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] configure problem --with-libxml
another way is: export CPPFLAGS=$(xml2-config --cflags); ./configure --with-libxml I think that such thing can be used in configure script itself, overwise a lot of people will try, fail and do not use SQL/XML at all. On 12/22/06, Pavel Stehule <[EMAIL PROTECTED]> wrote: Hello, I try to compile postgres with SQL/XML, but I finished on checking libxml/parser.h usability... no checking libxml/parser.h presence... no checking for libxml/parser.h... no configure: error: header file is required for XML support I have Fedora Core 6, and libxml2-devel I have installed. I checked parser.h and this file is in /usr/include/libxml2/libxml/ directory I am sorry, but configure file is spenish vilage for me, and I can't correct it. Regards Pavel Stehule _ Chcete sdilet sve obrazky a hudbu s prateli? http://messenger.msn.cz/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org -- Best regards, Nikolay ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Load distributed checkpoint
From: "Greg Smith" <[EMAIL PROTECTED]> > This is actually a question I'd been meaning to throw out myself to this > list. How hard would it be to add an internal counter to the buffer > management scheme that kept track of the current number of dirty pages? > I've been looking at the bufmgr code lately trying to figure out how to > insert one as part of building an auto-tuning bgwriter, but it's unclear > to me how I'd lock such a resource properly and scalably. I have a > feeling I'd be inserting a single-process locking bottleneck into that > code with any of the naive implementations I considered. To put it in an extreme way, how about making bgwriter count the dirty buffers periodically scanning all the buffers? Do you know the book "Principles of Transaction Processing"? Jim Gray was one of the reviewers of this book. http://www.amazon.com/gp/aa.html?HMAC=&CartId=&Operation=ItemLookup&&ItemId=1558604154&ResponseGroup=Request,Large,Variations&bStyle=aaz.jpg&MerchantId=All&isdetail=true&bsi=Books&logo=foo&Marketplace=us&AssociateTag=pocketpc In chapter 8, the author describes fuzzy checkpoint combined with two-checkpoint approach. In his explanation, recovery manager (which would be bgwriter in PostgreSQL) scans the buffers and records the list of dirty buffers at each checkpoint. This won't need any locking in PostgreSQL if I understand correctly. Then, the recovery manager performs the next checkpoint after writing those dirty buffers. In two-checkpoint approach, crash recovery starts redoing from the second to last checkpoint. Two-checkpoint is described in Jim Gray's book, too. But they don't refer to how the recovery manager tunes the speed of writing. > slightly different from the proposals here. What if all the database page > writes (background writer, buffer eviction, or checkpoint scan) were > counted and periodic fsync requests send to the bgwriter based on that? > For example, when I know I have a battery-backed caching controller that > will buffer 64MB worth of data for me, if I forced a fsync after every > 6000 8K writes, no single fsync would get stuck waiting for the disk to > write for longer than I'd like. That seems interesting. > You can do sync > writes with perfectly good performance on systems with a good > battery-backed cache, but I think you'll get creamed in comparisons > against MySQL on IDE disks if you start walking down that path; since > right now a fair comparison with similar logging behavior is an even match > there, that's a step backwards. I wonder what characteristics SATA disks have compared to IDE. Recent PCs are equiped with SATA disks, aren't they? What do you feel your approach compares to MySQL on IDE disks? > Also on the topic of sync writes to the database proper: wouldn't using > O_DIRECT for those potentially counter-productive? I was under the > impressions that one of the behaviors counted on by Postgres was that data > evicted from its buffer cache, eventually intended for writing to disk, > was still kept around for a bit in the OS buffer cache. A subsequent read > because the data was needed again might find the data already in the OS > buffer, therefore avoiding an actual disk read; that substantially reduces > the typical penalty for the database engine making a bad choice on what to > evict. I fear a move to direct writes would put more pressure on the LRU > implementation to be very smart, and that's code that you really don't > want to be more complicated. I'm worried about this, too. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: column ordering, was Re: [HACKERS] [PATCHES] Enums patch v2
> >> You could make a case that we need *three* numbers: a permanent column > >> ID, a display position, and a storage position. > > > Could this not be handled by some catalog fixup after an add/drop? If we > > get the having 3 numbers you will almost have me convinced that this > > might be too complicated after all. > > Actually, the more I think about it the more I think that 3 numbers > might be the answer. 99% of the code would use only the permanent ID. I am still of the opinion, that the system tables as such are too visible to users and addon developers as to change the meaning of attnum. And I don't quite see what the point is. To alter a table's column you need an exclusive lock, and plan invalidation (or are you intending to invalidate only plans that reference * ?). Once there you can just as well fix the numbering. Yes, it is more work :-( Andreas ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] configure problem --with-libxml
I solved it via symlink, but this is much cleaner Maybe configure scripts needs little bit more inteligence. All people on RH systems have to do it :-( Thank you Pavel Stehule From: Stefan Kaltenbrunner <[EMAIL PROTECTED]> To: Pavel Stehule <[EMAIL PROTECTED]> CC: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] configure problem --with-libxml Date: Fri, 22 Dec 2006 09:49:00 +0100 Pavel Stehule wrote: Hello, I try to compile postgres with SQL/XML, but I finished on checking libxml/parser.h usability... no checking libxml/parser.h presence... no checking for libxml/parser.h... no configure: error: header file is required for XML support I have Fedora Core 6, and libxml2-devel I have installed. I checked parser.h and this file is in /usr/include/libxml2/libxml/ directory I am sorry, but configure file is spenish vilage for me, and I can't correct it. try adding --with-includes=/usr/include/libxml2 to your configure line Stefan _ Emotikony a pozadi programu MSN Messenger ozivi vasi konverzaci. http://messenger.msn.cz/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Load distributed checkpoint
"Takayuki Tsunakawa" <[EMAIL PROTECTED]> wrote: > (1) Default case(this is show again for comparison and reminder) > 235 80 226 77 240 > (2) Default + WAL 1MB case > 302 328 82 330 85 > (3) Default + wal_sync_method=open_sync case > 162 67 176 67 164 > (4) (2)+(3) case > 322 350 85 321 84 > (5) (4) + /proc/sys/vm/dirty* tuning > 308 349 84 349 84 (3) is very strange. Your machine seems to be too restricted by WAL so that other factors cannot be measured properly. I'll send results on my machine. - Pentium4 3.6GHz with HT / 3GB RAM / Windows XP :-) - shared_buffers=1GB - wal_sync_method = open_datasync - wal_buffers = 1MB - checkpoint_segments = 16 - checkpoint_timeout = 5min I repeated "pgbench -c16 -t500 -s50" and picked up results around checkpoints. [HEAD] ... 560.8 373.5 <- checkpoint is here 570.8 ... [with patch] ... 562.0 528.4 <- checkpoint (fsync) is here 547.0 ... Regards, --- ITAGAKI Takahiro NTT Open Source Software Center ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] xmlagg is not supported?
Hello, why xmlagg is missing in SQL/XML support? Regards Pavel Stehule _ Citite se osamele? Poznejte nekoho vyjmecneho diky Match.com. http://www.msn.cz/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] configure problem --with-libxml
Pavel Stehule wrote: Hello, I try to compile postgres with SQL/XML, but I finished on checking libxml/parser.h usability... no checking libxml/parser.h presence... no checking for libxml/parser.h... no configure: error: header file is required for XML support I have Fedora Core 6, and libxml2-devel I have installed. I checked parser.h and this file is in /usr/include/libxml2/libxml/ directory I am sorry, but configure file is spenish vilage for me, and I can't correct it. try adding --with-includes=/usr/include/libxml2 to your configure line Stefan ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] configure problem --with-libxml
Hello, I try to compile postgres with SQL/XML, but I finished on checking libxml/parser.h usability... no checking libxml/parser.h presence... no checking for libxml/parser.h... no configure: error: header file is required for XML support I have Fedora Core 6, and libxml2-devel I have installed. I checked parser.h and this file is in /usr/include/libxml2/libxml/ directory I am sorry, but configure file is spenish vilage for me, and I can't correct it. Regards Pavel Stehule _ Chcete sdilet sve obrazky a hudbu s prateli? http://messenger.msn.cz/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Load distributed checkpoint
From: Inaam Rana > Which IO Shceduler (elevator) you are using? Elevator? Sorry, I'm not familiar with the kernel implementation, so I don't what it is. My Linux distribution is Red Hat Enterprise Linux 4.0 for AMD64/EM64T, and the kernel is 2.6.9-42.ELsmp. I probably havn't changed any kernel settings, except for IPC settings to run PostgreSQL.