Re: [HACKERS] recent --with-libxml support

2006-12-22 Thread Jeremy Drake
On Fri, 22 Dec 2006, Jeremy Drake wrote:

> On Sat, 23 Dec 2006, Tom Lane wrote:
>
> > Peter Eisentraut <[EMAIL PROTECTED]> writes:
> > > Jeremy Drake wrote:
> > >> #0  0xb7c4dc85 in memcpy () from /lib/tls/libc.so.6
> > >> #1  0x08190f59 in appendBinaryStringInfo (str=0xbfd87f90,
> > >> data=0x841ffc0 "qux", datalen=138543040) at stringinfo.c:192
> > >> #2  0x0828377f in map_sql_identifier_to_xml_name (ident=0x841ffc0
> > >> "qux", fully_escaped=0 '\0') at xml.c:933
> >
> > > Obviously the datalen has gone off the map.
> >
> > I wouldn't put 100% faith in that display, unless Jeremy built with -O0.
>
> I built this one with gcc 3.4.5 using --enable-debug --enable-cassert
> configure options.  I will try with -O0 and see what I get...

I just tried the same thing, but passing CFLAGS="-g -O0" to configure and
the xml test passed.  Maybe a '\0' termination issue?

I also recompiled everything with the defaults again (-O2) and the xml
test crashed in the same place.

So it is an issue of -O0 works vs -O2 does not.  Hate those...



-- 
When I get real bored, I like to drive downtown and get a great
parking spot, then sit in my car and count how many people ask me if
I'm leaving.
-- Steven Wright

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] recent --with-libxml support

2006-12-22 Thread Jeremy Drake
On Sat, 23 Dec 2006, Tom Lane wrote:

> Peter Eisentraut <[EMAIL PROTECTED]> writes:
> > Jeremy Drake wrote:
> >> #0  0xb7c4dc85 in memcpy () from /lib/tls/libc.so.6
> >> #1  0x08190f59 in appendBinaryStringInfo (str=0xbfd87f90,
> >> data=0x841ffc0 "qux", datalen=138543040) at stringinfo.c:192
> >> #2  0x0828377f in map_sql_identifier_to_xml_name (ident=0x841ffc0
> >> "qux", fully_escaped=0 '\0') at xml.c:933
>
> > Obviously the datalen has gone off the map.
>
> I wouldn't put 100% faith in that display, unless Jeremy built with -O0.

I built this one with gcc 3.4.5 using --enable-debug --enable-cassert
configure options.  I will try with -O0 and see what I get...


-- 
NAPOLEON: What shall we do with this soldier, Guiseppe?
Everything he says is wrong.
GUISEPPE: Make him a general, Excellency,
and then everything he says will be right.

-- G. B. Shaw, "The Man of Destiny"

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] recent --with-libxml support

2006-12-22 Thread Tom Lane
Peter Eisentraut <[EMAIL PROTECTED]> writes:
> Jeremy Drake wrote:
>> #0  0xb7c4dc85 in memcpy () from /lib/tls/libc.so.6
>> #1  0x08190f59 in appendBinaryStringInfo (str=0xbfd87f90,
>> data=0x841ffc0 "qux", datalen=138543040) at stringinfo.c:192
>> #2  0x0828377f in map_sql_identifier_to_xml_name (ident=0x841ffc0
>> "qux", fully_escaped=0 '\0') at xml.c:933

> Obviously the datalen has gone off the map.

I wouldn't put 100% faith in that display, unless Jeremy built with -O0.
If it is accurate then the question is how could mblen fail so badly?

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] recent --with-libxml support

2006-12-22 Thread Peter Eisentraut
Jeremy Drake wrote:
> #0  0xb7c4dc85 in memcpy () from /lib/tls/libc.so.6
> #1  0x08190f59 in appendBinaryStringInfo (str=0xbfd87f90,
> data=0x841ffc0 "qux", datalen=138543040) at stringinfo.c:192
> #2  0x0828377f in map_sql_identifier_to_xml_name (ident=0x841ffc0
> "qux", fully_escaped=0 '\0') at xml.c:933

Obviously the datalen has gone off the map.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] recent --with-libxml support

2006-12-22 Thread Jeremy Drake
On Fri, 22 Dec 2006, Tom Lane wrote:

> Jeremy Drake <[EMAIL PROTECTED]> writes:
> >> Can you provide a stack trace for that crash?
>
> > #0  0xb7c4dc85 in memcpy () from /lib/tls/libc.so.6
> > #1  0x08190f59 in appendBinaryStringInfo (str=0xbfd87f90,
> > data=0x841ffc0 "qux", datalen=138543040) at stringinfo.c:192
> > #2  0x0828377f in map_sql_identifier_to_xml_name (ident=0x841ffc0 "qux",
> > fully_escaped=0 '\0') at xml.c:933
>
> Hmm ... it seems to work for me here, using Fedora 5's libxml.
>
> Are you by any chance running this with a non-C locale?  The trace
> suggests an encoding-mismatch sort of issue...

Nope.

I saw another buildfarm member that looks like it croaked in the same
place:
http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=sponge&dt=2006-12-22%2022:30:02

So I guess it is not just me...


-- 
If you think education is expensive, try ignorance.
-- Derek Bok, president of Harvard

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Interface for pg_autovacuum

2006-12-22 Thread Robert Treat
On Thursday 21 December 2006 10:57, Dave Page wrote:
> Simon Riggs wrote:
> > On Wed, 2006-12-20 at 09:47 -0500, Jim Nasby wrote:
> >> On the other hand, this would be the only part of the system where
> >> the official interface/API is a system catalog table. Do we really
> >> want to expose the internal representation of something as our API?
> >> That doesn't seem wise to me...
> >
> > Define and agree the API (the hard bit) and I'll code it (the easy bit).
> >
> > We may as well have something on the table, even if it changes later.
> >
> > Dave: How does PgAdmin handle setting table-specific autovacuum
> > parameters? (Does it?)
>
> Yes, it adds/removes/edits rows in pg_autovacuum as required.
>

We do this in phppgadmin too, although I also added a screen that show alist 
of entries with schema and table names (rather than vacrelid) since otherwise 
it is too much pita to keep things straight.  My intent is also to add 
controls at the table level (where we'll know the vacrelid anyway) though it 
will probably be put off until there is more demand for it. 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] recent --with-libxml support

2006-12-22 Thread Tom Lane
Jeremy Drake <[EMAIL PROTECTED]> writes:
>> Can you provide a stack trace for that crash?

> #0  0xb7c4dc85 in memcpy () from /lib/tls/libc.so.6
> #1  0x08190f59 in appendBinaryStringInfo (str=0xbfd87f90,
> data=0x841ffc0 "qux", datalen=138543040) at stringinfo.c:192
> #2  0x0828377f in map_sql_identifier_to_xml_name (ident=0x841ffc0 "qux",
> fully_escaped=0 '\0') at xml.c:933

Hmm ... it seems to work for me here, using Fedora 5's libxml.

Are you by any chance running this with a non-C locale?  The trace
suggests an encoding-mismatch sort of issue...

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Strange pgsql crash on MacOSX

2006-12-22 Thread Tom Lane
Shane Ambler <[EMAIL PROTECTED]> writes:
> postgres=# \q
> psql(24931) malloc: *** error for object 0x180a800: incorrect checksum
> for freed object - object was probably modified after being freed, break
> at szone_error to debug
> psql(24931) malloc: *** set a breakpoint in szone_error to debug
> Segmentation fault

I think we've seen something like this before in connection with
readline/libedit follies.  Does the crash go away if you invoke
psql with "-n" option?  If so, exactly which version of readline or
libedit are you using?

FWIW, I do not see this on a fully up-to-date 10.4.8 G4 laptop.
I see

$ ls -l /usr/lib/libedit*
-rwxr-xr-x   1 root  wheel  112404 Sep 29 20:59 /usr/lib/libedit.2.dylib
lrwxr-xr-x   1 root  wheel  15 Apr 26  2006 /usr/lib/libedit.dylib -> 
libedit.2.dylib
$

so it seems that Apple did update libedit not too long ago ...

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Operator class group proposal

2006-12-22 Thread Tom Lane
Gregory Stark <[EMAIL PROTECTED]> writes:
> So the only reason we needed the cross-data-type operators was to get better
> estimates? I thought without them you couldn't get an index-based plan at all.

Oh, hm, there is that --- you won't get a nestloop with inner indexscan
unless the join expression uses the unmodified inner variable (unless
you do something weird like provide an index on the casted value...)

However, we need to be pretty wary about widening the families unless
we're sure that the semantics are right.  In particular, I think that
numeric-vs-float crosstype operators would violate the transitive law:
you could have values for which A=B and B=C but A!=C.  This is because
we smash numerics to float for comparison, and so there are distinct
numeric values that can compare equal to the same float.  bigint against
float same problem.  It'd be OK to integrate integers and numeric into
one class, but how much real value is there in that?

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] Strange pgsql crash on MacOSX

2006-12-22 Thread Shane Ambler

I have a dual G4 1.25Ghz with 2GB RAM running Mac OSX 10.4.8 and
PostgreSQL 8.2.0

This only happened to me today and with everything I have tried it 
always happens now - had been running fine before.


The only thing I can think of that has changed in the last few days is I
have installed the last 2 security updates from Apple and the X11 update 
(X11 1.1.3) that Apple released a while ago -


http://www.apple.com/support/downloads/securityupdate2006008ppc.html
http://www.apple.com/support/downloads/securityupdate20060071048clientppc.html

the first one I can't see having anything to do with postgres as it is I
believe only updating Java. The other one updates a few different areas
and may be the culprit.

I can't think of anything else I have changed just recently - certainly 
not in the last couple of days.


To test and try and track down the cause I have restarted my machine 
then started by unzipping the 8.2.0 released source and done the 
following steps (this example is with clean data files and everything 
default - the startup script has been there a while and using pg_ctl 
instead makes no difference) make check passes all test -


./configure --prefix=/usr/local/pgsql
make check
sudo make install
cd /usr/local/pgsql
sudo mkdir data
sudo chown pgsql:pgsql data
sudo chmod 700 data
sudo -u pgsql /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data
sudo /Library/StartupItems/PostgreSQL/PostgreSQL start

Then I get the following -
[devbox:~] shane% psql
Welcome to psql 8.2.0, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms
   \h for help with SQL commands
   \? for help with psql commands
   \g or terminate with semicolon to execute query
   \q to quit

postgres=# \q
psql(24931) malloc: *** error for object 0x180a800: incorrect checksum
for freed object - object was probably modified after being freed, break
at szone_error to debug
psql(24931) malloc: *** set a breakpoint in szone_error to debug
Segmentation fault
[devbox:~] shane%

The serverlog gives me -
[devbox:local/pgsql/data] root# cat serverlog
LOG:  database system was shut down at 2006-12-23 12:27:44 CST
LOG:  checkpoint record is at 0/42BEB8
LOG:  redo record is at 0/42BEB8; undo record is at 0/0; shutdown TRUE
LOG:  next transaction ID: 0/593; next OID: 10820
LOG:  next MultiXactId: 1; next MultiXactOffset: 0
LOG:  database system is ready


Apple's crashreporter gives me -

Date/Time:  2006-12-23 12:28:21.499 +1030
OS Version: 10.4.8 (Build 8L127)
Report Version: 4

Command: psql
Path:/usr/local/pgsql/bin/psql
Parent:  tcsh [294]

Version: ??? (???)

PID:24931
Thread: 0

Exception:  EXC_BAD_ACCESS (0x0001)
Codes:  KERN_INVALID_ADDRESS (0x0001) at 0x3430616b

Thread 0 Crashed:
0   libSystem.B.dylib   0x90006cd8 szone_free + 3148
1   libSystem.B.dylib   0x900152d0 fclose + 176
2   libedit.2.dylib 0x96b5c334 history_end + 1632
3   libedit.2.dylib 0x96b5c7bc history + 468
4   libedit.2.dylib 0x96b5ec58 write_history + 84
5   psql0x8350 saveHistory + 208
6   psql0x8428 finishInput + 120
7   libSystem.B.dylib   0x90014578 __cxa_finalize + 260
8   libSystem.B.dylib   0x9001 exit + 36
9   psql0x1d00 _start + 764
10  psql0x1a00 start + 48

Thread 0 crashed with PPC Thread State 64:
  srr0: 0x90006cd8 srr1: 0xd030
   vrsave: 0x
cr: 0x42002444  xer: 0x2001   lr:
0x90006ca4  ctr: 0x900143a0
r0: 0x90006ca4   r1: 0xb610   r2:
0x42002442   r3: 0x000d
r4: 0x   r5: 0x000d   r6:
0x80808080   r7: 0x0003
r8: 0x39333100   r9: 0xb545  r10:
0x  r11: 0x42002442
   r12: 0x900143a0  r13: 0x  r14:
0x  r15: 0x
   r16: 0x  r17: 0x0052  r18:
0x0400  r19: 0x0054
   r20: 0x02a4  r21: 0x0180a800  r22:
0xa0001fac  r23: 0x02a8
   r24: 0x0002  r25: 0x0002  r26:
0x0001  r27: 0x34306167
   r28: 0x0180  r29: 0x0180a400  r30:
0x2e616767  r31: 0x900060a0

Binary Images Description:
0x1000 -0x36fff psql/usr/local/pgsql/bin/psql
   0x3f000 -0x54fff libpq.5.dylib   /usr/local/pgsql/lib/libpq.5.dylib
0x8fe0 - 0x8fe51fff dyld 45.3   /usr/lib/dyld
0x9000 - 0x901bcfff libSystem.B.dylib   /usr/lib/libSystem.B.dylib
0x90214000 - 0x90219fff libmathCommon.A.dylib
/usr/lib/system/libmathCommon.A.dylib
0x9110f000 - 0x9111dfff libz.1.dylib/usr/lib/libz.1.dylib
0x969c3000 - 0x969f1fff libncurses.5.4.dylib/usr/lib/libncurses.5.4.dylib
0x96b4d000 - 0x96b63fff libedit.2.dylib /usr/lib/libedit.2.dylib

Model: PowerMac3,6, BootROM 4.4.8f2, 2 proces

Re: [HACKERS] Companies Contributing to Open Source

2006-12-22 Thread Bruce Momjian
Kevin Grittner wrote:
> >>> On Tue, Dec 19, 2006 at  6:13 PM, in message
> <[EMAIL PROTECTED]>, Bruce Momjian
> <[EMAIL PROTECTED]> wrote:
> > if the company dies, the community keeps going (as it did after
> Great
> > Bridge, without a hickup), but if the community dies, the company
> dies
> > too.
>  
> This statement seems to ignore organizations for which PostgreSQL is an
> implementation detail in their current environment.  While we appreciate
> PostgreSQL and are likely to try to make an occasional contribution,
> where it seems to be mutually beneficial, the Wisconsin State Courts
> would survive the collapse of the PostgreSQL community.

Yes, the statement relates mostly to companies that sell/support/enhance
open source software, rather than users who are using the software in
their businesses.  And that text isn't in the article, it was just in an
email to make a distinction.

I think I have improved the slant of the article.  Let me know if it
needs further improvement.  Thanks.

---


>  
> While I can only guess at the reasons you may have put the slant you
> did on the document, I think it really should reflect the patient
> assistance the community provides to those who read the developers FAQ
> and make a good faith effort to comply with what is outlined there.  The
> cooperative, professional, and helpful demeanor of the members of this
> community is something which should balanced against the community's
> need to act as a gatekeeper on submissions.
>  
> I have recent experience as a first time employee contributor.  When we
> hit a bump in our initial use of PostgreSQL because of the non-standard
> character string literals, you were gracious in accepting our quick
> patch as being possibly of some value in the implementation of the
> related TODO item.  You were then helpful in our effort to do a proper
> implementation of the TODO item which fixes it.  I see that the patch I
> submitted was improved by someone before it made the release, which is
> great.
>  
> This illustrates how the process can work.  I informed management of
> the problem, and presented the options -- we could do our own little
> hack that we then had to maintain and apply as the versions moved along,
> or we could try to do fix which the community would accept and have that
> feature "just work" for us for all subsequent releases.  The latter was
> a little more time up front, but resulted in a better quality product
> for us, and less work in the long term.  It was also presumably of some
> benefit to the community, which has indirect benefit to our
> organization.  Nobody here wants to switch database products again soon,
> so if we can solve our problem in a way that helps the product gain
> momentum, all the better.
>  
> I ran a consulting business for decades, and I know that there is a
> great variation in the attitudes among managers.  Many are quite
> reasonable.  I'm reminded of a meeting early in my career with a
> businessman who owned and operated half a dozen successful businesses in
> a variety of areas.  He proposed a deal that I was on the verge of
> accepting, albeit somewhat reluctantly.  He stopped me and told me that
> he hoped to continue to do business with me, so any deal we made had to
> benefit and work for both of us or it was no good at all; if I was
> uncomfortable with something in the proposal, we should talk it out. 
> That's the core of what we're trying to say in this document, isn't it? 
> The rest is an executive overview of the developer FAQ?  I can't help
> feeling that even with the revisions so far it could have a more
> positive "spin".
>  
> -Kevin
>  

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Companies Contributing to Open Source

2006-12-22 Thread Bruce Momjian

OK, based on this feedback and others, I have made a new version of the
article:

http://momjian.us/main/writings/pgsql/company_contributions/

There are no new concepts, just a more balance article with some of the
awkward wording improved.  I also added a link to the article from the
developer's FAQ.

---

Joshua D. Drake wrote:
> Hello,
> 
> O.k. below are some comments. Your article although well written has a
> distinct, from the community perspective ;) and I think there are some
> points from the business side that are missed.
> 
> ---
> Employees working in open source communities have two bosses -- the
> companies that employ them, and the open source community, which must
> review their proposals and patches. Ideally, both groups would want the
> same thing, but often companies have different priorities in terms of
> deadlines, time investment, and implementation details. And,
> unfortunately for companies, open source communities rarely adjust their
> requirements to match company needs. They would often rather "do
> without" than tie their needs to those of a company.
> ---
> 
> Employees don't have two bosses at least not in the presentation above.
> In the community the employee may choose to do it the communities way or
> not. That choice is much more defined within a Boss purview. 
> 
> A companies priorities have a priority that is very powerful that the
> community does not and I believe should be reflected in a document such
> as this. To actually feed the employee. There is a tendency for the
> community to forget that every minute spent on community work is a
> direct neglect to the immediate (note that I say immediate) bottom line.
> That means that priorities must be balanced so that profit can be made,
> employees can get bonuses and god forbid a steady paycheck.
> 
> ---
> This makes the employee's job difficult. When working with the
> community, it can be difficult to meet company demands. If the company
> doesn't understand how the community works, the employee can be seen as
> defiant, when in fact the employee has no choice but to work in the
> community process and within the community timetable.
> 
> By serving two masters, employees often exhibit various behaviors that
> make their community involvement ineffective. Below I outline the steps
> involved in open source development, and highlight the differences
> experienced by employees involved in such activities.
> ---
> 
> The first paragraph seems to need some qualification. An employee is
> hired to work at the best interests of the company, not the community.
> Those two things may overlap, but that is subject to the companies
> declaration. If the employee is not doing the task as delegated that is
> defiant.
> 
> I am suspecting that your clarification would be something to the effect
> of:
> 
> When a company sets forth to donate resources to the community, it can
> make an employee's job difficult. It is important for the company to
> understand exactly what it is giving and the process that gift entails.
> 
> Or something like that.
> 
> I take subject to the term serving two masters, I am certainly not the
> master of my team but that may just be me.
> 
> ---
> Employees usually circulate their proposal inside their companies first
> before sharing it with the community. Unfortunately, many employees
> never take the additional step of sharing the proposal with the
> community. This means the employee is not benefitting from community
> oversight and suggestions, often leading to a major rewrite when a patch
> is submitted to the community.
> ---
> 
> I think the above is not quite accurate. I see few proposals actually
> come across to the community either and those that do seem to get bogged
> down instead of progress being made.
> 
> The most successful topics I have seen are those that usually have some
> footing behind them *before* they bring it to the community.
> 
> ---
> For employees, patch review often happens in the company first. Only
> when the company is satisfied is the patch submitted to the community.
> This is often done because of the perception that poor patches reflect
> badly on the company. The problem with this patch pre-screening is that
> it prevents parallel review, where the company and community are both
> reviewing the patch. Parallel review speeds completion and avoids
> unnecessary refactoring.
> ---
> 
> It does effect the perception of the company. Maybe not to the community
> but as someone who reads comments on the patches that comes through... I
> do not look forward to the day when I have a customer that says, didn't
> you submit that patch that was torn apart by...
> 
> ---
> As you can see, community involvement has unique challenges for company
> employees. There are often many mismatches between company needs and
> community needs, and the company must decide if it is worth honoring the
> co

Re: [HACKERS] Load distributed checkpoint

2006-12-22 Thread Bruce Momjian
Gregory Stark wrote:
> 
> "Bruce Momjian" <[EMAIL PROTECTED]> writes:
> 
> > I have a new idea.  Rather than increasing write activity as we approach
> > checkpoint, I think there is an easier solution.  I am very familiar
> > with the BSD kernel, and it seems they have a similar issue in trying to
> > smooth writes:
> 
> Just to give a bit of context for this. The traditional mechanism for syncing
> buffers to disk on BSD which this daemon was a replacement for was to simply
> call "sync" every 30s. Compared to that this daemon certainly smooths the I/O
> out over the 30s window...
> 
> Linux has a more complex solution to this (of course) which has undergone a
> few generations over time. Older kernels had a user space daemon called
> bdflush which called an undocumented syscall every 5s. More recent ones have a
> kernel thread called pdflush. I think both have various mostly undocumented
> tuning knobs but neither makes any sort of guarantee about the amount of time
> a dirty buffer might live before being synced.
> 
> Your thinking is correct but that's already the whole point of bgwriter isn't
> it? To get the buffers out to the kernel early in the checkpoint interval so
> that come checkpoint time they're hopefully already flushed to disk. As long
> as your checkpoint interval is well over 30s only the last 30s (or so, it's a
> bit fuzzier on Linux) should still be at risk of being pending.
> 
> I think the main problem with an additional pause in the hopes of getting more
> buffers synced is that during the 30s pause on a busy system there would be a
> continual stream of new dirty buffers being created as bgwriter works and
> other backends need to reuse pages. So when the fsync is eventually called
> there will still be a large amount of i/o to do. Fundamentally the problem is
> that fsync is too blunt an instrument. We only need to fsync the buffers we
> care about, not the entire file.

Well, one idea would be for the bgwriter not to do many write()'s
between the massive checkpoint write()'s and the fsync()'s.  That would
cut down on the extra I/O that fsync() would have to do.

The problem I see with making the bgwriter do more writes between
checkpoints is that overhead of those scans, and the overhead of doing
write's that will later be dirtied before the checkpoint.  With the
delay between stages idea, we don't need to guess how agressive the
bgwriter needs to be --- we can just do the writes, and wait for a
while.

On an idle system, would someone dirty a large file, and watch the disk
I/O to see how long it takes for the I/O to complete to disk?

In what we have now, we are either having the bgwriter do too much I/O
between checkpoints, or guaranteeing an I/O storm during a checkpoint by
doing lots of write()'s and then calling fsync() right away.  I don't
see how we are ever going to get that properly tuned.

Would someone code up a patch and test it?

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Companies Contributing to Open Source

2006-12-22 Thread Gregory Stark

"Simon Riggs" <[EMAIL PROTECTED]> writes:

> In a humble, non-confrontational tone: Why/How does a patch imply a fait
> accompli, or show any disrespect?

Well depending on the circumstances it could show the poster isn't interested
in the judgement of the existing code authors. It can be hard to tell someone
that their last 6 months of work was all in a direction that other developers
would rather Postgres not head.

However I think people are over-generalising if they think this is always
true. 

Patches are often submitted by people who invite comment and are open to new
ideas and reworking their approach. Whether the submission is as a fait
accompli or as the beginning of a dialogue (imho a more productive dialogue
than the usual hand-waving on -hackers) is determined more by the attitude of
the presenter and willingness to take criticisms and make changes than it is
by the mere fact that they've written code without prior approval.

The flip side of all of this is that "the community" doesn't always engage
when people do ask for feedback. I asked for comments on how best to proceed
getting info down to the Sort node from a higher Limit node to implement the
limit-sort optimization and didn't get any guidance. As a result I'm kind of
stuck. I can proceed without feedback but I fear I would be, in fact,
presenting the result as a fait accompli which would end up getting rejected
if others were less comfortable with breaking the planner and executor
abstractions (or if I choose not to do so and they decide the necessary
abstractions are needless complexity).

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Load distributed checkpoint

2006-12-22 Thread Greg Smith

On Fri, 22 Dec 2006, Simon Riggs wrote:


I have also seen cases where the WAL drive, even when separated, appears
to spike upwards during a checkpoint. My best current theory, so far
untested, is that the WAL and data drives are using the same CFQ
scheduler and that the scheduler actively slows down WAL requests when
it need not. Mounting the drives as separate block drives with separate
schedulers, CFQ for data and Deadline for WAL should help.


The situation I've been seeing is that the database needs a new block to 
complete a query and issues a read request to get it, but that read is 
behind the big checkpoint fsync.  Client sits there for quite some time 
waiting for the fsync to finish before it gets the data it needs, and now 
your trivial select took seconds to complete.  It's fairly easy to 
replicate this problem using pgbench on Linux--I've seen a query sit there 
for 15 seconds when going out of my way to aggrevate the behavior.  One of 
Takayuki's posts here mentioned a worst-case delay of 13 seconds, that's 
the problem rearing its ugly head.


You may be right that what you're seeing would be solved with a more 
complicated tuning on a per-device basis (which, by the way, isn't 
available unless you're running a more recent Linux kernel than most many 
distributions have available).  You can tune the schedulers all day and 
not make a lick of difference to what I've been running into; I know, I 
tried.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Operator class group proposal

2006-12-22 Thread Gregory Stark

Tom Lane <[EMAIL PROTECTED]> writes:

> No, what you'll get is something like
> 
>   int4var::float8 float8eq float8var
> 
> which is perfectly mergejoinable ... however, it's not clear that the
> planner will make very good estimates about the value of the cast
> expression.  I'm not sure if it's worth introducing a pile more
> crosstype operators to change that situation --- improving
> the selectivity functions to handle casts better might be a wiser
> approach.

So the only reason we needed the cross-data-type operators was to get better
estimates? I thought without them you couldn't get an index-based plan at all.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Load distributed checkpoint

2006-12-22 Thread Gregory Stark

"Bruce Momjian" <[EMAIL PROTECTED]> writes:

> I have a new idea.  Rather than increasing write activity as we approach
> checkpoint, I think there is an easier solution.  I am very familiar
> with the BSD kernel, and it seems they have a similar issue in trying to
> smooth writes:

Just to give a bit of context for this. The traditional mechanism for syncing
buffers to disk on BSD which this daemon was a replacement for was to simply
call "sync" every 30s. Compared to that this daemon certainly smooths the I/O
out over the 30s window...

Linux has a more complex solution to this (of course) which has undergone a
few generations over time. Older kernels had a user space daemon called
bdflush which called an undocumented syscall every 5s. More recent ones have a
kernel thread called pdflush. I think both have various mostly undocumented
tuning knobs but neither makes any sort of guarantee about the amount of time
a dirty buffer might live before being synced.

Your thinking is correct but that's already the whole point of bgwriter isn't
it? To get the buffers out to the kernel early in the checkpoint interval so
that come checkpoint time they're hopefully already flushed to disk. As long
as your checkpoint interval is well over 30s only the last 30s (or so, it's a
bit fuzzier on Linux) should still be at risk of being pending.

I think the main problem with an additional pause in the hopes of getting more
buffers synced is that during the 30s pause on a busy system there would be a
continual stream of new dirty buffers being created as bgwriter works and
other backends need to reuse pages. So when the fsync is eventually called
there will still be a large amount of i/o to do. Fundamentally the problem is
that fsync is too blunt an instrument. We only need to fsync the buffers we
care about, not the entire file.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] recent --with-libxml support

2006-12-22 Thread Jeremy Drake
On Fri, 22 Dec 2006, Tom Lane wrote:

> Jeremy Drake <[EMAIL PROTECTED]> writes:
> > As seen, I needed to add an include dir for configure to pass.  However,
> > make check fails now with the backend crashing.  This can be seen in the
> > buildfarm results for mongoose.
>
> Can you provide a stack trace for that crash?

#0  0xb7c4dc85 in memcpy () from /lib/tls/libc.so.6
#1  0x08190f59 in appendBinaryStringInfo (str=0xbfd87f90,
data=0x841ffc0 "qux", datalen=138543040) at stringinfo.c:192
#2  0x0828377f in map_sql_identifier_to_xml_name (ident=0x841ffc0 "qux",
fully_escaped=0 '\0') at xml.c:933
#3  0x0811ce83 in transformXmlExpr (pstate=0x84202b8, x=0x8420034)
at parse_expr.c:1426
#4  0x0811ac91 in transformExpr (pstate=0x84202b8, expr=0x8420034)
at parse_expr.c:238
#5  0x0811ceb4 in transformXmlExpr (pstate=0x84202b8, x=0x8420174)
at parse_expr.c:1456
#6  0x0811ac91 in transformExpr (pstate=0x84202b8, expr=0x8420174)
at parse_expr.c:238
#7  0x081288a4 in transformTargetEntry (pstate=0x84202b8, node=0x8420174,
expr=0x0, colname=0x0, resjunk=0 '\0') at parse_target.c:74
#8  0x0812890e in transformTargetList (pstate=0x84202b8, targetlist=0x1)
at parse_target.c:146
#9  0x080ffcef in transformStmt (pstate=0x84202b8, parseTree=0x84201fc,
extras_before=0xbfd882c4, extras_after=0xbfd882c8) at analyze.c:2102
#10 0x08101421 in do_parse_analyze (parseTree=0x841ffc0, pstate=0x84202b8)
at analyze.c:251
#11 0x0810227a in parse_analyze (parseTree=0x84201fc,
sourceText=0x841ffc0 "qux", paramTypes=0x841ffc0, numParams=138543040)
at analyze.c:173
#12 0x0820b66e in pg_analyze_and_rewrite (parsetree=0x84201fc,
query_string=0x841fb74 "SELECT xmlconcat(xmlcomment('hello'),\n", ' '
, "xmlelement(NAME qux, 'foo'),\n", ' ' , "xmlcomment('world'));", paramTypes=0x0, numParams=0) at
postgres.c:567
#13 0x0820b91e in exec_simple_query (
query_string=0x841fb74 "SELECT xmlconcat(xmlcomment('hello'),\n", ' '
, "xmlelement(NAME qux, 'foo'),\n", ' ' , "xmlcomment('world'));") at postgres.c:875
#14 0x0820d72b in PostgresMain (argc=4, argv=0x83c5c2c,
username=0x83c5bfc "jeremyd") at postgres.c:3418
#15 0x081dfbd7 in ServerLoop () at postmaster.c:2924
#16 0x081e132c in PostmasterMain (argc=3, argv=0x83c4550) at
postmaster.c:958
#17 0x081991e0 in main (argc=3, argv=0x83c4550) at main.c:188


-- 
In Tennessee, it is illegal to shoot any game other than whales from a
moving automobile.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Load distributed checkpoint

2006-12-22 Thread Inaam Rana

On 12/22/06, Takayuki Tsunakawa <[EMAIL PROTECTED]> wrote:


 From: Inaam Rana
> Which IO Shceduler (elevator) you are using?

Elevator?  Sorry, I'm not familiar with the kernel implementation, so I
don't what it is.  My Linux distribution is Red Hat Enterprise Linux 4.0for 
AMD64/EM64T, and the kernel is
2.6.9-42.ELsmp.  I probably havn't changed any kernel settings, except for
IPC settings to run PostgreSQL.



There are four IO schedulers in Linux. Anticipatory, CFQ (default),
deadline, and noop. For typical OLTP type loads generally deadline is
recommended. If you are constrained on CPU and you have a good controller
then its better to use noop.
Deadline attempts to merge requests by maintaining two red black trees in
sector sort order and it also ensures that a request is serviced in given
time by using FIFO. I don't expect it to do the magic but was wondering that
it may dilute the issue of fsync() elbowing out WAL writes.

You can look into /sys/block//queue/scheduler to see which scheduler
you are using.

regards,
inaam


--
Inaam Rana
EnterpriseDB   http://www.enterprisedb.com


Re: [HACKERS] Load distributed checkpoint

2006-12-22 Thread Bruce Momjian

I have a new idea.  Rather than increasing write activity as we approach
checkpoint, I think there is an easier solution.  I am very familiar
with the BSD kernel, and it seems they have a similar issue in trying to
smooth writes:


http://www.brno.cas.cz/cgi-bin/bsdi-man?proto=1.1&query=update&msection=4&apropos=0

UPDATE(4)   BSD Programmer's Manual  
UPDATE(4)

NAME
 update - trickle sync filesystem caches to disk

DESCRIPTION
 At system boot time, the kernel starts filesys_syncer, process
 3.  This process helps protect the integrity of disk volumes
 by ensuring that volatile cached filesystem data are written
 to disk within the vfs.generic.syncdelay interval which defaults
 to thirty seconds (see sysctl(8)).  When a vnode is first
 written it is placed vfs.generic.syncdelay seconds down on
 the trickle sync queue.  If it still exists and has dirty data
 when it reaches the top of the queue, filesys_syncer writes
 it to disk.  This approach evens out the load on the underlying
 I/O system and avoids writing short-lived files.  The pa- pers
 on trickle-sync tend to favor aging based on buffers rather
 than files.  However, BSD/OS synchronizes on file age rather
 than buffer age because the data structures are much smaller
 as there are typically far fewer files than buffers.  Although
 this can make the I/O bursty when a big file is written to
 disk, it is still much better than the wholesale writes that
 were being done by the historic update process which wrote
 all dirty data buffers every 30 seconds.  It also adapts much
 better to the soft update code which wants to control aging
 to improve performance (inodes age in one third of
 vfs.generic.syncdelay seconds, directories in one half of
 vfs.generic.syncdelay seconds).  This ordering ensures that
 most dependencies are gone (e.g., inodes are written when
 directory en- tries want to go to disk) reducing the amount
 of work that the soft up- date code needs to do.

I assume other kernels have similar I/O smoothing, so that data sent to
the kernel via write() gets to disk within 30 seconds.  

I assume write() is not our checkpoint performance problem, but the
transfer to disk via fsync().  Perhaps a simple solution is to do the
write()'s of all dirty buffers as we do now at checkpoint time, but
delay 30 seconds and then do fsync() on all the files.  The goal here is
that during the 30-second delay, the kernel will be forcing data to the
disk, so the fsync() we eventually do will only be for the write() of
buffers during the 30-second delay, and because we wrote all dirty
buffers 30 seconds ago, there shouldn't be too many of them.

I think the basic difference between this and the proposed patch is that
we do not put delays in the buffer write() or fsync() phases --- we just
put a delay _between_ the phases, and wait for the kernel to smooth it
out for us.  The kernel certainly knows more about what needs to get to
disk, so it seems logical to let it do the I/O smoothing.

---

Bruce Momjian wrote:
> 
> I have thought a while about this and I have some ideas.
> 
> Ideally, we would be able to trickle the sync of individuals blocks
> during the checkpoint, but we can't because we rely on the kernel to
> sync all dirty blocks that haven't made it to disk using fsync().  We
> could trickle the fsync() calls, but that just extends the amount of
> data we are writing that has been dirtied post-checkpoint.  In an ideal
> world, we would be able to fsync() only part of a file at a time, and
> only those blocks that were dirtied pre-checkpoint, but I don't see that
> happening anytime soon (and one reason why many commercial databases
> bypass the kernel cache).
> 
> So, in the real world, one conclusion seems to be that our existing
> method of tuning the background writer just isn't good enough for the
> average user:
> 
>   #bgwriter_delay = 200ms # 10-1ms between rounds
>   #bgwriter_lru_percent = 1.0 # 0-100% of LRU buffers 
> scanned/round
>   #bgwriter_lru_maxpages = 5  # 0-1000 buffers max 
> written/round
>   #bgwriter_all_percent = 0.333   # 0-100% of all buffers 
> scanned/round
>   #bgwriter_all_maxpages = 5  # 0-1000 buffers max 
> written/round
> 
> These settings control what the bgwriter does, but they do not clearly
> relate to the checkpoint timing, which is the purpose of the bgwriter,
> and they don't change during the checkpoint interval, which is also less
> than ideal.  If set t

Re: [HACKERS] recent --with-libxml support

2006-12-22 Thread Tom Lane
Jeremy Drake <[EMAIL PROTECTED]> writes:
> As seen, I needed to add an include dir for configure to pass.  However,
> make check fails now with the backend crashing.  This can be seen in the
> buildfarm results for mongoose.

Can you provide a stack trace for that crash?

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Operator class group proposal

2006-12-22 Thread Tom Lane
Gregory Stark <[EMAIL PROTECTED]> writes:
> I thought that would just be formalizing what we currently have. But I just
> discovered to my surprise tat it's not. I don't see any cross-data-type
> operators between any of the integer types and numeric, or between any of the
> floating point types and numeric, or between any of the integers and the
> floating point types.

Correct.

> So does that mean we currently have three separate arithmetic "operator class
> groups" such as they currently exist and you can't currently do merge joins
> between some combinations of these arithmetic types?

No, what you'll get is something like

int4var::float8 float8eq float8var

which is perfectly mergejoinable ... however, it's not clear that the
planner will make very good estimates about the value of the cast
expression.  I'm not sure if it's worth introducing a pile more
crosstype operators to change that situation --- improving
the selectivity functions to handle casts better might be a wiser
approach.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Companies Contributing to Open Source

2006-12-22 Thread Bruce Momjian
Guido Barosio wrote:
> 
> "Companies often bring fresh prespective, ideas, and testing
> infrastucture to a project."
> 
> 
>  "prespective" || "perspective" ?

Thanks, fixed.

---


> 
> g.-
> 
> 
> On 12/21/06, Kevin Grittner <[EMAIL PROTECTED]> wrote:
> > >>> On Tue, Dec 19, 2006 at  6:13 PM, in message
> > <[EMAIL PROTECTED]>, Bruce Momjian
> > <[EMAIL PROTECTED]> wrote:
> > > if the company dies, the community keeps going (as it did after
> > Great
> > > Bridge, without a hickup), but if the community dies, the company
> > dies
> > > too.
> >
> > This statement seems to ignore organizations for which PostgreSQL is an
> > implementation detail in their current environment.  While we appreciate
> > PostgreSQL and are likely to try to make an occasional contribution,
> > where it seems to be mutually beneficial, the Wisconsin State Courts
> > would survive the collapse of the PostgreSQL community.
> >
> > While I can only guess at the reasons you may have put the slant you
> > did on the document, I think it really should reflect the patient
> > assistance the community provides to those who read the developers FAQ
> > and make a good faith effort to comply with what is outlined there.  The
> > cooperative, professional, and helpful demeanor of the members of this
> > community is something which should balanced against the community's
> > need to act as a gatekeeper on submissions.
> >
> > I have recent experience as a first time employee contributor.  When we
> > hit a bump in our initial use of PostgreSQL because of the non-standard
> > character string literals, you were gracious in accepting our quick
> > patch as being possibly of some value in the implementation of the
> > related TODO item.  You were then helpful in our effort to do a proper
> > implementation of the TODO item which fixes it.  I see that the patch I
> > submitted was improved by someone before it made the release, which is
> > great.
> >
> > This illustrates how the process can work.  I informed management of
> > the problem, and presented the options -- we could do our own little
> > hack that we then had to maintain and apply as the versions moved along,
> > or we could try to do fix which the community would accept and have that
> > feature "just work" for us for all subsequent releases.  The latter was
> > a little more time up front, but resulted in a better quality product
> > for us, and less work in the long term.  It was also presumably of some
> > benefit to the community, which has indirect benefit to our
> > organization.  Nobody here wants to switch database products again soon,
> > so if we can solve our problem in a way that helps the product gain
> > momentum, all the better.
> >
> > I ran a consulting business for decades, and I know that there is a
> > great variation in the attitudes among managers.  Many are quite
> > reasonable.  I'm reminded of a meeting early in my career with a
> > businessman who owned and operated half a dozen successful businesses in
> > a variety of areas.  He proposed a deal that I was on the verge of
> > accepting, albeit somewhat reluctantly.  He stopped me and told me that
> > he hoped to continue to do business with me, so any deal we made had to
> > benefit and work for both of us or it was no good at all; if I was
> > uncomfortable with something in the proposal, we should talk it out.
> > That's the core of what we're trying to say in this document, isn't it?
> > The rest is an executive overview of the developer FAQ?  I can't help
> > feeling that even with the revisions so far it could have a more
> > positive "spin".
> >
> > -Kevin
> >
> >
> >
> > ---(end of broadcast)---
> > TIP 5: don't forget to increase your free space map settings
> >
> 
> 
> -- 
> Guido Barosio
> ---
> http://www.globant.com
> [EMAIL PROTECTED]
> 
> ---(end of broadcast)---
> TIP 5: don't forget to increase your free space map settings

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[HACKERS] recent --with-libxml support

2006-12-22 Thread Jeremy Drake
I adjusted my buildfarm config (mongoose) to attempt to build HEAD
--with-libxml.  I added the following to build-farm.conf:

if ($branch eq 'HEAD' || $branch ge 'REL8_3')
{
 push(@{$conf{config_opts}},
"--with-includes=/usr/include/et:/usr/include/libxml2");
 push(@{$conf{config_opts}}, "--with-libxml");
}

As seen, I needed to add an include dir for configure to pass.  However,
make check fails now with the backend crashing.  This can be seen in the
buildfarm results for mongoose.

According to gentoo portage, I have libxml2 version 2.6.26 installed on my
system.

I am not clear if I should have pointed it at libxml version 1 or 2, but
configure seemed to be happy with libxml2.  If it needs version 1, perhaps
configure should do something to keep it from using version 2.

Here is the diff for the xml regression test:

*** ./expected/xml.out  Thu Dec 21 16:47:22 2006
--- ./results/xml.out   Thu Dec 21 16:59:32 2006
***
*** 58,68 
  SELECT xmlelement(name element,
xmlattributes (1 as one, 'deuce' as two),
'content');
!xmlelement
! 
!  content
! (1 row)
!
  SELECT xmlelement(name element,
xmlattributes ('unnamed and wrong'));
  ERROR:  unnamed attribute value must be a column reference
--- 58,64 
  SELECT xmlelement(name element,
xmlattributes (1 as one, 'deuce' as two),
'content');
! ERROR:  cache lookup failed for type 0
  SELECT xmlelement(name element,
xmlattributes ('unnamed and wrong'));
  ERROR:  unnamed attribute value must be a column reference
***
*** 73,145 
  (1 row)

  SELECT xmlelement(name employee, xmlforest(name, age, salary as pay)) FROM 
emp;
!   xmlelement
! --
!  sharon251000
!  sam302000
!  bill201000
!  jeff23600
!  cim30400
!  linda19100
! (6 rows)
!
! SELECT xmlelement(name wrong, 37);
! ERROR:  argument of XMLELEMENT must be type xml, not type integer
! SELECT xmlpi(name foo);
!   xmlpi
! -
!  
! (1 row)
!
! SELECT xmlpi(name xmlstuff);
! ERROR:  invalid XML processing instruction
! DETAIL:  XML processing instruction target name cannot start with "xml".
! SELECT xmlpi(name foo, 'bar');
! xmlpi
! -
!  
! (1 row)
!
! SELECT xmlpi(name foo, 'in?>valid');
! ERROR:  invalid XML processing instruction
! DETAIL:  XML processing instruction cannot contain "?>".
! SELECT xmlroot (
!   xmlelement (
! name gazonk,
! xmlattributes (
!   'val' AS name,
!   1 + 1 AS num
! ),
! xmlelement (
!   NAME qux,
!   'foo'
! )
!   ),
!   version '1.0',
!   standalone yes
! );
!  xmlroot
! 
--
!  foo
! (1 row)
!
! SELECT xmlserialize(content data as character varying) FROM xmltest;
! data
! 
!  one
!  two
! (2 rows)
!
! -- Check mapping SQL identifier to XML name
! SELECT xmlpi(name ":::_xml_abc135.%-&_");
!   xmlpi
! -
!  
! (1 row)
!
! SELECT xmlpi(name "123");
!  xmlpi
! ---
!  
! (1 row)
!
--- 69,75 
  (1 row)

  SELECT xmlelement(name employee, xmlforest(name, age, salary as pay)) FROM 
emp;
! server closed the connection unexpectedly
!   This probably means the server terminated abnormally
!   before or while processing the request.
! connection to server was lost




-- 
The very powerful and the very stupid have one thing in common.
Instead of altering their views to fit the facts, they alter the facts
to fit their views ... which can be very uncomfortable if you happen to
be one of the facts that needs altering.
-- Doctor Who, "Face of Evil"

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Operator class group proposal

2006-12-22 Thread Gregory Stark

"Tom Lane" <[EMAIL PROTECTED]> writes:

> A class group is associated with a specific index AM and can contain only
> opclasses for that AM.  We might for instance invent "numeric" and
> "numeric_reverse" groups for btree, to contain the default opclasses and
> reverse-sort opclasses for the standard arithmetic types.

I thought that would just be formalizing what we currently have. But I just
discovered to my surprise tat it's not. I don't see any cross-data-type
operators between any of the integer types and numeric, or between any of the
floating point types and numeric, or between any of the integers and the
floating point types.

So does that mean we currently have three separate arithmetic "operator class
groups" such as they currently exist and you can't currently do merge joins
between some combinations of these arithmetic types?

What puzzles me is that we used to have problems with bigint columns where
people just did "WHERE bigint_col = 1". But my testing shows similar constructs
between integer and numeric or other types with no cross-data-type comparator
don't lead to similar problems. The system happily introduces casts now and
uses the btree operator. So I must have missed another change that was also
relevant to this in addition to the cross datatype operators.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] configure problem --with-libxml

2006-12-22 Thread Tom Lane
Martijn van Oosterhout  writes:
> On Fri, Dec 22, 2006 at 03:03:49PM +0100, Peter Eisentraut wrote:
>> The reason why I did not do this was that this could resolve
>> to -I/usr/include or -I/usr/local/include, but adding such a standard
>> path explicitly is wrong on some systems.

> But if people on such a system want to use libxml2, and they install it
> in /usr/include then they're screwed anyway. There's no way to tell the
> compiler to use only some files in a directory.

That's not the point, the point is that an *explicit* -I can be wrong
(because it can change the search order of the default directories).

Perhaps it'd be worth trying to add xml2-config's output only if the
first probe for the headers fails?

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Load distributed checkpoint

2006-12-22 Thread Simon Riggs
On Thu, 2006-12-21 at 18:46 +0900, ITAGAKI Takahiro wrote:
> "Takayuki Tsunakawa" <[EMAIL PROTECTED]> wrote:
> 
> > > If you use Linux, it has very unpleased behavior in fsync(); It locks all
> > > metadata of the file being fsync-ed. We have to wait for the completion of
> > > fsync when we do read(), write(), and even lseek().
> > 
> > Oh, really, what an evil fsync is!  Yes, I sometimes saw a backend
> > waiting for lseek() to complete when it committed.  But why does the
> > backend which is syncing WAL/pg_control have to wait for syncing the
> > data file?  They are, not to mention, different files, and WAL and
> > data files are stored on separate disks.
> 
> Backends call lseek() in planning, so they have to wait fsync() to
> the table that they will access. Even if all of data in the file is in
> the cache, lseek() conflict with fsync(). You can see a lot of backends
> are waiting in planning phase in checkpoints, not executing phase.

It isn't clear to me why you are doing planning during a test at all.

If you are doing replanning during test execution then the real
performance problem will be the planning, not the fact that the fsync
stops planning from happening.

Prepared queries are only replanned manually, so the chances of
replanning during a checkpoint are fairly low. So although it sounds
worrying, I'm not sure that we'll want to alter the use of lseek during
planning - though there may be other arguments also.

I have also seen cases where the WAL drive, even when separated, appears
to spike upwards during a checkpoint. My best current theory, so far
untested, is that the WAL and data drives are using the same CFQ
scheduler and that the scheduler actively slows down WAL requests when
it need not. Mounting the drives as separate block drives with separate
schedulers, CFQ for data and Deadline for WAL should help.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Companies Contributing to Open Source

2006-12-22 Thread Simon Riggs
On Wed, 2006-12-20 at 00:29 -0800, David Fetter wrote:
> On Tue, Dec 19, 2006 at 05:40:12PM -0500, Bruce Momjian wrote:
> > Lukas Kahwe Smith wrote:
> > > I think another point you need to bring out more clearily is that
> > > the community is also often "miffed" if they feel they have been
> > > left out of the design and testing phases. This is sometimes just
> > > a reflex that is not always based on technical reasoning. Its just
> > > that as you correctly point out are worried of being "high-jacked"
> > > by companies.
> > 
> > I hate to mention an emotional community reaction in this document.
> 
> You don't have to name it that if you don't want to, although respect
> (or at least a good simulation of it) is crucial when dealing with any
> person or group. 

I'm very interested in this, because it does seem to me that there is an
emotional reaction to many things.

>  Handing the community a /fait accompli/ is a great
> way to convey disrespect, no matter how well-meaning the process
> originally was.

In a humble, non-confrontational tone: Why/How does a patch imply a fait
accompli, or show any disrespect?

My own reaction to Teodor's recent submission, or Kai-Uwe Sattler's
recent contributions has been: great news, patches need some work, but
thanks.

Please explain on, or off, list to help me understand.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] configure problem --with-libxml

2006-12-22 Thread Martijn van Oosterhout
On Fri, Dec 22, 2006 at 03:03:49PM +0100, Peter Eisentraut wrote:
> Nikolay Samokhvalov wrote:
> > another way is:
> > export CPPFLAGS=$(xml2-config --cflags); ./configure --with-libxml
> >
> > I think that such thing can be used in configure script itself,
> > overwise a lot of people will try, fail and do not use SQL/XML at
> > all.
> 
> The reason why I did not do this was that this could resolve 
> to -I/usr/include or -I/usr/local/include, but adding such a standard 
> path explicitly is wrong on some systems.

But if people on such a system want to use libxml2, and they install it
in /usr/include then they're screwed anyway. There's no way to tell the
compiler to use only some files in a directory.

Put another way, if adding the include path for libxml2 breaks their
build environment, they can't use libxml2. Have configure play dumb
isn't helping anyone. It won't work on any more or less systems.

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to 
> litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] xmlagg is not supported?

2006-12-22 Thread Peter Eisentraut
Pavel Stehule wrote:
> why xmlagg is missing in SQL/XML support?

Because the version contained in the patch did not work properly.  It 
should be added back, of course.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] xmlagg is not supported?

2006-12-22 Thread Peter Eisentraut
Nikolay Samokhvalov wrote:
> Another thing that was removed is XMLCOMMENT..

XMLCOMMENT works.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] configure problem --with-libxml

2006-12-22 Thread Peter Eisentraut
Nikolay Samokhvalov wrote:
> another way is:
> export CPPFLAGS=$(xml2-config --cflags); ./configure --with-libxml
>
> I think that such thing can be used in configure script itself,
> overwise a lot of people will try, fail and do not use SQL/XML at
> all.

The reason why I did not do this was that this could resolve 
to -I/usr/include or -I/usr/local/include, but adding such a standard 
path explicitly is wrong on some systems.

Clearly, we need to improve this, but I don't know how yet.  Ideas 
welcome.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Load distributed checkpoint

2006-12-22 Thread Zeugswetter Andreas ADI SD

> > If you use linux, try the following settings:
> >  1. Decrease /proc/sys/vm/dirty_ratio and dirty_background_ratio.

You will need to pair this with bgwriter_* settings, else too few 
pages are written to the os inbetween checkpoints.

> >  2. Increase wal_buffers to redule WAL flushing.

You will want the possibility of single group writes to be able to reach

256kb. The default is thus not enough when you have enough RAM.
You also want enough, so that new txns don't need to wait for an empty
buffer (that is only freed by a write).

> >  3. Set wal_sync_method to open_sync; O_SYNC is faster then fsync().

O_SYNC's only advantage over fdatasync is that it saves a system call,
since it still passes through OS cache, but the disadvantage is that it 
does not let the OS group writes. Thus it is more susceptible to too
few WAL_buffers. What you want is O_DIRECT + enough wal_buffers to allow

256k writes.

> >  4. Separate data and WAL files into different partitions or disks.

While this is generally suggested, I somehow doubt the validity when you
only have few disk spindles. If e.g. you only have 2-3 (mirrored) disks
I 
wouldn't do it (at least on the average 70/30 read write systems).

Andreas

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Load distributed checkpoint

2006-12-22 Thread Takayuki Tsunakawa
> (3) is very strange. Your machine seems to be too restricted
> by WAL so that other factors cannot be measured properly.

Right... It takes as long as 15 seconds to fsync 1GB file.  It's
strange.  This is a borrowed PC server, so the disk may be RAID 5?
However, the WAL disk and DB  disks show the same throughput.  I'll
investigate.  I may have to find another machine.

- Pentium4 3.6GHz with HT / 3GB RAM / Windows XP :-)

Oh, Windows.  Maybe the fsync() problem Itagaki-san pointed out does
not exist.
BTW, your env is showing attractive result, isn't it?

- Original Message - 
From: "ITAGAKI Takahiro" <[EMAIL PROTECTED]>
To: "Takayuki Tsunakawa" <[EMAIL PROTECTED]>
Cc: 
Sent: Friday, December 22, 2006 6:09 PM
Subject: Re: [HACKERS] Load distributed checkpoint


"Takayuki Tsunakawa" <[EMAIL PROTECTED]> wrote:

> (1) Default case(this is show again for comparison and reminder)
> 235  80  226  77  240
> (2) Default + WAL 1MB case
> 302  328  82  330  85
> (3) Default + wal_sync_method=open_sync case
> 162  67  176  67  164
> (4) (2)+(3) case
> 322  350  85  321  84
> (5) (4) + /proc/sys/vm/dirty* tuning
> 308  349  84  349  84

(3) is very strange. Your machine seems to be too restricted
by WAL so that other factors cannot be measured properly.


I'll send results on my machine.

- Pentium4 3.6GHz with HT / 3GB RAM / Windows XP :-)
- shared_buffers=1GB
- wal_sync_method = open_datasync
- wal_buffers = 1MB
- checkpoint_segments = 16
- checkpoint_timeout = 5min

I repeated "pgbench -c16 -t500 -s50"
and picked up results around checkpoints.

[HEAD]
...
560.8
373.5 <- checkpoint is here
570.8
...

[with patch]
...
562.0
528.4 <- checkpoint (fsync) is here
547.0
...

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center





---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] New version of money type

2006-12-22 Thread D'Arcy J.M. Cain
On Thu, 21 Dec 2006 10:47:52 -0500
Tom Lane <[EMAIL PROTECTED]> wrote:
> One bug I see in it is that you'd better make the alignment 'd' if the
> type is to be int8.  Also I much dislike these changes:
> 
> - int32   i = PG_GETARG_INT32(1);
> + int64   i = PG_GETARG_INT32(1);

As I have made the few corrections that you pointed out, should I go
ahead and commit so that it can be tested in a wider group?  Also,
there are further ideas out there to improve the type further that
would be easier to handle with this out of the way.

-- 
D'Arcy J.M. Cain  |  Democracy is three wolves
http://www.druid.net/darcy/|  and a sheep voting on
+1 416 425 1212 (DoD#0082)(eNTP)   |  what's for dinner.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] xmlagg is not supported?

2006-12-22 Thread Nikolay Samokhvalov

Another thing that was removed is XMLCOMMENT..

On 12/22/06, Nikolay Samokhvalov <[EMAIL PROTECTED]> wrote:

Hmm... In my patch (http://chernowiki.ru/index.php?node=98) I didn't
remove this, moreover I've fixed a couple of issues...

Looks like it was removed by Peter (both patches he mailed lack it).

Actually, without this function a set is SQL/XML publishing functions
becomes rather poor.

Peter?

On 12/22/06, Pavel Stehule <[EMAIL PROTECTED]> wrote:
> Hello,
>
> why xmlagg is missing in SQL/XML support?
>
> Regards
> Pavel Stehule
>
> _
> Citite se osamele? Poznejte nekoho vyjmecneho diky Match.com.
> http://www.msn.cz/
>
>
> ---(end of broadcast)---
> TIP 7: You can help support the PostgreSQL project by donating at
>
> http://www.postgresql.org/about/donate
>


--
Best regards,
Nikolay




--
Best regards,
Nikolay

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] xmlagg is not supported?

2006-12-22 Thread Nikolay Samokhvalov

Hmm... In my patch (http://chernowiki.ru/index.php?node=98) I didn't
remove this, moreover I've fixed a couple of issues...

Looks like it was removed by Peter (both patches he mailed lack it).

Actually, without this function a set is SQL/XML publishing functions
becomes rather poor.

Peter?

On 12/22/06, Pavel Stehule <[EMAIL PROTECTED]> wrote:

Hello,

why xmlagg is missing in SQL/XML support?

Regards
Pavel Stehule

_
Citite se osamele? Poznejte nekoho vyjmecneho diky Match.com.
http://www.msn.cz/


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate




--
Best regards,
Nikolay

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] configure problem --with-libxml

2006-12-22 Thread Nikolay Samokhvalov

another way is:
export CPPFLAGS=$(xml2-config --cflags); ./configure --with-libxml

I think that such thing can be used in configure script itself,
overwise a lot of people will try, fail and do not use SQL/XML at all.

On 12/22/06, Pavel Stehule <[EMAIL PROTECTED]> wrote:

Hello,

I try to compile postgres with SQL/XML, but I finished on

checking libxml/parser.h usability... no
checking libxml/parser.h presence... no
checking for libxml/parser.h... no
configure: error: header file  is required for XML support

I have Fedora Core 6, and libxml2-devel I have installed. I checked parser.h
and this file is in /usr/include/libxml2/libxml/ directory

I am sorry, but configure file is spenish vilage for me, and I can't correct
it.

Regards
Pavel Stehule

_
Chcete sdilet sve obrazky a hudbu s prateli? http://messenger.msn.cz/


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org




--
Best regards,
Nikolay

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Load distributed checkpoint

2006-12-22 Thread Takayuki Tsunakawa
From: "Greg Smith" <[EMAIL PROTECTED]>
> This is actually a question I'd been meaning to throw out myself to
this
> list.  How hard would it be to add an internal counter to the buffer
> management scheme that kept track of the current number of dirty
pages?
> I've been looking at the bufmgr code lately trying to figure out how
to
> insert one as part of building an auto-tuning bgwriter, but it's
unclear
> to me how I'd lock such a resource properly and scalably.  I have a
> feeling I'd be inserting a single-process locking bottleneck into
that
> code with any of the naive implementations I considered.

To put it in an extreme way, how about making bgwriter count the dirty
buffers periodically scanning all the buffers?  Do you know the book
"Principles of Transaction Processing"?  Jim Gray was one of the
reviewers of this book.

http://www.amazon.com/gp/aa.html?HMAC=&CartId=&Operation=ItemLookup&&ItemId=1558604154&ResponseGroup=Request,Large,Variations&bStyle=aaz.jpg&MerchantId=All&isdetail=true&bsi=Books&logo=foo&Marketplace=us&AssociateTag=pocketpc

In chapter 8, the author describes fuzzy checkpoint combined with
two-checkpoint approach.  In his explanation, recovery manager (which
would be bgwriter in PostgreSQL) scans the buffers and records the
list of dirty buffers at each checkpoint.  This won't need any locking
in PostgreSQL if I understand correctly.  Then, the recovery manager
performs the next checkpoint after writing those dirty buffers.  In
two-checkpoint approach, crash recovery starts redoing from the second
to last checkpoint.  Two-checkpoint is described in Jim Gray's book,
too.  But they don't refer to how the recovery manager tunes the speed
of writing.


> slightly different from the proposals here.  What if all the
database page
> writes (background writer, buffer eviction, or checkpoint scan) were
> counted and periodic fsync requests send to the bgwriter based on
that?
> For example, when I know I have a battery-backed caching controller
that
> will buffer 64MB worth of data for me, if I forced a fsync after
every
> 6000 8K writes, no single fsync would get stuck waiting for the disk
to
> write for longer than I'd like.

That seems interesting.

> You can do sync
> writes with perfectly good performance on systems with a good
> battery-backed cache, but I think you'll get creamed in comparisons
> against MySQL on IDE disks if you start walking down that path;
since
> right now a fair comparison with similar logging behavior is an even
match
> there, that's a step backwards.

I wonder what characteristics SATA disks have compared to IDE.  Recent
PCs are equiped with SATA disks, aren't they?
What do you feel your approach compares to MySQL on IDE disks?

> Also on the topic of sync writes to the database proper:  wouldn't
using
> O_DIRECT for those potentially counter-productive?  I was under the
> impressions that one of the behaviors counted on by Postgres was
that data
> evicted from its buffer cache, eventually intended for writing to
disk,
> was still kept around for a bit in the OS buffer cache.  A
subsequent read
> because the data was needed again might find the data already in the
OS
> buffer, therefore avoiding an actual disk read; that substantially
reduces
> the typical penalty for the database engine making a bad choice on
what to
> evict.  I fear a move to direct writes would put more pressure on
the LRU
> implementation to be very smart, and that's code that you really
don't
> want to be more complicated.

I'm worried about this, too.




---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: column ordering, was Re: [HACKERS] [PATCHES] Enums patch v2

2006-12-22 Thread Zeugswetter Andreas ADI SD

> >> You could make a case that we need *three* numbers: a permanent
column
> >> ID, a display position, and a storage position.
> 
> > Could this not be handled by some catalog fixup after an add/drop?
If we 
> > get the having 3 numbers you will almost have me convinced that this

> > might be too complicated after all.
> 
> Actually, the more I think about it the more I think that 3 numbers
> might be the answer.  99% of the code would use only the permanent ID.

I am still of the opinion, that the system tables as such are too
visible
to users and addon developers as to change the meaning of attnum.

And I don't quite see what the point is. To alter a table's column you
need
an exclusive lock, and plan invalidation (or are you intending to
invalidate only
plans that reference * ?). Once there you can just as well fix the
numbering.
Yes, it is more work :-(

Andreas

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] configure problem --with-libxml

2006-12-22 Thread Pavel Stehule

I solved it via symlink, but this is much cleaner

Maybe configure scripts needs little bit more inteligence. All people on RH 
systems have to do it :-(


Thank you

Pavel Stehule



From: Stefan Kaltenbrunner <[EMAIL PROTECTED]>
To: Pavel Stehule <[EMAIL PROTECTED]>
CC: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] configure problem --with-libxml
Date: Fri, 22 Dec 2006 09:49:00 +0100

Pavel Stehule wrote:

Hello,

I try to compile postgres with SQL/XML, but I finished on

checking libxml/parser.h usability... no
checking libxml/parser.h presence... no
checking for libxml/parser.h... no
configure: error: header file  is required for XML 
support


I have Fedora Core 6, and libxml2-devel I have installed. I checked 
parser.h and this file is in /usr/include/libxml2/libxml/ directory


I am sorry, but configure file is spenish vilage for me, and I can't 
correct it.


try adding --with-includes=/usr/include/libxml2 to your configure line


Stefan


_
Emotikony a pozadi programu MSN Messenger ozivi vasi konverzaci. 
http://messenger.msn.cz/



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] Load distributed checkpoint

2006-12-22 Thread ITAGAKI Takahiro
"Takayuki Tsunakawa" <[EMAIL PROTECTED]> wrote:

> (1) Default case(this is show again for comparison and reminder)
> 235  80  226  77  240
> (2) Default + WAL 1MB case
> 302  328  82  330  85
> (3) Default + wal_sync_method=open_sync case
> 162  67  176  67  164
> (4) (2)+(3) case
> 322  350  85  321  84
> (5) (4) + /proc/sys/vm/dirty* tuning
> 308  349  84  349  84

(3) is very strange. Your machine seems to be too restricted
by WAL so that other factors cannot be measured properly.


I'll send results on my machine.

- Pentium4 3.6GHz with HT / 3GB RAM / Windows XP :-)
- shared_buffers=1GB
- wal_sync_method = open_datasync
- wal_buffers = 1MB
- checkpoint_segments = 16
- checkpoint_timeout = 5min

I repeated "pgbench -c16 -t500 -s50"
and picked up results around checkpoints.

[HEAD]
...
560.8
373.5 <- checkpoint is here
570.8
...

[with patch]
...
562.0
528.4 <- checkpoint (fsync) is here
547.0
...

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



---(end of broadcast)---
TIP 6: explain analyze is your friend


[HACKERS] xmlagg is not supported?

2006-12-22 Thread Pavel Stehule

Hello,

why xmlagg is missing in SQL/XML support?

Regards
Pavel Stehule

_
Citite se osamele? Poznejte nekoho vyjmecneho diky Match.com. 
http://www.msn.cz/



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] configure problem --with-libxml

2006-12-22 Thread Stefan Kaltenbrunner

Pavel Stehule wrote:

Hello,

I try to compile postgres with SQL/XML, but I finished on

checking libxml/parser.h usability... no
checking libxml/parser.h presence... no
checking for libxml/parser.h... no
configure: error: header file  is required for XML support

I have Fedora Core 6, and libxml2-devel I have installed. I checked 
parser.h and this file is in /usr/include/libxml2/libxml/ directory


I am sorry, but configure file is spenish vilage for me, and I can't 
correct it.


try adding --with-includes=/usr/include/libxml2 to your configure line


Stefan

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] configure problem --with-libxml

2006-12-22 Thread Pavel Stehule

Hello,

I try to compile postgres with SQL/XML, but I finished on

checking libxml/parser.h usability... no
checking libxml/parser.h presence... no
checking for libxml/parser.h... no
configure: error: header file  is required for XML support

I have Fedora Core 6, and libxml2-devel I have installed. I checked parser.h 
and this file is in /usr/include/libxml2/libxml/ directory


I am sorry, but configure file is spenish vilage for me, and I can't correct 
it.


Regards
Pavel Stehule

_
Chcete sdilet sve obrazky a hudbu s prateli? http://messenger.msn.cz/


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Load distributed checkpoint

2006-12-22 Thread Takayuki Tsunakawa
From: Inaam Rana
> Which IO Shceduler (elevator) you are using?

Elevator?  Sorry, I'm not familiar with the kernel implementation, so I
don't what it is.  My Linux distribution is Red Hat Enterprise Linux 4.0 for
AMD64/EM64T, and the kernel is 2.6.9-42.ELsmp.  I probably havn't changed
any kernel settings, except for IPC settings to run PostgreSQL.