from:"Jonah H. Harris"

Re: [HACKERS] ordered aggregates using WITHIN GROUP (was Re: can somebody execute this query on Oracle 11.2g and send result?)

2010-01-29 Thread Jonah H. Harris

On Fri, Jan 29, 2010 at 11:57 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 I find it doubtful that it's actually necessary in Oracle's version
 of listagg ...


Eh?

http://download.oracle.com/docs/cd/E11882_01/server.112/e10592/functions087.htm

Defines:

*LISTAGG* (measure_expr [, 'delimiter_expr'])
  *WITHIN GROUP* (order_by_clause) [*OVER* query_partition_clause]


-- 
Jonah H. Harris

Re: [HACKERS] ordered aggregates using WITHIN GROUP (was Re: can somebody execute this query on Oracle 11.2g and send result?)

2010-01-29 Thread Jonah H. Harris

On Fri, Jan 29, 2010 at 12:09 PM, Jonah H. Harris jonah.har...@gmail.comwrote:

 On Fri, Jan 29, 2010 at 11:57 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 I find it doubtful that it's actually necessary in Oracle's version
 of listagg ...


 Eh?


 http://download.oracle.com/docs/cd/E11882_01/server.112/e10592/functions087.htm

 Defines:

 *LISTAGG* (measure_expr [, 'delimiter_expr'])
   *WITHIN GROUP* (order_by_clause) [*OVER* query_partition_clause]



SQL Server's listagg is similar to the PG implementation.  It seems Oracle
thinks people would prefer to order the list and for that reason, made their
listagg a rank function type.  Having done quite a bit of work generating
delimited lists/arrays based on ordering in PG, I generally agree that it's
what I would generally want.

-- 
Jonah H. Harris

Re: [HACKERS] can somebody execute this query on Oracle 11.2g and send result?

2010-01-28 Thread Jonah H. Harris

On Thu, Jan 28, 2010 at 9:10 AM, Pavel Stehule pavel.steh...@gmail.comwrote:

 Hello,

 I can't to install Oracle, and need to know result.

 CREATE TABLE foo(a varchar(10), b varchar(10));

 INSERT INTO foo VALUES('aaa',',');
 INSERT INTO foo VALUES('bbb',';');
 INSERT INTO foo VALUES('ccc','+');

 SELECT listagg(a,b) FROM foo;


That's not how listagg works.

The syntax is listagg(expression [, delimiter]) WITHIN GROUP (order by
clause) [OVER partition clause]
If a delimiter is defined, it must be a constant.

Query: SELECT listagg(a, ',') WITHIN GROUP (ORDER BY a) FROM foo;
Result: aaa,bbb,ccc

Query: SELECT listagg(a, ';') WITHIN GROUP (ORDER BY a) FROM foo;
Result: aaa;bbb;ccc

Query: SELECT listagg(a, '+') WITHIN GROUP (ORDER BY a) FROM foo;
Result: aaa+bbb+ccc

-- 
Jonah H. Harris

Re: [HACKERS] Block-level CRC checks

2009-12-03 Thread Jonah H. Harris

On Tue, Dec 1, 2009 at 1:27 PM, Joshua D. Drake j...@commandprompt.comwrote:

 On Tue, 2009-12-01 at 13:20 -0500, Robert Haas wrote:
  Does $COMPETITOR offer this feature?
 

 My understanding is that MSSQL does. I am not sure about Oracle. Those
 are the only two I run into (I don't run into MySQL at all). I know
 others likely compete in the DB2 space.


To my knowledge, MySQL, InnoDB, BerkeleyDB, solidDB, Oracle, SQL Server,
Sybase, DB2, eXtremeDB, RDB, and Teradata all checksum pages.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

Re: [HACKERS] table corrupted

2009-10-26 Thread Jonah H. Harris

On Thu, Oct 22, 2009 at 7:16 PM, Joshua D. Drake j...@commandprompt.comwrote:

 On Thu, 2009-10-22 at 14:28 -0200, João Eugenio Marynowski wrote:
  Hi
 
 Repair? Not likely. Get past? Maybe.


I don't know how valuable your data is, but I've performed data recovery on
tens of PG databases suffering from both hardware and software corruption on
versions 7.0 through 8.3.  My rate is $300-600 USD/hour depending on the
database/table size and the extent of the corruption.

If you're just trying to save what's not corrupted, there's quite a few
examples online.

-- 
Jonah H. Harris

Re: [HACKERS] table corrupted

2009-10-26 Thread Jonah H. Harris

On Mon, Oct 26, 2009 at 12:55 PM, Joshua D. Drake j...@commandprompt.comwrote:

 This reply is wholly inappropriate for a Pg list. We are here to help
 people. If you have a consultancy, please feel free to list that but any
 discussion of rates is just plain rude. Please use better discretion in
 the future.


Perhaps.  Though, I only posted because you made it sound somewhat
impossible and because I only know of a few ppl in the PG community that
offer it and/or have done is successfully.  Maybe letting people know there
are options, other than being screwed, is wrong...  my bad :-)

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

Re: [HACKERS] 8.5 TODO: any info on Create dump tool for write-ahead logs... in PITR section (1.4)?

2009-10-05 Thread Jonah H. Harris

On Sun, Oct 4, 2009 at 10:28 PM, Fujii Masao masao.fu...@gmail.com wrote:

 I think that xlogdump (http://xlogviewer.projects.postgresql.org/) is
 the first step to address that TODO item. Though I'm not sure if the
 xlogdump project is still active.


I believe it has been dead for quite awhile now.  Though, Tom may still
maintain his own xlogdump.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

Re: [HACKERS] PGCluster-II Progress

2009-09-15 Thread Jonah H. Harris

On Tue, Sep 15, 2009 at 7:48 AM, Marcos Luis Ortiz Valmaseda mlor...@uci.cu
 wrote:

 Yeah, the problem here is that CyberCluster is based yet on PostgreSQL 8.1
 and is a very old version to use it.

 I found the developer of PgCluster-II: Atsushi MITANI - mit...@sraw.co.jp


Yeah, AFAICS, PGCluster II is and has been dead for years.

I did an architectural review of PGCluster II while at EnterpriseDB, and
it's never going to work from a performance perspective.  Unfortunately, the
architecture (as it stands) requires coordination of pretty much all locks
and semaphores, and will only run on a single system because it requires a
nearly-identical shared memory segment for each instance.  Unlike Oracle
RAC, which was designed to be run on separate nodes and share/coordinate
only relevant resources, PGCluster II coordinates almost everything
(transaction ids, buffer locks, etc.).

As an example, Oracle doesn't need to communicate with every node in the
cluster to generate a new transaction id (SCN in Oracle parlance) because it
employs a system based on Lamport timestamps, but PGCluster II has to
coordinate a global transaction id across all nodes.

Also, PGCluster II has no concept of shared data/local WAL.  Whereas Oracle
supports threads of REDO/UNDO generated by the local node, all WAL in
PGCluster II has to be coordinated between nodes, which creates a bottleneck
on *all* concurrent workloads.

When I first saw the demonstration of PGCluster II, I was in awe.
Unfortunately, when I reviewed the architecture, I saw that it was too good
to be true.  Perhaps it has been rearchitected in private to overcome some
of these issues, but I'm not aware of it.  All attempts to talk to Atsushi
about it were met with no response.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

Re: [HACKERS] 8.5 release timetable, again

2009-08-27 Thread Jonah H. Harris

On Thu, Aug 27, 2009 at 3:53 PM, David Fetter da...@fetter.org wrote:

  I would appreciate it if somebody could send out some messages of
  calm, while I/we work. The time for open review will come around
  soon enough.

 With all due respect, the time for open review is now.  You have
 already tried closed development several times, and it each time has
 been, more or less, a spectacular failure.


Unlike Robert and Heikki, I don't see you contributing to or assisting
Simon's work.  And, while I may be wrong, I doubt that you assisted in
funding any of Simon's work on hot standby either.  As such, it's my opinion
that continuing to criticize him from the sidelines is not only rude, but is
also a bad idea as it relates to his motivation in working on this feature.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

Re: [HACKERS] Problem with accesing Oracle from plperlu functionwhen using remote pg client.

2009-03-16 Thread Jonah H. Harris

On Mon, Mar 16, 2009 at 11:09 AM, Tomasz Olszak tols...@o2.pl wrote:

 So it looks like that plperlu function is executing from remote and local
 clients with the same set of environment variable.


It has nothing to do with the environment variables.


 So I don't have a clue how can I iron out this issue.


Finally, my low-level Oracle knowledge does benefit Postgres :)

It's a TNS parsing error due to a combination of Oracle's use of a Lispish
s-expression-like name-value pair format and Postgres' process listing
format for remote connections.

On connection, the Oracle client sends the current application name to the
Oracle server (which is listed in the V$SESSION view); in the case of
Postgres, the program name is the current backend process name text.
Because Oracle picks up Postgres' backend text, postgres: www postgres
192.168.1.1(13243), the (13243) screws up Oracle's TNS parser which
prevents it from resolving the connection.  This doesn't happen when you're
connected to PG locally, because the backend text is, postgres: www
postgres [local].

The solution to this is to change the following line in
src/backend/postmaster/postmaster.c:

remote_port[0] == '\0' ? %s : %s(%s)

TO

remote_port[0] == '\0' ? %s : %s[%s]
OR
remote_port[0] == '\0' ? %s : %s:%s

Which I would prefer as a nice change to make overall.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

Re: [HACKERS] Problem with accesing Oracle from plperlu functionwhen using remote pg client.

2009-03-16 Thread Jonah H. Harris

On Mon, Mar 16, 2009 at 1:06 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 We're not going to break a bunch of other applications in order to make
 some undocumented, unsupported Oracle thingie work (until they change
 it...).  Got another solution?


Unfortunately, that's the way Oracle has done it since before the existence
of POSTGRES.

I first encountered it while working on database links from PG to Oracle at
EnterpriseDB, and the reason Tomasz couldn't find the answer to this online
is because it's such a rare problem that Oracle has no reason to change it.
Really, how many people have parenthesis in their program names?  Similarly,
the problem has always existed when connecting to Oracle from Postgres using
DBI-Link or the oralink contrib module, there's just so few PG people
connecting to Oracle that it hasn't really come up before.

As for alternate solutions, the only thing I can think of is a config
parameter to disable rewrite of the ps line.  Frankly, I don't recall ever
seeing a script that looked for (port) in the process list, but there are
probably some home-grown ones out there.  As for me, I'd prefer to separate
the host and port via a colon, just as everything else does, but that isn't
backward compatible.

I would expect this to become more of an issue when we start getting SQL/MED
more closely integrated with the server and people can more easily connect
to other databases.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

Re: [HACKERS] Problem with accesing Oracle from plperlu functionwhen using remote pg client.

2009-03-16 Thread Jonah H. Harris

On Mon, Mar 16, 2009 at 12:50 PM, Tomasz Olszak tols...@o2.pl wrote:

 Thank you very much, I tried to solve it for about 2 weeks. I know that few
 people in the net have the same problem too.


No problem :)

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

Re: [HACKERS] Problem with accesing Oracle from plperlu functionwhen using remote pg client.

2009-03-16 Thread Jonah H. Harris

On Mon, Mar 16, 2009 at 12:36 PM, Alvaro Herrera alvhe...@commandprompt.com
 wrote:

 Jonah H. Harris escribió:

 Wow, that's a really idiotic thing for Oracle to do.


Well, being able to find out what applications are connected to the database
is nice.  But, it would also be nice if they stopped parsing the program
name if/when it encounters a left/right parenthesis.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

Re: [HACKERS] Problem with accesing Oracle from plperlu functionwhen using remote pg client.

2009-03-16 Thread Jonah H. Harris

On Mon, Mar 16, 2009 at 2:04 PM, Alvaro Herrera
alvhe...@commandprompt.comwrote:

 We already have one; it's called update_process_title.


I have it turned off, and I still see the remote IP/port in the process
list.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

Re: [HACKERS] Problem with accesing Oracle from plperlu functionwhen using remote pg client.

2009-03-16 Thread Jonah H. Harris

On Mon, Mar 16, 2009 at 2:26 PM, Jonah H. Harris jonah.har...@gmail.comwrote:

 On Mon, Mar 16, 2009 at 2:04 PM, Alvaro Herrera 
 alvhe...@commandprompt.com wrote:

  We already have one; it's called update_process_title.


 I have it turned off, and I still see the remote IP/port in the process
 list.


Ahh, this is why:

init_ps_display():set_ps_display(initial_str, true);

Perhaps it should obey the configuration setting as well?

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

Re: [HACKERS] Problem with accesing Oracle from plperlu functionwhen using remote pg client.

2009-03-16 Thread Jonah H. Harris

On Mon, Mar 16, 2009 at 2:34 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 I'm finding it hard to believe that there is no way to override what
 Oracle's client library does --- there are *plenty* of situations where
 you don't really want a client command line exposed to the whole world.


AFAIK, there is no way to override that.  It's very low-level in their
client stack, is operating-system specific, and has been there forever.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

Re: [HACKERS] Problem with accesing Oracle from plperlu functionwhen using remote pg client.

2009-03-16 Thread Jonah H. Harris

On Mon, Mar 16, 2009 at 3:03 PM, Heikki Linnakangas 
heikki.linnakan...@enterprisedb.com wrote:

 Someone should raise a support request / whatever they call them with
 Oracle to get this fixed on their side..


Heh.  Why would they fix it when it's only a problem for  1% of their users
in odd corner cases?

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

Re: [HACKERS] Problem with accesing Oracle from plperlu functionwhen using remote pg client.

2009-03-16 Thread Jonah H. Harris

On Mon, Mar 16, 2009 at 3:21 PM, Heikki Linnakangas 
heikki.linnakan...@enterprisedb.com wrote:

 Because that's what a respectable business does when a customer runs into a
 bug with software they sell.


It's not a bug, it's expected behavior.  Not that I think it couldn't be
better handled.

I'm not trying to dig at this, but looking at it in terms of flexibility,
rather than us change the way we display a port in the ps-line because it
may break a couple hundred scripts, you seem to think it's more reasonable
for a company with a product utilized by millions of users, installed in
countless governments, and deployed in mission-critical areas, to risk
changing a fairly mature and well-tested behavior because it affects fewer
than 1% of its users per year; specifically, users who are trying to
interoperate with a competing database?  If it were my business, it doesn't
seem like something I would put much effort into :)

Whether or not they actually will fix it, I don't know, but they surely
 won't if no-one complains them about it.


Wouldn't hurt :)

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

Re: [HACKERS] Problem with accesing Oracle from plperlu functionwhen using remote pg client.

2009-03-16 Thread Jonah H. Harris

On Mon, Mar 16, 2009 at 5:22 PM, Heikki Linnakangas 
heikki.linnakan...@enterprisedb.com wrote:

 Hmm, I wonder if you could do something malicious with it. Like, run a
 query along the lines of SELECT $$ (HOST=10.0.0.123) $$, connect()...  to
 divert the connection to another server.


Not any more malicious than a connection string in and of itself.   It's
only used as a hierarchical name-value pair string, nothing is executed from
it.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

Re: [HACKERS] Problem with accesing Oracle from plperlu functionwhen using remote pg client.

2009-03-16 Thread Jonah H. Harris

On Mon, Mar 16, 2009 at 8:50 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
  Hmm, I wonder if you could do something malicious with it.

 There are any number of scenarios where exposing the client command-line
 contents to other database users represents a security hole, quite
 independently of whether anything falls over depending on the line
 contents.  (I wonder whether there are any Oracle clients that accept
 a password on the command line, for instance.)


Sure they let you pass the password on the command line, but they don't
recommend it.  Most of the utilities accept the syntax:

utility user/p...@instance

Just doing u...@instance will generally prompt for a password.

Ahh, the number of passwords I've recovered from shell history files as a
consultant... good times :)

The only reason this complaint is directed to us, and not Oracle,
 is that the complainant knows how far he's likely to get complaining
 to Oracle :-(


I don't doubt that.  But, like I said, it's really a matter of the
application name.  In our case, Postgres falls into that corner case and we
either choose to do something about it or we don't.  I put the temporary
solution out there for anyone that has the problem.  If we want to fix it
long-term, we'd have to look at one of the previously discussed alternatives
to using (port).  I don't particularly care one way or another, but if we
were to change the ps line format, I just wanted to say that I preferred
host:port rather than host(port).

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

Re: [HACKERS] Out parameters handling

2009-03-06 Thread Jonah H. Harris

On Fri, Mar 6, 2009 at 4:29 PM, Asko Oja asc...@gmail.com wrote:

 It was one of my worst Friday's finding out that this brain dead
 implementation of out parameters had been part of fuck up again.


:)



 This time we did notice it two days too late.
 I wish for a way to use out parameters in functions only through some
 predefined prefix like in triggers new and old. Means i  would like to limit
 referencing to out parameters to one prefix only defined in the beginning of
 declare section of stored procedure.
 It really sucks what kind of mistakes you can pass to production
 unknowingly. I would much prefer a way to prevent such nonsense.
 Here was the case where out parameters were with same names with select
 into field names resulting in null outcome. Just yesterday we had similar
 case with update statement.


Well, it's a problem with the language not parsing things correctly and
doing, in many cases, brain-dead replacements.  I don't know of any
developer using OUT parameters that doesn't run into this problem at one
time or another :(

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

Re: [HACKERS] SYNONYMs revisited

2009-03-04 Thread Jonah H. Harris

On Wed, Mar 4, 2009 at 9:34 AM, Joshua Tolley eggyk...@gmail.com wrote:

 Way back in this thread[1] one of the arguments against allowing
 some version of CREATE SYNONYM was that we couldn't create a synonym for
 an object in a remote database. Will the SQL/MED work make this sort of
 thing a possibility? I realize since it's not standard anyway, there's
 still a discussion or two to be had about how precisely it should work,
 but thought I'd raise the possibility.


While shaking my head In that movie-like slow-motion used as a precursor to
an almost disastrous event, I see myself saying, no...

OK, back to reality.

SQL/MED does support foreign tables, which are basically synonyms for remote
tables.  Other than that, it has no real similarity to synonym behavior for
other database objects such as views, functions, or local tables.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

Re: [HACKERS] Hadoop backend?

2009-02-23 Thread Jonah H. Harris

On Sun, Feb 22, 2009 at 3:47 PM, Robert Haas robertmh...@gmail.com wrote:

 In theory, I think you could make postgres work on any type of
 underlying storage you like by writing a second smgr implementation
 that would exist alongside md.c.  The fly in the ointment is that
 you'd need a more sophisticated implementation of this line of code,
 from smgropen:

reln-smgr_which = 0;   /* we only have md.c at present */


I believe there is more than that which would need to be done nowadays.  I
seem to recall that the storage manager abstraction has slowly been
dedicated/optimized for md over the past 6 years or so.  It may even be
easier/preferred to write a hadoop specific access method depending on what
you're looking for from hadoop.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

Re: [HACKERS] pg_restore --multi-thread

2009-02-12 Thread Jonah H. Harris

On Thu, Feb 12, 2009 at 11:37 AM, Joshua D. Drake j...@commandprompt.comwrote:

 --num-workers or --num-connections would both work.


--num-parallel?

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

Re: [HACKERS] Optimization rules for semi and anti joins

2009-02-11 Thread Jonah H. Harris

On Wed, Feb 11, 2009 at 8:05 AM, David Fetter da...@fetter.org wrote:

 As has been discussed here many, many times, the only kind of person
 who should be doing a patent search is a company's IP attorney, which
 you are not, and even if you were, under no circumstances would such a
 person paste that link in a public forum.


First of all, it was not an intentional patent search.  Secondly, I don't
believe there's any restriction of explicitly what can and cannot be posted
on a public Postgres mailing list.


 Should we have a kick-off policy for this kind of misbehavior?


Shut up David.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

Re: [HACKERS] Optimization rules for semi and anti joins

2009-02-11 Thread Jonah H. Harris

On Wed, Feb 11, 2009 at 11:19 AM, David Fetter da...@fetter.org wrote:

 This is a very big deal, as you are exposing every US PostgreSQL
 contributor to triple damages for knowing infringement.  Are you
 saying you're going to pay all that out of your own pocket?  Are you
 making a legal commitment, say, with a few tens of million dollars in
 escrow to back it?


Per IRC, this discussion will (and likely should) be taken elsewhere.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

Re: [HACKERS] Optimization rules for semi and anti joins

2009-02-10 Thread Jonah H. Harris

On Tue, Feb 10, 2009 at 3:10 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 I wrote (in response to Kevin Grittner's recent issues):
  Reflecting on this further, I suspect there are also some bugs in the
  planner's rules about when semi/antijoins can commute with other joins;

 After doing some math I've concluded this is in fact the case.  Anyone
 want to check my work?


FWIW, the logic looks correct to me.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

Re: [HACKERS] Optimization rules for semi and anti joins

2009-02-10 Thread Jonah H. Harris

On Tue, Feb 10, 2009 at 8:09 PM, Jonah H. Harris jonah.har...@gmail.comwrote:

 On Tue, Feb 10, 2009 at 3:10 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 I wrote (in response to Kevin Grittner's recent issues):
  Reflecting on this further, I suspect there are also some bugs in the
  planner's rules about when semi/antijoins can commute with other joins;

 After doing some math I've concluded this is in fact the case.  Anyone
 want to check my work?


 FWIW, the logic looks correct to me.


Cripes!  I just had an idea and it looks like the buggers beat me to it :(

http://www.google.com/patents?id=4bqBEBAJdq=null+aware+anti-join

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

Re: [HACKERS] Optimization rules for semi and anti joins

2009-02-10 Thread Jonah H. Harris

On Tue, Feb 10, 2009 at 8:41 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Jonah H. Harris jonah.har...@gmail.com writes:
  Cripes!  I just had an idea and it looks like the buggers beat me to it
 :(
  http://www.google.com/patents?id=4bqBEBAJdq=null+aware+anti-join

 I wonder if the USPTO is really clueless enough to accept this?
 Claim 1 would give Oracle ownership of the definition of NOT IN,
 and few of the other claims seem exactly non-obvious either.


Yeah, I just looked up semi and anti-join optimization patents and
Oracle/IBM have a ton.  What an obvious exploitation of math for business
gain.  I doubt they'd be enforceable.  I wish they'd just do away with
software patents altogether :(

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

Re: [HACKERS] add_path optimization

2009-02-04 Thread Jonah H. Harris

On Wed, Feb 4, 2009 at 10:12 AM, Greg Stark st...@enterprisedb.com wrote:

 On Wed, Feb 4, 2009 at 3:07 PM, Kevin Grittner
  It's been about 23 hours and it's still running.

 @snide(Gee, it sure would be nice if we continued with that
 explain-in-progress patch I had sent in earlier...:)


Agreed.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

Re: [HACKERS] 8.4 release planning

2009-01-28 Thread Jonah H. Harris

On Wed, Jan 28, 2009 at 4:28 AM, Peter Eisentraut pete...@gmx.net wrote:

 Greg Smith wrote:

 PostgreSQL advocacy point, one of the questions Tom asked about a bit
 upthread is still a bit hazy here.  There are commercial database offerings
 selling into the trusted space already.  While the use-cases you describe
 make perfect sense, I don't think it's clear to everyone yet if there's a
 unique draw to a PostgreSQL + selinux solution that the class of customers
 you're talking about would prefer it to purchasing one of those products.
  Is the cost savings the main driver here, or is there something else about
 a secure LAPP stack that makes it particularly compelling?


 According to the data available to me, it is a combination of doing it
 better than the other guys (e.g., a SELinux type interface instead of
 something handcrafted) and the usual cost savings.


I don't know about better, but I would definitely say that it's a more
integrated (with the OS) solution.  Can you get Oracle to use SELinux
policies?  Sure.  But it would take a combination of Label Security, Fine
Grained Access Control tweaks, custom C functions, and custom policies to
handle the access control.  And, it would cost a helluva lot of money.

In short, this would make Postgres quite a bit more appetizing to those who
need this functionality, those who prefer SELinux-based policies, and those
who don't have the time/money to do it in systems like Oracle.  How many
people is that?  Based on my consulting experience and questions from
DoD/DoE people specifically, I think the number of people needing this
feature is fairly small right now.  But, it wouldn't hurt us to have it.

Just to make it clear, this feature wouldn't make Postgres a trusted
database in any certification sense.  So, using that term would likely cause
confusion and get people who used it thinking it had an EAL certification
into trouble.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

Re: [HACKERS] How to get SE-PostgreSQL acceptable

2009-01-28 Thread Jonah H. Harris

On Wed, Jan 28, 2009 at 9:49 PM, KaiGai Kohei kai...@ak.jp.nec.com wrote:

 IIRC, 0racle or M$ has a patent to rewrite WHERE clause for security
 purpose, so Tom suggested it should be implemented using a hook
 deployed within executor.


Yes, it was Oracle.  There are a couple newer revisions, but they're all
based primarily on Patent #6487552, Database Fine-grained Access Control,
Filed Oct 5, 1998/Issued Nov 26, 2002.  The patent covers defining a
security context, retrieving-defined policies from that context, and
applying those policies by directly calling a security-context-related
stored procedure in the WHERE clause as well as dynamically adding
security-related predicates to the WHERE-clause.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

Re: 8.4 release planning (was Re: [HACKERS] [COMMITTERS] pgsql: Automatic view update rules)

2009-01-26 Thread Jonah H. Harris

On Sun, Jan 25, 2009 at 12:06 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Particularly with regard to hot standby, which by any sane reading was
 not close to being committable on 1 November (a fortiori from the fact
 that it's *still* not committable despite large amounts of later work).


While I haven't follwed every detail of this patch set, I'm not quite sure I
see that as being very fair to Simon.

Simon has put a lot of time into Hot Standby and has followed the
pseudo-defacto community process from design through what he believes to be
near-completion; he can't be sure of completion until someone reviews his
work.

Honestly, I'm not trying to marginalize your effort reviewing other patches,
and I know everyone is busy.  Hell, as much as I'd love to have this
feature, I too have been unable to find enough time to review Simon's
stuff.  However, over the past few months, I seem to recall seeing Simon
submit countless requests for review and regardless of whether it was
completely ready to go November 1st or not, at this time I don't think
anyone but Simon has a complete view of what his patch(es) even look like.
Yet, albeit with almost no review from the committers, Simon has continually
worked through testing, revising his patches, and requesting information and
suggestions from the community.

Frankly Tom, I don't know of anyone in the community with as much experience
in the recovery code as you and Simon.  So, any of the major edge case
problems will probably only be found by you regardless of how many of us
review Simon's work.

I do know that this is a feature which a large number of Postgres users
really want and were counting on being in 8.4.  Looking forward, if no one
wanted to review these patches in November, and seemingly no one wants to
review them now, how can we expect this to change for 8.5?  Can anyone point
out something Simon did wrong in this process?

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

Re: [HACKERS] 8.4 release planning

2009-01-26 Thread Jonah H. Harris

On Mon, Jan 26, 2009 at 11:11 AM, Merlin Moncure mmonc...@gmail.com wrote:

 What about a compromise solution: release 8.4 now, then focus on
 wrapping up the big ticket items that didn't make it into 8.4 into a
 quick (as possible) 8.5 release.  This means no fests.


That would depend on timing then.  Trying to get people to upgrade to 8.4 is
going to be difficult if they're waiting on Hot Standby, which means less
in-the-field testing of the 8.4 code base until the 8.5 release.  Similarly,
if we're looking at a quick 8.5 around September/October (having no commit
fests), that means it will probably be early 2011 for 8.6, which is fairly
unacceptable for the other patches currently in the queue.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

Re: [HACKERS] 8.4 release planning

2009-01-26 Thread Jonah H. Harris

On Mon, Jan 26, 2009 at 11:33 AM, Tom Lane t...@sss.pgh.pa.us wrote:

  That would depend on timing then.  Trying to get people to upgrade to 8.4
 is
  going to be difficult if they're waiting on Hot Standby, which means less
  in-the-field testing of the 8.4 code base until the 8.5 release.

 [ deja vu... ]  Just like no one was going to bother upgrading to 8.3
 because what they wanted wouldn't be there till 8.4, and the similar
 claims we heard about 8.2 and 8.1 before that ...


I'm not trying to be an alarmist, I'm just stating what I saw when I was @
EDB.  Customers, especially those with large databases or small admin teams,
would definitely wait for features before upgrading.  Some people waited
specifically for HOT or features that would benefit them specifically.  My
only gripe with a small window between 8.4 and 8.5 was just that I believe
people would be more likely to wait until 8.5 rather than upgrading twice in
the same year.  Though, as I generally like people to be using the latest
version of PG, I'd certainly be happy to be wrong on this.


  Similarly,
  if we're looking at a quick 8.5 around September/October (having no
 commit
  fests), that means it will probably be early 2011 for 8.6, which is
 fairly
  unacceptable for the other patches currently in the queue.

 Right, one of the major considerations here is allowing other
 development to get started again (and not be looking at two years wait
 to see the light of day).


Agreed.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

Re: [HACKERS] [PATCH] EnableDisableTrigger Cleanup Questions

2009-01-21 Thread Jonah H. Harris

On Thu, Nov 6, 2008 at 12:03 AM, Jonah H. Harris jonah.har...@gmail.comwrote:

 As I wasn't sure whether anyone agrees with my distaste for
 repurposing tgenabled as mentioned above, I have attached is a patch
 which minimally corrects the function comment for EnableDisableTrigger
 where fires_when is concerned.


Was there a reason that this cleanup patch wasn't applied?

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

Re: [HACKERS] [PATCH] EnableDisableTrigger Cleanup Questions

2009-01-21 Thread Jonah H. Harris

On Wed, Jan 21, 2009 at 2:02 PM, Robert Haas robertmh...@gmail.com wrote:

  Was there a reason that this cleanup patch wasn't applied?
 
  1. It was submitted after the deadline for CommitFest:November.
 
  Well, it's just comment changes...

 Oh, didn't realize that.  That's what I get for replying without
 reading the patch...


Yes :)


-- 
Jonah H. Harris, Senior DBA
myYearbook.com

Re: [HACKERS] Warning about the 8.4 release

2009-01-06 Thread Jonah H. Harris

On Tue, Jan 6, 2009 at 12:38 PM, Robert Haas robertmh...@gmail.com wrote:

 - WIP: Hash Join-Filter Pruning using Bloom Filters is in the commitfest


I'm pulling this patch and resubmitting for 8.5.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

Re: [HACKERS] Lockfree hashtables

2008-12-31 Thread Jonah H. Harris

On Wed, Dec 31, 2008 at 7:33 AM, Stephen R. van den Berg s...@cuci.nlwrote:

 The other day I bumped into some ideas about lockfree hashtables.
 Are these of any use in PostgreSQL?


Lock-free and wait-free algorithms have been used in various databases, but
most people tend to shy away from them because of their complexity,
difficulty to debug, and low-level portability issues.

I've used them in the past (lock-free hash tables and skip lists), and
they're pretty awesome if used properly, but the majority of PG's current
performance problems aren't generally found as part of our hash table
implementation (which I'm quite fond of actually).  FWIS, I think we'll look
more into this sometime in the future.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

Re: About CMake (was Re: [HACKERS] [COMMITTERS] pgsql: Append major version number and for libraries soname major)

2008-12-29 Thread Jonah H. Harris

On Mon, Dec 29, 2008 at 11:47 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 But of course those are just as straightforward in autoconf.  It's
 the not-straightforward stuff that's going to be a PITA to translate.

As much as I dislike autotools, I really despise CMake; it's a nasty
piece of work and I hope we don't switch to it.  Though, as I must've
missed it, what's the main complaint with the current build system?

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block-level CRC checks

2008-12-15 Thread Jonah H. Harris

On Mon, Dec 15, 2008 at 7:24 AM, Alvaro Herrera
alvhe...@commandprompt.com wrote:
 Here's an updated patch against head.

 Thanks.

No problemo.

 NOTE, it appears that this (and the previous) patch PANIC with
 concurrent transaction log activity while database system is shutting
 down on shutdown if checksumming is enabled.  This appears to be due
 to FlushBuffer (lines 1821-1828) during the checkpoint-at-shutdown.

 Yeah, I reported this issue several times.

Hmm.  Well, the easiest thing would be to add a !shutdown check for
logging the hint bits during the shutdown checkpoint :)  Of course,
that would break the page for recovery, which was the whole point of
putting that in place.  I'd have to look at xlog and see whether that
check can be deferred or changed.  Or, did you already research this
issue?

 Similarly, I ran a pgbench, performed a manual checkpoint, and
 corrupted the tellers table myself using hexedit but the system didn't
 pick up the corruption at all :(

 Heh :-)

:(

 Alvaro, have you given up on the patch or are you just busy on
 something else at the moment?

 I've given up until we find a good way to handle hint bits.  Various
 schemes have been proposed but they all have more or less fatal flaws.

Agreed.  Though, I don't want to see this patch get dropped from 8.4.

ALL, Alvaro has tried a couple different methods, does anyone have any
other ideas?

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block-level CRC checks

2008-12-15 Thread Jonah H. Harris

On Mon, Dec 15, 2008 at 10:13 AM, Bruce Momjian br...@momjian.us wrote:
 Jonah H. Harris wrote:
  Alvaro, have you given up on the patch or are you just busy on
  something else at the moment?
 
  I've given up until we find a good way to handle hint bits.  Various
  schemes have been proposed but they all have more or less fatal flaws.

 Agreed.  Though, I don't want to see this patch get dropped from 8.4.

 ALL, Alvaro has tried a couple different methods, does anyone have any
 other ideas?

 Feature freeze is not the time to be looking for new ideas.  I suggest
 we save this for 8.5.

Well, we may not need a new idea.  Currently, the problem I see with
the checkpoint-at-shutdown looks like it could possibly be easily
solved.  Though, there may be other issues I'm not familiar with.  Has
anyone reviewed this yet?

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block-level CRC checks

2008-12-15 Thread Jonah H. Harris

On Mon, Dec 15, 2008 at 11:29 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Jonah H. Harris jonah.har...@gmail.com writes:
 On Mon, Dec 15, 2008 at 10:13 AM, Bruce Momjian br...@momjian.us wrote:
 Feature freeze is not the time to be looking for new ideas.  I suggest
 we save this for 8.5.

 Well, we may not need a new idea.

 We don't really have an acceptable solution for the conflict with hint
 bit behavior.  The shutdown issue is minor, agreed, but that's not the
 stumbling block.

Agreed on the shutdown issue.  But, didn't this patch address the hint
bit setting as discussed?  After performing a cursory look at the
patch, it appears that hint-bit changes are detected and a WAL entry
is written on buffer flush if hint bits had been changed.  I don't see
anything wrong with this in theory.  Am I missing something?

Now, in the case where hint bits have been updated and a WAL record is
required because the buffer is being flushed, requiring the WAL to be
flushed up to that point may be a killer on performance.  Has anyone
tested it?

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block-level CRC checks

2008-12-15 Thread Jonah H. Harris

On Mon, Dec 15, 2008 at 11:50 AM, Alvaro Herrera
alvhe...@commandprompt.com wrote:
 That only does heap hint bits, but it does nothing about pd_flags, the
 btree flags (btpo_cycleid I think), and something else I don't recall at
 the moment.  This was all solvable however.  The big problem with it was
 that it was using a new bit in pd_flags in unsafe ways.  To make it safe
 you'd have to grab a lock on the page, which is very probably problematic.

:(

 Now, in the case where hint bits have been updated and a WAL record is
 required because the buffer is being flushed, requiring the WAL to be
 flushed up to that point may be a killer on performance.  Has anyone
 tested it?

 I didn't measure it but I'm sure it'll be plenty slow.

Yeah.  What really sucks is that it would be fairly unpredictable and
could easily result in unexpected production performance issues.

It is pretty late in the process to continue with this design-related
discussion, but I really wanted to see it in 8.4.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block-level CRC checks

2008-12-15 Thread Jonah H. Harris

On Mon, Dec 15, 2008 at 12:30 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:
 How hard would it be to just take an exclusive lock on the page when setting
 all these hint bits?

 I guess it will be intolerably slow then.  If we were to say we have
 CRC now, but if you enable it you have 1% of the performance we will
 get laughed at.

Well, Oracle does tell users that enabling full CRC checking will cost
~5% performance overhead, which is reasonable to me.  I'm not
pessimistic enough to think we'd be down to 1% the performance of a
non-CRC enabled system, but the locking overhead would probably be
fairly high.  The problem is, at this point, we don't really know what
the impact would be either way :(

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SQL/MED compatible connection manager

2008-12-15 Thread Jonah H. Harris

On Fri, Dec 12, 2008 at 7:55 AM, Peter Eisentraut pete...@gmx.net wrote:
 Now I have a question about the FDW C interface.  The way I understand it,
 an SQL/MED-enabled server and a FDW each have a specific API by which they
 communicate.  Supposedly, each database vendor should be able to ship a
 binary library for its FDW and each SQL/MED-enabled server should be able to
 load and use it.  (If you don't believe in binary compatibility, then I
 think there should at least be source-level interface compatibility.)

Yes, all FDWs should be similar to ODBC drivers in that they are
self-contained and interface with the database through a defined API.
What happens inside them should be irrelevant to PG.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block-level CRC checks

2008-12-14 Thread Jonah H. Harris

On Sun, Dec 14, 2008 at 4:51 PM, Josh Berkus j...@agliodbs.com wrote:
 v11 doesn't apply to cvs head anymore

 I'm not currently working on this patch, sorry.


 Should we pull it from 8.4, then?

Here's an updated patch against head.

NOTE, it appears that this (and the previous) patch PANIC with
concurrent transaction log activity while database system is shutting
down on shutdown if checksumming is enabled.  This appears to be due
to FlushBuffer (lines 1821-1828) during the checkpoint-at-shutdown.
Other than that, I haven't looked into what needs to be done to fix
it.

Similarly, I ran a pgbench, performed a manual checkpoint, and
corrupted the tellers table myself using hexedit but the system didn't
pick up the corruption at all :(

Alvaro, have you given up on the patch or are you just busy on
something else at the moment?

-- 
Jonah H. Harris, Senior DBA
myYearbook.com


blkcrc-12.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COCOMO Indians

2008-12-11 Thread Jonah H. Harris

On Thu, Dec 11, 2008 at 4:43 AM, Dmitry Turin
[EMAIL PROTECTED] wrote:
 We would like to obtain your opinion on these two questions:

This is the wrong place to do it.

 2) We are captivated by price of Indians,
 we listened much about low quality of code, written by Indians,
 we are fearing, that American company will resale implementation to Indian 
 subcontractor
 (i.e. real developers will be Indians anyway).

Did you really just say that?

 (SQL50, HTML60)

Because it seems that you haven't got the hint yet, I'll just say it
frankly: No one really cares about your desired additions to Postgres.

-Jonah

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Simple postgresql.conf wizard

2008-11-25 Thread Jonah H. Harris

On Tue, Nov 25, 2008 at 8:38 PM, Joshua D. Drake [EMAIL PROTECTED] wrote:
 I don't this is not empirical but really, 150 is very reasonable. Let's
 just set it to that by default and be done with it. It won't hurt
 anything and if they need more than that, they are already investigating
 either via the lists or via a vendor anyway.

Agreed.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Simple postgresql.conf wizard

2008-11-14 Thread Jonah H. Harris

On Fri, Nov 14, 2008 at 2:52 AM, Heikki Linnakangas
[EMAIL PROTECTED] wrote:
 Other systems do it.  For example, Oracle tracks column usage and
 attempts to determine the optimal statistics for that column (based on
 the queries that used it) on an iterative basis.  We don't track
 column usage at all, so that option wouldn't be quite that easy to
 implement.  Though, there are certain things ANALYZE would be able to
 determine with a little help, such as knowing to collect more samples
 for columns it finds extremely skewed data in.

 That kind of feedback loops are a bit dangerous. For starters, it would mean
 that your test system would behave differently than your production system,
 just because you run different queries on it. There's also all kinds of
 weird dynamic behaviors that could kick in. For example, a query could run
 fine for the first few hundred times, but then the analyzer notices that a
 certain column is being accessed frequently and decides to increase the
 stats target for it, which changes the plan, for worse. Usually the new plan
 would be better, but the planner isn't perfect.

Oracle already thought of that a long time ago, which is why the plan
has to come out better for it to take effect.  As for bad plans, you
obviously haven't used Postgres in production enough to deal with it
continually changing plans for the worse due to index bloat, data
skew, phase of the moon, etc. :)

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Simple postgresql.conf wizard

2008-11-14 Thread Jonah H. Harris

On Fri, Nov 14, 2008 at 10:50 AM, Heikki Linnakangas
[EMAIL PROTECTED] wrote:
 Oracle already thought of that a long time ago, which is why the plan
 has to come out better for it to take effect.

 Huh? We would never willingly choose a worse plan, of course, but the point
 is that what looks like a better plan, with a smaller cost estimate, is
 sometimes actually worse.

Oracle bases it on cost and elapsed execution time.

  As for bad plans, you
 obviously haven't used Postgres in production enough to deal with it
 continually changing plans for the worse due to index bloat, data
 skew, phase of the moon, etc. :)

 You're right, I haven't, but yes I know that's a problem. We've chatted
 about that with Greg sometimes. It would be nice to have more stable plans.
 My favorite idea is to stop using the current relation size in the planner,
 and use the value snapshotted at ANALYZE instead. That way, the planner
 would be completely deterministic, based on the statistics. Then, we could
 have tools to snapshot the statistics, move them to a test system, store
 them, revert back to old statistics etc.

Yes, plan stability would be a Good Thing(tm) IMO.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Simple postgresql.conf wizard

2008-11-14 Thread Jonah H. Harris

On Fri, Nov 14, 2008 at 11:43 AM, Tom Lane [EMAIL PROTECTED] wrote:
 IMHO, the only thing worse than an unstable plan is a stable one.

Your opinion contradicts the majority of the industry then, I'm
afraid.  Like query hints, people are sometimes smarter than the
optimizer.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Simple postgresql.conf wizard

2008-11-14 Thread Jonah H. Harris

On Fri, Nov 14, 2008 at 12:07 PM, Tom Lane [EMAIL PROTECTED] wrote:
 And, very often, they're not --- or more likely, they were smarter than
 the optimizer last year, but now conditions have changed.  Failing to
 adapt to new conditions is exactly the problem with query hints, and
 in general with any insistence that plans should be stable.

Well, at least they didn't have to wait a year to fix the problem.

Similarly, whether or not the plan changed due to bad hints or bad
plans, detecting the change is relatively easy, so I don't really see
an argument based on *why* the plan failed.  In my, and many others
opinion, if you decide to take your query plan into your own hands,
it's your problem if it fails.  I do agree that hints are a little too
nice and simple, and generally get people into trouble because they're
hard-coded in an app, tend to cause issues later, and are then
difficult to track down.  Oracle solved that years ago as well, which
is why they support more advanced plan stability features than just
hints.

However, given the number of large-scale OLTP sites I've been to, I
can tell you from experience that post-ANALYZE plan changes wreak
complete havoc on a system and in many cases, bring it to its knees.
In those cases, the proper query plan is well-known, and a hint (or
some other form of plan stability) is all that would be required to
prevent it from happening.

This is pretty off-topic for this thread, so I'll postpone the
discussion for 8.5.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Simple postgresql.conf wizard

2008-11-13 Thread Jonah H. Harris

On Thu, Nov 13, 2008 at 3:20 PM, Grzegorz Jaskiewicz
[EMAIL PROTECTED] wrote:
 If that's the situation, me thinks you guys have to start thinking about
 some sort of automated way to increase this param per column as needed.
 Is there any way planner could actually tell, that it would do better job
 with more stats for certain column ?

Other systems do it.  For example, Oracle tracks column usage and
attempts to determine the optimal statistics for that column (based on
the queries that used it) on an iterative basis.  We don't track
column usage at all, so that option wouldn't be quite that easy to
implement.  Though, there are certain things ANALYZE would be able to
determine with a little help, such as knowing to collect more samples
for columns it finds extremely skewed data in.

There are other things that could be done as well... so the answer is, yes.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Enabling archive_mode without restart

2008-11-12 Thread Jonah H. Harris

 Anyway, I think this is worth fixing before release but it clearly isn't
 worth attempting to rush a patch in the next few hours. I don't think
 we'll find anyone who is happy with making it a restart-required option.

I couldn't find a patch/commit for this and was just wondering whether
someone is addressing it for 8.4?  In a large-scale OLTP environment,
uptime is paramount, and having to restart the database to enable PITR
is a big PITA.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Hash Join-Filter Pruning using Bloom Filters

2008-11-10 Thread Jonah H. Harris

On Mon, Nov 10, 2008 at 2:42 PM, Lawrence, Ramon [EMAIL PROTECTED] wrote:
 I have tested the Bloom filter patch.  It compiles cleanly against HEAD.

Thank you for testing this!

 As indicated, the performance improvements for hash join are good,
 especially when the build table is filtered with a selection condition.
 Performance improvements range from a couple of percent up to 20% for
 multi-batch joins.  Note that the bloom filter will slightly slow
 queries where the filter has no benefit.

I have a new patch which does not create a bloom filter unless it sees
that the hash join is going to batch.  I'll send it along later
tonight.

 I have not looked at the actual implementation of the Bloom filter, but
 will proceed to do that next.  One issue to be considered is how the
 space used for the bloom filter is related to the work_mem allocated to
 the join. That is, does the bloom filter consume some of the work_mem
 space or is it treated as additional memory allocated to the join.

Currently it's additional space not accounted for by work_mem.
Additionally, it's a good amount more space than is required.  This is
fixed in the newer patch as well.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Page space reservation (pgupgrade)

2008-11-09 Thread Jonah H. Harris

On Sun, Nov 9, 2008 at 7:55 PM, Decibel! [EMAIL PROTECTED] wrote:
 On Nov 8, 2008, at 8:35 PM, Jonah H. Harris wrote:
 That's my question.  Why is this needed at all?

 I suspect this is to deal with needing to reserve space in a cluster that
 you're planning on upgrading to a new version that would take more space,
 but I think the implementation is probably too simplistic.

Well, if that's what it is, I think it's a fairly poor design
decision.  When I upgrade Oracle, SQL Server, or MySQL, I don't need
to plan the amount of free space in my blocks a year or more before an
upgrade.  In fact, I don't have to plan it at all... it's completely
handled by the in-place upgrade.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SQL5 budget

2008-11-09 Thread Jonah H. Harris

On Sun, Nov 9, 2008 at 7:41 PM, Decibel! [EMAIL PROTECTED] wrote:
 I think you're barking up the wrong tree here; the community can't really do
 hacking for hire. If you want to pay for something to be implemented (which
 is great!), you'll need to talk to companies that do Postgres consulting.
 You can find examples on the website and through google. You could also try
 posting to pgsql-jobs.

I would suggest submitting it to pgsql-jobs.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] [PATCH] Recreate Missing WAL Directories (from TODO)

2008-11-08 Thread Jonah H. Harris

When performing a PITR copy of a data cluster, the pg_xlog directory
is generally omitted.  As such, when starting the copy up for
replay/recovery, the WAL directories need to be recreated.  This patch
checks to see whether XLOGDIR and XLOGDIR/archive_status exist on
XLOGStartup and if not, recreates them.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com


arcstatdir.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Recreate Missing WAL Directories (from TODO)

2008-11-08 Thread Jonah H. Harris

On Sat, Nov 8, 2008 at 4:08 PM, Tom Lane [EMAIL PROTECTED] wrote:
 This has been suggested before but I'm unconvinced that it's a good
 idea.  It's reasonably common for pg_xlog to be a symlink.  If you
 neglect to re-establish the symlink then what would happen is that xlog
 gets recreated on the data disk, and with no notice you are running in
 a degraded mode.

Agreed on the basis that people sometimes forget to symlink.  That's
the reason why I was echoing a message.  Initially the message was
WARNING, but I degraded it to LOG.

 It might be reasonable to auto-recreate XLOGDIR/archive_status, though.

Attached.

BTW, I have seen people create both pg_xlog and archive_status as
files, which is why I'm validating that in this function rather than
waiting for it to error-out later in the code.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com


arcstatdir_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Page space reservation (pgupgrade)

2008-11-08 Thread Jonah H. Harris

On Sat, Nov 8, 2008 at 8:08 PM, Tom Lane [EMAIL PROTECTED] wrote:
 Zdenek Kotala [EMAIL PROTECTED] writes:
 Attached patch allows to setup storage parameter for space
 reservation.

 What is the point of this?

That's my question.  Why is this needed at all?

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] EnableDisableTrigger Cleanup Questions

2008-11-06 Thread Jonah H. Harris

On Thu, Nov 6, 2008 at 9:01 AM, Tom Lane [EMAIL PROTECTED] wrote:
 It would have been useful to make this criticism before 8.3 was
 released.  I don't think it's reasonable to change it now.

Well, I didn't have time to review code back in the 8.3 days, and ugly
is ugly regardless of when it was originally committted.  I'm not
saying it needs to be an 8.4 fix, just that as a whole, several of the
components of that patch (including rewrite) seem to be a little
hackish and that they could be cleaned up in 8.5.  I would imagine
someone will be working on trigger-related code in 8.5, and just
thought it would be nice to clean it up if one had the time to do so.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] EnableDisableTrigger Cleanup Questions

2008-11-06 Thread Jonah H. Harris

On Thu, Nov 6, 2008 at 10:08 AM, Tom Lane [EMAIL PROTECTED] wrote:
 I have no objection to cleaning up the backend internals, but system
 catalog definitions are client-visible.  I don't think we should thrash
 the catalog definitions for minor aesthetic improvements.  Since 8.3 is
 already out, that means client-side code (like pg_dump and psql, and
 probably other programs we don't control) is going to have to deal with
 the existing definition for the foreseeable future.  Dealing with this
 definition *and* a slightly cleaner one isn't a net improvement from the
 client standpoint.

Well, it didn't seem like anyone had an issue changing the definition
at 8.3 time.  As for pg_dump/psql, those changes are fairly simple.
And, there aren't that many PG utilities out there.  PGAdmin looks
like it would require a 1-3 line change (depending on coding
preferences) and I don't see anything that checks it in Slony.

I'm fine with cleaning up the internal-side, I just don't think
there's that much relying on tgenabled.  In fact, Google code search
seems to show more things relying on a boolean tgenabled rather than
the current implementation.

Oh well, it was just a thought.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] broken URL in commitfest page

2008-11-05 Thread Jonah H. Harris

On Wed, Nov 5, 2008 at 12:35 PM, Simon Riggs [EMAIL PROTECTED] wrote:
 The Join Removal item fails to point to a patch, also.

 I've removed that entry now. The patch was being worked on by Jonah but
 it looks like we didn't make the deadline.

Well, what is the official deadline on it?  It, like several other
patches on the wiki, was a WIP.  I'm hopeful that RI-based join
elimination for JOIN_INNER should be ready tonight based on your and
Tom's comments.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] [PATCH] EnableDisableTrigger Cleanup Questions

2008-11-05 Thread Jonah H. Harris

While working on the join elimination patch, I was going through the
trigger code and found quite a bit of nastiness in regard to naming
and variable repurposing related to the addition of replication roles
in 8.3.  The most obvious issue is that tgenabled was switched from a
bool to char to support replication roles.  From a naming standpoint,
the term enabled generally implies boolean and is fairly
consistently used as such in other functions within the core.  My
initial preference would be to return tgenabled to its original
boolean for use only in enabling/disabling triggers.  Then, I'd
probably add another boolean entry (tgreplica?) for use in determining
whether the trigger should be fired on origin/local or replica.
Otherwise, the naming of EnableDisableTrigger and friends seems a bit
contradictory due to the fact that it has the ability to convert a
trigger into a replica trigger.  Similarly, I can't see any reason for
keeping the structure member name the same, especially when the change
from bool to char broke backward compatibility anyway.  Thoughts?

As I wasn't sure whether anyone agrees with my distaste for
repurposing tgenabled as mentioned above, I have attached is a patch
which minimally corrects the function comment for EnableDisableTrigger
where fires_when is concerned.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com


endisable_trig_fctn_commnt_cleanup.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] [PATCH] Cleanup of PLpgSQL_recfield

2008-11-04 Thread Jonah H. Harris

While looking to add some functionality to PL/pgSQL, I found that the
rfno member of the PLpgSQL_recfield structure is unused.  This patch
is just a cleanup and doesn't seem along the same lines as the patches
in CommitFest... should I add it to the wiki anyway?

-- 
Jonah H. Harris, Senior DBA
myYearbook.com


plpgsql_unused_recrfno.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Cleanup of PLpgSQL_recfield

2008-11-04 Thread Jonah H. Harris

On Tue, Nov 4, 2008 at 3:57 PM, Tom Lane [EMAIL PROTECTED] wrote:
 I am not real sure why the code is inconsistent about spelling the
 second field's name differently in some of the structs, but it seems
 like a bad idea --- as you've demonstrated, it invites confusion.
 What would probably be better is a patch to rename exprno, rfno, etc
 to all be called dno to make this connection more obvious.

Attached.  Passed regressions and basic testing.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com


plpgsql_datumnaming_cleanup.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] WIP: Hash Join-Filter Pruning using Bloom Filters

2008-11-02 Thread Jonah H. Harris

All,

Attached is an initial patch I've been playing with which uses Bloom
filters to reduce unnecessary processing of outer tuples in hash
joins.  In short, this works by creating a Bloom filter, adding all
relevant tuples for the inner relation, and querying the filter (for
existence) when retrieving tuples from the outer relation.  This
avoids unnecessary tuple movement and bucket searches for matches we
already know can't exist.  Currently it works only for JOIN_INNER, but
could be modified to optimize anti/semi joins as well.  Similarly, I
created a GUC to enable pruning, named bloom_pruning.

Rather than performing k hash functions, this implementation simply
sets a bit based on the already-computed hash value.  I wanted to send
this around for reviews and comments before working on it further.  As
this isn't overly intrusive, if someone can commit to reviewing and
providing input, I'll commit to having this ready for 8.4.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com


bloompruning_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Hash Join-Filter Pruning using Bloom Filters

2008-11-02 Thread Jonah H. Harris

On Sun, Nov 2, 2008 at 5:36 PM, Hannes Eder [EMAIL PROTECTED] wrote:
 On Sun, Nov 2, 2008 at 10:49 PM, Jonah H. Harris [EMAIL PROTECTED] wrote:
 Similarly, I
 created a GUC to enable pruning, named bloom_pruning.

 I guess calls to bloom_filter_XXX should be surrounded by if
 (bloom_pruning) ... or a similar construct, i.e. make use of the GUC
 variable bloom_pruning in the rest of the code.

It's effective as-is for a preliminary patch.  The GUC code is the
least of my worries.

 Can you provide some figures on the performance impact of the bloom filter?

It depends on the queries.  I've been trying to find a good suite of
hash join tests... but not much luck.

CREATE TABLE t1 (id INTEGER PRIMARY KEY, x INTEGER);
CREATE TABLE t2 (id INTEGER PRIMARY KEY, x INTEGER);
INSERT INTO t1 (SELECT ge, ge % 100 FROM generate_series(1, 100) ge);
INSERT INTO t2 (SELECT * FROM t1);
VACUUM ANALYZE;
SELECT COUNT(*)
  FROM t1, t2
 WHERE t1.id = t2.id
   AND t1.x  30
   AND t2.x  10;
SET bloom_pruning TO off;
EXPLAIN
SELECT COUNT(*)
  FROM t1, t2
 WHERE t1.id = t2.id
   AND t1.x  30
   AND t2.x  10;
\timing
SELECT COUNT(*)
  FROM t1, t2
 WHERE t1.id = t2.id
   AND t1.x  30
   AND t2.x  10;
\timing
EXPLAIN
SELECT *
  FROM t1, t2
 WHERE t1.id = t2.id
   AND t1.x  30
   AND t2.x  10;
\timing
SELECT *
  FROM t1, t2
 WHERE t1.id = t2.id
   AND t1.x  30
   AND t2.x  10;
\timing
SET bloom_pruning TO on;
\timing
SELECT COUNT(*)
  FROM t1, t2
 WHERE t1.id = t2.id
   AND t1.x  30
   AND t2.x  10;
\timing
EXPLAIN
SELECT *
  FROM t1, t2
 WHERE t1.id = t2.id
   AND t1.x  30
   AND t2.x  10;
\timing
SELECT *
  FROM t1, t2
 WHERE t1.id = t2.id
   AND t1.x  30
   AND t2.x  10;
\timing

-- Without Pruning
Time: 1142.843 ms
Time: 1567.355 ms

-- With Pruning
Time: 891.557 ms
Time: 1269.634 ms



-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block-level CRC checks

2008-10-30 Thread Jonah H. Harris

On Thu, Oct 30, 2008 at 10:33 AM, Zdenek Kotala [EMAIL PROTECTED] wrote:
 Please, DO NOT MOVE position of page version in PageHeader structure! And
 PG_PAGE_LAYOUT_VERSION should be bump to 5.

Umm, any in-place upgrade should be capable of handling changes to the
page header.  Of, did I miss something significant in the in-place
upgrade design?

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block-level CRC checks

2008-10-30 Thread Jonah H. Harris

On Thu, Oct 30, 2008 at 11:14 AM, Tom Lane [EMAIL PROTECTED] wrote:
 Jonah H. Harris [EMAIL PROTECTED] writes:
 On Thu, Oct 30, 2008 at 10:33 AM, Zdenek Kotala [EMAIL PROTECTED] wrote:
 Please, DO NOT MOVE position of page version in PageHeader structure! And
 PG_PAGE_LAYOUT_VERSION should be bump to 5.

 Umm, any in-place upgrade should be capable of handling changes to the
 page header.

 Well, yeah, but it has to be able to tell which version it's dealing
 with.  I quite agree with Zdenek that keeping the version indicator
 in a fixed location is appropriate.

Most of the other databases I've worked, which don't have different
types of pages, put the page version as the first element of the page.
 That would let us put the crc right after it.  Thoughts?

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block-level CRC checks

2008-10-30 Thread Jonah H. Harris

On Thu, Oct 30, 2008 at 11:27 AM, Tom Lane [EMAIL PROTECTED] wrote:
 Jonah H. Harris [EMAIL PROTECTED] writes:
 On Thu, Oct 30, 2008 at 11:14 AM, Tom Lane [EMAIL PROTECTED] wrote:
 Well, yeah, but it has to be able to tell which version it's dealing
 with.  I quite agree with Zdenek that keeping the version indicator
 in a fixed location is appropriate.

 Most of the other databases I've worked, which don't have different
 types of pages, put the page version as the first element of the page.
  That would let us put the crc right after it.  Thoughts?

 Fixed location does not mean let's move it.

Just trying to be helpful.  Just thought I might give some insight as
to what others, who had implemented in-place upgrade functionality
years before Postgres' existence, had done.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block-level CRC checks

2008-10-30 Thread Jonah H. Harris

On Thu, Oct 30, 2008 at 12:14 PM, Alvaro Herrera
[EMAIL PROTECTED] wrote:
 Gregory Stark escribió:

 What I'm wondering though -- are we going to make CRCs mandatory? Or set 
 aside
 the 4 bytes even if you're not using them? Because if the size of the page
 header varies depending on whether you're using CRCs that sounds like it 
 would
 be quite a pain.

 Not mandatory, but the space needs to be set aside.  (Otherwise you
 couldn't turn it on after running with it turned off, which would rule
 out using the database after initdb).

Agreed.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pre-MED

2008-10-29 Thread Jonah H. Harris

On Wed, Oct 29, 2008 at 12:40 PM, David Fetter [EMAIL PROTECTED] wrote:
 Please find enclosed a WIP patch to add the ability for functions to
 see the qualifiers of the query in which they're called.  It's not
 working just yet, and I'm not sure how best to get it working, but I'd
 like to see this as part of 8.4, as SQL/MED is just way too ambitious
 given the time frame.

To be more specific, SQL/MED is going to be 8.5.  This is an overall
improvement for accessing the predicate.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SQL/MED compatible connection manager

2008-10-27 Thread Jonah H. Harris

On Mon, Oct 27, 2008 at 10:06 AM, Martin Pihlak [EMAIL PROTECTED] wrote:
 So the proposal is to implement a small subset of SQL/MED to cope with
 connection info management -- connection manager. This will only manage the
 connection metadata and provide the required system catalogs and commands for
 maintaining them. The actual connection management (open/close etc.) is still
 handled by the client modules.

Per SQL:2008, there are no expected changes to SQL/MED from SQL:2003.
As such, two weeks ago, I completed a full review of SQL/MED and am
planning to fully implement it for 8.5.  Currently, I'm working on a
proof of concept and have created a SQL/MED access method (sqlmed) as
well as started implementing the FDW API and hooks into the optimizer
to support remote capabilities, costing, and predicate pushdown.  The
first wrappers I intend to support are ODBC and

This is a large project, and I'm certainly open to assistance :)

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SQL/MED compatible connection manager

2008-10-27 Thread Jonah H. Harris

On Mon, Oct 27, 2008 at 10:35 AM, Jonah H. Harris
[EMAIL PROTECTED] wrote:
 The first wrappers I intend to support are ODBC and

Damn multiple windows :)

The first wrappers I intend to support are ODBC and CSV/fixed-width text.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SQL/MED compatible connection manager

2008-10-27 Thread Jonah H. Harris

On Mon, Oct 27, 2008 at 11:31 AM, Martin Pihlak [EMAIL PROTECTED] wrote:
 Cool. Have you published some notes on it (wiki etc)?

Not yet.  Discussed it a little on irc, but nothing substantial.  I'll
look at updating the Wiki hopefully today.

 It certainly is an undertaking :) I'm mostly interested in the connection
 management -- so hopefully I can help there.

That would be awesome!

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-24 Thread Jonah H. Harris

On Fri, Oct 24, 2008 at 7:59 AM, Hannu Krosing [EMAIL PROTECTED] wrote:
 On Fri, 2008-10-24 at 00:52 -0400, Jonah H. Harris wrote:
 While we could build an
 abstract prefetch interface and simply use fadvise for it now (rather
 than OS-specific code), I don't see an easy win in any case.

 When building an abstract interface, always use at least two
 implementations (I guess that would be fadvise on linux and AIO on
 solaris in this case). You are much more likely to get the interface
 right this way.

I agree, I just wasn't sure as to whether Greg's patch supported both methods.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-23 Thread Jonah H. Harris

On Thu, Oct 23, 2008 at 4:53 PM, Greg Smith [EMAIL PROTECTED] wrote:
 I think the current plan is to use posix_advise() to allow parallel I/O,
 rather than async I/O becuase posix_advise() will require fewer code
 changes.

 These are not necessarily mutually exclusive designs.  fadvise works fine on
 Linux, but as far as I know only async I/O works on Solaris.  Linux also has
 an async I/O library, and it's not clear to me yet whether that might work
 even better than the fadvise approach.

fadvise is a kludge.  While it will help, it still makes us completely
reliant on the OS.  For performance reasons, we should be supporting a
multi-block read directly into shared buffers.  IIRC, we currently
have support for rings in the buffer pool, which we could read
directly into.  Though, an LRU-based buffer manager design would be
more optimal in this case.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-23 Thread Jonah H. Harris

On Thu, Oct 23, 2008 at 8:44 PM, Bruce Momjian [EMAIL PROTECTED] wrote:
 True, it is a kludge but if it gives us 95% of the benfit with 10% of
 the code, it is a win.

I'd say, optimistically, maybe 30-45% the benefit over a proper
multi-block read using O_DIRECT.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-23 Thread Jonah H. Harris

On Thu, Oct 23, 2008 at 10:36 PM, Greg Stark
[EMAIL PROTECTED] wrote:
 I couldn't get async I/O to work on Linux. That is it worked but performed
 the same as reading one block at a time. On solaris the situation is
 reversed.

Hmm, then obviously you did something wrong, because my tests showed
it quite well.  Pull the source to iozone or fio.

 In what way is fadvise a kludge?

non-portable, requires more user-to-system CPU, ... need I go on?

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-23 Thread Jonah H. Harris

On Fri, Oct 24, 2008 at 12:42 AM, Tom Lane [EMAIL PROTECTED] wrote:
 non-portable, requires more user-to-system CPU, ... need I go on?

 I'd be interested to know which of these proposals you claim *is*
 portable.  The single biggest reason to reject 'em all is that
 they aren't.

Yes, that was bad wording on my part.  What I mean to say was
unpredictable.  Different OSes and filesystems handle fadvise
differently (or not at all), which makes any claim to performance gain
configuration-dependent.  My preferred method, using O_DIRECT and
fetching directly into shared buffers, is not without its issues or
challenges as well.  However, by abstracting the multi-block read
interface, we could use more optimal calls depending on the OS.

Having done a bit of research and testing in this area (AIO and buffer
management), I don't see any easy solution.  fadvise will work on some
systems and will likely give some gain on them, but won't work for
everyone.  The alternative is to abstract prefetching and allow
platform-specific code, which we rarely do.  While we could build an
abstract prefetch interface and simply use fadvise for it now (rather
than OS-specific code), I don't see an easy win in any case.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Properly access a buffer's LSN using existing access macros

2008-10-20 Thread Jonah H. Harris

On Mon, Oct 20, 2008 at 5:23 PM, Alvaro Herrera
[EMAIL PROTECTED] wrote:
 Hmm, I just noticed this comment in bufpage.h (which was also in Jonah's
 patch) :-(

 typedef struct PageHeaderData
 {
/* XXX LSN is member of *any* block, not only page-organized ones */

Passed regressions and several benchmarks for me.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block-level CRC checks

2008-10-17 Thread Jonah H. Harris

On Fri, Oct 17, 2008 at 11:26 AM, Alvaro Herrera
[EMAIL PROTECTED] wrote:
 So this discussion died with no solution arising to the
 hint-bit-setting-invalidates-the-CRC problem.

I've been busy.

 Apparently the only solution in sight is to WAL-log hint bits.  Simon
 opines it would be horrible from a performance standpoint to WAL-log
 every hint bit set, and I think we all agree with that.  So we need to
 find an alternative mechanism to WAL log hint bits.

Agreed.

 I thought about causing a process that's about to write a page check a
 flag that says this page has been dirtied by someone who didn't bother
 to generate WAL.  If the flag is set, then the writer process is forced
 to write a WAL record containing all hint bits in the page, and only
 then it is allowed to write the page (and thus calculate the new CRC).

Interesting idea... let me ponder it for a bit.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block-level CRC checks

2008-10-17 Thread Jonah H. Harris

On Fri, Oct 17, 2008 at 12:05 PM, Greg Stark
[EMAIL PROTECTED] wrote:
 Heikki had a clever idea earlier which was to have two crc checks- one which
 skips the hint bits and one dedicated to hint bits. If the second doesn't
 match we clear all the hint bits.

Sounds overcomplicated to me.

 The problem with that is that skipping the hint bits for the main crc would
 slow it down severely. It would make a lot of sense if the hint bits were
 all in a contiguous block of memory but I can't see how to make that add up.

Agreed.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block-level CRC checks

2008-10-02 Thread Jonah H. Harris

On Thu, Oct 2, 2008 at 9:07 AM, Brian Hurt [EMAIL PROTECTED] wrote:
 I have a stupid question wrt hint bits and CRC checksums- it seems to me
 that it should be possible, if you change the hint bits, to be able to very
 easily calculate what the change in the CRC checksum should be.

Doesn't the problem still remain?  The problem being that the buffer
can be changed as it's written, yes?

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block-level CRC checks

2008-10-02 Thread Jonah H. Harris

On Thu, Oct 2, 2008 at 9:36 AM, Brian Hurt [EMAIL PROTECTED] wrote:
 Another possibility is to just not checksum the hint bits...

Seems like that would just complicate matters and prevent a viable checksum.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block-level CRC checks

2008-10-02 Thread Jonah H. Harris

On Thu, Oct 2, 2008 at 9:42 AM, Gregory Stark [EMAIL PROTECTED] wrote:
 It's even worse than that. Two processes can both be fiddling hint bits on
 different tuples (or even the same tuple) at the same time.

Agreed.  Back to the double-buffer idea, we could have a temporary
BLCKSZ buffer we could use immediately before write() which we could
copy the block to, perform the checksum on, and write out... is that
what you were thinking Tom?

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block-level CRC checks

2008-10-02 Thread Jonah H. Harris

On Thu, Oct 2, 2008 at 10:09 AM, Andrew Chernow [EMAIL PROTECTED] wrote:
 I read through this patch and am curious why 0xdeadbeef was used as an
 uninitialized value for the page crc.  Is this value somehow less likely to
 have collisons than zero (or any other arbitrary value)?

It was just an arbitrary value I chose to identify non-checksummed
pages; I believe would have the same collision rate as anything else.

 Would it not be better to add a boolean bit or byte to inidcate the crc
 state?

Ideally, though we don't have any spare bits to play with in MAXALIGN=4.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block-level CRC checks

2008-10-02 Thread Jonah H. Harris

On Thu, Oct 2, 2008 at 10:27 AM, Heikki Linnakangas
[EMAIL PROTECTED] wrote:
 Ideally, though we don't have any spare bits to play with in MAXALIGN=4.

 In the page header? There's plenty of free bits in pd_flags.

Ahh, didn't see that.  Good catch!

 But isn't it a bit dangerous to have a single flag on the page indicating
 whether the CRC is valid or not? Any corruption that flips that bit would
 make the CRC check to be skipped.

Agreed.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block-level CRC checks

2008-10-02 Thread Jonah H. Harris

On Thu, Oct 2, 2008 at 10:41 AM, Tom Lane [EMAIL PROTECTED] wrote:
 Not checksumming the hint bits *is* a solution to the torn page problem.

 Yeah, but it has enough drawbacks that I'd like to keep looking for
 alternatives.

Agreed.

 One argument that I've not seen raised is that not checksumming the hint
 bits leaves you open to a single-bit error that incorrectly sets a hint
 bit.

Agreed.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block-level CRC checks

2008-10-02 Thread Jonah H. Harris

So, it comes down to two possible designs, each with its own set of challenges.

Just to see where to go from here... I want to make sure the options
I've seen in this thread are laid out clearly:

1. Hold an exclusive lock on the buffer during the call to smgrwrite
OR
2. Doublebuffer the write
OR
3. Do some crufty magic to ignore hint-bit updates

Because option 3 not only complicates the entire thing, but also makes
corruption more difficult to detect, I don't consider it viable.  Can
anyone provide a reason that makes this option viable?

Option 1 will prevent hint-bit updates during write, which means we
can checksum the buffer and not worry about it.  Also, is only the
buffer content lock required?  This could potentially slow down
concurrent transactions reading the block and/or writing hint bits.

Option #2 consists of copying the block to a temporary buffer,
checksumming it, and pushing the checksummed block down to write() (at
smgr/md/fd depending on where we want to perform the checksum).

From my perspective, I prefer #2 and performing it at the sgmr layer,
but I am open to suggestions.  Tom, what are your thoughts?  #1 isn't
very difficult, but I can see it potentially causing a number of
side-problems and it would require a fair amount of testing.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block-level CRC checks

2008-10-02 Thread Jonah H. Harris

On Thu, Oct 2, 2008 at 12:05 PM, Aidan Van Dyk [EMAIL PROTECTED] wrote:
 How does your current write strategy handle this situation.  I mean,
 how do you currently guarnetee that between when you call write() and
 the kernel copies the buffer internally, no hint-bit are updated?

Working on the exact double-buffering technique now.

 #define write(fd, buf, count) buffer_crc_write(fd, buf, count)

I certainly wouldn't interpose the write() call itself; that's just
asking for trouble.

 whatever protection you have on the regular write is sufficient.  The
 time of the protection will need to start before the buffer period
 instead of just the write, (and maybe not the write syscall anymore) but
 with CPU caches and speed, the buffer period should be = the time of
 the write() syscall...  Your fsync is your on disk guarentee, not the
 write, and that won't change.

Agreed.

 But I thought you didn't really care about hint-bit updates, even in the
 current strategy... but I'm fully ignorant about the code, sorry...

The current implementation does not take it into account.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block-level CRC checks

2008-10-02 Thread Jonah H. Harris

On Thu, Oct 2, 2008 at 12:51 PM, Aidan Van Dyk [EMAIL PROTECTED] wrote:
  But I thought you didn't really care about hint-bit updates, even in the
  current strategy... but I'm fully ignorant about the code, sorry...

 The current implementation does not take it into account.

 So if PG currently doesn't care about the hit-bits being updated, during
 the write, then why should introducing a double-buffer introduce the a
 torn-page problem Tom mentions?  I admit, I'm fishing for information
 from those in the know, because I haven't been looking at the code long
 enough (or all of it enough) to to know all the ins-and-outs...

PG doesn't care because during hint-bits aren't logged and during
normal WAL replay, the old page will be pulled from the WAL.  I
believe what Tom is referring to is that the buffer PG sends to
write() can still be modified by way of SetHintBits between the time
smgrwrite is called and the time the actual write takes place, which
is why we can't rely on a checksum of the buffer pointer passed to
smgrwrite and friends.

If we're double-buffering the write, I don't see where we could be
introducing a torn-page, as we'd actually be writing a copied version
of the buffer.  Will look into this.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block-level CRC checks

2008-10-02 Thread Jonah H. Harris

On Thu, Oct 2, 2008 at 1:07 PM, Bruce Momjian [EMAIL PROTECTED] wrote:
 If we're double-buffering the write, I don't see where we could be
 introducing a torn-page, as we'd actually be writing a copied version
 of the buffer.  Will look into this.

 The torn page is during kernel write to disk, I assume, so it is still
 possible.

Well, we can't really control too much of that.  The most common
solution to that I've seen is to double-write the page (which some
OSes already do regardless).  Or, are you meaning something else?

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block-level CRC checks

2008-10-02 Thread Jonah H. Harris

 It's not the buffeting it's the checksum. The problem arises if a page is
 read in but no wal logged modifications are done against it. If a hint bit
 is modified it won't be wal logged but the page is marked dirty.

Ah.  Thanks Greg.  Let me look into this a bit before I respond :)

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block-level CRC checks

2008-10-02 Thread Jonah H. Harris

On Thu, Oct 2, 2008 at 1:58 PM, Gregory Stark [EMAIL PROTECTED] wrote:
 On recovery after a torn-page write, won't the recovery of the
 full_page_write WAL + WAL changes get us back to the page as it was
 before the buffer+checksum+write?

 Hint bit setting doesn't trigger a WAL record.

Hence, no page image is written to WAL for later use in recovery.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block-level CRC checks

2008-10-02 Thread Jonah H. Harris

On Thu, Oct 2, 2008 at 1:44 PM, Alvaro Herrera
[EMAIL PROTECTED] wrote:
 How about when a hint bit is set and the page is not already dirty, set
 the checksum to the always valid value?  The problem I have with this
 idea is that there would be lots of pages excluded from the CRC checks,
 a non-trivial percentage of the time.

I don't like that because it trades-off corruption detection (the
whole point of this feature) for a slight performance improvement.

 Maybe we could mix this with Simon's approach to counting hint bit
 setting, and calculate a valid CRC on the page every n-th non-logged
 change.

I still think we should only calculate checksums on the actual write.
And, this still seems to have an issue with WAL, unless Simon's
original idea somehow included recording hint bit settings/dirtying
the page in WAL.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block-level CRC checks

2008-10-01 Thread Jonah H. Harris

On Wed, Oct 1, 2008 at 9:25 AM, Tom Lane [EMAIL PROTECTED] wrote:
 Harald Armin Massa [EMAIL PROTECTED] writes:
 WHAT should happen when corrupted data is detected?

 Same thing that happens now, ie, query fails with an error.  This would
 just be an extension of the existing validity checks done at page read
 time.

Agreed.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block-level CRC checks

2008-10-01 Thread Jonah H. Harris

On Wed, Oct 1, 2008 at 11:36 AM, Tom Lane [EMAIL PROTECTED] wrote:
 I probably wouldn't compare checksumming *every* WAL record to a
 single block-level checksum.

 No, not at all.  Block-level checksums would be an order of magnitude
 more expensive: they're on bigger chunks of data and they'd be done more
 often.

That's debatable and would be dependent on cache and the workload.

In our case however, because shared buffers doesn't scale, we would
end up doing a lot more block-level checksums than the other vendors
just pushing the block to/from the OS cache.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 3 4 5 6 >

1 - 100 of 580 matches

Mail list logo