Re: [HACKERS] Performance Enhancement/Fix for Array Utility Functions

2010-07-27 Thread Mike Lewis
>
>
> > 1. As-is, it's a significant *pessimization* for small arrays, because
> > the heap_tuple_untoast_attr_slice code does a palloc/copy even when one
> > is not needed because the data is already not toasted.  I think there
> > needs to be a code path that avoids that.
>
> This seems like it shouldn't be too hard to fix, and I think it should be
> fixed.
>

Do you have any suggestions where to start?  I do agree that this should be
fixed as well.   I don't have too much time to dedicate to this project.  I
can try to put in some time this weekend though if it isn't looking too bad.


> > 2. Arrays that are large enough to be pushed out to toast storage are
> > almost certainly going to get compressed first.  The potential win in
> > this case is very limited because heap_tuple_untoast_attr_slice will
> > fetch and decompress the whole thing.  Admittedly this is a limitation
> > of the existing code and not a fault of the patch proper, but still, if
> > you want to make something that's generically useful, you need to do
> > something about that.  Perhaps pglz_decompress() could be extended with
> > an argument to say "decompress no more than this much" --- although that
> > would mean adding another test to its inner loop, so we'd need to check
> > for performance degradation.  I'm also unsure how to predict how much
> > compressed data needs to be read in to get at least N bytes of
> > decompressed data.
>
> But this part seems to me to be something that can, and probably
> should, be left for a separate patch.  I don't see that we're losing
> anything this way; we're just not winning as much as we otherwise
> might.  To really fix this, I suspect we'd need a version of
> pglz_decompress that can operate in "streaming mode"; you tell it how
> many bytes you want, and it returns a code indicating that either (a)
> it decompressed that many bytes or (b) it decompressed less than that
> many bytes and you can call it again with another chunk of data if you
> want to keep going.  That'd probably be a good thing to have, but I
> don't think it needs to be a prerequisite for this patch.
>

Hopefully this is the case.  I can try tackling the first part, however.

Thanks,
Mike
--
Michael Lewis
lolrus.org
mikelikes...@gmail.com


Re: [HACKERS] Improper usage of MemoryContext in nodeSubplan.c for buildSubPlanHash() function. This maybe causes allocate memory failed.

2010-07-27 Thread Tom Lane
I wrote:
> "Tao Ma"  writes:
>> This is a potential memory error in nodeSubplan.c or execGrouping.c
>> Using select '1'::TEXT IN ((SELECT '1'::NAME) UNION ALL SELECT '1'::NAME);
>> to reproduce this bug.
>> ...
>> To fix this problem, we can use another memory context to passin 
>> BuildTupleHashTable() as the hashtable's tempcxt, or use other memory 
>> context as hash table's tempcxt or other ways.

> Yeah, I think you're right --- we can't get away with reusing the
> innerecontext's per-tuple context for the hashtable temp contexts.
> The best solution is probably to make an additional context that
> does nothing but act as the hashtable temp context.

I've committed a fix along those lines.  Thanks for the report!

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostGIS vs. PGXS in 9.0beta3

2010-07-27 Thread Andrew Dunstan



Tom Lane wrote:

Josh Berkus  writes:
  
A 9.0b3 tester reported this issue with our single most popular 
PostgreSQL extension, PostGIS:



  

==
Postgis makes use of 'PGXS' in postgresql > 8.2. Within postgresql-9, 
datadir and many other variables are defined as multiple values with an 
append operator, like this:



  

$ grep -i datadir /usr/pgsql-9.0/lib/pgxs/src/Makefile.global
[snip]
datadir := /usr/pgsql-9.0/share



This analysis is nonsense on its face --- := is not an append operator
and we do not have any multiple values for datadir.

The referenced postgis-users thread seems to indicate that the postgis
guys found and fixed a problem in their own makefiles.  If not, we need
a clearer description of what they think the problem is.
  



The real problem has nothing to do with any of the analysis, as you say. 
It is this: they have an override file for PGXS and it uses 
$(mkinstalldirs) which we got rid of about a year ago. So apparently 
they haven't been testing much against any of our alphas or betas or 
they would have seen this long ago. The correct fix is to do the 
following in the PostGIS source root:


   sed -i -e 's/mkinstalldirs/MKDIR_P/' postgis/Makefile.pgxs

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] page corruption on 8.3+ that makes it to standby

2010-07-27 Thread Jeff Davis
On Tue, 2010-07-27 at 15:23 -0700, Jeff Davis wrote:
> On Tue, 2010-07-27 at 17:18 -0400, Robert Haas wrote:
> > > My first concern with that idea was that it may create an inconsistency
> > > between the primary and the standby. The primary could have a bunch of
> > > zero pages that never make it to the standby.
> > 
> > Maybe I'm slow on the uptake here, but don't the pages start out
> > all-zeroes on the standby just as they do on the primary? The only way
> > it seems like this would be a problem is if a page that previously
> > contained data on the primary was subsequently zeroed without writing
> > a WAL record - or am I confused?
> 
> The case I was concerned about is when you have a table on the primary
> with a bunch of zero pages at the end. Then you SET TABLESPACE, and none
> of the copied pages (or even the fact that they exist) would be sent to
> the standby, but they would exist on the primary. And later pages may
> have data, so the standby may see page N but not N-1.
> 
> Generally, most of the code is not expecting to read or write past the
> end of the file, unless it's doing an extension.
> 
> However, I think everything is fine during recovery, because it looks
> like it's designed to create zero pages as needed. So your idea seems
> safe to me, although I do still have some doubts because of my lack of
> knowledge in this area; particularly hot standby conflict
> detection/resolution.
> 
> My idea was different: still log the zero page, just don't set LSN or
> TLI for a zero page in log_newpage() or heap_xlog_newpage(). This isn't
> as clean as your idea, but I'm a little more confident that it is
> correct.
> 

Both potential fixes attached and both appear to work.

fix1 -- Only call PageSetLSN/TLI inside log_newpage() and
heap_xlog_newpage() if the page is not zeroed.

fix2 -- Don't call log_newpage() at all if the page is not zeroed.

Please review. I don't have a strong opinion about which one should be
applied.

Regards,
Jeff Davis
*** a/src/backend/access/heap/heapam.c
--- b/src/backend/access/heap/heapam.c
***
*** 4079,4086  log_newpage(RelFileNode *rnode, ForkNumber forkNum, BlockNumber blkno,
  
  	recptr = XLogInsert(RM_HEAP_ID, XLOG_HEAP_NEWPAGE, rdata);
  
! 	PageSetLSN(page, recptr);
! 	PageSetTLI(page, ThisTimeLineID);
  
  	END_CRIT_SECTION();
  
--- 4079,4093 
  
  	recptr = XLogInsert(RM_HEAP_ID, XLOG_HEAP_NEWPAGE, rdata);
  
! 	/*
! 	 * The new page may be uninitialized. If so, we can't set the LSN
! 	 * and TLI because that would corrupt the page.
! 	 */
! 	if (!PageIsNew(page))
! 	{
! 		PageSetLSN(page, recptr);
! 		PageSetTLI(page, ThisTimeLineID);
! 	}
  
  	END_CRIT_SECTION();
  
***
*** 4266,4273  heap_xlog_newpage(XLogRecPtr lsn, XLogRecord *record)
  	Assert(record->xl_len == SizeOfHeapNewpage + BLCKSZ);
  	memcpy(page, (char *) xlrec + SizeOfHeapNewpage, BLCKSZ);
  
! 	PageSetLSN(page, lsn);
! 	PageSetTLI(page, ThisTimeLineID);
  	MarkBufferDirty(buffer);
  	UnlockReleaseBuffer(buffer);
  }
--- 4273,4288 
  	Assert(record->xl_len == SizeOfHeapNewpage + BLCKSZ);
  	memcpy(page, (char *) xlrec + SizeOfHeapNewpage, BLCKSZ);
  
! 	/*
! 	 * The new page may be uninitialized. If so, we can't set the LSN
! 	 * and TLI because that would corrupt the page.
! 	 */
! 	if (!PageIsNew(page))
! 	{
! 		PageSetLSN(page, lsn);
! 		PageSetTLI(page, ThisTimeLineID);
! 	}
! 
  	MarkBufferDirty(buffer);
  	UnlockReleaseBuffer(buffer);
  }
*** a/src/backend/commands/tablecmds.c
--- b/src/backend/commands/tablecmds.c
***
*** 7082,7089  copy_relation_data(SMgrRelation src, SMgrRelation dst,
  
  		smgrread(src, forkNum, blkno, buf);
  
! 		/* XLOG stuff */
! 		if (use_wal)
  			log_newpage(&dst->smgr_rnode, forkNum, blkno, page);
  
  		/*
--- 7082,7094 
  
  		smgrread(src, forkNum, blkno, buf);
  
! 		/*
! 		 * XLOG stuff
! 		 *
! 		 * Don't log page if it's new, because that will set the LSN
! 		 * and TLI, corrupting the page.
! 		 */
! 		if (use_wal && !PageIsNew(page))
  			log_newpage(&dst->smgr_rnode, forkNum, blkno, page);
  
  		/*

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance Enhancement/Fix for Array Utility Functions

2010-07-27 Thread Robert Haas
On Fri, Jul 16, 2010 at 4:43 PM, Tom Lane  wrote:
> Daniel Farina  writes:
>> Generally I think the delimited untoasting of metadata from arrays
>> separately from the payload is Not A Bad Idea.
>
> I looked at this patch a bit.  I agree that it could be a big win for
> large external arrays, but ...
>
> 1. As-is, it's a significant *pessimization* for small arrays, because
> the heap_tuple_untoast_attr_slice code does a palloc/copy even when one
> is not needed because the data is already not toasted.  I think there
> needs to be a code path that avoids that.

This seems like it shouldn't be too hard to fix, and I think it should be fixed.

> 2. Arrays that are large enough to be pushed out to toast storage are
> almost certainly going to get compressed first.  The potential win in
> this case is very limited because heap_tuple_untoast_attr_slice will
> fetch and decompress the whole thing.  Admittedly this is a limitation
> of the existing code and not a fault of the patch proper, but still, if
> you want to make something that's generically useful, you need to do
> something about that.  Perhaps pglz_decompress() could be extended with
> an argument to say "decompress no more than this much" --- although that
> would mean adding another test to its inner loop, so we'd need to check
> for performance degradation.  I'm also unsure how to predict how much
> compressed data needs to be read in to get at least N bytes of
> decompressed data.

But this part seems to me to be something that can, and probably
should, be left for a separate patch.  I don't see that we're losing
anything this way; we're just not winning as much as we otherwise
might.  To really fix this, I suspect we'd need a version of
pglz_decompress that can operate in "streaming mode"; you tell it how
many bytes you want, and it returns a code indicating that either (a)
it decompressed that many bytes or (b) it decompressed less than that
many bytes and you can call it again with another chunk of data if you
want to keep going.  That'd probably be a good thing to have, but I
don't think it needs to be a prerequisite for this patch.

I'm going to mark this patch "Waiting on Author".

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improper usage of MemoryContext in nodeSubplan.c for buildSubPlanHash() function. This maybe causes allocate memory failed.

2010-07-27 Thread Tom Lane
"Tao Ma"  writes:
> This is a potential memory error in nodeSubplan.c or execGrouping.c
> Using select '1'::TEXT IN ((SELECT '1'::NAME) UNION ALL SELECT '1'::NAME);
> to reproduce this bug.
> ...
> To fix this problem, we can use another memory context to passin 
> BuildTupleHashTable() as the hashtable's tempcxt, or use other memory 
> context as hash table's tempcxt or other ways.

Yeah, I think you're right --- we can't get away with reusing the
innerecontext's per-tuple context for the hashtable temp contexts.
The best solution is probably to make an additional context that
does nothing but act as the hashtable temp context.

This bug's been there a long time :-(.  Surprising that no one found it
before.  It would be mostly masked in a non-assert build, but not
entirely, I think.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Toward a column reorder solution

2010-07-27 Thread David E. Wheeler
On Jul 27, 2010, at 3:01 PM, Joshua D. Drake wrote:

> Correct. We are also hoping to get some sponsorship for it.
> 
> https://www.fossexperts.com/

Frigging copycat.

Any sponsorship for PGXN in there? ;-P

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [RRR] Review: Re: [PATCH] Re: [HACKERS] Adding xpath_exists function

2010-07-27 Thread Jeff Davis
On Tue, 2010-07-27 at 19:41 -0400, Robert Haas wrote:
> On Tue, Jul 27, 2010 at 7:33 PM, David Fetter  wrote:
> >Minor quibble with the regression tests: should we be using
> >dollar quotes in things like this?  Doubled-up quote marks:
> >
> >SELECT xpath_exists('//town[text() = 
> > ''Cwmbran'']','Bidford-on-AvonCwmbranBristol'::xml);
> >
> >Dollar quote:
> >
> >SELECT xpath_exists($$//town[text() = 
> > 'Cwmbran']$$,'Bidford-on-AvonCwmbranBristol'::xml);
> 
> Personally, I don't really see that as an improvement.  Dollar-quotes
> are really nice for longer strings, or where you would otherwise have
> quadrupled quotes (or more), but I don't see a big advantage to it
> here.  Still, it's a question of opinion more than anything else.

I like the idea of using dollar quotes, but I think they should be used
for both arguments (or neither). Using $$ for one and then shifting to
"'" for the second argument with no whitespace at all seems like the
least readable option.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parsing of aggregate ORDER BY clauses

2010-07-27 Thread Tom Lane
Robert Haas  writes:
> Am I misreading this, or did you just answer an "either-or" question with 
> "yes"?

I meant "Yes, that's an issue."

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSL cipher and version

2010-07-27 Thread Robert Haas
On Tue, Jul 27, 2010 at 12:06 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Mon, Jul 26, 2010 at 9:57 AM, Dave Page  wrote:
>>> On Mon, Jul 26, 2010 at 2:49 PM, Robert Haas  wrote:
 Any objections to me committing this?
>>>
>>> Might wanna fix this first:
>>>
>>> +PG_FUNCTION_INFO_V1(ssl_veresion);
>>>                                         
>
>> Wow.  It works remarkably well without fixing that, but I'll admit
>> that does seem lucky.
>
> Well, it's got no arguments, which is the main thing that works
> differently in call protocol V1.  I think you'd find that the
> PG_RETURN_NULL case doesn't really work though ...

It seems to work, but it might be that something's broken under the hood.

Anyhow, committed with that correction.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Review: Re: [PATCH] Re: [HACKERS] Adding xpath_exists function

2010-07-27 Thread Robert Haas
On Tue, Jul 27, 2010 at 7:33 PM, David Fetter  wrote:
>        Minor quibble with the regression tests: should we be using
>        dollar quotes in things like this?  Doubled-up quote marks:
>
>        SELECT xpath_exists('//town[text() = 
> ''Cwmbran'']','Bidford-on-AvonCwmbranBristol'::xml);
>
>        Dollar quote:
>
>        SELECT xpath_exists($$//town[text() = 
> 'Cwmbran']$$,'Bidford-on-AvonCwmbranBristol'::xml);

Personally, I don't really see that as an improvement.  Dollar-quotes
are really nice for longer strings, or where you would otherwise have
quadrupled quotes (or more), but I don't see a big advantage to it
here.  Still, it's a question of opinion more than anything else.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parsing of aggregate ORDER BY clauses

2010-07-27 Thread Robert Haas
On Tue, Jul 27, 2010 at 7:16 PM, Tom Lane  wrote:
> Daniel Grace  writes:
>>  But if we SELECT
>> SOME_INTEGER_AGGREGATE(DISTINCT floatcol ORDER BY floatcol), should
>> the DISTINCT operate on floatcol (i.e. 1.1 and 1.2 are distinct, even
>> if it means the function is called with '1' twice) or
>> floatcol::integer (1.1 and 1.2 are not distinct)?
>
> Yes.  The current implementation has the advantage that any
> unique-ifying step is guaranteed to produce outputs that are distinct
> from the point of view of the aggregate function, whereas if we try to
> keep the two operations at arms-length, then either we lose that
> property or we sort-and-unique twice :-(.

Am I misreading this, or did you just answer an "either-or" question with "yes"?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Review: Re: [PATCH] Re: [HACKERS] Adding xpath_exists function

2010-07-27 Thread David Fetter
== Submission review ==

* Is the patch in context diff format?

Yes.

* Does it apply cleanly to the current CVS HEAD?

Yes.

patch -p1 < ../xpath_exists-3.patch 
patching file doc/src/sgml/func.sgml
Hunk #1 succeeded at 8642 (offset 16 lines).
patching file src/backend/utils/adt/xml.c
patching file src/include/catalog/pg_proc.h
Hunk #1 succeeded at 4391 (offset 6 lines).
patching file src/include/utils/xml.h
patching file src/test/regress/expected/xml.out
patching file src/test/regress/sql/xml.sql

* Does it include reasonable tests, necessary doc patches, etc?

Tests:

As this is new functionality, it doesn't really need to test
much for interactions with other parts of the system.

I'm not really an XML expert, so I'd like to punt as to
whether it tests enough functionality.

Minor quibble with the regression tests: should we be using
dollar quotes in things like this?  Doubled-up quote marks:

SELECT xpath_exists('//town[text() = 
''Cwmbran'']','Bidford-on-AvonCwmbranBristol'::xml);

Dollar quote:

SELECT xpath_exists($$//town[text() = 
'Cwmbran']$$,'Bidford-on-AvonCwmbranBristol'::xml);


Doc patches: Good up to cross-Atlantic differences in spelling
(speciali[sz]ed), e.g.

== Usability review ==  

Read what the patch is supposed to do, and consider:

* Does the patch actually implement that? 

Yes.

* Do we want that? 

Yes.

* Do we already have it? 

Not really.  There are kludges to accomplish these things, but
they're available mostly in the sense that a general-purpose
language allows you to write code to do anything a Turing machine
can do.

* Does it follow SQL spec, or the community-agreed behavior? 

Yes.

* Does it include pg_dump support (if applicable)?

Not applicable.

* Are there dangers? 

Not that I can see.

* Have all the bases been covered?

To the extent of my XML knowledge, yes.

== Feature test ==

Apply the patch, compile it and test:

* Does the feature work as advertised?

Yes.

* Are there corner cases the author has failed to consider?

Not that I've found.  See above re: XML and my vast ignorance of
same.

* Are there any assertion failures or crashes?

No.

== Performance review ==

* Does the patch slow down simple tests? 

No.

* If it claims to improve performance, does it?

No such claim made.  The kludges needed to reproduce the
functionality would certainly consume an enormous number of
developer hours, though.

* Does it slow down other things?

Not that I've found.  There might be some minuscule slowing down
of the code to the existence of more code paths, but we're a long,
long way from having that be something other than noise.

== Coding review ==

Read the changes to the code in detail and consider:

* Does it follow the project 
[http://developer.postgresql.org/pgdocs/postgres/source.html coding 
guidelines]? 

Yes.

* Are there portability issues? 

Not that I can see.

* Will it work on Windows/BSD etc? 

Should do.

* Are the comments sufficient and accurate?

Yes.

* Does it do what it says, correctly?

Yes, subject to, etc.

* Does it produce compiler warnings?

No.

* Can you make it crash?

No.

== Architecture review ==

Consider the changes to the code in the context of the project as a whole:

* Is everything done in a way that fits together coherently with other 
features/modules? 

Yes.

* Are there interdependencies that can cause problems?

Not that I've found.

Cheers,
David.
On Tue, Jun 29, 2010 at 11:37:28AM +0100, Mike Fowler wrote:
> Mike Fowler wrote:
> >Bruce Momjian wrote:
> >>I have added this to the next commit-fest:
> >>
> >>https://commitfest.postgresql.org/action/commitfest_view?id=6
> >Thanks Bruce. Attached is a revised patch which changes the code
> >slightly such that it uses an older version of the libxml library.
> >I've added comments to the code so that we remember why we didn't
> >use the latest function.
> 
> After seeing some other posts in the last couple of days, I realised
> I hadn't documented the function in the SGML. I have now done so,
> and added a couple of tests with XML literals. Please find the patch
> attached. Now time to go correct the xmlexists patch too...
> 
> -- 
> Mike Fowler
> Registered Linux user: 379787
> 

> *** a/doc/src/sgml/func.sgml
> --- b/doc/src/sgml/func.sgml
> ***
> *** 8626,8631  SELECT xpath('/my:a/text()', ' xmlns:my="http://example.com";>test',
> --- 8626,8664 
>   (1 row)
>   ]]>
>  
> +
> +
> + xpath_exists
> + 
> + 
> +  xpath_exists
> + 
> + 
> + 
> +  xpath_exists(xpath, 
> xml, 
> nsarray)
> + 
> + 
> + 
> +  The function xpath_exists is a specialised form
> +  of the xpath function. Though the functions are
> +  syntactically the same the xp

Re: [HACKERS] Parsing of aggregate ORDER BY clauses

2010-07-27 Thread Tom Lane
Daniel Grace  writes:
> One possible concern might be typecasts that aren't a 1:1
> representation.  While no two VARCHARs are going to produce the same
> TEXT, this is not true in other cases (1.1::float::integer and
> 1.2::float::integer both produce 1, for instance).

> Off the top of my head, I can't think of a good example where this
> would cause a problem -- it'd be easy enough to manufacture a possible
> test case, but it'd be so contrived and I don't know if it's something
> that would be seen in production code.  But if we SELECT
> SOME_INTEGER_AGGREGATE(DISTINCT floatcol ORDER BY floatcol), should
> the DISTINCT operate on floatcol (i.e. 1.1 and 1.2 are distinct, even
> if it means the function is called with '1' twice) or
> floatcol::integer (1.1 and 1.2 are not distinct)?

Yes.  The current implementation has the advantage that any
unique-ifying step is guaranteed to produce outputs that are distinct
from the point of view of the aggregate function, whereas if we try to
keep the two operations at arms-length, then either we lose that
property or we sort-and-unique twice :-(.

If memory serves, this type of consideration is also why DISTINCT and
GROUP BY are made to follow ORDER BY's choice of semantics in an
ordinary SELECT query --- you might find that surprising, but if they
weren't on the same page it could be even more surprising.

So on reflection I think that the current fix is the best one and
we don't want to reconsider it later.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Toward a column reorder solution

2010-07-27 Thread Andrew Dunstan



Nilson Damasceno wrote:


The Tom's message was in Dec/2006. We are in 2010.


So what? The problem hasn't changed.



Sincerelly, I'm not afraid to seem naive, but I believe that a column 
that inherits the persistent semantics of attnum solves the 99.9% 
problem with column reordering of legacy software.



You're assuming that the only thing we want to be able to do related to 
column position is to reorder columns logically. That assumption is not 
correct.


(Incidentally, please don't top-answer).

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] do we need to postpone beta4?

2010-07-27 Thread Robert Haas
On Tue, Jul 27, 2010 at 6:42 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Tue, Jul 27, 2010 at 4:48 PM, Robert Haas  wrote:
>>> On Tue, Jul 27, 2010 at 4:09 PM, Tom Lane  wrote:
 Yup, that's what I think.  In fact I think September might be
 optimistic.  This is what happens when you fork early and allow
 developers to start focusing on new development instead of testing
 the release branch.
>
>>> [poorly worded protest]
>
>> Sorry - I apologize for that email.  As has been pointed out to me
>> off-list, that was too strongly worded and not constructive.  Still, I
>> don't think there is much evidence for the proposition that the
>> current delays are caused by having branched early.  I think they're
>> caused by people being out of town.
>
> Well, they're surely both contributing factors.  There's no way to run a
> controlled experiment to determine how much each one is hurting us, so
> opinions about which is worse can never be more than opinions.  I'm
> sticking with mine though, and for weak evidence will point to the
> amount of -hackers traffic about 9.1 CF items versus the amount of
> traffic about how to fix the known bugs.

I guess I'd counter by pointing out that there are half a dozen bugs
and almost 70 patches in the CommitFest.  And, again, it's not as if
bugs are sitting there being ignored for months on end.  To the
contrary, we've been largely ignoring new patches for the past five
months, but we rarely ignore bugs.  When 2 or 3 days go by without a
response to a serious bug report, people start posting messages like
"Hello? Hello? What's going on?" (there are several examples of this
in just the last week, from at least two different contributors).

> Anyway, I'm back from vacation and will start looking at those bugs as
> soon as I've caught up on email.

Thanks.  Let me know if I'm not picking up something you think I
should be looking at.  I've been attempting to stay on top of both bug
reports and the CommitFest in your absence, which has been keeping me
extremely busy, which may account for some portion of the testiness of
my previous response.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] do we need to postpone beta4?

2010-07-27 Thread Tom Lane
Robert Haas  writes:
> On Tue, Jul 27, 2010 at 4:48 PM, Robert Haas  wrote:
>> On Tue, Jul 27, 2010 at 4:09 PM, Tom Lane  wrote:
>>> Yup, that's what I think.  In fact I think September might be
>>> optimistic.  This is what happens when you fork early and allow
>>> developers to start focusing on new development instead of testing
>>> the release branch.

>> [poorly worded protest]

> Sorry - I apologize for that email.  As has been pointed out to me
> off-list, that was too strongly worded and not constructive.  Still, I
> don't think there is much evidence for the proposition that the
> current delays are caused by having branched early.  I think they're
> caused by people being out of town.

Well, they're surely both contributing factors.  There's no way to run a
controlled experiment to determine how much each one is hurting us, so
opinions about which is worse can never be more than opinions.  I'm
sticking with mine though, and for weak evidence will point to the
amount of -hackers traffic about 9.1 CF items versus the amount of
traffic about how to fix the known bugs.

Anyway, I'm back from vacation and will start looking at those bugs as
soon as I've caught up on email.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] page corruption on 8.3+ that makes it to standby

2010-07-27 Thread Jeff Davis
On Tue, 2010-07-27 at 17:18 -0400, Robert Haas wrote:
> > My first concern with that idea was that it may create an inconsistency
> > between the primary and the standby. The primary could have a bunch of
> > zero pages that never make it to the standby.
> 
> Maybe I'm slow on the uptake here, but don't the pages start out
> all-zeroes on the standby just as they do on the primary? The only way
> it seems like this would be a problem is if a page that previously
> contained data on the primary was subsequently zeroed without writing
> a WAL record - or am I confused?

The case I was concerned about is when you have a table on the primary
with a bunch of zero pages at the end. Then you SET TABLESPACE, and none
of the copied pages (or even the fact that they exist) would be sent to
the standby, but they would exist on the primary. And later pages may
have data, so the standby may see page N but not N-1.

Generally, most of the code is not expecting to read or write past the
end of the file, unless it's doing an extension.

However, I think everything is fine during recovery, because it looks
like it's designed to create zero pages as needed. So your idea seems
safe to me, although I do still have some doubts because of my lack of
knowledge in this area; particularly hot standby conflict
detection/resolution.

My idea was different: still log the zero page, just don't set LSN or
TLI for a zero page in log_newpage() or heap_xlog_newpage(). This isn't
as clean as your idea, but I'm a little more confident that it is
correct.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostGIS vs. PGXS in 9.0beta3

2010-07-27 Thread Tom Lane
Josh Berkus  writes:
> A 9.0b3 tester reported this issue with our single most popular 
> PostgreSQL extension, PostGIS:

> ==
> Postgis makes use of 'PGXS' in postgresql > 8.2. Within postgresql-9, 
> datadir and many other variables are defined as multiple values with an 
> append operator, like this:

> $ grep -i datadir /usr/pgsql-9.0/lib/pgxs/src/Makefile.global
> [snip]
> datadir := /usr/pgsql-9.0/share

This analysis is nonsense on its face --- := is not an append operator
and we do not have any multiple values for datadir.

The referenced postgis-users thread seems to indicate that the postgis
guys found and fixed a problem in their own makefiles.  If not, we need
a clearer description of what they think the problem is.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Toward a column reorder solution

2010-07-27 Thread Nilson Damasceno
Andrew,

The Tom's message was in Dec/2006. We are in 2010.

Sincerelly, I'm not afraid to seem naive, but I believe that a column that
inherits the persistent semantics of attnum solves the 99.9% problem with
column reordering of legacy software.

The exceptions seems to be:

1) software that address buffers based on attnum. Tipically a core/hacker
software.

2) INSERTs without naming the columns. This could be solved when attnum
become changeable with a server ou database variable
*allow_attnum_changes* with
false default value.


The problem addressed by Tom about the need of a primary key for attributes
is almost the same of the current solutions to reorder the columns:

a) recreate the table

b)  "shift" the columns.


Nilson

On Tue, Jul 27, 2010 at 6:45 PM, Andrew Dunstan  wrote:

>
>
> Nilson wrote:
>
>  Quoting  "wiki.postgresql.org/wiki/Alter_column_position <
>> http://wiki.postgresql.org/wiki/Alter_column_position>" :
>>
>> "The idea of allowing re-ordering of column position is not one the
>> postgresql developers are against, it is more a case where no one has
>> stepped forward to do the work."
>>
>> Well, a hard journey starts with a single step.
>>
>> Why not, in the next release that requires to run initdb, add a *attraw*
>>  column (a better name is welcome) in the catalog that stores the physical
>> position of column forever, i.e., the same semantics of *attnum*?
>>
>> Then, in a future release - 9.1 for example - the postgres team can make
>>  *attnum* changeable using something like ALTER COLUMN POSITION?
>>
>> Pros:
>>
>> - Requires only a couple of changes in main postgreSQL code. It seems to
>> be very simple.
>>
>> - Allows a smooth and decentralized rewrite of the whole code that may
>> needs the  *attraw *attribute - postgreSQL, contribs, pgAdmin, drivers,
>> tools  etc. This will give time to developers of that code to detect the
>> impact of  semantics change, make the arrangements  necessary and also allow
>> the release of production level software using the new feature before
>> *attnum *becomes changeable.
>> So, when *attnum *becomes read/write, all that software will be ready.
>>
>> Cons
>>
>> - More 4 bytes in each row of the catalog.
>>
>> Nilson
>>
>
>
> Please review the previous discussions on this. In particular, see this
> proposal from Tom Lane that I believe represents the consensus way we want
> to go on this: <
> http://archives.postgresql.org/pgsql-hackers/2006-12/msg00983.php>
>
> cheers
>
> andrew
>


Re: [HACKERS] Toward a column reorder solution

2010-07-27 Thread Joshua D. Drake
On Tue, 2010-07-27 at 17:56 -0400, Andrew Dunstan wrote:
> 
> Robert Haas wrote:
> >> Please review the previous discussions on this. In particular, see this
> >> proposal from Tom Lane that I believe represents the consensus way we want
> >> to go on this:
> >> 
> >> 
> >
> > Alvaro is planning to work on this for 9.1, I believe.
> >
> > http://archives.postgresql.org/pgsql-hackers/2010-07/msg00188.php
> >
> >   
> 
> Yay!

Correct. We are also hoping to get some sponsorship for it.

https://www.fossexperts.com/

Sincerely,

Joshua D. Drake

-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Toward a column reorder solution

2010-07-27 Thread Andrew Dunstan



Robert Haas wrote:

Please review the previous discussions on this. In particular, see this
proposal from Tom Lane that I believe represents the consensus way we want
to go on this:




Alvaro is planning to work on this for 9.1, I believe.

http://archives.postgresql.org/pgsql-hackers/2010-07/msg00188.php

  


Yay!

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Toward a column reorder solution

2010-07-27 Thread Robert Haas
On Tue, Jul 27, 2010 at 5:45 PM, Andrew Dunstan  wrote:
>
>
> Nilson wrote:
>>
>> Quoting  "wiki.postgresql.org/wiki/Alter_column_position
>> " :
>>
>> "The idea of allowing re-ordering of column position is not one the
>> postgresql developers are against, it is more a case where no one has
>> stepped forward to do the work."
>>
>> Well, a hard journey starts with a single step.
>>
>> Why not, in the next release that requires to run initdb, add a *attraw*
>>  column (a better name is welcome) in the catalog that stores the physical
>> position of column forever, i.e., the same semantics of *attnum*?
>>
>> Then, in a future release - 9.1 for example - the postgres team can make
>>  *attnum* changeable using something like ALTER COLUMN POSITION?
>>
>> Pros:
>>
>> - Requires only a couple of changes in main postgreSQL code. It seems to
>> be very simple.
>>
>> - Allows a smooth and decentralized rewrite of the whole code that may
>> needs the  *attraw *attribute - postgreSQL, contribs, pgAdmin, drivers,
>> tools  etc. This will give time to developers of that code to detect the
>> impact of  semantics change, make the arrangements  necessary and also allow
>> the release of production level software using the new feature before
>> *attnum *becomes changeable.
>> So, when *attnum *becomes read/write, all that software will be ready.
>>
>> Cons
>>
>> - More 4 bytes in each row of the catalog.
>>
>> Nilson
>
>
> Please review the previous discussions on this. In particular, see this
> proposal from Tom Lane that I believe represents the consensus way we want
> to go on this:
> 

Alvaro is planning to work on this for 9.1, I believe.

http://archives.postgresql.org/pgsql-hackers/2010-07/msg00188.php

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] do we need to postpone beta4?

2010-07-27 Thread Robert Haas
On Tue, Jul 27, 2010 at 4:48 PM, Robert Haas  wrote:
> On Tue, Jul 27, 2010 at 4:09 PM, Tom Lane  wrote:
>> Robert Haas  writes:
>>> Well, that's pretty much saying we won't release before September.
>>
>> Yup, that's what I think.  In fact I think September might be
>> optimistic.  This is what happens when you fork early and allow
>> developers to start focusing on new development instead of testing
>> the release branch.
>
> [poorly worded protest]

Sorry - I apologize for that email.  As has been pointed out to me
off-list, that was too strongly worded and not constructive.  Still, I
don't think there is much evidence for the proposition that the
current delays are caused by having branched early.  I think they're
caused by people being out of town.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Toward a column reorder solution

2010-07-27 Thread Andrew Dunstan



Nilson wrote:
Quoting  "wiki.postgresql.org/wiki/Alter_column_position 
" :


"The idea of allowing re-ordering of column position is not one the 
postgresql developers are against, it is more a case where no one has 
stepped forward to do the work."


Well, a hard journey starts with a single step.

Why not, in the next release that requires to run initdb, add a 
*attraw*  column (a better name is welcome) in the catalog that stores 
the physical position of column forever, i.e., the same semantics of 
*attnum*?


Then, in a future release - 9.1 for example - the postgres team can 
make  *attnum* changeable using something like ALTER COLUMN POSITION?


Pros:

- Requires only a couple of changes in main postgreSQL code. It seems 
to be very simple.


- Allows a smooth and decentralized rewrite of the whole code that may 
needs the  *attraw *attribute - postgreSQL, contribs, pgAdmin, 
drivers, tools  etc. 
This will give time to developers of that code to detect the impact 
of  semantics change, make the arrangements  necessary and also allow 
the release of production level software using the new feature before 
*attnum *becomes changeable.

So, when *attnum *becomes read/write, all that software will be ready.

Cons

- More 4 bytes in each row of the catalog.

Nilson



Please review the previous discussions on this. In particular, see this 
proposal from Tom Lane that I believe represents the consensus way we 
want to go on this: 



cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Toward a column reorder solution

2010-07-27 Thread Nilson
Quoting  
"wiki.postgresql.org/wiki/Alter_column_position"
:

"The idea of allowing re-ordering of column position is not one the
postgresql developers are against, it is more a case where no one has
stepped forward to do the work."

Well, a hard journey starts with a single step.

Why not, in the next release that requires to run initdb, add a *attraw*
column (a better name is welcome) in the catalog that stores the physical
position of column forever, i.e., the same semantics of *attnum*?

Then, in a future release - 9.1 for example - the postgres team can make  *
attnum* changeable using something like ALTER COLUMN POSITION?

Pros:

- Requires only a couple of changes in main postgreSQL code. It seems to be
very simple.

- Allows a smooth and decentralized rewrite of the whole code that may needs
the  *attraw *attribute - postgreSQL, contribs, pgAdmin, drivers, tools
etc.
This will give time to developers of that code to detect the impact of
semantics change, make the arrangements  necessary and also allow the
release of production level software using the new feature before
*attnum *becomes
changeable.
So, when *attnum *becomes read/write, all that software will be ready.

Cons

- More 4 bytes in each row of the catalog.

Nilson


Re: [HACKERS] page corruption on 8.3+ that makes it to standby

2010-07-27 Thread Robert Haas
On Tue, Jul 27, 2010 at 5:08 PM, Jeff Davis  wrote:
> On Tue, 2010-07-27 at 15:50 -0400, Robert Haas wrote:
>> > 1. Have log_newpage() and heap_xlog_newpage() only call PageSetLSN() and
>> > PageSetTLI() if the page is not new. This seems slightly awkward because
>> > most WAL replay stuff doesn't have to worry about zero pages, but in
>> > this case I think it does.
>> >
>> > 2. Have copy_relation_data() initialize new pages. I don't like this
>> > because (a) it's not really the job of SET TABLESPACE to clean up zero
>> > pages; and (b) it could be an index with different special size, etc.,
>> > and it doesn't seem like a good place to figure that out.
>>
>> It appears to me that all of the callers of log_newpage() other than
>> copy_relation_data() do so with pages that they've just constructed,
>> and which therefore can't be new.  So maybe we could just modify
>> copy_relation_data to check PageIsNew(buf), or something like that,
>> and only call log_newpage() if that returns true.
>
> My first concern with that idea was that it may create an inconsistency
> between the primary and the standby. The primary could have a bunch of
> zero pages that never make it to the standby.

Maybe I'm slow on the uptake here, but don't the pages start out
all-zeroes on the standby just as they do on the primary? The only way
it seems like this would be a problem is if a page that previously
contained data on the primary was subsequently zeroed without writing
a WAL record - or am I confused?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] page corruption on 8.3+ that makes it to standby

2010-07-27 Thread Jeff Davis
On Tue, 2010-07-27 at 15:50 -0400, Robert Haas wrote:
> > 1. Have log_newpage() and heap_xlog_newpage() only call PageSetLSN() and
> > PageSetTLI() if the page is not new. This seems slightly awkward because
> > most WAL replay stuff doesn't have to worry about zero pages, but in
> > this case I think it does.
> >
> > 2. Have copy_relation_data() initialize new pages. I don't like this
> > because (a) it's not really the job of SET TABLESPACE to clean up zero
> > pages; and (b) it could be an index with different special size, etc.,
> > and it doesn't seem like a good place to figure that out.
> 
> It appears to me that all of the callers of log_newpage() other than
> copy_relation_data() do so with pages that they've just constructed,
> and which therefore can't be new.  So maybe we could just modify
> copy_relation_data to check PageIsNew(buf), or something like that,
> and only call log_newpage() if that returns true.

My first concern with that idea was that it may create an inconsistency
between the primary and the standby. The primary could have a bunch of
zero pages that never make it to the standby.

However, it looks like all WAL recovery stuff passes true for "init"
when calling XLogReadBuffer(), so I think it's safe.

I'll test it and submit a patch.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] do we need to postpone beta4?

2010-07-27 Thread Robert Haas
On Tue, Jul 27, 2010 at 4:09 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> Well, that's pretty much saying we won't release before September.
>
> Yup, that's what I think.  In fact I think September might be
> optimistic.  This is what happens when you fork early and allow
> developers to start focusing on new development instead of testing
> the release branch.

I call bullshit.  The six items in the "code" section of the open
items list were reported 14, 5, 5, 1, 27, and 0 days ago.  The 27-day
old item is purely cosmetic and there's absolutely zero evidence that
Simon hasn't done it yet because he's been busy working on 9.1
development.  It's much more likely that he hasn't gotten around to
taking care of that (and his outstanding 9.1 patch) because he's been
busy with everything else in his life other than pgsql-hackers.  The
remaining items have an average age of precisely 5 days, which hardly
sounds like we've been sitting on our hands, especially when you
consider that both you and Heikki have been on vacation for longer
than that.  And it's not as if I haven't been following those issues,
either.  Had you and Heikki and Peter fallen down a well more or less
permanently, I would have patched about half of those bugs by now.
The fact that I haven't done so is not because I'm busy working on 9.1
development, but because I respect your expertise and wish to have the
benefit of it so as to reduce the chances that I will break things,
or, for that matter, fix them in a way that's adequate but not the one
you happen to prefer.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Copy path in Dynamic programming

2010-07-27 Thread Tom Lane
Robert Haas  writes:
> On Thu, Jul 22, 2010 at 12:38 PM, vamsi krishna
>  wrote:
>> if lev=5 , and let's say there are two combinations setA = {1,2,3,4,5} and
>> set B={6,7,8,9,10}.
>> 
>> I want to reuse the plan of {1.2,3,4,5} for {6,7,8,9,10}.

> I don't think that makes any sense.

Yeah.  The theoretical problem is that substituting setB for setA could
change the estimated rowcounts, and thus you should not use the same
path.  The practical problem is that the Path datastructure doesn't
store the specific quals to be used, in general --- it relies on the
RelOptInfo structures to remember which quals need to be checked during
a scan.  So you can't just "copy the path and substitute some other
qual".  You could maybe do it if you copied the entire planner
workspace, but that's not too practical from a memory consumption
standpoint.  Not to mention that a lot of the node types in question
don't have copyfuncs.c support, which is only partly laziness --- it's
also the case that copyObject() is incapable of coping with circular
linkages, and the planner data structures are full of those.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] do we need to postpone beta4?

2010-07-27 Thread Tom Lane
Robert Haas  writes:
> Well, that's pretty much saying we won't release before September.

Yup, that's what I think.  In fact I think September might be
optimistic.  This is what happens when you fork early and allow
developers to start focusing on new development instead of testing
the release branch.

> Which is kind of a bummer, but I guess that's what happens when we get
> into vacation season.

Yes.  If we were at full strength maybe August would be make-able, but
there are too many people on vacation right now, and way too many
distractions to boot.

In any case, now that 9.0 is branched there is not any
project-scheduling reason why the final release needs to happen any
particular time.  I think we need to fall back on our traditional mantra
"we'll release it when it's ready" rather than fret about whether it's
August or September or whatever.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] do we need to postpone beta4?

2010-07-27 Thread Robert Haas
On Tue, Jul 27, 2010 at 3:53 PM, Bruce Momjian  wrote:
>> Well, that's pretty much saying we won't release before September.
>> Which is kind of a bummer, but I guess that's what happens when we get
>> into vacation season.
>
> Yeah, if we are lucky we can do RC1 in mid-August and still release
> final in August, but that assumes everything is done by then, and that
> we only need 1-2 RCs.  Early September looks more likely.  The good
> thing is that we are deciding now and are continually seeing if we can
> tighten the schedule.  (If we stop trying to tighten the schedule, we
> get a lose schedule, which no one likes.)

I am assuming that if we release beta4 with known bugs, beta5 in
mid-August is inevitable.  And we're going to need at least a couple
of weeks after beta5 before RC1, even assuming no new issues come up.
ISTM that if everything goes well we can expect to release in
*mid*-September.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-27 Thread Pavel Stehule
2010/7/26 Robert Haas :
> On Mon, Jul 26, 2010 at 2:09 PM, Pavel Stehule  
> wrote:
>> I understand, but with only text accepting, then CONCAT will has much
>> less benefit - you can't do a numeric list, for example
>>
>> see concat(c1::text, ',', c2::text, ',' ...)
>>
>> with this is much simpler use a pipes '' || c1 || ',' || c2 ... and
>> this operator does necessary cast self.
>>
>> This function is probably one use case of exception from our rules.
>
> At least two, right?  Because for that use case you'd probably want
> concat_ws().

sorry, yes

Pavel

 In fact, it's hard for me to think of a variadic text
> function where you wouldn't want the "no casts" behavior you're
> getting via ANY.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise Postgres Company
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-27 Thread Robert Haas
On Mon, Jul 26, 2010 at 3:49 PM, Merlin Moncure  wrote:
> concat() is not a variadic text function. it is variadic "any" that
> happens to do text coercion (not casting) inside the function. The
> the assumption that concat is casting internally is probably wrong.
> Suppose I had hacked the int->text cast to call a custom function -- I
> would very much expect concat() not to produce output from that
> function, just the vanilla output text (I could always force the cast
> if I wanted to).
>
> concat is just a function that does something highly similar to
> casting.  suppose I had a function count_memory(variadic "any") that
> summed memory usage of input args -- forcing casts would make no sense
> in that context (I'm not suggesting that you think so -- just bringing
> up a case that illustrates how forcing cast into the function can
> change behavior in subtle ways).

Right, but I already said I wasn't objecting to the use of variadic
ANY in cases like that - only in cases where, as here, you were
basically taking any old arguments and forcing them all to text.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-27 Thread Robert Haas
On Mon, Jul 26, 2010 at 2:09 PM, Pavel Stehule  wrote:
> I understand, but with only text accepting, then CONCAT will has much
> less benefit - you can't do a numeric list, for example
>
> see concat(c1::text, ',', c2::text, ',' ...)
>
> with this is much simpler use a pipes '' || c1 || ',' || c2 ... and
> this operator does necessary cast self.
>
> This function is probably one use case of exception from our rules.

At least two, right?  Because for that use case you'd probably want
concat_ws().  In fact, it's hard for me to think of a variadic text
function where you wouldn't want the "no casts" behavior you're
getting via ANY.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-27 Thread Merlin Moncure
On Mon, Jul 26, 2010 at 3:07 PM, Robert Haas  wrote:
> On Mon, Jul 26, 2010 at 2:09 PM, Pavel Stehule  
> wrote:
>> I understand, but with only text accepting, then CONCAT will has much
>> less benefit - you can't do a numeric list, for example
>>
>> see concat(c1::text, ',', c2::text, ',' ...)
>>
>> with this is much simpler use a pipes '' || c1 || ',' || c2 ... and
>> this operator does necessary cast self.
>>
>> This function is probably one use case of exception from our rules.
>
> At least two, right?

correct: there would be at least two.

> Because for that use case you'd probably want
> concat_ws().  In fact, it's hard for me to think of a variadic text
> function where you wouldn't want the "no casts" behavior you're
> getting via ANY.

concat() is not a variadic text function. it is variadic "any" that
happens to do text coercion (not casting) inside the function.  The
the assumption that concat is casting internally is probably wrong.
Suppose I had hacked the int->text cast to call a custom function -- I
would very much expect concat() not to produce output from that
function, just the vanilla output text (I could always force the cast
if I wanted to).

concat is just a function that does something highly similar to
casting.  suppose I had a function count_memory(variadic "any") that
summed memory usage of input args -- forcing casts would make no sense
in that context (I'm not suggesting that you think so -- just bringing
up a case that illustrates how forcing cast into the function can
change behavior in subtle ways).

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-27 Thread Pavel Stehule
2010/7/26 Robert Haas :
> On Mon, Jul 26, 2010 at 11:39 AM, Merlin Moncure  wrote:
>>> I'm just very skeptical that we should give our functions argument
>>> types that are essentially fantasy.  CONCAT() doesn't concatenate
>>> integers or intervals or boxes: it concatenates strings, and only
>>> strings.  Surely the right fix, if our casting rules are too
>>> restrictive, is to fix the casting rules; not to start adding a bunch
>>> of kludgery in every function we define.
>>>
>>> The problem with your canonical example (and the other examples of
>>> this type I've seen) is that they are underspecified.  Given two
>>> identically-named operators, one of which accepts types T1 and T2, and
>>> the other of which accepts types T3 and T4, what is one to do with T1
>>> OP T4?  You can cast T1 to T3 and call the first operator or you can
>>> cast T4 to T2 and call the second one, and it's really not clear which
>>> is right, so you had better thrown an error.  The same applies to
>>> functions: given foo(text) and foo(date) and the call
>>> foo('2010-04-15'), you had better complain, or you may end up with a
>>> very sad user.  But sometimes our current casting rules require casts
>>> in situations where they don't seem necessary, such as
>>> LPAD(integer_column, '0', 5).  That's not ambiguous because there's
>>> only one definition of LPAD, and the chances that the user will be
>>> unpleasantly surprised if you call it seem quite low.
>>>
>>> In other words, this problem is not unique to CONCAT().
>>
>> shoot, can't resist :-).
>>
>> Are the casting rules broken? If you want to do lpad w/o casts for
>> integers, you can define it explicitly -- feature, not bug.  You can
>> basically do this for any function with fixed arguments -- either in
>> userland or core.  lpad(int) actually introduces a problem case with
>> the minus sign possibly requiring application specific intervention,
>> so things are probably correct exactly as they are.
>
> Huh?  If you're arguing that LPAD should require a cast on an integer
> argument because the defined behavior of the function might not be
> what someone wants, then apparently explicit casts should be required
> in all cases.  If you're arguing that I can eliminate the need for an
> explicit cast by overloading LPAD(), I agree with you, but that's not
> a feature.
>
>> ISTM you are unhappy with the "any" variadic mechanism in general, not
>> the casting rules.
>
> No, I'm unhappy with the use of "any" to make an end-run around the
> casting rules.  If you're writing a function that operates on an
> argument of any type, like pg_sizeof() - or if you're writing a
> function that does something like append two arrays of unspecified but
> identical type or, perhaps, search an array of unspecified type for an
> element of that same type - or if you're writing a function where the
> types of the arguments can't be known in advance, like printf(), well,
> then any is what you need.  But the only argument anyone has put
> forward for making CONCAT() accept ANY instead of TEXT is that it
> might require casting otherwise.  My response to that is "well then
> you have to cast it, or fix the casting rules".

I understand, but with only text accepting, then CONCAT will has much
less benefit - you can't do a numeric list, for example

see concat(c1::text, ',', c2::text, ',' ...)

with this is much simpler use a pipes '' || c1 || ',' || c2 ... and
this operator does necessary cast self.

This function is probably one use case of exception from our rules.

Regards

Pavel Stehule
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise Postgres Company
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-27 Thread Robert Haas
On Mon, Jul 26, 2010 at 11:39 AM, Merlin Moncure  wrote:
>> I'm just very skeptical that we should give our functions argument
>> types that are essentially fantasy.  CONCAT() doesn't concatenate
>> integers or intervals or boxes: it concatenates strings, and only
>> strings.  Surely the right fix, if our casting rules are too
>> restrictive, is to fix the casting rules; not to start adding a bunch
>> of kludgery in every function we define.
>>
>> The problem with your canonical example (and the other examples of
>> this type I've seen) is that they are underspecified.  Given two
>> identically-named operators, one of which accepts types T1 and T2, and
>> the other of which accepts types T3 and T4, what is one to do with T1
>> OP T4?  You can cast T1 to T3 and call the first operator or you can
>> cast T4 to T2 and call the second one, and it's really not clear which
>> is right, so you had better thrown an error.  The same applies to
>> functions: given foo(text) and foo(date) and the call
>> foo('2010-04-15'), you had better complain, or you may end up with a
>> very sad user.  But sometimes our current casting rules require casts
>> in situations where they don't seem necessary, such as
>> LPAD(integer_column, '0', 5).  That's not ambiguous because there's
>> only one definition of LPAD, and the chances that the user will be
>> unpleasantly surprised if you call it seem quite low.
>>
>> In other words, this problem is not unique to CONCAT().
>
> shoot, can't resist :-).
>
> Are the casting rules broken? If you want to do lpad w/o casts for
> integers, you can define it explicitly -- feature, not bug.  You can
> basically do this for any function with fixed arguments -- either in
> userland or core.  lpad(int) actually introduces a problem case with
> the minus sign possibly requiring application specific intervention,
> so things are probably correct exactly as they are.

Huh?  If you're arguing that LPAD should require a cast on an integer
argument because the defined behavior of the function might not be
what someone wants, then apparently explicit casts should be required
in all cases.  If you're arguing that I can eliminate the need for an
explicit cast by overloading LPAD(), I agree with you, but that's not
a feature.

> ISTM you are unhappy with the "any" variadic mechanism in general, not
> the casting rules.

No, I'm unhappy with the use of "any" to make an end-run around the
casting rules.  If you're writing a function that operates on an
argument of any type, like pg_sizeof() - or if you're writing a
function that does something like append two arrays of unspecified but
identical type or, perhaps, search an array of unspecified type for an
element of that same type - or if you're writing a function where the
types of the arguments can't be known in advance, like printf(), well,
then any is what you need.  But the only argument anyone has put
forward for making CONCAT() accept ANY instead of TEXT is that it
might require casting otherwise.  My response to that is "well then
you have to cast it, or fix the casting rules".

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-27 Thread Pavel Stehule
>
> so "any" datatype is last possibility - is workaroud for custom functions.

Probably the most correct implementation of CONCAT have to contains a
parser changes - and then you can use a some internal concat function
with text only parameters. VARIADIC with "any" is just workaround that
is enouhg

Regards

Pavel



>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>> The problem with your canonical example (and the other examples of
>> this type I've seen) is that they are underspecified.  Given two
>> identically-named operators, one of which accepts types T1 and T2, and
>> the other of which accepts types T3 and T4, what is one to do with T1
>> OP T4?  You can cast T1 to T3 and call the first operator or you can
>> cast T4 to T2 and call the second one, and it's really not clear which
>> is right, so you had better thrown an error.  The same applies to
>> functions: given foo(text) and foo(date) and the call
>> foo('2010-04-15'), you had better complain, or you may end up with a
>> very sad user.  But sometimes our current casting rules require casts
>> in situations where they don't seem necessary, such as
>> LPAD(integer_column, '0', 5).  That's not ambiguous because there's
>> only one definition of LPAD, and the chances that the user will be
>> unpleasantly surprised if you call it seem quite low.
>>
>> In other words, this problem is not unique to CONCAT().
>>
>> --
>> Robert Haas
>> EnterpriseDB: http://www.enterprisedb.com
>> The Enterprise Postgres Company
>>
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions

2010-07-27 Thread Pavel Stehule
2010/7/26 Robert Haas :
> On Mon, Jul 26, 2010 at 10:39 AM, Merlin Moncure  wrote:
>> On Mon, Jul 26, 2010 at 9:26 AM, Robert Haas  wrote:
>>> On Mon, Jul 26, 2010 at 9:10 AM, Merlin Moncure  wrote:
> CONCAT('foo', NULL) => 'foo' really the behavior that everyone else
> implements here?  And why does CONCAT() take a variadic "ANY"
> argument?  Shouldn't that be variadic TEXT?

 What does that accomplish, besides forcing you to sprinkle every
 concat call with text casts (maybe that's not a bad thing?)?
>>>
>>> You could ask the same thing about the existing || operator.  And in
>>> fact, we used to have that behavior.  We changed it in 8.3.  Perhaps
>>> that was a good decision and perhaps it wasn't, but I don't think
>>> using CONCAT() to make an end-run around that decision is the way to
>>> go.
>>
>> It was absolutely a good decision because it prevented type inference
>> in ways that were ambiguous or surprising (for a canonical case see:
>> http://www.mail-archive.com/pgsql-gene...@postgresql.org/msg93224.html).
>>
>> || operator is still pretty tolerant in the 8.3+ world.
>> select interval_col || bool_col; -- error
>> select interval_col::text || bool_col; -- text concatenation
>> select text_col || interval_col || bool_col; -- text concatenation
>>
>> variadic text would require text casts on EVERY non 'unknown' argument
>> which drops it below the threshold of usefulness IMO -- it would be
>> far stricter than vanilla || concatenation.  Ok, pure bikeshed here
>> (shutting my trap now!), but concat() is one of those wonder functions
>> that you want to make as usable and terse as possible.  I don't see
>> the value in making it overly strict.
>
> I'm just very skeptical that we should give our functions argument
> types that are essentially fantasy.  CONCAT() doesn't concatenate
> integers or intervals or boxes: it concatenates strings, and only
> strings.  Surely the right fix, if our casting rules are too
> restrictive, is to fix the casting rules; not to start adding a bunch
> of kludgery in every function we define.
>

Rules are correct probably. The problem is in searching function
algorithm - it is too simple (and fast, just only one cycle). And some
exceptions - like COALESCE and similar are solved specifically on
parser level. We cannot enforce some casting on user level. PostgreSQL
is more strict with timestamps, dates than other databases. Sometimes
you have to do explicit casts, but it clean from context desired
datatype -

SELECT date_trunc('day', current_date) - result is timestamp, but it
is clean, so result have to be date ... When I proposed a parser hook
I though about these functions. With this hook, you can enforce any
necessary casting like some buildin functions does.

so "any"



















































































































































































































> The problem with your canonical example (and the other examples of
> this type I've seen) is that they are underspecified.  Given two
> identically-named operators, one of which accepts types T1 and T2, and
> the other of which accepts types T3 and T4, what is one to do with T1
> OP T4?  You can cast T1 to T3 and call the first operator or you can
> cast T4 to T2 and call the second one, and it's really not clear which
> is right, so you had better thrown an error.  The same applies to
> functions: given foo(text) and foo(date) and the call
> foo('2010-04-15'), you had better complain, or you may end up with a
> very sad user.  But sometimes our current casting rules require casts
> in situations where they don't seem necessary, such as
> LPAD(integer_column, '0', 5).  That's not ambiguous because there's
> only one definition of LPAD, and the chances that the user will be
> unpleasantly surprised if you call it seem quite low.
>
> In other words, this problem is not unique to CONCAT().
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise Postgres Company
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Improper usage of MemoryContext in nodeSubplan.c for buildSubPlanHash() function. This maybe causes allocate memory failed.

2010-07-27 Thread Tao Ma

Hi all,

This is a potential memory error in nodeSubplan.c or execGrouping.c
Using select '1'::TEXT IN ((SELECT '1'::NAME) UNION ALL SELECT '1'::NAME);
to reproduce this bug.

You may see the memory content that slot1's tts_values[0] point to before 
and after the statement : MemoryContextReset(evalContext) of 
execTuplesMatch() in PG version with --enable-cassert switch on.  You will 
find that the memory is set to 0x7f7f7f7f.

Here are my code analysis:
For the executor node SubPlanState, The node has a ExprContext named 
'innerecontext' that is created by itself in function ExecInitSubPlan(). 
Also, the 'innerecontext' is used to build a project info, 'projRight', of 
SubPlanState node.

When the SubPlanState node is executed, a hash table of subplan will be 
built, so buildSubPlanHash() will be called. In buildSubPlanHash function, 
'hashtable' of SubPlanState will be initialized by calling 
BuildTupleHashTable(), and the code passes the
'ecxt_per_tuple_memory' of 'innerecontext' to BuildTupleHashTable() as the 
'hashtable''s tempcxt.

At this point, we can conclude that the 'projRight' and 'hashtable''s 
'tempcxt' of SubPlanState node are all using the same memory context, that 
is 'innerecontext' in SubPlanState node. So:
1) The memory of all the fetched tuples from 'projRight' will be located in 
'innerecontext' of SubPlanState node.
2) All other intermediate result that is located in 'hashtable''s tempcxt, 
also in fact, will reside in the 'innerecontext' of SubPlanState node.

Now next:
In buildSubPlanHash(), we will fetch tuple from  'projRight'  to fill the 
hash table of SubPlanState node. As we known, all the fetched tuples are 
located in the 'innerecontext'. When filling the tuple hash table, if the 
tuple already exists in the hash table of SubPlanState node, the match 
function execTuplesMatch() will be called by tuples matching, but in this 
function, the statement:
MemoryContextReset(evalContext);
will be reset 'evalContext', to free some memory usage. In fact, the 
'evalContext' is equal to 'innerecontext' that the fetched tuples are 
located in, and actually this statement will reset the 'innerecontext'. So 
the fetched tuple's memory are all lost. That's the problem.

In fact, this error only in debug version, but not in release version. In 
debug using --enable-cassert to configure, the memory will be set to 
0x7f7f7f7f, but not for release. For this error memory usage, the pg does 
not always report error because the 0x7f7f7f in testing Macro 
VARATT_IS_COMPRESSED() and VARATT_IS_EXTERNAL() are always false in 
!WORDS_BIGENDIAN platform such as i386 GNU Linux and in WORDS_BIGENDIAN, the 
testing Macro VARATT_IS_COMPRESSED() will be true and error may be reported 
such as AIX(Power 6 AIX V6.1).

To fix this problem, we can use another memory context to passin 
BuildTupleHashTable() as the hashtable's tempcxt, or use other memory 
context as hash table's tempcxt or other ways.


Any questions, please contact me.
Regards




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multibyte charater set in levenshtein function

2010-07-27 Thread Alexander Korotkov
Here is new version of my patch. There are following changes:

1) I've merged singlebyte and multibyte versions of levenshtein_internal and
levenshtein_less_equal_internal using macros and includes.
2) I found that levenshtein takes reasonable time even for long strings.
There is an example with strings which contain random combinations of words.

test=# select count(*) from words;
 count
---
 98569
(1 row)

test=# select * from words limit 10;
  id   |
word| next_id
---++-
   200 | Agnew diodes drilled composts Wassermann nonprofit adjusting
Chautauqua|   17370
  4589 | Dhaka craziness savvies teenager ploughs Barents's unwed
zither|   70983
  5418 | Eroses gauchos bowls smellier weeklies tormentors cornmeal
punctuality |   96685
  6103 | G prompt boron overthrew cricking begonia snuffers
blazed  |   34518
  4707 | Djibouti Mari pencilling approves pointing's pashas magpie
rams|   71200
 10125 | Lyle Edgardo livers gruelled capable cancels gnawing's
inhospitable|   28018
  9615 | Leos deputy evener yogis assault palatals Octobers
surceased   |   21878
 15640 | Starr's arrests malapropism Koufax's aesthetes Howell rustier
Algeria  |   19316
 15776 | Sucre battle's misapprehension's predicating nutmegged
electrification bowl planks |   65739
 16878 | Updike Forster ragout keenly exhalation grayed switch
guava's  |   43013
(10 rows)

Time: 26,125 ms

Time: 137,061 ms
test=# select avg(length(word)) from words;
 avg
-
 74.6052207083362923
(1 row)

Time: 96,415 ms
test=# select * from words where levenshtein(word, 'Dhaka craziness savvies
teeae luhs Barentss unwe zher')<=10;
  id  |  word   |
next_id
--+-+-
 4589 | Dhaka craziness savvies teenager ploughs Barents's unwed zither |
70983
(1 row)

Time: 2957,426 ms
test=# select * from words where levenshtein_less_equal(word, 'Dhaka
craziness savvies teeae luhs Barentss unwe zher',10)<=10;
  id  |  word   |
next_id
--+-+-
 4589 | Dhaka craziness savvies teenager ploughs Barents's unwed zither |
70983
(1 row)

Time: 84,755 ms

It takes O(max(n, m) * max_d / min(ins_c, max_c)) in worst case. That's why
I changed limitation of this function to max_d <= 255 * min(ins_c, del_c)

3) I found that there is no need for storing character offsets in
levenshtein_less_equal_internal_mb. Instead of this first position in
string, where value of distance wasn't greater than max_d, can be stored.

4) I found that there is theoretical maximum distance based of string
lengths. It represents the case, when no characters are matching. If
theoretical maximum distance is less than given maximum distance we can use
theoretical maximum distance. It improves behavior of levenshtein_less_equal
with high max_d, but it's still slower than levenshtein:

test=# select * from words where levenshtein_less_equal(word, 'Dhaka
craziness savvies teeae luhs Barentss unwe zher',1000)<=10;
  id  |  word   |
next_id
--+-+-
 4589 | Dhaka craziness savvies teenager ploughs Barents's unwed zither |
70983
(1 row)

Time: 4102,731 ms
test=# select * from words where levenshtein(word, 'Dhaka craziness savvies
teeae luhs Barentss unwe zher')<=10;
  id  |  word   |
next_id
--+-+-
 4589 | Dhaka craziness savvies teenager ploughs Barents's unwed zither |
70983
(1 row)

Time: 2983,244 ms


With best regards,
Alexander Korotkov.


fuzzystrmatch-0.5.tar.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] do we need to postpone beta4?

2010-07-27 Thread Bruce Momjian
Robert Haas wrote:
> On Tue, Jul 27, 2010 at 2:11 PM, Tom Lane  wrote:
> > Robert Haas  writes:
> >> I think we should consider postponing beta4. ?I count eleven
> >> non-documentation, 9.0-specific bug fix on REL9_0_STABLE, but there
> >> are currently five items on the open 9.0 issues list, at least one of
> >> which appears to be a new bug in 9.0, and we just got a bug report on
> >> pgsql-bugs from Valentine Gogichashvili complaining of what looks to
> >> be a crasher in the redo path for heap_xlog_update(). ?It seems
> >> unlikely at this point that we can have all of these issues fixed and
> >> still have time for a full buildfarm cycle before the wrap. ?Does it
> >> make sense to put out a beta with known bugs (presumably requiring
> >> another beta) at this point, or should we push this off a bit?
> >
> > If we don't wrap a beta this week, the next possible window is several
> > weeks away, because various people will be on vacation. ?So I think we
> > should get the existing fixes out there, even if there are known bugs
> > remaining. ?A beta is not an RC.
> 
> Well, that's pretty much saying we won't release before September.
> Which is kind of a bummer, but I guess that's what happens when we get
> into vacation season.

Yeah, if we are lucky we can do RC1 in mid-August and still release
final in August, but that assumes everything is done by then, and that
we only need 1-2 RCs.  Early September looks more likely.  The good
thing is that we are deciding now and are continually seeing if we can
tighten the schedule.  (If we stop trying to tighten the schedule, we
get a lose schedule, which no one likes.)

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] page corruption on 8.3+ that makes it to standby

2010-07-27 Thread Robert Haas
On Tue, Jul 27, 2010 at 2:06 PM, Jeff Davis  wrote:
> I reported a problem here:
>
> http://archives.postgresql.org/pgsql-bugs/2010-07/msg00173.php
>
> Perhaps I used a poor subject line, but I believe it's a serious issue.
> That reproducible sequence seems like an obvious bug to me on 8.3+, and
> what's worse, the corruption propagates to the standby as I found out
> today (through a test, fortunately).

I think that the problem is not so much your choice of subject line as
your misfortune to discover this bug when Tom and Heikki were both on
vacation.

> The only mitigating factor is that it doesn't actually lose data, and
> you can fix it (I believe) with zero_damaged_pages (or careful use of
> dd).
>
> There are two fixes that I can see:
>
> 1. Have log_newpage() and heap_xlog_newpage() only call PageSetLSN() and
> PageSetTLI() if the page is not new. This seems slightly awkward because
> most WAL replay stuff doesn't have to worry about zero pages, but in
> this case I think it does.
>
> 2. Have copy_relation_data() initialize new pages. I don't like this
> because (a) it's not really the job of SET TABLESPACE to clean up zero
> pages; and (b) it could be an index with different special size, etc.,
> and it doesn't seem like a good place to figure that out.

It appears to me that all of the callers of log_newpage() other than
copy_relation_data() do so with pages that they've just constructed,
and which therefore can't be new.  So maybe we could just modify
copy_relation_data to check PageIsNew(buf), or something like that,
and only call log_newpage() if that returns true.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Preliminary review of Synchronous Replication patches

2010-07-27 Thread Yeb Havinga

Kevin Grittner wrote:

Unless there are objections, I will mark the patch by Zoltán
Böszörményi as Returned with Feedback in a couple days, and ask that
everyone interested in this feature focus on advancing the patch by
Fujii Masao.  Given the scope and importance of this area, I think
we could benefit from another person or two signing on officially as
Reviewers.
  

Hello Zoltán, Fujii, Kevin and list,

Here follows a severly time-boxed review of both synchronous replication 
patches. Expect this review to be incomplete, though I believe the 
general remarks to be valid. Off the list I've received word from Zoltan 
that work on a new patch is planned. It would be great if ideas from 
both patches could be merged into one.


regards,
Yeb Havinga


Patch A: Zoltán Böszörményi
Patch B: Fujii Masao

* Is the patch in context diff format?
A: Yes
B: Yes

* Does it apply cleanly to the current CVS HEAD?
A: No
B: Yes

* Does it include reasonable tests, necessary doc patches, etc?
A: Doc patches, and a contrib module for debugging purposes.
B: Doc patches.

For neither patch the documentation is final, see e.g.
* 25.2 - The section starting with "It should be noted that the log 
shipping is asynchronous, i.e.," should be updated.

* 25.2.5 - Dito for "Streaming replication is asynchronous, so there..."

Testing any replication setup is not possible with the current 
regression tool suite, and adding that is beyond the scope of any 
synchronous replication patch. Perhaps the work of Markus Wanner with 
dtester (that adds a make dcheck) can be assimilated for this purpose.


Both patches add synchronous replication to the current single-master 
streaming replication features in 9.0, which means that there now is a 
mechanism where a commit on the master does not finish until 'some or 
more of the replicas are in sync'


* Does the patch actually implement that?
A: Yes
B: No, if no standbys are connected commits to not wait. (If I 
understand the code from WaitXLogSend correct)


* Do we want that?
For database users who do never want to loose a single committed record, 
synchronous replication is a relatively easy setup to achieve a high 
level of 'security' (where secure means: less chance of losing data). 
The easiness is relative to current solutions (the whole range of 
mirrored disks, controllers, backups, log shipping etc).


* Do we already have it?
No

* Does it follow SQL spec, or the community-agreed behavior?
A: Unknown, though the choices in guc parameters suggest the patch's 
been made to support actual use cases.
B: Design of patch B has been discussed on the mailing list as well as 
the wiki (for links I refer to my previous mail)


* Are there dangers?
Besides the usual errors causing transactions to fail, in my opinion the 
single largest danger would be that the master thinks a standby is in 
sync, where in fact it isn't.


* Have all the bases been covered?
Patch A seems to cover two-phase commit where patch B does not. (I'm 
time constrained and currently do not have a replicated cluster with 
patch A anymore, so I cannot test).


# Does the feature work as advertised?
Patch A: yes
Patch B: yes

I ran a few tests with failures and had some recoverable problems, of 
which I'm unsure they are related to the sync rep patches or streaming 
replication in general. Work in the direction of a replication 
regression test would be useful.


# Are there any assertion failures or crashes?
No

A note: reading source code of both patches makes it clear that these 
are patches from experienced PostgreSQL programmers. The source code 
reads just like 'normal' high quality postgresql source.


Differences:
Patch A synchronizes by sending back the Xids that have been received 
and written to disk. When the master receives the xids, the respective 
backends with having those xids active are unlocked and signalled to 
continue. This means some locking of the procarray. The libpq protocol 
was adapted so a client (the walreceiver) can report back the xids.
Patch B synchronizes by waiting to send wal data, until the sync 
replicas have sending back LSN pointers in the wal files they have 
synced to, in one of three ways. The libpq protocol was adapted so the 
walreceiver can report back WAL file positions.


Perhaps the weakness of both patches is that they are not one. In my 
opinion, both patches have their strengths. It would be great if these 
strenght could be unified in a single patch.


patch A strengths:
* design of the guc parameters - meaning of 
min_sync_replication_clients. As I understand it, it is not possible 
with patch B to define a minimum number of synchronous standby servers a 
transaction must be replicated to, before the commit succeeds. Perhaps 
the name could be changed (quorum_min_sync_standbys?), but in my opinion 
the definitive sync replication patch needs a parameter with this number 
and exact meaning. The 'synchronous_slave' guc in boolean is also a good 
one in my opinion.


patch

Re: [HACKERS] Preliminary review of Synchronous Replication patches

2010-07-27 Thread Yeb Havinga

Kevin Grittner wrote:

Unless there are objections, I will mark the patch by Zoltán
Böszörményi as Returned with Feedback in a couple days, and ask that
everyone interested in this feature focus on advancing the patch by
Fujii Masao.  Given the scope and importance of this area, I think
we could benefit from another person or two signing on officially as
Reviewers.
  

Hello Zoltán, Fujii, Kevin and list,

Here follows a severly time-boxed review of both synchronous replication
patches. Expect this review to be incomplete, though I believe the
general remarks to be valid. Off the list I've received word from Zoltan
that work on a new patch is planned. It would be great if ideas from
both patches could be merged into one.

regards,
Yeb Havinga


Patch A: Zoltán Böszörményi
Patch B: Fujii Masao

* Is the patch in context diff format?
A: Yes
B: Yes

* Does it apply cleanly to the current CVS HEAD?
A: No
B: Yes

* Does it include reasonable tests, necessary doc patches, etc?
A: Doc patches, and a contrib module for debugging purposes.
B: Doc patches.

For neither patch the documentation is final, see e.g.
* 25.2 - The section starting with "It should be noted that the log
shipping is asynchronous, i.e.," should be updated.
* 25.2.5 - Dito for "Streaming replication is asynchronous, so there..."

Testing any replication setup is not possible with the current
regression tool suite, and adding that is beyond the scope of any
synchronous replication patch. Perhaps the work of Markus Wanner with
dtester (that adds a make dcheck) can be assimilated for this purpose.

Both patches add synchronous replication to the current single-master
streaming replication features in 9.0, which means that there now is a
mechanism where a commit on the master does not finish until 'some or
more of the replicas are in sync'

* Does the patch actually implement that?
A: Yes
B: No, if no standbys are connected commits to not wait. (If I
understand the code from WaitXLogSend correct)

* Do we want that?
For database users who do never want to loose a single committed record,
synchronous replication is a relatively easy setup to achieve a high
level of 'security' (where secure means: less chance of losing data).
The easiness is relative to current solutions (the whole range of
mirrored disks, controllers, backups, log shipping etc).

* Do we already have it?
No

* Does it follow SQL spec, or the community-agreed behavior?
A: Unknown, though the choices in guc parameters suggest the patch's
been made to support actual use cases.
B: Design of patch B has been discussed on the mailing list as well as
the wiki (for links I refer to my previous mail)

* Are there dangers?
Besides the usual errors causing transactions to fail, in my opinion the
single largest danger would be that the master thinks a standby is in
sync, where in fact it isn't.

* Have all the bases been covered?
Patch A seems to cover two-phase commit where patch B does not. (I'm
time constrained and currently do not have a replicated cluster with
patch A anymore, so I cannot test).

# Does the feature work as advertised?
Patch A: yes
Patch B: yes

I ran a few tests with failures and had some recoverable problems, of
which I'm unsure they are related to the sync rep patches or streaming
replication in general. Work in the direction of a replication
regression test would be useful.

# Are there any assertion failures or crashes?
No

A note: reading source code of both patches makes it clear that these
are patches from experienced PostgreSQL programmers. The source code
reads just like 'normal' high quality postgresql source.

Differences:
Patch A synchronizes by sending back the Xids that have been received
and written to disk. When the master receives the xids, the respective
backends with having those xids active are unlocked and signalled to
continue. This means some locking of the procarray. The libpq protocol
was adapted so a client (the walreceiver) can report back the xids.
Patch B synchronizes by waiting to send wal data, until the sync
replicas have sending back LSN pointers in the wal files they have
synced to, in one of three ways. The libpq protocol was adapted so the
walreceiver can report back WAL file positions.

Perhaps the weakness of both patches is that they are not one. In my
opinion, both patches have their strengths. It would be great if these
strenght could be unified in a single patch.

patch A strengths:
* design of the guc parameters - meaning of
min_sync_replication_clients. As I understand it, it is not possible
with patch B to define a minimum number of synchronous standby servers a
transaction must be replicated to, before the commit succeeds. Perhaps
the name could be changed (quorum_min_sync_standbys?), but in my opinion
the definitive sync replication patch needs a parameter with this number
and exact meaning. The 'synchronous_slave' guc in boolean is also a good
one in my opinion.

patch B strengths:
* having the walsender synced by waiting fo

Re: [HACKERS] ALTER TABLE ... DISABLE TRIGGER vs. AccessExclusiveLock

2010-07-27 Thread Robert Haas
On Tue, Jul 27, 2010 at 3:07 PM, James Robinson
 wrote:
> Experience and a read through backend/commands/tablecmds.c's AlterTable()
> indicate that ALTER TABLE ... DISABLE TRIGGER obtains an exclusive lock on
> the table (as does any ALTER TABLE).
>
> Blocking other readers from a table when we've, within the body of a
> transaction performing a bulk update operation where we don't want / need
> triggers to fire, seems at first glance to be over-kill. I can see how
> AlterTable()'s complex logic is made less complex through 'get and keep a
> big lock', since most of its operational modes really do need exclusive
> access, but is it strictly required for ... DISABLE / REENABLE TRIGGER?
>
> Could, say, RowExclusiveLock hypothetically provide adequate protection,
> allowing concurrent reads, but blocking out any other writers (for ENABLE /
> DISABLE TRIGGER) -- such as if driven through a new statement other than
> ALTER TABLE -- such as "DISABLE TRIGGER foo ON tbar" ?

Funny you should mention this.  There is a pending patch to do
something very much along these line.

https://commitfest.postgresql.org/action/patch_view?id=347

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] do we need to postpone beta4?

2010-07-27 Thread Robert Haas
On Tue, Jul 27, 2010 at 2:11 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> I think we should consider postponing beta4.  I count eleven
>> non-documentation, 9.0-specific bug fix on REL9_0_STABLE, but there
>> are currently five items on the open 9.0 issues list, at least one of
>> which appears to be a new bug in 9.0, and we just got a bug report on
>> pgsql-bugs from Valentine Gogichashvili complaining of what looks to
>> be a crasher in the redo path for heap_xlog_update().  It seems
>> unlikely at this point that we can have all of these issues fixed and
>> still have time for a full buildfarm cycle before the wrap.  Does it
>> make sense to put out a beta with known bugs (presumably requiring
>> another beta) at this point, or should we push this off a bit?
>
> If we don't wrap a beta this week, the next possible window is several
> weeks away, because various people will be on vacation.  So I think we
> should get the existing fixes out there, even if there are known bugs
> remaining.  A beta is not an RC.

Well, that's pretty much saying we won't release before September.
Which is kind of a bummer, but I guess that's what happens when we get
into vacation season.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] ALTER TABLE ... DISABLE TRIGGER vs. AccessExclusiveLock

2010-07-27 Thread James Robinson

Hackers,

Experience and a read through backend/commands/tablecmds.c's  
AlterTable() indicate that ALTER TABLE ... DISABLE TRIGGER obtains an  
exclusive lock on the table (as does any ALTER TABLE).


Blocking other readers from a table when we've, within the body of a  
transaction performing a bulk update operation where we don't want /  
need triggers to fire, seems at first glance to be over-kill. I can  
see how AlterTable()'s complex logic is made less complex through 'get  
and keep a big lock', since most of its operational modes really do  
need exclusive access, but is it strictly required for ... DISABLE /  
REENABLE TRIGGER?


Could, say, RowExclusiveLock hypothetically provide adequate  
protection, allowing concurrent reads, but blocking out any other  
writers (for ENABLE / DISABLE TRIGGER) -- such as if driven through a  
new statement other than ALTER TABLE -- such as "DISABLE TRIGGER foo  
ON tbar" ?


Thanks!

James Robinson
Socialserve.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] git config user.email

2010-07-27 Thread Tom Lane
Robert Haas  writes:
> On Thu, Jul 22, 2010 at 5:41 AM, Magnus Hagander  wrote:
>> *Personally*, I'd prefer to keep using my main email address for
>> commits.

> As for me, I'd much prefer to be rh...@postgresql.org than
> robertmh...@gmail.com.

"Prefer" is exactly the key word here.  I see no reason not to let each
committer exercise his personal preference as to which address to use.
We should suggest that reasonably stable ones be chosen, but it's not
the project's business to make that decision for people.  And in any
case it's impossible to be sure of the longevity of email addresses
more than a few years out, unless your crystal ball works a lot better
than mine.

(My own take is that I absolutely refuse to use t...@postgresql.org as
a primary mail address.  Its spam filtering is nearly nonexistent.
What comes through there is not *quite* filed directly to /dev/null
here, but it's darn close to that.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostGIS vs. PGXS in 9.0beta3

2010-07-27 Thread Andrew Dunstan



Robert Haas wrote:

On Tue, Jul 27, 2010 at 1:13 PM, Josh Berkus  wrote:
  

http://postgis.refractions.net/pipermail/postgis-users/2010-May/026654.html



It's not obvious that there's an unresolved issue here; downthread
there's some indication this might be an environment problem?

http://postgis.refractions.net/pipermail/postgis-users/2010-May/026658.html

  


No, I have just reproduced this with a totally vanilla environment.

I think PostGIS probably needs to patch their makefile(s).

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] do we need to postpone beta4?

2010-07-27 Thread Joshua D. Drake
On Tue, 2010-07-27 at 14:11 -0400, Tom Lane wrote:
> Robert Haas  writes:
> > I think we should consider postponing beta4.  I count eleven
> > non-documentation, 9.0-specific bug fix on REL9_0_STABLE, but there
> > are currently five items on the open 9.0 issues list, at least one of
> > which appears to be a new bug in 9.0, and we just got a bug report on
> > pgsql-bugs from Valentine Gogichashvili complaining of what looks to
> > be a crasher in the redo path for heap_xlog_update().  It seems
> > unlikely at this point that we can have all of these issues fixed and
> > still have time for a full buildfarm cycle before the wrap.  Does it
> > make sense to put out a beta with known bugs (presumably requiring
> > another beta) at this point, or should we push this off a bit?
> 
> If we don't wrap a beta this week, the next possible window is several
> weeks away, because various people will be on vacation.  So I think we
> should get the existing fixes out there, even if there are known bugs
> remaining.  A beta is not an RC.

If we document the bugs, then +1, if we don't -1. E.g; we let people
know where we KNOW there are warts.

JD

> 
>   regards, tom lane
> 

-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] page corruption on 8.3+ that makes it to standby

2010-07-27 Thread Jeff Davis
I reported a problem here:

http://archives.postgresql.org/pgsql-bugs/2010-07/msg00173.php

Perhaps I used a poor subject line, but I believe it's a serious issue.
That reproducible sequence seems like an obvious bug to me on 8.3+, and
what's worse, the corruption propagates to the standby as I found out
today (through a test, fortunately).

The only mitigating factor is that it doesn't actually lose data, and
you can fix it (I believe) with zero_damaged_pages (or careful use of
dd).

There are two fixes that I can see:

1. Have log_newpage() and heap_xlog_newpage() only call PageSetLSN() and
PageSetTLI() if the page is not new. This seems slightly awkward because
most WAL replay stuff doesn't have to worry about zero pages, but in
this case I think it does.

2. Have copy_relation_data() initialize new pages. I don't like this
because (a) it's not really the job of SET TABLESPACE to clean up zero
pages; and (b) it could be an index with different special size, etc.,
and it doesn't seem like a good place to figure that out.

Comments?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] do we need to postpone beta4?

2010-07-27 Thread Tom Lane
Robert Haas  writes:
> I think we should consider postponing beta4.  I count eleven
> non-documentation, 9.0-specific bug fix on REL9_0_STABLE, but there
> are currently five items on the open 9.0 issues list, at least one of
> which appears to be a new bug in 9.0, and we just got a bug report on
> pgsql-bugs from Valentine Gogichashvili complaining of what looks to
> be a crasher in the redo path for heap_xlog_update().  It seems
> unlikely at this point that we can have all of these issues fixed and
> still have time for a full buildfarm cycle before the wrap.  Does it
> make sense to put out a beta with known bugs (presumably requiring
> another beta) at this point, or should we push this off a bit?

If we don't wrap a beta this week, the next possible window is several
weeks away, because various people will be on vacation.  So I think we
should get the existing fixes out there, even if there are known bugs
remaining.  A beta is not an RC.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] do we need to postpone beta4?

2010-07-27 Thread Robert Haas
I think we should consider postponing beta4.  I count eleven
non-documentation, 9.0-specific bug fix on REL9_0_STABLE, but there
are currently five items on the open 9.0 issues list, at least one of
which appears to be a new bug in 9.0, and we just got a bug report on
pgsql-bugs from Valentine Gogichashvili complaining of what looks to
be a crasher in the redo path for heap_xlog_update().  It seems
unlikely at this point that we can have all of these issues fixed and
still have time for a full buildfarm cycle before the wrap.  Does it
make sense to put out a beta with known bugs (presumably requiring
another beta) at this point, or should we push this off a bit?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostGIS vs. PGXS in 9.0beta3

2010-07-27 Thread Robert Haas
On Tue, Jul 27, 2010 at 1:13 PM, Josh Berkus  wrote:
> http://postgis.refractions.net/pipermail/postgis-users/2010-May/026654.html

It's not obvious that there's an unresolved issue here; downthread
there's some indication this might be an environment problem?

http://postgis.refractions.net/pipermail/postgis-users/2010-May/026658.html

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Query optimization problem

2010-07-27 Thread Tom Lane
Robert Haas  writes:
> On Tue, Jul 20, 2010 at 11:23 AM, Dimitri Fontaine
>  wrote:
>> The specific diff between the two queries is :
>> 
>>   JOIN DocPrimary d2 ON d2.BasedOn=d1.ID
>> - WHERE (d1.ID=234409763) or (d2.ID=234409763)
>> + WHERE (d2.BasedOn=234409763) or (d2.ID=234409763)
>> 
>> So the OP would appreciate that the planner is able to consider applying
>> the restriction on d2.BasedOn rather than d1.ID given that d2.BasedOn is
>> the same thing as d1.ID, from the JOIN.
>> 
>> I have no idea if Equivalence Classes are where to look for this, and if
>> they're meant to extend up to there, and if that's something possible or
>> wise to implement, though.

> I was thinking of the equivalence class machinery as well.  I think
> the OR clause may be the problem.  If you just had d1.ID=constant, I
> think it would infer that d1.ID, d2.BasedOn, and the constant formed
> an equivalence class.

Right.  Because of the OR, it is *not* possible to conclude that
d2.basedon is always equal to 234409763, which is the implication of
putting them into an equivalence class.

In the example, we do have d1.id and d2.basedon grouped in an
equivalence class.  So in principle you could substitute d1.id into the
WHERE clause in place of d2.basedon, once you'd checked that it was
being used with an operator that's compatible with the specific
equivalence class (ie it's in one of the eclass's opfamilies, I think).
The problem is to recognize that such a rewrite would be a win --- it
could just as easily be a big loss.

Even if we understood how to direct the rewriting process, I'm really
dubious that it would win often enough to justify the added planning
time.  The particular problem here seems narrow enough that solving it
on the client side is probably a whole lot easier and cheaper than
trying to get the planner to do it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PostGIS vs. PGXS in 9.0beta3

2010-07-27 Thread Josh Berkus

Hackers,

A 9.0b3 tester reported this issue with our single most popular 
PostgreSQL extension, PostGIS:


==
Postgis makes use of 'PGXS' in postgresql > 8.2. Within postgresql-9, 
datadir and many other variables are defined as multiple values with an 
append operator, like this:


$ grep -i datadir /usr/pgsql-9.0/lib/pgxs/src/Makefile.global
[snip]
datadir := /usr/pgsql-9.0/share

Postgis-1.5.1's Makefile.pgxs makefile override tries to use datadir as 
a flat variable, doesn't try to expand the array-like structure with a 
for loop or similar. So when 'make install' within postgis-1.5 
configured against pgsql-9.x is ran, 'make' treats the second value 
within DATADIR as a command it should just run, which fails:


http://postgis.refractions.net/pipermail/postgis-users/2010-May/026654.html

I've tried the latest tarball SVN exports of both postgis-1.5.x and 
postgis-2.x without success.

==

I'm unsure of what the resolution of this issue should be ... it seems 
like the fix belongs in PostGIS ... but I also think we can't go final 
until this is resolved, given.


--
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] merge command - GSoC progress

2010-07-27 Thread Robert Haas
On Tue, Jul 27, 2010 at 1:04 AM, Boxuan Zhai  wrote:
> I have get a edition that the merge command can run. It accept the standard
> merge command and can do UPDATE, INSERT and DELETE actions now. But we
> cannot put additional qualification for actions. There are some bugs when we
> try to evaluate the quals which make the system quit. I will fix it soon.

This patch doesn't compile.  You're using zbxprint() from a bunch of
places where it's not defined.  I get compile warnings for all of
those files and then a link failure at the end.  You might find it
useful to create src/Makefile.custom in your local tree and put
COPT=-Werror in there; it tends to prevent problems of this kind.

Undefined symbols:
  "_zbxprint", referenced from:
  _transformStmt in analyze.o
  _ExecInitMergeAction in nodeModifyTable.o
  _ExecModifyTable in nodeModifyTable.o
  _ExecInitModifyTable in nodeModifyTable.o
  _merge_action_planner in planner.o

Not that it's as high-priority as getting this fully working, but you
should revert the useless changes in this patch - e.g. the one-line
change to heaptuple.c is obvious debugging leftovers, and all of the
changes to execQual.c and execUtil.c are whitespace-only.  You should
also try to make your code and comments conform to project style
guidelines.  In general, you'll find it easier to keep track of your
changes (and you'll have fewer spurious changes) if you use git diff
master...yourbranch instead of marking comments, etc. with ZBX or
similar.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous replication

2010-07-27 Thread Dimitri Fontaine
Le 27 juil. 2010 à 15:12, Joshua Tolley  a écrit :
> My concern is that in a quorum system, if the quorum number is less than the
> total number of replicas, there's no way to know *which* replicas composed the
> quorum for any given transaction, so we can't know which servers to fail to if
> the master dies. This isn't different from Oracle, where it looks like
> essentially the "quorum" value is always 1. Your scenario shows that all
> replicas are not created equal, and that sometimes we'll be interested in WAL
> getting committed on a specific subset of the available servers. If I had two
> nearby replicas called X and Y, and one at a remote site called Z, for
> instance, I'd set quorum to 2, but really I'd want to say "wait for server X
> and Y before committing, but don't worry about Z".
> 
> I have no idea how to set up our GUCs to encode a situation like that :)

You make it so that Z does not take a vote, by setting it async.

Regards,
-- 
dim
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Add restart_after_crash GUC.

2010-07-27 Thread Robert Haas
On Tue, Jul 27, 2010 at 2:33 AM, Fujii Masao  wrote:
> On Tue, Jul 20, 2010 at 9:47 AM, Robert Haas  wrote:
>> Log Message:
>> ---
>> Add restart_after_crash GUC.
>
> In postgresql.conf.sample, on/off is used as a boolean value.
> But true/false is used for exit_on_error and restart_after_crash.
> Sorry, I had overlooked that inconsistency when reviewing the
> original patch. I attached the bug-fix patch.

Good catch, thanks.  Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous replication

2010-07-27 Thread Joshua Tolley
On Tue, Jul 27, 2010 at 10:53:45PM +0900, Fujii Masao wrote:
> On Tue, Jul 27, 2010 at 10:12 PM, Joshua Tolley  wrote:
> > My concern is that in a quorum system, if the quorum number is less than the
> > total number of replicas, there's no way to know *which* replicas composed 
> > the
> > quorum for any given transaction, so we can't know which servers to fail to 
> > if
> > the master dies.
> 
> What about checking the current WAL receive location of each standby by
> using pg_last_xlog_receive_location()? The standby which has the newest
> location should be failed over to.

That makes sense. Thanks.

> > This isn't different from Oracle, where it looks like
> > essentially the "quorum" value is always 1. Your scenario shows that all
> > replicas are not created equal, and that sometimes we'll be interested in 
> > WAL
> > getting committed on a specific subset of the available servers. If I had 
> > two
> > nearby replicas called X and Y, and one at a remote site called Z, for
> > instance, I'd set quorum to 2, but really I'd want to say "wait for server X
> > and Y before committing, but don't worry about Z".
> >
> > I have no idea how to set up our GUCs to encode a situation like that :)
> 
> Yeah, quorum commit alone cannot cover that situation. I think that
> current approach (i.e., quorum commit plus replication mode per standby)
> would cover that. In your example, you can choose "recv", "fsync" or
> "replay" as replication_mode in X and Y, and choose "async" in Z.

Clearly I need to read through the GUCs and docs better. I'll try to keep
quiet until that's finished :)


--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com


signature.asc
Description: Digital signature


Re: [HACKERS] Synchronous replication

2010-07-27 Thread Fujii Masao
On Tue, Jul 27, 2010 at 10:12 PM, Joshua Tolley  wrote:
> I don't think it can support the case you're interested in, though I'm not
> terribly expert on it. I'm definitely not arguing for the syntax Oracle uses,
> or something similar; I much prefer the flexibility we're proposing, and agree
> with Yeb Havinga in another email who suggests we spell out in documentation
> some recipes for achieving various possible scenarios given whatever GUCs we
> settle on.

Agreed. I'll add it to my TODO list.

> My concern is that in a quorum system, if the quorum number is less than the
> total number of replicas, there's no way to know *which* replicas composed the
> quorum for any given transaction, so we can't know which servers to fail to if
> the master dies.

What about checking the current WAL receive location of each standby by
using pg_last_xlog_receive_location()? The standby which has the newest
location should be failed over to.

> This isn't different from Oracle, where it looks like
> essentially the "quorum" value is always 1. Your scenario shows that all
> replicas are not created equal, and that sometimes we'll be interested in WAL
> getting committed on a specific subset of the available servers. If I had two
> nearby replicas called X and Y, and one at a remote site called Z, for
> instance, I'd set quorum to 2, but really I'd want to say "wait for server X
> and Y before committing, but don't worry about Z".
>
> I have no idea how to set up our GUCs to encode a situation like that :)

Yeah, quorum commit alone cannot cover that situation. I think that
current approach (i.e., quorum commit plus replication mode per standby)
would cover that. In your example, you can choose "recv", "fsync" or
"replay" as replication_mode in X and Y, and choose "async" in Z.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous replication

2010-07-27 Thread Fujii Masao
On Tue, Jul 27, 2010 at 8:48 PM, Yeb Havinga  wrote:
> Is there a reason not to send the signal in XlogFlush itself, so it would be
> called at
>
> CreateCheckPoint(), EndPrepare(), FlushBuffer(),
> RecordTransactionAbortPrepared(), RecordTransactionCommit(),
> RecordTransactionCommitPrepared(), RelationTruncate(),
> SlruPhysicalWritePage(), write_relmap_file(), WriteTruncateXlogRec(), and
> xact_redo_commit().

Yes, it's because there is no need to send WAL immediately in other
than the following functions:

* EndPrepare()
* RecordTransactionAbortPrepared()
* RecordTransactionCommit()
* RecordTransactionCommitPrepared()

Some functions call XLogFlush() to follow the basic WAL rule. In the
standby, WAL records are always flushed to disk prior to any corresponding
data-file change. So, we don't need to replicate the result of XLogFlush()
immediately for the WAL rule.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous replication

2010-07-27 Thread Joshua Tolley
On Tue, Jul 27, 2010 at 01:41:10PM +0900, Fujii Masao wrote:
> On Tue, Jul 27, 2010 at 12:36 PM, Joshua Tolley  wrote:
> > Perhaps I'm hijacking the wrong thread for this, but I wonder if the quorum
> > idea is really the best thing for us. I've been thinking about Oracle's way 
> > of
> > doing things[1]. In short, there are three different modes: availability,
> > performance, and protection. "Protection" appears to mean that at least one
> > standby has applied the log; "availability" means at least one standby has
> > received the log info (it doesn't specify whether that info has been fsynced
> > or applied, but presumably does not mean "applied", since it's distinct from
> > "protection" mode); "performance" means replication is asynchronous. I'm not
> > sure this method is perfect, but it might be simpler than the quorum 
> > behavior
> > that has been considered, and adequate for actual use cases.
> 
> In my case, I'd like to set up one synchronous standby on the near rack for
> high-availability, and one asynchronous standby on the remote site for 
> disaster
> recovery. Can Oracle's way cover the case?

I don't think it can support the case you're interested in, though I'm not
terribly expert on it. I'm definitely not arguing for the syntax Oracle uses,
or something similar; I much prefer the flexibility we're proposing, and agree
with Yeb Havinga in another email who suggests we spell out in documentation
some recipes for achieving various possible scenarios given whatever GUCs we
settle on.

> "availability" mode with two standbys might create a sort of similar 
> situation.
> That is, since the ACK from the near standby arrives in first, the near 
> standby
> acts synchronous and the remote one does asynchronous. But the ACK from the
> remote standby can arrive in first, so it's not guaranteed that the near 
> standby
> has received the log info before transaction commit returns a "success" to the
> client. In this case, we have to failover to the remote standby even if it's 
> not
> under control of a clusterware. This is a problem for me.

My concern is that in a quorum system, if the quorum number is less than the
total number of replicas, there's no way to know *which* replicas composed the
quorum for any given transaction, so we can't know which servers to fail to if
the master dies. This isn't different from Oracle, where it looks like
essentially the "quorum" value is always 1. Your scenario shows that all
replicas are not created equal, and that sometimes we'll be interested in WAL
getting committed on a specific subset of the available servers. If I had two
nearby replicas called X and Y, and one at a remote site called Z, for
instance, I'd set quorum to 2, but really I'd want to say "wait for server X
and Y before committing, but don't worry about Z".

I have no idea how to set up our GUCs to encode a situation like that :)

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com


signature.asc
Description: Digital signature


Re: [HACKERS] Synchronous replication

2010-07-27 Thread Yeb Havinga

Fujii Masao wrote:

I noted the changes in XlogSend where instead of *caughtup = true/false it
now returns !MyWalSnd->sndrqst. That value is initialized to false in that
procedure and it cannot be changed to true during execution of that
procedure, or can it?



That value is set to true in WalSndWakeup(). If WalSndWakeup() is called
after initialization of that value in XLogSend(), *caughtup is set to false.
  

Ah, so it can be changed by another backend process.

Another question:

Is there a reason not to send the signal in XlogFlush itself, so it 
would be called at


CreateCheckPoint(), EndPrepare(), FlushBuffer(), 
RecordTransactionAbortPrepared(), RecordTransactionCommit(), 
RecordTransactionCommitPrepared(), RelationTruncate(), 
SlruPhysicalWritePage(), write_relmap_file(), WriteTruncateXlogRec(), 
and xact_redo_commit().


regards,
Yeb Havinga


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous replication

2010-07-27 Thread Fujii Masao
On Tue, Jul 27, 2010 at 5:42 PM, Yeb Havinga  wrote:
> I'd like to bring forward another suggestion (please tell me when it is
> becoming spam). My feeling about replication_mode as is, is that is says in
> the same parameter something about async or sync, as well as, if sync, which
> method of feedback to the master. OTOH having two parameters would need
> documentation that the feedback method may only be set if the
> replication_mode was sync, as well as checks. So it is actually good to have
> it all in one parameter
>
> But somehow the shoe pinches, because async feels different from the other
> three parameters. There is a way to move async out of the enumeration:
>
> synchronous_replication_mode = off | recv | fsync | replay

ISTM that we need to get more feedback from users to determine which
is the best. So, how about leaving the parameter as it is and revisiting
this topic later? Since it's not difficult to change the parameter later,
we will not regret even if we delay that determination.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous replication

2010-07-27 Thread Fujii Masao
On Tue, Jul 27, 2010 at 7:39 PM, Yeb Havinga  wrote:
> Fujii Masao wrote:
>>
>> The attached patch changes the backend so that it signals walsender to
>> wake up from the sleep and send WAL immediately. It doesn't include any
>> other synchronous replication stuff.
>>
>
> Hello Fujii,

Thanks for the review!

> I noted the changes in XlogSend where instead of *caughtup = true/false it
> now returns !MyWalSnd->sndrqst. That value is initialized to false in that
> procedure and it cannot be changed to true during execution of that
> procedure, or can it?

That value is set to true in WalSndWakeup(). If WalSndWakeup() is called
after initialization of that value in XLogSend(), *caughtup is set to false.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous replication

2010-07-27 Thread Yeb Havinga

Fujii Masao wrote:

The attached patch changes the backend so that it signals walsender to
wake up from the sleep and send WAL immediately. It doesn't include any
other synchronous replication stuff.
  

Hello Fujii,

I noted the changes in XlogSend where instead of *caughtup = true/false 
it now returns !MyWalSnd->sndrqst. That value is initialized to false in 
that procedure and it cannot be changed to true during execution of that 
procedure, or can it?


regards,
Yeb Havinga


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous replication

2010-07-27 Thread Yeb Havinga

Joshua Tolley wrote:

Perhaps I'm hijacking the wrong thread for this, but I wonder if the quorum
idea is really the best thing for us.
For reference: it appeared in a long thread a while ago 
http://archives.postgresql.org/pgsql-hackers/2010-05/msg01226.php.

In short, there are three different modes: availability,
performance, and protection. "Protection" appears to mean that at least one
standby has applied the log; "availability" means at least one standby has
received the log info
  
Maybe we could do both, by describing use cases along the availability, 
performance and protection setups in the documentation and how they 
would be reflected with the standby related parameters.


regards,
Yeb Havinga


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous replication

2010-07-27 Thread Yeb Havinga

Fujii Masao wrote:

On Mon, Jul 26, 2010 at 8:25 PM, Robert Haas  wrote:
  

On Mon, Jul 26, 2010 at 6:48 AM, Marko Tiikkaja
 wrote:


On 7/26/10 1:44 PM +0300, Fujii Masao wrote:
  

On Mon, Jul 26, 2010 at 6:36 PM, Yeb Havinga  wrote:


I wasn't entirely clear. My suggestion was to have only

  acknowledge_commit = {no|recv|fsync|replay}

instead of

  replication_mode = {async|recv|fsync|replay}
  

Okay, I'll change the patch accordingly.


For what it's worth, I think replication_mode is a lot clearer.
Acknowledge_commit sounds like it would do something similar to
asynchronous_commit.
  

I agree.



As the result of the vote, I'll leave the parameter "replication_mode"
as it is.
  
I'd like to bring forward another suggestion (please tell me when it is 
becoming spam). My feeling about replication_mode as is, is that is says 
in the same parameter something about async or sync, as well as, if 
sync, which method of feedback to the master. OTOH having two parameters 
would need documentation that the feedback method may only be set if the 
replication_mode was sync, as well as checks. So it is actually good to 
have it all in one parameter


But somehow the shoe pinches, because async feels different from the 
other three parameters. There is a way to move async out of the enumeration:


synchronous_replication_mode = off | recv | fsync | replay

This also looks a bit like the "synchronous_replication = N # similar in 
name to synchronous_commit" Simon Riggs proposed in 
http://archives.postgresql.org/pgsql-hackers/2010-05/msg01418.php


regards,
Yeb Havinga



PS: Please bear with me, I thought a bit about a way to make clear what 
deduction users must make when figuring out if the replication mode is 
synchronous. That question might be important when counting 'which 
servers are the synchronous standbys' to debug quorum settings.


replication_mode

from the assumption !async -> sync
and !async -> recv|fsync|replay
to infer recv|fsync|replay -> synchronous_replication.

synchronous_replication_mode

from the assumption !off -> on
and !off -> recv|fsync|replay
to infer recv|fsync|replay -> synchronous_replication.

I think the last one is easier made by humans, since everybody will make 
the !off-> on assumption, but not the !async -> sync without having that 
verified in the documentation.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers