Re: [HACKERS] [GENERAL] Floating point error

2013-03-01 Thread Albe Laurenz
Tom Duffey wrote (on -general):
 To bring closure to this thread, my whole problem was caused by not knowing 
 about the
 extra_float_digits setting. We have a script that uses COPY to transfer a 
 subset of rows from a very
 large production table to a test table. The script was not setting 
 extra_float_digits so the values
 did not match even though they appeared to match when running queries in 
 psql. Definitely another
 gotcha for floating point values and it might be a good idea to mention this 
 setting on the Numeric
 Types page of the docs.

True; how about this patch.

Yours,
Laurenz Albe


float.patch
Description: float.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2013-03-01 Thread Ants Aasma
On Thu, Feb 28, 2013 at 7:52 PM, Kevin Grittner kgri...@ymail.com wrote:
 Barring a sudden confluence of opinion, I will go with TRUNCATE for
 the initial spelling.  I tend to favor that spelling for several
 reasons.  One was the size of the patch needed to add the opposite
 of REFRESH to the backend code:

FWIW, I found Andres's point about closing the door on updatable views
quite convincing. If at any point we want to add updatable
materialized views, it seems like a bad inconsistency to have TRUNCATE
mean something completely different from DELETE. While update
capability for materialized views might not be a common use case, I
don't think it's fair to completely shut the door on it to have easier
implementation and shorter syntax. Especially as the shorter syntax is
semantically inconsistent - normal truncate removes the data,
materialized view just makes the data inaccessible until the next
refresh.

Sorry for weighing in late, but it seemed to me that this point didn't
get enough consideration.

Ants Aasma
-- 
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2013-03-01 Thread Kevin Grittner
Ants Aasma a...@cybertec.at wrote:
 Kevin Grittner kgri...@ymail.com wrote:
 Barring a sudden confluence of opinion, I will go with TRUNCATE
 for the initial spelling.  I tend to favor that spelling for
 several reasons.  One was the size of the patch needed to add
 the opposite of REFRESH to the backend code:

 FWIW, I found Andres's point about closing the door on updatable
 views quite convincing. If at any point we want to add updatable
 materialized views, it seems like a bad inconsistency to have
 TRUNCATE mean something completely different from DELETE. While
 update capability for materialized views might not be a common
 use case, I don't think it's fair to completely shut the door on
 it to have easier implementation and shorter syntax. Especially
 as the shorter syntax is semantically inconsistent - normal
 truncate removes the data, materialized view just makes the data
 inaccessible until the next refresh.

 Sorry for weighing in late, but it seemed to me that this point
 didn't get enough consideration.

Personally, I don't understand why anyone would want updateable
materialized views.  That's probably because 99% of the cases where
I've seen that someone wanted them, they wanted them updated to
match the underlying data using some technique that didn't require
the modification or commit of the underlying data to carry the
overhead of maintaining the MV.  In other words, they do not want
the MV to be up-to-date for performance reasons.  That is a big
part of the reason for *wanting* to use an MV.  How do you make an
asynchronously-maintained view updateable?

In addtion, at least 80% of the cases I've seen where people want
an MV it is summary information, which does not tie a single MV row
to a single underlying row.  If someone updates an aggregate number
like an average, I see no reasonable way to map that to the
underlying data in a meaningful way.

I see the contract of a materialized view as providing a
table-backed relation which shows the result set of a query as of
some point in time.  Perhaps it is a failure of imagination, but I
don't see where modifying that relation directly is compatible with
that contract.

Can you describe a meaningful use cases for an udpateable
materialized view?

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Statistics and selectivity estimation for ranges

2013-03-01 Thread Alexander Korotkov
On Wed, Feb 13, 2013 at 5:55 PM, Alexander Korotkov aekorot...@gmail.comwrote:

 On Wed, Feb 13, 2013 at 5:28 PM, Heikki Linnakangas 
 hlinnakan...@vmware.com wrote:

 On 04.01.2013 10:42, Alexander Korotkov wrote:

 /*
  * Calculate selectivity of  operator using histograms of range
 lower bounds
  * and histogram of range lengths.
  */
 static double
 calc_hist_selectivity_overlap(**TypeCacheEntry *typcache, RangeBound
 *lower,
 RangeBound *upper, RangeBound
 *hist_lower, int hist_nvalues,
 Datum
 *length_hist_values, int length_hist_nvalues)


 We already have code to estimate , based on the lower and upper bound
 histograms:

  case OID_RANGE_OVERLAP_OP:
 case OID_RANGE_CONTAINS_ELEM_OP:
 /*
  * A  B = NOT (A  B OR A  B).
  *
  * range @ elem is equivalent to range 
 [elem,elem]. The
  * caller already constructed the singular range
 from the element
  * constant, so just treat it the same as .
  */
 hist_selec =
 calc_hist_selectivity_scalar(**typcache,
 const_lower, hist_upper,

  nhist, false);
 hist_selec +=
 (1.0 - 
 calc_hist_selectivity_scalar(**typcache,
 const_upper, hist_lower,

   nhist, true));
 hist_selec = 1.0 - hist_selec;
 break;


 I don't think the method based on lower bound and length histograms is
 any better. In fact, my gut feeling is that it's less accurate. I'd suggest
 dropping that part of the patch.


 Right. This estimation has an accuracy of histogram, while estimation
 based on lower bound and length histograms rely on additional assumption
 about independence of lower bound and length histogram. We can sum A  B
 and A  B probabilities because they are mutually exclusive. It's pretty
 evident but I would like to mention it in the comments, because typical
 assumption about events in statistics calculation is their independence.


These changes were made in attached patch.

--
With best regards,
Alexander Korotkov.


range_stat-0.11.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2013-03-01 Thread Ants Aasma
On Fri, Mar 1, 2013 at 4:18 PM, Kevin Grittner kgri...@ymail.com wrote:
 Personally, I don't understand why anyone would want updateable
 materialized views.  That's probably because 99% of the cases where
 I've seen that someone wanted them, they wanted them updated to
 match the underlying data using some technique that didn't require
 the modification or commit of the underlying data to carry the
 overhead of maintaining the MV.  In other words, they do not want
 the MV to be up-to-date for performance reasons.  That is a big
 part of the reason for *wanting* to use an MV.  How do you make an
 asynchronously-maintained view updateable?

 In addtion, at least 80% of the cases I've seen where people want
 an MV it is summary information, which does not tie a single MV row
 to a single underlying row.  If someone updates an aggregate number
 like an average, I see no reasonable way to map that to the
 underlying data in a meaningful way.

 I see the contract of a materialized view as providing a
 table-backed relation which shows the result set of a query as of
 some point in time.  Perhaps it is a failure of imagination, but I
 don't see where modifying that relation directly is compatible with
 that contract.

 Can you describe a meaningful use cases for an udpateable
 materialized view?

I actually agree that overwhelming majority of users don't need or
want updateable materialized views. My point was that we can't at this
point rule out that people will think of a good use for this. I don't
have any real use cases for this, but I can imagine a few situations
where updateable materialized views wouldn't be nonsensical.

One case would be if the underlying data is bulkloaded and is
subsetted into smaller materialized views for processing using
off-the-shelf tools that expect tables. One might want to propagate
changes from those applications to the base data.

The other case would be the theoretical future where materialized
views can be incrementally and transactionally maintained, in that
case being able to express modifications on the views could actually
make sense.

I understand that the examples are completely hypothetical and could
be solved by using regular tables. I just have feeling that will
regret conflating TRUNCATE semantics for slight implementation and
notation convenience. To give another example of potential future
update semantics, if we were to allow users manually maintaining
materialized view contents using DML commands, one would expect
TRUNCATE to mean make this matview empty, not make this matview
unavailable.

Ants Aasma
-- 
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Optimizing pglz compressor

2013-03-01 Thread Heikki Linnakangas
I spotted some low-hanging fruit in the pglz compression routine. It 
uses a hash table to keep track of string prefixes it has seen:


#define PGLZ_HISTORY_LISTS  8192/* must be power of 2 */
#define PGLZ_HISTORY_SIZE   4096

/* --
 * Statically allocated work arrays for history
 * --
 */
static PGLZ_HistEntry *hist_start[PGLZ_HISTORY_LISTS];
static PGLZ_HistEntry hist_entries[PGLZ_HISTORY_SIZE];

The hist_start array has to be zeroed at every invocation of 
pglz_compress(). That's relatively expensive, when compressing small 
values; on a 64-bit machine, 8192 * 8 = 64 kB.


The pointers in hist_start point to entries in hist_entries. If we 
replace the pointers with int16 indexes into hist_entries, we can shrink 
hist_start array to 8192 * 2 = 16 kB.


Also, in PGLZ_HistEntry:

typedef struct PGLZ_HistEntry
{
struct PGLZ_HistEntry *next;/* links for my hash key's list */
struct PGLZ_HistEntry *prev;
int hindex; /* my current hash key 
*/
const char *pos;/* my input position */
} PGLZ_HistEntry;

we can replace next and prev with indexes into the hist_entries array, 
halving the size of that struct, and thus the hist_entries array from 
32*4096=128kB to 64kB. I'm not sure how significant that is - 
hist_entries array doesn't need to be zeroed out like hist_start does - 
but it might buy us something in CPU cache efficiency.


I ran some performance tests with this:

 testname  | unpatched | patched
---+---+-
 5k text   |   1.55933 | 1.57337
 512b random   |   6.70911 | 5.41420
 100k of same byte |   1.92319 | 1.94880
 2k random |   4.29494 | 3.88782
 100k random   |   1.10400 | 1.07982
 512b text |   9.26736 | 7.55828
(6 rows)

The datum used in the 5k text test was the first 5k from the 
doc/src/sgml/func.sgml file in the source tree, and 512b text was 
the first 512 bytes from the same file. The data in the random tests was 
completely random, and in the 100k of same byte test, a single byte 
was repeated 10 times (ie. compresses really well).


Each test inserts rows into a temporary table. The table has 10 bytea 
columns, and the same value is inserted in every column. The number of 
rows inserted was scaled so that each test inserts roughly 500 MB of 
data. Each test was repeated 20 times, and the values listed above are 
the minimum runtimes in seconds.


I spotted this while looking at Amit's WAL update delta encoding patch. 
My earlier suggestion to just use the pglz compressor for the delta 
encoding didn't work too well because the pglz compressor was too 
expensive, especially for small values. This patch might help with that..


In summary, this seems like a pretty clear win for short values, and a 
wash for long values. Not surprising, as this greatly lowers the startup 
cost of pglz_compress(). We're past feature freeze, but how would people 
feel about committing this?


- Heikki

--
- Heikki
diff --git a/src/backend/utils/adt/pg_lzcompress.c b/src/backend/utils/adt/pg_lzcompress.c
index 66c64c1..b4a8b8b 100644
--- a/src/backend/utils/adt/pg_lzcompress.c
+++ b/src/backend/utils/adt/pg_lzcompress.c
@@ -198,9 +198,9 @@
  */
 typedef struct PGLZ_HistEntry
 {
-	struct PGLZ_HistEntry *next;	/* links for my hash key's list */
-	struct PGLZ_HistEntry *prev;
-	int			hindex;			/* my current hash key */
+	int16		next;			/* links for my hash key's list */
+	int16		prev;
+	uint32		hindex;			/* my current hash key */
 	const char *pos;			/* my input position */
 } PGLZ_HistEntry;
 
@@ -241,9 +241,11 @@ const PGLZ_Strategy *const PGLZ_strategy_always = strategy_always_data;
  * Statically allocated work arrays for history
  * --
  */
-static PGLZ_HistEntry *hist_start[PGLZ_HISTORY_LISTS];
-static PGLZ_HistEntry hist_entries[PGLZ_HISTORY_SIZE];
+static int16 hist_start[PGLZ_HISTORY_LISTS];
+static PGLZ_HistEntry hist_entries[PGLZ_HISTORY_SIZE + 1];
 
+/* Element 0 in hist_entries is unused, and means 'invalid'. */
+#define INVALID_ENTRY			0
 
 /* --
  * pglz_hist_idx -
@@ -279,25 +281,25 @@ static PGLZ_HistEntry hist_entries[PGLZ_HISTORY_SIZE];
 #define pglz_hist_add(_hs,_he,_hn,_recycle,_s,_e) \
 do {	\
 			int __hindex = pglz_hist_idx((_s),(_e));		\
-			PGLZ_HistEntry **__myhsp = (_hs)[__hindex];	\
+			int16 *__myhsp = (_hs)[__hindex];\
 			PGLZ_HistEntry *__myhe = (_he)[_hn];			\
 			if (_recycle) {	\
-if (__myhe-prev == NULL)	\
+if (__myhe-prev == INVALID_ENTRY)			\
 	(_hs)[__myhe-hindex] = __myhe-next;	\
 else		\
-	__myhe-prev-next = __myhe-next;		\
-if (__myhe-next != NULL)	\
-	__myhe-next-prev = __myhe-prev;		\
+	(_he)[__myhe-prev].next = __myhe-next;\
+if (__myhe-next != INVALID_ENTRY)			\
+	(_he)[__myhe-next].prev = __myhe-prev;\
 			

Re: [HACKERS] scanner/parser minimization

2013-03-01 Thread Peter Eisentraut
On 2/28/13 3:34 PM, Robert Haas wrote:
 It's
 possible to vastly reduce the size of the scanner output, and
 therefore of gram.c, by running flex with -Cf rather than -CF, which
 changes the table representation completely.  I assume there is a
 sound performance reason why we don't do this, but it might be worth
 checking that if we haven't lately.  When compiled with -Cf, the size
 of gram.o drops from 1019260 bytes to 703600, which is a large
 savings.

The option choice is based on the recommendation in the flex manual.  It
wouldn't hurt to re-test it.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimizing pglz compressor

2013-03-01 Thread Alvaro Herrera
Heikki Linnakangas wrote:

 I spotted this while looking at Amit's WAL update delta encoding
 patch. My earlier suggestion to just use the pglz compressor for the
 delta encoding didn't work too well because the pglz compressor was
 too expensive, especially for small values. This patch might help
 with that..
 
 In summary, this seems like a pretty clear win for short values, and
 a wash for long values. Not surprising, as this greatly lowers the
 startup cost of pglz_compress(). We're past feature freeze, but how
 would people feel about committing this?

Surely we're not past feature freeze.  If we were, we'd have to reject
all remaining patches from the commitfest, which is not what we want to
do at this stage, is it?

My take on this is that if this patch is necessary to get Amit's patch
to a commitable state, it's fair game.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimizing pglz compressor

2013-03-01 Thread Heikki Linnakangas

On 01.03.2013 17:37, Alvaro Herrera wrote:

Heikki Linnakangas wrote:


In summary, this seems like a pretty clear win for short values, and
a wash for long values. Not surprising, as this greatly lowers the
startup cost of pglz_compress(). We're past feature freeze, but how
would people feel about committing this?


Surely we're not past feature freeze.  If we were, we'd have to reject
all remaining patches from the commitfest, which is not what we want to
do at this stage, is it?


Right, no, that's not what I meant by feature freeze. I don't know what 
the correct term here would be, but what I meant is that we're not 
considering new features for 9.3 anymore, except those that are in the 
commitfest.



My take on this is that if this patch is necessary to get Amit's patch
to a commitable state, it's fair game.


I don't think it's necessary for that, but let's see..

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimizing pglz compressor

2013-03-01 Thread Stephen Frost
* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
 Surely we're not past feature freeze.  If we were, we'd have to reject
 all remaining patches from the commitfest, which is not what we want to
 do at this stage, is it?

Actually, I think we're getting very close to exactly that point- we're
not making progress through the CommitFest and if we don't triage and
close it out, we're never going to get a release out.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] JSON Function Bike Shedding

2013-03-01 Thread Merlin Moncure
On Fri, Feb 22, 2013 at 11:50 AM, David E. Wheeler
da...@justatheory.com wrote:
 On Feb 22, 2013, at 9:37 AM, Robert Haas robertmh...@gmail.com wrote:

 What I think is NOT tolerable is choosing a set of short but arbitrary
 names which are different from anything that we have now and
 pretending that we'll want to use those again for the next data type
 that comes along.  That's just wishful thinking.  Programmers who
 believe that their decisions will act as precedent for all future code
 are almost inevitably disappointed.  Precedent grows organically out
 of what happens; it's very hard to create it ex nihilo, especially
 since we have no clear idea what future data types we'll likely want
 to add.  Sure, if we add something that's just like JSON but with a
 few extra features, we'll be able to reuse the names no problem.  But
 that's unlikely, because we typically resist the urge to add things
 that are too much like what we already have.  The main reason we're
 adding JSON when we already have hstore is because JSON has become
 something of a standard.  We probably WILL add more container types
 in the future, but I'd guess that they are likely to be as different
 from JSON as JSON is from XML, or from arrays.  I'm not convinced we
 can define a set of semantics that are going to sweep that broadly.

 Maybe. I would argue, however, that a key/value-oriented data type will 
 always call those things keys and values. So keys() and vals() (or 
 get_keys() and get_vals()) seems pretty reasonable to me.

 Anyway, back to practicalities, Andrew last posted:

 I am going to go the way that involves the least amount of explicit casting 
 or array construction. So get_path() stays, but becomes non-variadic. get() 
 can take an int or variadic text[], so you can do:

get(myjson,0)
get(myjson,'f1')
get(myjson,'f1','2','f3')
get_path(myjson,'{f1,2,f3}')

 I would change these to mention the return types:

get_json(myjson,0)
get_json(myjson,'f1')
get_json(myjson,'f1','2','f3')
get_path_json(myjson,'{f1,2,f3}')

 And then the complementary text-returning versions:

get_text(myjson,0)
get_text(myjson,'f1')
get_text(myjson,'f1','2','f3')
get_path_text(myjson,'{f1,2,f3}')

 I do think that something like length() has pretty good semantics across data 
 types, though. So to update the proposed names, taking in the discussion, I 
 now propose:

 Existing Name  Proposed Name
 -- ---
 json_array_length() length()
 json_each() each_json()
 json_each_as_text() each_text()
 json_get()  get_json()
 json_get_as_text()  get_text()
 json_get_path() get_path_json()
 json_get_path_as_text() get_path_text()
 json_object_keys()  get_keys()
 json_populate_record()  to_record()
 json_populate_recordset()   to_records()
 json_unnest()   get_values()
 json_agg()  json_agg()

 I still prefer to_record() and to_records() to populate_record(). It just 
 feels more like a cast to me. I dislike json_agg(), but assume we're stuck 
 with it.

 But at this point, I’m happy to leave Andrew to it. The functionality is 
 awesome.


Agreed: +1 to your thoughts here.  But also +1 to the originals and +1
to Robert's point of view also.   This feature is of huge strategic
importance to the project and we need to lock this down and commit it.
  There is a huge difference between i slightly prefer some different
names and the feature has issues.

So, i think the various positions are clear: this is one argument i'd
be happy to lose (or win).

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enabling Checksums

2013-03-01 Thread Daniel Farina
On Sun, Feb 24, 2013 at 10:30 PM, Greg Smith g...@2ndquadrant.com wrote:
 Attached is some bit rot updates to the checksums patches.  The replace-tli
 one still works fine

I rather badly want this feature, and if the open issues with the
patch has hit zero, I'm thinking about applying it, shipping it, and
turning it on.  Given that the heap format has not changed, the main
affordence I may check for is if I can work in backwards compatibility
(while not maintaining the checksums, of course) in case of an
emergency.

-- 
fdr


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for REINDEX CONCURRENTLY

2013-03-01 Thread Fujii Masao
On Fri, Mar 1, 2013 at 12:57 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Thu, Feb 28, 2013 at 11:26 PM, Fujii Masao masao.fu...@gmail.com wrote:

 I found one problem in the latest patch. I got the segmentation fault
 when I executed the following SQLs.

 CREATE TABLE hoge (i int);
 CREATE INDEX hogeidx ON hoge(abs(i));
 INSERT INTO hoge VALUES (generate_series(1,10));
 REINDEX TABLE CONCURRENTLY hoge;

 The error messages are:

 LOG:  server process (PID 33641) was terminated by signal 11: Segmentation
 fault
 DETAIL:  Failed process was running: REINDEX TABLE CONCURRENTLY hoge;

 Oops. Index expressions were not correctly extracted when building
 columnNames for index_create in index_concurrent_create.
 Fixed in this new patch. Thanks for catching that.

I found another problem in the latest patch. When I issued the following SQLs,
I got the assertion failure.

CREATE EXTENSION pg_trgm;
CREATE TABLE hoge (col1 text);
CREATE INDEX hogeidx ON hoge USING gin (col1 gin_trgm_ops) WITH
(fastupdate = off);
INSERT INTO hoge SELECT random()::text FROM generate_series(1,100);
REINDEX TABLE CONCURRENTLY hoge;

The error message that I got is:

TRAP: FailedAssertion(!(((array)-elemtype) == 25), File:
reloptions.c, Line: 874)
LOG:  server process (PID 45353) was terminated by signal 6: Abort trap
DETAIL:  Failed process was running: REINDEX TABLE CONCURRENTLY hoge;

ISTM that the patch doesn't handle the gin option fastupdate = off correctly.

Anyway, I think you should test whether REINDEX CONCURRENTLY goes well
with every type of indexes, before posting the next patch. Otherwise,
I might find
another problem ;P

@@ -1944,7 +2272,8 @@ index_build(Relation heapRelation,
Relation indexRelation,
IndexInfo *indexInfo,
bool isprimary,
-   bool isreindex)
+   bool isreindex,
+   bool istoastupdate)

istoastupdate seems to be unused.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Commitfest progress

2013-03-01 Thread Josh Berkus

 As I stepped up to work on the CF and then became immediately swamped in
 other work I bear some of the responsibility for not keeping things
 rolling.

Just FYI, this is exactly why I wanted a 2nd CF manager for this CF.

 It'd be really good if anyone with a patch in the CF could follow up
 with an opinion on its readiness and, if appropriate, push it to the
 next CF. I'll be going through the patch list and following up with my
 own summaries, but I remain pretty swamped so it'll be a trickle rather
 than a flood.

I'll try to find time this weekend to go through the pending patches and
check status.  I'll start at the bottom of the list and you can start at
the top.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: proposal: a width specification for s specifier (format function), fix behave when positional and ordered placeholders are used

2013-03-01 Thread Pavel Stehule
Hello

2013/2/28 Dean Rasheed dean.a.rash...@gmail.com:
 On 28 February 2013 11:25, Kyotaro HORIGUCHI
 horiguchi.kyot...@lab.ntt.co.jp wrote:
 Umm. sorry,

 If you have no problem with this, I'll send this to committer.

 I just found that this patch already has a revewer. I've seen
 only Status field in patch list..

 Should I leave this to you, Dean?


 Sorry, I've been meaning to review this properly for some time, but
 I've been swamped with other work, so I'm happy for you to take over.

 My overall impression is that the patch is in good shape, and provides
 valuable new functionality, and it is probably close to being ready
 for committer.

 I think that the only other behavioural glitch I spotted was that it
 fails to catch one overflow case, which should generate an out of
 ranger error:

 select format('|%*s|', -2147483648, 'foo');
 Result: |foo|

 because -(-2147483648) = 0 in signed 32-bit integers.

fixed - next other overflow check added

Regards

Pavel


 Apart from that, I didn't find any problems during my testing.

 Thanks for your review.

 Regards,
 Dean


format-width-20130301.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] find libxml2 using pkg-config

2013-03-01 Thread Noah Misch
On Mon, Jan 14, 2013 at 06:51:05AM -0500, Peter Eisentraut wrote:
 In multi-arch OS installations, using a single foo-config script to find
 libraries is problematic, because you don't know which architecture it
 will point to, and you can't choose which one you want.  Using
 pkg-config is better in that situation, because you can use its
 environment variables to point to your preferred version
 of /usr/lib*/pkgconfig or similar.

./configure XML2_CONFIG=/my/preferred/xml2-config achieves this today.

 In configure, we use xml2-config and pthread-config.  The latter doesn't
 exist on my system, so I'm just dealing with xml2-config now.

pthread-config is now quite deep into legacy territory; no concern there.

 The attached patch looks for pkg-config first, and finds libxml2 using
 that if available.  Otherwise it falls back to using xml2-config.

There's a risk to making anything configurable in two places, with one taking
precedence.  Some user unaware of $PLACE_1 will scratch his head after
changing $PLACE_2 to no effect.  For example, I've been using an override
libxml2 by changing my PATH.  With this patch, I will need to also change
PKG_CONFIG_PATH; otherwise, my system libxml2 will quietly take precedence.  I
can adapt easily enough, but I wonder whether the patch will be a net win
generally for folks building PostgreSQL.

I'd like this change more if it could be construed as starting us on the road
to use pkg-config, or any one particular configuration discovery method, for
all dependencies.  But many dependencies don't ship .pc files at all (perl,
ldap, tcl, pam).  Others ship a .pc, but we use neither it nor a -config
script (openssl, gssapi, xslt).  I see this patch as adding one more thing for
builders to comprehend without making things notably easier for complex
situations.  Adopting this patch wouldn't appal me, but I would rather not.


 +  AC_CHECK_PROGS(PKG_CONFIG, pkg-config)

This deserves an AC_ARG_VAR(PKG_CONFIG) somewhere.  On the other hand,
AC_ARG_VAR(XML2_CONFIG) is already missing, so perhaps cleaning all that is a
separate change.

 +  if test -n $PKG_CONFIG  $PKG_CONFIG --exists libxml-2.0; then
 +CPPFLAGS=$CPPFLAGS `$PKG_CONFIG libxml-2.0 --cflags-only-I`

Under pkg-config, we'll get -I options only, but ...

 +  for pgac_option in `$XML2_CONFIG --cflags`; do
 +case $pgac_option in
 +  -I*|-D*) CPPFLAGS=$CPPFLAGS $pgac_option;;

... we'll convey both -I and -D options from xml2-config.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for REINDEX CONCURRENTLY

2013-03-01 Thread Fujii Masao
On Sat, Mar 2, 2013 at 2:43 AM, Fujii Masao masao.fu...@gmail.com wrote:
 Fixed in this new patch. Thanks for catching that.

After make installcheck finished, I connected to the regression database
and issued REINDEX DATABASE CONCURRENTLY regression, then
I got the error:

ERROR:  constraints cannot have index expressions
STATEMENT:  REINDEX DATABASE CONCURRENTLY regression;

OTOH REINDEX DATABASE regression did not generate an error.

Is this a bug?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bugfix: --echo-hidden is not supported by \sf statements

2013-03-01 Thread Pavel Stehule
2013/2/27 Stephen Frost sfr...@snowman.net:
 * Pavel Stehule (pavel.steh...@gmail.com) wrote:
 I don't agree so it works well - you cannot use short type names is
 significant issue

 This is for psql.  In what use-case do you see that being a serious
 limitation?

 I might support having psql be able to fall-back to checking if the
 function name is unique (or perhaps doing that first before going on to
 look at the function arguments) but I don't think this should all be
 punted to the backend where only 9.3+ would have any real support for a
 capability which already exists in other places and should be trivially
 added to these.

a functionality can be same - only error message will be little bit different




 Thanks,

 Stephen


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for REINDEX CONCURRENTLY

2013-03-01 Thread Peter Eisentraut
REINDEX CONCURRENTLY resets the statistics in pg_stat_user_indexes,
whereas plain REINDEX does not.  I think they should be preserved in
either case.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sql_drop Event Trigger

2013-03-01 Thread Alvaro Herrera
Dimitri Fontaine escribió:

 The good news is that the patch to do that has already been sent on this
 list, and got reviewed in details by Álvaro who did offer incremental
 changes. Version 3 of that patch is to be found in:
 
   http://www.postgresql.org/message-id/m2fw19n1hr@2ndquadrant.fr

Here's a v4 of that patch.  I added support for DROP OWNED, and added
object name and schema name available to the pg_dropped_objects
function.

Since we're now in agreement that this is the way to go, I think this
only needs a few more tweaks to get to a committable state, as well as
some useful tests and doc changes.  (v3 has docs which I didn't include
here but are probably useful almost verbatim.)

Do we want some more stuff provided by pg_dropped_objects?  We now have
classId, objectId, objectSubId, object name, schema name.  One further
thing I think we need is the object's type, i.e. a simple untranslated
string table, view, operator and so on.  AFAICT this requires a
nearly-duplicate of getObjectDescription.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index d203725..2233158 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -267,6 +267,12 @@ performDeletion(const ObjectAddress *object,
 	{
 		ObjectAddress *thisobj = targetObjects-refs + i;
 
+		if ((!(flags  PERFORM_DELETION_INTERNAL)) 
+			EventTriggerSupportsObjectType(getObjectClass(thisobj)))
+		{
+			evtrig_sqldrop_add_object(thisobj);
+		}
+
 		deleteOneObject(thisobj, depRel, flags);
 	}
 
@@ -349,6 +355,12 @@ performMultipleDeletions(const ObjectAddresses *objects,
 	{
 		ObjectAddress *thisobj = targetObjects-refs + i;
 
+		if ((!(flags  PERFORM_DELETION_INTERNAL)) 
+			EventTriggerSupportsObjectType(getObjectClass(thisobj)))
+		{
+			evtrig_sqldrop_add_object(thisobj);
+		}
+
 		deleteOneObject(thisobj, depRel, flags);
 	}
 
@@ -366,6 +378,10 @@ performMultipleDeletions(const ObjectAddresses *objects,
  * This is currently used only to clean out the contents of a schema
  * (namespace): the passed object is a namespace.  We normally want this
  * to be done silently, so there's an option to suppress NOTICE messages.
+ *
+ * Note we don't fire object drop event triggers here; it would be wrong to do
+ * so for the current only use of this function, but if more callers are added
+ * this might need to be reconsidered.
  */
 void
 deleteWhatDependsOn(const ObjectAddress *object,
diff --git a/src/backend/commands/alter.c b/src/backend/commands/alter.c
index 269d19c..347c405 100644
--- a/src/backend/commands/alter.c
+++ b/src/backend/commands/alter.c
@@ -746,58 +746,6 @@ ExecAlterOwnerStmt(AlterOwnerStmt *stmt)
 }
 
 /*
- * Return a copy of the tuple for the object with the given object OID, from
- * the given catalog (which must have been opened by the caller and suitably
- * locked).  NULL is returned if the OID is not found.
- *
- * We try a syscache first, if available.
- *
- * XXX this function seems general in possible usage.  Given sufficient callers
- * elsewhere, we should consider moving it to a more appropriate place.
- */
-static HeapTuple
-get_catalog_object_by_oid(Relation catalog, Oid objectId)
-{
-	HeapTuple	tuple;
-	Oid			classId = RelationGetRelid(catalog);
-	int			oidCacheId = get_object_catcache_oid(classId);
-
-	if (oidCacheId  0)
-	{
-		tuple = SearchSysCacheCopy1(oidCacheId, ObjectIdGetDatum(objectId));
-		if (!HeapTupleIsValid(tuple))  /* should not happen */
-			return NULL;
-	}
-	else
-	{
-		Oid			oidIndexId = get_object_oid_index(classId);
-		SysScanDesc	scan;
-		ScanKeyData	skey;
-
-		Assert(OidIsValid(oidIndexId));
-
-		ScanKeyInit(skey,
-	ObjectIdAttributeNumber,
-	BTEqualStrategyNumber, F_OIDEQ,
-	ObjectIdGetDatum(objectId));
-
-		scan = systable_beginscan(catalog, oidIndexId, true,
-  SnapshotNow, 1, skey);
-		tuple = systable_getnext(scan);
-		if (!HeapTupleIsValid(tuple))
-		{
-			systable_endscan(scan);
-			return NULL;
-		}
-		tuple = heap_copytuple(tuple);
-
-		systable_endscan(scan);
-	}
-
-	return tuple;

Re: [HACKERS] posix_fadvise missing in the walsender

2013-03-01 Thread Florian Weimer
* Jeff Janes:

 Does the kernel really read a data block from disk into memory in
 order to immediately overwrite it?  I would have thought it would
 optimize that away, at least if the writes are sized and aligned to
 512 or 1024 bytes blocks (which WAL should be).

With Linux, you'd have to use O_DIRECT to get that effect (but don't
do that), otherwise writes happen in page size granularity, writing in
512 or 1024 byte blocks should really trigger a read-modify-write
cycle.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Memory leakage associated with plperl spi_prepare/spi_freeplan

2013-03-01 Thread Tom Lane
Alex Hunsaker bada...@gmail.com writes:
 On Tue, Feb 26, 2013 at 3:56 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I'm inclined to think the right fix is to make a small memory context
 for each prepared plan made by plperl_spi_prepare().  The qdesc for it
 could be made right in the context (getting rid of the unchecked
 malloc's near the top of the function), the FmgrInfos and their
 subsidiary data could live there too, and plperl_spi_freeplan could
 replace its retail free's with a single MemoryContextDelete.

 Seemed fairly trivial, find the above approach in the attached.

Applied with some fixes.

 I added a
 CHECK_FOR_INTERRUPTS() at the top of plperl_spi_prepare(), it was fairly
 annoying that I couldn't ctrl+c my way out of test function.

Good idea, but it wasn't safe where it was --- needs to be inside the
PG_TRY(), so as to convert from postgres to perl error handling.

 One annonce is it still leaks :-(.

I fixed that, at least for the function-lifespan leakage from
spi_prepare() --- is that what you meant?

 It would be nice to squish the other leaks due to perm_fmgr_info()... but
 this is a start.

Yeah, I'm sure there's more left to do in the area --- but at least you
can create and free plans without it leaking.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Memory leakage associated with plperl spi_prepare/spi_freeplan

2013-03-01 Thread Alex Hunsaker
On Fri, Mar 1, 2013 at 7:38 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Alex Hunsaker bada...@gmail.com writes:

 Applied with some fixes.

Thanks! Your version looks much better than mine.

  One annonce is it still leaks :-(.

 I fixed that, at least for the function-lifespan leakage from
 spi_prepare() --- is that what you meant?


Yep, I don't see the leak with your version.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Commitfest progress

2013-03-01 Thread Craig Ringer
On 03/02/2013 02:36 AM, Josh Berkus wrote:
 As I stepped up to work on the CF and then became immediately swamped in
 other work I bear some of the responsibility for not keeping things
 rolling.
 Just FYI, this is exactly why I wanted a 2nd CF manager for this CF.

 It'd be really good if anyone with a patch in the CF could follow up
 with an opinion on its readiness and, if appropriate, push it to the
 next CF. I'll be going through the patch list and following up with my
 own summaries, but I remain pretty swamped so it'll be a trickle rather
 than a flood.
 I'll try to find time this weekend to go through the pending patches and
 check status.  I'll start at the bottom of the list and you can start at
 the top.
Works for me, since I haven't been able to find time for it during the
week.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers