Re: [HACKERS] RFC: Making TRUNCATE more "MVCC-safe"

2012-03-04 Thread Simon Riggs
On Sun, Mar 4, 2012 at 2:28 AM, Marti Raudsepp  wrote:
> On Sat, Mar 3, 2012 at 14:53, Simon Riggs  wrote:
>> Thanks Noah for drawing attention to this thread. I hadn't been
>> watching. As you say, this work would allow me to freeze rows at load
>> time and avoid the overhead of hint bit setting, which avoids
>> performance issues from hint bit setting in checksum patch.
>>
>> I've reviewed this patch and it seems OK to me. Good work Marti.
>
> Thanks! This approach wasn't universally liked, but if it gets us
> tangible benefits (COPY with frozenxid) then I guess it's a reason to
> reconsider.

Comments I see support this idea. If we did this purely for truncate
correctness I probably wouldn't care either, which is why I didn't
read this thread in the first place.

>> v2 patch attached, updated to latest HEAD. Patch adds
>> * relvalidxid is reset by VACUUM/ANALYZE to avoid wraparound failure
>
> Personally I'd rather keep this out of ANALYZE -- since its purpose is
> to collect stats; VACUUM is responsible for correctness (xid
> wraparound etc). But I don't feel strongly about this.

If there were a reason to do it, then I would agree. Later you point
out a reason, so I will make this change.

> A more important consideration is how this interacts with hot standby.
> Currently you compare OldestXmin to relvalidxmin to decide when to
> reset it. But the standby's OldestXmin may be older than the master's.
> (If VACUUM removes rows then this causes a recovery conflict, but
> AFAICT that won't happen if only relvalidxmin changes)

As of 9.1, the standby's oldestxmin is incorporated into the master's
via hot_standby_feedback, so it wouldn't typically be a problem these
days. It's possible that the standby has this set off by choice, in
which case anomalies could exist in the case that a VACUUM doesn't
clean any rows, as you say.

So we'll use vacrelstats->latestRemovedXid instead of OldestXmin when
we call vac_update_relstats()
which means ANALYZE should always pass InvalidTransactionId.

> It might be more robust to wait until relfrozenxid exceeds
> relvalidxmin -- by then, recovery conflict mechanisms will have taken
> care of killing all older snapshots, or am I mistaken?

It would be better to set it as early as possible to reduce the cost
of the test in heap_begin_scan()

> And a few typos the code...
>
> + gettext_noop("When enabled viewing a truncated or newly created table "
> + "will throw a serialization error to prevent MVCC "
> + "discrepancies that were possible priot to 9.2.")
>
> "prior" not "priot"

Yep

> + * Reset relvalidxmin if its old enough
>
> Should be "it's" in this context.

Cool, thanks for the review.

v3 attached.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index c910863..4387f49 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -73,7 +73,7 @@
 
 /* GUC variable */
 bool		synchronize_seqscans = true;
-
+bool		StrictMVCC = true;
 
 static HeapScanDesc heap_beginscan_internal(Relation relation,
 		Snapshot snapshot,
@@ -1175,6 +1175,24 @@ heap_beginscan_internal(Relation relation, Snapshot snapshot,
 	HeapScanDesc scan;
 
 	/*
+	 * If the snapshot is older than relvalidxmin then that either a table
+	 * has only recently been created or that a TRUNCATE has removed data
+	 * that we should still be able to see. Either way, we aren't allowed
+	 * to view the rows in StrictMVCC mode.
+	 */
+	if (StrictMVCC &&
+		TransactionIdIsValid(relation->rd_rel->relvalidxmin) &&
+		TransactionIdIsValid(snapshot->xmax) &&
+		NormalTransactionIdPrecedes(snapshot->xmax, relation->rd_rel->relvalidxmin))
+	{
+		ereport(ERROR,
+(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+ errmsg("canceling statement due to conflict with TRUNCATE TABLE on %s",
+		 NameStr(relation->rd_rel->relname)),
+ errdetail("User query attempting to see row versions that have been removed.")));
+	}
+
+	/*
 	 * increment relation ref count while scanning relation
 	 *
 	 * This is just to make really sure the relcache entry won't go away while
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index aef410a..3f9bd5d 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -787,6 +787,7 @@ InsertPgClassTuple(Relation pg_class_desc,
 	values[Anum_pg_class_relhastriggers - 1] = BoolGetDatum(rd_rel->relhastriggers);
 	values[Anum_pg_class_relhassubclass - 1] = BoolGetDatum(rd_rel->relhassubclass);
 	values[Anum_pg_class_relfrozenxid - 1] = TransactionIdGetDatum(rd_rel->relfrozenxid);
+	values[Anum_pg_class_relvalidxmin - 1] = TransactionIdGetDatum(rd_rel->relvalidxmin);
 	if (relacl != (Datum) 0)
 		values[Anum_pg_class_relacl - 1] = relacl;
 	else
@@ -884,6 +885,7 @@ AddNewRelationTuple(Relation pg_class_desc,
 		new_rel_reltup->relfrozenxid = InvalidTransactionId;
 	}
 
+	new_rel_rel

Re: [HACKERS] xlog location arithmetic

2012-03-04 Thread Magnus Hagander
On Sun, Feb 26, 2012 at 00:53, Euler Taveira de Oliveira
 wrote:
> On 25-02-2012 09:23, Magnus Hagander wrote:
>> Do we even *need* the validate_xlog_location() function? If we just
>> remove those calls, won't we still catch all the incorrectly formatted
>> ones in the errors of the sscanf() calls? Or am I too deep into
>> weekend-mode and missing something obvious?
>>
> sscanf() is too fragile for input sanity check. Try
> pg_xlog_location_diff('12/3', '-10/0'), for example. I won't object removing
> that function if you protect xlog location input from silly users.

Ah, good point. No, that's the reason I was missing :-)

Patch applied, thanks!

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xlog location arithmetic

2012-03-04 Thread Magnus Hagander
On Tue, Feb 28, 2012 at 07:21, Fujii Masao  wrote:
> On Sun, Feb 26, 2012 at 8:53 AM, Euler Taveira de Oliveira
>  wrote:
>> On 25-02-2012 09:23, Magnus Hagander wrote:
>>> Do we even *need* the validate_xlog_location() function? If we just
>>> remove those calls, won't we still catch all the incorrectly formatted
>>> ones in the errors of the sscanf() calls? Or am I too deep into
>>> weekend-mode and missing something obvious?
>>>
>> sscanf() is too fragile for input sanity check. Try
>> pg_xlog_location_diff('12/3', '-10/0'), for example. I won't object removing
>> that function if you protect xlog location input from silly users.
>
> After this patch will have been committed, it would be better to change
> pg_xlogfile_name() and pg_xlogfile_name_offset() so that they use
> the validate_xlog_location() function to validate the input.

And I've done this part as well.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Our regex vs. POSIX on "longest match"

2012-03-04 Thread Brendan Jurd
On 4 March 2012 17:53, Tom Lane  wrote:
> Brendan Jurd  writes:
>> I'll admit that this is a pretty obscure point, but we do appear to be
>> in direct violation of POSIX here.
>
> How so?  POSIX doesn't contain any non-greedy constructs.  If you use
> only the POSIX-compatible greedy constructs, the behavior is compliant.

While it's true that POSIX doesn't contemplate non-greed, after
reading the spec I would have expected an expression *as a whole* to
still prefer the longest match, insofar as that is possible while
respecting non-greedy particles.  I just ran some quick experiments in
Perl, Python and PHP, using 'xy1234' ~ 'y*?\d+'.  All returned
'y1234', which to my mind is the most obvious answer, and one which
still makes sense in light of what POSIX has to say.  Whereas Postgres
(and presumably Tcl) return 'y1'.

What I found surprising here is that our implementation allows an
earlier quantifier to clobber the greediness of a later quantifier.
There's a certain disregard for the intentions of the author of the
regex.  If I had wanted the whole expression to be non-greedy, I could
have written 'y*?\d+?'.  But since I wrote 'y*?\d+', it is clear that
I meant for the first atom to be non-greedy, and for the second to be
greedy.

> The issue that is obscure is, once you define some non-greedy
> constructs, how to define how they should act in combination with greedy
> ones.  I'm not sure to what extent the engine's behavior is driven by
> implementation restrictions and to what extent it's really the sanest
> behavior Henry could think of.  I found a comment from him about it:
> http://groups.google.com/group/comp.lang.tcl/msg/c493317cc0d10d50
> but it's short on details as to what alternatives he considered.

Thanks for the link; it is always good to get more insight into
Henry's approach.  I'm beginning to think I should just start reading
everything he ever posted to comp.lang.tcl ...

Cheers,
BJ

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Making TRUNCATE more "MVCC-safe"

2012-03-04 Thread Simon Riggs
On Sun, Mar 4, 2012 at 9:59 AM, Simon Riggs  wrote:
> On Sun, Mar 4, 2012 at 2:28 AM, Marti Raudsepp  wrote:
>> On Sat, Mar 3, 2012 at 14:53, Simon Riggs  wrote:
>>> Thanks Noah for drawing attention to this thread. I hadn't been
>>> watching. As you say, this work would allow me to freeze rows at load
>>> time and avoid the overhead of hint bit setting, which avoids
>>> performance issues from hint bit setting in checksum patch.
>>>
>>> I've reviewed this patch and it seems OK to me. Good work Marti.

...

> v3 attached.

More detailed thoughts show that the test in heap_beginscan_internal()
is not right enough, i.e. wrong.

We need a specific XidInMVCCSnapshot test on the relvalidxid, so it
needs to be a specific xid, not an xmin because otherwise we can get
concurrent transactions failing, not just older transactions.

If we're going freeze tuples on load this needs to be watertight, so
some minor rework needed.

Of course, if we only have a valid xid on the class we might get new
columns added when we do repeated SELECT * statements using the same
snapshot while concurrent DDL occurs. That is impractical, so if we
define this as applying to rows it can work; if we want it to apply to
everything its getting more difficult.

Longer term we might fix this by making all catalog access use MVCC,
but that suffers the same problems with ALTER TABLEs that rewrite rows
to add columns. I don't see a neat future solution that is worth
waiting for.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collect frequency statistics for arrays

2012-03-04 Thread Alexander Korotkov
On Sun, Mar 4, 2012 at 5:38 AM, Tom Lane  wrote:

> 1. I'm still unhappy about the loop that fills the count histogram,
> as I noted earlier today.  It at least needs a decent comment and some
> overflow protection, and I'm not entirely convinced that it doesn't have
> more bugs than the overflow issue.
>

Attached patch is focused on fixing this. The "frac" variable overflow is
evaded by making it int64. I hope comments is clarifying something. In
general this loop copies behaviour of histogram constructing loop of
compute_scalar_stats function. But instead of values array we've array of
unique DEC and it's frequency.

--
With best regards,
Alexander Korotkov.
*** a/src/backend/utils/adt/array_typanalyze.c
--- b/src/backend/utils/adt/array_typanalyze.c
***
*** 581,587  compute_array_stats(VacAttrStats *stats, AnalyzeAttrFetchFunc fetchfunc,
  			DECountItem **sorted_count_items;
  			int			count_item_index;
  			int			delta;
! 			int			frac;
  			float4	   *hist;
  
  			/* num_hist must be at least 2 for the loop below to work */
--- 581,587 
  			DECountItem **sorted_count_items;
  			int			count_item_index;
  			int			delta;
! 			int64		frac;
  			float4	   *hist;
  
  			/* num_hist must be at least 2 for the loop below to work */
***
*** 612,633  compute_array_stats(VacAttrStats *stats, AnalyzeAttrFetchFunc fetchfunc,
  			hist[num_hist] = (double) element_no / (double) nonnull_cnt;
  
  			/*
! 			 * Construct the histogram.
! 			 *
! 			 * XXX this needs work: frac could overflow, and it's not clear
! 			 * how or why the code works.  Even if it does work, it needs
! 			 * documented.
  			 */
  			delta = analyzed_rows - 1;
  			count_item_index = 0;
! 			frac = sorted_count_items[0]->frequency * (num_hist - 1);
  			for (i = 0; i < num_hist; i++)
  			{
  while (frac <= 0)
  {
  	count_item_index++;
  	Assert(count_item_index < count_items_count);
! 	frac += sorted_count_items[count_item_index]->frequency * (num_hist - 1);
  }
  hist[i] = sorted_count_items[count_item_index]->count;
  frac -= delta;
--- 612,642 
  			hist[num_hist] = (double) element_no / (double) nonnull_cnt;
  
  			/*
! 			 * Construct the histogram of DECs. The object of this loop is to
! 			 * copy the max and min DECs and evenly-spaced DECs in between
! 			 * ("space" here is number of arrays corresponding to DEC). If we
! 			 * imagine ordered array of DECs where each input array have a
! 			 * corresponding DEC item, i'th value of histogram will be 
! 			 * DECs[i * (analyzed_rows - 1) / (num_hist - 1)]. But instead
! 			 * of such array we've sorted_count_items which holds unique DEC
! 			 * values with their frequencies. We can imagine "frac" variable as
! 			 * an (index in DECs corresponding to next sorted_count_items
! 			 * element - index in DECs corresponding to last histogram value) *
! 			 * (num_hist - 1). In this case negative fraction leads us to
! 			 * iterate over sorted_count_items. 
  			 */
  			delta = analyzed_rows - 1;
  			count_item_index = 0;
! 			frac = (int64)sorted_count_items[0]->frequency * 
!    (int64)(num_hist - 1);
  			for (i = 0; i < num_hist; i++)
  			{
  while (frac <= 0)
  {
  	count_item_index++;
  	Assert(count_item_index < count_items_count);
! 	frac += (int64)sorted_count_items[count_item_index]->frequency * 
! 		(int64)(num_hist - 1);
  }
  hist[i] = sorted_count_items[count_item_index]->count;
  frac -= delta;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: improve selectivity estimation for IN/NOT IN

2012-03-04 Thread Euler Taveira de Oliveira
On 04-03-2012 00:20, Daniele Varrazzo wrote:
> It looks like you have grand plans for array estimation. My patch has
> a much more modest scope, and I'm hoping it could be applied to
> currently maintained PG versions, as I consider the currently produced
> estimations a bug.
> 
We don't normally add new features to stable branches unless it is a bug. In
the optimizer case, planner regression is a bug (that this case is not).


-- 
   Euler Taveira de Oliveira - Timbira   http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collect frequency statistics for arrays

2012-03-04 Thread Alexander Korotkov
On Sun, Mar 4, 2012 at 5:38 AM, Tom Lane  wrote:

> 2. The tests in the above-mentioned message show that in most cases
> where mcelem_array_contained_selec falls through to the "rough
> estimate", the resulting rowcount estimate is just 1, ie we are coming
> out with very small selectivities.  Although that path will now only be
> taken when there are no stats, it seems like we'd be better off to
> return DEFAULT_CONTAIN_SEL instead of what it's doing.  I think there
> must be something wrong with the "rough estimate" logic.  Could you
> recheck that?
>

I think the wrong think with "rough estimate" is that assumption about
independent occurrences of items is very unsuitable even for "rough
estimate". The following example shows that "rough estimate" really works
in the case of independent occurrences of items.

Generate test table where item occurrences are really independent.

test=# create table test as select ('{'||(select string_agg(s,',') from
(select case when (t*0 + random()) < 0.1 then i::text else null end from
generate_series(1,100) i) as x(s))||'}')::int[] AS val  from
generate_series(1,1) t;

SELECT 1

test=# analyze test;
ANALYZE

Do some test.

test=# explain analyze select * from test where val <@
array[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60];

  QUERY PLAN


--
 Seq Scan on test  (cost=0.00..239.00 rows=151 width=61) (actual
time=0.325..32.556 rows=163 loops=1
)
   Filter: (val <@
'{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60}'::integer[])
   Rows Removed by Filter: 9837
 Total runtime: 32.806 ms
(4 rows)

Delete DECHIST statistics.

test=# update pg_statistic set stakind1 = 0, staop1 = 0, stanumbers1 =
null, stavalues1 = null where starelid = (select oid from pg_class where
relname = 'test') and stakind1 = 5;
UPDATE 0
test=# update pg_statistic set stakind2 = 0, staop2 = 0, stanumbers2 =
null, stavalues2 = null where starelid = (select oid from pg_class where
relname = 'test') and stakind2 = 5;
UPDATE 0
test=# update pg_statistic set stakind3 = 0, staop3 = 0, stanumbers3 =
null, stavalues3 = null where starelid = (select oid from pg_class where
relname = 'test') and stakind3 = 5;
UPDATE 0
test=# update pg_statistic set stakind4 = 0, staop4 = 0, stanumbers4 =
null, stavalues4 = null where starelid = (select oid from pg_class where
relname = 'test') and stakind4 = 5;
UPDATE 1
test=# update pg_statistic set stakind5 = 0, staop5 = 0, stanumbers5 =
null, stavalues5 = null where starelid = (select oid from pg_class where
relname = 'test') and stakind5 = 5;
UPDATE 0

Do another test.

test=# explain analyze select * from test where val <@
array[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60];

  QUERY PLAN


--
 Seq Scan on test  (cost=0.00..239.00 rows=148 width=61) (actual
time=0.332..32.952 rows=163 loops=1)
   Filter: (val <@
'{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60}'::integer[])
   Rows Removed by Filter: 9837
 Total runtime: 33.225 ms
(4 rows)

It this particular case "rough estimate" is quite accurate. But in most
part of cases it behaves really bad. It is why I started to invent
calc_distr and etc. So, I think return DEFAULT_CONTAIN_SEL is OK unless
we've some better ideas.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] Patch: improve selectivity estimation for IN/NOT IN

2012-03-04 Thread Daniele Varrazzo
On Sun, Mar 4, 2012 at 1:44 PM, Euler Taveira de Oliveira
 wrote:
> On 04-03-2012 00:20, Daniele Varrazzo wrote:
>> It looks like you have grand plans for array estimation. My patch has
>> a much more modest scope, and I'm hoping it could be applied to
>> currently maintained PG versions, as I consider the currently produced
>> estimations a bug.
>>
> We don't normally add new features to stable branches unless it is a bug. In
> the optimizer case, planner regression is a bug (that this case is not).

Please note that we are talking about planning errors leading to
estimates of records in the millions instead of in the units, in
queries where all the elements are known (most common elements, with
the right stats, included in the query), even with a partial index
whose size could never physically contain all the estimated rows
screaming "something broken here!". So, it's not a regression as the
computation has always been broken, but I think it can be hardly
considered not a bug.

OTOH, I expect the decision of leaving things as they are could be
taken on the basis of the possibility that some program may stop
working in reaction of an altered query plan: I'm not going to argue
about this, although I feel it unlikely.

-- Daniele

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ECPG FETCH readahead

2012-03-04 Thread Boszormenyi Zoltan
Hi,

first, thank you for answering and for the review.

2012-03-02 17:41 keltezéssel, Noah Misch írta:
> On Thu, Dec 29, 2011 at 10:46:23AM +0100, Boszormenyi Zoltan wrote:
>> 2011-11-16 20:51 keltez?ssel, Boszormenyi Zoltan ?rta:
>>> 2010-10-14 11:56 keltez?ssel, Boszormenyi Zoltan ?rta:
> On Thu, Jun 24, 2010 at 8:19 AM, Michael Meskes  
> wrote:
>> On Thu, Jun 24, 2010 at 12:04:30PM +0300, Heikki Linnakangas wrote:
>>> Is there a reason not to enable it by default? I'm a bit worried
>>> that it will receive no testing if it's not always on.
>>>   
>> Yes, there is a reason, namely that you don't need it in native mode at 
>> all.
>> ECPG can read as many records as you want in one go, something ESQL/C
>> apparently cannot do. This patch is a workaround for that restriction. I 
>> still
>> do not really see that this feature gives us an advantage in native mode
>> though.
> We yet lack a consensus on whether native ECPG apps should have access to the
> feature.

I don't even remember about any opinion on this matter.
So, at this point  don't know whether it's lack of interest.
We also have a saying "silence means agreement". :-)

>   My 2c is to make it available, because it's useful syntactic sugar.

Thanks, we thought the same.

> If your program independently processes each row of an arbitrary-length result
> set, current facilities force you to add an extra outer loop to batch the
> FETCHes for every such code site.  Applications could define macros to
> abstract that pattern, but this seems common-enough to justify bespoke
> handling.

We have similar opinions.

>   Besides, minimalists already use libpq directly.

Indeed. On the other hand, ECPG provides a safety net with syntax checking
so it's useful for not minimalist types. :-)

> I suggest enabling the feature by default but drastically reducing the default
> readahead chunk size from 256 to, say, 5. 


>   That still reduces the FETCH round
> trip overhead by 80%, but it's small enough not to attract pathological
> behavior on a workload where each row is a 10 MiB document.

I see. How about 8? Nice "round" power of 2 value, still small and avoids
87.5% of overhead.

BTW, the default disabled behaviour was to avoid "make check" breakage,
see below.

>   I would not offer
> an ecpg-time option to disable the feature per se.  Instead, let the user set
> the default chunk size at ecpg time.  A setting of 1 effectively disables the
> feature, though one could later re-enable it with ECPGFETCHSZ.

This means all code previously going through ECPGdo() would go through
ECPGopen()/ECPGfetch()/ECPGclose(). This is more intrusive and all
regression tests that were only testing certain features would also
test the readahead feature, too.

Also, the test for WHERE CURRENT OF at ecpg time would have to be done
at runtime, possibly making previously working code fail if ECPGFETCHSZ is 
enabled.

How about still allowing "NO READAHEAD" cursors that compile into plain 
ECPGdo()?
This way, ECPGFETCHSZ don't interfere with WHERE CURRENT OF. But this would
mean code changes everywhere where WHERE CURRENT OF is used.

Or how about a new feature in the backend, so ECPG can do
UPDATE/DELETE ... WHERE OFFSET N OF cursor
and the offset of computed from the actual cursor position and the position 
known
by the application? This way an app can do readahead and do work on rows 
collected
by the cursor with WHERE CURRENT OF which gets converted to WHERE OFFSET OF
behind the scenes.

>
>>> The ASAP took a little long. The attached patch is in git diff format,
>>> because (1) the regression test intentionally doesn't do ECPGdebug()
>>> so the patch isn't dominated by a 2MB stderr file, so this file is empty
>>> and (2) regular diff cannot cope with empty new files.
> Avoid the empty file with fprintf(STDERR, "Dummy non-empty error output\n");

Fixed.

>
>>> - *NEW FEATURE* Readahead can be individually enabled or disabled
>>>   by ECPG-side grammar:
>>> DECLARE curname [ [ NO ] READAHEAD ] CURSOR FOR ...
>>>   Without [ NO ] READAHEAD, the default behaviour is used for cursors.
> Likewise, this may as well take a chunk size rather than a yes/no.

Done.

> The patch adds warnings:
> error.c: In function `ecpg_raise':
> error.c:281: warning: format not a string literal and no format arguments
> error.c:281: warning: format not a string literal and no format arguments

Fixed.

> The patch adds few comments and no larger comments explaining its higher-level
> ideas.  That makes it much harder to review.  In this regard it follows the
> tradition of the ECPG code, but let's depart from that tradition for new work.
> I mention a few cases below where the need for commentary is acute.

Understood. Adding comments as I go over that code again.

> I tested full reads of various "SELECT * FROM generate_series(1, $1)" commands
> over a 50ms link, and the patch gives a sound performance improvement.  With
> no reada

Re: [HACKERS] Patch review for logging hooks (CF 2012-01)

2012-03-04 Thread Andrew Dunstan



On 01/20/2012 10:01 AM, Robert Haas wrote:

On Fri, Jan 20, 2012 at 2:47 AM, Greg Smith  wrote:

The updated patch looks good, marking as 'Ready for Committer'

Patches without documentation are never ready for commit.  For this one, I'm
not sure if that should be in the form of a reference example in contrib, or
just something that documents that the hook exists and what the ground rules
are for grabbing it.

Hooks are frequently not documented, and we only sometimes even bother
to include an example in contrib.  We should probably at least have a
working example for testing purposes, though, whether or not we end up
committing it.



I'm just looking at this patch, and I agree, it should be testable. I'm 
wondering if it wouldn't be a good idea to have a module or set of 
modules for demonstrating and testing bits of the API that we expose. 
src/test/api or something similar? I'm not sure how we'd automate a test 
for this case, though. I guess we could use something like pg_logforward 
and have a UDP receiver catch the messages and write them to a file. 
Something like that should be possible to rig up in Perl. But all that 
seems a lot of work at this stage of the game. So the question is do we 
want to commit this patch without it?


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ECPG FETCH readahead

2012-03-04 Thread Michael Meskes
On Fri, Mar 02, 2012 at 11:41:05AM -0500, Noah Misch wrote:
> We yet lack a consensus on whether native ECPG apps should have access to the
> feature.  My 2c is to make it available, because it's useful syntactic sugar.
> If your program independently processes each row of an arbitrary-length result
> set, current facilities force you to add an extra outer loop to batch the
> FETCHes for every such code site.  Applications could define macros to
> abstract that pattern, but this seems common-enough to justify bespoke
> handling.  Besides, minimalists already use libpq directly.

Sorry, I don't really understand what you're saying here. The program logic
won't change at all when using this feature or what do I misunderstand?

> I suggest enabling the feature by default but drastically reducing the default
> readahead chunk size from 256 to, say, 5.  That still reduces the FETCH round
> trip overhead by 80%, but it's small enough not to attract pathological
> behavior on a workload where each row is a 10 MiB document.  I would not offer
> an ecpg-time option to disable the feature per se.  Instead, let the user set
> the default chunk size at ecpg time.  A setting of 1 effectively disables the
> feature, though one could later re-enable it with ECPGFETCHSZ.

Using 1 to effectively disable the feature is fine with me, but I strongly
object any default enabling this feature. It's farily easy to create cases with
pathological behaviour and this features is not standard by any means. I figure
a normal programmer would expect only one row being transfered when fetching
one.

Other than that, thanks for the great review.

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org
Jabber: michael.meskes at googlemail dot com
VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ECPG FETCH readahead

2012-03-04 Thread Boszormenyi Zoltan
2012-03-04 17:16 keltezéssel, Michael Meskes írta:
> On Fri, Mar 02, 2012 at 11:41:05AM -0500, Noah Misch wrote:
>> We yet lack a consensus on whether native ECPG apps should have access to the
>> feature.  My 2c is to make it available, because it's useful syntactic sugar.
>> If your program independently processes each row of an arbitrary-length 
>> result
>> set, current facilities force you to add an extra outer loop to batch the
>> FETCHes for every such code site.  Applications could define macros to
>> abstract that pattern, but this seems common-enough to justify bespoke
>> handling.  Besides, minimalists already use libpq directly.
> Sorry, I don't really understand what you're saying here. The program logic
> won't change at all when using this feature or what do I misunderstand?

The program logic shouldn't change at all. He meant that extra coding effort
is needed if you want manual caching. It requires 2 loops instead of 1 if you 
use
FETCH N (N>1).

>
>> I suggest enabling the feature by default but drastically reducing the 
>> default
>> readahead chunk size from 256 to, say, 5.  That still reduces the FETCH round
>> trip overhead by 80%, but it's small enough not to attract pathological
>> behavior on a workload where each row is a 10 MiB document.  I would not 
>> offer
>> an ecpg-time option to disable the feature per se.  Instead, let the user set
>> the default chunk size at ecpg time.  A setting of 1 effectively disables the
>> feature, though one could later re-enable it with ECPGFETCHSZ.
> Using 1 to effectively disable the feature is fine with me,

Something else would be needed. For DML with  WHERE CURRENT OF cursor,
the feature should stay disabled even with the environment variable is set
without adding any decoration to the DECLARE statement. The safe thing
would be to distinguish between uncached (but cachable) and uncachable
cases. A single value cannot work.

>  but I strongly
> object any default enabling this feature. It's farily easy to create cases 
> with
> pathological behaviour and this features is not standard by any means. I 
> figure
> a normal programmer would expect only one row being transfered when fetching
> one.
>
> Other than that, thanks for the great review.
>
> Michael

Best regards,
Zoltán Böszörményi

-- 
--
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Making TRUNCATE more "MVCC-safe"

2012-03-04 Thread Simon Riggs
On Sun, Mar 4, 2012 at 1:02 PM, Simon Riggs  wrote:
> On Sun, Mar 4, 2012 at 9:59 AM, Simon Riggs  wrote:
>> On Sun, Mar 4, 2012 at 2:28 AM, Marti Raudsepp  wrote:
>>> On Sat, Mar 3, 2012 at 14:53, Simon Riggs  wrote:
 Thanks Noah for drawing attention to this thread. I hadn't been
 watching. As you say, this work would allow me to freeze rows at load
 time and avoid the overhead of hint bit setting, which avoids
 performance issues from hint bit setting in checksum patch.

 I've reviewed this patch and it seems OK to me. Good work Marti.
>
> ...
>
>> v3 attached.
>
> More detailed thoughts show that the test in heap_beginscan_internal()
> is not right enough, i.e. wrong.
>
> We need a specific XidInMVCCSnapshot test on the relvalidxid, so it
> needs to be a specific xid, not an xmin because otherwise we can get
> concurrent transactions failing, not just older transactions.

Marti, please review this latest version which has new isolation tests added.

This does both TRUNCATE and CREATE TABLE.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index c910863..3d90125 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -73,7 +73,7 @@
 
 /* GUC variable */
 bool		synchronize_seqscans = true;
-
+bool		StrictMVCC = true;
 
 static HeapScanDesc heap_beginscan_internal(Relation relation,
 		Snapshot snapshot,
@@ -1175,6 +1175,24 @@ heap_beginscan_internal(Relation relation, Snapshot snapshot,
 	HeapScanDesc scan;
 
 	/*
+	 * If the snapshot can't see relvalidxid then that means either the table
+	 * is newly created or has recently been truncated. Either way, we aren't
+	 * allowed to view the rows in StrictMVCC mode.
+	 */
+	if (IsMVCCSnapshot(snapshot) &&
+		StrictMVCC &&
+		XidInMVCCSnapshot(relation->rd_rel->relvalidxid, snapshot))
+	{
+		/* Unless we created or truncated the table recently ourselves */
+		if (relation->rd_createSubid == InvalidSubTransactionId &&
+			relation->rd_newRelfilenodeSubid == InvalidSubTransactionId)
+			ereport(ERROR,
+	(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+	 errmsg("cannot access recently added or recently removed data in %s",
+			 NameStr(relation->rd_rel->relname;
+	}
+
+	/*
 	 * increment relation ref count while scanning relation
 	 *
 	 * This is just to make really sure the relcache entry won't go away while
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index aef410a..eb93a7c 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -787,6 +787,7 @@ InsertPgClassTuple(Relation pg_class_desc,
 	values[Anum_pg_class_relhastriggers - 1] = BoolGetDatum(rd_rel->relhastriggers);
 	values[Anum_pg_class_relhassubclass - 1] = BoolGetDatum(rd_rel->relhassubclass);
 	values[Anum_pg_class_relfrozenxid - 1] = TransactionIdGetDatum(rd_rel->relfrozenxid);
+	values[Anum_pg_class_relvalidxid - 1] = TransactionIdGetDatum(rd_rel->relvalidxid);
 	if (relacl != (Datum) 0)
 		values[Anum_pg_class_relacl - 1] = relacl;
 	else
@@ -872,6 +873,7 @@ AddNewRelationTuple(Relation pg_class_desc,
 		 * that will do.
 		 */
 		new_rel_reltup->relfrozenxid = RecentXmin;
+		new_rel_reltup->relvalidxid = GetCurrentTransactionId();
 	}
 	else
 	{
@@ -882,6 +884,7 @@ AddNewRelationTuple(Relation pg_class_desc,
 		 * commands/sequence.c.)
 		 */
 		new_rel_reltup->relfrozenxid = InvalidTransactionId;
+		new_rel_reltup->relvalidxid = InvalidTransactionId;
 	}
 
 	new_rel_reltup->relowner = relowner;
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index bfbe642..0d96a6a 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -2854,7 +2854,7 @@ reindex_index(Oid indexId, bool skip_constraint_checks)
 		}
 
 		/* We'll build a new physical relation for the index */
-		RelationSetNewRelfilenode(iRel, InvalidTransactionId);
+		RelationSetNewRelfilenode(iRel, InvalidTransactionId, InvalidTransactionId);
 
 		/* Initialize the index and rebuild */
 		/* Note: we do not need to re-establish pkey setting */
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index b40e57b..44b891c 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -538,6 +538,7 @@ do_analyze_rel(Relation onerel, VacuumStmt *vacstmt, bool inh)
 			totalrows,
 			visibilitymap_count(onerel),
 			hasindex,
+			InvalidTransactionId,
 			InvalidTransactionId);
 
 	/*
@@ -558,6 +559,7 @@ do_analyze_rel(Relation onerel, VacuumStmt *vacstmt, bool inh)
 totalindexrows,
 0,
 false,
+InvalidTransactionId,
 InvalidTransactionId);
 		}
 	}
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index 1f95abc..ab4aec2 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -287,

Re: [HACKERS] Our regex vs. POSIX on "longest match"

2012-03-04 Thread Tom Lane
Brendan Jurd  writes:
> On 4 March 2012 17:53, Tom Lane  wrote:
>> Brendan Jurd  writes:
>>> I'll admit that this is a pretty obscure point, but we do appear to be
>>> in direct violation of POSIX here.

>> How so?  POSIX doesn't contain any non-greedy constructs.  If you use
>> only the POSIX-compatible greedy constructs, the behavior is compliant.

> While it's true that POSIX doesn't contemplate non-greed, after
> reading the spec I would have expected an expression *as a whole* to
> still prefer the longest match, insofar as that is possible while
> respecting non-greedy particles.

It's not apparent to me that a construct that is outside the spec
shouldn't be expected to change the overall behavior.  In particular,
what if the RE contains only non-greedy quantifiers --- would you still
expect longest match overall?  Henry certainly didn't.

> I just ran some quick experiments in
> Perl, Python and PHP, using 'xy1234' ~ 'y*?\d+'.  All returned
> 'y1234', which to my mind is the most obvious answer, and one which
> still makes sense in light of what POSIX has to say.  Whereas Postgres
> (and presumably Tcl) return 'y1'.

Well, that's just an arbitrary example.  The cases I remember people
complaining about in practice were the other way round: greedy
quantifier followed by non-greedy, and they were unhappy that the
non-greediness was effectively not respected (because the overall RE was
taken as greedy).  So you can't fix the issue by pointing to POSIX and
saying "overall greedy is always the right thing".

Another thing I've seen people complain about is that an alternation
(anything with top-level "|") is always taken as greedy overall.
This seems a bit surprising to me --- I'd have expected a rule like
"inherits its greediness from the leftmost branch that has one",
though I'm not sure whether that would really be much of an improvement
in practice.  (It would at least fix the problem that a leading "(a|b)"
determines greediness of the containing RE whereas the logically
equivalent "[ab]" does not.)  Again, pointing to POSIX and saying
"overall greedy is the right thing" doesn't help.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: improve selectivity estimation for IN/NOT IN

2012-03-04 Thread Tom Lane
Daniele Varrazzo  writes:
> On Sun, Mar 4, 2012 at 1:44 PM, Euler Taveira de Oliveira
>  wrote:
>> On 04-03-2012 00:20, Daniele Varrazzo wrote:
>>> It looks like you have grand plans for array estimation. My patch has
>>> a much more modest scope, and I'm hoping it could be applied to
>>> currently maintained PG versions, as I consider the currently produced
>>> estimations a bug.

>> We don't normally add new features to stable branches unless it is a bug. In
>> the optimizer case, planner regression is a bug (that this case is not).

> Please note that we are talking about planning errors leading to
> estimates of records in the millions instead of in the units,

We're also talking about a proposed patch that is not clearly a bug fix,
but is a change in a heuristic, meaning it has the potential to make
things worse in some cases.  (Notably, for arrays that *don't* contain
all-distinct values, the estimates are likely to get worse.)  So I
wasn't thinking of it as being back-patchable anyway.  It's generally
better not to destabilize planner behavior in minor releases, even if
it's a 90%-of-the-time improvement.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, patch v11

2012-03-04 Thread Dimitri Fontaine
Tom Lane  writes:
> FWIW, I agree with Thom on this.  If we do it as you suggest, I
> confidently predict that it will be less than a year before we seriously
> regret it.  Given all the discussion around this, it's borderline insane
> to believe that the set of parameters to be passed to command triggers
> is nailed down and won't need to change in the future.
>
> As for special coding of support, it hardly seems onerous when every
> language that has triggers at all has got some provision for the
> existing trigger parameters.  A bit of copying and pasting should get
> the job done.

So, for that to happen I need to add a new macro and use it in most call
sites of CALLED_AS_TRIGGER(fcinfo). That includes the setup switch in
src/pl/plpgsql/src/pl_comp.c doCompile() and plpgsql_call_handler() for
starters.

Let's call the macro CALLED_AS_COMMAND_TRIGGER(fcinfo), and I would
extend CommandContextData to be a Node of type CommandTriggerData, that
needs to be added to the NodeTag enum as T_CommandTriggerData.

With that in place I can stuff the data into the function's call context
(via InitFunctionCallInfoData) then edit the call handlers of included
procedure languages until they are able to init their language variables
with the info.

Then, do we want the command trigger functions to return type trigger or
another specific type?  I guess we want to forbid registering any
function as a command trigger?

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collect frequency statistics for arrays

2012-03-04 Thread Tom Lane
Alexander Korotkov  writes:
> On Sun, Mar 4, 2012 at 5:38 AM, Tom Lane  wrote:
>> 2. The tests in the above-mentioned message show that in most cases
>> where mcelem_array_contained_selec falls through to the "rough
>> estimate", the resulting rowcount estimate is just 1, ie we are coming
>> out with very small selectivities.  Although that path will now only be
>> taken when there are no stats, it seems like we'd be better off to
>> return DEFAULT_CONTAIN_SEL instead of what it's doing.  I think there
>> must be something wrong with the "rough estimate" logic.  Could you
>> recheck that?

> I think the wrong think with "rough estimate" is that assumption about
> independent occurrences of items is very unsuitable even for "rough
> estimate". The following example shows that "rough estimate" really works
> in the case of independent occurrences of items. ...
> It this particular case "rough estimate" is quite accurate. But in most
> part of cases it behaves really bad. It is why I started to invent
> calc_distr and etc. So, I think return DEFAULT_CONTAIN_SEL is OK unless
> we've some better ideas.

OK.  Looking again at that code, I notice that it also punts and returns
DEFAULT_CONTAIN_SEL if it's not given MCELEM stats, which it more or
less has to because without even a minfreq the whole calculation is just
hot air.  And there are no plausible scenarios where compute_array_stats
would produce an MCELEM slot but no count histogram.  So that says there
is no point in sweating over this case, unless you have an idea how to
produce useful results without MCELEM.

So I think it's sufficient to punt at the top of the function if no
histogram, and take out the various attempts to cope with the case.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Our regex vs. POSIX on "longest match"

2012-03-04 Thread hubert depesz lubaczewski
On Sun, Mar 04, 2012 at 12:34:22PM -0500, Tom Lane wrote:
> Well, that's just an arbitrary example.  The cases I remember people
> complaining about in practice were the other way round: greedy
> quantifier followed by non-greedy, and they were unhappy that the
> non-greediness was effectively not respected (because the overall RE was
> taken as greedy).  So you can't fix the issue by pointing to POSIX and
> saying "overall greedy is always the right thing".

I was one of the complaining, and my point was that deciding for whole
regexp whether it's greedy or non-greedy is a bug (well, it might be
documented, but it's still *very* unexpected).

I stand on position that mixing greedy and non-greedy operators should
be possible, and that it should work according to POLA - i.e. greedines
of given atom shouldn't be influenced by other atoms.

Best regards,

depesz


-- 
The best thing about modern society is how easy it is to avoid contact with it.
 http://depesz.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autovacuum locks

2012-03-04 Thread Greg Jaskiewicz

On 2 Mar 2012, at 15:28, Robert Haas wrote:

> On Fri, Mar 2, 2012 at 6:22 AM, Gregg Jaskiewicz  wrote:
>> Looking at the system bit more now, it look like 'waiting' states are
>> changing for both the query and autovacuum in pg_stat_activity.
>> But very slowly. It looks like they both got into that sort of state
>> that keeps them on the edge of starvation.
>> 
>> So this isn't really a deadlock, but rather very poor performance in
>> this corner case.
> 
> Are you making any use of cursors?

nope. 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: CHECK FUNCTION statement

2012-03-04 Thread Pavel Stehule
Hello

2012/3/4 Alvaro Herrera :
>
> Excerpts from Alvaro Herrera's message of sáb mar 03 17:56:23 -0300 2012:
>> Excerpts from Alvaro Herrera's message of sáb mar 03 16:54:19 -0300 2012:
>> > Excerpts from Pavel Stehule's message of sáb mar 03 02:25:52 -0300 2012:
>>
>> > > 3. THE ARE NOT CARET - this is really important
>>
>> > I am not sure about the caret thingy -- mainly because I don't think it
>> > works all that well.
>
> It doesn't work correctly with your patch; see sample below.  Note the
> caret is pointing to an entirely nonsensical position.  I'm not sure
> about duplicating the libpq line-counting logic in the backend.
>
> Also note i18n seems to be working well, except for the "In function"
> header, "query", and the error level.  That seems easily fixable.
>
> I remain unconvinced that this is the best possible output.
>
> alvherre=# create function f() returns int language plpgsql as $$
> begin select
> var
> from
> foo; end; $$;
> CREATE FUNCTION
> alvherre=# check function f();
>                     CHECK FUNCTION
> -
>  In function: 'f()'
>  error:42P01:2:sentencia SQL:no existe la relación «foo»
>  query:select                                           +
>  var                                                    +
>  from                                                   +
>  foo
>                       ^
> (4 filas)
>

this should be fixed. I checked expressions, that works (I expect)
correctly. Caret helps - (really). Sometimes man is blind :).

I'll look on this topic tomorrow and I hope this will be solvable simply.

Regards

Pavel

> --
> Álvaro Herrera 
> The PostgreSQL Company - Command Prompt, Inc.
> PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xlog min recovery request ... is past current point ...

2012-03-04 Thread Heikki Linnakangas

On 03.02.2012 18:32, Christophe Pettus wrote:

PostgreSQL 9.0.4:

While bringing up a streaming replica, and while it is working its way through 
the WAL segments before connecting to the primary, I see a lot of messages of 
the form:

2012-02-01 21:26:13.978 PST,,,24448,,4f2a1e61.5f80,54,,2012-02-01 21:25:53 PST,1/0,0,LOG,0,"restored log file 
""00010DB40065"" from archive",""
2012-02-01 21:26:14.032 PST,,,24448,,4f2a1e61.5f80,55,,2012-02-01 21:25:53 
PST,1/0,0,WARNING,01000,"xlog min recovery request DB5/42E15098 is past current point 
DB4/657FA490","writing block 5 of relation base/155650/156470_vm
xlog redo insert: rel 1663/155650/1658867; tid 9640/53"""
2012-02-01 21:26:14.526 PST,,,24448,,4f2a1e61.5f80,56,,2012-02-01 21:25:53 PST,1/0,0,LOG,0,"restored log file 
""00010DB40066"" from archive",""

All of these are on _vm relations.  The recovery completed successfully and the 
secondary connected to the primary without issue, so: Are these messages 
something to be concerned over?


Hmm, I think I see how that can happen:

0. A heap page has its bit set in visibility map to begin with

1. A heap tuple is inserted/updated/deleted. This clears the VM bit.
2. time passes, and more WAL is generated
3. The page is vacuumed, and the visibility map bit is set again.

In the standby, this can happen while replaying the WAL, if you restart 
the standby so that some WAL is re-replayed:


1. The update of the heap tuple is replayed. This clears the VM bit.
2. The VACUUM is replayed, setting the VM bit again, and updating the VM 
page's LSN.

3. Shutdown and restart standby
4. The heap update is replayed again. This again clears the VM bit, but 
does not set the LSN


If the VM page is now evicted from the buffer cache, you get the WARNING 
you saw, because the page is dirty, yet its LSN is beyond the current 
point in recovery.


AFAICS that's totally harmless, but the warning is quite alarming, so 
we'll have to figure out a way to fix that. Not sure how; perhaps we 
need to set the LSN on the VM page when the VM bit is cleared, but I 
don't remember off the top of my head if there was some important reason 
why we don't do that currently.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Our regex vs. POSIX on "longest match"

2012-03-04 Thread Tom Lane
hubert depesz lubaczewski  writes:
> I stand on position that mixing greedy and non-greedy operators should
> be possible, and that it should work according to POLA - i.e. greedines
> of given atom shouldn't be influenced by other atoms.

[ shrug... ]  That sounds good, but it's pretty much vacuous as far as
defining a principled alternative behavior goes.  It's easy to
demonstrate cases where atoms *must* be influenced by other ones.
A trivial example is
(.*)(.*)
It doesn't matter whether the second atom is greedy or not: it's not
going to get to eat anything because the first one does instead.
IOW this is just the same as
(.*)(.*?)
--- they are both overall-greedy.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collect frequency statistics for arrays

2012-03-04 Thread Tom Lane
Alexander Korotkov  writes:
> On Sun, Mar 4, 2012 at 5:38 AM, Tom Lane  wrote:
>> 1. I'm still unhappy about the loop that fills the count histogram,
>> as I noted earlier today.  It at least needs a decent comment and some
>> overflow protection, and I'm not entirely convinced that it doesn't have
>> more bugs than the overflow issue.

> Attached patch is focused on fixing this. The "frac" variable overflow is
> evaded by making it int64. I hope comments is clarifying something. In
> general this loop copies behaviour of histogram constructing loop of
> compute_scalar_stats function. But instead of values array we've array of
> unique DEC and it's frequency.

OK, I reworked this a bit and committed it.  Thanks.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collect frequency statistics for arrays

2012-03-04 Thread Tom Lane
BTW, one other thing about the count histogram: seems like we are
frequently generating uselessly large ones.  For instance, do ANALYZE
in the regression database and then run

select tablename,attname,elem_count_histogram from pg_stats
  where elem_count_histogram is not null;

You get lots of entries that look like this:

 pg_proc | proallargtypes | 
{1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,5,5,5,6,6,6,2.80556}
 pg_proc | proargmodes| 
{1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,1.6}
 pg_proc | proargnames| 
{1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,5,5,5,5,5,5,5,5,6,6,6,7,7,7,7,8,8,8,14,14,15,16,3.8806}
 pg_proc | proconfig  | 
{1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}
 pg_class| reloptions | 
{1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}

which seems to me to be a rather useless expenditure of space.
Couldn't we reduce the histogram size when there aren't many
different counts?

It seems fairly obvious to me that we could bound the histogram
size with (max count - min count + 1), but maybe something even
tighter would work; or maybe I'm missing something and this would
sacrifice accuracy.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parameterized-path cost comparisons need some work

2012-03-04 Thread Robert Haas
On Sun, Mar 4, 2012 at 12:20 AM, Tom Lane  wrote:
> After looking at the results, I think that the fallacy in what we've
> been discussing is this: a parameterized path may well have some extra
> selectivity over a less-parameterized one, but perhaps *not enough to be
> meaningful*.  The cases I was getting hits on were where the rowcount
> estimate got rounded off to be the same as for the less-parameterized
> path.  (In this connection it's worth noting that most of the hits were
> for rowcount estimates of only 1 or 2 rows.)  So basically, the scenario
> is where you have restriction clauses that are already enough to get
> down to a small number of rows retrieved, and then you have some join
> clauses that are not very selective and don't reduce the rowcount any
> further.  Or maybe you have some nicely selective join clauses, and then
> adding more joins to some other relations doesn't help any further.

OK, makes sense.

> One annoying thing about that is that it will reduce the usefulness of
> add_path_precheck, because that's called before we compute the rowcount
> estimates (and indeed not having to make the rowcount estimates is one
> of the major savings from the precheck).  I think what we'll have to do
> is assume that a difference in parameterization could result in a
> difference in rowcount, and hence only a dominant path with exactly the
> same parameterization can result in failing the precheck.

I wish we had some way of figuring out how much this - and maybe some
of the other new planning possibilities like index-only scans - were
going to cost us on typical medium-to-large join problems.  In the
absence of real-world data it's hard to know how worried we should be.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: CHECK FUNCTION statement

2012-03-04 Thread Alvaro Herrera

Excerpts from Pavel Stehule's message of dom mar 04 16:33:08 -0300 2012:
> 
> Hello
> 
> 2012/3/4 Alvaro Herrera :

> >                     CHECK FUNCTION
> > -
> >  In function: 'f()'
> >  error:42P01:2:sentencia SQL:no existe la relación «foo»
> >  query:select                                           +
> >  var                                                    +
> >  from                                                   +
> >  foo
> >                       ^
> > (4 filas)
> >
> 
> this should be fixed. I checked expressions, that works (I expect)
> correctly. Caret helps - (really). Sometimes man is blind :).

Agreed.

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: GiST for range types (was Re: [HACKERS] Range Types - typo + NULL string constructor)

2012-03-04 Thread Tom Lane
Alexander Korotkov  writes:
> On Mon, Jan 30, 2012 at 1:39 AM, Jeff Davis  wrote:
>> Marking "ready for committer", but please apply my comment fixes at your
>> discretion.

> Patch with your comment fixes is attached.

Applied with revisions, some cosmetic, some not so much.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Our regex vs. POSIX on "longest match"

2012-03-04 Thread Brendan Jurd
On 5 March 2012 04:34, Tom Lane  wrote:
> Brendan Jurd  writes:
>> On 4 March 2012 17:53, Tom Lane  wrote:
>>> Brendan Jurd  writes:
>> While it's true that POSIX doesn't contemplate non-greed, after
>> reading the spec I would have expected an expression *as a whole* to
>> still prefer the longest match, insofar as that is possible while
>> respecting non-greedy particles.
>
> It's not apparent to me that a construct that is outside the spec
> shouldn't be expected to change the overall behavior.  In particular,
> what if the RE contains only non-greedy quantifiers --- would you still
> expect longest match overall?  Henry certainly didn't.

Well, no, but then that wouldn't be "prefer the longest match, insofar
as that is possible while
 respecting non-greedy particles".  If all the quantifiers in an
expression are non-greedy, then it is trivially true that the only way
to respect the author's intention is to return the shortest match.

> Well, that's just an arbitrary example.  The cases I remember people
> complaining about in practice were the other way round: greedy
> quantifier followed by non-greedy, and they were unhappy that the
> non-greediness was effectively not respected (because the overall RE was
> taken as greedy).

I am unhappy about the reverse example too, and for the same reasons.

If we look at 'xy1234' ~ 'y*\d+?', Postgres gives us 'y1234'
(disregarding the non-greediness of the latter quantifier), and Python
gives us 'y1' (respecting both quantifiers).

So in Postgres, no matter how you mix the greediness up, some of your
quantifiers will not be respected.

> So you can't fix the issue by pointing to POSIX and
> saying "overall greedy is always the right thing".

"... insofar as that is possible while respecting non-greedy particles".

I will take Henry's word for it that this problem is harder than it
looks, but in any case, I think we may not presume to know better than
the author of the regex about the greediness of his quantifiers.

> Another thing I've seen people complain about is that an alternation
> (anything with top-level "|") is always taken as greedy overall.

Yeah, that is quirky.

Cheers,
BJ

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Our regex vs. POSIX on "longest match"

2012-03-04 Thread Robert Haas
On Sun, Mar 4, 2012 at 3:20 PM, Tom Lane  wrote:
> hubert depesz lubaczewski  writes:
>> I stand on position that mixing greedy and non-greedy operators should
>> be possible, and that it should work according to POLA - i.e. greedines
>> of given atom shouldn't be influenced by other atoms.
>
> [ shrug... ]  That sounds good, but it's pretty much vacuous as far as
> defining a principled alternative behavior goes.  It's easy to
> demonstrate cases where atoms *must* be influenced by other ones.
> A trivial example is
>        (.*)(.*)
> It doesn't matter whether the second atom is greedy or not: it's not
> going to get to eat anything because the first one does instead.
> IOW this is just the same as
>        (.*)(.*?)
> --- they are both overall-greedy.

But I don't think this makes Brendan's example not a bug.  In this
case, we can't eat anything because there's nothing to eat.  In his
example, there's something to eat, and it's sitting in a place where
it logically seems to line up with a greedy quantifier, and yet the
quantifier doesn't eat it.  Now in fairness, I've seen my fair share
of apparently-buggy behavior from Perl's regex engine over the years
as well.

I think the right way to imagine this is as though the regular
expression were being matched to the source text in left-to-right
fashion.  For example, suppose the RE is [ab]+?[bc]+ and the source
text is aabbcc.  Clearly there's a match at position 1, but the match
could be anywhere between 3 and 6 characters in length, depending on
how greedy we think the RE should be, and the exact source text that
gets matched to each atom is up for grabs.  Enumerating all the
possibilities where each atom matches a string that is consistent with
its definition, leaving greediness aside, we get this:

[aa,b]
[aa,bb]
[aa,bbc]
[aa,bbcc]
[aab,b]
[aab,bc]
[aab,bcc]
[aabb,c]
[aabb,cc]

To distinguish among these, we look first at [ab]+? and decide that,
since it is non-greedy, the right overall answer must be one of the
ones that assigns to [ab]+? a match of minimal length within the space
of possible matches.  That narrows the field to just the first four
choices listed above.  Then we look at [bc]+ and, since it is greedy,
give it a match of maximal length within the space of possible
matches, leading to [ab]+? = aa and [bc]+ = bbcc.  Had the RE been
[ab]+?[bc]+?, the same algorithm would assign [ab]+? = aa and [bc]+? =
b; had it been [ab]+[bc]+? the same algorithm would assign [ab]+ =
aabb and [bc]+? = c.  Testing, I find that this matches what Perl does
in all of those cases.

Things get a bit more complicated when you throw alternation into the
picture, but I think it's basically right to view it as a greedy
operation.  Anything else seems rather unprincipled.  It may seem
surprising that a+?|b+? matched against aaa will match the first
branch to aaa rather than just a, but there's no real reason to
suppose that the ? which applies to the plus should somehow bleed up
and affect the interpretation of the alternation.  The RE might be
something like a+b+?|b+?a+ where the sub-REs have different greediness
interpretations and there's no obvious principled way to decide which
one should apply to the parent; or even something like cat|catfish
where the sub-REs don't contain any greedy/non-greedy operators at all
and yet we still have to assign some greediness to the alternation. I
think it's right to view every RE construct as greedy unless it's got
an explicit "not-greedy" flag attached to it; after all, that's the
traditional behavior of REs from time immemorial.  Someone could
invent a non-greedy form of alternation if they were so inclined.
This is different from what Perl does, but I think Perl's behavior
here is batty: given a+|a+b+ and the string aaabbb, it picks the first
branch and matches only aaa.  My whole being recoils: that surely
looks like a non-greedy interpretation of a regex with only greedy
quantifiers.  It turns out that if you write a+b+|a+ then it matches
the whole string; apparently, it selects the syntactically first
branch that can match, regardless of the length of the match, which
strikes me as nearly pure evil.  PostgreSQL - correctly, IMHO -
matches the longest substring available regardless of the syntactic
ordering; AIUI, the original charter for REs was to always match the
longest possible substring; non-greedy quantifiers were added later,
and should not be taken as a license to change the semantics of REs
that don't use them.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers