Re: [HACKERS] [sqlsmith] Failing assertions in spgtextproc.c

2015-12-19 Thread Andres Freund
On 2015-12-19 07:04:09 +0100, Andreas Seltenreich wrote:
> output below.  Since sqlsmith ist no longer restricted to read-only
> statements, the chances for reproduction are low :-/.

How many backends are concurrently writing data with sqlsmith? Just one?
If so, and possibly even otherwise, it might be worthwhile to set up
PITR and record the LSN at the beginning of transactions. That'd allow
to replay the state of the database to just before the failing xact.

Regards,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \rotate in psql

2015-12-19 Thread Pavel Stehule
2015-12-19 6:55 GMT+01:00 Pavel Stehule :

>
>
> 2015-12-18 21:21 GMT+01:00 Daniel Verite :
>
>> Pavel Stehule wrote:
>>
>> > The symbol 'X' in two column mode should be centred - now it is aligned
>> to
>> > left, what is not nice
>>
>> Currently print.c does not support centered alignment, only left and
>> right.
>> Should we add it, it would have to work for all output formats
>> (except obviously for "unaligned"):
>> - aligned
>> - wrapped
>> - html
>> - latex
>> - latex-longtable
>> - troff-ms
>> - asciidoc
>>
>> Because of this, I believe that adding support for a 'c' alignment
>> might be a significant patch by itself, and that it should be considered
>> separately.
>>
>
> ok
>
>
>>
>> I agree that if it existed, the crosstabview command should use it
>> as you mention, but I'm not volunteering to implement it myself, at
>> least not in the short term.
>>
>
> I'll look how much work it is
>

attached patch allows align to center.

everywhere where left/right align was allowed, the center align is allowed

Regards

Pavel


>
> Regards
>
> Pavel
>
>
>>
>> Best regards,
>> --
>> Daniel Vérité
>> PostgreSQL-powered mailer: http://www.manitou-mail.org
>> Twitter: @DanielVerite
>>
>
>
diff --git a/src/bin/psql/crosstabview.c b/src/bin/psql/crosstabview.c
new file mode 100644
index 6d13b57..c9e11c2
*** a/src/bin/psql/crosstabview.c
--- b/src/bin/psql/crosstabview.c
*** printCrosstab(const PGresult *results,
*** 203,209 
 * alignment is determined by PQftype(). Otherwise the contents are
 * made-up strings, so the alignment is 'l'
 */
!   if (PQnfields(results) == 3)
{
int colnum; /* column placed inside 
the grid */
/*
--- 203,213 
 * alignment is determined by PQftype(). Otherwise the contents are
 * made-up strings, so the alignment is 'l'
 */
!   if (PQnfields(results) == 2)
!   {
!   col_align = 'c';
!   }
!   else if (PQnfields(results) == 3)
{
int colnum; /* column placed inside 
the grid */
/*
diff --git a/src/bin/psql/print.c b/src/bin/psql/print.c
new file mode 100644
index b2f8c2b..6e2cc6e
*** a/src/bin/psql/print.c
--- b/src/bin/psql/print.c
*** print_aligned_text(const printTableConte
*** 1042,1059 
{
/* spaces first */
fprintf(fout, "%*s", 
width_wrap[j] - chars_to_output, "");
-   fputnbytes(fout,
-(char *) 
(this_line->ptr + bytes_output[j]),
-  
bytes_to_output);
}
!   else/* Left aligned cell */
{
!   /* spaces second */
!   fputnbytes(fout,
!(char *) 
(this_line->ptr + bytes_output[j]),
!  
bytes_to_output);
}
  
bytes_output[j] += bytes_to_output;
  
/* Do we have more text to wrap? */
--- 1042,1059 
{
/* spaces first */
fprintf(fout, "%*s", 
width_wrap[j] - chars_to_output, "");
}
!   else if (cont->aligns[j] == 'c') /* 
Center aligned cell */
{
!   /* spaces first */
!   fprintf(fout, "%*s",
!(width_wrap[j] - 
chars_to_output) / 2, "");
}
  
+   fputnbytes(fout,
+(char *) 
(this_line->ptr + bytes_output[j]),
+  bytes_to_output);
+ 
bytes_output[j] += bytes_to_output;
  
/* Do we have more text to wrap? */
*** print_aligned_text(const printTableConte
*** 1083,1095 
 * If left-aligned, pad out remaining space if 
needed (not
 * last column, and/or wrap marks required).

Re: [HACKERS] [sqlsmith] Failing assertions in spgtextproc.c

2015-12-19 Thread Tom Lane
Andreas Seltenreich  writes:
> I do see two assertions in spgtextproc.c fail on occasion when testing
> with sqlsmith:
> TRAP: FailedAssertion([...], File: "spgtextproc.c", Line: 424)
> TRAP: FailedAssertion([...], File: "spgtextproc.c", Line: 564)

Can you show us the definition of the index that's causing this,
and some samples of the data you're putting in it?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting sorted data from foreign server for merge join

2015-12-19 Thread Rushabh Lathia
On Sat, Dec 19, 2015 at 2:17 AM, Robert Haas  wrote:

> On Thu, Dec 17, 2015 at 3:32 AM, Ashutosh Bapat
>  wrote:
> > On Wed, Dec 9, 2015 at 12:14 AM, Robert Haas 
> wrote:
> >> On Wed, Dec 2, 2015 at 6:45 AM, Rushabh Lathia <
> rushabh.lat...@gmail.com>
> >> wrote:
> >> > Thanks Ashutosh.
> >> >
> >> > Re-reviewed and Re-verified the patch, pg_sort_all_pd_v5.patch
> >> > looks good to me.
> >>
> >> This patch needs a rebase.
> >
> > Done.
>
> Thanks.
>
> >> It's not going to work to say this is a patch proposed for commit when
> >> it's still got a TODO comment in it that obviously needs to be
> >> changed.   And the formatting of that long comment is pretty weird,
> >> too, and not consistent with other functions in that same file (e.g.
> >> get_remote_estimate, ec_member_matches_foreign, create_cursor).
> >>
> >
> > The TODO was present in v4 but not in v5 and is not present in v6
> attached
> > here.. Formatted comment according estimate_path_cost_size(),
> > convert_prep_stmt_params().
>
> Hrm, I must have been looking at the wrong version somehow.  Sorry about
> that.
>
> >> Aside from that, I think before we commit this, somebody should do
> >> some testing that demonstrates that this is actually a good idea.  Not
> >> as part of the test case set for this patch, but just in general.
> >> Merge joins are typically going to be relevant for large tables, but
> >> the examples in the regression tests are necessarily tiny.  I'd like
> >> to see some sample data and some sample queries that get appreciably
> >> faster with this code.  If we can't find any, we don't need the code.
> >>
> >
> > I tested the patch on my laptop with two types of queries, a join between
> > two foreign tables on different foreign servers (pointing to the same
> self
> > server) and a join between one foreign and one local table. The foreign
> > tables and servers are created using sort_pd_setup.sql attached. Foreign
> > tables pointed to table with index useful for join clause. Both the
> joining
> > tables had 10M rows. The execution time of query was measured for 100
> runs
> > and average and standard deviation were calculated (using function
> > query_execution_stats() in script sort_pd.sql) and are presented below.
>
> OK, cool.
>
> I went over this patch in some detail today and did a lot of cosmetic
> cleanup.  The results are attached.  I'm fairly happy with this
> version, but let me know what you think.  Of course, feedback from
> others is more than welcome also.
>
>
Had a quick look at the patch changes and also performed basic
sanity check. Patch looks good to me.

Thanks.

--
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>



-- 
Rushabh Lathia
www.EnterpriseDB.com


Re: [HACKERS] Costing foreign joins in postgres_fdw

2015-12-19 Thread Albe Laurenz
Robert Haas wrote:
>> Maybe, to come up with something remotely realistic, a formula like
>>
>> sum of locally estimated costs of sequential scan for the base table
>> plus count of estimated result rows (times a factor)
>
> Was this meant to say "the base tables", plural?

Yes.

> I think whatever we do here should try to extend the logic in
> postgres_fdw's estimate_path_cost_size() to foreign tables in some
> reasonably natural way, but I'm not sure exactly what that should look
> like.  Maybe do what that function currently does for single-table
> scans, and then add all the values up, or something like that.  I'm a
> little worried, though, that the planner might then view a query that
> will be executed remotely as a nested loop with inner index-scan as
> not worth pushing down, because in that case the join actually will
> not touch every row from both tables, as a hash or merge join would.

That's exactly what I meant, minus a contribution for the estimated
result set size.

There are cases where a nested loop is faster than a hash join,
but I think it is rare that this is by orders of magnitude.

So I'd say it is a decent rough estimate, and that's the best we can
hope for here, if we cannot ask the remote server.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Making tab-complete.c easier to maintain

2015-12-19 Thread Tom Lane
Michael Paquier  writes:
> On Sat, Dec 19, 2015 at 5:42 AM, Tom Lane  wrote:
>> 2. Why does MatchAnyExcept use "'" as the inversion flag, rather than
>> say "!" or "~" ?  Seems pretty random.

> Actually, "'" is not that much a good idea, no? There could be single
> quotes in queries so there is a risk of interfering with the
> completion... What do you think would be good candidates? "?", "!",
> "#" or "&"?

We don't care whether the character appears in queries; it only matters
that it not be something we'd need to use in patterns.  My inclination
is to use "!".

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Making tab-complete.c easier to maintain

2015-12-19 Thread Michael Paquier
On Sun, Dec 20, 2015 at 6:24 AM, Tom Lane  wrote:
> 1. I think it would be a good idea to convert the matching rules for
> backslash commands too.  To do that, we'd need to provide a case-sensitive
> equivalent to word_match and the matching macros.  I think we'd also have
> to extend word_match to allow a trailing wildcard character, maybe "*".
>
> 2. I believe that a very large fraction of the TailMatches() rules really
> ought to be Matches(), ie, they should not consider matches that don't
> start at the start of the line.  And there's another bunch that could
> be Matches() if the author hadn't been unaccountably lazy about checking
> all words of the expected command.  If we converted as much as we could
> that way, it would make psql_completion faster because many inapplicable
> rules could be discarded after a single integer comparison on
> previous_words_count, and it would greatly reduce the risk of inapplicable
> matches.  We can't do that for rules meant to apply to DML statements,
> since they can be buried in WITH, EXPLAIN, etc ... but an awful lot of
> the DDL rules could be changed.
>
> 3. The HeadMatches macros are pretty iffy because they can only look back
> nine words.  I'm tempted to redesign get_previous_words so it just
> tokenizes the whole line rather than having an arbitrary limitation.
> (For that matter, it's long overdue for it to be able to deal with
> multiline input...)
>
> I might go look at #3, but I can't currently summon the energy to tackle
> #1 or #2 --- any volunteers?

I could have a look at both of them and submit patch for next CF, both
things do not seem that much complicated.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Making tab-complete.c easier to maintain

2015-12-19 Thread Tom Lane
Michael Paquier  writes:
> On Sat, Dec 19, 2015 at 5:42 AM, Tom Lane  wrote:
>> 1. It seems inconsistent that all the new macros are named in CamelCase
>> style, whereas there is still plenty of usage of the existing macros like
>> COMPLETE_WITH_LIST.  It looks pretty jarring IMO.  I think we should
>> either rename the new macros back to all-upper-case style, or rename the
>> existing macros in CamelCase style.
>> 
>> I slightly favor the latter option; we're already pretty much breaking any
>> hope of tab-complete fixes applying backwards over this patch, so changing
>> the code even more doesn't seem like a problem.  Either way, it's a quick
>> search-and-replace.  Thoughts?

> Both would be fine, honestly. Now if we look at the current code there
> are already a lot of macros IN_UPPER_CASE, so it would make more sense
> on the contrary to have MATCHES_N and MATCHES_EXCEPT?

After some contemplation I decided that what was bugging me was mainly
the inconsistency of having some completion actions in camelcase and
immediately adjacent actions in all-upper-case.  Making the test macros
camelcase isn't such a problem as long as they all look similar, and
I think it's more readable anyway.  So I changed CompleteWithListN to
COMPLETE_WITH_LISTN and otherwise left the naming alone.

I've committed this now with a number of changes, many of them just
stylistic.  Notable is that I rewrote word_matches to rely on
pg_strncasecmp rather than using toupper/tolower directly, so as to avoid
any possible change in semantics.  (The code as submitted was flat wrong
anyway, since it wasn't being careful about passing only unsigned chars to
those functions.)  I also got rid of its inconsistent treatment of empty
strings, in favor of not ever calling word_matches() on nonexistent words
in the first place.  That just requires testing previous_words_count in
the TailMatches macros.  I think it'd now be possible for 
get_previous_words to skip generating empty strings for word positions
before the start of line, but have not experimented.

I found a couple more small errors in the converted match rules too,
but I have to admit my eyes started to glaze over while looking through
them.  It's possible there are some remaining errors there.  On the
other hand, it's at least as likely that we've gotten rid of some bugs.

There's a good deal more that could be done here:

1. I think it would be a good idea to convert the matching rules for
backslash commands too.  To do that, we'd need to provide a case-sensitive
equivalent to word_match and the matching macros.  I think we'd also have
to extend word_match to allow a trailing wildcard character, maybe "*".

2. I believe that a very large fraction of the TailMatches() rules really
ought to be Matches(), ie, they should not consider matches that don't
start at the start of the line.  And there's another bunch that could
be Matches() if the author hadn't been unaccountably lazy about checking
all words of the expected command.  If we converted as much as we could
that way, it would make psql_completion faster because many inapplicable
rules could be discarded after a single integer comparison on
previous_words_count, and it would greatly reduce the risk of inapplicable
matches.  We can't do that for rules meant to apply to DML statements,
since they can be buried in WITH, EXPLAIN, etc ... but an awful lot of
the DDL rules could be changed.

3. The HeadMatches macros are pretty iffy because they can only look back
nine words.  I'm tempted to redesign get_previous_words so it just
tokenizes the whole line rather than having an arbitrary limitation.
(For that matter, it's long overdue for it to be able to deal with
multiline input...)

I might go look at #3, but I can't currently summon the energy to tackle
#1 or #2 --- any volunteers?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN data corruption bug(s) in 9.6devel

2015-12-19 Thread Tomas Vondra

Hi,

On 11/06/2015 02:09 AM, Tomas Vondra wrote:

Hi,

On 11/06/2015 01:05 AM, Jeff Janes wrote:

On Thu, Nov 5, 2015 at 3:50 PM, Tomas Vondra
 wrote:

...


I can do that - I see there are three patches in the two threads:

   1) gin_pending_lwlock.patch (Jeff Janes)
   2) gin_pending_pagelock.patch (Jeff Janes)
   3) gin_alone_cleanup-2.patch (Teodor Sigaev)

Should I test all of them? Or is (1) obsoleted by (2) for example?


1 is obsolete.  Either 2 or 3 should fix the bug, provided this is the
bug you are seeing.  They have different performance side effects, but
as far as fixing the bug they should be equivalent.


OK, I'll do testing with those two patches then, and I'll also note the
performance difference (the data load was very stable). Of course, it's
just one particular workload.

I'll post an update after the weekend.


I've finally managed to test the two patches. Sorry for the delay.

I've repeated the workload on 9.5, 9.6 and 9.6 with (1) or (2), looking 
for lockups or data corruption. I've also measured duration of the 
script, to see what's the impact on performance. The configuration 
(shared_buffers, work_mem ...) was exactly the same in all cases.


9.5 : runtime ~1380 seconds
9.6 : runtime ~1380 seconds (but lockups and data corruption)
9.6+(1) : runtime ~1380 seconds
9.6+(2) : runtime ~1290 seconds

So both patches seem to do the trick, but (2) is faster. Not sure if 
this is expected. (BTW all the results are without asserts enabled).


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Making tab-complete.c easier to maintain

2015-12-19 Thread Tom Lane
I wrote:
> 3. The HeadMatches macros are pretty iffy because they can only look back
> nine words.  I'm tempted to redesign get_previous_words so it just
> tokenizes the whole line rather than having an arbitrary limitation.
> (For that matter, it's long overdue for it to be able to deal with
> multiline input...)

I poked into this and found that it's really not hard, if you don't mind
still another global variable associated with tab-completion.  See
attached patch.

The main objection to this change might be speed, but I experimented with
the longest query in information_schema.sql (CREATE VIEW usage_privileges,
162 lines) and found that pressing tab at the end of that seemed to take
well under a millisecond on my workstation.  So I think it's probably
insignificant, especially compared to any of the code paths that send a
query to the server for tab completion.

regards, tom lane

diff --git a/src/bin/psql/input.c b/src/bin/psql/input.c
index 2bc065a..ccd9a3e 100644
*** a/src/bin/psql/input.c
--- b/src/bin/psql/input.c
*** static void finishInput(void);
*** 53,64 
   * gets_interactive()
   *
   * Gets a line of interactive input, using readline if desired.
   * The result is a malloc'd string.
   *
   * Caller *must* have set up sigint_interrupt_jmp before calling.
   */
  char *
! gets_interactive(const char *prompt)
  {
  #ifdef USE_READLINE
  	if (useReadline)
--- 53,69 
   * gets_interactive()
   *
   * Gets a line of interactive input, using readline if desired.
+  *
+  * prompt: the prompt string to be used
+  * query_buf: buffer containing lines already read in the current command
+  * (query_buf is not modified here, but may be consulted for tab completion)
+  *
   * The result is a malloc'd string.
   *
   * Caller *must* have set up sigint_interrupt_jmp before calling.
   */
  char *
! gets_interactive(const char *prompt, PQExpBuffer query_buf)
  {
  #ifdef USE_READLINE
  	if (useReadline)
*** gets_interactive(const char *prompt)
*** 76,81 
--- 81,89 
  		rl_reset_screen_size();
  #endif
  
+ 		/* Make current query_buf available to tab completion callback */
+ 		tab_completion_query_buf = query_buf;
+ 
  		/* Enable SIGINT to longjmp to sigint_interrupt_jmp */
  		sigint_interrupt_enabled = true;
  
*** gets_interactive(const char *prompt)
*** 85,90 
--- 93,101 
  		/* Disable SIGINT again */
  		sigint_interrupt_enabled = false;
  
+ 		/* Pure neatnik-ism */
+ 		tab_completion_query_buf = NULL;
+ 
  		return result;
  	}
  #endif
diff --git a/src/bin/psql/input.h b/src/bin/psql/input.h
index abd7012..c9d30b9 100644
*** a/src/bin/psql/input.h
--- b/src/bin/psql/input.h
***
*** 38,44 
  #include "pqexpbuffer.h"
  
  
! char	   *gets_interactive(const char *prompt);
  char	   *gets_fromFile(FILE *source);
  
  void		initializeInput(int flags);
--- 38,44 
  #include "pqexpbuffer.h"
  
  
! char	   *gets_interactive(const char *prompt, PQExpBuffer query_buf);
  char	   *gets_fromFile(FILE *source);
  
  void		initializeInput(int flags);
diff --git a/src/bin/psql/mainloop.c b/src/bin/psql/mainloop.c
index b6cef94..cb86457 100644
*** a/src/bin/psql/mainloop.c
--- b/src/bin/psql/mainloop.c
*** MainLoop(FILE *source)
*** 133,139 
  			/* May need to reset prompt, eg after \r command */
  			if (query_buf->len == 0)
  prompt_status = PROMPT_READY;
! 			line = gets_interactive(get_prompt(prompt_status));
  		}
  		else
  		{
--- 133,139 
  			/* May need to reset prompt, eg after \r command */
  			if (query_buf->len == 0)
  prompt_status = PROMPT_READY;
! 			line = gets_interactive(get_prompt(prompt_status), query_buf);
  		}
  		else
  		{
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 731df2a..fec7c38 100644
*** a/src/bin/psql/tab-complete.c
--- b/src/bin/psql/tab-complete.c
*** extern char *filename_completion_functio
*** 70,75 
--- 70,82 
  #define WORD_BREAKS		"\t\n@$><=;|&{() "
  
  /*
+  * Since readline doesn't let us pass any state through to the tab completion
+  * callback, we have to use this global variable to let get_previous_words()
+  * get at the previous lines of the current command.  Ick.
+  */
+ PQExpBuffer tab_completion_query_buf = NULL;
+ 
+ /*
   * This struct is used to define "schema queries", which are custom-built
   * to obtain possibly-schema-qualified names of database objects.  There is
   * enough similarity in the structure that we don't want to repeat it each
*** static char *pg_strdup_keyword_case(cons
*** 923,929 
  static char *escape_string(const char *text);
  static PGresult *exec_query(const char *query);
  
! static int	get_previous_words(int point, char **previous_words, int nwords);
  
  static char *get_guctype(const char *varname);
  
--- 930,936 
  static char *escape_string(const char *text);
  static PGresult *exec_query(const char *query);
  
! 

Re: [HACKERS] POC: Cache data in GetSnapshotData()

2015-12-19 Thread Mithun Cy
On Thu, Dec 17, 2015 at 3:15 AM, Mithun Cy 
wrote:

> I have rebased the patch and tried to run pgbench.

> I see memory corruptions, attaching the valgrind report for the same.
Sorry forgot to add re-based patch, adding the same now.
After some analysis I saw writing to shared memory to store shared snapshot
is not protected under exclusive write lock, this leads to memory
corruptions.
I think until this is fixed measuring the performance will not be much
useful.
-- 
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com
*** a/src/backend/commands/cluster.c
--- b/src/backend/commands/cluster.c
***
*** 1559,1564  finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
--- 1559,1565 
  			elog(ERROR, "cache lookup failed for relation %u", OIDOldHeap);
  		relform = (Form_pg_class) GETSTRUCT(reltup);
  
+ 		Assert(TransactionIdIsNormal(frozenXid));
  		relform->relfrozenxid = frozenXid;
  		relform->relminmxid = cutoffMulti;
  
*** a/src/backend/storage/ipc/procarray.c
--- b/src/backend/storage/ipc/procarray.c
***
*** 410,415  ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
--- 410,416 
  		if (LWLockConditionalAcquire(ProcArrayLock, LW_EXCLUSIVE))
  		{
  			ProcArrayEndTransactionInternal(proc, pgxact, latestXid);
+ ProcGlobal->cached_snapshot_valid = false;
  			LWLockRelease(ProcArrayLock);
  		}
  		else
***
*** 558,563  ProcArrayGroupClearXid(PGPROC *proc, TransactionId latestXid)
--- 559,566 
  		nextidx = pg_atomic_read_u32(>nextClearXidElem);
  	}
  
+ ProcGlobal->cached_snapshot_valid = false;
+ 
  	/* We're done with the lock now. */
  	LWLockRelease(ProcArrayLock);
  
***
*** 1543,1548  GetSnapshotData(Snapshot snapshot)
--- 1546,1553 
  	 errmsg("out of memory")));
  	}
  
+ snapshot->takenDuringRecovery = RecoveryInProgress();
+ 
  	/*
  	 * It is sufficient to get shared lock on ProcArrayLock, even if we are
  	 * going to set MyPgXact->xmin.
***
*** 1557,1565  GetSnapshotData(Snapshot snapshot)
  	/* initialize xmin calculation with xmax */
  	globalxmin = xmin = xmax;
  
! 	snapshot->takenDuringRecovery = RecoveryInProgress();
  
! 	if (!snapshot->takenDuringRecovery)
  	{
  		int		   *pgprocnos = arrayP->pgprocnos;
  		int			numProcs;
--- 1562,1592 
  	/* initialize xmin calculation with xmax */
  	globalxmin = xmin = xmax;
  
! 	if (!snapshot->takenDuringRecovery && ProcGlobal->cached_snapshot_valid)
! 	{
! 		Snapshot csnap = >cached_snapshot;
! 		TransactionId *saved_xip;
! 		TransactionId *saved_subxip;
  
! 		saved_xip = snapshot->xip;
! 		saved_subxip = snapshot->subxip;
! 
! 		memcpy(snapshot, csnap, sizeof(SnapshotData));
! 
! 		snapshot->xip = saved_xip;
! 		snapshot->subxip = saved_subxip;
! 
! 		memcpy(snapshot->xip, csnap->xip,
! 			   sizeof(TransactionId) * csnap->xcnt);
! 		memcpy(snapshot->subxip, csnap->subxip,
! 			   sizeof(TransactionId) * csnap->subxcnt);
! 		globalxmin = ProcGlobal->cached_snapshot_globalxmin;
! 		xmin = csnap->xmin;
! 
! 		Assert(TransactionIdIsValid(globalxmin));
! 		Assert(TransactionIdIsValid(xmin));
! 	}
! 	else if (!snapshot->takenDuringRecovery)
  	{
  		int		   *pgprocnos = arrayP->pgprocnos;
  		int			numProcs;
***
*** 1577,1591  GetSnapshotData(Snapshot snapshot)
  			TransactionId xid;
  
  			/*
! 			 * Backend is doing logical decoding which manages xmin
! 			 * separately, check below.
  			 */
! 			if (pgxact->vacuumFlags & PROC_IN_LOGICAL_DECODING)
! continue;
! 
! 			/* Ignore procs running LAZY VACUUM */
! 			if (pgxact->vacuumFlags & PROC_IN_VACUUM)
! continue;
  
  			/* Update globalxmin to be the smallest valid xmin */
  			xid = pgxact->xmin; /* fetch just once */
--- 1604,1615 
  			TransactionId xid;
  
  			/*
! 			 * Ignore procs running LAZY VACUUM (which don't need to retain
! 			 * rows) and backends doing logical decoding (which manages xmin
! 			 * separately, check below).
  			 */
! 			if (pgxact->vacuumFlags & (PROC_IN_LOGICAL_DECODING | PROC_IN_VACUUM))
! continue;
  
  			/* Update globalxmin to be the smallest valid xmin */
  			xid = pgxact->xmin; /* fetch just once */
***
*** 1653,1658  GetSnapshotData(Snapshot snapshot)
--- 1677,1706 
  }
  			}
  		}
+ 
+ 		/* upate cache */
+ 		{
+ 			Snapshot csnap = >cached_snapshot;
+ 			TransactionId *saved_xip;
+ 			TransactionId *saved_subxip;
+ 
+ 			ProcGlobal->cached_snapshot_globalxmin = globalxmin;
+ 
+ 			saved_xip = csnap->xip;
+ 			saved_subxip = csnap->subxip;
+ 			memcpy(csnap, snapshot, sizeof(SnapshotData));
+ 			csnap->xip = saved_xip;
+ 			csnap->subxip = saved_subxip;
+ 
+ 			/* not yet stored. Yuck */
+ 			csnap->xmin = xmin;
+ 
+ 			memcpy(csnap->xip, snapshot->xip,
+    sizeof(TransactionId) * csnap->xcnt);
+ 			memcpy(csnap->subxip, snapshot->subxip,
+    

Re: [HACKERS] extend pgbench expressions with functions

2015-12-19 Thread Michael Paquier
On Sat, Dec 19, 2015 at 6:56 AM, Fabien COELHO  wrote:
>> It was definitely useful to debug the double/int type stuff within
>> expressions when writing a non trivial pgbench script. It is probably less
>> interesting if there are only integers.
>
>
> After looking again at the code, I remembered why double are useful: there
> are needed for random exponential & gaussian because the last parameter is a
> double.
>
> I do not care about the sqrt, but double must be allowed to keep that, and
> the randoms are definitely useful for a pgbench script. Now the patch may
> just keep double constants, but it would look awkward, and the doc must
> explain why 1.3 and 1+2 are okay, but not 1.3 + 2.4.
>
> So I'm less keen at removing double expressions, because it removes a key
> feature. If it is a blocker I'll go for just the constant, but this looks to
> me like a stupid compromise.

Hm, say that you do that in a script:
\set aid double(1.4)
\set bid random_gaussian(1, 10, :aid)
Then what is passed as third argument in random_gaussian is 1, and not
1.4, no? If all allocations within a variable are unconditionally
integers, why is it useful to make the cast function double()
user-visible? Now, by looking at the code, I agree that you would need
to keep things like DOUBLE and coerceToDouble(),
PGBENCH_RANDOM_GAUSSIAN and its other friend are directly using it. I
am just doubting that it is actually necessary to make that visible at
user-level if they have no direct use..
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2015-12-19 Thread Alexander Korotkov
On Fri, Dec 18, 2015 at 10:53 PM, Artur Zakirov 
wrote:

> On 18.12.2015 22:43, Artur Zakirov wrote:
>
>> Hello.
>>
>> PostgreSQL has a contrib module named pg_trgm. It is used to the fuzzy
>> text search. It provides some functions and operators for determining the
>> similarity of the given texts using trigram matching.
>>
>> Sorry, I have forgotten to mark previous message with [PROPOSAL].
>

​I think, you shouldn't mark thread as ​[PROPOSAL] since you're providing a
full patch.


> I registered the patch in commitfest:
> https://commitfest.postgresql.org/8/448/


​Great.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] pg_tables bug?

2015-12-19 Thread Gaetano Mendola
On Sat, Dec 19, 2015, 01:50 Andrew Dunstan  wrote:

>
>
>
>
> On 12/18/2015 05:18 PM, Gaetano Mendola wrote:
> > From documentation about "CREATE DATABASE name WITH TABLESAPCE =
> > tablespace_name":
> >
> > tablespace_name
> > The name of the tablespace that will be associated with the new
> > database, or DEFAULT to
> > use the template database's tablespace. This tablespace will be the
> > default tablespace used
> > for objects created in this database. See CREATE TABLESPACE for more
> > information.
> >
> > I'm sure that my tables are created in the name space but those are
> > not reported either in
> > pg_tables, either in pg_dump or by \d.
>
> 1. Please don't top-post on the PostgreSQL lists. See
> 
>
> 2. The system is working as designed and as documented - see the
> comments in the docs on pg_tables. If nothing is shown for the table's
> tablespace then it will be in the default tablespace for the database.
> That's what you're seeing. You appear to be assuming incorrectly that it
> means that the table will be in the system's default tablespace.
>

I did a reply using a not correctly setup client sorry for it. I'm not new
to this list. I understood now how it works. Having many database and many
tablespace is a nightmare this way. I will make my own view to fix the
annoyance.

>


Re: [HACKERS] Making tab-complete.c easier to maintain

2015-12-19 Thread Michael Paquier
On Sat, Dec 19, 2015 at 5:42 AM, Tom Lane  wrote:
> Thomas Munro  writes:
>> [ tab-complete-macrology-v11.patch.gz ]
>
> A couple of stylistic reactions after looking through the patch for the
> first time in a long time:
>
> 1. It seems inconsistent that all the new macros are named in CamelCase
> style, whereas there is still plenty of usage of the existing macros like
> COMPLETE_WITH_LIST.  It looks pretty jarring IMO.  I think we should
> either rename the new macros back to all-upper-case style, or rename the
> existing macros in CamelCase style.
>
> I slightly favor the latter option; we're already pretty much breaking any
> hope of tab-complete fixes applying backwards over this patch, so changing
> the code even more doesn't seem like a problem.  Either way, it's a quick
> search-and-replace.  Thoughts?

Both would be fine, honestly. Now if we look at the current code there
are already a lot of macros IN_UPPER_CASE, so it would make more sense
on the contrary to have MATCHES_N and MATCHES_EXCEPT?

> 2. Why does MatchAnyExcept use "'" as the inversion flag, rather than
> say "!" or "~" ?  Seems pretty random.

Actually, "'" is not that much a good idea, no? There could be single
quotes in queries so there is a risk of interfering with the
completion... What do you think would be good candidates? "?", "!",
"#" or "&"?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2015-12-19 Thread Peter Geoghegan
On Sat, Dec 19, 2015 at 3:26 PM, Michael Paquier
 wrote:
> +1. There are folks around doing tests using 9.5 now, it is not
> correct to impact the effort they have been putting on it until now.

This is a total misrepresentation of what I've said.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2015-12-19 Thread Michael Paquier
On Sat, Dec 19, 2015 at 1:58 AM, Robert Haas wrote:
> No, it's far too late to be pushing this into 9.5.  We are at RC1 now
> and hoping to cut a final release right after Christmas.  I think it's
> quite wrong to argue that these changes have no risk of destabilizing
> 9.5.  Nobody is exempt from having bugs in their code - not me, not
> you, not Tom Lane.  But quite apart from that, there seems to be no
> compelling benefit to having these changes in 9.5.  You say that the
> branches will diverge needlessly, but the whole point of having
> branches is that we do need things to diverge.  The question isn't
> "why shouldn't these go into 9.5?" but "do these fix something that is
> clearly broken in 9.5 and must be fixed to avoid hurting users?".
> Andres has said clearly that he doesn't think so, and Heikki didn't
> seem convinced that we wanted the changes at all.  I've read over the
> thread and I think that even if all the good things you say about this
> patch are 100% true, it doesn't amount to a good reason to back-patch.
> Code that does something possibly non-sensical or sub-optimal isn't a
> reason to back-patch in the absence of a clear, user-visible
> consequence.

+1. There are folks around doing tests using 9.5 now, it is not
correct to impact the effort they have been putting on it until now.

> I think it's a shame that we haven't gotten this patch dealt with just
> because when somebody submits a patch in June, it's not very nice for
> it to still be pending in December, but since this stuff is even
> further outside my area of expertise than the sorting stuff, and since
> me and my split personalities only have so many hours in the day, I'm
> going to have to leave it to somebody else to pick up anyhow.  But
> that's a separate issue from whether this should be back-patched.

Many patches wait in the queue for months, that's not new. Some of
them wait even longer than that.

For those reasons, and because you are willing visibly to still work
on it, I am moving this patch to next CF.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] An unlikely() experiment

2015-12-19 Thread David Rowley
On 20 December 2015 at 03:06, Andres Freund  wrote:

> On 2015-12-20 02:49:13 +1300, David Rowley wrote:
> > So I choose to ignore that, and give
>
> it a try anyway... elog(ERROR) for example, surely that branch should
> > always be in a cold path? ... ereport() too perhaps?
>
> Hm, not exactly what you were thinking of, but it'd definitely be
> interesting to add unlikely annotations to debug elogs, in some
> automated manner (__builtin_choose_expr?). I've previously hacked
> elog/ereport to only support >= ERROR, and that resulted in a speedup,
> removing a log of elog(DEBUG*) triggered jumps.
>
>
Good thinking. It would be interesting to see benchmarks with the hacked
version of elog()

To make sure I understand correctly: Before doing that you manually
> added the errcheck()'s manually? At each elog(ERROR) callsite?
>
>
Yeah manually at most call sites. Although I did nothing in cases like;

if (somethinggood)
   dosomethinggood();
else
  elog(ERROR, ...);

Quite possibly in that case we could use likely(), but there's also cases
where elog() is in the default: in a switch(), or contained in an else in
an if/else if/else block.

I also left out errcheck() in places where the condition called a function
with some side affect. I did this as I also wanted to test a hard coded
false condition to see how much overhead remained. I did nothing with
ereport().


> One way to do this would be to add elog_on() / ereport_on() macros,
> directly containing the error message. Like
> #define elog_on(cond, elevel, ...) \
> do { \
>if (unlikely(cond)) \
>{ \
> elog(elevel, __VA_ARGS__) \
>} \
> } while(0)
>

Interesting idea. Would you think that would be something we could do a
complete replace on, or are you thinking just for the hotter code paths?

I've attached the patch this time, it should apply cleanly to bbbd807. I'd
would be good to see some other performance tests on some other hardware.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


errcheck1.patch.bz2
Description: BZip2 compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: bloom filter in Hash Joins with batches

2015-12-19 Thread Oleg Bartunov
Tomas,

have you seen
http://www.postgresql.org/message-id/4b4dd67f.9010...@sigaev.ru
I have very limited internet connection (no graphics) , so I may miss
something

Oleg

On Wed, Dec 16, 2015 at 4:15 AM, Tomas Vondra 
wrote:

> Hi,
>
> while working on the Hash Join improvements, I've been repeatedly running
> into the idea of bloom filter - various papers on hash joins mention bloom
> filters as a way to optimize access to the hash table by doing fewer
> lookups, etc.
>
> Sadly, I've been unable to actually see any real benefit of using a bloom
> filter, which I believe is mostly due to NTUP_PER_BUCKET=1, which makes the
> lookups much more efficient (so the room for bloom filter improvements is
> narrow).
>
> The one case where bloom filter might still help, and that's when the
> bloom filter fits into L3 cache (a few MBs) while the hash table (or more
> accurately the buckets) do not. Then there's a chance that the bloom filter
> (which needs to do multiple lookups) might help.
>
> But I think there's another case where bloom filter might be way more
> useful in Hash Join - when we do batching. What we do currently is that we
> simply
>
> 1) build the batches for the hash table (inner relation)
>
> 2) read the outer relation (usually the larger one), and split it
>into batches just like the hash table
>
> 3) while doing (2) we join the first batch, and write the remaining
>batches to disk (temporary files)
>
> 4) we read the batches one by one (for both tables) and do the join
>
> Now, imagine that only some of the rows in the outer table actually match
> a row in the hash table. Currently, we do write those rows into the
> temporary file, but with a bloom filter on the whole hash table (all the
> batches at once) we can skip that for some types of joins.
>
> For inner join we can immediately discard the outer row, for left join we
> can immediately output the row. In both cases we can completely eliminate
> the overhead with writing the tuple to the temporary file and then reading
> it again.
>
> The attached patch is a PoC of this approach - I'm pretty sure it's not
> perfectly correct (e.g. I only tried it with inner join), but it's good
> enough for demonstrating the benefits. It's rather incomplete (see the end
> of this e-mail), and I'm mostly soliciting some early feedback at this
> point.
>
> The numbers presented here are for a test case like this:
>
> CREATE TABLE dim (id INT, dval TEXT);
> CREATE TABLE fact (id INT, fval TEXT);
>
> INSERT INTO dim SELECT i, md5(i::text)
>   FROM generate_series(1,1000) s(i);
>
> -- repeat 10x
> INSERT INTO fact SELECT * FROM dim;
>
> and a query like this
>
> SELECT COUNT(fval) FROM fact JOIN dim USING (id) WHERE dval < 'a';
>
> with different values in the WHERE condition to select a fraction of the
> inner 'dim' table - this directly affects what portion of the 'fact' table
> has a matching row, and also the size of the hash table (and number of
> batches).
>
> Now, some numbers from a machine with 8GB of RAM (the 'fact' table has
> ~6.5GB of data, so there's actually quite a bit of memory pressure, forcing
> the temp files to disk etc.).
>
> With work_mem=16MB, it looks like this:
>
> batches   filter   select.bloom  masterbloom/master
> ---
>   41 6.25%23871   48631 49.09%
>   8212.50%25752   56692 45.42%
>   8318.75%31273 57455 54.43%
>  16425.01%37430   62325 60.06%
>  16531.25%39005   61143 63.79%
>  16637.50%46157   63533 72.65%
>  16743.75%53500   65483 81.70%
>  32849.99%53952 65730 82.08%
>  32956.23%55187 67521 81.73%
>  32a62.49%64454   69448 92.81%
>  32b68.73%66937 71692 93.37%
>  32c74.97%73323   72060101.75%
>  32d81.23%76703   73513104.34%
>  32e87.48%81970 74890109.45%
>  32f93.74%86102   76257112.91%
>
> The 'batches' means how many batches were used for the join, 'filter' is
> the value used in the WHERE condition, selectivity is the fraction of the
> 'dim' table that matches the condition (and also the 'fact'). Bloom and
> master are timings of the query in miliseconds, and bloom/master is
> comparison of the runtimes - so for example 49% means the hash join with
> bloom filter was running ~2x as fast.
>
> Admittedly, work_mem=16MB is quite low, but that's just a way to force
> batching. What really matters is the number of batches and selectivity (how
> many tuples we can eliminate 

Re: [HACKERS] extend pgbench expressions with functions

2015-12-19 Thread Fabien COELHO



After looking again at the code, I remembered why double are useful: there
are needed for random exponential & gaussian because the last parameter is a
double.

I do not care about the sqrt, but double must be allowed to keep that, and
the randoms are definitely useful for a pgbench script. Now the patch may
just keep double constants, but it would look awkward, and the doc must
explain why 1.3 and 1+2 are okay, but not 1.3 + 2.4.

So I'm less keen at removing double expressions, because it removes a key
feature. If it is a blocker I'll go for just the constant, but this looks to
me like a stupid compromise.


Hm, say that you do that in a script: \set aid double(1.4) \set bid 
random_gaussian(1, 10, :aid) Then what is passed as third argument in 
random_gaussian is 1, and not 1.4, no?


Indeed.

Maybe pgbench should just generate an error when a variable is assigned a 
double, so that the user must explicitly add an int() cast.


If all allocations within a variable are unconditionally integers, why 
is it useful to make the cast function double() user-visible?


I'm not sure whether we are talking about the same thing:
 - there a "double" type managed within expressions, but not variables
 - there is a double() function, which takes an int and casts to double

I understood that you were suggesting to remove all "double" expressions,
but now it seems to be just about the double() function.


Now, by looking at the code, I agree that you would need
to keep things like DOUBLE and coerceToDouble(),
PGBENCH_RANDOM_GAUSSIAN and its other friend are directly using it.


Yep.

I am just doubting that it is actually necessary to make that visible at 
user-level if they have no direct use..


If there are both ints and doubles, then being able to cast make sense, so 
I just put both functions without deeper thinking.


So I would suggest to generate an error when an double expression is 
assigned to a variable, so as to avoid any surprise.


If both type are kept, I would like to keep the debug functions, which is 
really just a debug tool to have a look at what is going within 
expressions.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] An unlikely() experiment

2015-12-19 Thread David Rowley
Many of you might know of GCC's __builtin_expect() and that it allows us to
tell the compiler which code branches are more likely to occur. The
documentation on this does warn that normally it's a bad idea to use this
"as programmers are notoriously bad at predicting how their programs
actually perform" [1].  So I choose to ignore that, and give it a try
anyway... elog(ERROR) for example, surely that branch should always be in a
cold path? ... ereport() too perhaps?

Anyway, I knocked up a patch which went an added an errcheck() macro which
is defined the same as unlikely() is, and I stuck these around just the
"if" conditions for tests before elog(ERROR) calls, such as:

if (errcheck(newoff == InvalidOffsetNumber))
elog(ERROR, "failed to add BRIN tuple to new page");

gcc version 5.2.1 on i7-4712HQ

pgbench -M prepared -T 60 -S
Master: 12246 tps
Patched 13132 tsp (107%)

pgbench -M prepared -T 60 -S -c 8 -j 8
Master: 122982 tps
Patched: 129318 tps (105%)

I thought this was quite interesting.

Ok, so perhaps it's not that likely that we'd want to litter these
errcheck()s around everywhere, so I added a bit more logging as I thought
it's probably just a handful of calls that are making this improvement. I
changed the macro to call a function to log the __FILE__ and __LINE__, then
I loaded the results into Postgres, aggregated by file and line and sorted
by count(*) desc. Here's a sample of them (against bbbd807):

   file   | line | count
--+--+
 fmgr.c   | 1326 | 158756
 mcxt.c   |  902 | 147184
 mcxt.c   |  759 | 113162
 lwlock.c | 1545 |  67585
 lwlock.c |  938 |  67578
 stringinfo.c |  253 |  55364
 mcxt.c   |  831 |  47984
 fmgr.c   | 1030 |  36271
 syscache.c   |  978 |  28182
 dynahash.c   |  981 |  22721
 dynahash.c   | 1665 |  19994
 mcxt.c   |  931 |  18594

Perhaps it would be worth doing something with the hottest ones maybe?

Alternatively, if there was some way to mark the path as cold from within
the path, rather than from the if() condition before the path, then we
could perhaps do something in the elog() macro instead. I just couldn't
figure out a way to do this.

Does anyone have any thoughts on this?

[1] https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] An unlikely() experiment

2015-12-19 Thread Andres Freund
Hi,

On 2015-12-20 02:49:13 +1300, David Rowley wrote:
> Many of you might know of GCC's __builtin_expect() and that it allows us to
> tell the compiler which code branches are more likely to occur.

It also tells the reader that, which I find also rather helpful.


> The documentation on this does warn that normally it's a bad idea to
> use this "as programmers are notoriously bad at predicting how their
> programs actually perform" [1].

I think we indeed have to careful to not add this to 40/60 type
branches. But 99/1 or so paths are ok.


> So I choose to ignore that, and give
> it a try anyway... elog(ERROR) for example, surely that branch should
> always be in a cold path? ... ereport() too perhaps?

Hm, not exactly what you were thinking of, but it'd definitely be
interesting to add unlikely annotations to debug elogs, in some
automated manner (__builtin_choose_expr?). I've previously hacked
elog/ereport to only support >= ERROR, and that resulted in a speedup,
removing a log of elog(DEBUG*) triggered jumps.


> Anyway, I knocked up a patch which went an added an errcheck() macro which
> is defined the same as unlikely() is, and I stuck these around just the
> "if" conditions for tests before elog(ERROR) calls, such as:

Personally I'd rather go with a plain unlikely() - I don't think
errcheck as a name really buys us that much, and it's a bit restrictive.


> gcc version 5.2.1 on i7-4712HQ
>
> pgbench -M prepared -T 60 -S
> Master: 12246 tps
> Patched 13132 tsp (107%)
>
> pgbench -M prepared -T 60 -S -c 8 -j 8
> Master: 122982 tps
> Patched: 129318 tps (105%)

Not bad at all.


> Ok, so perhaps it's not that likely that we'd want to litter these
> errcheck()s around everywhere, so I added a bit more logging as I thought
> it's probably just a handful of calls that are making this improvement. I
> changed the macro to call a function to log the __FILE__ and __LINE__, then
> I loaded the results into Postgres, aggregated by file and line and sorted
> by count(*) desc. Here's a sample of them (against bbbd807):
>
>file   | line | count
> --+--+
>  fmgr.c   | 1326 | 158756
>  mcxt.c   |  902 | 147184
>  mcxt.c   |  759 | 113162
>  lwlock.c | 1545 |  67585
>  lwlock.c |  938 |  67578
>  stringinfo.c |  253 |  55364
>  mcxt.c   |  831 |  47984
>  fmgr.c   | 1030 |  36271
>  syscache.c   |  978 |  28182
>  dynahash.c   |  981 |  22721
>  dynahash.c   | 1665 |  19994
>  mcxt.c   |  931 |  18594

To make sure I understand correctly: Before doing that you manually
added the errcheck()'s manually? At each elog(ERROR) callsite?


> Perhaps it would be worth doing something with the hottest ones maybe?

One way to do this would be to add elog_on() / ereport_on() macros,
directly containing the error message. Like
#define elog_on(cond, elevel, ...) \
do { \
   if (unlikely(cond)) \
   { \
elog(elevel, __VA_ARGS__) \
   } \ 
} while(0)

Greetings,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers