Re: [HACKERS] Review: GiST support for UUIDs

2015-12-25 Thread Teodor Sigaev

Thank you, but I have some notices about
static float
uuid_parts_distance(pg_uuid_t *a, pg_uuid_t *b)
{
pg_uuid_t ua, ub;
const double mp = pow(2, -64);

uuid_cnv(a, );
uuid_cnv(b, );

Assert(ua.v64[0] > ub.v64[0]);
uint64 high = ua.v64[0] - ub.v64[0];
uint64 low = ua.v64[1] - ub.v64[1];
if (low > ua.v64[1])
high--;

return (float) (ldexp(high, 64) + (double) low * mp);
}

First, variables (high and low) should not be declared in the middle of code. 
Second, assert will fail if ua.v64[0] == ub.v64[0] and
ua.v64[1] > ub.v64[1] although it's a possible and allowed case. Third, actually 
you multiply high value by 2^64 anf low by 2^-64. Seems, it's needed to do only 
one multiplication.



Ildus Kurbangaliev wrote:

On Wed, 23 Dec 2015 16:36:23 -0800
Paul Jungwirth  wrote:


On 12/23/2015 08:10 AM, Ildus Kurbangaliev wrote:

There is a more improved version of the patch. Main idea is to
present uuid as two uint64 values, and make comparisons and penalty
calculation based on these values. This approach is much faster
than using memcmp for uuid comparisons.


Thank you for picking this up! I'm sorry I was not able to work on it
the last few months. I'm very glad to see someone wrapping it up. I'm
not a reviewer, but personally it looks like a good change to me.

Happy holidays,

Paul






Thanks! The patch was almost done and ready. I attached new version of
the patch with compability changes.



--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: pg_trgm: gin index scan performance for similarity search

2015-12-25 Thread Teodor Sigaev

Good catch, committed.

Fornaroli Christophe wrote:

Hi,

I think that we can improve the gin index scan performance for similarity search
defined in the pg_trgm extension. The similarity function is (for the default
case where DIVUNION is defined in the code):

 count / (len1 + len2 - count) >= trgm_limit

where
   len1 is the number of unique trigrams for the first string,
   len2 is the same number for the second string,
   count is the number of common trigrams between both strings,
   trgm_limit is a user specfied limit in [0, 1].

The code used to determine if a tuple may match the query string is:

 case SimilarityStrategyNumber:
 /* Count the matches */
 ntrue = 0;
 for (i = 0; i < nkeys; i++)
 {
 if (check[i] != GIN_FALSE)
 ntrue++;
 }
#ifdef DIVUNION
 res = (nkeys == ntrue) ? GIN_MAYBE : (float4) ntrue) /
((float4) (nkeys - ntrue))) >= trgm_limit) ? GIN_MAYBE : GIN_FALSE);
#else
 res = (nkeys == 0) ? GIN_FALSE : (float4) ntrue) / ((float4)
nkeys)) >= trgm_limit) ? GIN_MAYBE : GIN_FALSE);
#endif

where
   ntrue is the number of common trigrams in both strings,
   nkeys is the number of trigrams in the search string.

This code uses this upper bound for the similarity: ntrue / (nkeys - ntrue). But
if there is ntrue trigrams in common, we know that the indexed string is at
least ntrue trigrams long. We can then use a more aggressive upper bound: ntrue
/ (ntrue + nkeys - ntrue) or ntrue / nkeys. Attached is a patch that changes 
this.

Here are some performance gains with this test case:

create table foo as select
   substring(md5(random()::text) for random() * 5) || '123' as bar
from generate_series(1,100);

create index on foo using gin (bar gin_trgm_ops);

patched:

test=# explain analyze select count(*) from foo where bar % 'abc123';
   QUERY PLAN
---
  Aggregate  (cost=2511.14..2511.15 rows=1 width=0) (actual
time=807.434..807.435 rows=1 loops=1)
->  Bitmap Heap Scan on foo  (cost=99.75..2508.64 rows=1000 width=0) (actual
time=109.893..787.261 rows=54746 loops=1)
  Recheck Cond: (bar % 'abc123'::text)
  Rows Removed by Index Recheck: 55125
  Heap Blocks: exact=4514
  ->  Bitmap Index Scan on foo_bar_idx  (cost=0.00..99.50 rows=1000
width=0) (actual time=108.456..108.456 rows=109871 loops=1)
Index Cond: (bar % 'abc123'::text)
  Planning time: 0.353 ms
  Execution time: 807.593 ms
(9 rows)

test=# explain analyze select count(*) from foo where bar % 'abcdef';
QUERY PLAN

  Aggregate  (cost=2511.14..2511.15 rows=1 width=0) (actual time=4.829..4.830
rows=1 loops=1)
->  Bitmap Heap Scan on foo  (cost=99.75..2508.64 rows=1000 width=0) (actual
time=3.512..4.794 rows=5 loops=1)
  Recheck Cond: (bar % 'abcdef'::text)
  Rows Removed by Index Recheck: 137
  Heap Blocks: exact=139
  ->  Bitmap Index Scan on foo_bar_idx  (cost=0.00..99.50 rows=1000
width=0) (actual time=3.355..3.355 rows=142 loops=1)
Index Cond: (bar % 'abcdef'::text)
  Planning time: 0.363 ms
  Execution time: 5.061 ms
(9 rows)


master:

test=# explain analyze select count(*) from foo where bar % 'abc123';
   QUERY PLAN
---
  Aggregate  (cost=2511.14..2511.15 rows=1 width=0) (actual
time=6416.554..6416.554 rows=1 loops=1)
->  Bitmap Heap Scan on foo  (cost=99.75..2508.64 rows=1000 width=0) (actual
time=484.359..6389.819 rows=54746 loops=1)
  Recheck Cond: (bar % 'abc123'::text)
  Rows Removed by Index Recheck: 945250
  Heap Blocks: exact=4514
  ->  Bitmap Index Scan on foo_bar_idx  (cost=0.00..99.50 rows=1000
width=0) (actual time=482.677..482.677 rows=96 loops=1)
Index Cond: (bar % 'abc123'::text)
  Planning time: 0.359 ms
  Execution time: 6416.945 ms
(9 rows)

test=# explain analyze select count(*) from foo where bar % 'abcdef';
QUERY PLAN
-
  Aggregate  (cost=2511.14..2511.15 rows=1 width=0) (actual time=30.678..30.679
rows=1 loops=1)
->  Bitmap Heap Scan on foo  (cost=99.75..2508.64 rows=1000 width=0) (actual
time=9.020..30.643 rows=5 loops=1)
  Recheck Cond: (bar % 

Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW

2015-12-25 Thread Etsuro Fujita

On 2015/12/24 4:34, Robert Haas wrote:

On Wed, Dec 23, 2015 at 5:50 AM, Rushabh Lathia
 wrote:

+1.

I like idea of separate FDW API for the DML Pushdown. Was thinking can't we
can re-use the  IterateForeignScan(ForeignScanState *node) rather then
introducing IterateDMLPushdown(ForeignScanState *node) new API ?



Yeah, I think we need to ask ourselves what advantage we're getting
out of adding any new core APIs.  Marking the scan as a pushed-down
update or delete has some benefit in terms of making the information
visible via EXPLAIN, but even that's a pretty thin benefit.  The
iterate method seems to just complicate the core code without any
benefit at all.  More generally, there is very, very little code in
this patch that accomplishes anything that could not be done just as
well with the existing methods.  So why are we even doing these core
changes?


From the FDWs' point of view, ISTM that what FDWs have to do for 
IterateDMLPushdown is quite different from what FDWs have to do for 
IterateForeignScan; eg, IterateDMLPushdown must work in accordance with 
presence/absence of a RETURNING list.  (In addition to that, 
IterateDMLPushdown has been designed so that it must make the scan tuple 
available to later RETURNING projection in nodeModifyTable.c.)  So, I 
think that it's better to FDWs to add separate APIs for the DML 
pushdown, making the FDW code much simpler.  So based on that idea, I 
added the postgres_fdw changes to the patch.  Attached is an updated 
version of the patch, which is still WIP, but I'd be happy if I could 
get any feedback.



Tom seemed to think that we could centralize some checks in the core
code, say, related to triggers, or to LIMIT.  But there's nothing like
that in this patch, so I'm not really understanding the point.


For the trigger check, I added relation_has_row_level_triggers.  I 
placed that function in postgres_fdw.c in the updated patch, but I think 
that by placing that function in the core, FDWs can share that function. 
 As for the LIMIT, I'm not sure we can do something about that.


I think the current design allows us to handle a pushed-down update on a 
join, ie, "UPDATE foo ... FROM bar ..." where both foo and bar are 
remote, which was Tom's concern, but I'll leave that for another patch. 
 Also, I think the current design could also extend to push down INSERT 
.. RETURNING .., but I'd like to leave that for future work.


I'll add this to the next CF.

Best regards,
Etsuro Fujita
*** a/contrib/postgres_fdw/deparse.c
--- b/contrib/postgres_fdw/deparse.c
***
*** 816,822  deparseTargetList(StringInfo buf,
   *
   * If params is not NULL, it receives a list of Params and other-relation Vars
   * used in the clauses; these values must be transmitted to the remote server
!  * as parameter values.
   *
   * If params is NULL, we're generating the query for EXPLAIN purposes,
   * so Params and other-relation Vars should be replaced by dummy values.
--- 816,822 
   *
   * If params is not NULL, it receives a list of Params and other-relation Vars
   * used in the clauses; these values must be transmitted to the remote server
!  * as parameter values.  Caller is responsible for initializing it to empty.
   *
   * If params is NULL, we're generating the query for EXPLAIN purposes,
   * so Params and other-relation Vars should be replaced by dummy values.
***
*** 833,841  appendWhereClause(StringInfo buf,
  	int			nestlevel;
  	ListCell   *lc;
  
- 	if (params)
- 		*params = NIL;			/* initialize result list to empty */
- 
  	/* Set up context struct for recursion */
  	context.root = root;
  	context.foreignrel = baserel;
--- 833,838 
***
*** 971,976  deparseUpdateSql(StringInfo buf, PlannerInfo *root,
--- 968,1030 
  }
  
  /*
+  * deparse remote UPDATE statement
+  *
+  * The statement text is appended to buf, and we also create an integer List
+  * of the columns being retrieved by RETURNING (if any), which is returned
+  * to *retrieved_attrs.
+  */
+ void
+ deparsePushedDownUpdateSql(StringInfo buf, PlannerInfo *root,
+ 		   Index rtindex, Relation rel,
+ 		   List	*targetlist,
+ 		   List *targetAttrs,
+ 		   List	*remote_conds,
+ 		   List **params_list,
+ 		   List *returningList,
+ 		   List **retrieved_attrs)
+ {
+ 	RelOptInfo *baserel = root->simple_rel_array[rtindex];
+ 	deparse_expr_cxt context;
+ 	bool		first;
+ 	ListCell   *lc;
+ 
+ 	if (params_list)
+ 		*params_list = NIL;		/* initialize result list to empty */
+ 
+ 	/* Set up context struct for recursion */
+ 	context.root = root;
+ 	context.foreignrel = baserel;
+ 	context.buf = buf;
+ 	context.params_list = params_list;
+ 
+ 	appendStringInfoString(buf, "UPDATE ");
+ 	deparseRelation(buf, rel);
+ 	appendStringInfoString(buf, " SET ");
+ 
+ 	first = true;
+ 	foreach(lc, targetAttrs)
+ 	{
+ 		int			attnum = lfirst_int(lc);
+ 		TargetEntry *tle = get_tle_by_resno(targetlist, 

Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-12-25 Thread Vinayak Pokale
Hi,
Please find attached patch addressing following comments.

>The relation OID should be reported and not its name. In case of a
>relation rename that would not be cool for tracking, and most users
>are surely going to join with other system tables using it.
The relation OID is reported instead of relation name.
The following interface function is called at the beginning to report the
relation OID once.
void pgstat_report_command_target(Oid relid)

>Regarding pg_stat_get_vacuum_progress(): I think a backend can simply be
>skipped if (!has_privs_of_role(GetUserId(), beentry->st_userid)) (cannot
>put that in plain English, :))
Updated in the attached patch.

In the previous patch, ACTIVITY_IS_VACUUM is set unnecessarily for
VACOPT_FULL and they are not covered by lazy_scan_heap().
I have updated it in attached patch and also renamed ACTIVITY_IS_VACUUM to
COMMAND_LAZY_VACUUM.

Added documentation for view.
Some more comments need to be addressed.

Regards,

Vinayak Pokale

On Sat, Dec 12, 2015 at 2:07 AM, Robert Haas  wrote:

> On Fri, Dec 11, 2015 at 1:25 AM, Michael Paquier
>  wrote:
> >> For another thing, there are definitely going to be
> >> some people that want the detailed information - and I can practically
> >> guarantee that if we don't make it available, at least one person will
> >> write a tool that tries to reverse-engineer the detailed progress
> >> information from whatever we do report.
> >
> > OK, so this justifies the fact of having detailed information, but
> > does it justify the fact of having precise and accurate data? ISTM
> > that having detailed information and precise information are two
> > different things. The level of details is defined depending on how
> > verbose we want the information to be, and the list you are giving
> > would fulfill this requirement nicely for VACUUM. The level of
> > precision/accuracy at which this information is provided though
> > depends at which frequency we want to send this information. For
> > long-running VACUUM it does not seem necessary to update the fields of
> > the progress tracker each time a counter needs to be incremented.
> > CLUSTER has been mentioned as well as a potential target for the
> > progress facility, but it seems that it enters as well in the category
> > of things that would need to be reported on a slower frequency pace
> > than "each-time-a-counter-is-incremented".
> >
> > My impression is just based on the needs of VACUUM and CLUSTER.
> > Perhaps I am lacking imagination regarding the potential use cases of
> > the progress facility though in cases where we'd want to provide
> > extremely precise progress information :)
> > It just seems to me that this is not a requirement for VACUUM or
> > CLUSTER. That's all.
>
> It's not a hard requirement, but it should be quite easy to do without
> adding any significant overhead.  All you need to do is something
> like:
>
> foo->changecount++;
> pg_write_barrier();
> foo->count_of_blocks++;
> pg_write_barrier();
> foo->changecount++;
>
> I suspect that's plenty cheap enough to do for every block.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


Vacuum_progress_checker_v8.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MergeAttributes type (mod) conflict error detail

2015-12-25 Thread Tom Lane
Amit Langote  writes:
> I wonder if the following error detail text could say more than it does
> currently for the following rather artificial example case:

> CREATE TABLE p1(a char(3));
> CREATE TABLE p2(a char(2));

> CREATE TABLE c(d int) INHERITS (p1, p2);
> NOTICE:  merging multiple inherited definitions of column "a"
> ERROR:  inherited column "a" has a type conflict
> DETAIL:  character versus character

> Any specific reason why it doesn't spell out typmods in the above detail
> message?

I agree this could stand to be improved.  There are probably a couple of
reasons why this code is like it is:

* We did not use to have format_type_with_typemod(), only
format_type_be().  That's easily changed now of course.

* There's a rough policy in the parser to prefer TypeNameToString
when complaining about a TypeName input, rather than reconstructing
the type name from the OID.  The reason for this is that we'd rather
complain about the type name as entered, not the canonical type name
--- for example, if the user typed "float8" it might be a bit confusing
if the parser then complains about "double precision".

I'm not entirely sure though that that argument should be applied
to this particular case, because the other type referred to will
certainly get displayed in canonical form.

So we could either apply your patch as written, or we could replace
only the format_type_be calls with format_type_with_typemod, and
then fix TypeNameToString so that it will show the typmod if any.
(We'd need to consider whether that behavior is OK for all callers.)

Even if we decide this particular case is best handled by presenting
canonical type names on both sides, maybe it would be wise to look
into whether TypeNameToString should be changed for other callers.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance degradation in commit ac1d794

2015-12-25 Thread Tom Lane
Andres Freund  writes:
> On December 25, 2015 7:10:23 PM GMT+01:00, Tom Lane  
> wrote:
>> Seems like what you've got here is a kernel bug.

> I wouldn't go as far as calling it a kernel bug. Were still doing 300k tps. 
> And were triggering the performance degradation by adding another socket 
> (IIRC) to the poll(2) call.

Hmm.  And all those FDs point to the same pipe.  I wonder if we're looking
at contention for some pipe-related data structure inside the kernel.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Performance degradation in commit ac1d794

2015-12-25 Thread Васильев Дмитрий
Hello hackers!

I suddenly found commit ac1d794 gives up to 3 times performance degradation.

I tried to run pgbench -s 1000 -j 48 -c 48 -S -M prepared on 70 CPU-core
machine:
commit ac1d794 gives me 363,474 tps
and previous commit a05dc4d gives me 956,146
and master( 3d0c50f ) with revert ac1d794 gives me 969,265

it's shocking

---
Dmitry Vasilyev
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


Re: [HACKERS] Performance degradation in commit ac1d794

2015-12-25 Thread Васильев Дмитрий
2015-12-25 20:18 GMT+03:00 Andres Freund :

> On December 25, 2015 6:08:15 PM GMT+01:00, "Васильев Дмитрий" <
> d.vasil...@postgrespro.ru> wrote:
> >Hello hackers!
> >
> >I suddenly found commit ac1d794 gives up to 3 times performance
> >degradation.
> >
> >I tried to run pgbench -s 1000 -j 48 -c 48 -S -M prepared on 70
> >CPU-core
> >machine:
> >commit ac1d794 gives me 363,474 tps
> >and previous commit a05dc4d gives me 956,146
> >and master( 3d0c50f ) with revert ac1d794 gives me 969,265
> >
> >it's shocking
>
>
>
> ​
> You're talking about
> http://git.postgresql.org/gitweb/?p=postgresql.git;a=blobdiff;f=src/backend/libpq/be-secure.c;h=2ddcf428f89fd12c230d6f417c2f707fbd97bf39;hp=26d8faaf773a818b388b899b8d83d617bdf7af9b;hb=ac1d794;hpb=a05dc4d7fd57d4ae084c1f0801973e5c1a1aa26e
>
> ​​
> If so, could you provide a hierarchical before/after profile?
>
> Andres
> Hi,
>
> ---
> Please excuse brevity and formatting - I am writing this on my mobile
> phone.
>

​> ​You're talking about
http://git.postgresql.org/gitweb/?p=postgresql.git;a=blobdiff;f=src/backend/libpq/be-secure.c;h=2ddcf428f89fd12c230d6f417c2f707fbd97bf39;hp=26d8faaf773a818b388b899b8d83d617bdf7af9b;hb=ac1d794;hpb=a05dc4d7fd57d4ae084c1f0801973e5c1a1aa26e

Yes, about this.

>​If so, could you provide a hierarchical before/after profile?

​Performance | Git hash commit| Date
~ 360k tps | c3e7c24a1d60dc6ad56e2a0723399f1570c54224 | Thu Nov 12 09:12:18
2015 -0500
~ 360k tps | ac1d7945f866b1928c2554c0f80fd52d7f92| Thu Nov 12
09:00:33 2015 -0500
~ 960k tps | a05dc4d7fd57d4ae084c1f0801973e5c1a1aa26e  | Thu Nov 12
07:40:31 2015 -0500


​​---
Dmitry Vasilyev
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company​


Re: [HACKERS] Performance degradation in commit ac1d794

2015-12-25 Thread Andres Freund
On December 25, 2015 6:27:06 PM GMT+01:00, "Васильев Дмитрий" 
>>​If so, could you provide a hierarchical before/after profile?
>
>​Performance | Git hash commit| Date
>~ 360k tps | c3e7c24a1d60dc6ad56e2a0723399f1570c54224 | Thu Nov 12
>09:12:18
>2015 -0500
>~ 360k tps | ac1d7945f866b1928c2554c0f80fd52d7f92| Thu Nov 12
>09:00:33 2015 -0500
>~ 960k tps | a05dc4d7fd57d4ae084c1f0801973e5c1a1aa26e  | Thu Nov 12
>07:40:31 2015 -0500

Profile as in perf oprofile or something.

--- 
Please excuse brevity and formatting - I am writing this on my mobile phone.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Commit fest status for 2015-11

2015-12-25 Thread Michael Paquier
On Fri, Dec 25, 2015 at 5:20 AM, Gavin Flower
 wrote:
> On 25/12/15 01:45, Michael Paquier wrote:
> [...]
>>
>> And the CF is no closed. Here is the final score:
>> Committed: 30.
>> Moved to next CF: 42.
>> Rejected: 9.
>> Returned with Feedback: 22.
>> Total: 103.
>> Regards,
>
>
> You didn't say how may regards...

One.

> [More seriously]
> Many thanks to you, and the other Postgres developers, for all your hard
> work & dedication, much appreciated!

Thanks. I am moving back to more normal hacking activity, and there is
much to be done. Being a CFM is highly energy-draining activity.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remove Windows crash dump support?

2015-12-25 Thread Craig Ringer
On 25 December 2015 at 19:45, Michael Paquier 
wrote:

> On Thu, Dec 24, 2015 at 5:57 PM, Dave Page  wrote:
> > On Thu, Dec 24, 2015 at 2:14 AM, Craig Ringer 
> wrote:
> >> On 22 December 2015 at 23:48, Alex Ignatov 
> wrote:
> >>
> >>>
> >>> I think that you can debug crash dump since windbg exists.
> >>
> >>
> >> Nobody in their right mind uses windbg though. Visual Studio is really
> where
> >> it's at and the Express versions make it much more practical.
>
> Well, FWIW, I have been working lately on a bug hidden in a custom
> background worker on Windows that crashed in some weird way with a
> 0x00C5, and while I have a reproducible test case I have not yet
> found the time to look at it yet but I would think that this is
> generating a core dump, and that I will need to use windbg for that.
> Hence count me on the -1 team.
>

Huh?

You can use VS Express to debug a core dump.  Per links upthread you can
get the platform to generate core dumps for you. No windbg required.

If you want to torture yourself using windbg go ahead, but it really isn't
necessary, you can use a sane debugger.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Performance degradation in commit ac1d794

2015-12-25 Thread Tom Lane
=?UTF-8?B?0JLQsNGB0LjQu9GM0LXQsiDQlNC80LjRgtGA0LjQuQ==?= 
 writes:
> ​ Samples: 1M of event 'cycles', Event count (approx.): 816922259995, UID:
> pgpro
>   Overhead  Shared Object   Symbol

>   69,72%  [kernel][k] _raw_spin_lock_irqsave
>   1,43%  postgres[.] _bt_compare
>1,19%  postgres[.] LWLockAcquire
>0,99%  postgres[.] hash_search_with_hash_value
>0,61%  postgres[.] PinBuffer

Seems like what you've got here is a kernel bug.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review: GiST support for UUIDs

2015-12-25 Thread Ildus Kurbangaliev
On Fri, 25 Dec 2015 13:34:25 +0300
Teodor Sigaev  wrote:

> Thank you, but I have some notices about
> static float
> uuid_parts_distance(pg_uuid_t *a, pg_uuid_t *b)
> {
>  pg_uuid_t ua, ub;
>  const double mp = pow(2, -64);
> 
>  uuid_cnv(a, );
>  uuid_cnv(b, );
> 
>  Assert(ua.v64[0] > ub.v64[0]);
>  uint64 high = ua.v64[0] - ub.v64[0];
>  uint64 low = ua.v64[1] - ub.v64[1];
>  if (low > ua.v64[1])
>  high--;
> 
>  return (float) (ldexp(high, 64) + (double) low * mp);
> }
> 
> First, variables (high and low) should not be declared in the middle
> of code. Second, assert will fail if ua.v64[0] == ub.v64[0] and
> ua.v64[1] > ub.v64[1] although it's a possible and allowed case.
> Third, actually you multiply high value by 2^64 anf low by 2^-64.
> Seems, it's needed to do only one multiplication.

Thank you for review. Fixed.

-- 
Ildus Kurbangaliev
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
diff --git a/contrib/btree_gist/Makefile b/contrib/btree_gist/Makefile
index a4b2cc7..43a6c5d 100644
--- a/contrib/btree_gist/Makefile
+++ b/contrib/btree_gist/Makefile
@@ -6,16 +6,16 @@ OBJS =  btree_gist.o btree_utils_num.o btree_utils_var.o btree_int2.o \
 btree_int4.o btree_int8.o btree_float4.o btree_float8.o btree_cash.o \
 btree_oid.o btree_ts.o btree_time.o btree_date.o btree_interval.o \
 btree_macaddr.o btree_inet.o btree_text.o btree_bytea.o btree_bit.o \
-btree_numeric.o $(WIN32RES)
+btree_numeric.o btree_uuid.o $(WIN32RES)
 
 EXTENSION = btree_gist
-DATA = btree_gist--1.1.sql btree_gist--unpackaged--1.0.sql \
-	btree_gist--1.0--1.1.sql
+DATA = btree_gist--1.2.sql btree_gist--unpackaged--1.0.sql \
+	btree_gist--1.0--1.1.sql btree_gist--1.1--1.2.sql
 PGFILEDESC = "btree_gist - B-tree equivalent GiST operator classes"
 
 REGRESS = init int2 int4 int8 float4 float8 cash oid timestamp timestamptz \
 time timetz date interval macaddr inet cidr text varchar char bytea \
-bit varbit numeric not_equal
+bit varbit numeric uuid not_equal
 
 SHLIB_LINK += $(filter -lm, $(LIBS))
 
diff --git a/contrib/btree_gist/btree_gist--1.1--1.2.sql b/contrib/btree_gist/btree_gist--1.1--1.2.sql
new file mode 100644
index 000..376e884
--- /dev/null
+++ b/contrib/btree_gist/btree_gist--1.1--1.2.sql
@@ -0,0 +1,64 @@
+/* contrib/btree_gist/btree_gist--1.1--1.2.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION btree_gist UPDATE TO '1.2'" to load this file. \quit
+
+-- Add support for indexing UUID columns
+
+-- define the GiST support methods
+CREATE FUNCTION gbt_uuid_consistent(internal,uuid,int2,oid,internal)
+RETURNS bool
+AS 'MODULE_PATHNAME'
+LANGUAGE C IMMUTABLE STRICT;
+
+CREATE FUNCTION gbt_uuid_fetch(internal)
+RETURNS internal
+AS 'MODULE_PATHNAME'
+LANGUAGE C IMMUTABLE STRICT;
+
+CREATE FUNCTION gbt_uuid_compress(internal)
+RETURNS internal
+AS 'MODULE_PATHNAME'
+LANGUAGE C IMMUTABLE STRICT;
+
+CREATE FUNCTION gbt_uuid_penalty(internal,internal,internal)
+RETURNS internal
+AS 'MODULE_PATHNAME'
+LANGUAGE C IMMUTABLE STRICT;
+
+CREATE FUNCTION gbt_uuid_picksplit(internal, internal)
+RETURNS internal
+AS 'MODULE_PATHNAME'
+LANGUAGE C IMMUTABLE STRICT;
+
+CREATE FUNCTION gbt_uuid_union(bytea, internal)
+RETURNS gbtreekey16
+AS 'MODULE_PATHNAME'
+LANGUAGE C IMMUTABLE STRICT;
+
+CREATE FUNCTION gbt_uuid_same(internal, internal, internal)
+RETURNS internal
+AS 'MODULE_PATHNAME'
+LANGUAGE C IMMUTABLE STRICT;
+
+-- Create the operator class
+CREATE OPERATOR CLASS gist_uuid_ops
+DEFAULT FOR TYPE uuid USING gist
+AS
+	OPERATOR	1	<   ,
+	OPERATOR	2	<=  ,
+	OPERATOR	3	=   ,
+	OPERATOR	4	>=  ,
+	OPERATOR	5	>   ,
+	FUNCTION	1	gbt_uuid_consistent (internal, uuid, int2, oid, internal),
+	FUNCTION	2	gbt_uuid_union (bytea, internal),
+	FUNCTION	3	gbt_uuid_compress (internal),
+	FUNCTION	4	gbt_decompress (internal),
+	FUNCTION	5	gbt_uuid_penalty (internal, internal, internal),
+	FUNCTION	6	gbt_uuid_picksplit (internal, internal),
+	FUNCTION	7	gbt_uuid_same (internal, internal, internal),
+	STORAGE		gbtreekey32;
+
+ALTER OPERATOR FAMILY gist_uuid_ops USING gist ADD
+	OPERATOR	6	<>  (uuid, uuid) ,
+	FUNCTION	9 (uuid, uuid) gbt_uuid_fetch (internal) ;
diff --git a/contrib/btree_gist/btree_gist--1.1.sql b/contrib/btree_gist/btree_gist--1.1.sql
deleted file mode 100644
index cdec964..000
--- a/contrib/btree_gist/btree_gist--1.1.sql
+++ /dev/null
@@ -1,1570 +0,0 @@
-/* contrib/btree_gist/btree_gist--1.0.sql */
-
--- complain if script is sourced in psql, rather than via CREATE EXTENSION
-\echo Use "CREATE EXTENSION btree_gist" to load this file. \quit
-
-CREATE FUNCTION gbtreekey4_in(cstring)
-RETURNS gbtreekey4
-AS 'MODULE_PATHNAME', 'gbtreekey_in'
-LANGUAGE C IMMUTABLE STRICT;
-
-CREATE FUNCTION gbtreekey4_out(gbtreekey4)
-RETURNS cstring
-AS 'MODULE_PATHNAME', 'gbtreekey_out'
-LANGUAGE C IMMUTABLE STRICT;
-
-CREATE TYPE 

Re: [HACKERS] Performance degradation in commit ac1d794

2015-12-25 Thread Васильев Дмитрий
​
2015-12-25 20:44 GMT+03:00 Andres Freund :

> On December 25, 2015 6:27:06 PM GMT+01:00, "Васильев Дмитрий"
> >>​If so, could you provide a hierarchical before/after profile?
> >
> >​Performance | Git hash commit| Date
> >~ 360k tps | c3e7c24a1d60dc6ad56e2a0723399f1570c54224 | Thu Nov 12
> >09:12:18
> >2015 -0500
> >~ 360k tps | ac1d7945f866b1928c2554c0f80fd52d7f92| Thu Nov 12
> >09:00:33 2015 -0500
> >~ 960k tps | a05dc4d7fd57d4ae084c1f0801973e5c1a1aa26e  | Thu Nov 12
> >07:40:31 2015 -0500
>
> Profile as in perf oprofile or something.
>
> ---
> Please excuse brevity and formatting - I am writing this on my mobile
> phone.
>

​ac1d794​:

​ Samples: 1M of event 'cycles', Event count (approx.): 816922259995, UID:
pgpro
  Overhead  Shared Object   Symbol


  69,72%  [kernel][k] _raw_spin_lock_irqsave
  1,43%  postgres[.] _bt_compare
   1,19%  postgres[.] LWLockAcquire
   0,99%  postgres[.] hash_search_with_hash_value
   0,61%  postgres[.] PinBuffer


   0,46%  postgres[.] GetSnapshotData ​

​a05dc4d:​

​Samples: 1M of event 'cycles', Event count (approx.): 508150718694, UID:
pgpro
Overhead  Shared Object   Symbol


   4,77%  postgres[.] GetSnapshotData
   4,30%  postgres[.] _bt_compare
   3,13%  postgres[.] hash_search_with_hash_value
   3,08%  postgres[.] LWLockAcquire
   2,09%  postgres[.] LWLockRelease
   2,03%  postgres[.] PinBuffer




​Perf record ​generate big traffic:

time perf record -u pgpro -g --call-graph=dwarf
^C[ perf record: Woken up 0 times to write data ]
Warning:
Processed 1078453 events and lost 18257 chunks!
Check IO/CPU overload!
[ perf record: Captured and wrote 8507.985 MB perf.data (1055663 samples) ]
real0m8.791s
user0m0.678s
sys 0m8.120s



​If you want i give you ssh-access.


Re: [HACKERS] Remove Windows crash dump support?

2015-12-25 Thread Michael Paquier
On Thu, Dec 24, 2015 at 5:57 PM, Dave Page  wrote:
> On Thu, Dec 24, 2015 at 2:14 AM, Craig Ringer  wrote:
>> On 22 December 2015 at 23:48, Alex Ignatov  wrote:
>>
>>>
>>> I think that you can debug crash dump since windbg exists.
>>
>>
>> Nobody in their right mind uses windbg though. Visual Studio is really where
>> it's at and the Express versions make it much more practical.

Well, FWIW, I have been working lately on a bug hidden in a custom
background worker on Windows that crashed in some weird way with a
0x00C5, and while I have a reproducible test case I have not yet
found the time to look at it yet but I would think that this is
generating a core dump, and that I will need to use windbg for that.
Hence count me on the -1 team.

>> You can't even install Debugging Tools for Windows and Windbg standalone
>> anymore.
>>
>>>
>>> Also I think that Postgres on Windows number  of instalations is so tiny
>>> because people even today think that it is not so solid as unix version
>>> thats why you think that nobody use your code ;).
>>
>>
>> I disagree. Windows Pg users are often at bigger companies and don't talk
>> about PostgreSQL as much. Often for fear of reprisals from other database
>> vendors they have ongoing relationships with.  At least that's been my
>> experience and I'm sure EDB folks will concur.
>
> In my experience PG isn't used much in production on Windows in bigger
> companies.

Well, there is indeed not much, but it is used. At least I work for
one and Postgres is embedded in a couple of products on Windows.

> It's used a *lot* (and is quite probably the most
> frequently downloaded build from EDB or postgresql.org) as an embedded
> database in some applications, and for development/test. There are a
> huge number of Windows PostgreSQL users out there.

And they don't those ones.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance degradation in commit ac1d794

2015-12-25 Thread Васильев Дмитрий
2015-12-25 20:27 GMT+03:00 Васильев Дмитрий :

>
> 2015-12-25 20:18 GMT+03:00 Andres Freund :
>
>> On December 25, 2015 6:08:15 PM GMT+01:00, "Васильев Дмитрий" <
>> d.vasil...@postgrespro.ru> wrote:
>> >Hello hackers!
>> >
>> >I suddenly found commit ac1d794 gives up to 3 times performance
>> >degradation.
>> >
>> >I tried to run pgbench -s 1000 -j 48 -c 48 -S -M prepared on 70
>> >CPU-core
>> >machine:
>> >commit ac1d794 gives me 363,474 tps
>> >and previous commit a05dc4d gives me 956,146
>> >and master( 3d0c50f ) with revert ac1d794 gives me 969,265
>> >
>> >it's shocking
>>
>>
>>
>> ​
>> You're talking about
>> http://git.postgresql.org/gitweb/?p=postgresql.git;a=blobdiff;f=src/backend/libpq/be-secure.c;h=2ddcf428f89fd12c230d6f417c2f707fbd97bf39;hp=26d8faaf773a818b388b899b8d83d617bdf7af9b;hb=ac1d794;hpb=a05dc4d7fd57d4ae084c1f0801973e5c1a1aa26e
>>
>> ​​
>> If so, could you provide a hierarchical before/after profile?
>>
>> Andres
>> Hi,
>>
>> ---
>> Please excuse brevity and formatting - I am writing this on my mobile
>> phone.
>>
>
> ​> ​You're talking about
> http://git.postgresql.org/gitweb/?p=postgresql.git;a=blobdiff;f=src/backend/libpq/be-secure.c;h=2ddcf428f89fd12c230d6f417c2f707fbd97bf39;hp=26d8faaf773a818b388b899b8d83d617bdf7af9b;hb=ac1d794;hpb=a05dc4d7fd57d4ae084c1f0801973e5c1a1aa26e
>
> Yes, about this.
>
> >​If so, could you provide a hierarchical before/after profile?
>
> ​Performance | Git hash commit| Date
> ~ 360k tps | c3e7c24a1d60dc6ad56e2a0723399f1570c54224 | Thu Nov 12
> 09:12:18 2015 -0500
> ~ 360k tps | ac1d7945f866b1928c2554c0f80fd52d7f92| Thu Nov 12
> 09:00:33 2015 -0500
> ~ 960k tps | a05dc4d7fd57d4ae084c1f0801973e5c1a1aa26e  | Thu Nov 12
> 07:40:31 2015 -0500
>
>
> ​​
> ​​---
> Dmitry Vasilyev
> Postgres Professional: http://www.postgrespro.com
> Russian Postgres Company​
>
>
​I came across it by studying the results:​​

vanilla 9.5 = 30020c3fc3b6de5592978313df929d370f5770ce
vanilla 9.6 = c4a8812cf64b142685e39a69694c5276601f40e4

​
 info  | clients |   tps
---+-+-
 vanilla 9.5   |   1 |   30321
 vanilla 9.5   |   8 |  216542
 vanilla 9.5   |  16 |  412526
 vanilla 9.5   |  32 |  751331
 vanilla 9.5   |  48 |  956146
​ <- this point​
 vanilla 9.5   |  56 |  990122
 vanilla 9.5   |  64 |  842436
 vanilla 9.5   |  72 |  913272
 vanilla 9.5   |  82 |  659332
 vanilla 9.5   |  92 |  630111
 vanilla 9.5   |  96 |  616863
 vanilla 9.5   | 110 |  592080
 vanilla 9.5   | 120 |  575831
 vanilla 9.5   | 130 |  557521
 vanilla 9.5   | 140 |  537951
 vanilla 9.5   | 150 |  517019
 vanilla 9.5   | 160 |  502312
 vanilla 9.5   | 170 |  489162
 vanilla 9.5   | 180 |  477178
 vanilla 9.5   | 190 |  464620
 vanilla 9.6   |   1 |   31738
 vanilla 9.6   |   8 |  219692
 vanilla 9.6   |  16 |  422933
 vanilla 9.6   |  32 |  375546
 vanilla 9.6   |  48 |  363474
​ <- this point​
 vanilla 9.6   |  56 |  352943
 vanilla 9.6   |  64 |  334498
 vanilla 9.6   |  72 |  369802
 vanilla 9.6   |  82 |  604867
 vanilla 9.6   |  92 |  871048
 vanilla 9.6   |  96 |  969265
 vanilla 9.6   | 105 |  996794
 vanilla 9.6   | 110 |  932853
 vanilla 9.6   | 115 |  758485
 vanilla 9.6   | 120 |  721365
 vanilla 9.6   | 125 |  632265
 vanilla 9.6   | 130 |  624666
 vanilla 9.6   | 135 |  582120
 vanilla 9.6   | 140 |  583080
 vanilla 9.6   | 150 |  555608
 vanilla 9.6   | 160 |  533340
 vanilla 9.6   | 170 |  520308
 vanilla 9.6   | 180 |  504536
 vanilla 9.6   | 190 |  496967​


---
Dmitry Vasilyev
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company​​


Re: [HACKERS] Performance degradation in commit ac1d794

2015-12-25 Thread Andres Freund
On December 25, 2015 7:10:23 PM GMT+01:00, Tom Lane  wrote:
>=?UTF-8?B?0JLQsNGB0LjQu9GM0LXQsiDQlNC80LjRgtGA0LjQuQ==?=
> writes:
>> ��� Samples: 1M of event 'cycles', Event count (approx.):
>816922259995, UID:
>> pgpro
>>   Overhead  Shared Object   Symbol
>
>>   69,72%  [kernel][k] _raw_spin_lock_irqsave
>>   1,43%  postgres[.] _bt_compare
>>1,19%  postgres[.] LWLockAcquire
>>0,99%  postgres[.] hash_search_with_hash_value
>>0,61%  postgres[.] PinBuffer
>
>Seems like what you've got here is a kernel bug.

I wouldn't go as far as calling it a kernel bug. Were still doing 300k tps. And 
were triggering the performance degradation by adding another socket (IIRC) to 
the poll(2) call.

It certainly be interesting to see the expanded tree below the spinlock. I 
wonder if this is related to directed wakeups.

Andres

--- 
Please excuse brevity and formatting - I am writing this on my mobile phone.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance degradation in commit ac1d794

2015-12-25 Thread Andres Freund
On December 25, 2015 6:08:15 PM GMT+01:00, "Васильев Дмитрий" 
 wrote:
>Hello hackers!
>
>I suddenly found commit ac1d794 gives up to 3 times performance
>degradation.
>
>I tried to run pgbench -s 1000 -j 48 -c 48 -S -M prepared on 70
>CPU-core
>machine:
>commit ac1d794 gives me 363,474 tps
>and previous commit a05dc4d gives me 956,146
>and master( 3d0c50f ) with revert ac1d794 gives me 969,265
>
>it's shocking



You're talking about 
http://git.postgresql.org/gitweb/?p=postgresql.git;a=blobdiff;f=src/backend/libpq/be-secure.c;h=2ddcf428f89fd12c230d6f417c2f707fbd97bf39;hp=26d8faaf773a818b388b899b8d83d617bdf7af9b;hb=ac1d794;hpb=a05dc4d7fd57d4ae084c1f0801973e5c1a1aa26e

If so, could you provide a hierarchical before/after profile?

Andres
Hi,

--- 
Please excuse brevity and formatting - I am writing this on my mobile phone.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 9.5rc1 brin_summarize_new_values

2015-12-25 Thread Jeff Janes
brin_summarize_new_values() does not check that it is called on the
correct type of index, nor does it check permissions.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Commit fest status for 2015-11

2015-12-25 Thread David Fetter
On Thu, Dec 24, 2015 at 09:45:40PM +0900, Michael Paquier wrote:
> On Thu, Dec 24, 2015 at 12:09 PM, Michael Paquier
>  wrote:
> > On Tue, Dec 22, 2015 at 10:45 PM, Michael Paquier
> >  wrote:
> >> On Tue, Dec 22, 2015 at 4:04 PM, Michael Paquier
> >>  wrote:
> >>> OK, if those don't get addressed soon, they will be moved to the next
> >>> CF with the same status.
> >>
> >> After a first pass, Numbers are getting down,  I am just too tired to 
> >> continue:
> >> Needs review: 13.
> >> Waiting on Author: 15.
> >> Ready for Committer: 5
> >>
> >> Two additional patches have been marked as ready for committer:
> >> 1) Bug fix for pg_dump --jobs when password is passed a a parameter in
> >> a connection string:
> >> https://commitfest.postgresql.org/7/405/
> >> 2) Default roles, in Stephen's plate:
> >> https://commitfest.postgresql.org/7/49/
> >
> > I had a look at all the remaining entries and treated each one of
> > them, patches marked as ready for committer have been moved to next CF
> > with the same status. There are currently 3 remaining entries in the
> > CF app where it would be cool authors and/or reviewers give a status
> > of the situation. Once this is addressed the CF of 2015-11 will be
> > closed.
> >
> > Note particularly to authors: if your patch has been returned with
> > feedback, but you are still working on it, please make sure that it is
> > moved to next CF soon. Or, if your patch has been moved to next CF,
> > *but* you are not working on it, it would be nice to mark it as
> > returned with feedback in 2016-01.
> 
> And the CF is no closed. Here is the final score:
> Committed: 30.
> Moved to next CF: 42.
> Rejected: 9.
> Returned with Feedback: 22.
> Total: 103.
> Regards,

Thanks very much for doing the thankless!

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance degradation in commit ac1d794

2015-12-25 Thread Васильев Дмитрий
​
2015-12-25 21:28 GMT+03:00 Tom Lane :

> Andres Freund  writes:
> > On December 25, 2015 7:10:23 PM GMT+01:00, Tom Lane 
> wrote:
> >> Seems like what you've got here is a kernel bug.
>
> > I wouldn't go as far as calling it a kernel bug. Were still doing 300k
> tps. And were triggering the performance degradation by adding another
> socket (IIRC) to the poll(2) call.
>
> Hmm.  And all those FDs point to the same pipe.  I wonder if we're looking
> at contention for some pipe-related data structure inside the kernel.
>
> regards, tom lane
>

​I did bt on backends and found it in following state:

#0  0x7f77b0e5bb60 in __poll_nocancel () from /lib64/libc.so.6
#1  0x006a7cd0 in WaitLatchOrSocket (latch=0x7f779e2e96c4,
wakeEvents=wakeEvents@entry=19, sock=9, timeout=timeout@entry=0) at
pg_latch.c:333
#2  0x00612c7d in secure_read (port=0x17e6af0, ptr=0xcc94a0
, len=8192) at be-secure.c:147
#3  0x0061be36 in pq_recvbuf () at pqcomm.c:915
#4  pq_getbyte () at pqcomm.c:958
#5  0x00728ad5 in SocketBackend (inBuf=0x7ffd8b6b1460) at
postgres.c:345
​
Perf shows _raw_spin_lock_irqsave call remove_wait_queue add_wait_queue
There’s screenshots: http://i.imgur.com/pux2bGJ.png
http://i.imgur.com/LJQbm2V.png​


​​---
Dmitry Vasilyev
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company​


Re: [HACKERS] Performance degradation in commit ac1d794

2015-12-25 Thread Tom Lane
Andres Freund  writes:
> There's a couple solutions I can think of to that problem:
> 1) Use epoll()/kqueue, or other similar interfaces that don't require
>re-registering fds at every invocation. My guess is that that'd be
>desirable for performance anyway.

Portability, on the other hand, would be problematic.

> 2) Create a pair of fds between postmaster/backend for each
>backend. While obviously increasing the the number of FDs noticeably,
>it's interesting for other features as well: If we ever want to do FD
>passing from postmaster to existing backends, we're going to need
>that anyway.

Maybe; it'd provide another limit on how many backends we could run.

> 3) Replace the postmaster_alive_fds socketpair by some other signalling
>mechanism. E.g. sending a procsignal to each backend, which sets the
>latch and a special flag in the latch structure.

And what would send the signal?  The entire point here is to notice the
situation where the postmaster has crashed.  It can *not* depend on the
postmaster taking some action.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance degradation in commit ac1d794

2015-12-25 Thread Васильев Дмитрий
​
2015-12-25 22:42 GMT+03:00 Васильев Дмитрий :

>
>
> ​
> 2015-12-25 21:28 GMT+03:00 Tom Lane :
>
>> Andres Freund  writes:
>> > On December 25, 2015 7:10:23 PM GMT+01:00, Tom Lane 
>> wrote:
>> >> Seems like what you've got here is a kernel bug.
>>
>> > I wouldn't go as far as calling it a kernel bug. Were still doing 300k
>> tps. And were triggering the performance degradation by adding another
>> socket (IIRC) to the poll(2) call.
>>
>> Hmm.  And all those FDs point to the same pipe.  I wonder if we're looking
>> at contention for some pipe-related data structure inside the kernel.
>>
>> regards, tom lane
>>
>
> ​I did bt on backends and found it in following state:
>
> #0  0x7f77b0e5bb60 in __poll_nocancel () from /lib64/libc.so.6
> #1  0x006a7cd0 in WaitLatchOrSocket (latch=0x7f779e2e96c4,
> wakeEvents=wakeEvents@entry=19, sock=9, timeout=timeout@entry=0) at
> pg_latch.c:333
> #2  0x00612c7d in secure_read (port=0x17e6af0, ptr=0xcc94a0
> , len=8192) at be-secure.c:147
> #3  0x0061be36 in pq_recvbuf () at pqcomm.c:915
> #4  pq_getbyte () at pqcomm.c:958
> #5  0x00728ad5 in SocketBackend (inBuf=0x7ffd8b6b1460) at
> postgres.c:345
> ​
> Perf shows _raw_spin_lock_irqsave call remove_wait_queue add_wait_queue
> There’s screenshots: http://i.imgur.com/pux2bGJ.png
> http://i.imgur.com/LJQbm2V.png​
>
>
> ​​---
> Dmitry Vasilyev
> Postgres Professional: http://www.postgrespro.com
> Russian Postgres Company​
>
>
​I’m sorry I meant remove_wait_queue and add_wait_queue functions call
_raw_spin_lock_irqsave what holds main processor time.​

​uname -a: 3.10.0-229.20.1.el7.x86_64 #1 SMP Tue Nov 3 19:10:07 UTC 2015
x86_64 x86_64 x86_64 GNU/Linux​


​​---
Dmitry Vasilyev
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company​


Re: [HACKERS] Performance degradation in commit ac1d794

2015-12-25 Thread Andres Freund
On 2015-12-25 13:28:55 -0500, Tom Lane wrote:
> Hmm.  And all those FDs point to the same pipe.  I wonder if we're looking
> at contention for some pipe-related data structure inside the kernel.

Sounds fairly likely - and not too surprising. In this scenario we've a
couple 100k registrations/unregistrations to a pretty fundamentally
shared resource (the wait queue for changes to the pipe). Not that
surprising that it becomes a problem.

There's a couple solutions I can think of to that problem:
1) Use epoll()/kqueue, or other similar interfaces that don't require
   re-registering fds at every invocation. My guess is that that'd be
   desirable for performance anyway.

2) Create a pair of fds between postmaster/backend for each
   backend. While obviously increasing the the number of FDs noticeably,
   it's interesting for other features as well: If we ever want to do FD
   passing from postmaster to existing backends, we're going to need
   that anyway.

3) Replace the postmaster_alive_fds socketpair by some other signalling
   mechanism. E.g. sending a procsignal to each backend, which sets the
   latch and a special flag in the latch structure.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2015-12-25 Thread Pavel Stehule
2015-12-23 21:36 GMT+01:00 Daniel Verite :

>Hi,
>
> Here's an updated patch that replaces sorted arrays by AVL binary trees
> when gathering distinct values for the columns involved in the pivot.
> The change is essential for large resultsets. For instance,
> it allows to process a query like this (10 million rows x 10 columns):
>
> select x,(random()*10)::int, (random()*1000)::int from
>   generate_series(1,1000) as x
>  \crosstabview
>
> which takes about 30 seconds to run and display on my machine with the
> attached patch. That puts it seemingly in the same ballpark than
> the equivalent test with the server-side crosstab().
>
> With the previous iterations of the patch, this test would never end,
> even with much smaller sets, as the execution time of the 1st step
> grew exponentially with the number of distinct keys.
> The exponential effect starts to be felt at about 10k values on my low-end
> CPU,
> and from there quickly becomes problematic.
>
> As a client-side display feature, processing millions of rows like in
> the query above does not necessarily make sense, it's pushing the
> envelope, but stalling way below 100k rows felt lame, so I'm happy to get
> rid of that limitation.
>
> However, there is another one. The above example does not need or request
> an additional sort step, but if it did, sorting more than 65535 entries in
> the vertical header would error out, because values are shipped as
> parameters to PQexecParams(), which only accepts that much.
> To avoid the problem, when the rows in the output "grid" exceed 2^16 and
> they need to be sorted, the user must  let the sort being driven by ORDER
> BY
> beforehand in the query, knowing that the pivot will keep the original
> ordering intact in the vertical header.
>
> I'm still thinking about extending this based on Pavel's diff for the
> "label" column, so that
>  \crosstabview [+|-]colV[:colSortH] [+|-]colH[:colSortH]
> would mean to use colV/H as grid headers but sort them according
> to colSortV/H.
> I prefer that syntax over adding more parameters, and also I'd like
> to have it work in both V and H directions.
>

This syntax is good - simple, readable

Pavel

>
> Aside from the AVL trees, there are a few other minor changes in that
> patch:
> - move non-exportable structs from the .h to the .c
> - move code in common.c to respect alphabetical ordering
> - if vertical sort is requested, add explicit check against more than 65535
>   params instead of letting the sort query fail
> - report all failure cases of the sort query
> - rename sortColumns to serverSort and use less the term "columns" in
>   comments and variables.
>
>
> Best regards,
> --
> Daniel Vérité
> PostgreSQL-powered mailer: http://www.manitou-mail.org
> Twitter: @DanielVerite
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>