from:"Haribabu Kommi"

Re: [HACKERS] Combining Aggregates

2016-03-20 Thread Haribabu Kommi

On Sun, Mar 20, 2016 at 2:23 PM, David Rowley
 wrote:
>
> I've had a look over this. I had to first base it on the 0005 patch,
> as it seemed like the pg_aggregate.h changes didn't include the
> serialfn and deserialfn changes, and an OID was being consumed by
> another function I added in patch 0003.
>
> On testing I also noticed some wrong results, which on investigation,
> are due to the wrong array elements being added together.
>
> For example:
>
> postgres=# select stddev(num) from f;
>   stddev
> --
>  28867.5149028984
> (1 row)
>
>
> postgres=# set max_parallel_degree=8;
> SET
> postgres=# select stddev(num) from f;
>  stddev
> 
>   0
> (1 row)
>
> + N += transvalues2[0];
> + sumX += transvalues2[1];
> + CHECKFLOATVAL(sumX, isinf(transvalues1[1]) || isinf(transvalues2[1]), true);
> + sumX2 += transvalues1[2];
>
> The last line should use transvalues2[2], not transvalues1[2].

Thanks.

> There's also quite a few mistakes in float8_regr_combine()
>
> + sumX2 += transvalues2[2];
> + CHECKFLOATVAL(sumX2, isinf(transvalues1[2]) || isinf(transvalues2[1]), 
> true);
>
> Wrong array element on isinf() check
>
> + sumY2 += transvalues2[4];
> + CHECKFLOATVAL(sumY2, isinf(transvalues1[4]) || isinf(transvalues2[3]), 
> true);
>
> Wrong array element on isinf() check
>
> + sumXY += transvalues2[5];
> + CHECKFLOATVAL(sumXY, isinf(transvalues1[5]) || isinf(transvalues2[1]) ||
> +  isinf(transvalues2[3]), true);
>
> Wrong array element on isinf() check, and the final
> isinf(transvalues2[3]) check does not need to be there.

Thanks for the changes, I just followed the float8_regr_accum function while
writing float8_regr_combine function. Now I understood isinf proper usage.


Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2016-03-20 Thread Haribabu Kommi

On Fri, Mar 18, 2016 at 7:46 PM, Shulgin, Oleksandr
<oleksandr.shul...@zalando.de> wrote:
> On Fri, Mar 18, 2016 at 7:53 AM, Haribabu Kommi <kommi.harib...@gmail.com>
> wrote:
>>
>> On Thu, Mar 17, 2016 at 6:56 PM, Shulgin, Oleksandr
>> <oleksandr.shul...@zalando.de> wrote:
>> >
>> > You mean change context name and correct the comment?  I didn't suggest
>> > to
>> > change the function name.
>>
>> Changed the context name and the comment only.
>
> Check.
>
> +} lookup_hba_line_context;
> ^ but why TAB here?

corrected. I am not sure why pg_indent is adding a tab here.

>> >> > Still remains an issue of representing special keywords in database
>> >> > and
>> >> > user_name fields, but there was no consensus about that though.
>> >>
>> >> How about adding keyword_database and keyword_user columns to listing
>> >> out the keywords.  These columns will be populated only when the hba
>> >> line
>> >> contains any keywords.
>> >
>> >
>> > Hm... that could work too.
>>
>> Here I attached patch with the added two keyword columns.
>
> + if (!tok->quoted && strcmp(tok->string, "all") == 0)
>
> token_is_keyword(tok, "all") ?

updated.

>> During the testing with different IP comparison methods like 'samehost' or
>> 'samenet', the address details are not displayed. Is there any need of
>> showing the IP compare method also?
>
> Do you mean return "samehost/samenet/all" in the yet another keyword_address
> out parameter or something like that?

Yes, Currently other than IP/hostname, if the user specifies
samehost/samenet/all
keywords, currently these are not displayed. I feel adding another column of
keyword_address is worth.

Updated patch is attached.

Regards,
Hari Babu
Fujitsu Australia


pg-hba-lookup-21-03-2016.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Aggregate

2016-03-19 Thread Haribabu Kommi

On Thu, Mar 17, 2016 at 2:13 PM, James Sewell  wrote:
>
> Hi again,
>
> This is probably me missing something, but is there a reason parallel 
> aggregate doesn't seem to ever create append nodes containing Index scans?
>
> SET random_page_cost TO 0.2;
> SET max_parallel_degree TO 8;
>
> postgres=# explain SELECT sum(count_i) FROM base GROUP BY view_time_day;
>QUERY PLAN
> -
>  Finalize GroupAggregate  (cost=310596.32..310598.03 rows=31 width=16)
>Group Key: view_time_day
>->  Sort  (cost=310596.32..310596.79 rows=186 width=16)
>  Sort Key: view_time_day
>  ->  Gather  (cost=310589.00..310589.31 rows=186 width=16)
>Number of Workers: 5
>->  Partial HashAggregate  (cost=310589.00..310589.31 rows=31 
> width=16)
>  Group Key: view_time_day
>  ->  Parallel Seq Scan on base  (cost=0.00..280589.00 
> rows=600 width=12)
>
>
> SET max_parallel_degree TO 0;
>
> postgres=# explain SELECT sum(count_i) FROM base GROUP BY view_time_day;
> QUERY PLAN
> ---
>  GroupAggregate  (cost=0.56..600085.92 rows=31 width=16)
>Group Key: view_time_day
>->  Index Only Scan using base_view_time_day_count_i_idx on base  
> (cost=0.56..450085.61 rows=3000 width=12)
> (3 rows)


To get good parallelism benefit, the workers has to execute most of
the plan in parallel.
If we run only some part of the upper plan in parallel, we may not get
better parallelism
benefit. At present only seq scan node possible for parallelism at
scan node level.
Index scan is not possible as of now. So because of this reason based
on the overall
cost of the parallel aggregate + parallel seq scan, the plan is chosen.

If index scan is changed to make it parallel in future, it is possible
that parallel aggregate +
parallel index scan plan may chosen.

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Combining Aggregates

2016-03-19 Thread Haribabu Kommi

On Wed, Mar 16, 2016 at 10:08 PM, David Rowley
<david.row...@2ndquadrant.com> wrote:
> On 16 March 2016 at 23:54, Haribabu Kommi <kommi.harib...@gmail.com> wrote:
>> On Wed, Mar 16, 2016 at 8:34 AM, David Rowley
>> <david.row...@2ndquadrant.com> wrote:
>>> Yes me too, so I spent several hours yesterday writing all of the
>>> combine functions and serialisation/deserialisation that are required
>>> for all of SUM(), AVG() STDDEV*(). I also noticed that I had missed
>>> using some existing functions for bool_and() and bool_or() so I added
>>> those to pg_aggregate.h. I'm just chasing down a crash bug on
>>> HAVE_INT128 enabled builds, so should be posting a patch quite soon. I
>>> didn't touch the FLOAT4 and FLOAT8 aggregates as I believe Haribabu
>>> has a patch for that over on the parallel aggregate thread. I've not
>>> looked at it in detail yet.
>>
>> The additional combine function patch that I posted handles all float4 and
>> float8 aggregates. There is an OID conflict with the latest source code,
>> I will update the patch and post it in that thread.
>
> Thanks! I just send a series of patches which add a whole host of
> serial/deserial functions, and a patch which adds some documentation.
> Maybe you could base your patch on the 0005 patch, and update the
> documents to remove the "All types apart from floating-point types"
> text and replace that with "Yes".

Here I attached updated float aggregates patch based on 0005 patch.

Regards,
Hari Babu
Fujitsu Australia


0006-float-aggregates-17-03-2016.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Combining Aggregates

2016-03-19 Thread Haribabu Kommi

On Thu, Mar 17, 2016 at 10:59 PM, Tomas Vondra
 wrote:
> Hi,
>
> On 03/17/2016 12:53 PM, David Rowley wrote:
>>
> ...
>>
>>
>> I just had a quick skim over the patch and noticed the naming
>> convention you're using for the combine function is *_pl, and you have
>> float8_pl. There's already a function named float8pl() which is quite
>> close to what you have. I've been sticking to *_combine() for these,
>> so maybe float8_combine() and float8_regr_combine() are better names.
>
>
> +1 to the _combine naming convention.

Thanks for the input. Makes sense, updated patch is attached with
the changes.

Regards,
Hari Babu
Fujitsu Australia


0006-float-aggregates-18-03-2016.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2016-03-19 Thread Haribabu Kommi

On Wed, Mar 16, 2016 at 9:49 PM, Shulgin, Oleksandr
<oleksandr.shul...@zalando.de> wrote:
> On Tue, Mar 15, 2016 at 7:23 PM, David Steele <da...@pgmasters.net> wrote:
>>
>> On 3/3/16 12:16 AM, Haribabu Kommi wrote:
>> > On Fri, Feb 5, 2016 at 2:29 PM, Haribabu Kommi
>> > <kommi.harib...@gmail.com> wrote:
>> >>
>> >> This patch needs to be applied on top discard_hba_and_ident_cxt patch
>> >> that is posted earlier.
>> >
>> > Here I attached a re-based patch to the latest head with inclusion of
>> > discard_hba_ident_cxt patch for easier review as a single patch.
>>
>> Alex, Scott, do you have an idea of when you'll be able to review this
>> new version?
>
>
> The new version applies with some fuzziness to the current master and
> compiles cleanly.
>
> Some comments:
>
> +/* Context to use with hba_line_callback function. */
> +typedef struct
> +{
> +   MemoryContext memcxt;
> +   TupleDesc   tupdesc;
> +   Tuplestorestate *tuple_store;
> +}  HbalineContext;
>
> Rather "with *lookup_hba_line_callback*", as hba_line_callback() is a
> generic one.

Fine. I will change the function and context names.

> + line_number |  mode   | type  | database | user_name |  address  |
> netmask | hostname | method | options |  reason
> +-+-+---+--+---+---+-+--++-+--
> +  84 | skipped | local | {all}| {all} |   |
> |  | trust  | {}  | connection type mismatch
> +  86 | skipped | host  | {all}| {all} | 127.0.0.1 |
> 255.255.255.255 |  | trust  | {}  | IP
> address mismatch
> +  88 | matched | host  | {all}| {all} | ::1   |
> ::::::: |  | trust  | {}  |
>
> Hm... now I'm not sure if we really need the "mode" column.  It should be
> clear that we skipped every line that had a non-NULL "reason".  I guess we
> could remove "mode" and rename "reason" to "skip_reason"?

Ok. Lets hear from others also regarding the same.

> Still remains an issue of representing special keywords in database and
> user_name fields, but there was no consensus about that though.

How about adding keyword_database and keyword_user columns to listing
out the keywords.  These columns will be populated only when the hba line
contains any keywords.


Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2016-03-18 Thread Haribabu Kommi

On Thu, Mar 17, 2016 at 6:56 PM, Shulgin, Oleksandr
<oleksandr.shul...@zalando.de> wrote:
> On Thu, Mar 17, 2016 at 2:12 AM, Haribabu Kommi <kommi.harib...@gmail.com>
> wrote:
>>
>> On Wed, Mar 16, 2016 at 9:49 PM, Shulgin, Oleksandr
>> <oleksandr.shul...@zalando.de> wrote:
>> >
>> > Some comments:
>> >
>> > +/* Context to use with hba_line_callback function. */
>> > +typedef struct
>> > +{
>> > +   MemoryContext memcxt;
>> > +   TupleDesc   tupdesc;
>> > +   Tuplestorestate *tuple_store;
>> > +}  HbalineContext;
>> >
>> > Rather "with *lookup_hba_line_callback*", as hba_line_callback() is a
>> > generic one.
>>
>> Fine. I will change the function and context names.
>
>
> You mean change context name and correct the comment?  I didn't suggest to
> change the function name.

Changed the context name and the comment only.

>> > Still remains an issue of representing special keywords in database and
>> > user_name fields, but there was no consensus about that though.
>>
>> How about adding keyword_database and keyword_user columns to listing
>> out the keywords.  These columns will be populated only when the hba line
>> contains any keywords.
>
>
> Hm... that could work too.

Here I attached patch with the added two keyword columns.
During the testing with different IP comparison methods like 'samehost' or
'samenet', the address details are not displayed. Is there any need of
showing the IP compare method also?

Regards,
Hari Babu
Fujitsu Australia


pg-hba-lookup-18-03-2016.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Combining Aggregates

2016-03-16 Thread Haribabu Kommi

On Wed, Mar 16, 2016 at 8:34 AM, David Rowley
 wrote:
> On 16 March 2016 at 06:39, Tomas Vondra  wrote:
>> After looking at the parallel aggregate patch, I also looked at this one, as
>> it's naturally related. Sadly I haven't found any issue that I could nag
>> about ;-) The patch seems well baked, as it was in the oven for quite a long
>> time already.
>
> Thanks for looking at this.
>
>> The one concern I do have is that it only adds (de)serialize functions for
>> SUM(numeric) and AVG(numeric). I think it's reasonable not to include that
>> into the patch, but it will look a bit silly if that's all that gets into
>> 9.6.
>
> Yes me too, so I spent several hours yesterday writing all of the
> combine functions and serialisation/deserialisation that are required
> for all of SUM(), AVG() STDDEV*(). I also noticed that I had missed
> using some existing functions for bool_and() and bool_or() so I added
> those to pg_aggregate.h. I'm just chasing down a crash bug on
> HAVE_INT128 enabled builds, so should be posting a patch quite soon. I
> didn't touch the FLOAT4 and FLOAT8 aggregates as I believe Haribabu
> has a patch for that over on the parallel aggregate thread. I've not
> looked at it in detail yet.

The additional combine function patch that I posted handles all float4 and
float8 aggregates. There is an OID conflict with the latest source code,
I will update the patch and post it in that thread.

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Aggregate

2016-03-13 Thread Haribabu Kommi

On Mon, Mar 14, 2016 at 8:44 AM, David Rowley
 wrote:
> On 12 March 2016 at 16:31, David Rowley  wrote:
>> I've attached an updated patch which is based on commit 7087166,
>> things are really changing fast in the grouping path area at the
>> moment, but hopefully the dust is starting to settle now.
>
> The attached patch fixes a harmless compiler warning about a possible
> uninitialised variable.

The setrefs.c fix for updating the finalize-aggregate target list is nice.
I tested all the float aggregates and are working fine.

Overall the patch is fine. I will do some test and provide the update later.

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pam auth - add rhost item

2016-03-12 Thread Haribabu Kommi

On Sun, Mar 13, 2016 at 8:07 AM, Grzegorz Sampolski  wrote:
> Hi.
> Thank you for improve documentation and yes I'm fine with this chages.

Thanks. changed the patch status as ready for committer.

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pam auth - add rhost item

2016-03-12 Thread Haribabu Kommi

On Fri, Mar 11, 2016 at 12:11 AM, Grzegorz Sampolski  wrote:
> Hi.
> In attchment new patch with updated documentation and with small change
> to coding style as you suggested.


Thanks for the update. Here I attached updated patch with additional
documentation
changes, If you are fine with the changes, I will mark the patch as
ready for committer.


Regards,
Hari Babu
Fujitsu Australia


pam_auth_updated.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [HACKERS] Re: [HACKERS] Re: [HACKERS] Windows service is not starting so there’s message in log: FATAL: "could not create shared memory segment “Global/PostgreSQL.851401618”: Permission

2016-03-11 Thread Haribabu Kommi

On Fri, Mar 11, 2016 at 11:15 PM, Amit Kapila <amit.kapil...@gmail.com> wrote:
> On Fri, Mar 11, 2016 at 5:21 PM, Haribabu Kommi <kommi.harib...@gmail.com>
> wrote:
>>
>> On Fri, Mar 11, 2016 at 12:00 AM, Amit Kapila <amit.kapil...@gmail.com>
>> wrote:
>> >
>> > Okay, so one probable theory for such an error could be that when there
>> > is
>> > already an object with same name exists, this API requests access to the
>> > that existing object and found that it can't access it due to some
>> > reason.
>> > On googling, I found some people suggesting to try by disabling UAC [1]
>> > on
>> > your m/c, can you once try that to see what is the result (this
>> > experiment
>> > is just to find out the actual reason of failure, rather than a
>> > permanent
>> > change suggestion).
>>
>> Thanks for the details. Currently I am unable to change the UAC settings
>> in my
>> laptop. I will try to do it in a different system and let you know the
>> result later.
>>
>>
>>
>> >> I am not able to find the reason for this error. This error is
>> >> occurring
>> >> only
>> >> when the PostgreSQL is started as a service only.
>> >>
>> >
>> > Did you use pg_ctl register/unregister to register different services.
>> > Can
>> > you share the detail steps and OS version on which you saw this
>> > behaviour?
>>
>> Operating system - windows 7
>> Binary - PostgreSQL 9.5 (This doesn't matter, 9.4+ can produce the
>> problem)
>>
>> 1. Create two standard users in the system (test_user1 and test_user2)
>
> I think one possibility is that one user is not able to access the object
> created by another user, if possible can you as well try with just one user
> (Have same user for both the services).

Yes, it is working as same user services. The main problem is, PostgreSQL
as a service for two different users in the same system is not working because
of same random getting generated for two services.

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [HACKERS] Re: [HACKERS] Re: [HACKERS] Windows service is not starting so there’s message in log: FATAL: "could not create shared memory segment “Global/PostgreSQL.851401618”: Permission

2016-03-11 Thread Haribabu Kommi

On Fri, Mar 11, 2016 at 12:00 AM, Amit Kapila <amit.kapil...@gmail.com> wrote:
> On Wed, Mar 9, 2016 at 5:46 PM, Haribabu Kommi <kommi.harib...@gmail.com>
> wrote:
>> On Wed, Mar 9, 2016 at 10:06 PM, Amit Kapila <amit.kapil...@gmail.com>
>> wrote:
>> > On Wed, Mar 9, 2016 at 11:46 AM, Haribabu Kommi
>> > <kommi.harib...@gmail.com>
>> > wrote:
>> >>
>> >>
>> >> I tried replacing the random() with PostmaterRandom() for a test and it
>> >> worked.
>> >> This is generating different random values, so the issue is not
>> >> occurring.
>> >>
>> >> "Global/PostgreSQL.2115609797"
>> >>
>> >> I feel, we should add the the data directory path + the random number
>> >> to
>> >> generate the name for dynamic shared memory, this can fix problem.
>> >>
>> >
>> > As mentioned above, I think if we can investigate why this error is
>> > generated, that will be helpful.  Currently the code ensures that if the
>> > segment already exists, it should retry to create a segment with other
>> > name
>> > (refer dsm_impl_windows()), so the point of investigation is, why it is
>> > not
>> > going via that path?  I am guessing due to some reason
>> > CreateFileMapping()
>> > is returning NULL in this case whereas ideally it should return the
>> > existing
>> > handle with an error ERROR_ALREADY_EXISTS.
>>
>> DEBUG:  mapped win32 error code 5 to 13
>>
>> Yes, the CreateFileMapping() is returning NULL with an error of
>> ERROR_ACCESS_DENIED.
>>
>
> Okay, so one probable theory for such an error could be that when there is
> already an object with same name exists, this API requests access to the
> that existing object and found that it can't access it due to some reason.
> On googling, I found some people suggesting to try by disabling UAC [1] on
> your m/c, can you once try that to see what is the result (this experiment
> is just to find out the actual reason of failure, rather than a permanent
> change suggestion).

Thanks for the details. Currently I am unable to change the UAC settings in my
laptop. I will try to do it in a different system and let you know the
result later.


>> I am not able to find the reason for this error. This error is occurring
>> only
>> when the PostgreSQL is started as a service only.
>>
>
> Did you use pg_ctl register/unregister to register different services.  Can
> you share the detail steps and OS version on which you saw this behaviour?

Operating system - windows 7
Binary - PostgreSQL 9.5 (This doesn't matter, 9.4+ can produce the problem)

1. Create two standard users in the system (test_user1 and test_user2)
2. Create two databases belongs each user listed above.
3. Now using pg_ctl register the services for the two users.
4. Provide logon permissions to these users to run the services by changing
service properties.
5. Now try to start the services, the second service fails with the
error message.
6. Error details can be found out in Event log viewer.

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Aggregate

2016-03-09 Thread Haribabu Kommi

On Mon, Mar 7, 2016 at 4:39 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Haribabu Kommi <kommi.harib...@gmail.com> writes:
>> 2. Temporary fix for float aggregate types in _equalAggref because of
>> a change in aggtype to trans type, otherwise the parallel aggregation
>> plan failure in set_plan_references. whenever the aggtype is not matched,
>> it verifies with the trans type also.
>
> That is a completely unacceptable kluge.  Quite aside from being ugly as
> sin, it probably breaks more things than it fixes, first because it breaks
> the fundamental semantics of equal() across the board, and second because
> it puts catalog lookups into equal(), which *will* cause problems.  You
> can not expect that this will get committed, not even as a "temporary fix".

I am not able to find a better solution to handle this problem, i will provide
the details of the problem and why I did the change, so if you can provide
some point where to look into, that would be helpful.

In parallel aggregate, as the aggregate operation is divided into two steps
such as finalize and partial aggregate. The partial aggregate is executed
in the worker and returns the results of transition data which is of type
aggtranstype. This can work fine even if we don't change the targetlist
aggref return type from aggtype to aggtranstype for aggregates whose
aggtype is a variable length data type. The output slot that is generated
with variable length type, so even if we send the aggtrantype data that
is also a variable length, this can work.

But when it comes to the float aggregates, the aggtype is a fixed length
and aggtranstype is a variable length data type. so if we try to change
the aggtype of a aggref in set_plan_references function with aggtrantype
only the partial aggregate targetlist is getting changed, because the
set_plan_references works from top of the plan.

To avoid this problem, I changed the target list type during the partial
aggregate path generation itself and that leads to failure in _equalAggref
function in set_plan_references. Because of which I put the temporary
fix.

Do you have any point in handling this problem?

Regards,
Hari Babu
Fujitsu Australia

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [HACKERS] Re: [HACKERS] Re: [HACKERS] Windows service is not starting so there’s message in log: FATAL: "could not create shared memory segment “Global/PostgreSQL.851401618”: Permission

2016-03-09 Thread Haribabu Kommi

On Thu, Mar 10, 2016 at 5:30 AM, Robert Haas <robertmh...@gmail.com> wrote:
> On Wed, Mar 9, 2016 at 7:16 AM, Haribabu Kommi <kommi.harib...@gmail.com> 
> wrote:
>> On Wed, Mar 9, 2016 at 10:06 PM, Amit Kapila <amit.kapil...@gmail.com> wrote:
>>> On Wed, Mar 9, 2016 at 11:46 AM, Haribabu Kommi <kommi.harib...@gmail.com>
>>> wrote:
>>>>
>>>>
>>>> I tried replacing the random() with PostmaterRandom() for a test and it
>>>> worked.
>>>> This is generating different random values, so the issue is not occurring.
>>>>
>>>> "Global/PostgreSQL.2115609797"
>>>>
>>>> I feel, we should add the the data directory path + the random number to
>>>> generate the name for dynamic shared memory, this can fix problem.
>>>>
>>>
>>> As mentioned above, I think if we can investigate why this error is
>>> generated, that will be helpful.  Currently the code ensures that if the
>>> segment already exists, it should retry to create a segment with other name
>>> (refer dsm_impl_windows()), so the point of investigation is, why it is not
>>> going via that path?  I am guessing due to some reason CreateFileMapping()
>>> is returning NULL in this case whereas ideally it should return the existing
>>> handle with an error ERROR_ALREADY_EXISTS.
>>
>> DEBUG:  mapped win32 error code 5 to 13
>>
>> Yes, the CreateFileMapping() is returning NULL with an error of
>> ERROR_ACCESS_DENIED.
>> I am not able to find the reason for this error. This error is occurring only
>> when the PostgreSQL is started as a service only.
>
> Another question is: why are both postmasters returning the same
> random number?  That's not very, uh, random.

The random number is generated from our own implementation of
random function. The random function internally calls the pg_lrand48
function to get the random value. This function generates the random
number based on specified random seed and pre-defined calculations.
Because of this reason, the same random number is getting generated
every time.

In LInux, the random function is used from the glibc, there also it is
generating the same random number as the first number, but if the
number is used by some process then it is generating a different random
number for the next PostgreSQL process.

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [HACKERS] Re: [HACKERS] Re: [HACKERS] Windows service is not starting so there’s message in log: FATAL: "could not create shared memory segment “Global/PostgreSQL.851401618”: Permission

2016-03-09 Thread Haribabu Kommi

On Wed, Mar 9, 2016 at 10:06 PM, Amit Kapila <amit.kapil...@gmail.com> wrote:
> On Wed, Mar 9, 2016 at 11:46 AM, Haribabu Kommi <kommi.harib...@gmail.com>
> wrote:
>>
>>
>> I tried replacing the random() with PostmaterRandom() for a test and it
>> worked.
>> This is generating different random values, so the issue is not occurring.
>>
>> "Global/PostgreSQL.2115609797"
>>
>> I feel, we should add the the data directory path + the random number to
>> generate the name for dynamic shared memory, this can fix problem.
>>
>
> As mentioned above, I think if we can investigate why this error is
> generated, that will be helpful.  Currently the code ensures that if the
> segment already exists, it should retry to create a segment with other name
> (refer dsm_impl_windows()), so the point of investigation is, why it is not
> going via that path?  I am guessing due to some reason CreateFileMapping()
> is returning NULL in this case whereas ideally it should return the existing
> handle with an error ERROR_ALREADY_EXISTS.

DEBUG:  mapped win32 error code 5 to 13

Yes, the CreateFileMapping() is returning NULL with an error of
ERROR_ACCESS_DENIED.
I am not able to find the reason for this error. This error is occurring only
when the PostgreSQL is started as a service only.

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pam auth - add rhost item

2016-03-08 Thread Haribabu Kommi

On Tue, Mar 8, 2016 at 10:43 PM, Grzegorz Sampolski 
wrote:
> Hi Hari.
> To use pam modules you can use whatever backend authentication method
> you want.
>
> This is example configuration:
>
> Install this library https://github.com/pam-pgsql/pam-pgsql
> Create some example database , schema access and two tables:
> pam_auth and pam_account with example defintion:
>
> pam_account:
> db_user character varying(16) NOT NULL,
> host character varying(255) NOT NULL
>
> pam_auth:
> db_user character varying(16) NOT NULL,
> password character varying(512) NOT NULL
>
> Sample /etc/pam_pgsql.conf:
> connect = dbname= user= password=
> auth_query = SELECT password FROM access.pam_auth WHERE db_user = %u
LIMIT 1
> acct_query = SELECT '0','0','' FROM access.pam_account WHERE db_user =
> %u AND (host = %h OR %h LIKE host) ORDER BY host DESC LIMIT 1;
> pw_type = crypt

Thanks for the details. I am able to test the host limitation based on
the host from where the connection request is given.This patch
provides the advantage of getting the connected host address
details for the PAM modules to provide/restrict the authentication.

A small change in the code, correct the following code from

+ if (retval) {

to

if (retval)
{

as per the code everywhere.


> I will try to update documentation in regard to this chagnes, but please
> take into account that my english isn't fluent so much. So if I'll do
> some mistakes please correct me.

I am also not a good English speaker :), but we can try to provide to
as good as possible, later community can help in correcting it if they find
any problem/improvement.

Regards,
Hari Babu
Fujitsu Australia

[HACKERS] Re: [HACKERS] Re: [HACKERS] Re: [HACKERS] Windows service is not starting so there’s message in log: FATAL: "could not create shared memory segment “Global/PostgreSQL.851401618”: Permission

2016-03-08 Thread Haribabu Kommi

On Sun, Oct 18, 2015 at 1:03 AM, Tom Lane  wrote:
> Amit Kapila  writes:
>> On Sat, Oct 17, 2015 at 12:07 AM, Robert Haas  wrote:
>>> Maybe we need to be using PostmasterRandom() rather than random() for
>>> the control segment name.
>
>> +1.  Though I think it is better to investigate the actual cause before
>> doing this.
>
> BackendRun() deliberately prevents that from working.  And it also sets
> srandom() to a new value for each subprocess, so that AFAICS this idea
> would be a net negative.  If you are seeing duplicate key values getting
> selected, the problem is elsewhere.

Coming back to an old thread, recently I got a problem in starting two
PostgreSQL services with a user that is not an administrator. The error
message is as follows.

FATAL:  could not create shared memory segment
"Global/PostgreSQL.851401618": Permission denied

The issue is happening only with the processes that are running as service.
I observed that the handle received in creating the dynamic shared memory
is same for two services, because of which the Access denied error is thrown
by the operating system and thus it leads to failure.

The PG shared memory name is always includes the data directory path as
below, because of which it doesn't match with two services.

"Global/PostgreSQL:C:/work/FEP/installation/bin/data"

But whereas the dynamic shared memory is formed with a random number
and this number getting generated same for two service thus it leads to
failure.

"Global/PostgreSQL.85140161"

I tried replacing the random() with PostmaterRandom() for a test and it worked.
This is generating different random values, so the issue is not occurring.

"Global/PostgreSQL.2115609797"

I feel, we should add the the data directory path + the random number to
generate the name for dynamic shared memory, this can fix problem.

comments?

Regards,
Hari Babu
Fujitsu Australia

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pam auth - add rhost item

2016-03-07 Thread Haribabu Kommi

On Tue, Dec 29, 2015 at 10:46 AM, Grzegorz Sampolski  wrote:
> Hi.
> I thought link on commitfest to github url was sufficient.
> Sorry. Attached new patch.

I reviewed and tested the patch. With the addition of
new RHOST member to the passed items in the PAM
authentication doesn't have any impact with existing
behavior.

As Tomas said in up thread that RHOST is the item
that I also that can be added to PAM authentication.

I am not able to test PAM authentication using the
RHOST, can you please let me know the way for
the same?

And also the patch lacks of documentation changes,
As it adds the new pamusedns option and also it
sends the RHOST, so the documentation needs to be
updated.

Regards,
Hari Babu
Fujitsu Australia

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Aggregate

2016-03-06 Thread Haribabu Kommi

On Sun, Mar 6, 2016 at 10:21 PM, Haribabu Kommi
<kommi.harib...@gmail.com> wrote:
>
> Pending:
> 1. Explain plan needs to be corrected for parallel grouping similar like
> parallel aggregate.

Here I attached update patch with further changes,
1. Explain plan changes for parallel grouping

2. Temporary fix for float aggregate types in _equalAggref because of
a change in aggtype to trans type, otherwise the parallel aggregation
plan failure in set_plan_references. whenever the aggtype is not matched,
it verifies with the trans type also.

3. Generates parallel path for all partial paths and add it to the path_list,
based on the cheapest_path, the plan is chosen.

To apply this patch, first apply the patch in [1]

[1] - http://www.postgresql.org/message-id/14172.1457228...@sss.pgh.pa.us

Regards,
Hari Babu
Fujitsu Australia

parallelagg_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Aggregate

2016-03-06 Thread Haribabu Kommi

On Fri, Mar 4, 2016 at 3:00 PM, David Rowley
<david.row...@2ndquadrant.com> wrote:
> On 17 February 2016 at 17:50, Haribabu Kommi <kommi.harib...@gmail.com> wrote:
>> Here I attached a draft patch based on previous discussions. It still needs
>> better comments and optimization.
>
> Over in [1] Tom posted a large change to the grouping planner which
> causes large conflict with the parallel aggregation patch. I've been
> looking over Tom's patch and reading the related thread and I've
> observed 3 things:
>
> 1. Parallel Aggregate will be much easier to write and less code to
> base it up top of Tom's upper planner changes. The latest patch does
> add a bit of cruft (e.g create_gather_plan_from_subplan()) which won't
> be required after Tom pushes the changes to the upper planner.
> 2. If we apply parallel aggregate before Tom's upper planner changes
> go in, then Tom needs to reinvent it again when rebasing his patch.
> This seems senseless, so this is why I did this work.
> 3. Based on the thread, most people are leaning towards getting Tom's
> changes in early to allow a bit more settle time before beta, and
> perhaps also to allow other patches to go in after (e.g this)
>
> So, I've done a bit of work and I've rewritten the parallel aggregate
> code to base it on top of Tom's patch posted in [1]. There's a few
> things that are left unsolved at this stage.
>
> 1. exprType() for Aggref still returns the aggtype, where it needs to
> return the trans type for partial agg nodes, this need to return the
> trans type rather than the aggtype. I had thought I might fix this by
> adding a proxy node type that sits in the targetlist until setrefs.c
> where it can be plucked out and replaced by the Aggref. I need to
> investigate this further.
> 2. There's an outstanding bug relating to HAVING clause not seeing the
> right state of aggregation and returning wrong results. I've not had
> much time to look into this yet, but I suspect its an existing bug
> that's already in master from my combine aggregate patch. I will
> investigate this on Sunday.
>

Thanks for updating the patch. Here I attached updated patch
with the additional changes,

1. Now parallel aggregation works with expressions along with aggregate
functions also.
2. Aggref return the trans type instead of agg type, this change adds
the support of parallel aggregate to float aggregates, still it needs a
fix in _equalAggref function.

Pending:
1. Explain plan needs to be corrected for parallel grouping similar like
parallel aggregate.

To apply this patch, first apply the patch in [1]

[1] - http://www.postgresql.org/message-id/14172.1457228...@sss.pgh.pa.us

Regards,
Hari Babu
Fujitsu Australia


parallelagg_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ExecGather() + nworkers

2016-03-04 Thread Haribabu Kommi

On Fri, Mar 4, 2016 at 10:33 PM, Amit Kapila <amit.kapil...@gmail.com> wrote:
> On Fri, Mar 4, 2016 at 11:57 AM, Haribabu Kommi <kommi.harib...@gmail.com>
> wrote:
>>
>> On Wed, Jan 13, 2016 at 7:19 PM, Amit Kapila <amit.kapil...@gmail.com>
>> wrote:
>> >>
>> >
>> > Changed the code such that nworkers_launched gets used wherever
>> > appropriate instead of nworkers.  This includes places other than
>> > pointed out above.
>>
>> The changes of the patch are simple optimizations that are trivial.
>> I didn't find any problem regarding the changes. I think the same
>> optimization is required in "ExecParallelFinish" function also.
>>
>
> There is already one change as below for ExecParallelFinish() in patch.
>
> @@ -492,7 +492,7 @@ ExecParallelFinish(ParallelExecutorInfo *pei)
>
>   WaitForParallelWorkersToFinish(pei->pcxt);
>
>
>
>   /* Next, accumulate buffer usage. */
>
> - for (i = 0; i < pei->pcxt->nworkers; ++i)
>
> + for (i = 0; i < pei->pcxt->nworkers_launched; ++i)
>
>   InstrAccumParallelQuery(>buffer_usage[i]);
>
>
> Can you be slightly more specific, where exactly you are expecting more
> changes?

I missed it during the comparison with existing code and patch.
Everything is fine with the patch. I marked the patch as ready for committer.

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ExecGather() + nworkers

2016-03-03 Thread Haribabu Kommi

On Wed, Jan 13, 2016 at 7:19 PM, Amit Kapila  wrote:
> On Mon, Jan 11, 2016 at 9:16 AM, Amit Kapila 
> wrote:
>> On Mon, Jan 11, 2016 at 3:14 AM, Peter Geoghegan  wrote:
>> >
>> > On Sun, Jan 10, 2016 at 9:13 AM, Robert Haas 
>> > wrote:
>> > >> I'm not sure why the test for nworkers following the
>> > >> LaunchParallelWorkers() call doesn't look like this, though:
>> > >>
>> > >> /* Set up tuple queue readers to read the results. */
>> > >> if (pcxt->nworkers_launched > 0)
>> > >> {
>> > >> ...
>> > >> }
>> > >
>> > > Hmm, yeah, I guess it could do that.
>> >
>> > That would make it clearer as an example.
>> >
>> > >> But going to this additional trouble (detecting no workers launched
>> > >> on
>> > >> the basis of !nworkers_launched) suggests that simply testing
>> > >> nworkers_launched would be wrong, which AFAICT it isn't. Can't we
>> > >> just
>> > >> do that, and in so doing also totally remove the "for" loop shown
>> > >> here?
>> > >
>> > > I don't see how the for loop goes away.
>> >
>> > I meant that some code in the "for" loop goes away. Not all of it.
>> > Just the more obscure code. As I said, I'm mostly pointing this out
>> > out of concern for making it clearer as example code.
>> >
>>
>> Right, I can write a patch to do it in a way you are suggesting if you
>> are not planning to do it.
>>
>
> Changed the code such that nworkers_launched gets used wherever
> appropriate instead of nworkers.  This includes places other than
> pointed out above.

The changes of the patch are simple optimizations that are trivial.
I didn't find any problem regarding the changes. I think the same
optimization is required in "ExecParallelFinish" function also.

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Incorrect error message in InitializeSessionUserId

2016-03-03 Thread Haribabu Kommi

On Wed, Mar 2, 2016 at 12:21 AM, Dmitriy Sarafannikov
 wrote:
> Hi all,
>
> I have found incorrect error message in InitializeSessionUserId function
> if you try to connect to database by role Oid (for example
> BackgroundWorkerInitializeConnectionByOid).
> If role have no permissions to login, you will see error message like this:
> FATAL:  role "(null)" is not permitted to log in
>
> I changed few lines of code and fixed this.
> Patch is attached.
> I want to add this patch to commitfest.
> Any objections?
>

The patch adds the support of taking the role name from the role tuple
instead of using the provided rolename variable, because it is possible
that rolename variable is NULL if the connection is from a background
worker.

The patch is fine, I didn't find any problems, I marked it as ready for
committer.

IMO this patch may need to backpatch supported branches as it is
a bug fix. Committer can decide.

Regards,
Hari Babu
Fujitsu Australia

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] On columnar storage (2)

2016-03-03 Thread Haribabu Kommi

On Thu, Mar 3, 2016 at 7:46 PM, Bert  wrote:
>
> Thank you for the performance test. But please not that the patch is 'thrown
> away', and will be totally rewritten. I have no idea of the status of the
> second / third attempt however.
> However, what is interesting is that for some queries this patch is already
> on par with VCI. Which db is that exactly?

The performance report is taken on the patch that is WIP columnar storage
on PostgreSQL database. Only the storage part of the code is finished.
To test the performance, we used custom plan to generate the plans
where it can use the columnar storage. This way we ran the performance
test.

I want to integrate this patch with syntax proposed by Alvaro for columnar
storage and share it with community, before that i want to share the current
storage design with the community for review by preparing some readme
file. I will try to send this soon.

Regards,
Hari Babu
Fujitsu Australia

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2016-03-02 Thread Haribabu Kommi

On Fri, Feb 5, 2016 at 2:29 PM, Haribabu Kommi <kommi.harib...@gmail.com> wrote:
>
>This patch needs to be applied on top discard_hba_and_ident_cxt patch
>that is posted earlier.

Here I attached a re-based patch to the latest head with inclusion of
discard_hba_ident_cxt patch for easier review as a single patch.

Regards,
Hari Babu
Fujitsu Australia


pg_hba_lookup_poc_v13.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] On columnar storage (2)

2016-03-02 Thread Haribabu Kommi

On Mon, Feb 1, 2016 at 12:11 AM, Alvaro Herrera
 wrote:
> So we discussed some of this stuff during the developer meeting in
> Brussels and the main conclusion is that we're going to split this up in
> multiple independently useful pieces, and write up the general roadmap
> in the wiki so that we can discuss in detail on-list.
>
> I'm marking this as Returned with Feedback now.
>
> Thanks everybody,

Here I attached the DBT-3 performance report that is measured on the
prototype patch
that is written for columnar storage as I mentioned in my earlier mail
with WOS and ROS
design.

Currently to measure the benefits of this design, we did the following changes,
1. Created the columnar storage index similar like other index methods
2. Used custom plan to generate the plan that can use the columnar storage
3. Optimized parallelism to use the columnar storage

The code is not fully ready yet, I posted the performance results to
get a view from
community, whether this approach is really beneficial?

I will provide the full details of the design and WIP patches later.

Regards,
Hari Babu
Fujitsu Australia

DBT3_performance_vci_community.xls
Description: MS-Excel spreadsheet

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Aggregate

2016-02-16 Thread Haribabu Kommi

On Sat, Feb 13, 2016 at 3:51 PM, Robert Haas <robertmh...@gmail.com> wrote:
> On Sun, Feb 7, 2016 at 8:21 PM, Haribabu Kommi
> <kommi.harib...@gmail.com> wrote:future.
>> Here I attached updated patch with the corrections.
>
> So, what about the main patch, for parallel aggregation itself?  I'm
> reluctant to spend too much time massaging combine functions if we
> don't have the infrastructure to use them.

Here I attached a draft patch based on previous discussions. It still needs
better comments and optimization.

Overview:
1. Before creating the plan for the best path, verify whether parallel aggregate
plan is possible or not? If possible check whether it is the cheapest plan
compared to normal aggregate? If parallel is cheaper then replace the best
path with the cheapest_partial_path.

2. while generating parallel aggregate plan, first generate targetlist of
partial aggregate by generating bare aggregate references and group by
expressions.

3. Change the aggref->aggtype with aggtranstype in the partial aggregate
targetlist to return a proper tuple data from worker.

4. Generate partial aggregate node using the generated targetlist.

5. Add gather and finalize aggregate nodes on top of partial aggregate plan.

To do:
1. Optimize the aggregate cost calculation mechanism, currently it is used
many times.
2. Better comments and etc.

Please verify whether the patch is in the right direction as per your
expectation?

Regards,
Hari Babu
Fujitsu Australia

parallelagg_poc_v7.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Aggregate

2016-02-08 Thread Haribabu Kommi

On Mon, Feb 8, 2016 at 9:01 AM, Robert Haas <robertmh...@gmail.com> wrote:
>  On Thu, Jan 21, 2016 at 11:25 PM, Haribabu Kommi
> <kommi.harib...@gmail.com> wrote:
>>  [ new patch ]
>
> This patch contains a number of irrelevant hunks that really ought not
> to be here and make the patch harder to understand, like this:
>
> -* Generate appropriate target list for
> scan/join subplan; may be
> -* different from tlist if grouping or
> aggregation is needed.
> +* Generate appropriate target list for
> subplan; may be different from
> +* tlist if grouping or aggregation is needed.
>
> Please make a habit of getting rid of that sort of thing before submitting.

sure. I will take of such things in future.

> Generally, I'm not quite sure I understand the code here.  It seems to
> me that what we ought to be doing is that grouping_planner, right
> after considering using a presorted path (that is, just after the if
> (sorted_path) block between lines 1822-1850), ought to then consider
> using a partial path.  For the moment, it need not consider the
> possibility that there may be a presorted partial path, because we
> don't have any way to generate those yet.  (I have plans to fix that,
> but not in time for 9.6.)  So it can just consider doing a Partial
> Aggregate on the cheapest partial path using an explicit sort, or
> hashing; then, above the Gather, it can finalize either by hashing or
> by sorting and grouping.
>
> The trick is that there's no path representation of an aggregate, and
> there won't be until Tom finishes his upper planner path-ification
> work.  But it seems to me we can work around that.  Set best_path to
> the cheapest partial path, add a partial aggregate rather than a
> regular one around where it says "Insert AGG or GROUP node if needed,
> plus an explicit sort step if necessary", and then push a Gather node
> and a Finalize Aggregate onto the result.

Thanks, i will update the patch accordingly. Along with those changes,
I will try to calculate the cost involved in normal aggregate without
generating the plan and compare it against the parallel cost plan before
generating the actual plan. Because with less number of groups
normal aggregate is performing better than parallel aggregate in tests.


Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] backpatch for REL9_4_STABLE of commit 40482e606733675eb9e5b2f7221186cf81352da1

2016-02-08 Thread Haribabu Kommi

On Mon, Feb 8, 2016 at 8:20 PM, Huong Dangminh
 wrote:
> Hi,
>
> I think this fixed is also required for REL9_4_STABLE.
> Please confirm the attached patch.

Yes, this fix was missed for 9.4 stable branch during back patch
and it is available on all other supported branches.

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Combining Aggregates

2016-02-07 Thread Haribabu Kommi

On Thu, Jan 21, 2016 at 3:42 PM, David Rowley
 wrote:
> On 21 January 2016 at 08:06, Robert Haas  wrote:
>>
>> On Wed, Jan 20, 2016 at 7:38 AM, David Rowley
>>  wrote:
>> > Agreed. So I've attached a version of the patch which does not have any
>> > of
>> > the serialise/deserialise stuff in it.
>>
>> I re-reviewed this and have committed most of it with only minor
>> kibitizing.  A few notes:
>
>
> I've attached the re-based remainder, which includes the serial/deserial
> again.
>
> I'll submit this part to March 'fest, where hopefully we'll also have
> something to utilise it.
>

While testing parallel aggregate with float4 and float8 types based on
the latest patch,
I found the following problems,

+ /*
+ * For partial aggregation we must adjust the return types of
+ * the Aggrefs
+ */
+ if (!aggplan->finalizeAggs)
+ set_partialagg_aggref_types(root, plan,
+ aggplan->serialStates);

[...]

+ aggref->aggtype = aggform->aggserialtype;
+ else
+ aggref->aggtype = aggform->aggtranstype;

Changing the aggref->aggtype with aggtranstype or aggserialtype will
only gets it changed in
partial aggregate plan, as set_upper_references starts from the top
plan and goes
further. Because of this reason, the targetlist contains for the node
below finalize
aggregate are still points to original type only.

To fix this problem, I tried updating the targetlist aggref->aggtype
with transtype during
aggregate plan itself, that leads to a problem in setting upper plan
references. This is
because, while fixing the aggregate reference of upper plans after
partial aggregate,
the aggref at upper plan nodes doesn't match with aggref that is
coming from partial
aggregate node because of aggtype difference in _equalAggref function.

COMPARE_SCALAR_FIELD(aggtype);

Temporarily i corrected it compare it against aggtranstype and
aggserialtype also then
it works fine. I don't see that change as correct approach. Do you
have any better idea
to solve this problem?

Regards,
Hari Babu
Fujitsu Australia

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Aggregate

2016-02-07 Thread Haribabu Kommi

On Mon, Feb 8, 2016 at 2:00 AM, Robert Haas <robertmh...@gmail.com> wrote:
> On Sun, Jan 24, 2016 at 7:56 PM, Haribabu Kommi
> <kommi.harib...@gmail.com> wrote:
>> On Sat, Jan 23, 2016 at 12:59 PM, Haribabu Kommi
>> <kommi.harib...@gmail.com> wrote:
>>> Here I attached updated patch with additional combine function for
>>> two stage aggregates also.
>>
>> A wrong combine function was added in pg_aggregate.h in the earlier
>> patch that leading to
>> initdb problem. Corrected one is attached.
>
> I'm not entirely sure I know what's going on here, but I'm pretty sure
> that it makes no sense for the new float8_pl function to reject
> non-aggregate callers at the beginning and then have a comment at the
> end indicating what it does when not invoked as an aggregate.
> Similarly for the other new function.
>
> It would be a lot more clear what this patch was trying to accomplish
> if the new functions had header comments explaining their purpose -
> not what they do, but why they exist.

I added some header comments explaining the need of these functions
and when they will be used? These combine functions are necessary
to float4 and float8 for parallel aggregation.

> float8_regr_pl is labeled in pg_proc.h as an aggregate transition
> function, but I'm wondering if it should say combine function.

corrected.

> The changes to pg_aggregate.h include a large number of
> whitespace-only changes which are unacceptable.  Please change only
> the lines that need to be changed.

I try to align the other rows according to the new combine function addition,
that leads to a white space problem, I will take care of such things in future.
Here I attached updated patch with the corrections.

Regards,
Hari Babu
Fujitsu Australia


additional_combine_fns_v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2016-02-04 Thread Haribabu Kommi

On Tue, Feb 2, 2016 at 8:57 AM, Alvaro Herrera <alvhe...@2ndquadrant.com> wrote:
> Haribabu Kommi wrote:
>
>> Hi, Do you have any further comments on the patch that needs to be
>> taken care?
>
> I do.  I think the jsonb functions you added should be added to one of
> the json .c files instead, since they seem of general applicability.

moved these functions into jsonb_util.c file.

> But actually, I don't think you have ever replied to the question of why
> are you using JSON in the first place; isn't a normal array suitable?

It was discussed and told to use JSON for options instead of array in [1],
because of that reason I changed.

> The callback stuff is not documented in check_hba() at all.  Can you
> please add an explanation just above the function, so that people trying
> to use it know what can the callback be used for?  Also a few lines
> above the callback itself would be good.

Added some details in explaining the call back function.

>Also, the third argument of
> check_hba() is a translatable message so you should enclose it in _() so
> that it is picked up for translation.  The "skipped"/"matched" words
> (and maybe others?) need to be marked similarly.

Changed mode column (skipped/matched) and reason for mismatch details
are enclosed in _() for translation. Do you find any other details needs to be
enclosed?

> That "Failed" in the errmsg in pg_hba_lookup() should be lowercase.

corrected.

> Moving to next CF.

Thanks. updated patch is attached with the above corrections.
This patch needs to be applied on top discard_hba_and_ident_cxt patch
that is posted earlier.

[1] - http://www.postgresql.org/message-id/5547db0a.2020...@gmx.net

Regards,
Hari Babu
Fujitsu Australia


pg_hba_lookup_poc_v12.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Aggregate

2016-01-24 Thread Haribabu Kommi

On Sat, Jan 23, 2016 at 12:59 PM, Haribabu Kommi
<kommi.harib...@gmail.com> wrote:
>
> Here I attached updated patch with additional combine function for
> two stage aggregates also.

A wrong combine function was added in pg_aggregate.h in the earlier
patch that leading to
initdb problem. Corrected one is attached.

Regards,
Hari Babu
Fujitsu Australia


additional_combine_fns_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Aggregate

2016-01-22 Thread Haribabu Kommi

On Fri, Jan 22, 2016 at 10:13 PM, David Rowley
<david.row...@2ndquadrant.com> wrote:
> On 22 January 2016 at 17:25, Haribabu Kommi <kommi.harib...@gmail.com> wrote:
>> Along with these changes, I added a float8 combine function to see
>> how it works under parallel aggregate, it is working fine for float4, but
>> giving small data mismatch with float8 data type.
>>
>> postgres=# select avg(f3), avg(f4) from tbl;
>>avg|   avg
>> --+--
>>  1.1002384186 | 100.12344879
>> (1 row)
>>
>> postgres=# set enable_parallelagg = true;
>> SET
>> postgres=# select avg(f3), avg(f4) from tbl;
>>avg|   avg
>> --+--
>>  1.1002384186 | 100.12344918
>> (1 row)
>>
>> Column - f3 - float4
>> Column - f4 - float8
>>
>> similar problem for all float8 var_pop, var_samp, stddev_pop and stddev_samp
>> aggregates. Any special care is needed for float8 datatype?
>
> I'm not sure if this is what's going on here, as I don't really know
> the range of numbers that you've used to populate f4 with. It would be
> good to know, does "f4" contain negative values too?

No negative values are present in the f4 column.
Following are the SQL statements,

create table tbl(f1 int, f2 char(100), f3 float4, f4 float8);
insert into tbl values(generate_series(1,10), 'Fujitsu', 1.1, 100.12345);


> It's not all that hard to demonstrate the instability of addition with
> float8. Take the following example:
>
> create table d (d float8);
> insert into d 
> values(1223123223412324.2231),(0.23),(-1223123223412324.2231);
>
> # select sum(d order by random()) from d;
>  sum
> -
>0
> (1 row)
>
> same query, once more.
>
> # select sum(d order by random()) from d;
>sum
> --
>  2.3e-013
> (1 row)
>
> Here the result just depends on the order which the numbers have been
> added. You may need to execute a few more times to see the result
> change.
>
> Perhaps a good test would be to perform a sum(f4 order by random()) in
> serial mode, and see if you're getting a stable result from the
> numbers that you have populated the table with.
>
> If that's the only problem at play here, then I for one am not worried
> about it, as the instability already exists today depending on which
> path is chosen to scan the relation. For example an index scan is
> likely not to return rows in the same order as a seq scan.
>
> We do also warn about this in the manual: "Inexact means that some
> values cannot be converted exactly to the internal format and are
> stored as approximations, so that storing and retrieving a value might
> show slight discrepancies. Managing these errors and how they
> propagate through calculations is the subject of an entire branch of
> mathematics and computer science and will not be discussed here,
> except for the following points:" [1]
>
>
> [1] http://www.postgresql.org/docs/devel/static/datatype-numeric.html
>

Thanks for the detailed explanation. Now I understood.

Here I attached updated patch with additional combine function for
two stage aggregates also.


Regards,
Hari Babu
Fujitsu Australia


additional_combine_fns_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Combining Aggregates

2016-01-21 Thread Haribabu Kommi

On Thu, Jan 21, 2016 at 3:52 PM, David Rowley
<david.row...@2ndquadrant.com> wrote:
> On 21 January 2016 at 15:53, Haribabu Kommi <kommi.harib...@gmail.com>
> wrote:
>>
>> On Thu, Jan 21, 2016 at 1:33 PM, Haribabu Kommi
>> <kommi.harib...@gmail.com> wrote:
>> >
>> > Here I attached updated patch of parallel aggregate based on the latest
>> > changes in master. Still it lack of cost comparison of normal aggregate
>> > to
>> > parallel aggregate because of difficulty. This cost comparison is
>> > required
>> > in parallel aggregate as this is having some regression when the number
>> > of groups are less in the query plan.
>> >
>>
>> Updated patch is attached after removing a warning in building group
>> aggregate path.
>
>
> Hi,
>
> Thanks for updating the patch. I'd like to look at this with priority, but
> can you post it on the Parallel Agg thread? that way anyone following there
> can chime in over there rather than here.  I've still got a bit of work to
> do (in the not too distant future) on the serial/deserial part, so would be
> better to keep this thread for discussion on that.

Thanks for the details. Sorry for sending parallel aggregate patch in
this thread.
I will take care of it from next time onward.


Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Aggregate

2016-01-21 Thread Haribabu Kommi

On Fri, Jan 22, 2016 at 7:44 AM, David Rowley
<david.row...@2ndquadrant.com> wrote:
> On 21 January 2016 at 18:26, Haribabu Kommi <kommi.harib...@gmail.com> wrote:
>> Here I attached update parallel aggregate patch on top of recent commits
>> of combine aggregate and parallel join patch. It still lacks of cost 
>> comparison
>> code to compare both parallel and normal aggregates.
>
> Thanks for the updated patch.
>
> I'm just starting to look over this now.
>
> # create table t1 as select x from generate_series(1,100) x(x);
> # vacuum ANALYZE t1;
> # set max_parallel_degree =8;
> # explain select sum(x) from t1;
>QUERY PLAN
> -
>  Aggregate  (cost=9633.33..9633.34 rows=1 width=4)
>->  Parallel Seq Scan on t1  (cost=0.00..8591.67 rows=416667 width=4)
> (2 rows)
>
> I'm not quite sure what's happening here yet as I've not ran it
> through my debugger, but how can we have a Parallel Seq Scan without a
> Gather node? It appears to give correct results, so I can only assume
> it's not actually a parallel scan at all.
>
> Let's check:
>
> # select relname,seq_scan from pg_stat_user_tables where relname ='t1';
>  relname | seq_scan
> -+--
>  t1  |0
> (1 row)
>
> # explain analyze select sum(x) from t1;
> QUERY PLAN
> --
>  Aggregate  (cost=9633.33..9633.34 rows=1 width=4) (actual
> time=161.820..161.821 rows=1 loops=1)
>->  Parallel Seq Scan on t1  (cost=0.00..8591.67 rows=416667
> width=4) (actual time=0.051..85.348 rows=100 loops=1)
>  Planning time: 0.040 ms
>  Execution time: 161.861 ms
> (4 rows)
>
> # select relname,seq_scan from pg_stat_user_tables where relname ='t1';
>  relname | seq_scan
> -+--
>  t1  |1
> (1 row)
>
> Only 1 scan.
>
>
> # explain analyze select * from t1 where x=1;
>QUERY PLAN
> 
>  Gather  (cost=1000.00..10633.43 rows=1 width=4) (actual
> time=0.231..49.105 rows=1 loops=1)
>Number of Workers: 2
>->  Parallel Seq Scan on t1  (cost=0.00..9633.33 rows=0 width=4)
> (actual time=29.060..45.302 rows=0 loops=3)
>  Filter: (x = 1)
>  Rows Removed by Filter: 33
>  Planning time: 0.049 ms
>  Execution time: 51.438 ms
> (7 rows)
>
> # select relname,seq_scan from pg_stat_user_tables where relname ='t1';
>  relname | seq_scan
> -+--
>  t1  |4
> (1 row)
>
> 3 more scans. This one seems to actually be parallel, and makes sense
> based on "Number of Workers: 2"


The problem was the gather path that is generated on partial path list is
not getting added to path list, because of which, there is  a mismatch in
sorted path and cheapest_path, so it leads to a wrong plan.

For temporarily, I marked the sorted_path and cheapest_path as same
and it works fine.


> Also looking at the patch:
>
> +bool
> +aggregates_allow_partial(Node *clause)
> +{
>
> In the latest patch that I sent on the combine aggregates thread:
> http://www.postgresql.org/message-id/CAKJS1f_in9J_ru4gPfygCQLUeB3=rzq3kg6rnpn-fzzhddi...@mail.gmail.com
> I made it so there's 3 possible return values from this function. As
> your patch stands now, if I create an aggregate function with an
> INTERNAL state with a combine function set, then this patch might try
> to parallel aggregate that and pass around the pointer to the internal
> state in the Tuple going from the worker to the main process, when the
> main process dereferences this pointer we'll get a segmentation
> violation. So I'd say you should maybe use a modified version of my
> latest aggregates_allow_partial() and check for PAT_ANY, and only
> parallelise the aggregate if you get that value.  If the use of
> partial aggregate was within a single process then you could be quite
> content with PAT_INTERNAL_ONLY. You'll just need to pull out the logic
> that checks for serial and deserial functions, since that's not in
> yet, and just have it return PAT_INTERNAL_ONLY if INTERNAL aggregates
> are found which have combine functions set.
>

I took the suggested code changes from combine aggregate patch and
changed accordingly.

Along with these changes, I added a float8 combine function to see
how it works under parallel aggregate, it is working fine for float

Re: [HACKERS] Parallel Aggregate

2016-01-20 Thread Haribabu Kommi

On Thu, Dec 24, 2015 at 5:12 AM, Robert Haas  wrote:
> On Mon, Dec 21, 2015 at 6:38 PM, David Rowley
>  wrote:
>> On 22 December 2015 at 04:16, Paul Ramsey  wrote:
>>>
>>> Shouldn’t parallel aggregate come into play regardless of scan
>>> selectivity?
>>
>> I'd say that the costing should take into account the estimated number of
>> groups.
>>
>> The more tuples that make it into each group, the more attractive parallel
>> grouping should seem. In the extreme case if there's 1 tuple per group, then
>> it's not going to be of much use to use parallel agg, this would be similar
>> to a scan with 100% selectivity. So perhaps the costings for it can be
>> modeled around a the parallel scan costing, but using the estimated groups
>> instead of the estimated tuples.
>
> Generally, the way that parallel costing is supposed to work (with the
> parallel join patch, anyway) is that you've got the same nodes costed
> the same way you would otherwise, but the row counts are lower because
> you're only processing 1/Nth of the rows.  That's probably not exactly
> the whole story here, but it's something to think about.

Here I attached update parallel aggregate patch on top of recent commits
of combine aggregate and parallel join patch. It still lacks of cost comparison
code to compare both parallel and normal aggregates.


Regards,
Hari Babu
Fujitsu Australia


parallelagg_poc_v5.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Combining Aggregates

2016-01-20 Thread Haribabu Kommi

On Thu, Jan 21, 2016 at 12:32 PM, David Rowley
 wrote:
> On 21 January 2016 at 04:59, Robert Haas  wrote:
>>
>> On Wed, Jan 20, 2016 at 7:53 AM, David Rowley
>>  wrote:
>> > On 21 January 2016 at 01:44, Robert Haas  wrote:
>> >>
>> >> On Wed, Jan 20, 2016 at 7:38 AM, David Rowley
>> >>  wrote:
>> >> >> To my mind, priority #1 ought to be putting this fine new
>> >> >> functionality to some use.  Expanding it to every aggregate we've
>> >> >> got
>> >> >> seems like a distinctly second priority.  That's not to say that
>> >> >> it's
>> >> >> absolutely gotta go down that way, but those would be my priorities.
>> >> >
>> >> > Agreed. So I've attached a version of the patch which does not have
>> >> > any
>> >> > of
>> >> > the serialise/deserialise stuff in it.
>> >> >
>> >> > I've also attached a test patch which modifies the grouping planner
>> >> > to
>> >> > add a
>> >> > Partial Aggregate node, and a final aggregate node when it's
>> >> > possible.
>> >> > Running the regression tests with this patch only shows up variances
>> >> > in
>> >> > the
>> >> > EXPLAIN outputs, which is of course expected.
>> >>
>> >> That seems great as a test, but what's the first patch that can put
>> >> this to real and permanent use?
>> >
>> > There's no reason why parallel aggregates can't use the
>> > combine_aggregate_state_d6d480b_2016-01-21.patch patch.
>>
>> I agree.  Are you going to work on that?  Are you expecting me to work
>> on that?  Do you think we can use Haribabu's patch?  What other
>> applications are in play in the near term, if any?
>
>
> At the moment I think everything which will use this is queued up behind the
> pathification of the grouping planner which Tom is working on. I think
> naturally Parallel Aggregate makes sense to work on first, given all the
> other parallel stuff in this release. I plan on working on that that by
> either assisting Haribabu, or... whatever else it takes.
>

Here I attached updated patch of parallel aggregate based on the latest
changes in master. Still it lack of cost comparison of normal aggregate to
parallel aggregate because of difficulty. This cost comparison is required
in parallel aggregate as this is having some regression when the number
of groups are less in the query plan.

Regards,
Hari Babu
Fujitsu Australia


parallelagg_poc_v4.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Combining Aggregates

2016-01-20 Thread Haribabu Kommi

On Thu, Jan 21, 2016 at 1:33 PM, Haribabu Kommi
<kommi.harib...@gmail.com> wrote:
>
> Here I attached updated patch of parallel aggregate based on the latest
> changes in master. Still it lack of cost comparison of normal aggregate to
> parallel aggregate because of difficulty. This cost comparison is required
> in parallel aggregate as this is having some regression when the number
> of groups are less in the query plan.
>

Updated patch is attached after removing a warning in building group
aggregate path.

Regards,
Hari Babu
Fujitsu Australia


parallelagg_poc_v5.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Combining Aggregates

2016-01-18 Thread Haribabu Kommi

On Mon, Jan 18, 2016 at 10:32 PM, David Rowley
 wrote:
> On 18 January 2016 at 16:44, Robert Haas  wrote:
>>
>> On Sun, Jan 17, 2016 at 9:26 PM, David Rowley
>>  wrote:
>> > hmm, so wouldn't that mean that the transition function would need to
>> > (for
>> > each input tuple):
>> >
>> > 1. Parse that StringInfo into tokens.
>> > 2. Create a new aggregate state object.
>> > 3. Populate the new aggregate state based on the tokenised StringInfo,
>> > this
>> > would perhaps require that various *_in() functions are called on each
>> > token.
>> > 4. Add the new tuple to the aggregate state.
>> > 5. Build a new StringInfo based on the aggregate state modified in 4.
>> >
>> > ?
>>
>> I don't really know what you mean by parse the StringInfo into tokens.
>> The whole point of the expanded-object interface is to be able to keep
>> things in an expanded internal form so that you *don't* have to
>> repeatedly construct and deconstruct internal data structures.
>
>
> That was a response to Haribabu proposal, although perhaps I misunderstood
> that. However I'm not sure how a PolyNumAggState could be converted to a
> string and back again without doing any string parsing.

I just thought like direct mapping of the structure with text pointer.
something like
the below.

result = PG_ARGISNULL(0) ? NULL : (text *) PG_GETARG_POINTER(0);
state = (PolyNumAggState *)VARDATA(result);

To handle the big-endian or little-endian, we may need some extra changes.

Instead of adding 3 new columns to the pg_aggregate catalog table to handle
the internal types, either something like the above to handle the internal types
or some other way is better IMO.

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2016-01-17 Thread Haribabu Kommi

On Thu, Dec 31, 2015 at 10:47 AM, Haribabu Kommi
<kommi.harib...@gmail.com> wrote:
> On Wed, Dec 30, 2015 at 9:48 PM, Shulgin, Oleksandr
> <oleksandr.shul...@zalando.de> wrote:
>> On Wed, Dec 30, 2015 at 4:31 AM, Haribabu Kommi <kommi.harib...@gmail.com>
>> wrote:
>>>
>>>
>>> Adding quotes to pg_hba_lookup function makes it different from others.
>>> The issues regarding the same is already discussed in [1].
>>>
>>> select a.database[1], b.datname from
>>> pg_hba_lookup('postgres','kommih','::1')
>>> as a, pg_database as b where a.database[1]
>>> = b.datname;
>>>
>>> The queries like above are not possible with quoted output. It is very
>>> rare that the
>>> pg_hba_lookup function used in join operations, but still it is better
>>> to provide
>>> data without quotes. so I reverted these changes in the attached latest
>>> patch.
>>
>>
>> That's a good point.  I wonder that maybe instead of re-introducing quotes
>> we could somehow make the unquoted keywords that have special meaning stand
>> out, e.g:
>>
>> database  | {$sameuser}
>> user_name | {$all}
>>
>> That should make it obvious which of the values are placeholders and doesn't
>> interfere with joining database or user catalogs (while I would call
>> "sameuser" a very unlikely name for a database, "all" might be not that
>> unlikely name for a user, e.g. someone called like "Albert L. Lucky" could
>> use that as a login name).
>
> It is not only the problem with joins, the following two cases works
> without quotes only.
> With quotes the query doesn't match with the database name.
>
> select * from pg_hba_lookup('Test', 'kommih','127.0.0.1') where
> database = '{"Test"}';
> select * from pg_hba_lookup('Test', 'kommih','127.0.0.1') where
> database = '{Test}';

Hi, Do you have any further comments on the patch that needs to be
taken care?

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Combining Aggregates

2016-01-17 Thread Haribabu Kommi

On Sat, Jan 16, 2016 at 12:00 PM, David Rowley
 wrote:
> On 16 January 2016 at 03:03, Robert Haas  wrote:
>>
>> On Tue, Dec 29, 2015 at 7:39 PM, David Rowley
>>  wrote:
>> >> No, the idea I had in mind was to allow it to continue to exist in the
>> >> expanded format until you really need it in the varlena format, and
>> >> then serialize it at that point.  You'd actually need to do the
>> >> opposite: if you get an input that is not in expanded format, expand
>> >> it.
>> >
>> > Admittedly I'm struggling to see how this can be done. I've spent a good
>> > bit
>> > of time analysing how the expanded object stuff works.
>> >
>> > Hypothetically let's say we can make it work like:
>> >
>> > 1. During partial aggregation (finalizeAggs = false), in
>> > finalize_aggregates(), where we'd normally call the final function,
>> > instead
>> > flatten INTERNAL states and store the flattened Datum instead of the
>> > pointer
>> > to the INTERNAL state.
>> > 2. During combining aggregation (combineStates = true) have all the
>> > combine
>> > functions written in such a ways that the INTERNAL states expand the
>> > flattened states before combining the aggregate states.
>> >
>> > Does that sound like what you had in mind?
>>
>> More or less.  But what I was really imagining is that we'd get rid of
>> the internal states and replace them with new datatypes built to
>> purpose.  So, for example, for string_agg(text, text) you could make a
>> new datatype that is basically a StringInfo.  In expanded form, it
>> really is a StringInfo.  When you flatten it, you just get the string.
>> When somebody expands it again, they again have a StringInfo.  So the
>> RW pointer to the expanded form supports append cheaply.
>
>
> That sounds fine in theory, but where and how do you suppose we determine
> which expand function to call? Nothing exists in the catalogs to decide this
> currently.

I am thinking of transition function returns and accepts the StringInfoData
instead of PolyNumAggState internal data for int8_avg_accum for example.

The StringInfoData is formed with the members of the PolyNumAggState
structure data. The input given StringInfoData is transformed into
PolyNumAggState data and finish the calculation and again form the
StringInfoData and return. Similar changes needs to be done for final
functions input type also. I am not sure whether this approach may have
some impact on performance?


Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Combining Aggregates

2016-01-14 Thread Haribabu Kommi

On Fri, Jan 15, 2016 at 3:34 PM, David Rowley
 wrote:
> On 8 January 2016 at 22:43, David Rowley 
> wrote:
>>
>> I've attached some re-based patched on current master. This is just to fix
>> a duplicate OID problem.
>>
>
> I've attached two updated patched to fix a conflict with a recent change to
> planner.c

I am getting following compilation error and warning with the latest patch,
because of a function prototype mismatch.

aggregatecmds.c: In function ‘DefineAggregate’:
aggregatecmds.c:93:8: warning: variable ‘serialTypeType’ set but not
used [-Wunused-but-set-variable]
  char  serialTypeType = 0;
^
clauses.c:434:1: error: conflicting types for ‘partial_aggregate_walker’
 partial_aggregate_walker(Node *node, partial_agg_context *context)
 ^
clauses.c:100:13: note: previous declaration of
‘partial_aggregate_walker’ was here
 static bool partial_aggregate_walker(Node *node, void *context);
 ^

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Weighted Stats

2016-01-07 Thread Haribabu Kommi

On Mon, Dec 21, 2015 at 1:50 PM, David Fetter  wrote:
> On Sun, Dec 20, 2015 at 06:13:33PM -0600, Jim Nasby wrote:
>> On 11/2/15 5:46 PM, David Fetter wrote:
>> >I'd like to add weighted statistics to PostgreSQL
>>
>> Anything happen with this? If community isn't interested, ISTM it'd be good
>> to put this in PGXN.
>
> I think it's already in PGXN as an extension, and I'll get another
> version out this early this week, as it involves mostly adding some
> tests.
>
> I'll do the float8 ones for core this week, too, and unless there's a
> really great reason to do more data types on the first pass, it should
> be in committable shape.

I reviewed the patch, following are my observations.

1. +   precision, numeric, or interval

with interval type it is giving problem. As interval data type is not supported,
so remove it in the list of supported inputs.

postgres=# select weighted_avg(f7,f1) from tbl;
ERROR:  function weighted_avg(interval, smallint) does not exist at character 8
HINT:  No function matches the given name and argument types. You
might need to add explicit type casts.

2. +float8_weighted_avg(PG_FUNCTION_ARGS)

It will be helpful, if you provide some information as a function header,
how the weighted average is calculated similar like other weighted functions.

3. + transvalues = check_float8_array(transarray,
"float8_weighted_stddev_accum", 4);

The second parameter to check_float8_array should be "float8_weighted_accum".

4. There is an OID conflict of 4066 with latest master code.

5.+ A += newvalW * ( newvalX - transvalues[2] ) / W;
+ CHECKFLOATVAL(A, isinf(newvalW) || isinf(newvalX - transvalues[2])
|| isinf(1.0/W), true);

+ Q += newvalW * (newvalX - transvalues[2]) * (newvalX - A);
+ CHECKFLOATVAL(A, isinf(newvalX -  transvalues[3]) || isinf(newvalX -
A) || isinf(1.0/W), true);

Is the need of calculation also needs to be passed to CHECKFLOATVAL?
Just passing
the variables involved in the calculation isn't enough? If expressions
are required then
it should be something as follows?

CHECKFLOATVAL(A, isinf(transvalues[2]) || isinf(newvalW) ||
isinf(newvalX - transvalues[2]) || isinf(1.0/W), true);

CHECKFLOATVAL(Q, isinf(transvalues[3]) || isinf(newvalX -
transvalues[2]) || isinf(newvalX - A) || isinf(1.0/W), true);

I verified the stddev transition and final function calculations
according to wikipedia
and they are fine.

Regards,
Hari Babu
Fujitsu Australia

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Function and view to retrieve WAL receiver status

2016-01-06 Thread Haribabu Kommi

On Wed, Jan 6, 2016 at 8:00 PM, Michael Paquier
 wrote:
> On Wed, Jan 6, 2016 at 3:04 PM, Michael Paquier
>  wrote:
>> Attached is an updated patch.
>
> Forgot to update rules.out...

Thanks for the update. Patch looks good to me.
I marked it as ready for committer.

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi-tenancy with RLS

2016-01-06 Thread Haribabu Kommi

On Thu, Jan 7, 2016 at 2:29 PM, Stephen Frost  wrote:
> Robert,
>
> * Robert Haas (robertmh...@gmail.com) wrote:
>
>> Apart from the issue of whether this is doomed for some architectural
>> reason, it is not entirely clear to me that there's any consensus that
>> we want this.  I don't think that I understand the issues here well
>> enough to proffer an opinion of my own just yet... but I'd like to
>> hear what other people think.
>
> I'm certainly of the opinion that we want this or something similar.
>
> The big caveat kicking around in my head is if we want to have our own
> set of defined policies or if we want to give flexibility to the
> administrator to define their own policies.  In particular, I'm
> wondering about things like:
>
> CREATE POLICY namespace_limit ON pg_namespace TO company1 USING
>   (substring(nspname,1,8) = 'company1_');
>
> Which is a bit different, as I understand it, from what Haribadu has
> been proposing and quite a bit more complicated, as we'd then have to
> make the internal lookups respect the policy (so things like CREATE
> SCHEMA would have to check if you're allowed to actually create that
> schema, which would be based on the policy...).

I feel we may needed both our own set of policies and also providing
the user to create/alter/drop the catalog policies. This way we can
support both simple and complex scenarios. With default policies
an user can setup multi-tenancy easily. With the help of edit option,
user can tune the policies according to their scenarios.

The one problem with either approach as i am thinking, currently with
our own set of policies, the objects entries that are present on the
catalog tables are visible to the users, those are having any kind of
privileges on those objects. In case if a user tries to create an object
that is already present in the catalog relation will produce an error, but
user cannot view that object because of permissions problem.

To avoid such problem, administrator has to add policies such as
"namespace_prefix" needs to be added to all catalog tables.

Regards,
Hari Babu
Fujitsu Australia

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Function and view to retrieve WAL receiver status

2016-01-05 Thread Haribabu Kommi

On Sat, Dec 19, 2015 at 12:54 AM, Michael Paquier
 wrote:
> On Fri, Dec 18, 2015 at 8:39 AM, Robert Haas  wrote:
>> On Mon, Dec 14, 2015 at 7:23 PM, Michael Paquier
>>  wrote:
>>> On Tue, Dec 15, 2015 at 5:27 AM, Gurjeet Singh wrote:
 The function, maybe. But emitting an all-nulls row from a view seems
 counter-intuitive, at least when looking at it in context of relational
 database.
>>>
>>> OK, noted. Any other opinions?
>>
>> I wouldn't bother with the view.  If we're going to do it, I'd say
>> just provide the function and let people SELECT * from it if they want
>> to.
>
> OK, I took some time to write a patch for that as attached, added in
> the next CF here:
> https://commitfest.postgresql.org/8/447/
> I am fine switching to an SRF depending on other opinions of people
> here, it just seems like an overkill knowing the uniqueness of the WAL
> sender in a server.
>
> I have finished with a function and a system view, this came up more
> in line with the existing things like pg_stat_archiver, and this makes
> as well the documentation clearer, at least that was my feeling when
> hacking that.

I also feel showing NULL values may not be good, when there is
no walreceiver. Instead of SRF function to avoid showing NULL vallues
how about adding "WHERE s.pid IS NOT NULL" to the system view.
pid value cannot be NULL, until unless there is no walreceiver.


Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Function and view to retrieve WAL receiver status

2016-01-05 Thread Haribabu Kommi

On Tue, Jan 5, 2016 at 10:24 PM, Michael Paquier
<michael.paqu...@gmail.com> wrote:
> On Tue, Jan 5, 2016 at 7:49 PM, Haribabu Kommi <kommi.harib...@gmail.com> 
> wrote:
>> On Sat, Dec 19, 2015 at 12:54 AM, Michael Paquier
>> <michael.paqu...@gmail.com> wrote:
>>> On Fri, Dec 18, 2015 at 8:39 AM, Robert Haas <robertmh...@gmail.com> wrote:
>>>> On Mon, Dec 14, 2015 at 7:23 PM, Michael Paquier
>>>> <michael.paqu...@gmail.com> wrote:
>>>>> On Tue, Dec 15, 2015 at 5:27 AM, Gurjeet Singh wrote:
>>>>>> The function, maybe. But emitting an all-nulls row from a view seems
>>>>>> counter-intuitive, at least when looking at it in context of relational
>>>>>> database.
>>>>>
>>>>> OK, noted. Any other opinions?
>>>>
>>>> I wouldn't bother with the view.  If we're going to do it, I'd say
>>>> just provide the function and let people SELECT * from it if they want
>>>> to.
>>>
>>> OK, I took some time to write a patch for that as attached, added in
>>> the next CF here:
>>> https://commitfest.postgresql.org/8/447/
>>> I am fine switching to an SRF depending on other opinions of people
>>> here, it just seems like an overkill knowing the uniqueness of the WAL
>>> sender in a server.
>>>
>>> I have finished with a function and a system view, this came up more
>>> in line with the existing things like pg_stat_archiver, and this makes
>>> as well the documentation clearer, at least that was my feeling when
>>> hacking that.
>>
>> I also feel showing NULL values may not be good, when there is
>> no walreceiver. Instead of SRF function to avoid showing NULL vallues
>> how about adding "WHERE s.pid IS NOT NULL" to the system view.
>> pid value cannot be NULL, until unless there is no walreceiver.
>
> Yeah, I would not mind switching it to that. A couple of other stat
> catalog views do it as well.

Following are my observations on the latest patch.

+ If no WAL receiver is present on the server queried,
+   a single tuple filled with NULL values is returned instead.
+  

The above documentation change is not required if we change the system
view.

+s.received_up_to_lsn,

The column name can be changed as "received_lsn" similar to "received_tli".
up_to may not be required.

+ XLogRecPtr received_up_lsn;
+ TimeLineID received_up_tli;

same as like above comment.

+ /* lock? */

I find out that walrcv data is updated only under mutex. it is better
to take that
mutex to provide a consistent snapshot data to user.


Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi-tenancy with RLS

2016-01-04 Thread Haribabu Kommi

On Mon, Jan 4, 2016 at 8:34 PM, Amit Langote
<langote_amit...@lab.ntt.co.jp> wrote:
> On 2016/01/04 14:43, Haribabu Kommi wrote:
>>>
>>> Here I attached new series of patches with a slightly different approach.
>>> Instead of creating the policies on the system catalog tables whenever
>>> the catalog security command is executed, just enable row level security
>>> on the system catalog tables. During the relation build, in
>>> RelationBuildRowSecurity function, if it is a system relation, frame the
>>> policy using the policy query which we earlier used to create by parsing it.
>>>
>>> With the above approach, in case of any problems in the policy, to use
>>> the corrected policy, user just needs to replace the binaries. whereas in
>>> earlier approach, either pg_upgrade or disabling and enabling of catalog
>>> security is required.
>>>
>>> Currently it is changed only for shared system catalog tables and also the
>>> way of enabling catalog security on shared system catalog tables is through
>>> initdb only. This also can be changed later. I will do similar changes for
>>> remaining catalog tables.
>>>
>>> Any comments on the approach?
>>
>> Instead of creating policies during the "alter database" command for database
>> catalog tables, generating at relation building is leading to an
>> infinite recursion
>> loop because of transformExpr call for the qual. Any ideas to handle the 
>> same?
>
> I tried your latest patch to see what may have caused the infinite
> recursion. The recursion occurs during backend startup itself, right?
>
> ISTM, doing transformWhereClause during RelationCacheInitializePhase3()
> would not work. Things like operators, functions within the policy qual
> require namespace lookup which down the line would call
> RelationBuildRowSecurity for pg_namespace build and so on thus causing the
> infinite recursion. Perhaps, it would have to be done in a separate phase
> after the phase 3 but I'm not sure.

Thanks for the test. Yes, the issue happens at backend startup itself.
I will give a try by separating the initialization of security
policies after init phase 3.

> I wonder why do the policy quals need to be built from scratch every time
> in RelationBuildRowSecurity for system tables (shared or otherwise)? Why
> not just read from the pg_policy catalog like regular relations if ALTER
> DATABASE CATALOG SECURITY TRUE already created those entries? Maybe I
> missing something though.

Yes, creating policies at start and using them every time when
relation is building works
until there is no problem is found in the policies. The row level
security policies on catalog
tables are created automatically when user enables catalog security,
so user don't have
any control on these policies.

In case if we found any problem in these policies, later if we want to
correct them, for
shared system catalog tables policies user needs to do a pg_upgrade to
correct them.
And for the other catalog tables, user needs to disable and enable the
catalog security
on all databases.

Instead of the above, if we built the policy at run time always for
catalog tables, user
can just replace binaries with latest works. Currently it is working
fine for shared system
catalog tables. I will give a try by separating
RelationBuildRowSecurity from init phase 3.

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2015-12-30 Thread Haribabu Kommi

On Wed, Dec 30, 2015 at 9:48 PM, Shulgin, Oleksandr
<oleksandr.shul...@zalando.de> wrote:
> On Wed, Dec 30, 2015 at 4:31 AM, Haribabu Kommi <kommi.harib...@gmail.com>
> wrote:
>>
>>
>> Adding quotes to pg_hba_lookup function makes it different from others.
>> The issues regarding the same is already discussed in [1].
>>
>> select a.database[1], b.datname from
>> pg_hba_lookup('postgres','kommih','::1')
>> as a, pg_database as b where a.database[1]
>> = b.datname;
>>
>> The queries like above are not possible with quoted output. It is very
>> rare that the
>> pg_hba_lookup function used in join operations, but still it is better
>> to provide
>> data without quotes. so I reverted these changes in the attached latest
>> patch.
>
>
> That's a good point.  I wonder that maybe instead of re-introducing quotes
> we could somehow make the unquoted keywords that have special meaning stand
> out, e.g:
>
> database  | {$sameuser}
> user_name | {$all}
>
> That should make it obvious which of the values are placeholders and doesn't
> interfere with joining database or user catalogs (while I would call
> "sameuser" a very unlikely name for a database, "all" might be not that
> unlikely name for a user, e.g. someone called like "Albert L. Lucky" could
> use that as a login name).

It is not only the problem with joins, the following two cases works
without quotes only.
With quotes the query doesn't match with the database name.

select * from pg_hba_lookup('Test', 'kommih','127.0.0.1') where
database = '{"Test"}';
select * from pg_hba_lookup('Test', 'kommih','127.0.0.1') where
database = '{Test}';


Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi-tenancy with RLS

2015-12-29 Thread Haribabu Kommi

On Thu, Dec 17, 2015 at 12:46 PM, Haribabu Kommi
<kommi.harib...@gmail.com> wrote:
> Rebased patch is attached as it is having an OID conflict with the
> latest set of changes
> in the master branch.

Here I attached new series of patches with a slightly different approach.
Instead of creating the policies on the system catalog tables whenever
the catalog security command is executed, just enable row level security
on the system catalog tables. During the relation build, in
RelationBuildRowSecurity function, if it is a system relation, frame the
policy using the policy query which we earlier used to create by parsing it.

With the above approach, in case of any problems in the policy, to use
the corrected policy, user just needs to replace the binaries. whereas in
earlier approach, either pg_upgrade or disabling and enabling of catalog
security is required.

Currently it is changed only for shared system catalog tables and also the
way of enabling catalog security on shared system catalog tables is through
initdb only. This also can be changed later. I will do similar changes for
remaining catalog tables.

Any comments on the approach?

Regards,
Hari Babu
Fujitsu Australia

3_shared_catalog_tenancy_v2.patch
Description: Binary data

1_any_privilege_option_v2.patch
Description: Binary data

2_view_security_definer_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2015-12-29 Thread Haribabu Kommi

On Wed, Dec 30, 2015 at 1:07 AM, Shulgin, Oleksandr
 wrote:
>
> This is close enough, but what I actually mean by "a callback" is more or
> less like the attached version.

Thanks for the changes.

> While at it, I've also added some trivial code to preserve keyword quoting
> in database and user fields, as well as added netmask output parameter; also
> documentation is extended a little.

Thanks for the documentation changes and regarding the quoting, in any system
catalog table, the quoted objects are represented without quotes as below.

postgres=> select datname from pg_database;
datname
---
 postgres
 template1
 template0
 test_user2_db
 TEST_USER1_DB
 test_user2_dB
(6 rows)

Adding quotes to pg_hba_lookup function makes it different from others.
The issues regarding the same is already discussed in [1].

select a.database[1], b.datname from pg_hba_lookup('postgres','kommih','::1')
as a, pg_database as b where a.database[1]
= b.datname;

The queries like above are not possible with quoted output. It is very
rare that the
pg_hba_lookup function used in join operations, but still it is better
to provide
data without quotes. so I reverted these changes in the attached latest patch.

> The biggest question for me is the proper handling of memory contexts for
> HBA and ident files data.  I think it makes sense to release them explicitly
> because with the current state of affairs, we have dangling pointers in
> parsed_{hba,ident}_{context,lines} after release of PostmasterContext.  The
> detailed comment in postgres.c around
> MemoryContextDelete(PostmasterContext); suggests that it's not easy already
> to keep track of the all things that might be affected by this cleanup step:
>
> /*
> * If the PostmasterContext is still around, recycle the space; we don't
> * need it anymore after InitPostgres completes.  Note this does not trash
> * *MyProcPort, because ConnCreate() allocated that space with malloc()
> * ... else we'd need to copy the Port data first.  Also, subsidiary data
> * such as the username isn't lost either; see ProcessStartupPacket().
> */
>
> Not sure if we need any advanced machinery here like some sort of cleanup
> hooks list?  For now I've added discard_{hba,ident}() functions and call
> them explicitly where appropriate.

The added functions properly frees the hba and ident contexts once its use is
finished. I removed the discard function calls in PerformAuthentication function
in EXEC_BACKEND mode, as these are called once the PerformAuthenication
function finishes.

The discard hba and ident context patch is separated from
pg_hba_lookup patch for
easier understanding. The pg_hba_lookup patch needs to be applied on top of
discard_hba_and_ident_cxt patch.

[1] 
http://www.postgresql.org/message-id/cafj8prarzdscocmk30gyydogiwutucz7eve-bbg+wv2wg5e...@mail.gmail.com

Regards,
Hari Babu
Fujitsu Australia

pg_hba_lookup_poc_v11.patch
Description: Binary data

discard_hba_and_ident_cxt.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2015-12-28 Thread Haribabu Kommi

On Mon, Dec 28, 2015 at 9:09 PM, Shulgin, Oleksandr
<oleksandr.shul...@zalando.de> wrote:
> On Thu, Dec 24, 2015 at 5:16 AM, Haribabu Kommi <kommi.harib...@gmail.com>
> wrote:
>>
>> On Thu, Dec 24, 2015 at 2:37 AM, Tom Lane <t...@sss.pgh.pa.us> wrote:
>> > "Shulgin, Oleksandr" <oleksandr.shul...@zalando.de> writes:
>> >> 1. Have you considered re-loading the HBA file upon call to this
>> >> function
>> >> in a local context instead of keeping it in the backends memory?
>> >
>> > Aside from the security questions, please consider that this feature
>> > should
>> > work similarly to the current implementation of the pg_file_settings
>> > view,
>> > namely it tells you about what is *currently* in the on-disk files, not
>> > necessarily what is the active setting in the postmaster's memory.
>> > A backend could not be entirely sure about the postmaster's state
>> > anyway;
>> > and even if it could be, one of the major applications for features like
>> > this is testing manual changes to the files before you SIGHUP the
>> > postmaster.  So re-reading the files on each usage is a Good Thing, IMO,
>> > even if it sounds inefficient.
>> >
>> >> 2. I also wonder why JSONB arrays for database/user instead of TEXT[]?
>> >
>> > Yes, that seems rather random to me too.
>>
>> Here I attached updated patch with the following changes,
>> - Local loading of HBA file to show the authentication data
>> - Changed database and user types are text[]
>
>
> Still this requires a revert of the memory context handling commit for
> load_hba() and load_ident().  I think you can get around the problem by
> changing these functions to work with CurrentMemoryContext and set it
> explicitly to the newly allocated PostmasterContext in
> PerformAuthentication().  In your function you could then create a temporary
> context to be discarded before leaving the function.

Thanks for the review. I didn't understand your point clearly.

In the attached patch, load_hba uses PostmasterContext if it is present,
otherwise CurretMemoryContext. PostmasterContext is present only
in the backend start phase.

> I still think you should not try to re-implement check_hba(), but extend
> this function with means to report line skip reasons as per your
> requirements.  Having an optional callback function might be a good fit (a
> possible use case is logging the reasons line by line).

check_hba function is enhanced to fill the hba line details with
reason for mismatch.
In check_hba function whenever a mismatch is found, the fill_hbaline function is
called to frame the tuple and inserted into tuple store.

Regards,
Hari Babu
Fujitsu Australia


pg_hba_lookup_poc_v9.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2015-12-23 Thread Haribabu Kommi

On Wed, Dec 23, 2015 at 8:54 PM, Shulgin, Oleksandr
<oleksandr.shul...@zalando.de> wrote:
> On Wed, Dec 16, 2015 at 9:33 AM, Haribabu Kommi <kommi.harib...@gmail.com>
> wrote:
>>
>>
>> Function is changed to accept default values.
>>
>> Apart from the above, added a local memory context to allocate the memory
>> required for forming tuple for each line. This context resets for every
>> hba line
>> to avoid consuming unnecessary memory for scenarios of huge pg_hba.conf
>> files.
>>
>> In the revert_hba_context_release_in_backend patch, apart from reverting
>> the commit - 1e24cf64. In tokenize_file function, changed the new context
>> allocation from CurrentMemoryContext instead of TopMemoryContext.
>>
>> Patch apply process:
>> 1. revert_hba_context_release_in_backend_2.patch
>> 2. pg_hba_lookup_poc_v7.patch
>
>
> Hello,
>
> 1. Have you considered re-loading the HBA file upon call to this function in
> a local context instead of keeping it in the backends memory?  I do not
> expect that the revert of 1e24cf645d24aab3ea39a9d259897fd0cae4e4b6 would be
> accepted, as the commit message refers to potential security problems with
> keeping this data in backend memory:
>
> ... This saves a
> probably-usually-negligible amount of space per running backend.  It
> also
> avoids leaving potentially-security-sensitive data lying around in
> memory
> in processes that don't need it.  You'd have to be unusually paranoid to
> think that that amounts to a live security bug, so I've not gone so far
> as
> to forcibly zero the memory; but there surely isn't a good reason to
> keep
> this data around.

Yes, it is possible to load the file locally whenever the lookup
function is called.
Only thing i am considering is performance impact because of huge file load
whenever the function is called.

> 2. I also wonder why JSONB arrays for database/user instead of TEXT[]?

When I first tried this functionality as a view, it became very
difficult to deal with
keyword database and user names, so at that time, it was thought to use jsonb
instead of text[], thinking of easy to handle the keywords. But later decided to
drop the view approach itself. I can change it if others also feel the same.

> 3. What happens with special keywords for database column like
> sameuser/samerole/samegroup and for special values in the user column?

There is no special handling for the keywords in this approach. Based on the
inputs to the function, it checks for the matched line in all hba lines.

For example, if a configuration line contains 'all' for database and user names,
then if user gives any database name and user name this line will be matched
and returned.

> 4. Would it be possible to also include the raw unparsed line from the HBA
> file?  Just the line number is probably enough when you have access to the
> host, but to show the results to someone else you might need to copy the raw
> line manually.  Not a big deal anyway.

IMO as we are already showing all line information in columns
separately, it may not
be looks good.

> 5. Some tests demonstrating possible output would be really nice to have.

Do you mean regression tests? In case of install check case, the results are
based on the server configuration that is running. It may be difficult to write
tests to pass in all scenarios. Because of this reason i didn't add them.

Regards,
Hari Babu
Fujitsu Australia

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Combining Aggregates

2015-12-23 Thread Haribabu Kommi

On Wed, Dec 23, 2015 at 7:50 PM, David Rowley
 wrote:
> One other part that I'm not too sure on how to deal with is how to set the
> data type for the Aggrefs when we're not performing finalization on the
> aggregate node. The return type for the Aggref in this case will be either
> the transtype, or the serialtype, depending on if we're serializing the
> states or not. To do this, I've so far just come up with
> set_partialagg_aggref_types() which is called during setrefs. The only other
> time that I can think to do this return type update would be when building
> the partial agg node's target list. I'm open to better ideas on this part.

Thanks for the patch. I am not sure about the proper place of this change,
but changing it with transtype will make all float4 and float8 aggregates to
work in parallel. Most of these aggregates return type is typbyval and
transition type is a variable length.

we may need to write better combine functions for these types to avoid wrong
results because of parallel.

Regards,
Hari Babu
Fujitsu Australia

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Combining Aggregates

2015-12-23 Thread Haribabu Kommi

On Thu, Dec 24, 2015 at 1:12 PM, David Rowley
<david.row...@2ndquadrant.com> wrote:
> On 24 December 2015 at 13:55, Haribabu Kommi <kommi.harib...@gmail.com>
> wrote:
>>
>> On Wed, Dec 23, 2015 at 7:50 PM, David Rowley
>> <david.row...@2ndquadrant.com> wrote:
>> > One other part that I'm not too sure on how to deal with is how to set
>> > the
>> > data type for the Aggrefs when we're not performing finalization on the
>> > aggregate node. The return type for the Aggref in this case will be
>> > either
>> > the transtype, or the serialtype, depending on if we're serializing the
>> > states or not. To do this, I've so far just come up with
>> > set_partialagg_aggref_types() which is called during setrefs. The only
>> > other
>> > time that I can think to do this return type update would be when
>> > building
>> > the partial agg node's target list. I'm open to better ideas on this
>> > part.
>>
>>
>> Thanks for the patch. I am not sure about the proper place of this change,
>> but changing it with transtype will make all float4 and float8 aggregates
>> to
>> work in parallel. Most of these aggregates return type is typbyval and
>> transition type is a variable length.
>>
>> we may need to write better combine functions for these types to avoid
>> wrong
>> results because of parallel.
>
>
> I might be misunderstanding you here, but  yeah, well, if by "write better"
> you mean "write some", then yeah :)  I only touched sum(), min() and max()
> so far as I didn't need to do anything special with these. I'm not quite
> sure what you mean with the "wrong results" part. Could you explain more?

sorry for not providing clear information. To check the change of replacing
aggtype with aggtranstype is working fine or not for float8 avg. I just tried
adding a float8_combine_accum function for float8 avg similar to
float8_acuum with a difference of adding the two transvalues instead
of newval to existing transval in float8_accum function as follows,

N += transvalues1[0];
sumX += transvalues1[1];
sumX2 += transvalues1[2];

But the result came different compared to normal aggregate. The data in the
column is 1.1

postgres=# select avg(f3) from tbl where f1 < 1001 group by f3;
  avg
---
 2.16921537594434e-316
(1 row)

postgres=# set enable_parallelagg = false;
SET
postgres=# select avg(f3) from tbl where f1 < 1001 group by f3;
   avg
--
 1.11
(1 row)


First i thought it is because of combine function problem, but it seems
some where else the problem is present in parallel aggregate code. sorry
for the noise.

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Combining Aggregates

2015-12-23 Thread Haribabu Kommi

On Wed, Dec 23, 2015 at 7:50 PM, David Rowley
 wrote:
> This is just my serial format that I've come up with for NumericAggState,
> which is basically: "N sumX maxScale maxScaleCount NaNcount". Perhaps we can
> come up with something better, maybe just packing the ints and int64s into a
> bytea type and putting the text version of sumX on the end... I'm sure we
> can think of something more efficient between us, but I think the serial
> state should definitely be cross platform e.g if we do the bytea thing, then
> the ints should be in network byte order so that a server cluster can have a
> mix of little and big-endian processors.

Instead of adding serial and de-serial functions to all aggregates which have
transition type as internal, how about adding these functions as send and
recv functions for internal type? can be used only in aggregate context.
The data can send and receive similar like other functions. Is it possible?

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2015-12-23 Thread Haribabu Kommi

On Thu, Dec 24, 2015 at 2:37 AM, Tom Lane  wrote:
> "Shulgin, Oleksandr"  writes:
>> 1. Have you considered re-loading the HBA file upon call to this function
>> in a local context instead of keeping it in the backends memory?
>
> Aside from the security questions, please consider that this feature should
> work similarly to the current implementation of the pg_file_settings view,
> namely it tells you about what is *currently* in the on-disk files, not
> necessarily what is the active setting in the postmaster's memory.
> A backend could not be entirely sure about the postmaster's state anyway;
> and even if it could be, one of the major applications for features like
> this is testing manual changes to the files before you SIGHUP the
> postmaster.  So re-reading the files on each usage is a Good Thing, IMO,
> even if it sounds inefficient.
>
>> 2. I also wonder why JSONB arrays for database/user instead of TEXT[]?
>
> Yes, that seems rather random to me too.

Here I attached updated patch with the following changes,
- Local loading of HBA file to show the authentication data
- Changed database and user types are text[]

Regards,
Hari Babu
Fujitsu Australia


pg_hba_lookup_poc_v8.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Aggregate

2015-12-21 Thread Haribabu Kommi

On Tue, Dec 22, 2015 at 2:16 AM, Paul Ramsey  wrote:
> Shouldn’t parallel aggregate come into play regardless of scan selectivity?
> I know in PostGIS land there’s a lot of stuff like:
>
> SELECT ST_Union(geom) FROM t GROUP BY areacode;
>
> Basically, in the BI case, there’s often no filter at all. Hoping that’s
> considered a prime case for parallel agg :)

Yes, the latest patch attached in the thread addresses this issue.
But it still lacks of proper cost calculation and comparison with
original aggregate cost.

The parallel aggregate selects only when the number of groups
should be at least less than 1/4 of rows that are getting selected.
Otherwise, doing aggregation two times for more number of
records leads to performance drop compared to original aggregate.

Regards,
Hari Babu
Fujitsu Australia

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Aggregate

2015-12-21 Thread Haribabu Kommi

On Mon, Dec 21, 2015 at 6:48 PM, David Rowley
<david.row...@2ndquadrant.com> wrote:
> On 21 December 2015 at 17:23, Haribabu Kommi <kommi.harib...@gmail.com>
> wrote:
>>
>>
>> Attached latest performance report. Parallel aggregate is having some
>> overhead
>> in case of low selectivity.This can be avoided with the help of cost
>> comparison
>> between normal and parallel aggregates.
>>
>
> Hi, Thanks for posting an updated patch.
>
> Would you be able to supply a bit more detail on your benchmark? I'm
> surprised by the slowdown reported with the high selectivity version. It
> gives me the impression that the benchmark might be producing lots of groups
> which need to be pushed through the tuple queue to the main process. I think
> it would be more interesting to see benchmarks with varying number of
> groups, rather than scan selectivity. Selectivity was important for parallel
> seqscan, but less so for this, as it's aggregated groups we're sending to
> main process, not individual tuples.

Yes the query is producing more groups according to the selectivity.
For example - scan selectivity - 40, the number of groups - 400

Following is the query:

SELECT tenpoCord,
SUM(yokinZandaka)   AS yokinZandakaxGOUKEI,
SUM(kashikoshiZandaka)   AS kashikoshiZandakaxGOUKEI,
SUM(kouzasuu) AS kouzasuuxGOUKEI,
SUM(sougouKouzasuu) AS sougouKouzasuuxGOUKEI
   FROM public.test01
  WHERE tenpoCord   <= '001'  AND
kamokuCord   = '01'   AND
kouzaKatujyoutaiCord = '0'
GROUP BY kinkoCord,tenpoCord;


Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi-tenancy with RLS

2015-12-16 Thread Haribabu Kommi

Rebased patch is attached as it is having an OID conflict with the
latest set of changes
in the master branch.

Regards,
Hari Babu
Fujitsu Australia


4_database_catalog_tenancy_v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2015-12-16 Thread Haribabu Kommi

On Wed, Dec 16, 2015 at 8:19 AM, Tomas Vondra
 wrote:
> Hi,
>
> I've reviewed the patch today, after re-reading the whole discussion.

Thanks for the review.

> The one unsolved issue is what to do with 1e24cf64. My understanding is that
> the current patch still requires reverting of that patch, but my feeling is
> TL won't be particularly keen about doing that. Or am I missing something?

Until pg_hba_lookup function, the parsed hba lines are not used in the backend.
These are used only postmaster process for the authentication. As the parsed
hba lines occupy extra memory in the backend process which is of no use.
Because of this reason TL has changed it to PostmasterContext instead of
TopMemoryContext.

> The current patch (v6) also triggers a few warnings during compilation,
> about hostname/address being unitialized in pg_hba_lookup(). That happens
> because 'address' is only set when (! PG_ARGISNULL(2)). Fixing it is as
> simple as
>
> char*address = NULL;
> char*hostname = NULL;
>
> at the beginning of the function (this seems correct to me).

corrected.

> The current patch also does not handle 'all' keywords correctly - it
> apparently just compares the values as strings, which is incorrect. For
> example this
>
> SELECT * FROM pg_hba_lookup('all', 'all')
>
> matches this pg_hba.conf line
>
> localallalltrust
>
> That's clearly incorrect, as Alvaro pointed out.

In the above case, the 'all' is taken as a database and user names.
The pg_hba line contains the keyword of 'all' as database and user.
This line can match with any database and user names provided
by the user. Because of this reason, it matches with the first line
of pg_hba.conf.

I feel it is fine. Please let me know if you are expecting a different
behavior.

> I'm also wondering whether we really need three separate functions in
> pg_proc.
>
> pg_hba_lookup(database, user)
> pg_hba_lookup(database, user, address)
> pg_hba_lookup(database, user, address, ssl_inuse)
>
> Clearly, that's designed to match the local/host/hostssl/hostnossl cases
> available in pg_hba. But why not to simply use default values instead?
>
> pg_hba_lookup(database TEXT, user TEXT,
>   address TEXT DEFAULT NULL,
>   ssl_inuse BOOLEAN DEFAULT NULL)
>

Function is changed to accept default values.

Apart from the above, added a local memory context to allocate the memory
required for forming tuple for each line. This context resets for every hba line
to avoid consuming unnecessary memory for scenarios of huge pg_hba.conf
files.

In the revert_hba_context_release_in_backend patch, apart from reverting
the commit - 1e24cf64. In tokenize_file function, changed the new context
allocation from CurrentMemoryContext instead of TopMemoryContext.

Patch apply process:
1. revert_hba_context_release_in_backend_2.patch
2. pg_hba_lookup_poc_v7.patch

Regards,
Hari Babu
Fujitsu Australia

revert_hba_context_release_in_backend_2.patch
Description: Binary data

pg_hba_lookup_poc_v7.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Aggregate

2015-12-15 Thread Haribabu Kommi

On Tue, Dec 15, 2015 at 8:04 AM, Paul Ramsey  wrote:
> But the run dies.
>
> NOTICE:  SRID value -32897 converted to the officially unknown SRID value 0
> ERROR:  Unknown geometry type: 2139062143 - Invalid type
>
> From the message it looks like geometry gets corrupted at some point,
> causing a read to fail on very screwed up metadata.

Thanks for the test. There was some problem in advance_combination_function
in handling pass by reference data. Here I attached updated patch with the fix.


Regards,
Hari Babu
Fujitsu Australia


parallelagg_poc_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Aggregate

2015-12-13 Thread Haribabu Kommi

On Fri, Dec 11, 2015 at 5:42 PM, Haribabu Kommi
<kommi.harib...@gmail.com> wrote:
> 3. Performance test to observe the effect of parallel aggregate.

Here I attached the performance test report of parallel aggregate.
Summary of the result is:
1. Parallel aggregate is not giving any improvement or having
very less overhead compared to parallel scan in case of low
selectivity.

2. Parallel aggregate is performing well more than 60% compared
to parallel scan because of very less data transfer overhead as the
hash aggregate operation is reducing the number of tuples that
are required to be transferred from workers to backend.

The parallel aggregate plan is depends on below parallel seq scan.
In case if parallel seq scan plan is not generated because of more
tuple transfer overhead cost in case of higher selectivity, then
parallel aggregate is also not possible. But with parallel aggregate
the number of records that are required to be transferred from
worker to backend may reduce compared to parallel seq scan. So
the overall cost of parallel aggregate may be better.

To handle this problem, how about the following way?

Having an one more member in RelOptInfo called
cheapest_parallel_path used to store the parallel path that is possible.
where ever the parallel plan is possible, this value will be set with
the possible parallel plan. If parallel plan is not possible in the parent
nodes, then this will be set as NULL. otherwise again calculate the
parallel plan at this node based on the below parallel plan node.

Once the entire paths are finalized, in grouping planner, prepare a
plan for normal aggregate and parallel aggregate. Compare these
two costs and decide the cheapest cost plan.

I didn't yet evaluated the feasibility of the above solution. suggestions?

Regards,
Hari Babu
Fujitsu Australia


performance_test_result.xlsx
Description: MS-Excel 2007 spreadsheet

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Aggregate

2015-12-12 Thread Haribabu Kommi

On Sat, Dec 12, 2015 at 8:42 AM, David Rowley
 wrote:
> On 12 December 2015 at 04:00, Robert Haas  wrote:
>>
>> I'd like to commit David Rowley's patch from the other thread first,
>> and then deal with this one afterwards.  The only thing I feel
>> strongly needs to be changed in that patch is CFUNC -> COMBINEFUNC,
>> for clarity.
>
>
> I have addressed that in my local copy. I'm now just working on adding some
> test code which uses the new infrastructure. Perhaps I'll just experiment
> with the parallel aggregate stuff instead now.
>

Here I attached a patch with following changes, i feel it is better to
include them as part
of combine aggregate patch.

1. Added Missing outfuncs.c changes for newly added variables in
Aggref structure
2. Keeping the aggregate function in final aggregate stage to do the
final aggregate
on the received tuples from all workers.

Patch still needs a fix for correcting the explain plan output issue.

postgres=# explain analyze verbose select count(*), sum(f1) from tbl
where f1 % 100 = 0 group by f3;
   QUERY
PLAN

 Finalize HashAggregate  (cost=1853.75..1853.76 rows=1 width=12)
(actual time=92.428..92.429 rows=1 loops=1)
   Output: pg_catalog.count(*), pg_catalog.sum((sum(f1))), f3
   Group Key: tbl.f3
   ->  Gather  (cost=0.00..1850.00 rows=500 width=12) (actual
time=92.408..92.416 rows=3 loops=1)
 Output: f3, (count(*)), (sum(f1))


Regards,
Hari Babu
Fujitsu Australia


set_ref_final_agg.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Aggregate

2015-12-10 Thread Haribabu Kommi

On Thu, Dec 3, 2015 at 6:06 PM, David Rowley
<david.row...@2ndquadrant.com> wrote:
> On 3 December 2015 at 19:24, Haribabu Kommi <kommi.harib...@gmail.com>
> wrote:
>>
>> On Thu, Dec 3, 2015 at 4:18 PM, David Rowley
>> <david.row...@2ndquadrant.com> wrote:
>> >
>> > Hi,
>> >
>> > I just wanted to cross post here to mark that I've posted an updated
>> > patch
>> > for combining aggregate states:
>> >
>> > http://www.postgresql.org/message-id/CAKJS1f9wfPKSYt8CG=t271xbymzjrzwqbjeixiqrf-olh_u...@mail.gmail.com
>> >
>> > I also wanted to check if you've managed to make any progress on
>> > Parallel
>> > Aggregation? I'm very interested in this myself and would like to
>> > progress
>> > with it, if you're not already doing so.
>>
>> Yes, the parallel aggregate basic patch is almost ready.
>> This patch is based on your earlier combine state patch.
>> I will post it to community with in a week or so.
>
>
> That's great news!
>
> Also note that there's some bug fixes in the patch I just posted on the
> other thread for combining aggregate states:
>
> For example:  values[Anum_pg_aggregate_aggcombinefn - 1] =
> ObjectIdGetDatum(combinefn);
> was missing from AggregateCreate().
>
> It might be worth diffing to the updated patch just to pull in anything else
> that's changed.

Here I attached a POC patch of parallel aggregate based on combine
aggregate patch. This patch contains the combine aggregate changes
also. This patch generates and executes the parallel aggregate plan
as discussed in earlier threads.

Changes:

1. The aggregate reference in Finalize aggregate is getting overwritten
with OUTER_VAR reference. But to do the final aggregate we need the
aggregate here, so currently by checking the combine states it is avoided.

2. Check whether the aggregate functions that are present in the targetlist
and qual can be executed parallel or not? Based on this the targetlist is
formed to pass it to partial aggregate.

3. Replaces the seq scan as the lefttree with partial aggregate plan and
generate full parallel aggregate plan.

Todo:
1. Needs a code cleanup, it is just a prototype.
2. Explain plan with proper instrumentation data.
3. Performance test to observe the effect of parallel aggregate.
4. Need to separate combine aggregate patch with additional changes
done.

Regards,
Hari Babu
Fujitsu Australia

parallelagg_poc.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2015-12-09 Thread Haribabu Kommi

On Wed, Dec 9, 2015 at 5:31 PM, Amit Kapila  wrote:
>
> Another bigger issue I see in the above part of code is that it doesn't
> seem to be safe to call load_hba() at that place in PostgresMain() as
> currently load_hba() is using a context created from PostmasterContext
> to perform the parsing and some other stuff, the PostmasterContext
> won't be available at that time.  It is deleted immediately after
> InitPostgres
> is completed.  So either we need to make PostmasterContext don't go
> away after InitPostgres() or load_hba shouldn't use it and rather use
> CurrentMemroyContext similar to ProcessConfigFile or may be use
> TopMemoryContext instead of PostmasterContext if possible.  I think
> this needs some more thoughts.
>
> Apart from above, this patch doesn't seem to work on Windows or
> for EXEC_BACKEND builds as we are loading the hba file in a
> temporary context (PostmasterContext, refer PerformAuthentication)
> which won't be alive for the backends life.  This works on linux because
> of fork/exec mechanism which allows to inherit the parsed file
> (parsed_hba_lines). This is slightly tricky problem to solve and we
> have couple of options (a) use TopMemoryContext instead of Postmaster
> Context to load hba; (b) Use CurrentMemoryContext (c) pass the parsed
> hba file for Windows/Exec_Backend using save_backend_variables/
> restore_backend_variables mechanism or if you have any other idea.
> If you don't have any better idea, then you can evaluate above ideas
> and see which one makes more sense.

Reverting the context release patch is already provided in the first
mail of this
thread [1]. Forgot to mention about the same in further mails.

Here I attached the same patch. This patch needs to be applied first before
pg_hba_lookup patch. I tested it in windows version also.

[1] - 
http://www.postgresql.org/message-id/cajrrpgffyf45mfk7ub+qhwhxn_ttmknrvhtudefqzuzzrwe...@mail.gmail.com

Regards,
Hari Babu
Fujitsu Australia


revert_hba_context_release_in_backend.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2015-12-09 Thread Haribabu Kommi

On Thu, Dec 10, 2015 at 4:22 AM, Alvaro Herrera
<alvhe...@2ndquadrant.com> wrote:
> Haribabu Kommi wrote:
>
>> Reverting the context release patch is already provided in the first
>> mail of this
>> thread [1]. Forgot to mention about the same in further mails.
>>
>> Here I attached the same patch. This patch needs to be applied first before
>> pg_hba_lookup patch. I tested it in windows version also.
>
> So if you change the file and reload repeatedly, we leak all the memory
> allocated for HBA lines in TopMemoryContext?  This doesn't sound great.
> Perhaps we need a dedicated context which can be reset at will so that
> it can be refilled with the right info when we reload the file.

No. There is no leaks associated with pg_hba.conf parsing. we already have
a memory context called "hba parser context" allocated from Postmaster
context. The "revert_hba_context_release_in_backend" patch changes it to
TopMemoryContext. The memory required for parsing and storing parsed
hba lines is obtained from this context.


Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2015-12-09 Thread Haribabu Kommi

On Thu, Dec 10, 2015 at 2:29 PM, Amit Kapila <amit.kapil...@gmail.com> wrote:
> On Thu, Dec 10, 2015 at 6:46 AM, Haribabu Kommi <kommi.harib...@gmail.com>
> wrote:
>>
>> On Thu, Dec 10, 2015 at 4:22 AM, Alvaro Herrera
>> <alvhe...@2ndquadrant.com> wrote:
>> > Haribabu Kommi wrote:
>> >
>> >> Reverting the context release patch is already provided in the first
>> >> mail of this
>> >> thread [1]. Forgot to mention about the same in further mails.
>> >>
>> >> Here I attached the same patch. This patch needs to be applied first
>> >> before
>> >> pg_hba_lookup patch. I tested it in windows version also.
>> >
>> > So if you change the file and reload repeatedly, we leak all the memory
>> > allocated for HBA lines in TopMemoryContext?  This doesn't sound great.
>> > Perhaps we need a dedicated context which can be reset at will so that
>> > it can be refilled with the right info when we reload the file.
>>
>> No. There is no leaks associated with pg_hba.conf parsing. we already have
>> a memory context called "hba parser context" allocated from Postmaster
>> context. The "revert_hba_context_release_in_backend" patch changes it to
>> TopMemoryContext. The memory required for parsing and storing parsed
>> hba lines is obtained from this context.
>>
>
> tokenize_file() is called before creation of hba parser context, so below
> change would be problem.
>
> *** 386,392  tokenize_file(const char *filename, FILE *file,
>
>   MemoryContext linecxt;
>
>   MemoryContext oldcxt;
>
>
>
> ! linecxt = AllocSetContextCreate(CurrentMemoryContext,
>
>   "tokenize file cxt",
>
>   ALLOCSET_DEFAULT_MINSIZE,
>
>   ALLOCSET_DEFAULT_INITSIZE,
>
> --- 386,392 
>
>   MemoryContext linecxt;
>
>   MemoryContext oldcxt;
>
>
>
> ! linecxt = AllocSetContextCreate(TopMemoryContext,
>
>   "tokenize file cxt",
>
>   ALLOCSET_DEFAULT_MINSIZE,
>
>   ALLOCSET_DEFAULT_INITSIZE,
>
>
> How about creating "hba parser context" and "ident parser context"
> at the beginning of their respective functions and don't change
> anything in tokenize_file()?

The tokenize file cxt is deleted after a successful load of pg_hba.conf or
pg_ident.conf files. we don't need this memory once the pg_hba.conf
or pg_ident file is loaded, because of this reason, it is created as a
separate context and deleted later.

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2015-12-09 Thread Haribabu Kommi

On Thu, Dec 10, 2015 at 4:33 PM, Amit Kapila <amit.kapil...@gmail.com> wrote:
> On Thu, Dec 10, 2015 at 9:51 AM, Haribabu Kommi <kommi.harib...@gmail.com>
> wrote:
>>
>> On Thu, Dec 10, 2015 at 2:29 PM, Amit Kapila <amit.kapil...@gmail.com>
>> wrote:
>
>> > How about creating "hba parser context" and "ident parser context"
>> > at the beginning of their respective functions and don't change
>> > anything in tokenize_file()?
>>
>> The tokenize file cxt is deleted after a successful load of pg_hba.conf or
>> pg_ident.conf files. we don't need this memory once the pg_hba.conf
>> or pg_ident file is loaded, because of this reason, it is created as a
>> separate context and deleted later.
>>
>
> What about the error case?

Yes, One error case is possible when the length of the string crosses
the MAX_LINE size.
If we allocate the tokenize file cxt inside CurrentMemoryContext (i.e
MessageContext)
instead of TopMemoryContext, it will automatically freed later in case
if exists.


Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2015-12-08 Thread Haribabu Kommi

On Wed, Dec 9, 2015 at 2:36 AM, Amit Kapila  wrote:
>
> Few assorted comments:

Thanks for the review.

> 1.
> + /*
> + * SQL-accessible SRF to return all the settings from the pg_hba.conf
> + * file.
> + */
> + Datum
> + pg_hba_lookup_2args(PG_FUNCTION_ARGS)
> + {
> + return pg_hba_lookup(fcinfo);
> + }
> +
> + /*
> +  * SQL-accessible SRF to return all the settings from the pg_hba.conf
> +  * file.
> +  */
> + Datum
> + pg_hba_lookup_3args(PG_FUNCTION_ARGS)
> + {
> + return pg_hba_lookup(fcinfo);
> + }
>
> I think it is better to have check on number of args in the
> above functions similar to what we have in ginarrayextract_2args.

ginarrayextract_2args is an deprecated function that checks and returns
error if user is using with two arguments.  But in pg_hba_lookup function,
providing two argument is a valid scenario. The check can be added only
to verify whether the provided number of arguments are two or not. Is it
really required?

> 2.
> +
> + /*
> + * Reload authentication config files too to refresh
> + * pg_hba_conf view data.
> + */
> + if (!load_hba())
> + {
> + ereport(DEBUG1,
> + (errmsg("Falure in reloading pg_hba.conf, pg_hba_conf view may show stale
> information")));
> + load_hba_failure = true;
> + }
> +
> + load_hba_failure = false;
>
> Won't the above code set load_hba_failure as false even in
> failure case.

Fixed.

> 3.
> + Datum
> + pg_hba_lookup(PG_FUNCTION_ARGS)
> + {
> + char *user;
> + char *database;
> + char *address;
> + char*hostname;
> + bool ssl_inuse = false;
> + struct sockaddr_storage addr;
> + hba_lookup_args_mode args_mode = TWO_ARGS_MODE; /* Minimum number of
> arguments */
> +
> + /*
> + * We must use the Materialize mode to be safe against HBA file reloads
> + * while the cursor is open. It's also more efficient than having to look
> + * up our current position in the parsed list every time.
> + */
> + ReturnSetInfo *rsi = (ReturnSetInfo *)fcinfo->resultinfo;
> +
> + if (!superuser())
> + ereport(ERROR,
> + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> + (errmsg("only superuser can view pg_hba.conf settings";
>
> To be consistent with other similar messages, it is better to
> start this message with "must be superuser ..", refer other
> similar superuser checks in xlogfuncs.c

Updated as "must be superuser to view".

Attached updated patch after taking care of review comments.

Regards,
Hari Babu
Fujitsu Australia


pg_hba_lookup_poc_v6.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2015-12-07 Thread Haribabu Kommi

On Sat, Dec 5, 2015 at 3:31 AM, Alvaro Herrera <alvhe...@2ndquadrant.com> wrote:
> Haribabu Kommi wrote:
>
>> How about as follows?
>>
>> postgres=# select * from pg_hba_lookup('all','all','::1');
>>  line_number | type  | database |  user   |  address  | hostname | method | 
>> options |  mode
>> -+---+--+-+---+--++-+-
>> 84   | local  | ["all"] | ["all"] |   |  | trust  | 
>> {}  | skipped
>> 86   | host   | ["all"] | ["all"] | 127.0.0.1 |  | trust  | 
>> {}  | skipped
>> 88   | host   | ["all"] | ["all"] | ::1   |  | trust  | 
>> {}  | matched
>> (3 rows)
>
> What did you do to the whitespace when posting that table?  I had to
> reformat it pretty heavily to understand what you had.
> Anyway, I think the "mode" column should be right after the line number.
> I assume the "reason" for skipped lines is going to be somewhere in the
> table too.

when i try to copy paste the output from psql, it didn't come properly, so
I adjusted the same to looks properly, but after sending mail, it look ugly.

Added reason column also as the last column of the table and moved the mode
as the second column.

> What happens if a "reject" line is matched?  I hope the lookup
> would terminate there.

whenever any line matches with the given arguments, the function stops
processing further lines.

> What does it mean to query for "all"?  Do you have database and user
> named "all"?  Because otherwise that seems wrong to me; you should be
> able to query for specific databases/users, but not for special
> keywords; maybe I am wrong and there is a use case for this, in which
> case please state what it is.

The 'all' is just passed as a database and user name. In my configuration
I just put every database to match. so just for a test i did that way. There is
no special handling for keywords.

> I see three problems in your code.  One is that the translation of
> auth_method enum to text should be a separate function, not the SQL
> function layer;

Moved into a different function.

>another is that the code to put keywords as JSON object
> values is way too repetitive; the other is that messing with the JSON
> API like that is not nice.  (I don't think we're closed to doing that,
> but that would be a separate discussion).  I think this patch should
> just use the "push value" interface rather than expose add_jsonb.
>
> (I assume the usage of JSON rather than a regular array was already
> discussed and JSON was chosen for some reason.)

Repetitive jsonb object code is moved into a function and used those functions.
Changed all jsonb calls into push value functions.

Regards,
Hari Babu
Fujitsu Australia


pg_hba_lookup_poc_v5.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2015-12-03 Thread Haribabu Kommi

On Fri, Dec 4, 2015 at 7:45 AM, Pavel Stehule  wrote:
>
>
> this tracing can be implemented to main pg_hba processing. When you are
> connect from some specific client - and you can see, why you cannot to
> connect to Postgres

The trace messages that are going to print doesn't come to client until the
connection gets successful. The traces may not useful for the clients
to find out
why the connection is failing. But it may be useful for administrators.
How about the attached patch?

[kommih@localhost bin]$ ./psql postgres -h ::1
psql (9.6devel)
Type "help" for help.

postgres=#

ServerLog:
NOTICE:  Skipped 84 pg_hba line, because of host connection type.
NOTICE:  Skipped 86 pg_hba line, because of non matching IP.

Regards,
Hari Babu
Fujitsu Australia

hba_trace.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2015-12-03 Thread Haribabu Kommi

On Fri, Dec 4, 2015 at 8:05 AM, Alvaro Herrera  wrote:
>> >> Here I attached the patch with the suggested changes.
>> >> Along with line number, I kept the options column also with authentication
>> >> options as a jsonb datatype.
>> >>
>> >> Example output:
>> >>
>> >> postgres=# select pg_hba_lookup('test','all','::1');
>> >> NOTICE:  Skipped 84 Hba line, because of non matching IP.
>> >> NOTICE:  Skipped 86 Hba line, because of non matching database.
>> >> NOTICE:  Skipped 87 Hba line, because of non matching role.
>> >>  pg_hba_lookup
>> >> ---
>> >>  (89,trust,{})
>> >> (1 row)
>> >>
>> >> comments?
>
> I don't like this interface.  It's nice for psql, but everybody else is
> going to lose.  I think these should be reported in the SRF result set
> as well; perhaps add a "mode" column that says "skipped" for such rows,
> and "matched" for the one that, uh, matches.  (Please try calling your
> function with "select * from" which should give nicer output.)
>

How about as follows?

postgres=# select * from pg_hba_lookup('all','all','::1');
 line_number | type  | database |  user   |  address  | hostname |
method | options |  mode
-+---+--+-+---+--++-+-
  84   | local  | ["all"]| ["all"]   |
| | trust  | {}  | skipped
  86   | host   | ["all"]| ["all"]   | 127.0.0.1 |
| trust  | {}  | skipped
  88   | host   | ["all"]| ["all"]   | ::1
   | | trust  | {}  | matched
(3 rows)


In the above case, all the columns are displayed. Based on the
feedback we can keep
the required columns. I didn't yet removed the NOTICE messages in the
attached version.
Are they still required?


Regards,
Hari Babu
Fujitsu Australia


pg_hba_lookup_poc_v4.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Aggregate

2015-12-02 Thread Haribabu Kommi

On Thu, Dec 3, 2015 at 4:18 PM, David Rowley
 wrote:
>
> Hi,
>
> I just wanted to cross post here to mark that I've posted an updated patch
> for combining aggregate states:
> http://www.postgresql.org/message-id/CAKJS1f9wfPKSYt8CG=t271xbymzjrzwqbjeixiqrf-olh_u...@mail.gmail.com
>
> I also wanted to check if you've managed to make any progress on Parallel
> Aggregation? I'm very interested in this myself and would like to progress
> with it, if you're not already doing so.

Yes, the parallel aggregate basic patch is almost ready.
This patch is based on your earlier combine state patch.
I will post it to community with in a week or so.

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2015-12-02 Thread Haribabu Kommi

On Wed, Nov 25, 2015 at 7:18 PM, Pavel Stehule <pavel.steh...@gmail.com> wrote:
>
>
> 2015-11-25 8:05 GMT+01:00 Haribabu Kommi <kommi.harib...@gmail.com>:
>>
>>
>> Thanks. Here I attached the poc patch that returns authentication method
>> of the
>> first matched hba entry in pg_hba.conf with the given input values.
>> Currently these
>> functions returns text type. Based on the details required to be
>> printed, it can
>> be changed.
>>
>> postgres=# select pg_hba_lookup('all', 'all');
>>  pg_hba_lookup
>> ---
>>  trust
>> (1 row)
>>
>> comments for the approach?
>
>
> From my perspective, it shows too less informations.
>
> What I am expecting:
>
> 1. line num of choosed rule
> 2. some tracing - via NOTICE, what and why some rules was skipped.

Here I attached the patch with the suggested changes.
Along with line number, I kept the options column also with authentication
options as a jsonb datatype.

Example output:

postgres=# select pg_hba_lookup('test','all','::1');
NOTICE:  Skipped 84 Hba line, because of non matching IP.
NOTICE:  Skipped 86 Hba line, because of non matching database.
NOTICE:  Skipped 87 Hba line, because of non matching role.
 pg_hba_lookup
---
 (89,trust,{})
(1 row)

comments?

Regards,
Hari Babu
Fujitsu Australia


pg_hba_lookup_poc_v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2015-11-24 Thread Haribabu Kommi

On Tue, Nov 17, 2015 at 9:37 AM, Peter Eisentraut <pete...@gmx.net> wrote:
> On 11/16/15 2:37 AM, Haribabu Kommi wrote:
>> On Mon, Nov 16, 2015 at 2:30 PM, Peter Eisentraut <pete...@gmx.net> wrote:
>>> On 7/21/15 5:15 AM, Haribabu Kommi wrote:
>>>> With the output of this view, administrator can identify the lines
>>>> that are matching for the given
>>>> criteria easily without going through the file.
>>>
>>> How is this useful?  I could see the use if you want to debug cases of
>>> user foo on host bar says they can't connect, but you can't impersonate
>>> them to verify it.  But then all you need is a function with a scalar
>>> result, not a result set.
>>
>> Do you mean the function should return true or false based on the connection
>> status with the provided arguments?
>>
>> I also feel difficult to understand the function result as compared to a 
>> view.
>
> An hba lookup is essentially a lookup by user name, database name,
> client address, yielding an authentication method (possibly with
> parameters).  So I think this function should work that way as well:
> arguments are user name, database name, and so on, and the return value
> is an authentication method.  Maybe it would be some kind of record,
> with line number and some parameters.
>
> That would address the use case I put forth above.  I don't know whether
> that's what you were going for.

Thanks. Here I attached the poc patch that returns authentication method of the
first matched hba entry in pg_hba.conf with the given input values.
Currently these
functions returns text type. Based on the details required to be
printed, it can
be changed.

postgres=# select pg_hba_lookup('all', 'all');
 pg_hba_lookup
---
 trust
(1 row)

comments for the approach?

Regards,
Hari Babu
Fujitsu Australia


pg_hba_lookup_poc_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Skip ALTER x SET SCHEMA if the schema didn't change

2015-11-17 Thread Haribabu Kommi

On Wed, Nov 18, 2015 at 6:02 AM, Robert Haas  wrote:
> On Mon, Nov 16, 2015 at 4:27 AM, Marti Raudsepp  wrote:
>> Thank you so much for the review and patch update. I should have done that
>> myself, but I've been really busy for the last few weeks. :(
>
> Maybe I'm having an attack of the stupids today, but it looks to me
> like the changes to pg_constraint.c look awfully strange to me.  In
> the old code, if object_address_present() returns true, we continue,
> skipping the rest of the loop.  In the new code, we instead set
> alreadyChanged to true.  That causes both of the following if
> statements, as revised, to fall out, so that we skip the rest of the
> loop.  Huh?  Wouldn't a one line change to add oldNspId != newNspId to
> the criteria for a simple_heap_update be just as good?

Yes, that's correct, the above change can be written as you suggested.
Updated patch attached with correction.

> Backing up a bit, maybe we should be a bit more vigorous in treating a
> same-namespace move as a no-op.  That is, don't worry about calling
> the post-alter hook in that case - just have AlterConstraintNamespaces
> start by checking whether oldNspId == newNspid right at the top; if
> so, return.  The patch seems to have the idea that it is important to
> call the post-alter hook even in that case, but I'm not sure whether
> that's true.  I'm not sure it's false, but I'm also not sure it's
> true.

I am also not sure whether calling the post-alter hook in case of constraint is
necessarily required? but it was doing for other objects, so I suggested
that way.

Regards,
Hari Babu
Fujitsu Australia


0001-Skip-ALTER-x-SET-SCHEMA-if-the-schema-didn-t-change_v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Skip ALTER x SET SCHEMA if the schema didn't change

2015-11-15 Thread Haribabu Kommi

On Thu, Nov 5, 2015 at 10:20 PM, Haribabu Kommi
<kommi.harib...@gmail.com> wrote:
> On Tue, Sep 29, 2015 at 12:17 AM, Marti Raudsepp <ma...@juffo.org> wrote:
>> Hi list
>>
>> The attached patch changes the behavior of multiple ALTER x SET SCHEMA
>> commands, to skip, rather than fail, when the old and new schema is
>> the same.
>>
>> The advantage is that it's now easier to write DDL scripts that are 
>> indempotent.
>>
>> This already matches the behavior of ALTER EXTENSION SET SCHEMA in
>> earlier versions, as well as many other SET-ish commands, e.g. ALTER
>> TABLE SET TABLESPACE, OWNER TO, CLUSTER ON, SET (storage_parameter...)
>> etc. I don't see why SET SCHEMA should be treated any differently.
>>
>> The code is written such that object_access_hook is still called for
>> each object.
>>
>> Regression tests included. I couldn't find any documentation that
>> needs changing.
>
> I went through the patch, following are my observations,
>
> Patch applied with hunks and compiled with out warnings.
> Basic tests are passed.
>
> In AlterTableNamespaceInternal function, if a table or matview called
> for set schema,
> If the object contains any constraints, the constraint gets updated
> with new schema.
>
> In AlterTypeNamespaceInternal function, the InvokeObjectPostAlterHook function
> doesn't get called if the type is of composite type, domain and array
> types as because
> it just returns from top of the function.

Most of the community members didn't find any problem in changing the
behavior, so here I attached updated patch with the above two corrections.

Regards,
Hari Babu
Fujitsu Australia


0001-Skip-ALTER-x-SET-SCHEMA-if-the-schema-didn-t-change_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2015-11-15 Thread Haribabu Kommi

On Mon, Nov 16, 2015 at 2:30 PM, Peter Eisentraut <pete...@gmx.net> wrote:
> On 7/21/15 5:15 AM, Haribabu Kommi wrote:
>> With the output of this view, administrator can identify the lines
>> that are matching for the given
>> criteria easily without going through the file.
>
> How is this useful?  I could see the use if you want to debug cases of
> user foo on host bar says they can't connect, but you can't impersonate
> them to verify it.  But then all you need is a function with a scalar
> result, not a result set.

Do you mean the function should return true or false based on the connection
status with the provided arguments?

I also feel difficult to understand the function result as compared to a view.

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Skip ALTER x SET SCHEMA if the schema didn't change

2015-11-05 Thread Haribabu Kommi

On Tue, Sep 29, 2015 at 12:17 AM, Marti Raudsepp  wrote:
> Hi list
>
> The attached patch changes the behavior of multiple ALTER x SET SCHEMA
> commands, to skip, rather than fail, when the old and new schema is
> the same.
>
> The advantage is that it's now easier to write DDL scripts that are 
> indempotent.
>
> This already matches the behavior of ALTER EXTENSION SET SCHEMA in
> earlier versions, as well as many other SET-ish commands, e.g. ALTER
> TABLE SET TABLESPACE, OWNER TO, CLUSTER ON, SET (storage_parameter...)
> etc. I don't see why SET SCHEMA should be treated any differently.
>
> The code is written such that object_access_hook is still called for
> each object.
>
> Regression tests included. I couldn't find any documentation that
> needs changing.

I went through the patch, following are my observations,

Patch applied with hunks and compiled with out warnings.
Basic tests are passed.

In AlterTableNamespaceInternal function, if a table or matview called
for set schema,
If the object contains any constraints, the constraint gets updated
with new schema.

In AlterTypeNamespaceInternal function, the InvokeObjectPostAlterHook function
doesn't get called if the type is of composite type, domain and array
types as because
it just returns from top of the function.

Regards,
Hari Babu
Fujitsu Australia

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] NOTIFY in Background Worker

2015-11-05 Thread Haribabu Kommi

On Fri, Nov 6, 2015 at 4:57 AM, Robert Haas <robertmh...@gmail.com> wrote:
> On Thu, Nov 5, 2015 at 12:34 AM, Haribabu Kommi
> <kommi.harib...@gmail.com> wrote:
>> I marked this patch as ready for committer.
>
> The patch says:
>
> If a background worker registers to receive asynchronous notifications
> with the LISTEN through SPI,
> there is currently no way for incoming notifications to be received.
>
> But wouldn't it be more correct to say:
>
> If a background worker registers to receive asynchronous notifications
> with the LISTEN through SPI, the
> worker will log those notifications, but there is no programmatic way
> for the worker to intercept and respond to those notifications.

Yes, the above description is good.

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] NOTIFY in Background Worker

2015-11-04 Thread Haribabu Kommi

On Sat, Aug 29, 2015 at 12:55 PM, Thomas Munro
 wrote:
> On Sat, Aug 29, 2015 at 9:03 AM, Thomas Munro
>  wrote:
>>
>> On Fri, Aug 28, 2015 at 10:30 PM, jacques klein
>>  wrote:
>>>
>>> Hello,
>>>
>>> I added a "NOFITY chan" to the SQL arg of an SPI_execute(), (I did it
>>> also with just the NOTIFY statement),
>>> but the listeners (other workers) don't get the notification until a
>>> "NOTIFY chan" is done for example with pgadmin,
>>>
>>> They don't get lost, just not emited after the "not forgotten" call of
>>> CommitTransactionCommand().
>>>
>>> Is this normal ( i.e. not supported (yet) ), a bug, or did I overlook
>>> some doc. (or source code) ?.
>>>
>>> For now, I will try to "emit" the NOTIFY via libpq.
>>
>>
>> That's because ProcessCompletedNotifies isn't being called.  For regular
>> backends it is called inside the top level loop PostgresMain.  I think you
>> need to include "commands/async.h" and add a call to
>> ProcessCompletedNotifies() after your background worker commits to make this
>> work.
>
>
> For the record, Jacques confirmed off-list that this worked, and I also did
> a couple of tests.
>
> Is this expected?  If so, should it be documented -- perhaps with something
> like the attached?  Alternatively there may be some way to make
> CommitTransactionCommand do it, though the comments in
> ProcessCompletedNotifies explain why that was rejected, at least as far as
> AtCommit_Notify goes.
>
> This made me wonder what happens if a background worker calls LISTEN.
> NotifyMyFrontEnd simply logs the notifications, since there is no remote
> libpq to sent a message to.  Perhaps a way of delivering to background
> workers could be developed, though of course there are plenty of other kinds
> of IPC available already.

Yes, the Notify command execution is possible with call to
ProcessCompletedNotifies
function in a background worker and the Listen command is not possible in
background worker because of no client associated with it.

The documentation patch provides a better understanding to the user regarding
notify and listen commands.

I marked this patch as ready for committer.

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Seq Scan

2015-11-04 Thread Haribabu Kommi

On Tue, Nov 3, 2015 at 9:41 PM, Amit Kapila  wrote:
> On Fri, Oct 23, 2015 at 4:41 PM, Amit Kapila 
> wrote:
>>
>> On Fri, Oct 23, 2015 at 10:33 AM, Robert Haas 
>> wrote:
>
> Please find the rebased partial seq scan patch attached with this
> mail.
>
> Robert suggested me off list that we should once try to see if we
> can use Seq Scan node instead of introducing a new Partial Seq Scan
> node. I have analyzed to see if we can use the SeqScan node (containing
> parallel flag) instead of introducing new partial seq scan and found that
> we primarily need to change most of the functions in nodeSeqScan.c to
> have a parallel flag check and do something special for Partial Seq Scan
> and apart from that we need special handling in function
> ExecSupportsBackwardScan().  In general, I think we can make
> SeqScan node parallel-aware by having some special paths without
> introducing much complexity and that can save us code-duplication
> between nodeSeqScan.c and nodePartialSeqScan.c.  One thing that makes
> me slightly uncomfortable with this approach is that for partial seq scan,
> currently the plan looks like:
>
> QUERY PLAN
> --
>  Gather  (cost=0.00..2588194.25 rows=9990667 width=4)
>Number of Workers: 1
>->  Partial Seq Scan on t1  (cost=0.00..89527.51 rows=9990667 width=4)
>  Filter: (c1 > 1)
> (4 rows)
>
> Now instead of displaying Partial Seq Scan, if we just display Seq Scan,
> then it might confuse user, so it is better to add some thing indicating
> parallel node if we want to go this route.

IMO, the change from Partial Seq Scan to Seq Scan may not confuse user,
if we clearly specify in the documentation that all plans under a Gather node
are parallel plans.

This is possible for the execution nodes that executes fully under a
Gather node.
The same is not possible for parallel aggregates, so we have to mention the
aggregate node below Gather node as partial only.

I feel this suggestion arises as may be because of some duplicate code between
Partial Seq Scan and Seq scan. By using Seq Scan node only if we display as
Partial Seq Scan by storing some flag in the plan? This avoids the
need of adding
new plan nodes.


Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] NOTIFY in Background Worker

2015-11-02 Thread Haribabu Kommi

On Sat, Aug 29, 2015 at 12:55 PM, Thomas Munro
 wrote:
> On Sat, Aug 29, 2015 at 9:03 AM, Thomas Munro
>  wrote:
>>
>> On Fri, Aug 28, 2015 at 10:30 PM, jacques klein
>>  wrote:
>>>
>>> Hello,
>>>
>>> I added a "NOFITY chan" to the SQL arg of an SPI_execute(), (I did it
>>> also with just the NOTIFY statement),
>>> but the listeners (other workers) don't get the notification until a
>>> "NOTIFY chan" is done for example with pgadmin,
>>>
>>> They don't get lost, just not emited after the "not forgotten" call of
>>> CommitTransactionCommand().
>>>
>>> Is this normal ( i.e. not supported (yet) ), a bug, or did I overlook
>>> some doc. (or source code) ?.
>>>
>>> For now, I will try to "emit" the NOTIFY via libpq.
>>
>>
>> That's because ProcessCompletedNotifies isn't being called.  For regular
>> backends it is called inside the top level loop PostgresMain.  I think you
>> need to include "commands/async.h" and add a call to
>> ProcessCompletedNotifies() after your background worker commits to make this
>> work.
>
>
> For the record, Jacques confirmed off-list that this worked, and I also did
> a couple of tests.
>
> Is this expected?  If so, should it be documented -- perhaps with something
> like the attached?  Alternatively there may be some way to make
> CommitTransactionCommand do it, though the comments in
> ProcessCompletedNotifies explain why that was rejected, at least as far as
> AtCommit_Notify goes.
>
> This made me wonder what happens if a background worker calls LISTEN.
> NotifyMyFrontEnd simply logs the notifications, since there is no remote
> libpq to sent a message to.  Perhaps a way of delivering to background
> workers could be developed, though of course there are plenty of other kinds
> of IPC available already.

With this commit - bde39eed0cafb82bc94c40e95d96b5cf47b6f719, it is not possible
to execute Notify commands inside a parallel worker. Can't we change
it as disable
both listen and notify commands inside a background worker?

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] security_barrier view option type mistake in create view document

2015-10-28 Thread Haribabu Kommi

The security_barrier view option is classified as string in the create
view documentation.
But it is actually a boolean. The type is mentioned correctly in alter
view. Here I attached
the patch with the correction.

-security_barrier (string)
+security_barrier (boolean)

Regards,
Hari Babu
Fujitsu Australia


security_barrier_type.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi-tenancy with RLS

2015-10-26 Thread Haribabu Kommi

On Wed, Oct 21, 2015 at 2:42 PM, Haribabu Kommi
<kommi.harib...@gmail.com> wrote:
> Pending items:
> 1. Need to add some more tests to verify all database catalog tables.
> 2. Documentation changes for database catalog tenancy.

Here I attached the updated database-catalog-security with more tests
including system views,
information schema views and documentation.

>Known issues:
>2. If user (U2) executes a query on an object (tbl2) which the user
>(U2) don't have
>permissions, as he cannot able to see that object from catalog 
> views/tables,
>but the query returns an error message as "permission denied", but in case
>if multi-tenancy is enabled, the error message should be "relation
>doesn't exist".

To handle the above problem, we can add a check to verify whether the
corresponding
catalog relation has the row level security is enabled or not? in all
*_aclmask or similar
functions. Based on the ACL result, if the row security is enabled,
through an error as
"object does not exist", instead of permission denied by the
aclcheck_error function.
This will increase the extra processing time for queries irrespective
of whether the
multi-tenancy is enabled or not?

comments?

Regards,
Hari Babu
Fujitsu Australia

4_database_catalog_tenancy_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] On columnar storage (2)

2015-10-21 Thread Haribabu Kommi

On Tue, Sep 1, 2015 at 8:53 AM, Alvaro Herrera  wrote:
> As discussed in
> https://www.postgresql.org/message-id/20150611230316.gm133...@postgresql.org
> we've been working on implementing columnar storage for Postgres.
> Here's some initial code to show our general idea, and to gather
> comments related to what we're building.  This is not a complete patch,
> and we don't claim that it works!  This is in very early stages, and we
> have a lot of work to do to get this in working shape.
>
> This was proposed during the Developer's Unconference in Ottawa earlier
> this year.  While some questions were raised about some elements of our
> design, we don't think they were outright objections, so we have pressed
> forward on the expectation that any limitations can be fixed before this
> is final if they are critical, or in subsequent commits if not.
>
> The commit messages for each patch should explain what we've done in
> enough technical detail, and hopefully provide a high-level overview of
> what we're developing.
>
> The first few pieces are "ready for comment" -- feel free to speak up
> about the catalog additions, the new COLUMN STORE bits we added to the
> grammar, the way we handle column stores in the relcache, or the
> mechanics to create column store catalog entries.
>
> The later half of the patch series is much less well cooked yet; for
> example, the colstore_dummy module is just a simple experiment to let us
> verify that the API is working.  The planner and executor code are
> mostly stubs, and we are not yet sure of what are the executor nodes
> that we would like to have: while we have discussed this topic
> internally a lot, we haven't yet formed final opinions, and of course
> the stub implementations are not doing the proper things, and in many
> cases they are even not doing anything at all.

Fujitsu is also interested in implementing a columnar storage extension.
First we thought of implementing this extension using index access methods
The following is the basic design idea of the columnar extension, currently
this may need to be redesigned according to columnar access methods,

create an vertical columnar index on a table with specified columns that are
needed to be stored in columnar storage format. To provide performance
benefit for both read and write operations, the data is stored in two formats,
1) write optimized storage (WOS) 2) read optimized storage (ROS). This
is useful for the users where there is a great chance of data modification
that is newly added.

Because of two storage's, we need maintain two entries in pg_class table.
one is WOS and others are all columns in columnar storage.

Insert:

write optimized storage is the data of all columns that are part of VCI are
stored in a row wise format. All the newly added data is stored in WOS
relation with xmin/xmax information also. If user wants to update/delete the
newly added data, it doesn't affect the performance much compared to
deleting the data from columnar storage.

The tuples which don't have multiple copies or frozen data will be moved
from WOS to ROS periodically by the background worker process or autovauum
process. Every column data is stored separately in it's relation file. There
is no transaction information is present in ROS. The data in ROS can be
referred with tuple ID.

In this approach, the column data is present in both heap and columnar
storage, whereas with columnar access methods the column data doesn't
present in the heap.

Select:

Because of two storage formats, during the select operation, the data in WOS
is converted into Local ROS for the statement to be executed. The conversion
cost depends upon the number of tuples present in the WOS file. This
may add some performance overhead for select statements.

Delete:

During the delete operation, whenever the data is deleted in heap at the same
time the data in WOS file is marked as deleted similar like heap. But in case
if the data is already migrated from WOS to ROS, then we will maintain some
delete vector to store the details of tuple id, transaction information and etc.
During the data read from ROS file, it is verified against delete
vector and confirms
whether the record is visible or not? All the delete vectors data is
applied to ROS
periodically.

The concept of columnar extension is from Fujitsu Labs, Japan.
Any comments for further evaluation of this approach according to
columnar access
methods?

Regards,
Hari Babu
Fujitsu Australia

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi-tenancy with RLS

2015-10-20 Thread Haribabu Kommi

On Sat, Oct 10, 2015 at 1:54 AM, Stephen Frost <sfr...@snowman.net> wrote:
> * Haribabu Kommi (kommi.harib...@gmail.com) wrote:
>> On Fri, Oct 9, 2015 at 2:04 PM, Stephen Frost <sfr...@snowman.net> wrote:
>> > * Robert Haas (robertmh...@gmail.com) wrote:
>> >> We've got one reloption for views already - security_barrier.  Maybe
>> >> we could have another one that effectively changes a particular view
>> >> from "security definer" as it is today to "security invoker".
>> >
>> > As I recall, there was a previous suggestion (honestly, I thought it was
>> > your idea) to have a reloption which made views "fully" security
>> > definer, in that functions in the view definition would run as the view
>> > owner instead of the view invoker.
>> >
>> > I liked that idea, though we would need to have a function to say "who
>> > is the 'outer' user?" (CURRENT_USER always being the owner with the
>> > above described reloption).
>> >
>> > I'm less sure about the idea of having a view which runs entirely as the
>> > view invoker, but I'm not against it either.
>>
>> I changed in function check_enable_rls to use the invoker id instead of 
>> owner id
>> for all the system objects, the catalog table policies are getting
>> applied and it is
>> working fine till now in my multi-tenancy testing.
>>
>> Currently I am writing tests to validate it against all user objects also.
>> If this change works for all user objects also, then we may not needed
>> the security invoker
>> reloption.
>
> The reloption would be to allow the user to decide which behavior they
> wanted, as there are use-cases for both.

Any_privilege_option:
Patch that adds 'any' type as a privilege option to verify whether the user
is having any privileges on the object, instead of specifying each and every
privilege type that object supports. Using of this option at grant and revoke
commands throw an error.

View_security_definer:
Patch that adds "security_definer" as a view option to specify whether the
view owner needs to be used for all operations on the view, otherwise the
current user is used.

Currently by default the view owner is used to check against all privileges,
so changing it as invoker instead of owner leads to backward compatibility
problems as permission denied on the base relation and etc. To minimize
the impact, currently the invoker id is used only when the view is rewritten
to base relation for 1) updatable views 2) while applying the row security
policies to the base relations.

Instead of the above change, if we treat all the views by default as security
definer, then to support multi-tenancy we need to change all the system views
as security_definer=false.

comments?

shared_catalog_tenancy:
Patch adds an initdb option -C or --shared-catalog-security to add row level
security policies on shared catalog tables that are eligible for tenancy.
With this option, user gets the tenancy at database level, means user can
get the database list that he has some privileges, but not all. It is
not possible
to disable the shared catalog security once it is set at initdb time.

database_catalog_tenancy:
Patch that adds an database option of "catalog security". This can be used
with alter database only not possible with create database command.
With this option, user gets the tenancy at table level. Once user enables
the catalog security at database level, row level security policies are created
on catalog tables that are eligible. User can disable catalog security if wants.

Known issues:
1. If user (U1) grants permissions on object (tbl1) to user (U2), the user U2
can get the information that there exists an user (U1) in the system, but
U2 cannot get the details of U1.

2. If user (U2) executes a query on an object (tbl2) which the user
(U2) don't have
permissions, as he cannot able to see that object from catalog views/tables,
but the query returns an error message as "permission denied", but in case
if multi-tenancy is enabled, the error message should be "relation
doesn't exist".

Pending items:
1. Need to add some more tests to verify all database catalog tables.
2. Documentation changes for database catalog tenancy.

Regards,
Hari Babu
Fujitsu Australia

1_any_privilege_option_v1.patch
Description: Binary data

2_view_security_definer_v1.patch
Description: Binary data

3_shared_catalog_tenancy_v1.patch
Description: Binary data

4_database_catalog_tenancy_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Seq Scan

2015-10-15 Thread Haribabu Kommi

On Thu, Oct 15, 2015 at 6:32 PM, Amit Kapila 
wrote:
> On Wed, Oct 14, 2015 at 3:29 AM, Robert Haas 
wrote:
> I think this got messed up while rebasing on top of Gather node
> changes, but nonetheless, I have changed it such that PartialSeqScan
> node handling is after SeqScan.

Currently, the explain analyze of parallel seq scan plan is not showing the
allocated number of workers
including the planned workers.I feel this information is good for users in
understanding the performance
difference that is coming with parallel seq scan. It may be missed in
recent patch series. It was discussed
in[1].

Currently there is no qualification evaluation at Result and Gather nodes,
because of this reason, if any
query that contains any parallel restricted functions is not chosen for
parallel scan. Because of
this reason, there is no difference between parallel restricted and
parallel unsafe functions currently.
Is it fine for first version?

[1]
http://www.postgresql.org/message-id/ca+tgmobhq0_+yobmlbjexvt4qef6xblfudax1owl-ivgan5...@mail.gmail.com

Regards,
Hari Babu
Fujitsu Australia

Re: [HACKERS] Parallel Seq Scan

2015-10-15 Thread Haribabu Kommi

On Fri, Oct 16, 2015 at 2:10 PM, Haribabu Kommi
<kommi.harib...@gmail.com> wrote:
> On Thu, Oct 15, 2015 at 11:45 PM, Amit Kapila <amit.kapil...@gmail.com> wrote:
>> On Thu, Oct 15, 2015 at 5:39 PM, Haribabu Kommi <kommi.harib...@gmail.com>
>> wrote:
>>>
>>>
>>>
>>> On Thu, Oct 15, 2015 at 6:32 PM, Amit Kapila <amit.kapil...@gmail.com>
>>> wrote:
>>> > On Wed, Oct 14, 2015 at 3:29 AM, Robert Haas <robertmh...@gmail.com>
>>> > wrote:
>>> > I think this got messed up while rebasing on top of Gather node
>>> > changes, but nonetheless, I have changed it such that PartialSeqScan
>>> > node handling is after SeqScan.
>>>
>>> Currently, the explain analyze of parallel seq scan plan is not showing
>>> the allocated number of workers
>>> including the planned workers.I feel this information is good for users in
>>> understanding the performance
>>> difference that is coming with parallel seq scan. It may be missed in
>>> recent patch series. It was discussed
>>> in[1].
>>>
>>
>> I am aware of that and purposefully kept it for a consecutive patch.
>> There are other things as well which I have left out from this patch
>> and those are:
>> a. Early stop of executor for Rescan purpose
>> b. Support of pushdown for plans containing InitPlan and SubPlans
>>
>> Then there is more related work like
>> a. Support for prepared statements
>>
>
> OK.
>
> During the test with latest patch, I found a dead lock between worker
> and backend
> on relation lock. To minimize the test scenario, I changed the number
> of pages required
> to start one worker to 1 and all parallel cost parameters as zero.
>
> Backend is waiting for the tuples from workers, workers are waiting on
> lock of relation.
> Attached is the sql script that can reproduce this issue.

Some more tests that failed in similar configuration settings.
1. Table that is created under a begin statement is not visible in the worker.
2. permission problem in worker side for set role command.


Regards,
Hari Babu
Fujitsu Australia


parallel_table_doesn't_exist_problem.sql
Description: Binary data


parallel_set_role_permission_problem.sql
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Seq Scan

2015-10-15 Thread Haribabu Kommi

On Thu, Oct 15, 2015 at 11:45 PM, Amit Kapila <amit.kapil...@gmail.com> wrote:
> On Thu, Oct 15, 2015 at 5:39 PM, Haribabu Kommi <kommi.harib...@gmail.com>
> wrote:
>>
>>
>>
>> On Thu, Oct 15, 2015 at 6:32 PM, Amit Kapila <amit.kapil...@gmail.com>
>> wrote:
>> > On Wed, Oct 14, 2015 at 3:29 AM, Robert Haas <robertmh...@gmail.com>
>> > wrote:
>> > I think this got messed up while rebasing on top of Gather node
>> > changes, but nonetheless, I have changed it such that PartialSeqScan
>> > node handling is after SeqScan.
>>
>> Currently, the explain analyze of parallel seq scan plan is not showing
>> the allocated number of workers
>> including the planned workers.I feel this information is good for users in
>> understanding the performance
>> difference that is coming with parallel seq scan. It may be missed in
>> recent patch series. It was discussed
>> in[1].
>>
>
> I am aware of that and purposefully kept it for a consecutive patch.
> There are other things as well which I have left out from this patch
> and those are:
> a. Early stop of executor for Rescan purpose
> b. Support of pushdown for plans containing InitPlan and SubPlans
>
> Then there is more related work like
> a. Support for prepared statements
>

OK.

During the test with latest patch, I found a dead lock between worker
and backend
on relation lock. To minimize the test scenario, I changed the number
of pages required
to start one worker to 1 and all parallel cost parameters as zero.

Backend is waiting for the tuples from workers, workers are waiting on
lock of relation.
Attached is the sql script that can reproduce this issue.

Regards,
Hari Babu
Fujitsu Australia


parallel_hang_with_cluster.sql
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Aggregate

2015-10-13 Thread Haribabu Kommi

On Tue, Oct 13, 2015 at 5:53 PM, David Rowley
<david.row...@2ndquadrant.com> wrote:
> On 13 October 2015 at 17:09, Haribabu Kommi <kommi.harib...@gmail.com>
> wrote:
>>
>> On Tue, Oct 13, 2015 at 12:14 PM, Robert Haas <robertmh...@gmail.com>
>> wrote:
>> > Also, I think the path for parallel aggregation should probably be
>> > something like FinalizeAgg -> Gather -> PartialAgg -> some partial
>> > path here.  I'm not clear whether that is what you are thinking or
>> > not.
>>
>> No. I am thinking of the following way.
>> Gather->partialagg->some partial path
>>
>> I want the Gather node to merge the results coming from all workers,
>> otherwise
>> it may be difficult to merge at parent of gather node. Because in case
>> the partial
>> group aggregate is under the Gather node, if any of two workers are
>> returning
>> same group key data, we need to compare them and combine it to make it a
>> single group. If we are at Gather node, it is possible that we can
>> wait till we get
>> slots from all workers. Once all workers returns the slots we can compare
>> and merge the necessary slots and return the result. Am I missing
>> something?
>
>
> My assumption is the same as Robert's here.
> Unless I've misunderstood, it sounds like you're proposing to add logic into
> the Gather node to handle final aggregation? That sounds like a modularity
> violation of the whole node concept.
>
> The handling of the final aggregate stage is not all that different from the
> initial aggregate stage. The primary difference is just that your calling
> the combine function instead of the transition function, and the values

Yes, you are correct, till now i am thinking of using transition types as the
approach, because of that reason only I proposed it as Gather node to handle
the finalize aggregation.

> being aggregated are aggregates states rather than the type of the values
> which were initially aggregated. The handling of GROUP BY is all the same,
> yet you only apply the HAVING clause during final aggregation. This is why I
> ended up implementing this in nodeAgg.c instead of inventing some new node
> type that's mostly a copy and paste of nodeAgg.c [1]

After going through your Partial Aggregation / GROUP BY before JOIN patch,
Following is my understanding of parallel aggregate.

Finalize [hash] aggregate
-> Gather
  -> Partial [hash] aggregate

The data that comes from the Gather node contains the group key and
grouping results.
Based on these we can generate another hash table in case of hash aggregate at
finalize aggregate and return the final results. This approach works
for both plain and
hash aggregates.

For group aggregate support of parallel aggregate, the plan should be
as follows.

Finalize Group aggregate
->sort
-> Gather
  -> Partial group aggregate
   ->sort

The data that comes from Gather node needs to be sorted again based on
the grouping key,
merge the data and generates the final grouping result.

With this approach, we no need to change anything in Gather node. Is
my understanding correct?

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Seq Scan

2015-10-12 Thread Haribabu Kommi

On Mon, Oct 5, 2015 at 11:20 PM, Amit Kapila  wrote:
> For now, I have fixed this by not preserving the startblock incase of rescan
> for parallel scan. Note that, I have created a separate patch
> (parallel_seqscan_heaprescan_v1.patch) for support of rescan (for parallel
> scan).

while testing parallel seqscan, My colleague Jing Wang has found a problem in
parallel_seqscan_heapscan_v2.patch.

In function initscan, the allow_sync flag is set to false as the
number of pages in the
table are less than NBuffers/4.

if (!RelationUsesLocalBuffers(scan->rs_rd) &&
  scan->rs_nblocks > NBuffers / 4)

As allow_sync flag is false, the function
heap_parallelscan_initialize_startblock is not
called in initscan function to initialize the
parallel_scan->phs_cblock parameter. Because
of this reason while getting the next page in
heap_parallelscan_nextpage, it returns
InvalidBlockNumber, thus it ends the scan without returning the results.

Regards,
Hari Babu
Fujitsu Australia

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Aggregate

2015-10-12 Thread Haribabu Kommi

On Tue, Oct 13, 2015 at 12:14 PM, Robert Haas <robertmh...@gmail.com> wrote:
> On Sun, Oct 11, 2015 at 10:07 PM, Haribabu Kommi
> <kommi.harib...@gmail.com> wrote:
>> Parallel aggregate is the feature doing the aggregation job parallel
>> with the help of Gather and
>> partial seq scan nodes. The following is the basic overview of the
>> parallel aggregate changes.
>>
>> Decision phase:
>>
>> Based on the following conditions, the parallel aggregate plan is generated.
>>
>> - check whether the below plan node is Gather + partial seq scan only.
>>
>> This is because to check whether the plan nodes that are present are
>> aware of parallelism or not?
>
> This is really not the right way of doing this.  We should do
> something more general.  Most likely, parallel aggregate should wait
> for Tom's work refactoring the upper planner to use paths.  But either
> way, it's not a good idea to limit ourselves to parallel aggregation
> only in the case where there is exactly one base table.

Ok. Thanks for the details.

> One of the things I want to do pretty early on, perhaps in time for
> 9.6, is create a general notion of partial paths.  A Partial Seq Scan
> node creates a partial path.  A Gather node turns a partial path into
> a complete path.  A join between a partial path and a complete path
> creates a new partial path.  This concept lets us consider,
> essentially, pushing joins below Gather nodes.  That's quite powerful
> and could make Partial Seq Scan applicable to a much broader variety
> of use cases.  If there are worthwhile partial paths for the final
> joinrel, and aggregation of that joinrel is needed, we can consider
> parallel aggregation using that partial path as an alternative to
> sticking a Gather node on there and then aggregating.
>
>> - Set the single_copy mode as true, in case if the below node of
>> Gather is a parallel aggregate.
>
> That sounds wrong.  Single-copy mode is for when we need to be certain
> of running exactly one copy of the plan.  If you're trying to have
> several workers aggregate in parallel, that's exactly what you don't
> want.

I mean of setting the flag is to avoid backend executing the child plan.

> Also, I think the path for parallel aggregation should probably be
> something like FinalizeAgg -> Gather -> PartialAgg -> some partial
> path here.  I'm not clear whether that is what you are thinking or
> not.

No. I am thinking of the following way.
Gather->partialagg->some partial path

I want the Gather node to merge the results coming from all workers, otherwise
it may be difficult to merge at parent of gather node. Because in case
the partial
group aggregate is under the Gather node, if any of two workers are returning
same group key data, we need to compare them and combine it to make it a
single group. If we are at Gather node, it is possible that we can
wait till we get
slots from all workers. Once all workers returns the slots we can compare
and merge the necessary slots and return the result. Am I missing something?

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Parallel Aggregate

2015-10-11 Thread Haribabu Kommi

Parallel aggregate is the feature doing the aggregation job parallel
with the help of Gather and
partial seq scan nodes. The following is the basic overview of the
parallel aggregate changes.

Decision phase:

Based on the following conditions, the parallel aggregate plan is generated.

- check whether the below plan node is Gather + partial seq scan only.

This is because to check whether the plan nodes that are present are
aware of parallelism or not?

- check Are there any projection or qual condition is present in the
Gather node?

If there exists any quals and projection info that is required to
performed in the
Gather node because of the function that can only be executed in
master backends,
the parallel aggregate plan is not chosen.

- check whether the aggregate supports parallelism or not.

As for first patch, I thought of supporting only some aggregates for
this parallel aggregate.
The supported aggregates are mainly the aggregate functions that have
variable length data types as final and transition types. This is to
avoid changing the target list return types. Because of variable
lengths, even the transition type can be returned to backend without
applying the final function in aggregate. To identify the supported
aggregates for parallelism, a new member is added to pg_aggregate
system catalog table.

- currently Group and plain aggregates are only supported for simplicity.

This patch doesn't change anything in aggregate plan decision. If the
planner decides the group
or plain aggregates as the best plan, then we will check whether this
can be converted into
parallel aggregate or not?


Planning phase:

- Generate the target list items that needs to be passed to the child
aggregate nodes,
by separting bare aggregate and group by expressions. This is required
to take care
of any expressions those are involved the target list.

Example:
Output: (sum(id1)), (3 + (sum((id2 - 3, (max(id1)), ((count(id1))
- (max(id1)))
   ->  Aggregate
 Output: sum(id1), sum((id2 - 3)), max(id1), count(id1)

- Don't push the Having clause to the child aggregate node, this needs
to be executed at
the Gather node only, after combining all results from workers with
the matching key,
(and also after the final function is called for the aggregate
function if exists).

- Get the details of the Gather plan and remove its plan node from the
actual plan and prepare
the Gather plan on top of the aggregate plan.


Execution phase:

- By passing some execution flag like EXEC_PARALLEL or something, the
aggregate operations doesn't do the final function calculation in the
worker side.

- Set the single_copy mode as true, in case if the below node of
Gather is a parallel aggregate.

- Add the support of getting a slot from a particular worker. This
support is required to
merge the slots from different workers based on grouping key.

- Merge the slots received from the workers based on the grouping key.
If there is no grouping key,
then merge all slots without waiting for  receiving slots from all workers.

- If there exists a grouping key, backend has to wait till it gets
slots from all workers who are running. Once all slots are received,
they needs to be compared against the grouping key and merged
accordingly. The merged slot needs to be processed further to apply
the final function, qualification and projection.

I will try to provide a POC patch by next commit-fest.

Comments?


Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Aggregate

2015-10-11 Thread Haribabu Kommi

On Mon, Oct 12, 2015 at 2:25 PM, David Rowley
<david.row...@2ndquadrant.com> wrote:
> On 12 October 2015 at 15:07, Haribabu Kommi <kommi.harib...@gmail.com>
> wrote:
>>
>> - check whether the aggregate supports parallelism or not.
>>
>> As for first patch, I thought of supporting only some aggregates for
>> this parallel aggregate.
>> The supported aggregates are mainly the aggregate functions that have
>> variable length data types as final and transition types. This is to
>> avoid changing the target list return types. Because of variable
>> lengths, even the transition type can be returned to backend without
>> applying the final function in aggregate. To identify the supported
>> aggregates for parallelism, a new member is added to pg_aggregate
>> system catalog table.
>>
>> - currently Group and plain aggregates are only supported for simplicity.
>>
>> This patch doesn't change anything in aggregate plan decision. If the
>> planner decides the group
>> or plain aggregates as the best plan, then we will check whether this
>> can be converted into
>> parallel aggregate or not?
>
>
> Hi,
>
> I've never previously proposed any implementation for parallel aggregation,
> but I have previously proposed infrastructure to allow aggregation to happen
> in multiple steps. It seems your plan sounds very different from what I've
> proposed.
>
> I attempted to convey my idea on this to the community here
> http://www.postgresql.org/message-id/cakjs1f-tmwi-4c5k6cblrdtfgsvxojhadefzje7swuvbgms...@mail.gmail.com
> which Simon and I proposed an actual proof of concept patch here
> https://commitfest.postgresql.org/5/131/

My plan also to use the combine_aggregate_state_v2.patch or similar
that you have proposed to merge the partial aggregate results
and combine them in the backend process. As a POC patch, I just want
to limit this functionality to aggregates that have variable length
datatypes as transition and final arguments.

> I've since expanded on that work in the form of a WIP patch which implements
> GROUP BY before JOIN here
> http://www.postgresql.org/message-id/CAKJS1f9kw95K2pnCKAoPmNw==7fgjsjc-82cy1rb+-x-jz0...@mail.gmail.com
>
> It's pretty evident that we both need to align the way we plan to handle
> this multiple step aggregation, there's no sense at all in having 2
> different ways of doing this. Perhaps you could look over my patch and let
> me know the parts which you disagree with, then we can resolve these
> together and come up with the best solution for each of us.

Thanks for the details. I will go through it. From a basic view, this
patch is an
enhancement of combine_aggregate_state_v2.patch.

> It may also be useful for you to glance at how Postgres-XL handles this
> partial aggregation problem, as it, where possible, will partially aggregate
> the results on each node, pass the partially aggregates state to the master
> node to have it perform the final aggregate stage on each of the individual
> aggregate states from each node. Note that this requires giving the
> aggregates with internal aggregate states an SQL level type and it also
> means implementing an input and output function for these types. I've
> noticed that XL mostly handles this by making the output function build a
> string something along the lines of : for aggregates such as
> AVG(). I believe you'll need something very similar to this to pass the
> partial states between worker and master process.

Yes, we may need something like this, or adding the support of passing internal
datatypes between worker and backend process to support all aggregate functions.

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RLS bug in expanding security quals

2015-10-08 Thread Haribabu Kommi

On Fri, Oct 9, 2015 at 3:50 AM, Dean Rasheed  wrote:
> On 8 October 2015 at 15:05, Dean Rasheed  wrote:
>> Attached is a simple patch that appears to work, but it needs more
>> testing (and some regression tests).
>>
>
> Here's an updated patch with an extra regression test case that
> triggers the issue.
>
> I've also updated the function comment for expand_security_quals() to
> better explain the situations where it actually has work to do --
> tables with RLS and updates to auto-updatable security barrier views,
> but not SELECTs from security berrier views. This explains why this
> bug doesn't affect security barrier views (UNION ALL views aren't
> auto-updatable), so only 9.5 and HEAD need to be patched.

Thanks for the patch. I didn't find any problem in my test with the patch.

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] removing set_latch_on_sigusr1

2015-10-08 Thread Haribabu Kommi

On Fri, Oct 9, 2015 at 2:41 AM, Robert Haas  wrote:
>
> As it happens, the TupleQueueFunnelNext function I recently committed
> has such a hazard, which I failed to spot during review and testing.
> If people don't like this, I can instead cause that function to set
> the flag.  But every place that sets the flag has to use a
> PG_TRY()/PG_CATCH() block to make sure the old value of the flag gets
> restored.  I'm pretty sure that's going to burn more cycles than the
> flag can ever hope to save, not to mention the risk of bugs due to
> people forgetting to add necessary volatile qualifiers.  We've already
> got four PG_TRY() blocks in the code to cater to this stupid flag, and
> if we keep it around I'm sure we'll accumulate at least a few more.
>
> Patch attached.  Objections?  Suggestions?  Comments?

Once I also faced a problem with set_latch_on_sigusr1 flag in our development.

+1 for removal.

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

< 1 2 3 4 5 6 >

301 - 400 of 544 matches

Mail list logo