Re: [HACKERS] Allowing NOT IN to use ANTI joins

2014-07-05 Thread David Rowley
On Wed, Jul 2, 2014 at 9:25 PM, Jeevan Chalke 
jeevan.cha...@enterprisedb.com wrote:


 On Sun, Jun 29, 2014 at 4:18 PM, David Rowley dgrowle...@gmail.com
 wrote:

 I think I'm finally ready for a review again, so I'll update the
 commitfest app.


 I have reviewed this on code level.

 1. Patch gets applied cleanly.
 2. make/make install/make check all are fine

 No issues found till now.

 However some cosmetic points:

 1.
  * The API of this function is identical to convert_ANY_sublink_to_join's,
  * except that we also support the case where the caller has found NOT
 EXISTS,
  * so we need an additional input parameter under_not.

 Since now, we do support NOT IN handling in convert_ANY_sublink_to_join()
 we
 do have under_not parameter there too. So above comments near
 convert_EXISTS_sublink_to_join() function is NO more valid.


Nice catch. I've removed the last 2 lines from that comment now.


 2. Following is the unnecessary change. Can be removed:

 @@ -506,6 +560,7 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node
 *node,
 return NULL;
 }
 }
 +
 }
 /* Else return it unmodified */
 return node;


Removed




3. Typos:

 a.
 + * queryTargetListListCanHaveNulls
 ...
 +queryTargetListCanHaveNulls(Query *query)

 Function name is queryTargetListCanHaveNulls() not
 queryTargetListListCanHaveNulls()


Fixed.


 b.
  *For example the presense of; col IS NOT NULL, or col = 42 would
 allow

 presense = presence



Fixed.


 4. get_attnotnull() throws an error for invalid relid/attnum.
 But I see other get_att* functions which do not throw an error rather
 returning some invalid value, eg. NULL in case of attname, InvalidOid in
 case of atttype and InvalidAttrNumber in case of attnum. I have observed
 that we cannot return such invalid value for type boolean and I guess
 that's
 the reason you are throwing an error. But somehow looks unusual. You had
 put
 a note there which is good. But yes, caller of this function should be
 careful enough and should not assume its working like other get_att*()
 functions.
 However for this patch, I would simply accept this solution. But I wonder
 if
 someone thinks other way round.


hmmm, I remember thinking about that at the time. It was a choice between
returning false or throwing an error. I decided that it probably should be
an error, as that's what get_relid_attribute_name() was doing. Just search
lsyscache.c for cache lookup failed for attribute %d of relation %u.



 Testing more on SQL level.


Thank you for reviewing this. I think in the attached I've managed to nail
down the logic in find_inner_rels(). It was actually more simple than I
thought. On working on this today I also noticed that RIGHT joins can still
exist at this stage of planning and also that full joins are not
transformed. e.g: a FULL JOIN b ON a.id=b.id WHERE a.is IS NOT NULL would
later become a LEFT JOIN, but at this stage in planning, it's still a FULL
JOIN. I've added some new regression tests which test some of these join
types.

With further testing I noticed that the patch was not allowing ANTI joins
in cases like this:

explain select * from a where id not in(select x from b natural join c);

even if b.x and b.c were NOT NULL columns. This is because the TargetEntry
for x has a varno which belongs to neither b or c, it's actually the varno
for the join itself. I've added some code to detect this in
find_inner_rels(), but I'm not quite convinced yet that it's in the correct
place... I'm wondering if a future use for find_inner_rels() would need to
only see relations rather than join varnos. The code I added does get the
above query using ANTI joins, but it does have a bit of a weird quirk, in
that for it to perform an ANTI JOIN, b.x must be NOT NULL. If c.x is NOT
NULL and b.x is nullable, then there will be no ANTI JOIN. There must be
some code somewhere that chooses which of the 2 vars that should go into
the target list in for naturals joins.

The above got me thinking that the join conditions can also be used to
prove not nullness too. Take the query above as an example, x can never
actually be NULL, the condition b.x = c.x ensures that. I've only gone as
far as adding some comments which explain that this is a possible future
optimisation. I've not had time to look at this yet and I'd rather see the
patch go in without it than risk delaying this until the next commitfest...
Unless of course someone thinks it's just too weird a quirk to have in the
code.

Attached is the updated version of the patch.

Regards

David Rowley


However, assigning it to author to think on above cosmetic issues.

 Thanks

 --
 Jeevan B Chalke
 Principal Software Engineer, Product Development
 EnterpriseDB Corporation
 The Enterprise PostgreSQL Company




not_in_anti_join_5257082_2014-07-05.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your 

Re: [HACKERS] Proposing pg_hibernate

2014-07-05 Thread MauMau

Hello,

I'm reviewing this patch.  I find this feature useful, so keep good work.

I've just begun the review of pg_hibernate.c, and finished reviewing other 
files.  pg_hibernate.c will probably take some time to review, so let me 
give you the result of my review so far.  I'm sorry for trivial comments.



(1)
The build on Windows with MSVC 2008 Express failed.  The error messages are 
as follows (sorry, they are in Japanese):




 .\contrib\pg_hibernator\pg_hibernator.c(50): error C2373: 
'CreateDirectoryA' : 再定義されています。異なる型修飾子です。
 .\contrib\pg_hibernator\pg_hibernator.c(196): error C2373: 
'CreateDirectoryA' : 再定義されています。異なる型修飾子です。
 .\contrib\pg_hibernator\pg_hibernator.c(740): error C2198: 
'CreateDirectoryA' : 呼び出しに対する引数が少なすぎます。



The cause is that CreateDirectory() is an Win32 API.  When I renamed it, the 
build succeeded.  I think you don't need to make it a function, because its 
processing is simple and it is used only at one place.



(2)
Please look at the following messages.  Block Reader read all blocks 
successfully but exited with 1, while the Buffer Reader exited with 0.  I 
think the Block Reader also should exit 0 when it completes its job 
successfully, because the exit code 0 gives the impression of success.


LOG:  worker process: Buffer Saver (PID 2984) exited with exit code 0
...
LOG:  Block Reader 1: all blocks read successfully
LOG:  worker process: Block Reader 2 (PID 6064) exited with exit code 1

In addition, the names Buffer Saver and Block Reader don't correspond, 
while they both operate on the same objects.  I suggest using the word 
Buffer or Block for both processes.



(3)
Please add the definition of variable PGFILEDESC in Makefile.  See 
pg_upgrade_support's Makefile for an example.  It is necessary for the 
versioning info of DLLs on Windows.  Currently, other contrib libraries lack 
the versioning info.  Michael Paquier is adding the missing versioning info 
to other modules for 9.5.



(4)
Remove the following #ifdef, because you are attempting to include this 
module in 9.5.


#if PG_VERSION_NUM = 90400


(5)
Add header comments at the beginning of source files like other files.


(6)
Add user documentation SGML file in doc/src/sgml instead of README.md.
For reference, I noticed the following mistakes in README.md:

instalation - installation
`Block Reader` threads - `Block Reader` processes


(7)
The message

could not open \%s\: %m

should better be:

could not open file \%s\: %m

because this message is already used in many places.  Please find and use 
existing messages for other places as much as possible.  That will reduce 
the translation efforts for other languages.



(8)
fileClose() never returns false despite its comment:

/* Returns true on success, doesn't return on error */


(9)
I think it would be better for the Block Reader to include the name and/or 
OID of the database in its success message:


LOG:  Block Reader 1: restored 14 blocks


(10)
I think the save file name should better be database OID.save instead of 
some arbitrary integer.save.  That also means %u.save instead of %d.save. 
What do you think?



(11)
Why don't you remove misc.c, removing unnecessary functions and keeping just 
really useful ones in pg_hibernator.c?  For example, I don't think 
fileOpen(), fileClose(), fileRead() and fileWrite() need not be separate 
functions (see src/backend/postmaster/pgstat.c).  And, there's only one call 
site of the following functions:


readDBName
getSavefileNameBeingRestored
markSavefileBeingRestored

Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] unused local variable

2014-07-05 Thread Kevin Grittner
Commit 8b6010b8350a1756cd85595705971df81b5ffc07 eliminated the only
usage of a variable called typeStruct in plpy_spi.c, but left the
declaration and the code that sets a value for it.  This is
generating a warning when I build.  I would have just pushed a fix,
but I was concerned that it might have been left on purpose for
some follow-on patch. Any objections to the attached, to silence
the warning?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Companydiff --git a/src/pl/plpython/plpy_spi.c b/src/pl/plpython/plpy_spi.c
index 6c3eeff..465b316 100644
--- a/src/pl/plpython/plpy_spi.c
+++ b/src/pl/plpython/plpy_spi.c
@@ -93,7 +93,6 @@ PLy_spi_prepare(PyObject *self, PyObject *args)
 			HeapTuple	typeTup;
 			Oid			typeId;
 			int32		typmod;
-			Form_pg_type typeStruct;
 
 			optr = PySequence_GetItem(list, i);
 			if (PyString_Check(optr))
@@ -129,7 +128,6 @@ PLy_spi_prepare(PyObject *self, PyObject *args)
 			optr = NULL;
 
 			plan-types[i] = typeId;
-			typeStruct = (Form_pg_type) GETSTRUCT(typeTup);
 			PLy_output_datum_func(plan-args[i], typeTup);
 			ReleaseSysCache(typeTup);
 		}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] delta relations in AFTER triggers

2014-07-05 Thread Kevin Grittner
Kevin Grittner kgri...@ymail.com wrote:

 Because this review advances the patch so far, it may be feasible
 to get it committed in this CF.  I'll see what is needed to get
 there and maybe have a patch toward that end in a few days.

It appears that I need to create a new execution node and a way for
SPI calls to use it.  That seems beyond the scope of what is fair
to include in this CF, even if I got something put together in the
next couple days.

FWIW, I think that once I have the other pieces, what I initially
posted is committable as the first patch of a series.  A second
patch would add the new execution node and code to allow SPI calls
to use it.  The patch that David submitted, as modified by myself
and with further refinements that David is working on would be the
third patch.  An implementation in plpgsql, would be the fourth.
Other PLs would be left for people more familiar with those
languages to implement.

What I was hoping for in this CF was a review to confirm the
approach before proceeding to build on this foundation.  David
found nothing to complain about, and went to the trouble of writing
code to confirm that it was actually generating complete results
which were usable.  Robert doesn't feel this constitutes a serious
code review.  I'm not aware of any changes which are needed to the
pending patch once the follow-on patches are complete.  I'm moving
this to Needs Review status.  People will have another chance to
review this patch when the other code is available, but if we want
incremental maintenance of materialized views in 9.5, delaying
review of what I have submitted in this CF until the next CF will
put that goal in jeopardy.

The one thing I don't feel great about is that it's using
tuplestores of the actual tuples rather than just their TIDs; but
it seems to me that we need the full tuple to support triggers on
FDWs, so the TID approach would be an optimization for a subset of
the cases, and would probably be more appropriate, if we do it at
all, in a follow-on patch after this is working (although I think
it is an optimization we should get into 9.5 if we are going to do
it).  If you disagree with that assessment, now would be a good
time to explain your reasoning.

A minor point is psql tab-completion for the REFERENCING clause. 
It seems to me that's not critical, but I might slip it in anyway
before commit.

I took a look at whether I could avoid making OLD and NEW
non-reserved keywords, but I didn't see how to do that without
making FOR at least partially reserved.  If someone sees a way to
do this without creating three new unreserved keywords
(REFERENCING, OLD, and NEW) I'm all ears.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgresql.auto.conf and reload

2014-07-05 Thread Tom Lane
Amit Kapila amit.kapil...@gmail.com writes:
 Please find the patch attached to address the above concern.  I have
 updated docs, so that users can be aware of such behaviour.

I'm in the camp that says that we need to do something about this, not
just claim that it's operating as designed.  AFAICS, the *entire* argument
for having ALTER SYSTEM at all is ease of use; but this behavior is not
what I would call facilitating ease of use.  In particular, if you are
conveniently able to edit postgresql.conf, what the heck do you need
ALTER SYSTEM for?

One possible answer is to modify guc-file.l so that only the last value
supplied for any one GUC gets processed.  The naive way of doing that
would be O(N^2) in the number of uncommented settings, which might or
might not be a problem; if it is, we could no doubt devise a smarter
data structure.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unused local variable

2014-07-05 Thread Tom Lane
Kevin Grittner kgri...@ymail.com writes:
 Commit 8b6010b8350a1756cd85595705971df81b5ffc07 eliminated the only
 usage of a variable called typeStruct in plpy_spi.c, but left the
 declaration and the code that sets a value for it.  This is
 generating a warning when I build.  I would have just pushed a fix,
 but I was concerned that it might have been left on purpose for
 some follow-on patch. Any objections to the attached, to silence
 the warning?

Huh.  Odd that I did not see such a warning here.  No objection to
removing the dead variable.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allowing join removals for more join types

2014-07-05 Thread Tom Lane
David Rowley dgrowle...@gmail.com writes:
 Attached is a delta patch between version 1.2 and 1.3, and also a
 completely updated patch.

Just to note that I've started looking at this, and I've detected a rather
significant omission: there's no check that the join operator has anything
to do with the subquery's grouping operator.  I think we need to verify
that they are members of the same opclass, as
relation_has_unique_index_for does.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unused local variable

2014-07-05 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 Kevin Grittner kgri...@ymail.com writes:


 typeStruct in plpy_spi.c

 No objection to removing the dead variable.

Done.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 9.4 documentation: duplicate paragraph in logical decoding example

2014-07-05 Thread Christoph Moench-Tegeder
Hi,

while reading the logical decoding docs, I came across a duplicated
paragraph in doc/src/sgml/logicaldecoding.sgml - in the current
master branch, lines 108 to 115 are the same as lines 117 to 124.
I've attached a patch which removes the second instance of that
paragraph.
In case it is intended to demonstrate that the changes in the stream
were not consumed by pg_logical_slot_peek_changes(), the comment
in line 117 should be removed, or reworded like the changes have
not been consumed by the previous command, just to avoid making
it look like that paragraph had been duplicated by accident :)

Regards,
Christoph

-- 
Spare Space
diff --git a/doc/src/sgml/logicaldecoding.sgml b/doc/src/sgml/logicaldecoding.sgml
index a2108d6..5fa2a1e 100644
--- a/doc/src/sgml/logicaldecoding.sgml
+++ b/doc/src/sgml/logicaldecoding.sgml
@@ -114,15 +114,6 @@ postgres=# SELECT * FROM pg_logical_slot_peek_changes('regression_slot', NULL, N
  0/16E0B90 | 690 | COMMIT 690
 (3 rows)
 
-postgres=# -- You can also peek ahead in the change stream without consuming changes
-postgres=# SELECT * FROM pg_logical_slot_peek_changes('regression_slot', NULL, NULL);
- location  | xid | data
+-+---
- 0/16E09C0 | 690 | BEGIN 690
- 0/16E09C0 | 690 | table public.data: INSERT: id[integer]:3 data[text]:'3'
- 0/16E0B90 | 690 | COMMIT 690
-(3 rows)
-
 postgres=# -- options can be passed to output plugin, to influence the formatting
 postgres=# SELECT * FROM pg_logical_slot_peek_changes('regression_slot', NULL, NULL, 'include-timestamp', 'on');
  location  | xid | data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Performance regression: 9.2+ vs. ScalarArrayOpExpr vs. ORDER BY

2014-07-05 Thread Andrew Gierth
Spent some time analyzing a severe performance regression on 9.1-9.3
upgrade for a user on IRC. Narrowed it down to this:

commit 807a40c5 fixed a bug in handling of (new in 9.2) functionality
of ScalarArrayOpExpr in btree index quals, forcing the results of
scans including such a qual to be treated as unordered (because the
order can in fact be wrong).

However, this kills any chance of using the same index _without_ the
SAOP to get the benefit of its ordering.

For example:

regression=# create index on tenk1 (ten,unique1,thousand);
CREATE INDEX
regression=# set enable_sort = false;
SET
regression=# explain analyze select * from tenk1 where thousand in 
(19,29,39,49,57,66,77,8,90,12,22,32) and (ten = 5) and (ten  5 or unique1  
5000) order by ten,unique1 limit 1;
  QUERY PLAN
  
--
 Limit  (cost=1000294.71..1000294.71 rows=1 width=244) (actual 
time=0.889..0.891 rows=1 loops=1)
   -  Sort  (cost=1000294.71..1000294.81 rows=42 width=244) (actual 
time=0.884..0.884 rows=1 loops=1)
 Sort Key: ten, unique1
 Sort Method: top-N heapsort  Memory: 25kB
 -  Bitmap Heap Scan on tenk1  (cost=44.34..294.50 rows=42 width=244) 
(actual time=0.194..0.687 rows=80 loops=1)
   Recheck Cond: (thousand = ANY 
('{19,29,39,49,57,66,77,8,90,12,22,32}'::integer[]))
   Filter: ((ten = 5) AND ((ten  5) OR (unique1  5000)))
   Rows Removed by Filter: 40
   Heap Blocks: exact=100
   -  Bitmap Index Scan on tenk1_thous_tenthous  (cost=0.00..44.33 
rows=120 width=0) (actual time=0.148..0.148 rows=120 loops=1)
 Index Cond: (thousand = ANY 
('{19,29,39,49,57,66,77,8,90,12,22,32}'::integer[]))
 Planning time: 0.491 ms
 Execution time: 1.023 ms
(13 rows)

(real case had larger rowcounts of course, but showed the same effect
of using a Sort plan even when enable_sort=false.)

Obviously one can create a (ten,unique1) index (which would otherwise
be almost completely redundant with the 3-column one), but that wastes
space and eliminates some index-only-scan opportunities. Similarly one
can hack the query, e.g. using WHERE (thousand+0) IN (...) but that is
as ugly as all sin.

I've experimented with the attached patch, which detects when this
situation might have occurred and does another pass to try and build
ordered scans without the SAOP condition. However, the results may not
be quite ideal, because at least in some test queries (not all) the
scan with the SAOP included in the indexquals is being costed higher
than the same scan with the SAOP moved to a Filter, which seems
unreasonable.  (One of the affected queries is the regression test for
the original bug, which this patch does not yet try and fix and
therefore currently fails regression on.)

Thoughts?

-- 
Andrew (irc:RhodiumToad)

diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index 42dcb11..6ae8e33 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -50,7 +50,8 @@ typedef enum
 {
 	SAOP_PER_AM,/* Use ScalarArrayOpExpr if amsearcharray */
 	SAOP_ALLOW,	/* Use ScalarArrayOpExpr for all indexes */
-	SAOP_REQUIRE/* Require ScalarArrayOpExpr to be used */
+	SAOP_REQUIRE,/* Require ScalarArrayOpExpr to be used */
+	SAOP_SKIP_LOWER/* Require lower ScalarArrayOpExpr to be eliminated */
 } SaOpControl;
 
 /* Whether we are looking for plain indexscan, bitmap scan, or either */
@@ -118,7 +119,8 @@ static void get_index_paths(PlannerInfo *root, RelOptInfo *rel,
 static List *build_index_paths(PlannerInfo *root, RelOptInfo *rel,
   IndexOptInfo *index, IndexClauseSet *clauses,
   bool useful_predicate,
-  SaOpControl saop_control, ScanTypeControl scantype);
+  SaOpControl saop_control, ScanTypeControl scantype,
+  bool *saop_retry);
 static List *build_paths_for_OR(PlannerInfo *root, RelOptInfo *rel,
    List *clauses, List *other_clauses);
 static List *generate_bitmap_or_paths(PlannerInfo *root, RelOptInfo *rel,
@@ -734,6 +736,7 @@ get_index_paths(PlannerInfo *root, RelOptInfo *rel,
 {
 	List	   *indexpaths;
 	ListCell   *lc;
+	bool   saop_retry = false;
 
 	/*
 	 * Build simple index paths using the clauses.  Allow ScalarArrayOpExpr
@@ -742,7 +745,23 @@ get_index_paths(PlannerInfo *root, RelOptInfo *rel,
 	indexpaths = build_index_paths(root, rel,
    index, clauses,
    index-predOK,
-   SAOP_PER_AM, ST_ANYSCAN);
+   SAOP_PER_AM, ST_ANYSCAN, saop_retry);
+
+	/*
+	 * If we allowed any ScalarArrayOpExprs on an index with a useful sort
+	 * ordering, then try again without them, since otherwise we miss important
+	 * paths where the index 

Re: [HACKERS] Allowing join removals for more join types

2014-07-05 Thread David Rowley
On 6 July 2014 03:20, Tom Lane t...@sss.pgh.pa.us wrote:

 David Rowley dgrowle...@gmail.com writes:
  Attached is a delta patch between version 1.2 and 1.3, and also a
  completely updated patch.

 Just to note that I've started looking at this, and I've detected a rather
 significant omission: there's no check that the join operator has anything
 to do with the subquery's grouping operator.  I think we need to verify
 that they are members of the same opclass, as
 relation_has_unique_index_for does.


hmm, good point. If I understand this correctly we can just ensure that the
same operator is used for both the grouping and the join condition.

I've attached a small delta patch which fixes this, and also attached the
full updated patch.

Regards

David Rowley


subquery_leftjoin_removal_v1.4.patch
Description: Binary data


subquery_leftjoin_removal_v1.4_delta.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DISTINCT with btree skip scan

2014-07-05 Thread Thomas Munro
On 5 July 2014 02:03, Vik Fearing vik.fear...@dalibo.com wrote:
 [1] http://wiki.postgresql.org/wiki/Loose_indexscan

Thanks. It talks about DISTINCT, and also about using index when you
don't have the leading column in your WHERE clause (as well as MySQL
(loose), at least Oracle (skip), SQLite (skip), DB2 (jump) can
do this).  It looks like at least MySQL can also use loose index scans
to implement GROUP BY in certain cases involving MIN or MAX aggregate
functions (things like SELECT a, MIN(b) FROM foo GROUP BY a, given an
index on (a, b)).

But I'm only trying to implement the lowest hanging index skipping
plan: plain old DISTINCT.  I think I see roughly how to cost, plan and
execute it...  now to learn a lot more about PG internals...


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tweaking NTUP_PER_BUCKET

2014-07-05 Thread Stephen Frost
* Greg Stark (st...@mit.edu) wrote:
 On Thu, Jul 3, 2014 at 11:40 AM, Atri Sharma atri.j...@gmail.com wrote:
  IIRC, last time when we tried doing bloom filters, I was short of some real
  world useful hash functions that we could use for building the bloom filter.
 
 Last time was we wanted to use bloom filters in hash joins to filter
 out tuples that won't match any of the future hash batches to reduce
 the amount of tuples that need to be spilled to disk. However the
 problem was that it was unclear for a given amount of memory usage how
 to pick the right size bloom filter and how to model how much it would
 save versus how much it would cost in reduced hash table size.

Right.  There's only one hash function available, really, (we don't
currently support multiple hash functions..), unless we want to try and
re-hash the 32bit hash value that we get back (not against trying that,
but it isn't what I'd start with) and it would hopefully be sufficient
for this.

 I think it just required some good empirical tests and hash join heavy
 workloads to come up with some reasonable guesses. We don't need a
 perfect model just some reasonable bloom filter size that we're pretty
 sure will usually help more than it hurts.

This would help out a lot of things, really..  Perhaps what Tomas is
developing regarding test cases would help here also.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] postgresql.auto.conf and reload

2014-07-05 Thread Amit Kapila
On Sat, Jul 5, 2014 at 8:22 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Amit Kapila amit.kapil...@gmail.com writes:
  Please find the patch attached to address the above concern.  I have
  updated docs, so that users can be aware of such behaviour.

 I'm in the camp that says that we need to do something about this, not
 just claim that it's operating as designed.  AFAICS, the *entire* argument
 for having ALTER SYSTEM at all is ease of use; but this behavior is not
 what I would call facilitating ease of use.  In particular, if you are
 conveniently able to edit postgresql.conf, what the heck do you need
 ALTER SYSTEM for?

 One possible answer is to modify guc-file.l so that only the last value
 supplied for any one GUC gets processed.

Another could be that during initdb all the uncommented settings be
written to postgresql.auto.conf file rather than to postgresql.conf.
I think we can do this by changing code in initdb.c-setup_config().
This will ensure that unless user is changing settings in a mixed way
(some by Alter System and some manually by editing postgresql.conf),
there will no such problem.

 The naive way of doing that
 would be O(N^2) in the number of uncommented settings, which might or
 might not be a problem; if it is, we could no doubt devise a smarter
 data structure.

Okay. To implement it, we can make sure during parsing Configfile
that only unique element can be present in list.  We can modify
function ParseConfigFp() to achieve this functionality.  Another
way could be that after the list is formed, we can eliminate
duplicate entries from it, we might need to do this at multiple places.
Do you have anything else in mind?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] RLS Design

2014-07-05 Thread Stephen Frost
Kaigai,

* Kouhei Kaigai (kai...@ak.jp.nec.com) wrote:
  Can you clarify where this is coming from..?  It sounds like you're
  referring to an existing implementation and, if so, it'd be good to get
  more information on how that works exactly.
 
 Oracle VPD - Multiple Policies for Each Table, View, or Synonym
 http://docs.oracle.com/cd/B19306_01/network.102/b14266/apdvpoli.htm#i1008351
 
 It says - Note that all policies applied to a table are enforced with AND 
 syntax.

While I'm not against using this as an example to consider, it's much
more complex than what we're talking about here- and it supports
application contexts which allow groups of RLS rights to be applied or
not applied; essentially it allows both AND and OR for sets of RLS
policies, along with default policies which are applied no matter
what.

 Not only Oracle VPD, it fits attitude of defense in depth.
 Please assume a system that installs network firewall, unix permissions
 and selinux. If somebody wants to reference an information asset within
 a file, he has to connect the server from the network address being allowed
 by the firewall configuration AND both of DAC and MAC has to allow his
 access.

These are not independent systems and your argument would apply to our
GRANT system also, which I hope it's agreed would make it far less
useful.  Note also that SELinux brings in another complexity- it needs
to make system calls out to check the access.

 Usually, we have to pass all the access control to reference the target
 information, not one of the access control stuffs being installed.

This is true in some cases, and not in others.  Only one role you are a
member of needs to have access to a relation, not all of them.  There
are other examples of 'OR'-style security policies, this is merely one.
I'm simply not convinced that it applies in the specific case we're
talking about.

In the end, I expect that either way people will be upset because they
won't be able to specify fully which should be AND vs. which should be
OR with the kind of flexibility other systems provide.  What I'm trying
to get to is an initial implementation which is generally useful and is
able to add such support later.

  This is similar to how roles work- your overall access includes all access
  granted to any roles you are a member of. You don't need SELECT rights 
  granted
  to every role you are a member of to select from the table. Additionally,
  if an admin wants to AND the quals together then they can simply create
  a policy which does that rather than have 2 policies.
  
 It seems to me a pain on database administration, if we have to pay attention
 not to conflict each RLS-policy. 

This notion of 'conflict' doesn't make much sense to me.  What is
'conflicting' here?  Each policy would simply need to stand on its own
for the role which it's being applied to.  That's very simple and
straight-forward.

 I expect 90% of RLS-policy will be configured
 to PUBLIC user, to apply everybody same rules on access. In this case, DBA
 has to ensure the target table has no policy or existing policy does not
 conflict with the new policy to be set.
 I don't think it is a good idea to enforce DBA these checks.

If the DBA only uses PUBLIC then they have to ensure that each policy
they set up for PUBLIC can stand on its own- though, really, I expect if
they go that route they'd end up with just one policy that calls a
stored procedure...

   You are suggesting instead that if application 2 sets up policies on the
  table and then application 1 adds another policy that it should reduce what
  application 2's users can see?  That doesn't make any sense to me.  I'd
  actually expect these applications to at least use different roles anyway,
  which means they could each have a single role specific policy which only
  returns what that application is allowed to see.
  
 I don't think this assumption is reasonable.
 Please expect two applications: app-X that is a database security product
 to apply access control based on remote ip-address of the client for any
 table accesses by any database roles. app-Y that is a usual enterprise
 package for daily business data, with RLS-policy.
 What is the expected behavior in this case?

That the DBA manage the rights on the tables.  I expect that will be
required for quite a while with PG.  It's nice to think of these
application products that will manage all access for users by setting up
their own policies, but we have yet to even discuss how they would have
appropriate rights on the table to be able to do so (and to not
interfere with each other..).

Let's at least get something which is generally useful in.  I'm all for
trying to plan out how to get there and would welcome suggestions you
have which are specific to PG on what we could do here (I'm not keen on
just trying to mimic another product completely...), but at the level
we're talking about (either AND them all or OR them all), I don't think
we'd actually solve the