(2010/06/08 11:15), Robert Haas wrote:
> 2010/6/7 KaiGai Kohei<kai...@ak.jp.nec.com>:
>> Our headache is on functions categorized to middle-threat. It enables to
>> leak the given arguments using error messages. Here are several ideas,
>> but they have good and bad points.
> 
> I think we are altogether off in the weeds here.  We ought to start
> with an implementation that pushes nothing down, and then try to
> figure out how much we can relax that without too much compromising
> security.
> 

The attached patch tries to prevent pushing down anything into subqueries
from outside of them.

The distribute_qual_to_rels() tries to distribute the given qualifier
into a certain scanning-plan based on the dependency of qualifier.

E.g) SELECT * FROM (SELECT * FROM t1 JOIN t2 ON t1.a = t2.x WHERE 
f_policy(t1.a)) WHERE f_user(t2.x);

In this case, f_user() function depends on only t2 table, so it is
reasonable to attach on the scanning plan of t2 from perspective of
performance.

However, f_user() may have a side-effect which writes arguments into
somewhere. If here is such a possibility, f_user() should not be called
before the joined tuples being filtered by f_policy().

In the case when we can ensure all functions within the qualifier are
enough trustable, we don't need to prevent them to be pushed down.
But the algorithm to determine it is under discussion. So, right now,
we prevent all the possible pushing down.

Example.1)  CREATE VIEW v1 AS SELECT * FROM t1 JOIN t2 ON a = x WHERE 
f_policy(a);
            SELECT * FROM v1 WHERE f_malicious(b);

 * without this patch
    postgres=# EXPLAIN SELECT * FROM v1 WHERE f_malicious(b);
                                QUERY PLAN
    -------------------------------------------------------------------
     Hash Join  (cost=639.01..667.29 rows=137 width=72)
       Hash Cond: (t2.x = t1.a)
       ->  Seq Scan on t2  (cost=0.00..22.30 rows=1230 width=36)
       ->  Hash  (cost=637.30..637.30 rows=137 width=36)
             ->  Seq Scan on t1  (cost=0.00..637.30 rows=137 width=36)
                   Filter: (f_policy(a) AND f_malicious(b))
    (6 rows)

 * with this patch
    postgres=# EXPLAIN SELECT * FROM v1 WHERE f_malicious(b);
                                QUERY PLAN
    -------------------------------------------------------------------
     Hash Join  (cost=334.93..468.44 rows=137 width=72)
       Hash Cond: (t2.x = t1.a)
       Join Filter: f_malicious(t1.b)
       ->  Seq Scan on t2  (cost=0.00..22.30 rows=1230 width=36)
       ->  Hash  (cost=329.80..329.80 rows=410 width=36)
             ->  Seq Scan on t1  (cost=0.00..329.80 rows=410 width=36)
                   Filter: f_policy(a)
    (7 rows)

It prevents to push down f_malicious() inside of the join loop.


Example.2)  CREATE VIEW v1 AS SELECT * FROM t1 JOIN t2 ON a = x WHERE 
f_policy(a);
            SELECT * FROM v1 JOIN t3 ON v1.a=t3.s WHERE f_malicious(b);

  * without this patch
    postgres=# EXPLAIN SELECT * FROM v1 JOIN t3 ON v1.a=t3.s WHERE 
f_malicious(b);
                                      QUERY PLAN
    
-------------------------------------------------------------------------------
     Hash Join  (cost=669.01..697.29 rows=137 width=108)
       Hash Cond: (t3.s = t1.a)
       ->  Seq Scan on t3  (cost=0.00..22.30 rows=1230 width=36)
       ->  Hash  (cost=667.29..667.29 rows=137 width=72)
             ->  Hash Join  (cost=639.01..667.29 rows=137 width=72)
                   Hash Cond: (t2.x = t1.a)
                   ->  Seq Scan on t2  (cost=0.00..22.30 rows=1230 width=36)
                   ->  Hash  (cost=637.30..637.30 rows=137 width=36)
                         ->  Seq Scan on t1  (cost=0.00..637.30 rows=137 
width=36)
                               Filter: (f_policy(a) AND f_malicious(b))
    (10 rows)

  * with this patch
    postgres=# EXPLAIN SELECT * FROM v1 JOIN t3 ON v1.a=t3.s WHERE 
f_malicious(b);
                                      QUERY PLAN
    
-------------------------------------------------------------------------------
     Hash Join  (cost=470.15..498.43 rows=137 width=108)
       Hash Cond: (t3.s = t1.a)
       ->  Seq Scan on t3  (cost=0.00..22.30 rows=1230 width=36)
       ->  Hash  (cost=468.44..468.44 rows=137 width=72)
             ->  Hash Join  (cost=334.93..468.44 rows=137 width=72)
                   Hash Cond: (t2.x = t1.a)
                   Join Filter: f_malicious(t1.b)
                   ->  Seq Scan on t2  (cost=0.00..22.30 rows=1230 width=36)
                   ->  Hash  (cost=329.80..329.80 rows=410 width=36)
                         ->  Seq Scan on t1  (cost=0.00..329.80 rows=410 
width=36)
                               Filter: f_policy(a)
    (11 rows)

It also prevents f_malisious() to be pushed down into the join loop within view,
but we can push it down into same level of the query.


Please note that it specially handles equality operator at the bottom half of
the distribute_qual_to_rels(), so this patch does not care about these cases.
However, I'm not in hustle to prevent these optimization, because I guess
these should be entirely trusted. So, the patch is in just a start up phase,
not commitable anyway.

    postgres=# EXPLAIN SELECT * FROM v1 WHERE b = 'aaa';
                                   QUERY PLAN
    -------------------------------------------------------------------------
     Nested Loop  (cost=0.00..349.44 rows=2 width=72)
       ->  Seq Scan on t1  (cost=0.00..332.88 rows=2 width=36)
             Filter: ((b = 'aaa'::text) AND f_policy(a))
       ->  Index Scan using t2_pkey on t2  (cost=0.00..8.27 rows=1 width=36)
             Index Cond: (t2.x = t1.a)
    (5 rows)

Thanks,
-- 
KaiGai Kohei <kai...@ak.jp.nec.com>
*** a/src/backend/optimizer/plan/initsplan.c
--- b/src/backend/optimizer/plan/initsplan.c
***************
*** 45,57 **** static SpecialJoinInfo *make_outerjoininfo(PlannerInfo *root,
  				   Relids left_rels, Relids right_rels,
  				   Relids inner_join_rels,
  				   JoinType jointype, List *clause);
! static void distribute_qual_to_rels(PlannerInfo *root, Node *clause,
  						bool is_deduced,
  						bool below_outer_join,
  						JoinType jointype,
  						Relids qualscope,
  						Relids ojscope,
  						Relids outerjoin_nonnullable);
  static bool check_outerjoin_delay(PlannerInfo *root, Relids *relids_p,
  					  Relids *nullable_relids_p, bool is_pushed_down);
  static bool check_redundant_nullability_qual(PlannerInfo *root, Node *clause);
--- 45,60 ----
  				   Relids left_rels, Relids right_rels,
  				   Relids inner_join_rels,
  				   JoinType jointype, List *clause);
! static void distribute_qual_to_rels(PlannerInfo *root,
! 						Node *clause,
! 						Node *jtnode,
  						bool is_deduced,
  						bool below_outer_join,
  						JoinType jointype,
  						Relids qualscope,
  						Relids ojscope,
  						Relids outerjoin_nonnullable);
+ static Relids prevent_untrusted_pushdown(Node *jtnode, Relids relids, bool recursing);
  static bool check_outerjoin_delay(PlannerInfo *root, Relids *relids_p,
  					  Relids *nullable_relids_p, bool is_pushed_down);
  static bool check_redundant_nullability_qual(PlannerInfo *root, Node *clause);
***************
*** 334,340 **** deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join,
  		{
  			Node	   *qual = (Node *) lfirst(l);
  
! 			distribute_qual_to_rels(root, qual,
  									false, below_outer_join, JOIN_INNER,
  									*qualscope, NULL, NULL);
  		}
--- 337,343 ----
  		{
  			Node	   *qual = (Node *) lfirst(l);
  
! 			distribute_qual_to_rels(root, qual, jtnode,
  									false, below_outer_join, JOIN_INNER,
  									*qualscope, NULL, NULL);
  		}
***************
*** 457,463 **** deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join,
  		{
  			Node	   *qual = (Node *) lfirst(l);
  
! 			distribute_qual_to_rels(root, qual,
  									false, below_outer_join, j->jointype,
  									*qualscope,
  									ojscope, nonnullable_rels);
--- 460,466 ----
  		{
  			Node	   *qual = (Node *) lfirst(l);
  
! 			distribute_qual_to_rels(root, qual, jtnode,
  									false, below_outer_join, j->jointype,
  									*qualscope,
  									ojscope, nonnullable_rels);
***************
*** 728,733 **** make_outerjoininfo(PlannerInfo *root,
--- 731,737 ----
   *	  EquivalenceClasses.
   *
   * 'clause': the qual clause to be distributed
+  * 'jtnode': the join tree node where the 'clause' was chained in original.
   * 'is_deduced': TRUE if the qual came from implied-equality deduction
   * 'below_outer_join': TRUE if the qual is from a JOIN/ON that is below the
   *		nullable side of a higher-level outer join
***************
*** 748,754 **** make_outerjoininfo(PlannerInfo *root,
   * all and only those special joins that are syntactically below this qual.
   */
  static void
! distribute_qual_to_rels(PlannerInfo *root, Node *clause,
  						bool is_deduced,
  						bool below_outer_join,
  						JoinType jointype,
--- 752,760 ----
   * all and only those special joins that are syntactically below this qual.
   */
  static void
! distribute_qual_to_rels(PlannerInfo *root,
! 						Node *clause,
! 						Node *jtnode,
  						bool is_deduced,
  						bool below_outer_join,
  						JoinType jointype,
***************
*** 771,776 **** distribute_qual_to_rels(PlannerInfo *root, Node *clause,
--- 777,807 ----
  	relids = pull_varnos(clause);
  
  	/*
+ 	 * Prevent untrusted function calls to be pushed down to inside of
+ 	 * subqueries already pulled up; which might be views.
+ 	 * When a view is used to row-level security purpose, it may have
+ 	 * WHERE clause to filter out invisible tuples. However, functions
+ 	 * with side-effect allow us to leak contents of invisible tuples,
+ 	 * if user given qualifiers are evaluated prior to WHERE clause of
+ 	 * views.
+ 	 * For example, when a view contains A JOIN B and user's qualifier
+ 	 * takes arguments that reference columns of A, it prefers to be
+ 	 * distributed on scan plan of A. But it also means the qualifier
+ 	 * may be launched with arguments come from invisible tuples.
+ 	 * In this case, the contents of invisible tuples may be leaked
+ 	 * via user's qualifier which has side-effects, such as inserting
+ 	 * the arguments into other tables.
+ 	 * Hence, it needs to prevent untrusted function calls to be pushed
+ 	 * down to inside of subqueries.
+ 	 *
+ 	 * XXX - To minimize performance impact, it should be prevented
+ 	 * only when the clause contains 'untrusted' function calls.
+ 	 * But the algorithm to determine whether a certain function is
+ 	 * trusted, or not, is under discussion.
+ 	 */
+ 	relids = prevent_untrusted_pushdown(jtnode, relids, false);
+ 
+ 	/*
  	 * Cross-check: clause should contain no relids not within its scope.
  	 * Otherwise the parser messed up.
  	 */
***************
*** 1075,1080 **** distribute_qual_to_rels(PlannerInfo *root, Node *clause,
--- 1106,1169 ----
  }
  
  /*
+  * prevent_untrusted_pushdown
+  *
+  * It expands the given relids when it tries to reference a part of
+  * subqueries, to prevent unexpected pushing down due to the security
+  * reason.
+  */
+ static Relids
+ prevent_untrusted_pushdown(Node *jtnode, Relids relids, bool recursing)
+ {
+ 	if (!jtnode || IsA(jtnode, RangeTblRef))
+ 		return relids;
+ 
+ 	if (IsA(jtnode, FromExpr))
+ 	{
+ 		FromExpr   *f = (FromExpr *) jtnode;
+ 		ListCell   *l;
+ 		Relids		temp;
+ 
+ 		if (!recursing)
+ 		{
+ 			/*
+ 			 * The FromExpr in the top-level does not mean subqueries.
+ 			 * It is same level of the clause originally chained.
+ 			 */
+ 			foreach (l, f->fromlist)
+ 				relids = prevent_untrusted_pushdown(lfirst(l), relids, true);
+ 		}
+ 		else
+ 		{
+ 			/*
+ 			 * If we find out FromExpr except for the top-level join tree,
+ 			 * it means subqueries pulled up on the earlier stage.
+ 			 * In the case when the given relids references a part of
+ 			 * relations within the FromExpr, it shall be expanded to
+ 			 * whole of the subquery to prevent unexpected pushing-down.
+ 			 */
+ 			temp = get_relids_in_jointree(jtnode, false);
+ 
+ 			if (bms_overlap(relids, temp))
+ 				relids = bms_add_members(relids, temp);
+ 
+ 			bms_free(temp);
+ 		}
+ 	}
+ 	else if (IsA(jtnode, JoinExpr))
+ 	{
+ 		JoinExpr   *j = (JoinExpr *) jtnode;
+ 
+ 		relids = prevent_untrusted_pushdown(j->larg, relids, true);
+ 		relids = prevent_untrusted_pushdown(j->rarg, relids, true);
+ 	}
+ 	else
+ 		elog(ERROR, "unrecognized node type: %d", (int) nodeTag(jtnode));
+ 
+ 	return relids;
+ }
+ 
+ /*
   * check_outerjoin_delay
   *		Detect whether a qual referencing the given relids must be delayed
   *		in application due to the presence of a lower outer join, and/or
***************
*** 1357,1363 **** process_implied_equality(PlannerInfo *root,
  	/*
  	 * Push the new clause into all the appropriate restrictinfo lists.
  	 */
! 	distribute_qual_to_rels(root, (Node *) clause,
  							true, below_outer_join, JOIN_INNER,
  							qualscope, NULL, NULL);
  }
--- 1446,1452 ----
  	/*
  	 * Push the new clause into all the appropriate restrictinfo lists.
  	 */
! 	distribute_qual_to_rels(root, (Node *) clause, NULL,
  							true, below_outer_join, JOIN_INNER,
  							qualscope, NULL, NULL);
  }
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to