Re: [HACKERS] How does postgres store the join predicate for a relation in a given query

2017-10-15 Thread Ashutosh Bapat
On Sat, Oct 14, 2017 at 3:15 AM, Gourav Kumar  wrote:
> Why does have_relevant_joinclause() and have_relevant_eclass_joinclause()
> return true for all possible joins for the query given below.
> Even when they have no join predicate between them.
> e.g. join between ss1 & ws3, ss2 & ws3 etc.
>

The prologues of those functions and comments within those explain that.

/*
 * have_relevant_joinclause
 *  Detect whether there is a joinclause that involves
 *  the two given relations.
 *
 * Note: the joinclause does not have to be evaluable with only these two
 * relations.  This is intentional.  For example consider
 *  SELECT * FROM a, b, c WHERE a.x = (b.y + c.z)
 * If a is much larger than the other tables, it may be worthwhile to
 * cross-join b and c and then use an inner indexscan on a.x.  Therefore
 * we should consider this joinclause as reason to join b to c, even though
 * it can't be applied at that join step.
 */

/*
 * have_relevant_eclass_joinclause
 *  Detect whether there is an EquivalenceClass that could produce
 *  a joinclause involving the two given relations.
 *
 * This is essentially a very cut-down version of
 * generate_join_implied_equalities().  Note it's OK to occasionally say "yes"
 * incorrectly.  Hence we don't bother with details like whether the lack of a
 * cross-type operator might prevent the clause from actually being generated.
 */
May be you want to see whether those comments are applicable in your
case and also see how the callers handle the return values.


-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How does postgres store the join predicate for a relation in a given query

2017-10-13 Thread Gourav Kumar
Why does have_relevant_joinclause() and have_relevant_eclass_joinclause()
return true for all possible joins for the query given below.
Even when they have no join predicate between them.
e.g. join between ss1 & ws3, ss2 & ws3 etc.

The query is :
TPC-DS query 50

 -- query 50 in stream 0 using template query31.tpl
with ss as
 (select ca_county,d_qoy, d_year,sum(ss_ext_sales_price) as store_sales
 from store_sales,date_dim,customer_address
 where ss_sold_date_sk = d_date_sk
  and ss_addr_sk=ca_address_sk
 group by ca_county,d_qoy, d_year),
 ws as
 (select ca_county,d_qoy, d_year,sum(ws_ext_sales_price) as web_sales
 from web_sales,date_dim,customer_address
 where ws_sold_date_sk = d_date_sk
  and ws_bill_addr_sk=ca_address_sk
 group by ca_county,d_qoy, d_year)
 select /* tt */
ss1.ca_county
   ,ss1.d_year
   ,ws2.web_sales/ws1.web_sales web_q1_q2_increase
   ,ss2.store_sales/ss1.store_sales store_q1_q2_increase
   ,ws3.web_sales/ws2.web_sales web_q2_q3_increase
   ,ss3.store_sales/ss2.store_sales store_q2_q3_increase
 from
ss ss1
   ,ss ss2
   ,ss ss3
   ,ws ws1
   ,ws ws2
   ,ws ws3
 where
ss1.d_qoy = 1
and ss1.d_year = 2000
and ss1.ca_county = ss2.ca_county
and ss2.d_qoy = 2
and ss2.d_year = 2000
 and ss2.ca_county = ss3.ca_county
and ss3.d_qoy = 3
and ss3.d_year = 2000
and ss1.ca_county = ws1.ca_county
and ws1.d_qoy = 1
and ws1.d_year = 2000
and ws1.ca_county = ws2.ca_county
and ws2.d_qoy = 2
and ws2.d_year = 2000
and ws1.ca_county = ws3.ca_county
and ws3.d_qoy = 3
and ws3.d_year =2000
and case when ws1.web_sales > 0 then ws2.web_sales/ws1.web_sales else
null end
   > case when ss1.store_sales > 0 then ss2.store_sales/ss1.store_sales
else null end
and case when ws2.web_sales > 0 then ws3.web_sales/ws2.web_sales else
null end
   > case when ss2.store_sales > 0 then ss3.store_sales/ss2.store_sales
else null end
 order by web_q2_q3_increase;

-- end


On 13 October 2017 at 01:00, Gourav Kumar  wrote:

> Well for this given query it is possible. I haven't come across any such
> query yet.
>
> Possibly because I am more concerned about the TPCDS and TPCH benchmarks,
> where it's less likely to occur.
>
> On 13 October 2017 at 00:52, Tom Lane  wrote:
>
>> Gourav Kumar  writes:
>> > A Join clause/predicate will only mention 2 relations. It can't have 3
>> or
>> > more relations.
>>
>> Really?  What of, say,
>>
>> select ... from a,b,c where (a.x + b.y) = c.z;
>>
>> regards, tom lane
>>
>
>
>
> --
> Thanks,
> Gourav Kumar
> Computer Science and Automation
> Indian Institute of Science
>



-- 
Thanks,
Gourav Kumar
Computer Science and Automation
Indian Institute of Science


Re: [HACKERS] How does postgres store the join predicate for a relation in a given query

2017-10-12 Thread Gourav Kumar
Well for this given query it is possible. I haven't come across any such
query yet.

Possibly because I am more concerned about the TPCDS and TPCH benchmarks,
where it's less likely to occur.

On 13 October 2017 at 00:52, Tom Lane  wrote:

> Gourav Kumar  writes:
> > A Join clause/predicate will only mention 2 relations. It can't have 3 or
> > more relations.
>
> Really?  What of, say,
>
> select ... from a,b,c where (a.x + b.y) = c.z;
>
> regards, tom lane
>



-- 
Thanks,
Gourav Kumar
Computer Science and Automation
Indian Institute of Science


Re: [HACKERS] How does postgres store the join predicate for a relation in a given query

2017-10-12 Thread Tom Lane
Gourav Kumar  writes:
> A Join clause/predicate will only mention 2 relations. It can't have 3 or
> more relations.

Really?  What of, say,

select ... from a,b,c where (a.x + b.y) = c.z;

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How does postgres store the join predicate for a relation in a given query

2017-10-12 Thread Gourav Kumar
A Join clause/predicate will only mention 2 relations. It can't have 3 or
more relations.

On 12 October 2017 at 23:14, Tom Lane  wrote:

> Gourav Kumar  writes:
> > My objective is to construct join graph from a given query.
> > A join graph, has a node for each relation involved in a join, and an
> edge
> > between two relations if they share a join predicate among them.
>
> Hm, well, you could adapt the logic in have_relevant_joinclause() and
> have_relevant_eclass_joinclause().  Or maybe you could just use them
> as-is ... depends on what you have in mind to do with join clauses
> that mention 3 or more relations.
>
> regards, tom lane
>



-- 
Thanks,
Gourav Kumar
Computer Science and Automation
Indian Institute of Science


Re: [HACKERS] How does postgres store the join predicate for a relation in a given query

2017-10-12 Thread Tom Lane
Gourav Kumar  writes:
> My objective is to construct join graph from a given query.
> A join graph, has a node for each relation involved in a join, and an edge
> between two relations if they share a join predicate among them.

Hm, well, you could adapt the logic in have_relevant_joinclause() and
have_relevant_eclass_joinclause().  Or maybe you could just use them
as-is ... depends on what you have in mind to do with join clauses
that mention 3 or more relations.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How does postgres store the join predicate for a relation in a given query

2017-10-12 Thread Gourav Kumar
What is meant by "unstructured Join"?

Thanks,
Gourav

On 12 October 2017 at 22:47, Gourav Kumar  wrote:

> My objective is to construct join graph from a given query.
> A join graph, has a node for each relation involved in a join, and an edge
> between two relations if they share a join predicate among them.
>
> To do this I first tried to use the make_join_rel() function
>  - There I checked if they root->join->cur->level is 2, just write the
> relation names to a file.
>  - But this strategy failed, because if there is somewhere a Cartesian
> product among two relations, then they can't have an edge in the join graph.
>  - So, along with writing the relation name, I need to know if they
> share a join predicate among them or not.
>
>
> On 12 October 2017 at 22:08, Tom Lane  wrote:
>
>> Gourav Kumar  writes:
>> > I have the RelOptInfo data structure for the relations which are to be
>> > joined but when I check their joininfo, it is empty.
>>
>> You aren't telling us anything much about the case you're studying,
>> but if the join clauses have the form of equality comparisons, they
>> likely got converted into EquivalenceClass data structures instead.
>> These days the joininfo lists only contain "unstructured" join
>> conditions.
>>
>> regards, tom lane
>>
>
>
>
> --
> Thanks,
> Gourav Kumar
> Computer Science and Automation
> Indian Institute of Science
>



-- 
Thanks,
Gourav Kumar
Computer Science and Automation
Indian Institute of Science


Re: [HACKERS] How does postgres store the join predicate for a relation in a given query

2017-10-12 Thread Gourav Kumar
My objective is to construct join graph from a given query.
A join graph, has a node for each relation involved in a join, and an edge
between two relations if they share a join predicate among them.

To do this I first tried to use the make_join_rel() function
 - There I checked if they root->join->cur->level is 2, just write the
relation names to a file.
 - But this strategy failed, because if there is somewhere a Cartesian
product among two relations, then they can't have an edge in the join graph.
 - So, along with writing the relation name, I need to know if they
share a join predicate among them or not.


On 12 October 2017 at 22:08, Tom Lane  wrote:

> Gourav Kumar  writes:
> > I have the RelOptInfo data structure for the relations which are to be
> > joined but when I check their joininfo, it is empty.
>
> You aren't telling us anything much about the case you're studying,
> but if the join clauses have the form of equality comparisons, they
> likely got converted into EquivalenceClass data structures instead.
> These days the joininfo lists only contain "unstructured" join
> conditions.
>
> regards, tom lane
>



-- 
Thanks,
Gourav Kumar
Computer Science and Automation
Indian Institute of Science


Re: [HACKERS] How does postgres store the join predicate for a relation in a given query

2017-10-12 Thread Tom Lane
Gourav Kumar  writes:
> I have the RelOptInfo data structure for the relations which are to be
> joined but when I check their joininfo, it is empty.

You aren't telling us anything much about the case you're studying,
but if the join clauses have the form of equality comparisons, they
likely got converted into EquivalenceClass data structures instead.
These days the joininfo lists only contain "unstructured" join
conditions.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How does postgres store the join predicate for a relation in a given query

2017-10-12 Thread Gourav Kumar
Hi Ashutosh,

I have the RelOptInfo data structure for the relations which are to be
joined but when I check their joininfo, it is empty.
Does baserestrictinfo contains base predicates ?

Thanks
Gourav.

On 11 October 2017 at 12:00, Ashutosh Bapat  wrote:

> On Tue, Oct 10, 2017 at 7:29 PM, Gourav Kumar 
> wrote:
> > Hi all,
> >
> > When you fire a query in postgresql, it will first parse the query and
> > create the data structures for storing various aspects of the query and
> > executing the query. (Like RangeTblEntry, PlannerInfo, RangeOptInfo
> etc.).
> >
> > I want to know how does postgresql stores the join predicates of a query.
> > Like which data structure is used to store the join predicates.
> >
> > How can we find the join predicates applied on a relation from relid,
> Oid or
> > RangeTblEntry ?
> >
>
> Every relation has a RelOptInfo associated with it. Predicates
> applicable to it are stored in this RelOptInfo as a list. For base
> relations (simple tables) it's in baserestrictinfo. The join
> predicates applicable are in joininfo. You can get RelOptInfo of a
> given simple table using find_base_rel().
>
> HTH.
>
>
> --
> Best Wishes,
> Ashutosh Bapat
> EnterpriseDB Corporation
> The Postgres Database Company
>


Re: [HACKERS] How does postgres store the join predicate for a relation in a given query

2017-10-11 Thread Ashutosh Bapat
On Tue, Oct 10, 2017 at 7:29 PM, Gourav Kumar  wrote:
> Hi all,
>
> When you fire a query in postgresql, it will first parse the query and
> create the data structures for storing various aspects of the query and
> executing the query. (Like RangeTblEntry, PlannerInfo, RangeOptInfo etc.).
>
> I want to know how does postgresql stores the join predicates of a query.
> Like which data structure is used to store the join predicates.
>
> How can we find the join predicates applied on a relation from relid, Oid or
> RangeTblEntry ?
>

Every relation has a RelOptInfo associated with it. Predicates
applicable to it are stored in this RelOptInfo as a list. For base
relations (simple tables) it's in baserestrictinfo. The join
predicates applicable are in joininfo. You can get RelOptInfo of a
given simple table using find_base_rel().

HTH.


-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How does postgres store the join predicate for a relation in a given query

2017-10-10 Thread Nico Williams
On Tue, Oct 10, 2017 at 07:29:24PM +0530, Gourav Kumar wrote:
> When you fire a query in postgresql, it will first parse the query and
> create the data structures for storing various aspects of the query and
> executing the query. (Like RangeTblEntry, PlannerInfo, RangeOptInfo etc.).
> 
> I want to know how does postgresql stores the join predicates of a query.
> Like which data structure is used to store the join predicates.
> 
> How can we find the join predicates applied on a relation from relid, Oid
> or RangeTblEntry ?
> 
> I want to construct a join graph for a given query, for which I need the
> join predicates between two relations.

In the usingClause or quals fields of a JoinExpr.  See
src/backend/parser/gram.y, search for join_qual.

Of course, WHERE clauses have to be inspected as well, which go into the
whereClause of of a SelectStmt; search for where_clause in
src/backend/parser/gram.y.

Nico
-- 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] How does postgres store the join predicate for a relation in a given query

2017-10-10 Thread Gourav Kumar
Hi all,

When you fire a query in postgresql, it will first parse the query and
create the data structures for storing various aspects of the query and
executing the query. (Like RangeTblEntry, PlannerInfo, RangeOptInfo etc.).

I want to know how does postgresql stores the join predicates of a query.
Like which data structure is used to store the join predicates.

How can we find the join predicates applied on a relation from relid, Oid
or RangeTblEntry ?

I want to construct a join graph for a given query, for which I need the
join predicates between two relations.

-- 
Thanks,
Gourav Kumar