Re: SPARK-13900 - Join with simple OR conditions take too long

2016-04-05 Thread Mich Talebzadeh
gt;>|   |   |   |   |  With LRU Buffer Replacement Strategy for data
>> pages.
>>
>> Total estimated I/O cost for statement 4 (at line 4): 2147483647.
>>
>> HTH
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>>
>> On 1 April 2016 at 13:19, ashokkumar rajendran <
>> ashokkumar.rajend...@gmail.com> wrote:
>>
>>> I agree with Hemant's comment. But it does not give good results for
>>> simple usecases like 2 OR conditions. Ultimately we need good results from
>>> Spark for end users. shall we consider this as a request to support SQL
>>> hints then? Is there any plan to support SQL hint in Spark in upcoming
>>> release?
>>>
>>> Regards
>>> Ashok
>>>
>>> On Fri, Apr 1, 2016 at 5:04 PM, Robin East 
>>> wrote:
>>>
>>>> Yes and even today CBO (e.g. in Oracle) will still require hints in
>>>> some cases so I think it is more like:
>>>>
>>>> RBO -> RBO + Hints -> CBO + Hints. Most relational databases meet
>>>> significant numbers of corner cases where CBO plans simply don’t do what
>>>> you would want. I don’t know enough about Spark SQL to comment on whether
>>>> the same problems would afflict Spark.
>>>>
>>>>
>>>>
>>>>
>>>> On 31 Mar 2016, at 15:54, Yong Zhang  wrote:
>>>>
>>>> I agree that there won't be a generic solution for these kind of cases.
>>>>
>>>> Without the CBO from Spark or Hadoop ecosystem in short future, maybe
>>>> Spark DataFrame/SQL should support more hints from the end user, as in
>>>> these cases, end users will be smart enough to tell the engine what is the
>>>> correct way to do.
>>>>
>>>> Weren't the relational DBs doing exactly same path? RBO -> RBO + Hints
>>>> -> CBO?
>>>>
>>>> Yong
>>>>
>>>> --
>>>> Date: Thu, 31 Mar 2016 16:07:14 +0530
>>>> Subject: Re: SPARK-13900 - Join with simple OR conditions take too long
>>>> From: hemant9...@gmail.com
>>>> To: ashokkumar.rajend...@gmail.com
>>>> CC: user@spark.apache.org
>>>>
>>>> Hi Ashok,
>>>>
>>>> That's interesting.
>>>>
>>>> As I understand, on table A and B, a nested loop join (that will
>>>> produce m X n rows) is performed and than each row is evaluated to see if
>>>> any of the condition is met. You are asking that Spark should instead do a
>>>> BroadcastHashJoin on the equality conditions in parallel and then union the
>>>> results like you are doing in a different query.
>>>>
>>>> If we leave aside parallelism for a moment, theoretically, time taken
>>>> for nested loop join would vary little when the number of conditions are
>>>> increased while the time taken for the solution that you are suggesting
>>>> would increase linearly with number of conditions. So, when number of
>>>> conditions are too many, nested loop join would be faster than the solution
>>>> that you suggest. Now the question is, how should Spark decide when to do
>>>> what?
>>>>
>>>>
>>>> Hemant Bhanawat <https://www.linkedin.com/in/hemant-bhanawat-92a3811>
>>>> www.snappydata.io
>>>>
>>>> On Thu, Mar 31, 2016 at 2:28 PM, ashokkumar rajendran <
>>>> ashokkumar.rajend...@gmail.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I have filed ticket SPARK-13900. There was an initial reply from a
>>>> developer but did not get any reply on this. How can we do multiple hash
>>>> joins together for OR conditions based joins? Could someone please guide on
>>>> how can we fix this?
>>>>
>>>> Regards
>>>> Ashok
>>>>
>>>>
>>>>
>>>
>>
>


Re: SPARK-13900 - Join with simple OR conditions take too long

2016-04-05 Thread ashokkumar rajendran
>> Yes and even today CBO (e.g. in Oracle) will still require hints in some
>>> cases so I think it is more like:
>>>
>>> RBO -> RBO + Hints -> CBO + Hints. Most relational databases meet
>>> significant numbers of corner cases where CBO plans simply don’t do what
>>> you would want. I don’t know enough about Spark SQL to comment on whether
>>> the same problems would afflict Spark.
>>>
>>>
>>>
>>>
>>> On 31 Mar 2016, at 15:54, Yong Zhang  wrote:
>>>
>>> I agree that there won't be a generic solution for these kind of cases.
>>>
>>> Without the CBO from Spark or Hadoop ecosystem in short future, maybe
>>> Spark DataFrame/SQL should support more hints from the end user, as in
>>> these cases, end users will be smart enough to tell the engine what is the
>>> correct way to do.
>>>
>>> Weren't the relational DBs doing exactly same path? RBO -> RBO + Hints
>>> -> CBO?
>>>
>>> Yong
>>>
>>> --
>>> Date: Thu, 31 Mar 2016 16:07:14 +0530
>>> Subject: Re: SPARK-13900 - Join with simple OR conditions take too long
>>> From: hemant9...@gmail.com
>>> To: ashokkumar.rajend...@gmail.com
>>> CC: user@spark.apache.org
>>>
>>> Hi Ashok,
>>>
>>> That's interesting.
>>>
>>> As I understand, on table A and B, a nested loop join (that will produce
>>> m X n rows) is performed and than each row is evaluated to see if any of
>>> the condition is met. You are asking that Spark should instead do a
>>> BroadcastHashJoin on the equality conditions in parallel and then union the
>>> results like you are doing in a different query.
>>>
>>> If we leave aside parallelism for a moment, theoretically, time taken
>>> for nested loop join would vary little when the number of conditions are
>>> increased while the time taken for the solution that you are suggesting
>>> would increase linearly with number of conditions. So, when number of
>>> conditions are too many, nested loop join would be faster than the solution
>>> that you suggest. Now the question is, how should Spark decide when to do
>>> what?
>>>
>>>
>>> Hemant Bhanawat <https://www.linkedin.com/in/hemant-bhanawat-92a3811>
>>> www.snappydata.io
>>>
>>> On Thu, Mar 31, 2016 at 2:28 PM, ashokkumar rajendran <
>>> ashokkumar.rajend...@gmail.com> wrote:
>>>
>>> Hi,
>>>
>>> I have filed ticket SPARK-13900. There was an initial reply from a
>>> developer but did not get any reply on this. How can we do multiple hash
>>> joins together for OR conditions based joins? Could someone please guide on
>>> how can we fix this?
>>>
>>> Regards
>>> Ashok
>>>
>>>
>>>
>>
>


Re: SPARK-13900 - Join with simple OR conditions take too long

2016-04-04 Thread Mich Talebzadeh
Actually this may not be a bug. It just the Optimizer decides to do a
nested loop join over Hash Join when more that two OR joins are involved

With one equality predicate Hash JOin is chosen

4> SELECT COUNT(SALES.PROD_ID) from SALES, SALES2
5> WHERE SALES.CUST_ID = SALES2.CUST_ID
6> go
QUERY PLAN FOR STATEMENT 1 (at line 4).
Optimized using Parallel Mode

STEP 1
The type of query is SELECT.
4 operator(s) under root
   |ROOT:EMIT Operator (VA = 4)
   |
   |   |SCALAR AGGREGATE Operator (VA = 3)
   |   |  Evaluate Ungrouped COUNT AGGREGATE.
   |   |
   |   |   |*HASH JOIN Operator* (VA = 2) (Join Type: Inner Join)
   |   |   | Using Worktable1 for internal storage.
   |   |   |  Key Count: 1
   |   |   |
   |   |   |   |SCAN Operator (VA = 0)
   |   |   |   |  FROM TABLE
   |   |   |   |  SALES2
   |   |   |   |  Table Scan.
   |   |   |   |  Forward Scan.
   |   |   |   |  Positioning at start of table.
   |   |   |   |  Using I/O Size 64 Kbytes for data pages.
   |   |   |   |  With LRU Buffer Replacement Strategy for data pages.
   |   |   |
   |   |   |   |SCAN Operator (VA = 1)
   |   |   |   |  FROM TABLE
   |   |   |   |  SALES
   |   |   |   |  Table Scan.
   |   |   |   |  Forward Scan.
   |   |   |   |  Positioning at start of table.
   |   |   |   |  Using I/O Size 64 Kbytes for data pages.
   |   |   |   |  With MRU Buffer Replacement Strategy for data pages.

Total estimated I/O cost for statement 1 (at line 4): 783206.

Now if I chose two predicates it reverts to Nested Loop Join

4> SELECT COUNT(SALES.PROD_ID) from SALES, SALES2
5> WHERE SALES.CUST_ID = SALES2.CUST_ID
6> OR
7> SALES.TIME_ID = SALES2.TIME_ID
8> go

STEP 1
The type of query is SET OPTION ON.
Total estimated I/O cost for statement 3 (at line 3): 0.

QUERY PLAN FOR STATEMENT 4 (at line 4).
Optimized using Parallel Mode

STEP 1
The type of query is SELECT.
5 operator(s) under root
   |ROOT:EMIT Operator (VA = 5)
   |
   |   |SCALAR AGGREGATE Operator (VA = 4)
   |   |  Evaluate Ungrouped COUNT AGGREGATE.
   |   |
   |   |   |*NESTED LOOP JOIN* Operator (VA = 3) (Join Type: Inner Join)
   |   |   |
   |   |   |   |SCAN Operator (VA = 0)
   |   |   |   |  FROM TABLE
   |   |   |   |  SALES
   |   |   |   |  Table Scan.
   |   |   |   |  Forward Scan.
   |   |   |   |  Positioning at start of table.
   |   |   |   |  Using I/O Size 64 Kbytes for data pages.
   |   |   |   |  With MRU Buffer Replacement Strategy for data pages.
   |   |   |
   |   |   |   |RESTRICT Operator (VA = 2)(0)(0)(0)(7)(0)
   |   |   |   |
   |   |   |   |   |SCAN Operator (VA = 1)
   |   |   |   |   |  FROM TABLE
   |   |   |   |   |  SALES2
   |   |   |   |   |  Table Scan.
   |   |   |   |   |  Forward Scan.
   |   |   |   |   |  Positioning at start of table.
   |   |   |   |   |  Using I/O Size 64 Kbytes for data pages.
   |   |   |   |   |  With LRU Buffer Replacement Strategy for data
pages.

Total estimated I/O cost for statement 4 (at line 4): 2147483647.

HTH

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com



On 1 April 2016 at 13:19, ashokkumar rajendran <
ashokkumar.rajend...@gmail.com> wrote:

> I agree with Hemant's comment. But it does not give good results for
> simple usecases like 2 OR conditions. Ultimately we need good results from
> Spark for end users. shall we consider this as a request to support SQL
> hints then? Is there any plan to support SQL hint in Spark in upcoming
> release?
>
> Regards
> Ashok
>
> On Fri, Apr 1, 2016 at 5:04 PM, Robin East  wrote:
>
>> Yes and even today CBO (e.g. in Oracle) will still require hints in some
>> cases so I think it is more like:
>>
>> RBO -> RBO + Hints -> CBO + Hints. Most relational databases meet
>> significant numbers of corner cases where CBO plans simply don’t do what
>> you would want. I don’t know enough about Spark SQL to comment on whether
>> the same problems would afflict Spark.
>>
>>
>>
>>
>> On 31 Mar 2016, at 15:54, Yong Zhang  wrote:
>>
>> I agree that there won't be a generic solution for these kind of cases.
>>
>> Without the CBO from Spark or Hadoop ecosystem in short future, maybe
>> Spark DataFrame/SQL should support more hints from the end user, as in
>> these cases, end users will be smart enough to tell the engine what is the
>> correct way to do.
>>
>> Weren't the relational DBs doing exactly same path? RBO -&g

Re: SPARK-13900 - Join with simple OR conditions take too long

2016-04-01 Thread ashokkumar rajendran
I agree with Hemant's comment. But it does not give good results for simple
usecases like 2 OR conditions. Ultimately we need good results from Spark
for end users. shall we consider this as a request to support SQL hints
then? Is there any plan to support SQL hint in Spark in upcoming release?

Regards
Ashok

On Fri, Apr 1, 2016 at 5:04 PM, Robin East  wrote:

> Yes and even today CBO (e.g. in Oracle) will still require hints in some
> cases so I think it is more like:
>
> RBO -> RBO + Hints -> CBO + Hints. Most relational databases meet
> significant numbers of corner cases where CBO plans simply don’t do what
> you would want. I don’t know enough about Spark SQL to comment on whether
> the same problems would afflict Spark.
>
>
>
>
> On 31 Mar 2016, at 15:54, Yong Zhang  wrote:
>
> I agree that there won't be a generic solution for these kind of cases.
>
> Without the CBO from Spark or Hadoop ecosystem in short future, maybe
> Spark DataFrame/SQL should support more hints from the end user, as in
> these cases, end users will be smart enough to tell the engine what is the
> correct way to do.
>
> Weren't the relational DBs doing exactly same path? RBO -> RBO + Hints ->
> CBO?
>
> Yong
>
> ----------
> Date: Thu, 31 Mar 2016 16:07:14 +0530
> Subject: Re: SPARK-13900 - Join with simple OR conditions take too long
> From: hemant9...@gmail.com
> To: ashokkumar.rajend...@gmail.com
> CC: user@spark.apache.org
>
> Hi Ashok,
>
> That's interesting.
>
> As I understand, on table A and B, a nested loop join (that will produce m
> X n rows) is performed and than each row is evaluated to see if any of the
> condition is met. You are asking that Spark should instead do a
> BroadcastHashJoin on the equality conditions in parallel and then union the
> results like you are doing in a different query.
>
> If we leave aside parallelism for a moment, theoretically, time taken for
> nested loop join would vary little when the number of conditions are
> increased while the time taken for the solution that you are suggesting
> would increase linearly with number of conditions. So, when number of
> conditions are too many, nested loop join would be faster than the solution
> that you suggest. Now the question is, how should Spark decide when to do
> what?
>
>
> Hemant Bhanawat <https://www.linkedin.com/in/hemant-bhanawat-92a3811>
> www.snappydata.io
>
> On Thu, Mar 31, 2016 at 2:28 PM, ashokkumar rajendran <
> ashokkumar.rajend...@gmail.com> wrote:
>
> Hi,
>
> I have filed ticket SPARK-13900. There was an initial reply from a
> developer but did not get any reply on this. How can we do multiple hash
> joins together for OR conditions based joins? Could someone please guide on
> how can we fix this?
>
> Regards
> Ashok
>
>
>


Re: SPARK-13900 - Join with simple OR conditions take too long

2016-04-01 Thread Robin East
Yes and even today CBO (e.g. in Oracle) will still require hints in some cases 
so I think it is more like:

RBO -> RBO + Hints -> CBO + Hints. Most relational databases meet significant 
numbers of corner cases where CBO plans simply don’t do what you would want. I 
don’t know enough about Spark SQL to comment on whether the same problems would 
afflict Spark.




> On 31 Mar 2016, at 15:54, Yong Zhang  wrote:
> 
> I agree that there won't be a generic solution for these kind of cases.
> 
> Without the CBO from Spark or Hadoop ecosystem in short future, maybe Spark 
> DataFrame/SQL should support more hints from the end user, as in these cases, 
> end users will be smart enough to tell the engine what is the correct way to 
> do.
> 
> Weren't the relational DBs doing exactly same path? RBO -> RBO + Hints -> CBO?
> 
> Yong
> 
> Date: Thu, 31 Mar 2016 16:07:14 +0530
> Subject: Re: SPARK-13900 - Join with simple OR conditions take too long
> From: hemant9...@gmail.com <mailto:hemant9...@gmail.com>
> To: ashokkumar.rajend...@gmail.com <mailto:ashokkumar.rajend...@gmail.com>
> CC: user@spark.apache.org <mailto:user@spark.apache.org>
> 
> Hi Ashok,
> 
> That's interesting. 
> 
> As I understand, on table A and B, a nested loop join (that will produce m X 
> n rows) is performed and than each row is evaluated to see if any of the 
> condition is met. You are asking that Spark should instead do a 
> BroadcastHashJoin on the equality conditions in parallel and then union the 
> results like you are doing in a different query. 
> 
> If we leave aside parallelism for a moment, theoretically, time taken for 
> nested loop join would vary little when the number of conditions are 
> increased while the time taken for the solution that you are suggesting would 
> increase linearly with number of conditions. So, when number of conditions 
> are too many, nested loop join would be faster than the solution that you 
> suggest. Now the question is, how should Spark decide when to do what? 
> 
> 
> Hemant Bhanawat <https://www.linkedin.com/in/hemant-bhanawat-92a3811>
> www.snappydata.io <http://www.snappydata.io/> 
> 
> On Thu, Mar 31, 2016 at 2:28 PM, ashokkumar rajendran 
> mailto:ashokkumar.rajend...@gmail.com>> 
> wrote:
> Hi,
> 
> I have filed ticket SPARK-13900. There was an initial reply from a developer 
> but did not get any reply on this. How can we do multiple hash joins together 
> for OR conditions based joins? Could someone please guide on how can we fix 
> this? 
> 
> Regards
> Ashok



Re: SPARK-13900 - Join with simple OR conditions take too long

2016-04-01 Thread Hemant Bhanawat
As Mich has already noticed, Spark defaults to NL join if there are more
than one condition. Oracle is probably doing cost-based optimizations in
this scenario. You can call it a bug but in my opinion it is an area where
Spark is still evolving.

>> Hemant has mentioned the nested loop time will be very little.
I had mentioned that NL time will *vary *little with more number of
conditions.  What I meant was that instead of 3 conditions if you would
have 15 conditions, the NL loop would still take 13-15 mins while the hash
join would take more than that.

Hemant

Hemant Bhanawat 
www.snappydata.io

On Fri, Apr 1, 2016 at 3:08 PM, ashokkumar rajendran <
ashokkumar.rajend...@gmail.com> wrote:

> Hi Mich,
>
> Thanks for the input.
>
> Yes, it seems to be a bug. Is it possible to fix this in next release?
>
> Regards
> Ashok
>
> On Fri, Apr 1, 2016 at 2:06 PM, Mich Talebzadeh  > wrote:
>
>> hm.
>>
>> Sounds like it ends up in Nested Loop Join (NLJ) as opposed to Hash Join
>> (HJ) when OR  is used for more than one predicate comparison.
>>
>> In below I have a table dummy created as ORC with 1 billion rows. Just
>> created another one called dummy1 with 60K rows
>>
>> A simple join results in Hash Join good!
>>
>> scala> sql("select d.id, d1.id from dummy d, dummy2 d1 where
>> d.random_string = d1.random_string").explain(true)
>>
>> == Physical Plan ==
>> Project [id#212,id#219]
>>
>> *+- BroadcastHashJoin [random_string#216], [random_string#223],
>> BuildRight*   :- ConvertToUnsafe
>>:  +- HiveTableScan [id#212,random_string#216], MetastoreRelation
>> test, dummy, Some(d)
>>+- ConvertToUnsafe
>>   +- HiveTableScan [id#219,random_string#223], MetastoreRelation
>> test, dummy2, Some(d1)
>>
>> When the join is done using OR on other predicates I see it starts doing
>> NLJ
>>
>> scala> sql("select d.id, d1.id from dummy d, dummy2 d1 where
>> d.random_string = d1.random_string OR d.small_vc =
>> d1.small_vc").explain(true)
>>
>> == Physical Plan ==
>> Project [id#241,id#248]
>> +- B*roadcastNestedLoopJoin *BuildRight, Inner, Some(((random_string#245
>> = random_string#252) || (small_vc#246 = small_vc#253)))
>>:- HiveTableScan [small_vc#246,id#241,random_string#245],
>> MetastoreRelation test, dummy, Some(d)
>>+- HiveTableScan [id#248,random_string#252,small_vc#253],
>> MetastoreRelation test, dummy2, Some(d1)
>>
>> in contrast the same identical tables in Oracle use Hash Join with OR
>> which is expected
>>
>> scratch...@mydb.mich.LOCAL> select d.id, d1.id from dummy d, dummy2 d1
>> where d.random_string = d1.random_string OR d.small_vc = d1.small_vc;
>>
>> Execution Plan
>> --
>> Plan hash value: 4163534687
>>
>> --
>> | Id  | Operation   | Name   | Rows  | Bytes |TempSpc| Cost
>> (%CPU)| Time |
>>
>> --
>> |   0 | SELECT STATEMENT|| 63207 |  8332K|   |  1280K
>> (1)| 04:16:05 |
>> |   1 |  CONCATENATION  ||   |   |   |
>> |  |
>> |*  2 |  * HASH JOIN *|| 60183 |  7934K|  4632K|   640K
>> (1)| 02:08:03 |
>> |   3 |TABLE ACCESS FULL| DUMMY2 | 6 |  3925K|   |   157
>> (1)| 00:00:02 |
>> |   4 |TABLE ACCESS FULL| DUMMY  |   100M|  6484M|   |   261K
>> (1)| 00:52:13 |
>> |*  5 |   *HASH JOIN *||  3024 |   398K|  4632K|   640K
>> (1)| 02:08:03 |
>> |   6 |TABLE ACCESS FULL| DUMMY2 | 6 |  3925K|   |   157
>> (1)| 00:00:02 |
>> |   7 |TABLE ACCESS FULL| DUMMY  |   100M|  6484M|   |   261K
>> (1)| 00:52:13 |
>>
>> --
>>
>> So this looks like a bug!
>>
>>
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> *
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>>
>> On 1 April 2016 at 04:53, ashokkumar rajendran <
>> ashokkumar.rajend...@gmail.com> wrote:
>>
>>> Thanks for the reply everyone.
>>>
>>> Let me provide more detail on the dataset as well.
>>> 1. The big table (A) contains more than 3 billion records in parquet
>>> format, which is few TBs.
>>> 2. The second table (B) is only of 60K rows which is less than 10MB.
>>> 3. The column on which I perform JOIN is mostly on the String datatype
>>> columns.
>>> 4. I used 20 machines that were of 16 cores each and 120GB RAMs for
>>> testing this.
>>>
>>> The pseudo OR query is as below.
>>>
>>> sql(Select field1, field2 from A, B where A.dimension1 = B.dimension1 OR 
>>> A.dimension2 = B.dimension2 OR A.dimension3 = B.dimension3 OR A.dimension4 
>>> = B.dimension4).explain(true)
>>>
>>>
>>> Pseudo union qu

Re: SPARK-13900 - Join with simple OR conditions take too long

2016-04-01 Thread ashokkumar rajendran
Hi Mich,

Thanks for the input.

Yes, it seems to be a bug. Is it possible to fix this in next release?

Regards
Ashok

On Fri, Apr 1, 2016 at 2:06 PM, Mich Talebzadeh 
wrote:

> hm.
>
> Sounds like it ends up in Nested Loop Join (NLJ) as opposed to Hash Join
> (HJ) when OR  is used for more than one predicate comparison.
>
> In below I have a table dummy created as ORC with 1 billion rows. Just
> created another one called dummy1 with 60K rows
>
> A simple join results in Hash Join good!
>
> scala> sql("select d.id, d1.id from dummy d, dummy2 d1 where
> d.random_string = d1.random_string").explain(true)
>
> == Physical Plan ==
> Project [id#212,id#219]
>
> *+- BroadcastHashJoin [random_string#216], [random_string#223], BuildRight*
> :- ConvertToUnsafe
>:  +- HiveTableScan [id#212,random_string#216], MetastoreRelation test,
> dummy, Some(d)
>+- ConvertToUnsafe
>   +- HiveTableScan [id#219,random_string#223], MetastoreRelation test,
> dummy2, Some(d1)
>
> When the join is done using OR on other predicates I see it starts doing
> NLJ
>
> scala> sql("select d.id, d1.id from dummy d, dummy2 d1 where
> d.random_string = d1.random_string OR d.small_vc =
> d1.small_vc").explain(true)
>
> == Physical Plan ==
> Project [id#241,id#248]
> +- B*roadcastNestedLoopJoin *BuildRight, Inner, Some(((random_string#245
> = random_string#252) || (small_vc#246 = small_vc#253)))
>:- HiveTableScan [small_vc#246,id#241,random_string#245],
> MetastoreRelation test, dummy, Some(d)
>+- HiveTableScan [id#248,random_string#252,small_vc#253],
> MetastoreRelation test, dummy2, Some(d1)
>
> in contrast the same identical tables in Oracle use Hash Join with OR
> which is expected
>
> scratch...@mydb.mich.LOCAL> select d.id, d1.id from dummy d, dummy2 d1
> where d.random_string = d1.random_string OR d.small_vc = d1.small_vc;
>
> Execution Plan
> --
> Plan hash value: 4163534687
>
> --
> | Id  | Operation   | Name   | Rows  | Bytes |TempSpc| Cost
> (%CPU)| Time |
>
> --
> |   0 | SELECT STATEMENT|| 63207 |  8332K|   |  1280K
> (1)| 04:16:05 |
> |   1 |  CONCATENATION  ||   |   |   |
> |  |
> |*  2 |  * HASH JOIN *|| 60183 |  7934K|  4632K|   640K
> (1)| 02:08:03 |
> |   3 |TABLE ACCESS FULL| DUMMY2 | 6 |  3925K|   |   157
> (1)| 00:00:02 |
> |   4 |TABLE ACCESS FULL| DUMMY  |   100M|  6484M|   |   261K
> (1)| 00:52:13 |
> |*  5 |   *HASH JOIN *||  3024 |   398K|  4632K|   640K
> (1)| 02:08:03 |
> |   6 |TABLE ACCESS FULL| DUMMY2 | 6 |  3925K|   |   157
> (1)| 00:00:02 |
> |   7 |TABLE ACCESS FULL| DUMMY  |   100M|  6484M|   |   261K
> (1)| 00:52:13 |
>
> --
>
> So this looks like a bug!
>
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> *
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> On 1 April 2016 at 04:53, ashokkumar rajendran <
> ashokkumar.rajend...@gmail.com> wrote:
>
>> Thanks for the reply everyone.
>>
>> Let me provide more detail on the dataset as well.
>> 1. The big table (A) contains more than 3 billion records in parquet
>> format, which is few TBs.
>> 2. The second table (B) is only of 60K rows which is less than 10MB.
>> 3. The column on which I perform JOIN is mostly on the String datatype
>> columns.
>> 4. I used 20 machines that were of 16 cores each and 120GB RAMs for
>> testing this.
>>
>> The pseudo OR query is as below.
>>
>> sql(Select field1, field2 from A, B where A.dimension1 = B.dimension1 OR 
>> A.dimension2 = B.dimension2 OR A.dimension3 = B.dimension3 OR A.dimension4 = 
>> B.dimension4).explain(true)
>>
>>
>> Pseudo union query is as below.
>>
>> sql{Select field1, field2 from A, B where A.dimension1 = B.dimension1
>> UNION ALL
>> Select field1, field2 from A, B where A.dimension2 = B.dimension2
>> UNION ALL
>> Select field1, field2 from A, B where A.dimension3 = B.dimension3
>> UNION ALL
>> Select field1, field2 from A, B where A.dimension4 = 
>> B.dimension4}.explain(true)
>>
>> You can look at the explain plan in the ticket
>> https://issues.apache.org/jira/browse/SPARK-13900
>>
>> Hemant has mentioned the nested loop time will be very little. But the
>> time taken by inner query for this kind of join is too long. Union of 3
>> HashJoins take only 3 minutes (each dimension hashjoin takes 1 minute),
>> nested loop join takes nearly 13 mins. I agree that the performance of
>> HashedJoin on OR condition will be linear but that will be very optimized
>> comparing to the nested loop join.
>>
>> As

Re: SPARK-13900 - Join with simple OR conditions take too long

2016-04-01 Thread Mich Talebzadeh
hm.

Sounds like it ends up in Nested Loop Join (NLJ) as opposed to Hash Join
(HJ) when OR  is used for more than one predicate comparison.

In below I have a table dummy created as ORC with 1 billion rows. Just
created another one called dummy1 with 60K rows

A simple join results in Hash Join good!

scala> sql("select d.id, d1.id from dummy d, dummy2 d1 where
d.random_string = d1.random_string").explain(true)

== Physical Plan ==
Project [id#212,id#219]

*+- BroadcastHashJoin [random_string#216], [random_string#223], BuildRight*
:- ConvertToUnsafe
   :  +- HiveTableScan [id#212,random_string#216], MetastoreRelation test,
dummy, Some(d)
   +- ConvertToUnsafe
  +- HiveTableScan [id#219,random_string#223], MetastoreRelation test,
dummy2, Some(d1)

When the join is done using OR on other predicates I see it starts doing NLJ

scala> sql("select d.id, d1.id from dummy d, dummy2 d1 where
d.random_string = d1.random_string OR d.small_vc =
d1.small_vc").explain(true)

== Physical Plan ==
Project [id#241,id#248]
+- B*roadcastNestedLoopJoin *BuildRight, Inner, Some(((random_string#245 =
random_string#252) || (small_vc#246 = small_vc#253)))
   :- HiveTableScan [small_vc#246,id#241,random_string#245],
MetastoreRelation test, dummy, Some(d)
   +- HiveTableScan [id#248,random_string#252,small_vc#253],
MetastoreRelation test, dummy2, Some(d1)

in contrast the same identical tables in Oracle use Hash Join with OR which
is expected

scratch...@mydb.mich.LOCAL> select d.id, d1.id from dummy d, dummy2 d1
where d.random_string = d1.random_string OR d.small_vc = d1.small_vc;

Execution Plan
--
Plan hash value: 4163534687
--
| Id  | Operation   | Name   | Rows  | Bytes |TempSpc| Cost (%CPU)|
Time |
--
|   0 | SELECT STATEMENT|| 63207 |  8332K|   |  1280K  (1)|
04:16:05 |
|   1 |  CONCATENATION  ||   |   |   |
|  |
|*  2 |  * HASH JOIN *|| 60183 |  7934K|  4632K|   640K
(1)| 02:08:03 |
|   3 |TABLE ACCESS FULL| DUMMY2 | 6 |  3925K|   |   157   (1)|
00:00:02 |
|   4 |TABLE ACCESS FULL| DUMMY  |   100M|  6484M|   |   261K  (1)|
00:52:13 |
|*  5 |   *HASH JOIN *||  3024 |   398K|  4632K|   640K
(1)| 02:08:03 |
|   6 |TABLE ACCESS FULL| DUMMY2 | 6 |  3925K|   |   157   (1)|
00:00:02 |
|   7 |TABLE ACCESS FULL| DUMMY  |   100M|  6484M|   |   261K  (1)|
00:52:13 |
--

So this looks like a bug!



Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com



On 1 April 2016 at 04:53, ashokkumar rajendran <
ashokkumar.rajend...@gmail.com> wrote:

> Thanks for the reply everyone.
>
> Let me provide more detail on the dataset as well.
> 1. The big table (A) contains more than 3 billion records in parquet
> format, which is few TBs.
> 2. The second table (B) is only of 60K rows which is less than 10MB.
> 3. The column on which I perform JOIN is mostly on the String datatype
> columns.
> 4. I used 20 machines that were of 16 cores each and 120GB RAMs for
> testing this.
>
> The pseudo OR query is as below.
>
> sql(Select field1, field2 from A, B where A.dimension1 = B.dimension1 OR 
> A.dimension2 = B.dimension2 OR A.dimension3 = B.dimension3 OR A.dimension4 = 
> B.dimension4).explain(true)
>
>
> Pseudo union query is as below.
>
> sql{Select field1, field2 from A, B where A.dimension1 = B.dimension1
> UNION ALL
> Select field1, field2 from A, B where A.dimension2 = B.dimension2
> UNION ALL
> Select field1, field2 from A, B where A.dimension3 = B.dimension3
> UNION ALL
> Select field1, field2 from A, B where A.dimension4 = 
> B.dimension4}.explain(true)
>
> You can look at the explain plan in the ticket
> https://issues.apache.org/jira/browse/SPARK-13900
>
> Hemant has mentioned the nested loop time will be very little. But the
> time taken by inner query for this kind of join is too long. Union of 3
> HashJoins take only 3 minutes (each dimension hashjoin takes 1 minute),
> nested loop join takes nearly 13 mins. I agree that the performance of
> HashedJoin on OR condition will be linear but that will be very optimized
> comparing to the nested loop join.
>
> As Yong pointed out, if we can provide hints in SQL engine, it will be
> awesome for these kind of cases.
>
> Any idea on how we can optimize this will be helpful. Please let me know
> if any other detail is needed to provide input.
>
> Regards
> Ashok
>
> On Thu, Mar 31, 2016 at 8:32 PM, Mich Talebzadeh <
> mich.talebza...@gmail.com> wrote:
>
>> a hash join come into play 

RE: SPARK-13900 - Join with simple OR conditions take too long

2016-03-31 Thread Yong Zhang
I agree that there won't be a generic solution for these kind of cases.
Without the CBO from Spark or Hadoop ecosystem in short future, maybe Spark 
DataFrame/SQL should support more hints from the end user, as in these cases, 
end users will be smart enough to tell the engine what is the correct way to do.
Weren't the relational DBs doing exactly same path? RBO -> RBO + Hints -> CBO?
Yong

Date: Thu, 31 Mar 2016 16:07:14 +0530
Subject: Re: SPARK-13900 - Join with simple OR conditions take too long
From: hemant9...@gmail.com
To: ashokkumar.rajend...@gmail.com
CC: user@spark.apache.org

Hi Ashok,

That's interesting. 

As I understand, on table A and B, a nested loop join (that will produce m X n 
rows) is performed and than each row is evaluated to see if any of the 
condition is met. You are asking that Spark should instead do a 
BroadcastHashJoin on the equality conditions in parallel and then union the 
results like you are doing in a different query. 

If we leave aside parallelism for a moment, theoretically, time taken for 
nested loop join would vary little when the number of conditions are increased 
while the time taken for the solution that you are suggesting would increase 
linearly with number of conditions. So, when number of conditions are too many, 
nested loop join would be faster than the solution that you suggest. Now the 
question is, how should Spark decide when to do what? 

Hemant Bhanawat
www.snappydata.io 


On Thu, Mar 31, 2016 at 2:28 PM, ashokkumar rajendran 
 wrote:
Hi,

I have filed ticket SPARK-13900. There was an initial reply from a developer 
but did not get any reply on this. How can we do multiple hash joins together 
for OR conditions based joins? Could someone please guide on how can we fix 
this? 
Regards
Ashok


  

Re: SPARK-13900 - Join with simple OR conditions take too long

2016-03-31 Thread Hemant Bhanawat
Hi Ashok,

That's interesting.

As I understand, on table A and B, a nested loop join (that will produce m
X n rows) is performed and than each row is evaluated to see if any of the
condition is met. You are asking that Spark should instead do a
BroadcastHashJoin on the equality conditions in parallel and then union the
results like you are doing in a different query.

If we leave aside parallelism for a moment, theoretically, time taken for
nested loop join would vary little when the number of conditions are
increased while the time taken for the solution that you are suggesting
would increase linearly with number of conditions. So, when number of
conditions are too many, nested loop join would be faster than the solution
that you suggest. Now the question is, how should Spark decide when to do
what?


Hemant Bhanawat 
www.snappydata.io

On Thu, Mar 31, 2016 at 2:28 PM, ashokkumar rajendran <
ashokkumar.rajend...@gmail.com> wrote:

> Hi,
>
> I have filed ticket SPARK-13900. There was an initial reply from a
> developer but did not get any reply on this. How can we do multiple hash
> joins together for OR conditions based joins? Could someone please guide on
> how can we fix this?
>
> Regards
> Ashok
>


SPARK-13900 - Join with simple OR conditions take too long

2016-03-31 Thread ashokkumar rajendran
Hi,

I have filed ticket SPARK-13900. There was an initial reply from a
developer but did not get any reply on this. How can we do multiple hash
joins together for OR conditions based joins? Could someone please guide on
how can we fix this?

Regards
Ashok