Yes, I'd say so.

2018-06-29 4:43 GMT+02:00 吴晓菊 <chrysan...@gmail.com>:

> And it should be generic for HashJoin not only broadcast join, right?
>
>
> Chrysan Wu
> 吴晓菊
> Phone:+86 17717640807
>
>
> 2018-06-29 10:42 GMT+08:00 吴晓菊 <chrysan...@gmail.com>:
>
>> Sorry for the mistake. You are right output ordering of broadcast join
>> can be the order of big table in some types of join. I will prepare a PR
>> and let you review later. Thanks a lot!
>>
>>
>> Chrysan Wu
>> 吴晓菊
>> Phone:+86 17717640807
>>
>>
>> 2018-06-29 0:00 GMT+08:00 Wenchen Fan <cloud0...@gmail.com>:
>>
>>> SortMergeJoin sorts its children by join key, but broadcast join does
>>> not. I think the output ordering of broadcast join has nothing to do with
>>> join key.
>>>
>>> On Thu, Jun 28, 2018 at 11:28 PM Marco Gaido <marcogaid...@gmail.com>
>>> wrote:
>>>
>>>> I think the outputOrdering would be the one of the big table (if any)
>>>> and it wouldn't matter if this involves the join keys or not. Am I wrong?
>>>>
>>>> 2018-06-28 17:01 GMT+02:00 吴晓菊 <chrysan...@gmail.com>:
>>>>
>>>>> Thanks for the reply.
>>>>> By looking into the SortMergeJoinExec, I think we can follow what
>>>>> SortMergeJoin do, for some types of join, if the children is ordered on
>>>>> join keys, we can output the ordered join keys as output ordering.
>>>>>
>>>>>
>>>>> Chrysan Wu
>>>>> 吴晓菊
>>>>> Phone:+86 17717640807
>>>>>
>>>>>
>>>>> 2018-06-28 22:53 GMT+08:00 Wenchen Fan <cloud0...@gmail.com>:
>>>>>
>>>>>> SortMergeJoin only reports ordering of the join keys, not the output
>>>>>> ordering of any child.
>>>>>>
>>>>>> It seems reasonable to me that broadcast join should respect the
>>>>>> output ordering of the children. Feel free to submit a PR to fix it, 
>>>>>> thanks!
>>>>>>
>>>>>> On Thu, Jun 28, 2018 at 10:07 PM 吴晓菊 <chrysan...@gmail.com> wrote:
>>>>>>
>>>>>>> Why we cannot use the output order of big table?
>>>>>>>
>>>>>>>
>>>>>>> Chrysan Wu
>>>>>>> Phone:+86 17717640807
>>>>>>>
>>>>>>>
>>>>>>> 2018-06-28 21:48 GMT+08:00 Marco Gaido <marcogaid...@gmail.com>:
>>>>>>>
>>>>>>>> The easy answer to this is that SortMergeJoin ensure an
>>>>>>>> outputOrdering, while BroadcastHashJoin doesn't, ie. after running a
>>>>>>>> BroadcastHashJoin you don't know which is going to be the order of the
>>>>>>>> output since nothing enforces it.
>>>>>>>>
>>>>>>>> Hope this helps.
>>>>>>>> Thanks.
>>>>>>>> Marco
>>>>>>>>
>>>>>>>> 2018-06-28 15:46 GMT+02:00 吴晓菊 <chrysan...@gmail.com>:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> We see SortMergeJoinExec is implemented with
>>>>>>>>> outputPartitioning&outputOrdering while BroadcastHashJoinExec is
>>>>>>>>> only implemented with outputPartitioning. Why is the design?
>>>>>>>>>
>>>>>>>>> Chrysan Wu
>>>>>>>>> Phone:+86 17717640807
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>
>

Reply via email to