Re: Probing of simple repartition hash join

2019-07-25 Thread Stephan Ewen
Hi! The join implementations used for the DataSet API and for the Blink Planner are quite intricate. They make use of these custom memory segments, to operate as much as possible on bytes, to control JVM memory utilization and to save serialization costs. That makes the implementation super compli

Re: Probing of simple repartition hash join

2019-07-25 Thread Benjamin Burkhardt
Hi, while doing a join, a MutableHashTable is created and filled while building. After building it is closed and the probing can begin. I would like to start probing while building the hash table still runs. (ignoring the fact that this would lead to join misses...) Anyone having an idea how on

Re: Probing of simple repartition hash join

2019-07-23 Thread Caizhi Weng
Hi Benjamin, As you mentioned hash join I assume that you are referring to `HashJoinOperator` in blink planner. The input is selected by `nextSelection` method. As you can see, it will first read all records in the build side then read all records in the probe side. So the probing will only start

Probing of simple repartition hash join

2019-07-23 Thread Benjamin Burkhardt
Hi all, Let’s imagine a simple repartition hash Join oft two tables. As soon as the first table is hashed completely (all EndOfPartition Events sent) the shipping and probing of the second table starts. What I can’t find: 1. What triggers to start the probing exactly? 2. Where can I find it in