stdpain opened a new issue #5300:
URL: https://github.com/apache/incubator-doris/issues/5300


   **Describe**
   In the original logic, Hashtable uses a vector-like structure to store 
actual data. When constructing the hash table, there may be about a quarter of 
the time copying data continuously. Especially in the case of building more 
columns, it will take more time. So I changed this to a raw pointer to avoid 
extra copy overhead. There will be good results in the hash table construction 
phase
   
   Here  is my test case, LINE_ORDER and LINE_ORDER_V2 is from SSB datasets:
   
   ```
   SELECT count(*) FROM LINE_ORDER t1 join LINE_ORDER_V2 t2 WHERE 
t1.LO_ORDERKEY=t2.LO_ORDERKEY;
   ```
   
   |Type| Right Table Rows | Build Time | Probe Time | Time Cost (s) |
   |--| ------------ | ---------- | ---------- | ---- |
   | After |6001215          | 658.288ms  | 1s451ms    | 4.07 |
   | Before |6001215          | 1s428ms    | 1s512ms    | 4.69 |
   
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to