stdpain opened a new issue #5300: URL: https://github.com/apache/incubator-doris/issues/5300
**Describe** In the original logic, Hashtable uses a vector-like structure to store actual data. When constructing the hash table, there may be about a quarter of the time copying data continuously. Especially in the case of building more columns, it will take more time. So I changed this to a raw pointer to avoid extra copy overhead. There will be good results in the hash table construction phase Here is my test case, LINE_ORDER and LINE_ORDER_V2 is from SSB datasets: ``` SELECT count(*) FROM LINE_ORDER t1 join LINE_ORDER_V2 t2 WHERE t1.LO_ORDERKEY=t2.LO_ORDERKEY; ``` |Type| Right Table Rows | Build Time | Probe Time | Time Cost (s) | |--| ------------ | ---------- | ---------- | ---- | | After |6001215 | 658.288ms | 1s451ms | 4.07 | | Before |6001215 | 1s428ms | 1s512ms | 4.69 | ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
