[ 
https://issues.apache.org/jira/browse/HIVE-17783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202938#comment-16202938
 ] 

Ferdinand Xu commented on HIVE-17783:
-------------------------------------

Thanks [~sershe] for your input. I think the degradation in my test should 
cause by the unnecessary spilling. I take a look at the longest tasks for both 
enable/disable which processing the same number of records and have a roughly 
estimation for each phase with log.
!screenshot-1.png!

Extra reprocessing takes longer time and in disable case those data actually is 
not spilled into the disk. And from the following logs we can see that the 
spilled row numbers are actually very small (e.g. partition 0: 0 row, partition 
3: 1 row) while the estimated memory is relative high (e.g. partition 0: 65636, 
partition 3: 589924). This is because the estimated memory is obtained from WBS 
from BytesBytesMultiHashMap. But in disabled case, there is limited overhead of 
creating BytesBytesMultiHashMap which maintains only one such hash map. I think 
we need to figure out a way to have better estimation of memory to avoid 
unnecessary spill caused by memory. Any thoughts or suggestions about this 
point?

{noformat}
2017-10-13 09:14:43,666 [INFO] [pool-29-thread-1] 
|persistence.HybridHashTableContainer|: Spilling hash partition 0 (Rows: 0, Mem 
size: 65636): 
/ssd/ssd-pcie/hadoop/data/local-dirs/usercache/root/appcache/application_1506652027239_1242/container_1506652027239_1242_01_000042/tmp/partition-0-1870407313239959464.tmp
2017-10-13 09:14:43,666 [INFO] [pool-29-thread-1] 
|persistence.HybridHashTableContainer|: Memory usage before spilling: 1050880
2017-10-13 09:14:43,666 [INFO] [pool-29-thread-1] 
|persistence.HybridHashTableContainer|: Memory usage after spilling: 985244
2017-10-13 09:14:43,667 [INFO] [pool-29-thread-1] |common.FileUtils|: Local 
directories not specified; created a tmp file: 
/ssd/ssd-pcie/hadoop/data/local-dirs/usercache/root/appcache/application_1506652027239_1242/container_1506652027239_1242_01_000042/tmp/partition-3-8788724383233259683.tmp
2017-10-13 09:14:43,667 [INFO] [pool-29-thread-1] 
|persistence.HybridHashTableContainer|: Trying to spill hash partition 3 ...
2017-10-13 09:14:43,669 [INFO] [pool-29-thread-1] 
|persistence.HybridHashTableContainer|: Spilling hash partition 3 (Rows: 1, Mem 
size: 589924): 
/ssd/ssd-pcie/hadoop/data/local-dirs/usercache/root/appcache/application_1506652027239_1242/container_1506652027239_1242_01_000042/tmp/partition-3-8788724383233259683.tmp
2017-10-13 09:14:43,669 [INFO] [pool-29-thread-1] 
|persistence.HybridHashTableContainer|: Memory usage before spilling: 1509532
2017-10-13 09:14:43,669 [INFO] [pool-29-thread-1] 
|persistence.HybridHashTableContainer|: Memory usage after spilling: 919608
2017-10-13 09:14:43,669 [INFO] [pool-29-thread-1] |common.FileUtils|: Local 
directories not specified; created a tmp file: 
/ssd/ssd-pcie/hadoop/data/local-dirs/usercache/root/appcache/application_1506652027239_1242/container_1506652027239_1242_01_000042/tmp/partition-15-1832304146451074287.tmp
2017-10-13 09:14:43,669 [INFO] [pool-29-thread-1] 
|persistence.HybridHashTableContainer|: Trying to spill hash partition 15 ...
2017-10-13 09:14:43,676 [INFO] [pool-29-thread-1] 
|persistence.HybridHashTableContainer|: Spilling hash partition 15 (Rows: 2, 
Mem size: 589924): 
/ssd/ssd-pcie/hadoop/data/local-dirs/usercache/root/appcache/application_1506652027239_1242/container_1506652027239_1242_01_000042/tmp/partition-15-1832304146451074287.tmp
{noformat}





> Hybrid Grace Hash Join has performance degradation for N-way join using Hive 
> on Tez
> -----------------------------------------------------------------------------------
>
>                 Key: HIVE-17783
>                 URL: https://issues.apache.org/jira/browse/HIVE-17783
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 2.2.0
>         Environment: 8*Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz
> 1 master + 7 workers
> TPC-DS at 3TB data scales
> Hive version : 2.2.0
>            Reporter: Ferdinand Xu
>         Attachments: Hybrid_Grace_Hash_Join.xlsx, screenshot-1.png
>
>
> Most configurations are using default value. And the benchmark is to test 
> enabling against disabling hybrid grace hash join using TPC-DS queries at 3TB 
> data scales. Many queries related to N-way join has performance degradation 
> over three times test. Detailed result  is attached.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to