Hi Gopal:
Really thanks for your reply!
You mean that if I limit only 1 cpu to run 
VectorizedLogicBench.IfExprLongColumnLongColumnBench, the variation will be 
small, is my understanding right? If yes, the variation became smaller than 
before after using taskset -cp 1 $pid. But I am confused all the tests in 
VectorizedLogicBench is better pipelined and vectorized, why there is no large 
variation for other tests in VectorizedLogicBench? My guess is that the complex 
expression used in VectorizedLogicBench.IfExprLongColumnLongColumnBench 
actually uses more CPU than other expression.

The expression used in VectorizedLogicBench.IfExprLongColumnLongColumnBench: 
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/IfExprLongColumnLongColumn.java#L90





Best Regards
Kelly Zhang/Zhang,Liyun




 
-----Original Message-----
From: Gopal Vijayaraghavan [mailto:gop...@apache.org] 
Sent: Thursday, November 16, 2017 5:40 AM
To: dev@hive.apache.org
Cc: Zhang, Liyun <liyun.zh...@intel.com>; Teddy Choi <tc...@hortonworks.com>
Subject: Re: Anyone knows the problem I found in 
VectorizedLogicBench.IfExprLongColumnLongColumnBench?

Hi,

>   You see that there is a great float for 
> IfExprLongColumnLongColumnBench.bench, the  float is 583775 and the average 
> value is 1621602. 

In my tests, the single core tests tended to have huge variations on Intel with 
Turbo boost.

CPU operations which are fast when stressing CPU in single threaded mode tended 
to get really slow when the other cores spin up and hitting thermal limits.

For most memory bound operations this is not easily visible, but the better 
pipelined and vectorized the loops get the worse the impact of dynamic CPU 
frequency scaling.

Can you collect active CPU frequency when running this benchmark and do 
"taskset -c 1" to force the run to stick to a single CPU?

Cheers,
Gopal



Reply via email to