Status: Accepted
Owner: [email protected]
CC: [email protected]
Labels: Type-Bug Priority-Medium
New issue 2424 by [email protected]: Test case runs slower on V8 than on
JSC
http://code.google.com/p/v8/issues/detail?id=2424
Reported by Dmitry Chestnykh:
"""
Chrome runs at ~1 sec with any array type, Safari 248 ms (with Int32Array,
slower with Uint32Array, heh), Firefox - 670 ms.
"""
https://twitter.com/dchest/status/271974857124749312
Code: https://gist.github.com/4135716, hottest function salsa20_8_xor.
Results of the standalone test (attached) on my machine:
∮ out/ia32.release/d8
test.js scrypt: 686
ms 66,19,172,174,118,47,129,147,213,139,191,0,100,201
scrypt: 551 ms 66,19,172,174,118,47,129,147,213,139,191,0,100,201
Snippet from the function code (in the loop):
u = x0 + x12;
x4 ^= (u << 7) | (u >>> (32 - 7));
u = x4 + x0;
...
Here x0 and x12 are known to be int32 (they are result of ^). Thus I would
expect that u will be a truncating addition, however it is not:
0 2 d1238 Change i359 i to d range[-2147483648,2147483647,m0=0] <|@
0 1 d1258 Change i371 i to d range[-2147483648,2147483647,m0=0] <|@
0 2 d391 Add d1238 d1258 ! <|@
0 2 i1265 Change d391 d to i truncating-int32
range[-2147483648,2147483647,m0=0] <|@
0 1 i394 Shl i1265 i393 range[-2147483648,2147483647,m0=0] <|@
0 1 i397 Shr i1265 i396 range[0,127,m0=0] <|@
0 1 i399 Bitwise BIT_OR i394 i397 range[-2147483648,2147483647,m0=0]
<|@
0 3 i401 Bitwise BIT_XOR i363 i399 range[-2147483648,2147483647,m0=0]
<|@
It seems that we failed to infer that d391 should be actually i391 because
it has only truncating uses.
Furthermore even if I rewrite code like:
u = (x0 + x12) | 0;
x4 ^= (u << 7) | (u >>> (32 - 7));
u = (x4 + x0) | 0;
The addition remains double. I think now we should be able to revert both
case with and case without |0 to integer addition (previously there were I
think some issues with interfering simulates etc-etc). Adding the following
case to HAdd::Canonicalize
if (representation().IsDouble() &&
left()->IsChange() && HChange::cast(left())->from().IsInteger32() &&
right()->IsChange() && HChange::cast(right())->from().IsInteger32()
&&
CheckUsesForFlag(kTruncatingToInt32)) {
HAdd* instr = new(block()->zone()) HAdd(context(),
HChange::cast(left())->value(),
HChange::cast(right())->value());
instr->set_representation(Representation::Integer32());
instr->SetFlag(kTruncatingToInt32);
instr->ClearFlag(kCanOverflow);
instr->InsertBefore(this);
return instr;
}
Drastically improves both code and our results:
∮ out/ia32.release/d8 test.js
scrypt: 223 ms 66,19,172,174,118,47,129,147,213,139,191,0,100,201
scrypt: 197 ms 66,19,172,174,118,47,129,147,213,139,191,0,100,201
This is a factor of 3. I can upload my patch as a CL if you don't think
that we should approach this problem in some other more generic way.
[there are also seem to be some representation inference problems in cold
function SHA256.blocks which deopts a lot on non-truncating double-to-i for
some phi that either has truncating uses and a use at int32-to-double, but
I did not investigate that].
Attachments:
test.js 21.6 KB
--
v8-dev mailing list
[email protected]
http://groups.google.com/group/v8-dev