It might be worth investigating which memcpy calls are problematic, if you
haven't already. I suspect that MemoryRead and MemoryWrite are the main
culprits
but that is just a gut feeling.
Also, with a default -O0 debug build, GCC probably doesn't try to optimise
memcpy, so the memcpy calls might have more impact in a debug build than in
a
release build. However, there's no reason why a debug build has to be -O0.
(I
find that ARM code is easier to read at -O2 anyway, on the rare occasions
that I
have to delve into disassembled code.)
If we can't get a type-aliasing-safe implementation (using memcpy) to
optimise
properly, another option is to pass -no-strict-aliasing to GCC (at least
for the
simulator), which disables all optimisations relating to strict aliasing.
Clang
may or may not have an equivalent. It's a bit of a hack though, and it might
disable optimisations elsewhere. (I've no idea what performance impact that
might have. It might be negligible.)
https://codereview.chromium.org/169223004/diff/1/src/a64/instructions-a64.h
File src/a64/instructions-a64.h (left):
https://codereview.chromium.org/169223004/diff/1/src/a64/instructions-a64.h#oldcode120
src/a64/instructions-a64.h:120: memcpy(&bits, this, sizeof(bits));
These simple (fixed-size) memcpy calls should be free. The compiler
should recognise them and compile it as a no-op.
https://codereview.chromium.org/169223004/diff/1/src/a64/simulator-a64.cc
File src/a64/simulator-a64.cc (left):
https://codereview.chromium.org/169223004/diff/1/src/a64/simulator-a64.cc#oldcode142
src/a64/simulator-a64.cc:142: memcpy(stack, &(*it), sizeof(*it));
This copy might not be free because the address needs to be derived from
the iterator. However, this shouldn't be performance-sensitive code.
https://codereview.chromium.org/169223004/diff/1/src/a64/simulator-a64.cc#oldcode1472
src/a64/simulator-a64.cc:1472: memcpy(&read, address, num_bytes);
Again, the size is variable so this copy can't be free.
It might be possible for the compiler to optimise it if you split the
num_bytes cases (as you did for reinterpret_cast) but use a fixed memcpy
in each case.
https://codereview.chromium.org/169223004/diff/1/src/a64/simulator-a64.cc#oldcode1514
src/a64/simulator-a64.cc:1514: memcpy(address, &value, num_bytes);
Ditto.
https://codereview.chromium.org/169223004/
--
--
v8-dev mailing list
[email protected]
http://groups.google.com/group/v8-dev
---
You received this message because you are subscribed to the Google Groups "v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.