On Tue, 22 Mar 2022 02:52:07 GMT, Jatin Bhateja <jbhat...@openjdk.org> wrote:
>>> A read from constant table will incur minimum of L1I access penalty to >>> access code blob or at worst even more if data is not present in first >>> level cache >> >> But your approach comes at a cost of frontend bandwidth and port contention, >> which imo are more important than latency in this case since a constant load >> does not prolong dependency chains. A load has very good throughput so it is >> often performant unless the load depends on its input (the memory location >> or the registers used for address calculation). Thanks > > Thanks for going into details, multicycle memory load will also defer > dispatch of dependent instructions to execution port, port congestion becomes > bottleneck when multiple ready instructions cannot be issued due to lack of > execution resource or throughput constraints imposed by instruction, but a > single cycle dependency chain may still win over latency due to pending > memory operations. I think I get it now, thanks a lot for your detailed explanation. ------------- PR: https://git.openjdk.java.net/jdk/pull/7094