On 7/21/2011 7:19 AM, Shmuel Metz (Seymour J.) wrote:
In<4e24b31c.5080...@phoenixsoftware.com>, on 07/18/2011
at 03:26 PM, Edward Jaffe<edja...@phoenixsoftware.com> said:
Absolutely! There is a multi-stage pipeline that allows the processor
to get ahead of the current instruction's execution to fetch and
decode instructions, resolve addresses, fetch operands, etc. in
advance of the actual instruction execution.
That doesn't address my question. The questions are when the processor
starts filling in a new segment of the pipeline for a branch and
whether it takes the opcode into account. Keep in mind that there is
not a separate general register for the target address mode.
The last time I thoroughly studied/investigated System z branch prediction logic
(BPL) was on the z9.
On that model, the BPL runs early in the pipeline, before instruction decode,
and it essentially runs asynchronously to the rest of the pipeline. It
prefetches instructions and predicts direction and target based on the path it
thinks the rest of the pipeline will be later executing. It puts those
prefetched streams into fairly large instruction buffers awaiting when they
might be needed by the decode logic.
The BPL logic itself uses what appears to be a unified Branch Target Buffer and
Direction Buffer (they are physically built of separate arrays but are logically
the same). The BTB contains 8K entries. Being clever, IBM decided not to
'remember' not-taken conditional branches, thus allowing the BTB to appear much
bigger than it really is. This was considered a reasonable performance trade-off
(the rationale for keeping not taken branches in the BTB is to more accurately
handle branches that frequently change direction). The z9 BTB uses a
strongly-taken, weakly-taken approach for each branch.
Another clever thing IBM did on the z9 was to implement "just in time"
prefetching down non-predicted branch paths to enhance performance. So if the
hardware predicts a branch will be not taken, it prefetches the taken-path just
before the branch direction is resolved in the execution stage of the pipeline.
So if it was predicted incorrectly, the taken-path will be in a "recovery
instruction buffer" where it could be sent into the decoder on the next cycle.
Slick! :-)
--
Edward E Jaffe
Chief Technology Officer
Phoenix Software International, Inc
831 Parkview Drive North
El Segundo, CA 90245
310-338-0400 x318
edja...@phoenixsoftware.com
http://www.phoenixsoftware.com/
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html