"Wesley W. Terpstra" <wes...@terpstra.ca> writes: > On Mon, Aug 18, 2014 at 9:01 AM, Niels Möller <ni...@lysator.liu.se> wrote: >> I don't think it has to be that bad. First, the prefix flag and register >> should be saved and restored on irq, so there should be no problem with >> irq:s or page faults and the like in the "middle" of an instruction. > > Yes, that's the advantage. Keep in mind, though, that dealing with > variable-length instructions is a well understood and not-so-difficult > problem. I just need to report the PC as being at the start of the > prefix-chain. This change is local to the decoder.
I think the decoder could implement the prefix instruction, as I've defined it, in that way, treating a sequence of prefix instructions + non-prefix instruction as an indivisible longer instruction. Supervisor mode/kernel mode code might need to know if there really is a prefix register or not, but otherwise, it's an implementation detail not visible to user code. > The value in the decoder is ahead of the values seen in the execution > units. If an exception occurs, you need to be able to rewind/reset the > value in the decoder to the state it would have had if execution had > gone to the correct destination at that point. How to deal with exceptions in an out-of-order cpu is a bit of a mystery to me. We're drifting off-topic, but if you can educate me a bit on that, I'd appreciate it. E.g., for a page fault at instruction fetch, or an external irq, it seems reasonably simple to stop decoding and issuing any new instructions, then wait until all previously issued instructions have completed, and at that point transfer control to the exception handler. But if you get a page fault from a reordered load or store, or some other exception associated with the execution of a particular instruction, how do you stop the instruction flow at the correct point before the control transfer to the handler? Thinking aloud, it seems one needs to somehow (1) cancel execution of all later (in instruction order) instructions, or discard any results or exceptions they might generate. (2) complete all earlier (in instruction order) instructions. And in case one of those generates another exception, you need to "rewind" further and forget the original exception and its corresponding instruction. and then wait until the dust settles, with no pending instructions in the machine, and ready to handle the first (in instruction order) exception. And one would need particular attention to stores, or other instructions with side effects. > That said, from what you describe, it sounds to me like they've > actually decomposed the FMA into two micro-ops. I also don't know the ARM internals. But short latency between carry in and carry out is important to make the umaal and umlal instructions useful for bignum multiplication. Regards, /Niels -- Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26. Internet email is subject to wholesale government surveillance. _______________________________________________ gmp-devel mailing list gmp-devel@gmplib.org https://gmplib.org/mailman/listinfo/gmp-devel