> Can atom execute two IMUL in parallel? Or what exactly is the pipeline > behavior?
As I understand from Intel's optimization reference manual, the behavior is as follows: if the instruction immediately following IMUL has shorter latency, execution is stalled for 4 cycles (which is IMUL's latency); otherwise, a 4-or-more cycles latency instruction can be issued after IMUL without a stall. In other words, IMUL is pipelined with respect to other long-latency instructions, but not to short-latency instructions. >From reading the patch, I could not understand the link between pipeline behavior and what the patch appears to do. Hope that helps. Alexander